Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 82 additions & 5 deletions specification/index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Editor: Mukul Purohit, Microsoft Corporation https://www.microsoft.com, mpurohit
Editor: Tomislav Jovanovic, Mozilla https://www.mozilla.org/, tjovanovic@mozilla.com
Editor: Oliver Dunk, Google https://www.google.com, oliverdunk@chromium.org
Abstract: [Placeholder] Abstract.
Repository: w3c/webextensions
Markup Shorthands: markdown yes
</pre>

Expand Down Expand Up @@ -108,6 +109,8 @@ This key must be present if the `_locales` subdirectory is present, must be abse

This key may be present.

Issue: Specify background scripts. For relevant discussion, see https://github.com/w3c/webextensions/issues/282

### Key `commands`

This key may be present.
Expand Down Expand Up @@ -158,15 +161,64 @@ The <a href="#key-trial_tokens">`trial_tokens`</a> key is an optional [=list=] o

Filenames beginning with an underscore (`_`) are reserved for use by user agent.

# Isolated worlds
# Execution contexts

Extensions can execute JavaScript code, in any of the following execution contexts:

* An <dfn>extension context</dfn> is a [=realm=] associated with an [=extension origin=].

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a realm is the definition we should align on, but I was previously under the impression that world's today have some differences from realms. Is that the case and if so, are they different enough that we are just documenting something idealistic?

Feel free to ignore this if the above is not the case :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a realm is the definition we should align on, but I was previously under the impression that world's today have some differences from realms.

Could you elaborate what you are thinking of?

A realm is the pre-existing abstract description of the context where JS code executes.

In Firefox, we create XPConnect sandboxes, which creates a realm and sets up the globals. If objects are shared between the page's realm and the content script sandbox's realm, the object is wrapped in a XrayWrapper that offers controlled access (and restrictions) as needed. That mechanism is documented at https://firefox-source-docs.mozilla.org/dom/scriptSecurity/xray_vision.html

In Chrome, the JS execution environments are kept strictly separate, which is how Chromium enforces the isolation. That is documented at https://chromium.googlesource.com/chromium/src/+/master/third_party/blink/renderer/bindings/core/v8/V8BindingDesign.md . That makes zero mention of a "realm", but at some point JS needs to execute, so if I were to treat Chromium's implementation as a black box, a realm seems like a good abstraction.

Although "isolated world" is originally a Chromium terminology that leaks outside through various API names, we should not specify it in terms of Chromium/v8 internals, but in existing standard concepts. This is the first time I write this kind of specs, so feedback is welcome!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some further reading on realms and I think it's the right definition (@rdcronin to confirm).

Although "isolated world" is originally a Chromium terminology that leaks outside through various API names, we should not specify it in terms of Chromium/v8 internals, but in existing standard concepts.

I agree... almost. We definitely shouldn't specify things in terms of the internals of a given engine. However, we also shouldn't use a standard concept just because it is desirable if there is a meaningful delta between the behavior developers would expect if we properly implemented that concept vs. the reality of how engines work today.

As best as I understand it, a V8 context is essentially a realm, so using realms in the specification matches reality. If that's the case I'm onboard.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same concern as Oliver. While the concept of a realm in Ecma262 is very close to what we would use to describe an execution context for an extension, I don't think this is inherently true, nor that it is guaranteed to remain accurate. The spec for realms goes into details about the manner in which realms are created and initialized, and I'm not sure that this exact process is followed for v8 isolated worlds, nor do I think it is a fundamental requirement for them (that is, if a browser initialized its content script execution contexts differently from the exact realm spec, but in a way that still aligned with the properties we expect of an isolated world, I think that would still be fine from an extension perspective -- even though it would mean it deviated from the realm spec).

I'd prefer we avoid the "realm" usage here, instead using either nomenclature which is suitable abstract ("context") or defining our own with the set of principles required.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same concern as Oliver. While the concept of a realm in Ecma262 is very close to what we would use to describe an execution context for an extension, I don't think this is inherently true, nor that it is guaranteed to remain accurate.

I think that in terms of standard terms, the concept of a ECMAScript Realm is the right choice here. I'll show that Chromium (+Blink+v8) does indeed have an implementation that (trivially) resembles (and augments) it.
(and Safari also has concepts similar to Chromium, as the basics of isolated worlds (DOMWrapperWorld) existed in WebKit before Chromium created Blink as a fork of WebKit)

The literal text in the ECMA spec you linked (which I already cited in my proposed specification text) is:

Before it is evaluated, all ECMAScript code must be associated with a realm. Conceptually, a realm consists of a set of intrinsic objects, an ECMAScript global environment, all of the ECMAScript code that is loaded within the scope of that global environment, and other associated state and resources.

The first sentence already provides a basis for the argument that any JS engine should already recognize the concept of a realm. If we remove all special behavior, an isolated world is a place where JS code executes - which is exactly what a realm represents.

Note that I intentionally chose the ECMAScript definition (which has minimal requirements, it is browser-agnostic) instead of the augmented realm + environment settings object from the HTML spec (https://html.spec.whatwg.org/multipage/webappapis.html#realms-and-their-counterparts), exactly because the HTML spec version of it imposes many expectations that none of our implementations have.

Now I'm going to enumerate some Chromium implementation details:

  • script_state.h class declaration starts with the following comment: "ScriptState is an abstraction class that holds all information about script execution (e.g., v8::Isolate, v8::Context, DOMWrapperWorld, ExecutionContext etc). If you need any info about the script execution, you're expected to pass around ScriptState in the code base. ScriptState is in a 1:1 relationship with v8::Context."
  • ScriptStateImpl::Create is the glue between Blink and v8, and calls V8Initializer::InitializeContext(context, execution_context); (with context a v8::Context and execution_context the "ExecutionContext" described above) and pairs the instance with a DOMWrapperWorld, together forming a ScriptState.
  • script_state.h also has several static methods on ScriptState that use the "realm" terminology (ForCurrentRealm / ForRelevantRealm) that maps an arbitrary v8 object, via v8::Context back to ScriptState.
  • Conclusion: The "realm" concept has a 1:1 mapping to a v8::Context, and together with the documentation at https://chromium.googlesource.com/chromium/src/+/master/third_party/blink/renderer/bindings/core/v8/V8BindingDesign.md establishes a firm link between the "realm" concept and "world".

Chromium's point where it creates an isolated world is IsolatedWorldManager::GetOrCreateIsolatedWorldForHost, which shows (among others) that the instantiation of an isolated world also includes an identifier for the extension (host_id).

The spec for realms goes into details about the manner in which realms are created and initialized, and I'm not sure that this exact process is followed for v8 isolated worlds, nor do I think it is a fundamental requirement for them (that is, if a browser initialized its content script execution contexts differently from the exact realm spec, but in a way that still aligned with the properties we expect of an isolated world, I think that would still be fine from an extension perspective -- even though it would mean it deviated from the realm spec).

The definition of a Realm covers the Realm record and its fields/properties (including [[HostDefined]] where anything goes) and doe not require the exact steps for creation to be followed. For example, Chromium's v8 engine does not even follow these steps, it takes the smarter approach of deserializing a snapshot. There is a human-readable write-up at https://v8.dev/blog/lazy-deserialization for example.

What we care about are the observed behaviors. We need a place to execute (which realm provides), some indication that makes it different from the main world and other extensions (the world type, extension origin association), and a guarantee that the instrincs (builtin prototypes etc) operate on the same underlying state (which is a bit handwavy but we can expand later if we really want to). The text here is my attempt of expressing the requirements in terms of the bare minimum that is interoperable. I haven't even specified how to create an isolated world - as you remarked that would be too detailed at this stage.

I'd prefer we avoid the "realm" usage here, instead using either nomenclature which is suitable abstract ("context") or defining our own with the set of principles required.

In my first draft of this PR, I initially used An <dfn>extension context</dfn> is any JavaScript execution context associated with an extension. This sounds good at first, but if we look up the ecma262 definition of Execution context, the concept specified there does not match what we want from it. The "realm" concept, to me at least, does.

And then I continued looking for a specification concept capturing built-in DOM APIs and their prototype methods. These are specified by WebIDL and the platform object concept seems to be a perfect fit for that:

Although at the time of this writing the JavaScript specification does not reflect this, every JavaScript object must have an associated realm. The mechanisms for associating objects with realms are, for now, underspecified. However, we note that in the case of platform objects, the associated realm is equal to the object’s relevant realm,

I think that "realm" is still not only a reasonable but also accurate way to serve as the basis for specifying the expected behaviors. I think that my attempt to specify extension contexts and isolated worlds is an accurate representation of how Chromium and Firefox behave. I'll need to refine the text, e.g. the with its own [=global object=] part of A <dfn>world</dfn> is a [=realm=] with its own [=global object=]. is redundant since the definition of a realm already includes [[GlobalObject]]. But before I get there we first need to agree on the primitives used for the specification.

* A <dfn>privileged extension context</dfn> is an [=extension context=] with access to the full set of extension APIs available to the extension. For example, the background page or worker defined through [[#key-background]].
* A <dfn>content script context</dfn> is an [=extension context=] and [=isolated world=] with limited access to a subset of extension APIs. This is the default execution environment of all [=content scripts=] of an extension.
* The [=main world=] of a web page not associated with an [=extension origin=] is not an extension context. It does not have access to any extension API, except when an extension allows so through [[#key-externally_connectable]].

Some extension APIs may involve the execution of JavaScript code in contexts other than what is specified above. For example, the `userScripts` API allows the creation of `USER_SCRIPT` worlds that are isolated similarly to [=isolated worlds=] but with distinct API availability.

## Isolated worlds

A <dfn>world</dfn> is a [=realm=] with its own [=global object=].

The <dfn>main world</dfn> is the [=realm=] whose [=global object=] is the
associated document's {{Window}}, in which the document's own scripts run.
This is the realm implied throughout other specifications that assume a single
realm per document.

A document may also have a number of [=isolated worlds=], created by the user
agent to run [=content scripts=] in a [=content script context=].

An <dfn>isolated world</dfn> is a distinct [=realm=] whose [=global object=]'s
interface is {{Window}}, associated with the [=main world=]'s document. The
[=platform objects=] of this realm are distinct from their counterparts in the
[=main world=], but operate on the same underlying state. These operations
should maintain isolation across realms: no object in an [=isolated world=]'s
realm is observable from the [=main world=].

<div class="note">
For example, {{CustomEvent}} specifies the `event.detail` attribute that
"must return the value it was initialized to". Event dispatch can cross worlds,
and following this requirement to the letter would result in the exposure of
an object from one realm to another. Potential resolutions include:

- Throwing at `CustomEvent` construction when a non-primitive value is passed.
- Returning `null`.
- Returning a structured clone, per [[HTML#safe-passing-of-structured-data]].
- Returning a redacted version of the object.

<dfn>Worlds</dfn> are isolated JavaScript contexts with access to the same underlying DOM tree but their own set of wrappers around those DOM objects. Declarations in the global scope are also isolated.
</div>

# Unavailable APIs

# The `browser` global

# Extension origin
[[webextensions-browser-global inline]] is the primary namespace hosting extension APIs, available to [=extension contexts=].

Although the [=main world=] of a web page is not an [=extension context=], it may also contain the `browser` global to offer access to functionality granted by [[#key-externally_connectable]].

# <dfn>Extension origin</dfn>

The extension origin is a [=tuple origin=] consisting of an [=extension scheme=] and an extension-specific host. An <dfn>extension scheme</dfn> is a browser-specific scheme reserved for extension use.

<div class="example">
Comment thread
oliverdunk marked this conversation as resolved.
Examples of [=extension schemes=] include `chrome-extension`, `moz-extension`, and `safari-web-extension`.
</div>

# Localization

Expand Down Expand Up @@ -202,9 +254,9 @@ A <dfn>glob</dfn> can be any [=string=]. It can contain any number of wildcards

## Background content

## Content scripts
## <dfn>Content scripts</dfn>

<dfn>Content scripts</dfn> represent a set of JS and CSS files that should be injected into matching pages loaded by the user agent. They are injected using the steps in [[#inject-a-content-script]].
Content scripts represent a set of JS and CSS files that should be injected into matching pages loaded by the user agent. They are injected using the steps in [[#inject-a-content-script]]. Content scripts run in an [=isolated world=] by default.

### Key `matches`

Expand Down Expand Up @@ -277,6 +329,10 @@ enum ExecutionWorld {

The {{ExecutionWorld}} enum represents a JavaScript [=world=].

"ISOLATED" corresponds to an [=isolated world=].

"MAIN" corresponds to the [=main world=].

## Extension pages

# Classes of security risk
Expand Down Expand Up @@ -336,3 +392,24 @@ To determine if a content script should be injected in a document:
1. If |url| matches an entry in `exclude_matches` or `exclude_globs`, return.
1. If this is a child frame, and `all_frames` is not `true`, return.
1. Otherwise, inject the content script. This should be done based on the `run_at` setting.

<pre class="link-defaults">
spec:html; type:dfn; for:realm; text:global object
</pre>

<pre class="biblio">
{
"webextensions-browser-global": {
"authors": [
"Patrick Kettner"
],
"href": "https://w3c.github.io/webextensions/specification/window.browser.html",
"title": "window.browser",
"status": "CG-DRAFT",
"publisher": "WECG",
"deliveredBy": [
"https://www.w3.org/groups/cg/webextensions/"
]
}
}
</pre>
2 changes: 1 addition & 1 deletion specification/window.browser.bs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<pre class="metadata">
Title: window.browser
Shortname: wecg-browser
Shortname: webextensions-browser-global
Level: 1
Group: wecg
Status: w3c/CG-DRAFT
Expand Down
Loading