spec: clarify normative language for resolver validation (26 edits from cross-impl security audit)#282
Conversation
A four-implementation security audit (Rust, Python, TypeScript, Java) of did:webvh v1.0 surfaced 25+ vulnerabilities reproduced across multiple implementations. Six issues hit three of four implementations — strong evidence that the defects trace to under-normative spec language rather than implementation negligence. This commit applies 26 surgical edits that: - bind witness proofs to the resolved DID's URL and to distinct witness identities (cross-DID replay decentralized-identity#11, threshold inflation via duplicates decentralized-identity#3) - require updateKeys explicitly in every entry under pre-rotation (decentralized-identity#9) - bind state.id SCID to parameters.scid on every entry, including portable renames (decentralized-identity#14, decentralized-identity#15) - mandate cryptosuite enforcement on every proof, not just proofPurpose (decentralized-identity#10) - forbid silent downgrades of unknown method values (R6) - forbid "fast resolve" paths that skip intermediate-entry verification (T6) - harden the DID-to-HTTPS transformation against path traversal, IP-literal hosts, and percent-encoding case-folding bypasses (decentralized-identity#2, decentralized-identity#4, decentralized-identity#5, decentralized-identity#12, decentralized-identity#13) - anchor SCID-placeholder substitution to structural locations (P14, J6) - tighten versionTime monotonicity and future-time bounds on every entry (P1, P7, J8) - add a Resolver Transport Hardening subsection and a Resolver Validation Checklist to Security Considerations - add negative log-entry examples (what resolvers MUST reject) so implementers can self-test No changes to the DID URL grammar, JSON log-entry shape, witness-proof shape, multihash algorithms, or JCS canonicalization. The wire format is unchanged. Implementations that pass these clarified requirements remain interoperable with implementations that already passed the pre-clarification text. Full audit and per-bug rationale: see PR description. Signed-off-by: Glenn Gore <glenn.gore@gmail.com>
|
|
||
| ::: example | ||
|
|
||
| **Non-compliant log entries that resolvers MUST reject.** Illustrative — not exhaustive. |
There was a problem hiding this comment.
I don't think we should be using normative language here. This is an example. Suggest:
Non-compliant Log Entry Examples (informative). Illustrative — not exhaustive.
There was a problem hiding this comment.
Remove the "MUST reject" - though is it clear then as to what a resolver should do here?
| the previous step. | ||
| 2. **Replace the placeholder `{SCID}`** Replace the placeholder "`{SCID}`" with the calculated [[ref: SCID]] from the previous step **only** at the following structurally-anchored locations within the preliminary JSON object: `parameters.scid`; `state.id`; `state.controller` (string or each array entry); `id` and `controller` of every entry in `state.verificationMethod`; every string entry of `state.authentication`, `state.assertionMethod`, `state.keyAgreement`, `state.capabilityInvocation`, `state.capabilityDelegation` (and `id`/`controller` for any object entries); `id` of every entry in `state.service`. | ||
|
|
||
| The replacement **MUST NOT** be performed by unanchored string substitution over the serialized JSON. Unanchored substitution corrupts unrelated fields that may legitimately contain the placeholder as a substring (e.g., `alsoKnownAs`, `serviceEndpoint`). |
There was a problem hiding this comment.
I had thought about this when this was written and thought that a global replacement of all instances of "{SCID}" with the calculated SCID value was the right thing to do.
What is a realistic example where you wouldn't want this?
Why is that not sufficient?
There was a problem hiding this comment.
This is just being super pedantic, but there is an edge case where someone may, for some reason, want to include "scid" in an ID field for a key or service record. A basic string substitution may incorrectly pick up on the "scid" string and replace it with the calculated SCID.
This is just being explicit on where the string "scid" may be used elsewhere, independent of the webvh standard itself. Though reading this the 2nd paragraph is probably enough to protect on this rare edge case.
There was a problem hiding this comment.
To be clear, the placeholder is "{SCID}", so pretty unlikely. My concern is that we are trying to list off all the places to specifically change, but if we miss any, the user is hooped. I think we are better off leaving it and perhaps putting in a caveat saying that EVERY literal "{SCID}" will be replaced so you can't have a DIDDoc that contains that string.
| file, applying the [[ref: parameters]] set from the current and previous | ||
| entries. As noted in the [DID Log File](#the-did-log-file) section, [[ref: log entries]] | ||
| are each a JSON object with the following properties: | ||
| To process the retrieved [[ref: DID Log]] file, the resolver **MUST** carry out the following steps on each of the [[ref: log entries]] in the order they appear in the file, applying the [[ref: parameters]] from the current and previous entries. Every step **MUST** be performed for **every** entry; in particular, [[ref: Data Integrity]] proof verification (step 2) and `entryHash` verification (step 3) **MUST NOT** be skipped for intermediate entries on the grounds that the resolver only needs the latest [[ref: DIDDoc]]. A "fast-resolve" implementation that verifies only a proper subset of entries is **non-compliant**: the proof chain is what binds the latest entry's authority to the genesis SCID, and breaking it at any point invalidates all later entries. |
There was a problem hiding this comment.
I don't think normative language and all the words are needed here. This is already in the spec. I think some guidance is reasonable, but IMHO this equates to "Use the spec. Really. We mean it."
|
|
||
| ##### Reading did:webvh DID URLs | ||
|
|
||
| A `did:webvh` DID identifies a log by its SCID; host/path locate where the log is hosted. When a resolver receives a request, the SCID segment of the requested DID **MUST** equal the SCID segment of `state.id` in at least one entry of the retrieved log **and** equal `parameters.scid` from the first entry. A log whose `state.id` SCID does not match the requested DID — even if host/path matches — **MUST NOT** be returned; resolution **MUST** terminate. This rule applies independently of whether portability is enabled. |
There was a problem hiding this comment.
Why the "...equal the SCID segment of state.id in at least one entry of the retrieved log" -- it must be the same SCID in ALL entries.
| changed (`did.jsonl` to `did-witness.json`). The media type of the file | ||
| **SHOULD** be `application/json`. | ||
|
|
||
| Resolvers **MUST** retrieve `did-witness.json` from the location derived from **the DID being resolved**, not from a location supplied by another DID's log or out of band. The host, port, and path **MUST** match the `did.jsonl` URL (only the final element differs). A `did-witness.json` from a different host/path **MUST NOT** be combined with a `did.jsonl` for this DID; doing so permits cross-DID transplantation, even if every individual proof verifies as a `did:key` signature, because the payload `{"versionId":"..."}` does not bind to a DID. |
There was a problem hiding this comment.
Would this invalidate the use of a Watcher for resolving a DID?
Not sure about the last line. If the payload does not bind to the DID it must be rejected no matter where the file came from.
There was a problem hiding this comment.
Yes, this should be from the derived URL or a valid watcher.
There was a problem hiding this comment.
The statement "...doing so permits cross-DID transplantation..." is wrong. The did-witness.json file:
- must have
versionIdvalues that match that of the DID, and those values are chained to the DID. - must use a
did:keythat is referenced in thewitnesseslist in the DID parameters.
There is no reason to require the did-witness.json file to come from the defined web location (e.g. watchers -- official or not) are fine, and as long as the resolvers are properly checking what witness proofs to include/exclude, there is a direct tie between the DID and the witness file.
| 1. Complete all non-witness verifications of the [[ref: DID Log]] **before** processing any witness proof. Witness verification **MUST NOT** substitute for entry-hash or signature verification. | ||
| 2. Retrieve `did-witness.json` from the location derived from **this** DID via the [DID-to-HTTPS Transformation](#the-did-to-https-transformation). A file from a different location, or supplied out of band, **MUST NOT** be accepted unless provenance is independently established. | ||
| 3. For each entry in `did-witness.json`, confirm its `versionId` matches a `versionId` present in **this** `did.jsonl`. Non-matching entries **MUST** be discarded; resolvers **MUST NOT** treat them as evidence of witnessing for any other entry. | ||
| 4. Because the signed payload (`{"versionId": "<n-hash>"}`) does **not** itself encode the DID, the location-and-`versionId`-match check in step 3 is the **sole** mechanism binding a proof to this DID; implementations **MUST NOT** skip it. A witness signature lifted from another DID's `did-witness.json` with the same `versionId` **MUST NOT** be accepted. |
There was a problem hiding this comment.
Nothing incorrect about this, but it seems pretty unlikely and so unnecessary. Given that each versionID is GUID, it seems a stretch that a useful witness signature could be lifted from another DID's witness.json file.
There was a problem hiding this comment.
AI was getting paranoid by this stage of it's work - agreed it is not needed, as the witness checks should detect an invlaid signature that has been copied and inserted.
|
Reading through this -- it looks good. I noted a few things that looked weird to me as I was going through it, but nothing too serious. As always with AI it adds a lot of words, but in this case, not the end of the world. And if it makes it easier for another AI to read the spec and produce a compliant, secure implementation -- great. It appears that all of updates here are truly clarifications -- it doesn't change anything we intended in the v1.0 spec. Is that your read of it as well? If so, I think it would be best to keep the v1.0 spec and this "next" spec as the same. If you agree, copy the two updated files in /spec into /spec-v1.0, so that the two versions of the spec continue to match exactly. Interested in what others think. Thanks for this. |
|
We had a good discussion about this PR at the 20260604 did:webvh Work Item Meeting (Recording — starting at 14:15. While the vast majority of the PR is good, we do want some changes made to it to ensure it doesn’t change/break v1.0 and to address some mistakes. @stormer78 — how do you want to proceed with this PR? One option is that I could make changes to your fork, PR to you and you could accept/reject the changes and then update your PR. Happy to do it that way and I think it would preferred so that we have a human pass across the generated text. Here are the changes we want to see. The reference numbers is from the initial comment
Assuming these updates to the PR are made (especially 1, 3 and 4), we believe that the changes do not constitute breaking changes, just clarifications, and can thus be applied to both the “next” and “v1.0” versions of the spec. The current PR only covers “next”, but I would like them applied to “v1.0” as well. What do you think? . |
| 3. Form the set of **distinct** attributed `id` values. The threshold check applies to the size of this set, not to the raw proof count. | ||
| 4. If `|set| ≥ threshold`, the update is "[[ref: witnessed]]"; otherwise resolution **MUST** terminate with an error. | ||
|
|
||
| A single witness submitting multiple proofs (e.g., signed by additional keys) **MUST NOT** satisfy threshold > 1. |
There was a problem hiding this comment.
Removing this line as it is not correct. There is no way to know if a single witness used multiple keys.
Signed-off-by: Stephen Curran <swcurran@gmail.com>
…sion Signed-off-by: Stephen Curran <swcurran@gmail.com>
…-audit Corrections and rewording of the clarifications to the spec
|
Reiterating the comment I put in the PR to @stormer78 's PR: I reworded/adjusted most of the changes to be “better” aligned with the intent of the spec. The following is the list of changes, first with the actual errors, and then a rundown of the rewording, all of which are evident from the edits I made. The first list is the important one. Corrected errors:
Rewording rationale:
|
|
Would appreciate a review/approval from someone else @decentralized-identity/didwebvh-admins or @decentralized-identity/didwebvh-maintainers . This PR does not change the spec, merely clarifies sections. As such, the PR keeps in sync the v1.0 spec and the Editor's Draft. Let's get this merged so that we can tell the story about the use of Project Glasswing/Mythos to improve the implementations and the spec. |
PR: Clarify normative language for resolver validation
Base branch:
decentralized-identity/didwebvh:mainHead branch:
stormer78:spec-clarifications-from-security-auditLocal commit:
889eb7a(DCO-signed)Files changed:
spec/specification.md,spec/security_and_privacy.md(+149 / −72 lines)Summary
A cross-implementation security audit of four
did:webvhv1.0 referenceimplementations (Rust, Python, TypeScript, Java) found 25+ distinct vulnerabilities.
Six of those issues were reproduced by three of four implementations — strong
evidence that the defects trace to under-normative spec language rather than
isolated implementation negligence.
This PR applies 26 surgical edits that tighten normative requirements
without changing the wire format. The DID URL grammar, log-entry JSON
shape, witness-proof shape, multihash algorithm identifiers, and JCS
canonicalization process are untouched. Implementations that pass
the clarified text remain interoperable with implementations that passed
the pre-clarification text.
What changed
spec/specification.mdnextKeyHashes—updateKeysmandatory while pre-rotation activemethod+ § Authorized Keys — cryptosuite MUSTstate.idSCID MUST equalparameters.sciddid:keybody/fragment matchmethod— reject unknown values, no silent downgradeportable— codify retroactivity + SCID immutability in parameterdid:keybody onlyspec/security_and_privacy.mdHard constraints honoured
clarity, negative examples, or expanded Security Considerations — not
wire format
For one cross-impl spec gap (#11 cross-DID witness replay) the root
cause is in the witness signing payload, which doesn't bind to a DID.
Fixing that root cause would require a wire-format change and is out of
scope. The edits here propose the in-scope mitigation that closes the
practical exploit: requiring the URL-derived fetch +
versionId-matchas the binding mechanism (Edits 5, 16, 19, 20, 26).
Audit summary by severity
The audit findings the edits address:
did:new#14 SCID binding, Proposal 2:did:fractal#15 portable SCID swapCross-impl status tables are available with the audit; the per-bug
provenance is summarised in the test-suite PR (negative test vectors)
that pairs with this spec PR — see
decentralized-identity/didwebvh-test-suite#5.
Why we're confident the edits don't break interop
meaning.
A compliant implementation against the pre-clarification text was free
to apply or skip these checks; with these edits, "skip" is no longer
compliant. Implementations that already applied the checks (e.g., Python
gets #11 and #14 right; TypeScript gets #1 and #10 right) remain
compliant unchanged.
Reviewer checklist
intended security model
adding/removing entries
style and granularity of existing content
did:fractal#15 has Edits 7+ 17 + 23 reinforcing the SCID-immutability rule from three
angles), this is intentional defence-in-depth; the WG may prefer
to consolidate to a single location with cross-references
Provenance
The audit covered:
didwebvh-rs(#1–#15)R01–R10)P01–P16)T01–T16)J01–J13)finding from the other three
Companion artefacts:
decentralized-identity/didwebvh-test-suite#5.
exploitability matrix produced during the audit are available on
request from the author.