fix: apply Unicode full case fold in link reference label normalization by JSap0914 · Pull Request #3997 · markedjs/marked

JSap0914 · 2026-06-17T11:12:50Z

Bug

JavaScript's String.prototype.toLowerCase() implements Unicode simple case mapping, not full case folding. The CommonMark spec §5.6 requires Unicode case folding when matching link reference labels:

One label matches another just if their normalized forms are equal. To normalize a label, perform the Unicode case fold on the label, and collapse consecutive spaces, tabs, and line endings to a single space.

The key difference: full case folding allows a single character to expand to multiple characters. For example:

Character	`toLowerCase()`	Full case fold
ẞ (U+1E9E)	ß	ss
ß (U+00DF)	ß	ss
ﬁ (U+FB01)	ﬁ	fi

This means CommonMark spec example 540 fails:

[ẞ]

[SS]: /url

Expected: <p><a href="/url">ẞ</a></p>
Got: <p>[ẞ]</p> (unresolved reference)

Fix

Add normalizeLabel() to src/helpers.ts that chains toLowerCase() with replacements for the Unicode F-status (full) case fold pairs that expand to multiple characters. Update Tokenizer.def and Tokenizer.reflink to use it instead of plain toLowerCase().

Verification

npm test

CommonMark Links: 77/90 → 78/90 (example 540 now passes)
GFM Links: same improvement
All 1743 spec tests pass; all 188 unit tests pass; lint clean

AI assistance disclosure: This fix was developed with AI-assisted tooling.

JavaScript's String.prototype.toLowerCase() implements Unicode simple case mapping, not full case folding. As a result, ẞ (U+1E9E, Latin Capital Letter Sharp S) is lowercased to ß (U+00DF) rather than 'ss'. The same issue affects ß itself (U+00DF → 'ss' in full case fold) and Unicode ligatures (ﬁ → 'fi', ﬀ → 'ff', etc.). The CommonMark spec (§5.6) requires Unicode case folding when matching link reference labels, so [ẞ] must match [SS]: /url (example 540). Previously it did not. Fix: Add normalizeLabel() in helpers.ts that applies toLowerCase() then replaces the Unicode F-status multi-character case fold pairs. Use it in Tokenizer.def and Tokenizer.reflink in place of plain toLowerCase(). Fixes CommonMark spec example 540.

vercel · 2026-06-17T11:12:54Z

@JSap0914 is attempting to deploy a commit to the MarkedJS Team on Vercel.

A member of the Team first needs to authorize it.

Copilot

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

vercel · 2026-06-18T04:08:58Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
marked-website	Ready	Preview, Comment	Jun 18, 2026 4:09am

UziTech

Thanks! 💯

styfle · 2026-06-22T04:33:16Z

+    .replace(/ﬂ/g, 'fl') // U+FB02
+    .replace(/ﬃ/g, 'ffi') // U+FB03
+    .replace(/ﬄ/g, 'ffl') // U+FB04
+    .replace(/ﬅ|ﬆ/g, 'st'); // U+FB05, U+FB06


Is this list exhaustive?

I'm wondering if we can use a built-in API like Intl or perhaps label.normalize('NFKC') instead?

styfle · 2026-06-22T04:34:05Z

+    // Apply Unicode full case fold before toLowerCase so that
+    // ẞ (U+1E9E) → ss and ß (U+00DF) → ss (simple fold maps ẞ to ß first).
+    .toLowerCase()
+    .replace(/ß/g, 'ss') // U+00DF + folded U+1E9E


I see the test for this first one, but do we have tests for the others?

Copilot AI review requested due to automatic review settings June 17, 2026 11:12

Copilot AI reviewed Jun 17, 2026

vercel Bot deployed to Preview June 18, 2026 04:09 View deployment

UziTech approved these changes Jun 18, 2026

View reviewed changes

UziTech requested review from calculuschild and styfle June 18, 2026 04:11

styfle reviewed Jun 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: apply Unicode full case fold in link reference label normalization#3997

fix: apply Unicode full case fold in link reference label normalization#3997
JSap0914 wants to merge 1 commit into
markedjs:masterfrom
JSap0914:fix/autolink-email-scheme-normalization

JSap0914 commented Jun 17, 2026

Uh oh!

vercel Bot commented Jun 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

vercel Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

UziTech left a comment

Uh oh!

styfle Jun 22, 2026

Uh oh!

styfle Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

JSap0914 commented Jun 17, 2026

Bug

Fix

Verification

Uh oh!

vercel Bot commented Jun 17, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

vercel Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

UziTech left a comment

Choose a reason for hiding this comment

Uh oh!

styfle Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

styfle Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vercel Bot commented Jun 18, 2026 •

edited

Loading