Skip to content

Reject hostnames with unpaired UTF-16 surrogates#9499

Open
eyupcanakman wants to merge 1 commit into
square:masterfrom
eyupcanakman:fix/hostname-isascii-surrogates
Open

Reject hostnames with unpaired UTF-16 surrogates#9499
eyupcanakman wants to merge 1 commit into
square:masterfrom
eyupcanakman:fix/hostname-isascii-surrogates

Conversation

@eyupcanakman

Copy link
Copy Markdown

OkHostnameVerifier.isAscii() used length == utf8Size() to decide whether a hostname is ASCII. That breaks on unpaired UTF-16 surrogates. okio encodes a lone surrogate as a single ? byte, so its UTF-8 size equals its length and isAscii() wrongly returns true. A hostname that starts with an unpaired surrogate such as U+D800 then slips past the ASCII guard in verify() and can match a wildcard certificate like *.com.

The fix scans the string and returns false as soon as a code unit falls outside the ASCII range, so any unpaired surrogate counts as non-ASCII. I added a test that builds a *.com certificate and checks a surrogate hostname is rejected while a normal one still matches.

Fixes #6357

isAscii() treated a lone UTF-16 surrogate as ASCII because okio encodes it as one `?` byte. Scan the string so any non-ASCII code unit is detected.

Fixes square#6357
@JakeWharton JakeWharton requested a review from swankjesse June 22, 2026 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

isAscii is wrong for malformed surrogates

2 participants