Skip to content

[fix] correct small errors in string, struct, and system function docs#3714

Merged
morningman merged 1 commit into
apache:masterfrom
boluor:fix-batch23-24-simple
May 20, 2026
Merged

[fix] correct small errors in string, struct, and system function docs#3714
morningman merged 1 commit into
apache:masterfrom
boluor:fix-batch23-24-simple

Conversation

@boluor
Copy link
Copy Markdown
Contributor

@boluor boluor commented May 20, 2026

Summary

Fixes 28 small documentation issues across the string, struct, and system function references. Each item below is independent.

string-functions

  • ascii.md / auto-partition-name.md / char-length.md / cut-to-first-significant-subdomain.md / format-number.md / length.md / ngram-search.md / regexp-extract-all.md — frontmatter description was truncated mid-sentence; completed each.
  • count_substrings.md — two list items were both numbered 7.; renumbered the second to 8..
  • char.md — three result-table headers showed char('utf8', ...) while the queries used CHAR(...); headers now match the queries.
  • is-uuid.md — the result-table header omitted the curly braces that were in the query; header now includes them.
  • mask-last-n.md — the query used 'Helloṭṛ123' while the result header and output used 'Hello你好123'; query aligned with the displayed result.
  • parse-url.md / protocol.md — a Chinese ## 相关命令 section (with Chinese body and full-width ) was sitting in English docs; translated to ## See Also with English body.
  • position.md / repeat.md / reverse.md — each file had a complete duplicate Description/Syntax/Parameters/Examples block appended at the end; the trailing duplicate was removed.
  • quote.md — escape note contradicted itself (claimed both \\\\\\ and \\\\\\); removed the bad bullet. Also, quote(\"It's a test\") showed 'It's a test' (single quote unescaped) in the output; corrected to 'It\\'s a test'.
  • parse-data-size.md — prose listed units up to ZB/YB but the table stops at EB; aligned prose with the table.
  • hamming_distance.md / levenshtein.md — legacy frontmatter (title: … only, no language/description) updated to the standard JSON form used by sibling docs.
  • regexp-count.md — typo is n the total count corrected; typos paratemer/usr corrected to parameter/user; the quadruple-backslash pattern [\\\\\\\\.:;] simplified to [\\\\.:;].
  • regexp-extract-or-null.md — example query pattern had ([[]ower:]]+) while the error message referenced ([[:lower:]+); query updated to match the error.
  • regexp.md — the greedy / non-greedy example invoked REGEXP_EXTRACT(...) instead of REGEXP(...); corrected. A stray ~ on its own line after ## Description (and a leading ~ in the frontmatter description) removed.
  • soundex.md — examples were numbered 1-7 then jumped to 9; corrected to 8.
  • url-decode.md / url-encode.md — result blocks were fenced as sql; corrected to text. url-encode.md also had a duplicate ## Required Parameters heading; removed.
  • frontmatter descriptions that referenced function names without underscores (e.g. REGEXPREPLACEONE, MASKLASTN, PARSEURL, CHARLENGTH, ...) corrected to the canonical underscored form (REGEXP_REPLACE_ONE, MASK_LAST_N, PARSE_URL, CHAR_LENGTH, ...).

system-functions

  • database.mdSELECT database(), schema() showed both result-table headers as database(); second header corrected to schema().

struct-functions

  • named-struct.md / struct.md / struct-element.md — frontmatter language: \"en-US\"\"en\".

Test plan

  • CI doc build passes
  • Spot-check the affected pages render correctly (completed descriptions, no duplicate trailing blocks, corrected result-table headers, no stray punctuation, language metadata in struct docs)

string-functions:
- ascii.md: frontmatter description truncated mid-sentence; completed.
- count_substrings.md: two list items numbered 7; renumbered the second to 8.
- char.md: result-table headers showed `char('utf8', ...)` while queries used `CHAR(...)`; headers now match the queries.
- is-uuid.md: result-table header omitted the curly braces present in the query; header now includes them.
- mask-last-n.md: query used the literal `'Helloṭṛ123'` while the result header and output used `'Hello你好123'`; query aligned with the displayed result.
- parse-url.md / protocol.md: a Chinese `## 相关命令` section (with Chinese body and full-width period) in English docs; translated to `## See Also` with English body.
- position.md / repeat.md / reverse.md: a complete duplicate Description/Syntax/Parameters/Examples block was appended at the end of each file; removed.
- quote.md: contradictory escape note (claimed both `\\` -> `\\\\` and `\\\\` -> `\\`); removed the bad bullet. Also, `quote("It's a test")` showed an unescaped `'` in its result; output now shows the escaped `\\'`.
- parse-data-size.md: prose listed units up to ZB/YB but the table stops at EB; aligned prose with the table.
- hamming_distance.md / levenshtein.md: legacy frontmatter `title: …` (no `language`/`description`) updated to the standard JSON form used by sibling docs.
- auto-partition-name.md / char-length.md / cut-to-first-significant-subdomain.md / format-number.md / length.md / ngram-search.md / regexp-extract-all.md: frontmatter description was truncated mid-sentence; completed.
- regexp-count.md: typo `is n the total count` corrected; typos `paratemer` and `usr` corrected to `parameter` and `user`; quadruple backslash `[\\\\\\\\.:;]` simplified to `[\\\\.:;]`.
- regexp-extract-or-null.md: query pattern had `([[]ower:]]+)` while the error message referenced `([[:lower:]+)`; query updated to match the error.
- regexp.md: example used `REGEXP_EXTRACT(...)` instead of `REGEXP(...)`; corrected. Stray `~` on its own line after `## Description` (and leading `~ ` in the frontmatter description) removed.
- soundex.md: examples numbered 1-7 then jumped to 9; corrected to 8.
- url-decode.md / url-encode.md: result blocks were fenced as ```sql; corrected to ```text. Duplicate `## Required Parameters` heading in url-encode.md removed.
- frontmatter description references to function names without underscores (e.g. `REGEXPREPLACEONE`, `MASKLASTN`, `PARSEURL`, `CHARLENGTH`, ...) corrected to the canonical underscored form.

system-functions:
- database.md: `SELECT database(), schema()` showed both result-table headers as `database()`; second header corrected to `schema()`.

struct-functions:
- named-struct.md / struct.md / struct-element.md: frontmatter `language: "en-US"` -> `"en"`.
@morningman morningman merged commit a17000b into apache:master May 20, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants