3.9.0-beta1#50
Conversation
|
Moin WalkthroughDie Pull Request erhöht die Version auf 3.9.0-beta.1, führt eine URL-Normalisierungs-Pipeline mit automatischer Emoji-Entfernung ein, refaktoriert die Replacement-Logik in eine neue getReplacements()-Methode, entfernt redundanten Error-Flag-Code und löscht eine veraltete Rex-5.6.0-spezifische Konfigurationsspeicherung. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR prepares the yrewrite_scheme addon for a 3.9.0-beta release by updating URL normalization (including emoji/symbol removal), cleaning up legacy install/config code, and adding a changelog entry for the release.
Changes:
- Update
normalize()to pre-clean input (lowercase/trim/strip tags) and remove emojis/symbols before applying replacements. - Refactor global replacement table into a new
getReplacements()method and remove redundant uppercase mappings. - Bump addon version to
3.9.0-beta.1and add a newCHANGELOG.md.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| pages/yrewrite.yrewrite_scheme.php | Removes an obsolete REDAXO 5.6.0 config-save workaround from the settings save path. |
| package.yml | Bumps addon version and documents minimum REDAXO/yrewrite requirements. |
| lib/yrewrite_url_schemes.php | Refactors and extends URL normalization (emoji removal + extracted replacement map). |
| install.php | Removes placeholder “something is wrong” scaffolding code from install routine. |
| CHANGELOG.md | Adds changelog entry for 3.9.0-beta.1. |
Comments suppressed due to low confidence (1)
lib/yrewrite_url_schemes.php:147
normalize()now lowercases the input before applying the user-configuredlanguage_replaces_*rules. That makes any configuredsearchvalues that rely on uppercase characters stop matching (previously they could match before the later lowercasing step). To avoid a breaking behavior change, either applylanguage_replaces_*before lowercasing, or normalize the configuredsearchvalues to lowercase before callingstr_replace.
// 1. First cleanup: lowercasing and trimming
$string = strip_tags(trim(mb_strtolower($string)));
// 2. Remove Emojis and Symbols
// Match Emoticons, Misc Symbols and Pictographs, Transport and Map, Flags, etc.
$string = preg_replace('/[\x{1F600}-\x{1F64F}\x{1F300}-\x{1F5FF}\x{1F680}-\x{1F6FF}\x{1F1E0}-\x{1F1FF}\x{2600}-\x{26FF}\x{2700}-\x{27BF}\x{FE00}-\x{FE0F}\x{1F900}-\x{1F9FF}]/u', '', $string);
// 3. Sprachspezifische Ersetzungen anwenden
$language_replaces = rex_config::get('yrewrite_scheme', 'language_replaces_' . $clang, []);
if (!empty($language_replaces) && is_array($language_replaces)) {
$search = [];
$replace = [];
foreach ($language_replaces as $item) {
if (isset($item['search']) && isset($item['replace'])) {
$search[] = $item['search'];
$replace[] = $item['replace'];
}
}
if (!empty($search) && !empty($replace)) {
$string = str_replace($search, $replace, $string);
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // 2. Remove Emojis and Symbols | ||
| // Match Emoticons, Misc Symbols and Pictographs, Transport and Map, Flags, etc. | ||
| $string = preg_replace('/[\x{1F600}-\x{1F64F}\x{1F300}-\x{1F5FF}\x{1F680}-\x{1F6FF}\x{1F1E0}-\x{1F1FF}\x{2600}-\x{26FF}\x{2700}-\x{27BF}\x{FE00}-\x{FE0F}\x{1F900}-\x{1F9FF}]/u', '', $string); | ||
|
|
There was a problem hiding this comment.
preg_replace(..., '/.../u', ...) can return null on UTF-8/regex errors (and emits a warning), which would then flow into later string operations. Consider guarding the result (e.g., fallback to the pre-regex $string) so normalize() always returns a string even for malformed input.
| @@ -1,5 +1,5 @@ | |||
| package: yrewrite_scheme | |||
| version: '3.8.0' | |||
| version: '3.9.0-beta.1' | |||
There was a problem hiding this comment.
The version string in package.yml uses 3.9.0-beta.1, but the PR title/description refers to 3.9.0-beta1. Please align the version formatting across the PR metadata, package.yml, and CHANGELOG.md to avoid publishing/tagging inconsistencies.
| ### Changed | ||
| - Refactored `normalize` method for better performance and readability. | ||
| - Removed redundant uppercase replacements since strings are lowercased beforehand. | ||
| - Moved replacement table to separated `getReplacements()` method. |
There was a problem hiding this comment.
Minor grammar: “Moved replacement table to separated getReplacements() method.” should be “Moved replacement table to a separate getReplacements() method.”
| - Moved replacement table to separated `getReplacements()` method. | |
| - Moved replacement table to a separate `getReplacements()` method. |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (3)
lib/yrewrite_url_schemes.php (3)
131-131: Moin! Gemischte Sprachen in den Kommentaren.Zeile 131 ist auf Deutsch (
Sprachspezifische Ersetzungen anwenden), der Rest der Kommentare ist Englisch. Für Konsistenz wäreApply language-specific replacementspassender.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/yrewrite_url_schemes.php` at line 131, Replace the German inline comment "Sprachspezifische Ersetzungen anwenden" in lib/yrewrite_url_schemes.php with its English equivalent to maintain consistency; locate the comment near the existing comment block around the language replacement logic (search for the phrase "Sprachspezifische Ersetzungen anwenden" or the surrounding comment block) and change it to "Apply language-specific replacements" so all comments are in English.
128-129: Moin! Emoji-Regex deckt nicht alle Unicode-Bereiche ab.Neuere Emoji-Ranges fehlen, z.B.
\x{1FA70}-\x{1FAFF}(Symbols Extended-A, u.a. 🩷, 🫠),\x{1F000}-\x{1F02F}(Mahjong Tiles) und\x{200D}(Zero-Width Joiner aus zusammengesetzten Emojis). Letzterer bleibt als unsichtbares Zeichen in der URL zurück, wenn compound Emojis nur teilweise entfernt werden.Für die Beta kein Blocker, aber empfehlenswert für den finalen Release.
♻️ Erweiterte Emoji-Regex
- $string = preg_replace('/[\x{1F600}-\x{1F64F}\x{1F300}-\x{1F5FF}\x{1F680}-\x{1F6FF}\x{1F1E0}-\x{1F1FF}\x{2600}-\x{26FF}\x{2700}-\x{27BF}\x{FE00}-\x{FE0F}\x{1F900}-\x{1F9FF}]/u', '', $string); + $string = preg_replace('/[\x{1F600}-\x{1F64F}\x{1F300}-\x{1F5FF}\x{1F680}-\x{1F6FF}\x{1F1E0}-\x{1F1FF}\x{2600}-\x{26FF}\x{2700}-\x{27BF}\x{FE00}-\x{FE0F}\x{1F900}-\x{1F9FF}\x{1FA00}-\x{1FA6F}\x{1FA70}-\x{1FAFF}\x{200D}\x{1F000}-\x{1F02F}\x{E0020}-\x{E007F}]/u', '', $string);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/yrewrite_url_schemes.php` around lines 128 - 129, The current preg_replace call that strips emoji in the assignment to $string misses newer ranges and the zero-width joiner: update the regex inside the preg_replace (the line assigning $string) to include the additional Unicode ranges \x{1FA70}-\x{1FAFF} and \x{1F000}-\x{1F02F} and also include \x{200D} (ZWJ) so compound/modern emojis are fully removed; keep the /u modifier and ensure the pattern still covers the original ranges while only modifying the character class used by preg_replace.
172-192: Moin!getReplacements()erzeugt bei jedem Aufruf ein neues Array.Wenn
normalize()häufig aufgerufen wird (z.B. bei Sitemap-Generierung für viele Artikel), wird das Replacement-Array jedes Mal neu allokiert. Ein statisches Caching würde das vermeiden.Kein Blocker, aber für den finalen Release überlegenswert.
♻️ Optionales Caching via static Variable
protected function getReplacements() { + static $replacements = null; + if ($replacements !== null) { + return $replacements; + } + - return [ + $replacements = [ 'search' => [ ... ], 'replace' => [ ... ] ]; + + return $replacements; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/yrewrite_url_schemes.php` around lines 172 - 192, getReplacements() currently allocates the large replacement array on every call; change it to use a static cached variable inside the function (e.g. static $cached = null) and only build the ['search'=>..., 'replace'=>...] array the first time (when $cached is null), then return $cached on subsequent calls to avoid repeated allocations; ensure you keep the exact return shape and the arrays for 'search' and 'replace' intact and referenced by function getReplacements().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Line 10: Fix the grammar in the user-facing changelog entry mentioning
getReplacements(): change "Moved replacement table to separated
`getReplacements()` method." to "Moved replacement table to a separate
`getReplacements()` method." and commit the updated CHANGELOG.md line.
In `@lib/yrewrite_url_schemes.php`:
- Around line 127-129: The preg_replace call that strips emojis can return null
on invalid UTF-8, which will break later calls (str_replace, parent::normalize);
change the code around the preg_replace in lib/yrewrite_url_schemes.php to
capture its return (e.g. $clean = preg_replace(...)) and then check if $clean
=== null and if so set $string to a safe fallback (e.g. an empty string or a
UTF-8 sanitized version of the original using mb_convert_encoding or
mb_check_encoding), otherwise assign $string = $clean; this ensures $string
passed to str_replace and parent::normalize is never null.
- Around line 180-191: The return value's 'search' and 'replace' arrays in the
function returning URL rewrite schemes are mismatched (search has 131 items,
replace 144) and also contain a duplicated '/' entry; remove the duplicate '/'
from the 'search' array and then reconcile the arrays so they are the same
length by either removing the 13 extraneous entries from the 'replace' array or
adding the corresponding missing entries to 'search', ensuring each element in
'search' has a semantically correct replacement in 'replace' (use the existing
element order in the 'search' and 'replace' arrays to guide which entries to
remove or add).
---
Nitpick comments:
In `@lib/yrewrite_url_schemes.php`:
- Line 131: Replace the German inline comment "Sprachspezifische Ersetzungen
anwenden" in lib/yrewrite_url_schemes.php with its English equivalent to
maintain consistency; locate the comment near the existing comment block around
the language replacement logic (search for the phrase "Sprachspezifische
Ersetzungen anwenden" or the surrounding comment block) and change it to "Apply
language-specific replacements" so all comments are in English.
- Around line 128-129: The current preg_replace call that strips emoji in the
assignment to $string misses newer ranges and the zero-width joiner: update the
regex inside the preg_replace (the line assigning $string) to include the
additional Unicode ranges \x{1FA70}-\x{1FAFF} and \x{1F000}-\x{1F02F} and also
include \x{200D} (ZWJ) so compound/modern emojis are fully removed; keep the /u
modifier and ensure the pattern still covers the original ranges while only
modifying the character class used by preg_replace.
- Around line 172-192: getReplacements() currently allocates the large
replacement array on every call; change it to use a static cached variable
inside the function (e.g. static $cached = null) and only build the
['search'=>..., 'replace'=>...] array the first time (when $cached is null),
then return $cached on subsequent calls to avoid repeated allocations; ensure
you keep the exact return shape and the arrays for 'search' and 'replace' intact
and referenced by function getReplacements().
| ### Changed | ||
| - Refactored `normalize` method for better performance and readability. | ||
| - Removed redundant uppercase replacements since strings are lowercased beforehand. | ||
| - Moved replacement table to separated `getReplacements()` method. |
There was a problem hiding this comment.
Moin! Kleiner Grammatik-Typo im User-facing Text:
Moved replacement table to separated
getReplacements()method.
Sollte heißen: "to a separate" statt "to separated".
✏️ Vorgeschlagene Korrektur
-- Moved replacement table to separated `getReplacements()` method.
+- Moved replacement table to a separate `getReplacements()` method.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - Moved replacement table to separated `getReplacements()` method. | |
| - Moved replacement table to a separate `getReplacements()` method. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@CHANGELOG.md` at line 10, Fix the grammar in the user-facing changelog entry
mentioning getReplacements(): change "Moved replacement table to separated
`getReplacements()` method." to "Moved replacement table to a separate
`getReplacements()` method." and commit the updated CHANGELOG.md line.
| // 2. Remove Emojis and Symbols | ||
| // Match Emoticons, Misc Symbols and Pictographs, Transport and Map, Flags, etc. | ||
| $string = preg_replace('/[\x{1F600}-\x{1F64F}\x{1F300}-\x{1F5FF}\x{1F680}-\x{1F6FF}\x{1F1E0}-\x{1F1FF}\x{2600}-\x{26FF}\x{2700}-\x{27BF}\x{FE00}-\x{FE0F}\x{1F900}-\x{1F9FF}]/u', '', $string); |
There was a problem hiding this comment.
Moin! preg_replace kann bei fehlerhaftem UTF-8 null zurückgeben – das bricht die nachfolgende Pipeline.
Mit dem /u-Flag liefert preg_replace bei ungültigem UTF-8-Input null. $string wird danach in str_replace (Zeile 146), parent::normalize (Zeile 151/161) etc. weitergereicht – in PHP 8.1+ gibt das Deprecation-Warnings, in PHP 9 dann TypeError. Da REDAXO 5.18 PHP 8.1+ voraussetzt, ist das relevant.
🛡️ Vorgeschlagener Fix: Null-Check nach preg_replace
- $string = preg_replace('/[\x{1F600}-\x{1F64F}\x{1F300}-\x{1F5FF}\x{1F680}-\x{1F6FF}\x{1F1E0}-\x{1F1FF}\x{2600}-\x{26FF}\x{2700}-\x{27BF}\x{FE00}-\x{FE0F}\x{1F900}-\x{1F9FF}]/u', '', $string);
+ $string = preg_replace('/[\x{1F600}-\x{1F64F}\x{1F300}-\x{1F5FF}\x{1F680}-\x{1F6FF}\x{1F1E0}-\x{1F1FF}\x{2600}-\x{26FF}\x{2700}-\x{27BF}\x{FE00}-\x{FE0F}\x{1F900}-\x{1F9FF}]/u', '', $string) ?? $string;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // 2. Remove Emojis and Symbols | |
| // Match Emoticons, Misc Symbols and Pictographs, Transport and Map, Flags, etc. | |
| $string = preg_replace('/[\x{1F600}-\x{1F64F}\x{1F300}-\x{1F5FF}\x{1F680}-\x{1F6FF}\x{1F1E0}-\x{1F1FF}\x{2600}-\x{26FF}\x{2700}-\x{27BF}\x{FE00}-\x{FE0F}\x{1F900}-\x{1F9FF}]/u', '', $string); | |
| // 2. Remove Emojis and Symbols | |
| // Match Emoticons, Misc Symbols and Pictographs, Transport and Map, Flags, etc. | |
| $string = preg_replace('/[\x{1F600}-\x{1F64F}\x{1F300}-\x{1F5FF}\x{1F680}-\x{1F6FF}\x{1F1E0}-\x{1F1FF}\x{2600}-\x{26FF}\x{2700}-\x{27BF}\x{FE00}-\x{FE0F}\x{1F900}-\x{1F9FF}]/u', '', $string) ?? $string; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/yrewrite_url_schemes.php` around lines 127 - 129, The preg_replace call
that strips emojis can return null on invalid UTF-8, which will break later
calls (str_replace, parent::normalize); change the code around the preg_replace
in lib/yrewrite_url_schemes.php to capture its return (e.g. $clean =
preg_replace(...)) and then check if $clean === null and if so set $string to a
safe fallback (e.g. an empty string or a UTF-8 sanitized version of the original
using mb_convert_encoding or mb_check_encoding), otherwise assign $string =
$clean; this ensures $string passed to str_replace and parent::normalize is
never null.
| return [ | ||
| 'search' => [ | ||
| 'ß', 'à', 'á', 'â', 'ã', 'ä', 'å', 'æ', 'ç', 'è', 'é', 'ê', 'ë', 'ì', 'í', 'î', 'ï', 'ñ', 'ò', 'ó', 'ô', 'õ', 'ö', 'ø', 'ù', 'ú', 'û', 'ü', 'ý', 'ÿ', | ||
| 'ā', 'ă', 'ą', 'ć', 'ĉ', 'ċ', 'č', 'ď', 'đ', 'ē', 'ĕ', 'ė', 'ę', 'ě', 'ĝ', 'ğ', 'ġ', 'ģ', 'ĥ', 'ħ', 'ĩ', 'ī', 'ĭ', 'į', 'ı', 'ij', 'ĵ', 'ķ', 'ĺ', 'ļ', 'ľ', 'ŀ', 'ł', 'ń', 'ņ', 'ň', 'ʼn', 'ō', 'ŏ', 'ő', 'œ', 'ŕ', 'ŗ', 'ř', 'ś', 'ŝ', 'ş', 'š', 'ţ', 'ť', 'ŧ', 'ũ', 'ū', 'ŭ', 'ů', 'ű', 'ų', 'ŵ', 'ŷ', 'ź', 'ż', 'ž', 'ſ', 'ƒ', 'ơ', 'ư', 'ǎ', 'ǐ', 'ǒ', 'ǔ', 'ǖ', 'ǘ', 'ǚ', 'ǜ', 'ǻ', 'ǽ', 'ǿ', | ||
| '/', '®', '©', '™', ':', '#', '$', '%', '&', '(', ')', '*', '+', ',', '.', '/', '!', ';', '<', '=', '>', '?', '@', '[', ']', '^', '_', '`', '{', '}', '~', '–', '\'', '¿', '"', '"', ' ' | ||
| ], | ||
| 'replace' => [ | ||
| 'ss', 'a', 'a', 'a', 'a', 'ae', 'a', 'ae', 'c', 'e', 'e', 'e', 'e', 'i', 'i', 'i', 'i', 'n', 'o', 'o', 'o', 'o', 'oe', 'oe', 'u', 'u', 'u', 'ue', 'y', 'y', | ||
| 'a', 'a', 'a', 'c', 'c', 'c', 'c', 'd', 'd', 'e', 'e', 'e', 'e', 'e', 'g', 'g', 'g', 'g', 'h', 'h', 'i', 'i', 'i', 'i', 'i', 'ij', 'j', 'k', 'l', 'l', 'l', 'l', 'l', 'n', 'n', 'n', 'n', 'o', 'o', 'o', 'oe', 'r', 'r', 'r', 's', 's', 's', 's', 't', 't', 't', 'u', 'u', 'u', 'u', 'u', 'u', 'w', 'y', 'z', 'z', 'z', 's', 'f', 'o', 'u', 'a', 'i', 'o', 'u', 'u', 'u', 'u', 'u', 'a', 'ae', 'oe', | ||
| '-', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '-' | ||
| ] | ||
| ]; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, let's locate and examine the file
find . -type f -name "yrewrite_url_schemes.php"Repository: FriendsOfREDAXO/yrewrite_scheme
Length of output: 105
🏁 Script executed:
# Once we find the file, examine lines 180-191
sed -n '180,191p' lib/yrewrite_url_schemes.php | cat -nRepository: FriendsOfREDAXO/yrewrite_scheme
Length of output: 1780
🏁 Script executed:
# Extract and count the search and replace arrays
python3 << 'EOF'
import re
with open('lib/yrewrite_url_schemes.php', 'r', encoding='utf-8') as f:
content = f.read()
# Find the search array
search_match = re.search(r"'search'\s*=>\s*\[(.*?)\]", content, re.DOTALL)
if search_match:
search_array_str = search_match.group(1)
# Extract all quoted strings
search_items = re.findall(r"'([^']*)'", search_array_str)
print(f"Search array count: {len(search_items)}")
print(f"Search items: {search_items}")
# Check for duplicates
duplicates = {}
for idx, item in enumerate(search_items):
if item in duplicates:
duplicates[item].append(idx)
else:
duplicates[item] = [idx]
dupe_items = {k: v for k, v in duplicates.items() if len(v) > 1}
if dupe_items:
print(f"\nDuplicate entries in search array:")
for item, indices in dupe_items.items():
print(f" '{item}' at indices: {indices}")
else:
print("\nNo duplicates in search array")
# Find the replace array
replace_match = re.search(r"'replace'\s*=>\s*\[(.*?)\]", content, re.DOTALL)
if replace_match:
replace_array_str = replace_match.group(1)
replace_items = re.findall(r"'([^']*)'", replace_array_str)
print(f"\nReplace array count: {len(replace_items)}")
# Verify counts match
if len(search_items) != len(replace_items):
print(f"⚠️ MISMATCH: search has {len(search_items)}, replace has {len(replace_items)}")
else:
print(f"✓ Array counts match: {len(search_items)} == {len(replace_items)}")
EOFRepository: FriendsOfREDAXO/yrewrite_scheme
Length of output: 905
Moin! Array-Längen passen nicht zusammen – kritischeres Problem als nur das doppelte /.
Der Analyse zufolge hat das search-Array 131 Einträge, aber das replace-Array 144 Einträge. Das ist ein Mismatch von 13 Positionen. Zwar ist das doppelte / (an Index 107 und 122 im search-Array) real, aber das eigentliche Problem ist die fehlende Parität zwischen beiden Arrays.
Wenn str_replace mit Arrays unterschiedlicher Länge arbeitet, führt das zu unerwünschtem Verhalten: Ersetzungen werden fehlgeleitet oder gar nicht ausgeführt. Das replace-Array hat 13 mehr Einträge als erwartet – diese können nicht zugeordnet werden.
Beide Probleme müssen behoben werden:
- Das doppelte
/insearch(Position 122) entfernen oder korrigieren - Die Array-Längen synchronisieren – überprüfen, ob
replacezu lang ist odersearchgekürzt wurde
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/yrewrite_url_schemes.php` around lines 180 - 191, The return value's
'search' and 'replace' arrays in the function returning URL rewrite schemes are
mismatched (search has 131 items, replace 144) and also contain a duplicated '/'
entry; remove the duplicate '/' from the 'search' array and then reconcile the
arrays so they are the same length by either removing the 13 extraneous entries
from the 'replace' array or adding the corresponding missing entries to
'search', ensuring each element in 'search' has a semantically correct
replacement in 'replace' (use the existing element order in the 'search' and
'replace' arrays to guide which entries to remove or add).
|
@copilot open a new pull request to apply changes based on the comments in this thread |
|
endlich mal was, das emojis entfernt ;-) |
Added
Changed
normalizemethod for better performance and readability.getReplacements()method.Summary by CodeRabbit
Neue Funktionen
Refactor