Skip to content

⚡ Optimize parseChecksum to reduce allocations#57

Merged
AgusRdz merged 10 commits into
AgusRdz:mainfrom
giankpetrov:perf-optimize-parse-checksum-17229353039594671266
Apr 6, 2026
Merged

⚡ Optimize parseChecksum to reduce allocations#57
AgusRdz merged 10 commits into
AgusRdz:mainfrom
giankpetrov:perf-optimize-parse-checksum-17229353039594671266

Conversation

@giankpetrov

Copy link
Copy Markdown
Contributor

💡 What: Optimized the parseChecksum function in updater/updater.go by replacing strings.Split and strings.Fields with manual string manipulation.

🎯 Why: strings.Split allocates a new slice containing all lines, which is unnecessary memory overhead for potentially large checksum files. strings.Fields also creates a slice and multiple strings for each line.

📊 Measured Improvement:

Execution Time: ~6.8x faster (from 154,019 ns/op to 22,540 ns/op for 1000 lines).
Allocations: Reduced from 1,002 allocs/op (48,416 B/op) to 0 allocs/op (0 B/op).
Robustness: Now correctly handles filenames with spaces, which was a limitation of the previous strings.Fields implementation.

giankpetrov and others added 10 commits March 26, 2026 01:40
…ilters

- Implemented redactHeaders helper to mask sensitive HTTP headers.
- Updated curl and httpie filters to use redaction.
- Added redactAwareSanityCheck to prioritize security over compression heuristics.
- Added unit tests to verify redaction.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
- Created centralized filters/redact.go with expanded sensitive keywords.
- Implemented redactHeaders for plain-text and redactJSON for structured data.
- Integrated redactJSON into compressJSON to mask secrets before processing.
- Hardened docker inspect to always redact secrets, even for small output.
- Added comprehensive verification tests.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
- Created centralized filters/redact.go with expanded sensitive keywords.
- Implemented redactHeaders for plain-text (case-insensitive) and redactJSON for structured data (recursive).
- Integrated redactJSON into compressJSON to mask secrets before processing, ensuring small JSON preservation path is secure.
- Hardened docker inspect to always redact secrets, even for small output.
- Added comprehensive verification tests for small JSON, recursive arrays, and case-insensitivity.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
This commit fixes a vulnerability where an attacker controlling a local
`.chop-filters.yml` file could potentially execute arbitrary commands via
the `exec` directive.

The fix resolves the issue at the root cause by strictly enforcing the
`Trusted` boundary during configuration parsing. Instead of relying solely
on runtime checks that could be bypassed or using easily bypassed shell
interpreter blocklists, `exec` directives from untrusted sources are now
completely rejected and wiped when the YAML is parsed.

Risk: An attacker could place a malicious `.chop-filters.yml` in a
repository. When a user runs `chop` in that repository, the configuration
could lead to arbitrary command execution on the user's machine.

Solution: Aggressively clear the `Exec` field in `parseCustomFiltersWithTrust`
if the configuration is loaded as untrusted (e.g., local project files).
This guarantees that untrusted configurations can never pass an arbitrary
command string to `buildExecFilter()`.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Added `config/paths_test.go` to verify platform-specific path resolution
logic in `ConfigDir` and `DataDir`. The tests cover:
- Default paths on Unix (XDG), Windows (AppData/LocalAppData), and macOS.
- Environment variable overrides (XDG_CONFIG_HOME, XDG_DATA_HOME).
- Proper isolation using t.TempDir and t.Setenv.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Replaced strings.Split and strings.Fields with manual string slicing using
strings.IndexByte. This avoids allocating a slice for all lines and
multiple intermediate strings per line.

Benchmark results (1000 lines):
- Before: 154,019 ns/op, 48,416 B/op, 1,002 allocs/op
- After: 22,540 ns/op, 0 B/op, 0 allocs/op

The new implementation also correctly handles filenames with spaces,
improving robustness over the previous strings.Fields approach.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
@github-actions

github-actions Bot commented Apr 5, 2026

Copy link
Copy Markdown

Warning

This branch is 11 commit(s) behind main and will need to be updated before it can be applied cleanly.

Please sync your fork and rebase before continuing:

# If you have upstream configured
git fetch upstream
git rebase upstream/main
git push origin perf-optimize-parse-checksum-17229353039594671266 --force-with-lease

# Or if your fork's main is already synced via GitHub UI
git fetch origin main
git rebase origin/main
git push origin perf-optimize-parse-checksum-17229353039594671266 --force-with-lease

See CONTRIBUTING.md for the full guide.

@giankpetrov giankpetrov marked this pull request as draft April 5, 2026 22:52
@giankpetrov giankpetrov marked this pull request as ready for review April 5, 2026 23:09
@AgusRdz AgusRdz merged commit 2899738 into AgusRdz:main Apr 6, 2026
2 checks passed
@AgusRdz

AgusRdz commented Apr 6, 2026

Copy link
Copy Markdown
Owner

Thanks @giankpetrov! Really clean optimization — replacing strings.Split/strings.Fields with direct strings.IndexByte slicing is exactly the right call for a hot path like checksum parsing. Zero allocations, 6.8x faster, and the bonus fix for filenames with spaces is a nice touch. Appreciate the benchmark too, makes the improvement easy to verify. 🚀

@giankpetrov giankpetrov deleted the perf-optimize-parse-checksum-17229353039594671266 branch April 6, 2026 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants