Skip to content

perf(utilities): memoize sanitize_tool_name#6322

Open
HumphreySun98 wants to merge 1 commit into
crewAIInc:mainfrom
HumphreySun98:perf/memoize-sanitize-tool-name
Open

perf(utilities): memoize sanitize_tool_name#6322
HumphreySun98 wants to merge 1 commit into
crewAIInc:mainfrom
HumphreySun98:perf/memoize-sanitize-tool-name

Conversation

@HumphreySun98

@HumphreySun98 HumphreySun98 commented Jun 24, 2026

Copy link
Copy Markdown

Summary

sanitize_tool_name is a pure function called many times per agent iteration over the same small, stable set of tool names — tool selection, schema/name rendering, and name comparisons (139 call sites; 38 in tool_usage.py alone). Each call does an NFKD Unicode normalize, an ascii encode/decode round-trip, and several regex substitutions (plus a SHA-256 for over-length names).

Change

Memoize it with functools.lru_cache(maxsize=1024). The function depends only on its (name, max_length) arguments (both hashable) and has no side effects, so caching is safe; maxsize bounds memory, and the module already uses lru_cache for _duplicate_separator_pattern.

Benchmark

Repeated calls over a fixed name (local, timeit):

µs/call
uncached ~5.4
cached ~0.04

~120× on repeat calls. Within a single agent run the cache hit rate is ~100% since tool names are stable.

Tests

Added TestSanitizeToolName (sanitization rules preserved, memoization via cache_info, max_length cache-key correctness). ruff + mypy clean; test_tool_usage.py and test_base_tool_adapter.py green.


This PR was authored with Claude Code. Per CONTRIBUTING.md, AI-generated contributions require the llm-generated label — I don't have triage permission to set it, so could a maintainer please add it? 🤖 Generated with Claude Code

Summary by CodeRabbit

  • Performance

    • Improved tool-name sanitization by caching repeated results, which can speed up frequent agent operations.
  • Tests

    • Added coverage for sanitization behavior, deterministic truncation, cache usage, and support for custom maximum length values.

@corridor-security corridor-security Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary: This PR memoizes sanitize_tool_name and adds tests for existing sanitization behavior and cache correctness; no authentication, authorization, external I/O, or data boundary changes are introduced.

Risk: Low risk. No exploitable security vulnerabilities were identified because the change is limited to a bounded in-memory cache around a pure string utility function.

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 1b694990-3f39-4955-8828-c4be890bcf42

📥 Commits

Reviewing files that changed from the base of the PR and between 5827abb and f3c034a.

📒 Files selected for processing (2)
  • lib/crewai/src/crewai/utilities/string_utils.py
  • lib/crewai/tests/utilities/test_string_utils.py

📝 Walkthrough

Walkthrough

sanitize_tool_name now memoizes results with an LRU cache, and the utility tests add coverage for sanitization output, cache behavior, and max_length truncation.

Changes

Tool name sanitization

Layer / File(s) Summary
Memoize and test tool-name sanitization
lib/crewai/src/crewai/utilities/string_utils.py, lib/crewai/tests/utilities/test_string_utils.py
sanitize_tool_name is wrapped with lru_cache(maxsize=1024), and tests assert sanitization rules, cache statistics, cache_clear(), and max_length-based truncation.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: memoizing sanitize_tool_name for performance.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

sanitize_tool_name is a pure function called many times per agent iteration
over the same small, stable set of tool names (tool selection, schema
rendering, name comparisons), and each call does an NFKD normalize, an ascii
round-trip and several regex substitutions. Cache results with an
lru_cache(maxsize=1024); repeat calls drop from ~5.4us to ~0.04us (~120x).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@HumphreySun98 HumphreySun98 force-pushed the perf/memoize-sanitize-tool-name branch from f3c034a to 63d41d1 Compare June 24, 2026 20:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant