Add runtime warnings when max_rpm and max_tokens are unset#6371
Add runtime warnings when max_rpm and max_tokens are unset#6371camgrimsec wants to merge 1 commit into
Conversation
Surfaces silent denial-of-wallet risks by logging warnings at construction time when resource controls are left unbounded: * Crew.max_rpm=None: warn that client-side rate limiting is disabled and the only safeguard is the upstream provider's rate limit. * LLM.max_tokens AND max_completion_tokens both None: warn that responses are uncapped and can produce very large outputs (e.g. 128k+ tokens) on modern models. Both warnings are emitted via @model_validator(mode='after') hooks using logging.warning, consistent with the existing logging patterns. No behavior change, no breaking changes, no new defaults, no raised errors. Also clarifies the max_rpm field description.
📝 WalkthroughWalkthroughAdds ChangesRuntime Configuration Warnings
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/crew.py`:
- Around line 300-301: `max_rpm=0` is still treated as “disabled” in the `Crew`
RPM flow, but the new docs/validation only mention `None`. Update the
`Crew`-related `max_rpm` handling so zero is covered the same way as `None` in
the truthiness checks that control controller hookup and cleanup, and align the
warning/description accordingly. Make sure the validator and any explanatory
text near `Crew`’s `max_rpm` parameter reflect both uncapped values, not just
`None`.
In `@lib/crewai/src/crewai/llm.py`:
- Around line 686-698: The uncapped-token warning in LLM._warn_tokens_uncapped
only checks for None, but runtime logic treats 0 the same as no cap via the
max_tokens/max_completion_tokens fallback. Update the validator and any related
cap-handling in LLM so zero-valued caps are normalized or rejected consistently,
and make the warning trigger for both None and 0 when the effective request will
be uncapped.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 99be48f0-e0d5-4407-af2f-be890e29ad76
📒 Files selected for processing (2)
lib/crewai/src/crewai/crew.pylib/crewai/src/crewai/llm.py
| "Maximum number of requests per minute for the crew execution. " | ||
| "Set to None to disable rate limiting (not recommended for production)." |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
max_rpm=0 still disables throttling without triggering this warning.
Line 767 and Line 2102 both gate RPM behavior on truthiness, so max_rpm=0 skips controller hookup/cleanup just like None. As written, the new description and validator only cover None, so this PR still misses one uncapped configuration path.
Also applies to: 617-630
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/src/crewai/crew.py` around lines 300 - 301, `max_rpm=0` is still
treated as “disabled” in the `Crew` RPM flow, but the new docs/validation only
mention `None`. Update the `Crew`-related `max_rpm` handling so zero is covered
the same way as `None` in the truthiness checks that control controller hookup
and cleanup, and align the warning/description accordingly. Make sure the
validator and any explanatory text near `Crew`’s `max_rpm` parameter reflect
both uncapped values, not just `None`.
| @model_validator(mode="after") | ||
| def _warn_tokens_uncapped(self) -> LLM: | ||
| """Warn when neither ``max_tokens`` nor ``max_completion_tokens`` is set. | ||
|
|
||
| Without an explicit cap, a single prompt can produce very large outputs | ||
| (e.g. 128k+ tokens on modern models), leading to runaway spend. This | ||
| warning surfaces the risk at runtime without changing behavior. | ||
| """ | ||
| if self.max_tokens is None and self.max_completion_tokens is None: | ||
| logging.warning( | ||
| "max_tokens/max_completion_tokens not set; LLM responses are uncapped. " | ||
| "Set a limit (e.g., max_tokens=4096) to control token costs and avoid runaway generation." | ||
| ) |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Zero-valued caps bypass this warning but are still treated as uncapped.
Line 755 and Line 2168 both collapse 0 to “no cap” via self.max_tokens or self.max_completion_tokens. That means LLM(max_tokens=0) still sends an uncapped request without hitting this validator, so the warning does not cover the actual runtime behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/src/crewai/llm.py` around lines 686 - 698, The uncapped-token
warning in LLM._warn_tokens_uncapped only checks for None, but runtime logic
treats 0 the same as no cap via the max_tokens/max_completion_tokens fallback.
Update the validator and any related cap-handling in LLM so zero-valued caps are
normalized or rejected consistently, and make the warning trigger for both None
and 0 when the effective request will be uncapped.
This PR introduces informative warnings to alert developers when they are running with unbounded resource controls, as identified in a recent security audit.
max_rpm=Nonewarning — InCrew, a new@model_validator(mode="after")hook logs a warning whenmax_rpm is None, making operators aware that client-side rate limiting is disabled and upstream API provider limits are the only safeguard against rapid-fire requests.max_tokens=Nonewarning — InLLM, a new@model_validator(mode="after")hook logs a warning when bothmax_tokensandmax_completion_tokensareNone, surfacing that LLM responses are uncapped and may lead to very large token consumption (e.g. 128k+ output) on modern models.Why this is a safe, high-merge-likelihood PR
@model_validatorhooks plus animport loggingline.Both warnings use
logging.warning(consistent with the existing logging patterns in the codebase, includingllm.py's existingimport logging) and provide actionable advice to set explicit values.Implementation notes
The original audit recommendation suggested overriding
__init__, but bothCrewandLLMare Pydantic models that already use@model_validator(mode="after")hooks (~13 of them inCrew) for post-construction setup. Overriding__init__would conflict with Pydantic's validation machinery; using@model_validator(mode="after")is the idiomatic Pydantic v2 pattern and matches the surrounding code.Changes
lib/crewai/src/crewai/crew.pyimport logging.max_rpmfielddescriptionto call out the implication ofNone._warn_rate_limit_disabled@model_validator(mode="after").lib/crewai/src/crewai/llm.pyloggingis already imported._warn_tokens_uncapped@model_validator(mode="after").Stats
ast.parseclean on both files.Summary by CodeRabbit
Bug Fixes
Documentation