Skip to content

Fix #5930 - Add multimodal content formatting API to handle files as native provider content blocks#6241

Open
Whning0513 wants to merge 2 commits into
crewAIInc:mainfrom
Whning0513:fix-issue-5930
Open

Fix #5930 - Add multimodal content formatting API to handle files as native provider content blocks#6241
Whning0513 wants to merge 2 commits into
crewAIInc:mainfrom
Whning0513:fix-issue-5930

Conversation

@Whning0513

@Whning0513 Whning0513 commented Jun 19, 2026

Copy link
Copy Markdown

[BUG] input_files (PDFFile) are passed as base64 via read_file tool, causing context overflow and inconsistent LLM behavior

When using input_files with PDFFile (or File), CrewAI does not handle the file as a native file input at the provider level. Instead, the file is processed via the read_file tool and its content is returned as a binary/base64 representation that is injected into the agent execution context. This causes LLM context to become extremely large, responses to become inconsistent or fail due to context overflow, and the same file to be re-processed during agent execution via tools.

Reported by: Community user

Files passed via input_files (such as PDFs) were processed through the read_file tool which returned base64-encoded binary data. This base64 data was then injected directly into the LLM context as text, rather than being formatted as native provider content blocks (e.g., OpenAI's input_file type, Anthropic's image/file blocks). This caused massive context inflation and unreliable LLM behavior.

File 1: lib/crewai-files/src/crewai_files/formatting/api.py (new file)

  • Added format_multimodal_content() function that converts text and files into provider-specific multimodal content blocks
  • Added aformat_multimodal_content() async variant for parallel file resolution
  • Added _normalize_provider() to map provider strings to standardized types (gemini, anthropic, bedrock, openai, etc.)
  • Added _format_text_block() for provider-specific text block formatting
  • Integrated with file processing pipeline: file resolution, constraint checking, upload caching, and provider-specific formatting

File 2: lib/crewai-files/src/crewai_files/formatting/openai.py (new file)

  • Added OpenAIFormatter class for OpenAI Chat Completions API formatting (image_url blocks with base64 data)
  • Added OpenAIResponsesFormatter class for OpenAI Responses API formatting (input_image and input_file types)
  • Both handle resolution of file content to either base64-encoded inline data or uploaded file references

Medium - The change introduces a new API for multimodal content formatting that fundamentally changes how files are handled by LLM providers. Files are now formatted as native provider content blocks rather than being injected as raw base64 text. This is a significant architectural improvement but introduces new code paths for file handling. Proper integration with existing input_files flows is required to avoid breaking changes.

Summary by CodeRabbit

  • Bug Fixes
    • Improved handling and error messaging for PDF attachments in the OpenAI integration, including clearer guidance when PDFs aren’t supported.
    • Added MIME-type normalization so PDF detection is more reliable across common content-type variants.
  • Tests
    • Added coverage to confirm PDF handling rejects/accepts the correct cases and returns the expected OpenAI “input_file” formatting for supported variants.

@corridor-security corridor-security Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary: This PR updates multimodal file formatting for OpenAI providers by passing MIME type information and rejecting unsupported PDF attachments for the Chat Completions API; no exploitable security vulnerabilities were identified.

Risk: Low risk. The changes narrow provider handling behavior and do not introduce new authentication, authorization, data exposure, or user-controlled execution paths.

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: a7db43a0-4c69-42d0-a627-bdb74c7b3641

📥 Commits

Reviewing files that changed from the base of the PR and between e2b9cc2 and 37271e6.

📒 Files selected for processing (2)
  • lib/crewai-files/src/crewai_files/formatting/openai.py
  • lib/crewai-files/tests/test_openai_formatting.py

📝 Walkthrough

Walkthrough

OpenAIFormatter.format_block now accepts an optional content_type and rejects PDFs with a specific TypeError. _format_block forwards file_input.content_type only on the OpenAI path, and OpenAIResponsesFormatter.format_block normalizes MIME types before mapping PDF inputs to input_file blocks.

Changes

OpenAI PDF handling for Chat Completions and Responses

Layer / File(s) Summary
Dispatcher and Chat Completions PDF rejection
lib/crewai-files/src/crewai_files/formatting/api.py, lib/crewai-files/src/crewai_files/formatting/openai.py
_format_block now passes file_input.content_type only to OpenAIFormatter. OpenAIFormatter.format_block accepts content_type: str = "", normalizes it, and raises a PDF-specific TypeError for Chat Completions while keeping non-PDF resolved file handling unchanged.
Responses formatter PDF mapping and tests
lib/crewai-files/src/crewai_files/formatting/openai.py, lib/crewai-files/tests/test_openai_formatting.py
OpenAIResponsesFormatter.format_block normalizes MIME types before checking image vs PDF handling and maps PDF inputs to Responses API input_file blocks. New tests cover PDF MIME-type variants for both the Chat Completions rejection path and the Responses acceptance path.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding multimodal formatting so files are handled as native provider content blocks.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai-files/src/crewai_files/formatting/openai.py`:
- Around line 140-143: The PDF detection at the is_pdf assignment is using an
exact string comparison against "application/pdf" which can be bypassed by MIME
types with parameters or different casing (e.g., "application/pdf;
charset=binary" or "Application/PDF"). Normalize the content_type before the
comparison by converting it to lowercase and extracting only the media type
portion (removing any parameters after a semicolon), then compare the normalized
value against "application/pdf" to ensure all PDF variants are properly detected
and rejected.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 78230643-21ec-4abe-ba2c-625860a2fcbb

📥 Commits

Reviewing files that changed from the base of the PR and between 854c67d and e2b9cc2.

📒 Files selected for processing (2)
  • lib/crewai-files/src/crewai_files/formatting/api.py
  • lib/crewai-files/src/crewai_files/formatting/openai.py

Comment thread lib/crewai-files/src/crewai_files/formatting/openai.py Outdated
@samrusani

Copy link
Copy Markdown

I independently validated the MIME normalization issue on PR head e2b9cc2938842c1952ca4868bd1ea1b1f183737f.

Current behavior on that head:

  • OpenAIFormatter.format_block(..., "Application/PDF; charset=binary") returns an image_url block instead of raising the intended Chat Completions PDF error.
  • OpenAIFormatter.format_block(..., "application/pdf; charset=binary") also returns an image_url block.
  • OpenAIResponsesFormatter.format_block(..., "Application/PDF; charset=binary") raises Unsupported content type for Responses API instead of returning an input_file block.

A minimal local patch that normalizes once with:

media_type = content_type.split(";", 1)[0].strip().lower()

and uses media_type for the PDF/image checks made those probes pass. I also added a temporary local regression test parameterized for application/pdf, application/pdf; charset=binary, and Application/PDF across both Chat Completions rejection and Responses formatting.

Focused validation after the local patch:

uv run pytest lib/crewai-files/tests/test_openai_formatting.py lib/crewai-files/tests/test_resolved.py -q
17 passed in 2.85s

So the normalization concern still looks valid, and it may be worth sharing the same media-type helper across both formatter classes.

@Whning0513

Whning0513 commented Jun 28, 2026

Copy link
Copy Markdown
Author

Addressed in the latest commit on fix-issue-5930. I normalized the MIME type before the PDF/image checks (split(";", 1), strip(), lower()) and added regression coverage for application/pdf, application/pdf; charset=binary, and Application/PDF across the OpenAI formatter paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants