[codex] Add Mistral OCR client resource#152
Conversation
|
Warning Review limit reached
More reviews will be available in 46 minutes and 15 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (7)
WalkthroughAdds Mistral OCR client support with request validation, typed OCR response objects, CLI examples, and unit tests, and narrows the Google Gemini chat adapter so response schema data is only included when the response MIME type is valid. ChangesMistral OCR feature
Google Gemini generation config fix
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/mistral/src/Resources/Ocr.php`:
- Around line 29-39: The OCR payload is missing the required document type
discriminator because `Ocr::process()` validates it via `detectDocumentType()`
but does not persist the result into `$parameters['document']`. Update the
normalization flow in `Ocr::process()` so the detected type is written back onto
the `document` array before calling `Payload::create('ocr', $parameters)`,
ensuring requests without an explicit type still send a valid `document.type`
value.
In `@packages/mistral/src/Responses/Ocr/ConfidenceScore.php`:
- Around line 33-37: The ConfidenceScore hydration logic rejects integer-valued
confidence inputs because `ConfidenceScore::fromArray` currently validates
`$confidence` with `Assert::float`; update this to accept numeric values
instead, then cast the value to float before constructing the object. Keep the
existing `$text` and `$startIndex` handling intact, and make the change in the
`ConfidenceScore` class where the `confidence` attribute is read and validated.
In `@packages/mistral/src/Responses/Ocr/ConfidenceScores.php`:
- Around line 40-41: The page confidence validation in ConfidenceScores::from is
too strict because Assert::float rejects whole-number values returned by the
API. Update the handling of averagePageConfidenceScore and
minimumPageConfidenceScore to accept any numeric value consistently with
ConfidenceScore::from, then normalize/cast to float before constructing the
object.
In `@packages/mistral/src/Responses/Ocr/Image.php`:
- Around line 36-42: The nullable OCR coordinate fields in Image::from are still
being read with direct array access, which can trigger undefined key warnings
before validation. Update the coordinate parsing in Image::from to use
null-coalescing defaults for top_left_x, top_left_y, bottom_right_x, and
bottom_right_y, matching the handling used in Block::from, while keeping the
existing Assert::nullOrInteger checks and constructor argument mapping intact.
In `@packages/mistral/src/Responses/Ocr/Page.php`:
- Around line 47-48: The Page::fromArray hydration is treating required OCR
fields as if they were optional, so update the image/dimension extraction to
follow the file’s existing defensive defaults pattern. In the Page class, adjust
the assignments for $rawImages and $rawDimensions so they use null-coalescing
defaults consistent with the other optional arrays in this parser, while keeping
the handling aligned with the Mistral OCR schema expectations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8b6032bf-6f29-4409-aedd-87cb77be4bc5
📒 Files selected for processing (20)
packages/google-gemini-adapter/src/Chat/GoogleGeminiChatAdapter.phppackages/mistral/examples/ocr-blocks.phppackages/mistral/examples/ocr.phppackages/mistral/src/Client.phppackages/mistral/src/ClientInterface.phppackages/mistral/src/Model.phppackages/mistral/src/Resources/Ocr.phppackages/mistral/src/Resources/OcrInterface.phppackages/mistral/src/Responses/Ocr/Block.phppackages/mistral/src/Responses/Ocr/ConfidenceScore.phppackages/mistral/src/Responses/Ocr/ConfidenceScores.phppackages/mistral/src/Responses/Ocr/Dimensions.phppackages/mistral/src/Responses/Ocr/Image.phppackages/mistral/src/Responses/Ocr/Page.phppackages/mistral/src/Responses/Ocr/ProcessResponse.phppackages/mistral/src/Responses/Ocr/Table.phppackages/mistral/src/Responses/Ocr/UsageInfo.phppackages/mistral/tests/Unit/ClientTest.phppackages/mistral/tests/Unit/Resources/OcrTest.phppackages/mistral/tests/Unit/Responses/Ocr/ProcessResponseTest.php
- Persist detected document type into the OCR request payload so type-less requests still send the required document.type discriminator - Accept integer-valued confidence scores by validating numeric and casting to float in ConfidenceScore and ConfidenceScores - Use defensive null-coalescing for nullable Image coordinates and optional Page images/dimensions to avoid undefined-key warnings - Add regression tests covering the above
What changed
ocrendpoint.Model::OCR(mistral-ocr-latest) andModel::OCR_4(mistral-ocr-4-0).Client::ocr(), OCR request payload mapping, default model behavior, validation, and OCR 4 response block mapping.Why
Downstream consumers need first-class Mistral OCR support instead of carrying a Composer patch in application repositories. OCR 4 block extraction is included so clients can consume structured page regions from the API.
Validation
Ran locally in
packages/mistral:composer fixcomposer lintcomposer testRan locally in
packages/google-gemini-adapterfor the CI fix:composer fixcomposer lintcomposer testGitHub Actions:
Summary by CodeRabbit
New Features
Bug Fixes
Tests