Skip to content

Reject empty extracted resume uploads#793

Open
spsilu wants to merge 2 commits into
srbhr:mainfrom
spsilu:fix-reject-empty-resume-upload
Open

Reject empty extracted resume uploads#793
spsilu wants to merge 2 commits into
srbhr:mainfrom
spsilu:fix-reject-empty-resume-upload

Conversation

@spsilu

@spsilu spsilu commented May 6, 2026

Copy link
Copy Markdown

Pull Request Title

Reject empty extracted resume uploads

Related Issue

#791

Description

Prevent empty PDF/DOCX text extraction results from being stored and passed to AI parsing, which could otherwise generate a sample-style fallback resume.

Type

  • Bug Fix
  • Feature Enhancement
  • Documentation Update
  • Code Refactoring
  • Other (please specify):

Proposed Changes

  • Return 422 when uploaded files produce empty extracted text
  • Add a parser-level guard for empty resume content
  • Add an integration test covering empty extracted text uploads

Screenshots / Code Snippets (if applicable)

N/A

How to Test

  1. Upload a scanned or non-text PDF that produces empty extracted text.
  2. Confirm the API returns 422 with a text-extraction error.
  3. Confirm no resume record is created and AI parsing is not invoked.

Checklist

  • The code compiles successfully without any errors or warnings
  • The changes have been tested and verified
  • The documentation has been updated (if applicable)
  • The changes follow the project's coding guidelines and best practices
  • The commit messages are descriptive and follow the project's guidelines
  • All tests (if applicable) pass successfully
  • This pull request has been linked to the related issue (if applicable)

Additional Information

A lightweight HTTP-level verification confirmed the endpoint returns 422 and does not create a resume record when extracted text is blank. The branch is prepared locally as fix-reject-empty-resume-upload.


Summary by cubic

Reject uploads where text extraction returns empty content by returning 422 and skipping storage/parsing. Improves the error message with OCR guidance to prevent bogus resumes from scanned or non-text PDFs/DOCX.

  • Bug Fixes
    • Return 422 with a clearer message and OCR guidance when extracted text is blank.
    • Add a parser guard that raises on empty resume content.
    • Add an integration test verifying 422, no DB write, and no parser call.

Written for commit 4208a29. Summary will update on new commits.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 3 files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant