Skip to content

Feature/evidence type filter#5

Open
borkarsaish65 wants to merge 3 commits into
ELEVATE-Project:release-1.0from
borkarsaish65:feature/evidence-type-filter
Open

Feature/evidence type filter#5
borkarsaish65 wants to merge 3 commits into
ELEVATE-Project:release-1.0from
borkarsaish65:feature/evidence-type-filter

Conversation

@borkarsaish65

Copy link
Copy Markdown

No description provided.

borkarsaish65 and others added 3 commits June 18, 2026 16:43
Lets a user restrict which evidence types get validated by an
execution instead of always processing all three. Adds a generic
processing_config JSONB column (Alembic migration) on Execution so
future per-execution processing toggles have a home; evidence_types
rides inside it. Plumbed through to the pre-processor (skips
disallowed-type rows, reported in summary) and the processor
(defense-in-depth re-check in case a stale pre-split file is reused).
Supported on both create paths and on PATCH update, where it merges
into the existing processing_config instead of overwriting it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Excluded rows were tagged Relevance Tag='Irrelevant' even though no API
call was made for them, which inflated api_failures/failed_list in the
processor's run summary and skewed the relevance percentage shown in
reports (report_service.py folds every 'Irrelevant' row into the same
bucket regardless of whether it was actually evaluated).

Reuse 'notValidated' instead (same sentinel the relevant-evidence-cap
feature already uses for "not evaluated due to execution config") and
port that branch's two report/processor fixes onto this one: a 4th
notValidated bucket in RELEVANCE_TYPES/create_node/relevance_counts so
the breakdown still sums to total, and a rel_score() denominator that
excludes notValidated so the percentage reflects only evaluated rows.
Comment wording matches the sibling branch verbatim so the two branches'
overlapping hunks merge without conflict later.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
pandas.read_excel() needs openpyxl to read .xlsx files; without it,
process_excel() in scripts/processor/1-main-parallel-script.py throws
"Missing optional dependency 'openpyxl'" on every excel row, retries
3x, then gives up and tags the row Irrelevant with no Q&A — silently
looking like excel evidence was never analyzed. Sister repo already
carries this dependency (unpinned); pin it here to match how this repo
pins everything else.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 47bc1ecb-76a0-48c1-b45a-cd809853fe15

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread models/schemas.py
return None
if not isinstance(v, list):
raise ValueError("evidence_types must be an array of strings")
allowed = {"image", "pdf", "excel"}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@borkarsaish65 Can we take this from csv type config table ? there we can store available allowed ones.

also include the extensions as well

Comment thread models/schemas.py
return None
if not isinstance(v, list):
raise ValueError("evidence_types must be an array of strings")
allowed = {"image", "pdf", "excel"}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@borkarsaish65 dont hardcode the available evidence type in code

states=request_data.states or [],
criterias_mode=criterias_mode,
threshold_config=threshold_config,
processing_config=processing_config,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@borkarsaish65 why we need it as a json ?

detail="states must be a non-empty array if provided."
)

if field == 'evidence_types':

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@borkarsaish65 use constants

# excluded by the execution's evidence-type filter. It's a distinct bucket: counted in
# "total" but never folded into Relevant/Partially Relevant/Irrelevant, and excluded from
# rel_score's numerator and denominator.
RELEVANCE_TYPES = {"Relevant", "Partially Relevant", "Irrelevant", "notValidated"}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@borkarsaish65 Can we move this as well in commons

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants