Skip to content

feat: Scoring V2 and NewsAPI integration with budget enforcement#4

Merged
satriapamudji merged 2 commits into
mainfrom
feature/scoring-v2-design
Jan 9, 2026
Merged

feat: Scoring V2 and NewsAPI integration with budget enforcement#4
satriapamudji merged 2 commits into
mainfrom
feature/scoring-v2-design

Conversation

@satriapamudji

Copy link
Copy Markdown
Owner

Summary

This PR implements two major features:

Scoring V2 (Task 23)

  • Macro-first scoring algorithm with source tier weighting, recency decay, and keyword boosting
  • Source tiers: T1 (20pts), T2 (15pts), T3 (10pts), unlisted (5pts)
  • Configurable recency decay with half-life parameter
  • Title keyword boosting for high-impact topics (Fed, earnings, geopolitical)
  • Enable via providers.scoring: heuristic_v2

NewsAPI Integration (Task 24)

  • TheNewsAPI.com as alternative ingestion source to RSS
  • Multi-key rotation with round-robin or failover strategies
  • Smart pagination with sliding window duplicate detection
  • Budget enforcement: stops when usage_remaining <= min_remaining_budget
  • Configurable via apis/newsapi_{stream}.txt
  • Enable via providers.ingestion: api_newsapi

Analysis Tooling

  • Comparison eval framework for A/B testing scoring algorithms
  • Generates reports comparing v1 vs v2 output rankings

Key Files

Category Files
Scoring V2 src/argus/scoring/heuristics_v2.py, src/argus/pipeline/providers/scoring_heuristic_v2.py
NewsAPI src/argus/pipeline/providers/news_api_client.py, ingestion_api_newsapi.py
Config src/argus/config.py (NewsApiConfig), apis/newsapi_us_markets.txt
Docs docs/integrations/newsapi.md, tasks/03_archive/task_documentation/24_NewsAPIIntegration.md
Tests tests/test_scoring_v2.py (26 tests), tests/test_ingestion_api_newsapi.py (23 tests)

Testing

  • 559 tests passed, 5 skipped
  • E2E tested NewsAPI budget enforcement with real API calls
  • Scoring V2 evaluated against V1 using comparison framework

Configuration

# config.yaml
providers:
  ingestion: api_newsapi  # or "rss" (default)
  scoring: heuristic_v2   # or "heuristic_v1"
# .env
NEWS_API_KEYS="key1,key2,key3"

- Create comprehensive design doc (docs/design/scoring_v2_macro_first.md)
- Archive Task 22 (design-only task)
- Create Task 23 for implementation
- Document design decisions and acceptance criteria
…ement

Scoring V2 (Task 23):
- Add macro-first scoring algorithm (heuristics_v2.py)
- Source tier scoring (T1: 20pts, T2: 15pts, T3: 10pts)
- Recency decay with configurable half-life
- Title keyword boosting for high-impact topics
- Configurable via providers.scoring: heuristic_v2

NewsAPI Integration (Task 24):
- Add TheNewsAPI.com as alternative ingestion source
- Multi-key rotation (round-robin/failover strategies)
- Smart pagination with sliding window duplicate detection
- Budget enforcement: stops when usage_remaining <= threshold
- Fix: properly convert usage headers from string to int
- Configurable via apis/newsapi_{stream}.txt

Analysis tooling:
- Add comparison eval framework for A/B testing scoring algorithms
- Generate reports comparing v1 vs v2 output rankings

Tests: 559 passed, 5 skipped
@satriapamudji

Copy link
Copy Markdown
Owner Author

@copilot check and verify this PR

Copilot AI commented Jan 9, 2026

Copy link
Copy Markdown

@satriapamudji I've opened a new pull request, #5, to work on those changes. Once the pull request is ready, I'll request review from you.

@satriapamudji satriapamudji requested a review from Copilot January 9, 2026 17:16
@satriapamudji satriapamudji merged commit fe7658e into main Jan 9, 2026
4 checks passed
@satriapamudji satriapamudji deleted the feature/scoring-v2-design branch January 9, 2026 17:20

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements two major features for the Argus news aggregation system:

Scoring V2 (Macro-First): A new scoring algorithm that prioritizes macro-economic news over clickbait content using domain-based tier weighting, recency decay, keyword boosting for high-impact topics, and penalties for low-quality content.

NewsAPI Integration: Adds TheNewsAPI.com as an alternative ingestion source to RSS feeds, featuring multi-key rotation, smart pagination with sliding window duplicate detection, and budget enforcement to prevent API quota exhaustion.

Reviewed changes

Copilot reviewed 35 out of 37 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_scoring_v2.py Comprehensive test suite (26 tests) for v2 scoring algorithm
tests/test_ingestion_api_newsapi.py Test suite (23 tests) for NewsAPI ingestion provider
tests/test_pipeline_registry.py Updated registry tests for new providers
tests/test_config_providers.py Updated config tests for v2 default
src/argus/scoring/heuristics_v2.py New v2 scoring implementation with macro-first prioritization
src/argus/scoring/types.py Extended ScoringCandidate with feed_url and author fields
src/argus/scoring/worker.py Updated to support v2 scoring
src/argus/pipeline/registry.py Registered new providers
src/argus/pipeline/providers/* New NewsAPI provider implementations
src/argus/config.py Added NewsApiConfig and changed default scorer to v2
config.yaml Updated provider defaults
docs/integrations/newsapi.md Comprehensive NewsAPI documentation
docs/design/scoring_v2_macro_first.md Detailed design document for v2
tasks/03_archive/* Task documentation files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

parse_iso_datetime,
)
from argus.pipeline.providers.ingestion_api_newsapi import NewsApiIngestionProvider
from argus.pipeline.providers.news_api_client import NewsApiResponse, NewsArticle

Copilot AI Jan 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'NewsApiResponse' is not used.

Copilot uses AI. Check for mistakes.
# Unicode characters in titles/snippets. Ensure we never crash while printing.
try:
sys.stdout.reconfigure(errors="backslashreplace")
except Exception:

Copilot AI Jan 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
except Exception:
# Best-effort: if stdout cannot be reconfigured (e.g., on unsupported platforms),
# continue without crashing; this only affects how Unicode errors are displayed.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants