Skip to content

feat: retry 429 with backoff + document pagination date-range loss#178

Open
ranahaani wants to merge 2 commits into
masterfrom
feat/429-backoff-and-paginate-docs
Open

feat: retry 429 with backoff + document pagination date-range loss#178
ranahaani wants to merge 2 commits into
masterfrom
feat/429-backoff-and-paginate-docs

Conversation

@ranahaani

Copy link
Copy Markdown
Owner

Summary

  • _get_news retries HTTP 429 with capped exponential backoff + uniform jitter. Configurable via max_retries, retry_backoff_base, retry_backoff_max on the GNews constructor. Previously a single 429 bubbled straight to the caller, forcing every downstream consumer to reinvent retry policy.
  • _get_news_more_than_100 docstring + UserWarning now spell out that date ranges are discarded when paginating past 100 results, that the rolling window steps in 7-day chunks anchored on the earliest parsed published_date, and that per-call seen_urls dedup does not persist across calls. Lets callers building precise temporal pipelines pick the right max_results.
  • _fetch_feed and _sleep extracted as seams so retry behaviour can be unit tested without real HTTP or wall-clock waits.
  • README updated with the new retry params.

Why

Came up while designing gnews-agent (persistent news memory + MCP layer built on top of GNews). The agent fetcher needed retry+backoff and precise temporal queries — both gaps belonged in the library, not every downstream user reinventing them.

Compatibility

  • All new params default to behaviour-preserving values (max_retries=3, retry_backoff_base=1.0, retry_backoff_max=60.0). Existing callers see a more resilient fetcher with no API changes required.
  • Setting max_retries=0 restores the previous immediate-raise behaviour.

Test plan

  • pytest tests/test_retry_backoff.py — 8/8 pass
  • pytest tests/test_exceptions.py tests/test_logging.py — 5/5 pass (no regression on offline tests)
  • Maintainer to run the network-dependent test_gnews.py suite as part of CI

…ation date-range loss

- _get_news now retries HTTP 429 responses with capped exponential backoff and
  uniform jitter (max_retries, retry_backoff_base, retry_backoff_max constructor
  params). Previously a single 429 bubbled straight to the caller, which forced
  every downstream consumer to reinvent retry policy.
- _fetch_feed and _sleep extracted as seams so retry behaviour can be unit
  tested without real HTTP or wall-clock waits.
- _get_news_more_than_100 docstring + UserWarning now spell out that date
  ranges are discarded when paginating past 100 results, that the rolling
  window steps in 7-day chunks anchored on the earliest parsed published_date,
  and that per-call seen_urls dedup does not persist across calls. Helps
  callers building precise temporal pipelines pick the right max_results.
- README updated with the new retry parameters.
- tests/test_retry_backoff.py covers config validation, backoff math, retry
  exhaustion, retry disabled, and non-429 short-circuit (8 tests, all pass).
…changelog

- New docs/usage/retries.md walks through default behaviour, tuning,
  disabling, the backoff formula, and what is intentionally not retried.
  Linked from docs/index.rst so it appears in the ReadTheDocs sidebar.
- docs/reference/api.md GNews constructor signature now lists the three new
  retry parameters with defaults, plus a pointer to the retries guide.
- setup.py version bumped to 0.8.2 (additive feature, no API removal).
- docs/conf.py release synced to 0.8.2 (was stuck at 0.6.0).
- docs/changelog.md prepended with a 0.8.2 entry covering both the retry
  feature and the pagination warning/docstring tightening.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant