Skip to content

refactor: rework puzzle selection with quality score, denylist, and true motif caps#21

Merged
SKOHscripts merged 3 commits into
mainfrom
claude/anki-guid-puzzle-id-4pdvG
May 28, 2026
Merged

refactor: rework puzzle selection with quality score, denylist, and true motif caps#21
SKOHscripts merged 3 commits into
mainfrom
claude/anki-guid-puzzle-id-4pdvG

Conversation

@SKOHscripts

Copy link
Copy Markdown
Owner

Summary

Complete overhaul of the puzzle-selection algorithm in lichess_optimized_puzzles_datasets.py, addressing six structural weaknesses identified in the original design.

Problems fixed

# Weakness Fix
1 Within-theme selection used arbitrary CSV row order Bayesian quality score: quality = (NbPlays×p + 30×0.5) / (NbPlays + 30) — a noisy 100%/3-plays puzzle no longer beats a solid 92%/5000-plays puzzle
2 Popularity was the only quality signal; NbPlays and RatingDeviation ignored NbPlays read (tolerantly — absent columns fall back gracefully); RatingDeviation used as a soft sort-key preference (never blocks a motif)
3 Per-theme cap only counted one "bucket" per puzzle, ignoring co-occurrences motif_count tracks ALL motifs of every selected puzzle; quality top-up respects the true cap
4 Motifs only present in Popularity<90 puzzles were structurally impossible to cover Theme-aware complement phase forces in the best available puzzle per uncovered motif, no popularity gate
5 Themes mixed tactical motifs with metadata tags, inflating coverage metric THEME_DENYLIST excludes mateIn1..5, oneMove, short/long/veryLong, crushing/advantage/equality, master/masterVsMaster/superGM, opening/middlegame/endgame
6 iterrows() over millions of rows Vectorized explode + sort_values + groupby().head() fast-pass

New capabilities

  • target_deck_size (default 1200) gives explicit control over cards per deck
  • UPPER_TRANCHE_EDGES: the unbounded ≥1800 tail is split into 1800–1900, 1900–2000, 2000–2200, 2200+, giving homogeneous Woodpecker decks (14 sub-decks total)
  • report_theme_coverage now returns unique_motifs_sample, unique_motifs_tranche, coverage_pct (motif-based, honest), and coverage_pct_all (all-theme, for reference); all existing keys preserved
  • ALL_DECKS in build_apkg.py updated to match the new filenames

Test plan

  • 58 tests pass (pytest tests/ -q)
  • 10 new tests: quality score monotonicity, graceful NbPlays fallback, denylist filtering, theme-aware complement, _quality_topup cap enforcement, determinism under row shuffle, target deck size, motif coverage keys, upper tranche edges
  • Run extract_tranches on a real Lichess CSV sample and verify coverage_pct ≈ 100%, deck sizes ≈ 1200, no motif exceeds target_per_theme

https://claude.ai/code/session_01VAUnQCt5CM2TVpRbsQbSBL


Generated by Claude Code

claude added 3 commits May 28, 2026 17:58
…rue motif caps

Replace the iterrows+dict sampling with a vectorized pipeline that addresses
six structural weaknesses in the original algorithm:

- Bayesian quality score (Popularity + NbPlays confidence-shrunk) so a noisy
  100%/3-plays puzzle no longer outranks a well-evidenced 92%/5000-plays puzzle;
  RD used as a soft sort-key preference without blocking any motif.
- THEME_DENYLIST excludes non-tactical metadata tags (mateIn1..5, crushing,
  master, oneMove, etc.) from the diversity objective and coverage metric.
- Vectorized fast-pass: explode+sort+groupby.head instead of iterrows over
  millions of rows; fully deterministic via PuzzleId tiebreak.
- Theme-aware complement: motifs present only in low-popularity puzzles are
  guaranteed ≥1 representative (no popularity gate in complement phase).
- True per-motif cap: motif_count tracks all co-occurring motifs of selected
  puzzles, preventing overrepresentation via the quality top-up phase.
- target_deck_size (default 1200) gives explicit volume control per tranche.
- The unbounded ≥1800 tail is split into 1800-1900, 1900-2000, 2000-2200,
  2200+ for homogeneous Woodpecker decks; ALL_DECKS updated accordingly.
- report_theme_coverage adds motif-based keys (unique_motifs_sample/tranche,
  coverage_pct_all) while keeping all backward-compat keys.
- 10 new tests covering quality monotonicity, denylist, complement, cap, and
  determinism under row shuffle.

https://claude.ai/code/session_01VAUnQCt5CM2TVpRbsQbSBL
@SKOHscripts SKOHscripts marked this pull request as ready for review May 28, 2026 18:19
@SKOHscripts SKOHscripts merged commit b04e05d into main May 28, 2026
14 checks passed
@SKOHscripts SKOHscripts deleted the claude/anki-guid-puzzle-id-4pdvG branch May 28, 2026 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants