feat(research-areas): refresh all ResearchAreas pages for matrix completeness#35
Conversation
…ations Expand the Scaffolding stub into the full deep-dive template: thematic sections (mould/scaffold geometry surrogates; mechanical & print-quality prediction), clickable Papers.md anchors for refs #19/#20/#34/#35, a Tools and data section, an open-challenges analysis, and a Further reading footer. Every claim grounded in the cited papers' full text and passed both adversarial reviewers (citation + claim).
…ded citations Reconcile the page against its full matrix column (refs #1-3, #15-18, #21, #23-25, #58, #169-170): thematic clusters (Bayesian optimization, hybrid surrogate/evolutionary search, active learning, explainable feature selection), clickable Papers.md anchors, Tools and data and open-challenges sections, and a Further reading footer. Kanda (#16) is framed as protocol- not media-optimization per its full text. Every claim grounded in source full text via the caail Zotero library; passed both adversarial reviewers (the one flagged 'first' superlative removed).
…ded citations Reconcile against the matrix column (refs #7, #29-33, #59, #61-62): thematic sections (soft sensors; CFD-surrogate acceleration; small-data fermentation prediction; reduced metabolic models/digital twins; autonomous experimentation), clickable Papers.md anchors, Tools and data and open-challenges sections, and a Further reading footer. The microbial-vs-mammalian framing is grounded in the corpus. Ref #33's claim is held to its catalogued title (full text paywalled). Passed both adversarial reviewers; #29 (CFD error) and #31 (GA objective) corrected to match source full text.
…unded citations Expand the 3-paragraph stub to cover the full ~32-paper matrix column: single-cell characterization, biological foundation models (MLM / autoregressive / cross-species / LLM-native), perturbation-response prediction, genetic-part design, and autonomous cell-engineering agents, plus a cross-cutting section. DNABERT (#6) framed as a masked-LM model and Mathieu (#60) surfaced as the one explicitly cultivated-meat study, both per full text. Clickable Papers.md anchors, Software/Datasets/ Databases cross-links, Tools and data, open-challenges, Further reading. Passed citation review and both halves of the claim review.
…y Prediction Weave in the matrix-column papers missing from the page: a new 'Computational prediction of taste and off-flavor' section covering the Niv-lab bitterness lineage (BitterPredict #102, BitterIntense #103, BitterMatch #104, BitterMasS #105), plus texture (#171) and image-based freshness QC (#195) in the applied section. Grounded in source full text; passed citation review and claim review (model-list enumeration corrected to be non-exhaustive).
…I Evaluation Cover the matrix-column benchmarks missing from the page: a new 'General-purpose frontier benchmarks (capability context)' subsection (SWE-bench #155, GPQA #156, MMLU-Pro #157, Humanity's Last Exam #158, FrontierScience #159) framed honestly as capability context rather than biology evals, plus SciHorizon-Gene #164 (cell-state benchmarks) and MeatScan #196 (domain-specific). Cross-linked to existing Datasets/Benchmarks.md and Databases.md leaderboard entries. Passed citation review and claim review (SciHorizon-Gene axes corrected to all four, 'literature influence' restored).
Weave the missing AI Tooling-column papers into their clusters: discovery agents (Co-Scientist #153, Robin #154, ERA #166), domain-specific agents (Talk2Biomodels/T2KG #167), robot scientists (the original King 2004 Robot Scientist #182), agent infrastructure (BioContextAI #133, Aviary #160, SciAtlas #162), chemistry agents (ether0 #161, MoleCode #163), and the 'other methodology' section (Epicure #197 plus the Gu CHI human-AI-verification studies #151/#152, framed accurately as HCI studies, not agents). Also fix a pre-existing broken Further-reading anchor (Talks section lives in Talks.md, not OtherResources.md). Passed citation review and claim review (King-2004 headword corrected from 'Adam' to 'The Robot Scientist'; agent order fixed).
… field-gap papers Reconciles the eight ResearchAreas pages against main after the #32 field-gap additions and the #33 matrix-classification audit, which rewrote Papers.md (new rows incl. Reinforcement Learning and Chemometrics, Foundation Models split into five sub-rows, a 7th AI Evaluation & Benchmarking column, the Bioprocess column renamed to "Bioprocess & Scale-Up", and several references removed/reclassified). Correctness: - Remove four dangling anchors whose refs the audit deleted: #22 (Lao), #151/#152 (Gu CHI/HCI), #163 (MoleCode). - Rename the Bioprocess page H1/lede to "Bioprocess & Scale-Up" and relabel every cross-link to match the renamed column. - Reframe reclassified refs: #60 (Mathieu) is now a cultivated-meat perspective rather than "the one column study"; #90/#126/#62 corrected from Cellular-Engineering/Bioprocess cells to their AI Tooling cells; #68 recharacterized (LLM literature-extraction + hybrid GEM/DL predictor, not a RAG design workflow). Completeness — integrate ~39 newly-added column papers, each grounded in the paper's full text via the caail Zotero library and gated by the caail-claim-reviewer (writer != reviewer): - Media: GA/evolutionary + one-shot DoE media work (#210, #211, #212, #213). - Cellular Engineering: CellFM (#235), porcine-adipocyte readout (#218), and the active-learning/strain-design multi-listings (#63, #66, #68). - Bioprocess & Scale-Up: a reinforcement-learning control cluster (#200-#203), hybrid mechanistic+ML control (#204, #205), ML soft sensors (#206-#208), GA+ANN viral production (#209), and microbial volatile prediction (#27). - Scaffolding: ML scaffold/print-quality prediction (#214, #215, #216) and a nondestructive-characterization section (#217). - Sensory Prediction: MeatScan freshness data (#196) and the Tac generative burger study (#236). - AI Tooling: ProCyon (#224), discovery/agent-infra refs (#219-#223), the chemometrics R packages (#198, #199), and D-GEX (#4). - AI Evaluation: BioML-bench (#225), ARIEL (#53), and State/Cell-Eval (#57). Verified: 0 dangling anchors, every matrix-column paper represented in its page, and `pnpm --dir site build` succeeds.
Updated: reconciled against
|
| Page | New refs |
|---|---|
| Media Optimization | #210, #211, #212, #213 |
| Cellular Engineering | #235, #218, #63, #66, #68 |
| Bioprocess & Scale-Up | #200–#203 (new RL cluster), #204, #205, #206–#208, #209, #27 |
| Scaffolding | #214, #215, #216, #217 |
| Sensory Prediction | #196, #236 |
| AI Tooling | #4, #198, #199, #219, #220, #221, #222, #223, #224, #126, #90 |
| AI Evaluation | #225, #53, #57 |
Adversarial review caught and fixed real errors before commit, e.g. #208 ("Raman > capacitance" is only true for viability — capacitance wins VCD), #209 ("coefficients above 0.99" → held-out test 0.99; validation was 0.9856), and the #68 RAG mischaracterization.
Verified: 0 dangling anchors · every matrix-column paper represented in its page · pnpm --dir site build succeeds (38 pages).
MeatScan (Papers.md #196, Gyening et al. 2025) is a Data in Brief image dataset of fresh/spoiled cow-meat photos, not an AI/ML benchmark suite, and it sits in the Sensory Prediction matrix column (CNN), not AI Evaluation. - Remove it from Datasets/Benchmarks.md (the AI/ML benchmark-datasets page); its canonical home stays Datasets/Cow.md (species inventory + narrative). - Remove the stray MeatScan bullet from the AI Evaluation deep-dive's "Domain-specific predictive benchmarks" section. - Drop the now-removed Benchmarks.md H2 from the dataset tally: it was double-counted (once as a Cow.md inventory row, once as a Benchmarks.md heading), so the catalogued-dataset total goes 147 -> 146 and benchmarkEntries 15 -> 14. Update the three pinned test ground-truths accordingly. MeatScan's Sensory Prediction mention (framed as a dataset, linked to Datasets/Cow.md) and its Papers.md #196 entry are unchanged.
Follow-up: MeatScan recatalogued as a dataset, not a benchmark (
|
What & why
The
ResearchAreas/*.mddeep-dives behind thePapers.mdmatrix had drifted into two tiers: four mature pages and four near-empty stubs that covered only a fraction of their matrix columns (e.g. Cellular Engineering's column spans ~32 papers; its prose covered ~6). This PR brings all eight pages up to the mature template and reconciles each against its matrix column, so every paper assigned to a column is now represented in that column's page.How it was built (grounding + adversarial review)
Every factual claim is grounded in the paper's full text, pulled from the locally-synced caail Zotero group library (
localhost:23119) rather than abstracts — per the repo's AI-agent convention. Every drafted/edited page then passed the repo's two read-only adversarial reviewer subagents before commit (writer ≠ reviewer):caail-citation-reviewer— every(… ref #N)anchor resolves to the rightPapers.mdID with matching author/year, and everySoftware.md/Databases.md/Datasets//Talks.mdcross-link resolves.caail-claim-reviewer— every count, metric, method description, and identity claim checked against source full text.The review caught and fixed real issues, e.g. a contradicted CFD cross-validation figure (Bioprocess #29), a GA-objective drift (#31), an unsupported "first" superlative (Media #1), incomplete enumerations (Sensory #171, AIEval #164 SciHorizon-Gene), and a King-2004 headword ("Adam" → "The Robot Scientist").
Per-page coverage
Verification
Papers.md#Nanchors across the 8 pages resolve to existing IDs (0 dangling).pnpm --dir site buildsucceeds — all 8/research-areas/*pages render.Follow-ups (out of scope here — not modifying
Papers.md)Grounding surfaced a few likely matrix-placement nuances to consider in a separate PR: #34 Andrews 2025 sits in the SVM row but uses a genetic algorithm + DL surrogate (no SVM); #6 DNABERT sits in Deep Learning though it is a masked-LM model; #60 Mathieu and #145 MetaGEM are placed among foundation models but are an interactome perspective and a metabolic-model reconstruction respectively. The prose describes each accurately; the matrix cells were left untouched.
🤖 Generated with Claude Code