Releases: eijex/factorforge-cds
Releases · eijex/factorforge-cds
v3.2.5 — FactorForge
Fixed
- The published Docker image (
ghcr.io/eijex/factorforge-cds) crashed on every
POST /api/optimizerequest withAttributeError: 'FactorForgeHandler' object has no attribute 'validate_host', returning an empty response. Root cause:
scripts/serve.pyrouted requests to the API handler via an unbound-method
call (handler.do_POST(self)) instead of real inheritance, which broke once
do_POSTstarted callingself.validate_host/self.send_error_response
for host-parameter validation. The hosted web app (factorforge.eijex.com) and
the PyPI CLI were unaffected — only the Docker image and local
python scripts/serve.pydev server were broken.FactorForgeHandlernow
properly inherits from the optimize API handler. CITATION.cff'sdoifield was left pointing at the v3.2.3 exact-release
DOI (10.5281/zenodo.20758131) after the v3.2.4 release commit bumped
version/date-releasedbut notdoi. Updated to the v3.2.4 exact-release
DOI (10.5281/zenodo.20826659), confirmed live on Zenodo. Found while
updating the FactorForge Paper 1 manuscript's Availability section to
cite the same DOI.
v3.2.4 — FactorForge
Added
- "View Predicted Structure" (AlphaFold DB / ESM Atlas) buttons now show an explicit consent modal naming the actual external operator (EMBL-EBI/Google DeepMind for AlphaFold DB, Meta Platforms, Inc. for ESM Atlas) before the sequence is sent via URL, replacing the previous passive footnote-only disclosure.
- Runtime validation registry (
factorforge.validation_registry) listing all 17 validation checks with per-execution-path enforcement metadata, and a canonical validation report builder (factorforge.validation_report) that scans the final returned CDS and reports results keyed bycheck_id. The web "Sequence Checks" panel and the/api/optimizeresponse now expose all 9 advisory scanners plus restriction-site and MoClo-overhang results (previously only 3 of 9+ checks were visible).validation.moclo/polya/gclegacy fields are unchanged. reproducibility/benchmark_v0.5.1/scripts/figures/make_cai_vs_multiconstraint_figure.pyregenerates the "CAI-only baseline versus constraint-aware design" scatter figure (mean CAI vs. corrected multi-constraint pass rate per method) frombenchmark_summary.frozen.json. No generation script for this figure existed anywhere in the repository or its Git history; this is a reconstruction verified against the figure's existing per-method values, not a recovery of the original code.
Docs
- Clarified that BY-2 (N. tabacum) experimental host support applies to
thebalanced/gc_target/assembly_friendlyrule-based profiles only;
high_caiandfeasibility_bestremain N. benthamiana-only by design. docs/how-it-works.md's "Design Objectives" table listedgc_target/
high_caias if they were working DP--objectivevalues; passing either
to--objectivealways raised (the DP engine only implements
feasibility_best). Split the table into the DP engine's one real
objective and the profile engine's--profilevalues, with an explicit
note that the latter are not DP objectives.docs/profiles.md's "Stable Profiles" section claimed all four profiles
are "fully supported ... via CLI, Python API, web app, and MCP" without
disclosing thathigh_caiis N. benthamiana-only and rejected for other
hosts at the CLI/REST/web layers. Added that qualifier.
Fixed
- Web UI "AA Preserved" badge in the Optimization Results panel displayed
validator_status(assembly/restriction-site review outcome) instead of
actual amino acid identity — a 33aa test sequence with a flagged
restriction site showed "⚠️ Review" even though its translated protein
was 100% identical to the input. The API already exposed the correct
value asconstraint_report.aa_identity;getPrimaryResult()in
web/js/app.jsnow reads that field instead. No API contract change. - Success/info toast notifications (e.g. "History item loaded") were nearly
invisible in dark mode — the generic.dark .bg-emerald-50/.bg-blue-50
card-dimming rules turned the toast background into a low-alpha tint while
the dark text color was left unchanged. Scoped an opaque, high-contrast
override to#toastContaineronly; light mode and other card UIs (e.g.
the privacy banner) are unaffected. /api/optimize/compareand/api/optimize/batchaccepted ahostor
host_profilefield in the request body but never read it, silently
ignoring the caller's host intent and returning HTTP 200 with
default-host output. Both endpoints now reject any request containing
either field with HTTP 400 (HOST_NOT_SUPPORTED_ON_ENDPOINT) instead of
silently dropping it.- CLI
--objectivedeclaredgc_targetandhigh_caias valid DP
objective values viaclick.Choice, but the DP engine only ever
implementsfeasibility_best— passing either always raised
DP engine currently supports --objective feasibility_best.. Removed
both from theclick.Choiceso the CLI rejects them at the argument-parsing
stage with a clear "invalid choice" error instead of failing inside the
engine. - A direct library call (
RuleBasedOptimizer().optimize(profile="high_cai", host=<non-default>)) silently substituted N. benthamiana golden-set
output with no indication that the requested host was ignored (the
existing host/strategy compatibility guard only covers the REST/CLI/web
surfaces). Added alogger.warningat this call site; the host-invariant
output itself is unchanged (this is the documented design boundary, not a
bug) and the existing benchmark
codon_table_pathinjection path is unaffected. - Web UI "View Predicted Structure → ESM Atlas" link used a URL scheme
(esmatlas.com/explore?tab=fold&sequence=) that ESM Atlas's router never
branches on, so it always landed on the generic explore screen instead of
a fold view regardless of sequence. Switched to the route ESM Atlas's own
navigation actually uses (esmatlas.com/resources?action=fold&sequence=),
confirmed against ESM Atlas's production JS bundle and verified live —
it now lands on the correct "Fold Sequence" tool (ESM Atlas's UI does not
support pre-filling the sequence box from a URL parameter, so users still
paste it manually after arriving). - Removed internal task-tracking references ("Job NNN") from the public changelog (
CHANGELOG.md,docs/changelog.md,web/index.html) and from public docs (docs/strategy/eijex-tool-layer-classification.md,docs/validation/RELEASE_GATE.md,reproducibility/benchmark_v0.5.1/README.md).scripts/audit_public_surface.py'sinternal_referencepattern now also matchesJob \d+so future releases catch this automatically. - Removed the same internal task-tracking references ("Job NNN", "analysis NNN") from source-level comments, docstrings, and registry provenance fields (
src/,tests/), and deletedbenchmarks/scripts/resume_job130_rerun.sh, a one-off maintainer rerun script that hardcoded a local developer machine path. Empirical citations (CAI/GC benchmark numbers) were kept; only the internal ID labels were removed. /api/optimizenow rejects explicitobjective=feasibility_bestor
profile=high_cairequests combined with a non-defaulthost(HTTP
400) instead of silently returning N. benthamiana-table output. The
implicit case (host-only request, no explicit strategy) now discloses
requested_strategy/resolved_strategy/resolution_reasonand
resolves tobalanced(previously silently resolved towardhigh_cai,
which is also N. benthamiana-only). CLI and web UI now blockhigh_cai
for non-default hosts (--compare-profilesincluded), matching the
existingfeasibility_bestguard; the web UI's auto-selected fallback
for BY-2 changes fromhigh_caitogc_target.- Release provenance hashing now computed from the committed git blob (
git show HEAD:<path>) instead of local working-tree bytes, fixing CRLF/LF drift on Windows that could silently produce incorrect SHA-256 values inreproducibility/benchmark_v0.5.1/MANIFEST.jsonandtests/test_docs_consistency.py. - Public-surface DOI references (
README.md,docs/index.md,AGENTS.md) switched from version-pinned Zenodo DOIs to the concept DOI, which always resolves to the latest release, so future releases no longer require manually updating these files. AGENTS.mdno longer states a hardcoded "16 version-bearing files" count, which had gone stale; points atscripts/release.py'sbuild_targets()as the source of truth instead.docs/rule-engine-roadmap.md,docs/validation.md,docs/how-it-works.md, anddocs/factorforge-architecture.mdenumerated only 5-8 of the 9 default advisory RuleEngine scanners and omitted the MoClo overhang assembly-review check; all four now list the complete set.rule-engine-roadmap.mdadditionally mis-stated "Repeat patterns" as "Planned / Not yet implemented" (it is implemented and runs by default) and described an unused legacy GC-window calculation instead of the activescan_gc_extremesthresholds; both corrected. No runtime code changed.- Web app "Sequence Checks" badge labeled "MoClo Overhang Check" actually reported a Type IIS restriction-site scan result, not MoClo overhang validity (the real MoClo overhang check lives in the opt-in construct-builder path and was never called here); label and changelog text corrected to "Restriction Site Check (Type IIS)".
validation.mocloJSON field name kept unchanged for frontend compatibility.
Added
scripts/audit_public_surface.pyand a CHANGELOG duplicate-[Unreleased]-header check now run on every push/PR in CI, instead of only whenrelease.py --auto --audit-scriptis remembered.scripts/regen_manifest.pyregeneratesreproducibility/benchmark_v0.5.1/MANIFEST.jsoninput hashes from committed git-blob content on demand.
Changed
- Repositioned FactorForge as a claim-bounded pre-synthesis review harness across
README.md,ROADMAP.md,docs/index.md, the newdocs/roadmap.md, andweb/index.html, separating the research-software journey from product roadmap themes without adding any new guarantees (expression, glycosylation, folding, yield, synthesis acceptance, regulatory approval).
v3.2.3 — FactorForge
Release v3.2.3
v3.2.2 — FactorForge
Fixed
multi_constraint_passdefinition corrected (scoring_contract v1.1):benchmarks/scoring.py
score_cds()now definesmulti_constraint_pass = biological_pass AND assembly_pass AND gc_in_target_range.
The previous definition (biological_pass AND assembly_pass) omitted GC target compliance, producing
inflated L3/L4 ablation values (89.0%/88.6%) that were mathematically inconsistent with their GC in range
rates (3.7%/5.8%). Corrected values: L3=3.5%, L4=5.6%.
All benchmark artifacts (benchmark_summary.json,ablation_summary.json,benchmark_v0.5.1data/figures)
regenerated from full rerun (seed=320, N=49,257). Zenodobenchmark_results.csvv2 (DOI: 10.5281/zenodo.20676276) supersedes v1.
Acanonical_multi_constraint_pass()helper added for recomputing from primitive columns in historical CSVs.- Benchmark: source-profile codon-table injection now flows into both the design and scoring paths (
design_table_sha256 == score_table_sha256verified per run), fixing a prior gap where design always used the default table regardless of an injected profile (Job 130).
Added
- Data: added three genome-annotated N. benthamiana codon-usage profiles (SGN QLD183 v103 CDS-derived; SGN NbeV1.1 all-CDS-derived; SGN NbeV1.1 high-confidence-CDS-derived) built via
scripts/build_codon_profile.pyunderstrict_nuclear_cds_v1filtering, alongside the existing packaged reference profile (Job 130).
Changed
- Web: Host System cards in
web/index.htmlare now rendered dynamically fromGET /api/optimize(supported_hosts+ newhost_metadatafield) instead of being hardcoded, removing a 3-way duplication between the HTML,web/js/app.js, and the API (Job 133).
Docs
- Aligned public wet-lab validation contribution language with manual-review, public-safe submission rules.
- Clarified that public GitHub Issues must not contain raw sequences, confidential construct details, internal batch IDs, patient data, private contact information, exact process parameters, or confidential partner/customer data.
- Aligned public README, docs, web, citation, packaging, roadmap, and benchmark wording with the in-silico CDS design claim boundary.
- Removed maintainer-local file paths and internal repo references from
docs/release-checklist-template.md, replacing them with generic placeholders (Job 134).
v3.2.1 — FactorForge
Added
- Protein risk annotation layer for CDS sequences
- Transmembrane helix prediction (Kyte-Doolittle, window=19, threshold=1.6)
- Signal peptide heuristic (N-terminal 30 aa scan)
- Risk classification: HIGH / MEDIUM / LOW / UNKNOWN
Fixed
- Correct CAI provenance annotation in benchmark output (Job 110)
- Correct Type IIS restriction site warning status (Job 110)
- Pin JSON files to LF line endings for Windows reproducibility
- Add pandas to dev test dependencies
- Manifest SHA-256 reproducibility drift on Windows (JSON/EOL normalization)
- wet-lab result GitHub template: add protein_class options (Reporter / Antigen / Cytokine) and validation consent checkbox
Changed
- Add Google Form as wet-lab submission channel alongside GitHub Issue and email
- Standardize wet-lab submission link labels across README, docs, and web app
Docs
- Add public-safe FactorForge agent guidance
- Expand release checklist with public surface audit steps
Chore
- Bump actions/configure-pages, codecov/codecov-action, actions/deploy-pages, actions/setup-python, softprops/action-gh-release (Dependabot)
v3.2.0 — FactorForge
Added
- MFE metadata fields — Design Package and API response now include
mfe_used(bool),mfe_status(computed/not_computed), andmfe_warning(string when ViennaRNA unavailable).score_componentsadded to expose per-term weights used in composite score calculation. - Design Package schema v1.0.0 — Formal IUPAC/FASTA I/O contracts and MFE null invariant established (090).
- Registry production constants export —
DEFAULT_CAI_TARGET,DEFAULT_GC_LOW,DEFAULT_GC_HIGHimportable as public production constants (091). - Benchmark seed injection —
--seedflag for deterministic reruns;most_frequent_codontie-breaking deduplication (099). - Codon table provenance disclosure —
codon_table_manifest.jsonwith sha256 pin,build_path_status: incomplete, and known limitations fornbenthamiana_codons.json(097).
Fixed
- Domestication Silence Fail —
pipeline.pynow raisesValueErrorwhen restriction-site domestication fails (previously returned the undomesticated sequence silently as success). - Pipeline Output Validator —
validate_cds_output()is now called inpipeline.pybefore final sequence return, catching AA identity violations and internal stops at the pipeline level. - MFE not-computed value —
mfe_kcal_molis nownull(not0.0) when ViennaRNA is unavailable. Composite score is unchanged; this corrects misleading metadata only. - Input validator — IUPAC ambiguous DNA/AA sequence misclassification corrected (098).
Documentation
- Stale constant corrections — 5 doc/comment locations corrected to match live code.
- Claim wording alignment — Public-facing API and CLI output wording unified; no expression-level or yield improvement claims (092).
- Formal benchmark — N. benthamiana SGN CDS (N=49,257, seed=320). All metrics are in-silico; no wet-lab validation claimed.
v3.1.9 — FactorForge
Documentation
- Internal housekeeping — project tracking references updated. No engine changes.
v3.1.8 — FactorForge
Breaking Changes
gc_targetprofile default changed — callinggc_targetwithout an explicittarget_gcnow produces sequences targeting ~60% GC (host midpoint) instead of the previous 42.5%. If you relied on the 42.5% default, passtarget_gc=42.5explicitly to preserve the old behavior.
Changed
gc_targetprofile default — now targets the host-profile GC midpoint (60% for N. benthamiana) whentarget_gcis not supplied, instead of the legacy hardcoded 42.5%. To target a lower GC, passtarget_gcexplicitly. Output sequences fromgc_targetwithout an explicit target will differ.- GC scoring —
calculate_composite_scorenow scores GC using a band function (gc_band_score): full score inside[gc_min, gc_max], linear decay outside overgc_decay_width(default 20 pp). Replaces the previous1 - |GC - GC_opt|/50proximity formula, which under-discriminated GC quality. assembly_friendlyscoring weights — changed from balanced-identical(0.5, 0.3, 0.2)to(0.3, 0.4, 0.3)(lower CAI pressure, higher GC/MFE weight) to align scoring with its Type IIS site-avoidance translation strategy.feasibility.pydefaults —target_cai0.92 → 0.82 (achievable; aligns with industry >0.8 practice);target_gc41–44% → 55–65%; fallback GC ranges realigned to the 55–65% output distribution.
Fixed
- Homopolymer thresholds documented — expression-stability (≥6 nt) and synthesis/manufacturing (≥8 nt) scans now use named constants (
HOMOPOLYMER_EXPRESSION_WARN_NT,HOMOPOLYMER_SYNTHESIS_WARN_NT) and emitcontext/threshold_ntmetadata so the two intentionally different thresholds are no longer mistaken for a bug. - Misleading docs removed —
gc_targetno longer described as "42.5% (N. benthamiana optimal)"; 42.5% was a legacy assumption inconsistent with the 55–65% codon-table output. - CLI docs corrected —
docs/cli.md--gc-min/--gc-maxdefaults fixed from 40/55 to the actual 55/65.
Documentation
docs/profiles.md— added missingassembly_friendlyprofile; correctedgc_targetdescription.docs/tutorials/gfp-nbenthamiana.md— regenerated profile-comparison metrics under the new GC scoring andgc_targetdefault.
v3.1.7 — FactorForge
Added
- Web UI host selector — expression host toggle (N. benthamiana / BY-2 Experimental) in the input panel. BY-2 selection disables Feasibility Best objective and shows experimental warning. Result panel displays active host profile.
- E2E smoke tests — 5 Playwright smoke tests covering UI load, protein input, BY-2 host toggle, Feasibility Best guard, and result rendering. Runs automatically after each deployment via
e2e.yml.
Documentation
- Eijex MCP access — added Eijex MCP as access option in
README.mdanddocs/index.md - API endpoints — added
POST /api/optimize,/compare,/batchendpoints section todocs/cli.md - MCP getting started — added Eijex MCP connection guide to
docs/getting-started.md - ml_enhanced profile —
docs/profiles.md에 ml_enhanced 프로파일 문서화 - AGENTS.md — 새 API 엔드포인트 추가 시 eijex-mcp 동기화 항목 명시
v3.1.6 — FactorForge
Added
- SynCodonLM scoring dimension — optional 5th composite score component (
w_syncodonlm, default0.0). Integrates Boehringer-Ingelheim's BERT-based codon language model (SynCodonLM, NAR 2025; HuggingFace:jheuschkel/SynCodonLM-V2). Graceful fallback (score 0.5, WARNING) whentransformersis not installed. No change to existing scoring behavior. ml_enhancedscoring profile —w_cai=0.35, w_gc=0.25, w_mfe=0.15, w_syncodonlm=0.25. Opt-in; existing four profiles unchanged.[ml]optional dependency group —pip install factorforge-cds[ml]installstransformers>=4.40andtorch>=2.0for SynCodonLM inference.scoring_ml.py—SynCodonLMScorerclass with lazy model loading;calculate_syncodonlm_score(sequence, organism).- Profile comparison mode —
factorforge optimize input.fasta --engine profile --compare-profiles balanced,high_cai,gc_targetoutputs a side-by-side CAI / GC% / score table. First profile result saved to--outputwhen specified.POST /api/optimize/compareendpoint added with same functionality via JSON API. - Tutorial: GFP N. benthamiana — end-to-end worked example at
docs/tutorials/gfp-nbenthamiana.md. Covers CLI, Python API, profile comparison, and MoClo assembly preparation. - Batch optimization API —
POST /api/optimize/batchaccepts up to 20 sequences in a single request. Returns per-sequence CAI, GC%, score, and optimized CDS. Auto-generates IDs (seq_1,seq_2, ...) when omitted. CLI multi-FASTA was already supported. - Tobacco BY-2 host support (experimental) —
--host by2CLI flag and"host": "by2"API field optimize for N. tabacum BY-2 suspension culture cells using a Kazusa-derived codon table (1,534 CDS, species 4097). Default host remainsnbenthamiana. CAI difference between hosts is < 0.05. Experimental: uses N. tabacum codon usage as proxy; not wet-lab validated for BY-2 expression performance. - Structure prediction links — AlphaFold DB and ESM Atlas fold links appear in the result panel after optimization. No API calls — links open external services with the input sequence.