diff --git a/.codecov.yml b/.codecov.yml index d001c41da..af816b1a2 100644 --- a/.codecov.yml +++ b/.codecov.yml @@ -4,6 +4,11 @@ github_checks: annotations: true +ignore: + - 'src/easydiffraction/report/templates/html/vendor/**' + - 'src/easydiffraction/report/templates/tex/styles/**' + - 'src/easydiffraction/utils/_vendored/jupyter_dark_detect/**' + comment: layout: 'reach, diff, flags, files' behavior: default diff --git a/.codefactorignore b/.codefactorignore index 48d64e0bf..23d7fb1a1 100644 --- a/.codefactorignore +++ b/.codefactorignore @@ -1,11 +1,13 @@ # CodeFactor exclude patterns configuration must be done online: # https://www.codefactor.io/repository/github/easyscience/diffraction-lib/ignore # -# Last updated: 2026-01-05 +# Last updated: 2026-05-29 # # Exclude patterns: # deps/** # docs/** +# src/easydiffraction/report/templates/html/vendor/** +# src/easydiffraction/report/templates/tex/styles/** # src/easydiffraction/utils/_vendored/jupyter_dark_detect/** # tests/** # tmp/** diff --git a/.prettierignore b/.prettierignore index a08c3c48b..3891a9359 100644 --- a/.prettierignore +++ b/.prettierignore @@ -25,6 +25,11 @@ docs/docs/assets/ # Node node_modules +# Vendored snapshots +src/easydiffraction/report/templates/html/vendor/ +src/easydiffraction/report/templates/tex/styles/ +src/easydiffraction/utils/_vendored/jupyter_dark_detect/ + # Misc .benchmarks .cache diff --git a/THIRD_PARTY_LICENSES.md b/THIRD_PARTY_LICENSES.md new file mode 100644 index 000000000..ceb7fcf87 --- /dev/null +++ b/THIRD_PARTY_LICENSES.md @@ -0,0 +1,13 @@ +# Third-Party Licenses + +This file indexes third-party assets vendored in this repository. + +## Report LaTeX Styles + +The vendored report LaTeX style files are documented in +`src/easydiffraction/report/templates/tex/styles/LICENSES.md`. + +## Report HTML Assets + +The vendored report HTML assets are documented in +`src/easydiffraction/report/templates/html/vendor/LICENSES.md`. diff --git a/docs/dev/adrs/accepted/analysis-cif-fit-state.md b/docs/dev/adrs/accepted/analysis-cif-fit-state.md index b15b1ed52..9051c0ff4 100644 --- a/docs/dev/adrs/accepted/analysis-cif-fit-state.md +++ b/docs/dev/adrs/accepted/analysis-cif-fit-state.md @@ -25,6 +25,7 @@ Analysis-owned fit state needs to persist: - fit bounds and bound provenance - pre-fit scalar snapshots for recovery workflows - compact status metadata for the latest saved fit projection +- software-provenance snapshot for the latest successful fit - deterministic correlation summaries - minimizer-specific fit outputs on the paired `_fit_result.*` category - per-parameter posterior summaries on `_fit_parameter` @@ -104,6 +105,34 @@ round-trip project schema remains common. correlation summaries keyed by a persisted `id`. Only unique parameter pairs are stored. +### Software provenance + +`_software` stores the runtime software snapshot recorded after a +successful fit. It is part of the project save / load contract and feeds +report rendering plus the IUCr export software labels. It is not a user +configuration category. + +The category stores name, version, and URL triples for: + +- `framework` +- `calculator` +- `minimizer` + +Each role is persisted as scalar items on the same category: + +- `_name` +- `_version` +- `_url` + +The category also stores: + +- `timestamp` + +`timestamp` is an ISO-8601 UTC string for the fit that produced the +snapshot. Projects saved before this category existed load with all +software fields unset and `timestamp` set to `None`; rerunning a fit +populates the snapshot. + ### Minimizer fit projection The active `_minimizer.*` category stores user-selected solver inputs @@ -230,9 +259,10 @@ Load order is: 1. standard analysis configuration 2. `_minimizer.*` settings according to the active `_minimizer.type` -3. common and family-specific `_fit_result.*` fields on the paired class -4. `_fit_parameter` and `_fit_parameter_correlation` -5. posterior sidecar arrays when a Bayesian result is expected +3. `_software.*` provenance fields when present +4. common and family-specific `_fit_result.*` fields on the paired class +5. `_fit_parameter` and `_fit_parameter_correlation` +6. posterior sidecar arrays when a Bayesian result is expected Persist backend runtime objects, optimizer instances, and raw driver payloads nowhere in this design. diff --git a/docs/dev/adrs/accepted/iucr-cif-tag-alignment.md b/docs/dev/adrs/accepted/iucr-cif-tag-alignment.md index 119bb677e..3072bc81c 100644 --- a/docs/dev/adrs/accepted/iucr-cif-tag-alignment.md +++ b/docs/dev/adrs/accepted/iucr-cif-tag-alignment.md @@ -142,58 +142,61 @@ observation drives the policy: coefficient loop indexed by integer `power`), and pdCIF has no parametric peak-shape items at all. File path scopes them; no prefix needed. -- **Reports** — a separate write path, `project.save(report=True)`, that - pulls live Python state and emits a single journal-submission CIF to - `reports/.cif`. This path applies all IUCr renames, - structural reshapings, multi-datablock layout, and project-extension - namespacing (`_easydiffraction_*`). Lives under the new - `project.report` facade slot (replaces the unimplemented - `project.summary` placeholder). **Export only — no round-trip.** +- **Reports** — a separate `project.report` facade that pulls live + Python state and emits journal report artifacts under `reports/`. The + IUCr CIF one-off method is `project.report.save_cif()`; the regular + `project.save()` call emits configured reports from the + `project.report.{cif,html,tex,pdf}` booleans. This path applies all + IUCr renames, structural reshapings, multi-datablock layout, and + project-extension namespacing (`_easydiffraction_*`). It replaces the + unimplemented `project.summary` placeholder. **Export only — no + round-trip.** ## Current State Project CIF categories audited against `cif_core.dic` v3.4.0 and `cif_pow.dic` v2.5.0. The "Default-save tier" column shows whether the category changes in the default save; the "IUCr export" column shows the -dotted DDLm tag emitted under `project.save(report=True)`. - -| Category (current) | IUCr dictionary | Default-save tier | IUCr export (dotted DDLm) | -| ----------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `_cell.*` | core | Structure — unchanged | `_cell.length_a`, `_cell.angle_alpha`, etc. | -| `_atom_site.*` (most fields) | core | Structure — unchanged | `_atom_site.label`, `_atom_site.fract_x`, … | -| `_atom_site.adp_type` | core (`_atom_site.ADP_type`) | Structure — casing fix | `_atom_site.ADP_type` (uppercase ADP per dictionary). | -| `_atom_site.wyckoff_letter` | core (`_atom_site.Wyckoff_symbol`) | Structure — rename | `_atom_site.Wyckoff_symbol` (uppercase W, "symbol" not "letter"). | -| `_atom_site.B_iso_or_equiv` / `U_iso_or_equiv` | core | Structure — single-tag emit | `_atom_site.B_iso_or_equiv` xor `_atom_site.U_iso_or_equiv` per row, based on `_atom_site.ADP_type`. | -| `_atom_site_aniso.B_*` / `U_*` | core | Structure — single-tag emit | `_atom_site_aniso.B_*` xor `_atom_site_aniso.U_*` per row. | -| `_space_group.name_h_m` | core (`_space_group.name_H-M_alt`) | Structure — casing fix | `_space_group.name_H-M_alt`. | -| `_space_group.it_coordinate_system_code` | core (`_space_group.IT_coordinate_system_code`) | Structure — casing fix | `_space_group.IT_coordinate_system_code`. | -| symmetry operations | core (`_space_group_symop.*`) | (not emitted today) | `_space_group_symop.id` + `_space_group_symop.operation_xyz` loop alongside the H-M name. | -| `_diffrn.ambient_temperature`, `ambient_pressure` | core | Experiment — unchanged | `_diffrn.ambient_temperature`, `_diffrn.ambient_pressure`. | -| `_diffrn.ambient_magnetic_field`, `ambient_electric_field` | none | Experiment — unchanged | `_easydiffraction_diffrn.ambient_magnetic_field`, `…electric_field` (project extension). | -| `_refln.*` | core | (no default save under refln) | `_refln.*` reflections loop (column set differs by domain — see §2.3). | -| `_pd_meas.*`, `_pd_proc.*`, `_pd_calc.*`, `_pd_data.*` | pdCIF | Experiment — unchanged | `_pd_meas.*`, `_pd_proc.*`, `_pd_calc.*` profile-data loop (see §2.3). | -| `_pd_background.*` | pdCIF | Experiment — unchanged | `_pd_background.*`. | -| `_pd_phase_block.*` | pdCIF | Experiment — unchanged | `_pd_phase_block.*`. | -| `_sc_crystal_block.*` | community (no IUCr counterpart) | Experiment — unchanged | `_easydiffraction_sc_crystal_block.*` in IUCr export. | -| `_instr.wavelength` | core (`_diffrn_radiation_wavelength.value`) | Experiment — unchanged | `_diffrn_radiation_wavelength.{id, value, wt}` — single-row category for monochromatic; loop only for multi-λ. | -| `_instr.2theta_offset` | pdCIF (`_pd_calib.2theta_offset`) | Experiment — unchanged | `_pd_calib.2theta_offset`. | -| `_instr.2theta_bank`, `d_to_tof_*` | pdCIF (`_pd_calib_d_to_tof.*` loop) | Experiment — unchanged | Four-row loop `_pd_calib_d_to_tof.{id, coeff, power, coeff_su, diffractogram_id}`. | -| `_peak.*` (parametric profile shape) | none (pdCIF has no shape parameters) | Experiment — unchanged | `_easydiffraction_peak.*` + `_pd_proc_ls.profile_function` free-text descriptor. | -| `_extinction.*` | core (`_refine_ls.extinction_*` items) | Experiment — unchanged | `_easydiffraction_extinction.*` + dual emit `_refine_ls.extinction_{method,coef,expression}`. | -| `_excluded_region.*` | pdCIF (`_pd_proc.info_excluded_regions` free-text) | Experiment — unchanged | `_easydiffraction_excluded_region.*` + `_pd_proc.info_excluded_regions` free-text rendering. | -| `_expt_type.*` | none | Experiment — unchanged | `_easydiffraction_experiment_type.*`. | -| `_calculator.type`, `_minimizer.type` | none | Analysis — unchanged | Identification rolled into the `_easydiffraction_software.{framework, calculator, minimizer}` category; `_computing.structure_refinement` carries the same info as IUCr-standard free text. | -| `_minimizer.*` settings (tolerances, max_iter, …) | none | Analysis — unchanged | `_easydiffraction_minimizer.*` (settings only, separate from the identification triple). | -| `_fitting_mode.type`, `_background.type` | none | Analysis / Experiment — unchanged | `_easydiffraction_fitting_mode.type`, `_easydiffraction_background.type` selectors. | -| `_fit_result.reduced_chi_square`, `n_data_points`, `n_parameters` | core (`_refine_ls.*`) and pdCIF (`_pd_proc_ls.*`) | Analysis — unchanged (topology-neutral) | Shape-shifting per topology: see §1.2 and §3 transformers. | -| `_fit_result.*` (R-factors, counts, profile/background function) | core / pdCIF | Analysis — new fields under `_fit_result.*` | IUCr export remaps to per-topology `_refine_ls.*` / `_pd_proc_ls.*`; item names already match dictionary casing (§1.2). | -| `_fit_result.*` (Bayesian diagnostics, success, message, fitting_time, iterations, result_kind) | none | Analysis — unchanged | `_easydiffraction_fit_result.*`. | -| `_fit_parameter`, `_fit_parameter_correlation` | none / partial | Analysis — unchanged | `_easydiffraction_fit_parameter*` (no IUCr counterpart for per-parameter posterior). | -| `_alias`, `_constraint` | none | Analysis — unchanged | `_easydiffraction_alias*`, `_easydiffraction_constraint*`. | -| `_joint_fit`, `_sequential_fit*` | none | Analysis — unchanged | `_easydiffraction_joint_fit*`, `_easydiffraction_sequential_fit*`. | -| reflection-set aggregates | core (`_reflns.*`) | Analysis — new fields | `_reflns.number_total`, `_reflns.number_gt`, `_reflns.threshold_expression` (e.g. `'I>3\s(I)'`). | -| publication metadata | core (`_journal.*`, `_publ_author.*`, `_publ_contact_author.*`, `_audit.*`) | (not emitted today) | Emitted in `data_global` block per §2.3a with `?` placeholders. | -| analysis-stack identification | core (`_computing.structure_refinement`) | (not emitted today) | `_easydiffraction_software.{framework, calculator, minimizer}` triple + `_computing.structure_refinement` derived string in `data_global` (see §2.3a-i). | +dotted DDLm tag emitted by the IUCr CIF report writer. + +| Category (current) | IUCr dictionary | Default-save tier | IUCr export (dotted DDLm) | +| ----------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| `_cell.*` | core | Structure — unchanged | `_cell.length_a`, `_cell.angle_alpha`, etc. | +| `_atom_site.*` (most fields) | core | Structure — unchanged | `_atom_site.label`, `_atom_site.fract_x`, … | +| `_atom_site.adp_type` | core (`_atom_site.ADP_type`) | Structure — casing fix | `_atom_site.ADP_type` (uppercase ADP per dictionary). | +| `_atom_site.wyckoff_letter` | core (`_atom_site.Wyckoff_symbol`) | Structure — rename | `_atom_site.Wyckoff_symbol` (uppercase W, "symbol" not "letter"). | +| `_atom_site.B_iso_or_equiv` / `U_iso_or_equiv` | core | Structure — single-tag emit | `_atom_site.B_iso_or_equiv` xor `_atom_site.U_iso_or_equiv` per row, based on `_atom_site.ADP_type`. | +| `_atom_site_aniso.B_*` / `U_*` | core | Structure — single-tag emit | `_atom_site_aniso.B_*` xor `_atom_site_aniso.U_*` per row. | +| `_space_group.name_h_m` | core (`_space_group.name_H-M_alt`) | Structure — casing fix | `_space_group.name_H-M_alt`. | +| `_space_group.it_coordinate_system_code` | core (`_space_group.IT_coordinate_system_code`) | Structure — casing fix | `_space_group.IT_coordinate_system_code`. | +| symmetry operations | core (`_space_group_symop.*`) | (not emitted today) | `_space_group_symop.id` + `_space_group_symop.operation_xyz` loop alongside the H-M name. | +| `_diffrn.ambient_temperature`, `ambient_pressure` | core | Experiment — unchanged | `_diffrn.ambient_temperature`, `_diffrn.ambient_pressure`. | +| `_diffrn.ambient_magnetic_field`, `ambient_electric_field` | none | Experiment — unchanged | `_easydiffraction_diffrn.ambient_magnetic_field`, `…electric_field` (project extension). | +| `_refln.*` | core | (no default save under refln) | `_refln.*` reflections loop (column set differs by domain — see §2.3). | +| `_pd_meas.*`, `_pd_proc.*`, `_pd_calc.*`, `_pd_data.*` | pdCIF | Experiment — unchanged | `_pd_meas.*`, `_pd_proc.*`, `_pd_calc.*` profile-data loop (see §2.3). | +| `_pd_background.*` | pdCIF | Experiment — unchanged | `_pd_background.*`. | +| `_pd_phase_block.*` | pdCIF | Experiment — unchanged | `_pd_phase_block.*`. | +| `_sc_crystal_block.*` | community (no IUCr counterpart) | Experiment — unchanged | `_easydiffraction_sc_crystal_block.*` in IUCr export. | +| `_instr.wavelength` | core (`_diffrn_radiation_wavelength.value`) | Experiment — unchanged | `_diffrn_radiation_wavelength.{id, value, wt}` — single-row category for monochromatic; loop only for multi-λ. | +| `_instr.2theta_offset` | pdCIF (`_pd_calib.2theta_offset`) | Experiment — unchanged | `_pd_calib.2theta_offset`. | +| `_instr.2theta_bank`, `d_to_tof_*` | pdCIF (`_pd_calib_d_to_tof.*` loop) | Experiment — unchanged | Four-row loop `_pd_calib_d_to_tof.{id, coeff, power, coeff_su, diffractogram_id}`. | +| `_peak.*` (parametric profile shape) | none (pdCIF has no shape parameters) | Experiment — unchanged | `_easydiffraction_peak.*` + `_pd_proc_ls.profile_function` free-text descriptor. | +| `_extinction.*` | core (`_refine_ls.extinction_*` items) | Experiment — unchanged | `_easydiffraction_extinction.*` + dual emit `_refine_ls.extinction_{method,coef,expression}`. | +| `_excluded_region.*` | pdCIF (`_pd_proc.info_excluded_regions` free-text) | Experiment — unchanged | `_easydiffraction_excluded_region.*` + `_pd_proc.info_excluded_regions` free-text rendering. | +| `_expt_type.*` | none | Experiment — unchanged | `_easydiffraction_experiment_type.*`. | +| `_calculator.type`, `_minimizer.type` | none | Analysis — unchanged | Selection fields remain settings only; identity is read from `analysis.software` for `_easydiffraction_software.{framework, calculator, minimizer}` and `_computing.structure_refinement`. | +| `_software.*` | none | Analysis — new provenance category | Source for `_easydiffraction_software.{framework, calculator, minimizer}`, `_easydiffraction_software.fit_datetime`, and `_computing.structure_refinement` in `data_global`. | +| `_minimizer.*` settings (tolerances, max_iter, …) | none | Analysis — unchanged | `_easydiffraction_minimizer.*` (settings only, separate from the identification triple). | +| `_fitting_mode.type`, `_background.type` | none | Analysis / Experiment — unchanged | `_easydiffraction_fitting_mode.type`, `_easydiffraction_background.type` selectors. | +| `_fit_result.reduced_chi_square`, `n_data_points`, `n_parameters` | core (`_refine_ls.*`) and pdCIF (`_pd_proc_ls.*`) | Analysis — unchanged (topology-neutral) | Shape-shifting per topology: see §1.2 and §3 transformers. | +| `_fit_result.*` (R-factors, counts, profile/background function) | core / pdCIF | Analysis — new fields under `_fit_result.*` | IUCr export remaps to per-topology `_refine_ls.*` / `_pd_proc_ls.*`; item names already match dictionary casing (§1.2). | +| `_fit_result.*` (Bayesian diagnostics, success, message, fitting_time, iterations, result_kind) | none | Analysis — unchanged | `_easydiffraction_fit_result.*`. | +| `_fit_parameter`, `_fit_parameter_correlation` | none / partial | Analysis — unchanged | `_easydiffraction_fit_parameter*` (no IUCr counterpart for per-parameter posterior). | +| `_alias`, `_constraint` | none | Analysis — unchanged | `_easydiffraction_alias*`, `_easydiffraction_constraint*`. | +| `_joint_fit`, `_sequential_fit*` | none | Analysis — unchanged | `_easydiffraction_joint_fit*`, `_easydiffraction_sequential_fit*`. | +| reflection-set aggregates | core (`_reflns.*`) | Analysis — new fields | `_reflns.number_total`, `_reflns.number_gt`, `_reflns.threshold_expression` (e.g. `'I>3\s(I)'`). | +| publication metadata | core (`_journal.*`, `_publ_author.*`, `_publ_contact_author.*`, `_audit.*`) | (not emitted today) | Emitted in `data_global` block per §2.3a with `?` placeholders. | +| analysis-stack identification | core (`_computing.structure_refinement`) | Analysis — `_software.*` persisted | `_easydiffraction_software.{framework, calculator, minimizer}` triple + `_easydiffraction_software.fit_datetime` + `_computing.structure_refinement` derived from `analysis.software`. | ## Decision @@ -298,18 +301,19 @@ Specifically: #### 2.1 API ```python -project.save() # regular project save only -project.save(report=True) # regular save + reports/.cif -project.report.save() # write reports only, no regular save -project.report.check() # validate reports (see §2.5) +project.save() # project files + configured reports +project.report.save_cif() # one-off reports/.cif +project.report.save() # write configured reports only ``` `project.summary` (currently an unimplemented placeholder) is removed -and replaced by `project.report` — a new facade slot that owns the -journal-submission CIF generation and validation. The slot is named -generically because the same path can host additional report types in -the future (mmCIF export, figure bundles, etc.); the IUCr CIF is the -only kind shipped today. +and replaced by `project.report` — a facade slot that owns journal +report generation. The `project.report.{cif,html,tex,pdf}` booleans +control which reports `project.save()` emits. Per-format methods +(`save_cif()`, `save_html()`, `save_tex()`, `save_pdf()`) write one-off +artifacts without changing that configuration. The no-arg +`project.report.save()` uses those booleans and raises `ValueError` when +no formats are enabled. #### 2.2 Output location @@ -330,7 +334,7 @@ example files in the corpus). pd_xray.cif analysis/ analysis.cif - reports/ # written by save(report=True) + reports/ # written by report config or save_cif() .cif # single multi-block IUCr CIF ``` @@ -504,13 +508,15 @@ the project has source data, otherwise `?`. - `_audit.creation_method 'EasyDiffraction '`, `_audit.creation_date `. -- `_computing.structure_refinement` (single string concatenating the - framework + calculator + minimizer names and versions, e.g. +- `_computing.structure_refinement` (single string derived from + `analysis.software`; when calculator or minimizer provenance is unset + it falls back to the framework label only, e.g. `'EasyDiffraction 0.17.0 with lmfit 1.0.0 minimizer and cryspy 1.2.3 calculator'`). coreCIF standard channel for advertising the analysis-software stack to IUCr-aware tooling. - `_easydiffraction_software.*` triple holding the same three roles in - structured form (see §2.3a-i below). + structured form, plus `_easydiffraction_software.fit_datetime` when a + fit timestamp is available (see §2.3a-i below). - `_journal.*` placeholders, written as `?` when the project has no source data: `_journal.name_full`, `_journal.year`, `_journal.volume`, `_journal.issue`, `_journal.page_first`, `_journal.page_last`, @@ -540,13 +546,14 @@ similar) is deferred — see Deferred Work. #### 2.3a-i `_easydiffraction_software` framework The IUCr submission needs to identify the analysis stack. The project -emits one structured category in `data_global` carrying three role-keyed -strings: +emits one structured category in `data_global` from `analysis.software`, +carrying three role-keyed strings and an optional fit timestamp: ``` _easydiffraction_software.framework 'EasyDiffraction 0.17.0' _easydiffraction_software.calculator 'cryspy 1.2.3' _easydiffraction_software.minimizer 'lmfit 1.0.0' +_easydiffraction_software.fit_datetime 2026-05-26T13:45:00+00:00 ``` - `_easydiffraction_software.framework` — EasyDiffraction itself, the @@ -556,6 +563,9 @@ _easydiffraction_software.minimizer 'lmfit 1.0.0' - `_easydiffraction_software.minimizer` — the active minimizer (lmfit, scipy-lstsq, dfo-ls, emcee, …) with version. Bayesian sampler runs use the sampler name and version here. +- `_easydiffraction_software.fit_datetime` — ISO-8601 UTC timestamp of + the successful fit that populated `analysis.software`. Omitted when no + timestamp is recorded. The same three values are concatenated into the `_computing.structure_refinement` free-text string for IUCr-tooling @@ -630,12 +640,14 @@ _refln.index_k _refln.index_l _refln.F_squared_meas _refln.F_squared_calc -_refln.phase_calc +_pd_refln.phase_id _refln.d_spacing ``` Column set adapted from the corpus content (`bal5001.cif`, `hb8206.cif`) -with tag form taken from `cif_core.dic`. +with tag form taken from `cif_core.dic` and `cif_pow.dic`. The phase +identifier uses the powder dictionary's `_pd_refln.phase_id`; it is not +the calculated structure-factor phase angle `_refln.phase_calc`. #### 2.3e Powder profile-data loop @@ -775,44 +787,40 @@ The IUCr writer pass differs from the default writer: #### 2.5 Submission-side validation -`project.report.check()` runs the generated `reports/.cif` -through `gemmi` (already a project dependency per `pyproject.toml`) for -dictionary-compliance validation before submission. - -```python -project.report.check() # validate reports/.cif -project.save(report=True, check=True) # save + validate in one step -``` - -Validation checks performed by `gemmi`: - -- Every emitted tag exists in `cif_core.dic` or `cif_pow.dic` (the - shipped reference dictionaries, or fresh copies fetched on demand). - Unknown tags outside the project's `_easydiffraction_*` namespace - produce a warning. -- Value types match the dictionary's `_type.contents` declaration (Real, - Integer, Code, Text, …). -- Required category keys (`_category_key.name` per `_pd_calib_d_to_tof`, - `_atom_site`, etc.) are present in every loop row. -- Loop columns share the same parent category. -- DDLm dotted form is well-formed; underscore-form aliases resolve - correctly. - -Validation does **not** cover: - -- Crystallographic sanity checks (bond lengths, void volumes, density - plausibility, missed-symmetry detection, anisotropic-ADP - positive-definiteness). These need a full `checkCIF` implementation, - which `gemmi` does not provide. Treat `project.report.check()` as a - "spec compliance" pass, not a "scientific sanity" pass — a separate - IUCr-server upload remains the final check before submission. -- Verifying that `?` placeholders in `_journal.*` / `_publ_*` have been - filled in by the user (those are valid CIF; the project cannot decide - which are mandatory per journal). Flagged as a separate concern. - -The `_easydiffraction_*` project-extension namespace is excluded from -the unknown-tag warning by passing `gemmi`'s validator a prefix-skip -list. +**Superseded (2026-05-30): the runtime writer self-check described below +was removed.** The IUCr CIF writer no longer validates its own output +against `cif_core.dic` / `cif_pow.dic`; `reports/.cif` is +written directly. Rationale: + +- The report CIF is our own deterministic output. Checking it at write + time and raising `EasyDiffractionWriterError` ("…file a bug") turns a + developer-side test concern into a user-facing failure that blocks a + scientist's report over a defect only we can fix. +- The check resolved dictionaries from `tmp/iucr-dicts/` under the + repository root. That path never resolves for a pip-installed user, so + the self-check was a silent no-op for everyone except a developer who + had manually placed the dictionaries — where it only produced noise, + because the current COMCIFS DDLm/CIF2 dictionaries do not parse under + the helper's gemmi + regex approach. +- Spec compliance of the emitted tag set is maintained by authoring the + writer against the COMCIFS reference dictionaries (the dotted-tag set + is fixed in `iucr_writer.py`); a separate IUCr-server upload remains + the authoritative compliance check before submission. No part of the + library reads `tmp/iucr-dicts/` at runtime. + +The original decision (retained for history): the writer ran generated +content through `gemmi` before writing, with public +`project.report.check()` / `check=True` entry points removed so that +dictionary compliance was an internal writer self-check rather than a +user choice. The intended gemmi checks were tag existence in +`cif_core.dic` / `cif_pow.dic` (unknown non-`_easydiffraction_*` tags +raising `EasyDiffractionWriterError`), value-type matching against +`_type.contents`, required category keys per loop row, single-category +loop columns, and well-formed DDLm dotted form. It never covered +crystallographic sanity checks (bond lengths, void volumes, density +plausibility, missed-symmetry detection, ADP positive-definiteness) or +whether `?` placeholders in `_journal.*` / `_publ_*` had been filled — +those remain a separate IUCr-server concern. ### 3. Handler mechanism — `iucr_name` + `IucrCategoryTransformer` @@ -976,10 +984,11 @@ Policy: recognisable to scientists familiar with `_refine_ls.*` / `_pd_proc_ls.*` from Rietveld publications; the IUCr export carries the matching dictionary-canonical category prefixes per topology. -- IUCr submission becomes a single command, with no manual editing - required: `project.save(report=True)` produces an upload-ready file at - `reports/.cif` matching the multi-datablock publication - convention. +- IUCr submission becomes a single explicit report command, with no + manual editing required: `project.report.save_cif()` produces an + upload-ready file at `reports/.cif` matching the + multi-datablock publication convention. Users who want CIF reports on + every project save can set `project.report.cif = True`. - Publication-metadata placeholders are emitted as `?` in `data_global` so users know where to fill in journal-required info before submission. @@ -1005,8 +1014,8 @@ Policy: `hb8169.cif` at 50K lines (DDL1 form; DDLm form would be of comparable size). - IUCr export is one-way. A user who hand-edits a file in `reports/` - loses those edits on the next `project.save(report=True)`. Documented - as such; treat `reports/` as generated output. + loses those edits on the next configured report save. Documented as + such; treat `reports/` as generated output. - Some external tooling chains (publCIF, journal in-house scripts) may still expect DDL1 underscore form. The dotted DDLm form is the dictionary spec; if real submissions surface a problem, a downstream @@ -1020,15 +1029,19 @@ Policy: function descriptors, reflns aggregates). `_fit_result.*` stays topology-neutral in `analysis/analysis.cif`; per-topology renaming to `_refine_ls.*` / `_pd_proc_ls.*` happens only in the IUCr export - (§1.2, §3 transformers). + (§1.2, §3 transformers). A later project-report amendment adds + `_software.*` as the persisted source for report software provenance. - [`minimizer-input-output-split.md`](minimizer-input-output-split.md) — `_fit_result.*` examples updated for the new fields. - [`project-facade-and-persistence.md`](project-facade-and-persistence.md) — `project.summary` facade slot is removed and replaced by - `project.report`. `summary.cif` is no longer written by default - `Project.save()`; the slot is repurposed for IUCr / journal report - generation in `reports/.cif` (see §2). The unimplemented - `summary_to_cif()` placeholder code path + `project.report`. The accepted `project.save(report=True)` flag is + superseded by report booleans for configured reports and + `project.report.save_cif()` for the IUCr CIF one-off path. + `summary.cif` is no longer written by default `Project.save()`; the + slot is repurposed for IUCr / journal report generation in + `reports/.cif` (see §2). The unimplemented `summary_to_cif()` + placeholder code path ([`project.py:464`](../../../../src/easydiffraction/project/project.py)) is removed as part of the implementation plan; no summary content survives the transition because nothing was being written there in the @@ -1038,14 +1051,21 @@ Policy: and replaced by `project.report.help()` (same responsibilities, new slot name). All other entries in the help-surface table are unaffected. +- [`project-summary-rendering.md`](project-summary-rendering.md) — + amends this ADR's report API: public `check()` / `check=True` are + removed, the `_easydiffraction_software.*` triple is read from + `analysis.software`, and `_easydiffraction_software.fit_datetime` is + added when fit provenance has a timestamp. (The write-path gemmi + validation this ADR introduced was later removed — see the §2.5 + amendment.) ## Open Questions (None blocking. Dictionary-side ambiguities have all been resolved -against `cif_core.dic` v3.4.0 / `cif_pow.dic` v2.5.0. The §2.5 gemmi -pass surfaces any remaining spec-compliance issue at generate-time, so -the ADR no longer relies on speculation about real-world tooling -behaviour.) +against `cif_core.dic` v3.4.0 / `cif_pow.dic` v2.5.0 while authoring the +writer. The runtime gemmi self-check originally described in §2.5 was +removed (see the §2.5 amendment); spec compliance now rests on authoring +discipline plus a final IUCr-server upload before submission.) ## Alternatives Considered @@ -1103,11 +1123,11 @@ it. ## Deferred Work - **Publication-metadata override hook.** A user-supplied - `reports/publ_info.json` (or `publ_info.toml`) read by - `project.save(report=True)` to replace the `?` placeholders in - `data_global` (`_journal.*`, `_publ_*`, `_publ_author.*` loop - entries). Out of scope for the first pass; revisit once the IUCr - export is shipping and users have feedback on workflow friction. + `reports/publ_info.json` (or `publ_info.toml`) read by the IUCr report + writer to replace the `?` placeholders in `data_global` (`_journal.*`, + `_publ_*`, `_publ_author.*` loop entries). Out of scope for the first + pass; revisit once the IUCr export is shipping and users have feedback + on workflow friction. - **Crystallographic sanity validation.** The §2.5 validator covers spec compliance only. A future pass could integrate IUCr's web checkCIF (HTTP POST to the checkCIF endpoint) or bundle a local subset of its diff --git a/docs/dev/adrs/accepted/project-facade-and-persistence.md b/docs/dev/adrs/accepted/project-facade-and-persistence.md index 17bcdc1a9..6d16f7118 100644 --- a/docs/dev/adrs/accepted/project-facade-and-persistence.md +++ b/docs/dev/adrs/accepted/project-facade-and-persistence.md @@ -16,7 +16,8 @@ Persistence. `Project` is the top-level user facade. It owns project metadata, structures, experiments, rendering preferences, display helpers, -analysis, report helpers, verbosity, and save/load behavior. +analysis, report helpers, publication metadata, verbosity, and save/load +behavior. A later proposal considered renaming this facade to `Workspace` so that `project` could be reserved for the scientific project information @@ -50,9 +51,18 @@ through `project.report` and written only when requested, using `reports/.cif`; default project saves do not write `summary.cif`. -Expose submission-report helpers as `project.report`. The previous -`project.summary` placeholder and its `summary.cif` output are not part -of the persistence layout. +Expose submission-report helpers as `project.report`. This facade is a +hybrid surface: its scalar output configuration persists to +`project.cif` as `_report.*`, while its methods render report artifacts +under `reports/`. The previous `project.summary` placeholder and its +`summary.cif` output are not part of the persistence layout. + +Expose journal-submission metadata as `project.publication`. It is a +top-level owner with CIF-aligned sibling categories for `_journal.*`, +`_journal_date.*`, `_journal_coeditor.*`, `_publ_contact_author.*`, +`_publ_body.*`, and the `_publ_author.*` loop. These singleton +publication categories persist in `project.cif` and feed report exports; +`reports/.cif` remains export-only. Keep project information available as `project.info`. The Python name avoids a confusing `project.project` access path, while the persisted @@ -79,6 +89,11 @@ The saved project directory path is runtime file-I/O state, not a serialized project-information field. If the path is exposed in Python, it must not emit a `_project.path` CIF item. +The project-level singleton categories currently persisted in +`project.cif` are `_project.*`, `_chart.*`, `_report.*`, `_table.*`, +`_verbosity.*`, `_journal.*`, `_journal_date.*`, `_journal_coeditor.*`, +`_publ_contact_author.*`, `_publ_body.*`, and the `_publ_author.*` loop. + ## Consequences The saved layout mirrors the current object graph while preserving the diff --git a/docs/dev/adrs/accepted/project-summary-rendering.md b/docs/dev/adrs/accepted/project-summary-rendering.md new file mode 100644 index 000000000..8098b6d12 --- /dev/null +++ b/docs/dev/adrs/accepted/project-summary-rendering.md @@ -0,0 +1,2239 @@ +# ADR: Project Summary Rendering + +**Status:** Accepted +**Date:** 2026-05-26 + +Defines the **non-CIF** human-readable rendering surface for a project: +what the terminal/Jupyter summary, the auto-generated HTML report, the +on-demand journal-style LaTeX export, and (eventually) the GUI Summary +tab all consume and emit. + +Runs alongside, and **extends**, the accepted +[`iucr-cif-tag-alignment.md`](../accepted/iucr-cif-tag-alignment.md) ADR +(landed as PR #184). The alignment ADR established: + +- A new `project.report` facade slot (replaces the unimplemented + `project.summary` placeholder), with `save()` and `check()` methods. +- A single `reports/` directory at project root. +- A `project.save(report=True)` opt-in flag for the IUCr CIF. + +That ADR currently scopes `project.report` to **CIF only** — the +multi-datablock IUCr submission CIF written to `reports/.cif`. +This ADR keeps the facade and adds a **`project.report` configuration +category** with five scalar persisted fields (`cif`, `html`, `tex`, +`pdf`, `html_offline`) on `project.cif`, plus ad-hoc per-format methods +(`save_html()`, `save_cif()`, `save_tex()`, `save_pdf()`). The +Python-side API uses those same boolean descriptors directly, matching +the persisted CIF shape. The LaTeX writer hardcodes `iucrjournals` as +its document class — there is no style selector, no `_report.style` +field, no `style=` arg on `save_tex()` / `save_pdf()`. The accepted IUCr +`project.save(report=True)` flag is **removed**; reports come from the +config category, not from boolean flags. All four format booleans +default to `False` so `project.save()` writes nothing under `reports/` +until the user configures otherwise, preserving the "no surprise files" +property. + +Coordination points with the alignment ADR (no blocking conflicts; its +Open Questions section is empty): + +- **Software-stack identification** — the alignment ADR's §2.3a-i + defines `_easydiffraction_software.{framework, calculator, minimizer}` + as the structured CIF emission, plus a concatenated + `_computing.structure_refinement` free-text string. This ADR's §4 + adopts the same three-role triple as the Python-side attribute layout + so the same data flows into both write paths. +- **Spec-compliance validation** — the alignment ADR's §2.5 added + `project.report.check()` (gemmi-based) and a `check=True` flag on + `project.report.save()`. This ADR **removes the public surface** and + moves the gemmi pass to an internal pre-write step **inside the CIF + emission paths only** — `save_cif()` and the `cif` branch under + `project.save()` (§1.4). HTML, TeX, and PDF outputs are not + gemmi-validatable and get no pre-write validation; LaTeX errors + surface at PDF-compile time via the TeX engine. A writer that emits + non-compliant CIF raises `EasyDiffractionWriterError` instead. This + ADR's deferred `check_completeness()` (publication-side completeness) + is a separate concern that stays in Deferred Work. +- **Publication metadata source** — the alignment ADR's Deferred Work + proposes a user-supplied `reports/publ_info.{toml,json}` to replace + `?` placeholders. Both write paths read the same Python attribute, + **`project.publication`** — a new top-level on `Project`, sibling to + `project.info` and `project.analysis`. The schema is defined in §5 of + this ADR: six CIF-aligned sibling categories (`journal`, + `journal_date`, `journal_coeditor`, `contact_author`, `body`, + `authors`) with full IUCr-tag fidelity. The loader accepts TOML + (primary) and JSON (fallback); selection is by file extension. + +Also touches: + +- [`analysis-cif-fit-state.md`](../accepted/analysis-cif-fit-state.md) — + adds an `analysis.software` provenance category that serialises + through the analysis tier. +- [`minimizer-input-output-split.md`](../accepted/minimizer-input-output-split.md) + — the new provenance category lives alongside the existing + minimizer/fit-result pairing, not inside it. +- [`project-facade-and-persistence.md`](../accepted/project-facade-and-persistence.md) + — two changes: `project.report` gains a persisted configuration + category (`_report.*` in `project.cif`, see §1.3), turning the facade + into a hybrid of helper methods plus persisted config; and a new + top-level `project.publication` owner is added alongside the existing + `project.info`, `project.structures`, `project.experiments`, + `project.analysis`, `project.report` facade slots (see §5). +- [`python-cif-category-correspondence.md`](python-cif-category-correspondence.md) + — owns the Python↔CIF correspondence rule for **two** new + project-level singleton surfaces: `project.report.* ↔ _report.*` (five + scalar items, §1.3) and `project.publication.*` sibling categories ↔ + `_journal.*`, `_publ_author.*`, `_publ_contact_author.*`, etc. (§5). + +## Context + +The library today has four shapes of summary output: + +- `Report.show_report()` and friends — terminal/Jupyter rendering of + project metadata, crystallographic data per phase, experimental + configuration, and fit metrics + ([report.py](../../../../src/easydiffraction/report/report.py)). + (Pre-PR #184 this was `Summary.show_report()` on `project.summary`; + the IUCr alignment ADR replaced the unimplemented placeholder.) +- `summary.cif` — was written into the project root on every + `project.save()` as the literal string `"To be added..."` until PR + #184 removed both the writer call and the placeholder method. Not a + valid CIF block in any version that shipped. +- The old GUI's "Summary" tab — a single page listing project info, + crystal data, data collection, refinement engine + goodness-of-fit, + with an "Export summary" panel (Name, Format = HTML, Location). +- An eventual journal manuscript — currently produced by hand from the + scientist's notes and the values shown in the GUI Summary tab. + +These four are renderings of the **same** logical view. Every field the +GUI shows is already reachable from the live Python objects; the summary +is not a source of truth and has no field of its own that isn't +computable from `project`, its `structures`, its `experiments`, and +`analysis.fit_results`. The exception is software provenance — which +calculation engine and minimizer (with versions) produced the fit — +which the library does not currently capture anywhere. + +Two pressures act on the design: + +- **GUI consistency.** The library and the GUI must show the same + numbers from the same data flow. The GUI Summary tab needs a + programmatic API, not a CIF or an HTML file to re-parse. +- **Submission-grade output.** Scientists publish in IUCr journals, + Phys. Rev. B, J. Appl. Cryst., and others. The CIF side of that is + covered by the alignment ADR's IUCr export. The **manuscript** side + (refinement tables formatted to journal style) is not. + +The default-save `summary.cif` placeholder was the visible artefact of +the unresolved design question. The alignment ADR has since replaced the +unimplemented `project.summary` slot with a `project.report` facade +scoped to IUCr CIF generation (`reports/.cif`). That resolves +the CIF half of the question but leaves the GUI Summary tab, the +terminal `show_report()`, the human-readable HTML, and the +manuscript-bound LaTeX/PDF without a definition. This ADR fills the gap +by extending the same `project.report` facade with non-CIF rendering +surfaces. + +## Scope + +In scope: + +- Extend the alignment ADR's `project.report` facade with + terminal/Jupyter, HTML, and LaTeX rendering surfaces, a configuration + category (five scalar fields — + `project.report.{cif, html, tex, pdf, html_offline}` — persisted in + `project.cif`), and ad-hoc per-format save methods. **All report + formats are opt-in via the configuration; every format defaults to + `False` so `project.save()` writes nothing under `reports/` until a + format is enabled** — see §1 and §2 for the rationale. +- Define the shared "summary data context" (one dictionary) that + terminal, HTML, LaTeX, and GUI renderers all consume. +- Add a Python-side software-provenance category on `analysis` + (`analysis.software`) recording calculation-engine and + minimization-engine name + version + URL stamped at fit time. + Persisted in `analysis/analysis.cif` (amends the IUCr ADR's "Analysis + — unchanged" stance for these fields; see §4 and the ADRs-amended + list). +- Add a new top-level `project.publication` owner on `Project` (sibling + to `project.info`, `project.structures`, `project.experiments`, + `project.analysis`, `project.report`) carrying the `_publ_*` / + `_journal_*` publication metadata the IUCr writer otherwise emits as + `?` placeholders. See §5; amends `project-facade-and-persistence.md` + and complements `python-cif-category-correspondence.md`. +- Ship exactly one LaTeX style (`iucrjournals`) — no style selector, no + `ReportStyleEnum`, no `_report.style` field. Multi-style support + (REVTeX, Elsevier, etc.) is deferred to a follow-up ADR; see "Deferred + Work". + +Out of scope: + +- CIF tag-name decisions for any serialised field. Those are the + alignment ADR's job; this ADR notes recommended mappings and + cross-references. +- The IUCr CIF submission export tag policy and multi-datablock layout. + Covered by the alignment ADR; the output file lives at + `reports/.cif` and is opt-in via `project.report.cif = True`. +- Pre-existing project-level singleton categories (`_info.*`, + `_chart.*`, `_table.*`, `_verbosity.*`). Covered by the in-flight + [`python-cif-category-correspondence.md`](python-cif-category-correspondence.md). + This ADR **does** add one new project-level singleton category, + `_report.*`, alongside them (see §1.3 and the ADRs-amended list); that + surface is not delegated to the correspondence ADR. +- Static-image PDF generation. HTML prints from any browser; LaTeX + compiles to PDF locally. No bundled PDF writer. +- Markdown export. Trivial follow-on if the Jinja base templates are in + place, but no current user requirement. + +## Design Philosophy: Summary as a View + +Summary is a **derived view**, not a persisted artifact. The same data +dictionary feeds every renderer: + +``` +project, structures[], experiments[], analysis.fit_results, +analysis.software (new — see §4) + │ + ▼ + ReportDataContext (in-memory dict, single source of truth) + │ + ├──► terminal/Jupyter renderer (existing show_*; on demand) + ├──► HTML renderer (Jinja + Plotly + MathJax; opt-in via project.report.html) + ├──► LaTeX renderer (Jinja + pgfplots; opt-in via project.report.tex/pdf; iucrjournals style only) + └──► GUI Summary tab (programmatic; eventual) +``` + +Render targets never duplicate the data — they consume the same context. +New summary fields are added in one place (the context builder); every +renderer picks them up. + +## Decision + +### 1. Extend `project.report` with rendering methods and a config category + +The alignment ADR has already created the `project.report` facade +(replacing the unimplemented `project.summary` placeholder). This ADR +extends it along two axes: + +- A new **configuration category** on `project.report` — persisted in + `project.cif`, matching the existing `project.chart`, `project.table`, + `project.verbosity` config pattern — that records _which_ report + formats `project.save()` emits and _how_. +- A new set of **ad-hoc per-format methods** for explicit one-off writes + that bypass the configuration. + +The accepted IUCr `project.save(report=True)` flag and the public +`project.report.check()` method are both **removed** (see the +ADRs-amended list). Reports come from configuration, not boolean flags; +dictionary-spec validation runs internally before every CIF write (and +only before CIF writes — HTML, TeX, and PDF have no spec to validate +against; see §1.4) and surfaces as an error on writer bugs, not as a +user opt-in. + +#### 1.1 Configuration category — `project.report.*` + +Persisted fields on `project.report`, populated by the user once and +read by `project.save()` thereafter: + +| Field | Type | Default | Effect | +| ----------------------------- | ------ | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `project.report.cif` | `bool` | `False` | When `True`, `project.save()` writes `reports/.cif`. | +| `project.report.html` | `bool` | `False` | When `True`, `project.save()` writes `reports/.html`. | +| `project.report.tex` | `bool` | `False` | When `True`, `project.save()` writes `reports/tex/{.tex, data/, styles/}`. | +| `project.report.pdf` | `bool` | `False` | When `True`, `project.save()` writes `reports/.pdf` (and `tex/` as a side-effect). | +| `project.report.html_offline` | `bool` | `False` | When `True`, the HTML report is **fully self-contained** — inline-bundles both Plotly and MathJax (~3 MB + ~1.5 MB on top of the otherwise-empty document). Otherwise both load from CDN. | + +Four per-format scalar booleans (`cif`, `html`, `tex`, `pdf`) plus +`html_offline` — **five fields total**, all single-row in CIF. Matches +the existing `project.chart`, `project.table`, `project.verbosity` +scalar-config shape verbatim. All booleans default to `False`, so an +unconfigured project produces no `reports/` directory at all. + +There is no `style` field. The LaTeX output ships exactly one class +(`iucrjournals`); adding another style is deferred work, not a v1 +selector. See §3 for the reasoning behind the single-style choice. + +There is no separate list-style `project.report.formats` property. The +Python API intentionally mirrors CIF and the other project-level +configuration categories: each persisted scalar descriptor is set +directly. + +```python +import easydiffraction as ed + +project = ed.Project() +# … set up structures, experiments, run fit … + +# Configure once — persisted in project.cif (see §1.3 below). +project.report.cif = True +project.report.html = True +project.report.html_offline = False + +# Every subsequent save now emits the configured reports too. +project.save() +# → project.cif (with _report.* config block) +# → structures/<...>.cif, experiments/<...>.cif, analysis/analysis.cif +# → reports/.cif (because project.report.cif is True) +# → reports/.html (because project.report.html is True) +``` + +##### Enum backing per the closed-values ADR + +The set of report formats is a finite closed set, so per the accepted +[`enum-backed-closed-values.md`](../accepted/enum-backed-closed-values.md) +contract it is represented internally as `(str, Enum)`: + +```python +class ReportFormatEnum(str, Enum): + CIF = 'cif' + HTML = 'html' + TEX = 'tex' + PDF = 'pdf' +``` + +The four per-format booleans (`project.report.cif`, `.html`, `.tex`, +`.pdf`) are the public configuration API. Internal save dispatch may use +`ReportFormatEnum` members to keep the finite format set explicit, but +the enum is not a user-facing selector. + +There is no `ReportStyleEnum`. The LaTeX writer hardcodes `iucrjournals` +as its document class (see §3); when a future ADR adds a second style, +the `ReportStyleEnum` is reintroduced together with a new +`_report.style` config field. + +#### 1.2 Ad-hoc per-format methods + +Each format has its own explicit write method on the facade, independent +of the persisted report booleans. Use when a user wants to produce a +one-off artifact without changing the persistent configuration. + +```python +project.report.save_cif() # writes reports/.cif +project.report.save_html(offline: bool = False) # writes reports/.html +project.report.save_tex() # writes reports/tex/{.tex, ...} +project.report.save_pdf() # writes reports/.pdf (compiles TeX too) + +# Convenience: write every report enabled by project.report booleans. +# Raises ValueError if no formats are configured (see below). +project.report.save() # reads config, no flags + +# Ad-hoc string returns: +project.report.as_html(offline: bool = False) -> str +project.report.as_tex() -> str + +# Shared data context (for GUI Summary tab + Jinja templates): +project.report.data_context() -> dict + +# Terminal / Jupyter renderers (existing methods, migrated to +# project.report by PR #184 — names preserved): +project.report.show_report() # full report — sections below +project.report.show_project_info() +project.report.show_crystallographic_data() +project.report.show_experimental_data() +project.report.show_fitting_details() +``` + +Per-format method signatures only carry the args that apply to that +format — `save_html(offline=True)` is unambiguous; there is no +`save_tex(style=...)` because the LaTeX writer ships exactly one style +(`iucrjournals`), so a style selector would be dead weight. The +cross-format mixing that the earlier flag-based draft had +(`html_offline` ignored when `html=False`, `style=` ignored without +`tex=True`) is gone. + +`project.save()` itself takes no report-related arguments. The accepted +IUCr `project.save(report=True)` flag is removed (see ADRs amended); +reports are configured on `project.report.*`. + +```python +project.save() # writes project files + enabled report booleans +``` + +`Summary.as_cif()` and `summary_to_cif()` were already deleted by the +alignment ADR; this ADR's removal of `project.save(report=True)` +finishes the flag-cleanup. + +**Empty-configuration behaviour split.** + +The two entry points behave differently when no formats are enabled — a +deliberate split: `project.save()` writes the project regardless +(reports are a side-effect of configuration, not the point of the call); +`project.report.save()` is _only_ about reports, so calling it with +nothing configured is a user error. + +```python +# project.report.{cif,html,tex,pdf} == False (default — unconfigured) + +project.save() +# → writes project.cif + structures/ + experiments/ + analysis/ +# → reports/ is NOT created (no formats enabled — correct default +# behaviour, no error, no warning). + +project.report.save() +# → raises: +# ValueError( +# "project.report.save() called with no formats enabled. " +# "Set project.report.{cif,html,tex,pdf} = True, or call a per-format " +# "method directly (project.report.save_html(), etc.)." +# ) +``` + +The Python error applies only to the explicit report-save API: +`project.save()` keeps the no-report default because the user asked to +save the project, not the reports. CLI report generation happens through +project saves (`fit`), so it follows the persisted `_report.*` booleans +instead of offering a separate ad-hoc export command. + +The per-format methods (`save_cif()`, `save_html()`, etc.) never inspect +the persisted report booleans — they always write their format +unconditionally. They are explicit one-offs. + +#### 1.3 CIF persistence of the configuration + +The configuration category serialises to `project.cif` next to the other +project-level singleton categories (`_info.*`, `_chart.*`, `_table.*`, +`_verbosity.*`). The CIF tag prefix is `_report.*` — a Set category with +five scalar items, no loops: + +```text +data_ + +# ---- Project info (from project-facade-and-persistence ADR) ---- +_info.title 'Quartz at 300 K' +_info.description 'Refinement against XRD pattern xrd_300K.' +_info.created 2026-05-26T12:00:00 +_info.last_modified 2026-05-26T15:42:00 + +# ---- Chart / table / verbosity selectors (existing config) ---- +_chart.type plotly +_table.type plotly +_verbosity.fit short + +# ---- Report configuration (this ADR §1.3) ---- +_report.cif yes +_report.html yes +_report.tex no +_report.pdf no +_report.html_offline no +``` + +All five items are scalar DDLm dotted entries — the category is declared +`_definition.class Set` so a single value per item, no loops permitted. +Matches the existing `_chart.*`, `_table.*`, `_verbosity.*` category +shape exactly. The `yes`/`no` boolean encoding follows the project's +existing CIF boolean convention. + +The default unconfigured state writes four explicit `no` values for the +format booleans (not an absent or empty representation), so the "no +formats enabled" condition is always a concrete CIF value, never an +empty loop or missing block: + +```text +# Default (project.report.{cif,html,tex,pdf} = False): +_report.cif no +_report.html no +_report.tex no +_report.pdf no +_report.html_offline no +``` + +Load semantics are symmetric: every `no` reads back as `False` on its +descriptor; the `formats` property view returns `[]`. + +The four per-format booleans give the IUCr-aware tooling (`gemmi`, +`publCIF`) a typed, validatable view of the configuration — each format +is a known enum item with type `Boolean`, not a parsed string. There is +no additional Python list view with separate storage; the booleans are +the source of truth. + +Adding a new format in the future (e.g. `markdown`) is a one-line schema +extension: add `_report.markdown` to the dictionary and a +`project.report.markdown` boolean to the descriptor, then include it in +the internal save dispatch. + +Loading a `project.cif` populates `project.report.*` per the +project-facade-and-persistence contract; on the next `project.save()`, +the configured formats emit automatically with no further user action. + +##### Why not its own CIF file? + +The project already has two distinct facade patterns for top-level +`project.*` slots, used deliberately for different purposes. +`project.report` is **Pattern A** — lightweight project-level singleton +config — not Pattern B — heavy datablock owner with its own CIF file. +The split is summarised below. + +| Slot | Pattern | CIF location | Python shape | +| ------------------------------------ | ------- | ---------------------------------------- | ---------------------------------------------------- | +| `project.info` | A | `project.cif` (`_info.*`) | small `CategoryItem` | +| `project.chart` | A | `project.cif` (`_chart.*`) | `CategoryItem` (one field) | +| `project.table` | A | `project.cif` (`_table.*`) | `CategoryItem` (one field) | +| `project.verbosity` | A | `project.cif` (`_verbosity.*`) | `CategoryItem` (one field) | +| **`project.report`** (this ADR) | **A** | **`project.cif` (`_report.*`)** | **`CategoryItem` (five fields) plus action methods** | +| `project.publication` (this ADR, §5) | A | `project.cif` (`_publ_*` / `_journal_*`) | `CategoryOwner` of six sibling categories | +| `project.analysis` | B | `analysis/analysis.cif` | `CategoryOwner` (heavy datablock) | +| `project.structures[name]` | B | `structures/.cif` | `CategoryOwner` (heavy datablock) | +| `project.experiments[name]` | B | `experiments/.cif` | `CategoryOwner` (heavy datablock) | + +Reasons `project.report` is Pattern A, not Pattern B: + +- Five scalar config items do not justify a separate file + (`reports/report.cif` would be a tiny file holding five lines). +- A `reports/report.cif` would force the `reports/` directory to exist + even when every format boolean is `False` and no reports are written — + breaks the "no surprise files" property the design is built around. +- Splits report configuration from chart / table / verbosity + configuration, which already share `project.cif` for the same reason — + they are all project-level preferences, not domain data. + +What makes `project.report` look heavier than `project.chart` / +`project.table` / `project.verbosity` is the action methods on the +facade (`save_cif()`, `save_html()`, `show_report()`, `data_context()`, +etc.). Those live on the Python class alongside the configuration +fields, which is the facade-hybrid amendment to +`project-facade-and-persistence.md` already recorded in the ADRs-amended +list. The action methods do not change where the configuration persists +— that stays in `project.cif`. + +#### 1.4 Validation moves internal — CIF only, writer-correctness only + +The accepted IUCr ADR §2.5 exposed `project.report.check()` and a +`check=True` flag for gemmi-based dictionary-spec validation. Both are +**removed** in favour of a pre-write self-check inside the CIF emission +paths only: + +```text +project.save() + └─ for each enabled format: + ├─ build the format content + ├─ if format ∈ {cif}: run gemmi against the content + │ └─ on failure → raise EasyDiffractionWriterError + │ pointing at the malformed tag, + │ with a "please file a bug" hint. + ├─ if format ∈ {html, tex, pdf}: no pre-write validation + │ — see scope below. + └─ atomically write the file +``` + +**Scope split — what is validated and how:** + +| Output | Validation | Failure mode | +| ------------------------ | -------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- | +| `reports/.cif` | none at write time | n/a — the report renders the data context; an IUCr-server upload is the spec-compliance check before submission | +| `reports/.html` | none at write time | n/a — HTML is a render of the data context, not a typed format | +| `reports/tex/` | none at write time | n/a — LaTeX errors surface at PDF-compile time, with the engine's message | +| `reports/.pdf` | TeX engine's own compilation (returns non-zero on error) | engine-specific message; the `.tex` and `data/` CSVs are still written | + +The report CIF is the writer's own deterministic output, so it is +written without a runtime dictionary self-check, and nothing in the +library reads `tmp/iucr-dicts/`. Spec compliance of the emitted tag set +is maintained by authoring the writer against the COMCIFS reference +dictionaries, with a final IUCr-server upload as the authoritative check +before submission. (This supersedes the earlier write-time gemmi +validation; see the §2.5 amendment in +[`iucr-cif-tag-alignment.md`](iucr-cif-tag-alignment.md).) + +User-input validation (e.g., "is the email address syntactically +valid?", "is the ORCID well-formed?") happens **upstream** at the +descriptor's `value_spec` validator — the same boundary where every +other user input is checked. That's a separate concern from the writer +self-check above: descriptor validators raise `typeguard.TypeCheckError` +or the project's `ValidationError` at _assignment time_, before any save +is attempted. By the time the writer runs, the values it receives are +already shape-correct; the gemmi pass on the CIF output catches _writer_ +bugs (wrong tag, wrong type, malformed loop), not user bugs. + +Rationale: dictionary compliance is a _writer-correctness_ property, not +a user choice. A user can't fix a non-compliant emission without +modifying project state — and even then, the writer should refuse to +emit a malformed file in the first place. Making validation a +user-visible API surface invites users to skip it; making it internal +makes it impossible to skip. Cached dictionary parsing keeps the +overhead to a one-time ~200 ms session cost. The +`EasyDiffractionWriterError` includes the full gemmi diagnostic so bug +reports are actionable. + +A separate, _completeness_-oriented check +(`project.report.check_completeness()`) — flagging unfilled `_publ_*` / +`_journal_*` placeholders for journal submission, which is a +publication-readiness question rather than a writer-correctness one — is +a different concern and stays in Deferred Work. + +#### 1.5 Descriptor display metadata — `DisplayHandler` + +Parameter names like `u_iso` and unit strings like `Ų` need prettier +representations for the HTML and PDF renderers. The ADR introduces a new +optional handler on the descriptor base classes (`Parameter`, +`NumericDescriptor`, `StringDescriptor`) that carries the typeset +variants in a single place, sibling to the existing `cif_handler`: + +```python +from dataclasses import dataclass + +@dataclass(frozen=True, slots=True) +class DisplayHandler: + """Pretty-printing metadata for descriptors. + + All four fields are optional strings. Renderers fall back + to the descriptor's plain ``name`` / ``units`` when a + field is unset; missing fields never raise. + """ + display_name: str | None = None # HTML / GUI / show() label + display_units: str | None = None # HTML / GUI / show() unit string + latex_name: str | None = None # LaTeX inline-math label + latex_units: str | None = None # LaTeX text/math unit string +``` + +`DisplayHandler` lives at `src/easydiffraction/core/display_handler.py` +alongside the existing `CifHandler` in +`src/easydiffraction/io/cif/handler.py` — a frozen dataclass per the +project's value-object convention (matches `TypeInfo`, `Compatibility`, +`CalculatorSupport` per AGENTS.md). `slots=True` keeps memory overhead +constant per attached descriptor. + +The plain `name` and `units` fields keep their existing role on the +descriptor, but their **content convention changes**: + +- `name` — Python identifier; ASCII snake_case; unchanged. +- `units` — **ASCII only**, following the CIF DDLm `_units.code` + vocabulary from + [`cif_core.dic`](../../../../tmp/iucr-dicts/cif_core.dic) **verbatim** + when the dictionary defines a value for the unit. The dictionary's + vocabulary is a single source of truth, but it is **not** uniformly + plural — singular and plural forms appear mixed across units (each + unit is whatever the dictionary actually says). Verified codes from + `cif_core.dic`: + + | What we need | `_units.code` value | Source line in cif_core.dic | + | --------------------- | ---------------------------------------- | ----------------------------- | + | Å (length) | `angstroms` (plural) | line 1213 | + | Ų (area) | `angstrom_squared` (singular `angstrom`) | line 1178 | + | ° (angle) | `degrees` (plural) | line 500, 519, 1247, … | + | K (temperature) | `kelvins` (plural) | line 210, 232, 287, 316 | + | Pa (pressure) | `kilopascals` (plural) | line 115, 136, 161, 184 | + | µs (time) | `microseconds` (plural) | (from `cif_pow.dic` TOF text) | + | Da (mass) | `dalton` (singular) | line 753 | + | MGy (dose) | `megagray` (singular) | line 592, 607 | + | Å⁻¹ (reciprocal) | `reciprocal_angstroms` | line 795, 825 | + | Å⁻² (reciprocal area) | `reciprocal_angstrom_squared` | line 1552, 1587 | + | dimensionless | `none` | line 459, 480, … | + +- **Units the dictionary does not define.** The crystallographic + vocabulary includes a handful of compound units that `cif_core.dic` + does not assign a `_units.code` to — the one example currently in + scope is `deg²` (squared degrees, used for some angular variance + metrics). Convention for these: extend the same naming pattern + (`degrees_squared`) as a **project-internal code** with no + `_units.code` round-trip. The implementation plan keeps a small + `units_vocabulary.py` module listing every code (dictionary and + project-internal) so a sweep can validate every `units=` string at + descriptor-declaration time. + +The Unicode-symbol form (`Ų`) moves into `display_units`; the LaTeX +form (`\AA$^2$`) into `latex_units`. + +##### Worked example — `u_iso` + +```python +self._u_iso = Parameter( + name='u_iso', + description='Isotropic atomic displacement parameter', + units='angstrom_squared', + value_spec=AttributeSpec(default=0.0, validator=RangeValidator(ge=0.0)), + cif_handler=CifHandler(names=['_atom_site.U_iso_or_equiv']), + display_handler=DisplayHandler( + display_name='Uiso', + display_units='Ų', + latex_name=r'$U_{\mathrm{iso}}$', + latex_units=r'\AA$^2$', + ), +) +``` + +| Renderer / context | Name uses | Units uses | +| ---------------------------------------- | --------------------------- | ---------------------------- | +| LaTeX (`save_tex`) | `$U_{\mathrm{iso}}$` | `\AA$^2$` | +| HTML (`save_html`, MathJax-rendered) | `$U_{\mathrm{iso}}$` | `\AA$^2$` | +| HTML pre-MathJax / GUI / `show_report()` | `Uiso` | `Ų` | +| `project.report.data_context()` raw dict | both available | both available | +| CIF emission | `_atom_site.U_iso_or_equiv` | (no `_units.code` row today) | +| Python code / repr | `u_iso` | `angstrom_squared` | + +##### Resolution rules + +The renderers consult the `DisplayHandler` (if attached) using a +per-context fallback chain: + +- **LaTeX context** (`save_tex`, `save_pdf`, `as_tex`): + `handler.latex_name or descriptor.name`, + `handler.latex_units or descriptor.units`. +- **HTML context** (`save_html`, `as_html`): + `handler.display_name or descriptor.name`, + `handler.display_units or descriptor.units`. The HTML template + additionally surrounds `handler.latex_name` / `handler.latex_units` + with `\(...\)` math delimiters so MathJax picks them up where the + descriptor has typeset variants — i.e., HTML can show the same + `$U_{\mathrm{iso}}$` the PDF shows, while a GUI tooltip or + `show_report()` printout falls back to `display_*`. +- **GUI / terminal / `show_*()` context**: + `handler.display_name or descriptor.name`, + `handler.display_units or descriptor.units`. + +Each chain falls through to the descriptor's plain fields, so +**descriptors without a `display_handler` continue to work unchanged** — +they simply render as `u_iso` / `angstrom_squared` in all contexts. +Adding a `display_handler` is opt-in per descriptor. + +**Table-rendering paths MUST read through the resolution chain above, +not the plain `descriptor.units` field directly.** This is a strict +requirement because `units=` now holds ASCII CIF DDLm codes +(`'angstrom_squared'`) that would look ridiculous as a column header. +Concretely the following call sites migrate in the implementation sweep: + +- Every `show_*()` method on `Report` (terminal / Jupyter table + builders) — the unit column or row header is built from + `display_units or units`, not `units` alone. +- Every Jinja macro in `templates/base.j2` that formats a parameter row + — same resolution rule. +- The HTML template (`templates/html/report.html.j2`) uses + `display_units` for non-math contexts and the latex_units variant + inside `\(...\)` math delimiters where the descriptor declares both. +- The LaTeX template (`templates/tex/report.tex.j2`) uses `latex_units` + (falling through `display_units` then `units` if not declared). +- The shared `data_context()` (§6) builder exposes both rendered strings + per parameter so neither template has to re-derive the fallback chain + — the resolution happens once in the builder. + +External / third-party readers that hard-coded `parameter.units` to +compare against `'Ų'` (the prior Unicode form) are flagged in the Open +Questions section for a project-wide audit before the sweep lands. + +##### Why a handler class instead of four kwargs + +Two design pressures: + +- The fields cluster — they are all "how to display this parameter" — so + a single handler keeps the descriptor constructor flat. + `display_handler=DisplayHandler(latex_name=..., display_name=...)` + reads cleaner than four sibling kwargs. +- Future display targets (Markdown export, GUI tooltips, an + ASCII-fallback for terminal narrow-mode) can add fields to + `DisplayHandler` without growing the descriptor constructor signature. + +The mechanism mirrors the existing `cif_handler=CifHandler(...)` +pattern, so anyone reading the descriptor declarations sees the same +shape for CIF metadata and display metadata. + +##### Migration sweep + +Existing descriptors use `units='Å'` / `'Ų'` / `'°'` etc. (Unicode +short forms). The implementation plan owns the sweep that: + +- Converts every existing `units=` Unicode string to the ASCII CIF DDLm + form (`'Ų'` → `'angstrom_squared'`). +- Adds `display_handler=DisplayHandler(...)` to descriptors the + renderers benefit from prettifying (atom-site positions / ADPs, cell + parameters, fit-result R-factors, refinement statistics, peak + parameters, …). Descriptors the renderers don't show (CIF-only + internal state) get no handler — the fallback to `name`/`units` is + fine. +- Verifies the `_chart`, `_table`, `_verbosity` enum values and other + singleton-config CIF strings don't accidentally collide with the new + units vocabulary (they shouldn't — those are tag values, not unit + codes). + +The sweep is a Phase 1 step in the implementation plan, not an ADR-level +decision. + +### 2. HTML report — config-driven via `project.report.html` + +`project.report.html = True` causes `project.save()` to write +`reports/.html`. The all-`False` default keeps `reports/` from +being touched at all on plain `project.save()`. For one-off HTML without +changing the persistent config, call `project.report.save_html()` +directly. + +```python +# Persistent — every subsequent save writes the HTML report. +project.report.html = True +project.report.html_offline = False # CDN-Plotly (default) +project.save() # → reports/.html + +# Persistent + air-gapped readers — fully self-contained: +# inline Plotly AND inline MathJax. +project.report.html_offline = True +project.save() # → reports/.html (~4.5 MB) + +# One-off, ignoring config. +project.report.save_html() # CDN: Plotly + MathJax both from CDN +project.report.save_html(offline=True) # inline: Plotly + MathJax both inlined +``` + +Asset-bundling modes — `html_offline` controls **both** assets together +(single switch, single contract): + +- **CDN mode (default)** — Plotly via `include_plotlyjs='cdn'` (~50-300 + KB on top of the otherwise-empty document, depending on chart count); + MathJax from `https://cdn.jsdelivr.net/npm/mathjax@3/...` via + ``. + - `False`: + ``. + - When `html_offline=True`, the renderer copies the vendored file next + to the emitted `.html` so the relative + `