feat(thunk): add .parsed property to ComputedModelOutputThunk for structured output by planetf1 · Pull Request #1282 · generative-computing/mellea

planetf1 · 2026-06-17T08:54:06Z

Why

When format= is passed to act(), the model returns JSON and .value holds the raw string — not a Pydantic instance. The natural workaround is cast(MyModel, result.value), which satisfies the type checker but raises AttributeError at runtime. Because that error is often swallowed by a broad except handler, callers fall back silently on every invocation rather than surfacing a clear failure.

@generative avoids this, but it requires a fixed function signature. Dynamically-built prompts in a loop — a common pattern for batch classification — cannot use it.

Summary

Adds ComputedModelOutputThunk.parsed — validates the raw JSON string via model_validate_json and returns the Pydantic instance typed as S | None. Returns None when no format= type was passed.
All five built-in backends store the format type on the thunk so parsed can use it.
Docstring notes on value and act() point callers at .parsed.
Unit tests covering the happy path, None fallback, invalid JSON, value unchanged, copy/deepcopy _format preservation, and end-to-end through Ollama and HuggingFace backends.

Before / After

# Before
result = m.act(Instruction("Say yes or no"), format=Result)
classification = cast(Result, result.value)
print(classification.label)  # AttributeError: 'str' object has no attribute 'label'

# After
result = m.act(Instruction("Say yes or no"), format=Result)
print(result.parsed.label)   # works

Compatibility

.value is unchanged — existing callers are unaffected.
.parsed is a new property; nothing relies on it today.
Custom backends that subclass Backend without setting mot._format will return None from .parsed even when format= is passed. The .parsed docstring has a Note: calling this out explicitly for backend authors. The built-in backends are all updated.

What is not in this PR

.parsed is now typed S | None, so a ComputedModelOutputThunk[MyModel] exposes .parsed as MyModel | None with no cast at the call site. The remaining gap is that m.act(format=MyModel) does not yet infer ComputedModelOutputThunk[MyModel] — the call site still resolves to ComputedModelOutputThunk[Any]. That binding is tracked in #1274.

Test plan

uv run pytest test/core/test_base.py -k parsed
uv run pytest test/core/ -m "not qualitative"
uv run pytest test/backends/test_ollama.py -k parsed -m qualitative (requires Ollama)
uv run pytest test/backends/test_huggingface.py -k parsed -m qualitative (requires GPU)

…uctured output When `format=` is passed to `act()`/`instruct()`, the model returns a JSON string and `.value` has always held that raw JSON — not a Pydantic instance. Accessing `.label` (etc.) on `.value` silently raises `AttributeError` at runtime while pyright accepts the cast without complaint, leading to hard-to-debug silent failures. This commit adds: - `_format: type[pydantic.BaseModel] | None` attribute on `ModelOutputThunk` (initialised to `None`; propagated via `_copy_from`) - All five backends (`ollama`, `litellm`, `openai`, `huggingface`, `watsonx`) now set `mot._format = _format` in `post_processing()`, alongside the existing `generate_log.extra` artefact - `ComputedModelOutputThunk.parsed` property — calls `_format.model_validate_json(value)` when a format type is stored, returns `None` otherwise - Docstring updates on `ModelOutputThunk.value` and `Session.act()` pointing callers to `.parsed` when `format=` is used - Four unit tests covering: happy path, no-format returns None, invalid JSON raises `pydantic.ValidationError`, and `.value` is unaffected Closes generative-computing#1273. Signed-off-by: Nigel Jones <jonesn@uk.ibm.com> Assisted-by: Claude Code

…es: to parsed - Add `_format = self._format` to `__copy__` and `__deepcopy__` so that copying a ComputedModelOutputThunk preserves the format type; previously a copied thunk would silently return None from .parsed even when the original had a format set. - Add `Raises: pydantic.ValidationError` to the `parsed` property docstring to document the exception callers must handle when the model returns malformed structured output. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

"no manual model_validate_json needed" is more accurate than "MyModel instance, no cast needed" — .parsed returns BaseModel | None, so static type narrowing still requires a cast; the value is just already deserialized. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

jakelorocco · 2026-06-17T12:47:25Z

I think we should reconcile this change with the existing .parsed_repr field. That field is technically supposed to support the structured form of the data. However, the action used to generate the mot currently determines the shape of that data (which is reflected in the type system). We need to rectify these competing typing options.

It seems to me that this change saves the user the hassle of validating the model but doesn't actually give them typing information since the parsed field returns a generic pydantic.BaseModel type. One solution is to propagate the pydantic type through the function signature to the returned model output thunk (which we do for action types right now), assuming we make the disparate typing options coherent.

planetf1 · 2026-06-19T12:52:23Z

@nrfulton @jakelorocco @ajbozarth — this PR has been open for 2 days with no reviews yet. Happy to answer questions.

ajbozarth

Some feedback from Claude — follow-ups inline.

planetf1 · 2026-06-22T10:35:46Z

parsed_repr and .parsed aren't actually overlapping yet — Instruction._parse is a string pass-through, so parsed_repr holds the raw JSON string even when format= is set. The two fields are independent with nothing connecting them. The real question — whether format= should override S so parsed_repr carries the validated model, at which point .parsed could just delegate to it — belongs in #1274. I've added that context there. Is there anything specific you'd want this PR to get right before that work happens?

The HuggingFace chat-path post_processing never assigned mot._format, so .parsed always returned None when format= was set via LocalHFBackend. All other backends (ollama, openai, litellm, watsonx) already set it. Also adds: - Copy/deepcopy unit tests verifying _format is preserved across copies - E2e tests in test_ollama and test_huggingface asserting .parsed returns a typed Pydantic instance end-to-end through each backend - Docstring note on ComputedModelOutputThunk.parsed warning custom-backend authors to set mot._format in their post_processing method Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

The format= overloads narrow the thunk's generic element type, observable on parsed_repr (S | None), not on .value — ComputedModelOutputThunk.value is typed `-> str` unconditionally. parsed_repr also currently routes through Instruction._parse (returns str), so parsed_repr.some_field type-checks but AttributeErrors at runtime: the same silent-failure shape generative-computing#1274 set out to fix, relocated to parsed_repr. Add a TODO pointing at the coordinated .parsed redesign (PR generative-computing#1282) as the proper fix, out of scope here. PR body updated to match (was claiming .value narrows to MyModel). Assisted-by: Claude Code

@ajbozarth

…mat type `.parsed` previously returned `pydantic.BaseModel | None`, so callers still needed `cast(MyModel, result.parsed)` for static narrowing — the gap @ajbozarth flagged on PR generative-computing#1282. Thread the format type through the thunk's existing type parameter `S`: `_format` is now `type[S] | None` and `.parsed` returns `S | None`. Reusing `S` (rather than a second TypeVar) composes with the companion `format=` overloads on generative-computing#1274, which bind `S` to the supplied model so `m.act(action, format=MyModel)` yields `ComputedModelOutputThunk[MyModel]` and `.parsed` is typed `MyModel | None`. The `.parsed` body narrows `_format` to a pydantic type to call `model_validate_json`, then re-asserts the result as `S` — `S` is unbounded (it is `str` for plain instructions) so neither cast can be elided. Add `test/typing/check_parsed.py` asserting `.parsed` tracks the type parameter for both a model-parameterized and a `str` thunk. Assisted-by: Claude Code

@ajbozarth

…mat type `.parsed` previously returned `pydantic.BaseModel | None`, so callers still needed `cast(MyModel, result.parsed)` for static narrowing — the gap @ajbozarth flagged on PR generative-computing#1282. Thread the format type through the thunk's existing type parameter `S`: `_format` is now `type[S] | None` and `.parsed` returns `S | None`. Reusing `S` (rather than a second TypeVar) composes with the companion `format=` overloads on generative-computing#1274, which bind `S` to the supplied model so `m.act(action, format=MyModel)` yields `ComputedModelOutputThunk[MyModel]` and `.parsed` is typed `MyModel | None`. The `.parsed` body narrows `_format` to a pydantic type to call `model_validate_json`, then re-asserts the result as `S` — `S` is unbounded (it is `str` for plain instructions) so neither cast can be elided. Add `test/typing/check_parsed.py` asserting `.parsed` tracks the type parameter for both a model-parameterized and a `str` thunk. Assisted-by: Claude Code

@ajbozarth

…mat type `.parsed` previously returned `pydantic.BaseModel | None`, so callers still needed `cast(MyModel, result.parsed)` for static narrowing — the gap @ajbozarth flagged on PR generative-computing#1282. Thread the format type through the thunk's existing type parameter `S`: `_format` is now `type[S] | None` and `.parsed` returns `S | None`. Reusing `S` (rather than a second TypeVar) composes with the companion `format=` overloads on generative-computing#1274, which bind `S` to the supplied model so `m.act(action, format=MyModel)` yields `ComputedModelOutputThunk[MyModel]` and `.parsed` is typed `MyModel | None`. The `.parsed` body narrows `_format` to a pydantic type to call `model_validate_json`, then re-asserts the result as `S` — `S` is unbounded (it is `str` for plain instructions) so neither cast can be elided. Add `test/typing/check_parsed.py` asserting `.parsed` tracks the type parameter for both a model-parameterized and a `str` thunk. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

… compat Two pyright-compatibility fixes for the `.parsed` property added in bb53ddb: 1. `_format` annotation: revert from `type[S] | None` to `type[pydantic.BaseModel] | None`. Using a covariant TypeVar (`S`) in the invariant `type[...]` position is semantically unsound and can confuse stricter pyright configurations. The field only ever holds pydantic model types at runtime; the concrete annotation is accurate and avoids the variance issue entirely. 2. `.parsed` body: replace the two-step string-quoted cast (`cast("type[pydantic.BaseModel]", …)` then `cast("S", …)`) with a single direct cast (`cast(S, self._format.model_validate_json(self.value))`). Pyright resolves TypeVar forward-references in cast strings differently across versions; using the TypeVar directly is unambiguous. 3. `check_parsed.py`: use `cast(X, cast(object, None))` instead of `cast(X, None)` to avoid basedpyright's `reportInvalidCast` diagnostic (None and X share no overlap); assign `assert_type(…)` results to `_` to silence `reportUnusedCallResult`. All three checkers (mypy, pyright 1.1.408+, basedpyright) now report clean on both changed files. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

@ajbozarth

…mat type `.parsed` previously returned `pydantic.BaseModel | None`, so callers still needed `cast(MyModel, result.parsed)` for static narrowing — the gap @ajbozarth flagged on PR generative-computing#1282. Thread the format type through the thunk's existing type parameter `S`: `_format` is now `type[S] | None` and `.parsed` returns `S | None`. Reusing `S` (rather than a second TypeVar) composes with the companion `format=` overloads on generative-computing#1274, which bind `S` to the supplied model so `m.act(action, format=MyModel)` yields `ComputedModelOutputThunk[MyModel]` and `.parsed` is typed `MyModel | None`. The `.parsed` body narrows `_format` to a pydantic type to call `model_validate_json`, then re-asserts the result as `S` — `S` is unbounded (it is `str` for plain instructions) so neither cast can be elided. Add `test/typing/check_parsed.py` asserting `.parsed` tracks the type parameter for both a model-parameterized and a `str` thunk. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

…rings - ComputedModelOutputThunk.value now carries the same raw-JSON guidance as the parent override so callers inspecting the subclass see it directly. - .parsed opening paragraph no longer overstates current type inference: the format= overloads do not yet bind S to the format model, so the cast idiom is required; removed the false claim that m.act(format=MyModel) yields a typed thunk without a cast. - Added one-line distinction from parsed_repr to prevent confusion between the two properties. Signed-off-by: Nigel Jones <jonesn@uk.ibm.com> Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

@ajbozarth

…mat type `.parsed` previously returned `pydantic.BaseModel | None`, so callers still needed `cast(MyModel, result.parsed)` for static narrowing — the gap @ajbozarth flagged on PR generative-computing#1282. Thread the format type through the thunk's existing type parameter `S`: `_format` is now `type[S] | None` and `.parsed` returns `S | None`. Reusing `S` (rather than a second TypeVar) composes with the companion `format=` overloads on generative-computing#1274, which bind `S` to the supplied model so `m.act(action, format=MyModel)` yields `ComputedModelOutputThunk[MyModel]` and `.parsed` is typed `MyModel | None`. The `.parsed` body narrows `_format` to a pydantic type to call `model_validate_json`, then re-asserts the result as `S` — `S` is unbounded (it is `str` for plain instructions) so neither cast can be elided. Add `test/typing/check_parsed.py` asserting `.parsed` tracks the type parameter for both a model-parameterized and a `str` thunk. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

The format= overloads narrow the thunk's generic element type, observable on parsed_repr (S | None), not on .value — ComputedModelOutputThunk.value is typed `-> str` unconditionally. parsed_repr also currently routes through Instruction._parse (returns str), so parsed_repr.some_field type-checks but AttributeErrors at runtime: the same silent-failure shape generative-computing#1274 set out to fix, relocated to parsed_repr. Add a TODO pointing at the coordinated .parsed redesign (PR generative-computing#1282) as the proper fix, out of scope here. PR body updated to match (was claiming .value narrows to MyModel). Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

planetf1 · 2026-06-22T15:08:26Z

Closing in favour of #1284, which combines this PR with the companion overload-binding work from #1274.

During review @jakelorocco correctly noted that without the format= overloads, .parsed is typed str | None at every call site — meaning callers still need a manual cast, which is exactly the ergonomic problem this feature was meant to solve. #1284 delivers both halves together so the feature is coherent end-to-end.

All commits from this branch are included in #1284 (rebased on top of upstream/main). The review history and resolved threads from this PR are referenced there. A backup of this branch is preserved at backup/issue-1273-pre-collapse on origin.

…re in one PR - session.py overload comment now explicitly names all three attributes (.parsed = typed + runtime-correct via model_validate_json, .parsed_repr = statically typed but runtime gap via Instruction._parse → str, .value = str unconditionally) and directs callers to .parsed for structured output. - check_session.py: add assert_type(r.parsed, _M | None) alongside the existing parsed_repr assertion; update the KNOWN LIMITATION comment now that .parsed is in this same PR rather than deferred to generative-computing#1282. - check_parsed.py: update module docstring to reflect that the format= overloads are now in this PR, closing the "end-to-end" gap previously noted as pending. Signed-off-by: Nigel Jones <jonesn@uk.ibm.com> Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

The format= overloads narrow the thunk's generic element type, observable on parsed_repr (S | None), not on .value — ComputedModelOutputThunk.value is typed `-> str` unconditionally. parsed_repr also currently routes through Instruction._parse (returns str), so parsed_repr.some_field type-checks but AttributeErrors at runtime: the same silent-failure shape generative-computing#1274 set out to fix, relocated to parsed_repr. Add a TODO pointing at the coordinated .parsed redesign (PR generative-computing#1282) as the proper fix, out of scope here. PR body updated to match (was claiming .value narrows to MyModel). Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

github-actions Bot added the enhancement New feature or request label Jun 17, 2026

planetf1 force-pushed the worktree-issue-1273 branch from b1f9251 to acb02f9 Compare June 17, 2026 09:07

planetf1 added 2 commits June 17, 2026 12:15

planetf1 marked this pull request as ready for review June 17, 2026 11:23

planetf1 requested review from a team, jakelorocco and nrfulton as code owners June 17, 2026 11:23

planetf1 requested a review from ajbozarth June 17, 2026 11:23

jakelorocco mentioned this pull request Jun 17, 2026

feat(thunk+types): add .parsed property and wire format= overloads for cast-free structured output #1284

Open

4 tasks

ajbozarth requested changes Jun 19, 2026

View reviewed changes

Comment thread mellea/backends/huggingface.py

Comment thread mellea/core/base.py Outdated

Comment thread mellea/core/base.py

Comment thread mellea/core/base.py

Comment thread test/core/test_base.py

ajbozarth mentioned this pull request Jun 19, 2026

refactor(core): split MOT internal state into _call / _gen sub-objects #1216

Open

6 tasks

planetf1 mentioned this pull request Jun 22, 2026

chore: thread format= through act()/instruct() return type overloads so structured output is cast-free #1274

Open

planetf1 added 2 commits June 22, 2026 11:45

nit(test): explain subclass-loss caveat in copy/deepcopy _format tests

4857a32

Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

planetf1 requested a review from ajbozarth June 22, 2026 10:51

planetf1 enabled auto-merge June 22, 2026 11:42

planetf1 added 2 commits June 22, 2026 15:15

planetf1 force-pushed the worktree-issue-1273 branch from 5e969cf to 6118cea Compare June 22, 2026 14:16

planetf1 closed this Jun 22, 2026

auto-merge was automatically disabled June 22, 2026 15:08
Pull request was closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(thunk): add .parsed property to ComputedModelOutputThunk for structured output#1282

feat(thunk): add .parsed property to ComputedModelOutputThunk for structured output#1282
planetf1 wants to merge 8 commits into
generative-computing:mainfrom
planetf1:worktree-issue-1273

planetf1 commented Jun 17, 2026 •

edited

Loading

Uh oh!

jakelorocco commented Jun 17, 2026

Uh oh!

planetf1 commented Jun 19, 2026

Uh oh!

ajbozarth left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

planetf1 commented Jun 22, 2026

Uh oh!

planetf1 commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

planetf1 commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Summary

Before / After

Compatibility

What is not in this PR

Test plan

Uh oh!

jakelorocco commented Jun 17, 2026

Uh oh!

planetf1 commented Jun 19, 2026

Uh oh!

ajbozarth left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

planetf1 commented Jun 22, 2026

Uh oh!

planetf1 commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

planetf1 commented Jun 17, 2026 •

edited

Loading