refactor: make BaseProvider a BaseModel with discriminated union#5323
Conversation
Replace the Protocol with a BaseModel + ABC so providers serialize and deserialize natively via pydantic. Each provider gets a Literal provider_type field. CheckpointConfig.provider uses a discriminated union so the correct provider class is reconstructed from checkpoint JSON.
iris-clawd
left a comment
There was a problem hiding this comment.
Review: BaseProvider → BaseModel with discriminated union
Good fix for the serialization round-trip issue. Protocol → BaseModel+ABC is the right call when you need Pydantic to own the serialization.
🟡 Extensibility for custom providers
The discriminated union on CheckpointConfig.provider is now a concrete JsonProvider | SqliteProvider. Third-party providers can't be plugged in without modifying this type annotation. The old Protocol approach allowed any conforming object.
For alpha this is fine — you only have two providers. But if custom providers are a planned use case, you'd want a registry pattern or Union that can be extended (e.g., entry points, or a class method that returns the updated Annotated type). Worth noting in docs.
🟡 BaseProvider.provider_type default
BaseProvider has provider_type: str = "base" as a non-abstract field. A subclass that forgets to override it with a Literal would still pass validation but break the discriminator at the CheckpointConfig level (Pydantic would fail to match "base" to any union member). Making provider_type abstract or removing the default would catch this at definition time. Minor since you only have two concrete providers.
🟢 Looks good
- Discriminated union via
provider_typeLiteral is clean Pydantic v2 pattern @abstractmethodon all five methods is correct — forces implementation- Removes the hacky
__get_pydantic_core_schema__workaround - Fixes the actual bug (TypeError on checkpoint restore)
Solid refactor. 💬 171
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 45e05e2. Configure here.

Summary
BaseProviderProtocol with BaseModel + ABC so providers serialize/deserialize natively via pydanticprovider_typeLiteral field to JsonProvider and SqliteProviderCheckpointConfig.providerso the correct provider class is reconstructed from checkpoint JSONTypeError: Expected a BaseProvider instance, got <class 'str'>on checkpoint restoreTest plan
CheckpointConfig(provider=SqliteProvider())round-trips throughmodel_dump_json/model_validate_jsonCrew.from_checkpoint(path, provider=SqliteProvider())worksNote
Medium Risk
Medium risk because it changes how checkpoint providers are typed/validated/serialized via Pydantic, which can affect checkpoint round-tripping and restore paths across existing stored snapshots.
Overview
Refactors checkpoint provider serialization by replacing
BaseProviderfrom aProtocolwith an abstractpydantic.BaseModelcarrying aprovider_typediscriminator.CheckpointConfig.provideris now a discriminated union (JsonProvider | SqliteProvider) keyed byprovider_type, and both providers add aprovider_typeLiteralfield so checkpoint configs can round-trip through JSON and reconstruct the correct provider implementation during restore.Reviewed by Cursor Bugbot for commit 45e05e2. Bugbot is set up for automated code reviews on this repo. Configure here.