Skip to content

Recipes: add three starter modes (fast / balanced / accurate) #1809

Description

@RonShakutai

Is your feature request related to a problem? Please describe.
New Presidio users often don’t know where to start when choosing between spaCy, transformers, or LLM-augmented pipelines. This slows adoption and makes customization harder than necessary.

Describe the solution you'd like
Introduce three high-level, opinionated starter modes that users can immediately run and compare:

  • Fast (spaCy)– Minimal, lightweight, low-latency configuration.
  • Balanced (Transformers) – Uses a transformer NER model for improved accuracy with moderate latency.
  • Accurate (Transformers + LLM) – Hybrid pipeline prioritizing accuracy/recall, optionally calling an LLM for difficult cases.

Each mode includes:
A small config users can copy/paste.
A single notebook demonstrating all three modes on the same dataset.
Brief notes on expected performance/latency trade-offs.
This gives users a simple, structured starting point before diving into domain-specific customization.

Describe alternatives you've considered

  • Expanding documentation only—too abstract, still leaves users without clear starting configurations.
  • Providing fully domain-specific recipes first—higher effort, without giving users a generic baseline they can adopt anywhere.

Additional context
This is the first incremental step in building a broader “recipes” section for Presidio. The three modes establish a foundation on top of which domain-specific examples can later be added.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions