Issue: CLI Option - Allow Uploading CSV File with Annotations
Description
Add a CLI option to accept an annotations CSV file that enriches the LLM conversion process with external legal metadata.
Proposed CLI Interface
# Basic usage with annotations
akoma-markup input.pdf \
--llm-inline '{"provider": "anthropic"}' \
--annotations-csv annotations.csv \
-o output.txt
# Multiple annotation sources
akoma-markup input.pdf \
--llm-json config.json \
--annotations-csv refs.csv \
--annotations-csv cases.csv \
-o output.txt
# With metadata only (no LLM conversion)
akoma-markup input.pdf \
--annotations-csv annotations.csv \
--annotations-only \
-o output.txt
CSV File Format
The CSV file should support multiple annotation types:
Required Columns
section_num - Section number (e.g., "1", "1A", "41")
content - The annotation text
Optional Columns
type - Annotation type: reference, citation, commentary, history
source - Source of the annotation
url - Link to external resource
priority - Display priority (1-10) - this specifies the order in which it appears below the section
Example CSV:
section_num,type,content,source,url,priority
1,reference,See Section 5 for related provisions,BNSS 2023,https://...,1
41,citation,Kishan Singh v. State,Supreme Court,https://...,2
50,commentary,Amended by Act XX of 2024,Lok Sabha,https://...,3
Implementation Plan
1. CSV Parsing Module
Create src/akoma_markup/annotations.py:
load_annotations(csv_path) → dict keyed by section_num
- Validate CSV format
- Handle duplicates and conflicts
2. CLI Integration
Update src/akoma_markup/cli.py:
- Add
--annotations-csv option (multiple allowed)
- Pass annotations to
convert()
3. Core Conversion Updates
Update src/akoma_markup/__init__.py:convert():
- Accept optional
annotations_path or annotations_dict parameter
- Merge annotations with section data after LLM conversion
Use Cases
- Legal Researcher - Add case law citations to relevant sections
- Law Student - Include explanatory notes and cross-references
- Publisher - Merge official amendment notifications
- Developer - Test with curated annotation sets
Dependencies
- Depends on: Annotations data structure design
- Relates to: Add annotations support feature
Acceptance Criteria
Issue: CLI Option - Allow Uploading CSV File with Annotations
Description
Add a CLI option to accept an annotations CSV file that enriches the LLM conversion process with external legal metadata.
Proposed CLI Interface
CSV File Format
The CSV file should support multiple annotation types:
Required Columns
section_num- Section number (e.g., "1", "1A", "41")content- The annotation textOptional Columns
type- Annotation type:reference,citation,commentary,historysource- Source of the annotationurl- Link to external resourcepriority- Display priority (1-10) - this specifies the order in which it appears below the sectionExample CSV:
Implementation Plan
1. CSV Parsing Module
Create
src/akoma_markup/annotations.py:load_annotations(csv_path)→ dict keyed by section_num2. CLI Integration
Update
src/akoma_markup/cli.py:--annotations-csvoption (multiple allowed)convert()3. Core Conversion Updates
Update
src/akoma_markup/__init__.py:convert():annotations_pathorannotations_dictparameterUse Cases
Dependencies
Acceptance Criteria
--annotations-csvCLI option implemented