Migrate to content-addressable storage for archived sources

## Problem

Content is stored at `{YYYY}/{MM}/{DD}/{url_hash}.{ext}` where `url_hash = SHA256(url)[:16]`. This has two issues:

1. **Collision**: Same URL fetched twice on the same day writes to the same file path, silently overwriting content. If the page changed between fetches, the first source's archived content is lost.
2. **Wrong abstraction**: The hash is based on the URL, not the content — it's neither deduplication nor collision-safe.

## Solution

Switch to content-addressable storage (CAS), like Git's object store:

- **Hash the actual content** (SHA-256 of file bytes), not the URL
- **Path structure**: `{hash[:2]}/{hash[2:4]}/{hash}.{ext}` (2 levels of 2-char dirs)
- **Natural deduplication**: identical content stored once, different content never collides
- **Idempotent writes**: check if file exists before writing

## Changes needed

- Drop `url_hash` column entirely (the URL itself is already stored)
- Add `content_hash` and `html_content_hash` columns (set after fetch, when content is available)
- Replace `path_root` property with `cas_path()` deriving paths from content hashes
- Replace `save_archived_content`/`read_archived_content` with CAS equivalents
- Alembic migration including file migration from old paths to new CAS paths
- Update API schema, frontend types, and tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate to content-addressable storage for archived sources #145

Problem

Solution

Changes needed

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Migrate to content-addressable storage for archived sources #145

Description

Problem

Solution

Changes needed

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions