Problem
Streambed flushes per-table based on row count (--flush-rows) or time thresholds (--flush-interval), not transaction boundaries. A single Postgres transaction's writes can land in different Iceberg snapshots — either across tables or even within the same table if a flush fires mid-transaction.
Example
Postgres transaction:
BEGIN;
INSERT INTO orders (...);
INSERT INTO order_items (...);
UPDATE inventory SET qty = qty - 1 WHERE ...;
COMMIT;
With per-table flush, a reader querying streambed between flushes could see:
orders with the new row, but order_items without it (cross-table inconsistency)
- Half of a batch of inserts in one Iceberg snapshot, the other half in the next (within-table split)
The final state is always correct — all rows arrive eventually. But there is no point-in-time consistency guarantee across tables or across flush boundaries.
Why This Happens
pgoutput v1 streams events in transaction order (BEGIN → events → COMMIT), but streambed doesn't track transaction boundaries for flush decisions. The pipeline buffers rows per-table and flushes whenever thresholds are hit, regardless of whether the current transaction has committed yet.
Note: pgoutput v2 (streaming mode) doesn't help here — it adds streaming of large in-progress transactions but doesn't change the flush atomicity model.
Potential Approaches
1. Transaction-aware buffering
- Track
BEGIN/COMMIT messages
- Only make rows visible to flush after their
COMMIT arrives
- Flush at transaction boundaries when thresholds are met
- Trade-off: large transactions (e.g., bulk INSERT of 1M rows) must be fully buffered in memory before any flush
2. Cross-table atomic commit
- Buffer all tables touched by a transaction, flush them together on COMMIT
- Blocker: Iceberg has no cross-table atomic commit — each table is an independent snapshot. This would require a coordination layer (e.g., a "commit log" that readers consult)
3. Commit-timestamp watermark
- Tag each Iceberg snapshot with the Postgres
CommitMessage.CommitTime
- Readers query "as of timestamp T" across all tables
- Trade-off: requires reader-side logic; DuckDB/iceberg_scan doesn't support time-travel filtering natively in the current setup
Current Status
This is explicitly deferred in specs/rearchitecture.md:
Transaction-aligned flush (when per-table reader-visibility atomicity becomes a user-visible requirement)
For most analytics use cases (dashboards, reporting, aggregations on lagged data), eventual consistency within seconds is acceptable. This becomes important for workloads that need snapshot isolation across tables — e.g., financial reports joining accounts and transactions expecting consistency as of the same Postgres commit.
Scope
- Not a correctness bug — final state is always correct
- Optimization / feature — provides stronger read consistency guarantees
- Likely requires changes to
pipeline.go (transaction tracking), writer.go (flush coordination), and possibly the query server (time-travel support)
Problem
Streambed flushes per-table based on row count (
--flush-rows) or time thresholds (--flush-interval), not transaction boundaries. A single Postgres transaction's writes can land in different Iceberg snapshots — either across tables or even within the same table if a flush fires mid-transaction.Example
Postgres transaction:
With per-table flush, a reader querying streambed between flushes could see:
orderswith the new row, butorder_itemswithout it (cross-table inconsistency)The final state is always correct — all rows arrive eventually. But there is no point-in-time consistency guarantee across tables or across flush boundaries.
Why This Happens
pgoutput v1 streams events in transaction order (
BEGIN → events → COMMIT), but streambed doesn't track transaction boundaries for flush decisions. The pipeline buffers rows per-table and flushes whenever thresholds are hit, regardless of whether the current transaction has committed yet.Note: pgoutput v2 (streaming mode) doesn't help here — it adds streaming of large in-progress transactions but doesn't change the flush atomicity model.
Potential Approaches
1. Transaction-aware buffering
BEGIN/COMMITmessagesCOMMITarrives2. Cross-table atomic commit
3. Commit-timestamp watermark
CommitMessage.CommitTimeCurrent Status
This is explicitly deferred in
specs/rearchitecture.md:For most analytics use cases (dashboards, reporting, aggregations on lagged data), eventual consistency within seconds is acceptable. This becomes important for workloads that need snapshot isolation across tables — e.g., financial reports joining accounts and transactions expecting consistency as of the same Postgres commit.
Scope
pipeline.go(transaction tracking),writer.go(flush coordination), and possibly the query server (time-travel support)