Skip to content

feat(datasync): add chunk package for partitioned sync#19

Merged
jasoet merged 24 commits into
mainfrom
feat/datasync-chunk
May 10, 2026
Merged

feat(datasync): add chunk package for partitioned sync#19
jasoet merged 24 commits into
mainfrom
feat/datasync-chunk

Conversation

@jasoet

@jasoet jasoet commented May 10, 2026

Copy link
Copy Markdown
Owner

Summary

  • Adds datasync/chunk/ package with ChunkedSync[In, Out, K] and DateChunkedSync[In, Out] builders for Temporal-backed partitioned-sync workflows (fetch → map → write per partition).
  • Extracts heartbeat helpers into a new datasync/internal/heartbeat package shared by datasync/activity and datasync/chunk.
  • No changes to datasync/job.go, datasync/workflow/, datasync/builder/, or any caller. Para-sync is intentionally not modified — it will be migrated to consume the upstream package separately.

Key design points

  • Generic K cmp.Ordered for partition keys; DateChunkedSync wraps K = int64 (Unix nanos) so users keep a time.Time API.
  • Optional ProgressTracker[K] for resumable syncs (cursor-based filtering + advance).
  • MaxPartitionsPerExecution + ContinueAsNew to bound history for large partition lists. Validated at Build() time — panics if set without a tracker.
  • Phase-aware heartbeats (starting/fetching/mapping/writing) via the shared internal/heartbeat package.
  • WithRateLimitRetry decorator + HeartbeatSleeper ported from para-sync.

Spec & plan

  • Design: docs/superpowers/specs/2026-05-10-datasync-chunk-design.md
  • Plan: docs/superpowers/plans/2026-05-10-datasync-chunk.md

Test plan

  • task ci:test — unit, 24 packages green
  • task test:integration — integration, 25 packages green, datasync/chunk at 88% coverage; new file datasync/chunk/sync_integration_test.go spawns a real Temporal container via testutil.StartTemporalContainer
  • task lint — clean
  • examples/datasync/chunk_basic.go builds under -tags=example

jasoet added 24 commits May 11, 2026 00:06
Remove the invented "extend if trailing gap < ChunkSize/2" heuristic and
replace the Partitions method with the plain align+iterate+clamp algorithm
used by para-sync. Fix all three affected test expectations: BasicWindow
now uses now=midnight, LastChunkClampedToNow expects 3 partitions, and
AlignsToCalendarMidnightInTimezone expects 2 partitions with the comment
updated to document the corrected iteration.
Adds the ChunkedSync[In, Out, K] builder with all setters, Build()
validation (panics on missing required fields), and a stub run() method
returning an error — workflow logic follows in Tasks 13–15.
Extend chunkedSyncWorkflow.run to read cursor position via ReadCursor
activity when hasTracker is set, filter out already-processed partitions,
and advance cursor after each successful partition via AdvanceCursor.
…AsNew

When maxPerExec > 0 and the partition list exceeds the limit, run()
truncates to that many partitions and returns ContinueAsNewError so
Temporal restarts the workflow for the remainder.
Wraps ChunkedSync[In, Out, int64] with a time.Time API. Internally,
time.Time keys are projected onto Unix-nanosecond int64 keys via
TimeToKey/KeyToTime. Introduces TimeFetcher and TimeProgressTracker
local interface types to work around the cmp.Ordered constraint that
time.Time cannot satisfy.
…erExec requires tracker

Drop the unused [In any] type parameter from TimeProgressTracker and cascade
the change through timeTrackerAdapter, WithTracker, and test stubs. Add a
Build-time panic when MaxPartitionsPerExecution is set without WithTracker to
prevent silent infinite re-processing loops. Add godoc to
defaultCursorActivityOptions and MaxPartitionsPerExecution. Move
cursorAdvCtx allocation outside the partition loop (avoid per-iteration
re-alloc) using a distinct variable name to avoid shadowing. Add tests for
the new MaxPerExec+tracker validation and the partitionSleep workflow path.
@jasoet jasoet merged commit c290b24 into main May 10, 2026
1 check passed
@jasoet jasoet deleted the feat/datasync-chunk branch May 10, 2026 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant