[SS] Add a memory snapshot export monitor#12484
Conversation
ffd0f75 to
af4996f
Compare
af4996f to
c2f813d
Compare
There was a problem hiding this comment.
Adds two not-yet-wired building-block types for caching Firecracker memory-snapshot chunks during export: sequentialChunkWriteTracker and memorySnapshotExportMonitor. The design is clean and tests are readable, but there is a high-severity data race / dropped-final-chunk bug in Finish() that should be addressed (or covered by a test) before wiring to a real COWStore; remaining findings are non-blocking.
| // chunk as soon as writes move on to a later chunk, while the snapshot data is still in the page cache. | ||
| // This removes the need for another pass through the COWStore to cache chunks after the snapshot is generated, | ||
| // reducing memory and IO from processing the large snapshots twice. | ||
| type memorySnapshotExportMonitor struct { |
There was a problem hiding this comment.
"monitor" sounds like this is for observability but it's really for caching. maybe call it something like "memorySnapshotCacher" or "memorySnapshotCacheWriter"?
| log.CtxWarningf(m.egCtx, "Failed to finalize sequential chunk write tracker: %s", err) | ||
| return err |
There was a problem hiding this comment.
Generally we should either log an error or return the error, but not both. Otherwise we'll wind up polluting the logs.
If the logging was meant to add additional context (e.g. the error is happening due to "finalizing"), then you can instead use status.WrapError to add that extra context.
| "memory_snapshot_export_monitor.go", | ||
| "sequential_chunk_write_monitor.go", |
There was a problem hiding this comment.
nit: it seems a tad overkill to add two separate files for this - maybe consider putting both in a single file since the functionality is very closely related and the two things together are fairly small
| } | ||
|
|
||
| func (m *memorySnapshotExportMonitor) startUploadWorkers(uploadConcurrency int) { | ||
| for i := 0; i < uploadConcurrency; i++ { |
There was a problem hiding this comment.
nit: this can be for range i
|
|
||
| // queueChunkUpload schedules caching for a chunk that the sequential tracker has | ||
| // finalized. | ||
| func (m *memorySnapshotExportMonitor) queueChunkUpload(chunkOffset int64) error { |
There was a problem hiding this comment.
this always returns nil, so maybe it doesn't need to return an error?
If it didn't return an error, than sequentialChunkWriteTracker.Finish also wouldn't need to return an error.
| if chunkSizeBytes <= 0 { | ||
| return status.FailedPreconditionErrorf("chunk size must be positive: %d", chunkSizeBytes) | ||
| } |
There was a problem hiding this comment.
nit: chunkSizeBytes is the same on every iteration, but checked every time.
| chunkSizeBytes int64 | ||
|
|
||
| // finalizeChunk is called when no more writes are expected for the given chunk. | ||
| finalizeChunk func(chunkOffset int64) error |
There was a problem hiding this comment.
considering that memorySnapshotExportMonitor.OnWrite is a callback, adding another layer of callback is a bit tricky to understand. Two possible options for simplification:
- Instead of passing in a callback, just pass in the
chan int64 - Instead of callback, make Observe and Finish return the chunk offset that should be written, if any.
Just food for thought. I'm not sure that either option is actually better.
| } | ||
|
|
||
| func (m *memorySnapshotExportMonitor) run() error { | ||
| for { |
There was a problem hiding this comment.
what do you think about inlining the tracker implementation into this function?
It would just need currentChunkIndex and haveCurrentChunk variables, and Observe and Finish would be inlined. validateSequentialWriteEvent could still be a separate helper function.
Replacement for this PR. Rather than interleaving the logic into copy_on_write store, adds it to the firecracker package