feat(lifecycle): auto-archive and auto-delete stopped sandboxes with configurable idle thresholds#1096
Draft
zhangjaycee wants to merge 12 commits into
Draft
feat(lifecycle): auto-archive and auto-delete stopped sandboxes with configurable idle thresholds#1096zhangjaycee wants to merge 12 commits into
zhangjaycee wants to merge 12 commits into
Conversation
Introduce SandboxLifecycleConfig as the unified config entry point for sandbox lifecycle parameters (timeouts, auto-archive, auto-delete). ArchiveConfig (with OssConfig, AcrConfig) is nested under lifecycle.archive in YAML. RockConfig.from_env() parses the lifecycle section and __post_init__ coerces dicts to dataclasses.
Add abstract interfaces (AbstractDirStorage, AbstractImageStorage) and concrete implementations: OssDirStorage, S3DirStorage for directory archives, DockerRegistryV2ImageStorage for container image snapshots with optional Bearer token authentication.
…, operator, and reconciler - Add ARCHIVING/ARCHIVED states and archive/restore transitions to the state machine with on_* callbacks for metadata persistence. - Extend AbstractOperator with start_archive/start_restore; implement in RayOperator with low-resource actor override for archive tasks. - Add SandboxActor.archive() and restore_and_start() methods for commit+push and pull+download+start workflows. - Add archive_sandbox() and restart_from_archived() to SandboxManager with archive cleanup on delete. - Add _reconcile (30s) with _reconcile_pending (restore timeout + alive advancement) and _reconcile_archiving (completion check + retry). - Extract _try_advance_pending helper from get_status for reuse. - Wire up /archive API endpoint, SDK client, and admin storage injection.
…load Add max_image_push_size and max_dir_upload_size to ArchiveConfig (default 16g each). Enforced in SandboxActor.archive() before push/upload. Supports Nacos dynamic override via RockConfig.update().
--table: generate DDL for specific tables instead of all. --alter-from: compare current ORM against an old DDL file or git ref (commit/tag) and output ALTER TABLE ADD COLUMN / CREATE INDEX statements.
Unit tests cover storage/snapshot interfaces, state machine transitions, SandboxActor archive/restore, reconcile_archiving progress checks, and delete-clears-archive behavior. Integration tests cover Registry V2 push/pull/delete, S3 storage round-trip, and full archive E2E with MinIO + local registry fixtures.
Clarify intent: this long-interval scanner handles automatic state transitions (expired → STOPPED), not generic background checks.
… time Add _auto_archive_stopped to _auto_transition: STOPPED sandboxes whose stop_time exceeds auto_archive_after_sec (default 3600s) are automatically archived. Set auto_archive_after_sec=0 to disable. Rename _check_job_background → _auto_transition to reflect expanded scope.
…o-clear default - Add auto_delete_after_sec / auto_clear_default_sec to SandboxLifecycleConfig - Implement _auto_delete_stopped(): delete STOPPED sandboxes past threshold (runs before auto-archive; deleted IDs excluded from archive scan) - Add _apply_auto_clear_default(): use lifecycle.auto_clear_default_sec when SDK does not explicitly set auto_clear_time_minutes - Support Nacos override for lifecycle config section
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
close #1085
auto_archive_after_sec_auto_archive_stoppedscans STOPPED sandboxes, auto-archives after idle thresholdauto_delete_after_sec_auto_delete_stoppedscans STOPPED sandboxes, auto-deletes after idle thresholdExecution order
_auto_transitionruns auto-delete first, then auto-archive. Sandboxes deleted in the first pass are excluded from the archive scan to avoid double-processing.Idle time source
Both auto-archive and auto-delete measure idle time from
stop_timeinsandbox_info. Sandboxes without astop_timeare skipped.Configuration (added by this change)