Fix Windows daemon crash during v1 to v2 local migration#1209
Conversation
The v1->v2 data migration ran fsync/rename/remove on libSQL SQLite files whose native OS handle lingers past close() on Windows, crashing the daemon at boot with a fatal "Unknown error". POSIX never hits this, so it only reproduced on Windows (the v1.5.23 desktop/CLI startup regression). - fsyncFileIfExists opens read-write, since Windows FlushFileBuffers needs a writable handle (a read-only fd throws EPERM), and treats fsync as best-effort durability hardening. - remove of the source/staging file sets now retries EBUSY/EPERM the same way rename already did. - rename/remove retries force a GC pass first so libSQL's native finalizer releases the handle before the next attempt. Verified on a Windows VM: a seeded v1 database now migrates to v2 and the daemon announces ready.
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
executor-marketing | 86da524 | Commit Preview URL Branch Preview URL |
Jun 29 2026, 07:22 AM |
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ✅ Deployment successful! View logs |
executor-cloud | 86da524 | Jun 29 2026, 07:23 AM |
Greptile SummaryThis PR fixes a Windows-only daemon crash during the v1→v2 local database migration, caused by libSQL's OS file handles lingering past
Confidence Score: 4/5Safe to merge; directly addresses a verified Windows startup crash with a targeted, well-commented fix. The migration logic is carefully structured: the retry helpers mirror the pre-existing apps/local/src/db/v1-v2-migration.ts — the Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant M as migrateLocalV1ToV2IfNeeded
participant C as libSQL client
participant GC as Bun.gc (nudgeNativeHandleRelease)
participant FS as File System
M->>C: client.close()
Note over C,FS: Windows: OS handle lingers in native finalizer
M->>GC: Bun.gc(true) — flush finalizer queue
GC-->>FS: libSQL finalizer releases OS handle
M->>FS: rmWithRetry / renameWithRetry (retry on EBUSY/EPERM)
FS-->>M: success
Note over M,FS: fsyncFileIfExists path
M->>FS: "openSync(path, "r+") <- writable handle for FlushFileBuffers"
FS-->>M: fd
M->>FS: "fsyncSync(fd) <- best-effort, errors swallowed"
M->>FS: closeSync(fd)
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant M as migrateLocalV1ToV2IfNeeded
participant C as libSQL client
participant GC as Bun.gc (nudgeNativeHandleRelease)
participant FS as File System
M->>C: client.close()
Note over C,FS: Windows: OS handle lingers in native finalizer
M->>GC: Bun.gc(true) — flush finalizer queue
GC-->>FS: libSQL finalizer releases OS handle
M->>FS: rmWithRetry / renameWithRetry (retry on EBUSY/EPERM)
FS-->>M: success
Note over M,FS: fsyncFileIfExists path
M->>FS: "openSync(path, "r+") <- writable handle for FlushFileBuffers"
FS-->>M: fd
M->>FS: "fsyncSync(fd) <- best-effort, errors swallowed"
M->>FS: closeSync(fd)
|
Cloudflare previewTorn down — the PR is closed. |
@executor-js/cli
@executor-js/config
@executor-js/execution
@executor-js/sdk
@executor-js/codemode-core
@executor-js/runtime-quickjs
@executor-js/plugin-file-secrets
@executor-js/plugin-graphql
@executor-js/plugin-keychain
@executor-js/plugin-mcp
@executor-js/plugin-onepassword
@executor-js/plugin-openapi
executor
commit: |
Problem
The bundled CLI/desktop daemon crashed on first launch on Windows with a fatal
Unknown errorwhenever a v1 local database was present. macOS and Linux were unaffected. This is the v1.5.23 Windows startup regression (the desktopSmoke test bundled executorjob fails on Windows; mac/linux pass).Root cause
The v1 to v2 data migration performs file operations on libSQL SQLite files (the canonical
data.db, the.source-*copy, and the.building-*staging db, each with-wal/-shmsidecars). On Windows, libSQL releases its OS file handle in a native finalizer that only runs on GC, so the handle lingers pastclient.close(). POSIX releases immediately, which is why this never reproduced off Windows. The lingering handle caused:EPERMfromfsyncSyncon a read-only ("r") fd: WindowsFlushFileBuffersrequires a writable handle.EBUSYfromrename/rmof just-closed files. The existingrenameWithRetryhelped some sites, but its fixed backoff could not outlast a handle that was never finalized, and severalrmsites had no retry at all.The error surfaced only as
Unknown errorbecause the CLI error renderer collapses it;--log-level debugreveals theEPERM: ... fsync/EBUSY: ... renamechain.Fix (apps/local/src/db/v1-v2-migration.ts)
fsyncFileIfExistsopens the file read-write so the flush is permitted on Windows, and treats fsync as best-effort durability hardening (the rename/copy already wrote the bytes)..source-*/.building-*removals now retryEBUSY/EPERMthe same wayrenameWithRetryalready did.renameWithRetryandrmWithRetryforce a GC pass on each retry (Bun.gc(true), no-op off Bun) so libSQL's native finalizer releases the handle before the next attempt.Verification
Reproduced and fixed on a real Windows VM by running the published bundle against a seeded v1 database:
EPERM: operation not permitted, fsyncthenEBUSYonrm/rename, daemon never ready.Migrated local Executor data to v2; moved old DB to ...data.db.v1-v2-<ts>, thenEXECUTOR_READY:<port>/Daemon ready. The v2 db and thev1-v2-*backup are present and temp files cleaned up.typecheckpasses; the migration unit tests pass except 3 pre-existing crash-recovery tests that fail identically onmainin this environment (unrelated to this change). The Windows desktop smoke job in CI exercises this exact path.