Skip to content

fix(file-handler): delete-db no-ops when the storage file is already gone, resurrecting deleted files#56

Open
cosmic-fire-eng wants to merge 1 commit into
vrtmrz:mainfrom
cosmic-fire-eng:fix/offline-scanner-delete-tombstone
Open

fix(file-handler): delete-db no-ops when the storage file is already gone, resurrecting deleted files#56
cosmic-fire-eng wants to merge 1 commit into
vrtmrz:mainfrom
cosmic-fire-eng:fix/offline-scanner-delete-tombstone

Conversation

@cosmic-fire-eng

Copy link
Copy Markdown

What

In the offline-scanner delete-db path, a file that has already been removed from storage is never tombstoned in the database, so the next scan resurrects it. With headless / CLI synchronisation this shows up as deletions silently reappearing and zero tombstones ever being written (in one reproduction, a delete cycle logged doc_del_count=0 while the daemon emitted dozens of DELETE DATABASE lines — the deletes were attempted but never persisted).

Why

ServiceFileHandlerBase.deleteFileFromDB() calls infoToStub() first, which stats storage to build the stub. In the delete-db branch the file is by definition already gone from storage, so the stub is always null and the method returns false without touching the database. Worse, offlineScanner ignored that return value and cleared the file's last-seen entry from the status map regardless. On the following scan the document looks database-only with no last-seen record, is classified as update-storage, and the deleted file is written back from the database.

Fix (two small, independent parts)

  1. deleteFileFromDB: when the storage stub is null, fall back to a path-based database delete — fetch the entry by path and, if it exists and is not already deleted, delete it by path. This mirrors what the CLI rm command already does. It still returns false when there is genuinely nothing to delete.
  2. offlineScanner: only clear the fileStatusMap last-seen record when the database delete actually succeeded. If it returns false, keep the record and log a notice, so a no-op delete can never feed the resurrection path.

Net effect: a real tombstone reaches the database and the file stays deleted across scans.

Tests

  • npm run tsc-check and npm run test:unit pass (one unrelated, pre-existing bgWorker env failure aside).
  • Added a focused unit test for the delete-db false-return path: the last-seen record is retained, a notice is logged, and the file is not resurrected.
  • Manual repro: create + sync a file; delete it from storage; run the offline scanner twice. Pre-fix the file reappears and no tombstone is written; post-fix the first scan deletes by path, the DB entry is tombstoned, and the second scan does not resurrect it. Storage-present deletes (non-null stub) are unchanged.

…dy-removed file

In the offline-scanner delete-db path a file that is already gone from
storage is never tombstoned in the database, so the next scan resurrects
it. With headless or CLI synchronisation this shows up as deletions
silently reappearing and no tombstone ever being written.

deleteFileFromDB() calls infoToStub() first, which stats storage to build
the stub. In the delete-db branch the file is by definition already gone
from storage, so the stub is always null and the method returns false
without touching the database. The offline scanner then ignored that
return value and cleared the file's last-seen entry from the status map
regardless. On the following scan the document looks database-only with
no last-seen record, is classified as update-storage, and the deleted
file is written back from the database.

Two small, independent parts:

1. deleteFileFromDB: when the storage stub is null, fall back to a
   path-based database delete. Fetch the entry by path and, when it
   exists and is not already deleted, delete it by path. This mirrors
   what the CLI rm command already does. It still returns false when
   there is genuinely nothing to delete.

2. offlineScanner: only clear the fileStatusMap last-seen record when the
   database delete actually succeeded. When it returns false, keep the
   record and log a notice, so a no-op delete can never feed the
   resurrection path.

Adds a focused unit test covering the false-return path: the last-seen
record is retained and the file is not resurrected.

Co-Authored-By: Claude <noreply@anthropic.com>

@vrtmrz vrtmrz left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your PR! You are quite right. I also think that this is the correct way!
However, would you mind if I asked you to hear your intention, please? (If it was intentional, then I will merge it as it is).

this._log(`File ${tryGetFilePath(info)} is not exist on the storage`, LOG_LEVEL_VERBOSE);
return false;
}
const entryByPath = await this.db.fetchEntry(path as FilePathWithPrefix, undefined, true, true);

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are not going to use the contents, I think checking with fetchEntryMeta is another option, as it is lighter and does not involve a network request. Is this intentional?
Alternatively, if the idea is that the file should only be manipulated when all the chunks are present, that makes sense too.

The implementation in the CLI is a bit more naive (Given that it is a CLI, this is actually quite good in its own way), though. Of course, your PR's style is better in the plug-in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants