Manual merge stable into release-1.10#9533
Merged
Merged
Conversation
…9506) * test: reusable lock recording adapter (#9439) * adding new lock test adapter * make the lock installation a fixture * use constants * typing issues * docs(test): trim LockTimeline docstring to one line Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(test): give RecordingLock an explicit typed constructor ty (the CI type checker) does not honor the mypy-style `# type: ignore[arg-type]`, so forwarding `*args: object` into InfrahubLock.__init__ failed ty check. Mirror the base constructor signature and forward named arguments instead, dropping the ignore. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test: accept a collection of lock names in assert_never_overlap Generalize assert_never_overlap to take a Collection[str] and verify that no two of the given locks are ever held simultaneously, instead of being limited to exactly two lock names. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test: split checkpoint assertion and require injected timeline - Replace the confusing `expected` flag on assert_held_at_checkpoint with two intent-revealing methods (assert_held_at_checkpoint / assert_not_held_at_checkpoint) delegating to a shared private helper. - Make `timeline` a required argument of RecordingLockRegistry instead of constructing one internally; install_recording_lock_registry now owns the default construction and injects it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(git): run object import outside the repository lock during sync (#9440) * create a seam to inject the new test adapter * refactor(git): split repo sync into git-state collection and import Extract collect_pending_imports() which performs the on-disk git mutations (fetch, branch creation, pull, commit-worktree pinning) and returns one entry per branch to import, instead of importing inline. sync() now runs the returned imports. Behavior is preserved; this is the seam that lets the import run outside the repository lock. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(git): run object import outside the repository lock during sync The repository lock is held to serialize git working-copy mutations, but it was also kept reserved during the object import, which is the slow phase when a branch carries conflicting data. Narrow the lock so it covers only the git-state work: - sync_repository now locks collect_pending_imports() and runs the imports after releasing the lock; the import target is injectable for testing. - sync_remote_repositories no longer wraps the origin sync and the reinit-fallback import in its outer repository lock; only the local init/reinit stays locked. Imports read from immutable per-commit worktrees pinned during the locked phase, so they no longer need the lock. Resolves the long lock reservation reported in IFC-1566. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(git): assert object import runs outside the repository lock Drives sync_repository with a recording lock registry and an injected recording importer, asserting the repository lock is not held when the import runs. Replaces the earlier pydantic-recast/flow-override scaffold with the injected-importer seam. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(git): create RepositorySyncer and RepositoryImporter components Replace the injected callable with a RepositoryImporter implementation that checkpoints into the lock timeline, asserting the repository lock is not held when the import runs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(git): drop redundant lock-scope comment in sync flow The narrow async-with scope is self-evident from the structure, and the intent is pinned by test_repository_lock_released_before_import. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(git): rename BranchImport to PendingObjectImport The dataclass models a pending import of repository objects (the target Infrahub branch and the source commit), not a branch. Rename it and the corresponding parameter so they read as queued import work, matching collect_pending_imports / pending_imports. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(git): call assert_not_held_at_checkpoint for the released-lock check The test invoked assert_held_at_checkpoint with an unsupported expected= kwarg. Use the adapter's dedicated assert_not_held_at_checkpoint, which expresses the intent that the repository lock is released before import. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(git): run object import outside the repository lock when adding a repository (#9441) * create a seam to inject the new test adapter * fix(git): run object import outside the repository lock during sync The repository lock is held to serialize git working-copy mutations, but it was also kept reserved during the object import, which is the slow phase when a branch carries conflicting data. Narrow the lock so it covers only the git-state work: - sync_repository now locks collect_pending_imports() and runs the imports after releasing the lock; the import target is injectable for testing. - sync_remote_repositories no longer wraps the origin sync and the reinit-fallback import in its outer repository lock; only the local init/reinit stays locked. Imports read from immutable per-commit worktrees pinned during the locked phase, so they no longer need the lock. Resolves the long lock reservation reported in IFC-1566. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(git): create RepositorySyncer and RepositoryImporter components Replace the injected callable with a RepositoryImporter implementation that checkpoints into the lock timeline, asserting the repository lock is not held when the import runs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(git): run object import outside the repository lock when adding a repository add_git_repository held the repository lock for the whole flow (local clone, default-branch object import, and origin sync). Narrow it to cover only the local clone and pinning its commit worktree; the import (via the injected RepositoryImporter) and the origin sync (via RepositorySyncer) now run after the lock is released. Imports read from the immutable per-commit worktree, so they no longer need the lock. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(git): assert add_git_repository imports outside the repository lock Add a no-mock test driving add_git_repository with a recording lock registry and a recording RepositoryImporter, asserting the repository lock is not held when the import runs. Update the existing mock-based create test for the new structure (import via the importer, sync via RepositorySyncer). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(git): drop redundant add_git_repository lock comment The narrow async-with scope is self-evident, and the import-outside-lock invariant is pinned by test_add_git_repository_releases_lock_before_import. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(git): restore create_commit_worktree as a local-only primitive Revert the on-demand remote fetch that was placed inside the worktree primitive. The fetch mutates the shared clone and belongs at the orchestration layer under the repository lock, not in a synchronous local working-copy primitive. The primitive again raises CommitNotFoundError when the commit is absent from the local clone. The fetch-on-miss safety net is re-homed under the repository lock in a follow-up commit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(git): fetch a missing commit under the repository lock during init When a task references a commit that this worker's clone has not fetched yet, init() now fetches once under the repository lock and retries the worktree creation before surfacing CommitNotFoundError. The lock serializes the shared-clone fetch against concurrent resets and merges. This re-homes, at the orchestration layer, the recovery that previously lived inside the create_commit_worktree primitive, and keeps the safety net for the window the eventually-consistent sync fan-out leaves open: a dispatched task can outrun the fetch broadcast, or run on a restarted or newly added worker. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor: recording adapter test package + common class RecordingImporter * refactor: renamed locally overridden object * test(git): assert CommitNotFoundError message in worktree and init tests Add match= to the CommitNotFoundError assertions so they verify the commit SHA and repository identifier carried in the message, not just the exception type. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(git): tighten init fetch-retry comment Keep only the non-obvious rationale (the commit may be remote-only; the lock serializes shared-clone mutations); drop the unknowable database provenance and the enumerated operation list. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(git): use the recording lock adapter in the add-repo create test Replace the patched lock mock and the weak assert_any_call with install_recording_lock_registry and a timeline assertion that the repository lock was acquired, removing unittest.mock from the lock check in this test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * tests: use proper fixtures * refactor(git): extract RepositoryAdder and drop the add_git_repository importer param Move the locked clone/worktree and the outside-lock default-branch import into a RepositoryAdder component (sibling to RepositorySyncer). The task no longer needs an injectable importer parameter and always uses RepositoryFileImporter; the lock-release test injects the recording importer through the component instead. Invert the active-status check to return early. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(git): pin the sync import commit while the repository lock is held sync_remote_repositories resolved the import commit from the live branch head after releasing the lock, so a concurrent operation could advance the branch between lock release and import. Capture the commit while the lock is held and pass it to import_objects_from_files, matching the add and sync-via-syncer flows. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(git): isolate per-branch failures while collecting pending imports collect_pending_imports performed the git working-copy side of a sync for every branch and only then returned the imports for the caller to apply. A failure on a later branch aborted the whole collection, dropping the imports for branches already processed in the same run. Wrap each branch in its own error boundary so a single failing branch is logged and skipped while the rest are still imported. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(git): mark git_repos_dir as a usefixtures dependency git_repos_dir is needed only for its side effect (pointing the repositories directory at a temp path); it was declared as an unused parameter. Request it via usefixtures so the dependency is explicit and not mistaken for dead code. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor: split the method, too many statements * fix docstring collect_pending_imports * docs: clarify locked-phase scope in import docstring Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(git): log traceback when skipping a branch during import collection The per-branch error boundaries swallow the exception, so without exc_info the traceback was lost. Pass exc_info=exc to capture the skipped branch's call stack; the exception message is already part of the rendered traceback, so no separate error field is added. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(git): catch CommitNotFoundError in repository failure handlers create_commit_worktree now raises CommitNotFoundError, a sibling of RepositoryError rather than a subclass, so the existing per-branch isolation and failure-tagging handlers no longer caught a missing-commit failure they previously did. Add CommitNotFoundError explicitly to the relevant except clauses so a missing commit skips the branch / tags and skips the repo instead of aborting the whole sync. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore: add changelog fragment for repository lock narrowing [#6639] Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: update testing documentation for lock test adapter --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…to-release-1.10 # Conflicts: # backend/infrahub/git/integrator.py # backend/infrahub/git/tasks.py
Contributor
There was a problem hiding this comment.
1 issue found
Confidence score: 3/5
- There is a concrete regression risk in
backend/infrahub/git/repository.py: an uncaught staging pull failure can terminate sync early, which may skip applying imports already collected from other branches. - The issue is high-confidence and medium-high severity (7/10, confidence 9/10), so this is more than a minor edge-case and could affect user-visible sync completeness.
- Pay close attention to
backend/infrahub/git/repository.py- ensure staging pull errors are handled so sync can continue or fail in a controlled way without dropping previously gathered imports.
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/infrahub/git/repository.py">
<violation number="1" location="backend/infrahub/git/repository.py:194">
P1: Uncaught staging pull failure can abort sync before applying already-collected imports from other branches.</violation>
</file>
Shadow auto-approve: would not auto-approve because issues were found.
Re-trigger cubic
| f"An update was detected but the commit remained the same after pull() ({commit_after}).", | ||
| repository=self.name, | ||
| branch=self.default_branch, | ||
| commit_after = await self.pull(branch_name=self.default_branch) |
Contributor
There was a problem hiding this comment.
P1: Uncaught staging pull failure can abort sync before applying already-collected imports from other branches.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/infrahub/git/repository.py, line 194:
<comment>Uncaught staging pull failure can abort sync before applying already-collected imports from other branches.</comment>
<file context>
@@ -121,26 +176,36 @@ async def sync(self, staging_branch: str | None = None) -> None:
- f"An update was detected but the commit remained the same after pull() ({commit_after}).",
- repository=self.name,
- branch=self.default_branch,
+ commit_after = await self.pull(branch_name=self.default_branch)
+ if isinstance(commit_after, str):
+ return [
</file context>
Suggested change
| commit_after = await self.pull(branch_name=self.default_branch) | |
| try: | |
| commit_after = await self.pull(branch_name=self.default_branch) | |
| except (RepositoryError, CommitNotFoundError, GitCommandError, ValueError) as exc: | |
| log.warning( | |
| "Failed to pull staging source branch for import, skipping it.", | |
| repository=self.name, | |
| branch=self.default_branch, | |
| exc_info=exc, | |
| ) | |
| return [] |
Contributor
Author
There was a problem hiding this comment.
This could be an issue in the future if the code changes that's true. But for now, there is no other branch to collect import from when reaching this code.
- If
self.internal_status == RepositoryInternalStatus.STAGING.valueisTrue, there is no import collected before reaching the point. - If
self.internal_status == RepositoryInternalStatus.ACTIVE.value, then this code returns early.
Contributor
There was a problem hiding this comment.
Got it — thanks for the clarification.
ajtmccarty
approved these changes
Jun 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Supersedes #9529, which was blocked on merge conflicts.
Summary
Merges
stableintorelease-1.10. This forward-ports the repository-lock-narrowing work (only guarding clone/fetch/worktree mutations, running object import after the lock is released) plus theCommitNotFoundErrorhardening.Conflicts resolved
backend/infrahub/git/integrator.py— both sides edited the same import region.stablewidenedfrom infrahub import configtoconfig, lock;release-1.10addedAnonymousSessionandInfrahubContext. Kept all three imports.backend/infrahub/git/tasks.py— both sides edited the same import region.stableaddedCommitNotFoundError;release-1.10addedGitRepositoryNodeQuery. Kept both.Both were mechanical import collisions. The rest of the refactor (
infrahub/git/sync.py,RepositorySyncer/RepositoryAdder/RepositoryFileImporter, thesync→collect_pending_importssplit, lock test adapter) auto-merged. Verified the auto-merge is correct: all kept imports are used, every.sync(call site uses the new component API (no stale old-API calls), andrelease-1.10'sGitRepositoryNodeQueryusages live in separate functions from the refactored sync path.Conflicts raised for review
None — both conflicts were mechanical.
Generated files regenerated
None — the merge touched no generated files.
Validation
py_compile,ruff check,ty check— all pass on the changed git module.backend/tests/unit/adapters/test_lock.py— 6/6 pass.backend/tests/component/git/{test_git_repository,test_git_rpc,test_sync_lock_scope}.pyand the integration suite — CI covers these.Summary by cubic
Merges
stableintorelease-1.10, forward‑porting narrower Git repository locking andCommitNotFoundErrorhardening. Imports now run outside the repo lock; missing commits are fetched once during init under the lock.Refactors
RepositorySyncer,RepositoryAdder, andRepositoryFileImporter.collect_pending_imports(under lock) and an import loop (outside lock); pin the exact commit during the locked phase before importing.Bug Fixes
create_commit_worktreeraisesCommitNotFoundErrorwhen a commit is missing locally;InfrahubRepository.initfetches once under the repository lock and retries.CommitNotFoundErroralongsideRepositoryErrorto tag/skip instead of aborting; worker notifications include the pinned commit viaRefreshGitFetch.Written for commit 1c0ad67. Summary will update on new commits.