fix: add auth self-healing after repeated unlock failures#131
Conversation
There was a problem hiding this comment.
Hi @pwojciechowski ,
thanks for the PR. i have some minor concerns and then it can get merged.
EDIT: Please make sure to write a changelog in the Chart.yaml and bump the versions there
| cease_continuous_run = threading.Event() | ||
|
|
||
| class ScheduleThread(threading.Thread): | ||
| @classmethod |
There was a problem hiding this comment.
please add back this decorator. we should contain this thread to its own object
There was a problem hiding this comment.
Good catch — I restored @classmethod on ScheduleThread.run
| configured_threshold = os.environ.get( | ||
| "BW_AUTH_FAILURE_THRESHOLD", str(AUTH_FAILURE_THRESHOLD) | ||
| ) | ||
| try: |
There was a problem hiding this comment.
this try expect block seems to be used to manage program flow. please refactor it to an if else statement and do not use try expect for that.
There was a problem hiding this comment.
Agreed. I refactored the auth threshold/signin path to explicit return-value and if/else logic. Rookie mistake ;)
| auth_failures = 0 | ||
| logger.info("Authentication recovery succeeded") | ||
| except BitwardenCommandException as exc: | ||
| logger.error(f"Authentication recovery failed: {exc}") |
There was a problem hiding this comment.
shouldn't we exit in this case to let kubernetes retry a fresh run?
There was a problem hiding this comment.
Yes, implemented: if auth recovery still fails after threshold + recovery attempt, the operator now logs the failure and exits with sys.exit(1) so Kubernetes restarts it cleanly.
|
@pwojciechowski Please make sure to write a changelog in the Chart.yaml and bump the versions there |
Done. Chart.yaml updated. |
Lerentis
left a comment
There was a problem hiding this comment.
LGTM
Thanks again for sending this PR
Summary
This PR adds bounded auth self-healing to the Bitwarden CRD operator and documents the new tuning option in Helm/root docs.
Problem
In some environments, the operator can enter a persistent auth failure loop:
You are not logged in.Failed to unlock vaultupdate_managed_secretfailuresWhen this happens, periodic retries continue but may not recover without a manual pod restart.
What changed
Auth recovery logic
bitwarden_signin.BW_AUTH_FAILURE_THRESHOLD(default:3)BW_SESSIONbw logout(best effort)~/.config/Bitwarden CLI/data.jsonDocumentation and chart values
BW_AUTH_FAILURE_THRESHOLDto:values.yamlenv snippet commentsWhy this approach
Files changed
src/bitwardenCrdOperator.pytests/test_bitwarden_signin_recovery.pyREADME.mdcharts/bitwarden-crd-operator/values.yamlcharts/bitwarden-crd-operator/README.mdTests
Added unit tests for:
Local verification run: