Skip to content

appium: runner Auto-Lock requirement (real appium-smoke flake fix) + idle-wait hardening#1054

Merged
kar merged 4 commits into
mainfrom
fix/appium-smoke-notification-alert-race
May 19, 2026
Merged

appium: runner Auto-Lock requirement (real appium-smoke flake fix) + idle-wait hardening#1054
kar merged 4 commits into
mainfrom
fix/appium-smoke-notification-alert-race

Conversation

@kar
Copy link
Copy Markdown
Contributor

@kar kar commented May 19, 2026

Root cause of the appium-smoke flakiness: CI-runner device Auto-Lock

A controlled experiment settled it. On the same commit, with no code change other than the runner iPhone's Auto-Lock disabled:

phase result
device could sleep (Auto-Lock on) effectively all FAIL
Auto-Lock = Never 4 / 4 PASS

When the iPhone auto-locks/sleeps mid-run, WebDriverAgent returns black screenshots and an empty/SpringBoard accessibility tree, so element lookups fail intermittently — surfacing as element ("~automation.power_toggle") still not existing or Failed to find General / Allmänt in Settings, both passing on re-run. It was never the notification-permission race or WDA quiescence (earlier theories in this PR's history were wrong and are reverted).

The actual fix is runner configuration (documented here, not code)

automation/appium/README.md now requires, as one-time device setup: Settings ▸ Display & Brightness ▸ Auto-Lock ▸ Never, device kept unlocked and on power. CI self-hosted-runner devices must have this set.

What remains in this PR (low-risk hardening only)

  • appium:waitForIdleTimeout: 0 — legitimate, widely-recommended Flutter+Appium capability; reduces a real class of WDA quiescence stalls regardless of this flake.
  • Minimal acceptNotificationAlert change (don't bail on the first "no modal dialog" miss; poll within the timeout) + a notification-accept call in enableDnsProfile.
  • Reverted: the speculative aggressive "dismiss SpringBoard prompt without getAlertText" rewrite (unvalidated; not the cause).

Branch history shows the investigation (v1→v2→v3→revert+docs); squash-merge recommended.

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings May 19, 2026 10:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes flakiness in the DNS onboarding smoke test caused by an asynchronously-presented iOS notification permission alert racing the Settings hand-off. The previous acceptNotificationAlert() returned immediately if no alert was present at the moment of first check, allowing a late-arriving prompt to overlay Settings and break navigation.

Changes:

  • alerts.ts: replace early return on "no modal dialog open" with continued polling until the timeout, so an async-delayed prompt is still caught.
  • settings.ts: call acceptNotificationAlert() at the start of enableDnsProfile() to clear any permission prompt before navigating Settings.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
automation/appium/wdio/src/flows/alerts.ts Keep polling for an alert until the timeout instead of bailing on the first "no modal" error.
automation/appium/wdio/src/flows/settings.ts Invoke acceptNotificationAlert() before navigating in enableDnsProfile() to clear a late-surfacing permission prompt.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

kar added a commit that referenced this pull request May 19, 2026
…tAlertText

PR #1054 v1 was insufficient — CI failure screenshots show the iOS
permission alert IS up at the Settings hand-off yet the run still fails
identically. Root cause refined: notification-permission prompts are
SpringBoard system alerts that driver.getAlertText() does NOT expose
(throws "no modal dialog open" while visible). All dismissal paths
(mobile: alert / ~Allow element tap / acceptAlert) were gated behind a
successful getAlertText(), so the function returned at `if (!alertText)`
and never tapped Allow — the prompt then blocked the hand-off.

Restructure acceptNotificationAlert: poll up to the timeout and, every
iteration, attempt mobile:alert accept, the Allow-style element tap, and
acceptAlert REGARDLESS of getAlertText. getAlertText is now only used to
bail on a *different* non-notification alert (kept safe) or take the
locale path when WDA does expose it. The SpringBoard prompt is reachable
as a tappable element even though the W3C alert API can't see it — the
path the previous code never reached.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kar kar changed the title fix(appium-smoke): stop notification-permission alert racing the Settings hand-off appium: runner Auto-Lock requirement (real appium-smoke flake fix) + idle-wait hardening May 19, 2026
…+ idle-wait hardening

A controlled experiment settled the long-standing appium-smoke flakiness:
on the SAME commit, with no code change other than disabling the runner
iPhone's Auto-Lock, the smoke went from effectively all-FAIL to 4/4 PASS.

Root cause: the CI-runner device auto-locking/sleeping mid-run. When the
iPhone sleeps, WebDriverAgent returns black screenshots and an empty /
SpringBoard accessibility tree, so element lookups fail intermittently
(`element ("~automation.power_toggle") still not existing`,
`Failed to find General / Allmänt in Settings`) and pass on re-run. It
was never the notification-permission race or WDA quiescence.

Changes:
- automation/appium/README.md: document the actual remedy — device
  Auto-Lock must be Never (one-time/runner setup); CI self-hosted-runner
  devices must have this set. This is the fix; it is device config.
- capabilities.mjs: appium:waitForIdleTimeout: 0 — legitimate, low-risk
  Flutter+Appium hardening (independent of this flake).
- alerts.ts: minimal robustness — acceptNotificationAlert keeps polling
  within its timeout instead of returning on the first "no modal" miss.
- settings.ts: clear a stray notification prompt before navigating in
  enableDnsProfile.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kar kar force-pushed the fix/appium-smoke-notification-alert-race branch from 8d5d13b to d0792a4 Compare May 19, 2026 13:00
kar and others added 3 commits May 19, 2026 15:20
Reverting this in the cleanup was premature. Evidence: the 4/4 awake-phase
PASS runs were on a commit that still INCLUDED this v2 handling; after it
was reverted (v1-only), the awake device failed mode-1
("Failed to find General / Allmänt", notification prompt up at the
app->Settings hand-off) 2/2. So Auto-Lock fixed the device-sleep flake
(mode-2) and this v2 dismissal addresses the separate notification-prompt
flake (mode-1); both are needed.

Restores acceptNotificationAlert to the version that polls within the
timeout and attempts mobile:alert / Allow-element tap / acceptAlert every
iteration regardless of getAlertText (SpringBoard permission prompts are
not exposed via getAlertText but are tappable as elements).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…fix)

The manual "Auto-Lock = Never" device setting does not persist (resets on
reconnect/reboot), so the device-asleep flake recurs. Make each run
self-sufficient instead of relying on that setting:

- make appium-test: after install, before run-wdio, relaunch the app via
  `xcrun devicectl device process launch --terminate-existing` — this
  wakes the screen and foregrounds the app, closing the gap where the
  device sleeps during the long build/install and WDA then runs against a
  slept device. Best-effort (WDA also activates the app); stderr surfaces
  failures.
- wdio.conf.ts: an `after` hook locks the device (browser.lock()) when
  the run finishes, so the CI iPhone is not left awake at full brightness
  between runs. Wrapped so a lock failure never fails the suite.

Requires the CI device to have no passcode (already required for any
automated unlock). Net cycle: sleep between runs, deterministically wake
for each test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The wake step's success path was piped to /dev/null and the lock hook
only logged on failure, so a run could not prove either actually ran
(valid concern raised in review). Now both emit explicit log lines on
every run:
- make appium-test: "waking device ..." + "device wake OK|FAILED"
- wdio.conf.ts after hook: "Post-run: locking device screen" +
  "device screen locked" (or the existing failure warning)

No behavior change — observability only, so each smoke run self-verifies
the mode-2 wake/sleep lifecycle is active.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kar
Copy link
Copy Markdown
Contributor Author

kar commented May 19, 2026

There was two main reasons for flaky smoke test runs:

  • system notification prompt timing, would sometimes interfere.
  • when device is sleeping (screen off) it seems it can sometimes execute the test, but usually will fail.

Addressed both. the smoke test will wake up and sleep device before/after. tested with 5 runs in a row, all succeeded.

@kar kar merged commit 058f7c2 into main May 19, 2026
3 checks passed
@kar kar deleted the fix/appium-smoke-notification-alert-race branch May 19, 2026 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants