Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions .github/workflows/tui-tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
name: TUI Tests

# Interactive terminal (TUI) smoke suite. Drives the real coven-code binary
# through a tmux pseudo-terminal and asserts on the rendered output.
# Offline / headless-first: no live model call, no network, no credentials.
# See scripts/tui-tests/README.md.

on:
push:
branches: [main]
paths:
- 'src-rust/**'
- 'scripts/tui-tests/**'
- '.github/workflows/tui-tests.yml'
pull_request:
paths:
- 'src-rust/**'
- 'scripts/tui-tests/**'
- '.github/workflows/tui-tests.yml'
workflow_dispatch:

# A newer push to the same ref supersedes an in-flight run.
concurrency:
group: tui-tests-${{ github.ref }}
cancel-in-progress: true

env:
CARGO_TERM_COLOR: always

jobs:
tui-tests:
name: Interactive terminal suite
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@v5
with:
persist-credentials: false

- name: Install Rust toolchain
uses: dtolnay/rust-toolchain@stable

# tmux drives the pseudo-terminal; ALSA + pkg-config are needed to build
# the binary (voice feature), mirroring the release workflow.
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y tmux libasound2-dev pkg-config

- name: Cache cargo registry & build
uses: actions/cache@v5
with:
path: |
~/.cargo/registry
~/.cargo/git
src-rust/target
key: tui-tests-cargo-${{ hashFiles('src-rust/Cargo.lock') }}
restore-keys: tui-tests-cargo-

- name: Build debug binary
working-directory: src-rust
run: cargo build --locked --package claurst

# Belt-and-suspenders: mark onboarding complete so a credential-less
# runner lands on the main screen rather than (potentially) a blocking
# provider-setup dialog. The suite makes no network calls.
- name: Seed offline settings
run: |
mkdir -p "$HOME/.coven-code"
printf '{"hasCompletedOnboarding": true}' > "$HOME/.coven-code/settings.json"

- name: Run interactive terminal suite
env:
# Fail loudly if tmux is missing instead of silently skipping the
# interactive cases, and persist pane captures for any failure.
REQUIRE_TMUX: '1'
TUI_LOG_DIR: ${{ runner.temp }}/tui-failures
run: bash scripts/tui-tests/run.sh

- name: Upload failure pane captures
if: failure()
uses: actions/upload-artifact@v4
with:
name: tui-failure-captures
path: ${{ runner.temp }}/tui-failures
if-no-files-found: ignore
retention-days: 14
107 changes: 107 additions & 0 deletions scripts/tui-tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Interactive terminal (TUI) test suite

Pre-release smoke tests that drive the real `coven-code` binary through a
`tmux` pseudo-terminal, send keystrokes, capture the rendered pane, and assert
on what the user actually sees. This automates the manual tmux workflow
described in [`AGENTS.md`](../../AGENTS.md) ("Testing the TUI in a controlled
terminal").

**Offline / headless-first:** no test makes a live model call or hits the
network. Every assertion is deterministic and runnable on any machine with the
binary and `tmux`.

## Usage

```bash
# From the repo root. Auto-detects target/{debug,release}/coven-code.
scripts/tui-tests/run.sh

# Build the debug binary first, then test it.
scripts/tui-tests/run.sh --build

# Test a specific binary (e.g. an installed release).
COVEN_BIN=/usr/local/bin/coven-code scripts/tui-tests/run.sh

# Run only matching case files (substring match on the filename).
scripts/tui-tests/run.sh 03 05
```

Exit code is `0` only if every assertion passed, `1` otherwise — suitable for
a release gate or CI step.

## CI

Runs automatically via [`.github/workflows/tui-tests.yml`](../../.github/workflows/tui-tests.yml)
on pushes to `main` and on PRs that touch `src-rust/**` or this suite (plus
manual `workflow_dispatch`). The job installs `tmux`, builds the debug binary
(`cargo build --locked --package claurst`), seeds an offline settings file
(`hasCompletedOnboarding: true`) so a credential-less runner lands on the main
screen, and runs `run.sh`. No secrets or network access required.

## Requirements

- The `coven-code` binary (built debug/release, or on `PATH`).
- `tmux` (3.x). If absent, the headless cases still run and the interactive
cases are reported as **skipped** rather than failed.

## What it covers

| Case file | Area | Sample assertions |
|---|---|---|
| `cases/01_headless.sh` | CLI surface (no TTY) | `--version` semver, `--help` usage + flags, unknown flag exits non-zero, `auth status`, `models` catalog, `--dump-system-prompt` |
| `cases/02_startup.sh` | TUI cold start | title banner, welcome/changelog sections, input glyph, footer keybindings, no panic on boot |
| `cases/03_command_palette.sh` | Slash palette | `/` opens it, lists `/clear` `/compact` `/config`, typing filters the list, `Ctrl+K` entry point |
| `cases/04_help_overlay.sh` | Help overlay | `?` opens keybinding + command reference, `Esc` closes it |
| `cases/05_input_editing.sh` | Prompt input | typed text echoes into the buffer, `Ctrl+U` clears it |
| `cases/06_quit.sh` | Shutdown | `Ctrl+C` twice exits cleanly back to the shell |

## Configuration

Override via environment variables:

| Var | Default | Purpose |
|---|---|---|
| `COVEN_BIN` | auto-detected | Binary under test |
| `TUI_WIDTH` / `TUI_HEIGHT` | `120` / `40` | tmux pane size (the TUI is layout-sensitive) |
| `TUI_WAIT_TIMEOUT` | `20` | Seconds to wait for an expected string |
| `TUI_SESSION` | `coven-tui-test` | tmux session name |
| `TUI_SOCKET` | `coven-tui-test` | Dedicated tmux server socket (`-L`) so the harness never touches your own tmux sessions |
| `TUI_LOG_DIR` | _(unset)_ | If set, the full pane capture for each failing case is written here (CI uploads these as artifacts) |
| `REQUIRE_TMUX` | `0` | If `1`, a missing `tmux` is a hard error instead of skipping the interactive cases (set in CI) |

## Writing a new case

Drop a `cases/NN_name.sh` file. Each file registers one function:

```bash
register_case tc_mything

tc_mything() {
describe "My thing"
if ! have_tmux; then _skip "tmux not installed"; return 0; fi
tui_start || { tui_stop; return 0; }

tui_keys C-k # send a binding (tmux key tokens)
tui_type "some text" # type literal characters
tui_settle # let it redraw

local s; s="$(tui_capture)"
assert_contains "$s" "expected" "palette shows expected"

tui_stop
}
```

Helpers from [`lib.sh`](lib.sh): `tui_start` / `tui_stop`, `tui_keys`,
`tui_type`, `tui_settle`, `tui_capture`, `wait_for`, and the assertions
`assert_contains` / `assert_absent` / `assert_matches` / `assert_eq`. For
headless checks, call `run_bin <args...>` and read `$RUN_OUT` / `$RUN_RC`.

## Known tmux limitations

- `Ctrl+Shift+<key>` chords (e.g. the `Ctrl+Shift+A` model picker) cannot be
sent reliably through `tmux send-keys`, so they are not asserted here. Use
the slash palette path instead where one exists.
- After spawning a session, the inner shell needs a moment before it accepts
input. `tui_start` proves readiness with a marker echo before launching the
binary; replicate that pattern if you script sessions by hand.
56 changes: 56 additions & 0 deletions scripts/tui-tests/cases/01_headless.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
#!/usr/bin/env bash
# shellcheck shell=bash
#
# Headless CLI surface — no tmux, no TTY, no network. These exercise the
# argument parser and the offline subcommands that must work before release.

register_case tc_headless

tc_headless() {
describe "Headless CLI surface (offline)"

# --version prints the workspace version.
run_bin --version
assert_eq "$RUN_RC" "0" "--version exits 0"
assert_contains "$RUN_OUT" "coven-code" "--version names the binary"
# Version is a semver-ish x.y.z; assert the shape, not a hardcoded number.
assert_matches "$RUN_OUT" '[0-9]+\.[0-9]+\.[0-9]+' "--version prints a semver"

# --help describes usage and a couple of stable flags.
run_bin --help
assert_eq "$RUN_RC" "0" "--help exits 0"
assert_contains "$RUN_OUT" "Usage: coven-code" "--help shows usage line"
assert_contains "$RUN_OUT" "--print" "--help lists --print"
assert_contains "$RUN_OUT" "--model" "--help lists --model"
assert_contains "$RUN_OUT" "--permission-mode" "--help lists --permission-mode"

# An unknown flag must error and exit non-zero (clap behavior).
run_bin --definitely-not-a-flag
if [ "$RUN_RC" -ne 0 ]; then
_pass "unknown flag exits non-zero (rc=$RUN_RC)"
else
_fail "unknown flag exits non-zero" "$RUN_OUT"
fi

# auth status: must run offline and report a status without crashing.
# Outcome (logged in / not) depends on the machine, so accept either,
# but require a clean exit and recognizable wording.
run_bin auth status
if [ "$RUN_RC" -eq 0 ] || [ "$RUN_RC" -eq 1 ]; then
_pass "auth status exits cleanly (rc=$RUN_RC)"
else
_fail "auth status exits cleanly" "$RUN_OUT"
fi
assert_matches "$RUN_OUT" '[Ll]ogged in|[Nn]ot logged in|[Ll]og in' \
"auth status reports a login state"

# models: the static model catalog renders offline.
run_bin models
assert_eq "$RUN_RC" "0" "models exits 0"
assert_matches "$RUN_OUT" 'ctx:.*in:.*out:' "models lists priced model rows"

# --dump-system-prompt: offline prompt assembly (hidden flag).
run_bin --dump-system-prompt
assert_eq "$RUN_RC" "0" "--dump-system-prompt exits 0"
assert_contains "$RUN_OUT" "Working directory" "system prompt includes working directory"
}
37 changes: 37 additions & 0 deletions scripts/tui-tests/cases/02_startup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/usr/bin/env bash
# shellcheck shell=bash
#
# TUI cold-start: the welcome screen, status line, and footer render.

register_case tc_startup

tc_startup() {
describe "TUI startup & chrome"
if ! have_tmux; then _skip "tmux not installed"; return 0; fi
tui_start || { tui_stop; return 0; }

# tui_start only waits for the frame title; give the welcome card a beat to
# populate before snapshotting.
wait_for "What's new" 5
local s; s="$(tui_capture)"

assert_contains "$s" "Coven Code v" "title banner renders with version"

# Right-hand welcome column. Wording is stable; the username is not, so we
# assert on the static labels only.
assert_contains "$s" "Tips for getting started" "welcome shows tips section"
assert_contains "$s" "What's new" "welcome shows changelog section"

# Input affordance.
assert_contains "$s" "❯" "input prompt glyph present"

# Footer hint bar exposes the core keybindings.
assert_contains "$s" "help" "footer advertises help binding"
assert_contains "$s" "familiar" "footer advertises familiar binding"

# No panic / error banner on a clean boot.
assert_absent "$s" "panicked at" "no rust panic on startup"
assert_absent "$s" "RUST_BACKTRACE" "no backtrace prompt on startup"

tui_stop
}
46 changes: 46 additions & 0 deletions scripts/tui-tests/cases/03_command_palette.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
#!/usr/bin/env bash
# shellcheck shell=bash
#
# Slash command palette: opening, filtering, and Ctrl+K entry point.

register_case tc_palette

tc_palette() {
describe "Command palette"
if ! have_tmux; then _skip "tmux not installed"; return 0; fi
tui_start || { tui_stop; return 0; }

# Typing "/" as the first character opens the palette.
tui_type "/"
if ! wait_for "/config"; then
_fail "palette opened on '/' (/config never appeared)" "$(tui_capture)"
tui_stop; return 0
fi
local s; s="$(tui_capture)"
assert_contains "$s" "/clear" "palette lists /clear"
assert_contains "$s" "/compact" "palette lists /compact"
assert_contains "$s" "/config" "palette lists /config"

# Filtering narrows the list: "/config" should keep /config, drop /clear.
tui_type "config"
# Wait for the filter to take effect (a non-matching entry disappears).
wait_absent "/clear" 5
s="$(tui_capture)"
assert_contains "$s" "/config" "filter '/config' keeps /config"
assert_absent "$s" "/clear" "filter '/config' drops /clear"

# Esc dismisses the palette; Ctrl+U clears the leftover input.
tui_keys Escape; tui_settle
tui_keys C-u; tui_settle

# Ctrl+K is the second documented entry point to the palette.
tui_keys C-k
if wait_for "/clear" 5; then
_pass "Ctrl+K opens the command palette"
else
_fail "Ctrl+K opens the command palette" "$(tui_capture)"
fi
tui_keys Escape; tui_settle

tui_stop
}
38 changes: 38 additions & 0 deletions scripts/tui-tests/cases/04_help_overlay.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#!/usr/bin/env bash
# shellcheck shell=bash
#
# Help overlay: "?" opens the keybinding + command reference, Esc closes it.

register_case tc_help

tc_help() {
describe "Help overlay"
if ! have_tmux; then _skip "tmux not installed"; return 0; fi
tui_start || { tui_stop; return 0; }

tui_keys "?"
# The overlay is tall; under load the lower rows render a beat after the
# top. Poll for a late item before snapshotting so the assertions don't
# race the draw.
if ! wait_for "/permissions"; then
_fail "help overlay rendered (/permissions never appeared)" "$(tui_capture)"
tui_stop; return 0
fi
local s; s="$(tui_capture)"
assert_contains "$s" "Toggle help" "help overlay documents the help toggle"
assert_contains "$s" "Command palette" "help overlay documents the command palette"
assert_contains "$s" "Model picker" "help overlay documents the model picker"
# Command reference section.
assert_contains "$s" "/login" "help overlay lists /login"
assert_contains "$s" "/permissions" "help overlay lists /permissions"

# Esc closes the overlay.
tui_keys Escape
if wait_absent "Toggle help"; then
_pass "Esc closes the help overlay"
else
_fail "Esc closes the help overlay" "$(tui_capture)"
fi

tui_stop
}
Loading