Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 159 additions & 0 deletions .github/codeql/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
# CodeQL configuration

This directory contains the CodeQL configuration consumed by both
`.github/workflows/codeql.yml` and `.github/workflows/codeql-advanced.yml`
via the `config-file:` key on the `github/codeql-action/init` step.

## Files

- `config.yml` -- query filter rules. See inline comments for per-rule
rationale.

## Query suite vs config-file interaction

The `queries:` key on `github/codeql-action/init` (set to
`security-and-quality` in `codeql.yml`, default in `codeql-advanced.yml`)
selects which CodeQL query *suite* runs. The `config-file:` value points
at this directory's `config.yml`, which defines `query-filters`.

Per the [CodeQL Action docs][codeql-config-docs] and the
"queries listed here will override any specified in a config file"
comment block in `.github/workflows/codeql-advanced.yml`:

- A workflow-level `queries:` value *replaces* a config-file-level
`queries:` value unless the workflow value is prefixed with `+`.
- `query-filters` from the config-file always layer *on top* of whichever
suite is active.

`config.yml` here intentionally does not define a `queries:` key, so the
workflow-selected suite (`security-and-quality` in `codeql.yml`) remains
the active set; this config only contributes the `query-filters` layer.
If a future contributor adds `queries:` to `config.yml`, the suite will
*replace* the workflow's `security-and-quality` selection unless
explicitly merged with `+queries: ...` in the workflow.

[codeql-config-docs]: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/customizing-your-advanced-setup-for-code-scanning#using-a-custom-configuration-file

## Suppressed queries

### `py/unsafe-cyclic-import` (global)

**Why it fires:** the rule walks `if TYPE_CHECKING:` blocks for cycle
detection. `strands_robots/simulation/{base,policy_runner,benchmark}.py`
form a deliberate static-only cycle so `policy_runner` can call into the
`SimEngine` ABC defined in `base`, while `base` can advertise
`PolicyRunner`-typed return values to static checkers (mypy, IDE
navigation, `typing.get_type_hints` consumers).

**Why it is safe at runtime:** the cycle is asymmetric, not closed.

- `simulation/base.py`'s top-level `from
strands_robots.simulation.policy_runner import PolicyRunner,
VideoConfig` is the runtime edge that static analysers see and flag.
(Cited by symbol rather than line number so future edits above the
import don't silently invalidate this rationale.)
- `simulation/policy_runner.py`'s `if TYPE_CHECKING:` block imports
`SimEngine`, `BenchmarkProtocol`, and `Policy`. These are not real
edges at import time -- `TYPE_CHECKING` is `False` at runtime, the
block is skipped, and the names exist only for the static type
system.
- `simulation/benchmark.py`'s `if TYPE_CHECKING:` block similarly
imports `SimEngine`.

There is therefore no runtime back-edge from `policy_runner` (or
`benchmark`) into `base`. `base` finishes importing cleanly before
anything in `policy_runner` runs at module scope, so the cycle never
closes at import time. `from __future__ import annotations` (PEP 563)
on every affected file is the additional belt-and-braces guarantee
that type-hint resolution itself never re-enters `base` or
`policy_runner` via `typing.get_type_hints` consumers.

The CodeQL-independent regression contract pins this invariant:

- `tests/simulation/test_no_cyclic_imports.py` -- spawns a fresh Python
Comment thread
cagataycali marked this conversation as resolved.
interpreter for each of the three affected modules (and a combined
one-process run) and asserts each imports cleanly with no
`RecursionError` / `ImportError` traceback frames in stderr (anchored
on the start-of-line `ExceptionName:` framing rather than a substring
match, so benign log lines that mention the exception names by name
do not fail the pin). Catches the dynamic-import failure mode where
a top-level statement reintroduces a partial-module re-entry.
- `tests/simulation/test_no_import_cycle.py::test_no_runtime_import_cycles`
-- builds the runtime import graph (excluding `TYPE_CHECKING` and
inside-function edges) via `networkx.simple_cycles` and asserts it is
acyclic. Catches the static-graph failure mode where someone hoists a
`TYPE_CHECKING` import to module scope and re-creates the cycle.
- `tests/simulation/test_no_import_cycle.py::test_base_does_not_lazy_import_policy_runner`
-- counts module-level imports of `policy_runner` from `base.py` and
asserts the count stays at 1 (the documented module-level edge), so a
re-introduced inline lazy import inside a `SimEngine` method (the
prior bug shape) fails loudly.

If the runtime cycle is ever reintroduced, those pins fail loudly in CI
-- independent of CodeQL's view of the AST.

**Why a global suppression rather than path-scoped:** CodeQL's
`query-filters` keys filter by `id` / `tags` / `precision` only;
path-scoped exclusion of a single query is not supported (the only
path-aware key is `paths-ignore`, which excludes a file from all
queries -- too broad). The simulation triple is the only file shape in
the repository where this query fires today, so a global exclude is
equivalent in effect *today* and keeps the config small. The cost we
accept is that a future legitimate violation in unrelated code would
be silently suppressed too. Mitigation: the regression pins above
guard against runtime cycles independent of CodeQL, and a future
contributor who introduces a new legitimate violation should drop this
suppression and fix the new cycle properly rather than extend the
filter.

**Periodic audit reminder:** the assumption that the simulation triple

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[FOLLOW-UP] The "every minor release or every six months" cadence has no enforcement mechanism. After v0.4.0 ships at Summit, this becomes a calendar item that reliably falls off — the exact failure mode that turns documented trade-offs into silent regressions. Pair with the CI-enforcement follow-up on config.yml: either a scheduled workflow that runs the recipe at README.md:117-127 automatically and opens an issue on drift, or a make audit-codeql-suppressions target invoked from a quarterly cron. Track as a v0.4.1 issue.

is the only file shape that fires this rule today drifts silently as
the codebase grows. Maintainers should periodically (every minor
release or every six months, whichever is sooner) enumerate what
`py/unsafe-cyclic-import` would flag absent the filter, e.g. by running
the CodeQL CLI locally with the suppression dropped:

```bash
# Drop the exclude block in .github/codeql/config.yml temporarily, then:
codeql database create db --language=python --source-root=.
codeql database analyze db \
codeql/python-queries:Imports/UnsafeCyclicImport.ql \
--format=sarif-latest --output=cyclic-import.sarif
# Inspect cyclic-import.sarif: every result outside the simulation
# triple is a candidate true-positive that the global exclude is
# silently hiding.
```

If results outside the simulation triple appear, drop the suppression
(or narrow it via path-scoped surgery) and fix the new cycle properly.

**References:**

- PR #209 -- the multi-round attempt to satisfy CodeQL via code surgery.
Paused at draft after the constraint conflict between mypy and
`py/unsafe-cyclic-import` was confirmed locally; full analysis in
the S13/R6 changelog of that PR.
- Issue #215 -- this follow-up tracking the suppression.
- CodeQL alerts #83, #84 -- closed by PR #209 R4.
Comment thread
cagataycali marked this conversation as resolved.
- CodeQL alerts #85, #86, #87 -- raised by the rule against the
`if TYPE_CHECKING:` blocks in `benchmark.py` and `policy_runner.py`;
closed by this suppression. (Historical line numbers in the alert
bodies: `benchmark.py:42`, `policy_runner.py:49-50` at the SHA that
raised the alerts.)
- CodeQL alerts #253, #254, #255 -- recurrent alerts on `base.py`'s
`if TYPE_CHECKING:` block introduced by PR #209 R5 attempting to
recover mypy type info; closed by this suppression. (Historical line
number: `base.py:28` at the SHA that raised the alerts.)

## How to extend

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[FOLLOW-UP] The 'How to extend' recipe is contributor guidance, not enforcement -- a future suppression added to config.yml without the matching ## Suppressed queries README section or per-file breadcrumb will not fail any test. AGENTS.md > Review Learnings (#86) > 'Public API Hygiene' / 'centralise checks' suggests promoting this to a small lint test ("every exclude.id in config.yml has a matching H3 here and at least one source-file breadcrumb mentioning the id"). Track for v0.4.1; not blocking, since the suppression list is a single entry today.


When adding a new suppression:

1. Add the `query-filters` entry to `config.yml` with a thorough
inline comment explaining the rule, why it fires here, and why
the suppression is safe.
2. Add a section to this README mirroring the inline comment, plus
the regression contract (pin tests, runtime-safety argument).
3. Ensure the affected source files carry a top-of-file comment
pointing at the config, so future readers find the rationale
without grep-spelunking.
53 changes: 53 additions & 0 deletions .github/codeql/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# CodeQL configuration for strands-robots.
#
# Wired into both .github/workflows/codeql.yml (CodeQL) and
# .github/workflows/codeql-advanced.yml (CodeQL Advanced) via the
# `config-file:` key on the `github/codeql-action/init` step.
#
# See .github/codeql/README.md for the full rationale per query exclusion.

name: "strands-robots CodeQL config"

# Inherit the default query suite selected per workflow (security-and-quality
# in codeql.yml; default in codeql-advanced.yml). We layer query-filters on
# top to suppress documented false-positives.
query-filters:
# py/unsafe-cyclic-import:
#
# Globally suppressed. The rule walks `if TYPE_CHECKING:` blocks for cycle
# detection, which makes it fire on the deliberate static-only cycle in
# strands_robots/simulation/{base,policy_runner,benchmark}.py.
#
# Runtime is safe because the cycle is asymmetric, not closed:
# base.py's top-level ``from .policy_runner import PolicyRunner,
# VideoConfig`` statement is the only runtime edge. policy_runner.py
# imports SimEngine / BenchmarkProtocol / Policy from base only inside
# its ``if TYPE_CHECKING:`` block -- no runtime back-edge closes the
# cycle at import time. ``from __future__ import annotations`` (PEP
# 563) is additional belt-and-braces so type-hint resolution at runtime
# never re-enters either module.
#
# This is the only file shape in the repository where the rule fires
# today; a global suppression keeps the config small and avoids
# path-scoped filter syntax (CodeQL `query-filters` only filters by
# id / tags / precision, not by path). The cost we accept is that a
# future legitimate violation in unrelated code is silently suppressed
# too. Mitigation: the regression pins below guard runtime-safety
# invariants independent of CodeQL, and a future contributor who hits
# a new legitimate violation should drop this suppression and fix the
# new cycle properly rather than extend the filter.
#
# Regression contract: tests/simulation/test_no_cyclic_imports.py and
# tests/simulation/test_no_import_cycle.py pin the runtime-safety
# invariants. The first spawns fresh interpreters and asserts each
# affected module imports cleanly with no RecursionError / ImportError;
# the second builds the runtime import graph (excluding TYPE_CHECKING
# and inside-function edges) and asserts it is acyclic. Both fail
# loudly if the runtime cycle is ever reintroduced, independent of
# CodeQL's view of the AST.
#
# Refs: PR #209 (paused, full constraint analysis in S13/R6); issue #215
# (this suppression); CodeQL alerts #83-#87, #253-#255 (the recurrent
# alerts the suppression closes).
- exclude:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[FOLLOW-UP] Global suppression is justified in the README given today's repo shape, but there is no CI enforcement that the assumption ("simulation triple is the only file shape that fires this rule") still holds at any future commit. The audit recipe at README.md:117-127 is manual and load-bearing on a maintainer remembering it every six months. Track a follow-up to add a scheduled workflow (or weekly cron) that runs the rule with the suppression temporarily dropped and opens an issue if results outside the simulation triple appear. Not a blocker — the regression pins guard the runtime invariant independently — but the silent-suppression risk is real if the codebase grows.

No justification for MUST FIX: the trade-off is documented, the runtime contract is pinned, and the surfaceable risk (a brand-new legitimate violation slips through) is bounded by the manual recipe today.

id: py/unsafe-cyclic-import
Comment thread
cagataycali marked this conversation as resolved.
Comment thread
cagataycali marked this conversation as resolved.
Comment thread
cagataycali marked this conversation as resolved.
Comment thread
cagataycali marked this conversation as resolved.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[FOLLOW-UP] No CI guard validates this file's schema or that the id: here matches a real CodeQL rule. A typo (e.g. py/unsafe-cyclic-imports plural) silently disables the suppression -- the runtime-pin tests in tests/simulation/test_no_cyclic_imports.py would still pass because they exercise import behaviour, not CodeQL semantics, so the regression contract has a blind spot here. Suggested follow-up: add a tiny tests/test_codeql_config.py that yaml.safe_loads this file and asserts each exclude.id matches the regex ^[a-z]+/[a-z0-9-]+$ (CodeQL rule-id shape). Doesn't block merge; tracks under v0.4.1.

3 changes: 3 additions & 0 deletions .github/workflows/codeql-advanced.yml
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,9 @@ jobs:
with:
languages: ${{ matrix.language }}
build-mode: ${{ matrix.build-mode }}
# Repo-level query filters (suppress documented false-positives).
# See .github/codeql/README.md for per-rule rationale.
config-file: ./.github/codeql/config.yml
Comment thread
cagataycali marked this conversation as resolved.
Comment thread
cagataycali marked this conversation as resolved.
Comment thread
cagataycali marked this conversation as resolved.
Comment thread
cagataycali marked this conversation as resolved.
Comment thread
cagataycali marked this conversation as resolved.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[FOLLOW-UP] This file is now wired to the new config-file:, but the surrounding init@v4 (line 70), analyze@v4 (line 102), and actions/checkout@v4 (line 60) are still floating tags, violating AGENTS.md > Review Learnings (#92) > Action Pinning ("All uses: references in workflows pin to a full 40-character commit SHA"). PR description scopes the SHA-pin to companion PR #234, which is the right call to keep this PR narrow.

Not a MUST FIX for #216 because (a) the floating tags pre-exist this PR — this diff only adds the config-file: block, it does not introduce the floating tags, and (b) #234 explicitly addresses it. Worth confirming #234 lands before or together with #216 so main is never in the floating-tag-on-@v4 state for longer than necessary.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[FOLLOW-UP] This PR wires config-file: correctly, but the surrounding action references on this workflow are still floating tags (actions/checkout@v4 line 60, github/codeql-action/init@v4 line 70, github/codeql-action/analyze@v4 line 102). That violates AGENTS.md > Review Learnings (#92) > 'Action Pinning' ("All uses: references in workflows pin to a full 40-character commit SHA") and is the supply-chain pattern that the tj-actions/changed-files incident exploited. The PR description openly defers this to companion PR #234, which is the right call -- not introduced by this PR, and out of scope for the suppression fix. Tracking only: confirm #234 lands before / alongside this PR so the gap is not left open after merge.

# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.
Expand Down
5 changes: 4 additions & 1 deletion .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,13 @@ jobs:
# surfaced in PR #85 (MJCF interpolation) and PR #90 (subprocess
# injection / path traversal).
queries: security-and-quality
# Repo-level filters layered on top of the selected suite. See
# .github/codeql/README.md for per-rule rationale.
config-file: ./.github/codeql/config.yml
Comment thread
cagataycali marked this conversation as resolved.

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@4e828ff8d448a8a6e532957b1811f387a63867e8 # v3.29.4
with:
category: "/language:python"
# Findings appear in the Security tab; PRs are not blocked on
# CodeQL alerts (false-positive rate makes hard-fail noisy).
# CodeQL alerts (false-positive rate makes hard-fail noisy).
33 changes: 28 additions & 5 deletions strands_robots/simulation/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,34 @@
if TYPE_CHECKING:
from strands_robots.policies import Policy

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[FOLLOW-UP] This new top-of-file CodeQL block and the older Note (#191): block immediately below (line 58 in the head version, in this hunk's trailing context) both explain py/unsafe-cyclic-import and the TYPE_CHECKING cycle, from slightly different angles. After this PR lands, a future reader has to read two adjacent comment blocks to assemble the full rationale, and the OnFrame-specific guidance in the #191 block is functionally a corollary of the rule explained here.

Consolidating into a single block (with the OnFrame example as a sub-bullet) would tighten the breadcrumb without losing information. Cosmetic; defer to v0.4.1.

# PolicyRunner and VideoConfig are used by run_policy / replay / eval_policy.
# We could defer these with inline lazy imports (and historically did), but
# policy_runner.py only imports `SimEngine` from base under TYPE_CHECKING so
# the runtime cycle doesn't actually exist. Keep the imports at module level
# to break the AST-visible cycle that static analysers flag.
# CodeQL: the AST cycle base <-> policy_runner is a documented false-positive
# suppressed via .github/codeql/config.yml (py/unsafe-cyclic-import).
# See .github/codeql/README.md for the full rationale.
#
# Why this is safe at runtime: the cycle is asymmetric, not closed.
# This module imports ``PolicyRunner`` / ``VideoConfig`` at module level
# (the runtime edge), but ``policy_runner.py`` imports ``SimEngine`` from
# ``base`` only under ``TYPE_CHECKING``. There is no runtime back-edge
# from ``policy_runner`` into this module, so the cycle never closes at
# import time. Do NOT hoist the ``TYPE_CHECKING`` import in
# ``policy_runner.py`` to module scope -- that would close the loop and
# break ``import strands_robots.simulation.policy_runner`` with
# ``ImportError: cannot import name 'SimEngine' from partially initialized
# module ...``. ``from __future__ import annotations`` (PEP 563) is
# additional belt-and-braces, but the asymmetric edge is what makes the
# import work; PEP 563 only governs how type hints are resolved later.
#
# Pin tests guard this invariant independent of CodeQL:
# - tests/simulation/test_no_cyclic_imports.py: spawns fresh interpreters
# per affected module and asserts each imports cleanly.
# - tests/simulation/test_no_import_cycle.py: AST-builds the runtime
# import graph (excluding TYPE_CHECKING / inside-function edges) and
# asserts it is acyclic; also pins that this module keeps exactly one
# module-level import of policy_runner (re-introducing inline lazy
# imports inside SimEngine methods was the prior bug shape).
#
# PolicyRunner and VideoConfig are used by run_policy / replay /
# eval_policy on this class.
Comment thread
cagataycali marked this conversation as resolved.
Comment thread
cagataycali marked this conversation as resolved.
#
# Note (#191): we deliberately do NOT import ``OnFrame`` here, even under
# ``TYPE_CHECKING`` — CodeQL's ``py/unsafe-cyclic-import`` rule walks
Expand Down
4 changes: 4 additions & 0 deletions strands_robots/simulation/benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,10 @@
from dataclasses import dataclass, field
from typing import TYPE_CHECKING, Any

# CodeQL: the AST cycle benchmark <-> base is suppressed via
# .github/codeql/config.yml (py/unsafe-cyclic-import). The edge
# exists only under TYPE_CHECKING; runtime is safe under PEP 563.
# See .github/codeql/README.md.
if TYPE_CHECKING:
from strands_robots.simulation.base import SimEngine

Expand Down
4 changes: 4 additions & 0 deletions strands_robots/simulation/policy_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@
from strands_robots._async_utils import _resolve_coroutine
from strands_robots.utils import require_optional

# CodeQL: the AST cycle policy_runner <-> base / benchmark is suppressed
# via .github/codeql/config.yml (py/unsafe-cyclic-import). These edges
# exist only under TYPE_CHECKING; runtime is safe under PEP 563.
# See .github/codeql/README.md.
if TYPE_CHECKING:
from strands_robots.policies.base import Policy
from strands_robots.simulation.base import SimEngine
Expand Down
Loading
Loading