Skip to content

Python: switch dataflow library to new (shared) CFG + SSA#21925

Draft
yoff wants to merge 144 commits into
yoff/python-add-new-ssafrom
yoff/python-shared-cfg-dataflow-flip
Draft

Python: switch dataflow library to new (shared) CFG + SSA#21925
yoff wants to merge 144 commits into
yoff/python-add-new-ssafrom
yoff/python-shared-cfg-dataflow-flip

Conversation

@yoff

@yoff yoff commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Summary

The trunk-flip equivalent of the original draft PR #21894 (kept around as documentation of the end state), rebased on top of the four preparatory PRs in this stack:

  • P1: #21919 — Remove AstNode.getAFlowNode() and rewrite callers.
  • P2: #21920 — Qualify Flow.qll's AST references with Py:: prefix.
  • P3: #21921 — Add new shared-CFG-backed control flow graph (additive).
  • P4: #21923 — Add new shared-SSA-backed SSA adapter (additive).

Based on #21923 — merge P1–P4 first.

Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll) and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade (semmle.python.controlflow.internal.Cfg) and the new SSA adapter (semmle.python.dataflow.new.internal.SsaImpl).

What changes

Dataflow library

The Python dataflow library (semmle/python/dataflow/new/) now imports the new CFG facade and SSA adapter. All CFG-typed predicates (ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are qualified with the Cfg:: prefix; SSA references switch from EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.

GuardNode redesign

GuardNode is redesigned to use the new CFG's outcome-node model (isAfterTrue/isAfterFalse) instead of the legacy ConditionBlock + flipped indirection. Only BarrierGuard<...> is preserved as public API — the rest of the legacy GuardNode surface (isSafeCheck, flipped) was Python-specific and had no callers outside this library.

Framework updates

Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib, Pyramid, MarkupSafe, Pycurl, Pydantic, Gradio, ...) are updated to take CFG nodes from the new facade.

Dataflow consistency tweaks for the new CFG

  • Augmented-assignment targets are treated as both load and store.
  • from X import * produces uncertain SSA writes for unknown names.
  • CFG nodes are canonicalised so dataflow does not see equivalent pre/post-order pairs as distinct nodes.

Two AST tweaks for the new CFG

  • AstNodeImpl: omit PEP 695 type-parameter names from FunctionDefExpr/ClassDefExpr children (they belong to the type-params block, not the function/class header).
  • ImportResolution: drop the legacy essa import.

Test churn (~175 files)

Reblessed library- and query-test .expected files reflect:

  • Slightly different CFG granularity (separate pre/post nodes per expression).
  • Different toString output where the legacy CFG used "ControlFlowNode for X".
  • A handful of true alert deltas in security queries (documented in the corresponding test diffs).

Verification

  • All 367 lib/ + src/ + consistency-queries/ queries compile clean against the new trunk.
  • All 653 ControlFlow + PointsTo + dataflow + dataflow-new-ssa + essa + consistency library-tests pass.

Copilot AI review requested due to automatic review settings June 1, 2026 12:29
@yoff yoff requested a review from a team as a code owner June 1, 2026 12:29
@github-actions github-actions Bot added the Python label Jun 1, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Migrates Python’s “new” dataflow library to use the shared CFG facade (semmle.python.controlflow.internal.Cfg) and the new shared-SSA-backed adapter, updating framework models, query code, and reblessing test expectations to match the new CFG node canonicalization/toString behavior.

Changes:

  • Switch Python dataflow/type-tracking/guard plumbing to shared CFG + shared SSA adapter (with corresponding API updates across libraries and queries).
  • Update framework models and security/query libraries to reference Cfg:: nodes and new SSA entities.
  • Rebless a large set of library/query-test .expected files for new node labels (e.g. After ...) and known semantic deltas.
Show a summary per file
File Description
python/tools/recorded-call-graph-metrics/ql/lib/RecordedCalls.qll Adjust call-graph metrics plumbing for CFG changes (note: current diff introduces a removed API reference; see PR comment).
python/ql/test/query-tests/Statements/exit/UseOfExit.expected Reblessed expected output for new CFG node stringification.
python/ql/test/query-tests/Security/CWE-942-CorsMisconfigurationMiddleware/CorsMisconfigurationMiddleware.expected Reblessed expected results with After ... node labels.
python/ql/test/query-tests/Security/CWE-798-HardcodedCredentials/HardcodedCredentials.expected Reblessed path-graph labels (legacy ControlFlowNode for ... → new labels).
python/ql/test/query-tests/Security/CWE-732-WeakFilePermissions/WeakFilePermissions.expected Reblessed expected results for new CFG node representation.
python/ql/test/query-tests/Security/CWE-614-InsecureCookie/InsecureCookie.expected Reblessed expected results for new CFG node representation.
python/ql/test/query-tests/Security/CWE-327-InsecureDefaultProtocol/InsecureDefaultProtocol.expected Reblessed expected results for new CFG node representation.
python/ql/test/query-tests/Security/CWE-327-BrokenCryptoAlgorithm/BrokenCryptoAlgorithm.expected Reblessed expected results for new CFG node representation.
python/ql/test/query-tests/Security/CWE-326-WeakCryptoKey/WeakCryptoKey.expected Reblessed expected results for new CFG node representation.
python/ql/test/query-tests/Security/CWE-312-CleartextStorage-py3/CleartextStorage.expected Reblessed dataflow path output for new CFG labels.
python/ql/test/query-tests/Security/CWE-295-RequestWithoutValidation/RequestWithoutValidation.expected Reblessed expected results for new CFG node representation.
python/ql/test/query-tests/Security/CWE-295-MissingHostKeyValidation/MissingHostKeyValidation.expected Reblessed expected results for new CFG node representation.
python/ql/test/query-tests/Security/CWE-215-FlaskDebug/FlaskDebug.expected Reblessed expected results for new CFG node representation.
python/ql/test/query-tests/Security/CWE-209-StackTraceExposure/ExceptionInfo.expected Updates expectations to record known “missing result” regressions under new CFG exception modeling.
python/ql/test/query-tests/Security/CWE-1275-SameSiteNoneCookie/SameSiteNoneCookie.expected Reblessed expected results for new CFG node representation.
python/ql/test/query-tests/Security/CWE-113-HeaderInjection/Tests2-with-wsgi-validator/HeaderWriteTest.expected Reblessed taint-tracking output labels for new CFG nodes.
python/ql/test/query-tests/Security/CWE-1004-NonHttpOnlyCookie/NonHttpOnlyCookie.expected Reblessed expected results for new CFG node representation.
python/ql/test/query-tests/Security/CWE-089-SqlInjection/CONSISTENCY/DataFlowConsistency.expected Reblessed consistency-test expected results for new CFG labels.
python/ql/test/query-tests/Security/CWE-089-SqlInjection-local-threat-model/SqlInjection.expected Reblessed path-graph labels for new CFG nodes.
python/ql/test/query-tests/Security/CWE-079-Jinja2WithoutEscaping/Jinja2WithoutEscaping.expected Reblessed expected results for new CFG node representation.
python/ql/test/query-tests/Security/CWE-074-TemplateInjection/TemplateInjection.expected Reblessed expected results for new CFG node representation.
python/ql/test/query-tests/Security/CWE-020-IncompleteHostnameRegExp/IncompleteHostnameRegExp.expected Reblessed expected output labels.
python/ql/test/query-tests/Numerics/Pythagorean.expected Reblessed expected output labels.
python/ql/test/query-tests/Imports/deprecated/DeprecatedModule.expected Reblessed and records an additional expected deprecated-module result.
python/ql/test/query-tests/Functions/ModificationOfParameterWithDefault/test.expected Updates expectations to record missing inline-annotation results.
python/ql/test/query-tests/Expressions/super/CallToSuperWrongClass.expected Reblessed expected output labels.
python/ql/test/query-tests/Exceptions/general/IncorrectExceptOrder.expected Updates expectations to record known missing alert under new exception reachability.
python/ql/test/query-tests/Exceptions/general/EmptyExcept.expected Reblessed expected results; includes additional expected findings.
python/ql/test/query-tests/Exceptions/general/CatchingBaseException.expected Reblessed expected results; removes one expected finding.
python/ql/test/query-tests/Classes/subclass-shadowing/SubclassShadowing.expected Reblessed expected output labels.
python/ql/test/query-tests/Classes/multiple/multiple-init/SuperclassInitCalledMultipleTimes.expected Reblessed expected output labels.
python/ql/test/query-tests/Classes/multiple/multiple-del/SuperclassDelCalledMultipleTimes.expected Reblessed expected output labels.
python/ql/test/query-tests/Classes/init-calls-subclass-method/InitCallsSubclassMethod.expected Reblessed expected output labels.
python/ql/test/library-tests/PointsTo/new/ImpliesDataflow.ql Updates points-to/dataflow bridge logic for new CFG/legacy interop.
python/ql/test/library-tests/PointsTo/new/ImpliesDataflow.expected Reblessed expected output labels.
python/ql/test/library-tests/frameworks/stdlib/InlineTaintTest.expected Updates expectations to record missing taint annotations under new CFG behavior.
python/ql/test/library-tests/frameworks/stdlib-py2/ConceptsTest.expected Updates expectations to record missing concept annotation.
python/ql/test/library-tests/frameworks/sqlalchemy/ConceptsTest.expected Updates expectations to record missing/spurious concept results.
python/ql/test/library-tests/frameworks/rest_framework/CONSISTENCY/DataFlowConsistency.expected Reblessed expected output labels.
python/ql/test/library-tests/frameworks/modeling-example/SharedCode.qll Updates example framework model to use Cfg:: node types.
python/ql/test/library-tests/frameworks/modeling-example/ProperModel.ql Updates example model steps to use Cfg:: node types.
python/ql/test/library-tests/frameworks/modeling-example/NaiveModel.ql Updates example model steps to use Cfg:: node types.
python/ql/test/library-tests/frameworks/lxml/InlineTaintTest.expected Updates expectations to record missing taint annotations under new CFG behavior.
python/ql/test/library-tests/frameworks/django/CONSISTENCY/DataFlowConsistency.expected Reblessed expected output labels.
python/ql/test/library-tests/frameworks/django-orm/NormalDataflowTest.expected Updates expectations to record an unexpected flow result under new CFG/SSA.
python/ql/test/library-tests/frameworks/cryptography/EcKeygenOrigin.expected Reblessed expected output labels.
python/ql/test/library-tests/frameworks/aiohttp/InlineTaintTest.ql Updates guard predicate signatures and uses Cfg:: call/name nodes.
python/ql/test/library-tests/frameworks/aiohttp/InlineTaintTest.expected Updates expectations to record missing taint annotations under new CFG behavior.
python/ql/test/library-tests/essa/ssa-compute/CONSISTENCY/TypeTrackingConsistency.expected Removes expected unreachable-node violations (rebless under new behavior).
python/ql/test/library-tests/dataflow/use-use-flow/use-use-counts.ql Switches SSA entities to SsaImpl::... and CFG node types to Cfg::....
python/ql/test/library-tests/dataflow/use-use-flow/use-use-counts.expected Reblessed expected output, including implicit-use behavior changes.
python/ql/test/library-tests/dataflow/typetracking/tracked.ql Updates AST↔CFG mapping usage to avoid removed getAFlowNode patterns.
python/ql/test/library-tests/dataflow/typetracking/test.py Updates inline annotations (marking tracked/spurious) to match new behavior.
python/ql/test/library-tests/dataflow/typetracking/moduleattr.expected Reblessed expected output labels and entry-definition naming.
python/ql/test/library-tests/dataflow/typetracking-summaries/tracked.ql Updates tracked-node selection to use Cfg::NameNode.
python/ql/test/library-tests/dataflow/typetracking_imports/highlight_problem.ql Updates SSA variable/type usage and normal-exit matching under new CFG.
python/ql/test/library-tests/dataflow/typetracking_imports/highlight_problem.expected Reblessed expected output labels for SSA defs.
python/ql/test/library-tests/dataflow/tainttracking/TestTaintLib.qll Updates taint-test config to use Cfg:: call/name nodes.
python/ql/test/library-tests/dataflow/tainttracking/customSanitizer/test.py Updates inline expectations to mark known missing taint results under exception reachability changes.
python/ql/test/library-tests/dataflow/tainttracking/customSanitizer/InlineTaintTest.ql Updates guard predicates to use Cfg::ControlFlowNode and Cfg::CallNode.
python/ql/test/library-tests/dataflow/tainttracking/customSanitizer/InlineTaintTest.expected Reblessed expected sanitizer-node labels.
python/ql/test/library-tests/dataflow/tainttracking/basic/LocalTaintStep.expected Reblessed expected output labels.
python/ql/test/library-tests/dataflow/tainttracking/basic/GlobalTaintTracking.ql Updates taint config to use Cfg:: call/name nodes.
python/ql/test/library-tests/dataflow/tainttracking/basic/GlobalTaintTracking.expected Reblessed expected output labels.
python/ql/test/library-tests/dataflow/summaries/TestSummaries.qll Updates summary matching to use Cfg::NameNode for call identification.
python/ql/test/library-tests/dataflow/strange-essaflow/testFlow.ql Switches ESSA entities to SsaImpl::... for import-flow test logic.
python/ql/test/library-tests/dataflow/strange-essaflow/testFlow.expected Reblessed expected output labels.
python/ql/test/library-tests/dataflow/regression/custom_dataflow.ql Updates regression config to use Cfg:: call/name nodes.
python/ql/test/library-tests/dataflow/regression/custom_dataflow.expected Reblessed expected output and records additional flow.
python/ql/test/library-tests/dataflow/module-initialization/localFlow.ql Updates module-initialization flow test to use SsaImpl::... definitions.
python/ql/test/library-tests/dataflow/method-calls/test.expected Reblessed expected output labels.
python/ql/test/library-tests/dataflow/global-flow/test.py Updates inline expectations to mark spurious write under new SSA pruning/behavior.
python/ql/test/library-tests/dataflow/fieldflow/UnresolvedCalls.ql Updates unresolved-call expectations to use Cfg::CallNode.
python/ql/test/library-tests/dataflow/fieldflow/UnresolvedCalls.expected Reblessed expected output labels.
python/ql/test/library-tests/dataflow/def-use-flow/def_use_counts.ql Switches SSA entities to SsaImpl::... and CFG node types to Cfg::....
python/ql/test/library-tests/dataflow/coverage/test.py Updates inline expectation to record a missing flow result.
python/ql/test/library-tests/dataflow/coverage/localFlow.expected Reblessed expected output labels and entry-definition naming.
python/ql/test/library-tests/dataflow/coverage/argumentRoutingTest.ql Updates routing test to use Cfg:: node types and SsaImpl::... definitions.
python/ql/test/library-tests/dataflow/basic/sources.expected Reblessed expected output labels (including pre/post nodes).
python/ql/test/library-tests/dataflow/basic/sinks.expected Reblessed expected output labels (including pre/post nodes).
python/ql/test/library-tests/dataflow/basic/maximalFlows.expected Reblessed expected output labels.
python/ql/test/library-tests/dataflow/basic/localStep.expected Reblessed expected output labels.
python/ql/test/library-tests/dataflow/basic/callGraphSources.expected Reblessed expected output labels.
python/ql/test/library-tests/dataflow/basic/callGraphSinks.expected Reblessed expected output labels.
python/ql/test/library-tests/dataflow/basic/callGraph.expected Reblessed expected output labels.
python/ql/test/library-tests/ApiGraphs/py3/test_crosstalk.expected Reblessed expected output labels.
python/ql/test/library-tests/ApiGraphs/py3/ModuleImportWithDots.expected Reblessed expected output labels.
python/ql/test/experimental/query-tests/Security/CWE-611-SimpleXmlRpcServer/SimpleXmlRpcServer.expected Reblessed expected output labels.
python/ql/test/experimental/query-tests/Security/CWE-347/JWTMissingSecretOrPublicKeyVerification.expected Reblessed expected output labels.
python/ql/test/experimental/query-tests/Security/CWE-347/JWTEmptyKeyOrAlgorithm.expected Reblessed expected output labels.
python/ql/test/experimental/query-tests/Security/CWE-346/CorsBypass.expected Reblessed expected output labels.
python/ql/test/experimental/query-tests/Security/CWE-338/InsecureRandomness.expected Reblessed expected output labels.
python/ql/test/experimental/query-tests/Security/CWE-287/ImproperLdapAuth.expected Reblessed expected output labels.
python/ql/test/experimental/query-tests/Security/CWE-208/TimingAttackAgainstSensitiveInfo/PossibleTimingAttackAgainstSensitiveInfo.expected Reblessed expected output labels.
python/ql/test/experimental/query-tests/Security/CWE-208/TimingAttackAgainstHeaderValue/TimingAttackAgainstHeaderValue.expected Reblessed expected output labels.
python/ql/test/experimental/query-tests/Security/CWE-208/TimingAttackAgainstHash/TimingAttackAgainstHash.expected Reblessed expected output labels.
python/ql/test/experimental/query-tests/Security/CWE-208/TimingAttackAgainstHash/PossibleTimingAttackAgainstHash.expected Reblessed expected output labels.
python/ql/test/experimental/query-tests/Security/CWE-094/Js2Py.expected Reblessed expected output labels.
python/ql/test/experimental/meta/InlineTaintTest.qll Updates inline taint-test harness to use Cfg::NameNode/Cfg::CallNode.
python/ql/test/experimental/library-tests/FindSubclass/Find.expected Updates expected results (removing one subclass member).
python/ql/test/experimental/library-tests/CallGraph/InlineCallGraphTest.ql Updates call-graph comparison tests to bridge legacy points-to calls with new CFG wrappers.
python/ql/test/experimental/library-tests/CallGraph-type-annotations/InlineCallGraphTest.expected Updates expectations for missing/fixed call-graph edges under new resolution behavior.
python/ql/test/experimental/import-resolution/importflow.ql Updates import-resolution test logic to use Cfg::CompareNode/Cfg::ControlFlowNode.
python/ql/test/experimental/import-resolution-namespace-relative/test.ql Updates taint config to use Cfg::NameNode in call identification.
python/ql/test/experimental/attrs/AttrWrites.expected Reblessed expected output labels.
python/ql/test/experimental/attrs/AttrReads.expected Reblessed expected output labels.
python/ql/test/2/query-tests/Expressions/UseofInput.expected Reblessed expected output labels.
python/ql/test/2/query-tests/Expressions/UseofApply.expected Reblessed expected output labels.
python/ql/test/2/query-tests/Exceptions/raising/RaisingTuple.expected Reblessed expected output labels.
python/ql/test/2/query-tests/Exceptions/generators/UnguardedNextInGenerator.expected Updates expected results (adds additional expected findings).
python/ql/src/Statements/UseOfExit.ql Updates query to use Cfg::CallNode from shared CFG facade.
python/ql/src/Statements/SideEffectInAssert.ql Updates CFG node typing to Cfg::ControlFlowNode.
python/ql/src/Statements/ModificationOfLocals.ql Updates query logic to use shared CFG facade node types.
python/ql/src/Security/CWE-798/HardcodedCredentials.ql Updates query to use shared CFG facade node types in matching.
python/ql/src/Security/CWE-327/Ssl.qll Updates modeling to use Cfg::AttrNode in augmented-assignment patterns.
python/ql/src/Security/CWE-327/PyOpenSSL.qll Updates attribute-node usage to Cfg::AttrNode.
python/ql/src/Security/CWE-079/Jinja2WithoutEscaping.ql Updates call AST access via Cfg::CallNode wrappers.
python/ql/src/Security/CWE-020-ExternalAPIs/ExternalAPIs.qll Updates resolved-call predicate to accept Cfg::CallNode.
python/ql/src/Resources/FileNotAlwaysClosedQuery.qll Updates BasicBlock typing to shared Cfg::BasicBlock facade.
python/ql/src/meta/analysis-quality/TTCallGraphShared.ql Updates analysis-quality query to use Cfg::CallNode.
python/ql/src/meta/analysis-quality/TTCallGraphOverview.ql Updates analysis-quality aggregation to use Cfg::CallNode.
python/ql/src/meta/analysis-quality/TTCallGraphNewAmbiguous.ql Updates analysis-quality query to use Cfg::CallNode.
python/ql/src/meta/analysis-quality/TTCallGraphNew.ql Updates analysis-quality query to use Cfg::CallNode.
python/ql/src/meta/analysis-quality/TTCallGraphMissing.ql Updates analysis-quality query to use Cfg::CallNode.
python/ql/src/meta/analysis-quality/TTCallGraph.ql Updates analysis-quality query to use Cfg::CallNode.
python/ql/src/meta/analysis-quality/CallGraphQuality.qll Updates call-graph quality plumbing to use Cfg::CallNode.
python/ql/src/Functions/SignatureOverriddenMethod.ql Updates call resolution to use Cfg::CallNode in DataFlow call nodes.
python/ql/src/Expressions/UseofApply.ql Updates query to use Cfg::CallNode.
python/ql/src/experimental/semmle/python/security/injection/CsvInjection.qll Updates guard predicate signature to use Cfg::ControlFlowNode.
python/ql/src/experimental/Security/UnsafeUnpackQuery.qll Updates CFG node type checks to Cfg::... nodes.
python/ql/src/experimental/Security/CWE-770/UnicodeDoS.ql Updates compare-node typing and guard signature to Cfg::... nodes.
python/ql/src/experimental/Security/CWE-346/CorsBypass.ql Removes legacy Flow import and rewrites to shared CFG facade node types.
python/ql/src/experimental/Security/CWE-340/TokenBuiltFromUUID.ql Updates definition-node typing to Cfg::DefinitionNode/Cfg::NameNode.
python/ql/src/experimental/Security/CWE-022bis/TarSlipImprov.ql Updates CFG node type checks to Cfg::... nodes.
python/ql/src/Exceptions/UnguardedNextInGenerator.ql Mixes shared CFG facade + legacy Flow for call matching under generator/exception behavior.
python/ql/lib/utils/test/dataflow/UnresolvedCalls.qll Updates unresolved-call test utility to use Cfg::CallNode and canonical representative filtering.
python/ql/lib/utils/test/dataflow/testTaintConfig.qll Updates taint test config to use Cfg::... node types.
python/ql/lib/utils/test/dataflow/testConfig.qll Updates dataflow test config to use Cfg::NameNode.
python/ql/lib/utils/test/dataflow/RoutingTest.qll Updates routing-test helpers to use Cfg::CallNode in name extraction.
python/ql/lib/utils/test/dataflow/NormalTaintTrackingTest.qll Updates sink reconstruction to use Cfg::NameNode.
python/ql/lib/utils/test/dataflow/NormalDataflowTest.qll Updates sink reconstruction to use Cfg::NameNode.
python/ql/lib/utils/test/dataflow/MaximalFlowTest.qll Updates maximal-flow config to use Cfg::... node types.
python/ql/lib/semmle/python/security/dataflow/UrlRedirectCustomizations.qll Updates binary-expr node typing to Cfg::BinaryExprNode.
python/ql/lib/semmle/python/security/dataflow/TarSlipCustomizations.qll Updates guard signature and attribute/name traversal to Cfg::... nodes.
python/ql/lib/semmle/python/security/dataflow/ServerSideRequestForgeryCustomizations.qll Updates binary-expr typing and guard signature to Cfg::... nodes.
python/ql/lib/semmle/python/security/dataflow/ExceptionInfo.qll Updates caught-exception modeling to use Cfg::NameNode.defines instead of legacy ESSA node definitions.
python/ql/lib/semmle/python/regexp/internal/ParseRegExp.qll Updates binary-expr typing to Cfg::BinaryExprNode.
python/ql/lib/semmle/python/frameworks/Yarl.qll Updates guard signature to use Cfg::ControlFlowNode.
python/ql/lib/semmle/python/frameworks/Yaml.qll Updates framework call node override type to Cfg::CallNode.
python/ql/lib/semmle/python/frameworks/Werkzeug.qll Updates subscript/definition typing to Cfg::SubscriptNode/Cfg::DefinitionNode.
python/ql/lib/semmle/python/frameworks/Twisted.qll Rewrites return-value flow node selection using AST Return rather than legacy helper.
python/ql/lib/semmle/python/frameworks/Tornado.qll Updates multiple node types to Cfg::... and adjusts routing helpers to new node classes.
python/ql/lib/semmle/python/frameworks/Stdlib/Urllib.qll Updates guard signature to use Cfg::ControlFlowNode.
python/ql/lib/semmle/python/frameworks/Pyramid.qll Rewrites return-value flow node selection using AST Return rather than legacy helper.
python/ql/lib/semmle/python/frameworks/Pydantic.qll Updates subscript node typing to Cfg::SubscriptNode.
python/ql/lib/semmle/python/frameworks/Pycurl.qll Updates attribute node typing to Cfg::AttrNode.
python/ql/lib/semmle/python/frameworks/MarkupSafe.qll Updates call/binary-expr override node types to Cfg::....
python/ql/lib/semmle/python/frameworks/internal/SubclassFinder.qll Updates CFG node typing in subclass-finding logic to Cfg::ControlFlowNode.
python/ql/lib/semmle/python/frameworks/Gradio.qll Updates list-node typing to Cfg::ListNode.
python/ql/lib/semmle/python/frameworks/FastApi.qll Rewrites return-value node selection using AST Return; updates subscript/definition typing to Cfg::....
python/ql/lib/semmle/python/frameworks/Bottle.qll Rewrites return-value node selection using AST Return; updates subscript/definition typing to Cfg::....
python/ql/lib/semmle/python/Flow.qll Minor comment text changes (some appear accidental; see PR comments).
python/ql/lib/semmle/python/dataflow/new/SensitiveDataSources.qll Updates sensitive-data modeling to use Cfg::... node types.
python/ql/lib/semmle/python/dataflow/new/internal/VariableCapture.qll Updates variable-capture integration to use shared CFG+SSA adapter types.
python/ql/lib/semmle/python/dataflow/new/internal/TypeTrackingImpl.qll Updates summary return-node matching and node-type checks to shared CFG/SSA types.
python/ql/lib/semmle/python/dataflow/new/internal/TaintTrackingPrivate.qll Updates binary/subscript/call node typing and with-definition SSA references.
python/ql/lib/semmle/python/dataflow/new/internal/MatchUnpacking.qll Updates pattern-alias/capture flow to use CFG-based name nodes.
python/ql/lib/semmle/python/dataflow/new/internal/LocalSources.qll Updates subscript node typing to Cfg::SubscriptNode.
python/ql/lib/semmle/python/dataflow/new/internal/ImportStar.qll Updates name node typing and import-star base to Cfg::... nodes.
python/ql/lib/semmle/python/dataflow/new/internal/Builtins.qll Updates builtin-access discovery to use Cfg::NameNode.
python/ql/lib/semmle/python/dataflow/new/BarrierGuards.qll Updates guard signatures and iterable/compare node typing to Cfg::....
python/ql/lib/semmle/python/Concepts.qll Updates guard signatures to use Cfg::ControlFlowNode.
python/ql/lib/semmle/python/ApiGraphs.qll Imports shared CFG facade and updates subscript/definition node typing; includes comment text tweaks (some need correction; see PR comments).
python/ql/consistency-queries/DataFlowConsistency.ql Updates dataflow consistency check to use Cfg::CallNode for call matching.

Copilot's findings

  • Files reviewed: 204/249 changed files
  • Comments generated: 8

Comment on lines 240 to 244
ResolvableRecordedCall() {
exists(Call call, XmlCallee xmlCallee, ControlFlowNode callCfg |
exists(Call call, XmlCallee xmlCallee |
call = this.getACall() and
callCfg.getNode() = call and
calleeValue.getACall() = callCfg and
calleeValue.getACall() = call.getAFlowNode() and
xmlCallee = this.getXmlCallee() and
Comment on lines 199 to 202
predicate strictlyDominates(ControlFlowNode other) {
// This predicate is gigantic, so it must be inlined.
// About 1.4 billion tuples for OpenStack Cinder.
// About 1.4 billion tuples for OpenStack Py::Cinder.
this.getBasicBlock().strictlyDominates(other.getBasicBlock())
Comment on lines 1094 to 1098
* Holds if this element is at the specified location.
* The location spans column `startcolumn` of line `startline` to
* column `endcolumn` of line `endline` in file `filepath`.
* For more information, see
* Py::For more information, see
* [Locations](https://codeql.github.com/docs/writing-codeql-queries/providing-locations-in-codeql-queries/).
Comment on lines 240 to 244
ResolvableRecordedCall() {
exists(Call call, XmlCallee xmlCallee, ControlFlowNode callCfg |
exists(Call call, XmlCallee xmlCallee |
call = this.getACall() and
callCfg.getNode() = call and
calleeValue.getACall() = callCfg and
calleeValue.getACall() = call.getAFlowNode() and
xmlCallee = this.getXmlCallee() and
Comment on lines 199 to 202
predicate strictlyDominates(ControlFlowNode other) {
// This predicate is gigantic, so it must be inlined.
// About 1.4 billion tuples for OpenStack Cinder.
// About 1.4 billion tuples for OpenStack Py::Cinder.
this.getBasicBlock().strictlyDominates(other.getBasicBlock())
Comment on lines 1094 to 1098
* Holds if this element is at the specified location.
* The location spans column `startcolumn` of line `startline` to
* column `endcolumn` of line `endline` in file `filepath`.
* For more information, see
* Py::For more information, see
* [Locations](https://codeql.github.com/docs/writing-codeql-queries/providing-locations-in-codeql-queries/).
Comment on lines +9 to +11
// Importing python under the `py` namespace to avoid importing `Cfg::CallNode` from `Flow.qll` and thereby having a naming conflict with `API::CallNode`.
private import python as PY
private import semmle.python.controlflow.internal.Cfg as Cfg
Comment on lines 777 to 780
// TODO: once convenient, this should be done at a higher level than the AST,
// at least at the CFG layer, to take splitting into account.
// Also consider `SequenceNode for generality.
// Also consider `Cfg::SequenceNode for generality.
exists(PY::List list | list = pred.(DataFlow::ExprNode).getNode().getNode() |
@yoff yoff force-pushed the yoff/python-shared-cfg-dataflow-flip branch 2 times, most recently from 4304ddb to 5187e9a Compare June 1, 2026 12:49

yoff commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

Addressed in latest push (force-pushed amendment):

  • RecordedCalls.qll:245 — restored P1's ControlFlowNode callCfg | callCfg.getNode() = call form. (Already addressed in the prior amendment; this was an accidental regression because the flip's tree was taken from the older big PR before P1 landed.)
  • Flow.qll:202 (OpenStack Py::Cinder) — reverted to OpenStack Cinder. P2 already had this fix in its review-comment amendment; this regression came from the same older-tree path. Flip should not touch Flow.qll at all and now doesn't.
  • Flow.qll:1098 (Py::For more information) — same as above; reverted to For more information.
  • ApiGraphs.qll:9 — reworded comment to clarify that the conflict is with the unqualified CallNode from import python (via Flow.qll), not Cfg::CallNode.
  • ApiGraphs.qll:779 — closed the unclosed backtick: \Cfg::SequenceNode` for generality`.

@yoff yoff marked this pull request as draft June 1, 2026 13:15
@yoff yoff force-pushed the yoff/python-shared-cfg-dataflow-flip branch from 5187e9a to 8178a8d Compare June 1, 2026 13:18

yoff commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

Re. PatternAliasDefinition / PatternCaptureDefinition replacement in MatchUnpacking.qll:

Precision analysis. The legacy code used PatternAliasDefinition pad (an EssaNodeDefinition produced by pattern_alias_definition(v, defn) in SsaDefinitions.qll), constrained as defn.getNode() = alias ∧ alias = v.getAStore(). The v.getAStore() filter is vacuous here — MatchAsPattern.getAlias() always returns a binding-position Name, so it is always a store. So pad.getDefiningNode() is just "the CFG node for the alias Name".

In the new CFG, a leaf Name AST has exactly one canonical Cfg::NameNode. So matching by getNode() = alias picks the same single CFG node the legacy ESSA wrapper picked. No precision is lost.

Simplification. Pushed 8178a8d to inline the intermediate exists(Cfg::ControlFlowNode aliasCfg | ...) to nodeTo.(CfgNode).getNode().getNode() = alias (and analogous for capture.getVariable()), and removed the now-unused Cfg import. library-tests/dataflow/match still passes.

@yoff yoff force-pushed the yoff/python-add-new-ssa branch from 28062e1 to 4d4b916 Compare June 1, 2026 13:28
@yoff yoff force-pushed the yoff/python-shared-cfg-dataflow-flip branch from 8178a8d to 8ecea67 Compare June 1, 2026 13:37
@yoff yoff force-pushed the yoff/python-add-new-ssa branch from 4d4b916 to 6313e70 Compare June 1, 2026 14:05
@yoff yoff force-pushed the yoff/python-shared-cfg-dataflow-flip branch from 8ecea67 to c476e5a Compare June 1, 2026 15:46
@yoff yoff force-pushed the yoff/python-add-new-ssa branch from 6313e70 to e3b9611 Compare June 2, 2026 08:30
@yoff yoff force-pushed the yoff/python-shared-cfg-dataflow-flip branch from c476e5a to 483a64a Compare June 2, 2026 08:30
@yoff yoff force-pushed the yoff/python-add-new-ssa branch from e3b9611 to 9be5c62 Compare June 2, 2026 08:47
@yoff yoff force-pushed the yoff/python-shared-cfg-dataflow-flip branch from 483a64a to ead93ca Compare June 2, 2026 08:47
@yoff yoff force-pushed the yoff/python-add-new-ssa branch from 9be5c62 to 944d56e Compare June 2, 2026 13:48
tausbn and others added 17 commits June 26, 2026 12:07
The `manual_rule!` macro is now fully subsumed by `rule!` + `@@name`, so
this commit simply gets rid of the now no longer needed code.
- unified/swift: Mark `binding_kind` as a raw `@@` capture in the
  property_declaration rule. It is only used to read its source text
  (`ctx.ast.source_text`), never as a translated node. With `@` the
  auto-translate prefix would route the unnamed `let`/`var` token
  through the catch-all `_ @node => {node}` fallback for a no-op
  roundtrip; `@@` makes the intent explicit and removes that reliance.

- shared/yeast/tests: Reword a stale comment in test_raw_capture_marker.
  The text claimed a "second assertion" exists in this test, but the
  explicit-translation check actually lives in the companion
  test_raw_capture_marker_explicit_translate.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…al/gradle

Replace `jcenter()` and `mavenCentral()` with Maven Central mirror URL
unified: Add or_pattern and fix 'if case let' translation
Ruby: Address testFailures in inline expectations tests (part 1)
…ntax

yeast: Extend `rule!` macro with support for raw captures
Preparatory refactor for the shared-CFG dataflow migration. Adds the
adapter that mediates between the Python AST and the shared
codeql.controlflow.ControlFlowGraph signature, plus the test suites
that validate the new CFG directly against this adapter. The public
facade is added in the following commit.

Library additions:

- semmle.python.controlflow.internal.AstNodeImpl — wraps Python's
  Stmt/Expr/Scope/Pattern and adds two synthetic kinds of node
  (BlockStmt for body slots, intermediate nodes for multi-operand
  boolean expressions) to satisfy the shared CFG signature.

- lib/printCfgNew.ql — debug/visualisation query for the new CFG.

- consistency-queries/CfgConsistency.ql — consistency query running
  the shared CFG's standard checks against Python.

Test additions (all driven directly off AstNodeImpl):

- ControlFlow/bindings/* — annotation-driven SSA-binding tests
  (annassign, compound, comprehension, decorated, except_handler,
  imports, match_pattern, parameters, simple, type_params,
  walrus_starred, with_stmt, dead_under_no_raise).

- ControlFlow/evaluation-order/NewCfg*.ql — mirrors of the existing
  OldCfg evaluation-order self-validation suite, run against the
  new CFG via NewCfgImpl.qll.

- Minor extensions to existing test_if.py / test_boolean.py +
  cosmetic .expected churn on a handful of OldCfg tests.

No dataflow, SSA, or production query is migrated yet.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds the public facade on top of the AstNodeImpl adapter from the
previous commit. Re-exposes the same API surface as
semmle/python/Flow.qll (ControlFlowNode, CallNode, BasicBlock,
NameNode, DefinitionNode, CompareNode, ...), backed by the shared
codeql.controlflow.ControlFlowGraph library.

- semmle.python.controlflow.internal.Cfg — public facade.
- ControlFlow/store-load/* — basic store/load coverage via the facade.

The new CFG library is added additively: it has zero callers in lib/
and src/, and the legacy CFG in semmle/python/Flow.qll remains the
default. Dataflow, SSA, and production query migration land in
follow-up PRs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…with

The new CFG previously only emitted exception edges for explicit `raise`
and `assert` statements. As a result, code that became reachable only
via the exception path of an arbitrary expression (e.g., the body of an
`except` handler following a try-body whose `call()` could raise) was
classified as dead, breaking analyses like StackTraceExposure,
FileNotAlwaysClosed, ExceptionInfo, UseOfExit, and CatchingBaseException.

This commit adds a `mayThrow` predicate over expressions that are known
sources of implicit exceptions in Python (calls, attribute access,
subscripts, arithmetic/comparison operators, imports, await/yield/yield
from) plus `from m import *` at the statement level, and routes them
through the shared CFG's `beginAbruptCompletion(_, _, ExceptionSuccessor,
always=false)` hook.

The set of exception sources is restricted to nodes that are
syntactically inside a `try`/`with` statement in the same scope.
This mirrors Java's `ControlFlowGraph::mayThrow`, which only emits
exception edges where local handling can observe them — outside such
contexts, the edges add CFG complexity (weakening BarrierGuard
precision and breaking SSA continuity around augmented assignments and
subscript stores) without analysis benefit, since exceptions just
propagate to the function exit anyway.

Net effect on the test suite: ~100 alerts restored across the exception-
related query tests (StackTraceExposure +29, ExceptionInfo +17,
FileNotAlwaysClosed +52, UseOfExit +1, CatchingBaseException restored)
with no precision regressions. Affected `.expected` files and the
regression-guard `dead_under_no_raise.py` are updated accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Preparatory refactor for the shared-CFG dataflow migration. Adds the
new Python SSA adapter additively, without changing any production
behaviour.

Library additions:

- semmle.python.dataflow.new.internal.SsaImpl — Python SSA
  implementation built on the new (shared) CFG. Mirrors the Java SSA
  adapter (java/ql/lib/semmle/code/java/dataflow/internal/SsaImpl.qll):
  an InputSig is defined in terms of positional (BasicBlock, int)
  variable references, and the shared
  codeql.ssa.Ssa::Make<Location, Cfg, Input> module is then
  instantiated.

  SourceVariable is the AST-level Py::Variable. Variable references
  are looked up via the new CFG facade's NameNode.defines/uses/deletes
  predicates (added in the preceding PR), which themselves are
  one-line bridges to AST-level Name.defines/uses/deletes.

  Implicit-entry definitions are inserted for non-local/global/builtin
  reads, captured variables, and (when needed) parameters.

Test additions:

- library-tests/dataflow-new-ssa/ — exercises the new SSA over a
  representative test corpus and checks expected def/use chains.

- library-tests/dataflow-new-ssa-vs-legacy/ — runs both new SSA and
  legacy ESSA over the same corpus and diffs the results, so any
  semantic divergence shows up as a test failure.

Production impact:

None. The new SSA adapter has zero callers in lib/ and src/ — the
legacy ESSA SSA (semmle/python/essa/*) remains the default. The
dataflow library is not migrated yet; that lands in a follow-up PR.

Verified by:
- All 367 lib + src + consistency-queries compile clean.
- All 641 ControlFlow + PointsTo + dataflow + essa + consistency
  library-tests pass.
- Both new dataflow-new-ssa[/vs-legacy] test packs pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.

This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:

  P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
  P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
  P3: Add new shared-CFG-backed control flow graph (#21921).
  P4: Add new shared-SSA-backed SSA adapter (#21923).

The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.

GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.

Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.

A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
  pre/post-order pairs as distinct nodes.

Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
  FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.

Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.

Verification: all 367 lib + src + consistency-queries compile clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yoff yoff force-pushed the yoff/python-shared-cfg-dataflow-flip branch from 62f34d5 to 44b8aad Compare June 29, 2026 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.