Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions .github/workflows/arco-demo.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,11 @@ name: ARCO Demo Run
#
# What this gates (any failure fails the build):
# - "ALL CHECKS PASSED" signal in the pipeline output.
# - Three regression test scripts return 0:
# * test_gate_removal.py (each Annex III 1(a) gate is independently necessary)
# - Four regression test scripts return 0:
# * test_gate_removal.py (each Annex III 1(a) and 5(b) gate is independently necessary)
# * test_scenarios.py (multi-scenario classification correctness)
# * test_kiosk_html_no_false_concretization.py (L4.7 regression)
# * test_adversarial_mechanism.py (decoy and ghost classification mechanism)
# - Five expected artifact files exist in runs/demo/:
# certificate.txt, summary.json, evidence.json, shacl_report.txt,
# determination_view.html.
Expand Down Expand Up @@ -95,6 +96,12 @@ jobs:
set -euo pipefail
python -u 03_TECHNICAL_CORE/scripts/test_kiosk_html_no_false_concretization.py

- name: Run adversarial-mechanism regression tests
shell: bash
run: |
set -euo pipefail
python -u 03_TECHNICAL_CORE/scripts/test_adversarial_mechanism.py

- name: Verify artifact files exist
shell: bash
run: |
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/arco-smoke-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,3 +46,6 @@ jobs:

- name: Run kiosk HTML no-false-concretization regression test (L4.7)
run: python 03_TECHNICAL_CORE/scripts/test_kiosk_html_no_false_concretization.py

- name: Run adversarial-mechanism regression tests
run: python 03_TECHNICAL_CORE/scripts/test_adversarial_mechanism.py
38 changes: 38 additions & 0 deletions 03_TECHNICAL_CORE/ontology/ARCO_governance_extension.ttl
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,44 @@ cco:Organization rdf:type owl:Class ;
rdfs:label "Organization"@en ;
rdfs:subClassOf bfo:0000027 . # Object Aggregate

#################################################################
# 3a) REGULATORY CONTENT — Annex III conditions (universal)
#
# These ICE instances describe what EU AI Act Regulation (EU) 2024/1689
# Annex III prescribes for the modeled categories. They are universal
# regulatory content (one ICE per Annex III condition modeled), not
# per-fixture data. Every fixture references them via iao:0000136
# from its :AssessmentDocumentation.
#
# Pattern: a regulatory condition ICE has type :RegulatoryContent,
# prescribes the regulated process type via cco:prescribes, and is_about
# the capability / process / role universals via iao:0000136.
#
# Moved from per-fixture files (Sentinel, CreditScoring) on 2026-05-14
# to close regulatory_alignment FAIL on Adversarial and FlagTest fixtures.
#################################################################

:AnnexIII_List rdf:type :RegulatoryContent ;
rdfs:label "Annex III List" ;
bfo:0000051 :AnnexIII_Condition_1a ;
bfo:0000051 :AnnexIII_Condition_5b .

:AnnexIII_Condition_1a rdf:type :RegulatoryContent ;
rdfs:label "Annex III 1(a) (Biometric Rule)" ;
rdfs:comment "Annex III item 1(a): biometric identification of natural persons. cco:prescribes targets the regulated process TYPE (class IRI as concept-individual via OWL 2 punning) — the regulation prescribes process types, not deployment-specific tokens." ;
cco:prescribes :RemoteBiometricIdentificationProcess ;
iao:0000136 :BiometricIdentificationCapability ;
iao:0000136 :RemoteBiometricIdentificationProcess ;
iao:0000136 :NaturalPersonRole .

:AnnexIII_Condition_5b rdf:type :RegulatoryContent ;
rdfs:label "Annex III 5(b) (Creditworthiness Rule)" ;
rdfs:comment "Annex III item 5(b): AI systems intended to evaluate the creditworthiness of natural persons or establish their credit score, with the exception of AI systems used for the purpose of detecting financial fraud. cco:prescribes targets the regulated process TYPE (class IRI as concept-individual via OWL 2 punning)." ;
cco:prescribes :CreditworthinessEvaluationProcess ;
iao:0000136 :CreditworthinessEvaluationCapability ;
iao:0000136 :CreditworthinessEvaluationProcess ;
iao:0000136 :NaturalPersonRole .

#################################################################
# 3b) REGULATORY BRIDGE AXIOMS
#
Expand Down
21 changes: 4 additions & 17 deletions 03_TECHNICAL_CORE/ontology/ARCO_instances_creditscoring.ttl
Original file line number Diff line number Diff line change
Expand Up @@ -21,25 +21,12 @@
owl:imports <https://arco.ai/ontology/governance> .

#################################################################
# 1) REGULATORY LAYER — Annex III 5(b)
# 1) ANNEX III REGULATORY LAYER
#
# Annex III conditions moved to ARCO_governance_extension.ttl on 2026-05-14;
# references via iao:0000136 stay below.
#################################################################

# Mereological backbone (CLAUDE.md invariant 8): every modeled Annex III condition
# is `bfo:0000051` of `:AnnexIII_List`. The list is also re-asserted here with its
# rdf:type so this fixture is self-contained when loaded standalone (test_scenarios.py
# loads each fixture independently). Duplicate triples across fixtures are deduped
# at union time. See runs/loop/2026-05-09_beverley-research/audit_C_regulatory.md T2.
:AnnexIII_List rdf:type :RegulatoryContent ;
bfo:0000051 :AnnexIII_Condition_5b .

:AnnexIII_Condition_5b rdf:type :RegulatoryContent ;
rdfs:label "Annex III 5(b) (Creditworthiness Rule)" ;
rdfs:comment "Annex III item 5(b): AI systems intended to be used to evaluate the creditworthiness of natural persons or establish their credit score, with the exception of AI systems used for the purpose of detecting financial fraud. cco:prescribes targets the regulated process TYPE (class IRI as concept-individual via OWL 2 punning) — the regulation prescribes process types, not deployment-specific tokens. This matches the Sentinel pattern and generalizes across multiple 5(b) assessments sharing this single regulatory ICE." ;
cco:prescribes :CreditworthinessEvaluationProcess ;
iao:0000136 :CreditworthinessEvaluationCapability ;
iao:0000136 :CreditworthinessEvaluationProcess ;
iao:0000136 :NaturalPersonRole .

#################################################################
# 2) SYSTEM LAYER (reality-side particulars)
#################################################################
Expand Down
30 changes: 25 additions & 5 deletions 03_TECHNICAL_CORE/ontology/ARCO_instances_flag_tests.ttl
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,31 @@
rdfs:label "ARCO Flag Test Instances" ;
rdfs:comment """Two test cases for the audit-layer exception flags.

NOTE: Running the full pipeline on these instances will show classification PASS
but audit FAIL (traceability and regulatory alignment). This is expected and correct —
these are minimal instances for flag testing only, without full regulatory content
linkage. The same pattern applies to ARCO_instances_adversarial_*.ttl.
The classification layer and flag behavior are the only results under test here.
TEST TARGET: simultaneous OWL classification + audit-layer flag detection
on the same system, demonstrating that classification and audit do not
bleed into each other.

Expected audit-row outcomes (post 2026-05-14 migration of regulatory content
to ARCO_governance_extension.ttl):
- classification: PASS (all three Annex III gates satisfied)
- exception flag: FLAGGED (derogation or fraud, per fixture)
- traceability: FAIL
- regulatory_alignment: FAIL
The traceability and regulatory_alignment FAILs are NOT the test target.
They persist because the :AssessmentDocumentation instances below do not
carry an iao:0000136 link to :AnnexIII_Condition_1a or :AnnexIII_Condition_5b,
so the audit queries (which require ?doc iao:0000136 ?condition) return false.
This is a fixture-authoring gap, not a defect in the classification or flag
behavior under test. Adding those links would close the audit FAILs without
affecting the classification or flag entailments — separate change.

Prior to the 2026-05-14 migration, the audit FAILs also covered fixture-
distribution effects (regulatory condition declarations were per-fixture
inside ARCO_instances_sentinel.ttl and ARCO_instances_creditscoring.ttl, so
the universal regulatory content was invisible to this fixture). The
migration moved those declarations to the governance extension, removing
the distribution issue; what remains is the local AssessmentDoc->condition
linkage gap described above.

Test A — FlagTest_BiometricSystem_WithDerogationClaim:
A system that IS classified as AnnexIII1aApplicableSystem (all three gates satisfied)
Expand Down
16 changes: 4 additions & 12 deletions 03_TECHNICAL_CORE/ontology/ARCO_instances_sentinel.ttl
Original file line number Diff line number Diff line change
Expand Up @@ -13,20 +13,12 @@
owl:imports <https://arco.ai/ontology/governance> .

#################################################################
# 1) REGULATORY LAYER (ICE grounded to reality)
# 1) ANNEX III REGULATORY LAYER (mereological backbone)
#
# Annex III conditions moved to ARCO_governance_extension.ttl on 2026-05-14
# as universal regulatory content; references via iao:0000136 stay below.
#################################################################

:AnnexIII_List rdf:type :RegulatoryContent ;
rdfs:label "Annex III List" ;
bfo:0000051 :AnnexIII_Condition_1a . # has part

:AnnexIII_Condition_1a rdf:type :RegulatoryContent ;
rdfs:label "Annex III 1(a) (Biometric Rule)" ;
cco:prescribes :RemoteBiometricIdentificationProcess ; # directive: prescribes the regulated process type (Three D's — DirectiveICE → Process)
iao:0000136 :BiometricIdentificationCapability ; # is_about the capability universal
iao:0000136 :RemoteBiometricIdentificationProcess ; # is_about the regulated process type
iao:0000136 :NaturalPersonRole . # is_about the affected role

#################################################################
# 2) SYSTEM LAYER (reality-side particulars) - UPDATED
#################################################################
Expand Down
Loading
Loading