Skip to content

feat(skills): cover CoreFileObject in schemas + auditing#55

Open
iddocohen wants to merge 1 commit into
mainfrom
ic-rule-core-file-object
Open

feat(skills): cover CoreFileObject in schemas + auditing#55
iddocohen wants to merge 1 commit into
mainfrom
ic-rule-core-file-object

Conversation

@iddocohen

Copy link
Copy Markdown
Contributor

Summary

Adds schema- and audit-side rules for CoreFileObject, the built-in Infrahub generic that turns a node into a file-bearing entity (PDF, Visio, KMZ, image, certificate, contract, runbook, etc.) with auto file metadata and a GraphQL upload mutation. The skills previously covered CoreArtifactTarget and generate_template but had no guidance for CoreFileObject — driven by the opsmill/schema-library circuit_contract.yml extension and the broader file-attachment use cases (network diagrams, rack photos, compliance docs, KMZ overlays).

The rule is deliberately file-type-agnostic. The eval prompt uses a Visio network diagram, not a contract, so the model has to apply the capability rather than memorize the example.

What's in this PR

Schema-side (infrahub-managing-schemas)

  • rules/extension-file-object.md — capability-flag rule modeled on extension-artifact-target.md. Decision matrix covering diagrams, contracts, KMZ, imagery, certs, runbooks, BOMs; the five reserved read-only attributes (file_name, file_size, file_type, checksum, storage_id) called out as do-not-redeclare; "concrete node, not generic" trap; mandatory back-relationship to the parent entity; the bypass antipattern (kind: Text url/path/filename string instead of using object storage).
  • SKILL.md — new row in the "Designing for Downstream Consumers" table and a bullet in "Production Patterns Worth Knowing".
  • reference.mdCoreFileObject added to the built-in generics list alongside CoreArtifactTarget.

Audit-side (infrahub-auditing-repo)

  • rules/schema-file-object-misuse.md — MEDIUM severity rule flagging three drift conditions: reserved-attribute collisions on CoreFileObject heirs, CoreFileObject declared on a generic, and the bypass antipattern of a Text url/path string on a node that should be file-bearing.

Eval coverage (Rule = Test policy)

  • New task network-diagram-file-object in eval.yaml.
  • Four new check functions in graders/managing-schemas/lib.py: core-file-object-inherited, file-object-on-node-not-generic, no-reserved-file-attrs, no-filename-text-bypass.
  • New task grader graders/managing-schemas/check_file_object.py bundling the four checks with schema-version and human-friendly-id baselines.
  • Regenerated evaluations/infrahub-managing-schemas.json.

Grader verification

Verified the grader against four hand-crafted fixtures:

Fixture Score Names
compliant 1.0
redeclares file_name + checksum 0.833 no-reserved-file-attrs
CoreFileObject on a generic 0.667 file-object-on-node-not-generic, core-file-object-inherited
file_url Text bypass attribute 0.833 no-filename-text-bypass

CI lint status (local)

  • rumdl check . — 0 issues across 149 files
  • yamllint — exit 0 (only a pre-existing warning on eval.yaml:1 unchanged by this branch)
  • evaluations/*.json — in sync with eval.yaml

Test plan

  • CI: markdown-lint, yaml-lint, evals-sync all green
  • skillgrade --smoke passes on the new network-diagram-file-object task (deferred to follow-up commit if needed)
  • No regression on existing device-artifact-target task — the new lib.py functions are additive
  • Manual sanity-check: ask a model with the skill loaded "add a node for storing rack photos against a Rack" — model should produce inherit_from: [CoreFileObject] with a back-relationship and no redeclared metadata attributes

Adds schema- and audit-side rules for CoreFileObject, the built-in
Infrahub generic that turns a node into a file-bearing entity (PDF,
Visio, KMZ, image, certificate, contract, runbook, etc.) with auto
file metadata and a GraphQL upload mutation. The skills previously
covered CoreArtifactTarget and generate_template but had no guidance
for CoreFileObject — driven by the opsmill/schema-library
circuit_contract.yml extension and the broader file-attachment use
cases (network diagrams, rack photos, compliance docs).

Schema-side (managing-schemas):
- rules/extension-file-object.md — capability-flag rule modeled on
  extension-artifact-target.md. File-type-agnostic decision matrix
  (diagrams, contracts, KMZ, imagery, certs, runbooks, BOMs); the
  five reserved read-only attributes (file_name, file_size,
  file_type, checksum, storage_id) called out as do-not-redeclare;
  "concrete node, not generic" trap; mandatory back-relationship to
  the parent entity; bypass antipattern (kind: Text url/path/
  filename string instead of using object storage).
- SKILL.md — new row in the Designing for Downstream Consumers
  table and a bullet in Production Patterns.
- reference.md — CoreFileObject added to the built-in generics
  list alongside CoreArtifactTarget.

Audit-side (auditing-repo):
- rules/schema-file-object-misuse.md — MEDIUM severity rule
  flagging the three drift conditions: reserved-attribute
  collisions on CoreFileObject heirs, CoreFileObject declared on
  a generic, and the bypass antipattern of a Text url/path string
  on a node that should be file-bearing.

Eval coverage:
- new task `network-diagram-file-object` in eval.yaml — Visio
  diagram prompt (deliberately non-contract domain so the model
  has to apply the capability, not memorize the example).
- new lib.py check functions: core-file-object-inherited,
  file-object-on-node-not-generic, no-reserved-file-attrs,
  no-filename-text-bypass.
- new task grader graders/managing-schemas/check_file_object.py
  bundling the four checks with schema-version and
  human-friendly-id baselines.
- regenerated evaluations/infrahub-managing-schemas.json.

Verified grader against four fixtures: compliant (1.0); redeclared
file_name+checksum (0.833, names no-reserved-file-attrs);
CoreFileObject on a generic (0.667, names both
file-object-on-node-not-generic and core-file-object-inherited);
file_url bypass attribute (0.833, names no-filename-text-bypass).

CI: rumdl clean, yamllint clean (pre-existing warning on
eval.yaml:1 unrelated), evaluations/*.json in sync.
@github-actions github-actions Bot added skill/auditing-repo Changes to the infrahub-auditing-repo skill skill/managing-schemas Changes to the infrahub-managing-schemas skill labels Jun 7, 2026
@iddocohen iddocohen requested a review from BeArchiTek June 7, 2026 14:22
@iddocohen iddocohen marked this pull request as ready for review June 7, 2026 14:22
@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying infrahub-skills with  Cloudflare Pages  Cloudflare Pages

Latest commit: 75186bd
Status: ✅  Deploy successful!
Preview URL: https://aa148d69.infrahub-skills.pages.dev
Branch Preview URL: https://ic-rule-core-file-object.infrahub-skills.pages.dev

View logs

@BeArchiTek BeArchiTek left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: CoreFileObject coverage

Reviewed the new rules, eval task, and grader. Overall this is in good shape — the rule prose explains the why (object storage, system-managed metadata, branch isolation) rather than leaning on bare imperatives, the cross-domain examples table keeps it from overfitting to one use case, and the antipattern section ties each failure mode to an observable symptom.

Verified locally

  • Grader discriminates correctly. Ran check_file_object.py against 1 compliant + 4 violating fixtures (redeclared reserved attr / CoreFileObject on a generic / Text-URL bypass / missing inheritance). Compliant scores 1.0; each violation fails exactly the intended sub-check, nothing else.
  • JSON projection in syncscripts/sync-evals.py produces zero drift against eval.yaml.
  • Cross-references resolverelationship-component-parent.md and artifact-target-inheritance.md both exist.
  • Section prefixes validextension- (schemas) and schema- (auditing) are registered in each _sections.md.
  • Factual backbone confirmed against Infrahub source. Checked backend/infrahub/core/protocols.py and docs/schema/file-object.mdx: the five reserved attributes (file_name, file_size, file_type, checksum, storage_id), file_type = MIME, checksum = SHA-1, and read-only — all exact. This matters because the grader hard-codes those names, so a wrong attribute would mis-flag.

Recommended follow-ups (non-blocking)

  1. No baseline run documented. The grader proves the structure is checkable, but there's no evidence the model actually fails the network-diagram-file-object task without the rule present. If a model already reaches for CoreFileObject unaided, the 212-line rule isn't earning its context cost. A single smoke run with vs. without the rule would confirm the lift.

  2. Discoverability — skill description not updated. managing-schemas/SKILL.md frontmatter gained no triggers for this capability. A request like "how do I store an uploaded PDF / attachment / file in Infrahub" may not surface the schema guidance. Worth adding file objects / attachments / uploads to the description triggers.

Minor (no action needed)

  • The bypass heuristic uses generic names (url, path, location, filename). The schema-side grader is safely narrow — it only fires on CoreFileObject heirs and excludes file_name to avoid double-counting with the reserved-attr check. The audit-rule version is intentionally broader/advisory and self-acknowledges legitimate external-pointer cases, so expect some false positives if it's ever automated.
  • The SKILL.md quick-table row and the prose bullet restate the same fact, but they serve different scan paths — fine to leave.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skill/auditing-repo Changes to the infrahub-auditing-repo skill skill/managing-schemas Changes to the infrahub-managing-schemas skill

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants