Skip to content

coasys/git-expression-language

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Git Expression Language for AD4M

A global AD4M Expression Language that resolves the canonical git+https:// URI scheme to file content or directory listings from any reachable Git host.

Built with the modern ALDK (@coasys/ad4m-ldk) pattern and scaffolded from ad4m-expression-language-template.

Status: v0.1 — full initial release. 96/96 tests green; typecheck clean; bundle ~45 KB.


What this is

Any AD4M agent that installs this Language can resolve URIs like:

git+https://github.com/coasys/we-schemas.git#main:schemas/community-home.json
git+https://github.com/coasys/we-schemas.git#v1.2.0:schemas/community-home.json
git+https://github.com/coasys/we-schemas.git#a1b2c3d4…:schemas/community-home.json
git+https://github.com/coasys/we-schemas.git#pull/123:schemas/community-home.json
git+https://github.com/coasys/we-schemas.git#main:schemas/
git+https://github.com/coasys/we-schemas.git?lines=10-80#main:schemas/community-home.json
git+https://github.com/coasys/we-schemas.git?jsonpath=%24.title#v1.2.0:schemas/community-home.json

to an Expression<T> containing the file content (or, for trailing-slash paths, the directory listing).

The scheme is the canonical npm / Cargo / pip form for addressing Git remotes — not invented for AD4M. Fragment carries <ref>:<subject>; query parameters carry transforms applied after blob fetch.

See the full URI grammar in the proposal.


What v0.1 includes

A complete release covering the four originally-staged phases in one shot.

  • Canonical URI parser — full grammar: refs (SHA / branch / tag / pull/<n>[/head|merge]), subjects (file blob, tree listing), query params (lines, bytes, jsonpath, fields, format, recursive). Round-trip serialiser for cache-key normalisation.
  • GitHub provider — refs, tags, PRs, trees, blobs, default branch via the REST API.
  • GitLab provider — refs, tags, MRs (mapped to "pull"), trees, blobs via /api/v4.
  • Gitea provider — refs, tags, PRs, trees, blobs via /api/v1 (self-hosted hosts allowlisted via GITEA_HOSTS_CSV).
  • Raw-HTTP fallback — opt-in via ENABLE_RAW_HTTP_FALLBACK. Best-effort https://<host>/<o>/<r>/raw/<ref>/<path> for unrecognised hosts. Refs are passed as-is (no SHA resolution); tree listings and PR refs throw.
  • Tiered LRU + TTL cache — separate caches for blobs (permanent until LRU eviction), trees (permanent), refs (60 s default), default branches (1 h default). All sizes configurable via template variables.
  • Request deduplication — concurrent identical fetches share one in-flight promise. Stampede protection on hot URIs.
  • Conditional ref pollsIf-None-Match with cached ETags. 304s don't refresh the SHA but do extend the TTL.
  • Sub-file ranges?lines=<from>-<to> (1-indexed, inclusive) and ?bytes=<from>-<to> (0-indexed, inclusive). Both clamp out-of-bounds.
  • JSONPath transforms?jsonpath=$.foo[*].bar. Subset: root, field access, index, wildcard, recursive descent.
  • Field picker?fields=name,version for shallow JSON projection.
  • Format hints?format=raw|stripped|minified. stripped removes // + /* */ comments and trailing commas.
  • Tree listings — trailing-slash subject path returns directory contents. ?recursive=1 for the whole tree.
  • PR refspull/<n> (head) and pull/<n>/merge resolve to PR commits via the provider's PR endpoint.
  • Per-host authAUTH_TOKENS_JSON template variable carries a host→token map. Anonymous when absent.
  • isImmutable — returns true for SHA-pinned URIs; false otherwise. Enables downstream caches to pin content.
  • Comprehensive tests — 96 across URI parsing, fragment, transforms, cache + dedup, providers (mock transport), end-to-end resolution.

Proposed for the future

  • Radicle provider — depends on the Radicle Service Language in the git-link-language successor spec.
  • git+ssh:// and git+http:// schemes — reserved in the grammar; v0.1 implements only git+https://. SSH requires a host SSH primitive (currently unavailable in the executor).
  • Submodule traversal — follow commit-typed tree entries into the referenced repo. Niche; off by default behind a flag.
  • Symlink following — currently returns the symlink target string. Optional opt-in to resolve transitively (mode 120000).
  • Per-org token maps — extend AUTH_TOKENS_JSON to nested { "github.com": { "myorg": "ghp_…", "*": "default" } } for users with multiple PATs per host.
  • Richer JSONPath — filter expressions, slices, multi-key selectors (RFC 9535 full grammar).
  • Streaming for very large blobsbytes= ranges currently fetch the whole blob first. A streaming path would matter only for binary blobs > MBs.
  • expression.create for short-lived gist-like content — currently unsupported; v0.1 considers git+https:// addresses editorial. A future variant could integrate with the Gist API or repo contents endpoint for create-via-PR flows.
  • Inline image preview / MIME detection — return decoded binary as data URLs when consumers ask for them.
  • Provider health surface — expose rate-limit remaining quotas via a custom query / signal.

Architecture

index.ts
  ↓
defineLanguage({ expression: { get, isImmutable, icon, … } })
  ↓
GitResolver (src/resolve.ts)
  ├── parse URI         (src/uri.ts)         — canonical git+https grammar
  ├── ref → SHA         (provider)           — cached + ETag-conditional
  ├── tree at SHA       (provider)           — cached
  ├── blob fetch        (provider)           — cached
  ├── fragment slice    (src/fragment.ts)    — lines / bytes
  └── transforms        (src/transforms/)    — jsonpath / fields / format
  ↓
Expression<T>

Providers (src/providers/):
  github.ts   — github.com
  gitlab.ts   — gitlab.com + GITLAB_HOSTS_CSV
  gitea.ts    — GITEA_HOSTS_CSV
  raw-http.ts — fallback, ENABLE_RAW_HTTP_FALLBACK

Trust model

Expressions are not AD4M-signed. proof.signature is empty. The trust model is:

  1. The URI is the assertion — readers who trust github.com/coasys/we-schemas trust whoever has write access.
  2. SHA-pinned URIs are content-addressed by Git — the provider returns either the exact content or fails.
  3. Mutable references (branch, tag, PR) can change. Tags are by convention immutable; branches and PRs move by design.
  4. Higher-level Languages can wrap git+https:// URIs in signed link expressions when authorship of the assertion that this URI is interesting matters.

Template variables

//!@ad4m-template-variable
const AUTH_TOKENS_JSON = "{}";                // {"github.com":"ghp_…","gitlab.com":"glpat_…"}

//!@ad4m-template-variable
const BRANCH_REF_TTL_MS = "60000";

//!@ad4m-template-variable
const TAG_REF_TTL_MS = "86400000";

//!@ad4m-template-variable
const DEFAULT_BRANCH_TTL_MS = "3600000";

//!@ad4m-template-variable
const BLOB_CACHE_MAX_ENTRIES = "1024";

//!@ad4m-template-variable
const TREE_CACHE_MAX_ENTRIES = "256";

//!@ad4m-template-variable
const REF_CACHE_MAX_ENTRIES = "256";

//!@ad4m-template-variable
const ENABLE_RAW_HTTP_FALLBACK = "false";

//!@ad4m-template-variable
const GITEA_HOSTS_CSV = "";

//!@ad4m-template-variable
const GITLAB_HOSTS_CSV = "";

Build, test, type-check

NODE_ENV=development pnpm install
pnpm test       # 96 tests across 28 suites
pnpm typecheck  # tsc --noEmit
pnpm build      # → build/bundle.js (~45 KB)

Project structure

├── index.ts                            # defineLanguage entry
├── esbuild.ts                          # Build script
├── package.json
├── tsconfig.json
├── src/
│   ├── types.ts                        # Expression, BlobMeta, TreeMeta, TreeEntry
│   ├── adapters.ts                     # Transport / Storage / Runtime / Signing
│   ├── adapters-deno.ts                # Concrete adapters wrapping ad4m:host
│   ├── auth.ts                         # AUTH_TOKENS_JSON parser
│   ├── cache.ts                        # TtlLruCache (used for all four tiers)
│   ├── dedup.ts                        # In-flight request coalescer
│   ├── uri.ts                          # Canonical Git URI parser + serialiser
│   ├── provider.ts                     # GitProvider interface + registry
│   ├── providers/github.ts             # GitHub REST API client
│   ├── providers/gitlab.ts             # GitLab REST API client
│   ├── providers/gitea.ts              # Gitea REST API client
│   ├── providers/raw-http.ts           # Generic raw-HTTP fallback
│   ├── fragment.ts                     # Line / byte range slicing
│   ├── transforms/index.ts             # Transform pipeline
│   ├── transforms/jsonpath.ts          # JSONPath subset
│   ├── transforms/fields.ts            # Field picker
│   ├── transforms/format.ts            # raw / stripped / minified
│   └── resolve.ts                      # End-to-end resolver
└── tests/
    ├── uri.test.ts
    ├── auth.test.ts
    ├── cache.test.ts
    ├── fragment.test.ts
    ├── transforms.test.ts
    ├── providers-github.test.ts
    └── resolve.test.ts

License

Cryptographic Autonomy License v1.0 (CAL-1.0)

About

AD4M Expression Language resolving canonical git+https:// URIs to file content from any Git host

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors