Skip to content

HolobiomicsLab/asb-skill-collections

Repository files navigation

AgenticScienceBuilder logo

ASB Skill Collections

release skills tools license content DOI

Curated, evidence-grounded skill and software-tool collections for scientific AI agents, generated by the AgenticScienceBuilder (ASB) pipeline. Each skill is distilled from a peer-reviewed method paper and its public code repository, anchored to verbatim evidence, EDAM-annotated, and gated for licensing/PII before release.

This is a community effort, initiated at the Dagstuhl Seminar Computational Metabolomics (26181, 26–30 April 2026).

ASB pipeline: real research papers → ASB → domain benchmark (skills · tools · data) → agents evaluated → community validates, with a domain/community-defined feedback loop

This release — metabolomics-v0.1.0 (preliminary): collections/metabolomics/v25,865 skills across 909 tools distilled from 568 papers, for computational metabolomics — predominantly LC-MS/MS, but also LC-MS, GC-MS, mass-spectrometry imaging, ion mobility and lipidomics, with some NMR and multi-omics / statistics / pathway tools.

Artifacts in this release

Artifact Status
ASB-Skills — evidence-grounded procedural skills released
ASB-Tools — deduplicated software-tool records (EDAM + DOIs) released
ASB-Benchmark — per-paper tasks + claim-retrieval test sets ⏳ to be released soon
ASB-Capsules — raw per-paper ASB pipeline outputs (full traceability) ⏳ to be released soon

Only ASB-Skills and ASB-Tools are published now; the benchmark and capsule layers follow (see PROVENANCE.md).

Structure & roadmap

asb-skill-collections is a multi-domain marketplace — it hosts ASB-generated skill collections for any scientific domain, organized by provenance (ASB-generated), not by field, so the repo name and marketplace stay domain-agnostic as new domains are added:

collections/<domain>/<version>/    # full collection per domain   (e.g. metabolomics/v2)
packs/<domain>/<technique>/        # lighter per-technique subsets (e.g. metabolomics/lc-ms)

Each domain ships a full plugin (<domain>) plus per-technique packs (<domain>-<technique>). Metabolomics is the first released collection; proteomics, transcriptomics, epigenomics and further domains follow under the same layout — no rename, just new entries under collections/ + marketplace.json.


📦 Install

Tip

Fastest path — two lines in Claude Code: the full collection, or a lighter per-technique pack.

🚀 Claude Code (native plugin)

/plugin marketplace add HolobiomicsLab/asb-skill-collections
/plugin install metabolomics@asb-skill-collections          # full collection (5,865 skills)

🧩 Lighter per-technique packs — load only what you need

/plugin install metabolomics-lc-ms@asb-skill-collections    # also: gc-ms, nmr, ms-imaging,
                                                            # ion-mobility, ce-ms,
                                                            # direct-infusion, ms-generic

Note

Packs overlap (a multi-technique skill appears in several) — install one full plugin or a few packs, not both. See packs/metabolomics/.

🌐 Web UI — Claude · ChatGPT · Mistral

No CLI: upload the search indexes + the few skills you need as the assistant's knowledge (Claude Projects, ChatGPT Custom GPT/Project, Mistral Agent/Library) and paste a routing instruction — step-by-step in USAGE.md.

🤖 Any other agent / IDE

The collection is plain Markdown + JSON — point your agent at collections/metabolomics/v2/ and read the indexes. See AGENTS.md.

🌍 Install beyond Claude Code

Other agent runtimes have no /plugin install. Use the bundled asbb CLI from a local clone to materialize a pack into the runtime's own location.

git clone https://github.com/HolobiomicsLab/asb-skill-collections.git
cd asb-skill-collections
python3 -m scripts.asbb_cli install --list-runtimes      # see all targets

Skill-native runtimes (read SKILL.md directly):

# Codex + Copilot CLI + Gemini CLI all share ~/.agents/skills — one install:
python3 -m scripts.asbb_cli install metabolomics-lc-ms --runtime agents

# Or a specific home: --runtime codex | copilot | gemini
# Vendor into a project for Claude Code: --runtime claude  (add --user for ~/.claude)

Rules/instruction IDEs (a SKILL.md is rendered into their format — run from the target project):

python3 -m scripts.asbb_cli install metabolomics-lc-ms --runtime cursor          # .cursor/rules/*.mdc
python3 -m scripts.asbb_cli install metabolomics-lc-ms --runtime cline           # .clinerules/*.md
python3 -m scripts.asbb_cli install metabolomics-lc-ms --runtime vscode-copilot  # .github/instructions/*.instructions.md

Anything else (pi, Antigravity, or a runtime without a preset):

python3 -m scripts.asbb_cli install metabolomics-lc-ms --dest ~/some/skills/dir

Skill-native installs symlink by default (a git pull in the clone updates them); add --copy for a self-contained copy. --dry-run previews, --force overwrites unmanaged files, and asbb uninstall <pack> --runtime <id> cleanly removes exactly what was installed (tracked in ~/.asbb/installed.json).

For Claude Code, the plugin marketplace above remains the recommended path.

Use

Search → apply → ground. Find a skill via skills_index.json (by EDAM IRI, tool name, or keyword) or tools_index.json; read its SKILL.md and follow the procedure; then optionally ground it against the source paper/repo to verify a parameter or claim — see the 🔎 Grounding (Perspicacité) section below. Requirements (libraries, per-skill tool deps) are in USAGE §0.

🔎 Grounding (Perspicacité)

Skills carry distilled procedure; to verify an exact parameter, threshold, or claim, ground a skill against the paper it was distilled from. Grounding is optional and additive — every skill works without it — and ships inside every plugin and pack.

It's powered by Perspicacité — Holobiomics Lab's local-first scientific literature-RAG engine. Two backends, KB-primary with a serverless fallback:

  • kb (Perspicacité) — RAG over the source paper's full text + supplementary information, persistent and citable. The per-paper KB (asb-paper-<doi>) is auto-created and ingested on first use via the MCP tools ensure_kb / ground_paper.
  • local (serverless)no server: git clone the skill's source repo + best-effort open-access paper, then read the files directly.

Use it

  • In Claude Code: run /ground on the skill in play — it installs + queries the source KB, or falls back to a local clone when no server is running.
  • Anywhere: call the bundled bin/perspicacite_kb_bind.py (prepare / query / local).

Tip

Recommended before you act on a numeric parameter, threshold, or quantitative claim — it's the difference between "the skill says ~5 ppm" and "the paper specifies 5 ppm."

The kb backend needs a reachable Perspicacité (PERSPICACITE_BASE, default http://127.0.0.1:8000); the local backend needs only git + network. Full guide: USAGE.md §4.

What's in the collection

Metabolomics skill collection schematic: peer-reviewed papers → ASB → EDAM-typed, deduplicated skills routed via Perspicacité (EDAM filter + skill KB, search_skill_kb) to a Mimosa agent, spanning the untargeted LC-MS/MS workflow and wrapping community tools (GNPS, XCMS, matchms, MS-DIAL, MZmine, SIRIUS, spec2vec)

The LC-MS view of the collection (the metabolomics-lc-ms pack): papers → ASB → EDAM-typed skills, routed by Perspicacité. The full release spans 5,865 skills across all techniques.

File Contents
skills/<slug>/SKILL.md one evidence-grounded skill each (frontmatter: EDAM IRIs, derived_from DOIs, evidence_spans, tools, attribution)
tools/<slug>.yaml deduplicated software-tool records with EDAM + source DOIs
skills_index.json / tools_index.json machine search indexes
kb_bundle.json skill → source-paper KB slugs + repo_urls (grounding map)
bin/perspicacite_kb_bind.py · commands/ground.md · GROUNDING.md packaged grounding — the binder, the /ground command, and a how-to (shipped in every plugin & pack)
collection.yaml · corpus.yaml collection record · per-paper access basis (repo-oa)
CITATION.cff · PROVENANCE.md · gate_report.json citation · how it was generated · release-gate verdict

The default entry point is skills/_router/SKILL.md.

For a description of the collection's content (technique & EDAM-topic breakdown) and how it was selected (sources, inclusion/exclusion criteria, grounding, gating), see ABOUT.md.

How it was generated

The exact ASB build command and the mixed-model routing (Opus 4.8 for outline/card-revision, Haiku 4.5 for the rest, OpenAI embeddings) are documented in PROVENANCE.md, recorded per build in build_manifest.json. The raw per-paper ASB capsules and the benchmark layer (full end-to-end traceability) will be released later.

Attribution & citation

If you use this collection, cite both the collection and the original paper behind each skill you use (attribution.original_doi).

  • Collection authors (Zenodo, see CITATION.cff): AgenticScienceBuilder Community, Louis-Félix Nothias, HolobiomicsLab.cnrs.fr, MetaboLinkAI.net.
  • Per-skill roles (in each SKILL.md attribution: block): generator (the ASB pipeline) · curators (who modify/validate — none yet) · promoter (suggests use — Louis-Félix Nothias) · sponsor (paid the API cost — CNRS & Université Côte d'Azur) · original_doi (source paper).
  • Zenodo DOI: 10.5281/zenodo.20794027.

Funding & acknowledgements

This collaborative project was initiated at the Dagstuhl Seminar Computational Metabolomics (26181, 26–30 April 2026), and we thank the seminar's participants and organizers. API generation costs for this collection were sponsored by CNRS and Université Côte d'Azur. Built with the AgenticScienceBuilder pipeline and grounded with Perspicacité (Holobiomics Lab).

📚 Sources & provenance

Skills are distilled from peer-reviewed method papers anchored to the computational metabolomics review series (Misra → Enveda). See governance/SOURCES.md for the source inventory and the scientific inclusion criteria, and governance/CONTENT_POLICY.md for the legal/open-access policy.

Suggest or annotate a paper: see Contributing → Propose or annotate a paper.

License tiers

Every skill carries a license_tier field (in skills_index.json and in each SKILL.md frontmatter metadata.license_tier) that answers what may I do with the underlying tool?

Tier Meaning
open Commercial use OK (MIT, Apache-2.0, GPL, CC-BY, …)
noncommercial Academic / noncommercial only — confirm permitted use before applying the skill
restricted No clear license detected — verify before commercial use or redistribution

Discovery defaults to open skills; the asb-metabolomics meta-skill enforces the noncommercial acknowledgment gate. Non-open skills carry a one-line banner in their body. Full policy: governance/LICENSE_TIERS.md.

# list only open-tier skills
jq '[.[] | select(.license_tier=="open")]' collections/metabolomics/v2/skills_index.json

License

Dual-licensed, by layer (see LICENSING.md):

  • Code (scripts, tooling) — Apache-2.0.
  • Collection content (skill descriptions, tool records, structured metadata) — CC-BY-4.0 (as stamped in every SKILL.md, collection.yaml, CITATION.cff).
  • Verbatim quotations from source papers (evidence_spans) — minimal, attributed, under fair-use / quotation right; a rights holder may request removal.

Maintainers & contributing

Maintained by Holobiomics Lab — see MAINTAINERS.md. Curator workflow: CONTRIBUTING.md; conflict-of-interest policy: COI_POLICY.md. All governance & policy docs now live in governance/.

Other collections

collections/ also contains epigenomics/v1, transcriptomics/v1, and the earlier metabolomics/v1. These are staged/internal and not part of this release — only metabolomics/v2 is published via the plugin.

Status & caveats

  • Zenodo DOI10.5281/zenodo.20794027.
  • w3id.org/holobiomicslab/… IRIs — reserved identifiers that do not resolve yet (the redirect is not live); treat as stable names, not links.
  • HuggingFace mirror & leaderboard — planned, not yet live.
  • Benchmark / capsules — to be released later.