Auto-Research Template

A PyTorch Lightning research template for human-AI collaborative research — Claude handles literature search, hypothesis generation, and experiment scaffolding; you direct the science.

Paradigm	`trainer_type`	`model_type` options
Language Model	`lm`	`causal`, `multimodal` (prefix / cross_attn / adaLN)
Generative	`generative`	`vae`, `ddpm`, `flow_matching`, `drift`
Representation	`representation`	`contrastive`, `byol`, `mae`

How It Works

Research here is a loop, not a pipeline. Each cycle moves from rough idea → structured hypotheses → running experiment → new questions.

┌─────────────────────────────────────────────────────────────────┐
│                     RESEARCH LOOP                               │
│                                                                 │
│   You have an idea                                              │
│        │                                                        │
│        ▼                                                        │
│   /discuss <idea>          ← Claude + Semantic Scholar          │
│        │                                                        │
│        ├─ Phase 1: Claude restates your idea → you correct it   │
│        ├─ Phase 2: 5 hypotheses generated → you filter them     │
│        ├─ Phase 3: Claude enriches selected → related work,     │
│        │           abstract, experiments, risk factors          │
│        └─ Phase 4: saved to .research/discussions/              │
│        │                                                        │
│        ▼                                                        │
│   Pick a hypothesis                                             │
│        │                                                        │
│        ▼                                                        │
│   git checkout -b experiment/<slug>                             │
│   python train.py --config config/...                           │
│   tensorboard --logdir ./exp/                                   │
│        │                                                        │
│        ▼                                                        │
│   Review results → /discuss again with what you learned  ◄──┐  │
│        │                                                     │  │
│        └─────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

You bring	Claude brings
Research intuition and domain taste	Literature search (Semantic Scholar)
Judgment on what to pursue	Structured hypothesis generation
Experiment decisions	Config scaffolding and code changes
Interpretation of results	Metric summaries and TensorBoard links

Part 1 — Setup

1.1 Install

git clone <this-repo> && cd auto-research-template
uv venv .venv && source .venv/bin/activate
uv pip install -e .

Requires Python ≥ 3.10 and at least one CUDA GPU. Replace uv with pip if preferred.

1.2 API Keys

You need at least Anthropic to run /discuss. The rest are optional.

Anthropic (required for `/discuss`)

Go to console.anthropic.com → API Keys → Create Key
Copy the key (starts with sk-ant-)

export ANTHROPIC_API_KEY="sk-ant-..."

Semantic Scholar (optional — raises `/discuss` rate limit)

Without a key: 100 requests / 5 minutes (usually enough for one /discuss session). With a key: 1 request / second.

Go to api.semanticscholar.org → Get API Key
Fill out the form (instant approval for academic use)

export SEMANTIC_SCHOLAR_API_KEY="your-key-here"

Slack (optional — epoch notifications)

Go to api.slack.com/apps → Create New App → From scratch
Incoming Webhooks → toggle on → Add New Webhook to Workspace → pick a channel
Copy the webhook URL

export SLACK_WEBHOOK_URL="https://hooks.slack.com/services/T.../B.../xxx"

After each validation epoch, SlackNotificationCallback posts val_loss, train_loss, and a generated sample to that channel.

Obsidian (optional — experiment notes)

Writes .md notes to a git repo that syncs with Obsidian via the Obsidian Git plugin.

git clone https://github.com/<you>/obsidian-sync /workspace/obsidian-sync
export OBSIDIAN_VAULT_PATH="/workspace/obsidian-sync"

ObsidianLogCallback commits and pushes experiments/{model_name}.md with metrics and an ASCII loss curve after every validation epoch.

All at once

Edit scripts/setup_env.sh with your keys, then:

source scripts/setup_env.sh

1.3 Start Claude Code

claude

Permissions, hooks, and slash commands are pre-configured in .claude/settings.json.

Part 2 — Your First Research Cycle (end to end)

This walks through one full loop from idea to running experiment. Follow this the first time.

Step 1: Discuss your idea

/discuss flow matching for protein structure prediction

Claude runs a 4-phase pipeline. It pauses twice for your input before spending compute.

Phase 1 — Understand & Restate
  Claude states its understanding of your idea.
  ──────────────────────────────────────────────────────
  ★ YOU REVIEW: Is the domain and direction right?
    Correct it here — this is cheap to fix.
  ──────────────────────────────────────────────────────
  Reply: "yes" or describe corrections.

Phase 2 — Divergent Ideation  (5 parallel sub-agents)
  ┌─ Agent 1: Methodological angle
  ├─ Agent 2: Application transfer angle
  ├─ Agent 3: Efficiency & scalability angle
  ├─ Agent 4: Theoretical angle
  └─ Agent 5: Data & training strategy angle
  ──────────────────────────────────────────────────────
  ★ YOU REVIEW: Claude shows 5 short hypotheses.
    Drop any that are out of scope or over budget.
    Each dropped hypothesis = 5 fewer API calls.
  ──────────────────────────────────────────────────────
  Reply: "1, 3, 4" or "all"

Phase 3 — Enrichment  (per selected hypothesis)
  Runs semantic_scholar.py, fills related work,
  abstract, experiments, risk factors.

Phase 4 — Save
  .research/discussions/YYYY-MM-DD-<slug>.json
  Auto-staged by git hook.

Example correction at Phase 1:

Claude: "Domain: protein backbone folding, direction: flow matching..."
You:    "Close — target side-chain packing, not backbone."

Example filtering at Phase 2:

Claude: shows 5 hypotheses
You:    "Agent 3 (efficiency) is out of scope. Enrich 1, 2, 4."

Step 2: Pick a hypothesis and branch

After Phase 3, Claude asks which hypothesis to turn into an experiment. It will create the branch and copy the config.

# Claude does this, or you can do it manually:
git checkout -b experiment/flow-matching-protein
cp config/generative/flow_matching.yaml config/generative/protein_flow.yaml

Commit the discussion brief on the branch so the scientific rationale stays with the code:

git add .research/discussions/2026-05-02-flow-matching-protein.json
git commit -m "discuss: flow matching for protein side-chain packing"

Step 3: Train

Edit the config you copied (config/generative/protein_flow.yaml) to set model_name, data_dir, and any hyperparams.

python train.py --config config/generative/protein_flow.yaml
tensorboard --logdir ./exp/

Check progress:

/status    ← current training metrics
/results   ← latest val_loss and experiment list

Step 4: Interpret and iterate

When training finishes (or when you have early results), come back to /discuss with what you learned:

/discuss flow matching for protein side-chain packing —
  our OT-CFM baseline hits 1.2Å RMSD on CASP14 but struggles
  on long loops. ideas for improving loop modeling?

This starts a new cycle. The new brief gets saved alongside the old one in .research/discussions/, building a record of your research thread.

Part 3 — Running Hypotheses in Parallel

If you have multiple hypotheses worth testing at once, use git worktree to run them side by side without switching branches.

# Hypothesis 1 is already in the main working directory
# Add a worktree for hypothesis 2
git worktree add ../exp-masked-spectrogram experiment/masked-spectrogram-ssl

# GPU 0: hypothesis 1
python train.py --config config/representation/contrastive.yaml --gpu 0 &

# GPU 1: hypothesis 2 (in the worktree)
cd ../exp-masked-spectrogram
python train.py --config config/representation/byol.yaml --gpu 1 &

# Compare both in TensorBoard
tensorboard --logdir_spec \
  h1:./exp/representation/contrastive,\
  h2:../exp-masked-spectrogram/exp/representation/byol

When results are in:

# Merge what works
git checkout main
git merge experiment/masked-spectrogram-ssl

# Delete what doesn't
git branch -d experiment/contrastive-baseline
git worktree remove ../exp-masked-spectrogram

Part 4 — Training Reference

# Language models
python train.py --config config/lm/default.yaml

# Generative
python train.py --config config/generative/ddpm.yaml
python train.py --config config/generative/flow_matching.yaml
python train.py --config config/generative/drift.yaml
python train.py --config config/generative/vae.yaml

# Representation
python train.py --config config/representation/contrastive.yaml
python train.py --config config/representation/byol.yaml

# Options
python train.py --config <path> --gpu 0
python train.py --config <path> --resume ./exp/<project>/<name>/last.ckpt
tensorboard --logdir ./exp/

Config schema

project: "generative"
model_name: my_exp
steps: 100000
logdir: ./exp
expdir: ${logdir}/${project}/${model_name}

model:
  target: src.trainers.generative_trainer.GenerativeTrainer
  params:
    cfg:
      trainer_type: generative
      model_type: flow_matching

data:
  target: src.data.datamodule.ResearchDataModule
  params:
    cfg:
      data_dir: ./data
      batch_size: 16

lightning:
  slack_notify_every_n_epochs: 1
  obsidian_log_every_n_epochs: 1
  trainer:
    accelerator: gpu
    strategy: ddp
    max_steps: ${steps}

Part 5 — Extending the Template

New experiment

Copy the nearest config and adjust model_name and hyperparams — no code changes needed.

cp config/generative/ddpm.yaml config/generative/my_exp.yaml
# edit model_name, model_type, data_dir, hyperparams
python train.py --config config/generative/my_exp.yaml

New trainer type

vi src/trainers/my_trainer.py        # subclass BaseTrainer
vi src/trainers/__init__.py          # register in dispatch dict
cp config/generative/ddpm.yaml config/my_paradigm/default.yaml
# set target: src.trainers.my_trainer.MyTrainer
python train.py --config config/my_paradigm/default.yaml

Custom dataset

Subclass ResearchDataModule and override _build_datasets() in src/data/.

Slash Commands

Command	What it does
`/discuss <idea>`	Research ideation — 5 hypotheses, Semantic Scholar enrichment, saved to `.research/discussions/`
`/train`	Start a training run (prompts for config)
`/status`	Check current training metrics
`/results`	Show latest experiment results and discussion briefs

Project Layout

auto-research-template/
├── train.py
├── config/
│   ├── lm/default.yaml
│   ├── generative/{ddpm,vae,flow_matching,drift}.yaml
│   └── representation/{contrastive,byol}.yaml
├── src/
│   ├── trainers/          # base + lm + generative + representation
│   ├── data/              # JsonlDataset, ImageFolderDataset, ResearchDataModule
│   ├── callbacks/         # SlackNotificationCallback, ObsidianLogCallback
│   └── utils/             # instantiate_from_config, LinearWarmupCosineDecay
├── scripts/
│   ├── setup_env.sh
│   └── semantic_scholar.py
├── .research/
│   └── discussions/       # briefs saved by /discuss (auto-staged by hook)
└── .claude/
    ├── settings.json      # permissions + hooks
    ├── commands/
    │   └── discuss.md     # /discuss orchestration
    └── hooks/
        └── on_write.sh    # auto-stages discussion files after Write

Environment Variables

Variable	Required	Description
`ANTHROPIC_API_KEY`	Required for `/discuss`	Claude API access
`SEMANTIC_SCHOLAR_API_KEY`	Optional	1 req/sec instead of 100 req/5min
`SLACK_WEBHOOK_URL`	Optional	Epoch notifications
`OBSIDIAN_VAULT_PATH`	Optional	Auto-sync experiment notes
`CUDA_VISIBLE_DEVICES`	Optional	GPU selection (or use `--gpu`)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.claude		.claude
.research/discussions		.research/discussions
config		config
research		research
scripts		scripts
src		src
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Skills.md		Skills.md
pyproject.toml		pyproject.toml
readme.md		readme.md
sweep.config.yaml		sweep.config.yaml
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Auto-Research Template

How It Works

Part 1 — Setup

1.1 Install

1.2 API Keys

Anthropic (required for `/discuss`)

Semantic Scholar (optional — raises `/discuss` rate limit)

Slack (optional — epoch notifications)

Obsidian (optional — experiment notes)

All at once

1.3 Start Claude Code

Part 2 — Your First Research Cycle (end to end)

Step 1: Discuss your idea

Step 2: Pick a hypothesis and branch

Step 3: Train

Step 4: Interpret and iterate

Part 3 — Running Hypotheses in Parallel

Part 4 — Training Reference

Config schema

Part 5 — Extending the Template

New experiment

New trainer type

Custom dataset

Slash Commands

Project Layout

Environment Variables

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Auto-Research Template

How It Works

Part 1 — Setup

1.1 Install

1.2 API Keys

Anthropic (required for /discuss)

Semantic Scholar (optional — raises /discuss rate limit)

Slack (optional — epoch notifications)

Obsidian (optional — experiment notes)

All at once

1.3 Start Claude Code

Part 2 — Your First Research Cycle (end to end)

Step 1: Discuss your idea

Step 2: Pick a hypothesis and branch

Step 3: Train

Step 4: Interpret and iterate

Part 3 — Running Hypotheses in Parallel

Part 4 — Training Reference

Config schema

Part 5 — Extending the Template

New experiment

New trainer type

Custom dataset

Slash Commands

Project Layout

Environment Variables

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Anthropic (required for `/discuss`)

Semantic Scholar (optional — raises `/discuss` rate limit)

Packages