Skip to content
View tjhavranek's full-sized avatar

Block or report tjhavranek

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
tjhavranek/README.md

Tomáš Havránek

Professor of economics at Charles University, Prague. Meta-analyst and applied macroeconomist working on publication bias, p-hacking, and Bayesian model averaging (often with Zuzana Iršová). Affiliated with CEPR and the Stanford METRICS center.

I use this page as a working notebook for a few AI-assisted research-auditing experiments: a small family of tools that bring adversarial review to empirical papers, grant proposals, and referee reports. The newest and most ambitious is a Claude-only workshop (CRUCIBLE) that argues a paper to the sentence and then, optionally, rebuilds it. It grew out of a manual, human-in-the-loop audit protocol (multi-model: ChatGPT, Gemini, Grok, Claude) and its automation for the Claude Code + Codex command line. That protocol is the methodology to cite. The same approach covers a domain application for ERC grant drafts, alongside the meta-analysis work in the labs below.

The tools surface criticism for a human to weigh. They are not a substitute for judgment.

The underlying Duel/MAD protocol has been independently endorsed by Prof. Bob Reed (University of Canterbury / MAER-Net).

Which repo should I use?

If you want to… Use What it is
Workshop and (optionally) rebuild a whole paper paper-workshop CRUCIBLE, the flagship. A Claude-only fleet of rival-tradition referees argues the paper to the sentence (every comment grounded in an exact quote, no acceptance-probability scores), then rebuilds it on request: a tracked redline, your own code re-run so the numbers are real, and a replication package. The newest of these tools and the least battle-tested. Effectiveness isn't measured yet.
Run a manual, human-in-the-loop adversarial audit of a high-stakes paper research-audit-duel-protocol The protocol CRUCIBLE grew out of: the Duel + MAD workflows across ChatGPT, Gemini, Grok, and Claude, with a worked example. Independently endorsed, and the methodology to cite.
Automate that audit across Claude + Codex, without writing code mad-research Three Claude Code skills: one-shot Codex calls, collaborative build with cross-review, and a three-stream adversarial audit. The cross-model (Claude + Codex) automation; it produces an audit memo.
Triage an ERC Starting/Consolidator draft before peer review erc-ai-feedback A prompt + rubric for a structured pre-review against ERC criteria. Doesn't replace human reviewers; it frees workshop time for what AI can't address.
Curious what kids can build? race A small two-player browser racing game our kids (ages 8–12) made with Claude Code, set in our town of Litomyšl.

Cite the protocol

Iršová & Havránek (2026), doi:10.5281/zenodo.19105954. The core protocol repos ship a CITATION.cff, so GitHub's "Cite this repository" works there too.

Elsewhere

Getting started: CRUCIBLE and mad-research are Claude Code skills, so they need a Claude subscription (mad-research also calls Codex). research-audit-duel-protocol and erc-ai-feedback are just prompts and documents, with nothing to install. Each repo's README has the steps.

A note on confidentiality: these tools send your text to third-party AI providers. Each repo's README states what goes where, so read it before sending anything embargoed, under double-blind review, or otherwise non-public.

Popular repositories Loading

  1. research-audit-duel-protocol research-audit-duel-protocol Public

    Human-in-the-loop adversarial workflows for high-stakes research audit: from ChatGPT-Gemini duels to 4-model MAD.

    12 2

  2. mad-research mad-research Public

    Three Claude Code skills for working with Codex CLI: codex-bridge (one-shot Codex calls), mad-build (Claude+Codex collaboration with cross-review), and mad-research (three-stream adversarial audit …

    3

  3. race race Public

    🏎️ A two-player browser racing game set in Litomyšl, Czech Republic. Built by kids (ages 8–12) with Claude Code.

  4. erc-ai-feedback erc-ai-feedback Public

    Prompt + rubric for AI triage of ERC Starting/Consolidator Grant drafts before peer review. Doesn't replace human reviewers — frees workshop time for what AI can't address.

  5. tjhavranek tjhavranek Public

    Profile - AI-assisted research-audit tools, meta-analysis, reproducibility

  6. paper-workshop paper-workshop Public

    An adversarial AI expert workshop that stress-tests a research paper (rival-tradition referees argue; every comment quote-grounded and independently re-verified) and then rebuilds it: tracked-chang…

    Markdown