Lying with Truths: Open-Channel Multi-Agent Collusion for Belief Manipulation via Generative Montage

Overview

This directory contains demo materials for our recent work on cognitive collusion attacks against LLM-based agents. We demonstrate how coordinated agents can manipulate victim beliefs using only truthful evidence fragments through strategic narrative construction.

Repository Contents

.
├── README.md
├── dataset/
│   └── CoPHEME/              # Complete CoPHEME dataset
├── Montage/
│   └── attack_plans/         # Pre-generated attack sequences using GPT-4.1-mini
│       ├── charlie_hebdo/
│       ├── sydney_siege/
│       ├── ferguson/
│       ├── ottawa_shooting/
│       ├── germanwings/
│       └── putin_missing/
|   └──src/ Demo code for our framework
├── requirements.txt          # Python dependencies
└── environment.yml           # Conda environment file

Quick Start

Installation

Option 1: Using pip

pip install -r requirements.txt

Option 2: Using conda

conda env create -f environment.yml
conda activate colusive_agent

Running the Demo

./run_demo.sh

Example output:

[Event: Charlie Hebdo]
Target Hypothesis: Ahmed Merabet was the first victim
Attack Success Rate: 81.7%
Average Confidence: 0.83
High-Confidence ASR: 67.4%

Dataset: CoPHEME

The CoPHEME dataset extends PHEME for cognitive collusion research:

6 rumor events: Charlie Hebdo, Sydney Siege, Ferguson, Ottawa Shooting, Germanwings, Putin Missing
evidence fragments: Verified truthful tweets satisfying Local Truth (LT=1)
target fabrications: High-impact false narratives for attack simulation

See CoPHEME/ for complete statistics.

Pre-computed Attack Plans

To facilitate reproducibility without requiring extensive API calls, we provide pre-generated attack sequences produced by our Generative Montage framework using GPT-4.1-mini:

Writer-Editor-Director optimized narratives
Validated montage sequences (passed Director acceptance threshold τ=7.0)
Ready-to-use for easy victim evaluation across different LLM families

API Configuration

Set your API keys as environment variables:

# OpenAI
export OPENAI_API_KEY="your-key-here"

# Anthropic
export ANTHROPIC_API_KEY="your-key-here"

# For open-weights models (if using hosted APIs)
export HUGGINGFACE_API_KEY="your-key-here"

Full code release: The complete implementation and dataset will be released upon paper acceptance.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
CoPHEME		CoPHEME
attack_plan/GPT4.1		attack_plan/GPT4.1
configs		configs
eval		eval
src		src
README.md		README.md
complete_downstream_system.py		complete_downstream_system.py
demo.py		demo.py
environment.yml		environment.yml
generate_montage_plan.py		generate_montage_plan.py
requirements.txt		requirements.txt
run_demo.sh		run_demo.sh
tools.py		tools.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lying with Truths: Open-Channel Multi-Agent Collusion for Belief Manipulation via Generative Montage

Overview

Repository Contents

Quick Start

Installation

Running the Demo

Dataset: CoPHEME

Pre-computed Attack Plans

API Configuration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Lying with Truths: Open-Channel Multi-Agent Collusion for Belief Manipulation via Generative Montage

Overview

Repository Contents

Quick Start

Installation

Running the Demo

Dataset: CoPHEME

Pre-computed Attack Plans

API Configuration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages