Skip to content

CharlesJW222/Lying_with_Truth

Repository files navigation

Lying with Truths: Open-Channel Multi-Agent Collusion for Belief Manipulation via Generative Montage

Overview

This directory contains demo materials for our recent work on cognitive collusion attacks against LLM-based agents. We demonstrate how coordinated agents can manipulate victim beliefs using only truthful evidence fragments through strategic narrative construction.

Repository Contents

.
├── README.md
├── dataset/
│   └── CoPHEME/              # Complete CoPHEME dataset
├── Montage/
│   └── attack_plans/         # Pre-generated attack sequences using GPT-4.1-mini
│       ├── charlie_hebdo/
│       ├── sydney_siege/
│       ├── ferguson/
│       ├── ottawa_shooting/
│       ├── germanwings/
│       └── putin_missing/
|   └──src/ Demo code for our framework
├── requirements.txt          # Python dependencies
└── environment.yml           # Conda environment file

Quick Start

Installation

Option 1: Using pip

pip install -r requirements.txt

Option 2: Using conda

conda env create -f environment.yml
conda activate colusive_agent

Running the Demo

./run_demo.sh

Example output:

[Event: Charlie Hebdo]
Target Hypothesis: Ahmed Merabet was the first victim
Attack Success Rate: 81.7%
Average Confidence: 0.83
High-Confidence ASR: 67.4%

Dataset: CoPHEME

The CoPHEME dataset extends PHEME for cognitive collusion research:

  • 6 rumor events: Charlie Hebdo, Sydney Siege, Ferguson, Ottawa Shooting, Germanwings, Putin Missing
  • evidence fragments: Verified truthful tweets satisfying Local Truth (LT=1)
  • target fabrications: High-impact false narratives for attack simulation

See CoPHEME/ for complete statistics.

Pre-computed Attack Plans

To facilitate reproducibility without requiring extensive API calls, we provide pre-generated attack sequences produced by our Generative Montage framework using GPT-4.1-mini:

  • Writer-Editor-Director optimized narratives
  • Validated montage sequences (passed Director acceptance threshold τ=7.0)
  • Ready-to-use for easy victim evaluation across different LLM families

API Configuration

Set your API keys as environment variables:

# OpenAI
export OPENAI_API_KEY="your-key-here"

# Anthropic
export ANTHROPIC_API_KEY="your-key-here"

# For open-weights models (if using hosted APIs)
export HUGGINGFACE_API_KEY="your-key-here"

Full code release: The complete implementation and dataset will be released upon paper acceptance.

About

Lying with Truths: Open-Channel Multi-Agent Collusion for Belief Manipulation via Generative Montage (ACL 2026)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors