- Clone:
git clone https://github.com/adamrossnelson/SurveyResponder.git - Environment:
python -m venv venv - Activate:
.\venv\Scripts\activate(Windows) - Install:
pip install -r requirements.txt
Survey responses using LLMs For researchers, developers, and psychometricians testing, scoring, and metrics evaluation.
SurveyResponder is a Python package and CLI tool that uses Large Language Models (LLMs), such as those accessed through Ollama - ollama.com, to generate synthetic survey instrument responses.
Useful for:
- Testing and validating Likert-scale or multiple-choice instruments.
- Simulating responses across different personas.
- Exploring LLM behavior when prompted with surveys.
- Creating synthetic datasets for development and analysis.
A small collection of previous responses are available via Google Drive.
- β
Default Likert scale (
Strongly DisagreetoStrongly Agree, with neutral midpoint). - β Custom response options (passed as a list).
- β Persona-driven simulation (via a JSON file with structured traits and descriptions).
- β Supports simple text files (one question per line).
- β Generates N responses per session.
- β Outputs a tidy CSV file.
- β Temperature setting for controlling LLM creativity.
- β Parameter logging for reproducibility.
- β Configurable LLM base URL for using remote instances.
SurveyResponder requires Python 3.7+ and Ollama for local LLM execution.
- Install Python, Pandas, a) The Anaconda distribution is recommended. b) Otherwise the lastest of Python can work.
- Install Ollama and/or Annything LLM.
- Pull an LLM model with Ollama (Ex:
ollama pull llava-llama3:latest)
SurveyResponder is currently a single Python file (beta), installation is simple:
# Download the Python file
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/adamrossnelson/SurveyResponder/main/SurveyResponder.py" -OutFile "SurveyResponder.py"
# Download example files (optional)
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/adamrossnelson/SurveyResponder/main/questions.txt" -OutFile "questions.txt"
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/adamrossnelson/SurveyResponder/main/persona.json" -OutFile "persona.json"# Download the Python file
curl -O https://raw.githubusercontent.com/adamrossnelson/SurveyResponder/main/SurveyResponder.py
# Download example files (optional)
curl -O https://raw.githubusercontent.com/adamrossnelson/SurveyResponder/main/questions.txt
curl -O https://raw.githubusercontent.com/adamrossnelson/SurveyResponder/main/persona.jsonTo use SurveyResponder, import it in your Python code:
# Import the SurveyResponder class
from SurveyResponder import SurveyResponder
# Create a responder with example data
responder = SurveyResponder()
# Make sure Ollama is running before executing
df = responder.run_write('responses.csv')
print(f"Generated {len(df)} responses successfully!")from SurveyResponder import SurveyResponder
# Basic usage with defaults
responder = SurveyResponder()
df = responder.run()
df.to_csv("results.csv", index=False)
# Advanced usage with all parameters
responder = SurveyResponder(
questions_path="questions.txt",
persona_path="persona.json",
model_name="llava-llama3:latest",
response_options=["Never", "Rarely", "Sometimes", "Often", "Always"],
num_responses=100,
temperature=1.0,
base_url="http://localhost:11434/api/generate"
)
# Option 1: Get DataFrame only
df = responder.run()
# Option 2: Get DataFrame and write to CSV file (records save as they're generated)
# Also creates results_params.json with configuration parameters
df = responder.run_write("results.csv")- Run a survey:
python cli.py run --questions questions.txt --num-responses 10 - Manage your questions:
python cli.py questions --listpython cli.py questions --add "I enjoy this research project."
Full example with advanced options:
python cli.py run \
--questions questions.txt \
--persona persona.json \
--model llama3.1:latest \
--num-responses 100 \
--output results.csv \
--temperature 1.0 \
--response-options "Never,Rarely,Sometimes,Often,Always"
---
## π οΈ Customization Options
Below are a few examples of ways to customize and tailor the Survey Responder for specific use cases:
### Changing LLM Models
To test how responses differ among LLM models, you can change the LLM by pulling it from Ollama
A full list of available LLM's are found here: https://ollama.com/library
```python
# Example: pull mistral and use it in the responder
ollama pull mistral:latest
from SurveyResponder import SurveyResponder
responder = SurveyResponder(
questions_path="questions.txt",
persona_path="persona.json",
model_name="mistral:latest", # Changed to mistral
response_options=["Disagree", "Slightly Disagree", "Neutral", "Slightly Agree", "Agree"],
num_responses=100,
temperature=1.0,
base_url="http://localhost:11434/api/generate"
)SurveyResponder uses two input files:
questions.txt β plain text, one survey question per line.
persona.json β a dictionary of traits where each key becomes a column and each value is a list of [value, description] pairs.
You can edit these files manually in a file browser, text editor, or like this:
# Add a new question to questions.txt
with open("questions.txt", "a") as f:
f.write("\nI feel confident solving programming problems.")
# Add a new trait to persona.json
import json
with open("persona.json", "r") as f:
personas = json.load(f)
# Add a new student status trait
personas["student_status"] = personas.get("student_status", [])
personas["student_status"].append(["full-time", "who is a full-time student"])
# Save the changes
with open("persona.json", "w") as f:
json.dump(personas, f, indent=2)The default likert scale can be changed to more accurately fit specific questions and personas, and it can be done via the following:
responder = SurveyResponder(
questions_path="questions.txt",
persona_path="persona.json",
model_name="mistral:latest",
response_options=["Never", "Rarely", "Often", "Always"], # Changed to 4 point likert scale
num_responses=100,
temperature=1.0,
base_url="http://localhost:11434/api/generate"
)SurveyResponder includes methods to preview the personas and prompts that will be used (can be useful in verifying proper persona.json specifications):
# Create a SurveyResponder
responder = SurveyResponder()
# Generate a random persona description
persona = responder.example_persona()
print(persona)
# Output: "You are a someone who is multiracial, who is from a family whose members go to and do well in college..."
# Generate multiple personas
personas = responder.example_persona(npersonas=3)
for i, p in enumerate(personas):
print(f"Persona {i+1}: {p}")
# Generate an example prompt using the first question in questions.txt
prompt = responder.example_prompt()
print(prompt)
# Generate an example prompt with a custom question
prompt = responder.example_prompt("I enjoy Python programming.")
print(prompt)Plain text file, one survey question per line:
I enjoy working in teams.
I prefer a structured schedule.
I feel confident in my abilities.
Each key becomes a column in the output CSV. Each value is a list of tuples. The first element is recorded in the CSV. The second element is included in the LLM prompt.
{
"age": [[16, "is 16 years old"], [18, "is 18 years old"], [20, "is 20 years old"]],
"gender": [["male", "is male"], ["female", "is female"]],
"hobbies": [["art", "who enjoys making art"], ["music", "who enjoys music"]]
}Example format:
| resid | age | gender | hobbies | Q1 | Q2 | Q3 |
|---|---|---|---|---|---|---|
| 1 | 18 | male | music | Agree | Neutral | Strongly Agree |
| 2 | 20 | female | art | Disagree | Agree | Agree |
Configuration parameters file for reproducibility:
{
"questions_path": "questions.txt",
"persona_path": "persona.json",
"model_name": "llava-llama3:latest",
"base_url": "http://localhost:11434/api/generate",
"num_responses": 100,
"temperature": 1.0,
"response_options": ["Never", "Rarely", "Sometimes", "Often", "Always"],
"run_date": "2025-04-03 21:04:23.123456",
"num_questions": 3
}- Simulating data for scoring algorithm validation
- Explore how LLMs might (or might not) reflect or replicate human biases
- Generating mock data for dashboards or demonstrations
Pull requests welcome (especially if consistent with the rooadmap below)! Please open an issue first to discuss major changes. Or work to address an existing issue.
SurveyResponder/
βββ src/
β βββ surveyresponder/
β βββ __init__.py β re-exports SurveyResponder class, __version__
β βββ core.py β SurveyResponder class + helper functions
β βββ cli.py β CLI entry point
β βββ data/
β βββ questions.txt β default example questions
β βββ persona.json β default example persona
βββ tests/
β βββ conftest.py
β βββ test_core.py
βββ examples/
β βββ PRCA_LLM_Original_FrequencyScale.csv
βββ .github/
β βββ workflows/
β βββ test.yml
βββ pyproject.toml β build metadata, dependencies, CLI entry point
βββ README.md β renamed from ReadMe.md
βββ LICENSE
βββ .gitignore β simplified
| Priority | Change |
|---|---|
| π΄ High | Add pyproject.toml β makes the project installable Issue |
| π΄ High | Move source into src/surveyresponder/ package directory Issue |
| π΄ High | Rename SurveyResponder.py β core.py (PEP 8) |
| π‘ Medium | Rename ReadMe.md β README.md |
| π‘ Medium | Refactor run()/run_write() to eliminate duplication Issue |
| π‘ Medium | Clean up imports (remove unused, move inline imports to top)Issue |
| π‘ Medium | Add psutil to dependencies; trim requirements.txt to direct deps only Issue |
| π‘ Medium | Register CLI entry point; update README to reflect CLI is implemented |
| π’ Low | Add CI workflow (GitHub Actions) Issue |
| π’ Low | Move example data into data/ or examples/ subdirectories Issue |
| π’ Low | Simplify .gitignore strategy |
| π’ Low | Add __version__ |
The following features are under consideration for future releases:
- Support for open-ended responses: Allow questions that require textual responses in addition to multiple-choice options.
- Persona templates: Provide predefined personas for ease of use.
- Expanded persona logic: Include sampling strategies, weights, and dependencies between persona traits.
- Question metadata support: Allow users to include additional metadata about questions (e.g., topic, valence) to inform response generation.
- Batch processing of surveys: Enable running multiple different surveys or question sets in one go.
- Psychometric summaries:
- Perform exploratory factor analysis (EFA) and provide outputs.
- Estimate internal consistency metrics (e.g., Cronbachβs alpha).
- Visualize response patterns.
- Evaluation module: Compare LLM-generated responses with real human response distributions.
- Cloud deployment support: Make the tool available as a web service or via API.
If you use SurveyResponder in your research, please cite it using the following formats:
@software{nelson2025surveyresponder,
author = {Nelson, Adam Ross},
title = {SurveyResponder: Generate synthetic survey responses using LLMs},
year = 2025,
publisher = {Up Level Data, LLC},
version = {1.0},
url = {https://github.com/adamrossnelson/SurveyResponder}
}Nelson, A. R. (2025). SurveyResponder: Generate synthetic survey responses using LLMs (Version 1.0) [Computer software]. Up Level Data, LLC. https://github.com/adamrossnelson/SurveyResponder
MIT License. See LICENSE file for details.