Does the "Google Effect" persist when the external tool is ChatGPT?
A replication and extension of Sparrow et al. (2011) and Storm et al. (2017) into the era of Generative AI — examining cognitive offloading, memory strategy, and digital dependence when participants have access to ChatGPT instead of a search engine, across both trivia (recall) and writing (generative) tasks.
📊 Live Dashboard · MSc Data Science & Informatics · University of Manchester · 2025
- Overview
- Key Findings
- Dashboard
- Methodology
- Repository Structure
- Quick Start
- Data
- Results & Figures
- Theoretical Background
- Contributing
- Citation
- References
The Google Effect describes how people offload memory to external systems — remembering where to find information rather than the information itself. This study asks a harder question: does the same phenomenon occur with Generative AI, and does it extend beyond fact retrieval into creative writing?
Unlike passive search engines that return links, ChatGPT produces fluent, synthesised answers — acting as a cognitive co-author. This changes the nature and depth of offloading in ways previous research never examined.
| Measure | Easy | Hard/Mixed | F-stat | p | Cohen's d |
|---|---|---|---|---|---|
| Trivia Duration | 134s (SD=129) | 363s (SD=137) | 19.42 | <0.001 *** | 1.71 (large) |
| Trivia Accuracy | 96.3% (SD=6.0) | 83.4% (SD=6.2) | 29.11 | <0.001 *** | 2.09 (large) |
| Writing Duration | 1108s (SD=511) | 1095s (SD=1072) | 0.001 | 0.978 n.s. | 0.01 (small) |
Three headline results:
- Hard trivia took 2.7× longer but accuracy only dropped 13% — ChatGPT was used as a compensatory mechanism to maintain correctness, not as a time-saver
- Writing showed no mean time difference but doubled variance — a bimodal split between fast AI-delegators (~500s) and slow independent writers (~1800s+) reveals two distinct strategies
- CSE predicted AI use (Pearson r=0.53, p<0.001) — more digitally confident participants used ChatGPT more, not less, challenging the "AI as a crutch" narrative
An interactive research dashboard is included at dashboard.html. Open it in any browser — no server required.
What's in it:
| Section | Contents |
|---|---|
| Overview | Key metrics, effect size bars, full ANOVA table |
| Trivia | Per-participant charts, per-question accuracy breakdown |
| Writing | Bimodal scatter, density histogram, full prompt list |
| CSE & AI use | Score distribution, factor subscales, strip chart |
| Plain English | Accessible explanations + interactive duration simulator |
The simulator lets you select any task × difficulty × AI condition and see the real mean duration, a contextual note, and a live comparison against all other conditions — with animated transitions.
Deploy to GitHub Pages by pushing to main with GitHub Actions enabled — the pages.yml workflow handles it automatically.
A three-factor between-subjects design:
| Factor | Levels |
|---|---|
| Task Type | Trivia · Writing |
| Task Difficulty | Easy · Hard/Mixed |
| AI Condition | No-AI · Mid-AI · Full-AI |
| Condition | Description |
|---|---|
| No-AI | Memory only — no external resources permitted |
| Mid-AI | First 8 questions unaided; ChatGPT from question 9 onwards |
| Full-AI | ChatGPT available from question 1 |
- N = 60 recruited, 46 complete CSE survey responses
- Eligibility: ≥18 years, fluent English, willingness to complete tasks
- Ethics approved by the University of Manchester; data anonymised before upload
Trivia tasks (scored via Qualtrics SC0):
- Easy (8 questions): general knowledge — Shakespeare, seasons, colours, Ford Mustang, etc.
- Hard (8 additional questions): obscure recall — Vitamin C formula, Montgolfier brothers, caffeine molecular formula, first dog in orbit, etc.
Writing tasks (70–100 words per prompt):
- Easy (8 prompts): personal/descriptive — "describe a memorable meal", "instructions for a sandwich", etc.
- Hard (8 prompts): constrained/abstract — "describe silence using only one-syllable words", "life imprisonment vs death penalty", "a modern fairytale commenting on a current political topic", etc.
CSE Questionnaire: Ward's (2013) validated 14-item, 3-factor scale (1–3 Likert) plus weekly ChatGPT use frequency.
All analysis implemented in analysis.py using scipy only (no statsmodels dependency):
- One-way ANOVA with F-statistic and p-value
- Welch's t-test (robust to unequal variance)
- Cohen's d (pooled standard deviation method)
- Eta-squared (η²) from first principles
- Levene's test for homogeneity of variance
- Pearson r for CSE × AI use correlation
- Cronbach's α for CSE internal consistency
google-effects-genai/
│
├── .github/
│ ├── workflows/
│ │ ├── ci.yml # runs analysis.py on every push
│ │ └── pages.yml # deploys dashboard.html to GitHub Pages
│ └── ISSUE_TEMPLATE/
│ └── bug_report.md
│
├── data/ # real Qualtrics CSV exports (ethics-approved)
│ ├── trivia_easy.csv
│ ├── trivia_hard.csv
│ ├── writing_easy.csv
│ ├── writing_hard.csv
│ └── final_survey.csv
│
├── figures/ # generated by analysis.py
│ ├── fig1_boxplots.png
│ ├── fig2_mean_sd.png
│ ├── fig3_accuracy_vs_duration.png
│ ├── fig4_writing_density.png
│ ├── fig5_cse.png
│ ├── fig6_ai_use.png
│ ├── fig7_effect_heatmap.png
│ ├── fig8_per_question.png
│ └── descriptive_stats.csv
│
├── notebooks/
│ └── code.ipynb # original exploratory Colab notebook
│
├── src/
│ └── generate_synthetic_data.py # synthetic data calibrated to real distributions
│
├── analysis.py # full pipeline with CLI (argparse)
├── dashboard.html # interactive results dashboard (no server needed)
├── requirements.txt
├── CITATION.cff # machine-readable citation (GitHub + Zenodo)
├── LICENSE # MIT
├── .gitignore
└── README.md
Python 3.9+ and pip.
git clone https://github.com/YOUR_USERNAME/google-effects-genai.git
cd google-effects-genai
pip install -r requirements.txtpython analysis.pypython analysis.py --help
# Stats only — no figures
python analysis.py --no-figures
# Custom data directory
python analysis.py --data-dir my_csvs/ --fig-dir results/
# Quiet mode (suppress table output)
python analysis.py --quietpython src/generate_synthetic_data.pyProduces five CSVs in data/ calibrated to match the real study's distributions — useful for testing or demonstration without exposing participant data.
open dashboard.html # macOS
xdg-open dashboard.html # Linux
# or just double-click it in any file explorer| File | Key columns | Notes |
|---|---|---|
trivia_easy.csv |
Duration (in seconds), SC0 |
SC0 = raw correct count out of 8 |
trivia_hard.csv |
Duration (in seconds), SC0 |
SC0 = raw correct count out of 16 |
writing_easy.csv |
Duration (in seconds), Q1–Q8 |
Free-text responses |
writing_hard.csv |
Duration (in seconds), Q1–Q16 |
First 8 easy + 8 hard prompts |
final_survey.csv |
Q1–Q14, Q15_1 |
CSE Likert (1–3) + AI use frequency |
# Easy trivia: 8 questions
accuracy_pct = (SC0 / 8) * 100
# Hard trivia: 16 questions (8 easy + 8 hard)
accuracy_pct = (SC0 / 16) * 100Data is anonymised — no names, emails, or IP addresses are stored. Qualtrics metadata columns are stripped. Collected under University of Manchester ethics approval.
| Figure | What it shows |
|---|---|
fig1_boxplots.png |
Duration and accuracy distributions (boxplots) |
fig2_mean_sd.png |
Group means ± SD with annotated F-statistics and effect sizes |
fig3_accuracy_vs_duration.png |
Scatter plot — accuracy ceiling effect visible |
fig4_writing_density.png |
Density histogram — bimodal writing strategy split |
fig5_cse.png |
CSE score histogram + factor subscale boxplots |
fig6_ai_use.png |
Self-reported weekly ChatGPT use frequency |
fig7_effect_heatmap.png |
Effect size heatmap across all three tests |
fig8_per_question.png |
Per-question response rate breakdown |
| Theory | Authors | Role in this study |
|---|---|---|
| Google Effect | Sparrow, Liu & Wegner (2011) | Core paradigm being replicated |
| Habitual reliance | Storm, Stone & Benjamin (2017) | Predicts AI use extends to easy/solvable items |
| Transactive memory | Wegner (1987) | Explains distributed external cognition |
| Cognitive offloading | Risko & Gilbert (2016) | Framework for intentional AI delegation |
| Computer Self-Efficacy | Ward (2013) | Individual-differences moderator |
Issues and pull requests are welcome. Please open an issue first for any significant changes.
If you replicate or extend this study, please cite this repository (see below) and open a PR to add your work to a replications/ section.
For questions about the methodology or data, please open a GitHub issue rather than emailing directly.
@mastersthesis{14144847_2025_googleeffects,
title = {Google Effects and GenAI: Examining Cognitive Offloading
in the Age of Generative Artificial Intelligence},
author = {{University of Manchester, Student ID 14144847}},
school = {University of Manchester, School of Computer Science},
year = {2025},
type = {MSc Dissertation},
url = {https://github.com/YOUR_USERNAME/google-effects-genai}
}A CITATION.cff file is also included for GitHub's "Cite this repository" button and Zenodo DOI generation.
- Sparrow, B., Liu, J., & Wegner, D. M. (2011). Google effects on memory: Cognitive consequences of having information at our fingertips. Science, 333(6043), 776–778.
- Storm, B. C., Stone, S. M., & Benjamin, A. S. (2017). Using the internet to access information inflates future use of the internet to access other information. Memory, 25(6), 717–723.
- Ward, A. F. (2013). Supernormal: How the internet is changing our memories and our minds. Psychological Inquiry, 24(4), 341–348.
- Risko, E. F., & Gilbert, S. J. (2016). Cognitive offloading. Trends in Cognitive Sciences, 20(9), 676–688.
- Wegner, D. M. (1987). Transactive memory: A contemporary analysis of the group mind. In B. Mullen & G. R. Goethals (Eds.), Theories of group behavior (pp. 185–208). Springer.
Submitted in partial fulfilment of the requirements for the degree of MSc in Data Science and Informatics, The University of Manchester, 2025.