Add Fitbit data story and Dartmouth athletics data story; also update other READMEs by kate-marine · Pull Request #291 · ContextLab/storytelling-with-data

kate-marine · 2026-06-04T02:35:42Z

Adds Fitbit data story
Adds Dartmouth athletics data storfy
Add README for Collaborative story
Update README for sports betting story
Update README for coffee-consulting-story because I realized I never finished

Added an overview and approach for analyzing Fitbit data to predict memory-task performance.

Updated the README to include findings and approach details.

Added links to code (still waiting on video)

Added detailed project information, research questions, data sources, approach, findings, and acknowledgements for the Collaborative project.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds new data-story artifacts (README writeups + a small simulated dataset) to document project goals, approaches, and how to reproduce results.

Changes:

Added a new Collaborative project README with research questions, approach, findings, and data/code pointers
Added a simulated coffee-chain CSV dataset and refined the accompanying project README structure
Added an activity-patterns/memory README with methodology, findings, and runnable setup instructions

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.

File	Description
data-stories/collaborative/README.md	New project writeup documenting the RTU survey analysis and findings
data-stories/coffee-consulting-story/coffee_chain.csv	New simulated coffee-chain dataset for the MCP server demo story
data-stories/coffee-consulting-story/README.md	Refines narrative structure and adds “Approach” + limitations/future work
data-stories/activity-patterns-memory/README.md	New project writeup with modeling approach, findings, and run instructions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+
+## Data
+
+We used data from a series of follow-up surveys that RTU asked partipants after each event. These surveys contain responses from both youth participants and their caring adults, 


+The first main point where we found participate to clearly start dropping off was Event 4. Response counts fell by 149 compared to Event 3, and it was consistent across all participating schools (BBA fell by 56, Long Trail by 22, Leland & Gray by 17, Flood Brook by 16, and Maple Street by 10). 
+Some of this could definitely be due to the fact that Event 4 was in January, where weather can disrupts scheduling and the ski pass incentive starts to lose its power as students have possibly already purchased ones for the season.



-Main question: what it would look like to build something that lets a language model actually do data analysis, instead of just write code for someone else to run?
-
 For this project I designed and built an MCP server that lets clients such as LLMs upload data, define visualizations, and retrieve resutlts as part of a contextual workflow. It was originally part of a project for the DALI lab, but I expanded it to explore more practical applications. When working with an LLM on a dataset, such as like I have been doing with earlier data stories projects, it can produce great code snippets, but I still have to run everything in a notebook and continuously communicate back and forth into the chat (before I started doing Claude code).


+
+## Approach
+
+My server exposes a bunch tools organized into four categories: dataset operations (upload, describe, list), transforms (filter, aggregate, sort, select columns), visualization specifications (create, suggest, update chart definitions), and rendering (generate PNG or interactive HTML plots). I built it in Python using FastMCP for the protocol layer, pandas for data manipulation, matplotlib for static charts, and Plotly for interactive ones.


+The biggest limitation I ran into was that suggest_vizspec (tool that is supposed to interpret plain English chart requests but didn’t end up finishing) can pick the right plot type but doesn't know to chain in a aggregate_dataset call when the request implies it. For example asking for "total revenue by city" implies two server-side operations and means letting tools call other tools which I didn’t have time to implement. Future work could definitely work on building out this tool for non-LLM clients (as its redundant with a client like Claude desktop).
+
+Also, the server currently only works with limited set of different plot types and very simple ones. Future work could expand to make visualizations more appealing or supportive for more complex data. Same with HTML piece for making more complex interactive visualizations.


+### Challenges and potential next steps:
+I was a little limited in what I could include in the models since things like sleep and heart-rate/HRV (probably pretty strong ties to cognitive performance) were too sparse in the data. So this could definitely be revisited/replicated if can get more data. As a next step I might look into a different target metric (rather than memory) such as one of the mental health measures like typical stress. From a Spearman screen I ran I might look into the mean__floors vs. vocab learning correlation as well. 
+
+The biggest problem with the apporach I've taken is that the sample size of 113 participants is too small for meaningful modeling and led to significant overfitting. 


+
+## Findings
+
+Adding temporal dynamics did not help predicting memory-task performance, and it actually did significantly worst then the baseline model using average activity level. The models are mostly likely overfitting as is common with having more predictors (163) than participants (113). The null result stayed the same even after three stress tests (expanding to 40 fine-grained outcomes, switching to Elastic Net and Random Forest models, and a univariate Spearman screen across 560 feature–target pairs where no dynamic feature appeared among the top correlates).


kate-marine added 9 commits June 2, 2026 23:12

Create README.md for activity patterns memory study

545f799

Added an overview and approach for analyzing Fitbit data to predict memory-task performance.

update README with approach and results

0e21964

Updated the README to include findings and approach details.

Update README.md

2565599

Update README with code links

e05c11a

Added links to code (still waiting on video)

Merge branch 'ContextLab:master' into master

c884dcf

Add README for Collaborative project

a2dcb65

Added detailed project information, research questions, data sources, approach, findings, and acknowledgements for the Collaborative project.

Add csv dataset

55098e3

add README for coffee-consulting-story

c2013a5

add video link to fitbit data story README

208b7ce

Copilot AI review requested due to automatic review settings June 4, 2026 02:35

Copilot AI reviewed Jun 4, 2026

View reviewed changes

kate-marine added 4 commits June 5, 2026 13:09

Create dartmouth athletics project

fcb0958

add code for catapult athletics story

b4625be

add full README for dartmouth athletics story

b61fe4a

add video link

fa4dc85

kate-marine changed the title ~~Add Fitbit data story and README file for Collaborative story~~ Add Fitbit data story and Dartmouth athletics data story; also update other READMEs Jun 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Fitbit data story and Dartmouth athletics data story; also update other READMEs#291

Add Fitbit data story and Dartmouth athletics data story; also update other READMEs#291
kate-marine wants to merge 13 commits into
ContextLab:masterfrom
kate-marine:master

kate-marine commented Jun 4, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		## Data

		We used data from a series of follow-up surveys that RTU asked partipants after each event. These surveys contain responses from both youth participants and their caring adults,

		The first main point where we found participate to clearly start dropping off was Event 4. Response counts fell by 149 compared to Event 3, and it was consistent across all participating schools (BBA fell by 56, Long Trail by 22, Leland & Gray by 17, Flood Brook by 16, and Maple Street by 10).
		Some of this could definitely be due to the fact that Event 4 was in January, where weather can disrupts scheduling and the ski pass incentive starts to lose its power as students have possibly already purchased ones for the season.


		Main question: what it would look like to build something that lets a language model actually do data analysis, instead of just write code for someone else to run?

		For this project I designed and built an MCP server that lets clients such as LLMs upload data, define visualizations, and retrieve resutlts as part of a contextual workflow. It was originally part of a project for the DALI lab, but I expanded it to explore more practical applications. When working with an LLM on a dataset, such as like I have been doing with earlier data stories projects, it can produce great code snippets, but I still have to run everything in a notebook and continuously communicate back and forth into the chat (before I started doing Claude code).


		## Approach

		My server exposes a bunch tools organized into four categories: dataset operations (upload, describe, list), transforms (filter, aggregate, sort, select columns), visualization specifications (create, suggest, update chart definitions), and rendering (generate PNG or interactive HTML plots). I built it in Python using FastMCP for the protocol layer, pandas for data manipulation, matplotlib for static charts, and Plotly for interactive ones.

		The biggest limitation I ran into was that suggest_vizspec (tool that is supposed to interpret plain English chart requests but didn’t end up finishing) can pick the right plot type but doesn't know to chain in a aggregate_dataset call when the request implies it. For example asking for "total revenue by city" implies two server-side operations and means letting tools call other tools which I didn’t have time to implement. Future work could definitely work on building out this tool for non-LLM clients (as its redundant with a client like Claude desktop).

		Also, the server currently only works with limited set of different plot types and very simple ones. Future work could expand to make visualizations more appealing or supportive for more complex data. Same with HTML piece for making more complex interactive visualizations.


		## Findings

		Adding temporal dynamics did not help predicting memory-task performance, and it actually did significantly worst then the baseline model using average activity level. The models are mostly likely overfitting as is common with having more predictors (163) than participants (113). The null result stayed the same even after three stress tests (expanding to 40 fine-grained outcomes, switching to Elastic Net and Random Forest models, and a univariate Spearman screen across 560 feature–target pairs where no dynamic feature appeared among the top correlates).

Conversation

kate-marine commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kate-marine commented Jun 4, 2026 •

edited

Loading