Skip to content

Add Fitbit data story and Dartmouth athletics data story; also update other READMEs#291

Open
kate-marine wants to merge 13 commits into
ContextLab:masterfrom
kate-marine:master
Open

Add Fitbit data story and Dartmouth athletics data story; also update other READMEs#291
kate-marine wants to merge 13 commits into
ContextLab:masterfrom
kate-marine:master

Conversation

@kate-marine

@kate-marine kate-marine commented Jun 4, 2026

Copy link
Copy Markdown
Contributor
  • Adds Fitbit data story
  • Adds Dartmouth athletics data storfy
  • Add README for Collaborative story
  • Update README for sports betting story
  • Update README for coffee-consulting-story because I realized I never finished

Added an overview and approach for analyzing Fitbit data to predict memory-task performance.
Updated the README to include findings and approach details.
Added links to code (still waiting on video)
Added detailed project information, research questions, data sources, approach, findings, and acknowledgements for the Collaborative project.
Copilot AI review requested due to automatic review settings June 4, 2026 02:35

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds new data-story artifacts (README writeups + a small simulated dataset) to document project goals, approaches, and how to reproduce results.

Changes:

  • Added a new Collaborative project README with research questions, approach, findings, and data/code pointers
  • Added a simulated coffee-chain CSV dataset and refined the accompanying project README structure
  • Added an activity-patterns/memory README with methodology, findings, and runnable setup instructions

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.

File Description
data-stories/collaborative/README.md New project writeup documenting the RTU survey analysis and findings
data-stories/coffee-consulting-story/coffee_chain.csv New simulated coffee-chain dataset for the MCP server demo story
data-stories/coffee-consulting-story/README.md Refines narrative structure and adds “Approach” + limitations/future work
data-stories/activity-patterns-memory/README.md New project writeup with modeling approach, findings, and run instructions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


## Data

We used data from a series of follow-up surveys that RTU asked partipants after each event. These surveys contain responses from both youth participants and their caring adults,
Comment on lines +38 to +39
The first main point where we found participate to clearly start dropping off was Event 4. Response counts fell by 149 compared to Event 3, and it was consistent across all participating schools (BBA fell by 56, Long Trail by 22, Leland & Gray by 17, Flood Brook by 16, and Maple Street by 10).
Some of this could definitely be due to the fact that Event 4 was in January, where weather can disrupts scheduling and the ski pass incentive starts to lose its power as students have possibly already purchased ones for the season.

Main question: what it would look like to build something that lets a language model actually do data analysis, instead of just write code for someone else to run?

For this project I designed and built an MCP server that lets clients such as LLMs upload data, define visualizations, and retrieve resutlts as part of a contextual workflow. It was originally part of a project for the DALI lab, but I expanded it to explore more practical applications. When working with an LLM on a dataset, such as like I have been doing with earlier data stories projects, it can produce great code snippets, but I still have to run everything in a notebook and continuously communicate back and forth into the chat (before I started doing Claude code).

## Approach

My server exposes a bunch tools organized into four categories: dataset operations (upload, describe, list), transforms (filter, aggregate, sort, select columns), visualization specifications (create, suggest, update chart definitions), and rendering (generate PNG or interactive HTML plots). I built it in Python using FastMCP for the protocol layer, pandas for data manipulation, matplotlib for static charts, and Plotly for interactive ones.
Comment on lines +36 to +38
The biggest limitation I ran into was that suggest_vizspec (tool that is supposed to interpret plain English chart requests but didn’t end up finishing) can pick the right plot type but doesn't know to chain in a aggregate_dataset call when the request implies it. For example asking for "total revenue by city" implies two server-side operations and means letting tools call other tools which I didn’t have time to implement. Future work could definitely work on building out this tool for non-LLM clients (as its redundant with a client like Claude desktop).

Also, the server currently only works with limited set of different plot types and very simple ones. Future work could expand to make visualizations more appealing or supportive for more complex data. Same with HTML piece for making more complex interactive visualizations.
### Challenges and potential next steps:
I was a little limited in what I could include in the models since things like sleep and heart-rate/HRV (probably pretty strong ties to cognitive performance) were too sparse in the data. So this could definitely be revisited/replicated if can get more data. As a next step I might look into a different target metric (rather than memory) such as one of the mental health measures like typical stress. From a Spearman screen I ran I might look into the mean__floors vs. vocab learning correlation as well.

The biggest problem with the apporach I've taken is that the sample size of 113 participants is too small for meaningful modeling and led to significant overfitting.

## Findings

Adding temporal dynamics did not help predicting memory-task performance, and it actually did significantly worst then the baseline model using average activity level. The models are mostly likely overfitting as is common with having more predictors (163) than participants (113). The null result stayed the same even after three stress tests (expanding to 40 fine-grained outcomes, switching to Elastic Net and Random Forest models, and a univariate Spearman screen across 560 feature–target pairs where no dynamic feature appeared among the top correlates).
@kate-marine kate-marine changed the title Add Fitbit data story and README file for Collaborative story Add Fitbit data story and Dartmouth athletics data story; also update other READMEs Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants