Please note that this schedule is subject to change. (And open to pull requests.) Class progress through early materials may warrant a change in the schedule or assigned work.
General weekly schedule:
Tuesday – submit previous week's assignment; quiz by the beginning of class about the assignment; review/discuss/clarify last assignment; introduce new topic and readings
Midweek homework - reading assignment
Thursday – quiz by the beginning of class on assigned readings; review quiz results and clarify difficulties revealed by quiz; in‑class partner/small‑group work applying readings; introduce Colab notebook assignment; if time, begin the notebook in class
Weekend homework – Colab notebook
Following Tuesday – submit notebooks; clicker quiz; review/clarify; introduce new topic and readings; etc.
- Bring laptop.
- Read: What is a relational database
- Introductions and course/unit overview
- Introduction to SQL and MySQL.
- Colab warm‑up: connect to MySQL from Python (connector provided), run first
SELECT. - Query basics:
SELECT,WHERE,ORDER BY,LIMIT; basic pattern matching withLIKEand wildcards. - Notebook 1 released: "Getting Data Out."
- MySQL Tutorial (PDF): Intro + Basic Queries — pp. 5–7 (mysql-tutorial.pdf on Canvas)
These documentation articles will have more information than you need, and they will reference some things we haven't covered yet. However, it is important to be familiar with them. Glean from them what you need in reference to the assigned notebook and class discussions, and tuck the rest away for later.
- SELECT statement (overview) — dev.mysql.com/doc/en/select.html
- Pattern matching (LIKE) — dev.mysql.com/doc/en/pattern-matching.html
- String functions (reference) — dev.mysql.com/doc/en/string-functions.html
- Master SQL in 16 Pages (on Canvas) — pp. 1–5 (SELECT, WHERE, ORDER BY, LIMIT, LIKE)
- SQL Cheat Sheet (on Canvas) — pp. 1–2
- Submit Notebook 1 to Canvas by the start of class. The notebook is in Colab, and linked from Canvas under Assignments.
- Submit Reading Quiz 1 on Canvas under Quizzes (also covers Notebook 1, so be sure to complete Notebook 1 before the quiz).
- Review Notebook 1 results.
- Introduce new topic: Aggregation + string/math functions; preview of core joins (INNER, LEFT).
Reading Quiz #2 in Canvas (after doing the following readings).
- MySQL Tutorial (PDF): pp. 13–25 (mysql-tutorial.pdf on Canvas).
These documentation articles will have more information than you need, and they will reference some things we haven't covered yet. However, it is important to be familiar with them. Glean from them what you need in reference to the assigned notebook and class discussions, and tuck the rest away for later.
- Aggregate functions (overview) — dev.mysql.com/doc/refman/8.4/en/aggregate-functions.html
- MySQL handling of
GROUP BY— dev.mysql.com/doc/refman/8.4/en/group-by-handling.html - GROUP BY modifiers (
WITH ROLLUP) — dev.mysql.com/doc/refman/8.4/en/group-by-modifiers.html
- Quiz debrief.
- Grouping and aggregation practice.
- If time: begin Notebook 2 in class.
- Submit Notebook 2 by the start of class.
- Review Notebook 2 results.
- Introduce new topic: Subqueries, CTEs (
WITH), and joins.
Reading Quiz #3 in Canvas (after doing the following readings).
- MySQL Tutorial (PDF): Using more than one table (intro to joins) — pp. 25–27 (mysql-tutorial.pdf on Canvas)
- Introduction to MySQL joins
- An Overview of Joins
- Subqueries
- Common Table Expressions (CTEs)
- Unions
- CTEs (
WITH) — dev.mysql.com/doc/refman/8.4/en/with.html - Subqueries — dev.mysql.com/doc/mysql/en/subqueries.html
- JOIN Clause - https://dev.mysql.com/doc/refman/8.4/en/join.html
- Set operations (UNION, INTERSECT, EXCEPT) — dev.mysql.com/doc/refman/8.4/en/set-operations.html
- Master SQL in 16 Pages — p. 6 (subqueries) (Master_SQL_in_16_Pages.pdf on Canvas)
- Master SQL in 16 Pages — pp. 7–9 (JOINS) (Master_SQL_in_16_Pages.pdf on Canvas).
- Master SQL in 16 Pages — p. 10 (set operations) (Master_SQL_in_16_Pages.pdf on Canvas)
Videos are embedded in the primary readings this week. I highly recommend watching them, as well as reading the articles, particularly for any concepts that are new for you.
- Quiz debrief.
- Small‑group refactors: rewrite a subquery as a CTE and as a join; compare approaches.
- Notebook 3 released: Subqueries, CTEs, and basic joins.
- If time: begin Notebook 3 in class.
- Submit Notebook 3 by the start of class.
- Review Notebook 3.
- More work on joins.
Reading Quiz #4 in Canvas (after doing the following readings).
- Installing MySQL on Windows (if you're a Windows user)
- Installing MySQL on macOS Using Native Packages (if you're a Mac user)
- Getting Started with MySQL
- Quiz debrief.
- Install and troubleshoot MySQL and MySQL Workbench on your laptop (if you haven't already).
- Make and break some database(s).
Supplemental resource:
Use the code that AlexTheAnalyst provided to accompany the video we watched, available here, and recreate the Parks and Rec database on your local MySQL instance.
Then write SQL queries that do the following:
- Show all the tables in the database.
- Show the schema details for one of the tables.
- Show the first few records for another table.
- Write a query that uses a
JOINto combine data from two tables. - Write a query that uses an aggregate function (e.g.,
COUNT,SUM,AVG) and aGROUP BYclause.
For the latter two, they do not need to be anything fancy, but make sure they are original queries (not anything in AlexTheAnalyst's video or his provided code).
Finally, make a screencast video no more than 2 minutes long (preferably less than a minute) in which you show me the database on your machine, and run each of the above queries on it, showing the results.
I recommend writing and testing the queries ahead of time, ensuring that the results match your expectations, and then copying and pasting the queries into the terminal as you make the video, to be nice to your professor. ;)
Submit the video to Canvas by 1:00.
- Review Assignment 4 and CRUD operations.
- SOURCE and *.sql files.
- Pandas to_sql().
- Introduce database normalization (if time).
No reading quiz this time, so you can get started on your midterm project.
- Introduction to Schema
- Introduction to Database Normalization
- Normalization in SQL (only read up through third normal form - 3NF)
- Introduction to Indexes
- B+ Trees
- Normalization strategies.
We will practice with the following datasets: - Midterm work session for peer and instructor feedback.
- Continue work on your midterm project.
Also...
- Primary Keys
- Primary Key Data Types
- Where to add indexes
- Indexing for wildcard searches
- Fulltext indexes
- Foreign keys
- Primary keys — choosing fields and data types.
- Generating keys with the IDE.
- Indexes — choosing fields and creating indexes.
- Midterm peer feedback, followed by brief work/(re)planning time.
- Continue midterm project work.
Also...
I have found that combining EXPLAIN with ChatGPT is an immensely helpful combination for database debugging and optimization. You can simply run EXPLAIN and cut-and-paste the output into GPT, or you can ask it a specific question about what you are trying to do and how to fix/improve/optimize it.
EXPLAINwalkthroughs.- More on indexes.
- Midterm debugging and feedback, unit wrap‑up.
- Midterm project due by 2:00pm.
- Introduction to Unit 2: NoSQL databases.
Reading Quiz #5 in Canvas (after doing the following readings).
- Quiz debrief.
- Practice working with JSON in Python, using data from a public social media API. See JSON practice notebook.
NO CLASS (Kris had Covid)
NO CLASS!!!
- Install MongoDB on your machine, following the instructions for your OS at Try MongoDB Community Edition.
- Do Reading Quiz #6 in Canvas (after doing the following readings).
- NoSQL for Mere Mortals (in Canvas), pp. 153–195.
- More JSON Practice.
- Quiz debrief.
- MongoDB basic operations and testing.
mongosh(MongoDB Shell): Perform CRUD Commands (Create, Read, Update, Delete)
- Verify working MongoDB installation.
- MongoDB CRUD operations practice, in
mongoshand Python (pymongo) using this notebook.
- Complete the Pymongo_Startup.ipynb notebook and submit to Canvas.
- Reading Quiz #7 in Canvas (after doing the following readings).
- Pymongo documentation: CRUD Operations
- Pymongo documentation: Query Documents
- Pymongo documentation: Find Documents
- Quiz debrief.
- Pair coding with the PyMongo Query Basics notebook.
- No reading quiz, but read the following and have it handy in class.
- More pair coding with the PyMongo Query Basics notebook.
- Complete the Pymongo_Query_Basics.ipynb notebook and submit to Canvas.
- Reading Quiz #8 in Canvas (after doing the following readings).
NOTE: There is more information in the reading this weekend, especially towards the end of the chapter, than we will interact with in our notebooks and project. Let the quiz guide you regarding what is most important, and we'll reinforce that with our notebook work in class this week.
- MongoDB, The Definitive Guide: Ch. 7, Introduction to the Aggregation Framework (in Canvas).
- Aggregation
- Pymongo_Query_Basics.ipynb debrief.
- Reading quiz debrief.
- Introduction to aggregation framework and pipeline stages.
- Pair coding: Pymongo_Aggregation.ipynb.
- No required reading or notebook, but review any concepts from the past couple readings that you found challenging in our notebook work on Tuesday.
- More pair coding: Pymongo_Aggregation.ipynb.
- Complete the Pymongo_Aggregation.ipynb notebook and submit to Canvas.
- Reading Quiz #9 in Canvas (after doing the following readings).
- MongoDB, The Definitive Guide: Ch. 4, Indexes (in Canvas).
- PyMongo documentation: Indexes.
- Introducing the MongoDB midterm project.
- Pymongo_Aggregation.ipynb debrief.
- Indexes and text searches in Pymongo_indexing_and_text_search.ipynb.
- No daily assignment, just get started on the midterm project.
- Indexes and text searches in Pymongo_indexing_and_text_search.ipynb.
- Complete Pymongo_indexing_and_text_search.ipynb and submit to Canvas.
- Reading Quiz #10 (after doing the following readings).
- Continue work on your midterm project.
- Vector Databases (early release edition): Ch. 1, Introduction to Vector Databases (in Canvas).
- Vector Databases (early release edition): Ch. 2, Embeddings (in Canvas).
- Intro to vector databases and vector embeddings with VectorDBs_Sentence_Embeddings.ipynb.
- Continue work on your midterm project.
- More on vector databases and vector embeddings with VectorDBs_Sentence_Embeddings.ipynb and VectorDBs_Sentence_Embeddings_pt2.ipynb.
- Midterm project due in Canvas.
- Introduction to Ollama (local AI chatbot) with Ollama test.ipynb.
- Connecting Ollama to your vector database.
- Make sure you have all installation issues worked out and can run all cells in the VectorDB notebooks so far. (Exception: if you have issues with ChromaDB, you can ignore/slip those.)
- FCQs
- Introducing the final project (local RAG app — an AI chatbot based on a custom vector database).
- Sentence tokenization (splitting documents into sentences).
- Final project workshop time.
- Begin work on your final project.
- Submit VectorDBs_Sentence_Embeddings.ipynb, VectorDBs_Sentence_Embeddings_pt2.ipynb, and Ollama test.ipynb to Canvas. These are easy points, as the code was already provided to you, but you will receive credit only if you show that all packages and external tools are installed and working properly.
- Hybrid queries.
- Final project workshop time.
- Continue work on your final project.
- Final project workshop time.
Submit your final project to Canvas by 4pm.