A practical collection of beginner-to-intermediate Python data analysis exercises using pandas, numpy, and matplotlib.
This repository is designed to show consistent practice in data analysis, clean coding, and reproducible notebooks.
- Practice real data analysis workflows step by step
- Improve pandas, NumPy, and matplotlib skills
- Build a public GitHub portfolio with clean commits
- Create small, readable examples that can be reviewed by teachers, employers, or students
python-data-analysis-exercises/
├── data/ # Small sample datasets
├── exercises/ # Exercise files with TODO tasks
├── solutions/ # Completed reference solutions
├── notebooks/ # Jupyter notebooks for exploration
├── docs/ # Extra notes and learning guides
├── requirements.txt # Python dependencies
└── README.md # Project documentation
Create and activate a virtual environment:
python -m venv .venvOn Windows:
.venv\Scripts\activateOn macOS/Linux:
source .venv/bin/activateInstall dependencies:
pip install -r requirements.txt| No. | Topic | File |
|---|---|---|
| 01 | Pandas basics | exercises/01_pandas_basics.py |
| 02 | Missing values and cleaning | exercises/02_cleaning_missing_values.py |
| 03 | GroupBy and aggregation | exercises/03_groupby_analysis.py |
| 04 | Basic visualization | exercises/04_visualization.py |
Example:
python exercises/01_pandas_basics.pyOr open the notebook:
jupyter notebook notebooks/01_sales_analysis.ipynb- Day 1: Complete pandas basics
- Day 2: Complete missing value cleaning
- Day 3: Complete groupby analysis
- Day 4: Complete visualization
- Day 5: Add one new dataset and one new exercise
- Day 6: Improve README and add screenshots
- Day 7: Refactor code and commit improvements
- Python programming
- Data loading and inspection
- Data cleaning
- Exploratory data analysis
- GroupBy operations
- Basic visualization
- Git and GitHub workflow
This project is licensed under the MIT License.