Skip to content

binhducvu/MAST30034-Project-1

Repository files navigation

MAST30034 Project 1 - Quantitative Analysis

  • Student Name: Brian Vu
  • Student ID: 1053531
  • Due Date: Friday 13th of August 11:59:00 am (AEST).

Dependencies

  • Language: Python 3.8.3
  • Packages / Libraries: pandas, pyspark, geopandas, numpy, folium, rtree, pygeos, statsmodels, sklearn

Datasets

Directory

Change this to fit your needs when you have started the project.

  • raw_data: Contains all the raw data files that are too large to upload to git. These just include the taxi datasets.
  • raw_data_lite: Contains all other raw data that can be uploaded to git. These include the shapefiles for the census tracts and taxi zones, as well as the census dataset that used.
  • preprocessed_data: Contains all the preprocessed data files.
  • plots: Contains all plots and figures
  • code: Order to run notebooks:

1, Extracting and serializing data. Some data was manually uploaded, but have links to download them.

2, Preprocessing census data.

3, Preprocessing taxi data 1.

4, Preprocessing taxi data 2.

5, Merging and visualizing.

6, Statistical modelling.

  • deprecated: Contains discard plots and some code (but not much, most were deleted).

About

Applied Data Science Project 1 on Quantitative Analysis of New York Taxi Data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors