Skip to content

slitvinov/hzconvert

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HouseZero one-year dataset build pipeline.

Source: figshare https://doi.org/10.6084/m9.figshare.30260233

Run:

    ./bootstrap.sh

Downloads per-table CSVs from figshare, merges them into one wide
data.csv with renamed canonical headers, and prints a summary.

Output: data.csv
Expected md5: 09d05a7aa9fb67ba8c86a179e7306006

Files:
    build.py       merge + rename
    roundtrip.py   reconstruct per-table CSVs from data.csv and diff
    summary.py     print rows / columns / span / group counts
    rename.tsv     source_table  original  canonical
    rubric.tsv     canonical  unit  short_desc  long_desc

Requirements: python3, pandas, curl.

About

Build pipeline for the HouseZero one-year minute dataset (figshare CSVs -> data.csv)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors