Skip to content

"raw_data" vs "clean_data" folders in project structure #9

@cbedwards-dfw

Description

@cbedwards-dfw

https://phac-modelling-hub.github.io/dev-practices/projects/consistency.html

In general your advice mirrors the systems I've used. However, when structuring projects, for scientific endeavors I find it useful to have a folder for the raw / original data, and then a separate folder with cleaned data. I often have one or more scripts that programmatically modify or clean the data, and it's really helpful to have a clearly identifiable folder (for me raw_data/) for unaltered data files, and a separate one for the cleaned data files that are ready for analysis (for me cleaned_data/).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions