VASTKnowledgeGraphVisualization/data/README.md at main · sclfnc/VASTKnowledgeGraphVisualization

Data Download Script

This folder includes download_github_zip.py, a helper script to:

Download one or more dataset links
Convert GitHub blob URLs to direct download URLs
Extract .zip files into this data/ directory
Delete the downloaded .zip file after successful extraction

Prerequisite

Python 3

Usage

From the project root (VASTKnowledgeGraphVisualization), run:

python3 data/download_github_zip.py <url1> [url2 ...]

Example (single link)

python3 data/download_github_zip.py \
  https://github.com/vast-challenge/2025-data/blob/main/MC1_release.zip

Example (multiple links)

python3 data/download_github_zip.py \
  https://github.com/vast-challenge/2025-data/blob/main/MC1_release.zip \
  https://github.com/vast-challenge/2025-data/blob/main/DC_release.zip

Optional flags

--dry-run: show what will be downloaded without downloading
--output-dir <path>: save and extract into another directory

Example:

python3 data/download_github_zip.py \
  --dry-run \
  https://github.com/vast-challenge/2025-data/blob/main/MC1_release.zip

Dataset Creation script

create_datasets.py creates two additional Knowledge Graphs which can be used to try out the application:

genre_influence.json relationships between songs and albums from the VAST 2025 MC1 dataset, organized by musical genre.
asoiaf_interaction.json undirected interaction graph of characters from JRR Martin's A Song of Ice and Fire series, based from data from https://github.com/mathbeveridge/asoiaf by Andrew Beveridge released under CC BY-NC-SA 4.0.

Prerequisites

Python 3
Networkx

We recommend running ./dev.sh the first time in order to create the python environment and activating the python environment. From the project root (VASTKnowledgeGraphVisualization), run:

./dev.sh
source api/venv/bin/activate

Usage

From the project root (VASTKnowledgeGraphVisualization), run:

python3 data/create_datasets.py

You will find the datasets in the data directory.

Optional flags

--output-dir <path>: save the datasets into another directory.

Team

Contributions to this folder, from the git history:

Salvo Rinzivillo — the download script and this guide.
Giulia Fabiani — the dataset creation script and this guide.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Download Script

Prerequisite

Usage

Example (single link)

Example (multiple links)

Optional flags

Dataset Creation script

Prerequisites

Usage

Optional flags

Team

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Data Download Script

Prerequisite

Usage

Example (single link)

Example (multiple links)

Optional flags

Dataset Creation script

Prerequisites

Usage

Optional flags

Team