This repository contains a collection of Jupyter notebooks demonstrating practical use cases of LEI (Legal Entity Identifier) data. Each notebook focuses on a specific topic and provides example code to help data users explore, analyze, and apply LEI data effectively.
All notebooks are maintained and published by the GODIN members.
- Clone or download this repository
- Install the required dependencies:
pip install -r requirements.txt
-
Create a new conda environment:
conda create -n gleif-mapping python=3.9 conda activate gleif-mapping
-
Install the required packages:
pip install -r requirements.txt
- Upload the entire project folder to Google Drive
- Open the notebook in Google Colab
- Run the first cell and adjust the sys.path, if necessary
-
Start Jupyter Notebook:
jupyter notebook
-
Open the ipynb file
-
Run the cells in order
The notebooks support flexible configuration through flags in the second cell:
USE_FULL_DATASET = True: Download and use all available columnsUSE_FULL_DATASET = False: Use only essential columns (faster, less memory)
SAVE_TO_DISK = True: Save files to diskSAVE_TO_DISK = False: Keep data in memory only
# Quick analysis with minimal resources
USE_FULL_DATASET = False
SAVE_TO_DISK = False
# Comprehensive analysis with persistence
USE_FULL_DATASET = True
SAVE_TO_DISK = True
# Balanced approach
USE_FULL_DATASET = False
SAVE_TO_DISK = Trueleibooks/
├── LegalEntityEvents.ipynb # LegalEntityEvents notebook
├── MappingExercise.ipynb # Mapping notebook
├── ISO20022.ipynb # Name and Address Transformation to ISO 20022 notebook
├── requirements.txt # Python dependencies
├── README.md # This documentation file
├── LICENSE.md # License information
├── .gitignore # Git ignore patterns
├── utils/ # Utility modules package
│ ├── __init__.py # Package initialization and environment setup
│ ├── download_utils.py # GLEIF Golden Copy download utilities
│ ├── gleif_api_utils.py # GLEIF JSON:API client
│ ├── visualization_utils.py # Data visualization utilities
│ └── codelist_utils.py # Registration Authority code list utilities
│ └── column_names_utils.py # Golden Copy Column names utilities
│ └── textxml_utils.py # Text and XML utilities
├── cache/ # Cached data files (auto-created)
├── gc_downloads/ # Golden Copy downloads (if SAVE_TO_DISK=True)
├── downloads/ # Mapping file downloads (if SAVE_TO_DISK=True)
└── lib/
- pandas (≥1.5.0): Data manipulation and analysis
- numpy (≥1.21.0): Numerical computing
- requests (≥2.28.0): HTTP library for web requests
- beautifulsoup4 (≥4.11.0): HTML parsing for web scraping
- matplotlib (≥3.5.0): Data visualization
- jupyter: Jupyter notebook support (for local development)
- google-colab: Google Colab integration (automatic in Colab)
-
Import Errors in Google Colab:
- Make sure to upload the entire project folder to Google Drive
- Run the first cell to set up the environment
- Ensure that the
utilsfolder is included in your upload
-
Memory Issues with Large Datasets:
- Set
USE_FULL_DATASET = Falseto use only essential columns - Set
SAVE_TO_DISK = Trueto avoid keeping large datasets in memory - Use the time-based download feature to get smaller datasets
- Set
-
Time-based Download Issues:
- The Golden Copy is published three times a day (UTC): 00:00, 08:00, 16:00. If a Golden Copy file is not available, please choose an earlier publication
If you encounter issues:
- Check that all dependencies are installed correctly
- Verify your internet connection
- Try different configuration options (memory vs disk, full vs subset)
- Check the GLEIF website for any service outages
- Contact godin@gleif.org
CC0 1.0 Universal – No rights reserved.
See the LICENSE file or https://creativecommons.org/publicdomain/zero/1.0/.