Skip to content

johan-Rm/schemaorg2yaml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

SchemaORG2YAML

Build Status Python Version License: MIT

A CLI tool to generate YAML schemas from schema.org vocabulary.

Overview

SchemaORG2YAML downloads the schema.org vocabulary and generates normalized YAML files describing schema.org types with their properties, inheritance, and metadata. Perfect for developers who need readable and versionable schemas aligned with schema.org for data modeling.

Features

  • πŸ”„ Automatic vocabulary download from schema.org (JSON-LD/TTL/RDF)
  • 🧬 Inheritance resolution (rdfs:subClassOf) with property merging
  • πŸ“„ YAML export for individual types or type lists
  • 🏷️ Alias mapping support for local naming conventions
  • πŸ” Type discovery with filtering capabilities
  • ⚑ CLI ergonomics with multiple subcommands
  • πŸ§ͺ Comprehensive tests and CI/CD

Installation

Using pipx (recommended)

pipx install schemaorg2yaml

Using pip

pip install schemaorg2yaml

From source

git clone https://github.com/johanremy/schemaorg2yaml.git
cd schemaorg2yaml
pip install -e .

Quick Start

  1. Fetch the schema.org vocabulary:

    schemaorg2yaml fetch-vocab
  2. List available types:

    schemaorg2yaml list-types --filter Person
  3. Generate YAML schemas:

    schemaorg2yaml generate --types Person,Article --out ./schema
  4. Preview a schema:

    schemaorg2yaml show --type WebPage

Usage Examples

Fetch Vocabulary

# Download latest schema.org vocabulary
schemaorg2yaml fetch-vocab

# Use custom source
schemaorg2yaml fetch-vocab --source https://schema.org/version/latest/schemaorg-current-https.jsonld --format jsonld

# Force refresh cache
schemaorg2yaml fetch-vocab --force

Generate Schemas

# Generate specific types
schemaorg2yaml generate --types Person,WebPage --out ./schema

# Generate from file list
schemaorg2yaml generate --file types.txt --out ./schema

# Use aliases mapping
schemaorg2yaml generate --types Person --out ./schema --aliases ./aliases.yaml

# Include ancestor metadata
schemaorg2yaml generate --types Article --out ./schema --include-ancestors

# Minimal output (no descriptions)
schemaorg2yaml generate --types Person --out ./schema --no-descriptions

Explore Types

# List all available types
schemaorg2yaml list-types

# Filter types by pattern
schemaorg2yaml list-types --filter ".*Page.*"

# Show schema in console
schemaorg2yaml show --type Article

YAML Schema Format

Each generated YAML file follows this stable structure:

$schema: "https://example.com/specs/schema-yaml/v1"
id: "schema:Person"
label: "Person"
description: "A person (alive, dead, undead, or fictional)."
parents: ["Thing"]
seeAlso:
  - "https://schema.org/Person"
properties:
  birthDate:
    id: "schema:birthDate"
    label: "birthDate"
    description: "Date of birth."
    range: ["Date", "DateTime", "Text"]
    required: false
    repeated: false
  givenName:
    id: "schema:givenName"
    label: "givenName"
    description: "Given name."
    range: ["Text"]
    required: false
    repeated: false
    aliases: ["first_name"]

Key Features:

  • Sorted properties alphabetically
  • Inheritance chain in parents field
  • Type ranges from schema.org preserved
  • Cardinality information (required, repeated)
  • Local aliases for property mapping
  • Stable format for version control

Aliases Configuration

Create an aliases.yaml file to map schema.org properties to local names:

Person:
  givenName: ["first_name", "prenom"]
  familyName: ["last_name", "nom"]
  birthDate: ["date_of_birth"]

WebPage:
  breadcrumb: ["fil_ariane"]
  mainContentOfPage: ["main_content"]

Development

Setup

git clone https://github.com/johanremy/schemaorg2yaml.git
cd schemaorg2yaml
pip install -e ".[dev]"

Code Quality

# Format code
black src/ tests/
ruff check src/ tests/ --fix

# Type checking
mypy src/

# Run tests
pytest

Testing

# Run all tests
pytest

# With coverage
pytest --cov=src/schemaorg2yaml --cov-report=html

# Specific test file
pytest tests/test_cli.py -v

CLI Reference

Commands

  • fetch-vocab - Download/update schema.org vocabulary
  • generate - Generate YAML files for specified types
  • list-types - List available schema.org types
  • show - Display YAML for a type in console

Global Options

  • --help - Show help message
  • --version - Show version information

Run schemaorg2yaml COMMAND --help for detailed command options.

Limitations (v1)

  • No runtime data validation
  • No reverse generation (YAML β†’ RDF)
  • No web UI
  • Limited to schema.org vocabulary

Roadmap

  • πŸ—‚οΈ Versioned vocabulary cache
  • 🎯 Profile-based property filtering (Web, SEO, e-commerce)
  • πŸ“Š JSON export format
  • πŸ“‹ Multi-type index generation
  • βš™οΈ Minimal vs extended property sets

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Run the test suite
  6. Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

About

🎯 Generate YAML schemas from schema.org vocabulary with CLI tools. Support for aliases, hierarchical generation, and advanced filtering.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages