Skip to content

LA3D/fastkg

Repository files navigation

fastkg

This library provides optimized storage solutions for RDFLib graphs, focusing on:

  1. Parquet storage for efficient columnar compression
  2. SQLite storage for portable, indexed graph databases

Developer Guide

If you are new to using nbdev here are some useful pointers to get you started.

Install {{lib_path}} in Development mode

# make sure {{lib_path}} package is installed in development mode
$ pip install -e .

# make changes under nbs/ directory
# ...

# compile to have changes apply to {{lib_path}}
$ nbdev_prepare

Usage

Installation

Install latest from the GitHub repository:

$ pip install git+https://github.com/la3d/fastkg.git

Quick Start

Using parquet as a fast store

from fastkg.core import KnowledgeGraph
import rdflib

# Create a knowledge graph
kg = KnowledgeGraph()

# Add some triples
ex = rdflib.Namespace("http://example.org/")
kg.bind_ns("ex", ex)
kg.add((ex.John, rdflib.RDF.type, ex.Person))
kg.add((ex.John, ex.name, rdflib.Literal("John Doe")))
kg.add((ex.John, ex.knows, ex.Jane))

print(f"Created graph with {len(kg)} triples")

# Save to Parquet file
kg.save_parquet("example.parquet")
print("Saved graph to Parquet file")

# Load from Parquet file
kg2 = KnowledgeGraph().load_parquet("example.parquet")
print(f"Loaded {len(kg2)} triples from Parquet file")

# Query the graph
results = list(kg2.query("""
    SELECT ?name WHERE {
        ?person a <http://example.org/Person> .
        ?person <http://example.org/name> ?name .
    }
"""))

for row in results:
    print(f"Found person: {row[0]}")
Created graph with 3 triples
Saved graph to Parquet file
Loaded 3 triples from Parquet file
Found person: "John Doe"

Using SQLite as a simple triple store.

from fastkg.core import KnowledgeGraph
from fastkg.sqlite import *
import rdflib

# Create a knowledge graph and connect to SQLite
kg = KnowledgeGraph()
kg.connect_sqlite("example.db", create=True)

# Add some triples directly to the SQLite-backed graph
ex = rdflib.Namespace("http://example.org/")
kg.bind_ns("ex", ex)
kg.add((ex.John, rdflib.RDF.type, ex.Person))
kg.add((ex.John, ex.name, rdflib.Literal("John Doe")))
kg.add((ex.John, ex.knows, ex.Jane))

print(f"Added {len(kg)} triples to the database")

# Close the connection when done
kg.close();

# Load from SQLite
kg2 = KnowledgeGraph()
kg2.connect_sqlite("example.db", create=False)

print(f"Loaded {len(kg2)} triples from the database")

# Query the graph
results = list(kg2.query("""
    SELECT ?name WHERE {
        ?person a <http://example.org/Person> .
        ?person <http://example.org/name> ?name .
    }
"""))

for row in results:
    print(f"Found person: {row[0]}")

# Don't forget to close the connection
kg2.close();
Added 3 triples to the database
Loaded 3 triples from the database
Found person: John Doe

Use Cases for RAG Systems

This library is particularly useful for LLM-based Retrieval Augmented Generation systems:

  • Agent Memory: Store structured knowledge that persists between sessions
  • Knowledge Graphs: Maintain entity relationships for complex reasoning
  • Efficient Retrieval: Query relevant subgraphs to include in LLM context windows

Core Features

The library includes:

  1. KnowledgeGraph class - A wrapper around RDFLib’s Graph with additional storage capabilities
  2. Parquet storage - Fast columnar storage for large graphs
  3. SQLite storage - Indexed, portable database storage
  4. Helper methods for common graph operations

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Documentation can be found hosted on this GitHub repository’s pages. Additionally you can find package manager specific guidelines on [conda][conda] and [pypi][pypi] respectively.

About

Utilities for working with rdflib based knowledge graphs using the fast.ai approach

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors