RTGL (Relational Task Generation Language) is a Python framework for writing compact, expressive predictive queries over relational data, especially for Relational Deep Learning.
It lets you write shorter, more expressive queries by abstracting temporal joins and complex aggregations.
-
๐ฏ ANTLR-based Parser
- Lexer and parser for RTGL syntax
-
๐ณ Structured parse-tree visitor
- Converts parsed queries into normalized dictionaries with source positions.
-
๐ Semantic validation
- Schema-aware query validation with error reporting.
-
๐ Two converters
- ๐
SConverterfor static prediction queries. - โฐ
TConverterfor temporal prediction queries with timestamp windows.
- ๐
-
โ๏ธ Dual output mode
execute=Falsereturns generated SQL.execute=Trueexecutes SQL and returns aTableobject.
Install RTGL via pip:
pip install rtgl1. Build your database as RelBench Database object or use simplified RTGL version
# path to classes
from rtgl.base import Database, Tablefrom rtgl.converter import SConverter
converter = SConverter(db)
rtgl_query = """
PREDICT COUNT_DISTINCT(votes.*
WHERE votes.votetypeid == 2)
FOR EACH posts.* WHERE posts.PostTypeId == 1
AND posts.OwnerUserId IS NOT NULL
AND posts.OwnerUserId != -1;
"""
# SQL only
sql_query = converter.convert(rtgl_query, execute=False)
# execute and get Table(fk, label)
table = converter.convert(rtgl_query, execute=True)import pandas as pd
from rtgl.converter import TConverter
timestamps = pd.Series(...) # define timestamps for which prediction must be made
converter = TConverter(db, timestamps)
# also, it is possible to update prediction timestamps later without recreating converter
converter.set_timestamps(new_timestamps)
rtgl_query = """
PREDICT COUNT_DISTINCT(votes.*
WHERE votes.votetypeid == 2, 0, 91, DAYS)
FOR EACH posts.* WHERE posts.PostTypeId == 1
AND posts.OwnerUserId IS NOT NULL
AND posts.OwnerUserId != -1;
"""
# SQL only
sql_query = converter.convert(rtgl_query, execute=False)
# execute and get Table(fk, timestamp, label)
table = converter.convert(rtgl_query, execute=True)For more comprehensive examples and use cases, check out the relbench_exp.ipynb notebook.
You can also check the rtgl-tasks repository for more tasks.
PREDICT <aggregation | expression | table.column> [RANK TOP K | CLASSIFY]
FOR EACH <entity_table>.<primary_key>
[WHERE <static_condition | static_nested_expression>];PREDICT <aggregation | temporal_expression> [RANK TOP K | CLASSIFY]
FOR EACH <entity_table>.<primary_key> [WHERE <static_condition | static_nested_expression>]
[ASSUMING <temporal_condition | temporal_nested_expression>]
[WHERE <temporal_condition | temporal_nested_expression>];| Function | Meaning | Condition-Compatible |
|---|---|---|
AVG |
average | โ |
MAX |
maximum | โ |
MIN |
minimum | โ |
SUM |
sum | โ |
COUNT |
non-null count | โ |
COUNT_DISTINCT |
distinct count | โ |
FIRST |
earliest value by time | โ |
LAST |
latest value by time | โ |
LIST_DISTINCT |
list of distinct values | โ |
- Window format:
<start>, <end>, <measure_unit>. - Supported units:
YEARS,MONTHS,WEEKS,DAYS,HOURS,MINUTES,SECONDS. - Window semantics are half-open:
(start, end]. PREDICT/WHERE:startandendmust be non-negative.ASSUMING:startandendmust be non-positive.startmust be strictly less thanend.
RTGL Query String
โ
[Lexer] -> Tokens
โ
[Parser] -> Parse Tree
โ
[Visitor] -> Structured Dictionary
โ
[Validator] -> Semantic Checks
โ
[Converter] -> SQL Query
โ (optional execute=True)
[DuckDB] -> Result Table
- macOS & Linux
wget -qO- https://astral.sh/uv/install.sh | sh- Windows
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"uv sync --all-extrasIf you modify lexer or parser grammar files (*.g4), regenerate ANTLR outputs from the repo root:
./regenerate_parser.shpytestruff check .