Skip to content

boluor/tabsdata

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,405 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tabsdata

Tabsdata Pub/Sub for Tables

Tabsdata is a publish-subscribe (pub/sub) server for tables.

Tabsdata has connectors to publish and subscribe tables from local files, S3, Azure Storage, MySQL/MariaDB, Oracle and PostgreSQL. It also provides a Connector Plugin API to write custom connectors.

Tables can be populated with external data or using data from other tables already existing in the Tabsdata server.

Tables can be manipulated using a TableFrame API (internally Tabsdata uses Polars) that enables selection, filtering, aggregation and joins operations.

Tabsdata Binary Distribution

Tabsdata binary distribution of the Enterprise Package (binary distribution) that is built on this Open Source foundation and contains more valued added features is available in PyPi as a binary package for Linux, macOS and Windows.

To install and run the binary distribution use the following command:

pip install "tabsdata[all]"

Tabsdata Binary Distribution Documentation

Contributing

Contributions are welcome! Please refer to the Contributing Guide for more information.

How Does Tabsdata Work?

The following snippets show how to publish and subscribe to tables in Tabsdata.

Publishing data from a MySQL Database

@td.publisher(
    td.MySQLSource(
        "mysql://127.0.0.1:3306/testing",
        ["select * from CUSTOMERS"],
        td.UserPasswordCredentials("admin", td.EnvironmentSecret("DB_PASWORD"))
    ),
    tables=["customers"]
)
def pub(customers: td.TableFrame) -> td.TableFrame:
    return customers

Subscribing, transforming and publishing data within Tabsdata

@td.transformer(
    input_tables=["persons"],
    output_tables=["spanish"]
)
def tfr(persons: td.TableFrame):
    return persons.filter(td.col("nationality").eq("spanish")).select(
        ["identifier", "name", "surname", "language"]
    )

Subscribing to data in an S3 Bucket

@td.subscriber(
    "spanish",
    td.S3Destination(
        "s3://my_bucket/spanish.parquet",
        td.S3AccessKeyCredentials(
            td.EnvironmentSecret("AWS_ACCESS_KEY_ID"),
            td.EnvironmentSecret("AWS_SECRET_KEY")
        )
    ),
)
def sub(spanish: td.TableFrame):
    return spanish

Executing the Publisher

To publish data to Tabsdata run the following command:

$ td fn trigger --coll examples --name pub

In Tabsdata binary distribution, every time the pub publisher is executed, the tfr transformer and the sub subscriber will also be executed.

About

A Pub/Sub for Tables based data integration platform, to discover, publish, modify and consume data effortlessly.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 61.2%
  • Rust 38.8%
  • Shell 0.0%
  • CSS 0.0%
  • Dockerfile 0.0%
  • HTML 0.0%