aphia-sparql-sync

UPFRONT: important known limitations & issues

This is hyper-recent and new work (started 2026-01-26) -- use with caution and feel free to report issues (preferably after checking the list of known issues at github)

We are (and you should be) aware off:

getting the full ldes synchronised needs more investigation (could be due to limited laptop resources)
implementing an actual lookup by (partial) name should be added to the ipynb dashboard
and the results from that should be compared to using the aphia-webservice
would be nice to make it easy to use an external (not embede graphdb) sparql-endpoint
and better docker-child-management support in ldes-consumer should be considered through testcontainers

getting started

This stack depends you have a docker + docker-compose running on the environment where you run this.

With those dependencies in place:

checkout git clone this repo, and move cd ./aphia-sparql-sync into the created folder
clone and modify the env settings ./bin/initenv.sh and vi .env
get all needed docker images through docker compose pull
get the stack started with ./bin/msup.sh
check the output for indications that all launched as expected:
- there is a known timing issue with graphdb on initial start (on some slower platforms)
- if this plays up: just restart the previous command -- on second run all should be well

Optionally you can

keep an eye open on the logs with ./bin/mslogs.sh

Closing up after working on this, you might want to:

shutdown the stack with ./bin/msdwn.sh

The stack remebers state and earlier harvested results.

using the stack

When the microservices are running you can interact with them through your browser:

triplestore graphdb

GRAPHDB @5200 provides the UI of the embedded graphdb instance.

yasgui

YASGUI @5210 provides the popular yasgui sparql editor in the browser.

Be sure to have it point to the correct sparql endpoint. (default http://localhost:5200/repositories/aphia-sync should work.)

Compared to the built-in sparql UI in the graphd front-end this has the advantage it can also point to other sparql endpoints.

jupyter notebooks environment

JUPYTER @5220 provides a local jupyter notebook instance within the docker-networking stack. This means it can directly access the sparl endpoint on the graphdb service.

In combination with a handy jinja-templating-for-sparql feature of the py-sema library loaded into this python stack we allow for quick and easy analysis of the harvested graph.

For convenience we provide one [./notebook/aphia-sync-dashboard.ipynb](http://localhost:5220/lab/tree/aphia-sync-dashboard.ipynb) that provides the features:

counting the aphia-related objects in the graph
list and inspect predicates of available taxname objects
(todo) lookup taxname <uri> by scientific name
(todo) compare that result with what can be retrieved by webservice cals

ref and explain stuff used here.

This stack builds on a number of other projects worth exploring:

k-gap the basic docker-based python-analysis platform for knowledge graphs we are reusing here.
py-sema a python library adding convenience to scientic research tapping into knowldge graphs and semantics (works on top of py-rdflib and others...)
ldes2sparql an application of the rdfconnect platform tuned to materialise LDES feeds into a SPARQL endpoint

configuration and customisation through `.env`

See the comments in the dot-env-example as well as the docker-compose.yml for tuning specific settings.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
bin		bin
data		data
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
dot-env-example		dot-env-example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

aphia-sparql-sync

UPFRONT: important known limitations & issues

getting started

using the stack

triplestore graphdb

yasgui

jupyter notebooks environment

ref and explain stuff used here.

configuration and customisation through `.env`

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

aphia-sparql-sync

UPFRONT: important known limitations & issues

getting started

using the stack

triplestore graphdb

yasgui

jupyter notebooks environment

ref and explain stuff used here.

configuration and customisation through .env

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

configuration and customisation through `.env`

Packages