Skip to content

feat: v1 pipeline & api#5

Merged
chadoh merged 30 commits into
mainfrom
feat/v4
Apr 29, 2026
Merged

feat: v1 pipeline & api#5
chadoh merged 30 commits into
mainfrom
feat/v4

Conversation

@willemneal

@willemneal willemneal commented Apr 13, 2026

Copy link
Copy Markdown
Contributor

v1! The first mainnet-ready pipeline & API. The db tables are now in their own schema, so rather than incrementing to v4 we thought it made sense to call it v1.

This indexes named subregistries (oz, blend, soroswap, defindex, etc) with subregistry auto-discovery using Goldsky's dynamic tables. It also creates views via a redeploy.sh script that join to this table for the subregistry names. Code has been reorganized to make the redeploy script mostly version-agnostic, though a Goldsky workaround is still hard-coded to v1 specifics.

Pipeline / schema

willemneal and others added 4 commits April 13, 2026 13:49
Pipeline: drop hardcoded contract_id allowlist in transform_1 and filter by
event symbol instead, so new registries are indexed the moment they emit any
of deploy/publish/register/rename/update_address/update_owner/sub_reg. Add
transform_3_subregistry_events + subregistry_events_pg sink that captures
sub_reg events from the root registry CB7WYFH2...SYA76J2W and writes them
to public.registries keyed on the subregistry's contract_id. The authority
filter on emitter_contract_id prevents rogue contracts from polluting the
lookup table.

sql/v4_registries.sql: schema for the new registries lookup table.

fly-app: LEFT JOIN public.registries in every contract/wasm query and return
COALESCE(r.channel, table.channel) so unknown subregistries surface under
their raw contract_id until the root announces them, at which point the
friendly name resolves. Drop the hardcoded main/unverified allowlist from
the four endpoints that had it. Add GET /v1/registries listing the new
lookup table. Dedupe v4_published_wasms and v4_deployed_contracts in
contract queries via DISTINCT ON subqueries so shared wasm hashes don't
multiply registered rows.

Agent-Id: bitswell
Session-Id: 277b25ec-bebf-4249-9719-104aae47e81a
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
transform_1: filter out op-none event variants. The Stellar dataset emits
every contract event twice — once at the transaction level (id suffix
op-none-event-N) and once at the operation level (op-0-event-M). Both
rows have identical data, only the id differs, so the postgres sink was
keeping both and every /v1/contracts row surfaced twice. NOT LIKE
'%-op-none-%' keeps only the operation-level copy.

transform_3_subregistry_events: swap authority filter from the old root
CB7WYFH2...SYA76J2W to the new root CBT6FE6W...OGUJH5JTI.

start_at: 2036131 -> 2038000 to skip the historical events from the two
defunct roots (CDVDJX2HX, CB7WYFH2) so a fresh restart doesn't drag their
orphaned register events back into v4_registered_contracts.

Agent-Id: bitswell
Session-Id: 277b25ec-bebf-4249-9719-104aae47e81a
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread registry-turbo-v4.yaml Outdated
from: transform_3_subregistry_events
schema: public
secret_name: HOSTED_POSTGRES_CMJD1KI8O0
table: registries

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also be prefixed with v4_?

@willemneal willemneal Apr 17, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to v4_registries in the yaml sink, sql/v4_registries.sql, and every public.registries JOIN in fly-app/src/main.rs (in cc3c35e). The /v1/registries endpoint path stays the same since that's the public API shape, independent of storage.

Comment thread fly-app/src/main.rs
willemneal and others added 4 commits April 17, 2026 12:52
Matches the v4_* prefix convention used by every other sink in
registry-turbo-v4.yaml. The SQL file was already named v4_registries.sql.

Agent-Id: bitswell
Session-Id: 4ff31451-91fe-4237-bd27-7143f59d56b0
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the LEFT JOIN v4_registries + COALESCE(...) pattern duplicated
across seven query sites in fly-app/src/main.rs with two views
(v4_published_wasms_named, v4_registered_contracts_named) that expose a
resolved_channel column. Query plans are unchanged; the join is now
defined once instead of being copy-pasted per call site.

Agent-Id: bitswell
Session-Id: 4ff31451-91fe-4237-bd27-7143f59d56b0
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The root registry now emits a sub_reg event naming itself "root", so
v4_registries maps its contract_id to channel="root". The API's
default-channel routes were passing the literal "main" and returning
nothing. Align the literal with what the pipeline writes, and rename the
default-channel handlers from *_main* to *_root* so function names
match their behavior.

Agent-Id: bitswell
Session-Id: 4ff31451-91fe-4237-bd27-7143f59d56b0
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the previous root CBT6FE6W with the new root
CBNBQND6EMYTTRTCUWUJ3VIKF7RUUISK5T4GAKTXRVIQRHGP4XQY4ID7 at the
emitter_contract_id filter in transform_3_subregistry_events, and fix a
stale reference to a pre-CBT6FE6W root in the v4_registries.sql header
comment so documentation matches the pipeline.

Agent-Id: bitswell
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@willemneal willemneal requested a review from chadoh April 20, 2026 16:19

@chadoh chadoh left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of want you to throw this away and start over, this time without Claude. This PR now has many thousands of lines of code. These lines of code need to be maintained, and there's no reason this project needs to be so complicated that we shouldn't be able to use our feeble, multi-tasking human brains to maintain it.

Before I knew you were adding tests yourself, I had a conversation with @ifropc about how to test this. Our plan was to do what many web frameworks do: create factory files for test data from current database dumps, and load these into SQLite. We can then test every API endpoint to make sure it returns well-formed data, which is the main thing we care to test. We want to exercise the queries in our API. Seems like the test suite designed by Claude is testing external dependencies like Postgres and Goldsky, and it seems like it adds lots of complexity (a whole new language to the codebase, TypeScript! — TS was prev only used for a Lambda PoC, which was an early experiment to compare to "Pure Goldsky"; Claude probably mistakenly thought we already used it so ¯\_(ツ)_/¯ ) and slowness to the suite, for not much value. We can probably trust our external dependencies to work.

@willemneal willemneal changed the title feat: add tables v4 feat: v4 pipeline and schema Apr 21, 2026
@chadoh chadoh force-pushed the feat/v4 branch 3 times, most recently from e4e3aa1 to 9796b61 Compare April 21, 2026 20:32
The events coming into the tables that now have the `*_raw` suffix could
come from any contract, not just ours. Anyone can deploy a contract that
emits a `deploy` event. So we shouldn't trust these events. We should
only trust them if they come from one of the contracts in the
`transform_3_subregistry_events` table.

`transform_4_events_with_name` now contains these clean events.

This also updates the view to use `INNER JOIN` rather than `LEFT JOIN`,
because we only want records that are in both tables (and with the new
transform logic, we should not have any "registered_contracts" or
"published_wasms" without a matching record in "subregistries" anyhow).
Comment thread sql/v4_named_views.sql Outdated
@chadoh chadoh force-pushed the feat/v4 branch 3 times, most recently from 1ba65ab to 3685a7a Compare April 23, 2026 19:39
as well as

- new testnet contract
- some syntax cleanup
- one trial join from v4_deployed_contracts to v4_registries (not yet
  tested)
willemneal and others added 3 commits April 24, 2026 09:54
Adds .github/workflows/test.yml running on every PR and push to main:
  1. Install Goldsky's turbo CLI on the runner (ubuntu-24.04 ships
     glibc 2.39, so no container wrapper needed unlike local dev).
  2. Run `turbo validate` against each registry-turbo-*.yaml via
     scripts/validate-pipelines.sh. registry-minimal.yaml has an
     independent pre-existing validation error and is intentionally
     out of scope.
  3. Run the jest unit project (no external services).
  4. Bring up postgres:17-alpine as a GitHub Actions service container
     and run the jest integration project — both synthetic and real
     suites — against TEST_PG_URL.

Also exposes TURBO_BIN in scripts/validate-pipelines.sh so developers
on older glibc can point at a docker-wrapper; the script tells them
where to install from when the binary is missing.

Agent-Id: bitswell
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The main https://goldsky.com/install installs the goldsky wrapper
CLI, not the standalone turbo binary the pipeline validator needs.
Switch to https://install-turbo.goldsky.com which puts a working turbo
at ~/.goldsky/bin/turbo. Also surface the correct URL in the
validate-pipelines.sh error message so local runs point at the same
command.

Agent-Id: bitswell
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
turbo validate requires an authenticated goldsky session even for
offline YAML schema checking — my earlier assumption that validate was
auth-free was wrong. Write the GOLDSKY_API_KEY secret to
~/.goldsky/auth_token before running the validator, and guard both
the auth step and the validate step on the secret being present so
fork PRs (which cannot access repo secrets) skip gracefully instead
of failing the workflow.

Agent-Id: bitswell
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
willemneal and others added 12 commits April 24, 2026 09:54
- new `goldsky` folder with `goldsky/v1` containing what was previously
  known as `v4` logic
- update `redeploy.sh` to call `v1/post_init.sql`, which is now where
  the views get created
- create all tables in a schema (the `v1` schema) rather than in the
  `public` namespace; this simplifies a bunch of logic (like
  `goldsky/scripts/drop_tables.sh` dropping the whole schema, rather
  than individual tables)
willemneal and others added 3 commits April 24, 2026 15:11
Events whose emitter contract_id is added to registries_dynamic_table
only a few ledgers before a downstream event can be dropped by
transform_4's dynamic_table_check, because the Postgres write from
transform_3 hasn't committed by the time the check reads. The race
window is non-deterministic (observed 10–70s).

- refresh.sh: wraps `turbo restart --clear-state`, which resets the
  source checkpoint while preserving the Postgres-backed dynamic
  table so a replay sees the fully-seeded membership set.
- audit-race.sql: compares raw_events_backup (upstream of the filter)
  against each sink to find events that should have passed but didn't.
- redeploy.sh --number-of-initial-subregistries N: polls the dynamic
  table to the expected size, runs the audit, runs refresh.sh on
  drops, re-audits.

Tracking: #12

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
chadoh added a commit to stellar-registry/ui that referenced this pull request Apr 27, 2026
Update Registry UI to deal with changes from stellar-registry/indexer#5

- New channels added (`oz`, `blend`, `soroswap`, `defindex`, etc) rather
  than just `unverified` or main/root
- Main channel now officially known (in the index, db, api) as `"root"`
- Channel of contract and the channel of the wasm it references may
  differ; new `wasm_channel` field now available

Note that at the start of stellar-registry/indexer#5, we were calling
this the "v4" index. Now that it has gone through a few more iterations,
we are confident that this is our starting "initial release" index &
API, rather than being essentially another prerelease. We've therefore
updated the underlying db & API to reference `v1`. (This does not
percolate to the UI itself, but makes naming this branch & PR awkward).
@chadoh chadoh changed the title feat: v4 pipeline and schema feat: v1 pipeline & api Apr 28, 2026
Comment thread fly-app/src/main.rs Outdated

@chadoh chadoh left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lfggggggg

chadoh added a commit to stellar-registry/ui that referenced this pull request Apr 29, 2026
Update Registry UI to deal with changes from stellar-registry/indexer#5

- New channels added (`oz`, `blend`, `soroswap`, `defindex`, etc) rather
  than just `unverified` or main/root
- Main channel now officially known (in the index, db, api) as `"root"`
- Channel of contract and the channel of the wasm it references may
  differ; new `wasm_channel` field now available

Note that at the start of stellar-registry/indexer#5, we were calling
this the "v4" index. Now that it has gone through a few more iterations,
we are confident that this is our starting "initial release" index &
API, rather than being essentially another prerelease. We've therefore
updated the underlying db & API to reference `v1`. (This does not
percolate to the UI itself, but makes naming this branch & PR awkward).
@chadoh chadoh merged commit a56ee0d into main Apr 29, 2026
1 check passed
@chadoh chadoh deleted the feat/v4 branch April 29, 2026 17:48
chadoh added a commit to stellar-registry/ui that referenced this pull request Apr 29, 2026
Update Registry UI to deal with changes from stellar-registry/indexer#5

- New channels added (`oz`, `blend`, `soroswap`, `defindex`, etc) rather
  than just `unverified` or main/root
- Main channel now officially known (in the index, db, api) as `"root"`
- Channel of contract and the channel of the wasm it references may
  differ; new `wasm_channel` field now available

Note that at the start of stellar-registry/indexer#5, we were calling
this the "v4" index. Now that it has gone through a few more iterations,
we are confident that this is our starting "initial release" index &
API, rather than being essentially another prerelease. We've therefore
updated the underlying db & API to reference `v1`. (This does not
percolate to the UI itself, but makes naming this branch & PR awkward).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants