Skip to content

Namespace secondary-schema proto types and enumerate all schemas in catalog metadata (cross-schema 4/6)#4215

Draft
arnaud-lacurie wants to merge 19 commits into
FoundationDB:mainfrom
arnaud-lacurie:cross-schema/pr4
Draft

Namespace secondary-schema proto types and enumerate all schemas in catalog metadata (cross-schema 4/6)#4215
arnaud-lacurie wants to merge 19 commits into
FoundationDB:mainfrom
arnaud-lacurie:cross-schema/pr4

Conversation

@arnaud-lacurie

Copy link
Copy Markdown
Collaborator

Summary

  • LogicalOperator prefixes the proto type name of secondary-schema tables with schemaName. (e.g. OTHER.things) to avoid TypeRepository collisions when two schemas define identically-named tables; the FDB record type name (scan key) is preserved unchanged
  • CatalogMetaData.getTables and getColumns now enumerate all schemas in the database when no schema filter is provided, enabling cross-schema metadata inspection

Stacking

This is PR 4 of 6 in the cross-schema join series. All PRs target main.

# PR Topic
1 #4212 Plan-cache key for multi-schema queries
2 #4213 SchemaIdentifier tag + store-binding infrastructure
3 #4214 Secondary-store wiring into execution context
4 this Type-namespace collision avoidance + catalog metadata
5 Java integration tests
6 YAML integration tests + documentation

Test plan

  • Joining two schemas that each have a table named things works without TypeRepository collision
  • getTables(database, null, ...) returns tables from all schemas

Introduce SchemaIdentifier value type (null = current schema) and attach
it as a metadata field to FullUnorderedScanExpression and
LogicalTypeFilterExpression. Both expressions gain a getSchemaId() accessor
and a second constructor/factory overload accepting a SchemaIdentifier.

The field does not participate in equalsWithoutChildren or
computeHashCodeWithoutChildren to keep planning behavior unchanged —
ImplementNestedLoopJoinRule will read it directly in a later step.
EvaluationContext gains an ImmutableMap<String, FDBRecordStoreBase<?>>
auxiliaryStores field (empty by default) plus getAuxiliaryStore(name) and
withAuxiliaryStores(map) accessors. withBinding() preserves the map.

RecordQueryStoreBindingPlan is a new transparent plan node that intercepts
executePlan() and re-dispatches to the secondary store bound for schemaId
in EvaluationContext. All planning properties delegate to the child plan.
Visitor implementations added for all RecordQueryPlanVisitor and
RelationalExpressionVisitor subscribers.

The plan node is unreachable from any existing query path until
ImplementNestedLoopJoinRule starts emitting it (Step 11).
- SemanticAnalyzer: secondary schema lookup (setSecondarySchemaLookup),
  lazy loading and caching of secondary templates (tryLoadSecondarySchema),
  cross-schema tableExists/getTable, getSchemaIdFor, getLoadedSecondarySchemas,
  getAllTableStorageNamesForTemplate
- LogicalOperator.generateTableAccess: derive SchemaIdentifier per table,
  use secondary template for tableNames and isStoreRowVersions; tag
  LogicalTypeFilterExpression with the schema identifier
- PlanContext: secondary schema lookup function + additional schemas map
  (additionalSchemas); withAdditionalSchemas factory method; Builder adds
  withSecondarySchemaLookup setter, build passes new fields
- EmbeddedRelationalStatement / PreparedStatement: populate secondary
  schema lookup from backing catalog in createPlanContext
- PlanGenerator: create BaseVisitor before logical plan generation so the
  secondary schema lookup can be set; after generation collect loaded
  secondary schemas and enrich PlanContext via enrichWithSecondarySchemas
- CascadesPlanner: add planGraph overload accepting additional schemas that
  delegates to MetaDataPlanContext.forRootReferenceWithAdditionalSchemas
- QueryPlan.LogicalQueryPlan.optimize: dispatch to the multi-schema planGraph
  overload when additionalSchemas is non-empty
- QueryPlan.PhysicalQueryPlan.executeInternal: collect required secondary
  schema names from RecordQueryStoreBindingPlan nodes and inject the
  corresponding FDBRecordStoreBase instances into EvaluationContext via
  withAuxiliaryStores before executing the plan
… tests

When a PredicateWithValueAndRanges has a single range with multiple equality
comparisons (e.g. JOIN + WHERE on the same field), asComparisonRange() silently
drops all but the first comparison as residuals. The compensation lambda
previously returned noCompensationNeeded() whenever the Placeholder alias was
bound, causing the constant equality (the WHERE filter) to be silently dropped
from the generated plan.

Fix: replay the merge sequence inside the compensation lambda to detect residual
comparisons. When residuals exist, build a new single-range PVWR from them and
apply it as a filter via computeCompensationFunctionForLeaf, ensuring the plan
emits SCAN([IS ITEMS, EQUALS q6.ITEM_ID]) | FILTER(output.ID = 2) rather than
an unfiltered correlated scan.

Also fix schemaIdFromReference to recurse into quantifier child groups so that
cross-schema plans nested inside a partitioned SelectExpression are wrapped with
RecordQueryStoreBindingPlan. Covers AbstractDataAccessRule schema detection for
SelectExpression alternatives created by predicate push-down rules.

Add CrossSchemaJoinTest covering: plain cross-schema join, join with WHERE
filter on the primary-schema table, and standalone select from a secondary
schema.
Replace raw String primary cache key with PlanCacheSchemaKey holding an
ImmutableSortedSet of schema names. Replace single schemaTemplateVersion
in QueryCacheKey with ImmutableSortedMap<String,Integer> schemaVersions,
enabling cross-schema plan cache keying.

Affects: PlanCacheSchemaKey (new), QueryCacheKey, RelationalPlanCache,
AstNormalizer.NormalizationResult, PlanGenerator, and their tests.
…to serialization to RecordQueryStoreBindingPlan

AbstractDataAccessRule.createScansForMatches now wraps scan plans in
RecordQueryStoreBindingPlan at creation time when the match candidate belongs
to a secondary schema. This places the store-binding node inside any
compensation wrappers (MAP, SELECT) produced by
Compensation.applyAllNeededCompensations, so the secondary store binding
survives to the final physical plan for standalone secondary-schema queries
as well as cross-schema joins.

The previous approach wrapped only at the dataAccessExpressions level after
compensation; the instanceof RecordQueryPlan check failed for logical
SelectExpression compensation wrappers, causing the secondary store to be
silently dropped.

RecordQueryStoreBindingPlan gains full proto serialization: adds
PRecordQueryStoreBindingPlan message (field 41 in PRecordQueryPlan.oneof),
implements toProto/toRecordQueryPlanProto/fromProto, and registers a
@autoservice(PlanDeserializer.class) Deserializer inner class. This
enables continuation support for queries spanning secondary schemas.
… time

TransactionBoundDatabase now populates storesBySchemaName from the
additional stores map on RecordStoreAndRecordContextTransaction, so that
loadRecordStore(schemaId) can return the secondary FDBRecordStoreBase
that RecordQueryStoreBindingPlan looks up at execution time.
…atalog metadata

LogicalOperator prefixes secondary-schema type names with the schema name
(e.g. "OTHER_SCHEMA.things") to avoid TypeRepository collisions when two
schemas define identically-named tables, while preserving the original
FDB record type name as the scan key.

CatalogMetaData.getTables/getColumns now enumerates all schemas in the
database when no schema filter is provided, enabling cross-schema metadata
inspection.
@arnaud-lacurie arnaud-lacurie added the enhancement New feature or request label May 25, 2026
@arnaud-lacurie arnaud-lacurie changed the title Namespace secondary-schema proto types and enumerate all schemas in catalog metadata Namespace secondary-schema proto types and enumerate all schemas in catalog metadata (cross-schema 4/6) May 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant