Skip to content

N+1 query patterns in __make_cre_links and __get_all_nodes_and_cres #848

@shiwani42

Description

@shiwani42

Issue

What is the issue?

Two internal data-access methods issue a growing number of database queries as the dataset scales, creating N+1 query patterns.

__make_cre_links fetches all Links rows for a CRE in one query, then issues a separate SELECT for each linked Node. For a CRE with L links this is 1 + L queries. Since __make_cre_links is called from get_CREs for every CRE being loaded, the total across M CREs becomes M x (1 + L).

__get_all_nodes_and_cres fetches all primary keys first, then re-queries each record individually. For CREs this chains into get_cre_by_db_id, which adds a further SELECT external_id FROM cre WHERE id = ? per CRE before calling get_CREs. Each CRE costs at minimum three round-trips before __make_cre_links is even reached. The same IDs-first pattern is present in the node branch.

__get_all_nodes_and_cres is called on every with_graph() invocation, which loads the in-memory CRE graph, making this the most impactful path.

Expected Behaviour

__make_cre_links should resolve all links and their associated nodes in a single JOIN query rather than one query per node.

__get_all_nodes_and_cres should fetch full ORM objects directly instead of fetching IDs first and re-querying each record individually.

Actual Behaviour

__make_cre_links issues 1 + N queries per CRE call where N is the number of linked nodes.

__get_all_nodes_and_cres issues multiple redundant round-trips per record due to the IDs-first fetch pattern and the get_cre_by_db_id chain.

Steps to reproduce

The behaviour can be observed by enabling SQLAlchemy query logging (SQLALCHEMY_ECHO=True) and calling any endpoint that triggers with_graph() or get_CREs() on a dataset with a reasonable number of CREs and links.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions