refactor: split datacell factory interfaces#2002
Conversation
Move graph and flatten instance construction out of core interface headers so factory dependencies no longer fan out through widely included abstractions. This keeps datacell creation logic in dedicated factory units while preserving existing behavior and serialization paths. Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Require kind labelWonderful, this rule succeeded.
🟢 Require version labelWonderful, this rule succeeded.
|
There was a problem hiding this comment.
Code Review
This pull request refactors the instantiation of FlattenInterface and GraphInterface by moving the factory logic from the interface classes into dedicated factory files (flatten_factory.cpp and graph_factory.cpp). This involves replacing static MakeInstance methods with standalone MakeFlattenInstance and MakeGraphInstance functions across the codebase. Feedback identifies a potential null pointer dereference in the graph factory and suggests refactoring the IO type selection logic in the flatten factory using a map to improve maintainability.
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Refactors datacell instance creation by extracting FlattenInterface/GraphInterface construction logic into dedicated factory headers/source files, reducing core header coupling and updating call sites accordingly.
Changes:
- Introduce
MakeFlattenInstanceandMakeGraphInstancein newflatten_factory.*andgraph_factory.*files. - Remove
MakeInstancestatic constructors fromflatten_interface.*andgraph_interface.*. - Update production and test code to include the new factory headers and call the new factory functions.
Reviewed changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/index/index_impl.h | Adds dataset_impl.h include after decoupling it from inner_index_interface.h. |
| src/index/hnsw_test.cpp | Switches test call sites to MakeFlattenInstance / MakeGraphInstance. |
| src/index/hnsw.h | Reduces header coupling by dropping logger include. |
| src/index/hnsw.cpp | Uses new factory functions and includes factory headers. |
| src/index/diskann.cpp | Uses MakeFlattenInstance and includes flatten factory headers. |
| src/impl/pruning_strategy_test.cpp | Updates tests to use new factory headers/functions. |
| src/impl/odescent/odescent_graph_builder_test.cpp | Updates tests to use new factory headers/functions. |
| src/datacell/sparse_vector_datacell_test.cpp | Updates tests to use MakeFlattenInstance. |
| src/datacell/sparse_graph_datacell_test.cpp | Updates tests to use MakeGraphInstance and adds factory include. |
| src/datacell/graph_interface.h | Removes static MakeInstance and related header dependencies. |
| src/datacell/graph_interface.cpp | Removes factory implementation from interface TU. |
| src/datacell/graph_factory.h | Adds new public graph factory API declaration. |
| src/datacell/graph_factory.cpp | Implements graph factory logic formerly in GraphInterface::MakeInstance. |
| src/datacell/graph_datacell_test.cpp | Updates tests to use MakeGraphInstance. |
| src/datacell/flatten_interface.h | Removes static MakeInstance and related header dependencies. |
| src/datacell/flatten_interface.cpp | Removes factory implementation from interface TU. |
| src/datacell/flatten_factory.h | Adds new public flatten factory API declaration. |
| src/datacell/flatten_factory.cpp | Implements flatten factory logic formerly in FlattenInterface::MakeInstance. |
| src/datacell/flatten_datacell_test.cpp | Updates tests to use MakeFlattenInstance. |
| src/datacell/compressed_graph_datacell_test.cpp | Updates tests to use MakeGraphInstance. |
| src/algorithm/warp.cpp | Uses MakeFlattenInstance for code storage creation. |
| src/algorithm/pyramid.h | Uses MakeFlattenInstance and includes factory headers. |
| src/algorithm/ivf.cpp | Uses MakeFlattenInstance and includes the flatten factory header. |
| src/algorithm/inner_index_interface.h | Removes dataset_impl.h to reduce coupling. |
| src/algorithm/hgraph.cpp | Uses both new factory functions for codes/graph creation. |
| src/algorithm/brute_force.cpp | Uses MakeFlattenInstance and includes factory header. |
| FlattenInterfacePtr | ||
| MakeFlattenInstance(const FlattenInterfaceParamPtr& param, const IndexCommonParam& common_param); |
| GraphInterfacePtr | ||
| MakeGraphInstance(const GraphInterfaceParamPtr& graph_param, const IndexCommonParam& common_param) { | ||
| switch (graph_param->graph_storage_type_) { | ||
| case GraphStorageTypes::GRAPH_STORAGE_TYPE_SPARSE: | ||
| return std::make_shared<SparseGraphDataCell>(graph_param, common_param); | ||
| case GraphStorageTypes::GRAPH_STORAGE_TYPE_VALUE_COMPRESSED: | ||
| return std::make_shared<CompressedGraphDataCell>(graph_param, common_param); | ||
| case GraphStorageTypes::GRAPH_STORAGE_TYPE_VALUE_FLAT: | ||
| auto io_string = std::dynamic_pointer_cast<GraphDataCellParameter>(graph_param) | ||
| ->io_parameter_->GetTypeName(); |
|
|
||
| #include <fmt/format.h> | ||
|
|
||
| #include "graph_factory.h" |
| @@ -13,204 +13,3 @@ | |||
| // limitations under the License. | |||
|
|
|||
Add the concrete parameter and common parameter headers at the translation units that still relied on transitive includes after the datacell factory split. This keeps the refactor intact while restoring CI builds across TSAN, CircleCI, and test targets. Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
Run the repository formatter after adding the direct include dependencies introduced by the factory split follow-up. This keeps the CI-only include fixes aligned with the project include ordering rules. Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
| #include "datacell/flatten_datacell.h" | ||
| #include "datacell/flatten_datacell_parameter.h" | ||
| #include "datacell/flatten_factory.h" | ||
| #include "datacell/graph_datacell_parameter.h" | ||
| #include "datacell/graph_factory.h" |
Summary
Testing