Skip to content

Added GPU interfaces for shortest paths algorithm#3622

Draft
olegkkruglov wants to merge 71 commits into
uxlfoundation:mainfrom
olegkkruglov:sp-gpu
Draft

Added GPU interfaces for shortest paths algorithm#3622
olegkkruglov wants to merge 71 commits into
uxlfoundation:mainfrom
olegkkruglov:sp-gpu

Conversation

@olegkkruglov

@olegkkruglov olegkkruglov commented May 6, 2026

Copy link
Copy Markdown
Contributor

Description

Contains content from #3482, #3620, #3621, and same CI problems

  • GPU kernel — Created backend/gpu/traverse_default_kernel_gpu_dpc.cpp with a label-correcting SSSP algorithm using frontier advance. Includes a weighted_csr_graph_device_wrapper that wraps int64 row offsets + int32 column indices + typed edge weights for the frontier API. Distances are relaxed via sycl::atomic_ref::fetch_min, with optional predecessor tracking.
  • GPU kernel headers — Created backend/gpu/shortest_paths.hpp (kernel struct declaration) and backend/gpu/traverse_kernel.hpp (GPU view struct + run_shortest_paths_gpu declaration).
  • DPC++ dispatch — Created select_kernel_dpc.cpp that dispatches to CPU (host policy) or GPU (context_gpu) via dispatch_by_device. Extracts topology and edge values from the directed graph, passes to the GPU kernel. Explicit instantiations for both int32 and double edge weight types.
  • DPC++ specialization in select_kernel.hpp — Added #ifdef ONEDAL_DATA_PARALLEL block with backend_default<data_parallel_policy, ...> specialization.
  • Queue overloads — Added data_parallel_policy dispatcher specialization in traverse_ops.hpp (algo-level), DPC++ overloads in traverse_ops.hpp, and traverse(sycl::queue&, ...) overload in traverse.hpp.
  • GPU test — Created test/shortest_paths_gpu_test.cpp with 4 test cases: isolated vertices (double and int32), one-bucket graph, and a 21-vertex graph. Double tests gracefully skip on devices without fp64 support.
  • DPC++ example — Created shortest_paths_batch.cpp that runs SSSP on all available SYCL devices, with fp64 detection and skip.
  • BUILD updates — Added frontier primitive dependency, split tests into CPU-only and GPU suites, added shortest_paths to DPC++ examples BUILD.

Checklist:

Completeness and readability

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes or created a separate PR with updates and provided its number in the description, if necessary.
  • Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
  • I have resolved any merge conflicts that might occur with the base branch.

Testing

  • I have run it locally and tested the changes extensively.
  • All CI jobs are green or I have provided justification why they aren't.
  • I have extended testing suite if new functionality was introduced in this PR.

Performance

  • I have measured performance for affected algorithms using scikit-learn_bench and provided at least a summary table with measured data, if performance change is expected.
  • I have provided justification why performance and/or quality metrics have changed or why changes are not expected.
  • I have extended the benchmarking suite and provided a corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

…consistency; update related functionality; fixed bug for SIGSEGV
- Removed the existing frontier_dpc.hpp file to streamline the codebase.
- Introduced new test files for advance operation, BFS, and basic frontier operations.
- Implemented comprehensive tests to validate the functionality of the frontier data structure.
- Enhanced the frontier class with additional methods for better performance and usability.
- Ensured compatibility with SYCL and improved device memory management.
…ble declarations in BitmapKernel and frontier_dpc implementations
…sentation and update test case names for clarity
…t32_t> for improved performance and memory efficiency
@olegkkruglov olegkkruglov force-pushed the sp-gpu branch 2 times, most recently from 587db32 to b70f36d Compare May 12, 2026 15:24
@olegkkruglov olegkkruglov added new algorithm New algorithm or method in oneDAL interfaces Update interfaces labels May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

interfaces Update interfaces new algorithm New algorithm or method in oneDAL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants