Skip to content

Dev/asolovev dbscan ext#3623

Draft
Alexandr-Solovev wants to merge 38 commits into
uxlfoundation:mainfrom
Alexandr-Solovev:dev/asolovev_dbscan_ext
Draft

Dev/asolovev dbscan ext#3623
Alexandr-Solovev wants to merge 38 commits into
uxlfoundation:mainfrom
Alexandr-Solovev:dev/asolovev_dbscan_ext

Conversation

@Alexandr-Solovev

Copy link
Copy Markdown
Contributor

Description


Checklist:

Completeness and readability

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes or created a separate PR with updates and provided its number in the description, if necessary.
  • Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
  • I have resolved any merge conflicts that might occur with the base branch.

Testing

  • I have run it locally and tested the changes extensively.
  • All CI jobs are green or I have provided justification why they aren't.
  • I have extended testing suite if new functionality was introduced in this PR.

Performance

  • I have measured performance for affected algorithms using scikit-learn_bench and provided at least a summary table with measured data, if performance change is expected.
  • I have provided justification why performance and/or quality metrics have changed or why changes are not expected.
  • I have extended the benchmarking suite and provided a corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

Alexandr-Solovev and others added 30 commits April 8, 2026 01:07
…GPU tests

- Split monolithic single_task extract_clusters into 5 GPU kernels:
  K1 (single_task): Kruskal dendrogram via union-find
  K2 (parallel_for): Init node arrays from dendrogram
  K3 (single_task): Condensed tree + EOM stability selection
  K4 (parallel_for): Build dendro_parent from edges
  K5 (parallel_for): Label each point independently
- Add Method template parameter to DAAL kernel (follows standard pattern)
- Add hdbscan_types.h with Method enum (defaultDense)
- Add CPU vs GPU comparison tests (permutation-invariant partition check)
- Remove unused engines dependency from DAAL BUILD
- Add scripts/ to .gitignore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add push trigger and remove repository == 'uxlfoundation/oneDAL'
guards from both Linux and Windows jobs so the workflow runs on
every commit in fork branches.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rename defaultDense to bruteForceDense in DAAL Method enum (keep alias)
- Simplify hdbscan_cluster_utils.h: extract subfunctions from 500-line method
  (sortMstEdges, buildKruskalDendrogram, buildCondensedTree, selectClusters, labelPoints)
- Move convert_metric to shared compute_kernel_common.hpp header
- Replace row_accessor with BlockDescriptor in CPU kernels where data is already DAAL table
- Remove _on_host postfix from CPU-only functions, replace std::vector with dal::array
- Split GPU build_condensed_tree_eom_kernel into two kernels for clarity

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@homksei

homksei commented May 18, 2026

Copy link
Copy Markdown
Contributor

/intelci: run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants