Add depth_to_point_cloud operator and demo#1588
Conversation
Signed-off-by: Nitheesh Kumar <snknitheesh@gmail.com>
Signed-off-by: Nitheesh Kumar <snknitheesh@gmail.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a new DepthToPointCloudOp Holoscan operator that performs GPU-side deprojection of an organized depth image into an organized point cloud, with optional per-frame intrinsics and color passthrough. Includes Python bindings, a standalone CUDA golden-reference test, Python pybind tests, and a hardware-free synthetic demo application.
Changes:
- New
depth_to_point_cloudoperator (C++/CUDA core, pybind11 bindings, headers, CMake, metadata). - New standalone CUDA kernel test plus a Python binding test suite.
- New
depth_to_point_cloud_demoPython application (synthetic generator → operator → stats sink) wired into the top-level CMake.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| operators/depth_to_point_cloud/deproject.{hpp,cu} | CUDA kernel + launcher for depth deprojection. |
| operators/depth_to_point_cloud/depth_to_point_cloud.{hpp,cpp} | Holoscan operator wrapping the kernel, with optional intrinsics/color inputs. |
| operators/depth_to_point_cloud/python/{CMakeLists.txt,depth_to_point_cloud.cpp} | pybind11 bindings exposing the op to Python. |
| operators/depth_to_point_cloud/test/{CMakeLists.txt,test_deproject.cu} | Standalone nvcc-only golden-reference kernel test registered with CTest. |
| operators/depth_to_point_cloud/test_depth_to_point_cloud.py | Pytest checks for construction, ports, required args, and parameter acceptance. |
| operators/depth_to_point_cloud/{CMakeLists.txt,metadata.json,README.md} | Build, metadata, and documentation for the operator. |
| operators/CMakeLists.txt | Registers the new operator subdirectory. |
| applications/depth_to_point_cloud_demo/* | Synthetic Python demo, README, metadata, and CTest hook. |
| applications/CMakeLists.txt | Registers the demo application with operator dependency. |
Greptile SummaryThis PR introduces
Confidence Score: 5/5The operator is safe to merge; the CUDA kernel is correct, all three input ports synchronize their producers' streams before the kernel launch, and input shape/dtype validation is thorough. The kernel validity logic, stream-ordering, and shape guards are all correct. The only finding is a minor coverage gap in the golden reference function: it omits the isfinite(z) check the kernel has, so NaN depth with a non-NaN invalid_value would go untested. This does not affect correctness of the operator itself. The test/test_deproject.cu golden function could be tightened to match the kernel's isfinite guard, but no correctness issues were found in any production-path file. Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant Gen as SyntheticDepthGeneratorOp
participant Op as DepthToPointCloudOp
participant Kernel as deproject_kernel (CUDA)
participant Sink as PointCloudStatsOp / HolovizOp
Gen->>Op: depth (float32/uint16 [H,W])
Gen->>Op: color (uint8 [H,W,3])
Note over Op: receive_cuda_stream("depth") → stream<br/>receive_cuda_stream("intrinsics") if connected<br/>receive_cuda_stream("color") if connected
Op->>Op: validate shape, dtype, color dims
opt intrinsics port connected
Op->>Op: "cudaMemcpyAsync intrinsics [4×float32]<br/>cudaStreamSynchronize(stream)"
end
Op->>Op: allocate float32[H,W,3] xyz via BlockMemoryPool
Op->>Op: allocate uint8[H,W,3] colors via BlockMemoryPool
Op->>Kernel: launch_deproject(depth, color, out_xyz, out_color, stream)
Kernel-->>Op: cudaPeekAtLastError()
Op->>Sink: "Entity { point_cloud: float32[H,W,3], colors: uint8[H,W,3] }"
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant Gen as SyntheticDepthGeneratorOp
participant Op as DepthToPointCloudOp
participant Kernel as deproject_kernel (CUDA)
participant Sink as PointCloudStatsOp / HolovizOp
Gen->>Op: depth (float32/uint16 [H,W])
Gen->>Op: color (uint8 [H,W,3])
Note over Op: receive_cuda_stream("depth") → stream<br/>receive_cuda_stream("intrinsics") if connected<br/>receive_cuda_stream("color") if connected
Op->>Op: validate shape, dtype, color dims
opt intrinsics port connected
Op->>Op: "cudaMemcpyAsync intrinsics [4×float32]<br/>cudaStreamSynchronize(stream)"
end
Op->>Op: allocate float32[H,W,3] xyz via BlockMemoryPool
Op->>Op: allocate uint8[H,W,3] colors via BlockMemoryPool
Op->>Kernel: launch_deproject(depth, color, out_xyz, out_color, stream)
Kernel-->>Op: cudaPeekAtLastError()
Op->>Sink: "Entity { point_cloud: float32[H,W,3], colors: uint8[H,W,3] }"
Reviews (12): Last reviewed commit: "fix --visualize: use (N,3) points_3d sha..." | Re-trigger Greptile |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a CUDA deprojection kernel and launch API, a Holoscan C++ operator with CUDA-backed compute and Python bindings, CUDA golden and PyTest unit tests, CMake registration and build wiring, operator/demo README and metadata, and a synthetic GPU demo application with CLI and test registration. ChangesDepth to Point Cloud Operator and Demo
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 11
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@applications/depth_to_point_cloud_demo/depth_to_point_cloud_demo.py`:
- Around line 137-138: The focal-length fy is incorrectly computed from
self._width causing vertical scaling; update the assignment for fy in
depth_to_point_cloud_demo.py to derive from image height (use
float(self._height) * 0.8) while keeping fx as float(self._width) * 0.8 so fx
and fy reflect width and height respectively; modify the line that sets fy
(currently using self._width) to use self._height instead and ensure types
remain float.
- Around line 154-162: In main(), validate CLI values before constructing
DepthToPointCloudDemoApp: check args.frames, args.width, and args.height are
positive integers (>0) and if not call parser.error(...) (or raise SystemExit
with a clear message) referencing the offending argument; perform these checks
immediately after args = parser.parse_args() and only build app =
DepthToPointCloudDemoApp(...) when all values pass validation so array
allocation/scheduling errors are avoided.
In `@applications/depth_to_point_cloud_demo/metadata.json`:
- Around line 27-28: The first tag in the metadata.json "tags" array is an
unapproved top-level category ("Point Cloud"); update the tags array so the
first element is one of the approved top-level categories (e.g., "Robotics",
"Computer Vision", "3D", etc.) and move "Point Cloud" to a subsequent position
(e.g., ["Robotics", "Point Cloud", "Depth", "3D"]); ensure the "tags" array
order now has a valid approved category as the first entry so the category rule
is satisfied.
- Around line 37-39: The run.command in metadata.json incorrectly uses the
<holohub_app_source> placeholder; update run.command to point to the built app
path relative to the declared workdir ("holohub_bin") instead of a source
placeholder. Locate the "run" object and replace "python3
<holohub_app_source>/depth_to_point_cloud_demo.py" with a command that invokes
the script from the build-workspace layout under holohub_bin (e.g., the relative
path to the built depth_to_point_cloud_demo.py), ensuring the path matches the
build output location used by the project.
In `@operators/depth_to_point_cloud/deproject.cu`:
- Around line 75-86: The switch over dtype (DepthDType) in the deprojection
launch code does not handle unexpected enum values, so add an explicit default
branch in the switch that catches invalid DepthDType values (when dtype is not
kUint16 or kFloat32) and returns a non-success CUDA error (e.g., return
cudaErrorInvalidValue) or logs and throws as appropriate; update the function
surrounding the deproject_kernel launches to propagate that error instead of
falling through to a cudaSuccess return.
- Around line 67-73: Validate color_channels before casting the input color
pointer: instead of treating any non-4 value as 3, explicitly check for
color_channels == 4 and color_channels == 3 when assigning color4 or color3, and
handle other values as an error path (e.g., skip color processing, set out_color
to default, or return/log an error). Update the block that inspects color,
out_color, color_channels and assigns color4/color3 to only perform static_cast
when the channel count matches, and add a fallback branch for invalid channel
counts to avoid misinterpreting layout and corrupt reads.
In `@operators/depth_to_point_cloud/depth_to_point_cloud.cpp`:
- Around line 120-124: The current code accepts any depth tensor rank >=2 and
ignores extra trailing dimensions; change the rank check in the
DepthToPointCloudOp routine to only allow shape.size() == 2 or shape.size() == 3
and, if 3, require shape[2] == 1; otherwise throw a std::runtime_error
indicating unsupported depth tensor shape. Use the same shape vector to compute
height and width (height = static_cast<int>(shape[0]); width =
static_cast<int>(shape[1])) after the validation so higher-rank tensors like
[H,W,3] are rejected rather than silently processed.
- Around line 127-142: The code can proceed with fx_/fy_ == 0.0 when no
"intrinsics" input is provided, causing divide-by-zero in the deprojection
kernel; after constructing CameraIntrinsics intr (the
CameraIntrinsics{fx_.get(), fy_.get(), cx_.get(), cy_.get()} and after the
override from op_input.receive), add a sanity check that intr.fx and intr.fy are
non-zero (or abs(value) > small epsilon, e.g. 1e-6) and throw a
std::runtime_error (similar style to existing errors) if they are invalid; place
this check in depth_to_point_cloud.cpp before any CUDA memcpy/stream work or
before the kernel launch so the operator fails fast with a clear message
referencing fx/fy.
- Around line 147-160: The color-tensor validation is incomplete: ensure the
tensor has an explicit channel dimension and correct dtype before treating it as
uchar3/uchar4. In the block where maybe_color is handled (use get_tensor(),
color_tensor, color_tensor_name_, color_channels, cshape, color_ptr), require
cshape.size() >= 3 and that static_cast<int>(cshape[2]) is 3 or 4, verify the
tensor dtype is uint8 (or the operator's expected unsigned 8-bit type) and that
H and W match depth; only after these checks set color_channels and assign
color_ptr. Add clear runtime_errors when the channel dimension is missing or
dtype is wrong.
In `@operators/depth_to_point_cloud/metadata.json`:
- Line 26: The first entry in the tags array currently uses "Point Cloud" which
is not an approved category; update the tags array so the first tag is an
approved category (e.g., "Robotics" or "Computer Vision and Perception") and
leave the remaining tags ("Point Cloud", "Depth", "3D", "Camera") after it,
ensuring the "tags" array's first element matches the approved category list in
metadata.json.
In `@operators/depth_to_point_cloud/test/test_deproject.cu`:
- Around line 66-166: Add explicit CUDA error checks after every CUDA API call
and kernel launch in this test: check return values of
cudaMalloc/cudaMemcpy/cudaFree and verify kernel launches by capturing
launch_deproject's returned cudaError_t (or call cudaGetLastError()) and
checking cudaDeviceSynchronize() results; update all places where
launch_deproject is invoked (Cases 1–4) to store/validate its return, and insert
CHECK(...) or equivalent assertions after each
cudaMalloc/cudaMemcpy/cudaDeviceSynchronize/cudaFree to fail fast on
allocation/copy/sync errors (refer to functions/variables: launch_deproject,
cudaMalloc, cudaMemcpy, cudaDeviceSynchronize, cudaFree, and the CHECK macro).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0d9ac9ec-6e51-491a-935d-cf9230e6e59b
📒 Files selected for processing (18)
applications/CMakeLists.txtapplications/depth_to_point_cloud_demo/CMakeLists.txtapplications/depth_to_point_cloud_demo/README.mdapplications/depth_to_point_cloud_demo/depth_to_point_cloud_demo.pyapplications/depth_to_point_cloud_demo/metadata.jsonoperators/CMakeLists.txtoperators/depth_to_point_cloud/CMakeLists.txtoperators/depth_to_point_cloud/README.mdoperators/depth_to_point_cloud/deproject.cuoperators/depth_to_point_cloud/deproject.hppoperators/depth_to_point_cloud/depth_to_point_cloud.cppoperators/depth_to_point_cloud/depth_to_point_cloud.hppoperators/depth_to_point_cloud/metadata.jsonoperators/depth_to_point_cloud/python/CMakeLists.txtoperators/depth_to_point_cloud/python/depth_to_point_cloud.cppoperators/depth_to_point_cloud/test/CMakeLists.txtoperators/depth_to_point_cloud/test/test_deproject.cuoperators/depth_to_point_cloud/test_depth_to_point_cloud.py
Signed-off-by: Nitheesh Kumar <snknitheesh@gmail.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
operators/depth_to_point_cloud/test/test_deproject.cu (1)
66-168:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winComplete the CUDA error-checking coverage.
Progress since the last review: Cases 2–4 now check
launch_deprojectreturn values andcudaDeviceSynchronize. However, the following gaps remain:
- Case 1 (line 72):
cudaDeviceSynchronize()result is not checked.- All cases:
cudaMallocandcudaMemcpycalls are unchecked; silent allocation or copy failures could produce misleading test results.Pattern to complete the checks
Case 1 example (lines 66–74):
- cudaMalloc(&d_depth, N * sizeof(float)); - cudaMalloc(&d_xyz, N * sizeof(float3)); - cudaMemcpy(d_depth, depth.data(), N * sizeof(float), cudaMemcpyHostToDevice); + CHECK(cudaMalloc(&d_depth, N * sizeof(float)) == cudaSuccess, "Case1 cudaMalloc d_depth"); + CHECK(cudaMalloc(&d_xyz, N * sizeof(float3)) == cudaSuccess, "Case1 cudaMalloc d_xyz"); + CHECK(cudaMemcpy(d_depth, depth.data(), N * sizeof(float), cudaMemcpyHostToDevice) == cudaSuccess, + "Case1 H2D depth"); cudaError_t err = launch_deproject(d_depth, DepthDType::kFloat32, 1.0f, k, dmin, dmax, invalid, nullptr, 0, d_xyz, nullptr, W, H, 0); CHECK(err == cudaSuccess, "Case1 launch returned error"); - cudaDeviceSynchronize(); + CHECK(cudaDeviceSynchronize() == cudaSuccess, "Case1 sync"); std::vector<float3> out(N); - cudaMemcpy(out.data(), d_xyz, N * sizeof(float3), cudaMemcpyDeviceToHost); + CHECK(cudaMemcpy(out.data(), d_xyz, N * sizeof(float3), cudaMemcpyDeviceToHost) == cudaSuccess, + "Case1 D2H xyz");Apply the same pattern to Cases 2–4 for their
cudaMallocandcudaMemcpycalls.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@operators/depth_to_point_cloud/test/test_deproject.cu` around lines 66 - 168, The tests are missing CUDA error checks: ensure every cudaMalloc and cudaMemcpy in all cases (variables d_depth, d_xyz, d_color, d_outcolor, etc.) checks the returned cudaError_t and fails via CHECK on error, and in Case 1 also check the result of cudaDeviceSynchronize (like you do in Cases 2–4); keep existing launch_deproject checks as-is but add the same CHECK(cudaDeviceSynchronize() == cudaSuccess, ...) and CHECK(err == cudaSuccess, ...) pattern around each cudaMalloc/cudaMemcpy and after the cudaDeviceSynchronize calls to guarantee allocation/copy/sync failures are caught.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@operators/depth_to_point_cloud/test/test_deproject.cu`:
- Around line 66-168: The tests are missing CUDA error checks: ensure every
cudaMalloc and cudaMemcpy in all cases (variables d_depth, d_xyz, d_color,
d_outcolor, etc.) checks the returned cudaError_t and fails via CHECK on error,
and in Case 1 also check the result of cudaDeviceSynchronize (like you do in
Cases 2–4); keep existing launch_deproject checks as-is but add the same
CHECK(cudaDeviceSynchronize() == cudaSuccess, ...) and CHECK(err == cudaSuccess,
...) pattern around each cudaMalloc/cudaMemcpy and after the
cudaDeviceSynchronize calls to guarantee allocation/copy/sync failures are
caught.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0ef060d9-b50f-4db2-8bdc-7906198b4c69
📒 Files selected for processing (5)
applications/depth_to_point_cloud_demo/CMakeLists.txtoperators/depth_to_point_cloud/deproject.cuoperators/depth_to_point_cloud/depth_to_point_cloud.cppoperators/depth_to_point_cloud/python/depth_to_point_cloud.cppoperators/depth_to_point_cloud/test/test_deproject.cu
🚧 Files skipped from review as they are similar to previous changes (4)
- applications/depth_to_point_cloud_demo/CMakeLists.txt
- operators/depth_to_point_cloud/deproject.cu
- operators/depth_to_point_cloud/python/depth_to_point_cloud.cpp
- operators/depth_to_point_cloud/depth_to_point_cloud.cpp
Signed-off-by: Nitheesh Kumar <snknitheesh@gmail.com>
Signed-off-by: Nitheesh Kumar <snknitheesh@gmail.com>
Signed-off-by: Nitheesh Kumar <snknitheesh@gmail.com>
Signed-off-by: Nitheesh Kumar <snknitheesh@gmail.com>
Signed-off-by: Nitheesh Kumar <31176710+snknitheesh@users.noreply.github.com>
bhashemian
left a comment
There was a problem hiding this comment.
@snknitheesh thanks for this interesting contribution. I reviewed and left some comments inline. I'd appreciate it if you could address them.
| operator. It generates a synthetic organized depth image (a gently tilting plane) plus an aligned | ||
| RGB image entirely on the GPU, deprojects it into an organized `H x W x 3` point cloud, and reports | ||
| the valid-point count and Z range each frame. No camera or dataset is required. | ||
|
|
There was a problem hiding this comment.
Could you please add a splash image for this application? It could be anything that can represent this application, such as diagram, screenshot, or any elaborated illustration. Thanks
Signed-off-by: snknitheesh <snknitheesh@gmail.com>
Signed-off-by: snknitheesh <snknitheesh@gmail.com>
Summary
Adds
depth_to_point_cloud, a GPU operator that deprojects an organized depth image into an organized point cloud, plus a hardware-free demo application.It turns a depth image (from a depth camera, stereo matcher, or monocular depth network) plus pinhole intrinsics into per-pixel
XYZpoints in the camera optical frame — all GPU-resident, zero-copy on the hot path, via a single custom CUDA kernel (one thread per pixel). It supportsuint16andfloat32depth, an optional per-frame intrinsics override, and optionalXYZRGBcolored output. It is sensor-agnostic — no dependency on any specific camera SDK.What's included
operators/depth_to_point_cloud/— C++/CUDA operator (DepthToPointCloudOp) with pybind11 Python bindingstest/test_deproject.cu, registered with CTest) and apytestbinding test (test_depth_to_point_cloud.py)applications/depth_to_point_cloud_demo/— a synthetic, hardware-free demo (CI-friendly)Ports
depthuint16orfloat32,[H, W]/[H, W, 1], deviceintrinsicsfloat32[4][fx, fy, cx, cy](per-frame override)coloruint8[H, W, 3]or[H, W, 4]aligned to depthpoint_cloudfloat32 [H, W, 3](+colorsuint8 [H, W, 3]whencoloris connected)Testing
pytestbinding test passes (construction, port wiring, error handling, parameters)./holohub lintgate passes (black, isort, ruff, cpplint, markdownlint, metadata schema, cmakelint, codespell, copyright)Checklist
./holohub lintpassesREADME.mdandmetadata.jsonprovided for both operator and demoadd_holohub_operator/add_holohub_applicationSummary by CodeRabbit
New Features
Documentation
Tests
Chores