Skip to content

Dynamic and Batch Support for Object Detector#29

Open
kevchengcodes wants to merge 3 commits into
apple:mainfrom
kevchengcodes:obj_detr_dynamic
Open

Dynamic and Batch Support for Object Detector#29
kevchengcodes wants to merge 3 commits into
apple:mainfrom
kevchengcodes:obj_detr_dynamic

Conversation

@kevchengcodes

Copy link
Copy Markdown

Summary

Enables the ObjectDetector to run dynamic models as well as static models with batch_size > 1.

Changes include resolving dynamic dimensions similar to how llm-runner does it, constructing a batched input tensor based on the list of input images, and post-processing each output.

Models can have dynamic batch size, image input height/width. Note that batched images will be resized to a common size when constructing the (B,C,H,W) input tensor. Users can optionally input a custom input height/width for their dynamic batch.

Verification

Tested with YoloS:

uv run models/yolo/export.py --dynamic
swift run -c release object-detector --model exports/yolos-base_float32_dynamic.aimodel --image img.png --output-image out.png

New unit tests to validate batch support is working.

@kevchengcodes kevchengcodes self-assigned this Jun 12, 2026
let imageArray = NDArray(descriptor: imageDescriptor)
_ = try await function.run(inputs: [imageInputName: imageArray])
let defaults = DetectionParameters()
let warmupShape = zip(imageDescriptor.shape, [1, 3, defaults.inputHeight, defaults.inputWidth])

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we warmup with default, not the shape the caller will actually use at inference time. We should check if this is expected -- we might be getting kernels compiled for the default shapes but inferences runs at a different shape which would be a different kernel, so we'd be wasting cost here

Maybe warmup() should accept the parameters you detect (DetectionParameters, or a BatchPlan?) so we compile for the right shape

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use the same shape during warmup


let plan = try Self.planBatch(
expectedShape: expectedShape,
imageSizes: images.map { CGSize(width: $0.width, height: $0.height) },

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we use the width and height here? We seem to use .count for the batching, but H and W always come from expectedShape in static or parameters.inputHeight/inputWidth in dynamic

If you think this is expected, we should change function signature :)

If not, maybe we should use them?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is expected, so I changed the function signature to show we're just using count

@carinapeng carinapeng self-requested a review June 12, 2026 21:38

@carinapeng carinapeng left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants