Skip to content

Add cancel() and isBusy to InferenceEngine protocol#32

Open
stikves wants to merge 3 commits into
apple:mainfrom
stikves:sukru/engine-cancel-api
Open

Add cancel() and isBusy to InferenceEngine protocol#32
stikves wants to merge 3 commits into
apple:mainfrom
stikves:sukru/engine-cancel-api

Conversation

@stikves

@stikves stikves commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Adds lifecycle management APIs to the InferenceEngine protocol so callers can gracefully cancel in-flight generation and query busy state.

All three engine implementations (Sequential, Pipelined, StaticShape) and MockEngine are updated with concrete implementations.

CoreAILanguageModel now calls cancel() before reset() to ensure clean state transitions.

@stikves stikves force-pushed the sukru/engine-cancel-api branch from 4e5d422 to f1a8bf3 Compare June 12, 2026 20:11
@stikves stikves self-assigned this Jun 12, 2026
Comment thread swift/Sources/CoreAILanguageModels/InferenceEngines/CoreAIPipelinedEngine.swift Outdated
Comment thread swift/Sources/CoreAILanguageModels/InferenceEngines/CoreAISequentialEngine.swift Outdated
@stikves stikves force-pushed the sukru/engine-cancel-api branch from 1fd5faa to 50386a3 Compare June 12, 2026 21:23
private func drain() {
public var isBusy: Bool { generating.withLock { $0 } }

public func cancel() async throws {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that cancel() in static engine has a slightly different meaning than in pipelined, and sequential shares that same definition as static here

Pipelined engine cancels a real background Task so cancel actually does stop the work. but the static and sequential engines has no background Task so generation runs inside the consumer's next() calls, so generating only becomes false when next() runs and sees cancel being requested.

So cancel here kind of means: we ask the consumer to stop if it keeps pulling. Should we restructure the static/sequential path so cancellation doesn't depend on consumer making progress?

The main problem I see here is that cancelRequested is only checked inside next(), so semantically the idea of cancel is different between pipelined and the other engines

@carinapeng carinapeng Jun 12, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make more concrete if a consumer is paused and not calling next() the check never runts so cancel() in that case means throwing a timeout

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is a valid concern. The iterator and the engine are a bit decoupled, and it might make sense to cancel on the engine side, and have the iterator be invalid.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a Token concept. And only the iterator that contains the token would talk to the engine

@stikves stikves force-pushed the sukru/engine-cancel-api branch 3 times, most recently from 342fd81 to f666779 Compare June 12, 2026 22:40
@stikves stikves force-pushed the sukru/engine-cancel-api branch from f666779 to 1a754e1 Compare June 12, 2026 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants