Skip to content

Sous Functions Develop

Osvaldo edited this page May 19, 2026 · 1 revision

Sous Functions Develop

This page is brief on purpose. The authoritative reference for writing a Sous function — the directory layout, the entrypoint signature, the supported runtimes, the isolate's calling convention, the testing CLI — lives in the Sous repository. What this page covers is the contract a function has to satisfy in order to behave well as a unit of work in codeQ. The contract is short, it is technical, and it falls out of the lease semantics and the at-least-once delivery model described on the Concepts page.

A function that respects this contract survives worker crashes, lease expirations, and re-claims without corrupting external state. A function that violates it can leak side-effects, run twice when it should not, or hang past its lease and force the queue into pathological retry patterns. The three properties below are the ones to internalise.

Idempotency

A Sous function should produce the same observable outcome whether it runs once or twice with the same arguments. This is the consequence of codeQ's at-least-once delivery: a task that is claimed but whose result never reaches the server is requeued and may be claimed again. The lease prevents two workers from owning the same task simultaneously, but it does not prevent the same task from being executed twice on different workers across an expiry-and-re-claim cycle. The function author has to design for that case.

There are two ways to satisfy the property. The cleaner way is to make the function deterministically idempotent — to design the side-effects so that a second run with the same arguments does no harm. Writes to a key-value store that are keyed by the function's input are idempotent by construction. Calls to external services that accept an idempotency key (Stripe, HTTP APIs with Idempotency-Key headers, anything event-sourced) can be made idempotent by deriving the key from the task ID or from a hash of the arguments. The other way is to rely on codeQ's producer-side idempotency: the Sous control plane can pass an IdempotencyKey on CreateTask, and codeQ will collapse duplicate enqueues with the same key to a single task. That covers retries originating at the producer; it does not cover retries originating after the task is enqueued, which is why the function still has to be idempotent at its own side-effect boundary.

The shape of an idempotency-violating function is easy to recognise: any function that increments a counter, appends to a log, charges a card, or sends a notification without checking whether the work has already been done. These are the kinds of operations that need an idempotency key built in or a server-side dedup table. The Sous documentation calls this out under "side-effect design", and the codeQ-side observation is that the dead-letter set on Tasks and Results is a useful read-out: if a function produces duplicate side-effects on retry, the producer's downstream system is the only place where it will show up, and the dead-letter logs are usually the first place an operator looks.

Side-effect awareness

A function that is not idempotent by construction has to be at least aware of the retry boundary. The simplest model is that everything before the function returns is at risk of being retried, and everything after is committed. The "return" point is the moment the worker's handler returns a Completed Result, which is the moment the worker SDK marshals the body and sends it to codeQ. If the function returns, but the worker crashes before the result reaches codeQ, the task expires and is retried — the function ran, but the producer never saw the outcome.

This boundary matters because the natural shape of a function — "do the work, return the result" — does not include the result-delivery step. A function author who imagines the work-and-return as one atomic unit will be surprised when the work is done but a retry happens anyway. The right mental model is that the function does the work, returns to the worker, the worker reports it, codeQ persists the report, and only then is the work safely "done". Anywhere in that chain a failure can cause the work to be re-attempted.

The practical recommendation is to keep the function's externally visible side-effects to a small number of idempotent operations near the end of execution, and to keep the work that happens earlier — reads, computations, intermediate state — internal to the isolate. The isolate's state is discarded between invocations, so a partial execution that does not produce an external side-effect is invisible to everything outside the isolate. The function author can rely on this property: an exception thrown halfway through the function, before any external mutation, is safe to retry because nothing observable happened.

Lease awareness

The third property is that the function should not exceed its lease budget without telling codeQ. The lease is the codeQ-side guarantee that a task is owned by exactly one worker at a time. If the function takes longer than LeaseSeconds, the lease expires and the task becomes claimable by another worker — and the original function continues to run, oblivious to the fact that it no longer owns the task. When it finishes, the worker's Result is rejected by codeQ with a not-owner error, and the work has been done twice.

Two mechanisms keep this from happening. The Sous runtime can issue heartbeats from inside the function's slot, extending the lease while the function continues to execute. This is the right answer for functions that occasionally take longer than expected but usually finish in time. The other mechanism is the resource cap on the isolate: if the function exceeds a Sous-imposed wall-clock limit, the isolate is torn down before the lease can expire, and the worker reports Failed or Nack instead of letting the lease silently lapse. Both mechanisms live on the Sous side and are documented in the Sous repository.

What the function author has to do is to choose realistic timeouts and to handle cancellation cleanly. A function that ignores its cancellation signal and keeps running after the isolate has been told to shut down will be killed by the kernel anyway; a function that cooperates with cancellation can release external resources, flush pending writes, or report partial progress before exiting. The cleaner shutdown path is usually worth the extra code.

Determinism

The fourth property, which follows from the three above, is determinism. A function whose output depends only on its input — a pure function — is the easiest case: it is idempotent by construction, it has no side-effects to worry about, and a retry is indistinguishable from a first run. Many useful functions are not pure: they read from external systems, they call APIs, they consume time. The recommendation is to be aware of which sources of non-determinism the function uses and to choose them deliberately. A function that reads the current time, includes the timestamp in its output, and depends on a downstream system distinguishing two runs with different timestamps is not deterministic — the producer's idempotency key will not save it, because the two timestamps will differ and the downstream system will accept both.

The Sous documentation's authoring guide goes into detail on which non-determinism sources are safe and which are not. The codeQ-side perspective is that the queue does not enforce determinism — it cannot — and that a function which produces different outputs for the same input will end up with whichever output the most recent retry produced, since GetResult returns the latest ResultRecord. The function author has to be the one to decide whether that is acceptable.

How to test these properties

A function that satisfies the three properties is most cheaply verified locally. The Sous CLI's cs fn test invokes the function in an isolate against a simulated invocation. To exercise the retry path, a developer can deploy the function against a local codeQ instance with a short LeaseSeconds, intentionally crash the worker mid-execution, and observe that the second claim runs cleanly and the external side-effect is not duplicated. The full integration loop — from invocation through claim through retry through completion — is the same locally as in production, which is the runtime-parity guarantee from the Sous project description.

The codeQ-side observation tools — per-command queue depth, per-worker active leases, lease-expiry counter, dead-letter set — are the same on a developer laptop as on a production cluster. A function that hits the dead-letter set during local testing is almost certainly going to hit it in production too, and the symptoms in both cases look identical.

What this page does not cover

The Sous repository covers everything that is specific to writing a function: the source layout, the entrypoint signature, the argument and return value encoding, the runtimes supported by the isolate, the testing CLI, the registration flow, and the artefact format. None of those are codeQ concerns and none of them appear here. If you are reading this page because you want to write your first function, the right next click is the Sous repository README. Come back here once your function runs and you are about to ship it; the contract above is what stands between a function that works locally and a function that survives a production retry.

Where to go next

If you want the full integration mechanics that explain why these contract points exist, Concepts is the long-form reference. If you are tuning lease and concurrency in production, Configure Workers and Performance Tuning Knobs are the right pages. The Sous repository at github.com/osvaldoandrade/sous is everything else.

Clone this wiki locally