Skip to content

Add Poulpy as a backend to HEIR #3096

Description

@j2kun

This issue covers the steps required to integrate the Poulpy FHE library (specifically the poulpy-ckks crate) as a backend in HEIR.

Poulpy uses a bivariate Torus representation (base-2^{base2k} digits) instead of the traditional RNS representation. This allows bit-level precision and capacity management, which simplifies parameter selection but introduces new optimization opportunities (like lazy normalization).

Integration Steps

Add Poulpy as a Bazel Dependency

  • Add poulpy-ckks and a concrete backend crate (e.g., poulpy-cpu-ref for testing) to MODULE.bazel using the Rust crate extension.
  • Add any required transitive crate dependencies, if needed.

Define the poulpy Exit Dialect

  • Create the poulpy dialect in lib/Dialect/Poulpy.
  • Types:
    • !poulpy.ciphertext (maps to CKKSCiphertext)
    • !poulpy.unnormalized_ciphertext (maps to UnnormalizedCKKSCiphertext)
    • !poulpy.plaintext (maps to CKKSPlaintext)
    • !poulpy.module (maps to Module<BE>, the evaluator context)
    • !poulpy.scratch (maps to ScratchArena)
    • !poulpy.tensor_key (maps to GLWETensorKeyPrepared)
    • !poulpy.automorphism_key_map (maps to GLWEAutomorphismKeyHelper for rotations)
  • Operations:
    • Setup: module_create, scratch_create
    • Data: encrypt, decrypt, encode, decode
    • Arithmetic (Normalized): add, sub, mul (takes tensor_key), rotate (takes automorphism_key_map), rescale
    • Arithmetic (In-place/Assign): add_assign, sub_assign, mul_assign, rotate_assign, rescale_assign
    • Arithmetic (Unnormalized): add_unnormalized, sub_unnormalized
    • Maintenance: normalize (unnormalized -> normalized), compact_limbs
  • The poulpy exit dialect should be designed to work with memrefs (not tensors as we are migrating all backends to bufferized impls)

Implement Code Generation (heir-translate)

  • Implement PoulpyEmitter in lib/Target/Poulpy.
  • Translate poulpy dialect operations to Rust code calling poulpy-ckks APIs.
  • Key generation/management: Prior passes will be responsible for generating poulpy dialect code that sets up the Module<BE> and allocates the ScratchArena before providing them to the main compiled function as function arguments.
  • Error handling: Poulpy APIs return Result. The generated Rust functions should return Result and use the ? operator for error propagation.
  • Generics: Generated functions should already know the target backend, so if they cannot be generic over the backend (BE: Backend) then they should query the IR to determine what poulpy backend is being used and use that to emit appropriate code.

Implement a configure-crypto-context-like pass

  • As mentioned in "Implement Code Generation", there should be a pass that sets up any needed global objects or state required for the API. This is analogous to CryptoContext in the OpenFHE backend. This pass introduces new functions that the user will call at server startup time.
  • Signature Conversion: Modify function signatures to accept !poulpy.module, !poulpy.scratch, and required keys (tensor_key, automorphism_key_map) based on the operations used in the function.

Implement Parameter Selection Pass

Poulpy's bivariate Torus representation simplifies parameter selection because we do not need to choose specific RNS prime chains.

  • Modulus Size (CT_K): We only need to decide the total bit size of the modulus. This is computed as $Q_{\text{bits}} = (L + 1) \cdot \Delta_{\text{bits}} + B_{\text{bits}}$, where $L$ is the multiplicative depth, $\Delta_{\text{bits}}$ is the scale (log_delta), and $B_{\text{bits}}$ is the headroom (log_budget).

  • Headroom (log_budget): HEIR's RangeAnalysis must be used to determine the maximum integer value size in the circuit, ensuring it fits in the headroom to prevent torus wrap-around.

  • Limb Size (base2k): Should be chosen as the largest supported by the target hardware backend (typically 52 or 64).

  • Create a pass (e.g., GenerateParamPoulpy) to determine:

    • log_delta (scaling factor, typically constant per circuit).
    • log_budget (headroom, can we use HEIR's RangeAnalysis for this if we have bounds on the input ranges?).
    • CT_K (maximum capacity in bits): Computed as (multiplicative_depth + 1) * log_delta + headroom.
    • base2k (limb size, e.g., 52).
  • Annotate the MLIR module with these parameters.

Implement Lowering Pass SecretToPoulpy

The main difficulty here is that our LWE dialects are incompatible with poulpy's bivariate ring representation. They hard-code RNS concepts that Poulpy doesn't have. So while we could try to generalize/update the LWE and CKKS dialects, instead we should just implement a lowering from secret (arithmetic on secret packed cleartext tensors) to poulpy directly. Then after the initial e2e implementation is working, we can decide how/whether to have a full representation of the scheme at the LWE level, and what other changes to the pipeline are necessary to support that.

  • Create lib/Dialect/Secret/Conversions/SecretToPoulpy.
  • Scale Management: Since the upstream pipeline should be configured to disable explicit scale management (see Section 3), this pass typically will not encounter mgmt.rescale operations in standard flows.

Implement a poulpy-mgmt pass for ciphertext management

This should replace the existing insert-mgmt-ckks sub-pipeline when targeting the poulpy backend.

Sub-passes should include:

  • limb compaction: Insert compact_limbs after multiplications to reclaim unused limbs (since multiplication reduces effective_k).
  • lazy normalization: Identify chains of additions/subtractions and convert them to add_unnormalized/sub_unnormalized to avoid intermediate carry propagation.
    • Insert normalize operations only before operations that require normalized inputs:
      • Multiplication (mul, mul_assign)
      • Rotation (rotate, rotate_assign)
      • Function return

Step 7: End-to-End Testing

  • Create a new set of pipeline flags and configurations related to the poulpy backend, and assemble the right pipeline for the middle-end (secret + mgmt optimizations -> poulpy).
  • Create a Bazel macro heir_poulpy_lib (similar to heir_lattigo_lib) to automate heir-opt -> heir-translate -> rust_library compilation.
  • Write Rust test harnesses that import the generated code, initialize Poulpy backends (e.g., poulpy-cpu-ref), encrypt inputs, run the function, and verify outputs.

Add an in-place optimization pass at the poulpy level

This should be similar to the lattigo/openfhe in place optimizations and can be considered an extra optimization (though it should be the default).


Open questions

  • Do we need to represent the rescale operation in the poulpy dialect? The poulpy-ckks API has rescale and rescale_assign operations, but when would be use them? Maybe for apples-to-apples comparisons of CKKS programs? LLM suggests we may want a ckks_rescale_into to copy ciphertexts into "structs with pre-defined smaller capacities," but this seems like it could be irrelevant to our purposes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    newcomer projectProject ideas for new contributors

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions