Add cheddar emitter#2994
Conversation
| int main(int argc, char** argv) { | ||
| // MLIR-to-C++ via the EmitC dialect: used as the final stage of the CHEDDAR | ||
| // backend (`scheme-to-cheddar | --cheddar-to-emitc | heir-translate | ||
| // --mlir-to-cpp`), and available standalone for any EmitC IR. |
There was a problem hiding this comment.
Looks like there's still a bit of 🤖-y bla-bla in here - I'll come back tomorrow and clean those up
| return mlir::emitc::LoadOp::create(b, loc, t, lvalue); | ||
| } | ||
|
|
||
| // Emit `receiver->method(out, args..., extra);`. |
There was a problem hiding this comment.
Cf. https://discourse.llvm.org/t/method-calls-in-emitc/90898 I may end up adding support for method calls so as to support OpenFHE more naturally as well
|
|
||
| // SCF/MemRef/Arith -> EmitC, used by the cheddar pipeline to lower | ||
| // bufferized loop kernels through EmitC to C++. | ||
| registerPass([]() -> std::unique_ptr<Pass> { return createSCFToEmitC(); }); |
There was a problem hiding this comment.
This doesn't have its own registration hook already because the --convert-to-emitc pass uses dialect interface registrations to handle all dialects at once. Cf. https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Conversion/ConvertToEmitC/ConvertToEmitCPatternInterface.td
I'm guessing you found this and, due to the complexity of the emitc patterns for Cheddar, you weren't able to use it?
e2f17c7 to
88a0072
Compare
j2kun
left a comment
There was a problem hiding this comment.
This will have to wait for the LLVM commit bump to hit main, but otherwise some minor organizational comments. I admit I don't fully understand all the work that went into accommodating the move-only style.
| } | ||
|
|
||
| struct ConvertHRot : public OpConversionPattern<cheddar::HRotOp> { | ||
| using OpConversionPattern::OpConversionPattern; |
There was a problem hiding this comment.
Most of these uses of Verbatim seem like they should also be MemberCallOpaque. Any reason why not?
| shared type converter sees cheddar types (e.g. an `scf.for` carrying an | ||
| `!cheddar.*` iter_arg lowers to a move-assigned `emitc.variable`, and | ||
| loop-index values feed `memref` subscripts without a stranded | ||
| `index -> emitc.size_t` cast). |
There was a problem hiding this comment.
Send me an upstream patch for this please!
There was a problem hiding this comment.
I'm still a bit confused why you're not using the emitc conversion interfaces and https://mlir.llvm.org/docs/Passes/#-convert-to-emitc. Is there a concrete reason for that?
| // *now*, keyed by func op (whose identity persists through the in-place | ||
| // signature conversion). Consumed by Pass 1 below. | ||
| llvm::DenseMap<Operation*, llvm::SmallDenseSet<unsigned>> writtenArrayArgs; | ||
| getOperation()->walk([&](func::FuncOp fn) { |
There was a problem hiding this comment.
This loop can be simplified by walking just the StoreOps and using op->getParentOfType() to get the map key and for the owner check, instead of a nested walk of funcs and bodies.
| // dialect-level setting regardless of call order, so a dialect-level | ||
| // addIllegalDialect<SCFDialect> would be silently shadowed and the EmitC | ||
| // for/if patterns would never fire. Marking the ops illegal here forces the | ||
| // SCFToEmitC patterns to win. |
There was a problem hiding this comment.
Perhaps it would be simpler to replace addStructuralConversionPatterns with the subset of patterns from the implementation of addStructuralConversionPatterns that are relevant to this pass, then you don't need implicit tie-breaking with a long comment.
| }); | ||
|
|
||
| auto& ctxRef = *ctx; | ||
| getOperation()->walk([&ctxRef, &writtenArrayArgs](func::FuncOp op) { |
There was a problem hiding this comment.
I feel this section would greatly benefit from having the steps inside the walk split into sub-functions.
| for (unsigned i = 0; i < numResults; ++i) { | ||
| StringRef name; | ||
| if (!isMoveOnlyOpaque(funcType.getResult(i), name)) continue; | ||
| int argIdx = -1; | ||
| bool ok = !returns.empty(); | ||
| for (auto ret : returns) { | ||
| auto ba = dyn_cast<BlockArgument>(ret.getOperand(i)); | ||
| if (!ba || ba.getOwner() != &entry) { | ||
| ok = false; | ||
| break; | ||
| } | ||
| if (argIdx < 0) | ||
| argIdx = ba.getArgNumber(); | ||
| else if (argIdx != static_cast<int>(ba.getArgNumber())) { | ||
| ok = false; | ||
| break; | ||
| } | ||
| } | ||
| if (ok && argIdx >= 0) { | ||
| resultIsInout[i] = true; | ||
| argIsInout[argIdx] = true; | ||
| } | ||
| } |
There was a problem hiding this comment.
| for (unsigned i = 0; i < numResults; ++i) { | |
| StringRef name; | |
| if (!isMoveOnlyOpaque(funcType.getResult(i), name)) continue; | |
| int argIdx = -1; | |
| bool ok = !returns.empty(); | |
| for (auto ret : returns) { | |
| auto ba = dyn_cast<BlockArgument>(ret.getOperand(i)); | |
| if (!ba || ba.getOwner() != &entry) { | |
| ok = false; | |
| break; | |
| } | |
| if (argIdx < 0) | |
| argIdx = ba.getArgNumber(); | |
| else if (argIdx != static_cast<int>(ba.getArgNumber())) { | |
| ok = false; | |
| break; | |
| } | |
| } | |
| if (ok && argIdx >= 0) { | |
| resultIsInout[i] = true; | |
| argIsInout[argIdx] = true; | |
| } | |
| } | |
| for (unsigned i = 0; i < numResults; ++i) { | |
| StringRef name; | |
| if (!isMoveOnlyOpaque(funcType.getResult(i), name)) | |
| continue; | |
| if (returns.empty()) | |
| continue; | |
| auto firstBa = dyn_cast<BlockArgument>(returns.front().getOperand(i)); | |
| if (!firstBa || firstBa.getOwner() != &entry) | |
| continue; | |
| unsigned expectedArgIdx = firstBa.getArgNumber(); | |
| bool allMatch = llvm::all_of(returns, [&](auto ret) { | |
| auto ba = dyn_cast<BlockArgument>(ret.getOperand(i)); | |
| return ba && ba.getOwner() == &entry && ba.getArgNumber() == expectedArgIdx; | |
| }); | |
| if (allMatch) { | |
| resultIsInout[i] = true; | |
| argIsInout[expectedArgIdx] = true; | |
| } | |
| } |
nit: The nesting makes this quite hard to read, plus the use of llvm::all_of simplifies the structure a bit.
| // HEIR-internal and reference the unregistered `tensor_ext` dialect, which | ||
| // mlir-translate (mlir-to-cpp) cannot parse. | ||
| getOperation()->walk([](func::FuncOp fn) { | ||
| auto stripTensorExt = |
There was a problem hiding this comment.
Why not use AttributeUtils::clearAttrs ?
a48ae17 to
0c4a4e1
Compare
…ef messages, inliner interface)
So the op survives one-shot-bufferize (the diagonals operand becomes a memref).
…fficients Add a !cheddar.boot_context type (lowered to BootContext<word>*) and make cheddar.boot require it, so the emitter calls Boot() directly with no downcast. Most server-side ops accept either via the new Cheddar_AnyContext constraint. Also type eval_poly's coefficients as F64ArrayAttr (was a bare ArrayAttr the emitter cast element-wise to FloatAttr).
Add MemRefElementTypeInterface to the Cheddar_Type base class so tensor<Nx!cheddar.*> / memref<Nx!cheddar.*> (produced by the bufferization pipeline for looped CKKS kernels, and used by the emitter's loop/compile lit tests) are valid. Previously this lived in the LWEToCheddar/MNIST commit, above the emitter in the stack, so the emitter branch could not build those tests standalone.
0c4a4e1 to
fb6314e
Compare
…cision) in member_call_opaque; refresh lit test
CHEDDAR's LinearTransform and EvalPoly are classes (no Context method), so the emitter lowers cheddar.linear_transform / cheddar.eval_poly to a single structured emitc.call_opaque targeting a thin HEIR-side prelude shim (RunLinearTransform<W,word> / RunEvalPoly<word>) that builds the StripedMatrix / coefficients, constructs the extension object, and evaluates it. Adds the LinearTransform bufferization interface (the diagonals tensor operand) and grows the no-CUDA compile-guard stub + kernels accordingly.
…ast) cheddar.boot now takes a !cheddar.boot_context, lowered to BootContext<word>*, so ConvertBoot emits ctx->Boot(...) directly instead of recovering the derived type with an unchecked static_cast<BootContext<word>*>.
…or_ext strip - Add a runnable doctest (tests/Conversions/CheddarToEmitC/doctest.mlir, referenced from the pass description via an (* example filepath= *) marker) showing the out-parameter / move-only lowering of a cheddar.add. - Collapse the nested func-then-store walk that detects written array args into a single memref.store walk using getParentOfType<func::FuncOp>(). - Explain why AttributeUtils::clearAttrs does not fit the tensor_ext.* metadata strip (it clears one exact-named attr on an op; here we sweep a family off func arg/result attribute dicts).
Replace the sentinel-argIdx nested loop in the DPS in-place/inout detection with an llvm::all_of check (per review). Semantically identical: a result is treated as in-place only when every return hands back the same entry-block argument.
…tion Hoist the DPS in-place/destination-result analysis out of the func-rewrite walk into a named detectInoutResults() helper (per review: split the walk's steps into sub-functions). The two IR-rewriting passes (Pass 1 input tightening, Pass 2 result lifting) stay inline as clearly delimited sections -- extracting them would thread the builder and several parallel vectors for little gain.
Register the upstream EmitC-to-C++ translation in heir-translate so the cheddar backend's final stage (and the emitter's compile/ guard test) work. Previously this lived in the LWEToCheddar/MNIST commit above the emitter in the stack, so the emitter branch's compile test could not run standalone.
fb6314e to
3587e8e
Compare
Split out from #2896, follow-up to #2944