Skip to content

Commit e23f48b

Browse files
compiler: lower polymorphic len() to wasm (string + array length) (#583)
## Lower polymorphic `len()` to wasm (string + array length) A string-wall residual found while verifying the migration corpus is actually unblocked: **`len()` did not lower to wasm** (`Codegen.UnboundVariable "len"`) — only `string_length` did. AffineScript arrays and strings share the same `[len: i32][payload…]` header (the list-concat handler already relies on this), so `len()` is the **same single `i32.load` at offset 0** whether its argument is a `String` or an array. Merged into the existing `string_length` arm. This unblocks the **stdlib string layer** on the wasm backend (`starts_with`/`ends_with`/`substring`/`split`/`join`/`pad_*` all call `len()` on strings; `join` also on arrays), which the corpus's `String.length` pattern (30 files) needs. **Verified:** `len("abcd")`=4 and `len([10,20,30])`=3 lower and run; full `run_codegen_wasm_tests.sh` green (no regressions). **Separately noted (not fixed here):** string `==`/`!=` still lower to *pointer* comparison, not value comparison (`string_sub(...) == "ab"` is wrongly `0`) — a #555-class interp-vs-wasm divergence, tracked for a follow-up slice. It only affects string-logic kernels; integer-brain extraction (strings kept host-side) is unaffected. https://claude.ai/code/session_01WoKhFQePiRsAj7aqnxbG8s --- _Generated by [Claude Code](https://claude.ai/code/session_01WoKhFQePiRsAj7aqnxbG8s)_ Co-authored-by: Claude <noreply@anthropic.com>
1 parent f8230ce commit e23f48b

1 file changed

Lines changed: 12 additions & 2 deletions

File tree

lib/codegen.ml

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1002,14 +1002,24 @@ let rec gen_expr (ctx : context) (expr : expr) : (context * instr list) result =
10021002
in
10031003
Ok (ctx_with_heap, code)
10041004

1005-
| ExprVar id when id.name = "string_length" && List.length args = 1 ->
1005+
| ExprVar id when (id.name = "string_length" || id.name = "len")
1006+
&& List.length args = 1 ->
10061007
(* STDLIB-04e (#332) wasm-backend lowering. AS string layout is
10071008
`[len: i32][bytes...]` at the pointer the arg evaluates to —
10081009
reading the length is one i32.load at offset 0. The interp
10091010
binding (lib/interp.ml) was wired in #362; this handler is
10101011
the codegen sibling so tests/codegen/*.affine fixtures that
10111012
call string_length (env_at / arg_at / env_count_and_at) can
1012-
compile end-to-end. *)
1013+
compile end-to-end.
1014+
1015+
`len` (the polymorphic length, issue #135) lowers identically:
1016+
AS arrays share the same `[len: i32][elems...]` header (the
1017+
list-concat handler above relies on exactly this), so reading
1018+
length is the same single i32.load whether `len` is applied to a
1019+
String or an array. Sharing the arm unblocks the stdlib string
1020+
layer (starts_with/ends_with/substring/split/join, which call
1021+
`len` on both strings and arrays) for the wasm backend — the
1022+
string-wall slice-8 residual flagged in the migration ledger. *)
10131023
let* (ctx_with_arg, arg_code) = gen_expr ctx (List.hd args) in
10141024
Ok (ctx_with_arg, arg_code @ [I32Load (2, 0)])
10151025

0 commit comments

Comments
 (0)