Skip to content

Recognise HoleUnit and read dep_sig_mods correctly#33

Open
philippedev101 wants to merge 2 commits into
commercialhaskell:masterfrom
philippedev101:hole-unit-and-sig-mods
Open

Recognise HoleUnit and read dep_sig_mods correctly#33
philippedev101 wants to merge 2 commits into
commercialhaskell:masterfrom
philippedev101:hole-unit-and-sig-mods

Conversation

@philippedev101

Copy link
Copy Markdown

Two bugs in getInterfaceRecent that hi-file-parser hits the moment a .hi file references signatures: getModule doesn't recognise HoleUnit (tag 2), and dep_sig_mods is read as a list of Module when GHC actually writes a list of ModuleName. Both fixes are tiny (one extra case, one one-liner). The spec gets three fixtures, one targeted failing test per bug (each turns green from its own fix alone, independently of the other), and a "should fully deserialize" gate that requires both. Full GHC source pinpoints (introducing commits, first-release tags, file paths and line numbers) live inline as comments on each describe block, so the spec file itself is the reference and nothing important is buried only in this PR description.

Discovered while working on commercialhaskell/stack#6865, which adds cross-package Backpack support to Stack. Stack.ComponentFile.parseHI hits both bugs the moment Stack tries to load a .hi from an indefinite or instantiated library. The bugs aren't Backpack-specific (they're plain parser/format compliance), but Backpack is by far the easiest way to make GHC emit the byte patterns that trip them, which is probably why nobody had reported them before.

Bundled as one PR because the two fixes were discovered together, each is trivial, and shipping only one still leaves at least one of the three fixtures failing.

Two independent bugs in getInterfaceRecent stopped the parser from reading GHC interface files that reference signatures. Both surface together on commercialhaskell/stack PR #6865, where stack runs Iface.fromFile against three .hi files (GHC 9.10.3): LogHelper.hi (from an indefinite library), Consumer.hi (from an instantiated library) and Main.hi (from an instantiated executable). All three are committed under test-files/iface/x64/ghc9103-signatures/ as regression fixtures.

Bug A: getModule recognised only unit tags 0 (RealUnit) and 1 (VirtUnit). GHC's `Binary Unit` also writes byte 2 for `HoleUnit`, the sentinel for an un-filled signature reference, introduced for GHC 9.0.1 by Sylvain Henry on 2020-04-03 in GHC commit 10d15f1ec4ba ("Refactoring unit management code"; "Replace BackPack fake 'hole' UnitId by a proper HoleUnit constructor."). LogHelper.hi's own module reference is a VirtUnit whose substitution list maps each hole name to a Module wrapping HoleUnit, and the parser bailed at byte 2 before reaching the dependency section. Fix: one extra case in getModule that reads no payload for tag 2 and falls through to the trailing module-name read.

Bug B: dep_sig_mods was read as `skipList getModule`, but GHC declares it as `![ModuleName]` and serialises it as a list of cached FastStrings, added in GHC 9.4.1 by Matthew Pickering on 2021-05-05 in GHC commit 38faeea1a940 ("Remove transitive information about modules and packages from interface files", GHC issue #16885). For an empty sig_mods this accidentally worked (length 0, stop); for any non-empty value the parser walked into the wrong part of the stream and tripped on whatever byte happened to land at the next "unit tag" read. Consumer.hi and Main.hi both reproduced this as "Invalid unit type: 4". Fix: read the field as `skipList skipFastString`.

The new spec has one targeted assertion per bug (each becomes green as soon as that one bug is fixed, independently of the other) and a stricter "should fully deserialize" gate that requires both fixes. 24 examples total, all green. Full per-bug GHC source references (commit URLs, first release, release date, file paths and line numbers) are embedded in the spec's describe-block comments.
@mpilgrem

mpilgrem commented Jun 11, 2026

Copy link
Copy Markdown
Member

Also tested on Windows 11 by:

  • building the master branch of Stack with a dependency on it and then building Stack again with that Stack; and
  • building the backpack version of Stack (https://github.com/philippedev101/stack/tree/6b8f160e0ab1ae1b94cde1a2360e6a79a4c81aaf) with a dependency on it, building Stack again with that Stack; and
  • running Stack's backpack-cross-package-transitive integration test. This passed (EDIT: but only when moved to a short paths location), and did not generate any Failed to decode module interface: warnings. I'll post in the Stack repository.

@mpilgrem

Copy link
Copy Markdown
Member

@philippedev101, this repository contains a guide for the creation of the existing *.hi files used in tests at test-files/iface/x64/README.md - in practice it is the guide for Windows users that is most relevant.

Please could you post here how the three new *.hi files are to be created. That will enable me to write it up in the guide.

@mpilgrem

mpilgrem commented Jun 12, 2026

Copy link
Copy Markdown
Member

I updated the stale existing CI in the master branch. It now covers GHC 9.10.3, 9.12.4 and 9.14.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants