Recognise HoleUnit and read dep_sig_mods correctly#33
Open
philippedev101 wants to merge 2 commits into
Open
Conversation
Two independent bugs in getInterfaceRecent stopped the parser from reading GHC interface files that reference signatures. Both surface together on commercialhaskell/stack PR #6865, where stack runs Iface.fromFile against three .hi files (GHC 9.10.3): LogHelper.hi (from an indefinite library), Consumer.hi (from an instantiated library) and Main.hi (from an instantiated executable). All three are committed under test-files/iface/x64/ghc9103-signatures/ as regression fixtures.
Bug A: getModule recognised only unit tags 0 (RealUnit) and 1 (VirtUnit). GHC's `Binary Unit` also writes byte 2 for `HoleUnit`, the sentinel for an un-filled signature reference, introduced for GHC 9.0.1 by Sylvain Henry on 2020-04-03 in GHC commit 10d15f1ec4ba ("Refactoring unit management code"; "Replace BackPack fake 'hole' UnitId by a proper HoleUnit constructor."). LogHelper.hi's own module reference is a VirtUnit whose substitution list maps each hole name to a Module wrapping HoleUnit, and the parser bailed at byte 2 before reaching the dependency section. Fix: one extra case in getModule that reads no payload for tag 2 and falls through to the trailing module-name read.
Bug B: dep_sig_mods was read as `skipList getModule`, but GHC declares it as `![ModuleName]` and serialises it as a list of cached FastStrings, added in GHC 9.4.1 by Matthew Pickering on 2021-05-05 in GHC commit 38faeea1a940 ("Remove transitive information about modules and packages from interface files", GHC issue #16885). For an empty sig_mods this accidentally worked (length 0, stop); for any non-empty value the parser walked into the wrong part of the stream and tripped on whatever byte happened to land at the next "unit tag" read. Consumer.hi and Main.hi both reproduced this as "Invalid unit type: 4". Fix: read the field as `skipList skipFastString`.
The new spec has one targeted assertion per bug (each becomes green as soon as that one bug is fixed, independently of the other) and a stricter "should fully deserialize" gate that requires both fixes. 24 examples total, all green. Full per-bug GHC source references (commit URLs, first release, release date, file paths and line numbers) are embedded in the spec's describe-block comments.
Member
|
Also tested on Windows 11 by:
|
Member
|
@philippedev101, this repository contains a guide for the creation of the existing Please could you post here how the three new |
Member
|
I updated the stale existing CI in the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two bugs in
getInterfaceRecentthat hi-file-parser hits the moment a.hifile references signatures:getModuledoesn't recogniseHoleUnit(tag 2), anddep_sig_modsis read as a list ofModulewhen GHC actually writes a list ofModuleName. Both fixes are tiny (one extra case, one one-liner). The spec gets three fixtures, one targeted failing test per bug (each turns green from its own fix alone, independently of the other), and a "should fully deserialize" gate that requires both. Full GHC source pinpoints (introducing commits, first-release tags, file paths and line numbers) live inline as comments on eachdescribeblock, so the spec file itself is the reference and nothing important is buried only in this PR description.Discovered while working on commercialhaskell/stack#6865, which adds cross-package Backpack support to Stack.
Stack.ComponentFile.parseHIhits both bugs the moment Stack tries to load a.hifrom an indefinite or instantiated library. The bugs aren't Backpack-specific (they're plain parser/format compliance), but Backpack is by far the easiest way to make GHC emit the byte patterns that trip them, which is probably why nobody had reported them before.Bundled as one PR because the two fixes were discovered together, each is trivial, and shipping only one still leaves at least one of the three fixtures failing.