fix(staking): BLS proof-of-possession at candidate register/update (CRITICAL, pre-IIP-52)#4854
fix(staking): BLS proof-of-possession at candidate register/update (CRITICAL, pre-IIP-52)#4854envestcc wants to merge 9 commits into
Conversation
…pdate
CRITICAL security fix for the BLS rogue-key aggregate-forgery attack
against IIP-52's same-message FastAggregateVerify path.
IIP-52 verifies the COMMIT-vote BLS aggregate signature against the sum
of registered candidate BLS public keys:
blstAggregateSignature.FastAggregateVerify(true, Σ pk_i, msg, dst)
with dst = "BLS_SIG_BLS12381G2_XMD:SHA-256_SSWU_RO_POP_". The _POP_
suffix names the IRTF "Proof of Possession" ciphersuite, which is only
sound when each registered pubkey has been accompanied at registration
time by a signature proving knowledge of the corresponding secret key.
Today registration validates only key format + subgroup membership:
crypto.BLS12381PublicKeyFromBytes(blsPubKey)
No PoP is required. A registered candidate can publish
pk_rogue = g^x − Σ(other delegates' pubkeys)
computed from public information, supply a single signature
σ = sign(x, msg) and a bitmap that claims every delegate signed, and
FastAggregateVerify collapses Σ pk_i + pk_rogue to g^x, accepting σ.
The bitmap satisfies isMajorityCount, so one delegate forges a 2/3+
quorum certificate and the consensus safety property is broken.
The window is latent: IIP-52's ToBeEnabledBlockHeight = math.MaxUint64
so this code path is not live on any chain yet. It must be fixed
before activation.
- Add a 96-byte blsPop field to CandidateBasicInfo proto (iotex-proto
PR pushed; go.mod replace updated to envestcc/iotex-proto@7a486f6)
- New action/protocol/staking/bls_pop.go defines the PoP signing root:
BLSPopSigningRoot(blsPubKey, ownerAddress) =
SHA-256("IOTEX_BLS_POP_v1" || blsPubKey || ownerAddress.Bytes())
Binding all three values blocks the three replay variants:
(a) attacker without the secret cannot sign over the canonical
message at all — closes rogue key;
(b) PoP for owner A does not validate for owner B — closes
candidate-substitution replay;
(c) "IOTEX_BLS_POP_v1" domain prefix is disjoint from any other BLS
DST iotex uses, so a PoP cannot be replayed as a consensus vote
or vice versa, and a future fork can rotate via "v2".
- New FeatureCtx.EnforceBLSPoP shares IsToBeEnabled height with
EnableBLSAggregation today but is a semantically distinct switch so
the two can diverge if a hotfix or fork-height adjustment is needed.
- action.CandidateRegister and action.CandidateUpdate carry blsPop
through their constructors, Proto/LoadProto, and accessors.
NewCandidateRegisterWithBLS and NewCandidateUpdateWithBLS now accept
a blsPop parameter; pre-fork callers may pass nil.
- handleCandidateRegister, handleCandidateUpdate, and
handleCandidateUpdateByOperator call VerifyBLSPop before writing
Candidate.BLSPubKey when EnforceBLSPoP is active. Failure returns
ReceiptStatus_ErrUnauthorizedOperator.
- TestBLSPop_RogueKeyAttackBlocked is the regression guard: it models
the attacker trying to register a pubkey for which they do not have
the secret, demonstrates the only PoP they can produce (signed under
their own secret) does NOT validate against the target pubkey, and
confirms the control case (a legitimate registrant can produce a
valid PoP).
- ioctl/SDK wiring: stake2register.go and stake2update.go currently
pass nil for blsPop with TODO markers. These need a flag to ingest a
BLS private key and derive the PoP. Pre-fork the empty PoP is
accepted; post-fork the same call will fail with the new check, so
the tooling MUST be updated before the fork.
- Migration: existing pre-fork candidate records have no PoP. Before
the fork activates, all active candidates must submit a
candidateUpdate carrying a valid blsPop; the alternative is a
fork-block state migration that drops BLS pubkeys without
accompanying PoPs from ActiveCandidates. IIP draft will document
the chosen approach.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Companion to the BLS proof-of-possession fix. Even with PoP in place, nothing in the current handler stops two different candidates from registering with the same blsPubKey: the existing ContainsName / ContainsOwner / ContainsOperator checks have no BLS equivalent. Two delegates sharing one BLS pubkey breaks IIP-52's quorum-counting model. The signer bitmap counts both delegates as having voted, but FastAggregateVerify aggregates pubkeys as a set — one contribution per distinct pubkey. The committee size as seen by aggregation is N, the committee size as seen by the bitmap is N+1, and the second delegate's stake-weight effectively "votes for free." This is a quieter cousin of the rogue-key attack: the aggregation math still verifies; only the accounting is off. Adds CandidateCenter.ContainsBLSPubKey(blsPubKey, except) and surfaces it through CandidateStateManager. Implementation is a linear scan over candidates — registration / update are sparse calls and the active delegate set is bounded; the saved O(1) lookup is not worth maintaining a fourth index map across the change/base commit flow. Hooks into all three handler paths under the EnforceBLSPoP gate: - handleCandidateRegister: reject if blsPubKey is held by any candidate other than the incumbent (when re-registering against an existing owner-without-selfstake record). - handleCandidateUpdate: reject if blsPubKey is held by any candidate other than the one being updated. A candidate keeping its own pubkey across updates is allowed (except = c.GetIdentifier()). - handleCandidateUpdateByOperator: same as update-by-owner. Failures return ReceiptStatus_ErrCandidateConflict (matches existing collision semantics for name / operator). Gated by EnforceBLSPoP so pre-fork blocks replay unchanged. Test: TestCandidateCenter_ContainsBLSPubKey covers nil / empty / no-match / match-with-nil-except / match-against-self (allowed) / match-against- other (rejected). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Pushed a companion commit Why this belongs in the same PREven with PoP in place, two candidates registering the SAME blsPubKey breaks IIP-52's quorum-counting model. PoP only proves "I know the secret for this pubkey" — it does not prevent two delegates from independently producing valid PoPs for the same key (e.g. an actively shared keypair, or a delegate selling/leasing their secret). When IIP-52 footer aggregation runs:
Mitigation matches the existing collision-check pattern: Implementation
Tests
|
…t owner PoP at candidate update was binding to c.Owner — the *current* owner. For post-Xingu candidates this drifts after CandidateTransferOwnership: the new owner has to know they're the new owner at signing time, and a PoP signed under the old owner during a concurrent transfer is rejected purely on a value mismatch that has nothing to do with key possession. c.GetIdentifier() is the stable handle for this purpose: - Post-Xingu non-collision (the common case): Identifier is set once at register time to the original owner address (generateCandidateID fast-paths to owner when it's free). So a PoP signed at register binding to owner == identifier, and a PoP signed at update binding to c.GetIdentifier() lines up with that same value, even if owner has since transferred. - Post-Xingu collision (edge case): Identifier is a hash-derived address. Register still binds to act.OwnerAddress() (signer can't predict the hash), updates use c.GetIdentifier(). The register-time PoP doesn't get re-verified later, so this asymmetry is harmless; inside the candidate's lifetime, all update PoPs use the same hash consistently. - Pre-Xingu: c.Identifier is nil and c.GetIdentifier() falls back to c.Owner — behavior identical to the prior code. Renamed the function parameter from ownerAddress to candidateID throughout (BLSPopSigningRoot, SignBLSPop, VerifyBLSPop) and updated the docstring to spell out what to pass at register vs update. The on-the-wire bytes are unchanged — only the conceptual binding shifts. Tests: renamed TestBLSPop_RejectWrongOwner to RejectWrongCandidateID; added TestBLSPop_StableAcrossOwnershipTransfer locking in the property that a PoP signed under the original owner still verifies under the same identifier after the candidate is transferred to a new owner. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Pushed `3fbaa0dcd` — bind update-path PoP to `c.GetIdentifier()` instead of `c.Owner`. Motivation`c.Owner` shifts when ownership is transferred via `CandidateTransferOwnership`; `c.GetIdentifier()` is the candidate's stable handle. The previous owner-binding meant:
Why this isn't a big change`generateCandidateID` returns the owner address directly when free, only hash-deriving on collision. For the common case identifier == registration-time owner, so:
The on-the-wire PoP bytes don't change. Only the conceptual binding shifts from "current owner" to "candidate identity." Implementation
Test plan
|
| // which case the check rejects on any match. Used to enforce one | ||
| // BLS pubkey per delegate, a hard requirement for IIP-52's | ||
| // FastAggregateVerify quorum-counting model. | ||
| ContainsBLSPubKey(blsPubKey []byte, except address.Address) bool |
There was a problem hiding this comment.
The ContainsBLSPubKey should not need the except parameter.
| h.Write([]byte(blsPopDomain)) | ||
| h.Write(blsPubKey) | ||
| if candidateID != nil { | ||
| h.Write(candidateID.Bytes()) |
There was a problem hiding this comment.
return nil if candididateID == nil ?
| // for pre-Xingu records. This means a candidate that has been | ||
| // transferred to a new owner still uses its original identity for | ||
| // PoP, so the binding is stable across ownership transfers. | ||
| func BLSPopSigningRoot(blsPubKey []byte, candidateID address.Address) []byte { |
There was a problem hiding this comment.
blsPubKey could be deleted. It will be needed during verification.
There was a problem hiding this comment.
Pushed `fbfd7ae92` — dropped `blsPubKey` from the signing root per our discussion.
The signing root is now `H(blsPopDomain || candidateID)`. Security model:
- Basic rogue-key registration: blocked by the pairing verifier alone. `Verify(pk_rogue, msg, sig)` requires `sig` to be produced under `sk_rogue`, which the attacker doesn't have — regardless of what's in `msg`.
- Same-message aggregation (the classical attack that pubkey-in-message defends against): requires two distinct honest signers to sign the same root. Protocol-enforced uniqueness of candidate identifiers (generateCandidateID + ContainsName/Operator/Owner checks at register time) rules that out — no two honest delegates ever produce PoPs over the same root.
- Cross-candidate replay: still blocked by candidateID binding.
- Cross-domain replay: still blocked by blsPopDomain.
`SignBLSPop` no longer derives `pk` from `sk` since the signing root doesn't need it. PoP bytes are still 96 B G2-compressed; DST unchanged.
Tests updated: `TestBLSPop_RogueKeyAttackBlocked` + `TestBLSPop_RejectNilCandidateID` use the new `BLSPopSigningRoot(candidateID)` signature.
| // the candidate identifier); at update time pass the candidate's | ||
| // existing identifier (c.GetIdentifier()). | ||
| func SignBLSPop(sk *crypto.BLS12381PrivateKey, candidateID address.Address) ([]byte, error) { | ||
| if sk == nil { |
There was a problem hiding this comment.
or return error if candidateID == nil
| // which case the check rejects on any match. Used to enforce one | ||
| // BLS pubkey per delegate, a hard requirement for IIP-52's | ||
| // FastAggregateVerify quorum-counting model. | ||
| ContainsBLSPubKey(blsPubKey []byte, except address.Address) bool |
There was a problem hiding this comment.
BLSPubKeyOwner to return the owner address.
PoP coverage in the Go action layer (envestcc/iotex-core#4854) wasn't visible from the web3 path: candidateRegisterWithBLS / candidateUpdateWithBLS ABI entries have no blsPop slot, so any tx routed through eth_sendRawTransaction and decoded via NewCandidateRegister/UpdateFromABIBinary silently dropped the field. Post-fork that would have produced txs the handler always rejects. Adds two new V2 ABI methods rather than mutating the existing selectors: - candidateRegisterWithBLSAndPoP (8 fields + blsPop + data) - candidateUpdateWithBLSAndPoP (4 fields + blsPubKey + blsPop) Why V2 instead of extending the existing methods: changing a parameter list changes the 4-byte function selector, breaking any tooling that hardcoded the old ABI. The legacy WithBLS entries stay working pre-fork (their handler path is unchanged), and post-fork they reject naturally for lacking PoP — coexistence with no selector churn. EthData routing picks the entry by data carried on the action: - WithBLS && len(blsPop) > 0 → V2 selector, calldata includes PoP - WithBLS → legacy selector, no PoP slot - otherwise → non-BLS legacy candidateRegister FromABIBinary recognises the V2 selector and decodes blsPop. Tests: extended TestCandidateRegisterABIEncodeAndDecode with a "with public key and PoP" subcase, and added TestCandidateUpdate / "ABI encode with PoP" — both confirm the codec round-trips a non-empty PoP through Pack + Unpack, locking in the property whose absence was the bug. Out of scope for this commit: e2e coverage that exercises the web3 path through to a post-fork handler (proves the new selector actually flows blsPop into Candidate.BLSPubKey assignment). Will follow once the EnforceBLSPoP gate gets switched on in an integration scenario. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Pushed `ab96bbc4b` — closes the web3-path gap you flagged. What was brokenThe PoP changes earlier in this PR live in the Go action layer + proto + handler. The ABI layer (which web3 callers actually use via `eth_sendRawTransaction`) had no `blsPop` slot:
Post-fork impact would have been: every web3-routed register/update with BLS — wallets, SDKs, `ioctl` flows going through eth-tx — gets rejected by the handler for missing PoP, with no way to provide it from the client side. What changedAdded two new V2 ABI entries (your option B):
Coexistence model:
`EthData()` routes by content:
`FromABIBinary` adds the V2 selector recognition + decodes `blsPop`. Tests
Both fail without the EthData/FromABIBinary changes — they're regression guards for exactly the bug you spotted. Still out of scopeEnd-to-end web3 → handler test (proves the V2 selector flows blsPop all the way into `Candidate.BLSPubKey` post-fork, and that legacy selector + post-fork rejects properly). Will follow once we wire up the EnforceBLSPoP gate in a proper integration scenario — that's the same fork-activation work as the `ioctl` PoP-generation flag. |
…ning The previous commit added candidateRegisterWithBLSAndPoP / candidateUpdateWithBLSAndPoP entries to native_staking_contract_abi.json but left native_staking_contract_interface.sol — the human-readable source-of-truth that clients / docs consume — out of sync. Adds the matching Solidity declarations and a header comment in both the .sol file and native_staking_contract_abi.go pointing out the non-obvious sharp edge: there is no Makefile target that regenerates the JSON from the .sol, so the two files must be kept in sync by hand on every ABI change. Out-of-sync edits silently misencode the web3 path — clients embed one function signature while the node decodes another. No runtime behaviour change; the JSON has been authoritative all along. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Good catch — pushed `98bdaa969` mirroring the V2 PoP methods in `native_staking_contract_interface.sol` and adding a sync-warning header to both files. Findings
What's in the commit
The warning text is the same in both: "Adding a new method, parameter, or event requires editing BOTH files by hand, otherwise the web3 path silently misencodes — clients embed one signature while the node decodes another. There is no Makefile target that catches the drift." On the lack of a Makefile targetIt would be cleaner to wire one up (`solc --abi` → generated JSON, fail CI on drift). Out of scope for this PR — would need a solc version pin, an addition to the dev-deps target, and a CI step. Worth filing as a follow-up if the team wants. For now the comment is the cheapest guard. |
Addresses two review threads on PR iotexproject#4854: ## Comments 1 + 5: drop the `except` parameter Both envestcc and CoderZhi flagged that ContainsBLSPubKey(pubkey, except) bool pushed the "is this me?" decision into a generic helper that doesn't belong there. Replaces ContainsBLSPubKey(pubkey, except) bool with the broader GetByBLSPubKey(pubkey) *Candidate: callers receive the (possibly nil) holder and compare identifiers themselves. The three register / update handler call sites now read if holder := csm.GetByBLSPubKey(act.BLSPubKey()); holder != nil && holder.GetIdentifier().String() != c.GetIdentifier().String() { return ErrCandidateConflict } which is more explicit than the prior `except`-hiding API. It also unifies with the same GetByBLSPubKey method added in iotexproject#4857 (Y4a) so the two PRs don't introduce parallel BLS-pubkey-lookup APIs. ## Comments 2 + 4: strict nil-candidateID contract Both BLSPopSigningRoot and the surrounding Sign / Verify helpers used to silently accept a nil candidateID by skipping the candidate-binding write. That degrades the scheme to domain+pubkey-only — exactly the shape an attacker reaching for a cross-candidate replay would hope for. The three entry points now refuse: - BLSPopSigningRoot returns nil - SignBLSPop returns error("nil candidate ID; PoP must bind to a candidate identity") - VerifyBLSPop rejects before any cryptographic work TestBLSPop_RejectNilCandidateID locks the contract in place. ## Not in this commit Comment 3 ("blsPubKey can be removed from the signing root") — deferred. Will reply on the thread; the short answer is that the IRTF BLS draft defines canonical PoP as sign(sk, pk) and dropping blsPubKey from the digest opens up same-message aggregation when owner-uniqueness is ever relaxed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Pushed `9229438ac` addressing 4 of the 5 review comments. The 5th (Comment 3 about `blsPubKey` in the signing root) I'm leaving alone and will reply on the thread separately. What changedComments 1 + 5 → `GetByBLSPubKey` replaces `ContainsBLSPubKey``ContainsBLSPubKey(pubkey, except) bool` hid the "is this me?" decision in a helper. Replaced with `GetByBLSPubKey(pubkey) *Candidate` — callers get the (possibly nil) holder and compare identifiers themselves. The three register / update handler call sites now read: ```go Two side benefits:
Comments 2 + 4 → strict nil-candidateID`BLSPopSigningRoot` / `SignBLSPop` / `VerifyBLSPop` used to silently accept nil candidateID by skipping the candidate-binding write — which degrades the scheme to domain+pubkey-only and is exactly the shape an attacker reaching for cross-candidate replay would hope for. The three entry points now refuse:
`TestBLSPop_RejectNilCandidateID` locks the contract. Test plan
|
Addresses Comment 3 on PR iotexproject#4854 (CoderZhi). The previous signing root included blsPubKey: H(blsPopDomain || blsPubKey || candidateID) per the IRTF BLS canonical PoP form. After discussion, switching to: H(blsPopDomain || candidateID) The pairing verifier Verify(PK, msg, sig) already commits PK into the signature equation, so the basic rogue-key registration ("register pk_rogue without owning its discrete log") is blocked by basic PoP correctness without needing pubkey-in-message. Same-message aggregation — the classical attack that pubkey-in-message defends against — requires two distinct honest signers to sign the same signing root; the protocol-enforced uniqueness of candidate identifiers rules that out. Cross-candidate replay is still blocked by the candidateID binding, and cross-domain replay by blsPopDomain. The simpler signing root keeps the on-the-wire calldata independent of pubkey-length/encoding choices and removes an apparent redundancy between the signed message and the verifier's pk argument. SignBLSPop no longer derives pk from sk; the only caller-provided binding is candidateID. The bls_pop bytes themselves are unchanged in length and DST. Updated TestBLSPop_RogueKeyAttackBlocked and TestBLSPop_RejectNilCandidateID to use the new BLSPopSigningRoot(candidateID) signature. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-pop Wires the three-option model agreed on PR iotexproject#4854: how does a delegate actually supply the BLS proof-of-possession the post-fork handler now requires? ## stake2 register / update BREAKING positional change: BLS_PUBKEY moves out of the positional arg list. Pre-fork callers passing it positionally must migrate; the post-fork handler requires PoP anyway so this command was always going to need re-tooling. Three-option matrix lives in blspop_helper.go and applies to both register (mandatory BLS) and update (optional BLS, default = don't touch): Option 1 — auto-derive from signer (default for register): No --bls-* flags. The tool decrypts the signer keystore, derives a BLS key from the ECDSA scalar (same algorithm as `ioctl account blskey` — crypto.GenerateBLS12381PrivateKey(ecdsa_sk.Bytes())), computes the PoP, and prompts the user to confirm. The prompt is explicit about ECDSA↔BLS coupling so delegates aren't surprised when an ECDSA leak also exposes their producer identity. --yes suppresses for CI. Option 2 — --bls-priv-key HEX: Standalone BLS key supplied directly. Tool derives the pubkey and signs the PoP. No prompt. Option 3 — --bls-pubkey HEX + --bls-pop HEX: User supplies both. Tool VerifyBLSPop's locally against the candidateID before submission so a malformed PoP fails fast without burning gas. For register the candidateID is the owner address from positional args; for update it must be passed via --candidate-id. Option 0 (update only): no BLS flags at all → BLS untouched. Going from "leave BLS alone" to "rotate to derived key" requires explicit --bls-from-signer so a name/operator update can never silently rotate the producer key. Mutual-exclusion + completeness rules are centralised in blsPoPFlags.classifyForRegister / classifyForUpdate and locked in by TestBlsPoPFlags_ClassifyForRegister / ClassifyForUpdate. --bls-keystore is a placeholder; the iotex BLS keystore format isn't specified yet. Tool returns an explicit error pointing at --bls-priv-key. ## account bls-sign-pop New offline PoP signer for air-gapped workflows. Accepts either --bls-priv-key HEX or --signer ADDRESS (ECDSA keystore → derive), plus --candidate-id ADDRESS. Writes the 96-byte PoP hex to stdout (informational fields go to stderr so shell redirection captures only the PoP). The delegate signs on the cold machine, transfers the hex over USB / QR, and runs `ioctl stake2 register --bls-pubkey ... --bls-pop ...` on a hot machine — the BLS private key never leaves cold storage. Lives in the account package next to the existing `account blskey` so BLS key operations are co-located rather than scattered. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Pushed `b80f0f316` — ioctl UX for the BLS PoP path. What lands`ioctl stake2 register`Three-option matrix:
BREAKING: `BLS_PUBKEY` moves out of positional args (was `args[4]`) into flags. Pre-fork pre-existing scripts need updating, but post-fork they need re-tooling anyway since they must supply PoP. `ioctl stake2 update`Same three options PLUS Option 0:
`ioctl account bls-sign-pop` (new)Offline PoP generator. Takes `--bls-priv-key HEX` OR `--signer ADDRESS` (the latter decrypts the ECDSA keystore + derives BLS), plus `--candidate-id ADDRESS`. Writes 96-byte PoP hex to stdout (informational fields to stderr so `> pop.hex` captures just the PoP). Air-gap flow: ```bash cold machineioctl account bls-sign-pop --candidate-id io1owner... --bls-priv-key 0x... > pop.hex USB / QR to hot machineioctl stake2 register mycand io1op io1reward io1owner 10000 350 \ Why hot/cold separation matters hereBLS sk derived from ECDSA sk means a single compromise blast both. The `--yes`-able prompt makes the coupling explicit so delegates choosing the easy path know what they signed up for; security-conscious delegates use Option 2 / 3 with separately-managed BLS material. Tests
Deferred
Out of scope (separate PR territory)
🤖 Generated with Claude Code |
PR iotexproject#4854 had no handler-level integration coverage of the EnforceBLSPoP gate — only the underlying SignBLSPop / VerifyBLSPop unit tests in bls_pop_test.go — and no e2etest of the post-fork register path at all. Reviewers couldn't confirm the three handler sites actually fail-closed under the gate without running the full chain manually. ## Handler integration (action/protocol/staking/handlers_blspop_test.go) Three new test functions, 11 subcases: - TestHandleCandidateRegister_PoPGate - gate on + valid PoP → ReceiptStatus_Success, c.BLSPubKey persisted - gate on + empty PoP → ErrUnauthorizedOperator - gate on + PoP signed under wrong key → ErrUnauthorizedOperator - gate off + empty PoP → Success (pre-fork behaviour preserved) - TestHandleCandidateRegister_BLSPubKeyUniqueness - second candidate registering the same BLS pubkey with a valid PoP under its own owner → ErrCandidateConflict; the GetByBLSPubKey uniqueness check protects IIP-52's quorum-counting invariant against the "shared keypair" silent break - TestHandleCandidateUpdate_PoPGate - gate on + valid PoP under c.GetIdentifier() → rotation succeeds, new BLSPubKey in state - gate on + empty PoP → ErrUnauthorizedOperator - gate on + PoP signed under wrong candidateID → ErrUnauthorizedOperator - gate off + empty PoP → rotation succeeds (pre-fork compat) The fixture flips EnforceBLSPoP via genesis.ToBeEnabledBlockHeight (0 / math.MaxUint64) and forces XinguBlockHeight = 0 so the BLS registration codepath is reachable at the test block. ## e2etest (e2etest/native_staking_test.go) TestCandidateBLSPoP exercises the full chain pipeline — encode → submit → mint → handler → receipt → state — with four post-fork subcases plus a pre-fork backward-compat anchor: 1. Pre-fork register without PoP succeeds 2. Post-fork register with valid PoP succeeds, BlsPubKey persists 3. Post-fork register without PoP returns ErrUnauthorizedOperator 4. Post-fork update rotates BLS pubkey with valid PoP 5. Post-fork update without PoP returns ErrUnauthorizedOperator and leaves the previously rotated pubkey untouched in state Genesis wires ToBeEnabledBlockHeight == XinguBlockHeight so the BLS register path and the PoP gate activate together — the same shape the planned fork rollout will use. Pre-existing TestProtocol_FetchBucketAndValidate flake (master and this branch both flap ~1/3) is unrelated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Pushed `143010dfe` — handler integration + e2etest coverage for the PoP gate. Handler integration (`handlers_blspop_test.go`, new)3 functions, 11 subcases covering EnforceBLSPoP at all three handler sites: TestHandleCandidateRegister_PoPGate — gate on/off × {valid PoP, empty PoP, PoP signed by wrong key}. The "wrong key" subcase exercises the rogue-key blocking via the pairing verifier, separate from the unit-test-level proof in `bls_pop_test.go`. TestHandleCandidateRegister_BLSPubKeyUniqueness — second candidate trying to register the same BLS pubkey with a fresh valid PoP under its own owner → ErrCandidateConflict. Locks the GetByBLSPubKey invariant in at the handler boundary. TestHandleCandidateUpdate_PoPGate — rotation under gate on/off × {valid PoP, empty PoP, PoP for wrong candidateID}. The "wrong candidateID" subcase confirms the cross-candidate replay defence (the binding lives in BLSPopSigningRoot, not at the protocol layer above). Fixture flips the gate via `genesis.ToBeEnabledBlockHeight` (0 vs `math.MaxUint64`) and forces XinguBlockHeight = 0 so the BLS register/update codepath is reachable. e2etest (`native_staking_test.go`)TestCandidateBLSPoP — full chain pipeline encode → submit → mint → handler → receipt → state, with 5 subcases:
Genesis wires `ToBeEnabledBlockHeight == XinguBlockHeight` so the BLS register path and the PoP gate activate together — same shape as the planned fork rollout. Each test asserts both the receipt status AND the resulting candidate state, so a silent state mutation under a failure receipt wouldn't slip past. Pre-existing flake`TestProtocol_FetchBucketAndValidate/validate_owner_and_selfstake` flakes about 1 in 3 runs on this branch — and confirmed earlier it also flakes on master. Not introduced by this PR. Test plan
|
|



Why land BEFORE IIP-52
Master today has the full BLS registration infrastructure: `Candidate.BlsPubKey` proto field 5, `candidateRegisterWithBLS` ABI method, `crypto.BLS12381PublicKeyFromBytes` (format + subgroup check only, no PoP). Every BLS pubkey that gets registered between now and the IIP-52 activation block is a potential attack vector — once `FastAggregateVerify` goes live, all un-attested pubkeys become exploitable.
Activating `EnforceBLSPoP` BEFORE the IIP-52 fork means:
The bug
IIP-52's footer aggregation calls
```go
crypto.BLSAggregateSignatureFromBytes(sig).Verify(pubKeys, msg)
// → blst.FastAggregateVerify(true, pubKeys, msg, dst)
```
with DST `BLS_SIG_BLS12381G2_XMD:SHA-256_SSWU_RO_POP_`. The `POP` suffix names the IRTF Proof-of-Possession ciphersuite, sound only if every registered pubkey arrived with a possession proof.
Today `action/candidate_register.go` and `action/candidate_update.go` validate only:
```go
_, err := crypto.BLS12381PublicKeyFromBytes(blsPubKey) // format + subgroup
```
No PoP. The proto has no PoP field.
Attack (becomes exploitable when IIP-52 activates)
A single active delegate:
One delegate forges a 2/3+ quorum certificate → finalizes arbitrary blocks → consensus safety break.
The fix (Eth2 `process_deposit` pattern)
```go
// action/protocol/staking/bls_pop.go
const blsPopDomain = "IOTEX_BLS_POP_v1"
func BLSPopSigningRoot(blsPubKey []byte, ownerAddress address.Address) []byte {
h := sha256.New()
h.Write([]byte(blsPopDomain))
h.Write(blsPubKey)
h.Write(ownerAddress.Bytes())
return h.Sum(nil)
}
```
Three values bind into the signed message:
Implementation
Proto (iotex-proto#173): `bytes blsPop = 5` on `CandidateBasicInfo`. go.mod replace updated to envestcc/iotex-proto@7a486f6.
FeatureCtx.EnforceBLSPoP: `g.IsToBeEnabled(height)`. Standalone flag — does NOT share state with any IIP-52 flag, so this PR is fully independent.
action.CandidateRegister / CandidateUpdate: carry `blsPop` through constructors, Proto/LoadProto, accessors. `NewCandidateRegisterWithBLS` and `NewCandidateUpdateWithBLS` now take a `blsPop` parameter; pre-fork callers may pass nil.
handlers.go: `handleCandidateRegister`, `handleCandidateUpdate`, and `handleCandidateUpdateByOperator` call `VerifyBLSPop` before writing `Candidate.BLSPubKey` when `EnforceBLSPoP` is active. Failure → `ReceiptStatus_ErrUnauthorizedOperator`.
bls_pop_test.go: 5 cases including `TestBLSPop_RogueKeyAttackBlocked` as the regression guard — models the attacker registering a pubkey for which they don't have the secret; demonstrates no PoP they can produce (under any secret they control) validates against that pubkey; control case confirms a legitimate registrant succeeds.
Migration (out of scope for this PR)
Existing pre-PoP-merge candidate records have no PoP. Before `EnforceBLSPoP` activates, all active candidates must either:
This is a coordination problem (talking to delegates, picking a grace window), not a code problem. IIP draft will spell it out.
Out of scope for this PR (follow-ups)
ioctl / SDK wiring: `stake2register.go` and `stake2update.go` pass nil for `blsPop` with TODO markers. They need a flag to ingest a BLS private key and derive the PoP. Pre-fork accepted; post-fork rejected.
Dedicated receipt status code: reusing `ErrUnauthorizedOperator` (closest semantic match) to keep proto-change scope minimal. Adding `ErrInvalidBLSPoP` is another proto bump and not necessary for correctness.
Test plan
🤖 Generated with Claude Code