fix(staking): BLS proof-of-possession at candidate register/update (CRITICAL, pre-IIP-52) by envestcc · Pull Request #4854 · iotexproject/iotex-core

envestcc · 2026-06-11T02:17:18Z

Security severity: CRITICAL, latent. Replaces #4853 — this PR is based directly on master, NOT stacked on the IIP-52 PR chain. The vulnerable registration path (`crypto.BLS12381PublicKeyFromBytes` only, no PoP) is already live on master; IIP-52 turns the latent bug into a live exploit, but the un-attested BLS pubkeys can be collected starting now.

Depends on iotex-proto#173 (adds the `blsPop` field on `CandidateBasicInfo`).

Why land BEFORE IIP-52

Master today has the full BLS registration infrastructure: `Candidate.BlsPubKey` proto field 5, `candidateRegisterWithBLS` ABI method, `crypto.BLS12381PublicKeyFromBytes` (format + subgroup check only, no PoP). Every BLS pubkey that gets registered between now and the IIP-52 activation block is a potential attack vector — once `FastAggregateVerify` goes live, all un-attested pubkeys become exploitable.

Activating `EnforceBLSPoP` BEFORE the IIP-52 fork means:

New registrations from this PR onward all carry verified PoPs
The pre-fork migration scope shrinks to existing pubkeys registered before this PR lands
IIP-52 doesn't have to ship the PoP check — it just relies on the gate already being in place

The bug

IIP-52's footer aggregation calls

```go
crypto.BLSAggregateSignatureFromBytes(sig).Verify(pubKeys, msg)
// → blst.FastAggregateVerify(true, pubKeys, msg, dst)
```

with DST `BLS_SIG_BLS12381G2_XMD:SHA-256_SSWU_RO_POP_`. The `POP` suffix names the IRTF Proof-of-Possession ciphersuite, sound only if every registered pubkey arrived with a possession proof.

Today `action/candidate_register.go` and `action/candidate_update.go` validate only:

```go
_, err := crypto.BLS12381PublicKeyFromBytes(blsPubKey) // format + subgroup
```

No PoP. The proto has no PoP field.

Attack (becomes exploitable when IIP-52 activates)

A single active delegate:

Reads other delegates' BLS pubkeys from chain
Picks any secret `x` they control
Computes `pk_rogue = g^x − Σ(other pubkeys)` in G1 (public information only)
Registers `pk_rogue` via `candidateRegisterWithBLS` — current handler accepts
After IIP-52 activates, submits a footer with all-ones bitmap and `σ = sign(x, msg)`
`FastAggregateVerify` collapses `Σ pk_i + pk_rogue` to `g^x`. `σ` verifies. `isMajorityCount` passes.

One delegate forges a 2/3+ quorum certificate → finalizes arbitrary blocks → consensus safety break.

The fix (Eth2 `process_deposit` pattern)

```go
// action/protocol/staking/bls_pop.go
const blsPopDomain = "IOTEX_BLS_POP_v1"

func BLSPopSigningRoot(blsPubKey []byte, ownerAddress address.Address) []byte {
h := sha256.New()
h.Write([]byte(blsPopDomain))
h.Write(blsPubKey)
h.Write(ownerAddress.Bytes())
return h.Sum(nil)
}
```

Three values bind into the signed message:

Binding	Closes
`blsPubKey`	Rogue key — attacker without the secret cannot sign the canonical message
`ownerAddress`	Candidate substitution — PoP for owner A doesn't validate for owner B
`IOTEX_BLS_POP_v1` prefix	Cross-domain replay — PoP disjoint from any other BLS-signed iotex message; version suffix reserves room for a future fork to rotate

Implementation

Proto (iotex-proto#173): `bytes blsPop = 5` on `CandidateBasicInfo`. go.mod replace updated to envestcc/iotex-proto@7a486f6.
FeatureCtx.EnforceBLSPoP: `g.IsToBeEnabled(height)`. Standalone flag — does NOT share state with any IIP-52 flag, so this PR is fully independent.
action.CandidateRegister / CandidateUpdate: carry `blsPop` through constructors, Proto/LoadProto, accessors. `NewCandidateRegisterWithBLS` and `NewCandidateUpdateWithBLS` now take a `blsPop` parameter; pre-fork callers may pass nil.
handlers.go: `handleCandidateRegister`, `handleCandidateUpdate`, and `handleCandidateUpdateByOperator` call `VerifyBLSPop` before writing `Candidate.BLSPubKey` when `EnforceBLSPoP` is active. Failure → `ReceiptStatus_ErrUnauthorizedOperator`.
bls_pop_test.go: 5 cases including `TestBLSPop_RogueKeyAttackBlocked` as the regression guard — models the attacker registering a pubkey for which they don't have the secret; demonstrates no PoP they can produce (under any secret they control) validates against that pubkey; control case confirms a legitimate registrant succeeds.

Migration (out of scope for this PR)

Existing pre-PoP-merge candidate records have no PoP. Before `EnforceBLSPoP` activates, all active candidates must either:

Submit a `candidateUpdate` carrying a valid `blsPop` (grace period), or
Be dropped from `ActiveCandidates` at the fork height via a state migration

This is a coordination problem (talking to delegates, picking a grace window), not a code problem. IIP draft will spell it out.

Out of scope for this PR (follow-ups)

ioctl / SDK wiring: `stake2register.go` and `stake2update.go` pass nil for `blsPop` with TODO markers. They need a flag to ingest a BLS private key and derive the PoP. Pre-fork accepted; post-fork rejected.
Dedicated receipt status code: reusing `ErrUnauthorizedOperator` (closest semantic match) to keep proto-change scope minimal. Adding `ErrInvalidBLSPoP` is another proto bump and not necessary for correctness.

Test plan

`go test ./action/protocol/staking/ ./action/` — passing locally (470 cases including 5 new PoP tests)
`go build ./...` clean
Rogue-key regression guard: `TestBLSPop_RogueKeyAttackBlocked` ✓
CI green on master + iotex-proto#173
Coordinate migration plan with delegates before flipping fork height

🤖 Generated with Claude Code

…pdate CRITICAL security fix for the BLS rogue-key aggregate-forgery attack against IIP-52's same-message FastAggregateVerify path. IIP-52 verifies the COMMIT-vote BLS aggregate signature against the sum of registered candidate BLS public keys: blstAggregateSignature.FastAggregateVerify(true, Σ pk_i, msg, dst) with dst = "BLS_SIG_BLS12381G2_XMD:SHA-256_SSWU_RO_POP_". The _POP_ suffix names the IRTF "Proof of Possession" ciphersuite, which is only sound when each registered pubkey has been accompanied at registration time by a signature proving knowledge of the corresponding secret key. Today registration validates only key format + subgroup membership: crypto.BLS12381PublicKeyFromBytes(blsPubKey) No PoP is required. A registered candidate can publish pk_rogue = g^x − Σ(other delegates' pubkeys) computed from public information, supply a single signature σ = sign(x, msg) and a bitmap that claims every delegate signed, and FastAggregateVerify collapses Σ pk_i + pk_rogue to g^x, accepting σ. The bitmap satisfies isMajorityCount, so one delegate forges a 2/3+ quorum certificate and the consensus safety property is broken. The window is latent: IIP-52's ToBeEnabledBlockHeight = math.MaxUint64 so this code path is not live on any chain yet. It must be fixed before activation. - Add a 96-byte blsPop field to CandidateBasicInfo proto (iotex-proto PR pushed; go.mod replace updated to envestcc/iotex-proto@7a486f6) - New action/protocol/staking/bls_pop.go defines the PoP signing root: BLSPopSigningRoot(blsPubKey, ownerAddress) = SHA-256("IOTEX_BLS_POP_v1" || blsPubKey || ownerAddress.Bytes()) Binding all three values blocks the three replay variants: (a) attacker without the secret cannot sign over the canonical message at all — closes rogue key; (b) PoP for owner A does not validate for owner B — closes candidate-substitution replay; (c) "IOTEX_BLS_POP_v1" domain prefix is disjoint from any other BLS DST iotex uses, so a PoP cannot be replayed as a consensus vote or vice versa, and a future fork can rotate via "v2". - New FeatureCtx.EnforceBLSPoP shares IsToBeEnabled height with EnableBLSAggregation today but is a semantically distinct switch so the two can diverge if a hotfix or fork-height adjustment is needed. - action.CandidateRegister and action.CandidateUpdate carry blsPop through their constructors, Proto/LoadProto, and accessors. NewCandidateRegisterWithBLS and NewCandidateUpdateWithBLS now accept a blsPop parameter; pre-fork callers may pass nil. - handleCandidateRegister, handleCandidateUpdate, and handleCandidateUpdateByOperator call VerifyBLSPop before writing Candidate.BLSPubKey when EnforceBLSPoP is active. Failure returns ReceiptStatus_ErrUnauthorizedOperator. - TestBLSPop_RogueKeyAttackBlocked is the regression guard: it models the attacker trying to register a pubkey for which they do not have the secret, demonstrates the only PoP they can produce (signed under their own secret) does NOT validate against the target pubkey, and confirms the control case (a legitimate registrant can produce a valid PoP). - ioctl/SDK wiring: stake2register.go and stake2update.go currently pass nil for blsPop with TODO markers. These need a flag to ingest a BLS private key and derive the PoP. Pre-fork the empty PoP is accepted; post-fork the same call will fail with the new check, so the tooling MUST be updated before the fork. - Migration: existing pre-fork candidate records have no PoP. Before the fork activates, all active candidates must submit a candidateUpdate carrying a valid blsPop; the alternative is a fork-block state migration that drops BLS pubkeys without accompanying PoPs from ActiveCandidates. IIP draft will document the chosen approach. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Companion to the BLS proof-of-possession fix. Even with PoP in place, nothing in the current handler stops two different candidates from registering with the same blsPubKey: the existing ContainsName / ContainsOwner / ContainsOperator checks have no BLS equivalent. Two delegates sharing one BLS pubkey breaks IIP-52's quorum-counting model. The signer bitmap counts both delegates as having voted, but FastAggregateVerify aggregates pubkeys as a set — one contribution per distinct pubkey. The committee size as seen by aggregation is N, the committee size as seen by the bitmap is N+1, and the second delegate's stake-weight effectively "votes for free." This is a quieter cousin of the rogue-key attack: the aggregation math still verifies; only the accounting is off. Adds CandidateCenter.ContainsBLSPubKey(blsPubKey, except) and surfaces it through CandidateStateManager. Implementation is a linear scan over candidates — registration / update are sparse calls and the active delegate set is bounded; the saved O(1) lookup is not worth maintaining a fourth index map across the change/base commit flow. Hooks into all three handler paths under the EnforceBLSPoP gate: - handleCandidateRegister: reject if blsPubKey is held by any candidate other than the incumbent (when re-registering against an existing owner-without-selfstake record). - handleCandidateUpdate: reject if blsPubKey is held by any candidate other than the one being updated. A candidate keeping its own pubkey across updates is allowed (except = c.GetIdentifier()). - handleCandidateUpdateByOperator: same as update-by-owner. Failures return ReceiptStatus_ErrCandidateConflict (matches existing collision semantics for name / operator). Gated by EnforceBLSPoP so pre-fork blocks replay unchanged. Test: TestCandidateCenter_ContainsBLSPubKey covers nil / empty / no-match / match-with-nil-except / match-against-self (allowed) / match-against- other (rejected). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

envestcc · 2026-06-11T03:38:06Z

Pushed a companion commit 4d5389792 adding BLS pubkey uniqueness at register/update.

Why this belongs in the same PR

Even with PoP in place, two candidates registering the SAME blsPubKey breaks IIP-52's quorum-counting model. PoP only proves "I know the secret for this pubkey" — it does not prevent two delegates from independently producing valid PoPs for the same key (e.g. an actively shared keypair, or a delegate selling/leasing their secret).

When IIP-52 footer aggregation runs:

The signer bitmap counts each delegate as 1 — committee size = N+1
FastAggregateVerify sums pubkeys as a SET — committee size = N (duplicate dropped)
isMajorityCount(N+1) may pass even when only N distinct keys actually signed → quieter cousin of the rogue-key attack: aggregation math still verifies, accounting is off, second delegate's stake-weight votes for free

Mitigation matches the existing collision-check pattern: csm.ContainsBLSPubKey(blsPubKey, except) returns true if any candidate other than `except` already holds the pubkey, parallel to `ContainsName` / `ContainsOperator`.

Implementation

`CandidateCenter.ContainsBLSPubKey` — linear scan over `m.All()`. Registration is sparse and active-set is bounded; not worth maintaining a 4th index map across the change/base commit flow.
`CandidateStateManager` interface adds the new method; `candSM` delegates to `candCenter`.
All 3 handler paths (`handleCandidateRegister`, `handleCandidateUpdate`, `handleCandidateUpdateByOperator`) call it after PoP verification, returning `ErrCandidateConflict` on collision.
Gated under the same `EnforceBLSPoP` feature flag — pre-fork replay unchanged.
Update paths exempt the candidate-being-updated (`except = c.GetIdentifier()`) so re-registering / updating with one's own pubkey is allowed (no-op pubkey case).

Tests

`TestCandidateCenter_ContainsBLSPubKey` covers nil / empty / no-match / match-with-nil-except / match-against-self (allowed) / match-against-other (rejected).
Full staking + action suite: 472 passing.

…t owner PoP at candidate update was binding to c.Owner — the *current* owner. For post-Xingu candidates this drifts after CandidateTransferOwnership: the new owner has to know they're the new owner at signing time, and a PoP signed under the old owner during a concurrent transfer is rejected purely on a value mismatch that has nothing to do with key possession. c.GetIdentifier() is the stable handle for this purpose: - Post-Xingu non-collision (the common case): Identifier is set once at register time to the original owner address (generateCandidateID fast-paths to owner when it's free). So a PoP signed at register binding to owner == identifier, and a PoP signed at update binding to c.GetIdentifier() lines up with that same value, even if owner has since transferred. - Post-Xingu collision (edge case): Identifier is a hash-derived address. Register still binds to act.OwnerAddress() (signer can't predict the hash), updates use c.GetIdentifier(). The register-time PoP doesn't get re-verified later, so this asymmetry is harmless; inside the candidate's lifetime, all update PoPs use the same hash consistently. - Pre-Xingu: c.Identifier is nil and c.GetIdentifier() falls back to c.Owner — behavior identical to the prior code. Renamed the function parameter from ownerAddress to candidateID throughout (BLSPopSigningRoot, SignBLSPop, VerifyBLSPop) and updated the docstring to spell out what to pass at register vs update. The on-the-wire bytes are unchanged — only the conceptual binding shifts. Tests: renamed TestBLSPop_RejectWrongOwner to RejectWrongCandidateID; added TestBLSPop_StableAcrossOwnershipTransfer locking in the property that a PoP signed under the original owner still verifies under the same identifier after the candidate is transferred to a new owner. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

envestcc · 2026-06-11T04:13:41Z

Pushed `3fbaa0dcd` — bind update-path PoP to `c.GetIdentifier()` instead of `c.Owner`.

Motivation

`c.Owner` shifts when ownership is transferred via `CandidateTransferOwnership`; `c.GetIdentifier()` is the candidate's stable handle. The previous owner-binding meant:

a new owner doing BLS rotation right after receiving a transfer had to sign with the new owner address — not wrong, but conceptually awkward
mid-flight concurrent transfers (`Alice → Bob → Carol` while Bob's update tx is in mempool) would invalidate Bob's PoP for a non-cryptographic reason

Why this isn't a big change

`generateCandidateID` returns the owner address directly when free, only hash-deriving on collision. For the common case identifier == registration-time owner, so:

register: PoP binds to `act.OwnerAddress()` (the future identifier in the non-collision case). Unchanged from before.
update: PoP binds to `c.GetIdentifier()`:
- post-Xingu non-collision: same value as the register-time owner — register and update PoPs use identical bindings
- post-Xingu collision (edge case): identifier is a hash; updates consistently use it across the candidate's lifetime
- pre-Xingu: `GetIdentifier()` falls back to `c.Owner` — identical to the previous behavior

The on-the-wire PoP bytes don't change. Only the conceptual binding shifts from "current owner" to "candidate identity."

Implementation

`bls_pop.go`: renamed the parameter from `ownerAddress` to `candidateID` throughout (`BLSPopSigningRoot`, `SignBLSPop`, `VerifyBLSPop`); docstring spells out what to pass at register vs update
`handlers.go`: `handleCandidateUpdate` and `handleCandidateUpdateByOperator` switched from `c.Owner` to `c.GetIdentifier()` for the PoP binding (uniqueness check already used `c.GetIdentifier()`)
`bls_pop_test.go`: `TestBLSPop_RejectWrongOwner` → `TestBLSPop_RejectWrongCandidateID`; added `TestBLSPop_StableAcrossOwnershipTransfer` locking in the stability property

Test plan

6 PoP tests pass including the new transfer-stability case
No on-the-wire changes — pre-fork replay unaffected

envestcc · 2026-06-15T02:41:35Z

+		// which case the check rejects on any match. Used to enforce one
+		// BLS pubkey per delegate, a hard requirement for IIP-52's
+		// FastAggregateVerify quorum-counting model.
+		ContainsBLSPubKey(blsPubKey []byte, except address.Address) bool


The ContainsBLSPubKey should not need the except parameter.

CoderZhi · 2026-06-15T06:53:32Z

+	h.Write([]byte(blsPopDomain))
+	h.Write(blsPubKey)
+	if candidateID != nil {
+		h.Write(candidateID.Bytes())


return nil if candididateID == nil ?

CoderZhi · 2026-06-15T06:54:07Z

+//     for pre-Xingu records. This means a candidate that has been
+//     transferred to a new owner still uses its original identity for
+//     PoP, so the binding is stable across ownership transfers.
+func BLSPopSigningRoot(blsPubKey []byte, candidateID address.Address) []byte {


blsPubKey could be deleted. It will be needed during verification.

Pushed `fbfd7ae92` — dropped `blsPubKey` from the signing root per our discussion.

The signing root is now `H(blsPopDomain || candidateID)`. Security model:

Basic rogue-key registration: blocked by the pairing verifier alone. `Verify(pk_rogue, msg, sig)` requires `sig` to be produced under `sk_rogue`, which the attacker doesn't have — regardless of what's in `msg`.

Same-message aggregation (the classical attack that pubkey-in-message defends against): requires two distinct honest signers to sign the same root. Protocol-enforced uniqueness of candidate identifiers (generateCandidateID + ContainsName/Operator/Owner checks at register time) rules that out — no two honest delegates ever produce PoPs over the same root.

Cross-candidate replay: still blocked by candidateID binding.

Cross-domain replay: still blocked by blsPopDomain.

`SignBLSPop` no longer derives `pk` from `sk` since the signing root doesn't need it. PoP bytes are still 96 B G2-compressed; DST unchanged.

Tests updated: `TestBLSPop_RogueKeyAttackBlocked` + `TestBLSPop_RejectNilCandidateID` use the new `BLSPopSigningRoot(candidateID)` signature.

CoderZhi · 2026-06-15T06:56:07Z

+// the candidate identifier); at update time pass the candidate's
+// existing identifier (c.GetIdentifier()).
+func SignBLSPop(sk *crypto.BLS12381PrivateKey, candidateID address.Address) ([]byte, error) {
+	if sk == nil {


or return error if candidateID == nil

CoderZhi · 2026-06-15T07:04:58Z

+		// which case the check rejects on any match. Used to enforce one
+		// BLS pubkey per delegate, a hard requirement for IIP-52's
+		// FastAggregateVerify quorum-counting model.
+		ContainsBLSPubKey(blsPubKey []byte, except address.Address) bool


BLSPubKeyOwner to return the owner address.

PoP coverage in the Go action layer (envestcc/iotex-core#4854) wasn't visible from the web3 path: candidateRegisterWithBLS / candidateUpdateWithBLS ABI entries have no blsPop slot, so any tx routed through eth_sendRawTransaction and decoded via NewCandidateRegister/UpdateFromABIBinary silently dropped the field. Post-fork that would have produced txs the handler always rejects. Adds two new V2 ABI methods rather than mutating the existing selectors: - candidateRegisterWithBLSAndPoP (8 fields + blsPop + data) - candidateUpdateWithBLSAndPoP (4 fields + blsPubKey + blsPop) Why V2 instead of extending the existing methods: changing a parameter list changes the 4-byte function selector, breaking any tooling that hardcoded the old ABI. The legacy WithBLS entries stay working pre-fork (their handler path is unchanged), and post-fork they reject naturally for lacking PoP — coexistence with no selector churn. EthData routing picks the entry by data carried on the action: - WithBLS && len(blsPop) > 0 → V2 selector, calldata includes PoP - WithBLS → legacy selector, no PoP slot - otherwise → non-BLS legacy candidateRegister FromABIBinary recognises the V2 selector and decodes blsPop. Tests: extended TestCandidateRegisterABIEncodeAndDecode with a "with public key and PoP" subcase, and added TestCandidateUpdate / "ABI encode with PoP" — both confirm the codec round-trips a non-empty PoP through Pack + Unpack, locking in the property whose absence was the bug. Out of scope for this commit: e2e coverage that exercises the web3 path through to a post-fork handler (proves the new selector actually flows blsPop into Candidate.BLSPubKey assignment). Will follow once the EnforceBLSPoP gate gets switched on in an integration scenario. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

envestcc · 2026-06-16T01:14:19Z

Pushed `ab96bbc4b` — closes the web3-path gap you flagged.

What was broken

The PoP changes earlier in this PR live in the Go action layer + proto + handler. The ABI layer (which web3 callers actually use via `eth_sendRawTransaction`) had no `blsPop` slot:

`native_staking_contract_abi.json` defined `candidateRegisterWithBLS` with 8 fields (no PoP) and `candidateUpdateWithBLS` with 4 fields (no PoP).
`CandidateRegister.EthData()` / `CandidateUpdate.EthData()` packed only those 8 / 4 fields — `cr.blsPop` was never encoded into calldata.
`NewCandidateRegister/UpdateFromABIBinary` unpacked only those 8 / 4 fields — even if a client somehow embedded PoP bytes, the decoder would have ignored them.
e2etest `SignedCandidateRegisterWithBLS` / `SignedCandidateUpdateWithBLS` were called with `blsPop = nil` everywhere, so the missing codec coverage went undetected.

Post-fork impact would have been: every web3-routed register/update with BLS — wallets, SDKs, `ioctl` flows going through eth-tx — gets rejected by the handler for missing PoP, with no way to provide it from the client side.

What changed

Added two new V2 ABI entries (your option B):

Method	Inputs
`candidateRegisterWithBLSAndPoP`	name, operatorAddress, rewardAddress, ownerAddress, duration, autoStake, blsPubKey, blsPop, data
`candidateUpdateWithBLSAndPoP`	name, operatorAddress, rewardAddress, blsPubKey, blsPop

Coexistence model:

Legacy `candidateRegisterWithBLS` / `candidateUpdateWithBLS` keep their selector IDs. Tooling that hardcodes them keeps working pre-fork.
Post-fork the legacy entries are still accepted at ABI decode time, but the handler rejects them for lacking PoP — clients must switch to the V2 selector to register.

`EthData()` routes by content:

`WithBLS() && len(blsPop) > 0` → V2 selector
`WithBLS()` → legacy selector (pre-fork OK, post-fork rejected at handler)
otherwise → non-BLS `candidateRegister`

`FromABIBinary` adds the V2 selector recognition + decodes `blsPop`.

Tests

`TestCandidateRegisterABIEncodeAndDecode` gets a new "with public key and PoP" subcase that round-trips a non-empty PoP through Pack/Unpack via the V2 method.
`TestCandidateUpdate` gets "ABI encode with PoP" doing the same on the update side.

Both fail without the EthData/FromABIBinary changes — they're regression guards for exactly the bug you spotted.

Still out of scope

End-to-end web3 → handler test (proves the V2 selector flows blsPop all the way into `Candidate.BLSPubKey` post-fork, and that legacy selector + post-fork rejects properly). Will follow once we wire up the EnforceBLSPoP gate in a proper integration scenario — that's the same fork-activation work as the `ioctl` PoP-generation flag.

…ning The previous commit added candidateRegisterWithBLSAndPoP / candidateUpdateWithBLSAndPoP entries to native_staking_contract_abi.json but left native_staking_contract_interface.sol — the human-readable source-of-truth that clients / docs consume — out of sync. Adds the matching Solidity declarations and a header comment in both the .sol file and native_staking_contract_abi.go pointing out the non-obvious sharp edge: there is no Makefile target that regenerates the JSON from the .sol, so the two files must be kept in sync by hand on every ABI change. Out-of-sync edits silently misencode the web3 path — clients embed one function signature while the node decodes another. No runtime behaviour change; the JSON has been authoritative all along. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

envestcc · 2026-06-16T01:55:04Z

Good catch — pushed `98bdaa969` mirroring the V2 PoP methods in `native_staking_contract_interface.sol` and adding a sync-warning header to both files.

Findings

`action/native_staking_contract_interface.sol` is the human-readable source-of-truth (clients / docs consume it)
`action/native_staking_contract_abi.json` is what iotex-core actually loads via `NativeStakingContractABI()` (`go:embed` + parse at init)
Makefile has no target regenerating one from the other — confirmed by grepping for `solc` / `abigen` / staking-related targets. The two are kept in sync by hand.

What's in the commit

`candidateRegisterWithBLSAndPoP` and `candidateUpdateWithBLSAndPoP` Solidity declarations added to the .sol interface, mirroring the JSON I added in `ab96bbc4b`.
Header comment at the top of `native_staking_contract_interface.sol` calling out the no-auto-gen invariant.
Header comment in `native_staking_contract_abi.go` (above the `//go:embed` directive) saying the same thing — so anyone touching the JSON via Go-tooling code sees it.

The warning text is the same in both: "Adding a new method, parameter, or event requires editing BOTH files by hand, otherwise the web3 path silently misencodes — clients embed one signature while the node decodes another. There is no Makefile target that catches the drift."

On the lack of a Makefile target

It would be cleaner to wire one up (`solc --abi` → generated JSON, fail CI on drift). Out of scope for this PR — would need a solc version pin, an addition to the dev-deps target, and a CI step. Worth filing as a follow-up if the team wants. For now the comment is the cheapest guard.

Addresses two review threads on PR iotexproject#4854: ## Comments 1 + 5: drop the `except` parameter Both envestcc and CoderZhi flagged that ContainsBLSPubKey(pubkey, except) bool pushed the "is this me?" decision into a generic helper that doesn't belong there. Replaces ContainsBLSPubKey(pubkey, except) bool with the broader GetByBLSPubKey(pubkey) *Candidate: callers receive the (possibly nil) holder and compare identifiers themselves. The three register / update handler call sites now read if holder := csm.GetByBLSPubKey(act.BLSPubKey()); holder != nil && holder.GetIdentifier().String() != c.GetIdentifier().String() { return ErrCandidateConflict } which is more explicit than the prior `except`-hiding API. It also unifies with the same GetByBLSPubKey method added in iotexproject#4857 (Y4a) so the two PRs don't introduce parallel BLS-pubkey-lookup APIs. ## Comments 2 + 4: strict nil-candidateID contract Both BLSPopSigningRoot and the surrounding Sign / Verify helpers used to silently accept a nil candidateID by skipping the candidate-binding write. That degrades the scheme to domain+pubkey-only — exactly the shape an attacker reaching for a cross-candidate replay would hope for. The three entry points now refuse: - BLSPopSigningRoot returns nil - SignBLSPop returns error("nil candidate ID; PoP must bind to a candidate identity") - VerifyBLSPop rejects before any cryptographic work TestBLSPop_RejectNilCandidateID locks the contract in place. ## Not in this commit Comment 3 ("blsPubKey can be removed from the signing root") — deferred. Will reply on the thread; the short answer is that the IRTF BLS draft defines canonical PoP as sign(sk, pk) and dropping blsPubKey from the digest opens up same-message aggregation when owner-uniqueness is ever relaxed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

envestcc · 2026-06-16T02:28:57Z

Pushed `9229438ac` addressing 4 of the 5 review comments. The 5th (Comment 3 about `blsPubKey` in the signing root) I'm leaving alone and will reply on the thread separately.

What changed

Comments 1 + 5 → `GetByBLSPubKey` replaces `ContainsBLSPubKey`

`ContainsBLSPubKey(pubkey, except) bool` hid the "is this me?" decision in a helper. Replaced with `GetByBLSPubKey(pubkey) *Candidate` — callers get the (possibly nil) holder and compare identifiers themselves. The three register / update handler call sites now read:

```go
if holder := csm.GetByBLSPubKey(act.BLSPubKey()); holder != nil &&
holder.GetIdentifier().String() != c.GetIdentifier().String() {
return ErrCandidateConflict
}
```

Two side benefits:

Unifies with the `GetByBLSPubKey` method added in feat(blockchain): BlockCtx.ProducerPubKey + GetByBLSPubKey lookup (Y4a foundation) #4857 (Y4a) so the two PRs don't introduce parallel BLS-pubkey-lookup APIs.
Returns the full `*Candidate` so future consumers can read Owner / Operator / Reward without another API extension.

Comments 2 + 4 → strict nil-candidateID

`BLSPopSigningRoot` / `SignBLSPop` / `VerifyBLSPop` used to silently accept nil candidateID by skipping the candidate-binding write — which degrades the scheme to domain+pubkey-only and is exactly the shape an attacker reaching for cross-candidate replay would hope for. The three entry points now refuse:

`BLSPopSigningRoot` returns nil
`SignBLSPop` returns `error("nil candidate ID; PoP must bind to a candidate identity")`
`VerifyBLSPop` rejects before any cryptographic work

`TestBLSPop_RejectNilCandidateID` locks the contract.

Test plan

`TestBLSPop_RejectNilCandidateID` covers all three entry points
`TestCandidateCenter_GetByBLSPubKey` covers the new lookup (nil / empty / unregistered / hit)
`go test ./action/protocol/staking/ ./action/` — 476 passing
`go build ./...` clean

Addresses Comment 3 on PR iotexproject#4854 (CoderZhi). The previous signing root included blsPubKey: H(blsPopDomain || blsPubKey || candidateID) per the IRTF BLS canonical PoP form. After discussion, switching to: H(blsPopDomain || candidateID) The pairing verifier Verify(PK, msg, sig) already commits PK into the signature equation, so the basic rogue-key registration ("register pk_rogue without owning its discrete log") is blocked by basic PoP correctness without needing pubkey-in-message. Same-message aggregation — the classical attack that pubkey-in-message defends against — requires two distinct honest signers to sign the same signing root; the protocol-enforced uniqueness of candidate identifiers rules that out. Cross-candidate replay is still blocked by the candidateID binding, and cross-domain replay by blsPopDomain. The simpler signing root keeps the on-the-wire calldata independent of pubkey-length/encoding choices and removes an apparent redundancy between the signed message and the verifier's pk argument. SignBLSPop no longer derives pk from sk; the only caller-provided binding is candidateID. The bls_pop bytes themselves are unchanged in length and DST. Updated TestBLSPop_RogueKeyAttackBlocked and TestBLSPop_RejectNilCandidateID to use the new BLSPopSigningRoot(candidateID) signature. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…-pop Wires the three-option model agreed on PR iotexproject#4854: how does a delegate actually supply the BLS proof-of-possession the post-fork handler now requires? ## stake2 register / update BREAKING positional change: BLS_PUBKEY moves out of the positional arg list. Pre-fork callers passing it positionally must migrate; the post-fork handler requires PoP anyway so this command was always going to need re-tooling. Three-option matrix lives in blspop_helper.go and applies to both register (mandatory BLS) and update (optional BLS, default = don't touch): Option 1 — auto-derive from signer (default for register): No --bls-* flags. The tool decrypts the signer keystore, derives a BLS key from the ECDSA scalar (same algorithm as `ioctl account blskey` — crypto.GenerateBLS12381PrivateKey(ecdsa_sk.Bytes())), computes the PoP, and prompts the user to confirm. The prompt is explicit about ECDSA↔BLS coupling so delegates aren't surprised when an ECDSA leak also exposes their producer identity. --yes suppresses for CI. Option 2 — --bls-priv-key HEX: Standalone BLS key supplied directly. Tool derives the pubkey and signs the PoP. No prompt. Option 3 — --bls-pubkey HEX + --bls-pop HEX: User supplies both. Tool VerifyBLSPop's locally against the candidateID before submission so a malformed PoP fails fast without burning gas. For register the candidateID is the owner address from positional args; for update it must be passed via --candidate-id. Option 0 (update only): no BLS flags at all → BLS untouched. Going from "leave BLS alone" to "rotate to derived key" requires explicit --bls-from-signer so a name/operator update can never silently rotate the producer key. Mutual-exclusion + completeness rules are centralised in blsPoPFlags.classifyForRegister / classifyForUpdate and locked in by TestBlsPoPFlags_ClassifyForRegister / ClassifyForUpdate. --bls-keystore is a placeholder; the iotex BLS keystore format isn't specified yet. Tool returns an explicit error pointing at --bls-priv-key. ## account bls-sign-pop New offline PoP signer for air-gapped workflows. Accepts either --bls-priv-key HEX or --signer ADDRESS (ECDSA keystore → derive), plus --candidate-id ADDRESS. Writes the 96-byte PoP hex to stdout (informational fields go to stderr so shell redirection captures only the PoP). The delegate signs on the cold machine, transfers the hex over USB / QR, and runs `ioctl stake2 register --bls-pubkey ... --bls-pop ...` on a hot machine — the BLS private key never leaves cold storage. Lives in the account package next to the existing `account blskey` so BLS key operations are co-located rather than scattered. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

envestcc · 2026-06-18T01:51:33Z

Pushed `b80f0f316` — ioctl UX for the BLS PoP path.

What lands

`ioctl stake2 register`

Three-option matrix:

Option 1 (default) — no `--bls-*` flags: tool derives BLS from the signer's ECDSA key, computes PoP, prompts the user to confirm the derived pubkey + the ECDSA↔BLS coupling warning. `--yes` for CI.
Option 2 — `--bls-priv-key HEX`: standalone BLS key.
Option 3 — `--bls-pubkey HEX --bls-pop HEX`: explicit, local `VerifyBLSPop` before submission.

BREAKING: `BLS_PUBKEY` moves out of positional args (was `args[4]`) into flags. Pre-fork pre-existing scripts need updating, but post-fork they need re-tooling anyway since they must supply PoP.

`ioctl stake2 update`

Same three options PLUS Option 0:

Option 0 — no BLS flags: BLS untouched (the common case when changing only name / operator / reward).
Option 1 for update requires explicit `--bls-from-signer` opt-in so a name-only update never silently rotates the producer key.
Update path needs `--candidate-id` for any non-Option-0 mode (RPC fallback is a future improvement).

`ioctl account bls-sign-pop` (new)

Offline PoP generator. Takes `--bls-priv-key HEX` OR `--signer ADDRESS` (the latter decrypts the ECDSA keystore + derives BLS), plus `--candidate-id ADDRESS`. Writes 96-byte PoP hex to stdout (informational fields to stderr so `> pop.hex` captures just the PoP). Air-gap flow:

```bash

cold machine

ioctl account bls-sign-pop --candidate-id io1owner... --bls-priv-key 0x... > pop.hex

USB / QR to hot machine

ioctl stake2 register mycand io1op io1reward io1owner 10000 350 \
--bls-pubkey 0x... --bls-pop $(cat pop.hex) \
--signer io1owner --auto-stake
```

Why hot/cold separation matters here

BLS sk derived from ECDSA sk means a single compromise blast both. The `--yes`-able prompt makes the coupling explicit so delegates choosing the easy path know what they signed up for; security-conscious delegates use Option 2 / 3 with separately-managed BLS material.

Tests

`TestBlsPoPFlags_ClassifyForRegister` — 6 cases: auto-derive default, explicit key, explicit PoP, partial input rejection, mutual-exclusion rejection
`TestBlsPoPFlags_ClassifyForUpdate` — 8 cases: Option 0 + each of 1/2/3 + reject branches including `--bls-from-signer + --bls-priv-key` mutual exclusion
`TestLoadBLSPrivKey` — hex parse: good / 0x-prefixed / empty / non-hex / wrong-length

Deferred

`--bls-keystore PATH` — placeholder, errors out pointing at `--bls-priv-key`. iotex doesn't have a BLS keystore format yet; separate RFC.
RPC fallback to resolve `--candidate-id` from the signer on update. Requires a gRPC roundtrip; for v1 the user supplies it explicitly.
e2e test exercising the whole chain (decrypt → derive → PoP → V2 ABI → handler accept) — depends on `EnforceBLSPoP` activation fixture.

Out of scope (separate PR territory)

Migration command for existing pre-PoP BLS registrations (just re-run `stake2 update --bls-from-signer` — no new command needed).
JS / Go SDK exposure of the same helpers — that lands in the respective SDK repos.

🤖 Generated with Claude Code

PR iotexproject#4854 had no handler-level integration coverage of the EnforceBLSPoP gate — only the underlying SignBLSPop / VerifyBLSPop unit tests in bls_pop_test.go — and no e2etest of the post-fork register path at all. Reviewers couldn't confirm the three handler sites actually fail-closed under the gate without running the full chain manually. ## Handler integration (action/protocol/staking/handlers_blspop_test.go) Three new test functions, 11 subcases: - TestHandleCandidateRegister_PoPGate - gate on + valid PoP → ReceiptStatus_Success, c.BLSPubKey persisted - gate on + empty PoP → ErrUnauthorizedOperator - gate on + PoP signed under wrong key → ErrUnauthorizedOperator - gate off + empty PoP → Success (pre-fork behaviour preserved) - TestHandleCandidateRegister_BLSPubKeyUniqueness - second candidate registering the same BLS pubkey with a valid PoP under its own owner → ErrCandidateConflict; the GetByBLSPubKey uniqueness check protects IIP-52's quorum-counting invariant against the "shared keypair" silent break - TestHandleCandidateUpdate_PoPGate - gate on + valid PoP under c.GetIdentifier() → rotation succeeds, new BLSPubKey in state - gate on + empty PoP → ErrUnauthorizedOperator - gate on + PoP signed under wrong candidateID → ErrUnauthorizedOperator - gate off + empty PoP → rotation succeeds (pre-fork compat) The fixture flips EnforceBLSPoP via genesis.ToBeEnabledBlockHeight (0 / math.MaxUint64) and forces XinguBlockHeight = 0 so the BLS registration codepath is reachable at the test block. ## e2etest (e2etest/native_staking_test.go) TestCandidateBLSPoP exercises the full chain pipeline — encode → submit → mint → handler → receipt → state — with four post-fork subcases plus a pre-fork backward-compat anchor: 1. Pre-fork register without PoP succeeds 2. Post-fork register with valid PoP succeeds, BlsPubKey persists 3. Post-fork register without PoP returns ErrUnauthorizedOperator 4. Post-fork update rotates BLS pubkey with valid PoP 5. Post-fork update without PoP returns ErrUnauthorizedOperator and leaves the previously rotated pubkey untouched in state Genesis wires ToBeEnabledBlockHeight == XinguBlockHeight so the BLS register path and the PoP gate activate together — the same shape the planned fork rollout will use. Pre-existing TestProtocol_FetchBucketAndValidate flake (master and this branch both flap ~1/3) is unrelated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

envestcc · 2026-06-18T02:25:30Z

Pushed `143010dfe` — handler integration + e2etest coverage for the PoP gate.

Handler integration (`handlers_blspop_test.go`, new)

3 functions, 11 subcases covering EnforceBLSPoP at all three handler sites:

TestHandleCandidateRegister_PoPGate — gate on/off × {valid PoP, empty PoP, PoP signed by wrong key}. The "wrong key" subcase exercises the rogue-key blocking via the pairing verifier, separate from the unit-test-level proof in `bls_pop_test.go`.

TestHandleCandidateRegister_BLSPubKeyUniqueness — second candidate trying to register the same BLS pubkey with a fresh valid PoP under its own owner → ErrCandidateConflict. Locks the GetByBLSPubKey invariant in at the handler boundary.

TestHandleCandidateUpdate_PoPGate — rotation under gate on/off × {valid PoP, empty PoP, PoP for wrong candidateID}. The "wrong candidateID" subcase confirms the cross-candidate replay defence (the binding lives in BLSPopSigningRoot, not at the protocol layer above).

Fixture flips the gate via `genesis.ToBeEnabledBlockHeight` (0 vs `math.MaxUint64`) and forces XinguBlockHeight = 0 so the BLS register/update codepath is reachable.

e2etest (`native_staking_test.go`)

TestCandidateBLSPoP — full chain pipeline encode → submit → mint → handler → receipt → state, with 5 subcases:

Pre-fork register without PoP → Success (backward compat anchor)
Post-fork register with valid PoP → Success, `Candidate.BlsPubKey` persisted in state
Post-fork register without PoP → ErrUnauthorizedOperator
Post-fork update rotates BLS pubkey with valid PoP → `Candidate.BlsPubKey` reflects the new key
Post-fork update without PoP → ErrUnauthorizedOperator, previous BLS pubkey untouched in state

Genesis wires `ToBeEnabledBlockHeight == XinguBlockHeight` so the BLS register path and the PoP gate activate together — same shape as the planned fork rollout. Each test asserts both the receipt status AND the resulting candidate state, so a silent state mutation under a failure receipt wouldn't slip past.

Pre-existing flake

`TestProtocol_FetchBucketAndValidate/validate_owner_and_selfstake` flakes about 1 in 3 runs on this branch — and confirmed earlier it also flakes on master. Not introduced by this PR.

Test plan

`go test -count=1 -run "TestHandleCandidateRegister_PoPGate|TestHandleCandidateRegister_BLSPubKeyUniqueness|TestHandleCandidateUpdate_PoPGate" ./action/protocol/staking/` — 11/11 passing
`go test -count=1 -run "TestCandidateBLSPoP" ./e2etest/` — 5/5 passing
`go test -count=1 ./action/protocol/staking/ ./action/ ./ioctl/cmd/action/` — clean except for the pre-existing flake noted above
`go build ./...` clean

sonarqubecloud · 2026-06-18T02:26:55Z

Quality Gate passed

Issues
10 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
2.6% Duplication on New Code

See analysis details on SonarQube Cloud

envestcc and others added 2 commits June 11, 2026 10:14

This was referenced Jun 11, 2026

feat(consensus): roundCtx.IsProducer matches delegates by BLS pubkey #4855

Draft

feat(block): HeaderSigner interface — decouple block signing from scheme #4856

Draft

envestcc added this to the v2.5 milestone Jun 15, 2026

envestcc commented Jun 15, 2026

View reviewed changes

CoderZhi reviewed Jun 15, 2026

View reviewed changes

envestcc marked this pull request as ready for review June 18, 2026 01:03

envestcc requested a review from a team as a code owner June 18, 2026 01:03

Conversation

envestcc commented Jun 11, 2026

Why land BEFORE IIP-52

The bug

Attack (becomes exploitable when IIP-52 activates)

The fix (Eth2 `process_deposit` pattern)

Implementation

Migration (out of scope for this PR)

Out of scope for this PR (follow-ups)

Test plan

Uh oh!

envestcc commented Jun 11, 2026

Why this belongs in the same PR

Implementation

Tests

Uh oh!

envestcc commented Jun 11, 2026

Motivation

Why this isn't a big change

Implementation

Test plan

Uh oh!

envestcc Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

CoderZhi Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

CoderZhi Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

envestcc Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

CoderZhi Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

CoderZhi Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

envestcc commented Jun 16, 2026

What was broken

What changed

Tests

Still out of scope

Uh oh!

envestcc commented Jun 16, 2026

Findings

What's in the commit

On the lack of a Makefile target

Uh oh!

envestcc commented Jun 16, 2026

What changed

Comments 1 + 5 → `GetByBLSPubKey` replaces `ContainsBLSPubKey`

Comments 2 + 4 → strict nil-candidateID

Test plan

Uh oh!

envestcc commented Jun 18, 2026

What lands

`ioctl stake2 register`

`ioctl stake2 update`

`ioctl account bls-sign-pop` (new)

cold machine

USB / QR to hot machine

Why hot/cold separation matters here

Tests

Deferred

Out of scope (separate PR territory)

Uh oh!

envestcc commented Jun 18, 2026

Handler integration (`handlers_blspop_test.go`, new)

e2etest (`native_staking_test.go`)

Pre-existing flake

Test plan

Uh oh!

sonarqubecloud Bot commented Jun 18, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees