fix(staking): BLS proof-of-possession blocks rogue-key aggregate forgery (CRITICAL)#4853
Closed
envestcc wants to merge 9 commits into
Closed
fix(staking): BLS proof-of-possession blocks rogue-key aggregate forgery (CRITICAL)#4853envestcc wants to merge 9 commits into
envestcc wants to merge 9 commits into
Conversation
…te (IIP-52) Scaffolding for the BLS signature aggregation work tracked in IIP-52. No behavior change yet: EnableBLSAggregation is gated on IsToBeEnabled, and the BLS keys plumbed into rollDPoSCtx are not yet used to sign or verify endorsements. - blockchain/config.go: add Chain.BLSProducerPrivKey (comma-separated hex) and BLSProducerPrivateKeys(). Empty value falls back to deriving each BLS key from the corresponding ECDSA producer key via crypto.GenerateBLS12381PrivateKey. - consensus/scheme/rolldpos: Builder.SetBLSPriKey; NewRollDPoSCtx accepts []*crypto.BLS12381PrivateKey aligned 1:1 with producer ECDSA keys; rollDPoSCtx stores them on blsPriKeys for the upcoming signing path. - consensus/consensus.go: wire SetBLSPriKey(cfg.Chain.BLSProducerPrivateKeys()). - action/protocol/context.go: FeatureCtx.EnableBLSAggregation gated on g.IsToBeEnabled(height); flips to a named hardfork height later. - go.mod: bump iotex-proto to envestcc/iotex-proto bls-aggregate (52e72a6) for the BlockFooter aggregated_signature / signer_bitmap fields and the BLSEndorsement message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switches consensus vote signing to BLS12-381 post-fork, reusing the existing Endorsement type and dispatching on signature length (65B secp256k1 vs 96B BLS). Receiver verification and endorsement-manager quorum integration land in a follow-up PR; with the feature gate parked at IsToBeEnabled this commit is dead code in production. - endorsement.EndorseBLS / VerifyBLSEndorsement: thin helpers that produce / verify a regular *Endorsement whose signature field carries a BLS sig. Endorser remains the delegate's secp256k1 producer key so the existing Endorser().Address() path still resolves the iotex address; receivers look up the BLS verifying key from candidate state by that address. - ConsensusConfig.BLSAggregationEnabled(height): feature gate wired off Genesis.ToBeEnabledBlockHeight. Will be re-pointed at a named hardfork height once the full Phase-2 stack lands. - rollDPoSCtx.newEndorsement / endorseBlockProposal: post-fork, sign PROPOSAL, LOCK and COMMIT votes plus the proposer's wrapping endorsement with BLS (skipping delegates without a configured BLS key). Block header signing remains on the ECDSA producer key — that signature ties the block to chain identity and is unrelated to the consensus vote layer. - The proposer's producerKey (ECDSA + BLS + address) is threaded through Proposal / mintNewBlock / endorseBlockProposal so the branch can pick the right key without a separate lookup. - iotex-proto bump to envestcc/iotex-proto@e4439ef (PR iotexproject#169): clarifies Endorsement.signature semantics (pre-fork 65B secp256k1, post-fork 96B BLS, distinguished by length). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stacks on top of the BLS sender PR. Wires up the receiver side so BLS- signed consensus endorsements are accepted, verified against pubkeys resolved from candidate state, and counted toward quorum. With the feature gate parked at IsToBeEnabled this is dead code in production; intended for local iteration until the sender PRs land. - BLSPubKeysByEpochFunc callback type + Builder.SetBLSPubKeysByEpochFunc wired through to NewRollDPoSCtx. - consensus.go provides the implementation: reads the epoch's delegate list from candidate state and extracts each candidate's registered BLSPubKey, returning a map keyed by operator iotex address. - roundCalculator caches the BLS pubkey index per round, decoded as *crypto.BLS12381PublicKey. UpdateRound carries it across height transitions inside an epoch and re-fetches on epoch boundaries. - roundCtx.BLSPubKey(addr) accessor; roundCtx.verifyEndorsement(doc, en) dispatches on signature length (65B secp256k1 vs 96B BLS). - rollDPoSCtx.VerifyEndorsement(height, doc, en) is the public entry point; same length-aware dispatch plus pre/post-fork gating — pre-fork rejects 96B sigs, post-fork rejects 65B sigs. - HandleConsensusMsg replaces the unconditional ECDSA verify with the new VerifyEndorsement; CheckBlockProposer's proofOfLock replay path flows through round.AddVoteEndorsement which now dispatches on signature length internally, so BLS endorsements in proof-of-lock are verified transparently.
- endorsement/bls_endorsement_test.go: round-trip EndorseBLS / VerifyBLSEndorsement, plus negative cases (wrong pubkey, tampered document, tampered signature, nil inputs). Uses deterministic in-test keys (no identityset dep from this package). - consensus/scheme/rolldpos/bls_verify_test.go: covers the two-layer dispatch — roundCtx.verifyEndorsement (signature-length branch, BLS pubkey lookup miss, mismatched pubkey) and rollDPoSCtx.VerifyEndorsement (length gating with the BLS aggregation feature flag in both directions). Plus sender tests that newEndorsement emits the expected signature scheme based on the feature gate. 11 new tests; everything in the affected packages still passes (51 tests total across endorsement/ and consensus/scheme/rolldpos/).
- Bundle delegate address with BLS pubkey into a single 'delegate' struct; roundCtx.delegates becomes []delegate, dropping the parallel blsPubKeys map. roundCalc.delegatesAt replaces blsPubKeysFor and merges both callbacks into one slice — single source of truth for the address/pubkey alignment. - rollDPoSCtx.VerifyEndorsement takes a single *EndorsedConsensusMessage instead of (height, doc, en); the message already carries all three. - Shrink VerifyEndorsement lock scope: snapshot round + feature flag under RLock, release before doing the (potentially slow) signature verification. Safe because *roundCtx is replaced, never mutated in place. Test helpers + the two consumers (HandleConsensusMsg, roundCtx_test) updated. All targeted tests pass.
Per follow-up review on PR iotexproject#4843: remove the separate BLSPubKeysByEpochFunc callback. NodesSelectionByEpochFunc now returns []*Delegate (exported), where each Delegate pairs the operator address with its decoded BLS12-381 public key. consensus.go builds these from candidate state in one pass; roundCalculator stores them directly and no longer needs a parallel lookup/merge step. - Export delegate -> Delegate{Address, BLSPubKey}. - NodesSelectionByEpochFunc: ([]string) -> ([]*Delegate). - Drop BLSPubKeysByEpochFunc type, Builder.SetBLSPubKeysByEpochFunc, the NewRollDPoSCtx param, and roundCalculator.delegatesAt / blsPubKeysFor. - roundCalculator.Delegates returns []*Delegate; Proposers extracts addresses; IsDelegate scans by address. - consensus.go decodes each candidate's BLS pubkey once per epoch in delegatesByEpochFunc; proposersByEpochFunc reuses it. Net -68 lines. Build + vet clean; targeted tests pass.
Per PR iotexproject#4843 review: UpdateRound was reaching into round.delegates directly because Delegates() returned []string while the field is []*Delegate. Make the accessor return []*Delegate so UpdateRound (and any future caller) can use round.Delegates() consistently. - roundCtx.Delegates() now returns []*Delegate. - endorsementManager.Log's (unused) delegates param retyped to []*Delegate. - The one genuine []string consumer (ConsensusMetrics.LatestDelegates, an external metrics field) extracts addresses at the call site.
Phase 2 deliverable for IIP-52: proposer aggregates the per-block COMMIT BLS signatures into a single 96-byte sig + a signer bitmap, stored in BlockFooter.aggregated_signature and BlockFooter.signer_bitmap. Verifiers reconstruct the signer set from the bitmap, look up each BLS pubkey from the round's delegate index, and FastAggregateVerify the aggregate against the shared COMMIT-vote hash. - blockchain/block/footer.go: new fields aggregatedSignature, signerBitmap; proto round-trip; IsAggregated / AggregatedSignature / SignerBitmap accessors. - blockchain/block/block.go: new Block.FinalizeWithAggregate; the one-shot contract is preserved via a commitTime witness so either path errors on second call. - consensus/scheme/rolldpos/aggregate.go: aggregateCommitEndorsements builds the aggregate sig + bitmap from a slice of BLS COMMIT endorsements indexed against the round's delegates; bitmapSigners is the inverse for the verifier. - consensus/scheme/rolldpos/rolldposctx.go: at commit time, branch on BLSAggregationEnabled and call FinalizeWithAggregate post-fork. - consensus/scheme/rolldpos/rolldpos.go: ValidateBlockFooter routes aggregated footers through validateAggregatedFooter — bitmap → delegates → BLS pubkeys → BLSAggregateSignature.Verify, with a separate 2/3 majority check. - action/protocol/staking/protocol.go: ActiveCandidates filters out candidates without a registered BLS pubkey once aggregation is enabled, so the aggregate signer set is well-defined. - endorsement/endorsement.go: expose SigningHash so the verifier can reconstruct the COMMIT-vote hash from blk.CommitTime() outside an Endorsement struct. All signers sign the same hash (deterministic ts from round start + TTL sum), which is what FastAggregateVerify requires. 8 new unit tests cover the aggregate round-trip, partial signer sets, rejection of non-BLS endorsements / unknown endorsers, and bitmap edge cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pdate
CRITICAL security fix for the BLS rogue-key aggregate-forgery attack
against IIP-52's same-message FastAggregateVerify path.
## The bug
IIP-52 verifies the COMMIT-vote BLS aggregate signature against the sum
of registered candidate BLS public keys:
blstAggregateSignature.FastAggregateVerify(true, Σ pk_i, msg, dst)
with dst = "BLS_SIG_BLS12381G2_XMD:SHA-256_SSWU_RO_POP_". The _POP_
suffix names the IRTF "Proof of Possession" ciphersuite, which is only
sound when each registered pubkey has been accompanied at registration
time by a signature proving knowledge of the corresponding secret key.
Today registration validates only key format + subgroup membership:
crypto.BLS12381PublicKeyFromBytes(blsPubKey)
No PoP is required. A registered candidate can publish
pk_rogue = g^x − Σ(other delegates' pubkeys)
computed from public information, supply a single signature
σ = sign(x, msg) and a bitmap that claims every delegate signed, and
FastAggregateVerify collapses Σ pk_i + pk_rogue to g^x, accepting σ.
The bitmap satisfies isMajorityCount, so one delegate forges a 2/3+
quorum certificate and the consensus safety property is broken.
The window is latent: IIP-52's ToBeEnabledBlockHeight = math.MaxUint64
so this code path is not live on any chain yet. It must be fixed
before activation.
## The fix
- Add a 96-byte blsPop field to CandidateBasicInfo proto (iotex-proto
PR pushed; go.mod replace updated to envestcc/iotex-proto@7a486f6)
- New action/protocol/staking/bls_pop.go defines the PoP signing root:
BLSPopSigningRoot(blsPubKey, ownerAddress) =
SHA-256("IOTEX_BLS_POP_v1" || blsPubKey || ownerAddress.Bytes())
Binding all three values blocks the three replay variants:
(a) attacker without the secret cannot sign over the canonical
message at all — closes rogue key;
(b) PoP for owner A does not validate for owner B — closes
candidate-substitution replay;
(c) "IOTEX_BLS_POP_v1" domain prefix is disjoint from any other BLS
DST iotex uses, so a PoP cannot be replayed as a consensus vote
or vice versa, and a future fork can rotate via "v2".
- New FeatureCtx.EnforceBLSPoP shares IsToBeEnabled height with
EnableBLSAggregation today but is a semantically distinct switch so
the two can diverge if a hotfix or fork-height adjustment is needed.
- action.CandidateRegister and action.CandidateUpdate carry blsPop
through their constructors, Proto/LoadProto, and accessors.
NewCandidateRegisterWithBLS and NewCandidateUpdateWithBLS now accept
a blsPop parameter; pre-fork callers may pass nil.
- handleCandidateRegister, handleCandidateUpdate, and
handleCandidateUpdateByOperator call VerifyBLSPop before writing
Candidate.BLSPubKey when EnforceBLSPoP is active. Failure returns
ReceiptStatus_ErrUnauthorizedOperator.
- TestBLSPop_RogueKeyAttackBlocked is the regression guard: it models
the attacker trying to register a pubkey for which they do not have
the secret, demonstrates the only PoP they can produce (signed under
their own secret) does NOT validate against the target pubkey, and
confirms the control case (a legitimate registrant can produce a
valid PoP).
## Follow-ups (out of scope for this PR)
- ioctl/SDK wiring: stake2register.go and stake2update.go currently
pass nil for blsPop with TODO markers. These need a flag to ingest a
BLS private key and derive the PoP. Pre-fork the empty PoP is
accepted; post-fork the same call will fail with the new check, so
the tooling MUST be updated before the fork.
- Migration: existing pre-fork candidate records have no PoP. Before
the fork activates, all active candidates must submit a
candidateUpdate carrying a valid blsPop; the alternative is a
fork-block state migration that drops BLS pubkeys without
accompanying PoPs from ActiveCandidates. IIP draft will document
the chosen approach.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Member
Author
|
Closing in favor of a clean master-based PR — the BLS PoP fix doesn't depend on IIP-52 (the vulnerable registration path is already on master), so stacking it on the IIP-52 PR chain was unnecessary. New PR: #4854 |
Open
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



The bug
`consensus/scheme/rolldpos/rolldpos.go` `validateAggregatedFooter` verifies the IIP-52 COMMIT-vote BLS aggregate signature using:
```go
crypto.BLSAggregateSignatureFromBytes(sig).Verify(pubKeys, msg)
// → blst.FastAggregateVerify(true, pubKeys, msg, dst)
```
with DST = `BLS_SIG_BLS12381G2_XMD:SHA-256_SSWU_RO_POP_`. The `POP` suffix names the IRTF Proof-of-Possession ciphersuite, which is only sound if each registered BLS pubkey arrived with a possession proof at registration.
Today registration validates only key format and subgroup membership:
```go
// action/candidate_register.go, action/candidate_update.go
_, err := crypto.BLS12381PublicKeyFromBytes(blsPubKey)
```
No PoP is checked. The proto has no PoP field.
Attack
An attacker with a single active delegate slot:
One delegate forges a 2/3+ quorum certificate → finalizes arbitrary blocks → consensus safety break / chain split.
Reachability gate: requires the attacker's candidate to be in `round.delegates` (economic, not cryptographic) AND `EnableBLSAggregation` active.
The fix (Ethereum 2.0 style)
Eth2's mitigation for the same primitive is the `process_deposit` PoP: the validator's deposit signature signs a message that includes the pubkey. Adopting the equivalent pattern at `candidateRegister` / `candidateUpdate`:
```go
// action/protocol/staking/bls_pop.go
const blsPopDomain = "IOTEX_BLS_POP_v1"
func BLSPopSigningRoot(blsPubKey []byte, ownerAddress address.Address) []byte {
h := sha256.New()
h.Write([]byte(blsPopDomain))
h.Write(blsPubKey)
h.Write(ownerAddress.Bytes())
return h.Sum(nil)
}
```
Three values bind into the message — closes three replay variants:
Implementation
Proto (iotex-proto#173): `bytes blsPop = 5` on `CandidateBasicInfo`. go.mod replace updated to envestcc/iotex-proto@7a486f6.
FeatureCtx.EnforceBLSPoP: shares `IsToBeEnabled` height with `EnableBLSAggregation` today but is a semantically distinct switch — call sites read self-descriptively and the two can diverge if needed.
action.CandidateRegister / CandidateUpdate: carry `blsPop` through constructors, Proto/LoadProto, accessors. `NewCandidateRegisterWithBLS` and `NewCandidateUpdateWithBLS` now accept `blsPop`; pre-fork callers may pass nil.
handlers.go: `handleCandidateRegister`, `handleCandidateUpdate`, and `handleCandidateUpdateByOperator` call `VerifyBLSPop` before writing `Candidate.BLSPubKey` when `EnforceBLSPoP` is active. Failure → `ReceiptStatus_ErrUnauthorizedOperator`.
bls_pop_test.go: 5 tests including `TestBLSPop_RogueKeyAttackBlocked` as the regression guard — models the attacker registering a pubkey they don't own, demonstrates the only PoP they can produce (signed under their own secret) does NOT validate against the target pubkey, plus the control case confirming a legitimate registrant succeeds.
Out of scope for this PR (follow-ups)
ioctl / SDK wiring: `stake2register.go` and `stake2update.go` pass nil for `blsPop` with TODO markers. They need a flag to ingest a BLS private key and derive the PoP. Pre-fork the empty PoP is accepted; post-fork the same call will fail with the new check, so the tooling MUST be updated before the fork.
Migration: existing pre-fork candidate records have no PoP. Before the fork activates, all active candidates must submit a `candidateUpdate` carrying a valid `blsPop`; alternative is a fork-block state migration that drops un-attested BLS pubkeys from `ActiveCandidates`. IIP draft (separate doc) covers the chosen approach.
Receipt status code: reusing `ErrUnauthorizedOperator` (closest semantic match). Adding a dedicated `ErrInvalidBLSPoP` is another proto bump and not necessary for the fix.
Test plan
🤖 Generated with Claude Code