Skip to content

fix(staking): BLS proof-of-possession blocks rogue-key aggregate forgery (CRITICAL)#4853

Closed
envestcc wants to merge 9 commits into
iotexproject:masterfrom
envestcc:bls-rogue-key-fix
Closed

fix(staking): BLS proof-of-possession blocks rogue-key aggregate forgery (CRITICAL)#4853
envestcc wants to merge 9 commits into
iotexproject:masterfrom
envestcc:bls-rogue-key-fix

Conversation

@envestcc

Copy link
Copy Markdown
Member

Security severity: CRITICAL, latent. Stacks on iotex-proto#173 and the IIP-52 PR chain (#4841#4842#4843#4847). The diff includes the IIP-52 PRs' changes until they merge; review the last commit (`fix(staking): require BLS proof-of-possession at candidate register/update`).

The vulnerability is gated today by IIP-52's `ToBeEnabledBlockHeight = math.MaxUint64` — no live chain is at risk — but must be fixed before activation. IIP-52 PR stack should NOT be allowed to merge + activate without this PR also landing.

The bug

`consensus/scheme/rolldpos/rolldpos.go` `validateAggregatedFooter` verifies the IIP-52 COMMIT-vote BLS aggregate signature using:

```go
crypto.BLSAggregateSignatureFromBytes(sig).Verify(pubKeys, msg)
// → blst.FastAggregateVerify(true, pubKeys, msg, dst)
```

with DST = `BLS_SIG_BLS12381G2_XMD:SHA-256_SSWU_RO_POP_`. The `POP` suffix names the IRTF Proof-of-Possession ciphersuite, which is only sound if each registered BLS pubkey arrived with a possession proof at registration.

Today registration validates only key format and subgroup membership:

```go
// action/candidate_register.go, action/candidate_update.go
_, err := crypto.BLS12381PublicKeyFromBytes(blsPubKey)
```

No PoP is checked. The proto has no PoP field.

Attack

An attacker with a single active delegate slot:

  1. Reads other delegates' BLS pubkeys from chain
  2. Picks any secret `x` they control
  3. Computes `pk_rogue = g^x − Σ(other pubkeys)` in G1 (public-information-only)
  4. Registers `pk_rogue` via `candidateRegisterWithBLS` — current handler accepts (no PoP gate)
  5. For any block they want to finalize: submits a footer with bitmap set to all delegates and signature `σ = sign(x, msg)`
  6. `FastAggregateVerify(Σ pubKeys, σ, msg)` collapses `Σ pk_i + pk_rogue` to `g^x`. `σ` verifies. `isMajorityCount` passes.

One delegate forges a 2/3+ quorum certificate → finalizes arbitrary blocks → consensus safety break / chain split.

Reachability gate: requires the attacker's candidate to be in `round.delegates` (economic, not cryptographic) AND `EnableBLSAggregation` active.

The fix (Ethereum 2.0 style)

Eth2's mitigation for the same primitive is the `process_deposit` PoP: the validator's deposit signature signs a message that includes the pubkey. Adopting the equivalent pattern at `candidateRegister` / `candidateUpdate`:

```go
// action/protocol/staking/bls_pop.go
const blsPopDomain = "IOTEX_BLS_POP_v1"

func BLSPopSigningRoot(blsPubKey []byte, ownerAddress address.Address) []byte {
h := sha256.New()
h.Write([]byte(blsPopDomain))
h.Write(blsPubKey)
h.Write(ownerAddress.Bytes())
return h.Sum(nil)
}
```

Three values bind into the message — closes three replay variants:

Binding Closes
`blsPubKey` Rogue key — attacker without the secret can't sign over the canonical message at all
`ownerAddress` Candidate substitution — PoP for owner A doesn't validate for owner B
`IOTEX_BLS_POP_v1` prefix Cross-domain replay — PoP can't be reused as a consensus vote or vice versa; version suffix reserves room for a future fork to rotate

Implementation

  • Proto (iotex-proto#173): `bytes blsPop = 5` on `CandidateBasicInfo`. go.mod replace updated to envestcc/iotex-proto@7a486f6.

  • FeatureCtx.EnforceBLSPoP: shares `IsToBeEnabled` height with `EnableBLSAggregation` today but is a semantically distinct switch — call sites read self-descriptively and the two can diverge if needed.

  • action.CandidateRegister / CandidateUpdate: carry `blsPop` through constructors, Proto/LoadProto, accessors. `NewCandidateRegisterWithBLS` and `NewCandidateUpdateWithBLS` now accept `blsPop`; pre-fork callers may pass nil.

  • handlers.go: `handleCandidateRegister`, `handleCandidateUpdate`, and `handleCandidateUpdateByOperator` call `VerifyBLSPop` before writing `Candidate.BLSPubKey` when `EnforceBLSPoP` is active. Failure → `ReceiptStatus_ErrUnauthorizedOperator`.

  • bls_pop_test.go: 5 tests including `TestBLSPop_RogueKeyAttackBlocked` as the regression guard — models the attacker registering a pubkey they don't own, demonstrates the only PoP they can produce (signed under their own secret) does NOT validate against the target pubkey, plus the control case confirming a legitimate registrant succeeds.

Out of scope for this PR (follow-ups)

  • ioctl / SDK wiring: `stake2register.go` and `stake2update.go` pass nil for `blsPop` with TODO markers. They need a flag to ingest a BLS private key and derive the PoP. Pre-fork the empty PoP is accepted; post-fork the same call will fail with the new check, so the tooling MUST be updated before the fork.

  • Migration: existing pre-fork candidate records have no PoP. Before the fork activates, all active candidates must submit a `candidateUpdate` carrying a valid `blsPop`; alternative is a fork-block state migration that drops un-attested BLS pubkeys from `ActiveCandidates`. IIP draft (separate doc) covers the chosen approach.

  • Receipt status code: reusing `ErrUnauthorizedOperator` (closest semantic match). Adding a dedicated `ErrInvalidBLSPoP` is another proto bump and not necessary for the fix.

Test plan

  • Unit tests `TestBLSPop_*` — 5 cases, including the rogue-key regression guard
  • `go test ./action/protocol/staking/ ./action/` — 470 passing locally
  • `go build ./...` clean
  • CI green on the IIP-52 stack base (feat(consensus): BlockFooter BLS aggregation (IIP-52) #4847) + iotex-proto#173
  • Coordinate migration plan with delegates before flipping fork height

🤖 Generated with Claude Code

envestcc and others added 9 commits May 27, 2026 10:53
…te (IIP-52)

Scaffolding for the BLS signature aggregation work tracked in IIP-52. No
behavior change yet: EnableBLSAggregation is gated on IsToBeEnabled, and the
BLS keys plumbed into rollDPoSCtx are not yet used to sign or verify
endorsements.

- blockchain/config.go: add Chain.BLSProducerPrivKey (comma-separated hex)
  and BLSProducerPrivateKeys(). Empty value falls back to deriving each BLS
  key from the corresponding ECDSA producer key via
  crypto.GenerateBLS12381PrivateKey.
- consensus/scheme/rolldpos: Builder.SetBLSPriKey; NewRollDPoSCtx accepts
  []*crypto.BLS12381PrivateKey aligned 1:1 with producer ECDSA keys;
  rollDPoSCtx stores them on blsPriKeys for the upcoming signing path.
- consensus/consensus.go: wire SetBLSPriKey(cfg.Chain.BLSProducerPrivateKeys()).
- action/protocol/context.go: FeatureCtx.EnableBLSAggregation gated on
  g.IsToBeEnabled(height); flips to a named hardfork height later.
- go.mod: bump iotex-proto to envestcc/iotex-proto bls-aggregate (52e72a6)
  for the BlockFooter aggregated_signature / signer_bitmap fields and the
  BLSEndorsement message.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switches consensus vote signing to BLS12-381 post-fork, reusing the
existing Endorsement type and dispatching on signature length (65B
secp256k1 vs 96B BLS). Receiver verification and endorsement-manager
quorum integration land in a follow-up PR; with the feature gate parked
at IsToBeEnabled this commit is dead code in production.

- endorsement.EndorseBLS / VerifyBLSEndorsement: thin helpers that
  produce / verify a regular *Endorsement whose signature field carries
  a BLS sig. Endorser remains the delegate's secp256k1 producer key so
  the existing Endorser().Address() path still resolves the iotex
  address; receivers look up the BLS verifying key from candidate state
  by that address.
- ConsensusConfig.BLSAggregationEnabled(height): feature gate wired off
  Genesis.ToBeEnabledBlockHeight. Will be re-pointed at a named hardfork
  height once the full Phase-2 stack lands.
- rollDPoSCtx.newEndorsement / endorseBlockProposal: post-fork, sign
  PROPOSAL, LOCK and COMMIT votes plus the proposer's wrapping
  endorsement with BLS (skipping delegates without a configured BLS
  key). Block header signing remains on the ECDSA producer key — that
  signature ties the block to chain identity and is unrelated to the
  consensus vote layer.
- The proposer's producerKey (ECDSA + BLS + address) is threaded
  through Proposal / mintNewBlock / endorseBlockProposal so the branch
  can pick the right key without a separate lookup.
- iotex-proto bump to envestcc/iotex-proto@e4439ef (PR iotexproject#169): clarifies
  Endorsement.signature semantics (pre-fork 65B secp256k1, post-fork
  96B BLS, distinguished by length).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stacks on top of the BLS sender PR. Wires up the receiver side so BLS-
signed consensus endorsements are accepted, verified against pubkeys
resolved from candidate state, and counted toward quorum. With the
feature gate parked at IsToBeEnabled this is dead code in production;
intended for local iteration until the sender PRs land.

- BLSPubKeysByEpochFunc callback type + Builder.SetBLSPubKeysByEpochFunc
  wired through to NewRollDPoSCtx.
- consensus.go provides the implementation: reads the epoch's delegate
  list from candidate state and extracts each candidate's registered
  BLSPubKey, returning a map keyed by operator iotex address.
- roundCalculator caches the BLS pubkey index per round, decoded as
  *crypto.BLS12381PublicKey. UpdateRound carries it across height
  transitions inside an epoch and re-fetches on epoch boundaries.
- roundCtx.BLSPubKey(addr) accessor; roundCtx.verifyEndorsement(doc,
  en) dispatches on signature length (65B secp256k1 vs 96B BLS).
- rollDPoSCtx.VerifyEndorsement(height, doc, en) is the public entry
  point; same length-aware dispatch plus pre/post-fork gating —
  pre-fork rejects 96B sigs, post-fork rejects 65B sigs.
- HandleConsensusMsg replaces the unconditional ECDSA verify with the
  new VerifyEndorsement; CheckBlockProposer's proofOfLock replay path
  flows through round.AddVoteEndorsement which now dispatches on
  signature length internally, so BLS endorsements in proof-of-lock
  are verified transparently.
- endorsement/bls_endorsement_test.go: round-trip EndorseBLS /
  VerifyBLSEndorsement, plus negative cases (wrong pubkey, tampered
  document, tampered signature, nil inputs). Uses deterministic
  in-test keys (no identityset dep from this package).
- consensus/scheme/rolldpos/bls_verify_test.go: covers the two-layer
  dispatch — roundCtx.verifyEndorsement (signature-length branch, BLS
  pubkey lookup miss, mismatched pubkey) and rollDPoSCtx.VerifyEndorsement
  (length gating with the BLS aggregation feature flag in both
  directions). Plus sender tests that newEndorsement emits the
  expected signature scheme based on the feature gate.

11 new tests; everything in the affected packages still passes (51
tests total across endorsement/ and consensus/scheme/rolldpos/).
- Bundle delegate address with BLS pubkey into a single 'delegate'
  struct; roundCtx.delegates becomes []delegate, dropping the parallel
  blsPubKeys map. roundCalc.delegatesAt replaces blsPubKeysFor and
  merges both callbacks into one slice — single source of truth for
  the address/pubkey alignment.
- rollDPoSCtx.VerifyEndorsement takes a single *EndorsedConsensusMessage
  instead of (height, doc, en); the message already carries all three.
- Shrink VerifyEndorsement lock scope: snapshot round + feature flag
  under RLock, release before doing the (potentially slow) signature
  verification. Safe because *roundCtx is replaced, never mutated in
  place.

Test helpers + the two consumers (HandleConsensusMsg, roundCtx_test)
updated. All targeted tests pass.
Per follow-up review on PR iotexproject#4843: remove the separate
BLSPubKeysByEpochFunc callback. NodesSelectionByEpochFunc now returns
[]*Delegate (exported), where each Delegate pairs the operator address
with its decoded BLS12-381 public key. consensus.go builds these from
candidate state in one pass; roundCalculator stores them directly and
no longer needs a parallel lookup/merge step.

- Export delegate -> Delegate{Address, BLSPubKey}.
- NodesSelectionByEpochFunc: ([]string) -> ([]*Delegate).
- Drop BLSPubKeysByEpochFunc type, Builder.SetBLSPubKeysByEpochFunc,
  the NewRollDPoSCtx param, and roundCalculator.delegatesAt /
  blsPubKeysFor.
- roundCalculator.Delegates returns []*Delegate; Proposers extracts
  addresses; IsDelegate scans by address.
- consensus.go decodes each candidate's BLS pubkey once per epoch in
  delegatesByEpochFunc; proposersByEpochFunc reuses it.

Net -68 lines. Build + vet clean; targeted tests pass.
Per PR iotexproject#4843 review: UpdateRound was reaching into round.delegates
directly because Delegates() returned []string while the field is
[]*Delegate. Make the accessor return []*Delegate so UpdateRound (and
any future caller) can use round.Delegates() consistently.

- roundCtx.Delegates() now returns []*Delegate.
- endorsementManager.Log's (unused) delegates param retyped to
  []*Delegate.
- The one genuine []string consumer (ConsensusMetrics.LatestDelegates,
  an external metrics field) extracts addresses at the call site.
Phase 2 deliverable for IIP-52: proposer aggregates the per-block COMMIT
BLS signatures into a single 96-byte sig + a signer bitmap, stored in
BlockFooter.aggregated_signature and BlockFooter.signer_bitmap. Verifiers
reconstruct the signer set from the bitmap, look up each BLS pubkey from
the round's delegate index, and FastAggregateVerify the aggregate against
the shared COMMIT-vote hash.

- blockchain/block/footer.go: new fields aggregatedSignature, signerBitmap;
  proto round-trip; IsAggregated / AggregatedSignature / SignerBitmap
  accessors.
- blockchain/block/block.go: new Block.FinalizeWithAggregate; the one-shot
  contract is preserved via a commitTime witness so either path errors on
  second call.
- consensus/scheme/rolldpos/aggregate.go: aggregateCommitEndorsements
  builds the aggregate sig + bitmap from a slice of BLS COMMIT
  endorsements indexed against the round's delegates;
  bitmapSigners is the inverse for the verifier.
- consensus/scheme/rolldpos/rolldposctx.go: at commit time, branch on
  BLSAggregationEnabled and call FinalizeWithAggregate post-fork.
- consensus/scheme/rolldpos/rolldpos.go: ValidateBlockFooter routes
  aggregated footers through validateAggregatedFooter — bitmap →
  delegates → BLS pubkeys → BLSAggregateSignature.Verify, with a
  separate 2/3 majority check.
- action/protocol/staking/protocol.go: ActiveCandidates filters out
  candidates without a registered BLS pubkey once aggregation is
  enabled, so the aggregate signer set is well-defined.
- endorsement/endorsement.go: expose SigningHash so the verifier can
  reconstruct the COMMIT-vote hash from blk.CommitTime() outside an
  Endorsement struct.

All signers sign the same hash (deterministic ts from round start + TTL
sum), which is what FastAggregateVerify requires. 8 new unit tests cover
the aggregate round-trip, partial signer sets, rejection of non-BLS
endorsements / unknown endorsers, and bitmap edge cases.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pdate

CRITICAL security fix for the BLS rogue-key aggregate-forgery attack
against IIP-52's same-message FastAggregateVerify path.

## The bug

IIP-52 verifies the COMMIT-vote BLS aggregate signature against the sum
of registered candidate BLS public keys:

    blstAggregateSignature.FastAggregateVerify(true, Σ pk_i, msg, dst)

with dst = "BLS_SIG_BLS12381G2_XMD:SHA-256_SSWU_RO_POP_". The _POP_
suffix names the IRTF "Proof of Possession" ciphersuite, which is only
sound when each registered pubkey has been accompanied at registration
time by a signature proving knowledge of the corresponding secret key.

Today registration validates only key format + subgroup membership:

    crypto.BLS12381PublicKeyFromBytes(blsPubKey)

No PoP is required. A registered candidate can publish

    pk_rogue = g^x − Σ(other delegates' pubkeys)

computed from public information, supply a single signature
σ = sign(x, msg) and a bitmap that claims every delegate signed, and
FastAggregateVerify collapses Σ pk_i + pk_rogue to g^x, accepting σ.
The bitmap satisfies isMajorityCount, so one delegate forges a 2/3+
quorum certificate and the consensus safety property is broken.

The window is latent: IIP-52's ToBeEnabledBlockHeight = math.MaxUint64
so this code path is not live on any chain yet. It must be fixed
before activation.

## The fix

- Add a 96-byte blsPop field to CandidateBasicInfo proto (iotex-proto
  PR pushed; go.mod replace updated to envestcc/iotex-proto@7a486f6)

- New action/protocol/staking/bls_pop.go defines the PoP signing root:

    BLSPopSigningRoot(blsPubKey, ownerAddress) =
        SHA-256("IOTEX_BLS_POP_v1" || blsPubKey || ownerAddress.Bytes())

  Binding all three values blocks the three replay variants:
  (a) attacker without the secret cannot sign over the canonical
      message at all — closes rogue key;
  (b) PoP for owner A does not validate for owner B — closes
      candidate-substitution replay;
  (c) "IOTEX_BLS_POP_v1" domain prefix is disjoint from any other BLS
      DST iotex uses, so a PoP cannot be replayed as a consensus vote
      or vice versa, and a future fork can rotate via "v2".

- New FeatureCtx.EnforceBLSPoP shares IsToBeEnabled height with
  EnableBLSAggregation today but is a semantically distinct switch so
  the two can diverge if a hotfix or fork-height adjustment is needed.

- action.CandidateRegister and action.CandidateUpdate carry blsPop
  through their constructors, Proto/LoadProto, and accessors.
  NewCandidateRegisterWithBLS and NewCandidateUpdateWithBLS now accept
  a blsPop parameter; pre-fork callers may pass nil.

- handleCandidateRegister, handleCandidateUpdate, and
  handleCandidateUpdateByOperator call VerifyBLSPop before writing
  Candidate.BLSPubKey when EnforceBLSPoP is active. Failure returns
  ReceiptStatus_ErrUnauthorizedOperator.

- TestBLSPop_RogueKeyAttackBlocked is the regression guard: it models
  the attacker trying to register a pubkey for which they do not have
  the secret, demonstrates the only PoP they can produce (signed under
  their own secret) does NOT validate against the target pubkey, and
  confirms the control case (a legitimate registrant can produce a
  valid PoP).

## Follow-ups (out of scope for this PR)

- ioctl/SDK wiring: stake2register.go and stake2update.go currently
  pass nil for blsPop with TODO markers. These need a flag to ingest a
  BLS private key and derive the PoP. Pre-fork the empty PoP is
  accepted; post-fork the same call will fail with the new check, so
  the tooling MUST be updated before the fork.

- Migration: existing pre-fork candidate records have no PoP. Before
  the fork activates, all active candidates must submit a
  candidateUpdate carrying a valid blsPop; the alternative is a
  fork-block state migration that drops BLS pubkeys without
  accompanying PoPs from ActiveCandidates. IIP draft will document
  the chosen approach.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sonarqubecloud

Copy link
Copy Markdown

@envestcc

Copy link
Copy Markdown
Member Author

Closing in favor of a clean master-based PR — the BLS PoP fix doesn't depend on IIP-52 (the vulnerable registration path is already on master), so stacking it on the IIP-52 PR chain was unnecessary. New PR: #4854

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant