Skip to content

External heartbeat ticker #674

@spiridonov

Description

@spiridonov

Hi!

I am working on https://github.com/evrblk/monstera. In a few words, it is basically a multi-raft server. Each application is split into shards, each shard is replicated and forms a Raft group. Each Monstera node holds multiple replicas, one for each of those shards. And they all talk to each other!

The problem is that, even in idle, dozens of instances of Raft send heartbeats, frequently and in not synchronized way. That creates unnecessary load on transport layer and Go runtime (too many goroutines to synchronize and too many small objects to dispatch). I was thinking about grouping/batching those heartbeats by destination node (there is a high chance that multiple other replicas of some raft groups on node 1 are all located on node 2). But I have no control over those heartbeats whatsoever.

etcd/raft has an external ticker, so that users have more control over when heartbeats are triggered. hashicorp/raft has deeply hardcoded goroutine (https://github.com/hashicorp/raft/blob/main/replication.go#L141) and randomized timer in randomTimeout (https://github.com/hashicorp/raft/blob/main/util.go#L29). What do you think about making this configurable so that I could provide a single handler for multiple Raft instances?

Also, there is really no special message type for heartbeats. They are all appendEntry with empty Entries. What do you think about an optional pluggable handler that would allow sending heartbeats in a different way rather than by AppendEntries?

I can work on a PR, but first would like to hear your thoughts. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions