diff --git a/docs/learn/fundamentals/stellar-data-structures/ledgers.mdx b/docs/learn/fundamentals/stellar-data-structures/ledgers.mdx index 9673cefbf0..7c91cf0ead 100644 --- a/docs/learn/fundamentals/stellar-data-structures/ledgers.mdx +++ b/docs/learn/fundamentals/stellar-data-structures/ledgers.mdx @@ -89,7 +89,7 @@ The close time is a UNIX timestamp indicating when the ledger closes. Its accura ### Upgrades -How the network adjusts overall values (like the base fee) and agrees to network-wide changes (like switching to a new protocol version). This field is usually empty. When there is a network-wide upgrade, the SDF will inform and help coordinate participants using the #validators channel on the Dev Discord and the Stellar Validators Google Group. +How the network adjusts overall values (like the base fee) and agrees to network-wide changes (like switching to a new protocol version). This field is usually empty. When there is a network-wide upgrade, the SDF will inform and help coordinate participants using the #validator channel on the Dev Discord and the Stellar Validators Google Group. ### Transaction set result hash diff --git a/docs/validators/README.mdx b/docs/validators/README.mdx index b2c709339b..d9f3c40519 100644 --- a/docs/validators/README.mdx +++ b/docs/validators/README.mdx @@ -16,6 +16,12 @@ This section of the docs explains how to run a validator node, which participate If you are interested in running a validator node — because you issue an asset that you would like to help secure through transaction validation, because you want to help increase network health and decentralization, or because you want to participate in network governance — then this section of the docs is for you. It explains the technical and operational aspects of installing, configuring, and maintaining a Stellar Core validator node, and should help you figure out the best way to set up your Stellar integration. +:::tip Interested in Tier 1? + +If your organization is evaluating or planning to join the Tier 1 quorum — the core group of organizations whose validators bear the safety and liveness of the network — see [Tier 1 Organizations](/docs/validators/tier-1-orgs) for requirements, estimated costs, and a step-by-step onboarding path. The Admin Guide below covers the technical setup for a single validator; the Tier 1 page covers what it takes to run three and join the quorum. + +::: + ## Node Setup Process The basic flow, which you can navigate through using the "Admin Guide" on the left, goes (roughly) like this: diff --git a/docs/validators/admin-guide/configuring.mdx b/docs/validators/admin-guide/configuring.mdx index 6156398529..8594a31a6a 100644 --- a/docs/validators/admin-guide/configuring.mdx +++ b/docs/validators/admin-guide/configuring.mdx @@ -21,7 +21,7 @@ stellar-core --conf betterfile.cfg When installing using official Debian packages systemd unit file is configured to use `/etc/stellar/stellar-core.cfg` file. The examples in these docs don't specify `--conf betterfile.cfg` for the sake of brevity. -This page will walk you through the key fields you'll need to include in your config file to get your node up and runninig. +This page will walk you through the key fields you'll need to include in your config file to get your node up and running. :::info @@ -94,6 +94,12 @@ HOME_DOMAIN= QUALITY="MEDIUM" ``` +:::note Tier 1 quality rating + +The example above uses `QUALITY="MEDIUM"`, which is appropriate for a new validator building a track record. If your organization is a [Tier 1 participant](/docs/validators/tier-1-orgs) (or aspiring to become one), you should declare your own organization as `QUALITY="HIGH"`. Note that `HIGH` quality requires [publishing a history archive](/docs/validators/admin-guide/publishing-history-archives) — the requirement is programmatically enforced. Declaring a lower quality level significantly reduces your weight in [leader election](#impact-of-validator-quality-on-nomination) and may limit your participation in consensus. + +::: + If you don't include a `NODE_SEED` or set `NODE_IS_VALIDATOR=true`, your node will still watch SCP and see all the data in the network, but it will not send validation messages. If you run multiple validators, make sure to set a common `HOME_DOMAIN` for them by setting the `NODE_HOME_DOMAIN` property to the same value. This will ensure your nodes get grouped correctly during [quorum set generation](#home-domains-array). You also need to include your other nodes in in your config file's [`VALIDATORS` array](#validators-array). diff --git a/docs/validators/admin-guide/maintenance.mdx b/docs/validators/admin-guide/maintenance.mdx index 135b89934c..fdb782a783 100644 --- a/docs/validators/admin-guide/maintenance.mdx +++ b/docs/validators/admin-guide/maintenance.mdx @@ -25,7 +25,7 @@ We recommend performing the following steps in order (repeat sequentially as nee ## Special Considerations During Quorum Set Updates -When you join the ranks of node operators, it's also important to join the conversation. The best way to do that: follow the`#validators` channel on the [Stellar Developer Discord](https://discord.gg/stellardev). If you can't do that for some reason, sign up for the [Stellar Validators Google Group](https://groups.google.com/forum/#!forum/stellar-validators). +When you join the ranks of node operators, it's also important to join the conversation. The best way to do that: follow the `#validator` channel on the [Stellar Developer Discord](https://discord.gg/stellardev). If you can't do that for some reason, sign up for the [Stellar Validators Google Group](https://groups.google.com/forum/#!forum/stellar-validators). Sometimes an organization needs to make changes that will impact the quorum sets of others: diff --git a/docs/validators/admin-guide/monitoring.mdx b/docs/validators/admin-guide/monitoring.mdx index f30c949675..4200308bd5 100644 --- a/docs/validators/admin-guide/monitoring.mdx +++ b/docs/validators/admin-guide/monitoring.mdx @@ -81,6 +81,20 @@ Some notable fields from this `info` endpoint are: - `state`: the node's synchronization status relative to the network - `quorum`: summary of the state of the SCP protocol participants, which is the same information returned by the `quorum` command ([see below](#quorum-health)). +### Quick-Reference Health Indicators + +The table below summarizes the most important fields to check at a glance. For full details on each, see the explanations above and the [quorum health](#quorum-health) section below. + +| Field | Where | Healthy | Investigate | +| --- | --- | --- | --- | +| `state` | `info` | `Synced!` | Any other value — node is not participating in consensus | +| `ledger.age` | `info` | < 10 seconds | > 10 seconds — node may be falling behind | +| `peers.authenticated_count` | `info` | ≥ 8 | < 3 — limited connectivity, may miss messages | +| `quorum.qset.phase` | `info` | `EXTERNALIZE` | Other phases for extended periods — consensus stalled | +| `quorum.qset.fail_at` | `info` / `quorum` | ≥ 2 | ≤ 1 — one more failure will halt your node | +| `quorum.qset.missing` | `info` / `quorum` | None or few | Multiple nodes — check if quorum peers are down | +| `quorum.transitive.intersection` | `info` / `quorum` | `true` | `false` — **critical**, network at risk of splitting | + ## Peer Information The `peers` command returns information on the peers your node is connected to. @@ -663,7 +677,11 @@ You may find the below exporters useful for monitoring your infrastructure as th Once you've configured Prometheus to scrape and store your stellar-core metrics, you will want a nice way to render this data for human consumption. Grafana offers the simplest and most effective way to achieve this. Installing Grafana is out of scope of this document but is a very simple process, especially when using the [prebuilt apt packages](https://grafana.com/docs/installation/debian/#apt-repository) -We recommend that administrators import the following two dashboards into their grafana deployments: +We recommend that administrators import the following two dashboards into their Grafana deployments: + +| Dashboard | Grafana ID | Purpose | +| --- | --- | --- | +| [Stellar Core Monitoring](https://grafana.com/grafana/dashboards/10603) | `10603` | Key metrics, node status, and common problems. Start here for troubleshooting. | +| [Stellar Core Full](https://grafana.com/grafana/dashboards/10334) | `10334` | All metrics from the exporter. For in-depth analysis. | -- [**Stellar Core Monitoring**](https://grafana.com/grafana/dashboards/10603) - shows the most important metrics, node status and tries to surface common problems. It's a good troubleshooting starting point -- [**Stellar Core Full**](https://grafana.com/grafana/dashboards/10334) - shows a simple health summary as well as all metrics exposed by the `stellar-core-prometheus-exporter`. It's much more detailed than the _Stellar Core Monitoring_ and might be useful during in-depth troubleshooting +To import: in Grafana, go to **Dashboards → Import**, enter the dashboard ID, and select your Prometheus data source. diff --git a/docs/validators/admin-guide/prerequisites.mdx b/docs/validators/admin-guide/prerequisites.mdx index 8256843f52..8c43ddea88 100644 --- a/docs/validators/admin-guide/prerequisites.mdx +++ b/docs/validators/admin-guide/prerequisites.mdx @@ -13,15 +13,25 @@ CPU, RAM, Disk and network depends on network activity. If you decide to colloca ::: -At the time of writing (April 2024), Stellar Core with PostgreSQL running on the same machine worked well on a [c5d.2xlarge](https://aws.amazon.com/ec2/instance-types/c5/) instance in AWS (8x Intel Xeon vCPUs at 3.4 GHz; 16 GB RAM; 100 GB NVMe SSD (10,000 iops)). +:::tip For Tier 1 organizations + +[Tier 1 organizations](/docs/validators/tier-1-orgs) run three geographically dispersed Full Validators, each meeting these requirements independently. Each node needs its own hardware, its own unique validator key, and its own history archive. Plan for three times the resources listed below, spread across different data centers or cloud regions. See [Tier 1 Organizations](/docs/validators/tier-1-orgs) for the full requirements and onboarding path. + +::: Stellar Core is designed to run on relatively modest hardware so that a whole range of individuals and organizations can participate in the network, and basic nodes should be able to function pretty well without tremendous overhead. That said, the more you ask of your node, the greater the requirements. +The following recommendations were verified against production nodes in April 2024. Hardware requirements grow with network activity; check the [stellar-core releases](https://github.com/stellar/stellar-core/releases) for any notes on updated requirements. + | Node Type | CPU | RAM | Disk | AWS SKU | Google Cloud SKU | | --- | --- | --- | --- | --- | --- | -| Core Validator Node | 8x Intel Xeon @ 3.4 GHz | 16 GB | 100 GB NVMe SSD | [c5d.2xlarge] | [n4-highcpu-8] | +| Core Validator Node | 8 vCPUs @ 3.4 GHz | 16 GB | 100 GB NVMe SSD\* (10,000 IOPS) | [c5d.2xlarge] | [n4-highcpu-8] | + +PostgreSQL co-located on the same machine performs well at this spec — a separate database host is not required for a single validator. -_\* Assuming a 30-day retention window for data storage._ +_\* Disk sizing assumes a 30-day retention window (`AUTOMATIC_MAINTENANCE_COUNT` at default). See [Storage](#storage) below for details._ + + ## Stellar Network Access @@ -60,13 +70,37 @@ If you need to expose this endpoint to other hosts in your local network, we str ## Storage -Most storage needs come from stellar-core's database backend, which needs to store the entire ledger state. For the most part, the contents of both the database and related directories (such as `buckets`) can be ignored, since they are completely managed by Stellar Core. In terms of storage space, 100 GB is enough (as of April 2024). +Stellar Core's local storage needs come from two sources: the **buckets directory** (which serves as the primary database backend under BucketListDB, the default since stellar-core 21.0) and a much smaller **SQL database** for metadata. Both are managed entirely by Stellar Core. Local disk usage stays bounded over time — see [Why local disk stays bounded](#why-local-disk-stays-bounded) below. + +### How storage breaks down + +Approximate sizes for a current default-config validator on Mainnet. These figures should be treated as planning estimates rather than precise measurements. + +| Component | Approximate size | Notes | +| --- | --- | --- | +| Buckets directory (BucketListDB) | 20–40 GB | Primary store for live ledger state since stellar-core 21.0. | +| SQL database | A few GB | Post-BucketListDB, used only for non-ledger metadata, transaction history within the retention window, and some DEX queries. Most ledger state tables are dropped at migration. | +| WAL logs, temp files | 5–15 GB | PostgreSQL write-ahead logs and temporary space during maintenance operations. SQLite users will see lower numbers. | + +A working set of roughly 30–60 GB is typical. The 100 GB local NVMe included with the recommended `c5d.2xlarge` (and comparable on Hetzner, OVH, Contabo, and others) leaves comfortable operational headroom on top of that — room for debug captures, re-syncs, and unforeseen operational needs. + +### Why local disk stays bounded + +A common misconception is that validators need to provision storage proportional to network history. They do not. + +**Soroban (smart contract) state is bounded by [state archival](https://developers.stellar.org/docs/learn/encyclopedia/storage/state-archival).** Contract data and contract code entries carry a rent balance; when a Temporary entry's balance reaches zero, it is deleted from live state. Temporary entries are cheaper to create than Persistent entries and dominate total contract data volume, so as they expire and are removed, a significant amount of state is freed up — keeping the live state on disk from growing unboundedly. + +**Classic ledger entries — accounts, trustlines, offers, claimable balances, liquidity pool shares, and data entries — do not expire.** They persist on the live state indefinitely. They grow slowly, though, because [reserve requirements](/docs/learn/fundamentals/stellar-data-structures/accounts#base-reserves-and-subentries) act as anti-spam friction on creation. In practice, the resulting working set still lands in the 30–60 GB range cited above, so local validator state stays compact even as cumulative network history grows. + +**History archives live on object storage, not on the validator.** Full validators publish history archives to a separate object store (S3, R2, Backblaze B2, etc.) — that's where the multi-TB archive data lives. The validator process itself doesn't hold the archive on its local disk. See [Publishing History Archives](/docs/validators/admin-guide/publishing-history-archives) for the recommended setup. + +**`CATCHUP_COMPLETE=true` is almost never the right choice.** This setting makes the node sync the entire ledger from genesis on startup and is rarely appropriate for a validator. The standard pattern for new validators — including new Tier 1 candidates — is to sync against current network state, publish a history archive forward from that point, and use `stellar-archivist mirror` to backfill historical data into the published archive as a separate operation. The validator's local disk requirements are determined by the live state model above, not by historical depth. ### Database -The database is consulted during consensus, and modified atomically when a transaction set is applied to the ledger. It's random access, fine-grained, and fast. +Even with BucketListDB as the primary store, Stellar Core still requires a SQL database — either SQLite or PostgreSQL (recommended for production) — for metadata and transaction history. -As of August 2024, Stellar Core officially switched to BucketListDB as its primary database backend. Note that BucketListDB still requires SQL database, either SQLite or Postgres (recommended). +The SQL database is consulted during consensus and modified atomically when a transaction set is applied to the ledger. Access patterns are random, fine-grained, and fast. If you're using PostgreSQL, we recommend you configure your local database to be accessed over a Unix domain socket, as well as updating the below PostgreSQL configuration parameters: @@ -80,13 +114,37 @@ If you're using PostgreSQL, we recommend you configure your local database to be ### Buckets -Stellar-core stores ledger state in the form of flat XDR files called "buckets." The bucket files are used for hashing and transmission of ledger differences to history archives. If BucketListDB is used, the `buckets` directory is also used as a primary database backend. If SQL is used, the `buckets` directory is only used for hashing and history archives, and simply represents a copy of the ledger state stored in SQL. +Stellar Core stores ledger state in the form of flat XDR files called "buckets." These files are used for hashing and transmission of ledger differences to history archives. Under BucketListDB (the default since stellar-core 21.0), the `buckets` directory also serves as the primary database backend — making it the largest single component of validator storage. + +Buckets should be stored on a fast, local disk with sufficient space for several times the current ledger size. NVMe SSDs with 10,000+ IOPS are recommended for production validators. Network-attached or remote storage is not recommended; latency on the buckets path directly affects consensus performance. + + ## Kubernetes considerations -We currently do not recommend running validator nodes in Kubernetes. If you choose to do so, consider the following: +We currently do not recommend running validator nodes in Kubernetes. Standard VM-based deployments (bare metal or cloud instances) are the well-tested path for production validators. + +If you choose to use Kubernetes regardless, consider the following: - Sensitive data such as node seeds will be stored in Kubernetes etcd. Consider consuming credentials using tools like [vault agent](https://developer.hashicorp.com/vault/docs/platform/k8s/injector) or the [AWS Secrets Store CSI driver](https://github.com/aws/secrets-store-csi-driver-provider-aws) to improve security - Consider how external traffic will reach the pods. Tier 1 nodes need public DNS names and necessary ports must be accessible from the internet diff --git a/docs/validators/admin-guide/publishing-history-archives.mdx b/docs/validators/admin-guide/publishing-history-archives.mdx index 7c0b2ba221..68037046ba 100644 --- a/docs/validators/admin-guide/publishing-history-archives.mdx +++ b/docs/validators/admin-guide/publishing-history-archives.mdx @@ -5,6 +5,12 @@ sidebar_position: 50 If you want to run a [Full Validator](../README.mdx#full-validator), you need to set up your node to publish a history archive. You can host an archive using a blob store such as Amazon's S3 or Digital Ocean's spaces, or you can simply serve a local archive directly via an HTTP server such as Nginx or Apache. If you're setting up a [Basic Validator](../README.mdx#basic-validator), you can skip this section. No matter what kind of node you're planning to run, make sure to set it up to `get` history, which is covered in [Environment Preparation](./environment-preparation.mdx). +:::caution One archive per node + +Each node must publish to its own dedicated archive. Writing to the same archive from multiple nodes is not supported and will result in undefined behavior, potentially including data loss. If you're running multiple Full Validators (as [Tier 1 organizations](/docs/validators/tier-1-orgs) do), configure a separate archive for each — for example, `history-a.example.com`, `history-b.example.com`, and `history-c.example.com`. + +::: + ## Caching and History Archives The _primary_ cost of running a validator will very likely be egress bandwidth. A crucial part of your strategy to manage those costs should be caching. You can significantly reduce these data transfer costs by using common caching techniques or a CDN. Three simple rules apply to caching the History archives: diff --git a/docs/validators/admin-guide/running-node.mdx b/docs/validators/admin-guide/running-node.mdx index 03481c2e07..3197a2b7b4 100644 --- a/docs/validators/admin-guide/running-node.mdx +++ b/docs/validators/admin-guide/running-node.mdx @@ -93,7 +93,9 @@ And will then move through the various phases of downloading and applying state, "status" : [ "Catching up: downloading ledger files 20094/119803 (16%)" ] ``` -You can specify how far back in time your node goes to catch up in your config file. If you set the `CATCHUP_COMPLETE` field to `true`, your node will replay the _entire history_ of the network, which can take a long time. Weeks. You can speed up the process by copying existing archives from another full validator, using [`archivist` tool](./publishing-history-archives.mdx#complete-history-archive). Note that you only need to replay the complete network history if you're setting up a Full Validator. Otherwise, you can specify a starting point for catchup using the `CATCHUP_RECENT` field. To get in sync as fast as possible, simply use configuration defaults for `CATCHUP_RECENT` and `CATCHUP_COMPLETE`. See the [complete example configuration] for more details. +For the fastest sync — including for new Full Validators and Tier 1 candidates — leave `CATCHUP_RECENT` at its default and do **not** set `CATCHUP_COMPLETE=true`. Your node will sync against current network state in minutes to hours rather than weeks. + +`CATCHUP_COMPLETE=true` makes the node replay the _entire history_ of the network on startup, which takes weeks and is almost never the right choice for a validator — see [Why local disk stays bounded](./prerequisites.mdx#why-local-disk-stays-bounded). The standard pattern for new validators that want to publish a complete history archive is to sync against current state first, publish forward from that point, and use the [`stellar-archivist mirror`](https://github.com/stellar/go/tree/master/tools/stellar-archivist) tool to backfill historical data into the published archive as a separate offline operation. See the [complete example configuration] for more details. :::info diff --git a/docs/validators/tier-1-orgs.mdx b/docs/validators/tier-1-orgs.mdx index 72f85b271b..cec7480fec 100644 --- a/docs/validators/tier-1-orgs.mdx +++ b/docs/validators/tier-1-orgs.mdx @@ -3,84 +3,174 @@ title: Tier 1 Organizations sidebar_position: 30 --- -To help with Stellar’s decentralization, the most advanced teams building on Stellar run validators and strive to join the ranks of “Tier 1 organizations.” +# Tier 1 Organizations -Remember that the Stellar network consists of organizations that each run validators, and each organization decides for itself, by configuring a quorum set, which and how many other organizations it requires agreement from in order to commit to a particular new ledger. Tier 1 organizations are a group of organizations that, due to the fact that most other organizations require agreement from them, bear the safety and liveness of the Stellar network on their shoulders.[^1] +Tier 1 organizations are a group of organizations that bear the safety and liveness of the Stellar network on their shoulders. They earn this role because most other organizations on the network require agreement from them — by including them in their quorum sets — in order to commit to a new ledger.[^1] -To become a Tier 1 organization, a team running validators must convince enough other organizations in the Stellar network to trust them by including them in their quorum sets. As part of this process, they must meet some requirements that are accepted by the community of Stellar validators. For example, Tier 1 organizations generally run three validators, coordinate any changes to their quorum sets with each other, and hold themselves to a higher standard of uptime and responsiveness. +To become a Tier 1 organization, a team running validators must convince enough other organizations to trust them. SDF works closely with Tier 1 organizations to ensure network health, maintain robust quorum intersection, and build in redundancy to minimize disruptions. This guide outlines the requirements, costs, and process for organizations that want to join (or evaluate joining) the Tier 1 quorum. -As a steward of the Stellar network, the SDF works closely with Tier 1 organizations to ensure the health of the network, maintain robust quorum intersection, and build in redundancy to minimize network disruptions. This guide outlines the minimum requirements recommended by the SDF in order to be a Tier 1 organization. However, in the end, the SDF on its own cannot add or remove a Tier 1 organization; this depends on the quorum sets of many other organizations in the network. +:::info In short + +Run three geographically dispersed Full Validators, achieve sustained 99.9%+ uptime visible on [Obsrvr Radar](https://radar.withobsrvr.com/), complete [SEP-20](https://github.com/stellar/stellar-protocol/blob/master/ecosystem/sep-0020.md) and [SEP-1](https://github.com/stellar/stellar-protocol/blob/master/ecosystem/sep-0001.md) self-verification, and coordinate actively with the existing Tier 1 community. Joining is not a unilateral SDF decision — every existing Tier 1 organization independently decides whether to add you to their quorum set. Expect a process measured in months, not weeks. + +::: + +:::info Not ready for Tier 1? + +Running a single [Basic or Full Validator](/docs/validators) is a meaningful contribution to network decentralization and a great way to build operational experience. The [Admin Guide](/docs/validators/admin-guide) covers everything you need for single-node setup. This page covers the additional requirements for running three Full Validators as a Tier 1 organization. + +::: + +## What Tier 1 Requires + +| Requirement | Details | +| --- | --- | +| **Full Validators** | 3, each [publishing a separate history archive](/docs/validators/admin-guide/publishing-history-archives) | +| **Geographic dispersion** | Nodes in different data centers or cloud regions | +| **Self-verification** | [SEP-20](https://github.com/stellar/stellar-protocol/blob/master/ecosystem/sep-0020.md) on-chain identity linking + [SEP-1](https://github.com/stellar/stellar-protocol/blob/master/ecosystem/sep-0001.md) stellar.toml | +| **Uptime** | 99.9%+ target with 24/7 [monitoring and alerting](/docs/validators/admin-guide/monitoring) | +| **Coordination** | Active communication with other Tier 1 organizations | +| **Trust** | Other Tier 1 organizations must choose to include you in their quorum sets | ## Why Three Validators -The most important recommendation for a Tier 1 organization is to set up and maintain three full validators. Why three? +On Stellar, validators choose to trust _organizations_ when they configure their quorum set. If you are a trustworthy organization, you want your presence on the network to persist even if a node fails or you take it down for maintenance. A trio of validating nodes allows that to happen: other participants can require ⅔ of your nodes to agree. If one has issues, the other two still vote on your organization's behalf. -On Stellar, validators choose to trust organizations when they configure their quorum set. If you are a trustworthy organization, you want your presence on the network to persist even if a node fails or you take it down for maintenance. A trio of validating nodes allows that to happen: when configuring their quorum sets, other participants can requires ⅔ of your validating nodes to agree. If 1 has issues, no big deal: the other two still vote on your organization’s behalf, so the show goes on. To ensure redundancy, it's also important that those three full validators be geographically dispersed: if they're in the same data center, they run the risk of going down at the same time. +To ensure redundancy, those three Full Validators must be **geographically dispersed** — different data centers, ideally different cloud regions or providers. If all three are in the same facility, a single outage takes out your entire organization's voting power. -Here’s what else Tier 1 organizations should expect of one another: +## Estimated Costs -## Publish History Archives +Running three Full Validators is more affordable than many newcomers expect, but the spread between cost-conscious and managed-cloud setups is wide. The dominant cost driver is bandwidth associated with serving your history archive — and that varies significantly by provider, by archive consumption patterns, and by whether you use a CDN or an egress-free object store. -In addition to participating in the Stellar Consensus Protocol, a full validator publishes an archive of network transactions. To do that, you need to configure Stellar Core to record history to a publicly accessible archive, and add the location of that archive to your stellar.toml. We recommend that, as a Tier 1 organization, you should set each of your nodes to record history to a separate archive. +The table below groups setups into three archetypes that reflect actual choices made by current Tier 1 operators. Per-node figures assume a single Full Validator publishing its own history archive; three-node figures assume a Tier 1 deployment with geographic dispersion. -Public archives make the network more resilient: when new nodes come online, or when existing nodes lose synch, they need to consult an archive to figure out what they missed. Sharing snapshots of the ledger, which detail transactions and their results, allows those nodes to catch up, and more archives mean more redundancy and greater decentralization. Plus, sharing history keeps everyone honest. +| Setup archetype | Per node / month | 3 nodes / month | Typical providers | +| --- | --- | --- | --- | +| **Lean** | \$60–200 | \$180–600 | Contabo, Hetzner; archive on Cloudflare R2 or Backblaze B2 | +| **Standard** | \$200–500 | \$600–1,500 | Hetzner, OVH, DigitalOcean, Akamai; archive on Backblaze, Wasabi, or self-hosted | +| **Hyperscaler** | \$500–1,000+ | \$1,500–3,000+ | AWS, GCP, Azure; archive on S3/GCS/Blob without a CDN | -## Set Up a Safe Quorum Set +_Last reviewed: May 2026. Sanity-check against current provider pricing before budgeting. Costs reported by current Tier 1 operators range from roughly \$160 to \$1,800 per month for three nodes._ -For simplicity, we’re recommending that every Tier 1 node use the same quorum set configuration, which is made up of inner quorum sets representing each Tier 1 organization. +### What Drives the Variance -To configure a quorum set for your validator, we recommend including several Tier 1 organizations or copying the existing Tier 1 Qset and optionally adding additional organizations that you trust to it. Using existing Tier 1 organizations as a safety net, we can work together to expand the quorum methodically and deliberately. +**History archive bandwidth is the largest swing factor.** Validators that are heavily consumed as a catch-up source — by other validators, archivers, RPC nodes, and ecosystem services — can serve 10–30 TB per month from a single archive. Whether that costs \$0 or \$2,700 depends almost entirely on your provider: -To see what the current recommended quorum set looks like, check out the [example Full Validator config file](https://github.com/stellar/packages/blob/master/docs/examples/pubnet-validator-full/stellar-core.cfg). +- **Egress-free object storage** (Cloudflare R2, Backblaze B2 with Cloudflare in front) charges nothing for outbound bandwidth. Several Tier 1 operators use this approach specifically to remove archive bandwidth from the cost equation. +- **Bundled bandwidth allowances** (Hetzner includes 20 TB/month in EU regions; OVH includes generous traffic on dedicated servers) cover most realistic archive workloads at no marginal cost. +- **Hyperscaler egress** (\$0.09/GB on AWS, similar on GCP and Azure) turns a 10 TB archive month into a \$900 line item, before any compute or storage. -:::note +**Archive consumption is uneven across your three nodes.** Operators commonly report large variance between sibling nodes — one archive may serve tens of TB per month while another in the same fleet serves under 100 GB, depending on how each is configured in other operators' archive lists. Plan capacity for at least one of your three nodes to be heavily consumed. -Be sure to properly add your own validators to your validators' quorum sets. As a Tier 1 organization with three validators and history archives, SDF recommends declaring your organization [`HIGH` quality](./admin-guide/configuring.mdx#validator-quality) in your quorum set configuration. Declaring your organization at a lower quality level [may limit its participation in consensus](./admin-guide/configuring.mdx#impact-of-validator-quality-on-nomination). +**Compute is a smaller line item than newcomers often assume.** A node meeting the recommended hardware spec (8 vCPU, 16 GB RAM, NVMe SSD) costs \$40–80/month on dedicated providers like Contabo, Hetzner, or OVH, and \$150–300/month on hyperscalers. Reserved or committed-use pricing on hyperscalers narrows this gap considerably. -::: +### What Tier 1 Operators Actually Do + +Among current Tier 1 organizations, the dominant pattern is **dedicated or bare-metal providers with generous egress allowances**, paired with **egress-free or low-cost object storage for history archives**. Hetzner, OVH, Contabo, and DigitalOcean appear repeatedly across the cohort; AWS, GCP, and Azure are notably underrepresented. This is not a recommendation to avoid hyperscalers — some operators use them deliberately for compliance, geographic, or operational reasons — but it is a pattern worth understanding before committing to a stack. + +If you are evaluating providers, the `#validator` channel on the [Stellar Developer Discord](https://discord.gg/stellardev) is the best place to ask current operators what they are actually paying. + +### Other Line Items + +The archetypes above bundle these into the totals, but they are worth listing for completeness: + +- **DNS / domain registration:** \$1–5/month total. Negligible. +- **Monitoring:** \$0–50/month total. Self-hosted Prometheus + Grafana on one of your existing nodes is functionally free; managed services like Grafana Cloud or Datadog push to the high end. + +## What SDF and Existing Tier 1 Organizations Evaluate + +Becoming Tier 1 is not a unilateral decision by SDF — it depends on the quorum set choices of all existing Tier 1 organizations. But SDF does steward the process. When evaluating candidates, the community considers: + +1. **Organizational mission alignment** — Does your organization have a genuine stake in Stellar's success? Do you issue assets, process payments, or build infrastructure on the network? +2. **Identity and real-world presence** — Are you transparent about who you are? Public leadership, registered entity, visible operations? +3. **Operational excellence** — Can you demonstrate sustained high uptime (99.9%+)? Is your monitoring professional-grade? +4. **Geographic diversity** — Are your nodes in different regions than existing Tier 1 organizations? +5. **Jurisdictional diversity** — Are you in a different legal jurisdiction, reducing correlated regulatory risk? +6. **Infrastructure diversity** — Do you use different cloud providers, ISPs, or hardware than existing organizations? +7. **Economic diversity** — Does your business model differ from existing Tier 1 organizations? +8. **Responsiveness** — Do you respond quickly to incidents, upgrade requests, and coordination needs? + +No single dimension is disqualifying on its own, but organizations that strengthen the network across multiple dimensions are the strongest candidates. + +## Step-by-Step Path to Tier 1 + +The checklist below covers the operational work. Completing it is necessary but not sufficient — see [What SDF and Existing Tier 1 Organizations Evaluate](#what-sdf-and-existing-tier-1-organizations-evaluate) above for the qualitative dimensions that determine whether existing Tier 1 organizations add you to their quorum sets. + +### Phase 1: First Full Validator + +Work through the [Admin Guide](/docs/validators/admin-guide) to stand up a single Full Validator on Mainnet: -## Declare Your Node +- [ ] Review [prerequisites](/docs/validators/admin-guide/prerequisites) and provision a server +- [ ] [Install](/docs/validators/admin-guide/installation) Stellar Core +- [ ] [Configure](/docs/validators/admin-guide/configuring) for Mainnet with a validator key pair (`stellar-core gen-seed`) +- [ ] [Prepare your environment](/docs/validators/admin-guide/environment-preparation) and initialize the database +- [ ] Set up a [history archive](/docs/validators/admin-guide/publishing-history-archives) +- [ ] [Start your node](/docs/validators/admin-guide/running-node) and sync to the network +- [ ] Set up [monitoring](/docs/validators/admin-guide/monitoring) with Prometheus + Grafana -[SEP-20](https://github.com/stellar/stellar-protocol/blob/master/ecosystem/sep-0020.md) is an open spec that explains how self-verification of validator nodes works. The fields it specifies are pretty simple: you set the home domain of your validator’s Stellar account to your website, where you publish information about your node and your organization in a stellar.toml file. +### Phase 2: Three Full Validators -It’s an easy way to propagate information, and it harnesses the network to allow other participants to discover your node and add it to their quorum sets without the need for a centralized database. +Scale to the Tier 1 architecture: -## Keep Your Nodes Up To Date +- [ ] Provision 2 additional servers in **different geographic regions** +- [ ] Generate **unique** key pairs for each node — never share seeds +- [ ] Configure each with its own [history archive](/docs/validators/admin-guide/publishing-history-archives) (one archive per node) +- [ ] Deploy the standard Tier 1 [quorum set](/docs/validators/admin-guide/configuring#choosing-your-quorum-set) on all three nodes, declaring your own organization as `HIGH` quality +- [ ] Extend monitoring to cover all three nodes -Running a validator requires vigilance. You need to keep an eye on your nodes, keep them up to date with the latest version of Stellar Core, and check in on public channels for information about what’s currently happening with other validators. As organizations join or leave the network, you might need to update the quorum set configuration of your validators to ensure that your validators have robust quorum intersection with Tier 1 and robust quorum availability. +### Phase 3: Identity and Verification -The best two ways to do that: +Make your organization discoverable and verifiable: -- Join the validators [email list](https://groups.google.com/forum/#!forum/stellar-validators) -- download Keybase and join the [#validators channel](https://keybase.io/team/stellar.public) on the stellar.public team +- [ ] Configure public DNS for each node (e.g., `core-a.example.com`, `core-b.example.com`, `core-c.example.com`) +- [ ] Create funded Stellar accounts for each validator node +- [ ] Set home domain on-chain for each ([SEP-20](https://github.com/stellar/stellar-protocol/blob/master/ecosystem/sep-0020.md)) +- [ ] Publish a complete stellar.toml ([SEP-1](https://github.com/stellar/stellar-protocol/blob/master/ecosystem/sep-0001.md)) with organization info and all three `[[VALIDATORS]]` entries +- [ ] Verify your nodes appear correctly on [Obsrvr Radar](https://radar.withobsrvr.com/) -We always announce new Stellar Core releases in those channels. You can also find those releases on our github. +### Phase 4: Build Trust (Ongoing) -It’s also critical that you pay attention to information about what those updates mean: often, you’ll need to set your validators to vote on something timely, such as when to vote to upgrade to a new protocol version, or how high to set the operations-per-ledger limit. +Earn the confidence of existing Tier 1 organizations: -## Coordinate With Other Validators +- [ ] Join the [Stellar Dev Discord](https://discord.gg/stellardev) `#validator` channel +- [ ] Maintain 99.9%+ uptime for 3+ months, visible on Obsrvr Radar +- [ ] Participate in upgrade coordination and governance discussions +- [ ] Respond promptly to maintenance coordination from other validators +- [ ] Engage with SDF about Tier 1 candidacy when your track record is established -Whether you run a trio of validators or a single node, it’s important that you coordinate with other validators when you make a significant change or notice something wrong. You should let them know when you plan to: +### What Happens When You're Ready -- Take your node down for maintenance -- Make changes to your quorum set +Once SDF and existing Tier 1 organizations agree you meet the bar: -Letting other validators know when you plan to take your node down for maintenance or to upgrade to the latest version of stellar-core prevents a critical mass of nodes from going offline at the same time. +1. Existing Tier 1 organizations update their quorum sets to include your validators +2. The standard Tier 1 quorum set configuration is updated to include your organization +3. You begin appearing as Tier 1 on network monitoring tools +4. You assume all the responsibilities described on this page -Letting other validators know when you plan to change your quorum set allows them to respond, adjust, and think through the implications of the change. For the Stellar network to expand safely, the SDF recommends that validators coordinate off-chain to maintain good quorum intersection. +This is not a unilateral decision by SDF. Every existing Tier 1 organization independently decides whether to trust you by adding you to their quorum set. -## Monitor your quorum set +## Working with the Tier 1 Community -We recommend using Prometheus to scrape and store your stellar-core metrics, and Grafana to render that data for human consumption. You can find step-by-step instructions for setting up monitoring and alerts in [Monitoring and Diagnostics](./admin-guide/monitoring.mdx), along with links to Grafana dashboards we’ve created to make things easier. +Coordinate with other validators when you make significant changes. Specifically, let them know when you plan to: -You can also use [Obsrvr Radar](https://radar.withobsrvr.com/) to view validators’ quorum configurations, get information about their availability and uptime, and the quorum command to diagnose problems with the quorum set of the local node. +- **Take a node down for maintenance** — so a critical mass of nodes don't go offline simultaneously +- **Make changes to your quorum set** — so they can respond, adjust, and ensure quorum intersection is maintained +- **Upgrade to a new Stellar Core version** — especially when the release involves protocol upgrades or Soroban settings votes -You should do regular check-ins on your quorum set. If nodes have bad uptime or prove otherwise unreliable, you may need to remove them from your quorum set so that you don’t get stuck and so that the network doesn’t halt. You may also want to add new organizations that come online and prove reliable. If you plan to do either of those things, remember to communicate and coordinate with other validators. +For the Stellar network to expand safely, validators must coordinate off-chain to maintain good quorum intersection. Never change your quorum set without discussing it with other Tier 1 organizations first. -## Get in touch +The `#validator` channel on the [Stellar Dev Discord](https://discord.gg/stellardev) is the primary place to do this coordination — and also where prospective Tier 1 candidates can introduce themselves and get guidance from existing operators and SDF. As Stellar grows and more businesses build on the network, Tier 1 organizations will be crucial to healthy expansion of the network. -If you think you can be a Tier 1 organization, let us know on the `#validators` channel on the [Stellar Developers' Discord](https://discord.gg/stellardev). Community members can help you through the process, and once you’re up and running, SDF team members will help you join Tier 1, so that you can take your rightful place as a pillar of the network. Once you’ve proven that you are responsive, reliable, and maintain good uptime, the SDF may recommend that other validators adjust their quorum set to include your validators. +## Resources -As Stellar grows, and more and more businesses build on the network, Tier 1 organizations will be crucial to a healthy expansion of the network. +| Resource | Link | +| --- | --- | +| Stellar Dev Discord (`#validator`) — primary coordination channel for validator operators. Release announcements, upgrade coordination, incident response. | [discord.gg/stellardev](https://discord.gg/stellardev) | +| stellar-core GitHub releases — official release notes | [github.com/stellar/stellar-core/releases](https://github.com/stellar/stellar-core/releases) | +| Obsrvr Radar — public validator uptime and quorum monitoring | [radar.withobsrvr.com](https://radar.withobsrvr.com/) | +| Admin Guide | [/docs/validators/admin-guide](/docs/validators/admin-guide) | +| Example Mainnet Full Validator config | [stellar/packages on GitHub](https://github.com/stellar/packages/blob/master/docs/examples/pubnet-validator-full/stellar-core.cfg) | +| SEP-20 (Self-Verification) | [stellar-protocol on GitHub](https://github.com/stellar/stellar-protocol/blob/master/ecosystem/sep-0020.md) | +| SEP-1 (stellar.toml) | [stellar-protocol on GitHub](https://github.com/stellar/stellar-protocol/blob/master/ecosystem/sep-0001.md) | -[^1]: The notion of Tier 1 organization can be defined precisely, but this is besides the point of this page. +[^1]: The notion of Tier 1 organization can be defined precisely, but this is besides the point of this page.