Skip to content

feat(scheduler): use GCP Cloud Scheduler as HA scheduler backend #291

Description

@ptone

Feature Request

When Scion runs in HA mode (multiple hub replicas), the built-in scheduler feature needs a distributed, single-execution-guaranteed backend. GCP Cloud Scheduler is a natural fit for GCP-hosted deployments.

Context

The current scheduler implementation is designed for single-instance deployments. In HA mode, a naive in-process scheduler would fire on every hub replica — causing duplicate job executions. A distributed scheduler backend is required.

Proposal

Use GCP Cloud Scheduler as the scheduler engine when:

  • Scion is running in HA mode (multiple hub replicas)
  • The deployment is GCP-hosted (Cloud Run or GKE)

How it would work

  • Scion creates/manages Cloud Scheduler jobs via the GCP Cloud Scheduler API
  • Cloud Scheduler fires HTTP/Pub-Sub triggers to the hub load balancer
  • The hub receives the trigger and executes the scheduled action exactly once (guaranteed by Cloud Scheduler's single-delivery semantics)
  • Fallback: for non-GCP or single-instance deployments, keep the existing in-process scheduler

Benefits

  • No custom distributed locking required for HA scheduler
  • Cloud Scheduler handles retries, timezone support, and reliability
  • Integrates naturally with existing GCP auth (already used for Vertex AI, Cloud Run deployments)
  • Operations visibility via Cloud Console

Scope to consider

  • API for registering/deregistering Cloud Scheduler jobs
  • Auth requirements (service account with Cloud Scheduler editor role)
  • Job naming convention to map Scion schedule IDs → Cloud Scheduler job names
  • Whether to use HTTP targets (simpler) or Pub/Sub targets (more decoupled)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions