kube-cost-lens

Lightweight Kubernetes FinOps CLI — analyze resource usage, detect waste, and get actionable rightsizing recommendations without installing anything in your cluster.

The Problem

Kubernetes clusters are notoriously over-provisioned. Teams set generous CPU and memory requests "just in case," and nobody revisits them. The result:

30-60% of requested resources sit idle in a typical cluster.
Platform teams lack visibility into which workloads are wasteful.
FinOps conversations happen in spreadsheets, disconnected from the actual cluster state.
Existing tools (Kubecost, CAST AI, etc.) require in-cluster agents, are commercial, or demand heavy setup.

kube-cost-lens solves this by providing a zero-install, read-only CLI that connects to your cluster, analyzes real usage vs. declared requests, and outputs clear, actionable recommendations.

Core Principles

Principle	Description
Zero footprint	Nothing gets installed in the cluster. Read-only access via kubeconfig.
Actionable output	Every recommendation includes the exact patch (resource values) to apply.
No vendor lock-in	Works with any Kubernetes cluster (EKS, GKE, AKS, on-prem, k3s, etc.).
Prometheus-optional	Works with the Kubernetes Metrics API by default; Prometheus integration for historical analysis.
CI/CD friendly	Machine-readable output (JSON, YAML) for pipeline integration.

Features (Planned)

Phase 1 — Core Analysis

Connect to any cluster via kubeconfig / context selection
Scan all namespaces (or filtered subset) for Deployments, StatefulSets, DaemonSets, Jobs, CronJobs
Compare resource requests and limits vs. actual usage (via Metrics API)
Calculate waste score per workload (percentage of requested resources unused)
Classify workloads: oversized, undersized, right-sized, no-requests-set, no-limits-set
Generate per-workload rightsizing recommendations with suggested values
Output: table (terminal), JSON, YAML, CSV

Phase 2 — Namespace & Cluster Aggregation

Aggregate waste by namespace with total cost impact estimation
Cluster-level summary dashboard in terminal (rich/textual)
Support custom cost-per-cpu and cost-per-gb-memory inputs for cost estimation
ResourceQuota analysis: how much of the namespace quota is actually consumed
Detect namespaces without ResourceQuotas (governance gap)

Phase 3 — Historical Analysis (Prometheus)

Connect to Prometheus / Thanos / Victoria Metrics
Analyze usage patterns over configurable time windows (7d, 30d, 90d)
Detect workloads with periodic spikes (candidates for HPA)
Detect workloads with flat low usage (candidates for aggressive downsizing)
P95/P99 usage-based recommendations (not just average)
Time-series waste trend: is the cluster getting more or less efficient?

Phase 4 — Automation & Integration

Generate Kubernetes patches (JSON patch format) ready to apply
Generate Kustomize overlays with recommended values
CI mode: exit with non-zero code if waste exceeds configurable threshold
GitHub Actions integration example
Slack/webhook notifications for periodic reports
VPA (Vertical Pod Autoscaler) recommendation comparison: kube-cost-lens vs. VPA

Architecture

┌─────────────────────────────────────────────────────────┐
│                    kube-cost-lens CLI                    │
│                                                         │
│  ┌─────────────┐  ┌──────────────┐  ┌────────────────┐  │
│  │   Scanner    │  │   Analyzer   │  │   Reporter     │  │
│  │             │  │              │  │                │  │
│  │ - K8s API   │  │ - Waste calc │  │ - Table        │  │
│  │ - Metrics   │  │ - Classify   │  │ - JSON/YAML    │  │
│  │ - Prometheus│  │ - Recommend  │  │ - CSV          │  │
│  │             │  │ - Score      │  │ - CI exit code │  │
│  └──────┬──────┘  └──────┬───────┘  └───────┬────────┘  │
│         │                │                   │           │
│         └────────────────┴───────────────────┘           │
│                          │                               │
└──────────────────────────┼───────────────────────────────┘
                           │
              ┌────────────┼────────────────┐
              │            │                │
              ▼            ▼                ▼
        ┌──────────┐ ┌──────────┐   ┌─────────────┐
        │  K8s API │ │ Metrics  │   │ Prometheus  │
        │          │ │   API    │   │  (optional) │
        └──────────┘ └──────────┘   └─────────────┘

Component Responsibilities

Scanner: Connects to the cluster and collects workload specs (requests, limits) and real-time metrics. Handles kubeconfig, context switching, and namespace filtering.
Analyzer: Compares declared resources against actual usage. Calculates waste scores, classifies workloads, and generates rightsizing recommendations using configurable strategies (average, P95, P99).
Reporter: Formats the analysis output. Supports multiple formats and handles CI/CD exit codes based on configurable thresholds.

Tech Stack

Component	Technology	Rationale
Language	Python 3.11+	Fast prototyping, rich K8s client ecosystem
K8s client	kubernetes (official)	Stable, well-maintained, supports all auth methods
Prometheus client	prometheus-api-client	Lightweight, query-focused
CLI framework	Typer	Modern, type-hinted, auto-generated help
Terminal UI	Rich	Beautiful tables, progress bars, dashboards
Package manager	uv	Fast dependency resolution and virtualenv management
Build system	pyproject.toml (hatch/hatchling)	Modern Python packaging standard
Testing	pytest + pytest-mock	Industry standard, good K8s mocking support
Linting	Ruff	Fast, replaces flake8 + isort + black
Type checking	mypy	Catch bugs early

Planned CLI Interface

# Basic scan of current cluster context
kube-cost-lens scan

# Scan specific namespaces
kube-cost-lens scan --namespace production --namespace staging

# Exclude system namespaces
kube-cost-lens scan --exclude-namespace kube-system --exclude-namespace cert-manager

# Output as JSON for pipeline consumption
kube-cost-lens scan --output json > report.json

# Set custom cost rates
kube-cost-lens scan --cpu-cost-hour 0.032 --memory-cost-gb-hour 0.004

# Use a specific kubeconfig / context
kube-cost-lens scan --kubeconfig ~/.kube/prod-config --context prod-eu-west-1

# Historical analysis via Prometheus
kube-cost-lens scan --prometheus-url http://prometheus:9090 --window 30d

# CI mode: fail if cluster waste exceeds 40%
kube-cost-lens scan --ci --max-waste-percent 40

# Generate Kubernetes patches
kube-cost-lens recommend --output patches --target-dir ./patches/

# Cluster-level summary
kube-cost-lens summary

Example Output (Planned)

╭─────────────────────────────────────────────────────────────────╮
│                  kube-cost-lens · Cluster Report                │
│                  Context: prod-eu-west-1                        │
│                  Namespaces: 12 scanned                         │
╰─────────────────────────────────────────────────────────────────╯

 Namespace     Workload              CPU Req  CPU Used  Mem Req   Mem Used  Waste   Status
─────────────────────────────────────────────────────────────────────────────────────────────
 production    api-gateway           2000m    340m      4Gi       1.2Gi     72%     ⚠ oversized
 production    auth-service          1000m    780m      2Gi       1.8Gi     14%     ✓ right-sized
 production    payment-worker        500m     45m       1Gi       128Mi     91%     ✗ oversized
 staging       frontend              1000m    12m       2Gi       64Mi      97%     ✗ oversized
 monitoring    prometheus            4000m    1200m     8Gi       5.2Gi     42%     ⚠ oversized
 ...

 Cluster Summary
──────────────────────────────────────
  Total CPU requested:      24.5 cores
  Total CPU used:            8.2 cores    (33%)
  Total Memory requested:   48 Gi
  Total Memory used:        18.4 Gi      (38%)
  Estimated monthly waste:  $847.20

  3 workloads critically oversized (>80% waste)
  5 workloads moderately oversized (40-80% waste)
  4 workloads right-sized (<20% waste)

Project Structure (Planned)

kube-cost-lens/
├── src/
│   └── kube_cost_lens/
│       ├── __init__.py
│       ├── cli.py              # Typer CLI entrypoint and commands
│       ├── scanner/
│       │   ├── __init__.py
│       │   ├── kubernetes.py   # K8s API client, workload discovery
│       │   ├── metrics.py      # Metrics API integration
│       │   └── prometheus.py   # Prometheus/Thanos query client
│       ├── analyzer/
│       │   ├── __init__.py
│       │   ├── waste.py        # Waste calculation engine
│       │   ├── classifier.py   # Workload classification logic
│       │   └── recommender.py  # Rightsizing recommendation engine
│       ├── reporter/
│       │   ├── __init__.py
│       │   ├── table.py        # Rich terminal table output
│       │   ├── json.py         # JSON/YAML serialization
│       │   ├── csv.py          # CSV export
│       │   └── ci.py           # CI mode (exit codes, thresholds)
│       └── models.py           # Pydantic models for workloads, metrics, recommendations
├── tests/
│   ├── conftest.py
│   ├── fixtures/               # Sample K8s API responses
│   ├── test_scanner.py
│   ├── test_analyzer.py
│   └── test_reporter.py
├── examples/
│   ├── github-actions.yml      # Example CI workflow
│   └── sample-report.json      # Example output
├── pyproject.toml
├── LICENSE
└── README.md

Design Decisions

Why not just use VPA recommendations?

VPA (Vertical Pod Autoscaler) is great but has limitations:

Requires installation in the cluster (CRDs + controller).
Recommendations are per-pod, not aggregated per namespace or cluster.
No cost estimation or waste scoring.
No CI/CD integration or threshold-based alerts.

kube-cost-lens complements VPA by providing a bird's-eye view with cost context, and can even compare its recommendations against VPA's.

Why read-only / zero-install?

Security-conscious teams (especially in regulated environments) resist installing agents in production clusters. A read-only CLI that runs from a developer's laptop or a CI pipeline removes that friction entirely. The only requirement is a kubeconfig with get and list permissions.

Why Python over Go?

Faster iteration for a CLI-focused tool.
Excellent Kubernetes client library.
Rich ecosystem for terminal UI (Rich, Textual).
Lower barrier for contributions from platform/DevOps engineers.
Performance is not a bottleneck — the limiting factor is API call latency, not computation.

RBAC Requirements

The tool requires minimal read-only permissions. Example ClusterRole:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: kube-cost-lens-reader
rules:
  - apiGroups: [""]
    resources: ["pods", "namespaces", "resourcequotas"]
    verbs: ["get", "list"]
  - apiGroups: ["apps"]
    resources: ["deployments", "statefulsets", "daemonsets", "replicasets"]
    verbs: ["get", "list"]
  - apiGroups: ["batch"]
    resources: ["jobs", "cronjobs"]
    verbs: ["get", "list"]
  - apiGroups: ["metrics.k8s.io"]
    resources: ["pods"]
    verbs: ["get", "list"]

Related & Prior Art

Tool	Comparison
Kubecost	Full platform, requires in-cluster install, commercial tiers
CAST AI	SaaS, agent-based, broader scope (autoscaling, spot)
kubectl top	Real-time only, no recommendations, no aggregation
Goldilocks	VPA-based, requires VPA install, namespace-scoped dashboard
Krr	Closest alternative — Prometheus-based, Python, good inspiration

kube-cost-lens differentiates by working without any in-cluster dependency, providing CI/CD integration, and generating ready-to-apply patches.

Contributing

Contributions are welcome. Please open an issue to discuss your idea before submitting a PR.

This project follows:

Conventional Commits for commit messages.
Trunk-based development with short-lived feature branches.
All code must pass ruff check, ruff format --check, and mypy before merge.

License

Apache License 2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kube-cost-lens

The Problem

Core Principles

Features (Planned)

Phase 1 — Core Analysis

Phase 2 — Namespace & Cluster Aggregation

Phase 3 — Historical Analysis (Prometheus)

Phase 4 — Automation & Integration

Architecture

Component Responsibilities

Tech Stack

Planned CLI Interface

Example Output (Planned)

Project Structure (Planned)

Design Decisions

Why not just use VPA recommendations?

Why read-only / zero-install?

Why Python over Go?

RBAC Requirements

Related & Prior Art

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

kube-cost-lens

The Problem

Core Principles

Features (Planned)

Phase 1 — Core Analysis

Phase 2 — Namespace & Cluster Aggregation

Phase 3 — Historical Analysis (Prometheus)

Phase 4 — Automation & Integration

Architecture

Component Responsibilities

Tech Stack

Planned CLI Interface

Example Output (Planned)

Project Structure (Planned)

Design Decisions

Why not just use VPA recommendations?

Why read-only / zero-install?

Why Python over Go?

RBAC Requirements

Related & Prior Art

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages