Skip to content

arki05/pve-storage-test-lab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pve-storage-test-lab

Spin up a nested Proxmox VE cluster on a real PVE host, then run a pytest suite against pluggable storage backends (ZFS, LVM-thin, MooseFS, ...). Designed to exercise the operations PVE storage plugins are expected to handle correctly — snapshots, migration, resize, backup, concurrent ops — and to surface real bugs in plugin implementations.

The MooseFS run on this harness uncovered five real bugs in the upstream pve-moosefs plugin, all now fixed in arki05/pve-moosefs.

Origin

Driven by my (real) testing needs while developing the arki05/pve-moosefs fork; written largely by Claude (Anthropic) through pair programming. I set the direction, Claude produced most of the code, I caught it when it was wrong (often). Treat this as a working tool with rough edges, not a polished framework.

What it does

  1. Builds a node template (lab/template/build-template.sh) — uses Proxmox's official automated installer (proxmox-auto-install-assistant) to produce a real native PVE install on a Debian 13 / PVE 9 base. Cached, only rebuilt when the version marker changes.
  2. Provisions a cluster (lab/create.sh) — full-clones N nodes from the template, boots them on vmbr0, discovers IPs via the guest agent, deploys SSH keys, joins them into a PVE cluster.
  3. Sets up a storage profile (payloads/storage-test/profiles/<name>/setup.sh) — installs whatever the backend needs across all nodes and registers it as a PVE storage.
  4. Runs the test suite (payloads/storage-test/run.sh) — copies the pytest suite to node1, builds an inner test-guest template (small Debian cloud image with qemu-guest-agent + fio installed via cloud-init), then runs pytest against the configured storage.

Requirements

  • A real Proxmox VE 8.x or 9.x host with nested virt enabled
  • Enough free space and RAM (default: 3 nodes × 4 GB RAM × 50 GB disk)
  • Internet access (downloads PVE ISO, Debian cloud image)

Quick start

# Clone, build template, create lab
git clone https://github.com/arki05/pve-storage-test-lab.git
cd pve-storage-test-lab
./entrypoint.sh        # one-shot: template + lab + (optionally) tests

# Or step by step
bash lab/template/build-template.sh
bash lab/create.sh --nodes 3

# Run tests against a backend (ZFS as a sanity baseline)
bash payloads/storage-test/run.sh --storage-profile zfspool

# Tear down
bash lab/destroy.sh

Running against MooseFS (your fork or upstream)

# Upstream
bash payloads/storage-test/profiles/moosefs/setup.sh

# Your fork on a specific branch
export MOOSEFS_PLUGIN_FORK=https://github.com/arki05/pve-moosefs.git
export MOOSEFS_PLUGIN_BRANCH=main
bash payloads/storage-test/profiles/moosefs/setup.sh

# Then run tests
bash payloads/storage-test/run.sh --storage-profile moosefs

What the test suite covers

51 tests across 6 files:

  • test_vm_lifecycle.py — create/start/stop/destroy + agent + disk I/O for VMs and LXCs
  • test_data_integrity.py — data survives stop/start, snapshot rollback, fio verify
  • test_snapshot.py — create/rollback/delete, multiple snapshots, LXC snapshots
  • test_migration.py — offline + live migration, with data verification
  • test_operations.py — resize, backup/restore, storage move
  • test_composed.py — chained operations where bugs hide:
    • snapshot → resize → rollback (size + data both restored?)
    • migrate → snapshot → rollback (works on the new node?)
    • backup → resize → restore (right state?)
    • snapshot/backup of a running VM under fio (the gold-standard corruption test — fio writes pseudo-random data with crc32c verification; if the operation under test corrupts in-flight writes, fio's final verify pass fails)
    • online resize under fio
    • concurrent snapshots on two VMs
    • daemon resilience (pvestatd restart doesn't break storage access)

Tests are gated by per-storage capability flags (profiles/<name>/capabilities.env) — e.g., live-migration tests skip on storage without shared semantics.

Adding a storage profile

payloads/storage-test/profiles/<name>/
├── capabilities.env   # SUPPORTS_SNAPSHOTS, SHARED_STORAGE, SUPPORTS_LIVE_MIGRATION, ...
├── setup.sh           # install + register the storage on all nodes
└── teardown.sh        # remove it

Look at profiles/zfspool/ for the simplest example, profiles/moosefs/ for the most involved.

Test infrastructure

  • helpers/data_guard.pyDataGuard seeds files at known offsets with MD5 checksums; BackgroundFio runs fio --rw=randwrite --verify=crc32c so you can do an op with live I/O in flight and detect corruption.
  • helpers/ops.py — composable storage ops (snapshot_create, offline_migrate, resize_disk, backup, ...) that wait for PVE tasks and propagate errors.
  • helpers/guest.py — runs commands inside test VMs via qm guest exec (with proper SSH-fallback for cross-node and shlex.quote-correct argv handling).
  • conftest.pycreate_vm / create_lxc fixtures that clone from the inner test-guest template, with cleanup on teardown.

What's deliberately not here

  • Production-grade error handling in the lab provisioner. It does enough for repeatable test runs; if something gets stuck mid-flight you may have to clean up VMs/CTs manually.
  • CI integration. Designed to run on a homelab PVE host you have shell access to.
  • Coverage of every PVE storage feature. Focused on the ops most likely to bite real plugin implementations.

Known rough edges

  • Inner test-guest template build can take 10–15 min on first run (downloads cloud image, runs cloud-init through apt).
  • LXC fixture teardown fires pve.delete() but doesn't wait_for_task — destroys finish asynchronously, harmless but you'll see stale CTs in pct list mid-run.
  • Some failures are flaky on storage backends with fragile state (e.g., NBD device reuse on MooseFS). When that happens, the harness exposes the underlying plugin issue rather than papering over it.

License

MIT (see LICENSE).

About

Nested PVE cluster + pytest harness for stress-testing Proxmox storage plugins (ZFS, LVM-thin, MooseFS, ...). Originally built to validate the arki05/pve-moosefs fork.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors