DLPX-97279 cpu_online cron generates excessive syslog noise, reducing supportability#563
Merged
Merged
Conversation
When a hypervisor hot-inserts a vCPU while the VM is running, the kernel fires an add event for the new cpu device. This rule writes 1 to the device's online sysfs attribute, replacing the per-minute cpu_online cron that was removed from dlpx-app-gate. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
david-mendez1
approved these changes
May 19, 2026
nealquigley
approved these changes
May 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
/opt/delphix/server/bin/cpu_onlinecron fires unconditionally every minute on everyengine, writing a syslog entry even when all CPUs are already online:
This adds ~1440 noise entries per day, making it harder to find relevant signals (surfaced
during an ESCL-5998 investigation).
Solution
Add a udev rule (
/etc/udev/rules.d/70-cpu-online.rules) shipped via thedelphix-platform package. The rule fires only when the kernel raises a cpu
addevent —i.e., when a hypervisor actually hot-inserts a vCPU while the VM is running — and writes
1to the device's sysfsonlineattribute. This replaces the per-minute pollingbehavior of the cpu_online cron.
Landing order: this PR should land before the companion app-gate PR for DLPX-97279,
which removes the cron. Landing this first ensures the udev rule is present before the
cron is removed, avoiding any window where hot-added CPUs would not be brought online.
Testing Done
Tested in two phases to verify that (1) the cpu_online mechanism is genuinely required for
hot-add functionality, and (2) the udev rule is a sufficient replacement.
Phase 1 — control (failing test): Deployed only the companion app-gate change (cron +
script removed, no udev rule) to a fresh ESX-hosted DCenter VM (
dlpx-developgroup,engine
2026.4.0.0-snapshot.20260518083647291). Ran the CPU hotplug suite:test_hot_add_cpuerrored with:Number of detected CPUs by the OS was not 3 after 12 retries with polling interval of 10 seconds.This confirms that Linux does notautomatically online hot-plugged CPUs — explicit action is required, and the udev rule
is a necessary part of this change.
Phase 2 — with udev rule (passing test): Deployed both the app-gate cron removal and
this udev rule to a fresh VM (engine
2026.4.0.0-snapshot.20260519100011891). All 7tests passed:
test_hot_add_cpuwent from ERROR to SUCCESS, confirming the udev rule is a sufficientreplacement. The
platform.hypervisor.cpu.positiveandplatform.hypervisor.cpu.negativeQA suites are sufficient to verify these changes do not regress CPU hotplug product
functionality.