Skip to content

DLPX-96942 Replace ntpsec with chrony (appliance-build changes)#857

Merged
manoj-joseph merged 1 commit into
developfrom
dlpx/pr/manoj-joseph/ff7c6166-3554-4aab-b2a2-04ea84a3a6b0
May 8, 2026
Merged

DLPX-96942 Replace ntpsec with chrony (appliance-build changes)#857
manoj-joseph merged 1 commit into
developfrom
dlpx/pr/manoj-joseph/ff7c6166-3554-4aab-b2a2-04ea84a3a6b0

Conversation

@manoj-joseph

@manoj-joseph manoj-joseph commented Feb 18, 2026

Copy link
Copy Markdown
Contributor

Background

The Delphix appliance previously used ntpsec as its NTP daemon (and before that, the legacy ntp package). Both packages are being replaced with chrony (DLPX-96940/96941/96942), motivated by two factors:

  1. Security (DLPX-86999, Qualys QID 38293): ntpsec and ntp respond to NTP mode 6/7 queries, exposing sensitive system information (OS version, kernel, stratum, etc.) that can aid further attacks against the system. chrony does not support these query modes by default, eliminating the disclosure.
  2. Ubuntu 25.04 compatibility: The Delphix Engine is planned to move to Ubuntu 25.04 later this year, where chrony is the default NTP daemon. Making this switch now avoids duplicating the migration effort at that time.

Companion PRs

This PR is part of a set of four that must merge together:

Problem

This PR handles the upgrade path: when an appliance running ntpsec (or ntp) upgrades to a version with chrony, the existing NTP configuration must be migrated so that the user's configured NTP servers are preserved and chrony is started in the same enabled/disabled state as the outgoing NTP service.

Without this migration, an appliance that had NTP enabled would silently end up with chrony masked and no servers configured after upgrade.

Solution

The upgrade/upgrade-scripts/execute script now includes an NTP migration block that runs after the new packages are installed. It detects which of the two possible predecessor services was active and reads the corresponding config:

  • If ntpsec.service was enabled → reads from /etc/ntpsec/ntp.conf
  • If ntp.service was enabled → reads from /etc/ntp.conf and purges the ntp package (whose removal script would otherwise leave ntp.service in a masked state)
  • If neither was enabled → no migration; chrony.service stays masked (handled by fix_and_migrate_services)

When a source config is found, the script writes a new /etc/chrony/chrony.conf by extracting the relevant entries and translating them to chrony's format:

  • pool <server> iburst lines are preserved as-is (chrony supports the same pool directive)
  • multicastclient <addr> entries are converted to server <addr> iburst (the chrony equivalent)
  • ntpsec-specific directives (restrict, driftfile path) are dropped and replaced with chrony equivalents (driftfile /var/lib/chrony/chrony.drift, makestep 1.0 3, rtcsync)

Finally, chrony.service is unmasked and enabled.

upgrade/upgrade-scripts/common.sh (fix_and_migrate_services) is also updated to mask chrony.service (instead of the removed ntpsec.service and ntpsec-rotate-stats.timer) when those services are not-found or disabled, ensuring the default disabled state is preserved across not-in-place upgrades.

Note on config format: The Delphix appliance only ever writes two kinds of entries to the NTP config via bos_mgmt.sh in dlpx-app-gate: pool <server> iburst and multicastclient <addr>. This has been true for both the ntp and ntpsec eras, so the migration covers all configurations that can exist on a Delphix-managed appliance.

The remaining directives present in a typical Ubuntu ntpsec installation are package defaults, not appliance or user configuration. The table below documents each one and how chrony handles it, verified against man chrony.conf (chrony 4.5) and the Ubuntu default chrony.conf from the package:

ntpsec directive Source Chrony handling
driftfile /var/lib/ntpsec/ntp.drift Ubuntu default Replaced with driftfile /var/lib/chrony/chrony.drift (written explicitly by the migration script, matching the Ubuntu default chrony path)
leapfile /usr/share/zoneinfo/leap-seconds.list Ubuntu default Chrony uses leapsectz right/UTC instead (reads leap seconds from the system timezone database). The migration script does not add this directive; for a pure NTP client, leap seconds are handled automatically via NTP server announcements.
tos maxclock 11 Ubuntu default No equivalent; chrony selects the best sources automatically using its own algorithm
tos minclock 4 minsane 3 Ubuntu default No equivalent; chrony's source selection handles minimum source requirements natively
server ntp.ubuntu.com Ubuntu default fallback Not added; Ubuntu's default chrony.conf uses pool ntp.ubuntu.com iburst maxsources 4 instead of a bare server line. Since the migration writes a new chrony.conf from the pool entries in the original ntpsec config (which include the four N.ubuntu.pool.ntp.org entries), coverage is equivalent.
restrict default kod nomodify nopeer noquery limited Ubuntu default Not needed; per man chrony.conf: "The default is that no clients are allowed access, i.e. chronyd operates purely as an NTP client."
restrict 127.0.0.1 / restrict ::1 Ubuntu default Not needed; chrony binds its command sockets to loopback by default, blocking all access except from localhost

Testing Done

Migration logic (script-level test):

The NTP migration block was extracted and tested in isolation against simulated input configs, covering all meaningful scenarios:

  • NTP disabled: migration block is skipped; no chrony.conf changes (chrony stays masked via fix_and_migrate_services)
  • ntpsec with pool servers: pool lines preserved verbatim, restrict and ntpsec-specific directives dropped, chrony directives (driftfile, makestep, rtcsync) added correctly
  • ntpsec with multicastclient: multicastclient <addr> correctly converted to server <addr> iburst
  • Mixed pool + multicast: both handled correctly in the same config

All 14 test assertions passed.

CI builds

Build Description Status
#13927 Full build + upgrade from 2025.3.0.1 UNSTABLE — dx-integration-tests: NtpServerMonitoringTest flakiness (fixed; see second set of runs below)
#13928 Full build + upgrade from 2025.6.0.0 SUCCESS
#13929 Full build + upgrade from 2026.2.0.0 UNSTABLE — upgrade-testing: pre-existing failure unrelated to this change
#13930 Full build UNSTABLE — upgrade-testing: pre-existing failure unrelated to this change
#13941 Full build + upgrade from 2025.3.0.1 UNSTABLE — upgrade-testing: pre-existing failure unrelated to this change
#13938 Full build + upgrade from 2025.6.0.0 SUCCESS
#13940 Full build + upgrade from 2026.2.0.0 UNSTABLE — upgrade-testing: pre-existing failure unrelated to this change
#13939 Full build FAILURE — upgrade-testing: pre-existing failure unrelated to this change

Note: The git workflow runs with only this repo's changes, without the companion changes in dlpx-app-gate#4244 and delphix-platform#556. The appliance-build-orchestrator-pre-push runs pull in all three companion PRs together, so their results reflect the full end-to-end change.

Upgrade testing:

End-to-end upgrade tests were run against a VM running 2026.4.0.0-snapshot with ntpsec configured, upgrading to the build from #13897. All three migration scenarios passed on blackbox-self-service:

Scenario Result
ntpsec with pool servers (#215235) SUCCESS
ntpsec with multicastclient (#215236) SUCCESS
NTP disabled (#215237) SUCCESS

Long-path upgrade test (DLPX-95255):

Full upgrade path 21.0.0.0 → 2025.2.0.1 → 2026.4.0.0, exercising migration from the legacy ntp package:

Build NTP state Result
#9260 enabled (ntp.service) FAILURE — wrong qa_branch (dlpx-2025.2.0.1); fixed in #9262
#9262 enabled (ntp.service) SUCCESS

The NTP-never-enabled path (ntp.service masked from 21.0, chronyd should stay masked post-upgrade) was tested manually with AI assist (DLPX-95255). Upgrade performed on mj-ntp-off-21000.dlpxdc.co (21.0.0.0 → 2025.2.0.1 → 2026.4.0.0 pre-push #6498). Post-upgrade state confirmed: chrony installed, chrony.service masked, GET /service/timentpConfig.enabled: false.

@manoj-joseph manoj-joseph force-pushed the dlpx/pr/manoj-joseph/ff7c6166-3554-4aab-b2a2-04ea84a3a6b0 branch from 4f8e032 to 578a639 Compare February 18, 2026 23:55
@manoj-joseph manoj-joseph changed the title Replace ntpsec with systemd-timesyncd DLPX-96942 Replace ntpsec with chrony (appliance-build changes) Apr 13, 2026
@manoj-joseph manoj-joseph force-pushed the dlpx/pr/manoj-joseph/ff7c6166-3554-4aab-b2a2-04ea84a3a6b0 branch 4 times, most recently from 21985e3 to 0445988 Compare April 14, 2026 23:48
@manoj-joseph manoj-joseph force-pushed the dlpx/pr/manoj-joseph/ff7c6166-3554-4aab-b2a2-04ea84a3a6b0 branch from 0445988 to 7bcb7e4 Compare April 29, 2026 14:18
@manoj-joseph manoj-joseph requested a review from Copilot May 1, 2026 06:17

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@manoj-joseph manoj-joseph marked this pull request as ready for review May 1, 2026 08:01
@manoj-joseph manoj-joseph force-pushed the dlpx/pr/manoj-joseph/ff7c6166-3554-4aab-b2a2-04ea84a3a6b0 branch 2 times, most recently from d2d2796 to 870f6e1 Compare May 4, 2026 22:30

@prakashsurya prakashsurya left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we hit a few ESCLs due to the migration logic for ntp to ntpsec.. I don't mean to imply there's anything wrong with the logic here, but just want to emphasize that we're a bit careful with it.. since I think that's often the more difficult part to get right.

I see there's logic to handle both ntp, and ntpsec, migrations.. so I think we're good.. but just wanted to mention it, since I got this subtly wrong last time..

@manoj-joseph manoj-joseph force-pushed the dlpx/pr/manoj-joseph/ff7c6166-3554-4aab-b2a2-04ea84a3a6b0 branch from 870f6e1 to 4304305 Compare May 8, 2026 18:00
@manoj-joseph manoj-joseph enabled auto-merge (squash) May 8, 2026 18:00
@manoj-joseph manoj-joseph merged commit 3ae2aa3 into develop May 8, 2026
3 of 4 checks passed
@manoj-joseph manoj-joseph deleted the dlpx/pr/manoj-joseph/ff7c6166-3554-4aab-b2a2-04ea84a3a6b0 branch May 8, 2026 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants