Skip to content

Fix multi-ASIC package feature management#4445

Open
william8545 wants to merge 2 commits into
sonic-net:masterfrom
william8545:multi_asic_feature_sync_internal_public
Open

Fix multi-ASIC package feature management#4445
william8545 wants to merge 2 commits into
sonic-net:masterfrom
william8545:multi_asic_feature_sync_internal_public

Conversation

@william8545

@william8545 william8545 commented Apr 10, 2026

Copy link
Copy Markdown
Contributor

What I did

Fixed two issues with dynamically installed packages on multi-ASIC platforms:

  1. sonic-package-manager install now writes complete FEATURE entries to all per-ASIC namespace CONFIG_DBs, not just the host CONFIG_DB.
  2. config feature state and config feature autorestart no longer crash with KeyError when per-ASIC FEATURE entries are incomplete.

How I did it

sonic_package_manager/service_creator/sonic_db.py: Added get_namespace_db_connectors() to SonicDB that returns per-ASIC namespace CONFIG_DB connectors on multi-ASIC platforms. get_connectors() now yields these after the host connector, so FeatureRegistry.register() and deregister() write to all CONFIG_DBs.

config/feature.py: Changed entry_data['state'] to entry_data.get('state') in set_feature_state(), and entry_data['auto_restart'] to entry_data.get('auto_restart') in feature_autorestart(). Namespace entries missing the field are skipped during validation. Added a guard for the case where no namespace has the field at all. Fixed the always_enabled check to use the collected set instead of relying on the last loop variable.

How to verify it

On a multi-ASIC system:

sudo sonic-package-manager install xxx

# Verify per-ASIC entries are complete
sonic-db-cli CONFIG_DB HGETALL "FEATURE|xxx"
sonic-db-cli -n asic0 CONFIG_DB HGETALL "FEATURE|xxx"
# Both should show all fields 

# Verify CLI commands work
sudo config feature state xxx enabled
sudo config feature autorestart xxx disabled

# Clean up
sudo sonic-package-manager uninstall xxx -y

Previous command output (if the output of a command-line utility has changed)

New command output (if the output of a command-line utility has changed)

@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@william8545 william8545 marked this pull request as draft April 10, 2026 18:50
Signed-off-by: William Tsai <willtsai@nvidia.com>
@william8545 william8545 force-pushed the multi_asic_feature_sync_internal_public branch from c0d677a to 6b7d048 Compare April 10, 2026 19:50
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@william8545 william8545 marked this pull request as ready for review April 10, 2026 22:11
@Sourabh-Kumar7

Copy link
Copy Markdown
Member

@lguohan @abdosi can you pls help review this?

@securely1g securely1g left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review on behalf of @lguohan:

The fix for multi-ASIC package feature management — please provide:

  1. What specific failure does this address? (The PR description is truncated)
  2. How does this interact with config feature commands on multi-ASIC platforms?
  3. Are there sonic-mgmt tests that validate multi-ASIC feature enable/disable?

@william8545

Copy link
Copy Markdown
Contributor Author

Review on behalf of @lguohan:

The fix for multi-ASIC package feature management — please provide:

  1. What specific failure does this address? (The PR description is truncated)
  2. How does this interact with config feature commands on multi-ASIC platforms?
  3. Are there sonic-mgmt tests that validate multi-ASIC feature enable/disable?
  1. Two CLI- and package-manager-side failures after installing a new package
  • config feature state <pkg> enabled crashes / no-ops. The command prints status and the feature is not enabled.
  • Host vs ASIC CONFIG_DB divergence after install / uninstall. sonic-package-manager install only populated the host's FEATURE table, so ASIC namespaces never received the row (autorestart and state included). sonic-package-manager uninstall likewise only removed the host row, leaving stale entries in the ASIC namespaces.
  1. config feature state / config feature autorestart iterate db.cfgdb_clients, which utilities_common/db.py::Db builds as {'': host_cfgdb, 'asic0': ..., 'asic1': ..., ...} whenever multi_asic.is_multi_asic() is true. The command (a) validates the entry exists, (b) checks the value is consistent across all namespaces, and (c) writes the new value to every namespace.

  2. No tests in sonic-mgmt that test sonic-package-manager install/uninstall

@Sourabh-Kumar7

Copy link
Copy Markdown
Member

@lguohan @securely1g could you please re-review and approve the changes? Thanks

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes multi-ASIC handling for dynamically installed package features by ensuring FEATURE table updates are applied across all per-ASIC namespace CONFIG_DB instances, and by hardening config feature commands against incomplete per-namespace FEATURE entries.

Changes:

  • Extend SonicDB.get_connectors() to include per-ASIC namespace CONFIG_DB connectors on multi-ASIC systems.
  • Make config feature state and config feature autorestart tolerant of missing state / auto_restart fields across namespaces (avoids KeyError).
  • Add targeted unit tests for the new SonicDB namespace connector behavior and the CLI/feature validation behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
tests/sonic_package_manager/test_sonic_db.py Adds unit tests for per-ASIC namespace connector discovery and connector enumeration behavior.
tests/feature_test.py Adds tests to cover missing state / auto_restart fields and expected CLI behavior.
sonic_package_manager/service_creator/sonic_db.py Adds per-ASIC namespace CONFIG_DB connector support and extends connector iteration to cover namespaces.
config/feature.py Avoids crashing on incomplete FEATURE entries by using .get() and adds guards when fields are missing everywhere.

Comment on lines +123 to +136
if cls._namespace_db_conns is None:
cls._namespace_db_conns = []
if device_info.is_multi_npu():
from swsscommon.swsscommon import SonicDBConfig
SonicDBConfig.initializeGlobalConfig()
for ns in device_info.get_namespaces():
try:
conn = swsscommon.ConfigDBConnector(namespace=ns)
conn.connect()
cls._namespace_db_conns.append(conn)
except RuntimeError:
pass

return cls._namespace_db_conns
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

Address the Copilot review on get_namespace_db_connectors():

- Fail fast on a namespace connect error instead of swallowing
  RuntimeError and returning a partial connector set; a missing
  namespace CONFIG_DB is a real fault, not something to silently skip.
- Cache only after every namespace connects (local accumulator assigned
  to _namespace_db_conns at the end), so a failed run leaves the cache
  unset and the next call retries rather than caching a partial list.
- Guard the global DB init with isGlobalInit() and call it through the
  module-level swsscommon import, matching the pattern used in
  utilities_common/db.py; drop the in-function lazy import.

Update tests to cover fail-fast propagation, no-cache-on-failure retry,
both isGlobalInit branches, and single/multi-ASIC caching.

Signed-off-by: William Tsai <willtsai@nvidia.com>
@william8545 william8545 force-pushed the multi_asic_feature_sync_internal_public branch from d14ff99 to 9edd74b Compare June 12, 2026 00:02
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants