Skip to content

[dhcp4relay] Recognize VxLAN tunnel interfaces via CONFIG_DB#109

Open
cshivashgit wants to merge 6 commits into
sonic-net:masterfrom
cshivashgit:dhcp4relay-vxlan-tunnel-support
Open

[dhcp4relay] Recognize VxLAN tunnel interfaces via CONFIG_DB#109
cshivashgit wants to merge 6 commits into
sonic-net:masterfrom
cshivashgit:dhcp4relay-vxlan-tunnel-support

Conversation

@cshivashgit

Copy link
Copy Markdown
Contributor

When a DHCP BOOTREQUEST arrives on a VxLAN tunnel interface whose name does not begin with the literal prefix "VXLAN" (for example a "Vtep-*" tunnel), the existing intf.rfind("VXLAN", 0) check in pkt_in_callback() silently drops the packet and the client never gets an address.

Replace the prefix check with is_vxlan_interface(), backed by an unordered_set populated from CONFIG_DB:VXLAN_TUNNEL. The set is primed once during DHCPMgr startup via swss::Table::getKeys() and is kept in sync at runtime through a SubscriberStateTable on VXLAN_TUNNEL whose events are dispatched to process_vxlan_tunnel_notification(). The lookup strips any "-" suffix, so "VXLAN-101" / "Vtep-200" forms map back to the tunnel name configured in CONFIG_DB.

Priming on startup closes the early-startup window in which tunnels already configured in CONFIG_DB would otherwise not be recognized until the SubscriberStateTable delivers their first events.

When a DHCP BOOTREQUEST arrives on a VxLAN tunnel interface whose name
does not begin with the literal prefix "VXLAN" (for example a "Vtep-*"
tunnel), the existing intf.rfind("VXLAN", 0) check in pkt_in_callback()
silently drops the packet and the client never gets an address.

Replace the prefix check with is_vxlan_interface(), backed by an
unordered_set populated from CONFIG_DB:VXLAN_TUNNEL. The set is primed
once during DHCPMgr startup via swss::Table::getKeys() and is kept in
sync at runtime through a SubscriberStateTable on VXLAN_TUNNEL whose
events are dispatched to process_vxlan_tunnel_notification(). The
lookup strips any "<name>-<vlan>" suffix, so "VXLAN-101" / "Vtep-200"
forms map back to the tunnel name configured in CONFIG_DB.

Priming on startup closes the early-startup window in which tunnels
already configured in CONFIG_DB would otherwise not be recognized
until the SubscriberStateTable delivers their first events.

Signed-off-by: Shivashankar CR <shivashankar.c.r@gmail.com>
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@cshivashgit

Copy link
Copy Markdown
Contributor Author

@Xichen96 please review

CC: @ashutosh-agrawal

@ashutosh-agrawal ashutosh-agrawal requested a review from Xichen96 May 7, 2026 16:17

@Xichen96 Xichen96 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_vxlan_interface() has two failure modes worth addressing before merge.

Failure 1 — silent drop for hyphenated tunnel names. VXLAN_TUNNEL.name in YANG (sonic-vxlan.yang) is unconstrained string. A tunnel named vtep-1 lands in the set as vtep-1, but a packet on vtep-1-200 calls find('-') → strips to vtep → MISS → packet dropped silently. Same bug class this PR is fixing.

Failure 2 — false positive for prefix-sharing names. Any non-VXLAN interface whose name strips to a tunnel name passes the check. Set has VXLAN; a bridge / docker netdev named VXLAN-Mgmt strips to VXLAN → HIT → packet dispatched via the VXLAN path it shouldn't be on.

Failure 1 is solvable cheaply (try-full-name first, then rfind('-') instead of find('-')). Failure 2 isn't — robustly identifying a VXLAN netdev needs the actual mapping from VXLAN_TUNNEL_MAP (key is <tunnel>|<mapname>, netdev is <tunnel>-<mapname>). Either subscribe to VXLAN_TUNNEL_MAP and store full expected netdev names in the set, or be explicit in code that we accept the false-positive risk.

Tests required. dhcp4relay/test/ has the mock infra (mock_table.cpp, mock_consumerstatetable.cpp). Please add at minimum:

  • is_vxlan_interface("VXLAN") after primed set → true
  • is_vxlan_interface("VXLAN-101") after primed set → true
  • is_vxlan_interface("Ethernet0") → false
  • is_vxlan_interface("vtep-1-200") with set = {vtep-1} → true (covers Failure 1)
  • is_vxlan_interface("VXLAN-Mgmt") with set = {VXLAN} → false (covers Failure 2)
  • process_vxlan_tunnel_notification SET / DEL roundtrip
  • Primer: pre-existing VXLAN_TUNNEL keys appear in set after SYNC_BARRIER

@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

The original VxLAN tunnel-cache plumbing in this PR landed before
dhcp4relay's logger migration (sonic-net#110) and still emitted seven raw
syslog() calls. Now that the rest of dhcp4relay routes through
swss::Logger and is controllable at runtime via swssloglevel, the
VxLAN paths must follow the same convention so its log lines can be
silenced under sustained DHCP traffic without rebuilding the daemon.

Replace those calls with the equivalent SWSS_LOG_* macros, matching
the mapping used in sonic-net#110:
Trailing "\n" is dropped from the format strings since SWSS_LOG
appends its own newline.

No functional change.

Signed-off-by: Shivashankar CR <shivashankar.c.r@gmail.com>
… match

Addresses review feedback on PR sonic-net#109. The original PR cached
VXLAN_TUNNEL keys ("Vtep-1") and stripped any '-' suffix from the
incoming interface name before lookup, which had two failure modes:

  Failure 1 - silent drop for hyphenated tunnel names. A tunnel
  configured as "Vtep-1-100" produces kernel netdev "Vtep-1-100-100";
  the find('-')-based parser stripped that to "Vtep" and missed the
  set, dropping the BOOTREQUEST without any clue why.

  Failure 2 - false positive for prefix-sharing names. The parser
  accepted any non-vxlan netdev whose strip-on-'-' prefix happened
  to land on a configured tunnel name (e.g. "VXLAN-Mgmt" -> "VXLAN"
  hit), funneling unrelated traffic through the vxlan path.

Both are eliminated by storing the full kernel netdev name verbatim
and doing a single unordered_set::count() per packet -- no string
parsing, no prefix games. The kernel netdev name follows the same
"<tunnel_name>-<vlan_numeric_id>" convention used by vxlanmgr in
sonic-swss, derived from VXLAN_TUNNEL_MAP rows.

VXLAN_TUNNEL is intentionally not subscribed anymore: a tunnel
without a corresponding TUNNEL_MAP entry has no kernel netdev, so
DHCP packets cannot arrive on it. Subscribing only to
VXLAN_TUNNEL_MAP is semantically tighter and immune to both failure
modes by construction.

Changes
-------
- dhcp4relay.h: vxlan_tunnel_config now carries the full netdev name
  (renamed field tunnel_name -> netdev_name).
- dhcp4relay.cpp: rename vxlan_tunnel_set -> vxlan_netdev_set to
  reflect what it actually stores; rewrite is_vxlan_interface() to a
  one-line exact lookup; update the apply_config_event() VxLAN branch
  to insert/erase the netdev name verbatim.
- dhcp4relay_mgr.h: add private vxlan_tunnel_map_netdev_cache_ keyed
  by "<tunnel>|<mapname>". Needed for DEL because SubscriberStateTable
  delivers DEL events with empty FieldValuesTuple (row already gone
  from CONFIG_DB), so we cannot recompute the netdev name from the
  notification payload alone. Renames the processor to
  process_vxlan_tunnel_map_notification().
- dhcp4relay_mgr.cpp: replace the VXLAN_TUNNEL subscription with
  VXLAN_TUNNEL_MAP. The hand-rolled getKeys()-based primer is
  replaced with a SubscriberStateTable::pops() snapshot fed through
  the same processor used at runtime, matching the existing
  DHCPV4_RELAY priming pattern. The processor performs strict
  validation (key has '|' delimiter, vlan field present, "Vlan"
  prefix, numeric tail); any malformed row is skipped at INFO with
  the offending key logged.
- test/mock_relay.{h,cpp}: expose vxlan_netdev_set and
  is_vxlan_interface() to the harness. Cover all reviewer-requested
  cases:
    * exact-match + ethernet/docker negative cases
    * Failure 1 reproducer (Vtep-1-100-100)
    * Failure 2 prefix-sharing rejections (VXLAN-Mgmt,
      Vtep-1-100-eth0, VXLAN-100-eth0)
    * empty-set boundary
    * VXLAN_TUNNEL_UPDATE SET/DEL roundtrip via config_event_callback
    * process_vxlan_tunnel_map_notification SET/DEL roundtrip
      including DEL via the mgr-side cache lookup
    * stale DEL with no cached netdev is dropped (Times(0) on write)
    * malformed-entry rejection branches (no vlan, bad vlan, no
      "Vlan" prefix, no '|' in key)
    * primer wiring: rows pre-existing in CONFIG_DB reach
      vxlan_netdev_set through the same SubscriberStateTable::pops()
      -> processor -> applier path used at runtime, for both
      hyphenated (Vtep-1-100-100) and plain (vtep1-200) tunnel names

No functional change for non-vxlan packet paths.

Signed-off-by: Shivashankar CR <shivashankar.c.r@gmail.com>
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@cshivashgit

Copy link
Copy Markdown
Contributor Author

is_vxlan_interface() has two failure modes worth addressing before merge.

Failure 1 — silent drop for hyphenated tunnel names. VXLAN_TUNNEL.name in YANG (sonic-vxlan.yang) is unconstrained string. A tunnel named vtep-1 lands in the set as vtep-1, but a packet on vtep-1-200 calls find('-') → strips to vtep → MISS → packet dropped silently. Same bug class this PR is fixing.

Failure 2 — false positive for prefix-sharing names. Any non-VXLAN interface whose name strips to a tunnel name passes the check. Set has VXLAN; a bridge / docker netdev named VXLAN-Mgmt strips to VXLAN → HIT → packet dispatched via the VXLAN path it shouldn't be on.

Failure 1 is solvable cheaply (try-full-name first, then rfind('-') instead of find('-')). Failure 2 isn't — robustly identifying a VXLAN netdev needs the actual mapping from VXLAN_TUNNEL_MAP (key is <tunnel>|<mapname>, netdev is <tunnel>-<mapname>). Either subscribe to VXLAN_TUNNEL_MAP and store full expected netdev names in the set, or be explicit in code that we accept the false-positive risk.

Tests required. dhcp4relay/test/ has the mock infra (mock_table.cpp, mock_consumerstatetable.cpp). Please add at minimum:

  • is_vxlan_interface("VXLAN") after primed set → true
  • is_vxlan_interface("VXLAN-101") after primed set → true
  • is_vxlan_interface("Ethernet0") → false
  • is_vxlan_interface("vtep-1-200") with set = {vtep-1} → true (covers Failure 1)
  • is_vxlan_interface("VXLAN-Mgmt") with set = {VXLAN} → false (covers Failure 2)
  • process_vxlan_tunnel_notification SET / DEL roundtrip
  • Primer: pre-existing VXLAN_TUNNEL keys appear in set after SYNC_BARRIER

@Xichen96 Addressed

@cshivashgit

Copy link
Copy Markdown
Contributor Author

@Xichen96 please review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants