Skip to content

[vnetorch] VNET_ROUTE_TUNNEL_TABLE with monitoring=custom_bfd and directly-connected local-DPU primary fails to install #4593

@zjswhhh

Description

@zjswhhh

Bug summary

VNET routes with monitoring=custom_bfd fail to install in the NPU when the primary endpoint is a directly-connected local DPU IP. The route stays inactive in STATE_DB even though the BFD session is Up and the kernel has a valid ARP entry for the DPU on the VLAN interface.

In SmartSwitch HA deployments where hamgrd programs the primary as the local-DPU IP, inbound VIP traffic falls back to the secondary (remote NH via VxLAN tunnel) or fails entirely.

Symptom

$ 
edis-cli -n 6 hgetall 'VNET_ROUTE_TUNNEL_TABLE|Vnet-default|<prefix>'
'active_endpoints'  -> ''
'state'             -> 'inactive'
2026 May 21 05:58:53.784109 NOTICE swss#orchagent: :- createNextHopGroup: Next hop 100.117.156.35@ not found in neighorch, skipping.
2026 May 21 05:58:53.784109 WARNING swss#orchagent: :- updateVnetTunnelCustomMonitor: Failed to create primary based custom next hop group. Cannot proceed.

Root cause

NextHopGroupKey for VNET routes is built from VNET_ROUTE_TUNNEL_TABLE.endpoint, which only carries IPs. For a directly-connected local endpoint the NextHopKey ends up with an empty interface alias (<IP>@). NeighOrch::hasNextHop then misses the stored entry <IP>@<intf> (e.g. <IP>@Vlan32), and createNextHopGroup returns false.

The Down path (NeighOrch::updateNextHop from BfdUpdate) already resolves the interface correctly (setNextHopFlag on <IP> seen on port <intf>). The Up / create path just needs to do the same.

  • SONiC image: 20251110.21 (202511 branch)

Related

Why vstest didn't catch it

vs tests don't use vlan config.

Why sonic-mgmt didn't catch it

  1. There is also no assertion on STATE_DB:VNET_ROUTE_TUNNEL_TABLE.state anywhere in tests/.
    1. How did traffic get forwarded from NPU->DPU??

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions