[bgp] Fix native OVN BGP CI workarounds#4004
Conversation
The gateway IP (e.g. `192.168.133.1`) was configured on the router loopback interface so that VMs could ping the external subnet gateway. This worked with `ovn-bgp-agent`, but with native OVN BGP — which replaces `ovn-bgp-agent` in RHOSO — pinging the gateway IP fails: OVN's `arp_proxy` responds to the ARP request and the ICMP reaches the router, but an anti-loop flow in `lr_in_ip_input` drops the reply because the source IP matches the router port address. Since BGP routing does not depend on this loopback entry, remove it. Related-Issue: #OSPRH-30905 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Eduardo Olivares <eolivare@redhat.com>
After a fresh deployment, the BGP reconciler's `full_sync()` can be skipped if the OVSDB lock is not yet held at startup, and it is never retried. This leaves `arp_proxy` unset on interconnect LSPs. Restarting the neutron pods triggers a new `full_sync()` that completes the setup. This workaround should be removed once the bug is fixed in neutron. Related-Issue: #OSPRH-30900 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Eduardo Olivares <eolivare@redhat.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Build failed (check pipeline). Post ✔️ openstack-k8s-operators-content-provider SUCCESS in 22m 34s |
Summary
192.168.133.1) from therouter loopback in
prepare-bgp-spines-leaves.yaml. This IP wasneeded with
ovn-bgp-agent, but with native OVN BGP the ping replyis dropped by an OVN anti-loop flow in
lr_in_ip_input. BGP routingdoes not depend on this loopback entry.
prepare-bgp-computes.yaml. After a fresh deployment the BGPreconciler's
full_sync()can be skipped if the OVSDB lock is notyet held, leaving
arp_proxyunset on interconnect LSPs. A podrestart triggers a new
full_sync(). This workaround should beremoved once the bug is fixed in neutron.
Related-Issue: #OSPRH-30905
Related-Issue: #OSPRH-30900
Test plan
loopback gateway IP task
OpenStackControlPlanereconciles successfullyarp_proxyis set on interconnect LSPs after the restartAssisted-By: Claude Opus 4.6 (1M context) noreply@anthropic.com