Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LAG LACP & L3 Support #146

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
100 changes: 87 additions & 13 deletions docs/HLD/vpp-lag.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# SONIC VPP LAG Bridging support
Rev v0.1
# SONIC VPP LAG Support
Rev v0.2

<br/>
<br/>
Expand All @@ -11,26 +11,30 @@ Rev v0.1
3. [Definitions/Abbreviations](#item-3)
4. [Introduction](#item-4)
5. [LAG and Bridging](#item-5)
6. [LACP Support](#item-6)
7. [LAG L3 Support](#item-7)
8. [Status](#item-8)

<br/>
<br/>

<a id="item-1"></a>
## Revisions

| Rev | Date | Author(s) |
|-----|------|-----------|
|v0.1 | 25/02/2024 | Bendrapu Balareddy (Cisco), Sameer Nanajkar (Cisco) |
| Rev | Date | Author(s) | Changes |
|-----|------|-----------|---------|
|v0.1 | 25/02/2024 | Bendrapu Balareddy (Cisco), Sameer Nanajkar (Cisco) | LAG & Bridging Support |
|v0.2 | 17/12/2024 | Akeel Ali (Cisco) | LACP & L3 Support |


<br/>
<br/>

<a id="item-2"></a>
## Scope
This document describes the high level design of integrating LAG and bridging functionality between SONIC and VPP. It provides
- LAG
- 802.1Q bridging
This document describes the high level design of integrating LAG between SONIC and VPP. It covers:
- LAG and 802.1Q bridging
- LACP

<br/>

Expand All @@ -47,21 +51,18 @@ This document describes the high level design of integrating LAG and bridging fu
<a id="item-4"></a>
## Introduction
LAG functionality is supported in SONiC with port channel interfaces.
VPP port channel supports modes as active-standby , active-active or LACP mode. But SONiC does not have a way program
VPP port channel supports modes as active-standby, active-active or LACP mode. But SONiC does not have a way program
this through configuration. Hence the portchannel mode as default active-active by giving ROUND_ROBIN value. Similarly
VPP supports different load balancing algorithms for distributing the traffic in active-active mode. But again SONIC
does not support a way to configure any load balancing algorithm, hence it's programmed as BOND_API_LB_ALGO_L2 by default.
If there is any defined requirement in future to use different load balancing algorithm to use by default, it can be
changed accordingly.

Some parameters the SONIC supports are not supported by VPP. The attributes like min-links for portchannel group, fast-rate
etc are not supported by VPP.

To enable LACP on VPP, the VPP should be rebuilt with LACP plugin added. Currently sonic-platform-vpp repo uses prebuilt
vpp binary. Hence it needs some changes the way this repo is using the VPP


<a id="item-5"></a>
## LAG and Bridging
![LAG Bridging](../LAG-Dot1q-Bridging-Topo.png)

SONiC supports port channel interfaces. These interfaces can be configured for layer routing or as members of a vlan
Expand All @@ -76,6 +77,79 @@ SONiC supports port channel interfaces. These interfaces can be configured for l
that this subinterface is only specific to vpp and does not appear in SONiC control plane data bases.
3. set the vlan subinterface as bridge port

<a id="item-6"></a>
## LACP Support

Initially, the PortChannel was reported down because the LACP packets generated by SONiC were dropped by VPP and not forwarded to the peer.
This section explores how LACP was implemented in SONiC-VPP.
<br/>
<br/>
Both SONiC and VPP can run LACP. If we choose to run LACP in VPP, the PortChannel interface in SONiC will not be reported up.
Instead, we need to let SONiC run LACP and instruct VPP to carry SONiC's LACP packets without generating or consuming them.

To achieve this, we can use the VPP [linux-cp plugin](https://s3-docs.fd.io/vpp/22.06/developer/plugins/lcp.html) to mirror packets between the member tap interfaces (on host)
and the corresponding VPP phy interface.

However, the linux-cp plugin does not support LACP packets. We need to add support in VPP to use this plugin for this purpose.

As an experiment, A new node was added to the plugin (fashioned after lip_punt_node) that can punt and inject packets (between host and phy), then registered it with ethernet-input to handle ETHERNET_TYPE_SLOW_PROTOCOLS packets. This resulted in the injected SONiC LACP packets successfully forwarded and then punted to the peer SONiC device, allowing the PortChannel to come up and the LACP protocol to run successfully.

This VPP solution was productized and submitted for review to the VPP project: https://gerrit.fd.io/r/c/vpp/+/42124

Note that that the lacp plugin should not be enabled, nor should we configure the BondEthernet in lacp mode to ensure VPP does not run its own LACP protocol. Instead, we configure the BondEthernet in `VPP_BOND_API_MODE_XOR` so that we can use the loadbalancing algorithms (`VPP_BOND_API_LB_ALGO_L23` or `VPP_BOND_API_LB_ALGO_L34`).

<a id="item-7"></a>
## LAG L3 Support

### Configuring IP
Adding an ip address to a PortChannel was not originally supported.
This is the SAI update we get for applying an address:

2024-11-27.22:47:22.421304|c|SAI_OBJECT_TYPE_ROUTE_ENTRY:{"dest":"10.0.1.1/32","switch_id":"oid:0x21000000000000","vr":"oid:0x3000000000022"}|SAI_ROUTE_ENTRY_ATTR_PACKET_ACTION=SAI_PACKET_ACTION_FORWARD|SAI_ROUTE_ENTRY_ATTR_NEXT_HOP_ID=oid:0x1000000000001

The SONiC-VPP SAI code which handles this call (`vpp_add_del_intf_ip_addr_norif`) uses the `ip` command's output to derive the interface to which this address belongs as there is no indication of the interface in the SAI update (we get the CPU port id, which is common for all). For a PortChannel, we can derive the PortChannelX name in the same way, but we won't know the VPP BondEthernetY associated with it so that we can configure the IP in VPP (X and Y are not equal). When the LAG is created, we don't get any attributes with the SAI call, ie. we don't even get the PortChannelX name. This presents a challenge in applying the IP address based on the current design.

To resolve this, code was added to detect which PortChannelX we are dealing with during LAG create (also using the `ip` command, by scanning for the newly added PortChannel in the output). We then make sure the BondEthernet is created with the same bond_id (ie. set X=Y). This way, we can easily derive the BondEthernet in question from the PortChannel when applying the IP address to VPP in `vpp_add_del_intf_ip_addr_norif`.

### Ping

ARP resolution was failing as the ARP packets were not punted to SONiC. They are received on the member interface in VPP, processed as part of the BondEthernet in bond-input, and not punted as no linux-cp pair exists for the bond interface

As such, we need to call `configure_lcp_pair` for the BondEthernet interface to create a tap interface to which the ARP packets are punted. There are two issues here:

1. This call needs to be done after the first member is added to ensure the mac address of the tap interface matches that of the BondEthernet & member interfaces.

2. While other interfaces in SONiC are actually tap interfaces themselves (eg. the Ethernet interfaces), the PortChannel is not a tap interface and is already created by teamd before the SAI call. Attempting to delete it and delegating its creation (as a tap) to the lcp plugin (as it is done for a Loopback interface) does not work for a PortChannel as there is state maintained based on the original intf index elsewhere in the system.

To resolve (1), we defer calling `configure_lcp_pair` until the first member is added, and then delete the pair when the LAG itself is deleted

Addressing (2) is a little more complicated. The solution that worked is to use the Traffic Control (tc) utility in the syncd container to create a filter and redirect packets from the tap interface to the PortChannel. This resolves the ARP entry and ping.

```
tc qdisc add dev <bond-tap-interface> ingress
tc filter add dev <bond-tap-interface> parent ffff: \
protocol arp prio 2 u32 \
match u32 0 0 flowid 1:1 \
action mirred ingress redirect dev <port-channel-interface>
```


<a id="item-8"></a>
## Status


Item | Done | Remaining
--- | --- | ---
LACP punt/inject support in VPP | Coded and submitted for review in VPP https://gerrit.fd.io/r/c/vpp/+/42124. Addressed first round of review comments. | <li>Complete review and commit</li>
Add IP to PortChannel | Coded solution using `ip` command to detect newly added PortChannel and create BondEthernet with same id | <li>Complete review and commit</li>
Ping between PortChannels | Coded solution using `tc` utility to redirect traffic between tap and Sonic intf | <li>(Future) Explore alternative design using common punt port (Potentially larger project)</li>
Testing | Bring-up, ping of PortChannel with 2 members, add/remove members, v4/v6/multiple-members, multiple-portchannels | Sonic-mgmt testing
Sonic-mgmt t1-28-lag testing | Brought-up t1-28-lag TOPO. Ran LAG tests | <li> Debug and fix failing LAG TCs </li>
Adapt to new LCP VPP API once upstreamed | | Potentially Phase 2 given slow VPP review process. For now, we can commit the tested non-API version of the changes to the vpp.patch file.
Hashing algo selection | Coded using XOR & L2L3 | (Optional, TBD) Ability to switchover between L2L3 and L3L4 depending on presence of IP (or other scheme)
(Phase 2 - T0-lag) PortChannel Subinterfaces | Identified changes required to provision PortChannel subintf in VPP and apply IP | <li>Ping fails</li><li>May require redesign to add PortChannel support to existing interface APIs</li><li>Bring-up t0-lag topo and run LAG tests</li>



## References

Expand Down
27 changes: 25 additions & 2 deletions saivpp/src/SwitchStateBase.h
Original file line number Diff line number Diff line change
Expand Up @@ -39,13 +39,20 @@
#include "SwitchStateBaseAcl.h"

#define BFD_MUTEX std::lock_guard<std::mutex> lock(bfdMapMutex);
#define LAG_MUTEX std::lock_guard<std::mutex> lock(LagMutex);

#define SAI_VPP_FDB_INFO "SAI_VPP_FDB_INFO"

#define DEFAULT_VLAN_NUMBER 1

#define MAX_OBJLIST_LEN 128

#define IP_CMD "/sbin/ip"

#define PORTCHANNEL_PREFIX "PortChannel"

#define BONDETHERNET_PREFIX "BondEthernet"

#define CHECK_STATUS(status) { \
sai_status_t _status = (status); \
if (_status != SAI_STATUS_SUCCESS) { SWSS_LOG_ERROR("ERROR status %d", status); return _status; } }
Expand All @@ -66,6 +73,12 @@ typedef struct vpp_ace_cntr_info_ {
uint32_t ace_index;
} vpp_ace_cntr_info_t;

typedef struct platform_bond_info_ {
uint32_t sw_if_index;
uint32_t id;
bool lcp_created;
} platform_bond_info_t;

namespace saivpp
{
class SwitchStateBase:
Expand Down Expand Up @@ -895,6 +908,12 @@ namespace saivpp
sai_status_t removeRouterif(
_In_ sai_object_id_t objectId);

sai_status_t vpp_set_interface_attributes(
_In_ sai_object_id_t obj_id,
_In_ uint32_t attr_count,
_In_ const sai_attribute_t *attr_list,
_In_ uint16_t vlan_id);

sai_status_t vpp_create_router_interface(
_In_ uint32_t attr_count,
_In_ const sai_attribute_t *attr_list);
Expand Down Expand Up @@ -1319,7 +1338,11 @@ namespace saivpp
void populate_if_mapping();
const char *tap_to_hwif_name(const char *name);
const char *hwif_to_tap_name(const char *name);
uint32_t lag_to_bond_if_idx (const sai_object_id_t lag_id);

std::mutex LagMutex;

uint32_t find_bond_id();
sai_status_t get_lag_bond_info(const sai_object_id_t lag_id, platform_bond_info_t &bond_info);
int remove_lag_to_bond_entry (const sai_object_id_t lag_id);

void vppProcessEvents ();
Expand All @@ -1332,7 +1355,7 @@ namespace saivpp
bool m_run_vpp_events_thread = true;
bool VppEventsThreadStarted = false;
std::shared_ptr<std::thread> m_vpp_thread;
std::map<sai_object_id_t, uint32_t> m_lag_bond_map;
std::map<sai_object_id_t, platform_bond_info_t> m_lag_bond_map;

private:
static int currentMaxInstance;
Expand Down
Loading