diff --git a/docs/HLD/vpp-lag.md b/docs/HLD/vpp-lag.md
index 1954b68..3d4409f 100644
--- a/docs/HLD/vpp-lag.md
+++ b/docs/HLD/vpp-lag.md
@@ -1,5 +1,5 @@
-# SONIC VPP LAG Bridging support
-Rev v0.1
+# SONIC VPP LAG Support
+Rev v0.2
@@ -11,6 +11,9 @@ Rev v0.1
3. [Definitions/Abbreviations](#item-3)
4. [Introduction](#item-4)
5. [LAG and Bridging](#item-5)
+6. [LACP Support](#item-6)
+7. [LAG L3 Support](#item-7)
+8. [Status](#item-8)
@@ -18,9 +21,10 @@ Rev v0.1
## Revisions
-| Rev | Date | Author(s) |
-|-----|------|-----------|
-|v0.1 | 25/02/2024 | Bendrapu Balareddy (Cisco), Sameer Nanajkar (Cisco) |
+| Rev | Date | Author(s) | Changes |
+|-----|------|-----------|---------|
+|v0.1 | 25/02/2024 | Bendrapu Balareddy (Cisco), Sameer Nanajkar (Cisco) | LAG & Bridging Support |
+|v0.2 | 17/12/2024 | Akeel Ali (Cisco) | LACP & L3 Support |
@@ -28,9 +32,9 @@ Rev v0.1
## Scope
-This document describes the high level design of integrating LAG and bridging functionality between SONIC and VPP. It provides
- - LAG
- - 802.1Q bridging
+This document describes the high level design of integrating LAG between SONIC and VPP. It covers:
+ - LAG and 802.1Q bridging
+ - LACP
@@ -47,21 +51,18 @@ This document describes the high level design of integrating LAG and bridging fu
## Introduction
LAG functionality is supported in SONiC with port channel interfaces.
-VPP port channel supports modes as active-standby , active-active or LACP mode. But SONiC does not have a way program
+VPP port channel supports modes as active-standby, active-active or LACP mode. But SONiC does not have a way program
this through configuration. Hence the portchannel mode as default active-active by giving ROUND_ROBIN value. Similarly
VPP supports different load balancing algorithms for distributing the traffic in active-active mode. But again SONIC
does not support a way to configure any load balancing algorithm, hence it's programmed as BOND_API_LB_ALGO_L2 by default.
-If there is any defined requirement in future to use different load balancing algorithm to use by default, it can be
-changed accordingly.
Some parameters the SONIC supports are not supported by VPP. The attributes like min-links for portchannel group, fast-rate
etc are not supported by VPP.
-To enable LACP on VPP, the VPP should be rebuilt with LACP plugin added. Currently sonic-platform-vpp repo uses prebuilt
-vpp binary. Hence it needs some changes the way this repo is using the VPP
+## LAG and Bridging
![LAG Bridging](../LAG-Dot1q-Bridging-Topo.png)
SONiC supports port channel interfaces. These interfaces can be configured for layer routing or as members of a vlan
@@ -76,6 +77,79 @@ SONiC supports port channel interfaces. These interfaces can be configured for l
that this subinterface is only specific to vpp and does not appear in SONiC control plane data bases.
3. set the vlan subinterface as bridge port
+
+## LACP Support
+
+Initially, the PortChannel was reported down because the LACP packets generated by SONiC were dropped by VPP and not forwarded to the peer.
+This section explores how LACP was implemented in SONiC-VPP.
+
+
+Both SONiC and VPP can run LACP. If we choose to run LACP in VPP, the PortChannel interface in SONiC will not be reported up.
+Instead, we need to let SONiC run LACP and instruct VPP to carry SONiC's LACP packets without generating or consuming them.
+
+To achieve this, we can use the VPP [linux-cp plugin](https://s3-docs.fd.io/vpp/22.06/developer/plugins/lcp.html) to mirror packets between the member tap interfaces (on host)
+and the corresponding VPP phy interface.
+
+However, the linux-cp plugin does not support LACP packets. We need to add support in VPP to use this plugin for this purpose.
+
+As an experiment, A new node was added to the plugin (fashioned after lip_punt_node) that can punt and inject packets (between host and phy), then registered it with ethernet-input to handle ETHERNET_TYPE_SLOW_PROTOCOLS packets. This resulted in the injected SONiC LACP packets successfully forwarded and then punted to the peer SONiC device, allowing the PortChannel to come up and the LACP protocol to run successfully.
+
+This VPP solution was productized and submitted for review to the VPP project: https://gerrit.fd.io/r/c/vpp/+/42124
+
+Note that that the lacp plugin should not be enabled, nor should we configure the BondEthernet in lacp mode to ensure VPP does not run its own LACP protocol. Instead, we configure the BondEthernet in `VPP_BOND_API_MODE_XOR` so that we can use the loadbalancing algorithms (`VPP_BOND_API_LB_ALGO_L23` or `VPP_BOND_API_LB_ALGO_L34`).
+
+
+## LAG L3 Support
+
+### Configuring IP
+Adding an ip address to a PortChannel was not originally supported.
+This is the SAI update we get for applying an address:
+
+ 2024-11-27.22:47:22.421304|c|SAI_OBJECT_TYPE_ROUTE_ENTRY:{"dest":"10.0.1.1/32","switch_id":"oid:0x21000000000000","vr":"oid:0x3000000000022"}|SAI_ROUTE_ENTRY_ATTR_PACKET_ACTION=SAI_PACKET_ACTION_FORWARD|SAI_ROUTE_ENTRY_ATTR_NEXT_HOP_ID=oid:0x1000000000001
+
+The SONiC-VPP SAI code which handles this call (`vpp_add_del_intf_ip_addr_norif`) uses the `ip` command's output to derive the interface to which this address belongs as there is no indication of the interface in the SAI update (we get the CPU port id, which is common for all). For a PortChannel, we can derive the PortChannelX name in the same way, but we won't know the VPP BondEthernetY associated with it so that we can configure the IP in VPP (X and Y are not equal). When the LAG is created, we don't get any attributes with the SAI call, ie. we don't even get the PortChannelX name. This presents a challenge in applying the IP address based on the current design.
+
+To resolve this, code was added to detect which PortChannelX we are dealing with during LAG create (also using the `ip` command, by scanning for the newly added PortChannel in the output). We then make sure the BondEthernet is created with the same bond_id (ie. set X=Y). This way, we can easily derive the BondEthernet in question from the PortChannel when applying the IP address to VPP in `vpp_add_del_intf_ip_addr_norif`.
+
+### Ping
+
+ARP resolution was failing as the ARP packets were not punted to SONiC. They are received on the member interface in VPP, processed as part of the BondEthernet in bond-input, and not punted as no linux-cp pair exists for the bond interface
+
+As such, we need to call `configure_lcp_pair` for the BondEthernet interface to create a tap interface to which the ARP packets are punted. There are two issues here:
+
+1. This call needs to be done after the first member is added to ensure the mac address of the tap interface matches that of the BondEthernet & member interfaces.
+
+2. While other interfaces in SONiC are actually tap interfaces themselves (eg. the Ethernet interfaces), the PortChannel is not a tap interface and is already created by teamd before the SAI call. Attempting to delete it and delegating its creation (as a tap) to the lcp plugin (as it is done for a Loopback interface) does not work for a PortChannel as there is state maintained based on the original intf index elsewhere in the system.
+
+To resolve (1), we defer calling `configure_lcp_pair` until the first member is added, and then delete the pair when the LAG itself is deleted
+
+Addressing (2) is a little more complicated. The solution that worked is to use the Traffic Control (tc) utility in the syncd container to create a filter and redirect packets from the tap interface to the PortChannel. This resolves the ARP entry and ping.
+
+```
+tc qdisc add dev ingress
+tc filter add dev parent ffff: \
+ protocol arp prio 2 u32 \
+ match u32 0 0 flowid 1:1 \
+ action mirred ingress redirect dev
+```
+
+
+
+## Status
+
+
+Item | Done | Remaining
+--- | --- | ---
+LACP punt/inject support in VPP | Coded and submitted for review in VPP https://gerrit.fd.io/r/c/vpp/+/42124. Addressed first round of review comments. | Complete review and commit
+Add IP to PortChannel | Coded solution using `ip` command to detect newly added PortChannel and create BondEthernet with same id | Complete review and commit
+Ping between PortChannels | Coded solution using `tc` utility to redirect traffic between tap and Sonic intf | (Future) Explore alternative design using common punt port (Potentially larger project)
+Testing | Bring-up, ping of PortChannel with 2 members, add/remove members, v4/v6/multiple-members, multiple-portchannels | Sonic-mgmt testing
+Sonic-mgmt t1-28-lag testing | Brought-up t1-28-lag TOPO. Ran LAG tests | Debug and fix failing LAG TCs
+Adapt to new LCP VPP API once upstreamed | | Potentially Phase 2 given slow VPP review process. For now, we can commit the tested non-API version of the changes to the vpp.patch file.
+Hashing algo selection | Coded using XOR & L2L3 | (Optional, TBD) Ability to switchover between L2L3 and L3L4 depending on presence of IP (or other scheme)
+(Phase 2 - T0-lag) PortChannel Subinterfaces | Identified changes required to provision PortChannel subintf in VPP and apply IP | Ping failsMay require redesign to add PortChannel support to existing interface APIsBring-up t0-lag topo and run LAG tests
+
+
## References
diff --git a/saivpp/src/SwitchStateBase.h b/saivpp/src/SwitchStateBase.h
index 384275f..82b9ae5 100644
--- a/saivpp/src/SwitchStateBase.h
+++ b/saivpp/src/SwitchStateBase.h
@@ -39,6 +39,7 @@
#include "SwitchStateBaseAcl.h"
#define BFD_MUTEX std::lock_guard lock(bfdMapMutex);
+#define LAG_MUTEX std::lock_guard lock(LagMapMutex);
#define SAI_VPP_FDB_INFO "SAI_VPP_FDB_INFO"
@@ -46,6 +47,12 @@
#define MAX_OBJLIST_LEN 128
+#define IP_CMD "/sbin/ip"
+
+#define PORTCHANNEL_PREFIX "PortChannel"
+
+#define BONDETHERNET_PREFIX "BondEthernet"
+
#define CHECK_STATUS(status) { \
sai_status_t _status = (status); \
if (_status != SAI_STATUS_SUCCESS) { SWSS_LOG_ERROR("ERROR status %d", status); return _status; } }
@@ -66,6 +73,12 @@ typedef struct vpp_ace_cntr_info_ {
uint32_t ace_index;
} vpp_ace_cntr_info_t;
+typedef struct platform_bond_info_ {
+ uint32_t sw_if_index;
+ uint32_t id;
+ bool lcp_created;
+} platform_bond_info_t;
+
namespace saivpp
{
class SwitchStateBase:
@@ -895,6 +908,12 @@ namespace saivpp
sai_status_t removeRouterif(
_In_ sai_object_id_t objectId);
+ sai_status_t vpp_set_interface_attributes(
+ _In_ sai_object_id_t obj_id,
+ _In_ uint32_t attr_count,
+ _In_ const sai_attribute_t *attr_list,
+ _In_ uint16_t vlan_id);
+
sai_status_t vpp_create_router_interface(
_In_ uint32_t attr_count,
_In_ const sai_attribute_t *attr_list);
@@ -1319,7 +1338,9 @@ namespace saivpp
void populate_if_mapping();
const char *tap_to_hwif_name(const char *name);
const char *hwif_to_tap_name(const char *name);
- uint32_t lag_to_bond_if_idx (const sai_object_id_t lag_id);
+
+ uint32_t find_bond_id();
+ sai_status_t get_lag_bond_info(const sai_object_id_t lag_id, platform_bond_info_t &bond_info);
int remove_lag_to_bond_entry (const sai_object_id_t lag_id);
void vppProcessEvents ();
@@ -1332,7 +1353,8 @@ namespace saivpp
bool m_run_vpp_events_thread = true;
bool VppEventsThreadStarted = false;
std::shared_ptr m_vpp_thread;
- std::map m_lag_bond_map;
+ std::map m_lag_bond_map;
+ std::mutex LagMapMutex;
private:
static int currentMaxInstance;
diff --git a/saivpp/src/SwitchStateBaseFdb.cpp b/saivpp/src/SwitchStateBaseFdb.cpp
index 2a474cd..b223f52 100644
--- a/saivpp/src/SwitchStateBaseFdb.cpp
+++ b/saivpp/src/SwitchStateBaseFdb.cpp
@@ -18,6 +18,7 @@
#include "swss/logger.h"
#include "swss/select.h"
+#include "swss/exec.h"
#include "sai_serialize.h"
#include "NotificationFdbEvent.h"
@@ -723,7 +724,9 @@ sai_status_t SwitchStateBase::vpp_create_vlan_member(
return SAI_STATUS_FAILURE;
}
} else if (obj_type == SAI_OBJECT_TYPE_LAG) {
- lag_swif_idx = lag_to_bond_if_idx(port_id);
+ platform_bond_info_t bond_info;
+ CHECK_STATUS(get_lag_bond_info(port_id, bond_info));
+ lag_swif_idx = bond_info.sw_if_index;
SWSS_LOG_NOTICE("lag swif idx :%d",lag_swif_idx);
hwifname = vpp_get_swif_name(lag_swif_idx);
SWSS_LOG_NOTICE("lag swif idx :%d swif_name:%s",lag_swif_idx, hwifname);
@@ -908,7 +911,9 @@ sai_status_t SwitchStateBase::vpp_remove_vlan_member(
return SAI_STATUS_FAILURE;
}
} else if (obj_type == SAI_OBJECT_TYPE_LAG) {
- uint32_t lag_swif_idx = lag_to_bond_if_idx(port_id);
+ platform_bond_info_t bond_info;
+ CHECK_STATUS(get_lag_bond_info(port_id, bond_info));
+ uint32_t lag_swif_idx = bond_info.sw_if_index;
SWSS_LOG_NOTICE("lag swif idx :%d",lag_swif_idx);
hw_ifname = vpp_get_swif_name(lag_swif_idx);
SWSS_LOG_NOTICE("lag swif idx :%d swif_name:%s",lag_swif_idx, hw_ifname);
@@ -1118,16 +1123,16 @@ sai_status_t SwitchStateBase::vpp_delete_bvi_interface(
return SAI_STATUS_SUCCESS;
}
-uint32_t SwitchStateBase::lag_to_bond_if_idx (const sai_object_id_t lag_id)
+sai_status_t SwitchStateBase::get_lag_bond_info(const sai_object_id_t lag_id, platform_bond_info_t &bond_info)
{
auto it = m_lag_bond_map.find(lag_id);
-
if (it == m_lag_bond_map.end())
{
- SWSS_LOG_ERROR("failed to find bond if idx for lag id: %x",lag_id);
- return ~0;
+ SWSS_LOG_ERROR("failed to find bond info for lag id: %s", sai_serialize_object_id(lag_id).c_str());
+ return SAI_STATUS_ITEM_NOT_FOUND;
}
- return it->second;
+ bond_info = it->second;
+ return SAI_STATUS_SUCCESS;
}
int SwitchStateBase::remove_lag_to_bond_entry(const sai_object_id_t lag_oid)
@@ -1159,28 +1164,86 @@ sai_status_t SwitchStateBase:: createLag(
}
+uint32_t SwitchStateBase::find_bond_id()
+{
+ std::stringstream cmd;
+ std::string res;
+ uint32_t bond_id = ~0;
+
+ // Get list of PortChannels from ip command
+ cmd << IP_CMD << " -o link show | awk -F': ' '{print $2}' | grep " << PORTCHANNEL_PREFIX;
+
+ int ret = swss::exec(cmd.str(), res);
+ if (ret) {
+ SWSS_LOG_ERROR("Command '%s' failed with rc %d", cmd.str().c_str(), ret);
+ return ~0;
+ }
+
+ if (res.length() == 0) {
+ SWSS_LOG_ERROR("No PortChannels found in output of command '%s': %s", cmd.str().c_str(), res.c_str());
+ return ~0;
+ }
+
+ SWSS_LOG_DEBUG("Output of ip command: %s", res.c_str());
+
+ std::unordered_set existing_bond_ids;
+ for (const auto& entry : m_lag_bond_map) {
+ existing_bond_ids.insert(entry.second.id);
+ }
+
+ std::istringstream iss(res);
+ std::string line;
+ bool found_new_bond_id = false;
+ while (std::getline(iss, line)) {
+ std::string portchannel_name = line.substr(0, line.find('\n'));
+ bond_id = std::stoi(portchannel_name.substr(strlen(PORTCHANNEL_PREFIX)));
+
+ if (existing_bond_ids.find(bond_id) == existing_bond_ids.end()) {
+ SWSS_LOG_NOTICE("Found new bond id from PortChannel name: %d", bond_id);
+ found_new_bond_id = true;
+ break;
+ }
+ }
+
+ return found_new_bond_id ? bond_id : ~0;
+}
+
sai_status_t SwitchStateBase::vpp_create_lag(
- _In_ sai_object_id_t lag_id,
+ _In_ sai_object_id_t lag_id,
_In_ uint32_t attr_count,
_In_ const sai_attribute_t *attr_list)
{
- uint32_t bond_id, mode, lb;
+ uint32_t mode, lb;
+ uint32_t bond_id = ~0;
uint32_t swif_idx = ~0;
const char *hw_ifname;
SWSS_LOG_ENTER();
- //set mode and lb. ONiC config does not have provision to pass mode and load balancing algorithm
- mode = VPP_BOND_API_MODE_ROUND_ROBIN;
+ LAG_MUTEX;
- //if LACP is to be enabled in VPP
- //mode = VPP_BOND_API_MODE_LACP;
+ // Extract bond_id from PortChannel name
+ bond_id = find_bond_id();
+ if (bond_id == ~0)
+ {
+ SWSS_LOG_ERROR("Bond id could not be found");
+ return SAI_STATUS_FAILURE;
+ }
+
+ // Set mode and lb. SONiC config does not have provision to pass mode and load balancing algorithm
+ mode = VPP_BOND_API_MODE_XOR;
lb = VPP_BOND_API_LB_ALGO_L23;
- bond_id = ~0;
create_bond_interface(bond_id, mode, lb, &swif_idx);
- SWSS_LOG_NOTICE("vpp bond interfae created if index:%d\n", swif_idx);
- //update the lag to bond map
- m_lag_bond_map[lag_id] = swif_idx;
+ if (swif_idx == ~0)
+ {
+ SWSS_LOG_ERROR("failed to create bond interface in VPP for %s", sai_serialize_object_id(lag_id).c_str());
+ return SAI_STATUS_FAILURE;
+ }
+
+ // Update the lag to bond map
+ platform_bond_info_t bond_info = {swif_idx, bond_id, false};
+ m_lag_bond_map[lag_id] = bond_info;
+ SWSS_LOG_NOTICE("vpp bond interface created for lag_id:%s, swif index:%d, bond_id:%d\n", sai_serialize_object_id(lag_id).c_str(), swif_idx, bond_id);
refresh_interfaces_list();
// Set the bond interface state up
@@ -1190,12 +1253,12 @@ sai_status_t SwitchStateBase::vpp_create_lag(
return SAI_STATUS_SUCCESS;
}
-sai_status_t SwitchStateBase:: removeLag(
+sai_status_t SwitchStateBase::removeLag(
_In_ sai_object_id_t lag_oid)
{
SWSS_LOG_ENTER();
- vpp_remove_lag(lag_oid);
+ CHECK_STATUS_QUIET(vpp_remove_lag(lag_oid));
auto sid = sai_serialize_object_id(lag_oid);
CHECK_STATUS(remove_internal(SAI_OBJECT_TYPE_LAG, sid));
return SAI_STATUS_SUCCESS;
@@ -1204,10 +1267,14 @@ sai_status_t SwitchStateBase:: removeLag(
sai_status_t SwitchStateBase::vpp_remove_lag(
_In_ sai_object_id_t lag_oid)
{
+ int ret;
SWSS_LOG_ENTER();
- uint32_t lag_swif_idx = lag_to_bond_if_idx(lag_oid);
- SWSS_LOG_NOTICE("lag swif idx :%d",lag_swif_idx);
+ LAG_MUTEX;
+
+ platform_bond_info_t bond_info;
+ CHECK_STATUS(get_lag_bond_info(lag_oid, bond_info));
+ uint32_t lag_swif_idx = bond_info.sw_if_index;
auto lag_ifname = vpp_get_swif_name(lag_swif_idx);
SWSS_LOG_NOTICE("lag swif idx :%d swif_name:%s",lag_swif_idx, lag_ifname);
if (lag_ifname == NULL)
@@ -1216,8 +1283,13 @@ sai_status_t SwitchStateBase::vpp_remove_lag(
return SAI_STATUS_FAILURE;
}
- //Delete the Bond interface
- delete_bond_interface(lag_ifname);
+ //Delete the Bond interface (also deletes the lcp pair)
+ ret = delete_bond_interface(lag_ifname);
+ if (ret != 0)
+ {
+ SWSS_LOG_ERROR("failed to delete bond interface in VPP for %s", sai_serialize_object_id(lag_oid).c_str());
+ return SAI_STATUS_FAILURE;
+ }
remove_lag_to_bond_entry(lag_oid);
refresh_interfaces_list();
@@ -1245,7 +1317,9 @@ sai_status_t SwitchStateBase::vpp_create_lag_member(
{
bool is_long_timeout = false;
bool is_passive = false;
+ int ret;
uint32_t bond_if_idx;
+ uint32_t bond_id;
sai_object_id_t lag_oid, lag_port_oid;
SWSS_LOG_ENTER();
@@ -1266,15 +1340,18 @@ sai_status_t SwitchStateBase::vpp_create_lag_member(
sai_serialize_object_type(obj_type).c_str());
return SAI_STATUS_FAILURE;
}
- bond_if_idx = lag_to_bond_if_idx(lag_oid);
- SWSS_LOG_NOTICE("bond if index is %d\n",bond_if_idx);
+
+ platform_bond_info_t bond_info;
+ CHECK_STATUS(get_lag_bond_info(lag_oid, bond_info));
+ bond_if_idx = bond_info.sw_if_index;
+ SWSS_LOG_NOTICE("bond if index is %d\n", bond_if_idx);
attr_type = sai_metadata_get_attr_by_id(SAI_LAG_MEMBER_ATTR_PORT_ID, attr_count, attr_list);
if (attr_type == NULL)
{
SWSS_LOG_ERROR("attr SAI_LAG_MEMBER_ATTR_PORT_ID was not present\n");
- return SAI_STATUS_FAILURE;
+ return SAI_STATUS_FAILURE;
}
lag_port_oid = attr_type->value.oid;
@@ -1294,13 +1371,55 @@ sai_status_t SwitchStateBase::vpp_create_lag_member(
if (found == true)
{
hwifname = tap_to_hwif_name(if_name.c_str());
- SWSS_LOG_NOTICE("hwif name for port is %s",hwifname);
+ SWSS_LOG_NOTICE("hwif name for port is %s",hwifname);
}else {
SWSS_LOG_NOTICE("No ports found for lag port id :%s",sai_serialize_object_id(lag_port_oid).c_str());
return SAI_STATUS_FAILURE;
}
- create_bond_member(bond_if_idx, hwifname,is_passive,is_long_timeout);
+ LAG_MUTEX;
+
+ ret = create_bond_member(bond_if_idx, hwifname, is_passive, is_long_timeout);
+ if (ret != 0)
+ {
+ SWSS_LOG_ERROR("failed to add bond member in VPP for %s", sai_serialize_object_id(lag_port_oid).c_str());
+ return SAI_STATUS_FAILURE;
+ }
+
+ if (!bond_info.lcp_created) {
+ // create tap and lcp for the Bond intf after first member is added to ensure tap mac = member mac = bond mac
+ std::ostringstream tap_stream;
+ bond_id = bond_info.id;
+ tap_stream << "be" << bond_id;
+ std::string tap = tap_stream.str();
+
+ const char *hw_ifname;
+ hw_ifname = vpp_get_swif_name(bond_if_idx);
+ configure_lcp_interface(hw_ifname, tap.c_str(), true);
+
+ // add tc filter to redirect traffic from tap to PortChannel
+ std::stringstream cmd;
+ std::string res;
+
+ cmd << "tc qdisc add dev be" << bond_id << " ingress";
+ ret = swss::exec(cmd.str(), res);
+ if (ret) {
+ SWSS_LOG_ERROR("Command '%s' failed with rc %d", cmd.str().c_str(), ret);
+ }
+
+ cmd.str("");
+ cmd.clear();
+ cmd << "tc filter add dev be" << bond_id << " parent ffff: protocol all prio 2 u32 match u32 0 0 flowid 1:1 action mirred ingress redirect dev PortChannel" << bond_id;
+ ret = swss::exec(cmd.str(), res);
+ if (ret) {
+ SWSS_LOG_ERROR("Command '%s' failed with rc %d", cmd.str().c_str(), ret);
+ }
+
+ // update the lag to bond map
+ bond_info.lcp_created = true;
+ m_lag_bond_map[lag_oid] = bond_info;
+ }
+
return SAI_STATUS_SUCCESS;
}
@@ -1309,7 +1428,7 @@ sai_status_t SwitchStateBase::removeLagMember(
{
SWSS_LOG_ENTER();
- vpp_remove_lag_member(lag_member_oid);
+ CHECK_STATUS_QUIET(vpp_remove_lag_member(lag_member_oid));
auto sid = sai_serialize_object_id(lag_member_oid);
@@ -1321,6 +1440,7 @@ sai_status_t SwitchStateBase::removeLagMember(
sai_status_t SwitchStateBase::vpp_remove_lag_member(
_In_ sai_object_id_t lag_member_oid)
{
+ int ret;
SWSS_LOG_ENTER();
sai_attribute_t attr;
@@ -1377,7 +1497,13 @@ sai_status_t SwitchStateBase::vpp_remove_lag_member(
return SAI_STATUS_FAILURE;
}
- delete_bond_member(lag_member_ifname);
+ ret = delete_bond_member(lag_member_ifname);
+ if (ret != 0)
+ {
+ SWSS_LOG_ERROR("failed to delete bond member in VPP for %s", sai_serialize_object_id(port_oid).c_str());
+ return SAI_STATUS_FAILURE;
+ }
+
return SAI_STATUS_SUCCESS;
}
diff --git a/saivpp/src/SwitchStateBaseHostif.cpp b/saivpp/src/SwitchStateBaseHostif.cpp
index c7266af..95c9900 100644
--- a/saivpp/src/SwitchStateBaseHostif.cpp
+++ b/saivpp/src/SwitchStateBaseHostif.cpp
@@ -708,7 +708,7 @@ sai_status_t SwitchStateBase::vpp_create_hostif_tap_interface(
return SAI_STATUS_FAILURE;
}
- SWSS_LOG_ERROR("created TAP device for %s, fd: %d", name.c_str(), tapfd);
+ SWSS_LOG_NOTICE("created TAP device for %s, fd: %d", name.c_str(), tapfd);
const char *dev = name.c_str();
const char *hwif_name = tap_to_hwif_name(dev);
diff --git a/saivpp/src/SwitchStateBaseRif.cpp b/saivpp/src/SwitchStateBaseRif.cpp
index d449bfe..99a92d3 100644
--- a/saivpp/src/SwitchStateBaseRif.cpp
+++ b/saivpp/src/SwitchStateBaseRif.cpp
@@ -40,8 +40,6 @@
using namespace saivpp;
-#define IP_CMD "/sbin/ip"
-
int SwitchStateBase::currentMaxInstance = 0;
IpVrfInfo::IpVrfInfo(
@@ -364,6 +362,18 @@ bool SwitchStateBase::vpp_get_hwif_name (
_In_ uint32_t vlan_id,
_Out_ std::string& ifname)
{
+
+ if (objectTypeQuery(object_id) == SAI_OBJECT_TYPE_LAG) {
+ platform_bond_info_t bond_info;
+ sai_status_t status = get_lag_bond_info(object_id, bond_info);
+ if (status != SAI_STATUS_SUCCESS)
+ {
+ return false;
+ }
+ ifname = std::string(BONDETHERNET_PREFIX) + std::to_string(bond_info.id);
+ return true;
+ }
+
std::string if_name;
bool found = getTapNameFromPortId(object_id, if_name);
@@ -979,11 +989,16 @@ sai_status_t SwitchStateBase::vpp_add_del_intf_ip_addr_norif (
const char *hwifname;
char hw_subifname[32];
char hw_bviifname[32];
+ char hw_bondifname[32];
const char *hw_ifname;
if (full_if_name.compare(0, vlan_prefix.length(), vlan_prefix) == 0)
{
snprintf(hw_bviifname, sizeof(hw_bviifname), "%s%d","bvi",vlan_id);
hw_ifname = hw_bviifname;
+ } else if (full_if_name.compare(0, strlen(PORTCHANNEL_PREFIX), PORTCHANNEL_PREFIX) == 0) {
+ uint32_t bond_id = std::stoi(full_if_name.substr(strlen(PORTCHANNEL_PREFIX)));
+ snprintf(hw_bondifname, sizeof(hw_bondifname), "%s%d", BONDETHERNET_PREFIX, bond_id);
+ hw_ifname = hw_bondifname;
} else {
hwifname = tap_to_hwif_name(if_name.c_str());
if (vlan_id) {
@@ -993,6 +1008,7 @@ sai_status_t SwitchStateBase::vpp_add_del_intf_ip_addr_norif (
hw_ifname = hwifname;
}
}
+ SWSS_LOG_NOTICE("Setting ip on hw_ifname %s", hw_ifname);
int ret = interface_ip_address_add_del(hw_ifname, &vpp_ip_prefix, is_add);
@@ -1465,6 +1481,43 @@ int SwitchStateBase::vpp_get_vrf_id (const char *linux_ifname, uint32_t *vrf_id)
return 0;
}
+sai_status_t SwitchStateBase::vpp_set_interface_attributes(
+ _In_ sai_object_id_t obj_id,
+ _In_ uint32_t attr_count,
+ _In_ const sai_attribute_t *attr_list,
+ _In_ uint16_t vlan_id)
+{
+ auto attr_type_mtu = sai_metadata_get_attr_by_id(SAI_ROUTER_INTERFACE_ATTR_MTU, attr_count, attr_list);
+
+ if (attr_type_mtu != NULL)
+ {
+ vpp_set_interface_mtu(obj_id, vlan_id, attr_type_mtu->value.u32, AF_INET);
+ vpp_set_interface_mtu(obj_id, vlan_id, attr_type_mtu->value.u32, AF_INET6);
+ }
+
+ bool v4_is_up = false, v6_is_up = false;
+
+ auto attr_type_v4 = sai_metadata_get_attr_by_id(SAI_ROUTER_INTERFACE_ATTR_ADMIN_V4_STATE, attr_count, attr_list);
+
+ if (attr_type_v4 != NULL)
+ {
+ v4_is_up = attr_type_v4->value.booldata;
+ }
+ auto attr_type_v6 = sai_metadata_get_attr_by_id(SAI_ROUTER_INTERFACE_ATTR_ADMIN_V6_STATE, attr_count, attr_list);
+
+ if (attr_type_v6 != NULL)
+ {
+ v6_is_up = attr_type_v6->value.booldata;
+ }
+
+ if (attr_type_v4 != NULL || attr_type_v6 != NULL)
+ {
+ return vpp_set_interface_state(obj_id, vlan_id, (v4_is_up || v6_is_up));
+ } else {
+ return SAI_STATUS_SUCCESS;
+ }
+}
+
sai_status_t SwitchStateBase::vpp_create_router_interface(
_In_ uint32_t attr_count,
_In_ const sai_attribute_t *attr_list)
@@ -1511,9 +1564,9 @@ sai_status_t SwitchStateBase::vpp_create_router_interface(
return SAI_STATUS_SUCCESS;
}
- if (ot != SAI_OBJECT_TYPE_PORT)
+ if (ot != SAI_OBJECT_TYPE_PORT && ot != SAI_OBJECT_TYPE_LAG)
{
- SWSS_LOG_ERROR("SAI_ROUTER_INTERFACE_ATTR_PORT_ID=%s expected to be PORT but is: %s",
+ SWSS_LOG_ERROR("SAI_ROUTER_INTERFACE_ATTR_PORT_ID=%s expected to be PORT or LAG but is: %s",
sai_serialize_object_id(obj_id).c_str(),
sai_serialize_object_type(ot).c_str());
@@ -1534,7 +1587,16 @@ sai_status_t SwitchStateBase::vpp_create_router_interface(
}
std::string if_name;
- bool found = getTapNameFromPortId(obj_id, if_name);
+ bool found = false;
+ platform_bond_info_t bond_info;
+ if (ot == SAI_OBJECT_TYPE_LAG) {
+ CHECK_STATUS(get_lag_bond_info(obj_id, bond_info));
+ if_name = std::string(PORTCHANNEL_PREFIX) + std::to_string(bond_info.id);
+ found = true;
+ } else {
+ found = getTapNameFromPortId(obj_id, if_name);
+ }
+
if (found == false)
{
SWSS_LOG_ERROR("host interface for port id %s not found", sai_serialize_object_id(obj_id).c_str());
@@ -1559,6 +1621,7 @@ sai_status_t SwitchStateBase::vpp_create_router_interface(
} else {
linux_ifname = dev;
}
+
sai_object_id_t vrf_obj_id = 0;
auto attr_vrf_id = sai_metadata_get_attr_by_id(SAI_ROUTER_INTERFACE_ATTR_VIRTUAL_ROUTER_ID, attr_count, attr_list);
@@ -1577,37 +1640,19 @@ sai_status_t SwitchStateBase::vpp_create_router_interface(
vpp_add_ip_vrf(vrf_obj_id, vrf_id);
if (ret == 0 && vrf_id != 0) {
- set_interface_vrf(tap_to_hwif_name(dev), vlan_id, vrf_id, false);
- }
- auto attr_type_mtu = sai_metadata_get_attr_by_id(SAI_ROUTER_INTERFACE_ATTR_MTU, attr_count, attr_list);
-
- if (attr_type_mtu != NULL)
- {
- vpp_set_interface_mtu(obj_id, vlan_id, attr_type_mtu->value.u32, AF_INET);
- vpp_set_interface_mtu(obj_id, vlan_id, attr_type_mtu->value.u32, AF_INET6);
- }
-
- bool v4_is_up = false, v6_is_up = false;
-
- auto attr_type_v4 = sai_metadata_get_attr_by_id(SAI_ROUTER_INTERFACE_ATTR_ADMIN_V4_STATE, attr_count, attr_list);
-
- if (attr_type_v4 != NULL)
- {
- v4_is_up = attr_type_v4->value.booldata;
- }
- auto attr_type_v6 = sai_metadata_get_attr_by_id(SAI_ROUTER_INTERFACE_ATTR_ADMIN_V6_STATE, attr_count, attr_list);
-
- if (attr_type_v6 != NULL)
- {
- v6_is_up = attr_type_v6->value.booldata;
+ const char *hwif_name;
+ char hw_bondifname[32];
+ if (ot == SAI_OBJECT_TYPE_LAG) {
+ snprintf(hw_bondifname, sizeof(hw_bondifname), "%s%d", BONDETHERNET_PREFIX, bond_info.id);
+ hwif_name = hw_bondifname;
+ } else {
+ hwif_name = tap_to_hwif_name(dev);
+ }
+ SWSS_LOG_NOTICE("Setting interface vrf on hwif_name %s", hwif_name);
+ set_interface_vrf(hwif_name, vlan_id, vrf_id, false);
}
- if (attr_type_v4 != NULL || attr_type_v6 != NULL)
- {
- return vpp_set_interface_state(obj_id, vlan_id, (v4_is_up || v6_is_up));
- } else {
- return SAI_STATUS_SUCCESS;
- }
+ return vpp_set_interface_attributes(obj_id, attr_count, attr_list, vlan_id);
}
sai_status_t SwitchStateBase::vpp_update_router_interface(
@@ -1650,9 +1695,9 @@ sai_status_t SwitchStateBase::vpp_update_router_interface(
return SAI_STATUS_SUCCESS;
}
- if (ot != SAI_OBJECT_TYPE_PORT)
+ if (ot != SAI_OBJECT_TYPE_PORT && ot != SAI_OBJECT_TYPE_LAG)
{
- SWSS_LOG_ERROR("SAI_ROUTER_INTERFACE_ATTR_PORT_ID=%s expected to be PORT but is: %s",
+ SWSS_LOG_ERROR("SAI_ROUTER_INTERFACE_ATTR_PORT_ID=%s expected to be PORT or LAG but is: %s",
sai_serialize_object_id(obj_id).c_str(),
sai_serialize_object_type(ot).c_str());
@@ -1679,35 +1724,7 @@ sai_status_t SwitchStateBase::vpp_update_router_interface(
uint16_t vlan_id = attr.value.u16;
- auto attr_type_mtu = sai_metadata_get_attr_by_id(SAI_ROUTER_INTERFACE_ATTR_MTU, attr_count, attr_list);
-
- if (attr_type_mtu != NULL)
- {
- vpp_set_interface_mtu(obj_id, vlan_id, attr_type_mtu->value.u32, AF_INET);
- vpp_set_interface_mtu(obj_id, vlan_id, attr_type_mtu->value.u32, AF_INET6);
- }
-
- bool v4_is_up = false, v6_is_up = false;
-
- auto attr_type_v4 = sai_metadata_get_attr_by_id(SAI_ROUTER_INTERFACE_ATTR_ADMIN_V4_STATE, attr_count, attr_list);
-
- if (attr_type_v4 != NULL)
- {
- v4_is_up = attr_type_v4->value.booldata;
- }
- auto attr_type_v6 = sai_metadata_get_attr_by_id(SAI_ROUTER_INTERFACE_ATTR_ADMIN_V6_STATE, attr_count, attr_list);
-
- if (attr_type_v6 != NULL)
- {
- v6_is_up = attr_type_v6->value.booldata;
- }
-
- if (attr_type_v4 != NULL || attr_type_v6 != NULL)
- {
- return vpp_set_interface_state(obj_id, vlan_id, (v4_is_up || v6_is_up));
- } else {
- return SAI_STATUS_SUCCESS;
- }
+ return vpp_set_interface_attributes(obj_id, attr_count, attr_list, vlan_id);
}
sai_status_t SwitchStateBase::vpp_router_interface_remove_vrf(
@@ -1716,7 +1733,16 @@ sai_status_t SwitchStateBase::vpp_router_interface_remove_vrf(
SWSS_LOG_ENTER();
std::string if_name;
- bool found = getTapNameFromPortId(obj_id, if_name);
+ bool found = false;
+ platform_bond_info_t bond_info;
+ if (objectTypeQuery(obj_id) == SAI_OBJECT_TYPE_LAG) {
+ CHECK_STATUS(get_lag_bond_info(obj_id, bond_info));
+ if_name = std::string(PORTCHANNEL_PREFIX) + std::to_string(bond_info.id);
+ found = true;
+ } else {
+ found = getTapNameFromPortId(obj_id, if_name);
+ }
+
if (found == false)
{
SWSS_LOG_ERROR("host interface for port id %s not found", sai_serialize_object_id(obj_id).c_str());
@@ -1726,9 +1752,16 @@ sai_status_t SwitchStateBase::vpp_router_interface_remove_vrf(
linux_ifname = if_name.c_str();
- const char *hwif_name = tap_to_hwif_name(if_name.c_str());
+ const char *hwif_name;
+ char hw_bondifname[32];
+ if (objectTypeQuery(obj_id) == SAI_OBJECT_TYPE_LAG) {
+ snprintf(hw_bondifname, sizeof(hw_bondifname), "%s%d", BONDETHERNET_PREFIX, bond_info.id);
+ hwif_name = hw_bondifname;
+ } else {
+ hwif_name = tap_to_hwif_name(if_name.c_str());
+ }
- SWSS_LOG_NOTICE("Resetting to default vrf for interface %s", linux_ifname);
+ SWSS_LOG_NOTICE("Resetting to default vrf for interface %s, %s", linux_ifname, hwif_name);
uint32_t vrf_id = 0;
/* For now support is only for ipv4 tables */
@@ -1780,9 +1813,9 @@ sai_status_t SwitchStateBase::vpp_remove_router_interface(sai_object_id_t rif_id
return SAI_STATUS_SUCCESS;
}
- if (ot != SAI_OBJECT_TYPE_PORT)
+ if (ot != SAI_OBJECT_TYPE_PORT && ot != SAI_OBJECT_TYPE_LAG)
{
- SWSS_LOG_ERROR("SAI_ROUTER_INTERFACE_ATTR_PORT_ID=%s expected to be PORT but is: %s",
+ SWSS_LOG_ERROR("SAI_ROUTER_INTERFACE_ATTR_PORT_ID=%s expected to be PORT or LAG but is: %s",
sai_serialize_object_id(obj_id).c_str(),
sai_serialize_object_type(ot).c_str());
diff --git a/saivpp/src/vppxlate/SaiVppXlate.c b/saivpp/src/vppxlate/SaiVppXlate.c
index 7c70e6a..a3350cb 100644
--- a/saivpp/src/vppxlate/SaiVppXlate.c
+++ b/saivpp/src/vppxlate/SaiVppXlate.c
@@ -330,9 +330,10 @@
void classify_get_trace_chain(void ){}
void os_exit(int code) {}
-#define SAIVPP_DEBUG(format,args...) {}
-#define SAIVPP_WARN clib_warning
-#define SAIVPP_ERROR clib_error
+//#define SAIVPP_DEBUG(format,args...) {}
+#define SAIVPP_DEBUG(format,args...) clib_warning("PID: %d, TID: %ld, " format, getpid(), syscall(SYS_gettid), ##args)
+#define SAIVPP_WARN(format,args...) clib_warning("PID: %d, TID: %ld, " format, getpid(), syscall(SYS_gettid), ##args)
+#define SAIVPP_ERROR(format,args...) clib_error("PID: %d, TID: %ld, " format, getpid(), syscall(SYS_gettid), ##args)
/**
* Wait for result and retry if necessary. The retry is necessary because there could be unsolicited
@@ -3463,7 +3464,7 @@ int create_bond_interface(uint32_t bond_id, uint32_t mode, uint32_t lb, uint32_t
S (mp);
- W (ret);
+ WR (ret);
VPP_UNLOCK();
@@ -3503,7 +3504,7 @@ int delete_bond_interface(const char *hwif_name)
S (mp);
- W (ret);
+ WR (ret);
VPP_UNLOCK();
@@ -3516,7 +3517,7 @@ int create_bond_member(uint32_t bond_sw_if_index, const char *hwif_name, bool is
int ret;
- SAIVPP_WARN("Adding member to bond interface: \n");
+ SAIVPP_WARN("Adding member %s to bond interface %u \n", hwif_name, bond_sw_if_index);
VPP_LOCK();
__plugin_msg_base = bond_msg_id_base;
@@ -3545,7 +3546,7 @@ int create_bond_member(uint32_t bond_sw_if_index, const char *hwif_name, bool is
S (mp);
- W (ret);
+ WR (ret);
VPP_UNLOCK();
@@ -3589,7 +3590,7 @@ int delete_bond_member(const char * hwif_name)
S (mp);
- W (ret);
+ WR (ret);
VPP_UNLOCK();
diff --git a/vppbld/vpp.patch b/vppbld/vpp.patch
index 74a6016..72be36b 100644
--- a/vppbld/vpp.patch
+++ b/vppbld/vpp.patch
@@ -1,8 +1,8 @@
diff --git a/Makefile b/Makefile
-index 98866e9be..f9d5b349e 100644
+index 3144905f3..04560cf2c 100644
--- a/Makefile
+++ b/Makefile
-@@ -76,7 +76,7 @@ DEB_DEPENDS += libffi-dev python3-ply libunwind-dev
+@@ -78,7 +78,7 @@ DEB_DEPENDS += libffi-dev python3-ply libunwind-dev
DEB_DEPENDS += cmake ninja-build python3-jsonschema python3-yaml
DEB_DEPENDS += python3-venv # ensurepip
DEB_DEPENDS += python3-dev python3-pip
@@ -11,7 +11,7 @@ index 98866e9be..f9d5b349e 100644
# DEB_DEPENDS += enchant # for docs
DEB_DEPENDS += python3-virtualenv
DEB_DEPENDS += libssl-dev
-@@ -85,7 +85,7 @@ DEB_DEPENDS += iperf3 # for 'make test TEST=vcl'
+@@ -87,7 +87,7 @@ DEB_DEPENDS += iperf3 # for 'make test TEST=vcl'
DEB_DEPENDS += nasm
DEB_DEPENDS += iperf ethtool # for 'make test TEST=vm_vpp_interfaces'
DEB_DEPENDS += libpcap-dev
@@ -21,10 +21,10 @@ index 98866e9be..f9d5b349e 100644
LIBFFI=libffi6 # works on all but 20.04 and debian-testing
diff --git a/build/external/packages/xdp-tools.mk b/build/external/packages/xdp-tools.mk
-index b9285971f..c38acc598 100644
+index 08d94e424..1fbbef9b7 100644
--- a/build/external/packages/xdp-tools.mk
+++ b/build/external/packages/xdp-tools.mk
-@@ -24,7 +24,7 @@ define xdp-tools_config_cmds
+@@ -25,7 +25,7 @@ define xdp-tools_config_cmds
endef
define xdp-tools_build_cmds
@@ -33,6 +33,254 @@ index b9285971f..c38acc598 100644
endef
define xdp-tools_install_cmds
+diff --git a/src/plugins/lacp/lacp.c b/src/plugins/lacp/lacp.c
+index ba66f7b24..3d4316b13 100644
+--- a/src/plugins/lacp/lacp.c
++++ b/src/plugins/lacp/lacp.c
+@@ -24,6 +24,7 @@
+ #include
+
+ lacp_main_t lacp_main;
++__clib_export int lacp_plugin_enabled = 1;
+
+ /*
+ * Generate lacp pdu
+diff --git a/src/plugins/linux-cp/lcp_interface.c b/src/plugins/linux-cp/lcp_interface.c
+index 61665ad41..ec4324acd 100644
+--- a/src/plugins/linux-cp/lcp_interface.c
++++ b/src/plugins/linux-cp/lcp_interface.c
+@@ -1216,6 +1216,15 @@ lcp_itf_pair_link_up_down (vnet_main_t *vnm, u32 hw_if_index, u32 flags)
+
+ VNET_HW_INTERFACE_LINK_UP_DOWN_FUNCTION (lcp_itf_pair_link_up_down);
+
++static bool
++is_lacp_plugin_enabled ()
++{
++ int *lacp_plugin_enabled =
++ vlib_get_plugin_symbol ("lacp_plugin.so", "lacp_plugin_enabled");
++
++ return lacp_plugin_enabled ? *lacp_plugin_enabled : false;
++}
++
+ static clib_error_t *
+ lcp_interface_init (vlib_main_t *vm)
+ {
+@@ -1233,6 +1242,17 @@ lcp_interface_init (vlib_main_t *vm)
+ tcp_punt_unknown (vm, 0, 1);
+ tcp_punt_unknown (vm, 1, 1);
+
++ /* mirror LACP pkts if lacp_plugin disabled */
++ if (!is_lacp_plugin_enabled ())
++ {
++ vlib_node_t *n = vlib_get_node_by_name (vm, (u8 *) "linux-cp-punt-xc");
++ if (n)
++ {
++ ethernet_register_input_type (vm, ETHERNET_TYPE_SLOW_PROTOCOLS,
++ n->index);
++ }
++ }
++
+ lcp_itf_pair_logger = vlib_log_register_class ("linux-cp", "itf");
+
+ return NULL;
+diff --git a/src/plugins/linux-cp/lcp_node.c b/src/plugins/linux-cp/lcp_node.c
+index 241cc5e4b..6545f567a 100644
+--- a/src/plugins/linux-cp/lcp_node.c
++++ b/src/plugins/linux-cp/lcp_node.c
+@@ -39,40 +39,51 @@
+
+ typedef enum
+ {
+-#define _(sym, str) LIP_PUNT_NEXT_##sym,
++#define _(sym, str) LIP_PUNT_XC_NEXT_##sym,
+ foreach_lip_punt
+ #undef _
+- LIP_PUNT_N_NEXT,
+-} lip_punt_next_t;
++ LIP_PUNT_XC_N_NEXT,
++} lip_punt_xc_next_t;
+
+-typedef struct lip_punt_trace_t_
++typedef struct lip_punt_xc_trace_t_
+ {
++ u8 direction; // 0 = punt phy to host (default), 1 = xc host to phy
+ u32 phy_sw_if_index;
+ u32 host_sw_if_index;
+-} lip_punt_trace_t;
++} lip_punt_xc_trace_t;
+
+ /* packet trace format function */
+ static u8 *
+-format_lip_punt_trace (u8 *s, va_list *args)
++format_lip_punt_xc_trace (u8 *s, va_list *args)
+ {
+ CLIB_UNUSED (vlib_main_t * vm) = va_arg (*args, vlib_main_t *);
+ CLIB_UNUSED (vlib_node_t * node) = va_arg (*args, vlib_node_t *);
+- lip_punt_trace_t *t = va_arg (*args, lip_punt_trace_t *);
++ lip_punt_xc_trace_t *t = va_arg (*args, lip_punt_xc_trace_t *);
+
+- s =
+- format (s, "lip-punt: %u -> %u", t->phy_sw_if_index, t->host_sw_if_index);
++ if (t->direction)
++ {
++ s = format (s, "lip-xc: %u -> %u", t->host_sw_if_index,
++ t->phy_sw_if_index);
++ }
++ else
++ {
++ s = format (s, "lip-punt: %u -> %u", t->phy_sw_if_index,
++ t->host_sw_if_index);
++ }
+
+ return s;
+ }
+
+ /**
+ * Pass punted packets from the PHY to the HOST.
++ * Conditionally x-connect packets from the HOST to the PHY.
+ */
+-VLIB_NODE_FN (lip_punt_node)
+-(vlib_main_t *vm, vlib_node_runtime_t *node, vlib_frame_t *frame)
++static_always_inline u32
++lip_punt_xc_inline (vlib_main_t *vm, vlib_node_runtime_t *node,
++ vlib_frame_t *frame, bool xc)
+ {
+ u32 n_left_from, *from, *to_next, n_left_to_next;
+- lip_punt_next_t next_index;
++ lip_punt_xc_next_t next_index;
+
+ next_index = node->cached_next_index;
+ n_left_from = frame->n_vectors;
+@@ -89,6 +100,7 @@ VLIB_NODE_FN (lip_punt_node)
+ u32 next0 = ~0;
+ u32 bi0, lipi0;
+ u32 sw_if_index0;
++ u8 direction0 = 0;
+ u8 len0;
+
+ bi0 = to_next[0] = from[0];
+@@ -97,18 +109,34 @@ VLIB_NODE_FN (lip_punt_node)
+ to_next += 1;
+ n_left_from -= 1;
+ n_left_to_next -= 1;
+- next0 = LIP_PUNT_NEXT_DROP;
++ next0 = LIP_PUNT_XC_NEXT_DROP;
+
+ b0 = vlib_get_buffer (vm, bi0);
+
+ sw_if_index0 = vnet_buffer (b0)->sw_if_index[VLIB_RX];
+ lipi0 = lcp_itf_pair_find_by_phy (sw_if_index0);
+- if (PREDICT_FALSE (lipi0 == INDEX_INVALID))
+- goto trace0;
++
++ /*
++ * lip_punt_node: expect sw_if_index0 is phy in an itf pair
++ * lip_punt_xc_node: if sw_if_index0 is not phy, expect it is host
++ */
++ if (!xc && (PREDICT_FALSE (lipi0 == INDEX_INVALID)))
++ {
++ goto trace0;
++ }
++ else if (xc && (lipi0 == INDEX_INVALID))
++ {
++ direction0 = 1;
++ lipi0 = lcp_itf_pair_find_by_host (sw_if_index0);
++ if (PREDICT_FALSE (lipi0 == INDEX_INVALID))
++ goto trace0;
++ }
+
+ lip0 = lcp_itf_pair_get (lipi0);
+- next0 = LIP_PUNT_NEXT_IO;
+- vnet_buffer (b0)->sw_if_index[VLIB_TX] = lip0->lip_host_sw_if_index;
++ next0 = LIP_PUNT_XC_NEXT_IO;
++ vnet_buffer (b0)->sw_if_index[VLIB_TX] =
++ direction0 ? lip0->lip_phy_sw_if_index :
++ lip0->lip_host_sw_if_index;
+
+ if (PREDICT_TRUE (lip0->lip_host_type == LCP_ITF_HOST_TAP))
+ {
+@@ -129,10 +157,22 @@ VLIB_NODE_FN (lip_punt_node)
+ trace0:
+ if (PREDICT_FALSE ((b0->flags & VLIB_BUFFER_IS_TRACED)))
+ {
+- lip_punt_trace_t *t = vlib_add_trace (vm, node, b0, sizeof (*t));
+- t->phy_sw_if_index = sw_if_index0;
+- t->host_sw_if_index =
+- (lipi0 == INDEX_INVALID) ? ~0 : lip0->lip_host_sw_if_index;
++ lip_punt_xc_trace_t *t =
++ vlib_add_trace (vm, node, b0, sizeof (*t));
++
++ t->direction = direction0;
++ if (direction0)
++ {
++ t->phy_sw_if_index =
++ (lipi0 == INDEX_INVALID) ? ~0 : lip0->lip_phy_sw_if_index;
++ t->host_sw_if_index = sw_if_index0;
++ }
++ else
++ {
++ t->phy_sw_if_index = sw_if_index0;
++ t->host_sw_if_index =
++ (lipi0 == INDEX_INVALID) ? ~0 : lip0->lip_host_sw_if_index;
++ }
+ }
+
+ vlib_validate_buffer_enqueue_x1 (vm, node, next_index, to_next,
+@@ -145,16 +185,41 @@ VLIB_NODE_FN (lip_punt_node)
+ return frame->n_vectors;
+ }
+
++VLIB_NODE_FN (lip_punt_node)
++(vlib_main_t *vm, vlib_node_runtime_t *node, vlib_frame_t *frame)
++{
++ return (lip_punt_xc_inline (vm, node, frame, false /* xc */));
++}
++
++VLIB_NODE_FN (lip_punt_xc_node)
++(vlib_main_t *vm, vlib_node_runtime_t *node, vlib_frame_t *frame)
++{
++ return (lip_punt_xc_inline (vm, node, frame, true /* xc */));
++}
++
+ VLIB_REGISTER_NODE (lip_punt_node) = {
+ .name = "linux-cp-punt",
+ .vector_size = sizeof (u32),
+- .format_trace = format_lip_punt_trace,
++ .format_trace = format_lip_punt_xc_trace,
++ .type = VLIB_NODE_TYPE_INTERNAL,
++
++ .n_next_nodes = LIP_PUNT_XC_N_NEXT,
++ .next_nodes = {
++ [LIP_PUNT_XC_NEXT_DROP] = "error-drop",
++ [LIP_PUNT_XC_NEXT_IO] = "interface-output",
++ },
++};
++
++VLIB_REGISTER_NODE (lip_punt_xc_node) = {
++ .name = "linux-cp-punt-xc",
++ .vector_size = sizeof (u32),
++ .format_trace = format_lip_punt_xc_trace,
+ .type = VLIB_NODE_TYPE_INTERNAL,
+
+- .n_next_nodes = LIP_PUNT_N_NEXT,
++ .n_next_nodes = LIP_PUNT_XC_N_NEXT,
+ .next_nodes = {
+- [LIP_PUNT_NEXT_DROP] = "error-drop",
+- [LIP_PUNT_NEXT_IO] = "interface-output",
++ [LIP_PUNT_XC_NEXT_DROP] = "error-drop",
++ [LIP_PUNT_XC_NEXT_IO] = "interface-output",
+ },
+ };
+
+@@ -190,7 +255,7 @@ VLIB_NODE_FN (lcp_punt_l3_node)
+ (vlib_main_t *vm, vlib_node_runtime_t *node, vlib_frame_t *frame)
+ {
+ u32 n_left_from, *from, *to_next, n_left_to_next;
+- lip_punt_next_t next_index;
++ lip_punt_xc_next_t next_index;
+
+ next_index = node->cached_next_index;
+ n_left_from = frame->n_vectors;
diff --git a/src/plugins/vxlan/vxlan.c b/src/plugins/vxlan/vxlan.c
index 0885550d2..8b8cd66e4 100644
--- a/src/plugins/vxlan/vxlan.c