2020-06-08 21:03:24

by Jarod Wilson

[permalink] [raw]
Subject: [PATCH net-next 0/4] bonding: initial support for hardware crypto offload

This is an initial functional implementation for doing pass-through of
hardware encryption from bonding device to capable slaves, in active-backup
bond setups. This was developed and tested using ixgbe-driven Intel x520
interfaces with libreswan and a transport mode connection, primarily using
netperf, with assorted connection failures forced during transmission. The
failover works quite well in my testing, and overall performance is right
on par with offload when running on a bare interface, no bond involved.

Caveats: this is ONLY enabled for active-backup, because I'm not sure
how one would manage multiple offload handles for different devices all
running at the same time in the same xfrm, and it relies on some minor
changes to both the xfrm code and slave device driver code to get things
to behave, and I don't have immediate access to any other hardware that
could function similarly, but the NIC driver changes are minimal and
straight-forward enough that I've included what I think ought to be
enough for mlx5 devices too.

Earlier RFC submissions of this set didn't get any feedback, other than
from the build bot, so I'm hoping silence means nobody hated it...

Jarod Wilson (4):
xfrm: bail early on slave pass over skb
ixgbe_ipsec: become aware of when running as a bonding slave
bonding: support hardware encryption offload to slaves
mlx5: support crypto offload as a bonding slave

drivers/net/Kconfig | 11 ++
drivers/net/bonding/bond_main.c | 111 +++++++++++++++++-
.../net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 39 ++++--
.../mellanox/mlx5/core/en_accel/ipsec.c | 6 +
include/net/bonding.h | 3 +
include/net/xfrm.h | 1 +
net/xfrm/xfrm_device.c | 34 +++---
7 files changed, 177 insertions(+), 28 deletions(-)

--
2.20.1


2020-06-08 21:03:34

by Jarod Wilson

[permalink] [raw]
Subject: [PATCH net-next 1/4] xfrm: bail early on slave pass over skb

This is prep work for initial support of bonding hardware encryption
pass-through support. The bonding driver will fill in the slave_dev
pointer, and we use that to know not to skb_push() again on a given
skb that was already processed on the bond device.

CC: Jay Vosburgh <[email protected]>
CC: Veaceslav Falico <[email protected]>
CC: Andy Gospodarek <[email protected]>
CC: "David S. Miller" <[email protected]>
CC: Jeff Kirsher <[email protected]>
CC: Jakub Kicinski <[email protected]>
CC: Steffen Klassert <[email protected]>
CC: Herbert Xu <[email protected]>
CC: [email protected]
CC: [email protected]
Signed-off-by: Jarod Wilson <[email protected]>
---
include/net/xfrm.h | 1 +
net/xfrm/xfrm_device.c | 34 +++++++++++++++++-----------------
2 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 094fe682f5d7..e20b2b27ec48 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -127,6 +127,7 @@ struct xfrm_state_walk {

struct xfrm_state_offload {
struct net_device *dev;
+ struct net_device *slave_dev;
unsigned long offload_handle;
unsigned int num_exthdrs;
u8 flags;
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index f50d1f97cf8e..b8918fc5248b 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -106,6 +106,7 @@ struct sk_buff *validate_xmit_xfrm(struct sk_buff *skb, netdev_features_t featur
struct sk_buff *skb2, *nskb, *pskb = NULL;
netdev_features_t esp_features = features;
struct xfrm_offload *xo = xfrm_offload(skb);
+ struct net_device *dev = skb->dev;
struct sec_path *sp;

if (!xo)
@@ -119,6 +120,10 @@ struct sk_buff *validate_xmit_xfrm(struct sk_buff *skb, netdev_features_t featur
if (xo->flags & XFRM_GRO || x->xso.flags & XFRM_OFFLOAD_INBOUND)
return skb;

+ /* This skb was already validated on the master dev */
+ if ((x->xso.dev != dev) && (x->xso.slave_dev == dev))
+ return skb;
+
local_irq_save(flags);
sd = this_cpu_ptr(&softnet_data);
err = !skb_queue_empty(&sd->xfrm_backlog);
@@ -129,25 +134,20 @@ struct sk_buff *validate_xmit_xfrm(struct sk_buff *skb, netdev_features_t featur
return skb;
}

- if (skb_is_gso(skb)) {
- struct net_device *dev = skb->dev;
-
- if (unlikely(x->xso.dev != dev)) {
- struct sk_buff *segs;
+ if (skb_is_gso(skb) && unlikely(x->xso.dev != dev)) {
+ struct sk_buff *segs;

- /* Packet got rerouted, fixup features and segment it. */
- esp_features = esp_features & ~(NETIF_F_HW_ESP
- | NETIF_F_GSO_ESP);
+ /* Packet got rerouted, fixup features and segment it. */
+ esp_features = esp_features & ~(NETIF_F_HW_ESP | NETIF_F_GSO_ESP);

- segs = skb_gso_segment(skb, esp_features);
- if (IS_ERR(segs)) {
- kfree_skb(skb);
- atomic_long_inc(&dev->tx_dropped);
- return NULL;
- } else {
- consume_skb(skb);
- skb = segs;
- }
+ segs = skb_gso_segment(skb, esp_features);
+ if (IS_ERR(segs)) {
+ kfree_skb(skb);
+ atomic_long_inc(&dev->tx_dropped);
+ return NULL;
+ } else {
+ consume_skb(skb);
+ skb = segs;
}
}

--
2.20.1

2020-06-08 21:03:46

by Jarod Wilson

[permalink] [raw]
Subject: [PATCH net-next 3/4] bonding: support hardware encryption offload to slaves

Currently, this support is limited to active-backup mode, as I'm not sure
about the feasilibity of mapping an xfrm_state's offload handle to
multiple hardware devices simultaneously, and we rely on being able to
pass some hints to both the xfrm and NIC driver about whether or not
they're operating on a slave device.

I've tested this atop an Intel x520 device (ixgbe) using libreswan in
transport mode, succesfully achieving ~4.3Gbps throughput with netperf
(more or less identical to throughput on a bare NIC in this system),
as well as successful failover and recovery mid-netperf.

v2: rebase on latest net-next and wrap with #ifdef CONFIG_XFRM_OFFLOAD
v3: add new CONFIG_BOND_XFRM_OFFLOAD option and fix shutdown path

CC: Jay Vosburgh <[email protected]>
CC: Veaceslav Falico <[email protected]>
CC: Andy Gospodarek <[email protected]>
CC: "David S. Miller" <[email protected]>
CC: Jeff Kirsher <[email protected]>
CC: Jakub Kicinski <[email protected]>
CC: Steffen Klassert <[email protected]>
CC: Herbert Xu <[email protected]>
CC: [email protected]
CC: [email protected]
Signed-off-by: Jarod Wilson <[email protected]>

Signed-off-by: Jarod Wilson <[email protected]>
---
drivers/net/Kconfig | 11 ++++
drivers/net/bonding/bond_main.c | 111 +++++++++++++++++++++++++++++++-
include/net/bonding.h | 3 +
3 files changed, 122 insertions(+), 3 deletions(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index c7d310ef1c83..938c4dd9bfb9 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -56,6 +56,17 @@ config BONDING
To compile this driver as a module, choose M here: the module
will be called bonding.

+config BONDING_XFRM_OFFLOAD
+ bool "Bonding driver IPSec XFRM cryptography-offload pass-through support"
+ depends on BONDING
+ depends on XFRM_OFFLOAD
+ default y
+ select XFRM_ALGO
+ ---help---
+ Enable support for IPSec offload pass-through in the bonding driver.
+ Currently limited to active-backup mode only, and requires slave
+ devices that support hardware crypto offload.
+
config DUMMY
tristate "Dummy net driver support"
---help---
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index a25c65d4af71..01b80cef492a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -79,6 +79,7 @@
#include <net/pkt_sched.h>
#include <linux/rculist.h>
#include <net/flow_dissector.h>
+#include <net/xfrm.h>
#include <net/bonding.h>
#include <net/bond_3ad.h>
#include <net/bond_alb.h>
@@ -278,8 +279,6 @@ const char *bond_mode_name(int mode)
return names[mode];
}

-/*---------------------------------- VLAN -----------------------------------*/
-
/**
* bond_dev_queue_xmit - Prepare skb for xmit.
*
@@ -302,6 +301,8 @@ netdev_tx_t bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
return dev_queue_xmit(skb);
}

+/*---------------------------------- VLAN -----------------------------------*/
+
/* In the following 2 functions, bond_vlan_rx_add_vid and bond_vlan_rx_kill_vid,
* We don't protect the slave list iteration with a lock because:
* a. This operation is performed in IOCTL context,
@@ -372,6 +373,84 @@ static int bond_vlan_rx_kill_vid(struct net_device *bond_dev,
return 0;
}

+/*---------------------------------- XFRM -----------------------------------*/
+
+#ifdef CONFIG_BONDING_XFRM_OFFLOAD
+/**
+ * bond_ipsec_add_sa - program device with a security association
+ * @xs: pointer to transformer state struct
+ **/
+static int bond_ipsec_add_sa(struct xfrm_state *xs)
+{
+ struct net_device *bond_dev = xs->xso.dev;
+ struct bonding *bond = netdev_priv(bond_dev);
+ struct slave *slave = rtnl_dereference(bond->curr_active_slave);
+
+ xs->xso.slave_dev = slave->dev;
+ bond->xs = xs;
+
+ if (!(slave->dev->xfrmdev_ops
+ && slave->dev->xfrmdev_ops->xdo_dev_state_add)) {
+ slave_warn(bond_dev, slave->dev, "Slave does not support ipsec offload\n");
+ return -EINVAL;
+ }
+
+ return slave->dev->xfrmdev_ops->xdo_dev_state_add(xs);
+}
+
+/**
+ * bond_ipsec_del_sa - clear out this specific SA
+ * @xs: pointer to transformer state struct
+ **/
+static void bond_ipsec_del_sa(struct xfrm_state *xs)
+{
+ struct net_device *bond_dev = xs->xso.dev;
+ struct bonding *bond = netdev_priv(bond_dev);
+ struct slave *slave = rtnl_dereference(bond->curr_active_slave);
+
+ if (!slave)
+ return;
+
+ xs->xso.slave_dev = slave->dev;
+
+ if (!(slave->dev->xfrmdev_ops
+ && slave->dev->xfrmdev_ops->xdo_dev_state_delete)) {
+ slave_warn(bond_dev, slave->dev, "%s: no slave xdo_dev_state_delete\n", __func__);
+ return;
+ }
+
+ slave->dev->xfrmdev_ops->xdo_dev_state_delete(xs);
+}
+
+/**
+ * bond_ipsec_offload_ok - can this packet use the xfrm hw offload
+ * @skb: current data packet
+ * @xs: pointer to transformer state struct
+ **/
+static bool bond_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
+{
+ struct net_device *bond_dev = xs->xso.dev;
+ struct bonding *bond = netdev_priv(bond_dev);
+ struct slave *curr_active = rtnl_dereference(bond->curr_active_slave);
+ struct net_device *slave_dev = curr_active->dev;
+
+ if (!(slave_dev->xfrmdev_ops
+ && slave_dev->xfrmdev_ops->xdo_dev_offload_ok)) {
+ slave_warn(bond_dev, slave_dev, "%s: no slave xdo_dev_offload_ok\n", __func__);
+ return false;
+ }
+
+ xs->xso.slave_dev = slave_dev;
+ return slave_dev->xfrmdev_ops->xdo_dev_offload_ok(skb, xs);
+}
+
+static const struct xfrmdev_ops bond_xfrmdev_ops = {
+ .xdo_dev_state_add = bond_ipsec_add_sa,
+ .xdo_dev_state_delete = bond_ipsec_del_sa,
+ .xdo_dev_offload_ok = bond_ipsec_offload_ok,
+};
+#endif /* CONFIG_BONDING_XFRM_OFFLOAD */
+
/*------------------------------- Link status -------------------------------*/

/* Set the carrier state for the master according to the state of its
@@ -879,6 +958,11 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active)
return;

if (new_active) {
+#ifdef CONFIG_BONDING_XFRM_OFFLOAD
+ if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) && bond->xs)
+ bond_ipsec_del_sa(bond->xs);
+#endif /* CONFIG_BONDING_XFRM_OFFLOAD */
+
new_active->last_link_up = jiffies;

if (new_active->link == BOND_LINK_BACK) {
@@ -941,6 +1025,13 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active)
bond_should_notify_peers(bond);
}

+#ifdef CONFIG_BONDING_XFRM_OFFLOAD
+ if (old_active && bond->xs) {
+ xfrm_dev_state_flush(dev_net(bond->dev), bond->dev, true);
+ bond_ipsec_add_sa(bond->xs);
+ }
+#endif /* CONFIG_BONDING_XFRM_OFFLOAD */
+
call_netdevice_notifiers(NETDEV_BONDING_FAILOVER, bond->dev);
if (should_notify_peers) {
bond->send_peer_notif--;
@@ -1125,7 +1216,9 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
NETIF_F_HIGHDMA | NETIF_F_LRO)

#define BOND_ENC_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \
- NETIF_F_RXCSUM | NETIF_F_ALL_TSO)
+ NETIF_F_RXCSUM | NETIF_F_ALL_TSO | \
+ NETIF_F_HW_ESP | NETIF_F_HW_ESP_TX_CSUM | \
+ NETIF_F_GSO_ESP)

#define BOND_MPLS_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \
NETIF_F_ALL_TSO)
@@ -1464,6 +1557,9 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
slave_dbg(bond_dev, slave_dev, "is !NETIF_F_VLAN_CHALLENGED\n");
}

+ if (slave_dev->features & NETIF_F_HW_ESP)
+ slave_dbg(bond_dev, slave_dev, "is esp-hw-offload capable\n");
+
/* Old ifenslave binaries are no longer supported. These can
* be identified with moderate accuracy by the state of the slave:
* the current ifenslave will set the interface down prior to
@@ -4542,6 +4638,13 @@ void bond_setup(struct net_device *bond_dev)
bond_dev->priv_flags |= IFF_BONDING | IFF_UNICAST_FLT | IFF_NO_QUEUE;
bond_dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING);

+#ifdef CONFIG_BONDING_XFRM_OFFLOAD
+ /* set up xfrm device ops (only supported in active-backup right now) */
+ if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP))
+ bond_dev->xfrmdev_ops = &bond_xfrmdev_ops;
+ bond->xs = NULL;
+#endif /* CONFIG_BONDING_XFRM_OFFLOAD */
+
/* don't acquire bond device's netif_tx_lock when transmitting */
bond_dev->features |= NETIF_F_LLTX;

@@ -4560,6 +4663,8 @@ void bond_setup(struct net_device *bond_dev)
NETIF_F_HW_VLAN_CTAG_FILTER;

bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL | NETIF_F_GSO_UDP_L4;
+ if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP))
+ bond_dev->hw_features |= BOND_ENC_FEATURES;
bond_dev->features |= bond_dev->hw_features;
bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
}
diff --git a/include/net/bonding.h b/include/net/bonding.h
index aa854a9c01e2..29a25098e2a6 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -238,6 +238,9 @@ struct bonding {
struct dentry *debug_dir;
#endif /* CONFIG_DEBUG_FS */
struct rtnl_link_stats64 bond_stats;
+#ifdef CONFIG_BONDING_XFRM_OFFLOAD
+ struct xfrm_state *xs;
+#endif /* CONFIG_BONDING_XFRM_OFFLOAD */
};

#define bond_slave_get_rcu(dev) \
--
2.20.1

2020-06-08 21:05:24

by Jarod Wilson

[permalink] [raw]
Subject: [PATCH net-next 4/4] mlx5: become aware of when running as a bonding slave

I've been unable to get my hands on suitable supported hardware to date,
but I believe this ought to be all that is needed to enable the mlx5
driver to also work with bonding active-backup crypto offload passthru.

CC: Boris Pismenny <[email protected]>
CC: Saeed Mahameed <[email protected]>
CC: Leon Romanovsky <[email protected]>
CC: Jay Vosburgh <[email protected]>
CC: Veaceslav Falico <[email protected]>
CC: Andy Gospodarek <[email protected]>
CC: "David S. Miller" <[email protected]>
CC: Jeff Kirsher <[email protected]>
CC: Jakub Kicinski <[email protected]>
CC: Steffen Klassert <[email protected]>
CC: Herbert Xu <[email protected]>
CC: [email protected]
Signed-off-by: Jarod Wilson <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
index 92eb3bad4acd..72ad6664bd73 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
@@ -210,6 +210,9 @@ static inline int mlx5e_xfrm_validate_state(struct xfrm_state *x)
struct net_device *netdev = x->xso.dev;
struct mlx5e_priv *priv;

+ if (x->xso.slave_dev)
+ netdev = x->xso.slave_dev;
+
priv = netdev_priv(netdev);

if (x->props.aalgo != SADB_AALG_NONE) {
@@ -291,6 +294,9 @@ static int mlx5e_xfrm_add_state(struct xfrm_state *x)
unsigned int sa_handle;
int err;

+ if (x->xso.slave_dev)
+ netdev = x->xso.slave_dev;
+
priv = netdev_priv(netdev);

err = mlx5e_xfrm_validate_state(x);
--
2.20.1

2020-06-08 21:05:31

by Jarod Wilson

[permalink] [raw]
Subject: [PATCH net-next 2/4] ixgbe_ipsec: become aware of when running as a bonding slave

Slave devices in a bond doing hardware encryption also need to be aware
that they're slaves, so we operate on the slave instead of the bonding
master to do the actual hardware encryption offload bits.

CC: Jay Vosburgh <[email protected]>
CC: Veaceslav Falico <[email protected]>
CC: Andy Gospodarek <[email protected]>
CC: "David S. Miller" <[email protected]>
CC: Jeff Kirsher <[email protected]>
CC: Jakub Kicinski <[email protected]>
CC: Steffen Klassert <[email protected]>
CC: Herbert Xu <[email protected]>
CC: [email protected]
CC: [email protected]
Signed-off-by: Jarod Wilson <[email protected]>
---
.../net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 39 +++++++++++++++----
1 file changed, 31 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 113f6087c7c9..26b0a58a064d 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -432,6 +432,9 @@ static int ixgbe_ipsec_parse_proto_keys(struct xfrm_state *xs,
char *alg_name = NULL;
int key_len;

+ if (xs->xso.slave_dev)
+ dev = xs->xso.slave_dev;
+
if (!xs->aead) {
netdev_err(dev, "Unsupported IPsec algorithm\n");
return -EINVAL;
@@ -478,8 +481,8 @@ static int ixgbe_ipsec_parse_proto_keys(struct xfrm_state *xs,
static int ixgbe_ipsec_check_mgmt_ip(struct xfrm_state *xs)
{
struct net_device *dev = xs->xso.dev;
- struct ixgbe_adapter *adapter = netdev_priv(dev);
- struct ixgbe_hw *hw = &adapter->hw;
+ struct ixgbe_adapter *adapter;
+ struct ixgbe_hw *hw;
u32 mfval, manc, reg;
int num_filters = 4;
bool manc_ipv4;
@@ -497,6 +500,12 @@ static int ixgbe_ipsec_check_mgmt_ip(struct xfrm_state *xs)
#define BMCIP_V6 0x3
#define BMCIP_MASK 0x3

+ if (xs->xso.slave_dev)
+ dev = xs->xso.slave_dev;
+
+ adapter = netdev_priv(dev);
+ hw = &adapter->hw;
+
manc = IXGBE_READ_REG(hw, IXGBE_MANC);
manc_ipv4 = !!(manc & MANC_EN_IPV4_FILTER);
mfval = IXGBE_READ_REG(hw, IXGBE_MFVAL);
@@ -561,14 +570,21 @@ static int ixgbe_ipsec_check_mgmt_ip(struct xfrm_state *xs)
static int ixgbe_ipsec_add_sa(struct xfrm_state *xs)
{
struct net_device *dev = xs->xso.dev;
- struct ixgbe_adapter *adapter = netdev_priv(dev);
- struct ixgbe_ipsec *ipsec = adapter->ipsec;
- struct ixgbe_hw *hw = &adapter->hw;
+ struct ixgbe_adapter *adapter;
+ struct ixgbe_ipsec *ipsec;
+ struct ixgbe_hw *hw;
int checked, match, first;
u16 sa_idx;
int ret;
int i;

+ if (xs->xso.slave_dev)
+ dev = xs->xso.slave_dev;
+
+ adapter = netdev_priv(dev);
+ ipsec = adapter->ipsec;
+ hw = &adapter->hw;
+
if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) {
netdev_err(dev, "Unsupported protocol 0x%04x for ipsec offload\n",
xs->id.proto);
@@ -746,12 +762,19 @@ static int ixgbe_ipsec_add_sa(struct xfrm_state *xs)
static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
{
struct net_device *dev = xs->xso.dev;
- struct ixgbe_adapter *adapter = netdev_priv(dev);
- struct ixgbe_ipsec *ipsec = adapter->ipsec;
- struct ixgbe_hw *hw = &adapter->hw;
+ struct ixgbe_adapter *adapter;
+ struct ixgbe_ipsec *ipsec;
+ struct ixgbe_hw *hw;
u32 zerobuf[4] = {0, 0, 0, 0};
u16 sa_idx;

+ if (xs->xso.slave_dev)
+ dev = xs->xso.slave_dev;
+
+ adapter = netdev_priv(dev);
+ ipsec = adapter->ipsec;
+ hw = &adapter->hw;
+
if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
struct rx_sa *rsa;
u8 ipi;
--
2.20.1

2020-06-09 00:14:11

by Jeff Kirsher

[permalink] [raw]
Subject: RE: [PATCH net-next 2/4] ixgbe_ipsec: become aware of when running as a bonding slave

> -----Original Message-----
> From: Jarod Wilson <[email protected]>
> Sent: Monday, June 8, 2020 14:01
> To: [email protected]
> Cc: Jarod Wilson <[email protected]>; Jay Vosburgh
> <[email protected]>; Veaceslav Falico <[email protected]>; Andy
> Gospodarek <[email protected]>; David S. Miller <[email protected]>;
> Kirsher, Jeffrey T <[email protected]>; Jakub Kicinski
> <[email protected]>; Steffen Klassert <[email protected]>;
> Herbert Xu <[email protected]>; [email protected]; intel-
> [email protected]
> Subject: [PATCH net-next 2/4] ixgbe_ipsec: become aware of when running as
> a bonding slave
>
> Slave devices in a bond doing hardware encryption also need to be aware that
> they're slaves, so we operate on the slave instead of the bonding master to do
> the actual hardware encryption offload bits.
>
> CC: Jay Vosburgh <[email protected]>
> CC: Veaceslav Falico <[email protected]>
> CC: Andy Gospodarek <[email protected]>
> CC: "David S. Miller" <[email protected]>
> CC: Jeff Kirsher <[email protected]>
> CC: Jakub Kicinski <[email protected]>
> CC: Steffen Klassert <[email protected]>
> CC: Herbert Xu <[email protected]>
> CC: [email protected]
> CC: [email protected]
> Signed-off-by: Jarod Wilson <[email protected]>

Acked-by: Jeff Kirsher <[email protected]>

> ---
> .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 39 +++++++++++++++----
> 1 file changed, 31 insertions(+), 8 deletions(-)

2020-06-09 00:17:12

by Jarod Wilson

[permalink] [raw]
Subject: Re: [PATCH net-next 3/4] bonding: support hardware encryption offload to slaves

On Mon, Jun 8, 2020 at 7:48 PM Jay Vosburgh <[email protected]> wrote:
>
> Jarod Wilson <[email protected]> wrote:
>
> >Currently, this support is limited to active-backup mode, as I'm not sure
> >about the feasilibity of mapping an xfrm_state's offload handle to
> >multiple hardware devices simultaneously, and we rely on being able to
> >pass some hints to both the xfrm and NIC driver about whether or not
> >they're operating on a slave device.
> >
> >I've tested this atop an Intel x520 device (ixgbe) using libreswan in
> >transport mode, succesfully achieving ~4.3Gbps throughput with netperf
> >(more or less identical to throughput on a bare NIC in this system),
> >as well as successful failover and recovery mid-netperf.
> >
> >v2: rebase on latest net-next and wrap with #ifdef CONFIG_XFRM_OFFLOAD
> >v3: add new CONFIG_BOND_XFRM_OFFLOAD option and fix shutdown path
> >
> >CC: Jay Vosburgh <[email protected]>
> >CC: Veaceslav Falico <[email protected]>
> >CC: Andy Gospodarek <[email protected]>
> >CC: "David S. Miller" <[email protected]>
> >CC: Jeff Kirsher <[email protected]>
> >CC: Jakub Kicinski <[email protected]>
> >CC: Steffen Klassert <[email protected]>
> >CC: Herbert Xu <[email protected]>
> >CC: [email protected]
> >CC: [email protected]
> >Signed-off-by: Jarod Wilson <[email protected]>
> >
> >Signed-off-by: Jarod Wilson <[email protected]>
> >---
> > drivers/net/Kconfig | 11 ++++
> > drivers/net/bonding/bond_main.c | 111 +++++++++++++++++++++++++++++++-
> > include/net/bonding.h | 3 +
> > 3 files changed, 122 insertions(+), 3 deletions(-)
> >
> >diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
> >index c7d310ef1c83..938c4dd9bfb9 100644
> >--- a/drivers/net/Kconfig
> >+++ b/drivers/net/Kconfig
> >@@ -56,6 +56,17 @@ config BONDING
> > To compile this driver as a module, choose M here: the module
> > will be called bonding.
> >
> >+config BONDING_XFRM_OFFLOAD
> >+ bool "Bonding driver IPSec XFRM cryptography-offload pass-through support"
> >+ depends on BONDING
> >+ depends on XFRM_OFFLOAD
> >+ default y
> >+ select XFRM_ALGO
> >+ ---help---
> >+ Enable support for IPSec offload pass-through in the bonding driver.
> >+ Currently limited to active-backup mode only, and requires slave
> >+ devices that support hardware crypto offload.
> >+
>
> Why is this a separate Kconfig option? Is it reasonable to
> expect users to enable XFRM_OFFLOAD but not BONDING_XFRM_OFFLOAD?

I'd originally just wrapped it with XFRM_OFFLOAD, but in an
overabundance of caution, thought maybe gating it behind its own flag
was better. I didn't get any feedback on the initial posting, so I've
been sort of winging it. :)

> >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> >index a25c65d4af71..01b80cef492a 100644
> >--- a/drivers/net/bonding/bond_main.c
> >+++ b/drivers/net/bonding/bond_main.c
...
> >@@ -4560,6 +4663,8 @@ void bond_setup(struct net_device *bond_dev)
> > NETIF_F_HW_VLAN_CTAG_FILTER;
> >
> > bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL | NETIF_F_GSO_UDP_L4;
> >+ if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP))
> >+ bond_dev->hw_features |= BOND_ENC_FEATURES;
>
> Why is adding the ESP features to hw_features (here, and added
> to BOND_ENC_FEATURES, above) not behind CONFIG_BONDING_XFRM_OFFLOAD?
>
> If adding these features makes sense regardless of the
> XFRM_OFFLOAD configuration, then shouldn't this change to feature
> handling be a separate patch? The feature handling is complex, and is
> worth its own patch so it stands out in the log.

No, that would be an oversight by me. The build bot yelled at me on v1
about builds with XFRM_OFFLOAD not enabled, and I neglected to wrap
that bit too.

I'll do that in the next revision. I'm also fine with dropping the
extra kconfig and just using XFRM_OFFLOAD for all of it, if that's
sufficient.

--
Jarod Wilson
[email protected]

2020-06-09 02:07:09

by Jay Vosburgh

[permalink] [raw]
Subject: Re: [PATCH net-next 3/4] bonding: support hardware encryption offload to slaves

Jarod Wilson <[email protected]> wrote:

>Currently, this support is limited to active-backup mode, as I'm not sure
>about the feasilibity of mapping an xfrm_state's offload handle to
>multiple hardware devices simultaneously, and we rely on being able to
>pass some hints to both the xfrm and NIC driver about whether or not
>they're operating on a slave device.
>
>I've tested this atop an Intel x520 device (ixgbe) using libreswan in
>transport mode, succesfully achieving ~4.3Gbps throughput with netperf
>(more or less identical to throughput on a bare NIC in this system),
>as well as successful failover and recovery mid-netperf.
>
>v2: rebase on latest net-next and wrap with #ifdef CONFIG_XFRM_OFFLOAD
>v3: add new CONFIG_BOND_XFRM_OFFLOAD option and fix shutdown path
>
>CC: Jay Vosburgh <[email protected]>
>CC: Veaceslav Falico <[email protected]>
>CC: Andy Gospodarek <[email protected]>
>CC: "David S. Miller" <[email protected]>
>CC: Jeff Kirsher <[email protected]>
>CC: Jakub Kicinski <[email protected]>
>CC: Steffen Klassert <[email protected]>
>CC: Herbert Xu <[email protected]>
>CC: [email protected]
>CC: [email protected]
>Signed-off-by: Jarod Wilson <[email protected]>
>
>Signed-off-by: Jarod Wilson <[email protected]>
>---
> drivers/net/Kconfig | 11 ++++
> drivers/net/bonding/bond_main.c | 111 +++++++++++++++++++++++++++++++-
> include/net/bonding.h | 3 +
> 3 files changed, 122 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
>index c7d310ef1c83..938c4dd9bfb9 100644
>--- a/drivers/net/Kconfig
>+++ b/drivers/net/Kconfig
>@@ -56,6 +56,17 @@ config BONDING
> To compile this driver as a module, choose M here: the module
> will be called bonding.
>
>+config BONDING_XFRM_OFFLOAD
>+ bool "Bonding driver IPSec XFRM cryptography-offload pass-through support"
>+ depends on BONDING
>+ depends on XFRM_OFFLOAD
>+ default y
>+ select XFRM_ALGO
>+ ---help---
>+ Enable support for IPSec offload pass-through in the bonding driver.
>+ Currently limited to active-backup mode only, and requires slave
>+ devices that support hardware crypto offload.
>+

Why is this a separate Kconfig option? Is it reasonable to
expect users to enable XFRM_OFFLOAD but not BONDING_XFRM_OFFLOAD?

> config DUMMY
> tristate "Dummy net driver support"
> ---help---
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index a25c65d4af71..01b80cef492a 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -79,6 +79,7 @@
> #include <net/pkt_sched.h>
> #include <linux/rculist.h>
> #include <net/flow_dissector.h>
>+#include <net/xfrm.h>
> #include <net/bonding.h>
> #include <net/bond_3ad.h>
> #include <net/bond_alb.h>
>@@ -278,8 +279,6 @@ const char *bond_mode_name(int mode)
> return names[mode];
> }
>
>-/*---------------------------------- VLAN -----------------------------------*/
>-
> /**
> * bond_dev_queue_xmit - Prepare skb for xmit.
> *
>@@ -302,6 +301,8 @@ netdev_tx_t bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
> return dev_queue_xmit(skb);
> }
>
>+/*---------------------------------- VLAN -----------------------------------*/
>+
> /* In the following 2 functions, bond_vlan_rx_add_vid and bond_vlan_rx_kill_vid,
> * We don't protect the slave list iteration with a lock because:
> * a. This operation is performed in IOCTL context,
>@@ -372,6 +373,84 @@ static int bond_vlan_rx_kill_vid(struct net_device *bond_dev,
> return 0;
> }
>
>+/*---------------------------------- XFRM -----------------------------------*/
>+
>+#ifdef CONFIG_BONDING_XFRM_OFFLOAD
>+/**
>+ * bond_ipsec_add_sa - program device with a security association
>+ * @xs: pointer to transformer state struct
>+ **/
>+static int bond_ipsec_add_sa(struct xfrm_state *xs)
>+{
>+ struct net_device *bond_dev = xs->xso.dev;
>+ struct bonding *bond = netdev_priv(bond_dev);
>+ struct slave *slave = rtnl_dereference(bond->curr_active_slave);
>+
>+ xs->xso.slave_dev = slave->dev;
>+ bond->xs = xs;
>+
>+ if (!(slave->dev->xfrmdev_ops
>+ && slave->dev->xfrmdev_ops->xdo_dev_state_add)) {
>+ slave_warn(bond_dev, slave->dev, "Slave does not support ipsec offload\n");
>+ return -EINVAL;
>+ }
>+
>+ return slave->dev->xfrmdev_ops->xdo_dev_state_add(xs);
>+}
>+
>+/**
>+ * bond_ipsec_del_sa - clear out this specific SA
>+ * @xs: pointer to transformer state struct
>+ **/
>+static void bond_ipsec_del_sa(struct xfrm_state *xs)
>+{
>+ struct net_device *bond_dev = xs->xso.dev;
>+ struct bonding *bond = netdev_priv(bond_dev);
>+ struct slave *slave = rtnl_dereference(bond->curr_active_slave);
>+
>+ if (!slave)
>+ return;
>+
>+ xs->xso.slave_dev = slave->dev;
>+
>+ if (!(slave->dev->xfrmdev_ops
>+ && slave->dev->xfrmdev_ops->xdo_dev_state_delete)) {
>+ slave_warn(bond_dev, slave->dev, "%s: no slave xdo_dev_state_delete\n", __func__);
>+ return;
>+ }
>+
>+ slave->dev->xfrmdev_ops->xdo_dev_state_delete(xs);
>+}
>+
>+/**
>+ * bond_ipsec_offload_ok - can this packet use the xfrm hw offload
>+ * @skb: current data packet
>+ * @xs: pointer to transformer state struct
>+ **/
>+static bool bond_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
>+{
>+ struct net_device *bond_dev = xs->xso.dev;
>+ struct bonding *bond = netdev_priv(bond_dev);
>+ struct slave *curr_active = rtnl_dereference(bond->curr_active_slave);
>+ struct net_device *slave_dev = curr_active->dev;
>+
>+ if (!(slave_dev->xfrmdev_ops
>+ && slave_dev->xfrmdev_ops->xdo_dev_offload_ok)) {
>+ slave_warn(bond_dev, slave_dev, "%s: no slave xdo_dev_offload_ok\n", __func__);
>+ return false;
>+ }
>+
>+ xs->xso.slave_dev = slave_dev;
>+ return slave_dev->xfrmdev_ops->xdo_dev_offload_ok(skb, xs);
>+}
>+
>+static const struct xfrmdev_ops bond_xfrmdev_ops = {
>+ .xdo_dev_state_add = bond_ipsec_add_sa,
>+ .xdo_dev_state_delete = bond_ipsec_del_sa,
>+ .xdo_dev_offload_ok = bond_ipsec_offload_ok,
>+};
>+#endif /* CONFIG_BONDING_XFRM_OFFLOAD */
>+
> /*------------------------------- Link status -------------------------------*/
>
> /* Set the carrier state for the master according to the state of its
>@@ -879,6 +958,11 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active)
> return;
>
> if (new_active) {
>+#ifdef CONFIG_BONDING_XFRM_OFFLOAD
>+ if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) && bond->xs)
>+ bond_ipsec_del_sa(bond->xs);
>+#endif /* CONFIG_BONDING_XFRM_OFFLOAD */
>+
> new_active->last_link_up = jiffies;
>
> if (new_active->link == BOND_LINK_BACK) {
>@@ -941,6 +1025,13 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active)
> bond_should_notify_peers(bond);
> }
>
>+#ifdef CONFIG_BONDING_XFRM_OFFLOAD
>+ if (old_active && bond->xs) {
>+ xfrm_dev_state_flush(dev_net(bond->dev), bond->dev, true);
>+ bond_ipsec_add_sa(bond->xs);
>+ }
>+#endif /* CONFIG_BONDING_XFRM_OFFLOAD */
>+
> call_netdevice_notifiers(NETDEV_BONDING_FAILOVER, bond->dev);
> if (should_notify_peers) {
> bond->send_peer_notif--;
>@@ -1125,7 +1216,9 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
> NETIF_F_HIGHDMA | NETIF_F_LRO)
>
> #define BOND_ENC_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \
>- NETIF_F_RXCSUM | NETIF_F_ALL_TSO)
>+ NETIF_F_RXCSUM | NETIF_F_ALL_TSO | \
>+ NETIF_F_HW_ESP | NETIF_F_HW_ESP_TX_CSUM | \
>+ NETIF_F_GSO_ESP)
>
> #define BOND_MPLS_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \
> NETIF_F_ALL_TSO)
>@@ -1464,6 +1557,9 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
> slave_dbg(bond_dev, slave_dev, "is !NETIF_F_VLAN_CHALLENGED\n");
> }
>
>+ if (slave_dev->features & NETIF_F_HW_ESP)
>+ slave_dbg(bond_dev, slave_dev, "is esp-hw-offload capable\n");
>+
> /* Old ifenslave binaries are no longer supported. These can
> * be identified with moderate accuracy by the state of the slave:
> * the current ifenslave will set the interface down prior to
>@@ -4542,6 +4638,13 @@ void bond_setup(struct net_device *bond_dev)
> bond_dev->priv_flags |= IFF_BONDING | IFF_UNICAST_FLT | IFF_NO_QUEUE;
> bond_dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING);
>
>+#ifdef CONFIG_BONDING_XFRM_OFFLOAD
>+ /* set up xfrm device ops (only supported in active-backup right now) */
>+ if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP))
>+ bond_dev->xfrmdev_ops = &bond_xfrmdev_ops;
>+ bond->xs = NULL;
>+#endif /* CONFIG_BONDING_XFRM_OFFLOAD */
>+
> /* don't acquire bond device's netif_tx_lock when transmitting */
> bond_dev->features |= NETIF_F_LLTX;
>
>@@ -4560,6 +4663,8 @@ void bond_setup(struct net_device *bond_dev)
> NETIF_F_HW_VLAN_CTAG_FILTER;
>
> bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL | NETIF_F_GSO_UDP_L4;
>+ if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP))
>+ bond_dev->hw_features |= BOND_ENC_FEATURES;

Why is adding the ESP features to hw_features (here, and added
to BOND_ENC_FEATURES, above) not behind CONFIG_BONDING_XFRM_OFFLOAD?

If adding these features makes sense regardless of the
XFRM_OFFLOAD configuration, then shouldn't this change to feature
handling be a separate patch? The feature handling is complex, and is
worth its own patch so it stands out in the log.

-J

> bond_dev->features |= bond_dev->hw_features;
> bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
> }
>diff --git a/include/net/bonding.h b/include/net/bonding.h
>index aa854a9c01e2..29a25098e2a6 100644
>--- a/include/net/bonding.h
>+++ b/include/net/bonding.h
>@@ -238,6 +238,9 @@ struct bonding {
> struct dentry *debug_dir;
> #endif /* CONFIG_DEBUG_FS */
> struct rtnl_link_stats64 bond_stats;
>+#ifdef CONFIG_BONDING_XFRM_OFFLOAD
>+ struct xfrm_state *xs;
>+#endif /* CONFIG_BONDING_XFRM_OFFLOAD */
> };
>
> #define bond_slave_get_rcu(dev) \
>--
>2.20.1
>

---
-Jay Vosburgh, [email protected]

2020-06-09 02:09:29

by David Miller

[permalink] [raw]

2020-06-10 20:21:43

by Jarod Wilson

[permalink] [raw]
Subject: [PATCH net-next v2 0/4] bonding: initial support for hardware crypto offload

This is an initial functional implementation for doing pass-through of
hardware encryption from bonding device to capable slaves, in active-backup
bond setups. This was developed and tested using ixgbe-driven Intel x520
interfaces with libreswan and a transport mode connection, primarily using
netperf, with assorted connection failures forced during transmission. The
failover works quite well in my testing, and overall performance is right
on par with offload when running on a bare interface, no bond involved.

Caveats: this is ONLY enabled for active-backup, because I'm not sure
how one would manage multiple offload handles for different devices all
running at the same time in the same xfrm, and it relies on some minor
changes to both the xfrm code and slave device driver code to get things
to behave, and I don't have immediate access to any other hardware that
could function similarly, but the NIC driver changes are minimal and
straight-forward enough that I've included what I think ought to be
enough for mlx5 devices too.

v2: reordered patches, switched (back) to using CONFIG_XFRM_OFFLOAD
to wrap the code additions and wrapped overlooked additions.

Jarod Wilson (4):
xfrm: bail early on slave pass over skb
ixgbe_ipsec: become aware of when running as a bonding slave
mlx5: become aware of when running as a bonding slave
bonding: support hardware encryption offload to slaves

drivers/net/bonding/bond_main.c | 127 +++++++++++++++++-
.../net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 39 ++++--
.../mellanox/mlx5/core/en_accel/ipsec.c | 6 +
include/net/bonding.h | 3 +
include/net/xfrm.h | 1 +
net/xfrm/xfrm_device.c | 34 ++---
6 files changed, 183 insertions(+), 27 deletions(-)

--
2.20.1

2020-06-10 21:12:14

by Jarod Wilson

[permalink] [raw]
Subject: [PATCH net-next v2 1/4] xfrm: bail early on slave pass over skb

This is prep work for initial support of bonding hardware encryption
pass-through support. The bonding driver will fill in the slave_dev
pointer, and we use that to know not to skb_push() again on a given
skb that was already processed on the bond device.

CC: Jay Vosburgh <[email protected]>
CC: Veaceslav Falico <[email protected]>
CC: Andy Gospodarek <[email protected]>
CC: "David S. Miller" <[email protected]>
CC: Jeff Kirsher <[email protected]>
CC: Jakub Kicinski <[email protected]>
CC: Steffen Klassert <[email protected]>
CC: Herbert Xu <[email protected]>
CC: [email protected]
CC: [email protected]
Signed-off-by: Jarod Wilson <[email protected]>
---
include/net/xfrm.h | 1 +
net/xfrm/xfrm_device.c | 34 +++++++++++++++++-----------------
2 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 094fe682f5d7..e20b2b27ec48 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -127,6 +127,7 @@ struct xfrm_state_walk {

struct xfrm_state_offload {
struct net_device *dev;
+ struct net_device *slave_dev;
unsigned long offload_handle;
unsigned int num_exthdrs;
u8 flags;
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index f50d1f97cf8e..b8918fc5248b 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -106,6 +106,7 @@ struct sk_buff *validate_xmit_xfrm(struct sk_buff *skb, netdev_features_t featur
struct sk_buff *skb2, *nskb, *pskb = NULL;
netdev_features_t esp_features = features;
struct xfrm_offload *xo = xfrm_offload(skb);
+ struct net_device *dev = skb->dev;
struct sec_path *sp;

if (!xo)
@@ -119,6 +120,10 @@ struct sk_buff *validate_xmit_xfrm(struct sk_buff *skb, netdev_features_t featur
if (xo->flags & XFRM_GRO || x->xso.flags & XFRM_OFFLOAD_INBOUND)
return skb;

+ /* This skb was already validated on the master dev */
+ if ((x->xso.dev != dev) && (x->xso.slave_dev == dev))
+ return skb;
+
local_irq_save(flags);
sd = this_cpu_ptr(&softnet_data);
err = !skb_queue_empty(&sd->xfrm_backlog);
@@ -129,25 +134,20 @@ struct sk_buff *validate_xmit_xfrm(struct sk_buff *skb, netdev_features_t featur
return skb;
}

- if (skb_is_gso(skb)) {
- struct net_device *dev = skb->dev;
-
- if (unlikely(x->xso.dev != dev)) {
- struct sk_buff *segs;
+ if (skb_is_gso(skb) && unlikely(x->xso.dev != dev)) {
+ struct sk_buff *segs;

- /* Packet got rerouted, fixup features and segment it. */
- esp_features = esp_features & ~(NETIF_F_HW_ESP
- | NETIF_F_GSO_ESP);
+ /* Packet got rerouted, fixup features and segment it. */
+ esp_features = esp_features & ~(NETIF_F_HW_ESP | NETIF_F_GSO_ESP);

- segs = skb_gso_segment(skb, esp_features);
- if (IS_ERR(segs)) {
- kfree_skb(skb);
- atomic_long_inc(&dev->tx_dropped);
- return NULL;
- } else {
- consume_skb(skb);
- skb = segs;
- }
+ segs = skb_gso_segment(skb, esp_features);
+ if (IS_ERR(segs)) {
+ kfree_skb(skb);
+ atomic_long_inc(&dev->tx_dropped);
+ return NULL;
+ } else {
+ consume_skb(skb);
+ skb = segs;
}
}

--
2.20.1

2020-06-10 21:12:22

by Jarod Wilson

[permalink] [raw]
Subject: [PATCH net-next v2 4/4] bonding: support hardware encryption offload to slaves

Currently, this support is limited to active-backup mode, as I'm not sure
about the feasilibity of mapping an xfrm_state's offload handle to
multiple hardware devices simultaneously, and we rely on being able to
pass some hints to both the xfrm and NIC driver about whether or not
they're operating on a slave device.

I've tested this atop an Intel x520 device (ixgbe) using libreswan in
transport mode, succesfully achieving ~4.3Gbps throughput with netperf
(more or less identical to throughput on a bare NIC in this system),
as well as successful failover and recovery mid-netperf.

v2: just use CONFIG_XFRM_OFFLOAD for wrapping, isolate more code with it

CC: Jay Vosburgh <[email protected]>
CC: Veaceslav Falico <[email protected]>
CC: Andy Gospodarek <[email protected]>
CC: "David S. Miller" <[email protected]>
CC: Jeff Kirsher <[email protected]>
CC: Jakub Kicinski <[email protected]>
CC: Steffen Klassert <[email protected]>
CC: Herbert Xu <[email protected]>
CC: [email protected]
CC: [email protected]
Signed-off-by: Jarod Wilson <[email protected]>
---
drivers/net/bonding/bond_main.c | 127 +++++++++++++++++++++++++++++++-
include/net/bonding.h | 3 +
2 files changed, 128 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index a25c65d4af71..882b57328308 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -79,6 +79,7 @@
#include <net/pkt_sched.h>
#include <linux/rculist.h>
#include <net/flow_dissector.h>
+#include <net/xfrm.h>
#include <net/bonding.h>
#include <net/bond_3ad.h>
#include <net/bond_alb.h>
@@ -278,8 +279,6 @@ const char *bond_mode_name(int mode)
return names[mode];
}

-/*---------------------------------- VLAN -----------------------------------*/
-
/**
* bond_dev_queue_xmit - Prepare skb for xmit.
*
@@ -302,6 +301,8 @@ netdev_tx_t bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
return dev_queue_xmit(skb);
}

+/*---------------------------------- VLAN -----------------------------------*/
+
/* In the following 2 functions, bond_vlan_rx_add_vid and bond_vlan_rx_kill_vid,
* We don't protect the slave list iteration with a lock because:
* a. This operation is performed in IOCTL context,
@@ -372,6 +373,84 @@ static int bond_vlan_rx_kill_vid(struct net_device *bond_dev,
return 0;
}

+/*---------------------------------- XFRM -----------------------------------*/
+
+#ifdef CONFIG_XFRM_OFFLOAD
+/**
+ * bond_ipsec_add_sa - program device with a security association
+ * @xs: pointer to transformer state struct
+ **/
+static int bond_ipsec_add_sa(struct xfrm_state *xs)
+{
+ struct net_device *bond_dev = xs->xso.dev;
+ struct bonding *bond = netdev_priv(bond_dev);
+ struct slave *slave = rtnl_dereference(bond->curr_active_slave);
+
+ xs->xso.slave_dev = slave->dev;
+ bond->xs = xs;
+
+ if (!(slave->dev->xfrmdev_ops
+ && slave->dev->xfrmdev_ops->xdo_dev_state_add)) {
+ slave_warn(bond_dev, slave->dev, "Slave does not support ipsec offload\n");
+ return -EINVAL;
+ }
+
+ return slave->dev->xfrmdev_ops->xdo_dev_state_add(xs);
+}
+
+/**
+ * bond_ipsec_del_sa - clear out this specific SA
+ * @xs: pointer to transformer state struct
+ **/
+static void bond_ipsec_del_sa(struct xfrm_state *xs)
+{
+ struct net_device *bond_dev = xs->xso.dev;
+ struct bonding *bond = netdev_priv(bond_dev);
+ struct slave *slave = rtnl_dereference(bond->curr_active_slave);
+
+ if (!slave)
+ return;
+
+ xs->xso.slave_dev = slave->dev;
+
+ if (!(slave->dev->xfrmdev_ops
+ && slave->dev->xfrmdev_ops->xdo_dev_state_delete)) {
+ slave_warn(bond_dev, slave->dev, "%s: no slave xdo_dev_state_delete\n", __func__);
+ return;
+ }
+
+ slave->dev->xfrmdev_ops->xdo_dev_state_delete(xs);
+}
+
+/**
+ * bond_ipsec_offload_ok - can this packet use the xfrm hw offload
+ * @skb: current data packet
+ * @xs: pointer to transformer state struct
+ **/
+static bool bond_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
+{
+ struct net_device *bond_dev = xs->xso.dev;
+ struct bonding *bond = netdev_priv(bond_dev);
+ struct slave *curr_active = rtnl_dereference(bond->curr_active_slave);
+ struct net_device *slave_dev = curr_active->dev;
+
+ if (!(slave_dev->xfrmdev_ops
+ && slave_dev->xfrmdev_ops->xdo_dev_offload_ok)) {
+ slave_warn(bond_dev, slave_dev, "%s: no slave xdo_dev_offload_ok\n", __func__);
+ return false;
+ }
+
+ xs->xso.slave_dev = slave_dev;
+ return slave_dev->xfrmdev_ops->xdo_dev_offload_ok(skb, xs);
+}
+
+static const struct xfrmdev_ops bond_xfrmdev_ops = {
+ .xdo_dev_state_add = bond_ipsec_add_sa,
+ .xdo_dev_state_delete = bond_ipsec_del_sa,
+ .xdo_dev_offload_ok = bond_ipsec_offload_ok,
+};
+#endif /* CONFIG_XFRM_OFFLOAD */
+
/*------------------------------- Link status -------------------------------*/

/* Set the carrier state for the master according to the state of its
@@ -879,6 +958,11 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active)
return;

if (new_active) {
+#ifdef CONFIG_XFRM_OFFLOAD
+ if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) && bond->xs)
+ bond_ipsec_del_sa(bond->xs);
+#endif /* CONFIG_XFRM_OFFLOAD */
+
new_active->last_link_up = jiffies;

if (new_active->link == BOND_LINK_BACK) {
@@ -941,6 +1025,13 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active)
bond_should_notify_peers(bond);
}

+#ifdef CONFIG_XFRM_OFFLOAD
+ if (old_active && bond->xs) {
+ xfrm_dev_state_flush(dev_net(bond->dev), bond->dev, true);
+ bond_ipsec_add_sa(bond->xs);
+ }
+#endif /* CONFIG_XFRM_OFFLOAD */
+
call_netdevice_notifiers(NETDEV_BONDING_FAILOVER, bond->dev);
if (should_notify_peers) {
bond->send_peer_notif--;
@@ -1127,15 +1218,24 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
#define BOND_ENC_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \
NETIF_F_RXCSUM | NETIF_F_ALL_TSO)

+#ifdef CONFIG_XFRM_OFFLOAD
+#define BOND_XFRM_FEATURES (NETIF_F_HW_ESP | NETIF_F_HW_ESP_TX_CSUM | \
+ NETIF_F_GSO_ESP)
+#endif /* CONFIG_XFRM_OFFLOAD */
+
#define BOND_MPLS_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \
NETIF_F_ALL_TSO)

+
static void bond_compute_features(struct bonding *bond)
{
unsigned int dst_release_flag = IFF_XMIT_DST_RELEASE |
IFF_XMIT_DST_RELEASE_PERM;
netdev_features_t vlan_features = BOND_VLAN_FEATURES;
netdev_features_t enc_features = BOND_ENC_FEATURES;
+#ifdef CONFIG_XFRM_OFFLOAD
+ netdev_features_t xfrm_features = BOND_XFRM_FEATURES;
+#endif /* CONFIG_XFRM_OFFLOAD */
netdev_features_t mpls_features = BOND_MPLS_FEATURES;
struct net_device *bond_dev = bond->dev;
struct list_head *iter;
@@ -1157,6 +1257,12 @@ static void bond_compute_features(struct bonding *bond)
slave->dev->hw_enc_features,
BOND_ENC_FEATURES);

+#ifdef CONFIG_XFRM_OFFLOAD
+ xfrm_features = netdev_increment_features(xfrm_features,
+ slave->dev->hw_enc_features,
+ BOND_XFRM_FEATURES);
+#endif /* CONFIG_XFRM_OFFLOAD */
+
mpls_features = netdev_increment_features(mpls_features,
slave->dev->mpls_features,
BOND_MPLS_FEATURES);
@@ -1176,6 +1282,9 @@ static void bond_compute_features(struct bonding *bond)
NETIF_F_HW_VLAN_CTAG_TX |
NETIF_F_HW_VLAN_STAG_TX |
NETIF_F_GSO_UDP_L4;
+#ifdef CONFIG_XFRM_OFFLOAD
+ bond_dev->hw_enc_features |= xfrm_features;
+#endif /* CONFIG_XFRM_OFFLOAD */
bond_dev->mpls_features = mpls_features;
bond_dev->gso_max_segs = gso_max_segs;
netif_set_gso_max_size(bond_dev, gso_max_size);
@@ -1464,6 +1573,9 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
slave_dbg(bond_dev, slave_dev, "is !NETIF_F_VLAN_CHALLENGED\n");
}

+ if (slave_dev->features & NETIF_F_HW_ESP)
+ slave_dbg(bond_dev, slave_dev, "is esp-hw-offload capable\n");
+
/* Old ifenslave binaries are no longer supported. These can
* be identified with moderate accuracy by the state of the slave:
* the current ifenslave will set the interface down prior to
@@ -4542,6 +4654,13 @@ void bond_setup(struct net_device *bond_dev)
bond_dev->priv_flags |= IFF_BONDING | IFF_UNICAST_FLT | IFF_NO_QUEUE;
bond_dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING);

+#ifdef CONFIG_XFRM_OFFLOAD
+ /* set up xfrm device ops (only supported in active-backup right now) */
+ if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP))
+ bond_dev->xfrmdev_ops = &bond_xfrmdev_ops;
+ bond->xs = NULL;
+#endif /* CONFIG_XFRM_OFFLOAD */
+
/* don't acquire bond device's netif_tx_lock when transmitting */
bond_dev->features |= NETIF_F_LLTX;

@@ -4560,6 +4679,10 @@ void bond_setup(struct net_device *bond_dev)
NETIF_F_HW_VLAN_CTAG_FILTER;

bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL | NETIF_F_GSO_UDP_L4;
+#ifdef CONFIG_XFRM_OFFLOAD
+ if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP))
+ bond_dev->hw_features |= BOND_XFRM_FEATURES;
+#endif /* CONFIG_XFRM_OFFLOAD */
bond_dev->features |= bond_dev->hw_features;
bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
}
diff --git a/include/net/bonding.h b/include/net/bonding.h
index aa854a9c01e2..a00e1764e9b1 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -238,6 +238,9 @@ struct bonding {
struct dentry *debug_dir;
#endif /* CONFIG_DEBUG_FS */
struct rtnl_link_stats64 bond_stats;
+#ifdef CONFIG_XFRM_OFFLOAD
+ struct xfrm_state *xs;
+#endif /* CONFIG_XFRM_OFFLOAD */
};

#define bond_slave_get_rcu(dev) \
--
2.20.1

2020-06-19 14:30:35

by Jarod Wilson

[permalink] [raw]
Subject: Re: [PATCH net-next v2 0/4] bonding: initial support for hardware crypto offload

On Fri, Jun 19, 2020 at 6:26 AM Jeff Kirsher
<[email protected]> wrote:
>
> On Wed, Jun 10, 2020 at 1:18 PM Jarod Wilson <[email protected]> wrote:
> >
> > This is an initial functional implementation for doing pass-through of
> > hardware encryption from bonding device to capable slaves, in active-backup
> > bond setups. This was developed and tested using ixgbe-driven Intel x520
> > interfaces with libreswan and a transport mode connection, primarily using
> > netperf, with assorted connection failures forced during transmission. The
> > failover works quite well in my testing, and overall performance is right
> > on par with offload when running on a bare interface, no bond involved.
> >
> > Caveats: this is ONLY enabled for active-backup, because I'm not sure
> > how one would manage multiple offload handles for different devices all
> > running at the same time in the same xfrm, and it relies on some minor
> > changes to both the xfrm code and slave device driver code to get things
> > to behave, and I don't have immediate access to any other hardware that
> > could function similarly, but the NIC driver changes are minimal and
> > straight-forward enough that I've included what I think ought to be
> > enough for mlx5 devices too.
> >
> > v2: reordered patches, switched (back) to using CONFIG_XFRM_OFFLOAD
> > to wrap the code additions and wrapped overlooked additions.
> >
> > Jarod Wilson (4):
> > xfrm: bail early on slave pass over skb
> > ixgbe_ipsec: become aware of when running as a bonding slave
> > mlx5: become aware of when running as a bonding slave
> > bonding: support hardware encryption offload to slaves
...
> Was this ever sent to netdev (the more appropriate ML)?

I believe so, but I'd neglected to notice net-next was closed at the
time, so I was holding on to it to resubmit once net-next is opened
back up.

--
Jarod Wilson
[email protected]

2020-06-19 17:18:28

by Jeff Kirsher

[permalink] [raw]
Subject: Re: [PATCH net-next v2 0/4] bonding: initial support for hardware crypto offload

On Wed, Jun 10, 2020 at 1:18 PM Jarod Wilson <[email protected]> wrote:
>
> This is an initial functional implementation for doing pass-through of
> hardware encryption from bonding device to capable slaves, in active-backup
> bond setups. This was developed and tested using ixgbe-driven Intel x520
> interfaces with libreswan and a transport mode connection, primarily using
> netperf, with assorted connection failures forced during transmission. The
> failover works quite well in my testing, and overall performance is right
> on par with offload when running on a bare interface, no bond involved.
>
> Caveats: this is ONLY enabled for active-backup, because I'm not sure
> how one would manage multiple offload handles for different devices all
> running at the same time in the same xfrm, and it relies on some minor
> changes to both the xfrm code and slave device driver code to get things
> to behave, and I don't have immediate access to any other hardware that
> could function similarly, but the NIC driver changes are minimal and
> straight-forward enough that I've included what I think ought to be
> enough for mlx5 devices too.
>
> v2: reordered patches, switched (back) to using CONFIG_XFRM_OFFLOAD
> to wrap the code additions and wrapped overlooked additions.
>
> Jarod Wilson (4):
> xfrm: bail early on slave pass over skb
> ixgbe_ipsec: become aware of when running as a bonding slave
> mlx5: become aware of when running as a bonding slave
> bonding: support hardware encryption offload to slaves
>
> drivers/net/bonding/bond_main.c | 127 +++++++++++++++++-
> .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 39 ++++--
> .../mellanox/mlx5/core/en_accel/ipsec.c | 6 +
> include/net/bonding.h | 3 +
> include/net/xfrm.h | 1 +
> net/xfrm/xfrm_device.c | 34 ++---
> 6 files changed, 183 insertions(+), 27 deletions(-)

Was this ever sent to netdev (the more appropriate ML)?

--
Cheers,
Jeff