Subject: Re: [PATCH net-next v2] xfrm: Add ESN support for IPSec HW offload
To: Aviad Yehezkel <aviadye@dev.mellanox.co.il>,
        Yossi Kuperman <yossiku@mellanox.com>
Cc: Yossef Efraim <yossefe@mellanox.com>,
        Jonathan Corbet <corbet@lwn.net>,
        "David S. Miller" <davem@davemloft.net>,
        Steffen Klassert <steffen.klassert@secunet.com>,
        Herbert Xu <herbert@gondor.apana.org.au>,
        "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
        Boris Pismenny <borisp@mellanox.com>,
        Yevgeny Kliteynik <kliteyn@mellanox.com>
References: <1515580453-18470-1-git-send-email-yossefe@mellanox.com>
 <b9e24887-ae26-59c2-9ccf-d71c1429c8c9@oracle.com>
 <B8CEC0B8-8A38-40A2-A1F1-F8967D2ACCC2@mellanox.com>
 <bd01455a-6fe8-7f71-d09a-05272a55c144@oracle.com>
 <DB6PR0501MB271023B10CD4A65C5F017A75C4160@DB6PR0501MB2710.eurprd05.prod.outlook.com>
 <c42283e8-1843-79ea-5dab-c4ad3082fa26@dev.mellanox.co.il>
From: Shannon Nelson <shannon.nelson@oracle.com>
Organization: Oracle Corporation
Message-ID: <c7a56d5b-557e-4c4e-4d52-092b26195a77@oracle.com>
Date: Thu, 11 Jan 2018 11:15:39 -0800
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
 Thunderbird/52.5.2
MIME-Version: 1.0
In-Reply-To: <c42283e8-1843-79ea-5dab-c4ad3082fa26@dev.mellanox.co.il>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Sender: linux-kernel-owner@vger.kernel.org

On 1/11/2018 5:51 AM, Aviad Yehezkel wrote:
> 
> On 1/11/2018 10:28 AM, Yossi Kuperman wrote:
>>> From: Shannon Nelson [mailto:shannon.nelson@oracle.com]
>>> Sent: Thursday, January 11, 2018 5:21 AM
>>>
>>> On 1/10/2018 3:09 PM, Yossi Kuperman wrote:
>>>>> On 10 Jan 2018, at 19:36, Shannon Nelson 
>>>>> <shannon.nelson@oracle.com> wrote:
>>>>>
>>>>>> On 1/10/2018 2:34 AM, yossefe@mellanox.com wrote:
>>>>>> From: Yossef Efraim <yossefe@mellanox.com>
>>>>>> This patch adds ESN support to IPsec device offload.
>>>>>> Adding new xfrm device operation to synchronize device ESN.
>>>>>> Signed-off-by: Yossef Efraim <yossefe@mellanox.com>
>>>>>> ---
>>>>>> Changes from v1:
>>>>>>    - Added documentation
>>>>>> ---
>>>>>>    Documentation/networking/xfrm_device.txt |  3 +++
>>>>>>    include/linux/netdevice.h                |  1 +
>>>>>>    include/net/xfrm.h                       | 12 ++++++++++++
>>>>>>    net/xfrm/xfrm_device.c                   |  4 ++--
>>>>>>    net/xfrm/xfrm_replay.c                   |  2 ++
>>>>>>    5 files changed, 20 insertions(+), 2 deletions(-)
>>> [...]
>>>
>>>>>> diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
>>>>>> index 7598250..704a055 100644
>>>>>> --- a/net/xfrm/xfrm_device.c
>>>>>> +++ b/net/xfrm/xfrm_device.c
>>>>>> @@ -147,8 +147,8 @@ int xfrm_dev_state_add(struct net *net, struct 
>>>>>> xfrm_state *x,
>>>>>>        if (!x->type_offload)
>>>>>>            return -EINVAL;
>>>>>>    -    /* We don't yet support UDP encapsulation, TFC padding and 
>>>>>> ESN. */
>>>>>> -    if (x->encap || x->tfcpad || (x->props.flags & XFRM_STATE_ESN))
>>>>>> +    /* We don't yet support UDP encapsulation and TFC padding. */
>>>>>> +    if (x->encap || x->tfcpad)
>>>>> As I mentioned before, this will cause issues when working with 
>>>>> hardware that has no ESN support, such as Intel's x540: the stack will
>>> expect the driver to do ESN, and nothing actually happens but a 
>>> rollover of the numbers.  Sure, the driver could look for the ESN 
>>> attribute
>>> and fail the add, but that's a mode where we have to update every 
>>> driver to fend off problems every time we add a new feature.  Much
>>> better is to only update drivers that actively support the new feature.
>>>> You are right.
>>>>
>>>> I’m not sure why this check is here in the first place. IMO it 
>>>> should take place in xdo_dev_state_add—a driver-specific callback.
>>>>
>>> If you say I'm right, then why do you say it should take place in the
>>> driver callback?  I just wrote that it should *not*.
>>>
>> Sorry, I wasn't clear; you are right with respect that this change 
>> will break Intel's x540 driver.
>>
>> However, I do think that this is the purpose of xdo_dev_state_add(). 
>> Again, As far as I can understand, and please correct me if I'm wrong, 
>> this shouldn’t be here in the first place.
>>
>> Please have a look at mlx5e_xfrm_validate_state(). Currently, it 
>> return an error if the user requests ESN, regardless of the underlying 
>> device's capabilities. Subsequent patch to mlx5 driver, will allow 
>> such a request if the device does support it; maintaining backward 
>> compatibility.
>>
>> Here is a code snippet:
>>
>> -       if (x->props.flags & XFRM_STATE_ESN) {
>> +       if (x->props.flags & XFRM_STATE_ESN &&
>> +           !(mlx5_accel_ipsec_device_caps(priv->mdev) & 
>> MLX5_ACCEL_IPSEC_ESN)) {
>>                  netdev_info(netdev, "Cannot offload ESN xfrm states\n");
>>                  return -EINVAL;
>>          }
>>
>>> This code seems to be assuming that all drivers/NICs with the offload
>>> will be able to do ESN, and this is not the case.  If this code is put
>>> into place, suddenly the ixgbe driver's offload will have a failure
>>> case: the driver doesn't support ESN, and doesn't know to NAK the
>>> state_add if the ESN bit is on.  This is a generic capabilities issue
>>> for which we already have a solution "pattern".
>>>
> I guess you are right but ixgbe driver is already checking many other 
> caps during add_sa callback (below code from v3 patches for ixgbe ipsec):
> 
> +    if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) {
> +        netdev_err(dev, "Unsupported protocol 0x%04x for ipsec offload\n",
> +               xs->id.proto);
> +        return -EINVAL;
> +    }
> +
> +    if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
> +        struct rx_sa rsa;
> +
> +        if (xs->calg) {
> +            netdev_err(dev, "Compression offload not supported\n");
> +            return -EINVAL;
> +        }
> 
> 
> What is the difference for checking xs->calg exists in state to ESN?

Yes, the currently existing 2 drivers are doing this: I imagine that 
mlx5e did it this way because it happened to be the first driver and it 
got things working; ixgbe followed because there wasn't any other way to 
do this.  But this doesn't mean it is the right thing to do, and this is 
good that we're having this discussion before too many other drivers end 
up following this same example.

If I've read the patch correctly, the SA with ESN enabled will be added 
for offload, but nothing will happen when the ESN needs to be advanced 
if the driver hasn't implemented xdo_dev_state_advance_esn().  At this 
point the ipsec conversation will fail, correct?  How do we protect the 
XFRM stack's new feature from drivers that don't support it?

The quick and dirty answer is for this patch to include code for ixgbe 
and any other ipsec-offload drivers.  However, this becomes a burden for 
the author of the any feature, where every driver will need to be 
updated for it to work correctly, and every driver will need to have the 
same code to do it.  This is opening the door to mistakes.

When we look at code like mlx5e_xfrm_validate_state(), and similar 
things in ixgbe, we can see there are many capabilities that every ipsec 
offload driver needs to check for.  If drivers have to copy code to do 
the same checks, let's push these common requirements up the stack so we 
only need the code in one place, rather than code it in every driver.

> I think in long term we can refactor to cap mask declaration by the 
> driver and call add_sa only if mask exists but
> this can be a totally different patch.

Let's do this now before more drivers are enabled for ipsec, while the 
problem is still small.

In the meantime, while we're still hashing this out, please at least add 
something in xfrm_dev_state_add() to return -EINVAL if the driver hasn't 
implemented xdo_dev_state_advance_esn().  Perhaps something like this:

@@ -172,10 +172,12 @@ int xfrm_dev_state_add(struct net *net, struct 
xfrm_state *x,
                 dst_release(dst);
         }

-       if (!dev->xfrmdev_ops || !dev->xfrmdev_ops->xdo_dev_state_add) {
+       if (!dev->xfrmdev_ops || !dev->xfrmdev_ops->xdo_dev_state_add ||
+           ((x->props.flags & XFRM_STATE_ESN) &&
+                      !dev->xfrmdev_ops->xdo_dev_state_advance_esn)) {
                 xso->dev = NULL;
                 dev_put(dev);
-               return 0;
+               return -EINVAL;
         }

         xso->dev = dev;


sln


> 
> 
>> We weren't assuming that, please see above.
>>
>>>   > What do you suggest?
>>>   >
>>>
>>> There should be a capabilities/feature flag for the driver to set and
>>> the XFRM code shouldn't try the state_add with ESN if the driver hasn't
>>> set an ESN bit in its capabilities.  Other capabilities that might make
>>> sense here are IPv6, TSO, and CSUM; there may be others.
>>>
>>>>> Look at how feature bits are added to netdev->features to signify 
>>>>> what the driver can do.  I think that's a much better approach.
>>>>>
>>>> It looks like an overkill?
>>> Alternatively, just solve this by failing to add the SA that has ESN set
>>> if the driver hasn't defined your new xdo_dev_state_advance_esn().
>>>
>>> sln
>>>
>>>
>>>>> sln
>>>>>
>>>>>
>>>>>>            return -EINVAL;
>>>>>>          dev = dev_get_by_index(net, xuo->ifindex);
>>>>>> diff --git a/net/xfrm/xfrm_replay.c b/net/xfrm/xfrm_replay.c
>>>>>> index 0250181..1d38c6a 100644
>>>>>> --- a/net/xfrm/xfrm_replay.c
>>>>>> +++ b/net/xfrm/xfrm_replay.c
>>>>>> @@ -551,6 +551,8 @@ static void xfrm_replay_advance_esn(struct 
>>>>>> xfrm_state *x, __be32 net_seq)
>>>>>>                bitnr = replay_esn->replay_window - (diff - pos);
>>>>>>        }
>>>>>>    +    xfrm_dev_state_advance_esn(x);
>>>>>> +
>>>>>>        nr = bitnr >> 5;
>>>>>>        bitnr = bitnr & 0x1F;
>>>>>>        replay_esn->bmp[nr] |= (1U << bitnr);
>