Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp499358pxb; Tue, 9 Feb 2021 05:55:56 -0800 (PST) X-Google-Smtp-Source: ABdhPJy+Ck3wxOqO7/fSQQ9WdnR7U8ZYLkpBIC07m8o5YFveslIL1gedOLHlosjLLU9OqY1vAIe1 X-Received: by 2002:a50:c88d:: with SMTP id d13mr23025252edh.206.1612878955873; Tue, 09 Feb 2021 05:55:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612878955; cv=none; d=google.com; s=arc-20160816; b=v1D0SkVUGOFtGgvXLXx3h5TF+dtGHBqU8usvrftvg3VXCFqGFFKjYX23fV+aBxxDih UczsjSKLgYEe2GDu9aX+9vNpkRpMlUQ8hpR/cZYkq2Uw4CsJpeqyAAhkd+eG/41pKhue eH/k6UhstfrWLn0xu+ALNEUi48/IVZCK4dKdAthzgCqvS9nY8RfnNwlnB84MLreEpqTV BsRDb9noEayohkPtH2RGS6ktnICroZ7NsbnW9D1o35NyzeDlJi4FGgRPF1/YwVhMamiE sMj9wFpORLjh9HktKMZOdLWAt6sDIwbieryDm601PlWCHKKwN+m8SKdtIRqwAy991Nbs 5cCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=EFHWJqOuZ5oCfySHPR3lGIAEM6/c2S56gFx0hAWTits=; b=IkqaWo5HFwoNsTVqqg+ive2Tsh4uFEXCe/38i3gucaD2kVPrLY31ZUhdUm2BfBrz4w SOLkUYoI9Yhm9Cr04tmSXZ/kPnKWhZBFLtLID1mo/lv8aasRkP/IUlyY1GrYtkJw49jQ P2HvF/bB74QgnVvX7c/mmPXwsfRXbKdddCw5JMShhrXJGxNa+RrbtFj3NcCyqCxYfpak +VLCp2EIoZEvEH8tK5nCNVOCoMyy/HwqIFXhBg74Pw7NGV7RW33E0rODnbg/tTySJ8qe Laxrs1I3HrDnIAZhyF1IGFHXPUtZ7mv1XZoO+a5IAXnb3k/0O0MfyW+OFt48mWJoQimm /qKw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Ppv7Q7ej; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p12si12952231ejx.70.2021.02.09.05.55.32; Tue, 09 Feb 2021 05:55:55 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Ppv7Q7ej; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231881AbhBINvu (ORCPT + 99 others); Tue, 9 Feb 2021 08:51:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32814 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231956AbhBINtu (ORCPT ); Tue, 9 Feb 2021 08:49:50 -0500 Received: from mail-ej1-x633.google.com (mail-ej1-x633.google.com [IPv6:2a00:1450:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D20D3C061797; Tue, 9 Feb 2021 05:49:08 -0800 (PST) Received: by mail-ej1-x633.google.com with SMTP id a9so31540402ejr.2; Tue, 09 Feb 2021 05:49:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=EFHWJqOuZ5oCfySHPR3lGIAEM6/c2S56gFx0hAWTits=; b=Ppv7Q7ejOG9vxA58OJEbZVdzNl7aX/qp2kwYiGA43KDSoKyQ/O+Xuo1h5pbXDi7Ihv m+nOm1oG/rMqw5wq7RA/qdEqVjoIzBBzxVOVhZbbSNNzSm59kmnQ8CrSYf4VHuABfoRz q7z+ZLHD5lB0VJIEvIG4s4hjEUscbocpcQ092Ui9Pqivm98hvHh+ji9xVq66RSU7/MIi iokQyZCNIaUO1P+vfSBagS6Mwol9B3pNSpFjPvzeq8LCtMnyJz1VgrLbQ1XpDeuRIbVj +asGO79iHe9LOxGKRC1ZCR8SMd6BhhNjnWuJFBbAV57WADu3Pxp1B+snMkjhRRPbILce GmtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=EFHWJqOuZ5oCfySHPR3lGIAEM6/c2S56gFx0hAWTits=; b=fGfNVFWp464zCeDR6SJnDU8j1rXJO74woAm7MQfk5GjVP/xjI/T51eoCXKCYju700B /3okztziAcz1vAGKl6NBm77Vy87CNQnMcJKH8ol5GWqbE0lkzvQgR58SGWulX9renn+a 3TtdysE9kvMQkI7DTnUUJN2corgsneM/Nu2nms7bQroz0CRPpqQoLvPwmpjRXuXle0dg 3f/la18uNzRfD1XOMdM46TV1IgrX7Nx9U7HdbeCEYotoRkQWb7Qa5+No67blfT/vwBCb joWusy75c6MjFcSNaL9QJ3nEKRb77h2ADjsAklbOUp2K5LisOz/l/KGp5gRbstpr/1gn T55Q== X-Gm-Message-State: AOAM5319f1A5g4H1PpaURxi/fXopv2wCqb8eqZmFFMUFdxdJklRaLgcz nwIl7EvPmZsHBTHFKP23/SpvVccPCT+OrfkxSDI= X-Received: by 2002:a17:906:158c:: with SMTP id k12mr22774572ejd.119.1612878542051; Tue, 09 Feb 2021 05:49:02 -0800 (PST) MIME-Version: 1.0 References: <20210208113810.11118-1-hxseverything@gmail.com> <8552C5F8-8410-4E81-8AF4-7018878AFCDC@gmail.com> In-Reply-To: <8552C5F8-8410-4E81-8AF4-7018878AFCDC@gmail.com> From: Willem de Bruijn Date: Tue, 9 Feb 2021 08:48:26 -0500 Message-ID: Subject: Re: [PATCH] bpf: in bpf_skb_adjust_room correct inner protocol for vxlan To: =?UTF-8?B?6buE5a2m5qOu?= Cc: David Miller , bpf , Daniel Borkmann , Network Development , linux-kernel , chengzhiyong , wangli , Alan Maguire Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 9, 2021 at 5:41 AM =E9=BB=84=E5=AD=A6=E6=A3=AE wrote: > > Appreciate for your reply Willem! > > The original intention of this commit is that when we use bpf_skb_adjust_= room to encapsulate > Vxlan packets, we find some powerful device features disabled. > > Setting the inner_protocol directly as skb->protocol is the root cause. > > I understand that it=E2=80=99s not easy to handle all tunnel protocol in = one bpf helper function. But for my > immature idea, when pushing Ethernet header, setting the inner_protocol a= s ETH_P_TEB may > be better. > > Now the flag BPF_F_ADJ_ROOM_ENCAP_L4_UDP includes many udp tunnel types( = e.g. > udp+mpls, geneve, vxlan). Adding an independent flag to represents Vxlan = looks a little > reduplicative. What=E2=80=99s your suggestion? Agreed. I don't mean to add a vxlan specific flag. Instead, a way to identify that the encapsulation includes a mac header. To a certain extent, that already exists as of commit 58dfc900faff ("bpf: add layer 2 encap support to bpf_skb_adjust_room"). That computes an inner_maclen. It makes sense that inner_protocol needs to be updated if inner_maclen indicates a mac header. I would only not infer it based on some imprecise measure, such as inner_maclen being 14. But add a new explicit flag BPF_F_ADJ_ROOM_ENCAP_L2_ETH. Update inner protocol if the flag is passed and inner_maclen >=3D ETH_HLEN. Fail the operation if the flag is passed and inner_maclen is too short. > Thanks again for your reply! > > > > > 2021=E5=B9=B42=E6=9C=888=E6=97=A5 =E4=B8=8B=E5=8D=889:06=EF=BC=8CWillem= de Bruijn =E5=86=99=E9=81=93=EF=BC=9A > > > > On Mon, Feb 8, 2021 at 7:16 AM huangxuesen wr= ote: > >> > >> From: huangxuesen > >> > >> When pushing vxlan tunnel header, set inner protocol as ETH_P_TEB in s= kb > >> to avoid HW device disabling udp tunnel segmentation offload, just lik= e > >> vxlan_build_skb does. > >> > >> Drivers for NIC may invoke vxlan_features_check to check the > >> inner_protocol in skb for vxlan packets to decide whether to disable > >> NETIF_F_GSO_MASK. Currently it sets inner_protocol as the original > >> skb->protocol, that will make mlx5_core disable TSO and lead to huge > >> performance degradation. > >> > >> Signed-off-by: huangxuesen > >> Signed-off-by: chengzhiyong > >> Signed-off-by: wangli > >> --- > >> net/core/filter.c | 7 ++++++- > >> 1 file changed, 6 insertions(+), 1 deletion(-) > >> > >> diff --git a/net/core/filter.c b/net/core/filter.c > >> index 255aeee72402..f8d3ba3fe10f 100644 > >> --- a/net/core/filter.c > >> +++ b/net/core/filter.c > >> @@ -3466,7 +3466,12 @@ static int bpf_skb_net_grow(struct sk_buff *skb= , u32 off, u32 len_diff, > >> skb->inner_mac_header =3D inner_net - inner_mac_len; > >> skb->inner_network_header =3D inner_net; > >> skb->inner_transport_header =3D inner_trans; > >> - skb_set_inner_protocol(skb, skb->protocol); > >> + > >> + if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP && > >> + inner_mac_len =3D=3D ETH_HLEN) > >> + skb_set_inner_protocol(skb, htons(ETH_P_TEB)); > > > > This may be used by vxlan, but it does not imply it. > > > > Adding ETH_HLEN bytes likely means pushing an Ethernet header, but same= point. > > > > Conversely, pushing an Ethernet header is not limited to UDP encap. > > > > This probably needs a new explicit BPF_F_ADJ_ROOM_.. flag, rather than > > trying to infer from imprecise heuristics. >