Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1144013pxb; Thu, 4 Mar 2021 04:37:32 -0800 (PST) X-Google-Smtp-Source: ABdhPJwS+HEVHVUbj7Lmyz/Pt++VzWbUmxpbAMiGu/vRQeN6M9g0aYVYo3GiLwFx5bYCnFClmGIt X-Received: by 2002:a17:906:71d3:: with SMTP id i19mr4016321ejk.347.1614861452705; Thu, 04 Mar 2021 04:37:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614861452; cv=none; d=google.com; s=arc-20160816; b=xGSUR7xOM4ZKb2P4a3VBTAFU58sp8CziCQhG3QNUr/C7i3x97lI9XTNXGZQXTVGFga Vbn3RzD/VTaN7ZQXdtZMMua8pblplMKYxS8Lzvh8tfkTiYxWwik3NWGdirVuqz0X9pZa 1nEBryxp5CgoucTK97Y9/3e4t9qybEiwLE1TntipzWa417NA2Xnwcu1b1IqCt0HtR8Ld GOpIIw3NnzB2og9+g/hH8CZcI47nLfVKfdEY31rLvTBRjefo3nXrVQUrDwhiFFE3sw9M zKEb1RAEKZ1XQLPa0aIkdvE//1wM1WeNgd6TmF5C+/iUcYtN4+IT0aloSaK9pU/eKc+X PLKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=YKFQH6E0yNZuowlitL0N34JlWPD6X2Se55p2GC5RTco=; b=kJNS80S1PKJ4M5J7JiaaGluNKxqlbio786vr1yKZeUZqDMRL+e6vZH5OJrX+ZrwzPM mlTJk79LvQlM1KmxNFNPqD4wILBRw0UteTkPGbDQhZR+0d6QYnlhLpuQzog7EgEw6mV5 3Nte1TmAP4RAGWKNlbRZOOCjgmP2nJ1YZjOZm/s2bkEW8SZK/fQWsYD8PCvyBN031V6X b7W2XKRSTngLR46dCIciUyAD22nmE+icHqUWSs7uyQCpPlCb1dMQ6v0czkF1sI4dqzIG rrmktGAFmJQzTNPIcj6zffdDIPa3gek3kBHa9szLACSkYUd3vgLZMARaOpsaonRFJ5+p wSVQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=D2z0Gn99; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gx21si4085019ejc.503.2021.03.04.04.37.09; Thu, 04 Mar 2021 04:37:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=D2z0Gn99; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387679AbhCCTcr (ORCPT + 99 others); Wed, 3 Mar 2021 14:32:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37852 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350698AbhCCTHC (ORCPT ); Wed, 3 Mar 2021 14:07:02 -0500 Received: from mail-ed1-x530.google.com (mail-ed1-x530.google.com [IPv6:2a00:1450:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F963C061761 for ; Wed, 3 Mar 2021 10:54:16 -0800 (PST) Received: by mail-ed1-x530.google.com with SMTP id b13so22263691edx.1 for ; Wed, 03 Mar 2021 10:54:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=YKFQH6E0yNZuowlitL0N34JlWPD6X2Se55p2GC5RTco=; b=D2z0Gn99jifSpH4aD3vxwF1HsZhUEo7ZHIWKKwlftOgsXvzJy4CR9j1e7VGpMTpkzk DZjltsb8hsZ6COVcSSf54oRduRqNS7HwYj0d3+nYyJet5DRWaUAokYRbA21zGi0HMGw6 icJDbi7zzceq1odFUJaaCO+0k8xJW974siTZFpSLH4a30gCdEjVxdBC7UBRiZvF53oWD 8qHDkxX4aWd3FCScLKj6sniICzf7CXnKV+HP/9mehA0vUP+/NNIIAUGyKm8G/Mp+ESOG LV/4SOlxOCrrop84PalOIbCYRgU4bZ36CRmNPa6wlNOfx8eVLEDpBdPyFo0z+3BRF3kY y46A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=YKFQH6E0yNZuowlitL0N34JlWPD6X2Se55p2GC5RTco=; b=hPUscmNDS6e+j1+GUczt5PP1Nyd0BTPNvlql5sjWP6IGLaQlFXjV/km8aIjma+puQd 6Dc2R+/lCAgP8pEAlEvyQLk5wUU7GDcCaA+htcE7TUCdbRvNJiE9gYtrgRtzPsXfGeWs HcytuxcE0QzBP4FlJYrD3WzjIJF5aEWbOOUOQMvULy3WMMSjKgVQPtC9jO4gSmG2GyW4 W+l83JAZZhO5uC1CtbrjBZb13Tj/akPZy7W108xq0rKSP39jNFg3/SeP2qdx817dUFaL /Q0dER+BmiRSXUDsqns7ApEIBYBk32XKjwmiMOjOdP6/vDbG9SGlwJZpjdsJFE6/PNza HHhQ== X-Gm-Message-State: AOAM530FsIUu7MKHAeLBzfW7JeMMbAM0pTPU0YJlIokiDuwmfCa8VQwE KlPaqWIK8usBIIxoRe0GEE29TciB5Mg= X-Received: by 2002:a05:6402:3510:: with SMTP id b16mr723871edd.242.1614797654738; Wed, 03 Mar 2021 10:54:14 -0800 (PST) Received: from mail-wm1-f46.google.com (mail-wm1-f46.google.com. [209.85.128.46]) by smtp.gmail.com with ESMTPSA id j25sm4441586edy.9.2021.03.03.10.54.12 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 03 Mar 2021 10:54:12 -0800 (PST) Received: by mail-wm1-f46.google.com with SMTP id o16so7325296wmh.0 for ; Wed, 03 Mar 2021 10:54:12 -0800 (PST) X-Received: by 2002:a05:600c:198c:: with SMTP id t12mr330047wmq.183.1614797651850; Wed, 03 Mar 2021 10:54:11 -0800 (PST) MIME-Version: 1.0 References: <20210303123338.99089-1-hxseverything@gmail.com> In-Reply-To: <20210303123338.99089-1-hxseverything@gmail.com> From: Willem de Bruijn Date: Wed, 3 Mar 2021 13:53:34 -0500 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH/v4] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_ENCAP_L2_ETH To: Xuesen Huang Cc: Daniel Borkmann , David Miller , bpf , Willem de Bruijn , Network Development , linux-kernel , Cong Wang , Xuesen Huang , Zhiyong Cheng , Li Wang Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 3, 2021 at 7:33 AM Xuesen Huang wrote: > > From: Xuesen Huang > > bpf_skb_adjust_room sets the inner_protocol as skb->protocol for packets > encapsulation. But that is not appropriate when pushing Ethernet header. > > Add an option to further specify encap L2 type and set the inner_protocol > as ETH_P_TEB. > > Update test_tc_tunnel to verify adding vxlan encapsulation works with > this flag. > > Suggested-by: Willem de Bruijn > Signed-off-by: Xuesen Huang > Signed-off-by: Zhiyong Cheng > Signed-off-by: Li Wang Thanks for adding the test. Perhaps that is better in a separate patch? Overall looks great to me. The patch has not (yet?) arrived on patchwork. > enum { > diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c > index 37bce7a..6e144db 100644 > --- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c > +++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c > @@ -20,6 +20,14 @@ > #include > #include > > +#define encap_ipv4(...) __encap_ipv4(__VA_ARGS__, 0) > + > +#define encap_ipv4_with_ext_proto(...) __encap_ipv4(__VA_ARGS__) > + > +#define encap_ipv6(...) __encap_ipv6(__VA_ARGS__, 0) > + > +#define encap_ipv6_with_ext_proto(...) __encap_ipv6(__VA_ARGS__) > + Instead of untyped macros, I'd define encap_ipv4 as a function that calls __encap_ipv4. And no need for encap_ipv4_with_ext_proto equivalent to __encap_ipv4. > static const int cfg_port = 8000; > > static const int cfg_udp_src = 20000; > @@ -27,11 +35,24 @@ > #define UDP_PORT 5555 > #define MPLS_OVER_UDP_PORT 6635 > #define ETH_OVER_UDP_PORT 7777 > +#define VXLAN_UDP_PORT 8472 > + > +#define EXTPROTO_VXLAN 0x1 > + > +#define VXLAN_N_VID (1u << 24) > +#define VXLAN_VNI_MASK bpf_htonl((VXLAN_N_VID - 1) << 8) > +#define VXLAN_FLAGS 0x8 > +#define VXLAN_VNI 1 > > /* MPLS label 1000 with S bit (last label) set and ttl of 255. */ > static const __u32 mpls_label = __bpf_constant_htonl(1000 << 12 | > MPLS_LS_S_MASK | 0xff); > > +struct vxlanhdr { > + __be32 vx_flags; > + __be32 vx_vni; > +} __attribute__((packed)); > + > struct gre_hdr { > __be16 flags; > __be16 protocol; > @@ -45,13 +66,13 @@ struct gre_hdr { > struct v4hdr { > struct iphdr ip; > union l4hdr l4hdr; > - __u8 pad[16]; /* enough space for L2 header */ > + __u8 pad[24]; /* space for L2 header / vxlan header ... */ could we use something like sizeof(..) instead of a constant? > @@ -171,14 +197,26 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, __u8 encap_proto, > } > > /* add L2 encap (if specified) */ > + l2_hdr = (__u8 *)&h_outer + olen; > switch (l2_proto) { > case ETH_P_MPLS_UC: > - *((__u32 *)((__u8 *)&h_outer + olen)) = mpls_label; > + *(__u32 *)l2_hdr = mpls_label; > break; > case ETH_P_TEB: > - if (bpf_skb_load_bytes(skb, 0, (__u8 *)&h_outer + olen, > - ETH_HLEN)) This is non-standard indentation? Here and elsewhere. > @@ -249,7 +288,11 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, __u8 encap_proto, > break; > case ETH_P_TEB: > l2_len = ETH_HLEN; > - udp_dst = ETH_OVER_UDP_PORT; > + if (ext_proto & EXTPROTO_VXLAN) { > + udp_dst = VXLAN_UDP_PORT; > + l2_len += sizeof(struct vxlanhdr); > + } else > + udp_dst = ETH_OVER_UDP_PORT; > break; > } > flags |= BPF_F_ADJ_ROOM_ENCAP_L2(l2_len); > @@ -267,7 +310,7 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, __u8 encap_proto, > h_outer.l4hdr.udp.source = __bpf_constant_htons(cfg_udp_src); > h_outer.l4hdr.udp.dest = bpf_htons(udp_dst); > tot_len = bpf_ntohs(iph_inner.payload_len) + sizeof(iph_inner) + > - sizeof(h_outer.l4hdr.udp); > + sizeof(h_outer.l4hdr.udp) + l2_len; Was this a bug previously? > h_outer.l4hdr.udp.check = 0; > h_outer.l4hdr.udp.len = bpf_htons(tot_len); > break; > @@ -278,13 +321,24 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, __u8 encap_proto, > } > > /* add L2 encap (if specified) */ > + l2_hdr = (__u8 *)&h_outer + olen; > switch (l2_proto) { > case ETH_P_MPLS_UC: > - *((__u32 *)((__u8 *)&h_outer + olen)) = mpls_label; > + *(__u32 *)l2_hdr = mpls_label; > break; > case ETH_P_TEB: > - if (bpf_skb_load_bytes(skb, 0, (__u8 *)&h_outer + olen, > - ETH_HLEN)) > + flags |= BPF_F_ADJ_ROOM_ENCAP_L2_ETH; This is a change also for the existing case. Correctly so, I imagine. But the test used to pass with the wrong protocol?