Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp1935341iob; Thu, 5 May 2022 11:13:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzLwM2nG7tylNpmv9PbWzF+vFWdTj3vQtMgKPsxTxAMkv7llBk3OpKIgJQFTBRmJGqb9IBg X-Received: by 2002:a17:902:edc7:b0:15b:4196:1957 with SMTP id q7-20020a170902edc700b0015b41961957mr28193627plk.161.1651774432434; Thu, 05 May 2022 11:13:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651774432; cv=none; d=google.com; s=arc-20160816; b=sHhKwn5rQ3tpHDM5wpyM9PGBxdlsgOOr4pl09PlIskbHT3QG8t/gTzwEGWdIv1A5QI xW0yiNzBhL7xjExX+jl4fDmGOFNx0WJDx+2NEgXa2VF9mnPIJy3Yz1E9xKKs0FfWCdBR XwPWVbbf18qzmZ6TqhsHbtLcM5kCFZVx9ZkTC6MosapZ9rcjMBlW9CU1+IkPKigYUvCD 6bQaMv/NijZQwzWjIc3D7AVDtjT5QBd9VV2LZvSfM4L3RudcKfyjMzhTSJCSTRElCoOH dGQJZFeysHtdR3/dSKR46JktKQfsFPGe2JUkGhj2qxvNtN8DiH2J0LuqNn8JMBugHpnr hk/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=Ae/KJVBrg23yHeWDmom4aC7hOAo1ytOlnRlW5Cw2mBI=; b=fvwMBpQsiikuU4L308Ce0OBk5dHTjk6NiYM9UtJKoGijTYTwcUllfGRlPvE9TmzMeO scTJNOhJpTORE90ijgVfdG38JMGcbKdNdac+ZuuVJaDwjU7UChQTR+kQboCltP2PNkxX IDGHfg0t4BLATANs0Eb4BwXql8xJIQ7aQL/9/h/8icibjTc/rVsl1se7QMgDkA3KafCW 4LlNB/HzDgQT+OMh8U/2gJcp3VZjou6LaVqsa8F6ZRT46c5uvuF50QRQOfAtJaMxmdqM iJ0Me4036SCdUWkZDAJ6q/Xl8Q/NV60N7/TPZ7WHpKeN+njemKHzCS9ziBKnD075WpyX rv+A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=W8fg19Cq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e28-20020a63371c000000b003c62ba4c1bfsi1263575pga.499.2022.05.05.11.13.36; Thu, 05 May 2022 11:13:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=W8fg19Cq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355868AbiEDRKn (ORCPT + 99 others); Wed, 4 May 2022 13:10:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355449AbiEDRAD (ORCPT ); Wed, 4 May 2022 13:00:03 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D189D4B41B; Wed, 4 May 2022 09:51:42 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 9CB7EB827A5; Wed, 4 May 2022 16:51:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 45182C385A5; Wed, 4 May 2022 16:51:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1651683086; bh=MqJvWmbwug2SkXstXXEDsrTMt85TRcrWnBTzLiAds1s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=W8fg19CqsAH5Mg624ZWXpNnhT1xEhnwhKedfgMnjSX7cZa0/5Gu1bRdJ9DbwPXYQG lPu3ijEw9YVt+YouGPaCbDymZG4xA02DhXT3rrsGM3uQf1Jhv+9eUnoxS7j28O3mSo iBBm4tBTqFrKdrzOW2MZhKApuvEc2Ib/Ga4f1u1k= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Jakub Kicinski , Peilin Ye , William Tu , "David S. Miller" , Sasha Levin Subject: [PATCH 5.10 082/129] ip_gre, ip6_gre: Fix race condition on o_seqno in collect_md mode Date: Wed, 4 May 2022 18:44:34 +0200 Message-Id: <20220504153027.594805386@linuxfoundation.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220504153021.299025455@linuxfoundation.org> References: <20220504153021.299025455@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Peilin Ye [ Upstream commit 31c417c948d7f6909cb63f0ac3298f3c38f8ce20 ] As pointed out by Jakub Kicinski, currently using TUNNEL_SEQ in collect_md mode is racy for [IP6]GRE[TAP] devices. Consider the following sequence of events: 1. An [IP6]GRE[TAP] device is created in collect_md mode using "ip link add ... external". "ip" ignores "[o]seq" if "external" is specified, so TUNNEL_SEQ is off, and the device is marked as NETIF_F_LLTX (i.e. it uses lockless TX); 2. Someone sets TUNNEL_SEQ on outgoing skb's, using e.g. bpf_skb_set_tunnel_key() in an eBPF program attached to this device; 3. gre_fb_xmit() or __gre6_xmit() processes these skb's: gre_build_header(skb, tun_hlen, flags, protocol, tunnel_id_to_key32(tun_info->key.tun_id), (flags & TUNNEL_SEQ) ? htonl(tunnel->o_seqno++) : 0); ^^^^^^^^^^^^^^^^^ Since we are not using the TX lock (&txq->_xmit_lock), multiple CPUs may try to do this tunnel->o_seqno++ in parallel, which is racy. Fix it by making o_seqno atomic_t. As mentioned by Eric Dumazet in commit b790e01aee74 ("ip_gre: lockless xmit"), making o_seqno atomic_t increases "chance for packets being out of order at receiver" when NETIF_F_LLTX is on. Maybe a better fix would be: 1. Do not ignore "oseq" in external mode. Users MUST specify "oseq" if they want the kernel to allow sequencing of outgoing packets; 2. Reject all outgoing TUNNEL_SEQ packets if the device was not created with "oseq". Unfortunately, that would break userspace. We could now make [IP6]GRE[TAP] devices always NETIF_F_LLTX, but let us do it in separate patches to keep this fix minimal. Suggested-by: Jakub Kicinski Fixes: 77a5196a804e ("gre: add sequence number for collect md mode.") Signed-off-by: Peilin Ye Acked-by: William Tu Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- include/net/ip6_tunnel.h | 2 +- include/net/ip_tunnels.h | 2 +- net/ipv4/ip_gre.c | 6 +++--- net/ipv6/ip6_gre.c | 7 ++++--- 4 files changed, 9 insertions(+), 8 deletions(-) diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h index 028eaea1c854..42d50856fcf2 100644 --- a/include/net/ip6_tunnel.h +++ b/include/net/ip6_tunnel.h @@ -57,7 +57,7 @@ struct ip6_tnl { /* These fields used only by GRE */ __u32 i_seqno; /* The last seen seqno */ - __u32 o_seqno; /* The last output seqno */ + atomic_t o_seqno; /* The last output seqno */ int hlen; /* tun_hlen + encap_hlen */ int tun_hlen; /* Precalculated header length */ int encap_hlen; /* Encap header length (FOU,GUE) */ diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h index 61620677b034..c3e55a9ae585 100644 --- a/include/net/ip_tunnels.h +++ b/include/net/ip_tunnels.h @@ -113,7 +113,7 @@ struct ip_tunnel { /* These four fields used only by GRE */ u32 i_seqno; /* The last seen seqno */ - u32 o_seqno; /* The last output seqno */ + atomic_t o_seqno; /* The last output seqno */ int tun_hlen; /* Precalculated header length */ /* These four fields used only by ERSPAN */ diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index 801c540db33e..2a80038575d2 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -459,7 +459,7 @@ static void __gre_xmit(struct sk_buff *skb, struct net_device *dev, /* Push GRE header. */ gre_build_header(skb, tunnel->tun_hlen, flags, proto, tunnel->parms.o_key, - (flags & TUNNEL_SEQ) ? htonl(tunnel->o_seqno++) : 0); + (flags & TUNNEL_SEQ) ? htonl(atomic_fetch_inc(&tunnel->o_seqno)) : 0); ip_tunnel_xmit(skb, dev, tnl_params, tnl_params->protocol); } @@ -497,7 +497,7 @@ static void gre_fb_xmit(struct sk_buff *skb, struct net_device *dev, (TUNNEL_CSUM | TUNNEL_KEY | TUNNEL_SEQ); gre_build_header(skb, tunnel_hlen, flags, proto, tunnel_id_to_key32(tun_info->key.tun_id), - (flags & TUNNEL_SEQ) ? htonl(tunnel->o_seqno++) : 0); + (flags & TUNNEL_SEQ) ? htonl(atomic_fetch_inc(&tunnel->o_seqno)) : 0); ip_md_tunnel_xmit(skb, dev, IPPROTO_GRE, tunnel_hlen); @@ -574,7 +574,7 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev) } gre_build_header(skb, 8, TUNNEL_SEQ, - proto, 0, htonl(tunnel->o_seqno++)); + proto, 0, htonl(atomic_fetch_inc(&tunnel->o_seqno))); ip_md_tunnel_xmit(skb, dev, IPPROTO_GRE, tunnel_hlen); diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index 09262adf004e..3f88ba6555ab 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -766,7 +766,7 @@ static netdev_tx_t __gre6_xmit(struct sk_buff *skb, gre_build_header(skb, tun_hlen, flags, protocol, tunnel_id_to_key32(tun_info->key.tun_id), - (flags & TUNNEL_SEQ) ? htonl(tunnel->o_seqno++) + (flags & TUNNEL_SEQ) ? htonl(atomic_fetch_inc(&tunnel->o_seqno)) : 0); } else { @@ -777,7 +777,8 @@ static netdev_tx_t __gre6_xmit(struct sk_buff *skb, gre_build_header(skb, tunnel->tun_hlen, flags, protocol, tunnel->parms.o_key, - (flags & TUNNEL_SEQ) ? htonl(tunnel->o_seqno++) : 0); + (flags & TUNNEL_SEQ) ? htonl(atomic_fetch_inc(&tunnel->o_seqno)) + : 0); } return ip6_tnl_xmit(skb, dev, dsfield, fl6, encap_limit, pmtu, @@ -1055,7 +1056,7 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb, /* Push GRE header. */ proto = (t->parms.erspan_ver == 1) ? htons(ETH_P_ERSPAN) : htons(ETH_P_ERSPAN2); - gre_build_header(skb, 8, TUNNEL_SEQ, proto, 0, htonl(t->o_seqno++)); + gre_build_header(skb, 8, TUNNEL_SEQ, proto, 0, htonl(atomic_fetch_inc(&t->o_seqno))); /* TooBig packet may have updated dst->dev's mtu */ if (!t->parms.collect_md && dst && dst_mtu(dst) > dst->dev->mtu) -- 2.35.1