Received: by 2002:a05:7412:40d:b0:e2:908c:2ebd with SMTP id 13csp1180806rdf; Wed, 22 Nov 2023 07:38:00 -0800 (PST) X-Google-Smtp-Source: AGHT+IEGabPU8g015EJlHo9y4LQhWZ+VjgVShm81zI3LwzmQY+32rYiXR0GIJCDC3iSKq0E4Cgov X-Received: by 2002:a17:90b:3e89:b0:280:2840:80bd with SMTP id rj9-20020a17090b3e8900b00280284080bdmr2610213pjb.49.1700667479993; Wed, 22 Nov 2023 07:37:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700667479; cv=none; d=google.com; s=arc-20160816; b=Yk2vXkrutwfuerG1Ux859qPByxViCWrPM8DVw554gwOp+9FhMaI7BwTj4nyUl966+I Y2xu6Evc43ZyOTlFgQuFIE4sbjuBVd+jQ7dYfjf5q5+/Jcg5pHCMZUR2/fnf0GEZrByb FTW4Y+JMEIe5WwPBMz/tkh5lFvAPoeqw4XkXOqk++0VnGcSV9w5vNTquu97UXTXWFUo5 ZoxvQgtmSPGddzr5XkwWrcQRrCPq0/nmtwPB2jrNvAgXi1voDHGd7H0xTnLPpkYF4O+o 8s/QBe1b1F0yqjOy8BRCsjKMIntIjGrjaPHaAumFcR5wXBEycuxz5EQYuGDYiqMx+NcU SNSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=hTQAFjvVoG9ndvUWE/zlFvnwD9QALR9PGvlmzFmG554=; fh=BmtZNRndgwQnbAunCwpeseHgqJ3/hV1g8sFFyEl8MuQ=; b=tKAQI709RSsvJcdBTBfVdZ8/UCzGyUHapWA0aeCa6y69cAo2i9P4equfmyDeIRcTli uWyoXA5AFtsS9vASoUT3R8/zkK36XDwS3b+yFjv3UqIPWOfplQhEGM3YNrkPGeNeud5Y CQuSInwSV515wZLOhc/I7laBNYYaOxehDgMpGevEZkIW9Q/i9WXLiItAGugk3by0bQ0f itAwzmyG9QhGyn06NRn6tQqh4PfnM5l5Bn8PVuBLPQfNAC03Ammds6HOljUZitfTzC9R 3CSLEv1xlotDAamFn+X+WAa4dwykF3FPLNfp4vpbL/CEz/gpBrMXAo9fIFDdhS0lMRHz YZ1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=huSmhb3K; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id w10-20020a17090aea0a00b0028558b6968csi1316pjy.165.2023.11.22.07.37.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Nov 2023 07:37:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=huSmhb3K; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id DB4B68263E88; Wed, 22 Nov 2023 07:37:27 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344547AbjKVPgA (ORCPT + 99 others); Wed, 22 Nov 2023 10:36:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235266AbjKVPfc (ORCPT ); Wed, 22 Nov 2023 10:35:32 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 938101BC0; Wed, 22 Nov 2023 07:33:57 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 45252C433CB; Wed, 22 Nov 2023 15:33:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700667236; bh=KNN7hkgAUhDYvHRh8LMiNipp5q373VB2dg8mVurAazc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=huSmhb3KbSlFfKWE7IdkSnepjAjdCodTerdmFbHNta9UzGc94EwXVeV6r/HDmeCcR 080pgrV3as7ugxGVsca4yxjJ3+htbXHfodneMnype/7edo5C94wmZTuchgKLtqxXlx vjTVYsS182DaMYjHhsQ5T1uNaGSY9zmDgS5o0kTeeJamlHUlux5yo6mUBk8tUmBy2b EkPSiQDojqi+krIsLU4HuQDjEqyNzuLh1Ff0zLGXTVnHiRtsCa6maXdM12hs5+Y/Rp AeRornDBWaTCtKCoVklanXe6++8FvtFT7LRjgIpqqZnRBrT1ykmNzNoTECZcYkD4v2 +5jHqCafCWYTQ== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Jozsef Kadlecsik , Pablo Neira Ayuso , Sasha Levin , fw@strlen.de, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, justinstitt@google.com, kuniyu@amazon.com, netfilter-devel@vger.kernel.org, coreteam@netfilter.org, netdev@vger.kernel.org Subject: [PATCH AUTOSEL 6.5 05/15] netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test Date: Wed, 22 Nov 2023 10:33:07 -0500 Message-ID: <20231122153340.852434-5-sashal@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122153340.852434-1-sashal@kernel.org> References: <20231122153340.852434-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.5.12 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.3 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Wed, 22 Nov 2023 07:37:28 -0800 (PST) From: Jozsef Kadlecsik [ Upstream commit 28628fa952fefc7f2072ce6e8016968cc452b1ba ] Linkui Xiao reported that there's a race condition when ipset swap and destroy is called, which can lead to crash in add/del/test element operations. Swap then destroy are usual operations to replace a set with another one in a production system. The issue can in some cases be reproduced with the script: ipset create hash_ip1 hash:net family inet hashsize 1024 maxelem 1048576 ipset add hash_ip1 172.20.0.0/16 ipset add hash_ip1 192.168.0.0/16 iptables -A INPUT -m set --match-set hash_ip1 src -j ACCEPT while [ 1 ] do # ... Ongoing traffic... ipset create hash_ip2 hash:net family inet hashsize 1024 maxelem 1048576 ipset add hash_ip2 172.20.0.0/16 ipset swap hash_ip1 hash_ip2 ipset destroy hash_ip2 sleep 0.05 done In the race case the possible order of the operations are CPU0 CPU1 ip_set_test ipset swap hash_ip1 hash_ip2 ipset destroy hash_ip2 hash_net_kadt Swap replaces hash_ip1 with hash_ip2 and then destroy removes hash_ip2 which is the original hash_ip1. ip_set_test was called on hash_ip1 and because destroy removed it, hash_net_kadt crashes. The fix is to force ip_set_swap() to wait for all readers to finish accessing the old set pointers by calling synchronize_rcu(). The first version of the patch was written by Linkui Xiao . v2: synchronize_rcu() is moved into ip_set_swap() in order not to burden ip_set_destroy() unnecessarily when all sets are destroyed. v3: Florian Westphal pointed out that all netfilter hooks run with rcu_read_lock() held and em_ipset.c wraps the entire ip_set_test() in rcu read lock/unlock pair. So there's no need to extend the rcu read locked area in ipset itself. Closes: https://lore.kernel.org/all/69e7963b-e7f8-3ad0-210-7b86eebf7f78@netfilter.org/ Reported by: Linkui Xiao Signed-off-by: Jozsef Kadlecsik Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/netfilter/ipset/ip_set_core.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index 58608460cf6df..809b9b1a88afc 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -61,6 +61,8 @@ MODULE_ALIAS_NFNL_SUBSYS(NFNL_SUBSYS_IPSET); ip_set_dereference((inst)->ip_set_list)[id] #define ip_set_ref_netlink(inst,id) \ rcu_dereference_raw((inst)->ip_set_list)[id] +#define ip_set_dereference_nfnl(p) \ + rcu_dereference_check(p, lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET)) /* The set types are implemented in modules and registered set types * can be found in ip_set_type_list. Adding/deleting types is @@ -708,15 +710,10 @@ __ip_set_put_netlink(struct ip_set *set) static struct ip_set * ip_set_rcu_get(struct net *net, ip_set_id_t index) { - struct ip_set *set; struct ip_set_net *inst = ip_set_pernet(net); - rcu_read_lock(); - /* ip_set_list itself needs to be protected */ - set = rcu_dereference(inst->ip_set_list)[index]; - rcu_read_unlock(); - - return set; + /* ip_set_list and the set pointer need to be protected */ + return ip_set_dereference_nfnl(inst->ip_set_list)[index]; } static inline void @@ -1397,6 +1394,9 @@ static int ip_set_swap(struct sk_buff *skb, const struct nfnl_info *info, ip_set(inst, to_id) = from; write_unlock_bh(&ip_set_ref_lock); + /* Make sure all readers of the old set pointers are completed. */ + synchronize_rcu(); + return 0; } -- 2.42.0