Received: by 2002:a05:6358:4e97:b0:b3:742d:4702 with SMTP id ce23csp3090484rwb; Mon, 15 Aug 2022 17:53:38 -0700 (PDT) X-Google-Smtp-Source: AA6agR4JuWO5uxhsuc3xuk4agOCFe6EGP5AyBSjitSproQ04eSoxB6seJCaXaYDP57pZtR4R9ksY X-Received: by 2002:a17:907:a07b:b0:735:6744:c6be with SMTP id ia27-20020a170907a07b00b007356744c6bemr8358219ejc.685.1660611218530; Mon, 15 Aug 2022 17:53:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660611218; cv=none; d=google.com; s=arc-20160816; b=xW/y7IYiPyPy8KQ9XfOKQlJhyIbpYe31BgoKHLplbtp8UQZtNcdVW6I4nvMiYAYYJf gOC4yYTX7whBTOnWGkZOUFv842gKxJEXmt9YWCW3XIe8CeDUZtkwbrJrIM2hEpAXLSQj CAEJyLvXvVY7A8ooNXL126Bvo1rOsO0pGt3nQ2Gu7Vh5v+mgsjNDeiv3XLFcYjEQdWkA nbtr3YmmRkuFwhkT6EhWhOt816QtiXFWgw0CZl5m+j9VHHh0wl9Fui9oknxA5MuqQzXQ HUSqPR/ocbWPKZQV3dAkUk+YzE1NbtgISM9e1mIQlD19tE0ZT43Cdm9s8o2RO23XIZVC 0rnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=h4XmOyk3Yt2QlKo+XgJntCJs873lC9VFcuQt6lcdUBY=; b=Hmkm6jpg8H6v6j9iS5AvpTLW/zkq9CEqMtX65HYPe83V7jaNCZKfTUyn/4r1j9Kc4O P8o4OSdXf90iY8fWt+fKZFRDdzrXLxvxz6y/nU8wyk1Ta10BY1gwNFm2VbQU8YQQmtW+ fXFItalN5QmGL63xkqT50q0Wj9XSFtzjMxaRZdrBv8NSBQuAkZWPBCuyNTA76jiZ6Anq LRL9UJhKmgyM8fRKnSMonvpshWX5vwKhImx8wNkdnKnkFoHOZaJ+d8YC0aP4k4qYPfD/ R8MJE1pp2Z0kqpqgXXyr+lWwtrissNEPvBE35MGWU6yuyxDHAv+u6BPDQY/sGDs7LyRQ O4rg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=OHBGL22v; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m13-20020a056402430d00b0043bb9893d78si9886474edc.246.2022.08.15.17.53.13; Mon, 15 Aug 2022 17:53:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=OHBGL22v; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1353483AbiHOXe5 (ORCPT + 99 others); Mon, 15 Aug 2022 19:34:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43794 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1353375AbiHOX2I (ORCPT ); Mon, 15 Aug 2022 19:28:08 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69D0E14D737; Mon, 15 Aug 2022 13:07:32 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id A288FB81155; Mon, 15 Aug 2022 20:07:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CED93C433D6; Mon, 15 Aug 2022 20:07:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1660594049; bh=qeb5GrAM5tjHD+f6qoo0ZN7mzMf6nXXjTVEBx2JIzE0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OHBGL22viRw6ClJknetEj+OL111etGQB8YZJMVuWBTALdZrRCliZs/IiqOXQ4srQh IQsrZOQM9Kdt444fMUmdFUS11NNH2yUbCJcxFj7jiDNYwSnOMk/smeiOiXL1+9PL5e umLKLrhcq5yPDSdadzAuASnTKTODVusZga4Ma+3A= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Eric Dumazet , "David S. Miller" , Sasha Levin Subject: [PATCH 5.19 0367/1157] ping: convert to RCU lookups, get rid of rwlock Date: Mon, 15 Aug 2022 19:55:23 +0200 Message-Id: <20220815180454.402115916@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220815180439.416659447@linuxfoundation.org> References: <20220815180439.416659447@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Eric Dumazet [ Upstream commit dbca1596bbb08318f5e3b3b99f8ca0a0d3830a65 ] Using rwlock in networking code is extremely risky. writers can starve if enough readers are constantly grabing the rwlock. I thought rwlock were at fault and sent this patch: https://lkml.org/lkml/2022/6/17/272 But Peter and Linus essentially told me rwlock had to be unfair. We need to get rid of rwlock in networking code. Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ipv4/ping.c | 36 ++++++++++++++++-------------------- 1 file changed, 16 insertions(+), 20 deletions(-) diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c index 3c6101def7d6..b83c2bd9d722 100644 --- a/net/ipv4/ping.c +++ b/net/ipv4/ping.c @@ -50,7 +50,7 @@ struct ping_table { struct hlist_nulls_head hash[PING_HTABLE_SIZE]; - rwlock_t lock; + spinlock_t lock; }; static struct ping_table ping_table; @@ -82,7 +82,7 @@ int ping_get_port(struct sock *sk, unsigned short ident) struct sock *sk2 = NULL; isk = inet_sk(sk); - write_lock_bh(&ping_table.lock); + spin_lock(&ping_table.lock); if (ident == 0) { u32 i; u16 result = ping_port_rover + 1; @@ -128,14 +128,15 @@ int ping_get_port(struct sock *sk, unsigned short ident) if (sk_unhashed(sk)) { pr_debug("was not hashed\n"); sock_hold(sk); - hlist_nulls_add_head(&sk->sk_nulls_node, hlist); + sock_set_flag(sk, SOCK_RCU_FREE); + hlist_nulls_add_head_rcu(&sk->sk_nulls_node, hlist); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1); } - write_unlock_bh(&ping_table.lock); + spin_unlock(&ping_table.lock); return 0; fail: - write_unlock_bh(&ping_table.lock); + spin_unlock(&ping_table.lock); return 1; } EXPORT_SYMBOL_GPL(ping_get_port); @@ -153,19 +154,19 @@ void ping_unhash(struct sock *sk) struct inet_sock *isk = inet_sk(sk); pr_debug("ping_unhash(isk=%p,isk->num=%u)\n", isk, isk->inet_num); - write_lock_bh(&ping_table.lock); + spin_lock(&ping_table.lock); if (sk_hashed(sk)) { - hlist_nulls_del(&sk->sk_nulls_node); - sk_nulls_node_init(&sk->sk_nulls_node); + hlist_nulls_del_init_rcu(&sk->sk_nulls_node); sock_put(sk); isk->inet_num = 0; isk->inet_sport = 0; sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1); } - write_unlock_bh(&ping_table.lock); + spin_unlock(&ping_table.lock); } EXPORT_SYMBOL_GPL(ping_unhash); +/* Called under rcu_read_lock() */ static struct sock *ping_lookup(struct net *net, struct sk_buff *skb, u16 ident) { struct hlist_nulls_head *hslot = ping_hashslot(&ping_table, net, ident); @@ -190,8 +191,6 @@ static struct sock *ping_lookup(struct net *net, struct sk_buff *skb, u16 ident) return NULL; } - read_lock_bh(&ping_table.lock); - ping_portaddr_for_each_entry(sk, hnode, hslot) { isk = inet_sk(sk); @@ -230,13 +229,11 @@ static struct sock *ping_lookup(struct net *net, struct sk_buff *skb, u16 ident) sk->sk_bound_dev_if != sdif) continue; - sock_hold(sk); goto exit; } sk = NULL; exit: - read_unlock_bh(&ping_table.lock); return sk; } @@ -592,7 +589,7 @@ void ping_err(struct sk_buff *skb, int offset, u32 info) sk->sk_err = err; sk_error_report(sk); out: - sock_put(sk); + return; } EXPORT_SYMBOL_GPL(ping_err); @@ -998,7 +995,6 @@ enum skb_drop_reason ping_rcv(struct sk_buff *skb) reason = __ping_queue_rcv_skb(sk, skb2); else reason = SKB_DROP_REASON_NOMEM; - sock_put(sk); } if (reason) @@ -1084,13 +1080,13 @@ static struct sock *ping_get_idx(struct seq_file *seq, loff_t pos) } void *ping_seq_start(struct seq_file *seq, loff_t *pos, sa_family_t family) - __acquires(ping_table.lock) + __acquires(RCU) { struct ping_iter_state *state = seq->private; state->bucket = 0; state->family = family; - read_lock_bh(&ping_table.lock); + rcu_read_lock(); return *pos ? ping_get_idx(seq, *pos-1) : SEQ_START_TOKEN; } @@ -1116,9 +1112,9 @@ void *ping_seq_next(struct seq_file *seq, void *v, loff_t *pos) EXPORT_SYMBOL_GPL(ping_seq_next); void ping_seq_stop(struct seq_file *seq, void *v) - __releases(ping_table.lock) + __releases(RCU) { - read_unlock_bh(&ping_table.lock); + rcu_read_unlock(); } EXPORT_SYMBOL_GPL(ping_seq_stop); @@ -1202,5 +1198,5 @@ void __init ping_init(void) for (i = 0; i < PING_HTABLE_SIZE; i++) INIT_HLIST_NULLS_HEAD(&ping_table.hash[i], i); - rwlock_init(&ping_table.lock); + spin_lock_init(&ping_table.lock); } -- 2.35.1