Received: by 2002:a05:6512:3d0e:0:0:0:0 with SMTP id d14csp58766lfv; Tue, 12 Apr 2022 17:11:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzNx8R7PD9hVsUA1K5m4qQZk7Pf9s8Az4YrujlXdddlItD2FeDKu7USaYojCvY/TcbacZHf X-Received: by 2002:a17:903:22cc:b0:158:52d6:1b0b with SMTP id y12-20020a17090322cc00b0015852d61b0bmr14833115plg.47.1649808698213; Tue, 12 Apr 2022 17:11:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649808698; cv=none; d=google.com; s=arc-20160816; b=I681ba8w0iaohUfi6BvVoDHyjWTJRnsoSd7i9X2hGBmEwFOQtw1PxaXNkxYHqf2+Sg rfOoIVNUDdy4pdDMpTIS+vOrW0jFgBe+wu1c5xlFHiJagqrBGPq5Bx3TeDtSxMp3Y4vB K0jL5DFHzzV8LLgGY481oj32UCOX83MP6Ap0Q3lupNIqRKtX3EAEHrv11gk2kOlHAohZ s2kd46iuluSjzMw+sllwYbwpq8D3b60iEGYTJ+xwSLQgz6e5r0A/WhFruFdqPXBZtVCC Rq4Z75W5ji5tOVkM0EYQEOt6wqADycb2PK/U3Jc0twZ4KyTmwf8zTw74WosZBg0ceDdW SyUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=F/QnSD9AVvm0lt2W5TQRacqPOQUycJu4LaxbuM6oMxc=; b=pdAoVwKTCOx9mH/8YMpfC1wKtQZhbmW07aNfz9xTS57px+BBcXisVFxhpTh/yGnsyb Xa3U5vG1U3wzPM2NgKwPyBb+9i6fF0PI+99xdh+OSB+NDcMOgmcVj2Xa0XIxta1v3wqe TnAv+nFkJ1qH8toRFvvFZdSo/fH3kjdAUCjMXoDweCkKAmMbrZPUMJC9IFdQ9N6YA6+T 2eI21Z1HWzgSCvoAx5wgWdqG9JBMY1ojagRXnianKTiCrTffyiFG1qJNJLM2Zy0hLAcj R2Tzz84TToLHBJWiLq/nzIzFMInKj2hRo+syWL5eD/eeFtMIOvZkq9eUerkGdHrWiaR9 Y4tw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=vvBeFgIo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id u34-20020a632362000000b003816043ef57si4105815pgm.332.2022.04.12.17.11.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Apr 2022 17:11:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=vvBeFgIo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 65459DCE29; Tue, 12 Apr 2022 15:24:32 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350948AbiDLGnk (ORCPT + 99 others); Tue, 12 Apr 2022 02:43:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51820 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350258AbiDLGkm (ORCPT ); Tue, 12 Apr 2022 02:40:42 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC80D1116D; Mon, 11 Apr 2022 23:36:06 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5ADA8618FF; Tue, 12 Apr 2022 06:36:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 65BEBC385A6; Tue, 12 Apr 2022 06:36:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1649745365; bh=wdYPro3k5WfmoGlF+aTuYueHR2pQQgno184Tf5huwgA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=vvBeFgIoty6LanrvDQDNIB4NpT0elDNpxMATvLCa35EJsC4HJhc0p/lTqZP36LjNi oJPbz0Zo1KA5UwkVhwXU90vNZJhaC5Y6pI5zj0q0gitN6l+FntMReJPLb2juJTlyLQ K6d6rMJJOZ3zG5PbO3BC1C8TBIgWd3chrcWf/8Oo= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Mike Galbraith , Sebastian Andrzej Siewior , Jakub Kicinski , Sasha Levin Subject: [PATCH 5.10 032/171] tcp: Dont acquire inet_listen_hashbucket::lock with disabled BH. Date: Tue, 12 Apr 2022 08:28:43 +0200 Message-Id: <20220412062928.816380293@linuxfoundation.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220412062927.870347203@linuxfoundation.org> References: <20220412062927.870347203@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Sebastian Andrzej Siewior [ Upstream commit 4f9bf2a2f5aacf988e6d5e56b961ba45c5a25248 ] Commit 9652dc2eb9e40 ("tcp: relax listening_hash operations") removed the need to disable bottom half while acquiring listening_hash.lock. There are still two callers left which disable bottom half before the lock is acquired. On PREEMPT_RT the softirqs are preemptible and local_bh_disable() acts as a lock to ensure that resources, that are protected by disabling bottom halves, remain protected. This leads to a circular locking dependency if the lock acquired with disabled bottom halves is also acquired with enabled bottom halves followed by disabling bottom halves. This is the reverse locking order. It has been observed with inet_listen_hashbucket::lock: local_bh_disable() + spin_lock(&ilb->lock): inet_listen() inet_csk_listen_start() sk->sk_prot->hash() := inet_hash() local_bh_disable() __inet_hash() spin_lock(&ilb->lock); acquire(&ilb->lock); Reverse order: spin_lock(&ilb2->lock) + local_bh_disable(): tcp_seq_next() listening_get_next() spin_lock(&ilb2->lock); acquire(&ilb2->lock); tcp4_seq_show() get_tcp4_sock() sock_i_ino() read_lock_bh(&sk->sk_callback_lock); acquire(softirq_ctrl) // <---- whoops acquire(&sk->sk_callback_lock) Drop local_bh_disable() around __inet_hash() which acquires listening_hash->lock. Split inet_unhash() and acquire the listen_hashbucket lock without disabling bottom halves; the inet_ehash lock with disabled bottom halves. Reported-by: Mike Galbraith Signed-off-by: Sebastian Andrzej Siewior Link: https://lkml.kernel.org/r/12d6f9879a97cd56c09fb53dee343cbb14f7f1f7.camel@gmx.de Link: https://lkml.kernel.org/r/X9CheYjuXWc75Spa@hirez.programming.kicks-ass.net Link: https://lore.kernel.org/r/YgQOebeZ10eNx1W6@linutronix.de Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- net/ipv4/inet_hashtables.c | 53 ++++++++++++++++++++++--------------- net/ipv6/inet6_hashtables.c | 5 +--- 2 files changed, 33 insertions(+), 25 deletions(-) diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index e093847c334d..915b8e1bd9ef 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -637,7 +637,9 @@ int __inet_hash(struct sock *sk, struct sock *osk) int err = 0; if (sk->sk_state != TCP_LISTEN) { + local_bh_disable(); inet_ehash_nolisten(sk, osk, NULL); + local_bh_enable(); return 0; } WARN_ON(!sk_unhashed(sk)); @@ -669,45 +671,54 @@ int inet_hash(struct sock *sk) { int err = 0; - if (sk->sk_state != TCP_CLOSE) { - local_bh_disable(); + if (sk->sk_state != TCP_CLOSE) err = __inet_hash(sk, NULL); - local_bh_enable(); - } return err; } EXPORT_SYMBOL_GPL(inet_hash); -void inet_unhash(struct sock *sk) +static void __inet_unhash(struct sock *sk, struct inet_listen_hashbucket *ilb) { - struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo; - struct inet_listen_hashbucket *ilb = NULL; - spinlock_t *lock; - if (sk_unhashed(sk)) return; - if (sk->sk_state == TCP_LISTEN) { - ilb = &hashinfo->listening_hash[inet_sk_listen_hashfn(sk)]; - lock = &ilb->lock; - } else { - lock = inet_ehash_lockp(hashinfo, sk->sk_hash); - } - spin_lock_bh(lock); - if (sk_unhashed(sk)) - goto unlock; - if (rcu_access_pointer(sk->sk_reuseport_cb)) reuseport_detach_sock(sk); if (ilb) { + struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo; + inet_unhash2(hashinfo, sk); ilb->count--; } __sk_nulls_del_node_init_rcu(sk); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1); -unlock: - spin_unlock_bh(lock); +} + +void inet_unhash(struct sock *sk) +{ + struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo; + + if (sk_unhashed(sk)) + return; + + if (sk->sk_state == TCP_LISTEN) { + struct inet_listen_hashbucket *ilb; + + ilb = &hashinfo->listening_hash[inet_sk_listen_hashfn(sk)]; + /* Don't disable bottom halves while acquiring the lock to + * avoid circular locking dependency on PREEMPT_RT. + */ + spin_lock(&ilb->lock); + __inet_unhash(sk, ilb); + spin_unlock(&ilb->lock); + } else { + spinlock_t *lock = inet_ehash_lockp(hashinfo, sk->sk_hash); + + spin_lock_bh(lock); + __inet_unhash(sk, NULL); + spin_unlock_bh(lock); + } } EXPORT_SYMBOL_GPL(inet_unhash); diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c index 67c9114835c8..0a2e7f228391 100644 --- a/net/ipv6/inet6_hashtables.c +++ b/net/ipv6/inet6_hashtables.c @@ -333,11 +333,8 @@ int inet6_hash(struct sock *sk) { int err = 0; - if (sk->sk_state != TCP_CLOSE) { - local_bh_disable(); + if (sk->sk_state != TCP_CLOSE) err = __inet_hash(sk, NULL); - local_bh_enable(); - } return err; } -- 2.35.1