Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp4067802imw; Thu, 7 Jul 2022 12:09:48 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uLGHYESbYYjf2iA8Hk74S5x2rhD+ZcOhaIIJQeUGVQZ7d2FvCpDw6Alo2LyND0I08WUYx0 X-Received: by 2002:a17:907:2ccc:b0:72b:2f1:f157 with SMTP id hg12-20020a1709072ccc00b0072b02f1f157mr7425417ejc.265.1657220988513; Thu, 07 Jul 2022 12:09:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657220988; cv=none; d=google.com; s=arc-20160816; b=bwYdkd/4XBRBXY3ly1zuWehASXgT4G6qFayLTqIUas5W8ws7w800YfLjAxbOwoNvId fT+hgWqBmeR68dKgaXlX/T2J6zUyqRHdEMJXDkJI4wEqn/fdQ9WnAXOoUgw/drAY7zKW /OqXFNscqHrDJWglmLKRVmv11vR/LBtEPddsGRc4RN2up47V95V6l4vJiWCeVMtxDji1 bAnEcdcFz/G2+Nsqyu4r2MsQpxVioFyCr4ADaQMiu8xuN7KLyA4goaphX6hgPeIXdvKh OEtCQcMDoYqDCUjmQ+osuiMHRltsnMa7Ezk+ZBVZbgvyRABuYE9Fo8oo9fFQq0emePie UQRQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:feedback-id :dkim-signature; bh=Lj/5GKFD5bSgxpfgMiLIINDbLK3iL7GVnVF/8eyegVw=; b=puV71jLl84JIwEFlVecamdkdC+WCrPwv9oErrg3SwpiDswcVmCY04csw18XV/NCwLT PORosJIlvLODpH0AkLPt5l4+1W+UWhKJz+lgvTTrNn15dA37bM+HlIAHmeF7fAWkHuvm KXqtbOx5ZdNr5y/mRNFND99Zzqzk2VusEmLE09YLO3mCXZSkukSQHBwSLzPBIGONArWq Xy2O3t4Ikn532jAlG/bEz4yGorRZZZOmfpYobYEm8hVBbyEiX8ozetprXOSlKFPPSnd5 ev22qS1Yt+8QBvuVmmm9EkzvgK6XINJSNj8rxLCicD82CU+FjCI/b6j+hIarNEY/37jT mk+A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=O4PlJ4sv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z7-20020a05640240c700b0043639a0b48bsi51483649edb.276.2022.07.07.12.09.22; Thu, 07 Jul 2022 12:09:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=O4PlJ4sv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235452AbiGGTFN (ORCPT + 99 others); Thu, 7 Jul 2022 15:05:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57224 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235643AbiGGTFM (ORCPT ); Thu, 7 Jul 2022 15:05:12 -0400 Received: from mail-qt1-x836.google.com (mail-qt1-x836.google.com [IPv6:2607:f8b0:4864:20::836]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D0ED20F4C for ; Thu, 7 Jul 2022 12:05:11 -0700 (PDT) Received: by mail-qt1-x836.google.com with SMTP id c13so24137938qtq.10 for ; Thu, 07 Jul 2022 12:05:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=feedback-id:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Lj/5GKFD5bSgxpfgMiLIINDbLK3iL7GVnVF/8eyegVw=; b=O4PlJ4svdXAIysmorR7JS4m/B96kCMpmsXE6LPvC+47wcLRxPPo5DtZx2M8IJ05dF1 kxc6yI3WoB4LR6+/t0Qz3hdkwJbV66VeOVNdcOVTwMvSFNFADgM3+vgRsTnbtR/zTDoU UFj2njpI0lZE/VkR5wgPReMyxtuCqYX6gWYghXFMfAOY0jDs9DpzQRDgAUxPaI/Eo7LX xdRr/RyxdIrM16+HrUmKUliqfue1csAHy1W8/l2Z06onE9x/t1stCEo/1+9whQe3Uie5 5SklEoeNRt2Bh8rSn6YEv7b+lGBWYMnspBuEIUOLh5SNMgL+SMBB8SvGP/8+Qb6g4PoP uoOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:feedback-id:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=Lj/5GKFD5bSgxpfgMiLIINDbLK3iL7GVnVF/8eyegVw=; b=5hfqvH/LfxzPo4bgNDj/Go9QpmMYgxjDLNrTEkv5pz2adkUaC7gd8YqhpComrblaDG DcnoFZlIDhWIJOwMvLCtFs8+WEYq+HNBOLsbkqFAmwJ9lXRbN/Jj0NV2GNA51yrl0/RD 9aoXp+qWQOePyo8ut9RJaWs6nX1HxQtJr4SViuQ8UL69mgFWSfIDzgVdjl9ZvCWnvZeO C/iEExb6LTCoujpc0wMKp4IvtS8CkHoMemvW5PEPZ6q9m/I3ijSjL0bc2nPco693Qce9 t7hxa5aPD+mxUu8FpoDM41skS4QNiuBnmL7VFNXF1kkD9xn95fqlIqsyl4QNx69sZ1vS TpUA== X-Gm-Message-State: AJIora9bER1ngEgtS7QLLxNiQMFgaFhc/KvERlBLWoULGpT3uOo+r52K RKai9vl+1HMuagTEhymsUcM= X-Received: by 2002:ac8:5801:0:b0:31d:4c67:6f3 with SMTP id g1-20020ac85801000000b0031d4c6706f3mr19703387qtg.46.1657220709996; Thu, 07 Jul 2022 12:05:09 -0700 (PDT) Received: from auth1-smtp.messagingengine.com (auth1-smtp.messagingengine.com. [66.111.4.227]) by smtp.gmail.com with ESMTPSA id g11-20020ac8124b000000b0031ea1dd67d9sm408611qtj.14.2022.07.07.12.05.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Jul 2022 12:05:09 -0700 (PDT) Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailauth.nyi.internal (Postfix) with ESMTP id C031927C005A; Thu, 7 Jul 2022 15:05:08 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Thu, 07 Jul 2022 15:05:08 -0400 X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvfedrudeihedgudeffecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpeffhffvvefukfhfgggtuggjsehttdertddttddvnecuhfhrohhmpeeuohhq uhhnucfhvghnghcuoegsohhquhhnrdhfvghnghesghhmrghilhdrtghomheqnecuggftrf grthhtvghrnhephedugfduffffteeutddvheeuveelvdfhleelieevtdeguefhgeeuveei udffiedvnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomh epsghoqhhunhdomhgvshhmthhprghuthhhphgvrhhsohhnrghlihhthidqieelvdeghedt ieegqddujeejkeehheehvddqsghoqhhunhdrfhgvnhhgpeepghhmrghilhdrtghomhesfh higihmvgdrnhgrmhgv X-ME-Proxy: Feedback-ID: iad51458e:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 7 Jul 2022 15:05:08 -0400 (EDT) Date: Thu, 7 Jul 2022 12:04:12 -0700 From: Boqun Feng To: Waiman Long Cc: Peter Zijlstra , Ingo Molnar , Will Deacon , linux-kernel@vger.kernel.org, Thomas Gleixner , Sebastian Andrzej Siewior , Juri Lelli , Mike Stowell Subject: Re: [PATCH v3] locking/rtmutex: Limit # of lock stealing for non-RT waiters Message-ID: References: <20220706135916.980580-1-longman@redhat.com> <3e43bc07-053f-80d0-7ea1-93a2897ef03e@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3e43bc07-053f-80d0-7ea1-93a2897ef03e@redhat.com> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 07, 2022 at 02:45:10PM -0400, Waiman Long wrote: > On 7/7/22 14:22, Boqun Feng wrote: > > On Wed, Jul 06, 2022 at 10:03:10AM -0400, Waiman Long wrote: > > > On 7/6/22 09:59, Waiman Long wrote: > > > > Commit 48eb3f4fcfd3 ("locking/rtmutex: Implement equal priority lock > > > > stealing") allows unlimited number of lock stealing's for non-RT > > > > tasks. That can lead to lock starvation of non-RT top waiter tasks if > > > > there is a constant incoming stream of non-RT lockers. This can cause > > > > rcu_preempt self-detected stall or even task lockup in PREEMPT_RT kernel. > > > > For example, > > > > > > > > [77107.424943] rcu: INFO: rcu_preempt self-detected stall on CPU > > > > [ 1249.921363] INFO: task systemd:2178 blocked for more than 622 seconds. > > > > > > > > Avoiding this problem and ensuring forward progress by limiting the > > > > number of times that a lock can be stolen from each waiter. This patch > > > > sets a threshold of 32. That number is arbitrary and can be changed > > > > if needed. > > > > > > > > Fixes: 48eb3f4fcfd3 ("locking/rtmutex: Implement equal priority lock stealing") > > > > Signed-off-by: Waiman Long > > > > --- > > > > kernel/locking/rtmutex.c | 9 ++++++--- > > > > kernel/locking/rtmutex_common.h | 8 ++++++++ > > > > 2 files changed, 14 insertions(+), 3 deletions(-) > > > > > > > > [v3: Increase threshold to 32 and add rcu_preempt self-detected stall] > > > Note that I decided to increase the threshold to 32 from 10 to reduce the > > > potential performance impact of this change, if any. We also found out that > > > this patch can fix some of the rcu_preempt self-detected stall problems that > > > we saw with the PREEMPT_RT kernel. So I added that information in the patch > > > description. > > > > > Have you considered (and tested) whether we can set the threshold > > directly proportional to nr_cpu_ids? Because IIUC, the favorable case > > for lock stealing is that every CPU gets a chance to steal once. If one > > CPU can steal twice, 1) either there is a context switch between two > > tasks, which costs similarly as waking up the waiter, or 2) a task drops > > and re-graps a lock, which means the task wants to yield to other > > waiters of the lock. > > There is no inherent restriction on not allowing the same cpu stealing the > lock twice or more. With rtmutex, the top waiter may be sleeping and the Well, I'm not saying we need to restrict the same cpu to steal a lock twice or more. Think about this, when there is a task running on CPU 1 already steals a lock once, for example: {task C is the top waiter} CPU 1 ===== lock(); // steal the lock ... unlock(): // set owner to NULL // similar cost to wake up A lock(); // steal the lock , which means if a CPU steals a lock twice or more, it's almost certain that a context happened between two steals ("almost" because there could be a case where task A lock()+unlock() twice, but as I said, it means that task A is willing to yield.). Therefore if there are @nr_cpu_ids lock steals, it means either there is a context switch somewhere or a task has been willing to yield. And I think it's a reasonable signal to stop lock stealing. Thoughts? Regards, Boqun > wakeup latency can be considerable. By allowing another ready lock waiter to > steal the lock for productive use, it can improve system throughput. There > is no fairness in lock stealing and I don't believe it is a worthwhile goal > to allow each cpu to steal the lock once. It will just complicate the code. > > On the other hand, unlimited lock stealing is bad and we have a put a limit > somehow to ensure forward progress. > > Cheers, > Longman >