Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp10269369rwl; Wed, 11 Jan 2023 17:28:12 -0800 (PST) X-Google-Smtp-Source: AMrXdXv4YOGyzGt+p4uI6/1ER5aQh+G7DRpfrrmMerlPkUNCpvqNQb9I6X6vMSQQcGfd+eX8PHNc X-Received: by 2002:a17:907:d38c:b0:7c0:b0f9:e360 with SMTP id vh12-20020a170907d38c00b007c0b0f9e360mr65495371ejc.16.1673486892116; Wed, 11 Jan 2023 17:28:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673486892; cv=none; d=google.com; s=arc-20160816; b=Rdo0MTG3lHv7n9t5dgCJtOF4ZyM+NhEGqsSxIWyxhu9bWAxXF3C0nGzVN+YAd42cMp DFMTnGsBUFwD/o2AHZqIvrb0LC3OlDQJjZb3XYcJ+7dfAN8VcnarKNDxgyTU6cOeT0Bs VqK2jr4sp2KgVhzzsZF/t1fvYmAP3D7S2XScjefqQNtrQg3YQoUaM2NM78ojb0GqZAVC dXaxad/lAYVtjCmnxv9aW3jZUVYMxMd6A0Ut/7K0CUzFxn07RFAh+9oHRWwRcsyNEJY/ feI5aMeWBa3AiosjFjOPJr9qIDFVpvrLdw5l/n7XASU/5X6rSM7BaJY6L0XnBbccYwnA l6MQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:reply-to :message-id:subject:cc:to:from:date:dkim-signature; bh=rg4ktHhPQRjhEQUXm5ar3XufjUSMy000vrXqtXJ8Shw=; b=fIHExjPtB80btYo4o2t+2bgqHPfOAC4wOWouUMQCkZoDMv+46dTXr8E2UOLC5JxZt+ C2k1eiQPMU1aOLsc0pAstxsWvZ3+TSxjp9R3OC/NOGQdty7+1JECfw25HfnJVr0IMDdu WVnwLzCDHwE0f1pKIFdQCx06QovQPfMakiJ3dIMF6aw9R4Htyo8pSmdKtnGIS9bSzyge ADSScgEGT2WoQzmjPwa+/SpedFxhkDH4VTQ1eix/vd62r2YZ2S6DEUtnerq3TgwFEi2Y 0WTR3Iy/pm7Ln1mpTI/c3HSMIrHfXjVxBJNdrmvWUIjrxBUJ5sIDJG4T//ZvsBRcg9Hh do7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=YvCQnHmW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cr11-20020a170906d54b00b008626e197ab9si2035234ejc.690.2023.01.11.17.27.58; Wed, 11 Jan 2023 17:28:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=YvCQnHmW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233380AbjALAgk (ORCPT + 50 others); Wed, 11 Jan 2023 19:36:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54848 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231690AbjALAg3 (ORCPT ); Wed, 11 Jan 2023 19:36:29 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E013B62C3 for ; Wed, 11 Jan 2023 16:36:28 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 72AA061E89 for ; Thu, 12 Jan 2023 00:36:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C8B1EC433EF; Thu, 12 Jan 2023 00:36:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673483787; bh=IbaBrMYGLmoj1xH3uSne5D7/7LXFcR6qR4kCmMCIZrQ=; h=Date:From:To:Cc:Subject:Reply-To:From; b=YvCQnHmWShKbDCakinOSWib2Lhc6TcArDnsbF/Re2PUB4MPEeAPgt4aW5Rsyi+3zN X3+waHqMFZU5qRU0DdBSMacavsyBNfSXBu/IR6r2VfjyvtC92xqm/BxO8josEQiWVI 8wfkGuOQWjmUyOzov0ENbUpnb7XPQQO71jejkvw2OjChV+WMjH2MCPsZqIZByb6SZS fWRnF8aRBg44QW7pGO4COXLEFcGsItuqu3d3vEC3esWOGHw6kOME1k7LjgiSGPAhDs tJCB7Tag3nA5wzgimhTJbt7AW4fiX44FEE4grJX+zMTxBCqOykMzqWQnOeEmXKRrSg MavaYbAiEvZew== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 726BD5C0687; Wed, 11 Jan 2023 16:36:27 -0800 (PST) Date: Wed, 11 Jan 2023 16:36:27 -0800 From: "Paul E. McKenney" To: riel@surriel.com, davej@codemonkey.org.uk Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com Subject: [PATCH diagnostic qspinlock] Diagnostics for excessive lock-drop wait loop time Message-ID: <20230112003627.GA3133092@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org We see systems stuck in the queued_spin_lock_slowpath() loop that waits for the lock to become unlocked in the case where the current CPU has set pending state. Therefore, this not-for-mainline commit gives a warning that includes the lock word state if the loop has been spinning for more than 10 seconds. It also adds a WARN_ON_ONCE() that complains if the lock is not in pending state. If this is to be placed in production, some reporting mechanism not involving spinlocks is likely needed, for example, BPF, trace events, or some combination thereof. Signed-off-by: Paul E. McKenney diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index ac5a3e6d3b564..be1440782c4b3 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -379,8 +379,22 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) * clear_pending_set_locked() implementations imply full * barriers. */ - if (val & _Q_LOCKED_MASK) - atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_MASK)); + if (val & _Q_LOCKED_MASK) { + int cnt = _Q_PENDING_LOOPS; + unsigned long j = jiffies + 10 * HZ; + struct qspinlock qval; + int val; + + for (;;) { + val = atomic_read_acquire(&lock->val); + atomic_set(&qval.val, val); + WARN_ON_ONCE(!(val & _Q_PENDING_VAL)); + if (!(val & _Q_LOCKED_MASK)) + break; + if (!--cnt && !WARN(time_after(jiffies, j), "%s: Still pending and locked: %#x (%c%c%#x)\n", __func__, val, ".L"[!!qval.locked], ".P"[!!qval.pending], qval.tail)) + cnt = _Q_PENDING_LOOPS; + } + } /* * take ownership and clear the pending bit.