Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp2650466pxf; Sun, 4 Apr 2021 09:03:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzDa+NJkc8pG9h4VD9fbdbMfKIsn1GFPd/uK3p4YWiTEDEJ+t0C74HyMRAqc8nhpvPDysrB X-Received: by 2002:a5d:878e:: with SMTP id f14mr17505347ion.176.1617552223562; Sun, 04 Apr 2021 09:03:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617552223; cv=none; d=google.com; s=arc-20160816; b=o9G5czb36HQrZC2WJEGQvx8BKrYhP91TWzHU1PTrOpvagL4ix12a0KT1jTrG1xUQOF qvYcq7lnZI3aF8ikysdjxvVBedv0Xgfu/w6SDK9AuGq9w0oTzonLo+P81UaUBqGmzmoK Rh7AF41Ic4zluHUXTXLVisPbWQ/u/VjOhUE3VxB1PuDPTaQbFy4kxBnluU7x1tm7YI5w QsydhpBegw0S8L92JoUXTl1mc+BzZQ0RaH29tTAc2XSl4lFsY45HywsZkk+6Q4Rphfln eXce9YuonJz3M5IgY6mCZBV992m+k6kfCif3Jh1mWjgfmA6lJ4PZPYwhs0DVu3s+v9vI fQqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=ht2+H2OewF3rKhsdFPCiJpba/vZbBbaCy7DB5HVNZG0=; b=lNHpZV3Ac8qDjX66dqKYbZq2wcsdFYzLvLSbsM7/93BL1l77T6OmmAX9lzDS/bKuFu ZnT1wccGw0e/khZ6saARQFGxlbcCl9gKXa8xrtD2vbUH7Gf9laXLo/OFcev6AhyBIAlf o4wgGpcCVG87SX6iL4HVhirkXtXTvcyIl71XUu6l13miiDKmDrJ4o7iJ9ULdbI63+7mq VKVdcyWtOO1b+Fr47HloqmhzDyllvbaHR1NMfvNx/sOq3dVQVStZp6fXm1nEXdZh1MlA /cD/YhAwmLfwxIh3PQ81gGixZ/Av82dMktXa+ps4zodcVho6ZfA0YBuKs6KovE3oo2oV Lv4w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g15si4931068ilf.145.2021.04.04.09.03.28; Sun, 04 Apr 2021 09:03:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230202AbhDDQCk (ORCPT + 99 others); Sun, 4 Apr 2021 12:02:40 -0400 Received: from mail.kernel.org ([198.145.29.99]:48164 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229861AbhDDQCi (ORCPT ); Sun, 4 Apr 2021 12:02:38 -0400 Received: from oasis.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id CBF9C61368; Sun, 4 Apr 2021 16:02:32 +0000 (UTC) Date: Sun, 4 Apr 2021 12:02:31 -0400 From: Steven Rostedt To: Waiman Long Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Bharata B Rao , Phil Auld , Daniel Thompson , linux-kernel@vger.kernel.org Subject: Re: [PATCH v3] sched/debug: Use sched_debug_lock to serialize use of cgroup_path[] only Message-ID: <20210404120231.13843854@oasis.local.home> In-Reply-To: References: <20210401181030.7689-1-longman@redhat.com> <20210402164014.53c84f05@gandalf.local.home> X-Mailer: Claws Mail 3.17.3 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2 Apr 2021 23:09:09 -0400 Waiman Long wrote: > The main problem with sched_debug_lock is that under certain > circumstances, a lock waiter may wait a long time to acquire the lock > (in seconds). We can't insert touch_nmi_watchdog() while the cpu is > waiting for the spinlock. The problem I have with the patch is that it seems to be a hack (as it doesn't fix the issue in all cases). Since sched_debug_lock is "special", perhaps we can add wrappers to take it, and instead of doing the spin_lock_irqsave(), do a trylock loop. Add lockdep annotation to tell lockdep that this is not a try lock (so that it can still detect deadlocks). Then have the strategically placed touch_nmi_watchdog() also increment a counter. Then in that trylock loop, if it sees the counter get incremented, it knows that forward progress is being made by the lock holder, and it too can call touch_nmi_watchdog(). -- Steve