Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp1682434ybt; Sat, 27 Jun 2020 16:45:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwnRrC/zYRJUe1WqDBqjMuYwzF9vOowfDNaP5D9FzCWBcIXClhqGdhr8P3dGWxdFPsgxp+K X-Received: by 2002:a50:d942:: with SMTP id u2mr10304838edj.225.1593301553545; Sat, 27 Jun 2020 16:45:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593301553; cv=none; d=google.com; s=arc-20160816; b=bqCWNyQlD/rJoGepoR4Bovwun48XzxUJN8GzLY2TT4cJfBGiiIlI2PDhwfs64Vg9NS Rf8oWRYCjJagojl/cCXAl2e2WtinWynuiZ9TABOpVlQ2FpO+bo2aWDnZr1/9TUZFZ+Sx zpe11bzIodz2nX9swr3ENaVYMlPGZ/tDzKF3mBdcvP9ga+RFBepMBJvzTbD/ZViqt1CD A+cI31G4fbS67J49+tekdf9UI2xJGTemChJPp6s+BoTpURO62aPtIGIkWWcbfbP6u35g olTiGgPBa7lj2Ph5vJ8Iv0xnLDAke7Tr5WgtpqbVANlrxDz6395SSRtZ8QG3o95yxAZ8 flog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=ZcE6urFsEq3AMuiDshKKJNGcWvpVZN64bR9Qcdf2dtw=; b=zuQnrI35mNij1U7F2Ww0rJJe6hkGtZA32wUGXDMhYG3w9fFNP3iwAh8ffJVQYb7tSe 8K5L+GQ+Mp1La/hKqHeIyt5FBHBD9/y6AU7LkGh13NuOhIXMHwQvEK9Me6vr+5EhlFzI 5soWUUfr+RM21adwNANzNLY9iyskaP6I7gXFC72z+bkRd5ajiVGoxpROCx3vCVZ5V21M Wa4RxmWml7yekqIAF4KZNm0X42RXJ1bEgjHjHrqqQDF2Q8U7UxmbvEKBRYD8vaNUlhh1 qnty3RTG6/81WclmElI9fg5A/v7gUU9wi3qFQWhFSMYSgZm5fqh0nRIUZ8Ktd/D5Hb53 Cpmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=SUPaGrT5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w15si21104087edi.136.2020.06.27.16.45.31; Sat, 27 Jun 2020 16:45:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=SUPaGrT5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726836AbgF0Xod (ORCPT + 99 others); Sat, 27 Jun 2020 19:44:33 -0400 Received: from mail.kernel.org ([198.145.29.99]:56602 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725917AbgF0Xoc (ORCPT ); Sat, 27 Jun 2020 19:44:32 -0400 Received: from paulmck-ThinkPad-P72.home (50-39-105-78.bvtn.or.frontiernet.net [50.39.105.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1360420702; Sat, 27 Jun 2020 23:44:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1593301472; bh=d04eqpCgpXgxOLYZqH/WyKrVLxH+G/xevPPndAeOyCQ=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=SUPaGrT5PNtnqRh5VEM3mQlc29OZd9/jo+jc6MwU3uw+MjCN/hCVWLA8q+/RBI169 O5d2BDWUNXGvW9DtPVryJRtAb4fuY5+Vj0Yeyti1kmPPiBF9dDC+9iaVuW89pC0zGE GGWJsjkHzLFSQLi9q12QOcfmiiWKH2VaXkR7oiJA= Received: by paulmck-ThinkPad-P72.home (Postfix, from userid 1000) id F2F82352308E; Sat, 27 Jun 2020 16:44:31 -0700 (PDT) Date: Sat, 27 Jun 2020 16:44:31 -0700 From: "Paul E. McKenney" To: Andy Lutomirski Cc: Andy Lutomirski , Frederic Weisbecker , Thomas Gleixner , Ingo Molnar , LKML , kernel-team Subject: Re: [PATCH tick-sched] Clarify "NOHZ: local_softirq_pending" warning Message-ID: <20200627234431.GJ9247@paulmck-ThinkPad-P72> Reply-To: paulmck@kernel.org References: <20200627214629.GH9247@paulmck-ThinkPad-P72> <83B12EF8-3792-4943-A548-5DB0C6FC71D1@amacapital.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <83B12EF8-3792-4943-A548-5DB0C6FC71D1@amacapital.net> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jun 27, 2020 at 03:14:14PM -0700, Andy Lutomirski wrote: > > > On Jun 27, 2020, at 2:46 PM, Paul E. McKenney wrote: > > > > On Sat, Jun 27, 2020 at 02:02:15PM -0700, Andy Lutomirski wrote: > >>> On Fri, Jun 26, 2020 at 2:05 PM Paul E. McKenney wrote: > >>> > >>> Currently, can_stop_idle_tick() prints "NOHZ: local_softirq_pending HH" > >>> (where "HH" is the hexadecimal softirq vector number) when one or more > >>> non-RCU softirq handlers are still enablded when checking to stop the > >>> scheduler-tick interrupt. This message is not as enlightening as one > >>> might hope, so this commit changes it to "NOHZ tick-stop error: Non-RCU > >>> local softirq work is pending, handler #HH. > >> > >> Thank you! It would be even better if it would explain *why* the > >> problem happened, but I suppose this code doesn't actually know. > > > > Glad to help! > > > > To your point, is it possible to bisect the appearance of this message, > > or is it as usual non-reproducible? (Hey, had to ask!) > > > > > > In this particular case, I tracked it down by good old fashioned sleuthing for bugs, but it’s still unclear to me precisely how NOHZ gets involved. The bug is that we were entering the kernel from usermode, doing nmi_enter(), turning on interrupts, maybe getting a page fault, raising a signal, turning off interrupts, nmi_exit(), and back to usermode, with the signal still queued and undelivered. This is all kinds of bad, but I still don’t understand what softirqs or idle have to do with it. > > But I have the bug fixed now! Glad you found it! Thanx, Paul