Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp1175443pxb; Thu, 16 Sep 2021 01:10:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzJys/oJetVKGiXvH12/StUzsEe2sJTpo21woe6XhxxmH26bUPbYKRRqaEZF4Djj7yoxAtd X-Received: by 2002:a05:6e02:156c:: with SMTP id k12mr3048310ilu.61.1631779827163; Thu, 16 Sep 2021 01:10:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631779827; cv=none; d=google.com; s=arc-20160816; b=RyOd1sc1EcfEF7C7oSsxf1357pDUkqmE19X0M8G+1ZnE6/g1DSR3pHPOVoNnWKKpop lbABexcYY1CwJ2YKhPnJ7H3Qk/+uBBwbLKPvPbxYIcR0BJj+KE5bRWlg3ArParcq0Msk pl7cF9U558kAIEh6SHJAeSq8XOdLTPKN4vq3s1CFYGWjcg5XavwrU0C9sDzqo4Gz/l5K mwaHwrexW2j2sSqMTjsJwiKnGr9e9CTXC6T+DfltvgbvtLu6vpCdlNPZIvJXfvRMcCmJ gQgao0+0XZQisLAgGSmnJKb2FszYKvqbcGSx/UU0JGW4rNoq3y4FySCHhMhCuquw/MN1 0b1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=k3UdE8OKRUWyRhg0kwtjoF0bATzYaRaijgzcmFzDwiY=; b=cgFR23eEJO9ez7UPVPv7+yWB2I8IlwkjXJfIlcmJRQHm5SPb7bSUQglF7JxbYv8jf0 sPQgbkCCrVLn9e1Hga0FBLbjZu1wKYt7lquwV8Spu9CXDrHYB1jFqzdMQ3lRwl1E0NCb pDe29fSUMLCKGz+mvEHqP3+G5DbXxZLZ/NLQgzsOJb+U1QpVSzwHagDcKQSpUcGLKED6 Zn+viby30Wi5AIpl44edj+cCXo9j1CSkvvTXduL2/4UbB/12QJgACkuJICxB40tNymdA l0OABqjtrFILlOVyf9gebz6we4CILR97zx4IDuCcSSChXR2xW63cEk8D4547MM3zeLFk da6Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=IQZWZWVT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z7si2346255ils.59.2021.09.16.01.10.14; Thu, 16 Sep 2021 01:10:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=IQZWZWVT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234893AbhIPIKx (ORCPT + 99 others); Thu, 16 Sep 2021 04:10:53 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:36500 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229908AbhIPIKw (ORCPT ); Thu, 16 Sep 2021 04:10:52 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 1980B2232C; Thu, 16 Sep 2021 08:02:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1631779348; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=k3UdE8OKRUWyRhg0kwtjoF0bATzYaRaijgzcmFzDwiY=; b=IQZWZWVT1/cwYxYhSrRYozPyoS48UfDv/M3uUtwU9DNhbJqzcXj3lnYrgwQrsMtOSUPXP8 X1elXOCdCzD6yoZOMS77Ym+i9NRiZXTsZYthbqH65cry5LVbc0eJEYFNDChJb/j9mZj//f +Q2NCEQsIOwJaRnL1SNDVHbIyHF5+bA= Received: from suse.cz (unknown [10.100.216.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id CFC94A3B94; Thu, 16 Sep 2021 08:02:27 +0000 (UTC) Date: Thu, 16 Sep 2021 10:02:27 +0200 From: Petr Mladek To: Pingfan Liu Cc: Peter Zijlstra , Pingfan Liu , linux-kernel@vger.kernel.org, Andrew Morton , Wang Qing , Santosh Sivaraj , Sumit Garg , Will Deacon , Mark Rutland Subject: Re: [PATCH 2/5] kernel/watchdog_hld: clarify the condition in hardlockup_detector_event_create() Message-ID: References: <20210915035103.15586-1-kernelfans@gmail.com> <20210915035103.15586-3-kernelfans@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 2021-09-16 11:57:44, Pingfan Liu wrote: > On Wed, Sep 15, 2021 at 03:45:06PM +0200, Peter Zijlstra wrote: > > On Wed, Sep 15, 2021 at 11:51:00AM +0800, Pingfan Liu wrote: > > > hardlockup_detector_event_create() indirectly calls > > > kmem_cache_alloc_node(), which is blockable. > > > > > > So here, the really planned context is is_percpu_thread(). > > > > > > Signed-off-by: Pingfan Liu > > > Cc: Petr Mladek > > > Cc: Andrew Morton > > > Cc: Wang Qing > > > Cc: "Peter Zijlstra (Intel)" > > > Cc: Santosh Sivaraj > > > Cc: Sumit Garg > > > Cc: Will Deacon > > > Cc: Mark Rutland > > > To: linux-kernel@vger.kernel.org > > > --- > > > kernel/watchdog_hld.c | 5 ++++- > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c > > > index 247bf0b1582c..6876e796dbf5 100644 > > > --- a/kernel/watchdog_hld.c > > > +++ b/kernel/watchdog_hld.c > > > @@ -165,10 +165,13 @@ static void watchdog_overflow_callback(struct perf_event *event, > > > > > > static int hardlockup_detector_event_create(void) > > > { > > > - unsigned int cpu = smp_processor_id(); > > > + unsigned int cpu; > > > struct perf_event_attr *wd_attr; > > > struct perf_event *evt; > > > > > > + /* This function plans to execute in cpu bound kthread */ > > > + BUG_ON(!is_percpu_thread()); > > > + cpu = raw_smp_processor_id(); > > > wd_attr = &wd_hw_attr; > > > wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh); > > > > This patch makes no sense. > > This patch aims to disable any attempt such as using get_cpu()/put_cpu() to > shut up the check_preemption_disabled(). I have to say that the description of the problem is really cryptic. Please, provide more context, code paths, sample code, next time. Well, I probably got it. The code might sleep. But it should run on the same CPU even after waking up. You try to achieve this by running the code in a process that is bound to a single CPU. IMHO, this is not reliable. Anyone could change the affinity of the process in the meantime. I see two solutions. Either avoid the sleep or making sure that the code access per-CPU variables on the same CPU all the time. For example, you might use *per_cpu_ptr(watchdog_ev, cpu) = evt; instead of this_cpu_write(watchdog_ev, evt); Best Regards, Petr