Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp2071779ybi; Thu, 18 Jul 2019 02:59:36 -0700 (PDT) X-Google-Smtp-Source: APXvYqznWsSrXwTcfreDOKn/BwYWxSTVspioGg4R6CS49bHmryG0NoP2Y8SmopxzDtfHOg9D7YaJ X-Received: by 2002:a63:7a06:: with SMTP id v6mr47198480pgc.115.1563443975825; Thu, 18 Jul 2019 02:59:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1563443975; cv=none; d=google.com; s=arc-20160816; b=n4xNBP9K+EXaljEya2co04AMOccBb5LWJhVr9KvXi4tmujSm4Mrtekd7iSXUEFcvWl 0VSYSV0pqeW+Ld7tOB9uAYS/JeZtHnDwAWURMKlyTJ6mH3uHhGrgsvCyyROo2jHC0VDq PnDMtXDXQxOHJlduBtWTFGJ5qxqmiKi5/cN/3W0As0B0SicRl6sYfZacExzzWoJGc3bp mt5GDObmINlr5L9J+hotl2/3DjACBVIeJ7bc+fzeJrbiW9wH/OLF9tPHYJy++sdCXL/w MWERZvTSIUE/84xa3IAZV4ddHd/urVoTCMbpU3gxtUZM0rvU61fDd4Zyph9E434AI2RL NWzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=9satRAdvMqeDPPKP3/brQHsKHJ3Si25Teh8KpZpF9fU=; b=i0RXwWnCb7GA/wk6updoY22NjddlhUv+IMxPi4KNVL67Opfl7KN19DSUop3ArvYOSr BI4bJYrEW51wBVI0rToTvbPCK/XhK/FVK1aDvqhP/HfG0/QMKvls/9VdJ/835ucjXtVB cB2V+7862+gNZiLz0o+GrcnDLAoROgl+rJyUAeaM/2gH7llw6wdPxkiz4D04swV5csKo L5vnMFwe5DxV+wlFcRrFrbEW2yFRat08YvYFCcR3tv2AzAc52oSt3xJTa0ygpWyLvx9/ 5JU9c13ruJV6QAwsf3JaWmdWaEvxVmDDIktXOH27B0V45QjAKxrbzBRpIylS9MzCrn7E ncWw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g35si1678388pje.73.2019.07.18.02.59.19; Thu, 18 Jul 2019 02:59:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389901AbfGRJ64 (ORCPT + 99 others); Thu, 18 Jul 2019 05:58:56 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:56724 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726383AbfGRJ6y (ORCPT ); Thu, 18 Jul 2019 05:58:54 -0400 Received: from [5.158.153.52] (helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1ho3BY-0007Fy-6S; Thu, 18 Jul 2019 11:58:48 +0200 Date: Thu, 18 Jul 2019 11:58:47 +0200 (CEST) From: Thomas Gleixner To: luferry cc: "Peter Zijlstra (Intel)" , Rik van Riel , Greg Kroah-Hartman , Josh Poimboeuf , linux-kernel@vger.kernel.org Subject: Re:Re: [PATCH v2] smp: avoid generic_exec_single cause system lockup In-Reply-To: <5f5fbd7.1073c.16c0446ea63.Coremail.luferry@163.com> Message-ID: References: <20190718080308.48381-1-luferry@163.com> <5f5fbd7.1073c.16c0446ea63.Coremail.luferry@163.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 18 Jul 2019, luferry wrote: > At 2019-07-18 16:07:58, "Thomas Gleixner" wrote: > >On Thu, 18 Jul 2019, luferry@163.com wrote: > > > >> From: luferry > >> > >> The race can reproduced by sending wait enabled IPI in softirq/irq env > > > >Which code path is doing that? > > I checked kernel and found no code path can run into this. For a good reason. > Actually , i encounter with this problem by my own code. > I need to do some specific urgent work periodicity and these > work may run for quite a while. So i can't disable irq during these work > which stops me from using hrtimer to do this. So i did add an extra > sofitrq action which may invoke smp_call. Well, from softirq handling context the only allowed interface is smp_call_function_single_async(). The code is actually missing a warning to that effect. See below. Vs. your proposed change. It's broken in various ways and no, we are not going to support that and definitely we are not going to disable interrupts around a loop over all cpus in a mask. Thanks, tglx 8<-------------- Subject: smp: Warn on function calls from softirq context From: Thomas Gleixner Date: Thu, 18 Jul 2019 11:20:09 +0200 It's clearly documented that smp function calls cannot be invoked from softirq handling context. Unfortunately nothing enforces that or emits a warning. A single function call can be invoked from softirq context only via smp_call_function_single_async(). Reported-by: luferry Signed-off-by: Thomas Gleixner --- kernel/smp.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) --- a/kernel/smp.c +++ b/kernel/smp.c @@ -291,6 +291,15 @@ int smp_call_function_single(int cpu, sm WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() && !oops_in_progress); + /* + * Can deadlock when the softirq is executed on return from + * interrupt and the interrupt hit between llist_add() and + * arch_send_call_function_single_ipi() because then this + * invocation sees the list non-empty, skips the IPI send + * and waits forever. + */ + WARN_ON_ONCE(is_serving_softirq() && wait); + csd = &csd_stack; if (!wait) { csd = this_cpu_ptr(&csd_data); @@ -416,6 +425,13 @@ void smp_call_function_many(const struct WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() && !oops_in_progress && !early_boot_irqs_disabled); + /* + * Bottom half handlers are not allowed to call this as they might + * corrupt cfd_data when the interrupt which triggered softirq + * processing hit this function. + */ + WARN_ON_ONCE(is_serving_softirq()); + /* Try to fastpath. So, what's a CPU they want? Ignoring this one. */ cpu = cpumask_first_and(mask, cpu_online_mask); if (cpu == this_cpu)