Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp3765986imm; Mon, 2 Jul 2018 05:18:44 -0700 (PDT) X-Google-Smtp-Source: AAOMgpd3UjdCb5XhfJRFR0nutAMrMOEki2NKlhuvO9pO5JxIxJj0pX8oqBFwJdhP3ePniFjvx9ux X-Received: by 2002:a17:902:724a:: with SMTP id c10-v6mr15183226pll.64.1530533924229; Mon, 02 Jul 2018 05:18:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530533924; cv=none; d=google.com; s=arc-20160816; b=sgFpHlElgQXFshXqbdi6C6Y72rC+bzbUJoOyM6CqB2OOTStpcvh+HVYCsEeqjrrChL cn15sTu41P2mLZlu0FLnhoLO0ZMVA9XHtLkslfeGsWx1MHUzTWbRf8O9ZuKu6ANUVUyG fYJbkKak6e06x5Xt9rnFJvaciKcX/lqTV8J0yLBt54LSZ+H9WkvcxdrL48OGH0vryoJZ IhoU8QGTWIqS+UKgWWsTF6g/xi/YbupBmT6c35Dz31qW0NKUuO5RljyywZAwX9mWSQqR udHqdWvSkdPKs9IW1sU8MExaWIQIBAEJAlIeX1FnTKZQUbsD1ofCy+KOWgc64fZ4SBHZ 7cCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=jTzLrRXUUpW5WwBJpgHaPgLzEHZBYgG9Nuosma0Z02w=; b=vGon+4y/CHRckV2pX66ETvY9OuG4s1Qh2Rh3BL+nxc2JiIIkE2YWrSJ5AcR043VMPw 7D+/P57zgtlIIAvbGJ33/2PAp4mI3IfBAGROfxPnj/TXBsg4R3C/zoYkzmKIKxFdQ8cR esgJcE8hZWoxtLo1hm0T1P6uKZHZGnnAHgglVvIvFCdhaxFujTA1eiY2bpJqDiFP2Ab3 JVkIoyE1rgyQCP6JD7LDdv9ugwrRKBiO3BeLG5gZdOn23usMHLTUUUpBQ7YlwDRUIt0l W23StMzn2sPzfgUJTuilitSUnnAPeJNSydz6fFs+gnAqoMX5KHHIeS1ZN9Zej1tCZ/1n yHsQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=Bjc7ZAqm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 64-v6si13037522pgd.509.2018.07.02.05.18.29; Mon, 02 Jul 2018 05:18:44 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=Bjc7ZAqm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752435AbeGBMPK (ORCPT + 99 others); Mon, 2 Jul 2018 08:15:10 -0400 Received: from merlin.infradead.org ([205.233.59.134]:40990 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752033AbeGBMPH (ORCPT ); Mon, 2 Jul 2018 08:15:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=jTzLrRXUUpW5WwBJpgHaPgLzEHZBYgG9Nuosma0Z02w=; b=Bjc7ZAqmro6J6X9gkEAQztMu0 ppnzBBdKYyhE2kc/WRfnhyBAI3+JJLIC9L/p6rTDObk/Gorf3uhJ3ivPQbTMLOXekDpWdnCmx5Rt5 kLl54MDHnD9nseobU9iTGu0A1ykA/2y2fCEjdy3iLOrdgleaoWIsaPwTRIugtD+oCI7btWrow0jar y8vnoDVchkuPLQDKd8lP6eHxxb3zkJ1Yvx2Xl/PZ7uLsmck3f2bLqvdQTYoIXnbTgtKXDiID66PHy uvKgfu7WwOexZJB/RIm3AdiNwNbeRZoY+wxo3kOfe/d+/qQk1sCM24PR7pCGOreuh7bwBP1sybJPX YjyfhAprA==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1fZxjT-000622-B8; Mon, 02 Jul 2018 12:15:03 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 7E1E32029F1D9; Mon, 2 Jul 2018 14:15:00 +0200 (CEST) Date: Mon, 2 Jul 2018 14:15:00 +0200 From: Peter Zijlstra To: "Isaac J. Manjarres" Cc: matt@codeblueprint.co.uk, mingo@kernel.org, tglx@linutronix.de, bigeasy@linutronix.de, linux-kernel@vger.kernel.org, psodagud@codeaurora.org, pkondeti@codeaurora.org Subject: Re: [PATCH v2] stop_machine: Disable preemption when waking two stopper threads Message-ID: <20180702121500.GK2494@hirez.programming.kicks-ass.net> References: <1530305712-16416-1-git-send-email-isaacm@codeaurora.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1530305712-16416-1-git-send-email-isaacm@codeaurora.org> User-Agent: Mutt/1.10.0 (2018-05-17) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 29, 2018 at 01:55:12PM -0700, Isaac J. Manjarres wrote: > When cpu_stop_queue_two_works() begins to wake the stopper > threads, it does so without preemption disabled, which leads > to the following race condition: > > The source CPU calls cpu_stop_queue_two_works(), with cpu1 > as the source CPU, and cpu2 as the destination CPU. When > adding the stopper threads to the wake queue used in this > function, the source CPU stopper thread is added first, > and the destination CPU stopper thread is added last. > > When wake_up_q() is invoked to wake the stopper threads, the > threads are woken up in the order that they are queued in, > so the source CPU's stopper thread is woken up first, and > it preempts the thread running on the source CPU. > > The stopper thread will then execute on the source CPU, > disable preemption, and begin executing multi_cpu_stop(), > and wait for an ack from the destination CPU's stopper thread, > with preemption still disabled. Since the worker thread that > woke up the stopper thread on the source CPU is affine to the > source CPU, and preemption is disabled on the source CPU, that > thread will never run to dequeue the destination CPU's stopper > thread from the wake queue, and thus, the destination CPU's > stopper thread will never run, causing the source CPU's stopper > thread to wait forever, and stall. > > Disable preemption when waking the stopper threads in > cpu_stop_queue_two_works() to ensure that the worker thread > that is waking up the stopper threads isn't preempted > by the source CPU's stopper thread, and permanently > scheduled out, leaving the remaining stopper thread asleep > in the wake queue. > > Co-developed-by: Pavankumar Kondeti > Signed-off-by: Prasad Sodagudi > Signed-off-by: Pavankumar Kondeti > Signed-off-by: Isaac J. Manjarres That SoB chain is broken, if Prasad wrote the ptch then there needs to be a From: line somewhere. But yes, that looks about right. > --- > kernel/stop_machine.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c > index f89014a..1ff523d 100644 > --- a/kernel/stop_machine.c > +++ b/kernel/stop_machine.c > @@ -270,7 +270,11 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1, > goto retry; > } > > - wake_up_q(&wakeq); > + if (!err) { > + preempt_disable(); > + wake_up_q(&wakeq); > + preempt_enable(); > + } > > return err; > } > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project >