Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp910600imm; Fri, 3 Aug 2018 13:57:44 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfKP1E3lYYi5v8dW395Y7Dcoy6CGxVDE/vD4RKUUh+pNs0b5YxnZA76gPV3TVCfV9Jn1R67 X-Received: by 2002:a65:5304:: with SMTP id m4-v6mr5171701pgq.250.1533329864175; Fri, 03 Aug 2018 13:57:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533329864; cv=none; d=google.com; s=arc-20160816; b=EPFsWOK8fIrppzPedlTnuJSVI9f/axw5ZMiHYeFxtZLD7gMKAmEOKnYrKI9dydpNwF DT4x/3t14LiMhcZq+Dk70XdQeEbMMGpotNDJYcISBGdmHkmXpLWR/vbKM9nCpuvZmfLm SjmyQTivE6G+qQUJ+Gpk3hXNcjPuCs/KE9PbeUQvh/Z+GvNROc9B0R/LUPN8xcf3N0DE DXCODJDjcFbtfOQALM8MSyBonjzTA1WFPvwIiydGT0obECexlK8Er8NKfrMH78sz0EAH YMSUL+cBr89UsTJkYsOq4sceo4IT96nhs90Ves1ocqzmSpg9cthSsSK5Qae5EnUAlg9C Kskg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dmarc-filter:dkim-signature:dkim-signature :arc-authentication-results; bh=MldAmfjUcXM75BN/0WU2v0DdLrTyYwucZ/gV028KcZQ=; b=NI3z3LDFP6EUNN7IqgJqBMy/Q+AxXjQ6GPGZfnFAteyhaDQzs5rSApkpB075uhtbhn UVhleAS7K4DdtbBE2WJRiVz2owQuyjcjA+/aeFblAeVqp4lnAqBhmq7liMQElRaT2Tr9 IWVaFtVn0MAlMbfL/lwIu1ILcp/J0WCEl3wwGGchas1SBcTrGEqaWgNiCrxRtR4NhXk4 OE0KP9VG6EibVRkU8mlQFCa9mW/IoxqKejRPefBrvbGCjnqkOfvfCVEV4+zBhJGSYpIT bLTfBi8nMPxBYcR7eW/fr+QEgNmAUGDzXIcjnrQpTrYEEn7S5AbXsGS19T8tjv55Q/h5 w+2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=clEpDsNz; dkim=pass header.i=@codeaurora.org header.s=default header.b=clEpDsNz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w1-v6si6625941pfl.215.2018.08.03.13.57.29; Fri, 03 Aug 2018 13:57:44 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=clEpDsNz; dkim=pass header.i=@codeaurora.org header.s=default header.b=clEpDsNz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732090AbeHCWyS (ORCPT + 99 others); Fri, 3 Aug 2018 18:54:18 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:59008 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728477AbeHCWyS (ORCPT ); Fri, 3 Aug 2018 18:54:18 -0400 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id E1CF9606DD; Fri, 3 Aug 2018 20:56:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1533329781; bh=+/nJSFIWfQkJZwUEBl0cuL3v0U2Yno0jKG8eqyfduzA=; h=From:To:Cc:Subject:Date:From; b=clEpDsNzaQo0wGxTk7ZH6XxiG05DD20UK/gE5UwL6x2EC720EjcepsRlxPMWrzofS hZAQvXjlSCLfPeuzhDdxQcHYl0oeE/5duHEUqYVOG4o4B7B4ENujtm8YHnSMzHMorh 0vnRnfNry/voQlxrZ2VUb5XSvf9/gdUWKsJiSD0w= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pdx-caf-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,T_DKIM_INVALID autolearn=no autolearn_force=no version=3.4.0 Received: from isaacm-linux.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: isaacm@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id B549360264; Fri, 3 Aug 2018 20:56:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1533329781; bh=+/nJSFIWfQkJZwUEBl0cuL3v0U2Yno0jKG8eqyfduzA=; h=From:To:Cc:Subject:Date:From; b=clEpDsNzaQo0wGxTk7ZH6XxiG05DD20UK/gE5UwL6x2EC720EjcepsRlxPMWrzofS hZAQvXjlSCLfPeuzhDdxQcHYl0oeE/5duHEUqYVOG4o4B7B4ENujtm8YHnSMzHMorh 0vnRnfNry/voQlxrZ2VUb5XSvf9/gdUWKsJiSD0w= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org B549360264 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=isaacm@codeaurora.org From: "Isaac J. Manjarres" To: peterz@infradead.org, matt@codeblueprint.co.uk, mingo@kernel.org, tglx@linutronix.de, bigeasy@linutronix.de Cc: Prasad Sodagudi , linux-kernel@vger.kernel.org, gregkh@linuxfoundation.org, "Isaac J. Manjarres" , stable@vger.kernel.org Subject: [PATCH] stop_machine: Atomically queue and wake stopper threads Date: Fri, 3 Aug 2018 13:56:06 -0700 Message-Id: <1533329766-4856-1-git-send-email-isaacm@codeaurora.org> X-Mailer: git-send-email 1.9.1 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Prasad Sodagudi When cpu_stop_queue_work() releases the lock for the stopper thread that was queued into its wake queue, preemption is enabled, which leads to the following deadlock: CPU0 CPU1 sched_setaffinity(0, ...) __set_cpus_allowed_ptr() stop_one_cpu(0, ...) stop_two_cpus(0, 1, ...) cpu_stop_queue_work(0, ...) cpu_stop_queue_two_works(0, ..., 1, ...) -grabs lock for migration/0- -spins with preemption disabled, waiting for migration/0's lock to be released- -adds work items for migration/0 and queues migration/0 to its wake_q- -releases lock for migration/0 and preemption is enabled- -current thread is preempted, and __set_cpus_allowed_ptr has changed the thread's cpu allowed mask to CPU1 only- -acquires migration/0 and migration/1's locks- -adds work for migration/0 but does not add migration/0 to wake_q, since it is already in a wake_q- -adds work for migration/1 and adds migration/1 to its wake_q- -releases migration/0 and migration/1's locks, wakes migration/1, and enables preemption- -since migration/1 is requested to run, migration/1 begins to run and waits on migration/0, but migration/0 will never be able to run, since the thread that can wake it is affine to CPU1- Disable preemption in cpu_stop_queue_work() before queueing works for stopper threads, and queueing the stopper thread in the wake queue, to ensure that the operation of queueing the works and waking the stopper threads is atomic. Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock") Co-Developed-by: Isaac J. Manjarres Signed-off-by: Prasad Sodagudi Signed-off-by: Isaac J. Manjarres Cc: stable@vger.kernel.org --- kernel/stop_machine.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 34b6652..067cb83 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -81,6 +81,7 @@ static bool cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work) unsigned long flags; bool enabled; + preempt_disable(); raw_spin_lock_irqsave(&stopper->lock, flags); enabled = stopper->enabled; if (enabled) @@ -90,6 +91,7 @@ static bool cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work) raw_spin_unlock_irqrestore(&stopper->lock, flags); wake_up_q(&wakeq); + preempt_enable(); return enabled; } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project