Received: by 10.192.165.148 with SMTP id m20csp4993215imm; Tue, 24 Apr 2018 11:47:45 -0700 (PDT) X-Google-Smtp-Source: AIpwx49eoAbn/GK5bBbf6q+qYaLt+iE4PThGSSzXJjuK6a1EFPhE8Dcu5AK50dlYsP9R29Iq/BMs X-Received: by 10.101.101.15 with SMTP id x15mr21518451pgv.322.1524595665652; Tue, 24 Apr 2018 11:47:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524595665; cv=none; d=google.com; s=arc-20160816; b=TbANIsrLyl3PAcTATaO4e73GbLKWpc0Jc/h7sTAI1kpQHsMEm679UUPaFhDSEVjNsr Ttjd7bXGoBYMISBk3ERM2cX1az+dZKz46Wu5w5Mb9bfiBIJAhkaFC7BuobxbrI4amdrv oBDDaV0+QDgImLu7fHyIUp2EAzcSCwTzW3+0bicSzWxnYJ3YJmOtEBCS69eXImngvhKF Aif0nGzLL0in7HYdFXeCsfwhyWDIiVGXIskKDvcSSFX7Yq3q1Y3dez1Qj0+d2fGpVsXi Oj+q3egocBEQdmtFSqTtdBqRM77J0mRzOZ2JwRhZwxUCZAwAP45iA9oaPzi/569duTM8 9hfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dmarc-filter :dkim-signature:dkim-signature:arc-authentication-results; bh=qqrtQlsqMY4j6xuvd6fbPBvKipAqP+QVE54bjvd3fwo=; b=SOyHMewv9k8Axw535jtxBksvuBn+DoJvqR/IWR5olzEL3al/vBOAdA5v5HSthZmzmz HfrjUl1xR7KP29+uZlyBZonclMeGubQGWc0s3/UQNBgcGWuUmKy+R0A5CrXBC1ghl1FS LihEpdwkr5Mq/IRVXUjIm0vMj/RYi8ZH9KzOAVhFg0HOo1wR6EQAmR6Bv3kE5uzw/RAw dCnJUKvpUXcpArnH7hNOeVWbTeMslnW9juipNWvY076pfF6Z8exZRjm7HNnmBfbzIVpn Hf5TFj86PDigk7grblybaxHYm/emPnumNb6VmM0XpTt/6nRX4HSyfcy9nvfuWmr7zh3t +GiA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=m8rYYl0P; dkim=pass header.i=@codeaurora.org header.s=default header.b=otA48ynj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q127si11878648pga.326.2018.04.24.11.47.30; Tue, 24 Apr 2018 11:47:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=m8rYYl0P; dkim=pass header.i=@codeaurora.org header.s=default header.b=otA48ynj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751280AbeDXSq2 (ORCPT + 99 others); Tue, 24 Apr 2018 14:46:28 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:59820 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750772AbeDXSq1 (ORCPT ); Tue, 24 Apr 2018 14:46:27 -0400 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 92B7660C66; Tue, 24 Apr 2018 18:46:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1524595586; bh=LPps3WvVWC5SJg1qxveIahT53P5VGLqBaHK15S8jj9Q=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=m8rYYl0Pt6AfoVFmyOgEbKnJPtLaLcVUIvQZqFcF3xIyYFzlrAYcWgti1CpVhyEDi wftcazHHbZDAE2XnyeW6Uh4/uNEr9vOf8JcxKtiaEu2neo6+xfAWqW7YCXzD5iqvOu wdB54e+QI0eCvYQZV/TCWeIpdwFI0PZ4iLYPpFS0= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pdx-caf-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,T_DKIM_INVALID autolearn=no autolearn_force=no version=3.4.0 Received: from [192.168.0.2] (unknown [183.83.204.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: gkohli@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 6688F6034F; Tue, 24 Apr 2018 18:46:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1524595585; bh=LPps3WvVWC5SJg1qxveIahT53P5VGLqBaHK15S8jj9Q=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=otA48ynj1Hm4c+aoeKJGm+W81gUW1bZ/aGAZL+5H0GQ8wkgrM/BiibFyW7KqcJ6WT 3YDYLY8l2T/5gGpnXVMk4FuWjwtVn8S7FOtDF/8wRz3RnHoo066WzHTMr+9Zf5ao0G wZYq1vNVsi6dZAE5sQ/LWvOGsrC7osFWnlfMLzMQ= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 6688F6034F Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=gkohli@codeaurora.org Subject: Re: [PATCH] kthread/smpboot: Serialize kthread parking against wakeup To: Peter Zijlstra Cc: tglx@linutronix.de, mpe@ellerman.id.au, dzickus@redhat.com, mingo@kernel.org, bigeasy@linutronix.de, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, Neeraj Upadhyay References: <1524562105-31026-1-git-send-email-gkohli@codeaurora.org> <20180424182628.GW4043@hirez.programming.kicks-ass.net> From: "Kohli, Gaurav" Message-ID: <75ced7f3-e596-1942-f843-d43cf162103b@codeaurora.org> Date: Wed, 25 Apr 2018 00:16:19 +0530 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180424182628.GW4043@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/24/2018 11:56 PM, Peter Zijlstra wrote: > On Tue, Apr 24, 2018 at 02:58:25PM +0530, Gaurav Kohli wrote: >> The control cpu thread which initiates hotplug calls kthread_park() >> for hotplug thread and sets KTHREAD_SHOULD_PARK. After this control >> thread wakes up the hotplug thread. There is a chance that wakeup >> code sees the hotplug thread (running on AP core) in INTERRUPTIBLE >> state, but sets its state to RUNNING after hotplug thread has entered >> kthread_parkme() and changed its state to TASK_PARKED. This can result >> in panic later on in kthread_unpark(), as it sees KTHREAD_IS_PARKED >> flag set but fails to rebind the kthread, due to it being not in >> TASK_PARKED state. Fix this, by serializing wakeup state change, >> against state change before parking the kthread. >> >> Below is the possible race: >> >> Control thread Hotplug Thread >> >> kthread_park() >> set KTHREAD_SHOULD_PARK >> smpboot_thread_fn >> set_current_state(TASK_INTERRUPTIBLE); >> kthread_parkme >> >> wake_up_process() >> >> raw_spin_lock_irqsave(&p->pi_lock, flags); >> if (!(p->state & state)) -> this will fail >> goto out; >> >> __kthread_parkme >> __set_current_state(TASK_PARKED); >> >> if (p->on_rq && ttwu_remote(p, wake_flags)) >> ttwu_remote() >> p->state = TASK_RUNNING; >> schedule(); >> >> So to avoid this race, take pi_lock to serial state changes. >> >> Suggested-by: Pavankumar Kondeti >> Co-developed-by: Neeraj Upadhyay >> Signed-off-by: Neeraj Upadhyay >> Signed-off-by: Gaurav Kohli >> >> diff --git a/kernel/smpboot.c b/kernel/smpboot.c >> index 1650578..514b232 100644 >> --- a/kernel/smpboot.c >> +++ b/kernel/smpboot.c >> @@ -121,7 +121,9 @@ static int smpboot_thread_fn(void *data) >> } >> >> if (kthread_should_park()) { >> + raw_spin_lock(¤t->pi_lock); >> __set_current_state(TASK_RUNNING); >> + raw_spin_unlock(¤t->pi_lock); >> preempt_enable(); >> if (ht->park && td->status == HP_THREAD_ACTIVE) { >> BUG_ON(td->cpu != smp_processor_id()); > Note how in your scenario above you didn't actually need the > TASK_RUNNING state; so how is this change going to fix anything? Hi Peter, As with help of this , if kthread_should_park run first so wake_up call of controller get exited as task is already set as running, otherwise if controller runs first then we will block here and set running and then sets TASK_PARKED . So no chance of cpuhp set as running duringĀ  kthread_parkme call. But as we discussed this can be fix by 2nd patch as well, So once you get time and able to see , Please let us know or do you want me to try your 2nd patch for testing first? > > But yes, I suspect it is right, but it definitely needs a comment > explaining wth we take that lock there. > > Like I said earlier, my brain is entirely fried for the day; but I'll > have a try tomorrow. > -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.