Received: by 10.192.165.148 with SMTP id m20csp4650224imm; Tue, 1 May 2018 00:51:33 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpTE3EYsZAZs0CY6GAXsc6qlCnWX0ZJWgbEk9MHDQYiPOQsKTXBAyCGhmr4xfEdR4DRO1sq X-Received: by 10.98.12.202 with SMTP id 71mr14673127pfm.61.1525161093341; Tue, 01 May 2018 00:51:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525161093; cv=none; d=google.com; s=arc-20160816; b=okmXUvWKjfldxCVqstliUrqGPeTDIJtzVEypyCTBI2V5d45yuxQ7Q++IHKti8TpdAv DMgfQ5vRz9dJfVdPqui19oNUfP6KzF2J2DxPx6GNewRhe6a/dSTA3Fx1Ff5/cQRxv9af g9/0IVPu8g2+532OSuhxxgsXR7ideycx/McHCOrktV6/HL7Pfb6it5VzioK0JwQ6RVjP PdMBJn5R1Clo0iAGSGbSeYcdRxz/hyJrvmLmkbAgHwKGPNCzZjT8oJo39JnCifmY6ELA ivwZOFDnjDTjutNcXS0EExVuBXN3BAJx5okaGUW6qUNDlzG1/CaYSoLeCYaYDGhRFlG1 4cHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dmarc-filter :dkim-signature:dkim-signature:arc-authentication-results; bh=Mul9wXi17IjicWAXUn85wpmBopjSBj1M5+qTpbXTSfU=; b=u8hQj/b215I4LN+i6Be6t+MVCeuMJoUv3zA5V6jYYrZz0PHZidbI7++wS9c9Kk9ZUs 85XZmnAKL9hfFOeMAXfpoqLnGVYHLYRZDaQpxpXrYVXAPxWXixStxtXNVq79UHoDeJLb K0C9scovNl+bm4eY6KMgMyWYySgvq5pkA7j3JCK2vbQtUO8vNhejjAwplewaD9xC78bd 9saf8oOS1QNkESWK44m7NEEP+tkHTNB5QuJVtiOtfkxAZhfrlW5x8/NsyEqp/k1R2V/A uMOFPy99fr1Cjo4t7wpBb/SmyoBQ3fCOjRL5DUqMQcIbwD8Jp4ZMsk2/7T78L680OIsv B6yQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=AUe7/OHJ; dkim=pass header.i=@codeaurora.org header.s=default header.b=m7xESwej; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o14-v6si4291219pgc.664.2018.05.01.00.51.06; Tue, 01 May 2018 00:51:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=AUe7/OHJ; dkim=pass header.i=@codeaurora.org header.s=default header.b=m7xESwej; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751776AbeEAHuy (ORCPT + 99 others); Tue, 1 May 2018 03:50:54 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:34444 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751071AbeEAHuw (ORCPT ); Tue, 1 May 2018 03:50:52 -0400 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 3E0DF607F5; Tue, 1 May 2018 07:50:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1525161052; bh=H6hFmqvJQBCz/u8cbsFDxnRQkM5xVDG0mg4TsibE/6o=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=AUe7/OHJaR0HsAuWwTeN0T2KTBLSgisPy/bUc0aZVZ5eidfZAUFxeB+JrhrG0FLPG nI3juI1zxCGo7jQGyTdKQqUDlrSIG+VxBfrNil2YDopzkYroyyOFG3d/sP+7UBzRje NU+tr9YUAGoYR5hlvBxVAf/4nEVEBdrmTm6UoD/I= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pdx-caf-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,T_DKIM_INVALID autolearn=no autolearn_force=no version=3.4.0 Received: from [10.204.78.254] (blr-c-bdr-fw-01_globalnat_allzones-outside.qualcomm.com [103.229.19.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: gkohli@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 1AEF5601A0; Tue, 1 May 2018 07:50:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1525161032; bh=H6hFmqvJQBCz/u8cbsFDxnRQkM5xVDG0mg4TsibE/6o=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=m7xESwejdZ1UwBGOfvVzN0gp6SqWz1NY5DI+hA/uWvATtEDM+yEIDIIEavkZ3by+b btjq+B+poJTtRFUuzIP7PxF4wQyH58LialruJuhgvq0b9R3TKrZBIIGbwXO7QGHg1t 2AezitmyILFveZ+H9kfySRjW9DRn8NiFKhBTL0Cw= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 1AEF5601A0 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=gkohli@codeaurora.org Subject: Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup To: Peter Zijlstra Cc: tglx@linutronix.de, mpe@ellerman.id.au, mingo@kernel.org, bigeasy@linutronix.de, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, Neeraj Upadhyay , Will Deacon , Oleg Nesterov References: <1524645199-5596-1-git-send-email-gkohli@codeaurora.org> <20180425200917.GZ4082@hirez.programming.kicks-ass.net> <20180426084131.GV4129@hirez.programming.kicks-ass.net> <20180426085719.GW4129@hirez.programming.kicks-ass.net> <4d3f68f8-e599-6b27-a2e8-9e96b401d57a@codeaurora.org> <20180430111744.GE4082@hirez.programming.kicks-ass.net> From: "Kohli, Gaurav" Message-ID: <3af3365b-4e3f-e388-8e90-45a3bd4120fd@codeaurora.org> Date: Tue, 1 May 2018 13:20:26 +0530 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180430111744.GE4082@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org sorry for spam, Adding list On 4/30/2018 4:47 PM, Peter Zijlstra wrote: > On Thu, Apr 26, 2018 at 09:23:25PM +0530, Kohli, Gaurav wrote: >> On 4/26/2018 2:27 PM, Peter Zijlstra wrote: >> >>> On Thu, Apr 26, 2018 at 10:41:31AM +0200, Peter Zijlstra wrote: >>>> diff --git a/kernel/kthread.c b/kernel/kthread.c >>>> index cd50e99202b0..4b6503c6a029 100644 >>>> --- a/kernel/kthread.c >>>> +++ b/kernel/kthread.c >>>> @@ -177,12 +177,13 @@ void *kthread_probe_data(struct task_struct *task) >>>> static void __kthread_parkme(struct kthread *self) >>>> { >>>> - __set_current_state(TASK_PARKED); >>>> - while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) { >>>> + for (;;) { >>>> + __set_task_state(TASK_PARKED); >>> set_current_state(TASK_PARKED); >>> >>> of course.. >> >> Hi Peter, >> >> Maybe i am missing something , but still that race can come as we don't put task_parked on special state. >> >> Controller                                                                       Hotplug >> >>                                                                                  Loop >> >>                                                                                  Task_Interruptible >> >> Set SHOULD_PARK >> >> wakeup -> Proceeds >> >>                                                                                   Set Running >> >>                                                                                   kthread_parkme >> >>                                                                                   SET TASK_PARKED >> >>                                                                                   schedule >> >> Set TASK_RUNNING >> >> Can you please correct ME, if I misunderstood this. > > If that could happen, all wait-loops would be broken. However, > AFAICT that cannot happen, because ttwu_remote() and schedule() > serialize on rq->lock. See: > > > A B > > for (;;) { > set_current_state(UNINTERRUPTIBLE); > > cond1 = true; > wake_up_process(A) > lock(A->pi_lock) > smp_mb__after_spinlock() > if (A->state & TASK_NORMAL) > A->on_rq && ttwu_remote() > if (cond1) // true > break; > schedule(); > } > __set_current_state(RUNNING); > Hi Peter, Sorry for the late reply and i was on leave. Thanks for the new patches, We will apply and test for issue reproduction. But In our older case, where we have seen failure below is the wake up path and ftraces, Wakeup occured and completed before schedule call only. So final state of CPUHP is running not parked. I have also pasted debug ftraces that we got during issue reproduction. Here wakeup for cpuhp is below: takedown_cpu-> kthread_park-> wake_up_process 39,034,311,742,395 apps (10240) Trace Printk cpuhp/0 (16) [000] 39015.625000: __kthread_parkme state=512 task=ffffffcc7458e680 flags: 0x5 -> state 5 -> state is parked inside parkme function 39,034,311,846,510 apps (10240) Trace Printk cpuhp/0 (16) [000] 39015.625000: before schedule __kthread_parkme state=0 task=ffffffcc7458e680 flags: 0xd -> just before schedule call, state is running tatic void __kthread_parkme(struct kthread *self) { __set_current_state(TASK_PARKED); while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) { if (!test_and_set_bit(KTHREAD_IS_PARKED, &self->flags)) complete(&self->parked); schedule(); __set_current_state(TASK_PARKED); } clear_bit(KTHREAD_IS_PARKED, &self->flags); __set_current_state(TASK_RUNNING); } So my point is here also, if it is reschedule then it can set TASK_PARKED, but it seems after takedown_cpu call this thread never get a chance to run, So final state is TASK_RUNNING. In our current fix also can't we observe same scenario where final state is TASK_RUNNING. Regards Gaurav > for (;;) { > set_current_state(UNINTERRUPTIBLE); > if (cond2) > break; > > schedule(); > lock(rq->lock) > smp_mb__after_spinlock(); > deactivate_task(A); > > unlock(rq->lock); > rq = __task_rq_lock(A) > if (A->on_rq) // false > A->state = TASK_RUNNING; > __task_rq_unlock(rq) > > > Either A's schedule() must observe RUNNING (not shown) or B must > observe !A->on_rq (shown) and not issue the store. > -- > To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.