Received: by 10.192.165.148 with SMTP id m20csp1529863imm; Wed, 25 Apr 2018 21:06:28 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/0gT/T41Jcg0vTJPol/oBONAoIQuFKVmbww5FhDfYRYOok96xJ7eHEOuHVuKamabE6h565 X-Received: by 2002:a17:902:624:: with SMTP id 33-v6mr32102751plg.361.1524715588822; Wed, 25 Apr 2018 21:06:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524715588; cv=none; d=google.com; s=arc-20160816; b=UBXZagogYuClahcjEnk3atHMbj91vjFoL1x9211wuM1xXdQEsdl7/2Fy8BibcnIr3v eUuM2vyFfDEse1VWpmXo67FtCn0EzkN05vZbSRC9P9jvhbR7yop6Lbzyoe+L7H1Pw+kN Pdii5qxcxyx7KMuidZOKzEPlgmD4CcdVEmqUZCIXZbghrU2+B/5j3zqnbaSzq1EgDQkI UPfWvtZqvgGi+y/VpyOSzQTq5zGt0RX+3DlFbUqwiPP7NPOcYy4QdL4/UWeqAvi/0kO3 RAlKQ7g7tx3mHQeMfJ+NgCxyYTc7vC/LtKBGdiA2E3uHiMPOVcH92eRPCFo+mair9MHA pSSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dmarc-filter :dkim-signature:dkim-signature:arc-authentication-results; bh=yW+ihgPL91+Y9qjIjboW0YlMrsjXlJv8gfP+TCt41a4=; b=sKI+5oesINk4NZUu28TR+nEFlKT5FhQhvTOKjnTmgyHWRgkeDc61qHMExmqYce4fzO iiA8IkSsix8WYxqQ3snM1gitEROgL/g0vwkV0WBDAARZC5hZh66L/tm1KbtPtCnht6WR pjKStiry/LzjjHTKLZNG+YvU8C2QGh4Wu1wmyOR3YePaeXWENUSw97VLyPIf/QeM4RlE sTVVaMkZ0O54wS1q0xhYfQIN8OmYMivHOW7F0gAedBVvYiCjFWZZDjzFw+kehYEFMSf0 IjLIWyFk/X5YBFu6gva0zf67ygLH3OAbFki8sGoT1daEb4KYS9AO2Pttj5etn8HwpYFY MwIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=OtR2fai1; dkim=pass header.i=@codeaurora.org header.s=default header.b=pEZkHlt/; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id bg7-v6si18296990plb.537.2018.04.25.21.06.14; Wed, 25 Apr 2018 21:06:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=OtR2fai1; dkim=pass header.i=@codeaurora.org header.s=default header.b=pEZkHlt/; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751508AbeDZEEq (ORCPT + 99 others); Thu, 26 Apr 2018 00:04:46 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:41188 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750775AbeDZEEn (ORCPT ); Thu, 26 Apr 2018 00:04:43 -0400 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id CF813605FF; Thu, 26 Apr 2018 04:04:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1524715482; bh=un87rb6UC6Shx72iP4HfY2Z0yr5HGJkozhGc89xwl2k=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=OtR2fai1+eZQtcJmTqO1LImYCdxrPDkBa8qMpS6+upd9optCgst6wrfGapiCtMH0u xVxzVu64MKO9tqT1BsWgaFj7cJAU2y35sLos2lNSMDLgUyPIIpX9bCyeK7uE7Bn0Gk AyniuNfbBO2O5jonG5TGvBXVRGUgZVS6egMl09qI= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pdx-caf-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,T_DKIM_INVALID autolearn=no autolearn_force=no version=3.4.0 Received: from [10.204.78.254] (blr-c-bdr-fw-01_globalnat_allzones-outside.qualcomm.com [103.229.19.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: gkohli@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 2E2C160594; Thu, 26 Apr 2018 04:04:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1524715481; bh=un87rb6UC6Shx72iP4HfY2Z0yr5HGJkozhGc89xwl2k=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=pEZkHlt/lkfLxD5pUAsQ5jlbcsJiVKS07dh6ISofOGTt51uJ2YV8jjYpDv8ayBFu1 gDOceS91346+JNTMHK4IKAtWW95AEdH2tqoYWbh65Qibag0ce2HJwG7luGZOefbZNg llxnv+utiscOE+Hdomvca9qMva9fpIe0ld5zJkoQ= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 2E2C160594 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=gkohli@codeaurora.org Subject: Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup To: Peter Zijlstra Cc: tglx@linutronix.de, mpe@ellerman.id.au, mingo@kernel.org, bigeasy@linutronix.de, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, Neeraj Upadhyay , Will Deacon References: <1524645199-5596-1-git-send-email-gkohli@codeaurora.org> <20180425200917.GZ4082@hirez.programming.kicks-ass.net> From: "Kohli, Gaurav" Message-ID: Date: Thu, 26 Apr 2018 09:34:36 +0530 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180425200917.GZ4082@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/26/2018 1:39 AM, Peter Zijlstra wrote: > On Wed, Apr 25, 2018 at 02:03:19PM +0530, Gaurav Kohli wrote: >> diff --git a/kernel/smpboot.c b/kernel/smpboot.c >> index 5043e74..c5c5184 100644 >> --- a/kernel/smpboot.c >> +++ b/kernel/smpboot.c >> @@ -122,7 +122,45 @@ static int smpboot_thread_fn(void *data) >> } >> >> if (kthread_should_park()) { >> + /* >> + * Serialize against wakeup. > * > * Prior wakeups must complete and later wakeups > * will observe TASK_RUNNING. > * > * This avoids the case where the TASK_RUNNING > * store from ttwu() competes with the > * TASK_PARKED store from kthread_parkme(). > * > * If the TASK_PARKED store looses that > * competition, kthread_unpark() will go wobbly. >> + */ >> + raw_spin_lock(¤t->pi_lock); >> __set_current_state(TASK_RUNNING); >> + raw_spin_unlock(¤t->pi_lock); >> preempt_enable(); >> if (ht->park && td->status == HP_THREAD_ACTIVE) { >> BUG_ON(td->cpu != smp_processor_id()); > Does that work for you? We have given patch for testing, usually it takes around 2-3 days for reproduction(we will update for the same). > > But looking at this a bit more; don't we have the exact same problem > with the TASK_RUNNING store in the !ht->thread_should_run() case? > Suppose a ttwu() happens concurrently there, it can end up competing > against the TASK_INTERRUPTIBLE store, no? > > Of course, that race is not fatal, we'll just end up going around the > loop once again I suppose. Maybe a comment there too? > > /* > * A similar race is possible here, but loosing > * the TASK_INTERRUPTIBLE store is harmless and > * will make us go around the loop once more. > */ Actually instead of race, i am seeing wakeup miss problem which is very rare, if we take case of hotplug thread Controller                                           Hotplug                                                              Loop start                                                              set_current_state(TASK_INTERRUPTIBLE);                                                              if (kthread_should_park()) { -> fails Set Should_park then wake_up                                                             if (!ht->thread_should_run(td->cpu)) {                                                             preempt_enable_no_resched();                                                             schedule(); Again went to schedule(which is very rare to occur,not sure whether it hits) > > And of course, I suspect we actually want to use TASK_IDLE, smpboot > threads don't want signals do they? But that probably ought to be a > separate patch. Yes I agree, we can control race from here as well,  Please suggest would below change be any help here:  } else {                         __set_current_state(TASK_RUNNING);                         preempt_enable();                         ht->thread_fn(td->cpu);                        + set_current_state(TASK_INTERRUPTIBLE);                        + schedule();                 } > -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.