Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752045AbdGERXp (ORCPT ); Wed, 5 Jul 2017 13:23:45 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:53518 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751772AbdGERXo (ORCPT ); Wed, 5 Jul 2017 13:23:44 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 4814C6043F Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=markivx@codeaurora.org Subject: Re: [PATCH] kthread: Atomically set completion and perform dequeue in __kthread_parkme To: Thomas Gleixner Cc: Rusty Russell , Tejun Heo , Andrew Morton , LKML , Peter Zijlstra , Sebastian Sewior References: <1498515483-12743-1-git-send-email-markivx@codeaurora.org> <318fac36-66cd-7f90-df61-44042119ee2e@codeaurora.org> From: Vikram Mulukutla Message-ID: <51ef82b6-23b6-899e-ed5f-2e89c53f5399@codeaurora.org> Date: Wed, 5 Jul 2017 10:23:42 -0700 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2270 Lines: 49 On 7/4/2017 12:49 PM, Thomas Gleixner wrote: > On Mon, 26 Jun 2017, Vikram Mulukutla wrote: >> On 6/26/2017 3:18 PM, Vikram Mulukutla wrote: >>> kthread_park waits for the target kthread to park itself with >>> __kthread_parkme using a completion variable. __kthread_parkme - which is >>> invoked by the target kthread - sets the completion variable before >>> calling schedule() to voluntarily get itself off of the runqueue. >>> >>> This causes an interesting race in the hotplug path. takedown_cpu() >>> invoked for CPU_X attempts to park the cpuhp/X hotplug kthread before >>> running the stopper thread on CPU_X. kthread_unpark doesn't guarantee that >>> cpuhp/X is off of X's runqueue, only that the thread has executed >>> __kthread_parkme and set the completion. cpuhp/X may have been preempted >>> out before calling schedule() to voluntarily sleep. takedown_cpu proceeds >>> to run the stopper thread on CPU_X which promptly migrates off the >>> still-on-rq cpuhp/X thread to another cpu CPU_Y, setting its affinity >>> mask to something other than CPU_X alone. >>> >>> This is OK - cpuhp/X may finally get itself off of CPU_Y's runqueue at >>> some later point. But if that doesn't happen (for example, if there's >>> an RT thread on CPU_Y), the kthread_unpark in a subsequent cpu_up call >>> for CPU_X will race with the still-on-rq condition. Even now we're >>> functionally OK because there is a wait_task_inactive in the >>> kthread_unpark(), BUT the following happens: >>> >>> [ 12.472745] BUG: scheduling while atomic: swapper/7/0/0x00000002 > > Thats not the worst problem. We could simply enable preemption there, but > the real issue is that this is the idle task of the upcoming CPU which is > not supposed to schedule in the first place. > > So no, your 'fix' is just papering over the underlying issue. > > And yes, the moron who did not think about wait_task_inactive() being > called via kthread_unpark() -> kthread_bind() is me. > > I'm testing a proper fix for it right now. Will post later. Thanks, it did totally wrong to have any sort of scheduling in the idle thread as the subsequent warnings do indicate, but I didn't feel confident enough to mess around with the hotplug state machine. > > Thanks, > > tglx > Thanks, Vikram