Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp844632imm; Wed, 6 Jun 2018 06:52:01 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKsXxvrngiDaLXKZ7dLzDvHkNarIrGpRC0b+cq+IhOwUFtj8Fv2LHz1gQrKCF8dWrjcVh38 X-Received: by 2002:a17:902:7888:: with SMTP id q8-v6mr3293517pll.79.1528293121006; Wed, 06 Jun 2018 06:52:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528293120; cv=none; d=google.com; s=arc-20160816; b=V7RZit32yfxAa9Rpj0/vIKuqH4HxdH8vd+9MeTQpcjF9zFGG514zM0AmKLJOi6yCnb kuv8T73vDs3KwgkwW1cFSwxkNtkGhBeR8lvnEnwc9GQpiZYwwRdaW6F6InnTctIUm66m Hj5OJZ+0MTiUdbEU73J616WCuSWoFWYCS5jkMdTAtmsYLL7htwYmVjExgquQMJ8tvr49 ymfmx2mCF9kdYhHa8MPn4NO9BX8DoqYjdZNmOtaGUoRafLqHQOyfxC1osQiBywgJy3hF cau5ZPwHl/id5KJtgalCm5+aJ2vbVdw+3O1XSgJyEiG/v/Y1NdMPeIq9bSRlgqjjMaqj PwUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=xCB9HRLlROpaWADy3WJvbkDo2SiNLMId1UoNAUV1AXc=; b=A8qHnXtnz9VAKFRB9xI8GPgxq9saBQkRHQkStuVkxwFyFL7tP7ZTg35ZnGV9T7WJoI 7Q405/zlx3U8yj08tHP1yBM//gto/PogNnoz5RGuG64sg3YD87FXOludNpgGPKJ13DJm q1w8X4H/MbuUyjCf5cunpjXlbOvnMpqb1bb7iFUHvsjkiqS90QJTw+KhyLbtnaZV1BP2 ZFu+vsflzUIQ6XRaStenF86teRsQfc1rYX9MRc0QWou135JmUw8XX44DS/Wrs04Eu1QI SX8Ho2BGfyaSjHedEPSCCL46HMwxCcJP2jc6IpJybwAMC1OfXZMeRtLnDWJ61+ywyyn/ ophA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j2-v6si48686886pff.184.2018.06.06.06.51.46; Wed, 06 Jun 2018 06:52:00 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752042AbeFFNvU (ORCPT + 99 others); Wed, 6 Jun 2018 09:51:20 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:55936 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751533AbeFFNvS (ORCPT ); Wed, 6 Jun 2018 09:51:18 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 485508A3C3; Wed, 6 Jun 2018 13:51:18 +0000 (UTC) Received: from dhcp-27-174.brq.redhat.com (unknown [10.34.27.30]) by smtp.corp.redhat.com (Postfix) with SMTP id 9D4D32024CA1; Wed, 6 Jun 2018 13:51:16 +0000 (UTC) Received: by dhcp-27-174.brq.redhat.com (nbSMTP-1.00) for uid 1000 oleg@redhat.com; Wed, 6 Jun 2018 15:51:18 +0200 (CEST) Date: Wed, 6 Jun 2018 15:51:16 +0200 From: Oleg Nesterov To: Peter Zijlstra Cc: "Kohli, Gaurav" , tglx@linutronix.de, mpe@ellerman.id.au, mingo@kernel.org, bigeasy@linutronix.de, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, Neeraj Upadhyay , Will Deacon Subject: Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup Message-ID: <20180606135115.GA4609@redhat.com> References: <20180502082011.GB12180@hirez.programming.kicks-ass.net> <830d7225-af90-a55a-991a-bb2023d538f1@codeaurora.org> <55221a5b-dd52-3359-f582-86830dd9f205@codeaurora.org> <20180605150841.GA24053@redhat.com> <20180605152212.GY12180@hirez.programming.kicks-ass.net> <20180605154053.GB12235@hirez.programming.kicks-ass.net> <20180605163515.GB24053@redhat.com> <20180605201316.GZ12198@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180605201316.GZ12198@hirez.programming.kicks-ass.net> User-Agent: Mutt/1.5.24 (2015-08-30) X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Wed, 06 Jun 2018 13:51:18 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Wed, 06 Jun 2018 13:51:18 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'oleg@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/05, Peter Zijlstra wrote: > > Also, I think we still need TASK_PARKED as a special state for that. I think it would be nice to kill the TASK_PARKED state altogether. But I don't know how. I'll try to look at this code later, but I am not sure I will find a way to cleanup it... > --- a/kernel/kthread.c > +++ b/kernel/kthread.c > @@ -177,12 +177,24 @@ void *kthread_probe_data(struct task_struct *task) > static void __kthread_parkme(struct kthread *self) > { > for (;;) { > - set_current_state(TASK_PARKED); > + /* > + * TASK_PARKED is a special state; we must serialize against > + * possible pending wakeups to avoid store-store collisions on > + * task->state. > + * > + * Such a collision might possibly result in the task state > + * changin from TASK_PARKED and us failing the > + * wait_task_inactive() in kthread_park(). > + */ > + set_special_state(TASK_PARKED); Agreed, > if (!test_bit(KTHREAD_SHOULD_PARK, &self->flags)) > break; > + > + complete_all(&self->parked); > schedule(); > } > __set_current_state(TASK_RUNNING); > + reinit_completion(&self->parked); But how can we know that all the callers of kthread_park() have already returned from wait_for_completion() ? Oh. The very fact that __kthread_parkme() does complete_all() proves that we need some serious cleanups. In particular, I think that kthread_park() on a parked kthread must not be possible. Just look at this code. It looks as if __kthread_parkme() can race with _unpark() and thus we need this wait-event-like loop. But if it can race with _unpark() then kthread_park() can block forever. For the start, can't we change kthread_park() - set_bit(KTHREAD_SHOULD_PARK, &kthread->flags); + if (test_and_set_bit(...)) + return -EAGAIN; and s/complete_all/complete/ in __kthread_parkme() ? IIUC, this will only affect smpboot_update_cpumask_percpu_thread() which can hit an already parked thread, but it doesn't need to wait. And it seems that smpboot_update_cpumask_percpu_thread() in turn needs some cleanups. Hmm. and its single user: kernel/watchdog.c. And speaking of watchdog.c, can't we simply kill the "watchdog/%u" threads? This is off-topic, but can't watchdog_timer_fn() use stop_one_cpu_nowait(watchdog) ? And I really think we should unexport kthread_park/unpark(), only smpboot_thread_fn() should use them. kthread() should not play with __kthread_parkme(). And even KTHREAD_SHOULD_PARK must die, I mean it should live in struct smp_hotplug_thread, not in struct kthread. OK, this is off-topic too. In short, I think this patch is fine but I didn't read it carefully, will try tomorrow. And, let me repeat, can't we avoid complete_all() ? Oleg.