Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp1467330iog; Sat, 25 Jun 2022 10:04:23 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vCqFNo/x9QNuzCt4tnF8fn8O/JkkjX/J2WylyHxDWnKGHFS/KmVmv6SGl+PEvN+WnjOj7B X-Received: by 2002:a05:6402:2077:b0:435:a428:76e4 with SMTP id bd23-20020a056402207700b00435a42876e4mr6023536edb.367.1656176663644; Sat, 25 Jun 2022 10:04:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656176663; cv=none; d=google.com; s=arc-20160816; b=lTwLLpWW/3j87ZwnANfB2aEb1I7M0PdDAgP1st0Eyr/82zn5DjiqwqvrmNZYpXEnrB auBdLB8kM0LrOjHDOm+5fg6/queK+99Wjb5O+Mjqph2apBrpNpskaF9s/ojTt3ZtSbEy DcPW7X/bXAsttuFEWUMwR+w5zRnt6eOwdvgJ3MOe84c6KmpFCaWpb3cLdymgdd4W6Ozk J9YQ2oC2fO9eT+E8u1X0V0JtDozVbEsAdXqrJf+sv5Zfgah80e4KhnmwRlC6PfYzVR0R 418Gjk3qMGeSZeXA4rypXX1wn7FYgdgUKu2M9jgpxioTfxAEsTj2S2NMBo9OuMW1F1xA MU1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=0zs6mVdLWta+BbG4DcGuj6Z7cHElVuJa5tXJ/0ojnqI=; b=sY8LBFOPyQdxh9HAghhqL8A6rNi6iPzD6wEjSlKoCQ4fTbW2IZeyuX9wJrB6x+pJYD QKPnFrk9cNS9LMaSFO8b+kv3QsQ4vKvd2ZwZDtCsN2Wta4keCQRB7g9US9kLxUk+LkjS qUOE4/4f5e0eaWngfKCtCM5BnB57wkq2m3CsHWcgbPhZjoGUTNNmqWwA3M4bhIwNOzCD kpwg5xtmW+rCGgcsGxW0DVGEH4jSpYNChQBBfsJF2A6ZaakL2nqQnv2Pyw85t4j1JpM6 Ss1IpFo4Fp77YKYwWNeKXrsIb89A5OUPvqteFMdXkdkPtsyM+flK18tpUfhddPpMLlvv SsXA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=MMbGvUr8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h11-20020a056402280b00b00435dad43328si9335831ede.569.2022.06.25.10.03.59; Sat, 25 Jun 2022 10:04:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=MMbGvUr8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233207AbiFYRB6 (ORCPT + 99 others); Sat, 25 Jun 2022 13:01:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232827AbiFYRB5 (ORCPT ); Sat, 25 Jun 2022 13:01:57 -0400 Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B00E1582B for ; Sat, 25 Jun 2022 10:01:56 -0700 (PDT) Received: by mail-ej1-x62d.google.com with SMTP id ay16so10732639ejb.6 for ; Sat, 25 Jun 2022 10:01:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=0zs6mVdLWta+BbG4DcGuj6Z7cHElVuJa5tXJ/0ojnqI=; b=MMbGvUr8JbF8zGULiMCG0NN+n39KsHqVeu1e1xyMJ51gWEf8pH3qJksL6oKwcuqjnZ z1wsiGDRdvMOFE0gQ2Fv3k8wYcsU0FpW8SpLkOITiCyjF42i0JWH2hCz6s0jQURzZfcD +e+pzUcLCnPMdTR3uYcr1IZgur97M7SQHmw8Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0zs6mVdLWta+BbG4DcGuj6Z7cHElVuJa5tXJ/0ojnqI=; b=ki86l2DdP28JhhMzbv4qQn0cRxO1bgXHXSnHXhV92BWGAeJddyx8fwqQSlMBb+77Ks A6y86mHFfgLWVcWt/aGLpbvsPJ94y4R1xJFg0Us2xxtHc5Q8n3r3MrUCVby5n+Lg2qjp pfLrXVXtz6IH/VHwU0e+5Hksxgx3HbQDI2sOvh56QM9fAFQZBcbiL+QNASXB8ocB1Taz IEoB4eVazoWdPk7sfMyVMNo1jzjj8QcAmXYSspW1ZjEEAjZXx2TokjE1vinT42SHON8Z R0sWzclGSTh4yuQPC2KJa/wIqGYOxLNxK87l8u2GMDbYztfMjA2gpV84UzvPT/5PoFjo 38rQ== X-Gm-Message-State: AJIora8mfgR95uT8lb1ZCwMkFjbkwv8iOjA1G6rGX1o2oIj7i3wLM1zQ nxeJirgeB11Tu513HrxFoc6p0LIlXOtwsADR X-Received: by 2002:a17:906:ce:b0:718:bff8:58c3 with SMTP id 14-20020a17090600ce00b00718bff858c3mr4553661eji.512.1656176514484; Sat, 25 Jun 2022 10:01:54 -0700 (PDT) Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com. [209.85.128.41]) by smtp.gmail.com with ESMTPSA id hz11-20020a1709072ceb00b00722fc014e8csm2786384ejc.104.2022.06.25.10.01.51 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 25 Jun 2022 10:01:52 -0700 (PDT) Received: by mail-wm1-f41.google.com with SMTP id c130-20020a1c3588000000b0039c6fd897b4so4943414wma.4 for ; Sat, 25 Jun 2022 10:01:51 -0700 (PDT) X-Received: by 2002:a05:600c:354c:b0:39c:7e86:6ff5 with SMTP id i12-20020a05600c354c00b0039c7e866ff5mr9826295wmq.145.1656176511175; Sat, 25 Jun 2022 10:01:51 -0700 (PDT) MIME-Version: 1.0 References: <20220622140853.31383-1-pmladek@suse.com> In-Reply-To: From: Linus Torvalds Date: Sat, 25 Jun 2022 10:01:35 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: re. Spurious wakeup on a newly created kthread To: Tejun Heo Cc: Petr Mladek , Lai Jiangshan , Michal Hocko , Linux Kernel Mailing List , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Andrew Morton , Oleg Nesterov , "Eric W. Biederman" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 24, 2022 at 10:00 PM Tejun Heo wrote: > > So, Petr debugged a NULL deref in workqueue code to a spurious wakeup > on a newly created kthread. What? No. That patch can't be right for several reasons. What we call "spurious wakeups" exist, but they are about wakeups that happen from being on a _previous_ wait-queue, and having already been removed from it. They aren't "really" spurious, they are just asynchronous enough (and thus unexpected) that you basically should never have a "sleep on wait-queue" without then looping and re-testing the condition. There is no _truly_ spurious wakeup. You were always woken up for a reason, it's just that there are more reasons than the entirely obvious ones. For example, the reason that quoted patch cannot be right is that this code pattern: while (wait_for_completion_interruptible(&worker->ready_to_start)) ; is not valid kernel code. EVER. There is absolutely no way that can be correct. Either that code can take a signal, or it cannot. If it can take a signal, it had better react to said signal. If it cannot, it must not use an interruptble sleep - since now that loop turned into a kernel-side busy-loop. So NAK on this kind of crazy "I don't know what happened, so I'll just add *more* bugs to the code" voodoo programming. And no, we don't "fix" that by then adding a timeout. Stop this "add random code" thing. If you cannot be woken up before X happens, then you should: - don't go to sleep before X happens - don't add yourself to any wait-queues before X happens - don't expose your process to others before X happens The solution is *not* to add some completion with random "wait for this before waking". I think the problem here is much more fundamental: you expect a new thread to not wake up until you've told it to. We do have that infrastructure in the kernel: when you create a new thread, you can do various setup, and the thread won't actually run until you do "wake_up_new_task()" on it. However, that's not how kernel_thread() (or kernel_clone()) works. Those will call wake_up_new_task(p) for you, and as such a new kernel thread will immediately start running. So I think the expectations here are entirely wrong. I think create_worker() is fundamentally buggy, in how it does that /* start the newly created worker */ .. wake_up_process(worker->task); because that wake_up_process() is already much too late. The process got woken up already, because it was created by create_kthread() -> kernel_thread() -> kernel_clone, which does that wake_up_new_task() and it starts running. If you want to do thread setup *bnefore* the kernel is running, it needs to be done before that wake_up_new_task(). That's very possible. Look at what create_io_thread() does, for example: it never calls wake_up_new_process() at all, and leaves that to the caller, which has done the extra setup. Or the kernel_clone_args could have a "init" function that gets called before doing the wake_up_new_task() is done. Or a number of other solutions. But no, we're not randomly adding some new completion because people were confused and thought they were waking things up when it was already awake from before. Linus