Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp1031484iog; Fri, 24 Jun 2022 22:06:48 -0700 (PDT) X-Google-Smtp-Source: AGRyM1urVZIlRj6KMigc3S9Q1+anXQkrbMoFOMu52uGdT8pjKPadx0UaT7noSaNmvxUuaLhXgcuJ X-Received: by 2002:a17:90a:c705:b0:1ec:83e2:777c with SMTP id o5-20020a17090ac70500b001ec83e2777cmr2693115pjt.89.1656133608639; Fri, 24 Jun 2022 22:06:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656133608; cv=none; d=google.com; s=arc-20160816; b=O2GlGHT0JVW1iY30KeepVQ9+U6Z2vQH9BMR0hwQATA/lyvg9Th2QohhyUUcKxt91wN Kn3BgW3fp97E+jJv2A/i5yKWTHzAfIm9AKSv9QUnzEpcCoa3vN31hUVkok2ILHj6tGr+ qAVY6sPqiWffViJWjMPcu0WP1mSFOtnzLl8kDOSpEbXiIYs51fyW3GEFUjINQ8pKcdyH 3zcjmvAWBh1J/e6hgDYeLEwBNuhoa550FZ+ST0xzWU66nos6rScCYLGhiuR7Uth9e1qJ 3vX+fBaRXmTQvCfv85ULezQf0EFQ/0pgQnvQHXAEQoCilGWShc6nPQT2X3ukGut7TaLz o9JQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:sender:dkim-signature; bh=1qrPonAVKPhMOTfPeMHYE23I+GZiCtHV4xsHgdJefdw=; b=edqDhc1L4IyJRmktsY1lgjMllAnSva61g5x+DrkIyJnVZP/r0U3INNnKXDFjuXnXQS Cx20ED+U8c5Rskqu8k3zsBK2JPyw5U2XN8hUdrqAo2k9BuZnILk/Mp4rraj+HCuRLZFm nUJsJ9Ly0C452DvvDyZCcq9lZwEy2s8Op1UrW9GLPXXmVhySWTkqXC1tfdY4UIoCpAcM /ZxTJWy7mrWXsm8wq+N73EaP9Js5Ge1SaVeNtuTSHNk1uAN2HDmqtcNpWt1qcNKg0z1G kRQH/fukhN8tAkc3JkXb2ddViL3B076pFkLBs4dgvasy/1WOxp8Lo4RYO/z3v2Oq5DH1 wQ9A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=bUPjir8a; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r4-20020a170902be0400b00163937195dfsi4886982pls.201.2022.06.24.22.06.35; Fri, 24 Jun 2022 22:06:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=bUPjir8a; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230509AbiFYFAS (ORCPT + 99 others); Sat, 25 Jun 2022 01:00:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51442 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229530AbiFYFAR (ORCPT ); Sat, 25 Jun 2022 01:00:17 -0400 Received: from mail-qv1-xf2f.google.com (mail-qv1-xf2f.google.com [IPv6:2607:f8b0:4864:20::f2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66BE3427E2 for ; Fri, 24 Jun 2022 22:00:16 -0700 (PDT) Received: by mail-qv1-xf2f.google.com with SMTP id p31so7471188qvp.5 for ; Fri, 24 Jun 2022 22:00:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=1qrPonAVKPhMOTfPeMHYE23I+GZiCtHV4xsHgdJefdw=; b=bUPjir8aIntIGdVzOv1x4cfMK3ebucUwYSG+JagWwih0jk1LtX3Ct3DpErZJDWTqqS ECYsTlvJl5fIBKjO8X0IbrSj9KBaNSEzu1fl1wk4Sj9QGmaNNfAP7dAEQz0JfIrRjH1H WmI1/khrtmVjD2x5Zf/8YWwz7hmz09DUvJJMQRQqk/xV1gpZ6+zVsI0Wv0IKRJ0yZb1Y MT69Ip0GqOTUuHtWmtuQQ/Bs8Ej40Dlu7HdUWV0QePescCZZRTdDoPKE3IeU+CC2sg57 4soOJf/J7R3idi7mmzJbRCJwRvUUX3G6AQ16LZw5WbLvEtBSZfXMpMYiESKYIJeIwRx1 YDzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=1qrPonAVKPhMOTfPeMHYE23I+GZiCtHV4xsHgdJefdw=; b=KZfVu8vWWO4ryLeIIA5BGRFEp7ucrLicmRoTKt9G/dsmkBfYJnH9JTwuzgidq4ZCoy SqueOsMav5V4MrfR9/CJa68I1EJiz70x2VCNdg99JWZ6XP+2lLeHEPPcmyA7me+b/Tp8 aSPnDlFQznU3hpP6cwQQyERWFvfN10hW8NPL25vWEsO+1eq0lUQS+BCPfbA4O2yOFMUM tMZPF8fvDO/iP1cz+vpeRefTnZOJbFAMp6sitfo6O+pGCRBsN64XSgmhnfp7Ud4E12fN u2HwCQFQARajC8EZn7MiZzC2QDQXJkHRzUIHEPoiwNqi0PXTu9BoUxyAiLqFGx1nwv5v pygg== X-Gm-Message-State: AJIora8e+PmddUtVyqa6ziU/ydbkSxGLUhvjhCfZSs3CuEiUFz22lNeE l0VmKmBsE/kXXscsH7c/vTNj2yLKH2VK5/nZ X-Received: by 2002:ac8:5793:0:b0:304:ffe3:d3c5 with SMTP id v19-20020ac85793000000b00304ffe3d3c5mr2020051qta.460.1656133215365; Fri, 24 Jun 2022 22:00:15 -0700 (PDT) Received: from localhost ([2600:380:526b:c476:2748:efa5:4652:607d]) by smtp.gmail.com with ESMTPSA id l2-20020a05620a28c200b006a6cadd89efsm3787141qkp.82.2022.06.24.22.00.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Jun 2022 22:00:14 -0700 (PDT) Sender: Tejun Heo Date: Sat, 25 Jun 2022 14:00:10 +0900 From: Tejun Heo To: Petr Mladek Cc: Lai Jiangshan , Michal Hocko , linux-kernel@vger.kernel.org, Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Andrew Morton , Linus Torvalds , Oleg Nesterov , "Eric W. Biederman" Subject: re. Spurious wakeup on a newly created kthread Message-ID: References: <20220622140853.31383-1-pmladek@suse.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220622140853.31383-1-pmladek@suse.com> X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, cc'ing random assortment of ppl who touched kernel/kthread.c and others who would know better. So, Petr debugged a NULL deref in workqueue code to a spurious wakeup on a newly created kthread. The abbreviated patch description follows. The original message is at http://lkml.kernel.org/r/20220622140853.31383-1-pmladek@suse.com On Wed, Jun 22, 2022 at 04:08:53PM +0200, Petr Mladek wrote: > A system crashed with the following BUG() report: > > [115147.050484] BUG: kernel NULL pointer dereference, address: 0000000000000000 ... > [115147.050524] Call Trace: > [115147.050533] worker_thread+0xb4/0x3c0 > [115147.050540] kthread+0x152/0x170 > [115147.050544] ret_from_fork+0x35/0x40 > > Further debugging shown that the worker thread was woken > before worker_attach_to_pool() finished in create_worker(). > > Any kthread is supposed to stay in TASK_UNINTERRUPTIBLE sleep > until it is explicitly woken. But a spurious wakeup might > break this expectation. > > As a result, worker_thread() might read worker->pool before > it was set in worker create_worker() by worker_attach_to_pool(). > Also manage_workers() might want to create yet another worker > before worker->pool->nr_workers is updated. It is a kind off > a chicken & egg problem. tl;dr is that the worker creation code expects a newly created worker kthread to sit tight until the creator finishes setting up stuff and sends the initial wakeup. However, something, which wasn't identified in the report (Petr, it'd be great if you can find out who did the wakeup), wakes up the new kthread before the creation path is done with init which causes the new kthread to try to deref a NULL pointer. Petr fixed the problem by adding an extra handshake step so that the new kthread explicitly waits for the creation path, which is fine, but the picture isn't making sense to me. * Are spurious wakeups allowed? The way that we do set_current_state() in every iteration in wait_event() seems to suggest that we expect someone to spuriously flip task state to RUNNING. * However, if we're to expect spurious wakeups for anybody anytime, why does a newly created kthread bother with schedule_preempt_disabled() in kernel/kthread.c::kthread() at all? It can't guarantee anything and all it does is masking subtle bugs. What am I missing here? Thanks. -- tejun