Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp1690041iog; Sat, 25 Jun 2022 16:59:42 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uLbeIEdoOzlvqB73Zys9AK+RxHTjuPSnIOnTb8NaupZKLUrG5v4ppTwV3Dc7YmzBfQc6wc X-Received: by 2002:a17:906:5d07:b0:722:e1e3:ab6b with SMTP id g7-20020a1709065d0700b00722e1e3ab6bmr6069147ejt.674.1656201581819; Sat, 25 Jun 2022 16:59:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656201581; cv=none; d=google.com; s=arc-20160816; b=vkThBlKWWclAX8cAdqt5ChJERcxYGN+0/uBNDDC487zFmfHk7SNpi4UvnIUWZ738lm OB/mw8llMPzsi4k9Dc0HpJ475uv/ZHK+Fb1QFHZdYmwodKWtjuwZtqLlvT3UHzIYYd3C 7CBAsidcJYYpE5QjBEn/+sfxlf5pOrcxqM6RTXQA5Hc8ooijirHYqD/ef4ucMLogUPTc Eb14M9hMVkNpGyxTn+FMnZovPB0g3sP6oqhdYAbGK06NAK43q2QdY/manWtfN77+Y8E4 gh846xenxxmn8rGfduD9r4K1ERMf31XlqfGjBOl+cfMuYa85b6/O+wh0j3/PdNtmbeoJ 7qxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:mime-version:user-agent:message-id :in-reply-to:date:references:cc:to:from; bh=MhVks1BLzaguGWBgzs0ejIlDocvg7EN3B6HYUfdgWl0=; b=RaykqESDyMyq6Mbv6i8LvT89+NXgqC5EtNbIrJ4TIcmsIDJF5asuejGJs3zNkS7DDY LV6f+dq4OWoFXU9QkXERYHSz0m0c1ERcrfKWAANimn3xsFqg360WOd3JnfWuPPc97X2/ W0AJtuQtHqur3WuWASUu5suTPX11qpK10V79HGmvICFGhK7F9OmUKSJAGgPwolOtG90u ieKxN5SfmjfyDL1HeB7NZfLRKcdg/LxLmxQpNN+0nkcD7B9QhPP/Z3htb12LghsWLUdV sHz4I2WlOLoYeALjZTfxjoIclzjo3fWCK96SlgV/XIqxbT+jZRfPtmVJxMsQwimL0A0i yfDg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xmission.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dt20-20020a170907729400b00722fb5e76cbsi8086253ejc.350.2022.06.25.16.59.13; Sat, 25 Jun 2022 16:59:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xmission.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233659AbiFYX2c (ORCPT + 99 others); Sat, 25 Jun 2022 19:28:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233580AbiFYX2b (ORCPT ); Sat, 25 Jun 2022 19:28:31 -0400 Received: from out02.mta.xmission.com (out02.mta.xmission.com [166.70.13.232]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 025CFFD0F for ; Sat, 25 Jun 2022 16:28:29 -0700 (PDT) Received: from in01.mta.xmission.com ([166.70.13.51]:60468) by out02.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1o5FCU-009Axp-PL; Sat, 25 Jun 2022 17:28:26 -0600 Received: from ip68-227-174-4.om.om.cox.net ([68.227.174.4]:57590 helo=email.froward.int.ebiederm.org.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1o5FCT-00ATuF-O5; Sat, 25 Jun 2022 17:28:26 -0600 From: "Eric W. Biederman" To: Linus Torvalds Cc: Christian Brauner , Tejun Heo , Petr Mladek , Lai Jiangshan , Michal Hocko , Linux Kernel Mailing List , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Andrew Morton , Oleg Nesterov References: <20220622140853.31383-1-pmladek@suse.com> <874k0863x8.fsf@email.froward.int.ebiederm.org> Date: Sat, 25 Jun 2022 18:28:01 -0500 In-Reply-To: (Linus Torvalds's message of "Sat, 25 Jun 2022 11:43:15 -0700") Message-ID: <87pmiw1fy6.fsf@email.froward.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1o5FCT-00ATuF-O5;;;mid=<87pmiw1fy6.fsf@email.froward.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=68.227.174.4;;;frm=ebiederm@xmission.com;;;spf=softfail X-XM-AID: U2FsdGVkX19kWzYTkvknmqAUnyb+6Ca2bvXSKU0vc4Q= X-SA-Exim-Connect-IP: 68.227.174.4 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-DCC: XMission; sa04 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: *;Linus Torvalds X-Spam-Relay-Country: X-Spam-Timing: total 482 ms - load_scoreonly_sql: 0.06 (0.0%), signal_user_changed: 13 (2.7%), b_tie_ro: 11 (2.3%), parse: 1.61 (0.3%), extract_message_metadata: 19 (4.0%), get_uri_detail_list: 3.7 (0.8%), tests_pri_-1000: 8 (1.7%), tests_pri_-950: 1.88 (0.4%), tests_pri_-900: 1.36 (0.3%), tests_pri_-90: 65 (13.5%), check_bayes: 63 (13.1%), b_tokenize: 12 (2.4%), b_tok_get_all: 11 (2.2%), b_comp_prob: 3.9 (0.8%), b_tok_touch_all: 33 (6.8%), b_finish: 1.00 (0.2%), tests_pri_0: 353 (73.2%), check_dkim_signature: 0.81 (0.2%), check_dkim_adsp: 5 (1.1%), poll_dns_idle: 2.6 (0.5%), tests_pri_10: 3.9 (0.8%), tests_pri_500: 10 (2.2%), rewrite_mail: 0.00 (0.0%) Subject: Re: re. Spurious wakeup on a newly created kthread X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Linus Torvalds writes: > On Sat, Jun 25, 2022 at 11:25 AM Linus Torvalds > wrote: >> >> And that's not at all what the kthread code wants. It wants to set >> affinity masks, it wants to create a name for the thread, it wants to >> do all those other things. >> >> That code really wants to just do copy_process(). > > Honestly, I think kernel/kthread.c should be almost rewritten from scratch. > > I do not understand why it does all those odd keventd games at all, > and why kthread_create_info exists in the first place. I presume you mean kthreadd games? > Why does kthread_create() not just create the thread directly itself, > and instead does that odd queue it onto a work function? > > Some of that goes back to before the git history, and very little of > it seems to make any sense. It's as if the code is meant to be able to > run from interrupt context, but that can't be it: it's literally doing > a GFP_KERNEL kmalloc, it's doing spin-locks without irq safety etc. > > So why is it calling kthreadd_task() to create the thread? Purely for > some crazy odd "make that the parent" reason? > > I dunno. The code is odd, unexplained, looks buggy, and most fo the > reasons are probably entirely historical. I can explain why kthreadd exists and why it creates the threads. Very long ago in the context of random userspace processes people would use kernel_thread to create threads and a helper function that I think was called something like kernel_daemonize to scrub the userspace bits off. It was an unending sources of problems as the scrub was never complete nor correct. So with the introduction of kthreadd the kernel threads were moved out of the userspace process tree, and userspace stopped being able to influence the kernel threads. AKA instead of doing the equivalent of a suid exec the code started going the equivalent sshing into the local box. We *need* to preserve that kind of separation. I want to say that all that is required is that copy_process copies from kthreadd. Unfortunately that means that it needs to be kthreadd doing the work, as copy_process does always copies from current. It would take quite a bit of work to untangle that mess. It does appear possible to write a parallel function to copy_process that is used only for creating kernel threads, and can streamline itself because it knows it is creating kernel threads. Short of that the code needs to keep routing through kthreadd. Using create_io_thread or a dedicated wrapper around copy_process certainly looks like it could simplify some of kthread creation. > I'm adding Christian to this thread too, since I get the feeling that > it really should be more tightly integrated with copy_process(), and > that Christian might have comments. > > Christian, see some context in the thread here: > > https://lore.kernel.org/all/CAHk-=wiC7rj1o7vTnYUPfD7YxAu09MZiZbahHqvLm9+Cgg1dFw@mail.gmail.com/ > > for some of this. > > Linus Eric