Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp4645547rwb; Mon, 31 Jul 2023 09:55:46 -0700 (PDT) X-Google-Smtp-Source: APBJJlHDwUTYmsp4ub5XfO+VMArUy7vtl5o5jmNDHeo++FLXwbezMLAn1i0HShouDpvN6Y3/aDGD X-Received: by 2002:a17:90a:d915:b0:268:798:a28b with SMTP id c21-20020a17090ad91500b002680798a28bmr12392398pjv.23.1690822546327; Mon, 31 Jul 2023 09:55:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690822546; cv=none; d=google.com; s=arc-20160816; b=YiIx9x4ED422GdiPU5eS+RGHeQ51JBvFtz6ZROF2uWNc3cR5IYn23Vdq/u+3sYdk1+ SyFinfD//zTsxugG6V9gw2h/6/YWyCj6ayUXPErmSUwm565Fl+P456qF94tDc/S8O1xl ZeiWJ+ED7tnuqIzXDJJW8WtGxQSB01wdjb17qXnL2t9jGAhrtyKOX8rBnCAIPJsxFscj ZROkTjwUIol6zs3/pQWV0QsJ2O2BlHFpUKfFG3PvEvQ7UxBIal+njQevK/ZYynlCv8De IB0AfbhQFu2/hRGFfmaNWIoK6I07c21f3PHux3omXWKar4UJ4rRwDHU5vhD3Vnhb3kcM FpRQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent :content-transfer-encoding:references:in-reply-to:date:cc:to:from :subject:message-id:dkim-signature:dkim-signature; bh=GRi27RxqxV0i0ym17kj/c0a3gna5lH0Z+6HActWoXl4=; fh=a3ol0Ia8kINyprduNcR2Rg3TgyNGQ2vD9urfe1YZ7gE=; b=eZoPVt7h2fr/K+9/hNP5mpaio3V+zh3LM+Tpa0mVtDeH+NFcSRXSXAEr5YctZhFnbn 7VnJNCTsulkGgzHOsqb0/xbPuhLuO6I3z9WITCcXCjUIRm3lomV6tlpWbY8v+IcH+6Yo PnQdBeTTd3tRt2nrlHDVn3DjPe9H/xQQzzIR5j566fdDbcG6F0YoWzNfShT2LSN1T1UW bRDYgNGmOUdkIxLU+7H1C6fjBahKZEGVRBHYu1mjX14Ywvm+SefcPCZKCx0SIiqOT3/L ksO+MQjHpI+fIFW4ABt+jqt5z8TOGaKzfkYPFgk5ivhlFxbk4L71T0eoS2G9ryOXA3jQ Zb+g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=l+ZKgRNN; dkim=neutral (no key) header.i=@suse.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w8-20020a17090aea0800b00260a7aee610si9173869pjy.152.2023.07.31.09.55.34; Mon, 31 Jul 2023 09:55:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=l+ZKgRNN; dkim=neutral (no key) header.i=@suse.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230142AbjGaQbI (ORCPT + 99 others); Mon, 31 Jul 2023 12:31:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33056 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229992AbjGaQbG (ORCPT ); Mon, 31 Jul 2023 12:31:06 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61F1F19AE; Mon, 31 Jul 2023 09:30:37 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 3A36F22195; Mon, 31 Jul 2023 16:30:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1690821019; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GRi27RxqxV0i0ym17kj/c0a3gna5lH0Z+6HActWoXl4=; b=l+ZKgRNNKM7ez1Dl1pHzRijH6B3dejW8bX4DekjR2UuIqikaEYEEk2UtwU0Ns7tPWttAgL /ev3pklrQ0+O5+LHPhC/EGgk5bl+eNFKrD1SWnvpahU2Gs67eBlwjJktGAI8E/dML1dZB/ oF/Uf8xHzcutYRk7R1z12Ny22Le0ewc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1690821019; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GRi27RxqxV0i0ym17kj/c0a3gna5lH0Z+6HActWoXl4=; b=+Xi8ZmAwcxcr5zLI6xmgt3KP9N/ApdhlJ1V6BDMBgBgxOuEQ8/GdWLPo2BPukpaDMmpofA c+vhvQkN2Bg1yXCg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 4D14B1322C; Mon, 31 Jul 2023 16:30:18 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id zOK+EJrhx2TlFAAAMHmgww (envelope-from ); Mon, 31 Jul 2023 16:30:18 +0000 Message-ID: Subject: Re: scheduler problems in -next (was: Re: [PATCH 6.4 000/227] 6.4.7-rc1 review) From: Roy Hopkins To: Peter Zijlstra Cc: Guenter Roeck , Joel Fernandes , paulmck@kernel.org, Pavel Machek , Greg Kroah-Hartman , stable@vger.kernel.org, patches@lists.linux.dev, linux-kernel@vger.kernel.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, shuah@kernel.org, patches@kernelci.org, lkft-triage@lists.linaro.org, jonathanh@nvidia.com, f.fainelli@gmail.com, sudipm.mukherjee@gmail.com, srw@sladewatkins.net, rwarsow@gmx.de, conor@kernel.org, rcu@vger.kernel.org, Ingo Molnar Date: Mon, 31 Jul 2023 17:30:17 +0100 In-Reply-To: <20230731161452.GA40850@hirez.programming.kicks-ass.net> References: <2cfc68cc-3a2f-4350-a711-ef0c0d8385fd@paulmck-laptop> <3da81a5c-700b-8e21-1bde-27dd3a0b8945@roeck-us.net> <20230731141934.GK29590@hirez.programming.kicks-ass.net> <20230731143954.GB37820@hirez.programming.kicks-ass.net> <20230731145232.GM29590@hirez.programming.kicks-ass.net> <7ff2a2393d78275b14ff867f3af902b5d4b93ea2.camel@suse.de> <20230731161452.GA40850@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.4 MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2023-07-31 at 18:14 +0200, Peter Zijlstra wrote: > Ha!, I was poking around the same thing. My hack below seems to (so far, > <20 boots) help things. >=20 >=20 > diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h > index 56c470a489c8..b083b5a30025 100644 > --- a/kernel/rcu/tasks.h > +++ b/kernel/rcu/tasks.h > @@ -652,7 +658,11 @@ static void __init rcu_spawn_tasks_kthread_generic(s= truct rcu_tasks *rtp) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0t =3D kthread_run(rcu_tas= ks_kthread, rtp, "%s_kthread", rtp->kname); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (WARN_ONCE(IS_ERR(t), = "%s: Could not start %s grace-period kthread, OOM is now expected behavior\= n", __func__, rtp->name)) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0return; > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0smp_mb(); /* Ensure others see= full kthread. */ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0for (;;) { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0cond_resched(); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0if (smp_load_acquire(&rtp->kthread_ptr)) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0break; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > =C2=A0} > =C2=A0 > =C2=A0#ifndef CONFIG_TINY_RCU FWIW, here's my hack which seems to fix it. diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 9b9ce09f8f35..2e76fbfff9c6 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -52,6 +52,7 @@ struct rcu_tasks_percpu { * @cbs_gbl_lock: Lock protecting callback list. * @tasks_gp_mutex: Mutex protecting grace period, needed during mid-boot = dead zone. * @kthread_ptr: This flavor's grace-period/callback-invocation kthread. + * @kthread_started: Flag that indicates whether kthread has been launched= . * @gp_func: This flavor's grace-period-wait function. * @gp_state: Grace period's most recent state transition (debugging). * @gp_sleep: Per-grace-period sleep to prevent CPU-bound looping. @@ -92,6 +93,7 @@ struct rcu_tasks { unsigned long n_ipis; unsigned long n_ipis_fails; struct task_struct *kthread_ptr; + int kthread_started; rcu_tasks_gp_func_t gp_func; pregp_func_t pregp_func; pertask_func_t pertask_func; @@ -582,7 +584,7 @@ static void synchronize_rcu_tasks_generic(struct rcu_ta= sks *rtp) return; =20 // If the grace-period kthread is running, use it. - if (READ_ONCE(rtp->kthread_ptr)) { + if (READ_ONCE(rtp->kthread_started)) { wait_rcu_gp(rtp->call_func); return; } @@ -595,6 +597,7 @@ static void __init rcu_spawn_tasks_kthread_generic(stru= ct rcu_tasks *rtp) struct task_struct *t; =20 t =3D kthread_run(rcu_tasks_kthread, rtp, "%s_kthread", rtp->kname)= ; + rtp->kthread_started =3D 1; if (WARN_ONCE(IS_ERR(t), "%s: Could not start %s grace-period kthre= ad, OOM is now expected behavior\n", __func__, rtp->name)) return; smp_mb(); /* Ensure others see full kthread. */