Received: by 2002:a05:6358:700f:b0:131:369:b2a3 with SMTP id 15csp1220121rwo; Wed, 2 Aug 2023 10:23:36 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG8xwPrMHlNNTLb5/gKC79N1At89yZoDpfDIYhR5go2OAiGBeAiCDXbyCN31ymhb+ZOmCY/ X-Received: by 2002:aa7:d918:0:b0:523:78c:166c with SMTP id a24-20020aa7d918000000b00523078c166cmr652008edr.16.1690997016299; Wed, 02 Aug 2023 10:23:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690997016; cv=none; d=google.com; s=arc-20160816; b=N/o/BC4rCqsBdjVwPszbiMRXkbHpgLsNF1VMib9DOFxPFNmcaD7JO0uFTE9/mvEJHK U72uqcQJTMNAb3FpLbHxrN8S8N55OZ/mIIpwEPaF1zf8Jg+U3x89BJOojvEfZ0OSaPLx orgj53LiNRHIAL24AucfGFHivuY9hI/+VlgmKi1Oi8k4hsGgPzQ5yk6VuVAvmcSN6yOV Icm19kaWqcLPpTckQkG2xER+Abh/a+hOfXTyV73GI1Ba2YApoqOs/1dQbWBiS23m6ij3 7Yg4XpyS+ibVw8MxK7/9cJaPYcLcHgtuIUSeksK2d/74pDEd0yxElnczmWFgvsjejfYD Lkfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :from:content-language:references:cc:to:user-agent:mime-version:date :message-id:sender:dkim-signature; bh=EiYaQWCPiFv33eUKHe5JFxulExXwgyEegp8iSsFgI3U=; fh=fO7rDzEPWwPHK3JGs8EJz83qjZwNBJa4ZUpQf5KLZUY=; b=RbgDzaCHd58PRLx13PtPHreJerFsppR3zx9QimWPU7VVbreUiRLXivV+OUn8iTVtO8 3g/+TpO7dU+jcPCoslt+o2t7/5A0cMwRR0pX6OgIRIyve5gUZZDp0CtA1l40cuBYJjiH bS6Mcq14AoRj/nZd6ESQoqiFiJsVFG/0jnNEAFPbXKPgeVgNmGww8qAJ3W4IjLE70JRf uDjWXdZEWU+hFzaTW3sh51siUlQuvdtFYCqcqjl9pzRykDncin+w84UI/B4HW8nO/+3Q Sgw9OxD0zn8aiYsc9wuRFMcdwgKGZsITixQbuvcwXZ0Q0W85CW30qZ3vbjZTouB/W3wd 0KQQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=O7o8JQ35; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l5-20020a056402344500b00522d72c110dsi4237167edc.324.2023.08.02.10.23.11; Wed, 02 Aug 2023 10:23:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=O7o8JQ35; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233974AbjHBPpM (ORCPT + 99 others); Wed, 2 Aug 2023 11:45:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231509AbjHBPpK (ORCPT ); Wed, 2 Aug 2023 11:45:10 -0400 Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 563399B; Wed, 2 Aug 2023 08:45:09 -0700 (PDT) Received: by mail-pg1-x52a.google.com with SMTP id 41be03b00d2f7-56433b1b12dso2497314a12.1; Wed, 02 Aug 2023 08:45:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690991109; x=1691595909; h=content-transfer-encoding:in-reply-to:subject:from:content-language :references:cc:to:user-agent:mime-version:date:message-id:sender :from:to:cc:subject:date:message-id:reply-to; bh=EiYaQWCPiFv33eUKHe5JFxulExXwgyEegp8iSsFgI3U=; b=O7o8JQ35k2U6XbsP2NIAfgQvoQAOP73QN4mQ/OxXKxIgm7+PejmNKmNIOKCsxTp1ZG +VsJUNf9yBpJtConwMf5bmAF1dzWuoZGFlKrfLjRM5T/W+H0bZ0UU2l9V0vNl/13nJJY OyPltdHr46cQNLF6RX5HHyS2LXAJqs9hzzHc7HTomovfu0kTD874LUW5SDseJUvyH2a6 ZNObt6c00bk7gDYIjX94Lvk9XFX9jjTdX9lli6n+Mr5/8FcB5GWA9YzQmpdKD9b+Nz4V yCDzbrQxP2Mog4o0MYakBc8th3g8X2m2yt4UCzl753xVqB3FJcaMlbgVn1bJJWEc7K/z TOqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690991109; x=1691595909; h=content-transfer-encoding:in-reply-to:subject:from:content-language :references:cc:to:user-agent:mime-version:date:message-id:sender :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=EiYaQWCPiFv33eUKHe5JFxulExXwgyEegp8iSsFgI3U=; b=gHVEzgiaUb30Y2QqUEN6v/PY7rerc0v8DhwF5/1MBFhJgAInQFf6u/GVv579sO6Lqa wtq5rqKGfqC+W9f+hLVAufFn9FpOUXV6fHhxVsF6d23cbkM0Gd3Qng2049Qn6znyrdg3 RRHaVnTowET5OSR4vzsG6kwD3Z3b6dktNkx4GrPG/Ft8r/0z3nMcLtHKF4pedD6z7qwy Ox7/BdK4r2PrsgymUB7hoof76IMwWPAPWt68QQ66dA9L13tlPHeR158P/ilHUrUafb1E cQgxTG94/FXOSC3lbB1m8IqP9kWR1zjCJhsHx3j3wMV9+iJLlZjazWm7sULgkeJ5Svf+ Xi+Q== X-Gm-Message-State: ABy/qLZY9S287dD3EDKOgCwxpB4RzafA66KEt2AcL1ouZFxRc3JhbJvn c7YjkwFORY9jDmaYMzOc5iI= X-Received: by 2002:a17:90a:66c8:b0:268:f38:b2a1 with SMTP id z8-20020a17090a66c800b002680f38b2a1mr13300934pjl.41.1690991108748; Wed, 02 Aug 2023 08:45:08 -0700 (PDT) Received: from ?IPV6:2600:1700:e321:62f0:329c:23ff:fee3:9d7c? ([2600:1700:e321:62f0:329c:23ff:fee3:9d7c]) by smtp.gmail.com with ESMTPSA id 29-20020a17090a199d00b0025dc5749b4csm1369555pji.21.2023.08.02.08.45.06 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 02 Aug 2023 08:45:08 -0700 (PDT) Sender: Guenter Roeck Message-ID: <2568f0ca-af88-4001-79c4-571a9b6a8fb3@roeck-us.net> Date: Wed, 2 Aug 2023 08:45:06 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 To: paulmck@kernel.org, Roy Hopkins Cc: Peter Zijlstra , Joel Fernandes , Pavel Machek , Greg Kroah-Hartman , stable@vger.kernel.org, patches@lists.linux.dev, linux-kernel@vger.kernel.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, shuah@kernel.org, patches@kernelci.org, lkft-triage@lists.linaro.org, jonathanh@nvidia.com, f.fainelli@gmail.com, sudipm.mukherjee@gmail.com, srw@sladewatkins.net, rwarsow@gmx.de, conor@kernel.org, rcu@vger.kernel.org, Ingo Molnar References: <20230731143954.GB37820@hirez.programming.kicks-ass.net> <20230731145232.GM29590@hirez.programming.kicks-ass.net> <7ff2a2393d78275b14ff867f3af902b5d4b93ea2.camel@suse.de> <20230731161452.GA40850@hirez.programming.kicks-ass.net> <20230731211517.GA51835@hirez.programming.kicks-ass.net> <8215f037-63e9-4e92-8403-c5431ada9cc9@paulmck-laptop> <4f18d78411a5477690640a168e0e5d9f28d1c015.camel@suse.de> <063a2eba-6b5e-40bc-afd4-7d26f12762e4@paulmck-laptop> Content-Language: en-US From: Guenter Roeck Subject: Re: scheduler problems in -next (was: Re: [PATCH 6.4 000/227] 6.4.7-rc1 review) In-Reply-To: <063a2eba-6b5e-40bc-afd4-7d26f12762e4@paulmck-laptop> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, NICE_REPLY_A,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/2/23 08:05, Paul E. McKenney wrote: > On Wed, Aug 02, 2023 at 02:57:56PM +0100, Roy Hopkins wrote: >> On Tue, 2023-08-01 at 12:11 -0700, Paul E. McKenney wrote: >>> On Tue, Aug 01, 2023 at 10:32:45AM -0700, Guenter Roeck wrote: >>> >>> >>> Please see below for my preferred fix.  Does this work for you guys? >>> >>> Back to figuring out why recent kernels occasionally to blow up all >>> rcutorture guest OSes... >>> >>>                                                         Thanx, Paul >>> >>> ------------------------------------------------------------------------ >>> >>> diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h >>> index 7294be62727b..2d5b8385c357 100644 >>> --- a/kernel/rcu/tasks.h >>> +++ b/kernel/rcu/tasks.h >>> @@ -570,10 +570,12 @@ static void rcu_tasks_one_gp(struct rcu_tasks *rtp, bool midboot) >>>         if (unlikely(midboot)) { >>>                 needgpcb = 0x2; >>>         } else { >>> +               mutex_unlock(&rtp->tasks_gp_mutex); >>>                 set_tasks_gp_state(rtp, RTGS_WAIT_CBS); >>>                 rcuwait_wait_event(&rtp->cbs_wait, >>>                                    (needgpcb = rcu_tasks_need_gpcb(rtp)), >>>                                    TASK_IDLE); >>> +               mutex_lock(&rtp->tasks_gp_mutex); >>>         } >>> >>>         if (needgpcb & 0x2) { >> >> Your preferred fix looks good to me. >> >> With the original code I can quite easily reproduce the problem on my >> system every 10 reboots or so. With your fix in place the problem no >> longer occurs. > > Very good, thank you! May I add your Tested-by? > FWIW, I am still working on it. So far I get [ 8.191589] KTAP version 1 [ 8.191769] # Subtest: kunit_executor_test [ 8.191972] # module: kunit [ 8.192012] 1..8 [ 8.197643] ok 1 parse_filter_test [ 8.201851] ok 2 filter_suites_test [ 8.206713] ok 3 filter_suites_test_glob_test [ 8.211806] ok 4 filter_suites_to_empty_test [ 8.214077] kunit executor: filter operation not found: speed>slow, module!=example [ 8.217933] # parse_filter_attr_test: ASSERTION FAILED at lib/kunit/executor_test.c:126 [ 8.217933] Expected err == 0, but [ 8.217933] err == -22 (0xffffffffffffffea) [ 8.217933] [ 8.217933] failed to parse filter '(efault)' [ 8.221266] not ok 5 parse_filter_attr_test [ 8.224224] kunit executor: filter operation not found: speed>slow [ 8.225837] # filter_attr_test: ASSERTION FAILED at lib/kunit/executor_test.c:165 [ 8.225837] Expected err == 0, but [ 8.225837] err == -22 (0xffffffffffffffea) [ 8.228850] not ok 6 filter_attr_test [ 8.230942] kunit executor: filter operation not found: module!=dummy [ 8.232167] # filter_attr_empty_test: ASSERTION FAILED at lib/kunit/executor_test.c:190 [ 8.232167] Expected err == 0, but [ 8.232167] err == -22 (0xffffffffffffffea) [ 8.235317] not ok 7 filter_attr_empty_test [ 8.237065] kunit executor: filter operation not found: speed>slow [ 8.238796] # filter_attr_skip_test: ASSERTION FAILED at lib/kunit/executor_test.c:209 [ 8.238796] Expected err == 0, but [ 8.238796] err == -22 (0xffffffffffffffea) [ 8.241897] not ok 8 filter_attr_skip_test [ 8.241947] # kunit_executor_test: pass:4 fail:4 skip:0 total:8 [ 8.242144] # Totals: pass:4 fail:4 skip:0 total:8 and it looks like the console no longer works. Most likely this is some other problem that was introduced while tests were broken. It will take me some time to track that down. Guenter