Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp9592708rwl; Sun, 1 Jan 2023 06:35:58 -0800 (PST) X-Google-Smtp-Source: AMrXdXt9C3M+M2vtP63RNDcerf3OO1yXtN7AbVn0wADCiJaAfI0ASaiBsQrZKnPg4jo6wrMB9i8T X-Received: by 2002:a05:6a20:8b2f:b0:a3:b698:d036 with SMTP id l47-20020a056a208b2f00b000a3b698d036mr35296662pzh.39.1672583757941; Sun, 01 Jan 2023 06:35:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672583757; cv=none; d=google.com; s=arc-20160816; b=uwcHsZRQaM5JDowEtkERoGifUs8AWzV3QFORj6wE65tC2h4DrjbRwxTm8Fi1K04NVp mhArNo92rU/4t2RiXXXcMZqP5mOQs5Cte0pvWwTuAh5AiJ1AroKATQIivcoIeswK7F6J TNhk8t9rP59znIlMaxpfCxPXpMaBUQeCMXkB6aNdnK69rL/TB5w3m8efL0a/gvjJE3tK rh3piRD0G2XSV/Wr9NkiUpPwyveyPbdBK871ppBf+DcQltigOsJsOfsavjkKPVcDuAP9 PR3ED7w+Fg2QyOsppmMavjUe6aAjmTIkvrvIkXAqFN80FBXoXyNvByPYWBU4rFQj4voR y+bw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:in-reply-to:cc:references:message-id:date :subject:mime-version:from:content-transfer-encoding:dkim-signature; bh=RW5jWyZz8/3sO2rr7DBwG5+wsjocCd9/cH402FJ/K7M=; b=y4Flri4ahgZ8D/0e7V931Th3GjcJCE0ar8XDcM43eHPUtafGIEwpGt4FMf0MTonrmD pqXU+YWw25fRaWgu5Tg6cWbvQ+EWujv46WjukBJSdvQg0eu2PKAtjnJVkLGBO685qpS9 d9Q0Wg6rBEU8vl7gmZGD2tzGFGKOecCKoe7qxf63QUh/yNMiTnGVaEDTbiEYRETPzlfM 3H8O439TSUT3NiyZ3t5n6+eovtUyRk93m6pvsRi3ULy9ZSbw+IglkxSHvK0Q9RnOI8PQ Sj2sGOXUT33plZ2cSG1YZURoX3bYPuEzChXyPdTLT6bll435lD/eiYubcVxZoOhoxKpq +Z/Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=iBp2mi+J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v186-20020a6389c3000000b0049b6482d3f2si15494589pgd.834.2023.01.01.06.35.43; Sun, 01 Jan 2023 06:35:57 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=iBp2mi+J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230013AbjAAOQi (ORCPT + 61 others); Sun, 1 Jan 2023 09:16:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54116 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229599AbjAAOQg (ORCPT ); Sun, 1 Jan 2023 09:16:36 -0500 Received: from mail-qt1-x82d.google.com (mail-qt1-x82d.google.com [IPv6:2607:f8b0:4864:20::82d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4AC8263C for ; Sun, 1 Jan 2023 06:16:34 -0800 (PST) Received: by mail-qt1-x82d.google.com with SMTP id v14so17905456qtq.3 for ; Sun, 01 Jan 2023 06:16:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=to:in-reply-to:cc:references:message-id:date:subject:mime-version :from:content-transfer-encoding:from:to:cc:subject:date:message-id :reply-to; bh=RW5jWyZz8/3sO2rr7DBwG5+wsjocCd9/cH402FJ/K7M=; b=iBp2mi+JxiKIZPj5QLWfDaCUG6AcKXfXKyxNV9DmV4pgBPOu3McO5qJsqP/h2zFy3C 9cLWSN8d21RXJ8z9cEyQtI4tV2rh//zYszg9I/KW6k0BJS4/xNW1ivvAe9bsLHET8pie GFZ/HT/uyzORGR28uNy8j6ULZjIOMSsKRGJIo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:in-reply-to:cc:references:message-id:date:subject:mime-version :from:content-transfer-encoding:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RW5jWyZz8/3sO2rr7DBwG5+wsjocCd9/cH402FJ/K7M=; b=Qi6fj4NJuDWnzESQBJKozeeFzdXviPgskcetZB8sdrljpPCPyHNZHzw4e0m4umbDcv b3RX6cqxWDmgoEHltif+Y88Vm9o47FrBfJ0sT6/tKwCAA/Z5yH661JPsxRXemlhEOFmu gG0T9QN5DYY2plFiDx0iUBXUsFqKLojIkAAyO2g7yNF5+0EXnjXalKdrRK0QCTVsG8tv kTugjODq/XfWpCrUIN/BGmQpTc+yeB1dcN+a/OL//qQpuxEoIH6Lc7fT0nhdQU/KTqm0 N/0QERK6wdnw7phsPHSR58S96B3nPg/1yO6tyfdYmPHv6O0WWv+VQc3Q+qI2ssDn1oY0 Hylg== X-Gm-Message-State: AFqh2kqvljvMaBHw0O5r8HUhQlGVj95T8Cg+DmhBVxyIqNzuuoytJO1R TWgUX3DUWV/V5dD6RUiQJApN8w== X-Received: by 2002:ac8:5453:0:b0:3a8:25be:ba5b with SMTP id d19-20020ac85453000000b003a825beba5bmr47765354qtq.23.1672582593718; Sun, 01 Jan 2023 06:16:33 -0800 (PST) Received: from smtpclient.apple (c-98-249-43-138.hsd1.va.comcast.net. [98.249.43.138]) by smtp.gmail.com with ESMTPSA id x10-20020a05620a448a00b006ea7f9d8644sm19157688qkp.96.2023.01.01.06.16.32 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 01 Jan 2023 06:16:32 -0800 (PST) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Joel Fernandes Mime-Version: 1.0 (1.0) Subject: Re: [PATCH] torture: Fix hang during kthread shutdown phase Date: Sun, 1 Jan 2023 09:16:22 -0500 Message-Id: References: Cc: linux-kernel@vger.kernel.org, Paul McKenney , Frederic Weisbecker , stable@vger.kernel.org, Davidlohr Bueso , Josh Triplett In-Reply-To: To: Zhouyi Zhou X-Mailer: iPhone Mail (20B101) X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NORMAL_HTTP_TO_IP, NUMERIC_HTTP_ADDR,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Jan 1, 2023, at 8:02 AM, Zhouyi Zhou wrote: >=20 > =EF=BB=BFOn Sun, Jan 1, 2023 at 2:16 PM Joel Fernandes (Google) > wrote: >>=20 >> During shutdown of rcutorture, the shutdown thread in >> rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop >> to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for >> readers and fakewriters to breakout of their main while loop and start >> shutting down. >>=20 >> Once out of their main loop, they then call torture_kthread_stopping() >> which in turn waits for kthread_stop() to be called, however >> rcu_torture_cleanup() has not even called kthread_stop() on those >> threads yet, it does that a bit later. However, before it gets a chance >> to do so, torture_kthread_stopping() calls >> schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed >> this makes the timer softirq constantly execute timer callbacks, while >> never returning back to the softirq exit path and is essentially "locked >> up" because of that. If the softirq preempts the shutdown thread, >> kthread_stop() may never be called. >>=20 >> This commit improves the situation dramatically, by increasing timeout >> passed to schedule_timeout_interruptible() 1/20th of a second. This >> causes the timer softirq to not lock up a CPU and everything works fine. >> Testing has shown 100 runs of TREE07 passing reliably, which was not the >> case before because of RCU stalls. > On my Dell PowerEdge R720 with two Intel(R) Xeon(R) CPU E5-2660 128G memor= y: > 1) before this patch: > 3 of 80 rounds failed with "rcu: INFO: rcu_sched detected stalls on > CPUs/tasks" [1] > 2) after this patch > all 80 rounds passed >=20 > Tested-by: Zhouyi Zhou >=20 Thanks! Glad to see your tests look good now. - Joel > Thanks > Zhouyi >=20 > [1] http://154.220.3.115/logs/20230101/console.log >>=20 >> Cc: Paul McKenney >> Cc: Frederic Weisbecker >> Cc: Zhouyi Zhou >> Cc: # 6.0.x >> Signed-off-by: Joel Fernandes (Google) >> --- >> kernel/torture.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >>=20 >> diff --git a/kernel/torture.c b/kernel/torture.c >> index 29afc62f2bfe..d024f3b7181f 100644 >> --- a/kernel/torture.c >> +++ b/kernel/torture.c >> @@ -915,7 +915,7 @@ void torture_kthread_stopping(char *title) >> VERBOSE_TOROUT_STRING(buf); >> while (!kthread_should_stop()) { >> torture_shutdown_absorb(title); >> - schedule_timeout_uninterruptible(1); >> + schedule_timeout_uninterruptible(HZ/20); >> } >> } >> EXPORT_SYMBOL_GPL(torture_kthread_stopping); >> -- >> 2.39.0.314.g84b9a713c41-goog >>=20