Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp4539150imw; Tue, 12 Jul 2022 09:38:16 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uxadJIFXiAXKuKahENXNGTeJrbFzJXxn8J8JSxkWRLmD/fVSNjG8zXlJ348XO+SWhJVpkk X-Received: by 2002:a17:907:7349:b0:72b:4d9f:1418 with SMTP id dq9-20020a170907734900b0072b4d9f1418mr12962931ejc.304.1657643896648; Tue, 12 Jul 2022 09:38:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657643896; cv=none; d=google.com; s=arc-20160816; b=hRg7wA9bvJY6ooexDi0x0OQNyneyIQMdPejO7PQJijT/XTLIS+Kq3k9s6ypj7hAjCL bMa40t7/6RBFwU72CQnuKt812+d6tG5Rqv+uzA2HG5mcdIRjaDOIrwmSKHVLBhbApn87 Z+B2+pQSMa8795nZ5LPj4AEReuIUKQ82MXd0EeJSoWqWUe9RhAnAzMKNiW68zfz5Lcx7 Go1ywy0frHlwogXS9tOlLtuy+HMFv+6JDBFPvNr/2XzhLQiFSJRjdxSIXD71ip1e6gfx 01cucX26zMBaJEsA93RHRvEcy0Yonzd7AkeksQbPmN2DqC5irC2qRLy+DTeRDcSQlQ97 zCYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=EyUFDwm+MCF5o/jfmCwe0q+h8aIIfJ6Fh6QWFZidMRQ=; b=NLDQmU63DBfubUwJmgGcvv0O9z4+xZOnfzBGgo3xAxhT+DerqacpxXw/XPTvSEIbdi I5nWcyf3+7mzrvs8UWsa5KDu0FR5DJ2EyDZimZnQKacF+eqRBj0wn7/XdMbm7FcSQ2AL JisiTYkHJv4h0rm/2JtLXLek/edCNwytw8KBA5KB63dMEgNIyyEIqd12UuST+8mFha4O 6r8zanYyw38DJUbFScWSLmQMK140qVhInRRFRANug8M/QV/HBNRcBViXMvKCyhcI3sQs rYoIBmN7FMUw66r0n2Vexijl9Qls6TZzN5H2N2gFjHda49FFJsiSE6PzKY/4t3AvhA1n S72w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20210309 header.b=o7F7qm8J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qa39-20020a17090786a700b0072b5651772esi968707ejc.161.2022.07.12.09.37.47; Tue, 12 Jul 2022 09:38:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20210309 header.b=o7F7qm8J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234026AbiGLQZ4 (ORCPT + 99 others); Tue, 12 Jul 2022 12:25:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233943AbiGLQZp (ORCPT ); Tue, 12 Jul 2022 12:25:45 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 834D7CB453; Tue, 12 Jul 2022 09:25:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: Content-Type:In-Reply-To:From:References:Cc:To:Subject:MIME-Version:Date: Message-ID:Sender:Reply-To:Content-ID:Content-Description; bh=EyUFDwm+MCF5o/jfmCwe0q+h8aIIfJ6Fh6QWFZidMRQ=; b=o7F7qm8JPqBXj5UYxBg1ZN3Oz+ 6C8spyb8w8F+4P09llAiHI8Wuy0lpK087JBQEthXA1isE0s+SKnUBUcBhkVNTPO0aMwYgxSA3pdNi CbVedOTzknDAgDJwtxaVuziPs9sWWkxmtQguv2Lh/DPSxzBMrnLpEj0DMNW5DXKnRw7LxDNGNEaSp oXUH8yN7BNYsimD++6N+CwpgDrD5/IaXGSZhEy7RHl5ejZHcfgi4Mn3IAiedh5M06h6QUm4NGx4qX q0vuQEzjtx4PKMb9PgsOc19SQsFznoZbdfvBdaeQ+99sDDlhUdjCKfiMpLaG0DdbvQdCtKkOLm7Pn OZnZor1g==; Received: from [2601:1c0:6280:3f0::a6b3] by bombadil.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1oBIha-00ClSx-5G; Tue, 12 Jul 2022 16:25:34 +0000 Message-ID: Date: Tue, 12 Jul 2022 09:25:32 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH v4 4/4] pseries/mobility: set NMI watchdog factor during LPM Content-Language: en-US To: Laurent Dufour , mpe@ellerman.id.au, npiggin@gmail.com, christophe.leroy@csgroup.eu, wim@linux-watchdog.org, linux@roeck-us.net, nathanl@linux.ibm.com Cc: haren@linux.vnet.ibm.com, hch@infradead.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-watchdog@vger.kernel.org References: <20220712143202.23144-1-ldufour@linux.ibm.com> <20220712143202.23144-5-ldufour@linux.ibm.com> From: Randy Dunlap In-Reply-To: <20220712143202.23144-5-ldufour@linux.ibm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi-- On 7/12/22 07:32, Laurent Dufour wrote: > During a LPM, while the memory transfer is in progress on the arrival side, > some latencies is generated when accessing not yet transferred pages on the are > arrival side. Thus, the NMI watchdog may be triggered too frequently, which > increases the risk to hit a NMI interrupt in a bad place in the kernel, an NMI > leading to a kernel panic. > > Disabling the Hard Lockup Watchdog until the memory transfer could be a too > strong work around, some users would want this timeout to be eventually > triggered if the system is hanging even during LPM. > > Introduce a new sysctl variable nmi_watchdog_factor. It allows to apply > a factor to the NMI watchdog timeout during a LPM. Just before the CPU are an LPM. the CPU is > stopped for the switchover sequence, the NMI watchdog timer is set to > watchdog_tresh + factor% watchdog_thresh > > A value of 0 has no effect. The default value is 200, meaning that the NMI > watchdog is set to 30s during LPM (based on a 10s watchdog_tresh value). watchdog_thresh > Once the memory transfer is achieved, the factor is reset to 0. > > Setting this value to a high number is like disabling the NMI watchdog > during a LPM. an LPM. > > Reviewed-by: Nicholas Piggin > Signed-off-by: Laurent Dufour > --- > Documentation/admin-guide/sysctl/kernel.rst | 12 ++++++ > arch/powerpc/platforms/pseries/mobility.c | 43 +++++++++++++++++++++ > 2 files changed, 55 insertions(+) > > diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst > index ddccd1077462..0bb0b7f27e96 100644 > --- a/Documentation/admin-guide/sysctl/kernel.rst > +++ b/Documentation/admin-guide/sysctl/kernel.rst > @@ -592,6 +592,18 @@ to the guest kernel command line (see > Documentation/admin-guide/kernel-parameters.rst). > This entire block should be in kernel-parameters.txt, not .rst, and it should be formatted like everything else in the .txt file. > > +nmi_watchdog_factor (PPC only) > +================================== > + > +Factor apply to to the NMI watchdog timeout (only when ``nmi_watchdog`` is Factor to apply to the NMI > +set to 1). This factor represents the percentage added to > +``watchdog_thresh`` when calculating the NMI watchdog timeout during a during an > +LPM. The soft lockup timeout is not impacted. > + > +A value of 0 means no change. The default value is 200 meaning the NMI > +watchdog is set to 30s (based on ``watchdog_thresh`` equal to 10). > + > + > numa_balancing > ============== > -- ~Randy