Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp569834imw; Wed, 13 Jul 2022 04:08:28 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sGd+fAIZyP6qIOwPN+izMZmZdRER5+oGQjVH9r63DjrEEHOFoy9B+hZmrAyNW4cvGxPnRC X-Received: by 2002:a17:907:7b95:b0:6f4:ff03:981a with SMTP id ne21-20020a1709077b9500b006f4ff03981amr2905177ejc.653.1657710507811; Wed, 13 Jul 2022 04:08:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657710507; cv=none; d=google.com; s=arc-20160816; b=OFCA2+hrVYSvH81dArY7fuVgJLur1CEl0UWvPEYhQi5jsLMV8VHlVUyZleOq0WfWQd ofuPynjoy8v6UCgdH9A2RBgp2E7KV/jmUT+34BvKUeb/s4ZEiSt5Br3oFNoq9u0nA1cz WKMPIe+WA8JSzKtwd98l/tIbxpJL5OvZxE7o1jm5DWtT7be6cdknn8/SmwaUuSkJQ2v5 XZOXn6owDz56/Iqf0uIjexlfrEb24qMHBXmXqjQ4QhpdPSAklZbUwdLgriihjqmH5UU5 OcnVmYDdssvYG1jwVTYg39o1Z54v2mDH2T4WSl3prSAKto1lNisNxeXBCrXC3OF0btGi 1gJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=vBKz24vsY24wRxgj8ZLIKAAwqiimXAeKjx6naAU6jvI=; b=alZ5txBDpWkpBumvBmZZPUZwVSUKkP5caSLF+RK9jRyF/Kcs8YjpbVs1HY9PlelpZ1 KnSkFOeVroP7QYlYTPZTwWTY+CMCfcbrMr25/katkIQL6kOtEaOp99oYQjSy8J8Lu5+F n3Y5H+bapZZcQPssvGz2Uzs59AMzATGA81/C/YBAcoilvpKa81dAQSJq83BYD4jllA/+ pBskuqpOZaHE8h1MK0Zq4W92o8VYfLvJWbe0NspG7ptfydAiF9ydjnMAasfm32B6ZW68 C+w6KCig6u58nSomKvBlB8QqQwBN8RJZW30M09YlB+/2oRBvoNCcs+pCs2cQMyiKwXId L0DQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=Vmwv5ifg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dm22-20020a170907949600b0072625d8a873si555412ejc.927.2022.07.13.04.08.02; Wed, 13 Jul 2022 04:08:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=Vmwv5ifg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235703AbiGMK5D (ORCPT + 99 others); Wed, 13 Jul 2022 06:57:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230141AbiGMK46 (ORCPT ); Wed, 13 Jul 2022 06:56:58 -0400 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C768BFF5B3; Wed, 13 Jul 2022 03:56:56 -0700 (PDT) Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 26DApDnf002349; Wed, 13 Jul 2022 10:56:27 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=vBKz24vsY24wRxgj8ZLIKAAwqiimXAeKjx6naAU6jvI=; b=Vmwv5ifg/AGy4KQSihoiH7BDEL+DYfSC5AgjmPE5oGYqLeFBWZ5NoBw2deQFC0vvFGyj bdnVA4FisUEhDFpC0zxvYGln+36pYFWbUyVCNHm0d4m4Qg/ZFeH0yIdLkOTH33A0cMIS Q7/K8+WWle01LXpUfh2OwASiLZNwS9BN2FOqoOBwudccEhEZW0O5MKmB24C4aS4c4Ouk bL5cVQ59V0mFCrVtje3aU6XgVkWoVHEaH6xKOZyiVjvCNSpJoN2x9Jmz+hzk0PzTgYXu q/UIVnlCUMK6MUNFcCcHLJb9t0avAF7UqtK480gyHkZjNc8E62EBkdtUgqIwXszX+v9f +w== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3h9vneg29s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 13 Jul 2022 10:56:27 +0000 Received: from m0098404.ppops.net (m0098404.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 26DAtTZ0026656; Wed, 13 Jul 2022 10:56:27 GMT Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3h9vneg294-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 13 Jul 2022 10:56:26 +0000 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 26DAqBCg024794; Wed, 13 Jul 2022 10:56:24 GMT Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by ppma04fra.de.ibm.com with ESMTP id 3h99s78ys2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 13 Jul 2022 10:56:24 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 26DAuMCn15991270 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 13 Jul 2022 10:56:22 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 53DAFAE04D; Wed, 13 Jul 2022 10:56:22 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DD5E2AE045; Wed, 13 Jul 2022 10:56:21 +0000 (GMT) Received: from [9.145.2.121] (unknown [9.145.2.121]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 13 Jul 2022 10:56:21 +0000 (GMT) Message-ID: Date: Wed, 13 Jul 2022 12:56:21 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.0.1 Subject: Re: [PATCH v4 4/4] pseries/mobility: set NMI watchdog factor during LPM Content-Language: fr To: Randy Dunlap , mpe@ellerman.id.au, npiggin@gmail.com, christophe.leroy@csgroup.eu, wim@linux-watchdog.org, linux@roeck-us.net, nathanl@linux.ibm.com Cc: haren@linux.vnet.ibm.com, hch@infradead.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-watchdog@vger.kernel.org References: <20220712143202.23144-1-ldufour@linux.ibm.com> <20220712143202.23144-5-ldufour@linux.ibm.com> From: Laurent Dufour In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: A9dxdZgGAg81nFs8xBByt43Gc7N6zqAz X-Proofpoint-ORIG-GUID: EbrQsNumQJtaBYZCcd5TFFSi_vcfX3sm X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-07-12_14,2022-07-13_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 spamscore=0 malwarescore=0 priorityscore=1501 adultscore=0 clxscore=1011 bulkscore=0 impostorscore=0 mlxscore=0 lowpriorityscore=0 phishscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2206140000 definitions=main-2207130043 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Le 12/07/2022 à 18:25, Randy Dunlap a écrit : > Hi-- > > On 7/12/22 07:32, Laurent Dufour wrote: >> During a LPM, while the memory transfer is in progress on the arrival side, >> some latencies is generated when accessing not yet transferred pages on the > > are > >> arrival side. Thus, the NMI watchdog may be triggered too frequently, which >> increases the risk to hit a NMI interrupt in a bad place in the kernel, > > an NMI > >> leading to a kernel panic. >> >> Disabling the Hard Lockup Watchdog until the memory transfer could be a too >> strong work around, some users would want this timeout to be eventually >> triggered if the system is hanging even during LPM. >> >> Introduce a new sysctl variable nmi_watchdog_factor. It allows to apply >> a factor to the NMI watchdog timeout during a LPM. Just before the CPU are > > an LPM. the CPU is > >> stopped for the switchover sequence, the NMI watchdog timer is set to >> watchdog_tresh + factor% > > watchdog_thresh > >> >> A value of 0 has no effect. The default value is 200, meaning that the NMI >> watchdog is set to 30s during LPM (based on a 10s watchdog_tresh value). > > watchdog_thresh > >> Once the memory transfer is achieved, the factor is reset to 0. >> >> Setting this value to a high number is like disabling the NMI watchdog >> during a LPM. > > an LPM. > >> >> Reviewed-by: Nicholas Piggin >> Signed-off-by: Laurent Dufour >> --- >> Documentation/admin-guide/sysctl/kernel.rst | 12 ++++++ >> arch/powerpc/platforms/pseries/mobility.c | 43 +++++++++++++++++++++ >> 2 files changed, 55 insertions(+) >> >> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst >> index ddccd1077462..0bb0b7f27e96 100644 >> --- a/Documentation/admin-guide/sysctl/kernel.rst >> +++ b/Documentation/admin-guide/sysctl/kernel.rst >> @@ -592,6 +592,18 @@ to the guest kernel command line (see >> Documentation/admin-guide/kernel-parameters.rst). >> > > This entire block should be in kernel-parameters.txt, not .rst, > and it should be formatted like everything else in the .txt file. Thanks for reviewing this patch. I'll apply your requests in the next version. However, regarding the change in kernel-parameters.txt, I'm confused. The newly introduced parameter is only exposed through sysctl. Not as a kernel boot option. In that case, should it be mentioned in kernel-parameters.txt? Documentation/process/4.Coding.rst says: The file :ref:`Documentation/admin-guide/kernel-parameters.rst ` describes all of the kernel's boot-time parameters. Any patch which adds new parameters should add the appropriate entries to this file. And Documentation/process/submit-checklist.rst says: 16) All new kernel boot parameters are documented in ``Documentation/admin-guide/kernel-parameters.rst``. What are the rules about editing .txt or .rst files? >> >> +nmi_watchdog_factor (PPC only) >> +================================== >> + >> +Factor apply to to the NMI watchdog timeout (only when ``nmi_watchdog`` is > > Factor to apply to the NMI > >> +set to 1). This factor represents the percentage added to >> +``watchdog_thresh`` when calculating the NMI watchdog timeout during a > > during an > >> +LPM. The soft lockup timeout is not impacted. >> + >> +A value of 0 means no change. The default value is 200 meaning the NMI >> +watchdog is set to 30s (based on ``watchdog_thresh`` equal to 10). >> + >> + >> numa_balancing >> ============== >> > >