Received: by 10.223.185.116 with SMTP id b49csp1068547wrg; Sat, 3 Mar 2018 14:39:38 -0800 (PST) X-Google-Smtp-Source: AG47ELvoxTgHXFNW/6TG/dNZZQEEcyo08d4bGPYAzCAQZyqdccQm1I2qcEDtSuEgLod8npNJdGHq X-Received: by 10.99.122.12 with SMTP id v12mr8360066pgc.128.1520116778800; Sat, 03 Mar 2018 14:39:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520116778; cv=none; d=google.com; s=arc-20160816; b=FGGEMN6jSgENUn5+yWcnpfUtpdiAzuPRBnhbxTfSPUKN3RHjCKSrJxWmpP/+bN6M06 aDbu3JfUtBHFXNgsBdE0OjUakJFvr7smpml/3CmKZPKmK3T/cydus55pZoeUFnpm1ueq HQBY7oRcaB7YZHuCEsviNpDxL0VOfk/1q2ZF6sxaAkSONoFCZTHe9YuhYMFxZ+COtB6W 6lRybtwsvN4OqdfJwpTxpcc8MzIPsWlZagJmsqvLbY6VkrAGNmuvn5WxI7ltbxbI3XBu klcBRce1A2Rg+QNq3GhU4Q50aRKHBW8TsEbLONirmO7iQPC26Tm2JCMhlE74vH5grEbH s7wA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :spamdiagnosticmetadata:spamdiagnosticoutput:content-language :accept-language:in-reply-to:references:message-id:date:thread-index :thread-topic:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=wzHzvZH9sN1p39Jn+iVdyRj9LlwMiavWllhUjziLXjk=; b=ZayXPFtYldBUpxA6Pxt9Unvq359qgRDyifBclvSYjVBFCfA9+2SiL85YNydLWS/KLm pE4Bvr87s+cJDvwO0YFC4tO4cIaQR93e+UBJ5rMOB2i++rgmLzy8v1PeBXzTMeiQ2wAH ICYi2CEAsLtiRYk+QvB1hgRPN1kqxqQXUWG0AjP7g+AbNBF6gkePFGwy5Knm+jw8cDaj qobIKc04fIIZxoWTDlz9z5N9jv6zA4Ckm/m0Q3GrcwTDLLxrxNPV7o1GRffLrXbqftiU UwkRjHiU7619Bq1WR5gfS/m7GpSf0rrV/DFahk1ScI8WKGFm9b4WXUjAsDOxOpakC8B8 ++ug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@microsoft.com header.s=selector1 header.b=P8ywlOT+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=microsoft.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e5si7442286pfl.5.2018.03.03.14.39.24; Sat, 03 Mar 2018 14:39:38 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@microsoft.com header.s=selector1 header.b=P8ywlOT+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934961AbeCCWiW (ORCPT + 99 others); Sat, 3 Mar 2018 17:38:22 -0500 Received: from mail-cys01nam02on0126.outbound.protection.outlook.com ([104.47.37.126]:19904 "EHLO NAM02-CY1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S934892AbeCCWiP (ORCPT ); Sat, 3 Mar 2018 17:38:15 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=wzHzvZH9sN1p39Jn+iVdyRj9LlwMiavWllhUjziLXjk=; b=P8ywlOT+0A4LI50wHD8VNoqjyK6QdA5vSO4hT7I0qtsdrDe5Zs1RDA1s9fyE1blQlFx2KkRKBfTPcAjKoT0TpZeDGd+4hYY8E9lhjDanFEAwixSHmHd3Q6r0aATqK5cJzYa0c1ue6uLDjCBybvMbG3+vBD7LSERv5WieJeMeQ9o= Received: from MW2PR2101MB1034.namprd21.prod.outlook.com (52.132.149.10) by MW2PR2101MB1067.namprd21.prod.outlook.com (52.132.149.20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.567.2; Sat, 3 Mar 2018 22:38:10 +0000 Received: from MW2PR2101MB1034.namprd21.prod.outlook.com ([fe80::1d56:338f:e2b:cec0]) by MW2PR2101MB1034.namprd21.prod.outlook.com ([fe80::1d56:338f:e2b:cec0%3]) with mapi id 15.20.0567.006; Sat, 3 Mar 2018 22:38:10 +0000 From: Sasha Levin To: "linux-kernel@vger.kernel.org" , "stable@vger.kernel.org" CC: Tom Hromatka , John Stultz , Sasha Levin Subject: [PATCH AUTOSEL for 4.4 028/115] sysrq: Reset the watchdog timers while displaying high-resolution timers Thread-Topic: [PATCH AUTOSEL for 4.4 028/115] sysrq: Reset the watchdog timers while displaying high-resolution timers Thread-Index: AQHTsz9RcR/xsnyaVk2rtxGZDjECDQ== Date: Sat, 3 Mar 2018 22:31:04 +0000 Message-ID: <20180303223010.27106-28-alexander.levin@microsoft.com> References: <20180303223010.27106-1-alexander.levin@microsoft.com> In-Reply-To: <20180303223010.27106-1-alexander.levin@microsoft.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [52.168.54.252] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;MW2PR2101MB1067;7:gczo8N2wMvVvIGVAPgc+A6xm19oJJHggbkM/oGwAMBj2/zfJM4AhWzG9k5WpRafZP9wbkJXJg1zmkg1ptDBIjPgd+ZyQ1Qk5DyF5VivLaBWujPwWyApE2K5AIfMMSi8HFc0lERs0xGDF8umakazbiH+/gwWpvmr1HH/NinBGgGNH0obctkVWuovHN2bQXH5sRypt264uVjJh/mA/5qo4ERdNUmi9TLv+U57wxbKoz8x6FD+NN8GXcpHTKpHr8WmU x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: a767c910-cf04-4db3-59b2-08d581577122 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020095)(4652020)(4534165)(4627221)(201703031133081)(201702281549075)(48565401081)(5600026)(4604075)(3008032)(2017052603307)(7193020);SRVR:MW2PR2101MB1067; x-ms-traffictypediagnostic: MW2PR2101MB1067: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Alexander.Levin@microsoft.com; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(28532068793085)(89211679590171)(146099531331640); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(8211001083)(61425038)(6040501)(2401047)(5005006)(8121501046)(3231220)(944501244)(52105095)(3002001)(10201501046)(93006095)(93001095)(6055026)(61426038)(61427038)(6041288)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123558120)(20161123564045)(20161123560045)(20161123562045)(6072148)(201708071742011);SRVR:MW2PR2101MB1067;BCL:0;PCL:0;RULEID:;SRVR:MW2PR2101MB1067; x-forefront-prvs: 0600F93FE1 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(396003)(366004)(376002)(39860400002)(39380400002)(346002)(199004)(189003)(4326008)(5250100002)(86362001)(1076002)(2501003)(110136005)(6666003)(81156014)(2906002)(8936002)(305945005)(81166006)(2950100002)(7736002)(10290500003)(26005)(186003)(3660700001)(478600001)(72206003)(107886003)(8676002)(3280700002)(59450400001)(2900100001)(76176011)(6116002)(25786009)(3846002)(102836004)(6506007)(5660300001)(97736004)(10090500001)(86612001)(106356001)(99286004)(6512007)(66066001)(316002)(53936002)(14454004)(105586002)(36756003)(6486002)(54906003)(6436002)(68736007)(22452003)(22906009)(217873001);DIR:OUT;SFP:1102;SCL:1;SRVR:MW2PR2101MB1067;H:MW2PR2101MB1034.namprd21.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; received-spf: None (protection.outlook.com: microsoft.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: KbKU0cUQ9RhrtzBFJw3BuDDvrgJpxXPXFFPu7TIVAkmpZ0NF3R0pK3Ij+KAh/rZbeNz7RaGRx2/YHIpqCvk1u6hdYnh8XJowLryGrlL7aU5GBPlRWv/i2eGs5RxClDZj2cxCoFAhF0L03D7Os4F6z8shgScPdywiVtMyMu4YP1g= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: microsoft.com X-MS-Exchange-CrossTenant-Network-Message-Id: a767c910-cf04-4db3-59b2-08d581577122 X-MS-Exchange-CrossTenant-originalarrivaltime: 03 Mar 2018 22:31:04.9946 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47 X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW2PR2101MB1067 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Tom Hromatka [ Upstream commit 0107042768658fea9f5f5a9c00b1c90f5dab6a06 ] On systems with a large number of CPUs, running sysrq- can cause watchdog timeouts. There are two slow sections of code in the sysrq- path in timer_list.c. 1. print_active_timers() - This function is called by print_cpu() and contains a slow goto loop. On a machine with hundreds of CPUs, this loop took approximately 100ms for the first CPU in a NUMA node. (Subsequent CPUs in the same node ran much quicker.) The total time to print all of the CPUs is ultimately long enough to trigger the soft lockup watchdog. 2. print_tickdevice() - This function outputs a large amount of textual information. This function also took approximately 100ms per CPU. Since sysrq- is not a performance critical path, there should be no harm in touching the nmi watchdog in both slow sections above. Touching it in just one location was insufficient on systems with hundreds of CPUs as occasional timeouts were still observed during testing. This issue was observed on an Oracle T7 machine with 128 CPUs, but I anticipate it may affect other systems with similarly large numbers of CPUs. Signed-off-by: Tom Hromatka Reviewed-by: Rob Gardner Signed-off-by: John Stultz Signed-off-by: Sasha Levin --- kernel/time/timer_list.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c index ba7d8b288bb3..ef4f16e81283 100644 --- a/kernel/time/timer_list.c +++ b/kernel/time/timer_list.c @@ -16,6 +16,7 @@ #include #include #include +#include =20 #include =20 @@ -96,6 +97,9 @@ print_active_timers(struct seq_file *m, struct hrtimer_cl= ock_base *base, =20 next_one: i =3D 0; + + touch_nmi_watchdog(); + raw_spin_lock_irqsave(&base->cpu_base->lock, flags); =20 curr =3D timerqueue_getnext(&base->active); @@ -207,6 +211,8 @@ print_tickdevice(struct seq_file *m, struct tick_device= *td, int cpu) { struct clock_event_device *dev =3D td->evtdev; =20 + touch_nmi_watchdog(); + SEQ_printf(m, "Tick Device: mode: %d\n", td->mode); if (cpu < 0) SEQ_printf(m, "Broadcast device\n"); --=20 2.14.1