Received: by 10.223.185.116 with SMTP id b49csp1112969wrg; Sat, 3 Mar 2018 15:58:20 -0800 (PST) X-Google-Smtp-Source: AG47ELsQXLUonqKV/VEW4mqAsALhg7Jar6rYiokNRDLXUAdiU8zHXNDsRra06hMdWNenonKB14c8 X-Received: by 10.98.1.88 with SMTP id 85mr10446783pfb.226.1520121500655; Sat, 03 Mar 2018 15:58:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520121500; cv=none; d=google.com; s=arc-20160816; b=0fltYqDMPelCW13ijjhCbnns4fLiijetU2byVsPSCVINF+ft/w3ixRzhQjzJObYzlr MQp76WgQYHW7qNxaRWjX3GSnE/QaTFQ8gR+wc0bbO4h+ymfv5yEhMfNtkY8j33oSvfrk Ltw2flMkyvQXNuxa8Re6cxjSRA8t5PR/qmo0sAcS8+O4XIYsTntUkJQguAIAHQcMSjFo nLqBS002q+jJuj9haU0MQO7U7nXq1jx1Vm/dIvjQ10DyXCbf/BUBipo1AVnW7ZrjQXl0 RzfqYA9JpsAw9dHoQhIctn7XEQk70Vl9ZbTxhxHj39C25yQz8e+twsz95j6O75b5M0NA FX+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :spamdiagnosticmetadata:spamdiagnosticoutput:content-language :accept-language:in-reply-to:references:message-id:date:thread-index :thread-topic:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=wzHzvZH9sN1p39Jn+iVdyRj9LlwMiavWllhUjziLXjk=; b=rHa7L+4xGj1qKsEoJoQcW8cEyJ7mkcs8DPMA03MVEeMdvljU1Mnm54DF5/Gv+9c8SR ZGbydVpDGos/4Liyubxw8OtnUJsgx2MGrco8Mtq6LvV6UONVh703PYkiLVfn82pJwq5Z hb4wJkWL7bom75nHZEdsVs4N3IAqxFnwKmr+T1/AAjjfErtjxje2coscMzlMoKn1fzKL 8g+NKkJS1xhr/FY/RempNZCGiMF2rtHvHxPLqgCGAk9hoHW2JvNUhAqkxjh2OOjW8qTV G14dNXSSCP9DYHxopgWVAr0xOkUpGVac3MXG+vwzuwAly7Lk/5tpe+N889FKLlxnGz2l NJ7Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@microsoft.com header.s=selector1 header.b=PEEpwfl1; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=microsoft.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 97-v6si5412100plb.23.2018.03.03.15.58.06; Sat, 03 Mar 2018 15:58:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@microsoft.com header.s=selector1 header.b=PEEpwfl1; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933833AbeCCWcZ (ORCPT + 99 others); Sat, 3 Mar 2018 17:32:25 -0500 Received: from mail-by2nam01on0111.outbound.protection.outlook.com ([104.47.34.111]:2176 "EHLO NAM01-BY2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S933819AbeCCWcU (ORCPT ); Sat, 3 Mar 2018 17:32:20 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=wzHzvZH9sN1p39Jn+iVdyRj9LlwMiavWllhUjziLXjk=; b=PEEpwfl1Wa0uk0tUy+bScxWNmV913ojgYeKaCc5digxEaHpn14tZMxoO7xbxBz5D+lMWxizhon0kC4GYQLbzZKwEBDYaDtc+a23DNKDDqvBeMgaubHXkzLYmyPzm2134f4lGwjflV305nGhfadFybnSJ+R0pN6BLnGaXIcDyQqY= Received: from MW2PR2101MB1034.namprd21.prod.outlook.com (52.132.149.10) by MW2PR2101MB1097.namprd21.prod.outlook.com (52.132.149.26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.567.2; Sat, 3 Mar 2018 22:32:18 +0000 Received: from MW2PR2101MB1034.namprd21.prod.outlook.com ([fe80::1d56:338f:e2b:cec0]) by MW2PR2101MB1034.namprd21.prod.outlook.com ([fe80::1d56:338f:e2b:cec0%3]) with mapi id 15.20.0567.006; Sat, 3 Mar 2018 22:32:18 +0000 From: Sasha Levin To: "linux-kernel@vger.kernel.org" , "stable@vger.kernel.org" CC: Tom Hromatka , John Stultz , Sasha Levin Subject: [PATCH AUTOSEL for 4.9 059/219] sysrq: Reset the watchdog timers while displaying high-resolution timers Thread-Topic: [PATCH AUTOSEL for 4.9 059/219] sysrq: Reset the watchdog timers while displaying high-resolution timers Thread-Index: AQHTsz71WR+EklRNgkmLx0oprBgmJQ== Date: Sat, 3 Mar 2018 22:28:30 +0000 Message-ID: <20180303222716.26640-59-alexander.levin@microsoft.com> References: <20180303222716.26640-1-alexander.levin@microsoft.com> In-Reply-To: <20180303222716.26640-1-alexander.levin@microsoft.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [52.168.54.252] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;MW2PR2101MB1097;7:lQ2T/TP/yhB3h7VhDG0feY2+/em4s8oA7u5JcrofUGV6jgDEV2ezYW3/hAv7b/3vkFWUC2OCKv+40qpv/ct+TZYbL6yTWSkvaU/j77IBAnVZK0yTKvorSjIiEakB6SXht08u2JJKDum7a8u76mCEnv3QGoTI+VqJlF9ecdHDN+zgImYA1paN9TPTC7P7IQd3fIiVn/+uYBqHeinKsMXqAmw1WBh4xOZAOdkNlVI8Bt3M1Uw4+xpW8RPMpiwGYBoP x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 69ef4e69-6b61-4d6a-2fce-08d581569f52 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020095)(4652020)(4534165)(4627221)(201703031133081)(201702281549075)(48565401081)(5600026)(4604075)(3008032)(2017052603307)(7193020);SRVR:MW2PR2101MB1097; x-ms-traffictypediagnostic: MW2PR2101MB1097: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Alexander.Levin@microsoft.com; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(28532068793085)(89211679590171)(146099531331640); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(8211001083)(61425038)(6040501)(2401047)(5005006)(8121501046)(10201501046)(3002001)(3231220)(944501244)(52105095)(93006095)(93001095)(6055026)(61426038)(61427038)(6041288)(20161123560045)(20161123558120)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123564045)(6072148)(201708071742011);SRVR:MW2PR2101MB1097;BCL:0;PCL:0;RULEID:;SRVR:MW2PR2101MB1097; x-forefront-prvs: 0600F93FE1 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(376002)(366004)(39380400002)(396003)(39860400002)(346002)(189003)(199004)(305945005)(105586002)(36756003)(3660700001)(86362001)(6666003)(66066001)(2900100001)(5660300001)(8676002)(10090500001)(81166006)(81156014)(8936002)(22452003)(10290500003)(2950100002)(4326008)(25786009)(5250100002)(2501003)(68736007)(7736002)(72206003)(14454004)(478600001)(186003)(110136005)(26005)(316002)(2906002)(97736004)(6512007)(99286004)(107886003)(54906003)(1076002)(6506007)(106356001)(53936002)(59450400001)(6116002)(3846002)(102836004)(6486002)(86612001)(3280700002)(6436002)(76176011)(22906009)(217873001);DIR:OUT;SFP:1102;SCL:1;SRVR:MW2PR2101MB1097;H:MW2PR2101MB1034.namprd21.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; received-spf: None (protection.outlook.com: microsoft.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: xh0HTSFiobEzgX3RD3224qLECkAFkx6QvW6jxP61nonSb6uW/N2qEaBfUvopCOw8RYlDU5KrR5LjgkE+PSXwSaIEaMvzYmvR8xO1NIUptkW23jiJaMOvogYAhKwiNGX7gGrjKL9GXCv4CNqwYjibuwqmgXEjZVmLWvK/JDfDrZ8= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: microsoft.com X-MS-Exchange-CrossTenant-Network-Message-Id: 69ef4e69-6b61-4d6a-2fce-08d581569f52 X-MS-Exchange-CrossTenant-originalarrivaltime: 03 Mar 2018 22:28:30.5413 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47 X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW2PR2101MB1097 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Tom Hromatka [ Upstream commit 0107042768658fea9f5f5a9c00b1c90f5dab6a06 ] On systems with a large number of CPUs, running sysrq- can cause watchdog timeouts. There are two slow sections of code in the sysrq- path in timer_list.c. 1. print_active_timers() - This function is called by print_cpu() and contains a slow goto loop. On a machine with hundreds of CPUs, this loop took approximately 100ms for the first CPU in a NUMA node. (Subsequent CPUs in the same node ran much quicker.) The total time to print all of the CPUs is ultimately long enough to trigger the soft lockup watchdog. 2. print_tickdevice() - This function outputs a large amount of textual information. This function also took approximately 100ms per CPU. Since sysrq- is not a performance critical path, there should be no harm in touching the nmi watchdog in both slow sections above. Touching it in just one location was insufficient on systems with hundreds of CPUs as occasional timeouts were still observed during testing. This issue was observed on an Oracle T7 machine with 128 CPUs, but I anticipate it may affect other systems with similarly large numbers of CPUs. Signed-off-by: Tom Hromatka Reviewed-by: Rob Gardner Signed-off-by: John Stultz Signed-off-by: Sasha Levin --- kernel/time/timer_list.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c index ba7d8b288bb3..ef4f16e81283 100644 --- a/kernel/time/timer_list.c +++ b/kernel/time/timer_list.c @@ -16,6 +16,7 @@ #include #include #include +#include =20 #include =20 @@ -96,6 +97,9 @@ print_active_timers(struct seq_file *m, struct hrtimer_cl= ock_base *base, =20 next_one: i =3D 0; + + touch_nmi_watchdog(); + raw_spin_lock_irqsave(&base->cpu_base->lock, flags); =20 curr =3D timerqueue_getnext(&base->active); @@ -207,6 +211,8 @@ print_tickdevice(struct seq_file *m, struct tick_device= *td, int cpu) { struct clock_event_device *dev =3D td->evtdev; =20 + touch_nmi_watchdog(); + SEQ_printf(m, "Tick Device: mode: %d\n", td->mode); if (cpu < 0) SEQ_printf(m, "Broadcast device\n"); --=20 2.14.1