Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp5888528rwb; Tue, 1 Aug 2023 09:12:09 -0700 (PDT) X-Google-Smtp-Source: APBJJlGrAJ4TSsmGs9ifBJCrzYV/QDiZfRo/7GbveA5l20uhOTULEBD5D1sXCDyeVi6sefGu3PtD X-Received: by 2002:a05:6a20:12cb:b0:13e:427e:ff18 with SMTP id v11-20020a056a2012cb00b0013e427eff18mr5051162pzg.62.1690906329558; Tue, 01 Aug 2023 09:12:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690906329; cv=none; d=google.com; s=arc-20160816; b=h5VF6vmMGGz0+KI0QeiVD/bGPkGZQCgJ3dMed8aD0yey+BwkWbFwNgQJxPmEVgqnqx l/tELGisA8uhczhEDHZEiTsUm2oqFIG4mk6ItqhYjfKYqWG25DNfrTdAy8SZ+/eBoW+F RC1RYxKm9o3/IlgxCWQnwioTRRHrn50J2JBZdokX0jfEqPmPeNzYmF8EQ38Q8Deuzdav 0QR8ErN2vIGqDQ07cRdXfDOKD1F+jUMBDI7Gd2iwFhNSPpOsfRTGcQZNCNavc75UWt2l Zz9oy2Bzbbsk1NOpDX6V3GtANpva2NpGtC0NETrqxv8uRbPpEZyplLzgDPCY640Yay/3 Jt9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=N7CFtV5ddM82lM2QlcRQgSFRjK14XjJ23RmURcDGvlM=; fh=uEUmYYc6C5/pWlHrvsRI/JhzvJiTA7ApEvpZdQnbfhU=; b=ENRJ/PjpLu2PLgi4e8fcHF8oxUztj6CkQAjFXzhKDxB8Hmi8NsGMfzGNsD03v7rxxx p1XhmLsRGJLmPvEzocCRD+kPoxjGEJBcUhKNDJ4XUgkYl6hZDQ6biCpDZOZFC5Ht0p4U KF+9tAgqI7mkJr8yDdfunh6hmkCXpHeWIv7nj+zVU9yRicX9MI7Xq3CiUBNsiwCFS+nV yxOx4irtd5rSGbmKsEcmunFgFP6QNe86kMQQVc1a80DeDqP9fz2iMkZUyOXRcpqILyP3 9/FvOk9jwKPDYeKdlqmUYgVA8gJ0JzPp2Ac7qqEjLlDf3HUldE26yiDSK6B7gr+mgZtz hjvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=Zyj0DvYX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cm25-20020a056a00339900b006871527e5cdsi7031154pfb.345.2023.08.01.09.11.56; Tue, 01 Aug 2023 09:12:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=Zyj0DvYX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234009AbjHAP0u (ORCPT + 99 others); Tue, 1 Aug 2023 11:26:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233978AbjHAP0t (ORCPT ); Tue, 1 Aug 2023 11:26:49 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EF0771FDC for ; Tue, 1 Aug 2023 08:26:47 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 684E021D58; Tue, 1 Aug 2023 15:26:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1690903606; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N7CFtV5ddM82lM2QlcRQgSFRjK14XjJ23RmURcDGvlM=; b=Zyj0DvYXZgQWODc6h8cfHjOtdKp7bbmTjxynSyzmQKzXLhbRNUyQTSpiLxY7uUrdBGx04T RrywyeKiG/5kSKT/ONVAV/ELrGLiki9Um0vVePeJRUi8xAA7rrPQ69qwYElXWbEFeHxbAd G0o5uaICQyOq1yC/GYgwev0BGh6s6ig= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 5951B139BD; Tue, 1 Aug 2023 15:26:46 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id g0QpFTYkyWTdBAAAMHmgww (envelope-from ); Tue, 01 Aug 2023 15:26:46 +0000 Date: Tue, 1 Aug 2023 17:26:46 +0200 From: Michal Hocko To: Doug Anderson Cc: Andrew Morton , Petr Mladek , kernel test robot , Lecopzer Chen , Pingfan Liu , linux-kernel@vger.kernel.org Subject: Re: [PATCH] watchdog/hardlockup: Avoid large stack frames in watchdog_hardlockup_check() Message-ID: References: <20230731091754.1.I501ab68cb926ee33a7c87e063d207abf09b9943c@changeid> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 01-08-23 07:16:15, Doug Anderson wrote: > Hi, > > On Tue, Aug 1, 2023 at 5:58 AM Michal Hocko wrote: > > > > On Mon 31-07-23 09:17:59, Douglas Anderson wrote: > > > After commit 77c12fc95980 ("watchdog/hardlockup: add a "cpu" param to > > > watchdog_hardlockup_check()") we started storing a `struct cpumask` on > > > the stack in watchdog_hardlockup_check(). On systems with > > > CONFIG_NR_CPUS set to 8192 this takes up 1K on the stack. That > > > triggers warnings with `CONFIG_FRAME_WARN` set to 1024. > > > > > > Instead of putting this `struct cpumask` on the stack, let's declare > > > it as `static`. This has the downside of taking up 1K of memory all > > > the time on systems with `CONFIG_NR_CPUS` to 8192, but on systems with > > > smaller `CONFIG_NR_CPUS` it's not much emory (with 128 CPUs it's only > > > 16 bytes of memory). Presumably anyone building a system with > > > `CONFIG_NR_CPUS=8192` can afford the extra 1K of memory. > > > > > > NOTE: as part of this change, we no longer check the return value of > > > trigger_single_cpu_backtrace(). While we could do this and only call > > > cpumask_clear_cpu() if trigger_single_cpu_backtrace() didn't fail, > > > that's probably not worth it. There's no reason to believe that > > > trigger_cpumask_backtrace() will succeed at backtracing the CPU when > > > trigger_single_cpu_backtrace() failed. > > > > > > Alternatives considered: > > > - Use kmalloc with GFP_ATOMIC to allocate. I decided against this > > > since relying on kmalloc when the system is hard locked up seems > > > like a bad idea. > > > - Change the arch_trigger_cpumask_backtrace() across all architectures > > > to take an extra parameter to get the needed behavior. This seems > > > like a lot of churn for a small savings. > > > > > > Fixes: 77c12fc95980 ("watchdog/hardlockup: add a "cpu" param to watchdog_hardlockup_check()") > > > Reported-by: kernel test robot > > > Closes: https://lore.kernel.org/r/202307310955.pLZDhpnl-lkp@intel.com > > > Signed-off-by: Douglas Anderson > > > --- > > > > > > kernel/watchdog.c | 14 +++++++------- > > > 1 file changed, 7 insertions(+), 7 deletions(-) > > > > > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > > > index be38276a365f..19db2357969a 100644 > > > --- a/kernel/watchdog.c > > > +++ b/kernel/watchdog.c > > > @@ -151,9 +151,6 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs) > > > */ > > > if (is_hardlockup(cpu)) { > > > unsigned int this_cpu = smp_processor_id(); > > > - struct cpumask backtrace_mask; > > > - > > > - cpumask_copy(&backtrace_mask, cpu_online_mask); > > > > > > /* Only print hardlockups once. */ > > > if (per_cpu(watchdog_hardlockup_warned, cpu)) > > > @@ -167,10 +164,8 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs) > > > show_regs(regs); > > > else > > > dump_stack(); > > > - cpumask_clear_cpu(cpu, &backtrace_mask); > > > } else { > > > - if (trigger_single_cpu_backtrace(cpu)) > > > - cpumask_clear_cpu(cpu, &backtrace_mask); > > > + trigger_single_cpu_backtrace(cpu); > > > } > > > > > > /* > > > @@ -178,8 +173,13 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs) > > > * hardlockups generating interleaving traces > > > */ > > > if (sysctl_hardlockup_all_cpu_backtrace && > > > - !test_and_set_bit(0, &watchdog_hardlockup_all_cpu_dumped)) > > > + !test_and_set_bit(0, &watchdog_hardlockup_all_cpu_dumped)) { > > > + static struct cpumask backtrace_mask; > > > + > > > + cpumask_copy(&backtrace_mask, cpu_online_mask); > > > + cpumask_clear_cpu(cpu, &backtrace_mask); > > > trigger_cpumask_backtrace(&backtrace_mask); > > > > This looks rather wasteful to just copy the cpumask over to > > backtrace_mask in nmi_trigger_cpumask_backtrace (which all but sparc > > arches do AFAICS). > > > > Would it be possible to use arch_trigger_cpumask_backtrace(cpu_online_mask, false) > > and special case cpu != this_cpu && sysctl_hardlockup_all_cpu_backtrace? > > So you're saying optimize the case where cpu == this_cpu and then have > a special case (where we still copy) for cpu != this_cpu? I can do > that if that's what people want, but (assuming I understand correctly) > that's making the wrong tradeoff. Specifically, this code runs one > time right as we're crashing and if it takes an extra millisecond to > run it's not a huge deal. It feels better to avoid the special case > and keep the code smaller. > > Let me know if I misunderstood. I meant something like this (untested but it should give an idea what I mean hopefully). Maybe it can be simplified even further. TBH I am not entirely sure why cpu == this_cpu needs this special handling. --- diff --git a/kernel/watchdog.c b/kernel/watchdog.c index be38276a365f..0eedac9f1019 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -151,9 +151,7 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs) */ if (is_hardlockup(cpu)) { unsigned int this_cpu = smp_processor_id(); - struct cpumask backtrace_mask; - - cpumask_copy(&backtrace_mask, cpu_online_mask); + bool dump_all = false; /* Only print hardlockups once. */ if (per_cpu(watchdog_hardlockup_warned, cpu)) @@ -167,10 +165,6 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs) show_regs(regs); else dump_stack(); - cpumask_clear_cpu(cpu, &backtrace_mask); - } else { - if (trigger_single_cpu_backtrace(cpu)) - cpumask_clear_cpu(cpu, &backtrace_mask); } /* @@ -179,7 +173,12 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs) */ if (sysctl_hardlockup_all_cpu_backtrace && !test_and_set_bit(0, &watchdog_hardlockup_all_cpu_dumped)) - trigger_cpumask_backtrace(&backtrace_mask); + dump_all = true; + + if (dump_all) + arch_trigger_cpumask_backtrace(cpu_online_mask, cpu != this_cpu); + else if (cpu != this_cpu) + trigger_single_cpu_backtrace(cpu); if (hardlockup_panic) nmi_panic(regs, "Hard LOCKUP"); -- Michal Hocko SUSE Labs