Received: by 2002:a05:7412:40d:b0:e2:908c:2ebd with SMTP id 13csp840502rdf; Tue, 21 Nov 2023 19:53:45 -0800 (PST) X-Google-Smtp-Source: AGHT+IEDtAy+6CN9sGKs6SYXU4HzuPFBxxbgyX93iZNizaDgFOwxSuwyzKljSlRjM29zXps3gL4O X-Received: by 2002:a17:90a:1a03:b0:280:1d6c:a6a8 with SMTP id 3-20020a17090a1a0300b002801d6ca6a8mr1293312pjk.23.1700625224938; Tue, 21 Nov 2023 19:53:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700625224; cv=none; d=google.com; s=arc-20160816; b=LE+YVKqjCsrvMPlaVGrcv2xKIdSC0WGHm3tLqdkuhTiPMcLm1vBNDu3kROjTpo5EWL TPzHIdpiRMg9Cv3Mi1ry0aOsE/vGUHN4xuBRR9uvjdY4a2N14w3iJNcQk42qqOIauTyE Meb0iQWFK+iFLJpB+gCC1yp3hbblgY9WZ4HVarf4bHlr3CvmNiA4pHWzJxBDgsHwUMtG xKXEufk7AX/UtxCU12HxjIrSQdYtkctXb/L/uVm0rR22kRVRTpawVxH1U8wEYxgOrovp pi3jot3uqu/4kOUOzYLLvV507gD6KctefGckAkyzKx7bF5fL3/1GCPeCS54HijUyoXcZ 9/qw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=5wrkLx22tIx216Wpz1EvdCXBMKjcp6yxP+hh4ys4sAc=; fh=35GBBumKqPcW7XSyRAtPrN499yxRkjZFav6lGmcjNQk=; b=loWFMFNLXNFJEgSZNApynNfariXlgaOnmr3Lj9thxe8PB/n00byJ7c72cwEfBjOoId oTUtwgeQhfgDT8f9zJbn+uKt+bsg94xQM+USf4wObBUg8sAufarAphCH+ioVAmgZGNP1 4T7QwWWZSz+YkALIL4bXkdL8+j/0J9QBx71JXYOsCwjq1geIJ66CFWeFTjYJhh0SLhtz 3XksjYP8AM8sxxM4+81Y9ujEAuzRDJP3xud3L0HDi9gF5AJWNwhOTIrxx0ajIMOY14Ol tAQu3/Lj/vEY8tX3IojjtpU/fAdr8nsrMt9X61MqAWBFsnK60ijTr4xxr1R0LJ18J88U r8dg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=LCCLO0cT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id si1-20020a17090b528100b00283a2f57965si622737pjb.88.2023.11.21.19.53.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Nov 2023 19:53:44 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=LCCLO0cT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 4EDC780F6955; Tue, 21 Nov 2023 19:53:42 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343638AbjKVDxT (ORCPT + 99 others); Tue, 21 Nov 2023 22:53:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50294 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235057AbjKVDxR (ORCPT ); Tue, 21 Nov 2023 22:53:17 -0500 Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 114B01BB for ; Tue, 21 Nov 2023 19:53:11 -0800 (PST) Received: by mail-pf1-x42f.google.com with SMTP id d2e1a72fcca58-6b709048d8eso5657183b3a.2 for ; Tue, 21 Nov 2023 19:53:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1700625190; x=1701229990; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5wrkLx22tIx216Wpz1EvdCXBMKjcp6yxP+hh4ys4sAc=; b=LCCLO0cTtd+5Q/BADcTL/qmfv6Ny8G0SSV+nEbiFXKn60fGAKmWyCLg3YibYfpV+pq UEEzBze/xQ4DidlxPozC9euE51NDir19Fe8tYn7hgWT+id7KeB4e6Mkki8UNexwuqoqB 6b/k90S7uKf20HBy269rfc5JRbbHIWD8ptsr24/jOLjFjDxbNJhfNonEFfC/cFwuMUHq G9wAA9e15l4AsPBVMsouhtT80F73R2VhPTE93/zb2OYe2uV/fJ5u7QjffL2fRAek/T65 zdhxoLd4SxDYzlJ6dnaHP67tDd6pz3XyRdKS4PtJDZ9Us+eN6T4+FyeuqQUl+j1F0C18 ubog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700625190; x=1701229990; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5wrkLx22tIx216Wpz1EvdCXBMKjcp6yxP+hh4ys4sAc=; b=mWa2h0z82h3RhtITIRD0SSoLtHlnVjL/T/Pll2SpbxAX9BkUn1tT2qW3DGfdbjtwvF RsfJRczwC9ZycTH8wzCFvlwVMpxoIFk2fahUMR7xrNdCt0krnxnm2CwVjjW739r4OlPg l04rnI3GXFbgKgABOhcnWC55JNN3kq8qzLsFeYAmV4m4a7hqQCjNO/W3oIFESxhBwyLQ lgPDf6YvgPeeV0/67474ZuJZzvR/gLiOi6oUyw3F1r+/SbmaZeJsl+wMJPFjSG491BhZ 0d8lzkvCsUrseOSgI34yZOnIb84B6WzNeP/EqFP64PNxLx7MjSa0x99e4NQwLBAtHlwq KA3w== X-Gm-Message-State: AOJu0YzbJWr9f9DrhRZSVyOv8GsVoWu9e4mIaFdYo46JkR68hZSOr7/X jYOvfDsz5Gxp1PVB6SY/FNcZuQ== X-Received: by 2002:a05:6a20:7d89:b0:189:2e8f:d357 with SMTP id v9-20020a056a207d8900b001892e8fd357mr1344983pzj.49.1700625190553; Tue, 21 Nov 2023 19:53:10 -0800 (PST) Received: from localhost.localdomain ([203.208.189.6]) by smtp.gmail.com with ESMTPSA id c3-20020a170902d48300b001c61bde04a7sm8569676plg.276.2023.11.21.19.53.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Nov 2023 19:53:10 -0800 (PST) From: lizhe.67@bytedance.com To: dianders@chromium.org Cc: akpm@linux-foundation.org, kernelfans@gmail.com, lecopzer.chen@mediatek.com, linux-kernel@vger.kernel.org, lizefan.x@bytedance.com, lizhe.67@bytedance.com, pmladek@suse.com Subject: Re: [RFC] softlockup: serialized softlockup's log Date: Wed, 22 Nov 2023 11:53:04 +0800 Message-Id: <20231122035304.57483-1-lizhe.67@bytedance.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Tue, 21 Nov 2023 19:53:42 -0800 (PST) On Fri, 17 Nov 2023 13:45:21 wrote: >> >> From: Li Zhe >> >> If multiple CPUs trigger softlockup at the same time, the softlockup's >> logs will appear staggeredly in dmesg, which will affect the viewing of >> the logs for developer. Since the code path for outputting softlockup logs >> is not a kernel hotspot and the performance requirements for the code >> are not strict, locks are used to serialize the softlockup log output >> to improve the readability of the logs. >> >> Signed-off-by: Li Zhe >> --- >> kernel/watchdog.c | 3 +++ >> 1 file changed, 3 insertions(+) > >This seems reasonable to me. It might be interesting to talk about in >your commit message how this interacts with the various options. From >code inspection, I believe: Thanks for your advice. I will send a V2 patch to optimize my commit message. >* If `softlockup_all_cpu_backtrace` then this is a no-op since other >CPUs will be prevented from running the printing code while one is >already printing. Yes your are right. If `softlockup_all_cpu_backtrace` is set, interleaving problem is gone. And we don't need to worry about interleaving problem in function trigger_allbutcpu_cpu_backtrace() because it has already serialized the logs. >* I'm not 100% sure what happens if `softlockup_panic` is set and I >haven't sat down to test this myself. Will one CPUs panic message >interleave the other CPUs traces. I guess in the end both CPUs will >call panic()? Maybe you could experiment and describe the behavior in >your commit message? I did experiments and checked the implementation of the panic function. I have not reproduced interleaving problem with this patch. The panic function internally serializes the panic's logs by using variable 'panic_cpu'. Besides, function panic() will stop other cpu before outputing logs, so I think the interleaving problem between softlockup logs from cpu A and the panic logs from softlockup cpu B does not exist. >> diff --git a/kernel/watchdog.c b/kernel/watchdog.c >> index 5cd6d4e26915..8324ac194d0a 100644 >> --- a/kernel/watchdog.c >> +++ b/kernel/watchdog.c >> @@ -448,6 +448,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) >> struct pt_regs *regs = get_irq_regs(); >> int duration; >> int softlockup_all_cpu_backtrace = sysctl_softlockup_all_cpu_backtrace; >> + static DEFINE_SPINLOCK(watchdog_timer_lock); > >I'd be tempted to define this outside the scope of this function. I >need to dig more, but I'm pretty sure I've seen cases where a soft >lockup could trigger while I was trying to print traces for a >hardlockup, so it might be useful to grab the same spinlock in both >places... I've tried several times, but unfortunately I haven't been able to reproduce the problem you mentioned. My concern is that if the lock is shared, there will be potential deadlock issues because hardlockup exploits nmi.