Received: by 2002:a05:6a10:c604:0:0:0:0 with SMTP id y4csp266185pxt; Wed, 4 Aug 2021 10:33:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw7fzJImQmP1tgYbcQ5pb0HCFZJGU2e/cZOc/c1Ip0doBoddgVgtaUklSiYPrgb09lowNjU X-Received: by 2002:a05:6638:1036:: with SMTP id n22mr516266jan.81.1628098381055; Wed, 04 Aug 2021 10:33:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628098381; cv=none; d=google.com; s=arc-20160816; b=fTbz3ny9dwTkMmqp50elqy0ylitc8qFrkbuOsVwA+kClbNGOFTEWDi4Fs7zzZ8erNA 7Pk3vO76RiAfJqj08qG1XXD2HjEPZ0TK65G2XOqOgHyC3BRxBpn0ZTFQbJuaG1mRFI8x upKsBX3yMwfDGsDQLnEpHfQq1qn5Nsf5e9JqAgxf+OS3kHcYu/kbMmj51d1bQtNPKa6B gbQ8GzUMSkeDlsfzTXFMKRUiAtLOAibUoJ6qHGtTOHBJ7kvHJxbdwP3aCHA0hxPQue8U O36BgcUfAXVOU4nV98EnptZzX07Gobt6G7KFSq99ysiWqetF+KlOuk0kpi4KxkkrXmQy V97g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=bn9YVbCSQKeHHPfP05oD5rwX8ODweOX9w/K+lArk/DE=; b=R+1jYOvjnox+32kCDY3IZpL5hSXdus273Kn5ZtrmmHcbXzoR3XVNRwLA5TWVlEWAcz TyWdggMThhy4uAF5OKni+zG3EzgVakgaH95pOyaIUnVMJ+ITcOGuT7gFhLgi6deTm2XR qwhLhbYgT7BzxvN3kPLZKmpF9znHWNv1y2jJzuzyIOxbzzTJsc+YdKvfP5WPL2N1RXMs JXw9rcYDKrjU3tzSTP6BcBHYsQ90LPPELPJsnV1U+I5v5U5VXcgKFRfuTr5FIChZyCQG SYK2WoynRJ7XGnq45C2gor4zA2EfVjwkmS3R1gZD8DRd6HLK7Vk5bJofdQADMaHsiErW MNrw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=pJykIc+x; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n125si2749750iod.85.2021.08.04.10.32.48; Wed, 04 Aug 2021 10:33:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=pJykIc+x; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236550AbhHDMMm (ORCPT + 99 others); Wed, 4 Aug 2021 08:12:42 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:58110 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235697AbhHDMMl (ORCPT ); Wed, 4 Aug 2021 08:12:41 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 296761FDD5; Wed, 4 Aug 2021 12:12:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1628079148; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=bn9YVbCSQKeHHPfP05oD5rwX8ODweOX9w/K+lArk/DE=; b=pJykIc+xN1nPCYR+8lVCwGNHacbjeH4BnxYNPAjN5HwhwpgixEWNbXHGIVGxw0toxjfyE+ 3Cz/TV2a/vLe+bxLld3Yxu29dFROf4HRFFmsffjqCyRRUT/p7QefScdDc8u+qL9c428dDi 8yrfvbcQ6E33HugPCEHnK31du1EvPTE= Received: from suse.cz (unknown [10.100.216.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 18078A3B84; Wed, 4 Aug 2021 12:12:26 +0000 (UTC) Date: Wed, 4 Aug 2021 14:12:22 +0200 From: Petr Mladek To: Daniel Thompson Cc: John Ogness , Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" , Jason Wessel , Douglas Anderson , Srikar Dronamraju , "Gautham R. Shenoy" , Chengyang Fan , Christophe Leroy , Bhaskar Chowdhury , Nicholas Piggin , =?iso-8859-1?Q?C=E9dric?= Le Goater , "Gustavo A. R. Silva" , Peter Zijlstra , linuxppc-dev@lists.ozlabs.org, kgdb-bugreport@lists.sourceforge.net Subject: Re: [PATCH printk v1 03/10] kgdb: delay roundup if holding printk cpulock Message-ID: References: <20210803131301.5588-1-john.ogness@linutronix.de> <20210803131301.5588-4-john.ogness@linutronix.de> <20210803142558.cz7apumpgijs5y4y@maple.lan> <87tuk635rb.fsf@jogness.linutronix.de> <20210804113159.lsnoyylifg6v5i35@maple.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210804113159.lsnoyylifg6v5i35@maple.lan> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 2021-08-04 12:31:59, Daniel Thompson wrote: > On Tue, Aug 03, 2021 at 05:36:32PM +0206, John Ogness wrote: > > On 2021-08-03, Daniel Thompson wrote: > > > On Tue, Aug 03, 2021 at 03:18:54PM +0206, John Ogness wrote: > > >> kgdb makes use of its own cpulock (@dbg_master_lock, @kgdb_active) > > >> during cpu roundup. This will conflict with the printk cpulock. > > > > > > When the full vision is realized what will be the purpose of the printk > > > cpulock? > > > > > > I'm asking largely because it's current role is actively unhelpful > > > w.r.t. kdb. It is possible that cautious use of in_dbg_master() might > > > be a better (and safer) solution. However it sounds like there is a > > > larger role planned for the printk cpulock... > > > > The printk cpulock is used as a synchronization mechanism for > > implementing atomic consoles, which need to be able to safely interrupt > > the console write() activity at any time and immediately continue with > > their own printing. The ultimate goal is to move all console printing > > into per-console dedicated kthreads, so the primary function of the > > printk cpulock is really to immediately _stop_ the CPU/kthread > > performing write() in order to allow write_atomic() (from any context on > > any CPU) to safely and reliably take over. > > I see. > > Is there any mileage in allowing in_dbg_master() to suppress taking > the console lock? > > There's a couple of reasons to worry about the current approach. > > The first is that we don't want this code to trigger in the case when > kgdb is enabled and kdb is not since it is only kdb (a self-hosted > debugger) than uses the consoles. This case is relatively trivial to > address since we can rename it kdb_roundup_delay() and alter the way it > is conditionally compiled. > > The second is more of a problem however. kdb will only call into the > console code from the debug master. By default this is the CPU that > takes the debug trap so initial prints will work fine. However it is > possible to switch to a different master (so we can read per-CPU > registers and things like that). This will result in one of the CPUs > that did the IPI round up calling into console code and this is unsafe > in that instance. > > There are a couple of tricks we could adopt to work around this but > given the slightly odd calling context for kdb (all CPUs quiesced, no > log interleaving possible) it sounds like it would remain safe to > bypass the lock if in_dbg_master() is true. > > Bypassing an inconvenient lock might sound icky but: > > 1. If the lock is not owned by any CPU then what kdb will do is safe. > > 2. If the lock is owned by any CPU then we have quiesced it anyway > and this makes is safe for the owning CPU to share its ownership > (since it isn't much different to recursive acquisition on a single > CPU) I think about the following: void kgdb_roundup_cpus(void) { __printk_cpu_lock(); __kgdb_roundup_cpus(); } , where __printk_cpu_lock() waits/takes printk_cpu_lock() __kgdb_roundup_cpus() is the original kgdb_roundup_cpus(); The idea is that kgdb_roundup_cpus() caller takes the printk_cpu lock. The owner will be well defined. As a result any other CPU will not be able to take the printk_cpu lock as long as it is owned by the kgdb lock. But as you say, kgdb will make sure that everything is serialized at this stage. So that the original raw_printk_cpu_lock_irqsave() might just disable IRQs when called under debugger. Does it make any sense? I have to say that it is a bit hairy. But it looks slightly better than the delayed/repeated IPI proposed by this patch. Best Regards, Petr