Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp1993906rdh; Tue, 26 Sep 2023 09:10:11 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH+g6mh3BX1Dj/4iouvJFjzkUKnIMyb/6/rmHGWvqcJt597Dxc52Z7KVhgFH94RNMFEJLBH X-Received: by 2002:a05:6a20:12d6:b0:14c:c393:692 with SMTP id v22-20020a056a2012d600b0014cc3930692mr11423367pzg.7.1695744611199; Tue, 26 Sep 2023 09:10:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695744611; cv=none; d=google.com; s=arc-20160816; b=ydsRP764Gr6VfbQ1pR0JTVKmoAsiCxqORR2LObEdWTVNTlP/ZTXG46p4Ly6BOJUfzd xGZXhxqNcJsZmr/YWJcOKhPtoMuz/5TRYDeLJPzpG9SmMm9PKoRm/upWgcsfjvZ5/oiw q6zziM/+qPKmNz3AKreUvojAvX5et5PA7nAlXRQG572SgRIcHS9ixst1gktzEkIrPmZv 716A1VZ3DD0UAbBs+n4uAoSvJv8hkAJZHI9z+ABZBfxrea2Kx/BXfoQo2Iq3M6gaMETx Bb4gzaRqDFtCvxzSgaatg4M/MkJrgMghBYOa2zeaL/kz5nTkSw5AiFieHuwxlhwlbHdc Jpig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=st5BNw7K4yGiN/VpqqgBm7geCu8TMuOpnD9++GzYwQ8=; fh=3oUSWWYQ0rMHaNqDlm8j09K7pkfJE062n+5jIRWafQ4=; b=NdR1fad5JxiS9Vn75kSV0w2Vw28dDMzNtmfs3fc7xGgaC+B5nu/E93BGl5rBUz4uhE Im79VFQVeeQ1H5KCwzX+cdGXGUWdKu1q80eyIsBnhbuPII8ptvY2Q73I29CSMJKnnF8M caiilr9rHTYpw9X85tc1wXuyzg5H6CMLKwIH7zogxrmHh3FxOtoKiGU5ZrrFXrdX3etH XW0ycNHhuoZTE6Z/+NLjM+ggKXE97h8ZFzVoT1qEkjXgtwNT50iZleVsebTPPW7DhJyX EoqFNMyp+6yJsdeuLVO7HKNmP1fafNAn420lXWj+0P87WFwNR/o6zHIhignK84rryDvo xINg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=oD0SwkL5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id j70-20020a638049000000b005777535a67bsi13179759pgd.746.2023.09.26.09.10.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Sep 2023 09:10:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=oD0SwkL5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 84A928370D9D; Tue, 26 Sep 2023 04:34:46 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232828AbjIZLep (ORCPT + 99 others); Tue, 26 Sep 2023 07:34:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54352 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233360AbjIZLek (ORCPT ); Tue, 26 Sep 2023 07:34:40 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE82E10E for ; Tue, 26 Sep 2023 04:34:32 -0700 (PDT) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 68C7D21857; Tue, 26 Sep 2023 11:34:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1695728071; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=st5BNw7K4yGiN/VpqqgBm7geCu8TMuOpnD9++GzYwQ8=; b=oD0SwkL5UOwOgmmF8SrCH01KlQ3IutFtBUlRuVLk4hABjPOKHNYdpsLtaUM4MuXm9a4tZS iTGN1uqAuU6BEMrYcdHvcQdTcr2MT1Ze2iHq9noteUMKz9LY7g7oj9cpD5woAgvQE+Vq8/ 6igvtc3e4j67c1tMLrefOSVKwWIpxkk= Received: from suse.cz (pmladek.tcp.ovpn2.prg.suse.de [10.100.208.146]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id EB0462C142; Tue, 26 Sep 2023 11:34:30 +0000 (UTC) Date: Tue, 26 Sep 2023 13:34:30 +0200 From: Petr Mladek To: John Ogness Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org Subject: Re: [PATCH printk v2 07/11] printk: nbcon: Wire up nbcon into console_flush_all() Message-ID: References: <20230919230856.661435-1-john.ogness@linutronix.de> <20230919230856.661435-8-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230919230856.661435-8-john.ogness@linutronix.de> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Tue, 26 Sep 2023 04:34:46 -0700 (PDT) On Wed 2023-09-20 01:14:52, John Ogness wrote: > In atomic printing sections, the atomic nbcon consoles have > their own behavior of allowing all urgent messages to be > stored into the ringbuffer and then flushing afterwards. > > However, the nbcon consoles must also emit messages when not > within atomic printing sections. For this, the existing > console_flush_all() function can be used. I am not sure if I get it correctly. My understanding of the above two paragraphs is that urgent messages will be buffered and flushed later while normal messages would be flushed immediately. This is exactly the opposite from what I would expect. My expectations are that printk() would try to flush also nbcon consoles during early boot when the kthreads are not available or in the emergency and panic contexts. Of course, it might fail when the console has another owner and it has the same or higher priority or when it is in an unsafe state. > Provide nbcon_console_emit_next_record(), which acts as the > nbcon variant of console_emit_next_record(). Call this variant > within console_flush_all() for nbcon consoles. + nbcon_console_emit_next_record() + nbcon_atomic_emit_one() + nbcon_emit_next_record() + con->write_atomic() It interleaves API with and without "atomic" in the name. Also I guess that nbcon_emit_next_record() will be used in the printk thread in "non-atomic" mode. But I might be wrong. The more I think about it the more it looks that the term "atomic" creates more harm (confusion) than good. > Note that when in an atomic printing section, > nbcon_console_emit_next_record() does nothing. This is because > atomic printing sections will handle the nbcon flushing when > exiting the outermost atomic printing section. > > Signed-off-by: John Ogness > --- > kernel/printk/internal.h | 2 ++ > kernel/printk/nbcon.c | 46 ++++++++++++++++++++++++++++++++++++++++ > kernel/printk/printk.c | 26 +++++++++++++++-------- > 3 files changed, 65 insertions(+), 9 deletions(-) > > diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h > index 6780911fa8f2..a3c6b5ce80e4 100644 > --- a/kernel/printk/internal.h > +++ b/kernel/printk/internal.h > @@ -77,6 +77,7 @@ void nbcon_seq_force(struct console *con, u64 seq); > bool nbcon_alloc(struct console *con); > void nbcon_init(struct console *con); > void nbcon_free(struct console *con); > +bool nbcon_console_emit_next_record(struct console *con); > > /* > * Check if the given console is currently capable and allowed to print > @@ -131,6 +132,7 @@ static inline void nbcon_seq_force(struct console *con, u64 seq) { } > static inline bool nbcon_alloc(struct console *con) { return false; } > static inline void nbcon_init(struct console *con) { } > static inline void nbcon_free(struct console *con) { } > +static bool nbcon_console_emit_next_record(struct console *con) { return false; } > > static inline bool console_is_usable(struct console *con, short flags) { return false; } > > diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c > index 82e6a1678363..de473a1003d8 100644 > --- a/kernel/printk/nbcon.c > +++ b/kernel/printk/nbcon.c > @@ -1017,6 +1017,52 @@ static bool nbcon_atomic_emit_one(struct nbcon_write_context *wctxt) > return prb_read_valid(prb, ctxt->seq, NULL); > } > > +/** > + * nbcon_console_emit_next_record - Print one record for an nbcon console > + * in atomic mode I am completely lost now. The API does not have "atomic" in the name but it is supposed to be called in the atomic conext whatever it means. > + * @con: The console to print on > + * > + * Return: True if a record could be printed, otherwise false. > + * > + * This function is meant to be called by console_flush_all() to atomically > + * print records on nbcon consoles. Essentially it is the nbcon version of > + * console_emit_next_record(). > + * > + * This function also returns false if the current CPU is in an elevated > + * atomic priority state in order to allow the CPU to get all of the > + * emergency messages into the ringbuffer first. > + */ > +bool nbcon_console_emit_next_record(struct console *con) > +{ > + struct nbcon_cpu_state *cpu_state; > + bool progress = false; > + > + migrate_disable(); > + > + cpu_state = nbcon_get_cpu_state(); > + > + /* > + * Atomic printing from console_flush_all() only occurs if this > + * CPU is not in an elevated atomic priority state. If it is, the > + * atomic printing will occur when this CPU exits that state. This > + * allows a set of emergency messages to be completely stored in > + * the ringbuffer before this CPU begins flushing. > + */ > + if (cpu_state->prio <= NBCON_PRIO_NORMAL) { It does something only when the CPU in "default normal state" or "explicitely forced normal state" . Where the "explicitely forced normal" probably happens after nbcon_atomic_enter(NBCON_PRIO_NORMAL). Do we want anyone call this? I would expect that the explicit context with direct flush would be used only for emergency and panic messages. > + struct nbcon_write_context wctxt = { }; > + struct nbcon_context *ctxt = &ACCESS_PRIVATE(&wctxt, ctxt); > + > + ctxt->console = con; > + ctxt->prio = NBCON_PRIO_NORMAL; > + > + progress = nbcon_atomic_emit_one(&wctxt); > + } > + > + migrate_enable(); > + > + return progress; > +} My proposal: 1. Add the API for entering emergency context and getting the context. I have already mentioned this in reply to 4rt patch. Here it is: /* Allow to enter printk context with emergency prio */ #define PRINTK_STATE_EMERGENCY_MASK 0x000000ff #define PRINTK_STATE_EMERGENCY_OFFSET 0x00000001 /* Allow unsafe takeover */ #define PRINTK_STATE_ALLOW_UNSAFE_TAKEOVER_BIT 0x00000100 void nbcon_emergency_enter(&printk_state) { *printk_state = current->printk_state; WARN_ON_ONCE((*printk_state & PRINTK_STATE_EMERGENCY_MASK) == PRINTK_STATE_EMERGENCY_MASK); current->printk_state = *printk_state + PRINTK_STATE_EMERGENCY_OFFSET; } void nbcon_emergency_exit(printk_state) { WARN_ON_ONCE(!(current->printk_state & PRINTK_STATE_EMERGENCY_MASK)) current->printk_state = printk_state; } enum nbcon_prio nbcon_get_default_prio(void) { if (panic_cpu == raw_smp_processor_id() return NBCON_PANIC_PRIO; /* Does current always exist? What about fork? */ if (current && (current->printk_state && PRINTK_STATE_EMERGENCY_MASK)) return NBCON_PRIO_EMERGENCY; return NBCON_PRIO_NORMAL; } /* * Extend this by adding a bit which would allow to enable the unsafe * takeover. */ bool nbcon_is_allowed_unsafe_takeover(void) { bool allow_unsafe_takeover; allow_unsafe_takeover = (current->printk_state & PRINTK_STATE_ALLOW_UNSAFE_TAKEOVER_BIT); if (allow_unsafe_takeover) if (WARN_ON_ONCE(raw_smp_processor_id() != panic_cpu)) allow_unsafe_takeover - false; return allow_hostile_takeover; } 2. Have a function for emitting a single line in the given write context and with already acquired console lock. Simply rename the existing: nbcon_emit_next_record() -> nbcon_wctxt_emit_next_record_acquired() or -> nbcon_wctxt_emit_next_record_locked() 3. Add the following APIs /** * nbcon_wctxt_emit_next_record - try to emit next record using the * given write context. * * @wctxt: An initialized write context struct to use * for this context * * Returns false if the given console could not print a record or there are * no more records to print, otherwise true. */ bool nbcon_wctxt_emit_next_record(struct nbcon_write_context *wctxt) { struct nbcon_context *ctxt = &ACCESS_PRIVATE(&wctxt, ctxt); if (!nbcon_context_try_acquire(ctxt)) return false; if (!nbcon_wctxt_emit_next_record_acquired(wctxt)) return false; nbcon_context_release(ctxt); return prb_read_valid(prb, ctxt->seq, NULL); } /** * nbcon_emit_next_record_no_wait - try to emit next record on the given console * It will not wait when the console is locked. * @con: The console to print on * * Return: True if a record could be printed, otherwise false. */ bool nbcon_emit_next_record(struct console *con) { struct nbcon_write_context wctxt = { }; enum nbcon_prio; bool allow_unnsafe_takeover; prio = nbcon_get_default_prio(); allow_unsafe_takeover = nbcon_allowed_unsafe_takeover(); ctxt->console = con; ctxt->prio = prio; return nbcon_wctxt_emit_next_record(wctxt); } /* Add to the existing console_flush_all() */ static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handover) { [...] do { any_progress = false; cookie = console_srcu_read_lock(); for_each_console_srcu(con) { short flags = console_srcu_read_flags(con); u64 printk_seq; bool progress; if (!console_is_usable(con, flags)) continue; any_usable = true; + if (flags & CON_NBCON) { + progress = nbcon_emit_next_record_no_wait(con); + printk_seq = nbcon_seq_read(con); } else { progress = console_emit_next_record(con, handover, cookie); [...] } /* * Like console_flush_all() but flushing only nbcon consoles. * * It might be used in console_flush_on_panic() * before flushing also legacy consoles. */ void nbcon_flush_all(void) { struct nbcon_write_context wctxt = { }; struct nbcon_context *ctxt = &ACCESS_PRIVATE(&wctxt, ctxt); struct console *con; bool any_progress; enum nbcon_prio; int cookie; prio = nbcon_get_default_prio(); allow_unsafe_takeover = nbcon_is_allowed_unsafe_takeover(); do { any_progress = false; cookie = console_srcu_read_lock(); for_each_console_srcu(con) { short flags = console_srcu_read_flags(con); bool progress; if (!(flags & CON_NBCON)) continue; if (!console_is_usable(con, flags)) continue; memset(ctxt, 0, sizeof(*ctxt)); ctxt->console = con; ctxt->spinwait_max_us = 2000; ctxt->prio = prio; ctxt->allow_unsafe_takeover = allow_unsafe_takeover; progress = nbcon_wctxt_emit_next_record(&wctxt); if (!progress) continue; any_progress = true; } console_srcu_read_unlock(cookie); } while (any_progress); } void nbcon_flush_on_panic(void) { if (WARN_ON_ONCE(raw_smp_processor_id() != panic_cpu)) return; current->printk_state |= PRINTK_STATE_ALLOW_UNSAFE_TAKEOVER_BIT; nbcon_flush_all(); /* * Maybe we should not clear it. This should be called at * the very end of panic. If we allowed an unsafe takeover once * then we might also keep it allowed. * * Then the additional explicit flushes would not be needed. * I mean the one after "Rebooting in %d seconds.." or * after "---[ end Kernel panic - not syncing.." * * Also it would allow to see messages printed in * emergency_restart(). * * By other words. The unsafe flush either worked or not. * It can't be worse when trying more unsafe takeovers. */ /* current->printk_state &= ~PRINTK_STATE_ALLOW_UNSAFE_TAKEOVER_BIT; */ } IMHO, this would simplify the API and logic a lot. Note that all the complexity with different priorities and unsafe flushes is already handled by nbcon_context_try_acquire(). I do not see any reason to add yet another logic above. Also I would prefer to rename con->atomic_write() -> con->nbcon_write() and get rid of the "atomic" word completely. I know that it was mentioned during the original session with Linus. But it is just a name. The behavior is more important. And I think that the word "atomic" adds quite some confusion and even complexity in this implementation. Does it make any sense, please? Best Regards, Petr