Received: by 2002:a05:6a10:c604:0:0:0:0 with SMTP id y4csp826980pxt; Thu, 5 Aug 2021 12:38:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzpBAC3JiZrlZJ9rIgDQAUbLH2Oy3psk6z2GAOfAp9MteqvqJYLBFP+G1oOUuzdFAaCigA2 X-Received: by 2002:a05:6402:35c7:: with SMTP id z7mr8551512edc.85.1628192337888; Thu, 05 Aug 2021 12:38:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628192337; cv=none; d=google.com; s=arc-20160816; b=Q4qclwdVF5D7UzOqDoVaKmxvgWgM1iwlSEu6phGTsGauOI0LlL0/QJMX6ZW7AEp27C PwnxT6ZecxAgKHTDWSCoWlyQa1LASwYRP//URsZu83aSB0IByo2Dns/2CmbLMCaHG2Kp FFxgSHqubgKa87a4XpAaLPVQbdsa27lK9+us7hFtNQhPVNikWKCyRAh5vaXg6yTIlDbw LcmpUedMyygfzMKv+grr4dqD7x4i86UOq3x680JbBVk+jvzTdGnJ5ZXU2ar9n+gagLL2 ShkoLlh5590cG/0OHw2r4UKS+F0RqxrQnhxpncX8JlaEpgvqv5dFy5JK1AqpX/fUYLLK FUrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=hbSKxhFL75AVaZrCZ/aIHYq28wd3Dn4cdIO5Rcoxzmw=; b=dCdKqxhUhe9hi6b//ViArEQEiLepMaLUKjqf7Hj6VVO5BaVub+cXrCQukh0ZQXR9c4 ZuObHTU0mn7sBmz0UM1tMhdRvI935Cwh/a2D/IP4YB6i683r6z7LYpnal//1CDCGJj1L dJOHEMo+rfP+xoNj6hBMeOC/DZfM1Wfg4KaFlMBPGX/8Tsr6LYE1WMCqP08chW/kyx3e gRzRIbsOaK18lC5SIXL7rCyUp5LLgYPkfh6VXn9OcIkUSvQd4euroQjyl+RAX+BB2nOv fNCRmmZsj6QeaLuP/aNMGSc/vnWs2gZTxJBF0+jSyksEAW7yfTsJgdh7/ETEE1qKIjrT 3bPg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=qoAe+iPY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o20si6909464ejm.682.2021.08.05.12.38.30; Thu, 05 Aug 2021 12:38:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=qoAe+iPY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238328AbhHERLt (ORCPT + 99 others); Thu, 5 Aug 2021 13:11:49 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:48390 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238633AbhHERLp (ORCPT ); Thu, 5 Aug 2021 13:11:45 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 82F1F1FDC9; Thu, 5 Aug 2021 17:11:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1628183490; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=hbSKxhFL75AVaZrCZ/aIHYq28wd3Dn4cdIO5Rcoxzmw=; b=qoAe+iPYymbW4ChAM2aO9a3bdN9Cfk5zmqN52MscYMYdGCbTBMTj1+xjqHcfHk2O3gBfl0 TlI20oFmnygpLuJlFlSJId7LUuC2t2BQwiqepV2t2JAsgsnf5jZLaCqNK3UQ3CzP34BZsd Fw9V3bFjGopoodBK6umxMZGlHCIQgdk= Received: from suse.cz (unknown [10.100.224.162]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 5FC2AA49F6; Thu, 5 Aug 2021 17:11:30 +0000 (UTC) Date: Thu, 5 Aug 2021 19:11:28 +0200 From: Petr Mladek To: John Ogness Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Greg Kroah-Hartman Subject: Re: [PATCH printk v1 08/10] printk: introduce kernel sync mode Message-ID: References: <20210803131301.5588-1-john.ogness@linutronix.de> <20210803131301.5588-9-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210803131301.5588-9-john.ogness@linutronix.de> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 2021-08-03 15:18:59, John Ogness wrote: > Introduce "sync mode", which means that all printk calls will > synchronously write to the console. Once activated, this mode is > never deactivated. It is used when the kernel is about to end > (such as panic, oops, shutdown, reboot). > > Sync mode can only be activated if atomic consoles are available. > > In sync mode: > > - only atomic consoles (write_atomic() callback) will print > - printing occurs within vprintk_store() instead of console_unlock() > > CONSOLE_LOG_MAX is moved to printk.h to support the per-console > buffer used in sync mode. > > diff --git a/include/linux/console.h b/include/linux/console.h > index 2f11b604e487..eda9b96e3fb6 100644 > --- a/include/linux/console.h > +++ b/include/linux/console.h > @@ -151,6 +151,9 @@ struct console { > short flags; > short index; > int cflag; > +#if defined(CONFIG_PRINTK) && defined(CONFIG_HAVE_ATOMIC_CONSOLE) > + char sync_buf[CONSOLE_LOG_MAX]; Could it be allocated in register_console()? It is needed only when sync_write() callback is defined... > +#endif > void *data; > struct console *next; > }; > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > +static bool print_sync(struct console *con, u64 *seq) > +{ > + struct printk_info info; > + struct printk_record r; > + size_t text_len; > + > + prb_rec_init_rd(&r, &info, &con->sync_buf[0], sizeof(con->sync_buf)); > + > + if (!prb_read_valid(prb, *seq, &r)) > + return false; > + It should check suppress_message_printing(). > + text_len = record_print_text(&r, console_msg_format & MSG_FORMAT_SYSLOG, printk_time); > + > + con->write_atomic(con, &con->sync_buf[0], text_len); > + > + *seq = r.info->seq; > + > + touch_softlockup_watchdog_sync(); > + clocksource_touch_watchdog(); > + rcu_cpu_stall_reset(); > + touch_nmi_watchdog(); > + > + if (text_len) > + printk_delay(r.info->level); > + > + return true; > +} > + > +static void print_sync_to(struct console *con, u64 seq) > +{ > + u64 printk_seq; > + > + while (!__printk_cpu_trylock()) > + __printk_wait_on_cpu_lock(); > + > + for (;;) { > + printk_seq = read_console_seq(); > + if (printk_seq > seq) > + break; > + if (!print_sync(con, &printk_seq)) > + break; > +#ifdef CONFIG_PRINTK_NMI > + if (in_nmi()) { > + latched_seq_write(&console_sync_nmi_seq, printk_seq + 1); > + continue; > + } > +#endif > + latched_seq_write(&console_sync_seq, printk_seq + 1); > + } > + > + __printk_cpu_unlock(); > +} > + > +static void call_sync_console_drivers(u64 seq) > +{ > + struct console *con; > + > + for_each_console(con) { > + if (!(con->flags & CON_ENABLED)) > + continue; > + if (!con->write_atomic) > + continue; > + print_sync_to(con, seq); > + } > +} > +#else > + > +#define read_console_seq() latched_seq_read_nolock(&console_seq) > +#define in_sync_mode() false > +#define enable_sync_mode() > +#define call_sync_console_drivers(seq) ((void)seq) > + > +#endif /* CONFIG_HAVE_ATOMIC_CONSOLE */ > + > /* > * Special console_lock variants that help to reduce the risk of soft-lockups. > * They allow to pass console_lock to another printk() call using a busy wait. > @@ -2084,6 +2231,8 @@ static void call_console_drivers(const char *ext_text, size_t ext_len, > if (!cpu_online(smp_processor_id()) && > !(con->flags & CON_ANYTIME)) > continue; > + if (in_sync_mode()) > + continue; IMHO, this causes that console_unlock() will iterate over all pending messsages without calling console. But it will still increment console_seq. As a result, the messages will be skipped also by print_sync_to() because read_console_seq() will return the incremented value. I think that we need to break the iteration in console_unlock(). Or do I miss something? > if (con->flags & CON_EXTENDED) > con->write(con, ext_text, ext_len); > else { > @@ -2251,6 +2400,7 @@ int vprintk_store(int facility, int level, > const u32 caller_id = printk_caller_id(); > struct prb_reserved_entry e; > enum printk_info_flags flags = 0; > + bool final_commit = false; > struct printk_record r; > unsigned long irqflags; > u16 trunc_msg_len = 0; > @@ -2261,6 +2411,7 @@ int vprintk_store(int facility, int level, > u16 text_len; > int ret = 0; > u64 ts_nsec; > + u64 seq; > > /* > * Since the duration of printk() can vary depending on the message > @@ -2299,6 +2450,7 @@ int vprintk_store(int facility, int level, > if (flags & LOG_CONT) { > prb_rec_init_wr(&r, reserve_size); > if (prb_reserve_in_last(&e, prb, &r, caller_id, LOG_LINE_MAX)) { > + seq = r.info->seq; > text_len = printk_sprint(&r.text_buf[r.info->text_len], reserve_size, > facility, &flags, fmt, args); > r.info->text_len += text_len; > @@ -2306,6 +2458,7 @@ int vprintk_store(int facility, int level, > if (flags & LOG_NEWLINE) { > r.info->flags |= LOG_NEWLINE; > prb_final_commit(&e); > + final_commit = true; > } else { > prb_commit(&e); > } > @@ -2330,6 +2483,8 @@ int vprintk_store(int facility, int level, > goto out; > } > > + seq = r.info->seq; > + > /* fill message */ > text_len = printk_sprint(&r.text_buf[0], reserve_size, facility, &flags, fmt, args); > if (trunc_msg_len) > @@ -2344,13 +2499,19 @@ int vprintk_store(int facility, int level, > memcpy(&r.info->dev_info, dev_info, sizeof(r.info->dev_info)); > > /* A message without a trailing newline can be continued. */ > - if (!(flags & LOG_NEWLINE)) > + if (!(flags & LOG_NEWLINE)) { > prb_commit(&e); > - else > + } else { > prb_final_commit(&e); > + final_commit = true; > + } > > ret = text_len + trunc_msg_len; > out: > + /* only the kernel may perform synchronous printing */ > + if (in_sync_mode() && facility == 0 && final_commit) > + call_sync_console_drivers(seq); Is there any reason why this is called from vprintk_emit()? I guess that you wanted to call it before releasing IRQ. But is it really necessary? call_sync_console_drivers(seq) reads the message again via the seq number anyway. I have to say that the new code makes the printk() code/api much more twisted. It is a combination of the naming scheme and design. The original code path is: + printk() + vprintk_emit() + vprintk_store() + console_trylock() + console_unlock() + prb_read_valid() + record_print_text() + call_console_drivers() + con->write() The new code path is: + printk() + vprintk_emit() + vprintk_store() + call_sync_console_drivers() + printk_sync_to() + print_sync() + prb_read_valid() + record_print_text() + con->write_atomic() One thing is the ordering of the api names: + printk -> vprintk -> console -> record_print -> call_console -> con -> write vs. + printk -> vprintk -> call_console -> print -> record_print -> con -> write The original patch called console() API from printk() API. The most ugly things were: + console_unlock() flushed the messages to the console. A cleaner API would be: console_lock(); console_flush(); console_unlock(); + record_print() called from console_unlock(). The "print" name name makes it hard to distinguish from the "printk" API. But it does a completely different job: + "printk" API stores the message and call console + "record_print" API converts the message into the console format The new code adds even more twists: + Adds yet another "print" API. It has another meaning than "printk" or "record_print" API: + "printk" API stores the message and call console + "print" API prints the message to the console + "record_print" API converts the message into the console format + call_sync_console_drivers() does similar job as console_unlock() (iterates over all pending messages, read, format, call console). While the original call_console_drivers() only called the console. The logic is somehow inside out. This is why I opened the discussion about the behavior with console kthreads. I think that we might need another synchronous mode also for the early boot, suspend, kexec, panic, for non-atomic consoles. We might need another cycle/solution when there are per-console kthreads. I would prefer to somehow refactor the existing console_unlock() so that the iteration over pending messages and call_console_drivers() might be usable also in the other sync modes, console kthreads, ... Best Regards, Petr PS: I have vacation the following two weeks. I am still going to work tomorrow (Friday) but I am not sure how much time I will have to discuss this. I am afraid that I won't be able to help much before I am back from the vacation.