2022-08-30 15:49:13

by Daniel Vetter

[permalink] [raw]
Subject: [PATCH] kernel/panic: Drop unblank_screen call

console_unblank() does this too (called in both places right after),
and with a lot more confidence inspiring approach to locking.

Reconstructing this story is very strange:

In b61312d353da ("oops handling: ensure that any oops is flushed to
the mtdoops console") it is claimed that a printk(" "); flushed out
the console buffer, which was removed in e3e8a75d2acf ("[PATCH]
Extract and use wake_up_klogd()"). In todays kernels this is done way
earlier in console_flush_on_panic with some really nasty tricks. I
didn't bother to fully reconstruct this all, least because the call to
bust_spinlock(0); gets moved every few years, depending upon how the
wind blows (or well, who screamed loudest about the various issue each
call site caused).

Before that commit the only calls to console_unblank() where in s390
arch code.

The other side here is the console->unblank callback, which was
introduced in 2.1.31 for the vt driver. Which predates the
console_unblank() function by a lot, which was added (without users)
in 2.4.14.3. So pretty much impossible to guess at any motivation
here. Also afaict the vt driver is the only (and always was the only)
console driver implementing the unblank callback, so no idea why a
call to console_unblank() was added for the mtdooops driver - the
action actually flushing out the console buffers is done from
console_unlock() only.

Note that as prep for the s390 users the locking was adjusted in
2.5.22 (I couldn't figure out how to properly reference the BK commit
from the historical git trees) from a normal semaphore to a trylock.

Note that a copy of the direct unblank_screen() call was added to
panic() in c7c3f05e341a ("panic: avoid deadlocks in re-entrant console
drivers"), which partially inlined the bust_spinlocks(0); call.

Long story short, I have no idea why the direct call to unblank_screen
survived for so long (the infrastructure to do it properly existed for
years), nor why it wasn't removed when the console_unblank() call was
finally added. But it makes a ton more sense to finally do that than
not - it's just better encapsulation to go through the console
functions instead of doing a direct call, so let's dare. Plus it
really does not make much sense to call the only unblank
implementation there is twice, once without, and once with appropriate
locking.

Signed-off-by: Daniel Vetter <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: "Ilpo Järvinen" <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Cc: Xuezhi Zhang <[email protected]>
Cc: Yangxi Xiang <[email protected]>
Cc: nick black <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Luis Chamberlain <[email protected]>
Cc: "Guilherme G. Piccoli" <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: John Ogness <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: David Gow <[email protected]>
Cc: tangmeng <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Cc: Chris Wilson <[email protected]>
---
drivers/tty/vt/vt.c | 3 ++-
include/linux/vt_kern.h | 1 -
kernel/panic.c | 3 ---
lib/bust_spinlocks.c | 3 ---
4 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index a6be32798fad..08498fcf080a 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -154,6 +154,7 @@ static void console_callback(struct work_struct *ignored);
static void con_driver_unregister_callback(struct work_struct *ignored);
static void blank_screen_t(struct timer_list *unused);
static void set_palette(struct vc_data *vc);
+static void unblank_screen(void);

#define vt_get_kmsg_redirect() vt_kmsg_redirect(-1)

@@ -4450,7 +4451,7 @@ EXPORT_SYMBOL(do_unblank_screen);
* call it with 1 as an argument and so force a mode restore... that may kill
* X or at least garbage the screen but would also make the Oops visible...
*/
-void unblank_screen(void)
+static void unblank_screen(void)
{
do_unblank_screen(0);
}
diff --git a/include/linux/vt_kern.h b/include/linux/vt_kern.h
index b5ab452fca5b..c1f5aebef170 100644
--- a/include/linux/vt_kern.h
+++ b/include/linux/vt_kern.h
@@ -30,7 +30,6 @@ struct vc_data *vc_deallocate(unsigned int console);
void reset_palette(struct vc_data *vc);
void do_blank_screen(int entering_gfx);
void do_unblank_screen(int leaving_gfx);
-void unblank_screen(void);
void poke_blanked_console(void);
int con_font_op(struct vc_data *vc, struct console_font_op *op);
int con_set_cmap(unsigned char __user *cmap);
diff --git a/kernel/panic.c b/kernel/panic.c
index 1fc3d98417d1..74a7b55851c4 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -330,9 +330,6 @@ void panic(const char *fmt, ...)
if (_crash_kexec_post_notifiers)
__crash_kexec(NULL);

-#ifdef CONFIG_VT
- unblank_screen();
-#endif
console_unblank();

/*
diff --git a/lib/bust_spinlocks.c b/lib/bust_spinlocks.c
index 8be59f84eaea..bfd53972a4d8 100644
--- a/lib/bust_spinlocks.c
+++ b/lib/bust_spinlocks.c
@@ -22,9 +22,6 @@ void bust_spinlocks(int yes)
if (yes) {
++oops_in_progress;
} else {
-#ifdef CONFIG_VT
- unblank_screen();
-#endif
console_unblank();
if (--oops_in_progress == 0)
wake_up_klogd();
--
2.37.2


Subject: Re: [PATCH] kernel/panic: Drop unblank_screen call

On 2022-08-30 16:50:04 [+0200], Daniel Vetter wrote:
> Long story short, I have no idea why the direct call to unblank_screen
> survived for so long (the infrastructure to do it properly existed for
> years), nor why it wasn't removed when the console_unblank() call was
> finally added. But it makes a ton more sense to finally do that than
> not - it's just better encapsulation to go through the console
> functions instead of doing a direct call, so let's dare. Plus it
> really does not make much sense to call the only unblank
> implementation there is twice, once without, and once with appropriate
> locking.

Yup, calling it twice is redundant.
The only difference I see is that the console implementation relies on
CONFIG_VT_CONSOLE while the former relied only on CONFIG_VT. There
should be no console output without CONFIG_VT_CONSOLE so no need to
unblank it.

Acked-by: Sebastian Andrzej Siewior <[email protected]>

Sebastian

2022-09-01 11:51:17

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH] kernel/panic: Drop unblank_screen call

On Tue 2022-08-30 16:50:04, Daniel Vetter wrote:
> console_unblank() does this too (called in both places right after),
> and with a lot more confidence inspiring approach to locking.
>
> Reconstructing this story is very strange:
>
> In b61312d353da ("oops handling: ensure that any oops is flushed to
> the mtdoops console") it is claimed that a printk(" "); flushed out
> the console buffer, which was removed in e3e8a75d2acf ("[PATCH]
> Extract and use wake_up_klogd()"). In todays kernels this is done way
> earlier in console_flush_on_panic with some really nasty tricks. I
> didn't bother to fully reconstruct this all, least because the call to
> bust_spinlock(0); gets moved every few years, depending upon how the
> wind blows (or well, who screamed loudest about the various issue each
> call site caused).
>
> Before that commit the only calls to console_unblank() where in s390
> arch code.
>
> The other side here is the console->unblank callback, which was
> introduced in 2.1.31 for the vt driver. Which predates the
> console_unblank() function by a lot, which was added (without users)
> in 2.4.14.3. So pretty much impossible to guess at any motivation
> here. Also afaict the vt driver is the only (and always was the only)
> console driver implementing the unblank callback, so no idea why a
> call to console_unblank() was added for the mtdooops driver - the
> action actually flushing out the console buffers is done from
> console_unlock() only.

My understanding is that mtdoops is not a real console. The commit
4b23aff083649eafa141 ("[MTD] oops and panic message logging to MTD
device") suggests that it was just (mis)using the console
infrastructure.

The commit 2e386e4bac90554887e73d ("mtd: mtdoops: refactor as a
kmsg_dumper") converted it to use the new kmsg_dumper API that
was created for this use case.

So, I would consider all the mtdoops-related changes as a misuse
of the console API.


> Note that as prep for the s390 users the locking was adjusted in
> 2.5.22 (I couldn't figure out how to properly reference the BK commit
> from the historical git trees) from a normal semaphore to a trylock.
>
> Note that a copy of the direct unblank_screen() call was added to
> panic() in c7c3f05e341a ("panic: avoid deadlocks in re-entrant console
> drivers"), which partially inlined the bust_spinlocks(0); call.
>
> Long story short, I have no idea why the direct call to unblank_screen
> survived for so long (the infrastructure to do it properly existed for
> years), nor why it wasn't removed when the console_unblank() call was
> finally added. But it makes a ton more sense to finally do that than
> not - it's just better encapsulation to go through the console
> functions instead of doing a direct call, so let's dare. Plus it
> really does not make much sense to call the only unblank
> implementation there is twice, once without, and once with appropriate
> locking.
>
> Signed-off-by: Daniel Vetter <[email protected]>

Nice analyze. The change makes perfect sense from my POV:

Reviewed-by: Petr Mladek <[email protected]>

Best Regards,
Petr

2022-09-01 20:00:48

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH] kernel/panic: Drop unblank_screen call

On Thu, 1 Sept 2022 at 13:35, Petr Mladek <[email protected]> wrote:
>
> On Tue 2022-08-30 16:50:04, Daniel Vetter wrote:
> > console_unblank() does this too (called in both places right after),
> > and with a lot more confidence inspiring approach to locking.
> >
> > Reconstructing this story is very strange:
> >
> > In b61312d353da ("oops handling: ensure that any oops is flushed to
> > the mtdoops console") it is claimed that a printk(" "); flushed out
> > the console buffer, which was removed in e3e8a75d2acf ("[PATCH]
> > Extract and use wake_up_klogd()"). In todays kernels this is done way
> > earlier in console_flush_on_panic with some really nasty tricks. I
> > didn't bother to fully reconstruct this all, least because the call to
> > bust_spinlock(0); gets moved every few years, depending upon how the
> > wind blows (or well, who screamed loudest about the various issue each
> > call site caused).
> >
> > Before that commit the only calls to console_unblank() where in s390
> > arch code.
> >
> > The other side here is the console->unblank callback, which was
> > introduced in 2.1.31 for the vt driver. Which predates the
> > console_unblank() function by a lot, which was added (without users)
> > in 2.4.14.3. So pretty much impossible to guess at any motivation
> > here. Also afaict the vt driver is the only (and always was the only)
> > console driver implementing the unblank callback, so no idea why a
> > call to console_unblank() was added for the mtdooops driver - the
> > action actually flushing out the console buffers is done from
> > console_unlock() only.
>
> My understanding is that mtdoops is not a real console. The commit
> 4b23aff083649eafa141 ("[MTD] oops and panic message logging to MTD
> device") suggests that it was just (mis)using the console
> infrastructure.
>
> The commit 2e386e4bac90554887e73d ("mtd: mtdoops: refactor as a
> kmsg_dumper") converted it to use the new kmsg_dumper API that
> was created for this use case.
>
> So, I would consider all the mtdoops-related changes as a misuse
> of the console API.

Ah, that's a good piece of information that I didn't figure out.

Greg, if you haven't baked in the patch yet, can you perhaps add the
above information from Petr to the commit message?

Thanks, Daniel

>
>
> > Note that as prep for the s390 users the locking was adjusted in
> > 2.5.22 (I couldn't figure out how to properly reference the BK commit
> > from the historical git trees) from a normal semaphore to a trylock.
> >
> > Note that a copy of the direct unblank_screen() call was added to
> > panic() in c7c3f05e341a ("panic: avoid deadlocks in re-entrant console
> > drivers"), which partially inlined the bust_spinlocks(0); call.
> >
> > Long story short, I have no idea why the direct call to unblank_screen
> > survived for so long (the infrastructure to do it properly existed for
> > years), nor why it wasn't removed when the console_unblank() call was
> > finally added. But it makes a ton more sense to finally do that than
> > not - it's just better encapsulation to go through the console
> > functions instead of doing a direct call, so let's dare. Plus it
> > really does not make much sense to call the only unblank
> > implementation there is twice, once without, and once with appropriate
> > locking.
> >
> > Signed-off-by: Daniel Vetter <[email protected]>
>
> Nice analyze. The change makes perfect sense from my POV:
>
> Reviewed-by: Petr Mladek <[email protected]>
>
> Best Regards,
> Petr



--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2022-09-02 14:30:53

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] kernel/panic: Drop unblank_screen call

On Thu, Sep 01, 2022 at 09:26:27PM +0200, Daniel Vetter wrote:
> On Thu, 1 Sept 2022 at 13:35, Petr Mladek <[email protected]> wrote:
> >
> > On Tue 2022-08-30 16:50:04, Daniel Vetter wrote:
> > > console_unblank() does this too (called in both places right after),
> > > and with a lot more confidence inspiring approach to locking.
> > >
> > > Reconstructing this story is very strange:
> > >
> > > In b61312d353da ("oops handling: ensure that any oops is flushed to
> > > the mtdoops console") it is claimed that a printk(" "); flushed out
> > > the console buffer, which was removed in e3e8a75d2acf ("[PATCH]
> > > Extract and use wake_up_klogd()"). In todays kernels this is done way
> > > earlier in console_flush_on_panic with some really nasty tricks. I
> > > didn't bother to fully reconstruct this all, least because the call to
> > > bust_spinlock(0); gets moved every few years, depending upon how the
> > > wind blows (or well, who screamed loudest about the various issue each
> > > call site caused).
> > >
> > > Before that commit the only calls to console_unblank() where in s390
> > > arch code.
> > >
> > > The other side here is the console->unblank callback, which was
> > > introduced in 2.1.31 for the vt driver. Which predates the
> > > console_unblank() function by a lot, which was added (without users)
> > > in 2.4.14.3. So pretty much impossible to guess at any motivation
> > > here. Also afaict the vt driver is the only (and always was the only)
> > > console driver implementing the unblank callback, so no idea why a
> > > call to console_unblank() was added for the mtdooops driver - the
> > > action actually flushing out the console buffers is done from
> > > console_unlock() only.
> >
> > My understanding is that mtdoops is not a real console. The commit
> > 4b23aff083649eafa141 ("[MTD] oops and panic message logging to MTD
> > device") suggests that it was just (mis)using the console
> > infrastructure.
> >
> > The commit 2e386e4bac90554887e73d ("mtd: mtdoops: refactor as a
> > kmsg_dumper") converted it to use the new kmsg_dumper API that
> > was created for this use case.
> >
> > So, I would consider all the mtdoops-related changes as a misuse
> > of the console API.
>
> Ah, that's a good piece of information that I didn't figure out.
>
> Greg, if you haven't baked in the patch yet, can you perhaps add the
> above information from Petr to the commit message?

It's already baked, sorry :(