On Thu 2024-02-01 13:12:41, Sreenath Vijayan wrote:
> When terminal is unresponsive, one cannot use dmesg to view printk
> ring buffer messages. Also, syslog services may be disabled,
> to check the messages after a reboot, especially on embedded systems.
> In this scenario, dump the printk ring buffer messages via sysrq
> by pressing sysrq+D.
I would use sysrq-R and say that it replays the kernel log on
consoles.
The word "dump" is ambiguous. People might thing that it calls
dmesg dumpers.
Also the messages would be shown on the terminal only when
console_loglevel is set to show all messages. This is done
in __handle_sysrq(). But it is not done in the workqueue
context.
Finally, the commit message should explain why workqueues are used
and what are the limitations. Something like:
<add>
The log is replayed using workqueues. The reason is that it has to
be done a safe way (in compare with panic context).
This also means that the sysrq won't have the desired effect
when the system is in so bad state that workqueues do not
make any progress.
</add>
Another reason might be that we do not want to do it in
an interrupt context. But this reason is questionable.
Many other sysrq commands do a complicate work and
print many messages as well.
Another reason is that the function need to use console_lock()
which can't be called in IRQ context. Maybe, we should use
console_trylock() instead.
The function would replay the messages only when console_trylock()
succeeds. Users could repeat the sysrq when it fails.
Idea:
Using console_trylock() actually might be more reliable than
workqueues. console_trylock() might fail repeatably when:
+ the console_lock() owner is stuck. But workqueues would fail
in this case as well.
+ there is a flood of messages. In this case, replaying
the log would not help much.
Another advantage is that the consoles would be flushed
in sysrq context with the manipulated console_loglevel.
Best Regards,
Petr
On Wed, Feb 07, 2024 at 04:09:34PM +0100, Petr Mladek wrote:
> On Thu 2024-02-01 13:12:41, Sreenath Vijayan wrote:
> > When terminal is unresponsive, one cannot use dmesg to view printk
> > ring buffer messages. Also, syslog services may be disabled,
> > to check the messages after a reboot, especially on embedded systems.
> > In this scenario, dump the printk ring buffer messages via sysrq
> > by pressing sysrq+D.
>
> I would use sysrq-R and say that it replays the kernel log on
> consoles.
>
> The word "dump" is ambiguous. People might thing that it calls
> dmesg dumpers.
>
> Also the messages would be shown on the terminal only when
> console_loglevel is set to show all messages. This is done
> in __handle_sysrq(). But it is not done in the workqueue
> context.
>
> Finally, the commit message should explain why workqueues are used
> and what are the limitations. Something like:
>
> <add>
> The log is replayed using workqueues. The reason is that it has to
> be done a safe way (in compare with panic context).
>
> This also means that the sysrq won't have the desired effect
> when the system is in so bad state that workqueues do not
> make any progress.
> </add>
>
> Another reason might be that we do not want to do it in
> an interrupt context. But this reason is questionable.
> Many other sysrq commands do a complicate work and
> print many messages as well.
>
> Another reason is that the function need to use console_lock()
> which can't be called in IRQ context. Maybe, we should use
> console_trylock() instead.
>
> The function would replay the messages only when console_trylock()
> succeeds. Users could repeat the sysrq when it fails.
>
> Idea:
>
> Using console_trylock() actually might be more reliable than
> workqueues. console_trylock() might fail repeatably when:
>
> + the console_lock() owner is stuck. But workqueues would fail
> in this case as well.
>
> + there is a flood of messages. In this case, replaying
> the log would not help much.
>
> Another advantage is that the consoles would be flushed
> in sysrq context with the manipulated console_loglevel.
I just remembered all the rt-changes coming down the pipe for
consoles/printk, is this going to mess with that?
And in thinking about it, the workqueue is a worry, sysrq is only
usually hit if you have a lockup, and this isn't going to work well
here, if at all, in that situation.
So when this option fails when people need it the most, perhaps it's not
worth adding? When else would people want to use it?
thanks,
greg k-h
On 2024-02-08, Greg KH <[email protected]> wrote:
> I just remembered all the rt-changes coming down the pipe for
> consoles/printk, is this going to mess with that?
It will not mess with the changes because we will continue to support
the legacy consoles anyway.
> So when this option fails when people need it the most, perhaps it's not
> worth adding? When else would people want to use it?
The feature could be massively improved once the rt-changes (atomic
consoles) become available.
Petr also brought up valid points about this feature (such as the
loglevel) that should be considered. We should clarify what exactly we
want this feature to do. The actual implementation is the easy part.
John