On Wed 2022-04-20 01:52:35, John Ogness wrote:
> Create a kthread for each console to perform console printing. During
> normal operation (@system_state == SYSTEM_RUNNING), the kthread
> printers are responsible for all printing on their respective
> consoles.
>
> During non-normal operation, console printing is done as it has been:
> within the context of the printk caller or within irqwork triggered
> by the printk caller, referred to as direct printing.
>
> Since threaded console printers are responsible for all printing
> during normal operation, this also includes messages generated via
> deferred printk calls. If direct printing is in effect during a
> deferred printk call, the queued irqwork will perform the direct
> printing. To make it clear that this is the only time that the
> irqwork will perform direct printing, rename the flag
> PRINTK_PENDING_OUTPUT to PRINTK_PENDING_DIRECT_OUTPUT.
>
> Threaded console printers synchronize against each other and against
> console lockers by taking the console lock for each message that is
> printed.
>
> Note that the kthread printers do not care about direct printing.
> They will always try to print if new records are available. They can
> be blocked by direct printing, but will be woken again once direct
> printing is finished.
>
> Console unregistration is a bit tricky because the associated
> kthread printer cannot be stopped while the console lock is held.
> A policy is implemented that states: whichever task clears
> con->thread (under the console lock) is responsible for stopping
> the kthread. unregister_console() will clear con->thread while
> the console lock is held and then stop the kthread after releasing
> the console lock.
>
> For consoles that have implemented the exit() callback, the kthread
> is stopped before exit() is called.
>
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -390,6 +397,14 @@ void printk_prefer_direct_exit(void)
> WARN_ON(atomic_dec_if_positive(&printk_prefer_direct) < 0);
> }
>
> +static inline bool allow_direct_printing(void)
> +{
> + return (!printk_kthreads_available ||
> + system_state > SYSTEM_RUNNING ||
> + oops_in_progress ||
> + atomic_read(&printk_prefer_direct));
> +}
> +
> DECLARE_WAIT_QUEUE_HEAD(log_wait);
> /* All 3 protected by @syslog_lock. */
> /* the next printk record to read by syslog(READ) or /proc/kmsg */
> @@ -2280,10 +2295,10 @@ asmlinkage int vprintk_emit(int facility, int level,
> printed_len = vprintk_store(facility, level, dev_info, fmt, args);
>
> /* If called from the scheduler, we can not call up(). */
> - if (!in_sched) {
> + if (!in_sched && allow_direct_printing()) {
allow_direct_printing() is racy here. But I think that we could live
with it, see below.
> /*
> * The caller may be holding system-critical or
> - * timing-sensitive locks. Disable preemption during
> + * timing-sensitive locks. Disable preemption during direct
> * printing of all remaining records to all consoles so that
> * this context can return as soon as possible. Hopefully
> * another printk() caller will take over the printing.
[...]
> @@ -3475,10 +3720,14 @@ static void wake_up_klogd_work_func(struct irq_work *irq_work)
> {
> int pending = this_cpu_xchg(printk_pending, 0);
>
> - if (pending & PRINTK_PENDING_OUTPUT) {
> + if (pending & PRINTK_PENDING_DIRECT_OUTPUT) {
> + printk_prefer_direct_enter();
> +
> /* If trylock fails, someone else is doing the printing */
> if (console_trylock())
> console_unlock();
> +
> + printk_prefer_direct_exit();
> }
>
> if (pending & PRINTK_PENDING_WAKEUP)
> @@ -3503,10 +3752,11 @@ static void __wake_up_klogd(int val)
> * prepare_to_wait_event(), which is called after ___wait_event() adds
> * the waiter but before it has checked the wait condition.
> *
> - * This pairs with devkmsg_read:A and syslog_print:A.
> + * This pairs with devkmsg_read:A, syslog_print:A, and
> + * printk_kthread_func:A.
> */
> if (wq_has_sleeper(&log_wait) || /* LMM(__wake_up_klogd:A) */
> - (val & PRINTK_PENDING_OUTPUT)) {
> + (val & PRINTK_PENDING_DIRECT_OUTPUT)) {
> this_cpu_or(printk_pending, val);
> irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> }
> @@ -3524,7 +3774,16 @@ void defer_console_output(void)
> * New messages may have been added directly to the ringbuffer
> * using vprintk_store(), so wake any waiters as well.
> */
> - __wake_up_klogd(PRINTK_PENDING_WAKEUP | PRINTK_PENDING_OUTPUT);
> + int val = PRINTK_PENDING_WAKEUP;
> +
> + /*
> + * If console deferring was called with preferred direct printing,
> + * make the irqwork perform the direct printing.
> + */
> + if (atomic_read(&printk_prefer_direct))
> + val |= PRINTK_PENDING_DIRECT_OUTPUT;
We actually need:
/*
* Make sure that someone will handle the messages when direct
* printing is allowed. It happens when the kthreads are less
* reliable or unusable at all.
*/
if (allow_direct_printing())
val |= PRINTK_PENDING_DIRECT_OUTPUT;
It is racy. But the same race is also in vprintk_emit().
False positive is fine. console_flush_all() will bail out when
the direct printing gets disabled in the meantime.
False negative is worse. But we will still queue PRINTK_PENDING_WAKEUP
that will try to wake up the kthreads that should still be around.
And it was always problem even with console_trylock() approach.
Failure means an expectation that someone else is doing the printing.
It might be either a kthread or the current console_lock owner.
But it is never guaranteed because both might be sleeping.
We do our best by calling pr_flush() or console_flush_on_panic()
on various places. Also PRINTK_PENDING_WAKEUP will always try to wake
up the kthreads.
Anyway, we should document this somewhere. At least in the commit
message.
My dream is Documentation/core-api/printk-design.rst but I do not
want to force you to do it ;-)
> + __wake_up_klogd(val);
> }
>
> void printk_trigger_flush(void)
Otherwise, it looks good.
Best Regards,
Petr
On 2022-04-20, Petr Mladek <[email protected]> wrote:
> On Wed 2022-04-20 01:52:35, John Ogness wrote:
>> @@ -2280,10 +2295,10 @@ asmlinkage int vprintk_emit(int facility, int level,
>> printed_len = vprintk_store(facility, level, dev_info, fmt, args);
>>
>> /* If called from the scheduler, we can not call up(). */
>> - if (!in_sched) {
>> + if (!in_sched && allow_direct_printing()) {
>
> allow_direct_printing() is racy here. But I think that we could live
> with it, see below.
Well, it is not racy for its intended purpose, which is a context that
does:
printk_prefer_direct_enter();
printk();
printk_prefer_direct_exit();
It is only racy for _other_ contexts that might end up direct
printing. But since those other contexts don't have a preference, I see
no problem with it.
>> @@ -3524,7 +3774,16 @@ void defer_console_output(void)
>> * New messages may have been added directly to the ringbuffer
>> * using vprintk_store(), so wake any waiters as well.
>> */
>> - __wake_up_klogd(PRINTK_PENDING_WAKEUP | PRINTK_PENDING_OUTPUT);
>> + int val = PRINTK_PENDING_WAKEUP;
>> +
>> + /*
>> + * If console deferring was called with preferred direct printing,
>> + * make the irqwork perform the direct printing.
>> + */
>> + if (atomic_read(&printk_prefer_direct))
>> + val |= PRINTK_PENDING_DIRECT_OUTPUT;
>
> We actually need:
>
> /*
> * Make sure that someone will handle the messages when direct
> * printing is allowed. It happens when the kthreads are less
> * reliable or unusable at all.
> */
> if (allow_direct_printing())
> val |= PRINTK_PENDING_DIRECT_OUTPUT;
Agreed. I will update the comments appropriately as well.
> It is racy. But the same race is also in vprintk_emit().
It is not racy for the intended purpose, so I think it is fine.
> False positive is fine. console_flush_all() will bail out when
> the direct printing gets disabled in the meantime.
>
> False negative is worse. But we will still queue PRINTK_PENDING_WAKEUP
> that will try to wake up the kthreads that should still be around.
>
> And it was always problem even with console_trylock() approach.
> Failure means an expectation that someone else is doing the printing.
> It might be either a kthread or the current console_lock owner.
> But it is never guaranteed because both might be sleeping.
By "sleeping" I guess you mean "scheduled out". The console_lock owner
or mutex/atomic_t holder will be within printing code. And if a kthread
sees new records available, it will continue rather than wait.
> We do our best by calling pr_flush() or console_flush_on_panic()
> on various places. Also PRINTK_PENDING_WAKEUP will always try to wake
> up the kthreads.
Yes.
> Anyway, we should document this somewhere. At least in the commit
> message.
>
> My dream is Documentation/core-api/printk-design.rst but I do not
> want to force you to do it ;-)
I would be happy to contribute to such a document.
John