2022-02-09 12:51:39

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH REPOST] irq_poll: Add local_bh_disable() in cpu_dead notifier

On Tue, Feb 08, 2022 at 03:34:05PM +0100, Sebastian Andrzej Siewior wrote:
> __raise_softirq_irqoff() adds a bit to the pending sofirq mask and this
> is it. The softirq won't be handled in a deterministic way but randomly
> when an interrupt fires and handles the softirq in its irq_exit() routine or
> if something randomly checks and handles pending softirqs in the call
> chain before the CPU goes idle.
>
> Add a local_bh_disable/enable() around the IRQ-off section which will
> handle pending softirqs.

And I still haven't seen any good explanation of why this is useful.


Subject: Re: [PATCH REPOST] irq_poll: Add local_bh_disable() in cpu_dead notifier

On 2022-02-08 23:56:34 [-0800], Christoph Hellwig wrote:
> On Tue, Feb 08, 2022 at 03:34:05PM +0100, Sebastian Andrzej Siewior wrote:
> > __raise_softirq_irqoff() adds a bit to the pending sofirq mask and this
> > is it. The softirq won't be handled in a deterministic way but randomly
> > when an interrupt fires and handles the softirq in its irq_exit() routine or
> > if something randomly checks and handles pending softirqs in the call
> > chain before the CPU goes idle.
> >
> > Add a local_bh_disable/enable() around the IRQ-off section which will
> > handle pending softirqs.
>
> And I still haven't seen any good explanation of why this is useful.

You need to handle the pending softirqs. If you don't handle them
immediately or in a deterministic say (like on IRQ exit) then they will
be handled at a random point. If you don't handle them at all, the CPU
will go idle and at least the NO_HZ will complain about pending softirqs
(can_stop_idle_tick()).

You could still argue that the CPU will go down and the there are
latencies involved but…
I want to avoid waking ksoftirqd for that since there is no need to wake
it and the pending work can be done in-context, right away.

Sebastian

2022-02-11 07:20:29

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH REPOST] irq_poll: Add local_bh_disable() in cpu_dead notifier

On Thu, Feb 10, 2022 at 01:33:39PM +0100, Sebastian Andrzej Siewior wrote:
> You need to handle the pending softirqs. If you don't handle them
> immediately or in a deterministic say (like on IRQ exit) then they will
> be handled at a random point.

Yes. Just like regular interrupts.

Subject: Re: [PATCH REPOST] irq_poll: Add local_bh_disable() in cpu_dead notifier

On 2022-02-10 22:34:32 [-0800], Christoph Hellwig wrote:
> On Thu, Feb 10, 2022 at 01:33:39PM +0100, Sebastian Andrzej Siewior wrote:
> > You need to handle the pending softirqs. If you don't handle them
> > immediately or in a deterministic say (like on IRQ exit) then they will
> > be handled at a random point.
>
> Yes. Just like regular interrupts.

With the exception that this one was already handled and should be
handled and not delayed until the next interrupt.
And as I said, on NO_HZ you get a warning about unhandled soft-irqs if
the CPU goes idle.

Sebastian

2022-04-11 09:01:16

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH REPOST] irq_poll: Add local_bh_disable() in cpu_dead notifier

On Thu, Feb 10 2022 at 22:34, Christoph Hellwig wrote:
> On Thu, Feb 10, 2022 at 01:33:39PM +0100, Sebastian Andrzej Siewior wrote:
>> You need to handle the pending softirqs. If you don't handle them
>> immediately or in a deterministic say (like on IRQ exit) then they will
>> be handled at a random point.
>
> Yes. Just like regular interrupts.

But interrupts make sure they are handled. This code does not and as
Sebastian pointed out:

"If you don't handle them at all, the CPU will go idle and at least
the NO_HZ will complain about pending softirqs (can_stop_idle_tick())."

That's clearly a bug, but this should be part of the changelog.

Thanks,

tglx

2022-04-12 17:42:45

by Thomas Gleixner

[permalink] [raw]
Subject: [PATCH V2] lib/irq_poll: Add local_bh_disable() in irq_poll_cpu_dead()

From: Sebastian Andrzej Siewior <[email protected]>

irq_poll_cpu_dead() pulls the blk_cpu_iopoll backlog from the dead CPU and
raises the POLL softirq with __raise_softirq_irqoff() on the CPU it is
running on. That just sets the bit in the pending softirq mask.

This means the handling of the softirq is delayed until the next interrupt
or a local_bh_disable/enable() pair. As a consequence the CPU on which this
code runs can reach idle with the POLL softirq pending, which triggers a
warning in the NOHZ idle code.

Add a local_bh_disable/enable() pair around the interrupts disabled section
in irq_poll_cpu_dead(). local_bh_enable will handle the pending softirq.

[tglx: Massaged changelog and comment]

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---
V2: Updated changelog and comment
---
lib/irq_poll.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

--- a/lib/irq_poll.c
+++ b/lib/irq_poll.c
@@ -188,14 +188,18 @@ EXPORT_SYMBOL(irq_poll_init);
static int irq_poll_cpu_dead(unsigned int cpu)
{
/*
- * If a CPU goes away, splice its entries to the current CPU
- * and trigger a run of the softirq
+ * If a CPU goes away, splice its entries to the current CPU and
+ * set the POLL softirq bit. The local_bh_disable()/enable() pair
+ * ensures that it is handled. Otherwise the current CPU could
+ * reach idle with the POLL softirq pending.
*/
+ local_bh_disable();
local_irq_disable();
list_splice_init(&per_cpu(blk_cpu_iopoll, cpu),
this_cpu_ptr(&blk_cpu_iopoll));
__raise_softirq_irqoff(IRQ_POLL_SOFTIRQ);
local_irq_enable();
+ local_bh_enable();

return 0;
}

Subject: [tip: core/core] lib/irq_poll: Prevent softirq pending leak in irq_poll_cpu_dead()

The following commit has been merged into the core/core branch of tip:

Commit-ID: 75d8cce128c516fe6cf4b8683e8fe1a59e919902
Gitweb: https://git.kernel.org/tip/75d8cce128c516fe6cf4b8683e8fe1a59e919902
Author: Sebastian Andrzej Siewior <[email protected]>
AuthorDate: Sun, 10 Apr 2022 14:49:36 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Wed, 13 Apr 2022 21:32:21 +02:00

lib/irq_poll: Prevent softirq pending leak in irq_poll_cpu_dead()

irq_poll_cpu_dead() pulls the blk_cpu_iopoll backlog from the dead CPU and
raises the POLL softirq with __raise_softirq_irqoff() on the CPU it is
running on. That just sets the bit in the pending softirq mask.

This means the handling of the softirq is delayed until the next interrupt
or a local_bh_disable/enable() pair. As a consequence the CPU on which this
code runs can reach idle with the POLL softirq pending, which triggers a
warning in the NOHZ idle code.

Add a local_bh_disable/enable() pair around the interrupts disabled section
in irq_poll_cpu_dead(). local_bh_enable will handle the pending softirq.

[tglx: Massaged changelog and comment]

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/87k0bxgl27.ffs@tglx

---
lib/irq_poll.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/irq_poll.c b/lib/irq_poll.c
index 2f17b48..2d5329a 100644
--- a/lib/irq_poll.c
+++ b/lib/irq_poll.c
@@ -188,14 +188,18 @@ EXPORT_SYMBOL(irq_poll_init);
static int irq_poll_cpu_dead(unsigned int cpu)
{
/*
- * If a CPU goes away, splice its entries to the current CPU
- * and trigger a run of the softirq
+ * If a CPU goes away, splice its entries to the current CPU and
+ * set the POLL softirq bit. The local_bh_disable()/enable() pair
+ * ensures that it is handled. Otherwise the current CPU could
+ * reach idle with the POLL softirq pending.
*/
+ local_bh_disable();
local_irq_disable();
list_splice_init(&per_cpu(blk_cpu_iopoll, cpu),
this_cpu_ptr(&blk_cpu_iopoll));
__raise_softirq_irqoff(IRQ_POLL_SOFTIRQ);
local_irq_enable();
+ local_bh_enable();

return 0;
}