Ingo and Thomas,
John and I have been discussing all the "run softirq from IRQ thread"
lately and discovered something nasty.
Now it is a nice optimization to run softirqs from the IRQ thread, but
it may not be feasible because of the semantics of the IRQ thread
compared with the softirq thread. Namely, the softirq thread is bound to
a single CPU and the IRQ thread is not.
We use to think that it would be fine to simply bind an IRQ thread to a
single CPU, either at the start of the IRQ thread code, or just while it
is running the softirq code. But this has a major flaw as John Stultz
discovered.
If a RT hog that is of higher priority than the IRQ thread preempts the
IRQ thread while it is bound to the CPU (more likely with the latest
code that always binds the IRQ thread to 1 CPU), then that IRQ is, in
essence, masked. That means no more actions will be taken place by that
IRQ while the RT thread is running. Normally, one would expect, that if
the IRQ has its affinity set to all CPUS, if a RT thread were to preempt
the IRQ thread and run for a long time, it would be expected that the
IRQ thread would migrate to another CPU and finish. Letting more
interrupts from the IRQ line in (remember that the IRQ line is masked
until the IRQ finishes its handler).
This patch will only run the softirq functions if the IRQ thread and the
softirq thread have the same priority **and** the IRQ thread is already
bound to a single CPU. If we are running on UP or the IRQ thread is
bound to a single CPU, we already have the possibility of having a RT
hog starve the IRQ. But we should not add that scenario when the IRQ
thread has its affinity set to run on other CPUS that don't have RT hogs
on them.
Signed-off-by: Steven Rostedt <[email protected]>
Index: linux-2.6.23-rc2-rt2/kernel/irq/manage.c
===================================================================
--- linux-2.6.23-rc2-rt2.orig/kernel/irq/manage.c
+++ linux-2.6.23-rc2-rt2/kernel/irq/manage.c
@@ -769,17 +769,28 @@ static int do_irqd(void * __desc)
{
struct sched_param param = { 0, };
struct irq_desc *desc = __desc;
+ int run_softirq = 1;
#ifdef CONFIG_SMP
cpumask_t cpus_allowed, mask;
cpus_allowed = desc->affinity;
/*
- * Restrict it to one cpu so we avoid being migrated inside of
- * do_softirq_from_hardirq()
+ * If the irqd is bound to one CPU we let it run softirqs
+ * that have the same priority as the irqd thread. We do
+ * not run it if the irqd is bound to more than one CPU
+ * due to the fact that it can
+ * 1) migrate to other CPUS while running the softirqd
+ * 2) if we pin the irqd to a CPU to run the softirqd, then
+ * we risk a high priority process from waking up and
+ * preempting the irqd. Although the irqd may be able to
+ * run on other CPUS due to its irq affinity, it will not
+ * be able to since we bound it to a CPU to run softirqs.
+ * So a RT hog could starve the irqd from running on
+ * other CPUS that it's allowed to run on.
*/
- mask = cpumask_of_cpu(first_cpu(desc->affinity));
- set_cpus_allowed(current, mask);
+ if (cpus_weight(cpus_allowed) != 1)
+ run_softirq = 0; /* turn it off */
#endif
current->flags |= PF_NOFREEZE | PF_HARDIRQ;
@@ -795,7 +806,8 @@ static int do_irqd(void * __desc)
do {
set_current_state(TASK_INTERRUPTIBLE);
do_hardirq(desc);
- do_softirq_from_hardirq();
+ if (run_softirq)
+ do_softirq_from_hardirq();
} while (current->state == TASK_RUNNING);
local_irq_enable_nort();
@@ -806,12 +818,10 @@ static int do_irqd(void * __desc)
if (!cpus_equal(cpus_allowed, desc->affinity)) {
cpus_allowed = desc->affinity;
/*
- * Restrict it to one cpu so we avoid being
- * migrated inside of
- * do_softirq_from_hardirq()
+ * Only allow the irq thread to run the softirqs
+ * if it is bound to a single CPU.
*/
- mask = cpumask_of_cpu(first_cpu(desc->affinity));
- set_cpus_allowed(current, mask);
+ run_softirq = (cpus_weight(cpus_allowed) == 1);
}
#endif
schedule();
Index: linux-2.6.23-rc2-rt2/kernel/softirq.c
===================================================================
--- linux-2.6.23-rc2-rt2.orig/kernel/softirq.c
+++ linux-2.6.23-rc2-rt2/kernel/softirq.c
@@ -114,7 +114,14 @@ static void wakeup_softirqd(int softirq)
* context processing it later on.
*/
if ((current->flags & PF_HARDIRQ) && !hardirq_count() &&
- (tsk->normal_prio == current->normal_prio))
+ (tsk->normal_prio == current->normal_prio) &&
+ /*
+ * The hard irq thread must be bound to a single CPU to run
+ * a softirq. Don't worry about locking, the irq thread
+ * should be the only one to modify the cpus_allowed, when
+ * the irq affinity changes.
+ */
+ (cpus_weight(current->cpus_allowed) == 1))
return;
#endif
/*
On Wednesday 08 August 2007 13:42:32 you wrote:
> Ingo and Thomas,
>
> John and I have been discussing all the "run softirq from IRQ thread"
> lately and discovered something nasty.
>
> Now it is a nice optimization to run softirqs from the IRQ thread, but
> it may not be feasible because of the semantics of the IRQ thread
> compared with the softirq thread. Namely, the softirq thread is bound to
> a single CPU and the IRQ thread is not.
>
> We use to think that it would be fine to simply bind an IRQ thread to a
> single CPU, either at the start of the IRQ thread code, or just while it
> is running the softirq code. But this has a major flaw as John Stultz
> discovered.
>
> If a RT hog that is of higher priority than the IRQ thread preempts the
> IRQ thread while it is bound to the CPU (more likely with the latest
> code that always binds the IRQ thread to 1 CPU), then that IRQ is, in
> essence, masked. That means no more actions will be taken place by that
> IRQ while the RT thread is running. Normally, one would expect, that if
> the IRQ has its affinity set to all CPUS, if a RT thread were to preempt
> the IRQ thread and run for a long time, it would be expected that the
> IRQ thread would migrate to another CPU and finish. Letting more
> interrupts from the IRQ line in (remember that the IRQ line is masked
> until the IRQ finishes its handler).
>
> This patch will only run the softirq functions if the IRQ thread and the
> softirq thread have the same priority **and** the IRQ thread is already
> bound to a single CPU. If we are running on UP or the IRQ thread is
> bound to a single CPU, we already have the possibility of having a RT
> hog starve the IRQ. But we should not add that scenario when the IRQ
> thread has its affinity set to run on other CPUS that don't have RT hogs
> on them.
It seems to me that this patch will reduce the frequency of irqd/softirqd
starvation, but the core problem still exists: softirq tasks can't migrate to
other CPUs to perform their work if a higher priority task preempts them.
I'm wondering if we want to keep special casing things to minimize the
problem or not - seems to me the worst case is still the same - and isn't the
worst case the only case that matters (for -rt)?
--
Darren Hart
IBM Linux Technology Center
Realtime Linux Team
On Wed, 2007-08-08 at 14:16 -0700, Darren Hart wrote:
> It seems to me that this patch will reduce the frequency of irqd/softirqd
> starvation, but the core problem still exists: softirq tasks can't migrate to
> other CPUs to perform their work if a higher priority task preempts them.
> I'm wondering if we want to keep special casing things to minimize the
> problem or not - seems to me the worst case is still the same - and isn't the
> worst case the only case that matters (for -rt)?
>
softirq tasks should never migrate to other CPUs. A softirq exists in
every CPU. So if you trigger a softirq on CPU1 it will only run on CPU1.
If a high priority task preempts it, that same softirq can still run on
other CPUS. Only the thread that was preempted wont switch. But that's
the characteristic of softirqs, and that's how people who use them in
development expect them to work.
The problem with the IRQ thread, is that there exists only one. If you
preempt it, it will not run anywhere. Even if we tried to do something
like the softirqs where we put one on every CPU, you still have the IRQ
itself masked, and if the IRQ thread gets preempted, that IRQ wont
happen again on another CPU.
Now, I can see a problem with softirqs still being on one CPU, because
we basically have the same problem if a softirq gets preempted while
holding a mutex. Other softirqs must wait for the RT task that
preempted a softirq finishes. This could cause problems as well.
Especially if the designer of the RT app doesn't understand this.
/me will make a note of this in his RT chapter on "Building Embedded
Linux Systems".
-- Steve
On 8/8/07, Steven Rostedt <[email protected]> wrote:
> On Wed, 2007-08-08 at 14:16 -0700, Darren Hart wrote:
>
> > It seems to me that this patch will reduce the frequency of irqd/softirqd
> > starvation, but the core problem still exists: softirq tasks can't migrate to
> > other CPUs to perform their work if a higher priority task preempts them.
> > I'm wondering if we want to keep special casing things to minimize the
> > problem or not - seems to me the worst case is still the same - and isn't the
> > worst case the only case that matters (for -rt)?
> >
>
> softirq tasks should never migrate to other CPUs. A softirq exists in
> every CPU. So if you trigger a softirq on CPU1 it will only run on CPU1.
> If a high priority task preempts it, that same softirq can still run on
> other CPUS. Only the thread that was preempted wont switch. But that's
> the characteristic of softirqs, and that's how people who use them in
> development expect them to work.
Wouldn't a developer of a real-time system configure the system so
that interrupts do not interfere with the real-time tasks running on a
specific CPU?
In other words, is this problem not simply a misconfiguration of the system?
I personally redirect all interrupts away from the CPU's where my
real-time tasks run and only allow the interrupts that I want to
handle in my application on the CPU's where I handle them so as to
minimize latency.
Robert
--
On Thu, 9 Aug 2007, Robert de Vries wrote:
> On 8/8/07, Steven Rostedt <[email protected]> wrote:
>
> Wouldn't a developer of a real-time system configure the system so
> that interrupts do not interfere with the real-time tasks running on a
> specific CPU?
You would think. But that's something that the Linux kernel should expect.
This can happen on a system where it's not no RT, but more soft RT (God I
hate that term). Meaning, the "optimization" of running softirqs as
threads can actually cause longer latencies to the interrupt. Since any
time an RT task preempts the IRQ while running bounded to a CPU, it will
be like masking that interrupt for the full time the RT task is running.
> In other words, is this problem not simply a misconfiguration of the system?
> I personally redirect all interrupts away from the CPU's where my
> real-time tasks run and only allow the interrupts that I want to
> handle in my application on the CPU's where I handle them so as to
> minimize latency.
This can't always be done. Say you only have two CPUS, and there's two RT
tasks that run at high priorities, on each. But it could be period that
the two don't always run at the same time. You don't want to stop
an IRQ when it gets preempted by one of the RT tasks.
So my answer to you is, no, it is not just a misconfiguration of the
system. By making an optimization, we just added a latency that would not
be there without that optimization. And it wouldn't be something that one
would expect.
-- Steve
On Thursday 09 August 2007 02:49:27 Robert de Vries wrote:
> On 8/8/07, Steven Rostedt <[email protected]> wrote:
> > On Wed, 2007-08-08 at 14:16 -0700, Darren Hart wrote:
> > > It seems to me that this patch will reduce the frequency of
> > > irqd/softirqd starvation, but the core problem still exists: softirq
> > > tasks can't migrate to other CPUs to perform their work if a higher
> > > priority task preempts them. I'm wondering if we want to keep special
> > > casing things to minimize the problem or not - seems to me the worst
> > > case is still the same - and isn't the worst case the only case that
> > > matters (for -rt)?
> >
> > softirq tasks should never migrate to other CPUs. A softirq exists in
> > every CPU. So if you trigger a softirq on CPU1 it will only run on CPU1.
> > If a high priority task preempts it, that same softirq can still run on
> > other CPUS. Only the thread that was preempted wont switch. But that's
> > the characteristic of softirqs, and that's how people who use them in
> > development expect them to work.
>
> Wouldn't a developer of a real-time system configure the system so
> that interrupts do not interfere with the real-time tasks running on a
> specific CPU?
> In other words, is this problem not simply a misconfiguration of the
> system? I personally redirect all interrupts away from the CPU's where my
> real-time tasks run and only allow the interrupts that I want to
> handle in my application on the CPU's where I handle them so as to
> minimize latency.
I think the goal is to try and have a generally robust default setup - but the
interrupt shielding you are suggesting seems like it will become more and
more relevant as the number of CPUs increases.
--
Darren Hart
IBM Linux Technology Center
Realtime Linux Team