Hi all,
Attached is a patch i use to get my sparc LX to boot (against 2.4.14).
Without it the SUN does not work. I'm not sure if this is the right way to
fix it, but i fail to see way softirq should be raised again if there are no
enabled tasklets left. Am i missing anything?
The SUN got in a deadlock. The schedule() function didn't return
execution to spawn_ksoftirqd, because ksoftirqd kept on doing do_softirqd to
handle a disabled tasklet (sun_kbd_bh).
Both my i386 and sun4m seem to run stable with this patch.
me
diff -ruN linux-2.4.14/kernel/softirq.c linux/kernel/softirq.c
--- linux-2.4.14/kernel/softirq.c Thu Nov 8 15:58:24 2001
+++ linux/kernel/softirq.c Thu Nov 8 15:59:40 2001
@@ -193,10 +193,9 @@
if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
BUG();
t->func(t->data);
- tasklet_unlock(t);
- continue;
}
tasklet_unlock(t);
+ continue;
}
local_irq_disable();
On Thursday 08 November 2001 17:36, Mathijs Mohlmann wrote:
> Attached is a patch i use to get my sparc LX to boot (against 2.4.14).
> Without it the SUN does not work. I'm not sure if this is the right way to
> fix it, but i fail to see way softirq should be raised again if there are
> no enabled tasklets left. Am i missing anything?
Oke, forget that last patch :(
There is however -i feel- an issue to be addressed here. To clarify i wrote a
module. Here it is:
#define MODULE
#include <linux/sched.h>
#include <linux/module.h>
#include <linux/interrupt.h>
void
dummy_lockit(unsigned long nill)
{
printk("doing the lockit\n");
return;
}
DECLARE_TASKLET_DISABLED(lockit, dummy_lockit, 0);
int
init_module(void)
{
printk("going to lock\n");
tasklet_schedule(&lockit);
return(0);
}
void
cleanup_module(void)
{
tasklet_enable(&lockit);
set_current_state(TASK_INTERRUPTIBLE);
schedule_timeout(5*HZ); /* some time to do dummy_lockit */
printk("bye\n");
}
When you insert this in a 2.4.x machine, the CPU ide time will go to 0%. (and
your kernel will be tained ;). When the idletask is trying to boot your
system, this is bad.
Looking at kernel/softirq.c tasklet_action: i don't see:
a) why we have to retry to run a tasklet that is already runnning.
b) when a tasklet is disabled, why we immediately have to a
__cpu_raise_softirq. This will starve the rest of the system.
(we could consider doing __cpu_raise_softirq in tasklet_enable?)
IMHO the following patch will "fix" this. I was unable to test it on a SMP
system, but my SUN4M is running 2.4.14 now!
me
--- linux-2.4.14/kernel/softirq.c Thu Nov 8 15:58:24 2001
+++ linux-sparc/kernel/softirq.c Fri Nov 9 08:19:43 2001
@@ -188,21 +188,19 @@
list = list->next;
- if (tasklet_trylock(t)) {
- if (!atomic_read(&t->count)) {
+ if (!atomic_read(&t->count)) {
+ if (tasklet_trylock(t)) {
if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
BUG();
t->func(t->data);
tasklet_unlock(t);
- continue;
}
- tasklet_unlock(t);
+ continue;
}
local_irq_disable();
t->next = tasklet_vec[cpu].list;
tasklet_vec[cpu].list = t;
- __cpu_raise_softirq(cpu, TASKLET_SOFTIRQ);
local_irq_enable();
}
}
@@ -222,21 +220,19 @@
list = list->next;
- if (tasklet_trylock(t)) {
- if (!atomic_read(&t->count)) {
+ if (!atomic_read(&t->count)) {
+ if (tasklet_trylock(t)) {
if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
BUG();
t->func(t->data);
tasklet_unlock(t);
- continue;
}
- tasklet_unlock(t);
+ continue;
}
local_irq_disable();
t->next = tasklet_hi_vec[cpu].list;
tasklet_hi_vec[cpu].list = t;
- __cpu_raise_softirq(cpu, HI_SOFTIRQ);
local_irq_enable();
}
}