2001-07-06 06:42:48

by Ville Nummela

[permalink] [raw]
Subject: tasklets in 2.4.6

Hi,

I have little question about tasklets in the kernel 2.4.6:

In kernel/softirq.c, line 178:

if (test_bit(TASKLET_STATE_SCHED, &t->state))
tasklet_schedule(t);

What's the idea behind this line? If the tasklet is already scheduled,
schedule it again? This does not make much sense to me.

Also, few lines before:

if (test_bit(TASKLET_STATE_SCHED, &t->state))
goto repeat;

Here we'll loop forever if the tasklet should schedule itself.

So if the tasklet schedules itself we'll loop it forever, and if it doesn't
it'll get never run again.
If we'd change the line 178 to:

if (!test_bit(TASKLET_STATE_SCHED, &t->state))

the tasklet would get scheduled if it was NOT scheduled, and everything would
go on happily forever :)

But anyway, I'm probably missing something here, perhaps someone could educate
me a bit ;-)

--
| [email protected] tel: +358-40-8480344
| So Linus, what are we doing tonight?
| The same thing we every night Tux. Try to take over the world!


2001-07-06 07:08:36

by Jeff Garzik

[permalink] [raw]
Subject: Re: tasklets in 2.4.6

Ville Nummela wrote:
> In kernel/softirq.c, line 178:
>
> if (test_bit(TASKLET_STATE_SCHED, &t->state))
> tasklet_schedule(t);
>
> What's the idea behind this line? If the tasklet is already scheduled,
> schedule it again? This does not make much sense to me.
>
> Also, few lines before:
>
> if (test_bit(TASKLET_STATE_SCHED, &t->state))
> goto repeat;
>
> Here we'll loop forever if the tasklet should schedule itself.

hmmm, it looks almost ok to me.

The tasklet is locked before the "repeat" label, so any tasklet
scheduling itself will set the bit, but NOT actually schedule iteself.
For this see the tasket_schedule code, for the case where the bit is not
set, but the tasklet is locked.

The first statement above schedules the tasklet if the bit was set while
the tasklet was locked. The second statement, as the comment right
above it indicates, causes the tasklet to repeat itself.

The only thing that appears fishy is that if the tasklet constantly
reschedules itself, it will never leave the loop AFAICS. This affects
tasklet_hi_action as well as tasklet_action.

> repeat:
> if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
> BUG();
> if (!atomic_read(&t->count)) {
> local_irq_enable();
> t->func(t->data);
> local_irq_disable();
> /*
> * One more run if the tasklet got reactivated:
> */
> if (test_bit(TASKLET_STATE_SCHED, &t->state))
> goto repeat;
> }


Jeff


--
Jeff Garzik | A recent study has shown that too much soup
Building 1024 | can cause malaise in laboratory mice.
MandrakeSoft |

2001-07-06 08:26:25

by Jeff Garzik

[permalink] [raw]
Subject: Re: tasklets in 2.4.6

Jeff Garzik wrote:
> The only thing that appears fishy is that if the tasklet constantly
> reschedules itself, it will never leave the loop AFAICS. This affects
> tasklet_hi_action as well as tasklet_action.

As I hacked around to fix this, I noticed Andrea's ksoftirq patch
already fixed this. So, I decided to look over his patch instead.


Quoted from 00_ksoftirqd-7:
> +#ifdef CONFIG_SMP
> + movl processor(%ebx),%eax
> + shll $CONFIG_X86_L1_CACHE_SHIFT,%eax
> + testl $0, SYMBOL_NAME(irq_stat)(,%eax) # softirq_pending
> +#else
> + testl $0, SYMBOL_NAME(irq_stat) # softirq_pending
> +#endif
> + jne handle_softirq

Did you get this from Alpha? :) This is very similar to a
micro-optimization I put in for UP machines on Alpha.

Useful yes, but unrelated to ksoftirqd issue.

> +handle_softirq_back:
> cmpl $0,need_resched(%ebx)
> jne reschedule
> +reschedule_back:
> cmpl $0,sigpending(%ebx)
> jne signal_return
> restore_all:
> @@ -256,9 +266,14 @@
> jmp restore_all
>
> ALIGN
> +handle_softirq:
> + call SYMBOL_NAME(do_softirq)
> + jmp handle_softirq_back
> +
> + ALIGN
> reschedule:
> call SYMBOL_NAME(schedule) # test
> - jmp ret_from_sys_call
> + jmp reschedule_back

I'm too slack to look at the code flow and see what is going on here, so
"no comment" :)

> diff -urN 2.4.6pre5/include/asm-alpha/softirq.h softirq/include/asm-alpha/softirq.h
> --- 2.4.6pre5/include/asm-alpha/softirq.h Thu Jun 21 08:03:51 2001
> +++ softirq/include/asm-alpha/softirq.h Thu Jun 21 15:58:06 2001
> @@ -8,21 +8,28 @@
> extern inline void cpu_bh_disable(int cpu)
> {
> local_bh_count(cpu)++;
> - mb();
> + barrier();
> }
>
> -extern inline void cpu_bh_enable(int cpu)
> +extern inline void __cpu_bh_enable(int cpu)
> {
> - mb();
> + barrier();
> local_bh_count(cpu)--;
> }

I do not say this is wrong... but why reinvent the wheel? Is it
possible to use atomic_t for local

>
> -#define local_bh_enable() cpu_bh_enable(smp_processor_id())
> -#define __local_bh_enable local_bh_enable
> +#define __local_bh_enable() __cpu_bh_enable(smp_processor_id())
> #define local_bh_disable() cpu_bh_disable(smp_processor_id())
>
> -#define in_softirq() (local_bh_count(smp_processor_id()) != 0)
> +#define local_bh_enable() \
> +do { \
> + int cpu; \
> + \
> + barrier(); \
> + cpu = smp_processor_id(); \
> + if (!--local_bh_count(cpu) && softirq_pending(cpu)) \
> + do_softirq(); \
> +} while (0)

makes sense

> diff -urN 2.4.6pre5/include/asm-i386/softirq.h softirq/include/asm-i386/softirq.h
> --- 2.4.6pre5/include/asm-i386/softirq.h Thu Jun 21 08:03:52 2001
> +++ softirq/include/asm-i386/softirq.h Thu Jun 21 15:58:06 2001
> @@ -28,6 +26,7 @@
> do { \
> unsigned int *ptr = &local_bh_count(smp_processor_id()); \
> \
> + barrier(); \
> if (!--*ptr) \
> __asm__ __volatile__ ( \
> "cmpl $0, -8(%0);" \

If you don't mind, why is barrier() needed here? The __volatile__
doesn't take care of that?

I am not yet completely familiar with the conditions under which
barrier() is needed.


> diff -urN 2.4.6pre5/include/linux/interrupt.h softirq/include/linux/interrupt.h
> --- 2.4.6pre5/include/linux/interrupt.h Thu Jun 21 08:03:56 2001
> +++ softirq/include/linux/interrupt.h Thu Jun 21 15:58:06 2001
> @@ -74,6 +74,22 @@
> asmlinkage void do_softirq(void);
> extern void open_softirq(int nr, void (*action)(struct softirq_action*), void *data);
>
> +static inline void __cpu_raise_softirq(int cpu, int nr)
> +{
> + softirq_pending(cpu) |= (1<<nr);
> +}
> +
> +
> +/* I do not want to use atomic variables now, so that cli/sti */
> +static inline void raise_softirq(int nr)
> +{
> + unsigned long flags;
> +
> + local_irq_save(flags);
> + __cpu_raise_softirq(smp_processor_id(), nr);
> + local_irq_restore(flags);
> +}
> +
> extern void softirq_init(void);
>
>
> @@ -129,7 +145,7 @@
> extern struct tasklet_head tasklet_hi_vec[NR_CPUS];
>
> #define tasklet_trylock(t) (!test_and_set_bit(TASKLET_STATE_RUN, &(t)->state))
> -#define tasklet_unlock(t) clear_bit(TASKLET_STATE_RUN, &(t)->state)
> +#define tasklet_unlock(t) do { smp_mb__before_clear_bit(); clear_bit(TASKLET_STATE_RUN, &(t)->state); } while(0)
> #define tasklet_unlock_wait(t) while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); }
>
> extern void tasklet_schedule(struct tasklet_struct *t);
> diff -urN 2.4.6pre5/include/linux/irq_cpustat.h softirq/include/linux/irq_cpustat.h
> diff -urN 2.4.6pre5/kernel/softirq.c softirq/kernel/softirq.c
> --- 2.4.6pre5/kernel/softirq.c Thu Jun 21 08:03:57 2001
> +++ softirq/kernel/softirq.c Thu Jun 21 15:58:06 2001
> @@ -51,17 +51,20 @@
> {
> int cpu = smp_processor_id();
> __u32 pending;
> + long flags;
> + __u32 mask;
>
> if (in_interrupt())
> return;
>
> - local_irq_disable();
> + local_irq_save(flags);
>
> pending = softirq_pending(cpu);
>
> if (pending) {
> struct softirq_action *h;
>
> + mask = ~pending;
> local_bh_disable();
> restart:
> /* Reset the pending bitmask before enabling irqs */
> @@ -81,12 +84,26 @@
> local_irq_disable();
>
> pending = softirq_pending(cpu);
> - if (pending)
> + if (pending & mask) {
> + mask &= ~pending;
> goto restart;
> + }

Cool, I was wondering why the old code did not do pending&mask...

> @@ -112,8 +129,7 @@
> * If nobody is running it then add it to this CPU's
> * tasklet queue.
> */
> - if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state) &&
> - tasklet_trylock(t)) {
> + if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
> t->next = tasklet_vec[cpu].list;
> tasklet_vec[cpu].list = t;
> __cpu_raise_softirq(cpu, TASKLET_SOFTIRQ);
> @@ -130,8 +146,7 @@
> cpu = smp_processor_id();
> local_irq_save(flags);
>
> - if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state) &&
> - tasklet_trylock(t)) {
> + if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
> t->next = tasklet_hi_vec[cpu].list;
> tasklet_hi_vec[cpu].list = t;
> __cpu_raise_softirq(cpu, HI_SOFTIRQ);

Why does tasklet_trylock go away?

It seems like this removes the supported capability for a tasklet to
reschedule itself, which would occur when test_and_set_bit succeeds but
tasklet_trylock fails, in the code above.

> @@ -148,37 +163,29 @@
> local_irq_disable();
> list = tasklet_vec[cpu].list;
> tasklet_vec[cpu].list = NULL;
> + local_irq_enable();
>
> while (list) {
> struct tasklet_struct *t = list;
>
> list = list->next;
>
> - /*
> - * A tasklet is only added to the queue while it's
> - * locked, so no other CPU can have this tasklet
> - * pending:
> - */
> if (!tasklet_trylock(t))
> BUG();
> -repeat:
> - if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
> - BUG();
> if (!atomic_read(&t->count)) {
> - local_irq_enable();
> + clear_bit(TASKLET_STATE_SCHED, &t->state);
> t->func(t->data);
> - local_irq_disable();
> - /*
> - * One more run if the tasklet got reactivated:
> - */
> - if (test_bit(TASKLET_STATE_SCHED, &t->state))
> - goto repeat;
> + tasklet_unlock(t);
> + continue;

Here is the key problem area, the reason for this entire patch.

Andrea your fix appears -almost- correct, but it missed a key point:
re-running the tasklet in tasklet_action is i/dcache friendly.

I suggest preserving the original intent of the code (as I interpret
from the comment), by re-running the tasklet function -once-, if it
rescheduled itself.

Comments?

> }
> tasklet_unlock(t);
> - if (test_bit(TASKLET_STATE_SCHED, &t->state))
> - tasklet_schedule(t);
> +
> + local_irq_disable();
> + t->next = tasklet_vec[cpu].list;
> + tasklet_vec[cpu].list = t;
> + __cpu_raise_softirq(cpu, TASKLET_SOFTIRQ);
> + local_irq_enable();

please create __tasklet_schedule and __tasklet_hi_schedule instead of
duplicating code here, between local_irq_disable and local_irq_enable.

> }
> - local_irq_enable();
> }
>
>
> @@ -193,6 +200,7 @@
> local_irq_disable();
> list = tasklet_hi_vec[cpu].list;
> tasklet_hi_vec[cpu].list = NULL;
> + local_irq_enable();
>
> while (list) {
> struct tasklet_struct *t = list;
> @@ -201,21 +209,20 @@
>
> if (!tasklet_trylock(t))
> BUG();
> -repeat:
> - if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
> - BUG();
> if (!atomic_read(&t->count)) {
> - local_irq_enable();
> + clear_bit(TASKLET_STATE_SCHED, &t->state);
> t->func(t->data);
> - local_irq_disable();
> - if (test_bit(TASKLET_STATE_SCHED, &t->state))
> - goto repeat;
> + tasklet_unlock(t);
> + continue;
> }
> tasklet_unlock(t);
> - if (test_bit(TASKLET_STATE_SCHED, &t->state))
> - tasklet_hi_schedule(t);
> +
> + local_irq_disable();
> + t->next = tasklet_hi_vec[cpu].list;
> + tasklet_hi_vec[cpu].list = t;
> + __cpu_raise_softirq(cpu, HI_SOFTIRQ);
> + local_irq_enable();

this should be __tasklet_hi_schedule call, not inlined duplicate code


> +static int ksoftirqd(void * __bind_cpu)
> +{
> + int bind_cpu = *(int *) __bind_cpu;
> + int cpu = cpu_logical_map(bind_cpu);
> +
> + daemonize();
> + current->nice = 19;
> + sigfillset(&current->blocked);
> +
> + /* Migrate to the right CPU */
> + current->cpus_allowed = 1UL << cpu;
> + while (smp_processor_id() != cpu)
> + schedule();
> +
> + sprintf(current->comm, "ksoftirqd_CPU%d", bind_cpu);
> +
> + __set_current_state(TASK_INTERRUPTIBLE);
> + mb();
> +
> + ksoftirqd_task(cpu) = current;
> +
> + for (;;) {
> + if (!softirq_pending(cpu))
> + schedule();
> +
> + __set_current_state(TASK_RUNNING);
> +
> + while (softirq_pending(cpu)) {
> + do_softirq();
> + if (current->need_resched)
> + schedule();
> + }
> +
> + __set_current_state(TASK_INTERRUPTIBLE);
> + }
> +}

should there perhaps be a
current->policy |= SCHED_YIELD
in here, or does setting current->nice make that unnecessary?

So basically Andrea's ksoftirq patch indeed appears to fix the infinite
loop, but IMHO it needs a tiny bit of work...

Jeff



--
Jeff Garzik | A recent study has shown that too much soup
Building 1024 | can cause malaise in laboratory mice.
MandrakeSoft |

2001-07-06 08:28:05

by Jeff Garzik

[permalink] [raw]
Subject: Re: tasklets in 2.4.6

Jeff Garzik wrote:
> > --- 2.4.6pre5/include/asm-alpha/softirq.h Thu Jun 21 08:03:51 2001
> > +++ softirq/include/asm-alpha/softirq.h Thu Jun 21 15:58:06 2001
> > @@ -8,21 +8,28 @@
> > extern inline void cpu_bh_disable(int cpu)
> > {
> > local_bh_count(cpu)++;
> > - mb();
> > + barrier();
> > }
> >
> > -extern inline void cpu_bh_enable(int cpu)
> > +extern inline void __cpu_bh_enable(int cpu)
> > {
> > - mb();
> > + barrier();
> > local_bh_count(cpu)--;
> > }
>
> I do not say this is wrong... but why reinvent the wheel? Is it
> possible to use atomic_t for local

ignore this. I see the reason, and forgot to delete this text :)

--
Jeff Garzik | A recent study has shown that too much soup
Building 1024 | can cause malaise in laboratory mice.
MandrakeSoft |

2001-07-06 11:28:57

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: tasklets in 2.4.6

On Fri, Jul 06, 2001 at 03:08:05AM -0400, Jeff Garzik wrote:
> Ville Nummela wrote:
> > In kernel/softirq.c, line 178:
> >
> > if (test_bit(TASKLET_STATE_SCHED, &t->state))
> > tasklet_schedule(t);
> >
> > What's the idea behind this line? If the tasklet is already scheduled,
> > schedule it again? This does not make much sense to me.
> >
> > Also, few lines before:
> >
> > if (test_bit(TASKLET_STATE_SCHED, &t->state))
> > goto repeat;
> >
> > Here we'll loop forever if the tasklet should schedule itself.
>
> hmmm, it looks almost ok to me.
>
> The tasklet is locked before the "repeat" label, so any tasklet
> scheduling itself will set the bit, but NOT actually schedule iteself.
> For this see the tasket_schedule code, for the case where the bit is not
> set, but the tasklet is locked.
>
> The first statement above schedules the tasklet if the bit was set while
> the tasklet was locked. The second statement, as the comment right
> above it indicates, causes the tasklet to repeat itself.
>
> The only thing that appears fishy is that if the tasklet constantly
> reschedules itself, it will never leave the loop AFAICS. This affects
> tasklet_hi_action as well as tasklet_action.

another bug triggers when the tasklet is re-enabled by another cpu in
this point:

if (test_bit(TASKLET_STATE_SCHED, &t->state))
goto repeat;
}
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx
tasklet_unlock(t);
if (test_bit(TASKLET_STATE_SCHED, &t->state))
tasklet_schedule(t);

then the check for test_bit(TASKLET_STATE_SCHED, &t->state) will
trigger, but the tasklet_schedule won't queue the tasklet because
TASKLET_STATE_SCHED is just set in t->state.

This is fixed in my ksoftirqd patch. However one thing I'd like to
change in my patch is to add the BUG() that triggers in the case
mentioned yesterday.

Andrea

2001-07-06 11:58:31

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: tasklets in 2.4.6

On Fri, Jul 06, 2001 at 04:25:49AM -0400, Jeff Garzik wrote:
> Jeff Garzik wrote:
> > The only thing that appears fishy is that if the tasklet constantly
> > reschedules itself, it will never leave the loop AFAICS. This affects
> > tasklet_hi_action as well as tasklet_action.
>
> As I hacked around to fix this, I noticed Andrea's ksoftirq patch
> already fixed this. So, I decided to look over his patch instead.
>
>
> Quoted from 00_ksoftirqd-7:
> > +#ifdef CONFIG_SMP
> > + movl processor(%ebx),%eax
> > + shll $CONFIG_X86_L1_CACHE_SHIFT,%eax
> > + testl $0, SYMBOL_NAME(irq_stat)(,%eax) # softirq_pending
> > +#else
> > + testl $0, SYMBOL_NAME(irq_stat) # softirq_pending
> > +#endif
> > + jne handle_softirq
>
> Did you get this from Alpha? :) This is very similar to a
> micro-optimization I put in for UP machines on Alpha.

Not sure where I got it, I think I wrote it myself while adapting to the
removal of the mask field.

> > diff -urN 2.4.6pre5/include/asm-i386/softirq.h softirq/include/asm-i386/softirq.h
> > --- 2.4.6pre5/include/asm-i386/softirq.h Thu Jun 21 08:03:52 2001
> > +++ softirq/include/asm-i386/softirq.h Thu Jun 21 15:58:06 2001
> > @@ -28,6 +26,7 @@
> > do { \
> > unsigned int *ptr = &local_bh_count(smp_processor_id()); \
> > \
> > + barrier(); \
> > if (!--*ptr) \
> > __asm__ __volatile__ ( \
> > "cmpl $0, -8(%0);" \
>
> If you don't mind, why is barrier() needed here? The __volatile__
> doesn't take care of that?

The asm volatile is _after_ the --*ptr, while a barrier() must be _before_
the --*ptr, that's why an explicit barrier is needed there too.

That is a bug in mainline and that's the fix.

> > diff -urN 2.4.6pre5/kernel/softirq.c softirq/kernel/softirq.c
> > --- 2.4.6pre5/kernel/softirq.c Thu Jun 21 08:03:57 2001
> > +++ softirq/kernel/softirq.c Thu Jun 21 15:58:06 2001
> > @@ -51,17 +51,20 @@
> > {
> > int cpu = smp_processor_id();
> > __u32 pending;
> > + long flags;
> > + __u32 mask;
> >
> > if (in_interrupt())
> > return;
> >
> > - local_irq_disable();
> > + local_irq_save(flags);
> >
> > pending = softirq_pending(cpu);
> >
> > if (pending) {
> > struct softirq_action *h;
> >
> > + mask = ~pending;
> > local_bh_disable();
> > restart:
> > /* Reset the pending bitmask before enabling irqs */
> > @@ -81,12 +84,26 @@
> > local_irq_disable();
> >
> > pending = softirq_pending(cpu);
> > - if (pending)
> > + if (pending & mask) {
> > + mask &= ~pending;
> > goto restart;
> > + }
>
> Cool, I was wondering why the old code did not do pending&mask...

It didn't do it intentionally, just to infinite loop and live lock under
flood.

Also worthwhle to note that the unconditional local_irq_enable and lack
of save_flags in do_softirq is completly broken in mainline, it can
stack overflow. My patch fixes that. That's one of the most important
bits in my patch.

> > @@ -112,8 +129,7 @@
> > * If nobody is running it then add it to this CPU's
> > * tasklet queue.
> > */
> > - if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state) &&
> > - tasklet_trylock(t)) {
> > + if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
> > t->next = tasklet_vec[cpu].list;
> > tasklet_vec[cpu].list = t;
> > __cpu_raise_softirq(cpu, TASKLET_SOFTIRQ);
> > @@ -130,8 +146,7 @@
> > cpu = smp_processor_id();
> > local_irq_save(flags);
> >
> > - if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state) &&
> > - tasklet_trylock(t)) {
> > + if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
> > t->next = tasklet_hi_vec[cpu].list;
> > tasklet_hi_vec[cpu].list = t;
> > __cpu_raise_softirq(cpu, HI_SOFTIRQ);
>
> Why does tasklet_trylock go away?

because the logic is different, that's perfectly ok. In short when irq
are enabled and the TASKLET_STATE_SCHED bit is set, then the tasklet is
also queued. That's the old cleaner and not buggy logic implemented by
Alexey originally, it fixes the missed tasklet event in the other case
mentioned in the other email. It is also more efficient.

> It seems like this removes the supported capability for a tasklet to
> reschedule itself, which would occur when test_and_set_bit succeeds but
> tasklet_trylock fails, in the code above.

No, the mainline code does the trylock only because for mainline it is not
enough to know TASKLET_STATE_SCHED is not set with irq disabled to
ensure we need to queue the tasklet in the list.

> > @@ -148,37 +163,29 @@
> > local_irq_disable();
> > list = tasklet_vec[cpu].list;
> > tasklet_vec[cpu].list = NULL;
> > + local_irq_enable();
> >
> > while (list) {
> > struct tasklet_struct *t = list;
> >
> > list = list->next;
> >
> > - /*
> > - * A tasklet is only added to the queue while it's
> > - * locked, so no other CPU can have this tasklet
> > - * pending:
> > - */
> > if (!tasklet_trylock(t))
> > BUG();
> > -repeat:
> > - if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
> > - BUG();
> > if (!atomic_read(&t->count)) {
> > - local_irq_enable();
> > + clear_bit(TASKLET_STATE_SCHED, &t->state);
> > t->func(t->data);
> > - local_irq_disable();
> > - /*
> > - * One more run if the tasklet got reactivated:
> > - */
> > - if (test_bit(TASKLET_STATE_SCHED, &t->state))
> > - goto repeat;
> > + tasklet_unlock(t);
> > + continue;
>
> Here is the key problem area, the reason for this entire patch.

Not the reason for the entire patch, other loops are also in do_softirq
and many other bugs of mainline... but yes this is one of the reason.

> Andrea your fix appears -almost- correct, but it missed a key point:
> re-running the tasklet in tasklet_action is i/dcache friendly.

I don't think it makes sense to run the same tasklet in loop more than
once. It is more likley it has to do some work ASAP but not immediatly.

> I suggest preserving the original intent of the code (as I interpret
> from the comment), by re-running the tasklet function -once-, if it
> rescheduled itself.
>
> Comments?

That would be possible, instead of running it only once, we could run
twice and then break the loop to avoid the lockups, but I don't think it
is necessary. But if somebody can raise any real world case where we
need it I can change my mind about it. In all the real world case I can
think of we need the tasklet run very soon, but not immediatly.

Also for example when the cpu is idle ksoftirqd will make sure to keep
banging on the tasklet in loop, it is a not so tight loop inside the
tasklet code, it is a loop caming from do_softirq, but I think that's
not that far away to make a relevant i/dcache difference so I think
current way is just fine and much cleaner and faster for the fast path
than having a magic number of times to run a tasklet before breaking the
loop.

> > }
> > tasklet_unlock(t);
> > - if (test_bit(TASKLET_STATE_SCHED, &t->state))
> > - tasklet_schedule(t);
> > +
> > + local_irq_disable();
> > + t->next = tasklet_vec[cpu].list;
> > + tasklet_vec[cpu].list = t;
> > + __cpu_raise_softirq(cpu, TASKLET_SOFTIRQ);
> > + local_irq_enable();
>
> please create __tasklet_schedule and __tasklet_hi_schedule instead of
> duplicating code here, between local_irq_disable and local_irq_enable.

well there is not that huge code duplication, I was using 2.4.5 code and
it'd better be inline anyways so removing any runtime difference, but I
will be certainly ok if somebody wants to clean it up.

> > - local_irq_enable();
> > }
> >
> >
> > @@ -193,6 +200,7 @@
> > local_irq_disable();
> > list = tasklet_hi_vec[cpu].list;
> > tasklet_hi_vec[cpu].list = NULL;
> > + local_irq_enable();
> >
> > while (list) {
> > struct tasklet_struct *t = list;
> > @@ -201,21 +209,20 @@
> >
> > if (!tasklet_trylock(t))
> > BUG();
> > -repeat:
> > - if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
> > - BUG();
> > if (!atomic_read(&t->count)) {
> > - local_irq_enable();
> > + clear_bit(TASKLET_STATE_SCHED, &t->state);
> > t->func(t->data);
> > - local_irq_disable();
> > - if (test_bit(TASKLET_STATE_SCHED, &t->state))
> > - goto repeat;
> > + tasklet_unlock(t);
> > + continue;
> > }
> > tasklet_unlock(t);
> > - if (test_bit(TASKLET_STATE_SCHED, &t->state))
> > - tasklet_hi_schedule(t);
> > +
> > + local_irq_disable();
> > + t->next = tasklet_hi_vec[cpu].list;
> > + tasklet_hi_vec[cpu].list = t;
> > + __cpu_raise_softirq(cpu, HI_SOFTIRQ);
> > + local_irq_enable();
>
> this should be __tasklet_hi_schedule call, not inlined duplicate code

same as above.

>
>
> > +static int ksoftirqd(void * __bind_cpu)
> > +{
> > + int bind_cpu = *(int *) __bind_cpu;
> > + int cpu = cpu_logical_map(bind_cpu);
> > +
> > + daemonize();
> > + current->nice = 19;
> > + sigfillset(&current->blocked);
> > +
> > + /* Migrate to the right CPU */
> > + current->cpus_allowed = 1UL << cpu;
> > + while (smp_processor_id() != cpu)
> > + schedule();
> > +
> > + sprintf(current->comm, "ksoftirqd_CPU%d", bind_cpu);
> > +
> > + __set_current_state(TASK_INTERRUPTIBLE);
> > + mb();
> > +
> > + ksoftirqd_task(cpu) = current;
> > +
> > + for (;;) {
> > + if (!softirq_pending(cpu))
> > + schedule();
> > +
> > + __set_current_state(TASK_RUNNING);
> > +
> > + while (softirq_pending(cpu)) {
> > + do_softirq();
> > + if (current->need_resched)
> > + schedule();
> > + }
> > +
> > + __set_current_state(TASK_INTERRUPTIBLE);
> > + }
> > +}
>
> should there perhaps be a
> current->policy |= SCHED_YIELD
> in here, or does setting current->nice make that unnecessary?

when current->need_resched is set we need to schedule, that's a
completly different issue of when we want to release the cpu, when we
want to release the cpu intentionally we use sched_yield instead. the
renice make sure need_resched is set more frequently so we don't spent
too long cpu time there if there's some userspace work to do.

> So basically Andrea's ksoftirq patch indeed appears to fix the infinite
> loop, but IMHO it needs a tiny bit of work...

The tiny bit of work is the cleanup of the tasklet queueing code, that
is duplicated a few times and that has to be left inline, so it is
extremely low prio for me, if nobody sends me a patch I will clean it up
eventually.

Many thanks for auditing the patch, this was a very valuable feedback.

Andrea

2001-07-06 12:07:31

by Jeff Garzik

[permalink] [raw]
Subject: Re: tasklets in 2.4.6

Andrea Arcangeli wrote:
> On Fri, Jul 06, 2001 at 04:25:49AM -0400, Jeff Garzik wrote:
> > Why does tasklet_trylock go away?
>
> because the logic is different, that's perfectly ok. In short when irq
> are enabled and the TASKLET_STATE_SCHED bit is set, then the tasklet is
> also queued. That's the old cleaner and not buggy logic implemented by
> Alexey originally, it fixes the missed tasklet event in the other case
> mentioned in the other email. It is also more efficient.

nice :)

--
Jeff Garzik | A recent study has shown that too much soup
Building 1024 | can cause malaise in laboratory mice.
MandrakeSoft |

2001-07-09 09:11:05

by Ville Nummela

[permalink] [raw]
Subject: Re: tasklets in 2.4.6

Jeff Garzik <[email protected]> writes:

> Jeff Garzik wrote:
> > The only thing that appears fishy is that if the tasklet constantly
> > reschedules itself, it will never leave the loop AFAICS. This affects
> > tasklet_hi_action as well as tasklet_action.
> As I hacked around to fix this, I noticed Andrea's ksoftirq patch
> already fixed this. So, I decided to look over his patch instead.

I tried that patch too, but with not so good results: The code LOOKS good to
mee too, but for some reason it still seems to stuck in looping the tasklet
code only. btw. I'm trying this on PPC, it might have something to do with
that.. :)

I'll try to hack this out :-/

--
| [email protected] tel: +358-40-8480344
| So Linus, what are we doing tonight?
| The same thing we every night Tux. Try to take over the world!

2001-07-09 11:40:16

by Ville Nummela

[permalink] [raw]
Subject: Re: tasklets in 2.4.6

Ville Nummela <[email protected]> writes:

> Jeff Garzik <[email protected]> writes:
> > As I hacked around to fix this, I noticed Andrea's ksoftirq patch
> > already fixed this. So, I decided to look over his patch instead.
> I tried that patch too, but with not so good results: The code LOOKS good to
> mee too, but for some reason it still seems to stuck in looping the tasklet
> code only. btw. I'm trying this on PPC, it might have something to do with
> that.. :)

Stupid is stupid does.. I had used Adreas patch for a wrong kernel version,
and therefore the patch hadn't quite succeeded. The right patch works well
on the PPC too :-)

--
| [email protected] tel: +358-40-8480344
| So Linus, what are we doing tonight?
| The same thing we every night Tux. Try to take over the world!

2001-07-09 14:25:08

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: tasklets in 2.4.6

On Mon, Jul 09, 2001 at 02:39:46PM +0300, Ville Nummela wrote:
> Ville Nummela <[email protected]> writes:
>
> > Jeff Garzik <[email protected]> writes:
> > > As I hacked around to fix this, I noticed Andrea's ksoftirq patch
> > > already fixed this. So, I decided to look over his patch instead.
> > I tried that patch too, but with not so good results: The code LOOKS good to
> > mee too, but for some reason it still seems to stuck in looping the tasklet
> > code only. btw. I'm trying this on PPC, it might have something to do with
> > that.. :)
>
> Stupid is stupid does.. I had used Adreas patch for a wrong kernel version,
> and therefore the patch hadn't quite succeeded. The right patch works well
> on the PPC too :-)

btw, I tell to you too, if you have an SMP you also need to fix this plain
scheduler cpu affinity bug or you can deadlock at boot when ksoftirqd
kicks in:

ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.7pre3aa1/00_cpus_allowed-1

attached here too:

--- 2.4.4pre3aa/kernel/sched.c.~1~ Sat Apr 14 15:49:11 2001
+++ 2.4.4pre3aa/kernel/sched.c Sun Apr 15 18:31:14 2001
@@ -765,6 +765,8 @@
goto repeat_schedule;

still_running:
+ if (!(prev->cpus_allowed & (1UL << this_cpu)))
+ goto still_running_back;
c = goodness(prev, this_cpu, prev->active_mm);
next = prev;
goto still_running_back;

Andrea