2000-12-17 03:21:28

by Petr Vandrovec

[permalink] [raw]
Subject: 2.4.0-test13-pre1 lockup: run_task_queue or tty_io are wrong

Hi,
my 2.4.0-test13-pre1 just stopped answering to my keystrokes.
I've found that it is looping in tqueue_bh and flush_to_ldisc
still again and again.

To my surprise I found that flush_to_ldisc() does

if (test_bit(TTY_DONT_FLIP, &tty_flags)) {
queue_task(&tty->filp.tqueue, &tq_timer);
return;
}

Looks ok. But only until you'll look at run_task_queue().
It now contains

while (!list_empty(list)) {
...
}

So postponing event to next timer tick does not work anymore.
It will stop cycling in run_task_queue and machine is dead
(unless you have some spare CPU).

Is current run_task_queue() behavior intentional, or is it
only bug introduced by changing task queue to list based
implementation?
Thanks,
Petr Vandrovec
[email protected]

P.S.: Yes, I know that I should bought faster computer so that
ldisc buffer does not overflow, but...


2000-12-17 03:41:44

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.4.0-test13-pre1 lockup: run_task_queue or tty_io are wrong



On Sun, 17 Dec 2000, Petr Vandrovec wrote:
> my 2.4.0-test13-pre1 just stopped answering to my keystrokes.
> I've found that it is looping in tqueue_bh and flush_to_ldisc
> still again and again.

Thanks, I think you found the big problem with test12.

> Is current run_task_queue() behavior intentional, or is it
> only bug introduced by changing task queue to list based
> implementation?

It's a bug.

Ho humm. I'll have to check what the proper fix is. Right now the rule is
that nobody can _ever_ remove himself from a task queue, although there is
one bad user that does exactly that, and that means that it should be ok
to just cache the "next" pointer in run_task_queue(), and make it look
something like

static void run_task_queue(task_queue *list)
{
if (!list_empty(list)) {
unsigned long flags;
struct list_head *next;

spin_lock_irqsave(&tqueue_lock, flags);
next = list->next;
while (next != list) {
struct list_head *curr;
void *arg;
void (*f) (void *);
struct tq_struct *p;

curr = next;
next = next->next;
list_del(curr);

p = list_entry(curr, struct tq_struct, list);
arg = p->data;
f = p->routine;
p->sync = 0;
spin_unlock_irqrestore(&tqueue_lock, flags);

if (f)
f(arg);

spin_lock_irq(&tqueue_lock);
}
spin_unlock_irqrestore(&tqueue_lock, flags);
}
}

which basically will never re-start at the head if a tq re-inserts iself
(side note: new entires are inserted at the end, but if it was already the
last entry then re-inserting it will not re-execute it, should this should
avoid your problem).

Does this fix the problem for you?

Linus

2000-12-17 04:21:24

by Petr Vandrovec

[permalink] [raw]
Subject: Re: 2.4.0-test13-pre1 lockup: run_task_queue or tty_io are wrong

On Sat, Dec 16, 2000 at 07:09:56PM -0800, Linus Torvalds wrote:
>
> which basically will never re-start at the head if a tq re-inserts iself
> (side note: new entires are inserted at the end, but if it was already the
> last entry then re-inserting it will not re-execute it, should this should
> avoid your problem).
>
> Does this fix the problem for you?

It looks like that it does. I hope that there are not two such users in
kernel, as if they are, it can loop again.
Thanks,
Petr Vandrovec
[email protected]

2000-12-17 07:52:26

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.4.0-test13-pre1 lockup: run_task_queue or tty_io are wrong

Linus Torvalds wrote:
>
> On Sun, 17 Dec 2000, Petr Vandrovec wrote:
> > my 2.4.0-test13-pre1 just stopped answering to my keystrokes.
> > I've found that it is looping in tqueue_bh and flush_to_ldisc
> > still again and again.
>
> Thanks, I think you found the big problem with test12.
>
> > Is current run_task_queue() behavior intentional, or is it
> > only bug introduced by changing task queue to list based
> > implementation?
>
> It's a bug.
>
> Ho humm. I'll have to check what the proper fix is. Right now the rule is
> that nobody can _ever_ remove himself from a task queue, although there is
> one bad user that does exactly that, and that means that it should be ok
> to just cache the "next" pointer in run_task_queue(), and make it look
> something like
>

How about this?

- Create a private copy of the list, zap the original and then
walk the list in privacy. New list additions will then go
onto the next instantiation.

- Make the function non-inline. It's too big. Semi-randomly
chose softirq.c for this. Shrinks kernel by ~2k.

The test for list emptiness outside the spinlock is racy,
but this doesn't matter.

- Make the commentary slightly less inaccurate.

- As we walk the list, set the entries to point to NULL. This
will force an oops from anyone who tries to remove entries
during or after the walk.

- If anyone wants to remove a live entry from a list that's OK, but
they should do it by locking the list and walking it under the
lock to verify that the entry is still on it. If you find it
then fine: delete it. If you didn't find it then your callback
could currently be running. Beware.




--- linux-2.4.0-test13-pre2/include/linux/tqueue.h Tue Dec 12 19:24:23 2000
+++ linux-akpm/include/linux/tqueue.h Sun Dec 17 18:09:19 2000
@@ -53,22 +53,22 @@
* To implement your own list of active bottom halfs, use the following
* two definitions:
*
- * DECLARE_TASK_QUEUE(my_bh);
- * struct tq_struct run_my_bh = {
- * routine: (void (*)(void *)) run_task_queue,
- * data: &my_bh
+ * DECLARE_TASK_QUEUE(my_tqueue);
+ * struct tq_struct my_task = {
+ * routine: (void (*)(void *)) my_routine,
+ * data: &my_data
* };
*
- * To activate a bottom half on your list, use:
+ * To activate a bottom half on a list, use:
*
- * queue_task(tq_pointer, &my_bh);
+ * queue_task(&my_task, &my_tqueue);
*
- * To run the bottom halfs on your list put them on the immediate list by:
+ * To later run the queued tasks use
*
- * queue_task(&run_my_bh, &tq_immediate);
+ * run_task_queue(&my_tqueue);
*
- * This allows you to do deferred procession. For example, you could
- * have a bottom half list tq_timer, which is marked active by the timer
+ * This allows you to do deferred processing. For example, you could
+ * have a task queue called tq_timer, which is executed within the timer
* interrupt.
*/

@@ -78,8 +78,7 @@
* Queue a task on a tq. Return non-zero if it was successfully
* added.
*/
-static inline int queue_task(struct tq_struct *bh_pointer,
- task_queue *bh_list)
+static inline int queue_task(struct tq_struct *bh_pointer, task_queue *bh_list)
{
int ret = 0;
if (!test_and_set_bit(0,&bh_pointer->sync)) {
@@ -95,32 +94,13 @@
/*
* Call all "bottom halfs" on a given list.
*/
-static inline void run_task_queue(task_queue *list)
-{
- while (!list_empty(list)) {
- unsigned long flags;
- struct list_head *next;
-
- spin_lock_irqsave(&tqueue_lock, flags);
- next = list->next;
- if (next != list) {
- void *arg;
- void (*f) (void *);
- struct tq_struct *p;

- list_del(next);
- p = list_entry(next, struct tq_struct, list);
- arg = p->data;
- f = p->routine;
- p->sync = 0;
- spin_unlock_irqrestore(&tqueue_lock, flags);
+extern void __run_task_queue(task_queue *list);

- if (f)
- f(arg);
- continue;
- }
- spin_unlock_irqrestore(&tqueue_lock, flags);
- }
+static inline void run_task_queue(task_queue *list)
+{
+ if (TQ_ACTIVE(*list))
+ __run_task_queue(list);
}

#endif /* _LINUX_TQUEUE_H */
--- linux-2.4.0-test13-pre2/kernel/softirq.c Sun Oct 15 01:27:46 2000
+++ linux-akpm/kernel/softirq.c Sun Dec 17 17:50:01 2000
@@ -15,6 +15,7 @@
#include <linux/interrupt.h>
#include <linux/smp_lock.h>
#include <linux/init.h>
+#include <linux/tqueue.h>

/*
- No shared variables, all the data are CPU local.
@@ -288,4 +289,28 @@
open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
}

+void __run_task_queue(task_queue *list)
+{
+ struct list_head head, *next;
+ unsigned long flags;

+ spin_lock_irqsave(&tqueue_lock, flags);
+ list_add(&head, list);
+ list_del_init(list);
+ spin_unlock_irqrestore(&tqueue_lock, flags);
+
+ next = head.next;
+ while (next != &head) {
+ void (*f) (void *);
+ struct tq_struct *p;
+
+ p = list_entry(next, struct tq_struct, list);
+ next = next->next;
+ /* Debug: force an oops from people who delete entries */
+ next->prev->next = next->prev->prev = 0;
+ f = p->routine;
+ p->sync = 0;
+ if (f)
+ f(p->data);
+ }
+}
--- linux-2.4.0-test13-pre2/kernel/ksyms.c Tue Dec 12 19:24:23 2000
+++ linux-akpm/kernel/ksyms.c Sun Dec 17 17:50:41 2000
@@ -524,6 +524,7 @@
EXPORT_SYMBOL(remove_bh);
EXPORT_SYMBOL(tasklet_init);
EXPORT_SYMBOL(tasklet_kill);
+EXPORT_SYMBOL(__run_task_queue);

/* init task, for moving kthread roots - ought to export a function ?? */

2000-12-17 18:56:25

by Jamie Lokier

[permalink] [raw]
Subject: Re: 2.4.0-test13-pre1 lockup: run_task_queue or tty_io are wrong

Linus Torvalds wrote:
> Ho humm. I'll have to check what the proper fix is. Right now the rule is
> that nobody can _ever_ remove himself from a task queue, although there is
> one bad user that does exactly that, and that means that it should be ok
> to just cache the "next" pointer in run_task_queue(), and make it look
> something like

How about using a sentinel list entry representing the current position
in run_task_queue's loop?

The sentinel's next pointer isn't invalidated by other operations on the
list, provided each operation is protected by tqueue_lock. Each
iteration step is a matter or removing the sentinel, and inserting it in
the next position. A task removing itself would then be perfectly ok.

-- Jamie

2000-12-17 20:33:47

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.4.0-test13-pre1 lockup: run_task_queue or tty_io are wrong



On Sun, 17 Dec 2000, Jamie Lokier wrote:
>
> How about using a sentinel list entry representing the current position
> in run_task_queue's loop?

Nope.

There may be multiple concurrent run_task_queue's executing, so for now
I've applied Andrew Morton's patch that most closely gets the old
behaviour of having a private list.

HOWEVER, this does need to be re-visited. The task-queue handling is
potantially something that should be completely re-vamped in the future.

Linus

2000-12-17 20:40:46

by Jeff V. Merkey

[permalink] [raw]
Subject: Re: 2.4.0-test13-pre1 lockup: run_task_queue or tty_io are wrong

On Sun, Dec 17, 2000 at 12:02:31PM -0800, Linus Torvalds wrote:
>
>
> On Sun, 17 Dec 2000, Jamie Lokier wrote:
> >
> > How about using a sentinel list entry representing the current position
> > in run_task_queue's loop?
>
> Nope.
>
> There may be multiple concurrent run_task_queue's executing, so for now
> I've applied Andrew Morton's patch that most closely gets the old
> behaviour of having a private list.
>
> HOWEVER, this does need to be re-visited. The task-queue handling is
> potantially something that should be completely re-vamped in the future.
>
> Linus


Linus,

Try thinking about the work to do model (since task queues are so similiar)
Having to "kick" these things should be automatic in the kernel. I could
do a lot of cool stuff with this in there, manos aside.....

:-)

Jeff


>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/

2000-12-17 20:54:38

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.4.0-test13-pre1 lockup: run_task_queue or tty_io are wrong

In article <[email protected]>,
Jeff V. Merkey <[email protected]> wrote:
>
>Try thinking about the work to do model (since task queues are so similiar)
>Having to "kick" these things should be automatic in the kernel. I could
>do a lot of cool stuff with this in there, manos aside.....

No, the "kicking" should _not_ be automatic.

Th ewhol epoint of a lot of the task queues is that they are manual. The
main use for them (apart from the tty layer, which could easily use
something else) is the disk starter - where we want to delay the
submission of requests to the disk until we've aggregated as many as
possible. Which means that the "tq_disk" queue must absolutely not be
kicked automatically.

Linus

2000-12-18 21:23:26

by Jamie Lokier

[permalink] [raw]
Subject: Re: 2.4.0-test13-pre1 lockup: run_task_queue or tty_io are wrong

Linus Torvalds wrote:
> On Sun, 17 Dec 2000, Jamie Lokier wrote:
> > How about using a sentinel list entry representing the current position
> > in run_task_queue's loop?
>
> Nope.
>
> There may be multiple concurrent run_task_queue's executing, so for now
> I've applied Andrew Morton's patch that most closely gets the old
> behaviour of having a private list.

I wasn't clear. The sentinel is a local structure on the stack, and
only exists while run_task_queue is executing. Another name for this is
"deletion-safe pointer".

It works very nicely with concurrent run_task_queues: each one inserts
its own sentinel and skips over the others.

-- Jamie

2000-12-19 00:29:32

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.4.0-test13-pre1 lockup: run_task_queue or tty_io are wrong



On Mon, 18 Dec 2000, Jamie Lokier wrote:
> >
> > Nope.
> >
> > There may be multiple concurrent run_task_queue's executing, so for now
> > I've applied Andrew Morton's patch that most closely gets the old
> > behaviour of having a private list.
>
> I wasn't clear. The sentinel is a local structure on the stack, and
> only exists while run_task_queue is executing. Another name for this is
> "deletion-safe pointer".

Yes, except run_task_queue removes every object it finds. So two
concurrent run_task_queues would be bad.

Sure, we could just make it skip sentinels by adding a magic flag to them,
but that is pretty ugly.

Linus

2000-12-19 02:04:53

by Jamie Lokier

[permalink] [raw]
Subject: Re: 2.4.0-test13-pre1 lockup: run_task_queue or tty_io are wrong

Linus Torvalds wrote:
> > I wasn't clear. The sentinel is a local structure on the stack, and
> > only exists while run_task_queue is executing. Another name for this is
> > "deletion-safe pointer".
>
> Yes, except run_task_queue removes every object it finds. So two
> concurrent run_task_queues would be bad.

That could work, but forget it. I've just looked at Andrew's patch and
it's much nicer :-)

If you put a spinlock around the list operations in Andrew's version,
you'd have safe tqueue deletions again (if you felt that was worth
having). Some tricks and you can make it a different spinlock, but I
doubt that would be a net benefit.

-- Jamie