2005-01-19 17:19:29

by Badari Pulavarty

[permalink] [raw]
Subject: 2.6.10-mm1 hang

Hi Andrew,

I was playing with kexec+kdump and ran into this on 2.6.10-mm1.
I have seen similar behaviour on 2.6.10.

I am using a 4-way P-III machine. I have a module which tries
gets same spinlock twice. When I try to "insmod" this module,
my system hangs. All my windows froze, no more new logins,
console froze, doesn't respond to sysrq. I wasn't expecting
a system hang. Why ? Ideas ?

Here is the code.

Thanks,
Badari



Attachments:
test.c (268.00 B)

2005-01-19 21:32:04

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.10-mm1 hang

Badari Pulavarty <[email protected]> wrote:
>
> I was playing with kexec+kdump and ran into this on 2.6.10-mm1.
> I have seen similar behaviour on 2.6.10.
>
> I am using a 4-way P-III machine. I have a module which tries
> gets same spinlock twice. When I try to "insmod" this module,
> my system hangs. All my windows froze, no more new logins,
> console froze, doesn't respond to sysrq. I wasn't expecting
> a system hang. Why ? Ideas ?
>

Maybe all the other CPUs are stuck trying to send an IPI to this one? An
NMI watchdog trace would tell.

> #include <linux/init.h>
> #include <asm/uaccess.h>
> #include <linux/spinlock.h>
> spinlock_t mylock = SPIN_LOCK_UNLOCKED;
> static int __init panic_init(void)
> {
> spin_lock_irq(&mylock);
> spin_lock_irq(&mylock);
> return 1;
> }

2005-01-19 22:08:20

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: 2.6.10-mm1 hang

On Wed, 19 Jan 2005, Andrew Morton wrote:

> Badari Pulavarty <[email protected]> wrote:
>>
>> I was playing with kexec+kdump and ran into this on 2.6.10-mm1.
>> I have seen similar behaviour on 2.6.10.
>>
>> I am using a 4-way P-III machine. I have a module which tries
>> gets same spinlock twice. When I try to "insmod" this module,
>> my system hangs. All my windows froze, no more new logins,
>> console froze, doesn't respond to sysrq. I wasn't expecting
>> a system hang. Why ? Ideas ?
>>
>
> Maybe all the other CPUs are stuck trying to send an IPI to this one? An
> NMI watchdog trace would tell.
>
>> #include <linux/init.h>
>> #include <asm/uaccess.h>
>> #include <linux/spinlock.h>
>> spinlock_t mylock = SPIN_LOCK_UNLOCKED;
>> static int __init panic_init(void)
>> {
>> spin_lock_irq(&mylock);
>> spin_lock_irq(&mylock);
>> return 1;
>> }
> -

What would you expect this to do? After the first lock is
obtained, the second MUST fail forever or else the spin-lock
doesn't work. The code, above, just proves that spin-locks
work!


Cheers,
Dick Johnson
Penguin : Linux version 2.6.10 on an i686 machine (5537.79 BogoMips).
Notice : All mail here is now cached for review by Dictator Bush.
98.36% of all statistics are fiction.

2005-01-19 22:18:26

by Badari Pulavarty

[permalink] [raw]
Subject: Re: 2.6.10-mm1 hang

On Wed, 2005-01-19 at 14:01, linux-os wrote:
> On Wed, 19 Jan 2005, Andrew Morton wrote:
>
> > Badari Pulavarty <[email protected]> wrote:
> >>
> >> I was playing with kexec+kdump and ran into this on 2.6.10-mm1.
> >> I have seen similar behaviour on 2.6.10.
> >>
> >> I am using a 4-way P-III machine. I have a module which tries
> >> gets same spinlock twice. When I try to "insmod" this module,
> >> my system hangs. All my windows froze, no more new logins,
> >> console froze, doesn't respond to sysrq. I wasn't expecting
> >> a system hang. Why ? Ideas ?
> >>
> >
> > Maybe all the other CPUs are stuck trying to send an IPI to this one? An
> > NMI watchdog trace would tell.
> >
> >> #include <linux/init.h>
> >> #include <asm/uaccess.h>
> >> #include <linux/spinlock.h>
> >> spinlock_t mylock = SPIN_LOCK_UNLOCKED;
> >> static int __init panic_init(void)
> >> {
> >> spin_lock_irq(&mylock);
> >> spin_lock_irq(&mylock);
> >> return 1;
> >> }
> > -
>
> What would you expect this to do? After the first lock is
> obtained, the second MUST fail forever or else the spin-lock
> doesn't work. The code, above, just proves that spin-locks
> work!
>

I was expecting that one CPU will spin for the lock, while
3 other CPUs do real useful work (on 4-proc machine). Instead
my machine is hung - all my windows froze up, no more "ssh",
doesn't respond to sysrq to get traces. Only thing it does is,
respond to "ping".

Thanks,
Badari

2005-01-19 22:39:07

by Robert Love

[permalink] [raw]
Subject: Re: 2.6.10-mm1 hang

On Wed, 2005-01-19 at 17:01 -0500, linux-os wrote:

> What would you expect this to do? After the first lock is
> obtained, the second MUST fail forever or else the spin-lock
> doesn't work. The code, above, just proves that spin-locks
> work!

He has a four processor machine. Since the lock is local, it is
somewhat odd that the other three lock up.

Robert Love


2005-01-20 15:59:58

by Badari Pulavarty

[permalink] [raw]
Subject: Re: 2.6.10-mm1 hang

I see different behaviours on different architectures.

i386 - machine hang
Power5 ppc64 - only the process hang
Power3 ppc64 - machine hang

I modified it to use spin_lock() instead of spin_lock_irq() -
things are the way I was expecting. Only process hang, not
the system.

You may be right on other CPUs stuck on IPI.

Thanks,
Badari

On Wed, 2005-01-19 at 13:31, Andrew Morton wrote:
> Badari Pulavarty <[email protected]> wrote:
> >
> > I was playing with kexec+kdump and ran into this on 2.6.10-mm1.
> > I have seen similar behaviour on 2.6.10.
> >
> > I am using a 4-way P-III machine. I have a module which tries
> > gets same spinlock twice. When I try to "insmod" this module,
> > my system hangs. All my windows froze, no more new logins,
> > console froze, doesn't respond to sysrq. I wasn't expecting
> > a system hang. Why ? Ideas ?
> >
>
> Maybe all the other CPUs are stuck trying to send an IPI to this one? An
> NMI watchdog trace would tell.
>
> > #include <linux/init.h>
> > #include <asm/uaccess.h>
> > #include <linux/spinlock.h>
> > spinlock_t mylock = SPIN_LOCK_UNLOCKED;
> > static int __init panic_init(void)
> > {
> > spin_lock_irq(&mylock);
> > spin_lock_irq(&mylock);
> > return 1;
> > }
>