2002-08-30 20:54:07

by Terence Ripperda

[permalink] [raw]
Subject: lockup on Athlon systems, kernel race condition?

I'm working on a problem where leaving viewperf running overnight with our drivers on a dual Athlon system locks up. The same problem is not reproducible on a dual P3 or dual P4.

Using kdb, I was able to break in and take a look at things once they hung. It appears to be a core kernel race condition. I'm not 100% sure, so I wanted to lay the information out for you to see if the diagnosis looks correct.


The debugger breaks into cpu 1, with the following backtrace:

Entering kdb (current=0xe1eea000, pid 2103) on processor 1 due to Keyboard Entry
[1]kdb> bt
EBP EIP Function (args)
0xe1eeb910 0xc0115739 smp_call_function+0x99 (0xc0115550, 0x0, 0x1, 0x1, 0x400000)
kernel .text 0xc0100000 0xc01156a0 0xc0115790
0xe1eeb92c 0xc01155b4 flush_tlb_all+0x14 (0x1000, 0xf9b82000, 0x2d29f000, 0xc0101f9c, 0x73)
kernel .text 0xc0100000 0xc01155a0 0xc0115600
0xe1eeb96c 0xc011883f remap_area_pages+0x1ef (0xf9c00000, 0x2d29f000, 0x1000, 0x10, 0x1000)
kernel .text 0xc0100000 0xc0118650 0xc0118850
0xe1eeb994 0xc011890c __ioremap+0xbc (0x2d29f000, 0x1000, 0x10, 0xf891eb7f, 0x2d29f000)
kernel .text 0xc0100000 0xc0118850 0xc0118940
0xe1eeb9c4 0xf890cf3c [NVdriver]os_map_kernel_space+0x68 (0x2d29f000, 0x1000, 0x1, 0xf891da57, 0xf9b82000)
NVdriver .text 0xf8909060 0xf890ced4 0xf890cf80
... nvidia agp functions ...
0xe1eebf94 0xf890a7a4 [NVdriver]nv_kern_ioctl+0x2d0 (0xf5099380, 0xeb01ccc0, 0xc0284627, 0xbfffe810, 0xe1eea000)
NVdriver .text 0xf8909060 0xf890a4d4 0xf890a7c4
0xe1eebfbc 0xc0154d29 sys_ioctl+0x209 (0x6, 0xc0284627, 0xbfffe810, 0x2100, 0x8147048)
kernel .text 0xc0100000 0xc0154b20 0xc0154d90
0xc0109415 system_call+0x55
kernel .text 0xc0100000 0xc01093c0 0xc010941c

the basic idea is that when handling agp allocations, we're mapping and unmapping pages, which is triggering tlb flushes. (this lockup occurs with nvagp or agpgart). I originally suspected we were causing a race condition by holding a lock, then calling a blocking kernel function, but that is not the case. We are not holding any locks here.

checking out the other cpu reveals:

[1]kdb> cpu 0

Entering kdb (current=0xf1612000, pid 1661) on processor 0 due to cpu switch
[0]kdb> bt
EBP EIP Function (args)
0xc0154d97 .text.lock.ioctl+0x7
kernel .text 0xc0100000 0xc0154d90 0xc0154dc0
0xf1613fbc 0xc0154b8a sys_ioctl+0x6a (0x4, 0x541b, 0xbfffe15c, 0x805b050, 0x805ab10)
kernel .text 0xc0100000 0xc0154b20 0xc0154d90
0xc0109415 system_call+0x55
kernel .text 0xc0100000 0xc01093c0 0xc010941c

disassembling the code at cpu 0's EIP shows:

0xc0154b8a <sys_ioctl+106>: lock decb 0xc03cae40

looking up 0xc03cae40 in the System.map files shows:

c03cae40 D kernel_flag_cacheline


This code appears to be the following code from fs/ioctl.c (grabbing the big kernel lock):

asmlinkage long sys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
{
struct file * filp;
unsigned int flag;
int on, error = -EBADF;

filp = fget(fd);
if (!filp)
goto out;
error = 0;
lock_kernel(); <====
switch (cmd) {

Recall that the earlier backtrace on cpu 1 was also handling an ioctl, which suggests cpu 1 has the big kernel lock (by nature of being in an ioctl), and cpu 0 is attempting to grab the big kernel lock. While handling the ioctl, cpu 1 is mapping/unmapping pages, which causes an IPI to flush all cpu's tlbs. cpu 0 cannot respond to the IPI, due to being in an atomic operation grabbing the big kernel lock. But cpu 1 has the lock and won't surrender the lock until cpu 0 finishes the IPI.

I believe this is a core kernel race condition. I would also suspect that the same problem could happen with other drivers when run through such a stress test. Perhaps the same problem could be reproduced with another graphics card (calling to the kernel to allocate (agp) memory) left running viewperf overnight.


Thanks,
Terence


2002-08-30 21:13:04

by Andrew Morton

[permalink] [raw]
Subject: Re: lockup on Athlon systems, kernel race condition?

Terence Ripperda wrote:
>
> ...
>
> asmlinkage long sys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
> {
> struct file * filp;
> unsigned int flag;
> int on, error = -EBADF;
>
> filp = fget(fd);
> if (!filp)
> goto out;
> error = 0;
> lock_kernel(); <====
> switch (cmd) {

This CPU is spinning, waiting for kernel_flag. It will take the IPI
and the other CPU's smp_call_function() will succeed.

Possibly the IPI has got lost - seems that this is a popular failure mode
for flakey chipsets/motherboards.

Or someone has called sys_ioctl() with interrupts disabled. That's very
doubtful.

2002-08-31 00:33:15

by Alan

[permalink] [raw]
Subject: Re: lockup on Athlon systems, kernel race condition?

Duplicate the problem without NVdriver having ever been loaded that
boot, then it might be interesting

2002-09-03 18:31:14

by Terence Ripperda

[permalink] [raw]
Subject: Re: lockup on Athlon systems, kernel race condition?

On Fri, Aug 30, 2002 at 02:15:14PM -0700, [email protected] wrote:
> This CPU is spinning, waiting for kernel_flag. It will take the IPI
> and the other CPU's smp_call_function() will succeed.

thank you for the clarification, Andrew. I was unclear whether the 'lock' command would block out the IPI or not.

> Possibly the IPI has got lost - seems that this is a popular failure mode
> for flakey chipsets/motherboards.

this sounds like the most likely candidate. I'm working on tracking down documentation for further study. Is there an easy way to determine this as the cause?

> Or someone has called sys_ioctl() with interrupts disabled. That's very
> doubtful.

I agree, that's highly doubtful.

one of our QA guys tried to reproduce this issue on an ATI card, but was unable to get through viewperf long enough to repro this problem before hitting other system lockups. I'm also installing an ATI card to try and see if I can reproduce the problem.

Terence

2002-09-03 18:51:38

by Andrew Morton

[permalink] [raw]
Subject: Re: lockup on Athlon systems, kernel race condition?

Terence Ripperda wrote:
>
> ..
> > Possibly the IPI has got lost - seems that this is a popular failure mode
> > for flakey chipsets/motherboards.
>
> this sounds like the most likely candidate. I'm working on tracking down
> documentation for further study. Is there an easy way to determine this
> as the cause?

Some systems will drop nasty messages in the logs when APIC checksum
errors are detected. And such systems will also be prone to lockups
due to failed delivery. But whether IPIs can be lost without any such
warning signs: don't know, sorry.

2002-09-03 20:41:30

by Alan

[permalink] [raw]
Subject: Re: lockup on Athlon systems, kernel race condition?

On Tue, 2002-09-03 at 19:54, Andrew Morton wrote:
> Some systems will drop nasty messages in the logs when APIC checksum
> errors are detected. And such systems will also be prone to lockups
> due to failed delivery. But whether IPIs can be lost without any such
> warning signs: don't know, sorry.

Our IPI code requires the other end ack, so its pretty much impossible
for it to occur at the software end. The hardware in theory can also not
lose messages both because of queueing and checksums. On pentiums there
are documented errata with 3 interrupts matched to the same priority but
not on Athlon

2002-09-03 21:42:27

by Manfred Spraul

[permalink] [raw]
Subject: Re: lockup on Athlon systems, kernel race condition?

> Terence Ripperda wrote:
>>
>> ...
>>
>> asmlinkage long sys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
>> {
>> struct file * filp;
>> unsigned int flag;
>> int on, error = -EBADF;
>>
>> filp = fget(fd);
>> if (!filp)
>> goto out;
>> error = 0;
>> lock_kernel(); <====
Which compiler to you use, and which kernel? Which additional patches?

With my 2.4.20-pre4-ac1 kernel, the lock_kernel is at offset +3a,
according to your dump it's at +6a.

>> switch (cmd) {
>
> This CPU is spinning, waiting for kernel_flag. It will take the IPI
> and the other CPU's smp_call_function() will succeed.
>
> Possibly the IPI has got lost - seems that this is a popular failure mode
> for flakey chipsets/motherboards.
>
> Or someone has called sys_ioctl() with interrupts disabled. That's very
> doubtful.

Is it possible to display the cpu registers with kdb? Could you check
that the interrupts are enabled?

I'd add a quick test into sys_ioctl() or lock_kernel: save_flags, and
check that bit 9 is always enabled. Check __global_cli for sample code.
The X server probably runs with enough priveledges to disable the
interrupts, perhaps it's doing something stupid.

--
Manfred


2002-09-03 22:00:15

by Terence Ripperda

[permalink] [raw]
Subject: Re: lockup on Athlon systems, kernel race condition?

On Tue, Sep 03, 2002 at 11:46:11PM +0200, [email protected] wrote:
> Which compiler to you use, and which kernel? Which additional patches?
>
> With my 2.4.20-pre4-ac1 kernel, the lock_kernel is at offset +3a,
> according to your dump it's at +6a.

I'm using gcc 2.95.4 (on debian). The kernel I'm currently using is 2.4.19, with the kdb patch and apic_routing patches applied. Neither of these is required to reproduce the problem, they're just to help debug the problem.

that reminds me. when I originally tried to use the nmi watchdog to break into the debugger when the hang occured, I got the following at bootup:

Sep 3 14:56:49 localhost kernel: activating NMI Watchdog ... done.
Sep 3 14:56:49 localhost kernel: testing NMI watchdog ... CPU#0: NMI appears to be stuck!


> Is it possible to display the cpu registers with kdb? Could you check
> that the interrupts are enabled?

I don't have things caught in the debugger currently, but I did check the registers at the time. bit 9 of eflags was enabled on both cpus.

Terence

2002-09-04 15:01:13

by Manfred Spraul

[permalink] [raw]
Subject: Re: lockup on Athlon systems, kernel race condition?

Terence Ripperda wrote:
>
> I don't have things caught in the debugger currently, but I did check the registers at the time. bit 9 of eflags was enabled on both cpus.
>
That rules out a disabled interrupt flag.

What was cpu1 doing? There are 3 loops in smp_call_function:
spin_lock(&call_lock), and 2 while loops with barrier() [first one for
IPI arrival, second one for synchroneous ipi's. TLB flush is sync].

--
Manfred

2002-09-04 16:13:20

by Terence Ripperda

[permalink] [raw]
Subject: Re: lockup on Athlon systems, kernel race condition?

On Wed, Sep 04, 2002 at 05:05:58PM +0200, [email protected] wrote:
> What was cpu1 doing? There are 3 loops in smp_call_function:
> spin_lock(&call_lock), and 2 while loops with barrier() [first one for
> IPI arrival, second one for synchroneous ipi's. TLB flush is sync].

it was spinning on the first barrier loop, waiting for verification of the IPI arrival:

/* Wait for response */
while (atomic_read(&data.started) != cpus)
barrier();


Terence