2002-08-05 17:18:05

by Marc-Christian Petersen

[permalink] [raw]
Subject: AIO together with SMPtimers-A0 oops and freezing

Hi Ben, Hi Ingo,

Ben, I am using your AIO 20020619 patch + relevant fixes from the AIO
mailinglist together with your patch Ingo, SMPtimers-A0.

As almost anything for WOLK is selectable via kernel configuration I was able
to track the issue down we've experienced with both together. If I use AIO
without SMPtimers, Oracle 9i works horribly fine :), no oops, no panic, no
freeze, just fast as light ;) ... Unfortunately if I use them both, AIO +
SMPtimers, doing some heavy traffic to the Oracle 9i database, the system
oops() and after that it's almost freezed (only sysrq works).

If I use either AIO _or_ SMPtimers, no problem occurs, all works fine.

Is there any chance to give both a cooperation with each other? :)
Otherwise I have to disable SMPtimers completely from the kernel config if AIO
gets selected as I am not able to fix that issue :|


Kernel is : 2.4.18-wolk3.5-rc4 (sf.net/projects/wolk)
Hardware is: Compaq ML570, Quad Xeon 900MHz, 16GB RAM, u2w scsi raid.
System is : Debian Woody 3.0r0


Many thanks for your help and your time!!


Error follows:
--------------

Unable to handle kernel paging request at virtual address 00v03ab9
printing eip:
c01221e1
*pde = 00000000
Oops: 0002
CPU: 1
EIP: 0010:[<c01221e1>] Not tainted
EFLAGS: 00010002
eax: 08c03ab5 ebx: d1008000 ecx: c03ab501 edx: 00c03ab5
esi: c03ab520 edi: 10c03ab5 ebp: 10x03ab5 esp: d1009f38
ds: 0018 es 0018 ss: 0018
Process swapper (pid: 0, stackpage=d1009000)
Stack: d1008000 00000080 00001020 c03ab520 00000086 c0122844 c03ab520
00000080
00000001 00000000 00000000 c011252d d1008000 d1008000 c0105470
c02af3a8
d1009f7c c0105470 d1008000 d1008000 d1008000 c0105470 00000000
00000000
Call Trace: [<c0122884>] [<c011252d>] [<c0105470>] [<c105470>] [<c105470>]
[<c0105499>] [<c0105502>] [<c0119326>] [<c01191c4>]

Code: 89 42 04 89 10 c7 41 04 00 00 00 00 c7 01 00 00 00 00 89 4e
>>EIP; c01221e1 <del_timer_sync+32d/a44> <=====

>>eax; 08c03ab5 Before first symbol
>>ebx; d1008000 <___strtok+10c35f1c/11cebf1c>
>>ecx; c03ab501 <xtime+1011/1530>
>>edx; 00c03ab5 Before first symbol
>>esi; c03ab520 <xtime+1030/1530>
>>edi; 10c03ab5 Before first symbol

Trace; c0122884 <del_timer_sync+9d0/a44>
Trace; c011252d <smp_call_function+7e9/1d24>
Trace; c0105470 <enable_hlt+8/190>
Trace; 0c105470 Before first symbol
Trace; 0c105470 Before first symbol
Trace; c0105499 <enable_hlt+31/190>
Trace; c0105502 <enable_hlt+9a/190>
Trace; c0119326 <acquire_console_sem+136/164>
Trace; c01191c4 <printk+144/170>

Code; c01221e1 <del_timer_sync+32d/a44>
00000000 <_EIP>:
Code; c01221e1 <del_timer_sync+32d/a44> <=====
0: 89 42 04 mov %eax,0x4(%edx) <=====
Code; c01221e4 <del_timer_sync+330/a44>
3: 89 10 mov %edx,(%eax)
Code; c01221e6 <del_timer_sync+332/a44>
5: c7 41 04 00 00 00 00 movl $0x0,0x4(%ecx)
Code; c01221ed <del_timer_sync+339/a44>
c: c7 01 00 00 00 00 movl $0x0,(%ecx)
Code; c01221f3 <del_timer_sync+33f/a44>
12: 89 4e 00 mov %ecx,0x0(%esi)



--
Kind regards
Marc-Christian Petersen

http://sourceforge.net/projects/wolk

PGP/GnuPG Key: 1024D/408B2D54947750EC
Fingerprint: 8602 69E0 A9C2 A509 8661 2B0B 408B 2D54 9477 50EC
Key available at http://www.keyserver.net. Encrypted e-mail preferred.


2002-08-06 20:03:50

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: AIO together with SMPtimers-A0 oops and freezing

On Mon, Aug 05, 2002 at 07:20:29PM +0200, Marc-Christian Petersen wrote:
> Hi Ben, Hi Ingo,

> Ben, I am using your AIO 20020619 patch + relevant fixes from the AIO
> mailinglist together with your patch Ingo, SMPtimers-A0.

Hmmm, the only problem I can see in the aio code wrt timer usage is
the following. Does this patch make a difference? If not, I'm guessing
that the problem is something in SMPtimers-A0 that aio happens to
trigger. The only timer aio uses is for the timeout when waiting for an
event, and the structure for that is put on the stack.

-ben


Index: aio.c
===================================================================
RCS file: /bcrl/cvs/CVSROOT/net-aio/linux/fs/aio.c,v
retrieving revision 1.13
diff -u -u -r1.13 aio.c
--- aio.c 6 Aug 2002 20:02:23 -0000 1.13
+++ aio.c 6 Aug 2002 20:04:40 -0000
@@ -774,8 +774,10 @@
goto out;

set_timeout(&to, &ts);
- if (to.timed_out)
+ if (to.timed_out) {
timeout = 0;
+ clear_timeout(&to);
+ }
}

while (likely(i < nr)) {

2002-08-06 22:32:18

by J.A. Magallon

[permalink] [raw]
Subject: Re: AIO together with SMPtimers-A0 oops and freezing


On 2002.08.06 Benjamin LaHaise wrote:
>On Mon, Aug 05, 2002 at 07:20:29PM +0200, Marc-Christian Petersen wrote:
>> Hi Ben, Hi Ingo,
>
>> Ben, I am using your AIO 20020619 patch + relevant fixes from the AIO
>> mailinglist together with your patch Ingo, SMPtimers-A0.
>
>Hmmm, the only problem I can see in the aio code wrt timer usage is
>the following. Does this patch make a difference? If not, I'm guessing
>that the problem is something in SMPtimers-A0 that aio happens to
>trigger. The only timer aio uses is for the timeout when waiting for an
>event, and the structure for that is put on the stack.
>

Hmm, I forgot to comment, but I apply smptimers on top of latest -aa, that
includes aio (is it different implementation?), and the kernel works fine.

--
J.A. Magallon \ Software is like sex: It's better when it's free
mailto:[email protected] \ -- Linus Torvalds, FSF T-shirt
Linux werewolf 2.4.19-jam0, Mandrake Linux 9.0 (Cooker) for i586
gcc (GCC) 3.2 (Mandrake Linux 9.0 3.2-0.2mdk)

2002-08-06 23:11:08

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: AIO together with SMPtimers-A0 oops and freezing

On Wed, Aug 07, 2002 at 12:35:50AM +0200, J.A. Magallon wrote:
> Hmm, I forgot to comment, but I apply smptimers on top of latest -aa, that
> includes aio (is it different implementation?), and the kernel works fine.

That would point to a merge error, or one of the changes that -aa made as
being relevant. Someone has to extract the differences to track it down.

-ben
--
"You will be reincarnated as a toad; and you will be much happier."

2002-08-06 23:32:47

by J.A. Magallon

[permalink] [raw]
Subject: Re: AIO together with SMPtimers-A0 oops and freezing


On 2002.08.07 Benjamin LaHaise wrote:
>On Wed, Aug 07, 2002 at 12:35:50AM +0200, J.A. Magallon wrote:
>> Hmm, I forgot to comment, but I apply smptimers on top of latest -aa, that
>> includes aio (is it different implementation?), and the kernel works fine.
>
>That would point to a merge error, or one of the changes that -aa made as
>being relevant. Someone has to extract the differences to track it down.
>

Latest thing I run is here:

http://giga.cps.unizar.es/~magallon/linux/kernel/2.4.19-jam0/30-smptimers-A0.bz2

I applies on top of 00-aa0.bz2 (in the same location), which is a ported
version of latest -aa (-rc5-aa1) to 2.4.19-final.
But aio patches from -aa are accesible directly.


--
J.A. Magallon \ Software is like sex: It's better when it's free
mailto:[email protected] \ -- Linus Torvalds, FSF T-shirt
Linux werewolf 2.4.19-jam0, Mandrake Linux 9.0 (Cooker) for i586
gcc (GCC) 3.2 (Mandrake Linux 9.0 3.2-0.2mdk)

2002-08-09 20:00:26

by Marc-Christian Petersen

[permalink] [raw]
Subject: Re: AIO together with SMPtimers-A0 oops and freezing

On Wednesday 07 August 2002 01:14, Benjamin LaHaise wrote:

Hi Benjamin,

>> Ben, I am using your AIO 20020619 patch + relevant fixes from the AIO
>> mailinglist together with your patch Ingo, SMPtimers-A0.

> Hmmm, the only problem I can see in the aio code wrt timer usage is
> the following. Does this patch make a difference? If not, I'm guessing
> that the problem is something in SMPtimers-A0 that aio happens to
> trigger. The only timer aio uses is for the timeout when waiting for an
> event, and the structure for that is put on the stack.
Perfect!! That, really small, patch makes it working perfectly. The system is
up for some hours now with stress testing Oracle without any oops() or any
other problem.

Thanks alot Ben! :-)

I'll let you know, after some days of stress testing Oracle, if it still
works. I expect it will do! :)


--
Kind regards
Marc-Christian Petersen

http://sourceforge.net/projects/wolk

PGP/GnuPG Key: 1024D/569DE2E3DB441A16
Fingerprint: 3469 0CF8 CA7E 0042 7824 080A 569D E2E3 DB44 1A16
Key available at http://www.keyserver.net. Encrypted e-mail preferred.