Hello!
With 2.4.28-pre3 and 2.4.28-rc1 i see strange situation - sendmail some
times get stuck into release_task() call.
System - Tyan Tiger MPX, dual Athlon MP 2800+ with 1Gb memory.
--- SysRq-T output ---
ksymoops 2.4.9 on i686 2.4.28-rc1. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.28-rc1/ (default)
-m /boot/System.map-2.4.28-rc1 (default)
Reading Oops report from the terminal
sendmail S C012073D 0 15814 1 32701 14365 (NOTLB)
Using defaults from ksymoops -t elf32-i386 -a i386
Call Trace: [<c012073d>] [<c0106582>] [<c0107717>]
sendmail Z 00000000 4 30459 15814 30669 (L-TLB)
Call Trace: [<c0120f53>] [<c0121600>] [<c011547d>] [<c0121725>] [<c0107717>]
sendmail Z 00000000 0 30669 15814 30707 30459 (L-TLB)
Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
sendmail Z 00000000 4 30707 15814 31549 30669 (L-TLB)
Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
sendmail Z 00000000 2624 31549 15814 31708 30707 (L-TLB)
Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
sendmail Z 00000000 0 31708 15814 32269 31549 (L-TLB)
Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
sendmail Z 00000000 0 32269 15814 32352 31708 (L-TLB)
Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
sendmail Z 00000000 20 32352 15814 32403 32269 (L-TLB)
Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
sendmail Z 00000000 0 32403 15814 32413 32352 (L-TLB)
Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
sendmail Z 00000000 624 32413 15814 32468 32403 (L-TLB)
Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
sendmail Z 00000000 0 32468 15814 32473 32413 (L-TLB)
Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
sendmail Z 00000000 0 32473 15814 32482 32468 (L-TLB)
Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
sendmail Z 00000000 0 32482 15814 32499 32473 (L-TLB)
Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
..... many sendmail zombies ......
Warning (Oops_read): Code line not seen, dumping what data is available
Proc; sendmail
>>EIP; c012073d <release_task+1fd/230> <=====
Trace; c012073d <release_task+1fd/230>
Trace; c0106582 <sys_rt_sigsuspend+122/160>
Trace; c0107717 <system_call+33/38>
Proc; sendmail
>>EIP; 00000000 Before first symbol
Trace; c0120f53 <exit_notify+103/3c0>
Trace; c0121600 <do_exit+3f0/4e0>
Trace; c011547d <smp_apic_timer_interrupt+12d/130>
Trace; c0121725 <sys_exit+15/20>
Trace; c0107717 <system_call+33/38>
Proc; sendmail
>>EIP; 00000000 Before first symbol
Trace; c0120f53 <exit_notify+103/3c0>
Trace; c0121600 <do_exit+3f0/4e0>
Trace; c0121725 <sys_exit+15/20>
Trace; c0107717 <system_call+33/38>
Proc; sendmail
.... same trace with other zombies ......
disassemble show other result - process stuck into free_pages() call:
c0120540 <release_task>:
c0120540: 55 push %ebp
....
c0120736: 89 d8 mov %ebx,%eax
c0120738: e8 73 dd 01 00 call c013e4b0 <free_pages> <= here
c012073d: 83 c4 10 add $0x10,%esp
c0120740: 5b pop %ebx
c0120741: 5e pop %esi
c0120742: c9 leave
....
Any hints ?
--
Best regards, TEMHOTA-RIPN aka MJA13-RIPE
System Administrator. mailto:[email protected]
Hi Andrey,
On Tue, Nov 09, 2004 at 07:24:45PM +0300, Andrey J. Melnikoff (TEMHOTA) wrote:
> Hello!
>
> With 2.4.28-pre3 and 2.4.28-rc1 i see strange situation - sendmail some
> times get stuck into release_task() call.
>
> System - Tyan Tiger MPX, dual Athlon MP 2800+ with 1Gb memory.
>
> --- SysRq-T output ---
> ksymoops 2.4.9 on i686 2.4.28-rc1. Options used
> -V (default)
> -k /proc/ksyms (default)
> -l /proc/modules (default)
> -o /lib/modules/2.4.28-rc1/ (default)
> -m /boot/System.map-2.4.28-rc1 (default)
>
> Reading Oops report from the terminal
> sendmail S C012073D 0 15814 1 32701 14365 (NOTLB)
> Using defaults from ksymoops -t elf32-i386 -a i386
> Call Trace: [<c012073d>] [<c0106582>] [<c0107717>]
> sendmail Z 00000000 4 30459 15814 30669 (L-TLB)
> Call Trace: [<c0120f53>] [<c0121600>] [<c011547d>] [<c0121725>] [<c0107717>]
> sendmail Z 00000000 0 30669 15814 30707 30459 (L-TLB)
> Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
> sendmail Z 00000000 4 30707 15814 31549 30669 (L-TLB)
> Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
> sendmail Z 00000000 2624 31549 15814 31708 30707 (L-TLB)
> Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
> sendmail Z 00000000 0 31708 15814 32269 31549 (L-TLB)
> Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
> sendmail Z 00000000 0 32269 15814 32352 31708 (L-TLB)
> Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
> sendmail Z 00000000 20 32352 15814 32403 32269 (L-TLB)
> Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
> sendmail Z 00000000 0 32403 15814 32413 32352 (L-TLB)
> Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
> sendmail Z 00000000 624 32413 15814 32468 32403 (L-TLB)
> Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
> sendmail Z 00000000 0 32468 15814 32473 32413 (L-TLB)
> Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
> sendmail Z 00000000 0 32473 15814 32482 32468 (L-TLB)
> Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
> sendmail Z 00000000 0 32482 15814 32499 32473 (L-TLB)
> Call Trace: [<c0120f53>] [<c0121600>] [<c0121725>] [<c0107717>]
> ..... many sendmail zombies ......
>
> Warning (Oops_read): Code line not seen, dumping what data is available
>
> Proc; sendmail
>
> >>EIP; c012073d <release_task+1fd/230> <=====
>
> Trace; c012073d <release_task+1fd/230>
> Trace; c0106582 <sys_rt_sigsuspend+122/160>
> Trace; c0107717 <system_call+33/38>
> Proc; sendmail
>
> >>EIP; 00000000 Before first symbol
>
> Trace; c0120f53 <exit_notify+103/3c0>
> Trace; c0121600 <do_exit+3f0/4e0>
> Trace; c011547d <smp_apic_timer_interrupt+12d/130>
> Trace; c0121725 <sys_exit+15/20>
> Trace; c0107717 <system_call+33/38>
> Proc; sendmail
>
> >>EIP; 00000000 Before first symbol
>
> Trace; c0120f53 <exit_notify+103/3c0>
> Trace; c0121600 <do_exit+3f0/4e0>
> Trace; c0121725 <sys_exit+15/20>
> Trace; c0107717 <system_call+33/38>
> Proc; sendmail
>
> .... same trace with other zombies ......
>
>
> disassemble show other result - process stuck into free_pages() call:
>
> c0120540 <release_task>:
> c0120540: 55 push %ebp
> ....
> c0120736: 89 d8 mov %ebx,%eax
> c0120738: e8 73 dd 01 00 call c013e4b0 <free_pages> <= here
is this release_task+1fd? Can you send me the full disassemble of release_task?
It can't be blocked here, its a "call" instruction.
free_pages can't block either. Odd.
It is reproducible?
First wild guess (because I haven't got much of a clue really)
would be to revert the mm/page_alloc.c __free_pages() "fastcall"
gcc3.4 change.
Hi Marcelo,
> > >>EIP; c012073d <release_task+1fd/230> <=====
(...)
> > c0120540 <release_task>:
> > c0120540: 55 push %ebp
> > ....
> > c0120736: 89 d8 mov %ebx,%eax
> > c0120738: e8 73 dd 01 00 call c013e4b0 <free_pages> <= here
>
> is this release_task+1fd? Can you send me the full disassemble of release_task?
Yes it is because the next instruction after call will be at c0120738+5 =
c012073d = release_task+1fd. (the return address on the stack is the
address of the next instruction after the call).
> It can't be blocked here, its a "call" instruction.
Seems rather strange indeed ! Perhaps this is not the disassembled function
of the *running* kernel ? it would be good to disassemble vmlinux and ensure
that it is exactly the one currently running. I too have already lost lots
of time searching a wrong bug because I disassembled the wrong kernel, so
I'm certain it can happen even when we're very careful :-(
> free_pages can't block either. Odd.
Marcelo, I have two questions for my own understanding :
- free_pages does spin_lock(&zone->lock) around the while() loop.
Considering that someone else could hold the lock (bug, etc...), it
could block here. But my feeling is that if such a lock were kept held,
the system would be totally frozen because everything which would want
to free memory would get stuck (even a process exit). Am I right ?
- would it enhance performance a bit to put a bunch of 'unlikely()' in all
the ifs which end in BUG(), especially inside the loop ?
Regards,
Willy
On Thu, Nov 11, 2004 at 09:33:12AM +0100, Willy Tarreau wrote:
> Hi Marcelo,
>
> > > >>EIP; c012073d <release_task+1fd/230> <=====
> (...)
> > > c0120540 <release_task>:
> > > c0120540: 55 push %ebp
> > > ....
> > > c0120736: 89 d8 mov %ebx,%eax
> > > c0120738: e8 73 dd 01 00 call c013e4b0 <free_pages> <= here
> >
> > is this release_task+1fd? Can you send me the full disassemble of release_task?
>
> Yes it is because the next instruction after call will be at c0120738+5 =
> c012073d = release_task+1fd. (the return address on the stack is the
> address of the next instruction after the call).
OK.
> > It can't be blocked here, its a "call" instruction.
>
> Seems rather strange indeed ! Perhaps this is not the disassembled function
> of the *running* kernel ? it would be good to disassemble vmlinux and ensure
> that it is exactly the one currently running. I too have already lost lots
> of time searching a wrong bug because I disassembled the wrong kernel, so
> I'm certain it can happen even when we're very careful :-(
>
> > free_pages can't block either. Odd.
>
> Marcelo, I have two questions for my own understanding :
> - free_pages does spin_lock(&zone->lock) around the while() loop.
> Considering that someone else could hold the lock (bug, etc...), it
> could block here. But my feeling is that if such a lock were kept held,
> the system would be totally frozen because everything which would want
> to free memory would get stuck (even a process exit). Am I right ?
Right, the system will be totally frozen spinning on the lock.
> - would it enhance performance a bit to put a bunch of 'unlikely()' in all
> the ifs which end in BUG(), especially inside the loop ?
Yes, it should generate better code.
Try it and see how the generated code differs from the original without unlikely.
I'm not aware of the internals of unlikely however, so I can't
explain how it works in details... the GCC documentation
should do it. :)
Hello Willy Tarreau!
In article <[email protected]> you wrote:
> > > >>EIP; c012073d <release_task+1fd/230> <=====
> (...)
> > > c0120540 <release_task>:
> > > c0120540: 55 push %ebp
> > > ....
> > > c0120736: 89 d8 mov %ebx,%eax
> > > c0120738: e8 73 dd 01 00 call c013e4b0 <free_pages> <= here
> >
> > is this release_task+1fd? Can you send me the full disassemble of release_task?
> Yes it is because the next instruction after call will be at c0120738+5 =
> c012073d = release_task+1fd. (the return address on the stack is the
> address of the next instruction after the call).
> > It can't be blocked here, its a "call" instruction.
> Seems rather strange indeed ! Perhaps this is not the disassembled function
> of the *running* kernel ?
This is also strange for me. Stack trace should point into __free_pages_ok()
address space. Only this function work with lock's.
> it would be good to disassemble vmlinux and ensure that it is exactly
> the one currently running.
This is code from vmlinux-2.4.28-rc1.
> I too have already lost lots of time searching a wrong bug because I
> disassembled the wrong kernel, so I'm certain it can happen even when
> we're very careful :-(
[skipp]
PS: Please, keep CC: to me.