2004-11-27 13:11:35

by Nick Warne

[permalink] [raw]
Subject: kswapd0 oops -> debug information

Hi all,

I keep getting this oops so randomly, that 'RIGHT, YOU BUGGER' I have
attempted to supply proper debug info - where I have got is what I learnt
today, so I am a bit stuck after finding the area of code.

ksymoops provides:

>>EIP; c0151239 <__iget+29/4c> <=====
Code; c015120e <init_once+1a/1c>
00000000 <_EIP>:
Code; c015120e <init_once+1a/1c>
0: 76 00 jbe 2 <_EIP+0x2> c0151210 <__iget+0/4c>
Code; c0151210 <__iget+0/4c>
2: 53 push %ebx
Code; c0151211 <__iget+1/4c>
3: 8b 5c 24 08 mov 0x8(%esp,1),%ebx
Code; c0151215 <__iget+5/4c>
7: 8b 43 1c mov 0x1c(%ebx),%eax
Code; c0151218 <__iget+8/4c>
a: 85 c0 test %eax,%eax
Code; c015121a <__iget+a/4c>
c: 74 05 je 13 <_EIP+0x13> c0151221
<__iget+11/4c>
Code; c015121c <__iget+c/4c>
e: ff 43 1c incl 0x1c(%ebx)
Code; c015121f <__iget+f/4c>
11: eb 38 jmp 4b <_EIP+0x4b> c0151259
<__iget+49/4c>
Code; c0151221 <__iget+11/4c>

13: ff 43 1c incl 0x1c(%ebx)
Code; c0151224 <__iget+14/4c>
16: f6 83 1c 01 00 00 0f testb $0xf,0x11c(%ebx)
Code; c015122b <__iget+1b/4c>
1d: 75 26 jne 45 <_EIP+0x45> c0151253
<__iget+43/4c>
Code; c015122d <__iget+1d/4c>
1f: 8d 53 08 lea 0x8(%ebx),%edx
Code; c0151230 <__iget+20/4c>
22: 8b 4a 04 mov 0x4(%edx),%ecx
Code; c0151233 <__iget+23/4c>
25: 8b 43 08 mov 0x8(%ebx),%eax
Code; c0151236 <__iget+26/4c>
28: 89 48 04 mov %ecx,0x4(%eax)
Code; c0151239 <__iget+29/4c> <=====
2b: 89 01 mov %eax,(%ecx) <=====
Code; c015123b <__iget+2b/4c>
2d: a1 6c 9c 30 c0 mov 0xc0309c6c,%eax
Code; c0151240 <__iget+30/4c>
32: 89 50 04 mov %edx,0x4(%eax)
Code; c0151243 <__iget+33/4c>
35: 89 43 08 mov %eax,0x8(%ebx)
Code; c0151246 <__iget+36/4c>
38: c7 42 04 6c 9c 30 c0 movl $0xc0309c6c,0x4(%edx)
Code; c015124d <__iget+3d/4c>
3f: 89 .byte 0x89



I have traced this code to fs/inode.c. Producing assembler of inode.c gives
this (snipped):

__iget:
pushl %ebx
movl 8(%esp),%ebx
movl 28(%ebx),%eax
testl %eax,%eax
je .L3337
#APP
incl 28(%ebx)
#NO_APP
jmp .L3336
.p2align 4,,7
.L3337:
#APP
incl 28(%ebx)
#NO_APP
testb $15,284(%ebx)
jne .L3340
leal 8(%ebx),%edx
movl 4(%edx),%ecx
movl 8(%ebx),%eax
movl %ecx,4(%eax)
movl %eax,(%ecx) <===== >>EIP; c0151239 <__iget+29/4c>
movl inode_in_use,%eax
movl %edx,4(%eax)
movl %eax,8(%ebx)
movl $inode_in_use,4(%edx)
movl %edx,inode_in_use
.L3340:


Which quiet nicely matches the ksymoops output. My books tell me the inode.s
file _should_ give me line numbers in inode.c so I can then locate area of
code - but I can't see how to match the produced assembler to the C source.

Hope this helps someone - and if you know who to get assembler code to match C
code via line numbers, I would like to know please.

TIA,

Nick
--
"When you're chewing on life's gristle,
Don't grumble, Give a whistle..."


2004-11-27 17:05:49

by Randy.Dunlap

[permalink] [raw]
Subject: Re: kswapd0 oops -> debug information

Nick Warne wrote:
> Hi all,
>
> I keep getting this oops so randomly, that 'RIGHT, YOU BUGGER' I have
> attempted to supply proper debug info - where I have got is what I learnt
> today, so I am a bit stuck after finding the area of code.

kernel version?
.config file?
full oops message, with stack backtrace?
The stack backtrace could tell us who a bad caller is.
It can just be a caller's problem, not a bug in (this)
one isolated function.

Did you read/check linux/REPORTING-BUGS ?

> ksymoops provides:
>
>
>>>EIP; c0151239 <__iget+29/4c> <=====
>
> Code; c015120e <init_once+1a/1c>
> 00000000 <_EIP>:
> Code; c015120e <init_once+1a/1c>
> 0: 76 00 jbe 2 <_EIP+0x2> c0151210 <__iget+0/4c>
> Code; c0151210 <__iget+0/4c>
> 2: 53 push %ebx
> Code; c0151211 <__iget+1/4c>
> 3: 8b 5c 24 08 mov 0x8(%esp,1),%ebx
> Code; c0151215 <__iget+5/4c>
> 7: 8b 43 1c mov 0x1c(%ebx),%eax
> Code; c0151218 <__iget+8/4c>
> a: 85 c0 test %eax,%eax
> Code; c015121a <__iget+a/4c>
> c: 74 05 je 13 <_EIP+0x13> c0151221
> <__iget+11/4c>
> Code; c015121c <__iget+c/4c>
> e: ff 43 1c incl 0x1c(%ebx)
> Code; c015121f <__iget+f/4c>
> 11: eb 38 jmp 4b <_EIP+0x4b> c0151259
> <__iget+49/4c>
> Code; c0151221 <__iget+11/4c>
>
> 13: ff 43 1c incl 0x1c(%ebx)
> Code; c0151224 <__iget+14/4c>
> 16: f6 83 1c 01 00 00 0f testb $0xf,0x11c(%ebx)
> Code; c015122b <__iget+1b/4c>
> 1d: 75 26 jne 45 <_EIP+0x45> c0151253
> <__iget+43/4c>
> Code; c015122d <__iget+1d/4c>
> 1f: 8d 53 08 lea 0x8(%ebx),%edx
> Code; c0151230 <__iget+20/4c>
> 22: 8b 4a 04 mov 0x4(%edx),%ecx
> Code; c0151233 <__iget+23/4c>
> 25: 8b 43 08 mov 0x8(%ebx),%eax
> Code; c0151236 <__iget+26/4c>
> 28: 89 48 04 mov %ecx,0x4(%eax)
> Code; c0151239 <__iget+29/4c> <=====
> 2b: 89 01 mov %eax,(%ecx) <=====
> Code; c015123b <__iget+2b/4c>
> 2d: a1 6c 9c 30 c0 mov 0xc0309c6c,%eax
> Code; c0151240 <__iget+30/4c>
> 32: 89 50 04 mov %edx,0x4(%eax)
> Code; c0151243 <__iget+33/4c>
> 35: 89 43 08 mov %eax,0x8(%ebx)
> Code; c0151246 <__iget+36/4c>
> 38: c7 42 04 6c 9c 30 c0 movl $0xc0309c6c,0x4(%edx)
> Code; c015124d <__iget+3d/4c>
> 3f: 89 .byte 0x89
>
>
>
> I have traced this code to fs/inode.c. Producing assembler of inode.c gives
> this (snipped):
>
> __iget:
> pushl %ebx
> movl 8(%esp),%ebx
> movl 28(%ebx),%eax
> testl %eax,%eax
> je .L3337
> #APP
> incl 28(%ebx)
> #NO_APP
> jmp .L3336
> .p2align 4,,7
> .L3337:
> #APP
> incl 28(%ebx)
> #NO_APP
> testb $15,284(%ebx)
> jne .L3340
> leal 8(%ebx),%edx
> movl 4(%edx),%ecx
> movl 8(%ebx),%eax
> movl %ecx,4(%eax)
> movl %eax,(%ecx) <===== >>EIP; c0151239 <__iget+29/4c>
> movl inode_in_use,%eax
> movl %edx,4(%eax)
> movl %eax,8(%ebx)
> movl $inode_in_use,4(%edx)
> movl %edx,inode_in_use
> .L3340:
>
>
> Which quiet nicely matches the ksymoops output. My books tell me the inode.s
> file _should_ give me line numbers in inode.c so I can then locate area of
> code - but I can't see how to match the produced assembler to the C source.
>
> Hope this helps someone - and if you know who to get assembler code to match C
> code via line numbers, I would like to know please.
>
> TIA,
>
> Nick

--
~Randy

2004-11-27 17:21:45

by Nick Warne

[permalink] [raw]
Subject: Re: kswapd0 oops -> debug information

On Saturday 27 November 2004 17:01, Randy.Dunlap wrote:

> kernel version?

Heh. My great debug attempt, eh?

kernel 2.6.9

> .config file?
> full oops message, with stack backtrace?
> The stack backtrace could tell us who a bad caller is.
> It can just be a caller's problem, not a bug in (this)
> one isolated function.

http://linicks.net/kdebug/

> Did you read/check linux/REPORTING-BUGS ?

Yes, but wanted to try and learn myself on what was going on, rather than push
the onus onto other people.

The book I have re the make /dir/file.s states that it will produce assembler
with _line_ numbers to corresponding C code. That is where I got lost, as it
doesn't.

Thanks,

Nick.

--
"When you're chewing on life's gristle,
Don't grumble, Give a whistle..."

2005-01-02 07:41:22

by Herbert Poetzl

[permalink] [raw]
Subject: Re: kswapd0 oops -> debug information

On Sat, Nov 27, 2004 at 05:21:21PM +0000, Nick Warne wrote:
> On Saturday 27 November 2004 17:01, Randy.Dunlap wrote:
>
> > kernel version?
>
> Heh. My great debug attempt, eh?
>
> kernel 2.6.9
>
> > .config file?
> > full oops message, with stack backtrace?
> > The stack backtrace could tell us who a bad caller is.
> > It can just be a caller's problem, not a bug in (this)
> > one isolated function.
>
> http://linicks.net/kdebug/
>
> > Did you read/check linux/REPORTING-BUGS ?
>
> Yes, but wanted to try and learn myself on what was going on, rather than push
> the onus onto other people.
>
> The book I have re the make /dir/file.s states that it will produce assembler
> with _line_ numbers to corresponding C code. That is where I got lost, as it
> doesn't.

hmm, sorry for the late reply, but better late
than not at all ...

if you do

make fs/file.s V=1

you'll see what make actually does to compile
the source code into assembler code ...

make -f scripts/Makefile.build obj=scripts/basic
make -f scripts/Makefile.build obj=scripts
make -f scripts/Makefile.build obj=fs fs/file.s
gcc -Wp,-MD,fs/.file.s.d -nostdinc -iwithprefix include -D__KERNEL__ -Iinclude -Wall -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -O2 -fomit-frame-pointer -g -pipe -msoft-float -mpreferred-stack-boundary=2 -march=i586 -Iinclude/asm-i386/mach-default -DKBUILD_BASENAME=file -DKBUILD_MODNAME=file -S -o fs/file.s fs/file.c

and if that final gcc command does include a -g
(which can be controlled by CONFIG_DEBUG_INFO, or
simply added by hand), then the output will contain
lines like this:

.loc 1 45 0
.loc 1 46 0

which reference the file and line number in the
source code. files are 'declared' with lines:

.file "file.c"
.file 1 "fs/file.c"
.file 2 "include/linux/posix_types.h"

so you can pretty easy find the code in the
source. a different, but sometimes easier approach
is to use 'addr2line' on the kernel binary (if it
was compiled with CONFIG_DEBUG_INFO) to get the
source line from a kernel address ...

addr2line -e vmlinux c0123456

HTH,
Herbert

> Thanks,
>
> Nick.
>
> --
> "When you're chewing on life's gristle,
> Don't grumble, Give a whistle..."
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2005-01-02 11:01:36

by Nick Warne

[permalink] [raw]
Subject: Re: kswapd0 oops -> debug information

On Sunday 02 January 2005 07:41, Herbert Poetzl wrote:

> > The book I have re the make /dir/file.s states that it will produce
> > assembler with _line_ numbers to corresponding C code. That is where I
> > got lost, as it doesn't.
>
> hmm, sorry for the late reply, but better late
> than not at all ...
>
> if you do
>
> make fs/file.s V=1
>
> you'll see what make actually does to compile
> the source code into assembler code ...
>
> make -f scripts/Makefile.build obj=scripts/basic
> make -f scripts/Makefile.build obj=scripts
> make -f scripts/Makefile.build obj=fs fs/file.s
> gcc -Wp,-MD,fs/.file.s.d -nostdinc -iwithprefix include -D__KERNEL__
> -Iinclude -Wall -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing
> -fno-common -O2 -fomit-frame-pointer -g -pipe -msoft-float
> -mpreferred-stack-boundary=2 -march=i586 -Iinclude/asm-i386/mach-default
> -DKBUILD_BASENAME=file -DKBUILD_MODNAME=file -S -o fs/file.s fs/file.c
>
> and if that final gcc command does include a -g
> (which can be controlled by CONFIG_DEBUG_INFO, or
> simply added by hand), then the output will contain
> lines like this:
>
> .loc 1 45 0
> .loc 1 46 0
>
> which reference the file and line number in the
> source code. files are 'declared' with lines:
>
> .file "file.c"
> .file 1 "fs/file.c"
> .file 2 "include/linux/posix_types.h"
>
> so you can pretty easy find the code in the
> source. a different, but sometimes easier approach
> is to use 'addr2line' on the kernel binary (if it
> was compiled with CONFIG_DEBUG_INFO) to get the
> source line from a kernel address ...
>
> addr2line -e vmlinux c0123456
>
> HTH,
> Herbert

Hi Herbert,

Thanks for reply.

I will file this for when needed again for future reference, as I have moved
back to 2.6.4 kernel, as that tree never once produces a kswapd oops and just
runs and runs and runs.

I just don't know why > 2.6.4 kernels produce oops on my system - I have been
through change log looking at all the relevant stuff, but can't really see
anything obvious. I have built kernels 'clean' from bottom up, but all
produce kswapd oops within a few days - except 2.6.4.

I wish this box wasn't my LAN gateway, otherwise I wouldn't mine when it does
go AWOL and I could debug at my leisure.

Thanks for help,

Nick
--
"When you're chewing on life's gristle,
Don't grumble, Give a whistle..."