2003-03-02 01:28:13

by Randy.Dunlap

[permalink] [raw]
Subject: ntfs OOPS (2.5.63)

Hi,

This is plain vanilla 2.5.63.

The NTFS filesystem is mounted and I tried to cd several levels
deep into it...and voila.

Mar 1 13:35:29 midway kernel: SysRq : Changing Loglevel
Mar 1 13:35:29 midway kernel: Loglevel set to 9
Mar 1 13:35:44 midway kernel: Unable to handle kernel paging request at
virtual address 0001029a
Mar 1 13:35:44 midway kernel: printing eip:
Mar 1 13:35:44 midway kernel: c01f40f9
Mar 1 13:35:44 midway kernel: *pde = 00000000
Mar 1 13:35:44 midway kernel: Oops: 0002
Mar 1 13:35:44 midway kernel: CPU: 0
Mar 1 13:35:44 midway kernel: EIP: 0060:[__ntfs_init_inode+169/400] Not
tainted
Mar 1 13:35:44 midway kernel: EIP: 0060:[<c01f40f9>] Not tainted
Mar 1 13:35:44 midway kernel: EFLAGS: 00010282
Mar 1 13:35:44 midway kernel: EIP is at __ntfs_init_inode+0xa9/0x190
Mar 1 13:35:44 midway kernel: eax: f6c0f080 ebx: 0000416d ecx: 00010282
edx: f6c0f0f8
Mar 1 13:35:44 midway kernel: esi: c040b078 edi: f6c0f0f8 ebp: f6dd1dbc
esp: f6dd1db4
Mar 1 13:35:44 midway su(pam_unix)[1839]: session closed for user root
Mar 1 13:35:44 midway kernel: ds: 007b es: 007b ss: 0068
Mar 1 13:35:44 midway kernel: Process bash (pid: 1840, threadinfo=f6dd0000
task=f7fa4100)
Mar 1 13:35:44 midway kernel: Stack: 00000000 f77dedec f6dd1df8 c01f4350
f7854000 f6c0f080 c01409f2 f7db7c74
Mar 1 13:35:44 midway kernel: f6c1de00 00000000 0000416d 000e0000
f77dedec f77dedec f6c0f178 00000000
Mar 1 13:35:44 midway kernel: f77dedec f6dd1e1c c01f3ec1 f6c0f178
0000416d 00000000 00000000 00000000
Mar 1 13:35:44 midway kernel: Call Trace:
Mar 1 13:35:44 midway kernel: [ntfs_read_locked_inode+96/3344]
ntfs_read_locked_inode+0x60/0xd10
Mar 1 13:35:44 midway kernel: [<c01f4350>] ntfs_read_locked_inode+0x60/0xd10
Mar 1 13:35:44 midway kernel: [kmem_cache_free+418/496]
kmem_cache_free+0x1a2/0x1f0
Mar 1 13:35:44 midway kernel: [<c01409f2>] kmem_cache_free+0x1a2/0x1f0
Mar 1 13:35:44 midway kernel: [ntfs_iget+97/144] ntfs_iget+0x61/0x90
Mar 1 13:35:44 midway kernel: [<c01f3ec1>] ntfs_iget+0x61/0x90
Mar 1 13:35:44 midway kernel: [ntfs_lookup+158/1040] ntfs_lookup+0x9e/0x410
Mar 1 13:35:44 midway kernel: [<c01f6c3e>] ntfs_lookup+0x9e/0x410
Mar 1 13:35:44 midway kernel: [d_alloc+22/720] d_alloc+0x16/0x2d0
Mar 1 13:35:44 midway kernel: [<c0172c26>] d_alloc+0x16/0x2d0
Mar 1 13:35:44 midway kernel: [real_lookup+104/224] real_lookup+0x68/0xe0
Mar 1 13:35:44 midway kernel: [<c0168618>] real_lookup+0x68/0xe0
Mar 1 13:35:44 midway kernel: [do_lookup+78/144] do_lookup+0x4e/0x90
Mar 1 13:35:44 midway kernel: [<c0168d0e>] do_lookup+0x4e/0x90
Mar 1 13:35:44 midway kernel: [link_path_walk+2234/2928]
link_path_walk+0x8ba/0xb70
Mar 1 13:35:44 midway kernel: [<c016960a>] link_path_walk+0x8ba/0xb70
Mar 1 13:35:44 midway kernel: [getname+99/176] getname+0x63/0xb0
Mar 1 13:35:44 midway kernel: [<c0168033>] getname+0x63/0xb0
Mar 1 13:35:44 midway kernel: [__user_walk+43/80] __user_walk+0x2b/0x50
Mar 1 13:35:44 midway kernel: [<c0169c5b>] __user_walk+0x2b/0x50
Mar 1 13:35:44 midway kernel: [vfs_stat+26/80] vfs_stat+0x1a/0x50
Mar 1 13:35:44 midway kernel: [<c0163ffa>] vfs_stat+0x1a/0x50
Mar 1 13:35:44 midway kernel: [sys_stat64+20/48] sys_stat64+0x14/0x30
Mar 1 13:35:44 midway kernel: [<c0164594>] sys_stat64+0x14/0x30
Mar 1 13:35:44 midway kernel: [grab_super+352/944] grab_super+0x160/0x3b0
Mar 1 13:35:44 midway kernel: [<c0160000>] grab_super+0x160/0x3b0
Mar 1 13:35:44 midway kernel: [syscall_call+7/11] syscall_call+0x7/0xb
Mar 1 13:35:44 midway kernel: [<c0109a8b>] syscall_call+0x7/0xb
Mar 1 13:35:44 midway kernel:
Mar 1 13:35:44 midway kernel: Code: 89 51 18 89 51 1c 31 f6 31 c9 89 b0 80 00
00 00 31 f6 31 d2

~Randy




2003-03-04 14:51:54

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)


On Sat, 1 Mar 2003, Randy.Dunlap wrote:

> This is plain vanilla 2.5.63.

Ditto, no modules enabled, gcc 3.2.2 (Mandrake Linux 9.1 3.2.2-2mdk)

> The NTFS filesystem is mounted and I tried to cd several levels
> deep into it...and voila.

I guess it's not reproducible. I couldn't.

> Mar 1 13:35:44 midway kernel: Unable to handle kernel paging request at
> virtual address 0001029a
> Mar 1 13:35:44 midway kernel: *pde = 00000000
> Mar 1 13:35:44 midway kernel: Oops: 0002
> Mar 1 13:35:44 midway kernel: CPU: 0
> Mar 1 13:35:44 midway kernel: EIP: 0060:[__ntfs_init_inode+169/400] Not
> tainted
> Mar 1 13:35:44 midway kernel: EIP: 0060:[<c01f40f9>] Not tainted
> Mar 1 13:35:44 midway kernel: EFLAGS: 00010282
> Mar 1 13:35:44 midway kernel: EIP is at __ntfs_init_inode+0xa9/0x190
> Mar 1 13:35:44 midway kernel: eax: f6c0f080 ebx: 0000416d ecx: 00010282
> edx: f6c0f0f8
> Mar 1 13:35:44 midway kernel: esi: c040b078 edi: f6c0f0f8 ebp: f6dd1dbc
> esp: f6dd1db4
> Mar 1 13:35:44 midway su(pam_unix)[1839]: session closed for user root
> Mar 1 13:35:44 midway kernel: ds: 007b es: 007b ss: 0068

[...]

> Mar 1 13:35:44 midway kernel: Code: 89 51 18 89 51 1c 31 f6 31 c9 89 b0 80 00
> 00 00 31 f6 31 d2

0: 89 51 18 mov %edx,0x18(%ecx)
3: 89 51 1c mov %edx,0x1c(%ecx)

The only potential match is in this part of __ntfs_init_inode (gcc
3.2.2 generates totally different and overall 50% less code for
__ntfs_init_inode):

ni->seq_no = 0;
atomic_set(&ni->count, 1);

However neither the above machine code nor the edx and ecx values are
correct. How reliable is the oopser? What compiler did you use? Could
you disassemble __ntfs_init_inode?

gdb fs/ntfs/ntfs.o
gdb> disassemble __ntfs_init_inode

Szaka


2003-03-05 18:59:26

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)

Hi Randy,

Could you try to turn on debugging in the NTFS driver (compile option in
the menus), then once ntfs module is loaded (or otherwise anytime) as root
do:

echo -1 > /proc/sys/fs/ntfs-debug

Then mount and to the directory changes. Assuming that you get the bug
again could you send me the captured kernel log output? (Note there will
be massive amounts of output.)

The code looks ok and I can't reproduce here so it would be helpful to see
if there are any oddities on your partition. Just to make sure it is not
the compiler, could you do a "make fs/ntfs/inode.S" and send me that as
well?

Thanks,

Anton

On Sat, 1 Mar 2003, Randy.Dunlap wrote:
> This is plain vanilla 2.5.63.
>
> The NTFS filesystem is mounted and I tried to cd several levels
> deep into it...and voila.
>
> Mar 1 13:35:29 midway kernel: SysRq : Changing Loglevel
> Mar 1 13:35:29 midway kernel: Loglevel set to 9
> Mar 1 13:35:44 midway kernel: Unable to handle kernel paging request at
> virtual address 0001029a
> Mar 1 13:35:44 midway kernel: printing eip:
> Mar 1 13:35:44 midway kernel: c01f40f9
> Mar 1 13:35:44 midway kernel: *pde = 00000000
> Mar 1 13:35:44 midway kernel: Oops: 0002
> Mar 1 13:35:44 midway kernel: CPU: 0
> Mar 1 13:35:44 midway kernel: EIP: 0060:[__ntfs_init_inode+169/400] Not
> tainted
> Mar 1 13:35:44 midway kernel: EIP: 0060:[<c01f40f9>] Not tainted
> Mar 1 13:35:44 midway kernel: EFLAGS: 00010282
> Mar 1 13:35:44 midway kernel: EIP is at __ntfs_init_inode+0xa9/0x190
> Mar 1 13:35:44 midway kernel: eax: f6c0f080 ebx: 0000416d ecx: 00010282
> edx: f6c0f0f8
> Mar 1 13:35:44 midway kernel: esi: c040b078 edi: f6c0f0f8 ebp: f6dd1dbc
> esp: f6dd1db4
> Mar 1 13:35:44 midway su(pam_unix)[1839]: session closed for user root
> Mar 1 13:35:44 midway kernel: ds: 007b es: 007b ss: 0068
> Mar 1 13:35:44 midway kernel: Process bash (pid: 1840, threadinfo=f6dd0000
> task=f7fa4100)
> Mar 1 13:35:44 midway kernel: Stack: 00000000 f77dedec f6dd1df8 c01f4350
> f7854000 f6c0f080 c01409f2 f7db7c74
> Mar 1 13:35:44 midway kernel: f6c1de00 00000000 0000416d 000e0000
> f77dedec f77dedec f6c0f178 00000000
> Mar 1 13:35:44 midway kernel: f77dedec f6dd1e1c c01f3ec1 f6c0f178
> 0000416d 00000000 00000000 00000000
> Mar 1 13:35:44 midway kernel: Call Trace:
> Mar 1 13:35:44 midway kernel: [ntfs_read_locked_inode+96/3344]
> ntfs_read_locked_inode+0x60/0xd10
> Mar 1 13:35:44 midway kernel: [<c01f4350>] ntfs_read_locked_inode+0x60/0xd10
> Mar 1 13:35:44 midway kernel: [kmem_cache_free+418/496]
> kmem_cache_free+0x1a2/0x1f0
> Mar 1 13:35:44 midway kernel: [<c01409f2>] kmem_cache_free+0x1a2/0x1f0
> Mar 1 13:35:44 midway kernel: [ntfs_iget+97/144] ntfs_iget+0x61/0x90
> Mar 1 13:35:44 midway kernel: [<c01f3ec1>] ntfs_iget+0x61/0x90
> Mar 1 13:35:44 midway kernel: [ntfs_lookup+158/1040] ntfs_lookup+0x9e/0x410
> Mar 1 13:35:44 midway kernel: [<c01f6c3e>] ntfs_lookup+0x9e/0x410
> Mar 1 13:35:44 midway kernel: [d_alloc+22/720] d_alloc+0x16/0x2d0
> Mar 1 13:35:44 midway kernel: [<c0172c26>] d_alloc+0x16/0x2d0
> Mar 1 13:35:44 midway kernel: [real_lookup+104/224] real_lookup+0x68/0xe0
> Mar 1 13:35:44 midway kernel: [<c0168618>] real_lookup+0x68/0xe0
> Mar 1 13:35:44 midway kernel: [do_lookup+78/144] do_lookup+0x4e/0x90
> Mar 1 13:35:44 midway kernel: [<c0168d0e>] do_lookup+0x4e/0x90
> Mar 1 13:35:44 midway kernel: [link_path_walk+2234/2928]
> link_path_walk+0x8ba/0xb70
> Mar 1 13:35:44 midway kernel: [<c016960a>] link_path_walk+0x8ba/0xb70
> Mar 1 13:35:44 midway kernel: [getname+99/176] getname+0x63/0xb0
> Mar 1 13:35:44 midway kernel: [<c0168033>] getname+0x63/0xb0
> Mar 1 13:35:44 midway kernel: [__user_walk+43/80] __user_walk+0x2b/0x50
> Mar 1 13:35:44 midway kernel: [<c0169c5b>] __user_walk+0x2b/0x50
> Mar 1 13:35:44 midway kernel: [vfs_stat+26/80] vfs_stat+0x1a/0x50
> Mar 1 13:35:44 midway kernel: [<c0163ffa>] vfs_stat+0x1a/0x50
> Mar 1 13:35:44 midway kernel: [sys_stat64+20/48] sys_stat64+0x14/0x30
> Mar 1 13:35:44 midway kernel: [<c0164594>] sys_stat64+0x14/0x30
> Mar 1 13:35:44 midway kernel: [grab_super+352/944] grab_super+0x160/0x3b0
> Mar 1 13:35:44 midway kernel: [<c0160000>] grab_super+0x160/0x3b0
> Mar 1 13:35:44 midway kernel: [syscall_call+7/11] syscall_call+0x7/0xb
> Mar 1 13:35:44 midway kernel: [<c0109a8b>] syscall_call+0x7/0xb
> Mar 1 13:35:44 midway kernel:
> Mar 1 13:35:44 midway kernel: Code: 89 51 18 89 51 1c 31 f6 31 c9 89 b0 80 00
> 00 00 31 f6 31 d2
>
> ~Randy

--
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

2003-03-06 06:09:05

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)

> Hi Randy,
>
> Could you try to turn on debugging in the NTFS driver (compile option in the
> menus), then once ntfs module is loaded (or otherwise anytime) as root do:
>
> echo -1 > /proc/sys/fs/ntfs-debug
>
> Then mount and to the directory changes. Assuming that you get the bug again
> could you send me the captured kernel log output? (Note there will be
> massive amounts of output.)
>
> The code looks ok and I can't reproduce here so it would be helpful to see
> if there are any oddities on your partition. Just to make sure it is not the
> compiler, could you do a "make fs/ntfs/inode.S" and send me that as well?
>
> Thanks,

Anton,

I'll get to this in another day or so.

The help text for NTFS_DEBUG says to use 1 to enable it
or 0 to disable it. What does -1 do?

Thanks,
~Randy



2003-03-06 06:27:20

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)


On Wed, 5 Mar 2003, Randy.Dunlap wrote:

> > Could you try to turn on debugging in the NTFS driver (compile option in the
> > menus), then once ntfs module is loaded (or otherwise anytime) as root do:
> >
> > echo -1 > /proc/sys/fs/ntfs-debug
> >
> > Then mount and to the directory changes. Assuming that you get the bug again
> > could you send me the captured kernel log output? (Note there will be
> > massive amounts of output.)
> >
> > The code looks ok and I can't reproduce here so it would be helpful to see
> > if there are any oddities on your partition. Just to make sure it is not the
> > compiler, could you do a "make fs/ntfs/inode.S" and send me that as well?
> >
> > Thanks,
>
> Anton,
>
> I'll get to this in another day or so.
>
> The help text for NTFS_DEBUG says to use 1 to enable it
> or 0 to disable it. What does -1 do?

Same as 1. However I doubt NTFS_DEBUG gives any useful in your case
and if you had some NTFS "oddities" then it would be reproducible.

What would be really useful is to disassemble __ntfs_init_inode what I
asked 2 days ago (note, not the above 'make fs/ntfs/inode.S' because
it will not tell what machine code you have on disk), your .config and
exact CPU version (cat /proc/cpuinfo).

Szaka

2003-03-06 06:32:34

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)

>
> On Wed, 5 Mar 2003, Randy.Dunlap wrote:
>
>> > Could you try to turn on debugging in the NTFS driver (compile option in
>> the menus), then once ntfs module is loaded (or otherwise anytime) as
>> root do:
>> >
>> > echo -1 > /proc/sys/fs/ntfs-debug
>> >
>> > Then mount and to the directory changes. Assuming that you get the bug
>> again could you send me the captured kernel log output? (Note there will
>> be massive amounts of output.)
>> >
>> > The code looks ok and I can't reproduce here so it would be helpful to
>> see if there are any oddities on your partition. Just to make sure it is
>> not the compiler, could you do a "make fs/ntfs/inode.S" and send me that
>> as well?
>> >
>> > Thanks,
>>
>> Anton,
>>
>> I'll get to this in another day or so.
>>
>> The help text for NTFS_DEBUG says to use 1 to enable it
>> or 0 to disable it. What does -1 do?
>
> Same as 1. However I doubt NTFS_DEBUG gives any useful in your case and if
> you had some NTFS "oddities" then it would be reproducible.
>
> What would be really useful is to disassemble __ntfs_init_inode what I asked
> 2 days ago (note, not the above 'make fs/ntfs/inode.S' because it will not
> tell what machine code you have on disk), your .config and exact CPU version
> (cat /proc/cpuinfo).

OK, I'll do that too. Somehow I missed that request. My bad.

~Randy



2003-03-06 12:22:18

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)

On Thu, 6 Mar 2003, Szakacsits Szabolcs wrote:
> On Wed, 5 Mar 2003, Randy.Dunlap wrote:
>
> > > Could you try to turn on debugging in the NTFS driver (compile option in the
> > > menus), then once ntfs module is loaded (or otherwise anytime) as root do:
> > >
> > > echo -1 > /proc/sys/fs/ntfs-debug
> > >
> > > Then mount and to the directory changes. Assuming that you get the bug again
> > > could you send me the captured kernel log output? (Note there will be
> > > massive amounts of output.)
> > >
> > > The code looks ok and I can't reproduce here so it would be helpful to see
> > > if there are any oddities on your partition. Just to make sure it is not the
> > > compiler, could you do a "make fs/ntfs/inode.S" and send me that as well?
> > >
> > > Thanks,
> >
> > Anton,
> >
> > I'll get to this in another day or so.
> >
> > The help text for NTFS_DEBUG says to use 1 to enable it
> > or 0 to disable it. What does -1 do?
>
> Same as 1. However I doubt NTFS_DEBUG gives any useful in your case
> and if you had some NTFS "oddities" then it would be reproducible.
>
> What would be really useful is to disassemble __ntfs_init_inode what I
> asked 2 days ago (note, not the above 'make fs/ntfs/inode.S' because
> it will not tell what machine code you have on disk), your .config and
> exact CPU version (cat /proc/cpuinfo).

Yes it will, unless you suspect the assembler to get it wrong which is
highly unlikely. All compiler bugs I have ever seen have been quite well
visible in the .S assembler file.

The .S file is the only easy way to find out which the faulting
instruction from the oops output is and once you know the instruction you
reverse compile to know which C statement it was and once you know that
you know which variable was NULL/random value and then you can start
looking for the answer to "how the fsck did that happen?"... (-; At least
I have managed to find and fix quite a few bugs using that approach
before. But without the .S file + oops output it is impossible to do as my
compiler/.config would mean the oops output is not too useful in
combination with my own .S file...

Cheers,

Anton
--
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

2003-03-06 12:16:57

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)

On Wed, 5 Mar 2003, Randy.Dunlap wrote:

> > Hi Randy,
> >
> > Could you try to turn on debugging in the NTFS driver (compile option in the
> > menus), then once ntfs module is loaded (or otherwise anytime) as root do:
> >
> > echo -1 > /proc/sys/fs/ntfs-debug
> >
> > Then mount and to the directory changes. Assuming that you get the bug again
> > could you send me the captured kernel log output? (Note there will be
> > massive amounts of output.)
> >
> > The code looks ok and I can't reproduce here so it would be helpful to see
> > if there are any oddities on your partition. Just to make sure it is not the
> > compiler, could you do a "make fs/ntfs/inode.S" and send me that as well?
> >
> > Thanks,
>
> Anton,
>
> I'll get to this in another day or so.
>
> The help text for NTFS_DEBUG says to use 1 to enable it
> or 0 to disable it. What does -1 do?

It doesn't matter. We just test "if not zero do debug output". The old
ntfs driver used to use different bits for different error messages so -1
would enable all of them and I have stuck to using -1...

Thanks,

Anton
--
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

2003-03-06 14:33:55

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)


On Thu, 6 Mar 2003, Anton Altaparmakov wrote:
> On Thu, 6 Mar 2003, Szakacsits Szabolcs wrote:
> >
> > What would be really useful is to disassemble __ntfs_init_inode what I
> > asked 2 days ago (note, not the above 'make fs/ntfs/inode.S' because
> > it will not tell what machine code you have on disk), your .config and
> > exact CPU version (cat /proc/cpuinfo).
>
> Yes it will, unless you suspect the assembler [...]

I suspect everything :) It was also a polite way saying (on a completely
configured, etc kernel):
% make fs/ntfs/inode.S
make: *** No rule to make target `fs/ntfs/inode.S'. Stop.

Anyway, considering how bogus the oops was and Randy already had two
oops'es before this NTFS one, I think the NTFS driver was a sufferer
of other trouble(s) than the originator. So unless one can reproduce
something close to this one (or Randy sends his first [two] oops), I
would just trash

http://bugme.osdl.org/show_bug.cgi?id=432

Szaka

2003-03-06 14:45:10

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)

On Thu, 6 Mar 2003, Szakacsits Szabolcs wrote:

> On Thu, 6 Mar 2003, Anton Altaparmakov wrote:
> > On Thu, 6 Mar 2003, Szakacsits Szabolcs wrote:
> > >
> > > What would be really useful is to disassemble __ntfs_init_inode what I
> > > asked 2 days ago (note, not the above 'make fs/ntfs/inode.S' because
> > > it will not tell what machine code you have on disk), your .config and
> > > exact CPU version (cat /proc/cpuinfo).
> >
> > Yes it will, unless you suspect the assembler [...]
>
> I suspect everything :) It was also a polite way saying (on a completely
> configured, etc kernel):
> % make fs/ntfs/inode.S
> make: *** No rule to make target `fs/ntfs/inode.S'. Stop.

Oops. Sorry. I meant
make fs/ntfs/inode.s

Just tested and works...

Anton

>
> Anyway, considering how bogus the oops was and Randy already had two
> oops'es before this NTFS one, I think the NTFS driver was a sufferer
> of other trouble(s) than the originator. So unless one can reproduce
> something close to this one (or Randy sends his first [two] oops), I
> would just trash
>
> http://bugme.osdl.org/show_bug.cgi?id=432
>
> Szaka
>
> --
> This email was forwarded via the University of Cambridge alumni email system
> Visit http://cantab.net/ to update your forwarding details
>

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

2003-03-06 19:30:50

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)

On Thu, 6 Mar 2003 15:34:52 +0100 (MET) Szakacsits Szabolcs <[email protected]> wrote:

|
| On Thu, 6 Mar 2003, Anton Altaparmakov wrote:
| > On Thu, 6 Mar 2003, Szakacsits Szabolcs wrote:
| > >
| > > What would be really useful is to disassemble __ntfs_init_inode what I
| > > asked 2 days ago (note, not the above 'make fs/ntfs/inode.S' because
| > > it will not tell what machine code you have on disk), your .config and
| > > exact CPU version (cat /proc/cpuinfo).
| >
| > Yes it will, unless you suspect the assembler [...]
|
| I suspect everything :) It was also a polite way saying (on a completely
| configured, etc kernel):
| % make fs/ntfs/inode.S
| make: *** No rule to make target `fs/ntfs/inode.S'. Stop.
|
| Anyway, considering how bogus the oops was and Randy already had two
| oops'es before this NTFS one, I think the NTFS driver was a sufferer
| of other trouble(s) than the originator. So unless one can reproduce
| something close to this one (or Randy sends his first [two] oops), I
| would just trash
|
| http://bugme.osdl.org/show_bug.cgi?id=432

I must have missed something here. What other 2 oopses are you referring to?
I understand that other oopses could make a third one bogus, but I didn't
see 2 others. Did I miss them? How did you get that information?
I'll look in the kernel log tonight (at home) to see if I missed them.

As for closing bug reports because they are not reproducible...
sure, people do it, and whoever wants to can close bug reports for that reason,
but you won't catch me doing that. It's a poor reason to close a bug report
IMO.

--
~Randy

2003-03-06 19:40:45

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)


On Thu, 6 Mar 2003, Randy.Dunlap wrote:
> On Thu, 6 Mar 2003 15:34:52 +0100 (MET) Szakacsits Szabolcs <[email protected]> wrote:

> | Anyway, considering how bogus the oops was and Randy already had two
> | oops'es before this NTFS one, I think the NTFS driver was a sufferer
> | of other trouble(s) than the originator. So unless one can reproduce
> | something close to this one (or Randy sends his first [two] oops), I
> | would just trash
> |
> | http://bugme.osdl.org/show_bug.cgi?id=432
>
> I must have missed something here. What other 2 oopses are you referring to?

Quoting from your report:

==> Mar 1 13:35:44 midway kernel: Oops: 0002

This means oops counter is 2. So there were two oopses before with
counter value 0 and 1.

> As for closing bug reports because they are not reproducible...

No. Not because it's not reproducible however because it's untrustable
and bogus. Unless as I mentioned before ... please see above. Thanks!

Szaka

2003-03-06 20:14:40

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)


On Thu, 6 Mar 2003, Szakacsits Szabolcs wrote:
> On Thu, 6 Mar 2003, Randy.Dunlap wrote:
> > I must have missed something here. What other 2 oopses are you
> > referring to?
>
> Quoting from your report:
>
> ==> Mar 1 13:35:44 midway kernel: Oops: 0002
>
> This means oops counter is 2. So there were two oopses before with
> counter value 0 and 1.

I just checked, this is not true (I could dig up the false source
of information if interested). It's error_code: no page found,
kernel-mode write fault. Sorry for the confusion :(

> > As for closing bug reports because they are not reproducible...
>
> No. Not because it's not reproducible however because it's untrustable
> and bogus. Unless as I mentioned before ... please see above. Thanks!

So this is also invalid ... Could you please send the 'objdump -S
fs/ntfs/inode.o' output? The __ntfs_init_inode part would be enough
also.

Szaka

2003-03-06 20:27:10

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)

On Thu, 6 Mar 2003 21:15:35 +0100 (MET) Szakacsits Szabolcs <[email protected]> wrote:

|
| On Thu, 6 Mar 2003, Szakacsits Szabolcs wrote:
| > On Thu, 6 Mar 2003, Randy.Dunlap wrote:
| > > I must have missed something here. What other 2 oopses are you
| > > referring to?
| >
| > Quoting from your report:
| >
| > ==> Mar 1 13:35:44 midway kernel: Oops: 0002
| >
| > This means oops counter is 2. So there were two oopses before with
| > counter value 0 and 1.
|
| I just checked, this is not true (I could dig up the false source
| of information if interested). It's error_code: no page found,
| kernel-mode write fault. Sorry for the confusion :(
|
| > > As for closing bug reports because they are not reproducible...
| >
| > No. Not because it's not reproducible however because it's untrustable
| > and bogus. Unless as I mentioned before ... please see above. Thanks!
|
| So this is also invalid ... Could you please send the 'objdump -S
| fs/ntfs/inode.o' output? The __ntfs_init_inode part would be enough
| also.

I'm glad that this little confusion is cleared up.
I was about to correct it, but you beat me to it.
However, such an oops counter could be useful...

--
~Randy

2003-03-06 21:45:20

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Oops counter (was Re: ntfs OOPS (2.5.63))


> | > ==> Mar 1 13:35:44 midway kernel: Oops: 0002
> | >
> | > This means oops counter is 2. So there were two oopses before with
> | > counter value 0 and 1.
> |
> | I just checked, this is not true (I could dig up the false source
> | of information if interested).

I didn't have to: Google listed it in the top 5 hits searching for
"oops counter" ...

> However, such an oops counter could be useful...

I believed immediately this feature was added, it looked so good idea.

Szaka

2003-03-07 07:39:57

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)

> On Thu, 6 Mar 2003, Szakacsits Szabolcs wrote:
>> On Wed, 5 Mar 2003, Randy.Dunlap wrote:
>>
>> > > Could you try to turn on debugging in the NTFS driver (compile option
>> in the menus), then once ntfs module is loaded (or otherwise anytime)
>> as root do:
>> > >
>> > > echo -1 > /proc/sys/fs/ntfs-debug

Did that, got lots of output, but the oops isn't reproducible
AFAIK, so I haven't collected all of that debug output.

>> > > Then mount and to the directory changes. Assuming that you get the bug
>> again could you send me the captured kernel log output? (Note there
>> will be massive amounts of output.)
>> > >
>> > > The code looks ok and I can't reproduce here so it would be helpful to
>> see if there are any oddities on your partition. Just to make sure it
>> is not the compiler, could you do a "make fs/ntfs/inode.S" and send me
>> that as well?

The .config file, gcc -v output, and /proc/cpuinfo are added as
attachments at
http://bugme.osdl.org/show_bug.cgi?id=432

objdump disassembly and make fs/ntfs/inode.s files are also
attached there.

I tried to decode the disassembly, got lots of it done,
but I bogged down on something that may be outside of the
NTFS realm. I have ALL kernel hacking options enabled
(=y), and it's a bit hairy (for me) to decode all of the
extra/added code, and this may be where the oops is
happening. Dunno really, just wanted to warn you.

Thanks,
~Randy



2003-03-07 07:51:27

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)


On Thu, 6 Mar 2003, Randy.Dunlap wrote:

> I tried to decode the disassembly, got lots of it done,
> but I bogged down on something that may be outside of the
> NTFS realm. I have ALL kernel hacking options enabled
> (=y), and it's a bit hairy (for me) to decode all of the
> extra/added code, and this may be where the oops is
> happening. Dunno really, just wanted to warn you.

This was one of the issues I suspected (lots of hacking option) and
asked for .config also. Your __ntfs_init_inode was *huge* and the oops
Code didn't resembled to any of written in __ntfs_init_inode ...
unless you have some hardware issue (bit flips, memory/CPU, etc). When
I have time I'll also take a closer look. I don't exclude some
alignment issues either ...

Szaka

2003-03-07 17:08:37

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)

On Fri, 7 Mar 2003 08:52:42 +0100 (MET) Szakacsits Szabolcs <[email protected]> wrote:

|
| On Thu, 6 Mar 2003, Randy.Dunlap wrote:
|
| > I tried to decode the disassembly, got lots of it done,
| > but I bogged down on something that may be outside of the
| > NTFS realm. I have ALL kernel hacking options enabled
| > (=y), and it's a bit hairy (for me) to decode all of the
| > extra/added code, and this may be where the oops is
| > happening. Dunno really, just wanted to warn you.
|
| This was one of the issues I suspected (lots of hacking option) and
| asked for .config also. Your __ntfs_init_inode was *huge* and the oops
| Code didn't resembled to any of written in __ntfs_init_inode ...
| unless you have some hardware issue (bit flips, memory/CPU, etc). When
| I have time I'll also take a closer look. I don't exclude some
| alignment issues either ...

BTW, I think that this would be a reasonable reason (huh?) to dismiss
this bug against NTFS -- i.e., if it's found to be a problem in general
kernel debug helpers. Still be nice to find where it happened,
of course.

--
~Randy

2003-03-07 18:00:01

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)

On Fri, 7 Mar 2003 18:56:41 +0100 (MET) Szakacsits Szabolcs <[email protected]> wrote:

|
| On Fri, 7 Mar 2003, Randy.Dunlap wrote:
|
| > BTW, I think that this would be a reasonable reason (huh?) to dismiss
| > this bug against NTFS -- i.e., if it's found to be a problem in general
| > kernel debug helpers. Still be nice to find where it happened,
| > of course.
|
| It seems (and CONFIG_DEBUG_SPINLOCK also seems to contribute)
| init_MUTEX(&ni->mrec_lock);
| ...
| INIT_LIST_HEAD(...)
|
| and IMHO that shouldn't happen :) But you have the infamous 2.96
| compiler, there were several updates for Red Hat [remember how buggy
| code it complied?] but I don't know how many updates were issued for
| Mandrake and if you did those. gcc 3.2.2 generates much nicer code for
| __ntfs_init_inode.

OK, I'm fine with closing it as not an NTFS issue or as a tools issue
or something along those lines.

Thanks for looking into this.
--
~Randy

2003-03-07 17:55:14

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)


On Fri, 7 Mar 2003, Randy.Dunlap wrote:

> BTW, I think that this would be a reasonable reason (huh?) to dismiss
> this bug against NTFS -- i.e., if it's found to be a problem in general
> kernel debug helpers. Still be nice to find where it happened,
> of course.

It seems (and CONFIG_DEBUG_SPINLOCK also seems to contribute)
init_MUTEX(&ni->mrec_lock);
...
INIT_LIST_HEAD(...)

and IMHO that shouldn't happen :) But you have the infamous 2.96
compiler, there were several updates for Red Hat [remember how buggy
code it complied?] but I don't know how many updates were issued for
Mandrake and if you did those. gcc 3.2.2 generates much nicer code for
__ntfs_init_inode.

Szaka

2003-03-08 13:23:03

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)


On Fri, 7 Mar 2003, Randy.Dunlap wrote:
> On Fri, 7 Mar 2003 18:56:41 +0100 (MET) Szakacsits Szabolcs <[email protected]> wrote:
> |
> | It seems (and CONFIG_DEBUG_SPINLOCK also seems to contribute)
> | init_MUTEX(&ni->mrec_lock);
> | ...
> | INIT_LIST_HEAD(...)
>
> OK, I'm fine with closing it as not an NTFS issue or as a tools issue
> or something along those lines.

I took a closer look now:

EFLAGS: 00010282
eax: f6c0f080 ebx: 0000416d ecx: 00010282 edx: f6c0f0f8
esi: c040b078 edi: f6c0f0f8 ebp: f6dd1dbc esp: f6dd1db4
ds: 007b es: 007b ss: 0068

3c0: b9 06 00 00 00 mov $0x6,%ecx
... not important ...
3cc: 89 d7 mov %edx,%edi
3ce: 89 55 f4 mov %edx,0xfffffff4(%ebp)
3d1: f3 a5 repz movsl %ds:(%esi),%es:(%edi)
3d3: 8d 50 78 lea 0x78(%eax),%edx
3d6: 8b 4d f4 mov 0xfffffff4(%ebp),%ecx
3d9: 89 51 18 mov %edx,0x18(%ecx) ## OOPS ##

So %ecx should be %edi-24 = f6c0f0e0, instead it's EFLAGS. Oops [indeed].
%ebp value is correct, I checked. So it seems a hardware, strong
radiation or an interrupt that didn't restore ecx.

> Thanks for looking into this.

Thanks for reporting.

Szaka

2003-03-08 15:45:34

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)


On Sat, 8 Mar 2003, Szakacsits Szabolcs wrote:
>
> EFLAGS: 00010282
> eax: f6c0f080 ebx: 0000416d ecx: 00010282 edx: f6c0f0f8
> esi: c040b078 edi: f6c0f0f8 ebp: f6dd1dbc esp: f6dd1db4
> ds: 007b es: 007b ss: 0068
>
> 3c0: b9 06 00 00 00 mov $0x6,%ecx
> ... not important ...
> 3cc: 89 d7 mov %edx,%edi
> 3ce: 89 55 f4 mov %edx,0xfffffff4(%ebp)
> 3d1: f3 a5 repz movsl %ds:(%esi),%es:(%edi)
> 3d3: 8d 50 78 lea 0x78(%eax),%edx
> 3d6: 8b 4d f4 mov 0xfffffff4(%ebp),%ecx
> 3d9: 89 51 18 mov %edx,0x18(%ecx) ## OOPS ##
>
> So %ecx should be %edi-24 = f6c0f0e0, instead it's EFLAGS. Oops [indeed].
> %ebp value is correct, I checked. So it seems a hardware, strong
> radiation or an interrupt that didn't restore ecx.

Actually the "interrupt" did a pushfl and overwrote 0xfffffff4(%ebp).
esp = 0xfffffff4(%ebp). For kernel code the compiler shouldn't have
generated the above code.

Szaka

2003-03-10 04:06:08

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [Linux-NTFS-Dev] ntfs OOPS (2.5.63)

>
> On Sat, 8 Mar 2003, Szakacsits Szabolcs wrote:
>>
>> EFLAGS: 00010282
>> eax: f6c0f080 ebx: 0000416d ecx: 00010282 edx: f6c0f0f8
>> esi: c040b078 edi: f6c0f0f8 ebp: f6dd1dbc esp: f6dd1db4
>> ds: 007b es: 007b ss: 0068
>>
>> 3c0: b9 06 00 00 00 mov $0x6,%ecx
>> ... not important ...
>> 3cc: 89 d7 mov %edx,%edi
>> 3ce: 89 55 f4 mov %edx,0xfffffff4(%ebp) 3d1:
>> f3 a5 repz movsl %ds:(%esi),%es:(%edi) 3d3: 8d
>> 50 78 lea 0x78(%eax),%edx
>> 3d6: 8b 4d f4 mov 0xfffffff4(%ebp),%ecx 3d9:
>> 89 51 18 mov %edx,0x18(%ecx) ## OOPS ##
>>
>> So %ecx should be %edi-24 = f6c0f0e0, instead it's EFLAGS. Oops [indeed].
>> %ebp value is correct, I checked. So it seems a hardware, strong radiation
>> or an interrupt that didn't restore ecx.
>
> Actually the "interrupt" did a pushfl and overwrote 0xfffffff4(%ebp). esp =
> 0xfffffff4(%ebp). For kernel code the compiler shouldn't have generated the
> above code.
>
> Szaka

Hi Szaka,

Should I just close this bugzilla entry as invalid or not an NTFS problem?
I don't mind doing that.

Thanks,
~Randy



2003-03-10 07:19:49

by Szabolcs Szakacsits

[permalink] [raw]
Subject: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Sun, 9 Mar 2003, Randy.Dunlap wrote:
> > On Sat, 8 Mar 2003, Szakacsits Szabolcs wrote:
> >>
> >> EFLAGS: 00010282
> >> eax: f6c0f080 ebx: 0000416d ecx: 00010282 edx: f6c0f0f8
> >> esi: c040b078 edi: f6c0f0f8 ebp: f6dd1dbc esp: f6dd1db4
> >> ds: 007b es: 007b ss: 0068
> >>
> >> 3c0: b9 06 00 00 00 mov $0x6,%ecx
> >> ... not important ...
> >> 3cc: 89 d7 mov %edx,%edi
> >> 3ce: 89 55 f4 mov %edx,0xfffffff4(%ebp)
> >> 3d1: f3 a5 repz movsl %ds:(%esi),%es:(%edi)
> >> 3d3: 8d 50 78 lea 0x78(%eax),%edx
> >> 3d6: 8b 4d f4 mov 0xfffffff4(%ebp),%ecx
> >> 3d9: 89 51 18 mov %edx,0x18(%ecx) ## OOPS ##
> >>
> >> So %ecx should be %edi-24 = f6c0f0e0, instead it's EFLAGS. Oops [indeed].
> >> %ebp value is correct, I checked. So it seems a hardware, strong radiation
> >> or an interrupt that didn't restore ecx.
> >
> > Actually the "interrupt" did a pushfl and overwrote 0xfffffff4(%ebp). esp =
> > 0xfffffff4(%ebp).

Actually 0xfffffff4(%ebp) = %esp - 4.

> Should I just close this bugzilla entry as invalid or not an NTFS problem?
> I don't mind doing that.

It's very valid and personally think it's serious kernel wide issue. I
grepped recent linux-kernel oopses for this type of bug and seems to
be several hits, e.g. search for handling faults around EFLAGS.

The question is if we want to support the buggy 2.9[56] compilers or
not. I checked Red Hat 7.3 and the latest errata gcc fixes this issue,
the generated code is ok. But your complier didn't and probably many
more out there don't.

At least spinlock debugging triggers this bad code generation in the
widely used init_waitqueue_head() but quite probably there are others.
AFAIK fomit-frame-pointer was used earlier to workaround this but
apparently not anymore, so the bug came back. Maybe the new kernel
build broke it or it was just forgotten or it's a new policy not
supporting broken compilers, etc. I don't know.

But something should be done about it, IMHO.

Szaka

2003-03-11 15:42:57

by Alan

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Mon, 2003-03-10 at 07:22, Szakacsits Szabolcs wrote:
> The question is if we want to support the buggy 2.9[56] compilers or
> not. I checked Red Hat 7.3 and the latest errata gcc fixes this issue,
> the generated code is ok. But your complier didn't and probably many
> more out there don't.

I don't think gcc 2.96 had that problem. I've not seen it there, but
gcc 3.0.x certainly does and a gcc 3.0.x early 3.1.x built kernels seems
to explode randomly under load probably for this reason.

I've also not seen any problemd with this on gcc 3.2. Valgrind has some
notes on affected compilers as the valgrind app also picks up this
violation by the compiler.

2003-03-11 16:27:17

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On 11 Mar 2003, Alan Cox wrote:
> On Mon, 2003-03-10 at 07:22, Szakacsits Szabolcs wrote:
> > The question is if we want to support the buggy 2.9[56] compilers or
> > not. I checked Red Hat 7.3 and the latest errata gcc fixes this issue,
> > the generated code is ok. But your complier didn't and probably many
> > more out there don't.
>
> I don't think gcc 2.96 had that problem.

Randy's compliler is 2.96 and it forgot to do a 'sub $0xc,%esp'. See
yourself all the data at http://bugme.osdl.org/show_bug.cgi?id=432

Red Hat had this bug also for 1-1.5 year (there is a bugzilla entry
submitted by a Parasoft employee, the bug also screw[s|ed] user space
apps, e.g. by signal handling).

Szaka

2003-03-11 23:51:16

by Alan

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Tue, 2003-03-11 at 16:29, Szakacsits Szabolcs wrote:
> Randy's compliler is 2.96 and it forgot to do a 'sub $0xc,%esp'. See
> yourself all the data at http://bugme.osdl.org/show_bug.cgi?id=432
>
> Red Hat had this bug also for 1-1.5 year (there is a bugzilla entry
> submitted by a Parasoft employee, the bug also screw[s|ed] user space
> apps, e.g. by signal handling).

Thanks for the reference. It does indeed look like its a longer standing
bug than some of us thought.

2003-03-12 00:29:22

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

In article <[email protected]>,
Szakacsits Szabolcs <[email protected]> wrote:
>
>At least spinlock debugging triggers this bad code generation in the
>widely used init_waitqueue_head() but quite probably there are others.
>AFAIK fomit-frame-pointer was used earlier to workaround this but
>apparently not anymore, so the bug came back. Maybe the new kernel
>build broke it or it was just forgotten or it's a new policy not
>supporting broken compilers, etc. I don't know.
>
>But something should be done about it, IMHO.

Ouch, hell yes. Compiler bugs are nasty to chase down.

If there is a well-known list of compilers, we should put a BIG warning
in some core kernel file to guide people to upgrade (or maybe work
around it by forcing -fno-frame-pointer if that fixes it for the
affected compilers).

Do we have a list?

Linus

2003-03-12 06:04:30

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On 11 Mar 2003, Linus Torvalds wrote:
>
> If there is a well-known list of compilers, we should put a BIG warning
> in some core kernel file to guide people to upgrade (or maybe work

Not enough, nobody would notice and today most end user doesn't
compile the kernel himself, they are just shipped by a broken kernels.

We *also* need a mechanism to know the kernel was compiled with a
broken compiler (from kernel point of view of course, not the latest
C++ features). Like the 'tainted' approach but this would be marked as
broken/miscompiled/etc. To be able to tell the user *immediately* to
complain to his vendor instead hunting/finding the bug again and
again, as happening now.

I know all compiler is broken but the severities are different.

> around it by forcing -fno-frame-pointer if that fixes it for the
> affected compilers).

I also doubt this. There are things suppose the code was compiled
otherwise.

> Do we have a list?

Impossible. Vendors have their own versioning when they patch the
compiler.

Only way I see is to detect it at build time, BIG warning to the user
of compiler, print a well visible "kernel was built by broken tools"
message at boot time to end users and marking the kernel 'broken' in
case it oopses for developers.

And the badly broken compilers' case is closed. Once ..... well, at
least for a given compiler bug, but the infrastructure would be there
to deal with them in the future, if kernel can't workaround them.

BTW, if possible having the Code both before and after when a fault
happens could also help a lot in the future.

And in general, an oops counter would be also useful, not spending too
much time decoding potentialy bogus oopses.

Szaka

2003-03-12 07:41:44

by Richard Henderson

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Wed, Mar 12, 2003 at 07:07:26AM +0100, Szakacsits Szabolcs wrote:
> Only way I see is to detect it at build time, BIG warning to the user
> of compiler, print a well visible "kernel was built by broken tools"
> message at boot time to end users ...

You don't have to let things go that far. If you have a test
case, then you can run the test case at build time, and have
the make actively fail. So the kernel never gets built at all.


r~

2003-03-12 07:58:39

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Tue, 11 Mar 2003, Richard Henderson wrote:
> On Wed, Mar 12, 2003 at 07:07:26AM +0100, Szakacsits Szabolcs wrote:
> > Only way I see is to detect it at build time, BIG warning to the user
> > of compiler, print a well visible "kernel was built by broken tools"
> > message at boot time to end users ...
>
> You don't have to let things go that far. If you have a test
> case,

gcc team must have, haven't it? Do you know?

> then you can run the test case at build time, and have
> the make actively fail. So the kernel never gets built at all.

I thought about it, I'm just afraid too much kernel wouldn't build.
This bug is in most 2.95, 2.96 and according to Alan in 3.0 and early
3.1) and people would just start "working around" it by commenting out
the check for getting something to work quickly then forgetting about
the issue completely.

Szaka

2003-03-12 08:06:59

by Richard Henderson

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Wed, Mar 12, 2003 at 09:02:08AM +0100, Szakacsits Szabolcs wrote:
> gcc team must have, haven't it? Do you know?

I have one test case. It was never turned into anything
that you could run.

> I thought about it, I'm just afraid too much kernel wouldn't build.

Then it won't build. Use a different compiler.

> This bug is in most 2.95, 2.96 and according to Alan in 3.0 and early
> 3.1) and people would just start "working around" it by commenting out
> the check for getting something to work quickly then forgetting about
> the issue completely.

The bug report I can find,

http://gcc.gnu.org/ml/gcc-patches/2001-06/msg00746.html

was fixed before gcc 3.0.0 was released. So if this is
a different bug...


r~

2003-03-12 08:42:08

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Wed, 12 Mar 2003, Richard Henderson wrote:
> On Wed, Mar 12, 2003 at 09:02:08AM +0100, Szakacsits Szabolcs wrote:
> > gcc team must have, haven't it? Do you know?
>
> I have one test case. It was never turned into anything
> that you could run.

The simplest test case I've found is at

http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=57760

I did/do not have time to check it closely now but at first sight it's
the same I've found.

> > I thought about it, I'm just afraid too much kernel wouldn't build.
>
> Then it won't build. Use a different compiler.

If we know the impact [how wildly broken compilers are used] we could
decide about the approach, IMHO.

Please note, this is not only about how to make build fail but how not
to waste potentially significant users and developers time if one
circumvent the blocking phase and starts distributing [intentional or
not] the broken binaries. Bigger the impact the bigger the chance it
happens.

> > This bug is in most 2.95, 2.96 and according to Alan in 3.0 and early
> > 3.1) and people would just start "working around" it by commenting out
> > the check for getting something to work quickly then forgetting about
> > the issue completely.
>
> The bug report I can find,
>
> http://gcc.gnu.org/ml/gcc-patches/2001-06/msg00746.html
>
> was fixed before gcc 3.0.0 was released. So if this is
> a different bug...

Could be also the classical "copy-paste [slightly change one occasion]
then fix one occasion" type of bug? I've never looked the quality of
the gcc source. I really don't know.

Szaka

2003-03-12 09:14:14

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Wed, 12 Mar 2003, Szakacsits Szabolcs wrote:
> On Wed, 12 Mar 2003, Richard Henderson wrote:
> > > This bug is in most 2.95, 2.96 and according to Alan in 3.0 and early
> > > 3.1) and people would just start "working around" it by commenting out
> > > the check for getting something to work quickly then forgetting about
> > > the issue completely.
> >
> > The bug report I can find,
> >
> > http://gcc.gnu.org/ml/gcc-patches/2001-06/msg00746.html
> >
> > was fixed before gcc 3.0.0 was released. So if this is
> > a different bug...
>
> Could be also the classical "copy-paste [slightly change one occasion]
> then fix one occasion" type of bug? I've never looked the quality of
> the gcc source. I really don't know.

Sorry, looks "a bit" more complex. And quoting you "Confirmed. I'd
classify this as a fairly serious bug. I am currently testing the
following fix."

I can confirm that Red Hat 7.3 latest gcc errata fixes (issued 13
months later) indeed generates correct code for at least
__ntfs_init_inode() with Randy's .config file. The compiler version,

% gcc --version
2.96

Oops, not to talkative considering how heavily patched ...

% rpm -qf =gcc
gcc-2.96-113

gcc generated code from Mandrake 9.1 RC2 is also ok.

% gcc --version
gcc (GCC) 3.2.2 (Mandrake Linux 9.1 3.2.2-2mdk)

Szaka

2003-03-12 10:09:15

by Arjan van de Ven

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Wed, 2003-03-12 at 07:07, Szakacsits Szabolcs wrote:
> On 11 Mar 2003, Linus Torvalds wrote:
> >
> > If there is a well-known list of compilers, we should put a BIG warning
> > in some core kernel file to guide people to upgrade (or maybe work
>
> Not enough, nobody would notice and today most end user doesn't
> compile the kernel himself, they are just shipped by a broken kernels.

and all vendors always ship -fno-frame-pointer kernels so far so those
users are ok! Until recently there was no way to build a non
-fno-frame-pointer kernel!


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2003-03-12 15:12:01

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On 12 Mar 2003, Arjan van de Ven wrote:
>
> and all vendors always ship -fno-frame-pointer kernels so far so those
> users are ok! Until recently there was no way to build a non
> -fno-frame-pointer kernel!

Not entirely true.

Even with the traditional -fomit-frame-pointer build, "sched.c" has always
been built with -fno-fomit-frame-pointer in order to get the correct
"wchan" of callers of schedule() and wait_on().

See kernel/Makefile for details.

So yes, old kernels (and CONFIG_FRAME_POINTER=n) have traditionally
avoided the bug _mostly_. But it could still bite us in some rather
important functions.

Linus

2003-03-12 15:14:18

by Arjan van de Ven

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Wed, Mar 12, 2003 at 07:20:39AM -0800, Linus Torvalds wrote:
>
> On 12 Mar 2003, Arjan van de Ven wrote:
> >
> > and all vendors always ship -fno-frame-pointer kernels so far so those
> > users are ok! Until recently there was no way to build a non
> > -fno-frame-pointer kernel!
>
> Not entirely true.
>
> Even with the traditional -fomit-frame-pointer build, "sched.c" has always
> been built with -fno-fomit-frame-pointer in order to get the correct
> "wchan" of callers of schedule() and wait_on().
>
> See kernel/Makefile for details.
>
> So yes, old kernels (and CONFIG_FRAME_POINTER=n) have traditionally
> avoided the bug _mostly_. But it could still bite us in some rather
> important functions.

I know. And when the gcc bug was found (and fixed)
we audited the disassembly of sched.o for this and it
didn't get triggered by this bug.

2003-03-12 15:24:45

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Wed, 12 Mar 2003, Szakacsits Szabolcs wrote:
> On Wed, 12 Mar 2003, Szakacsits Szabolcs wrote:
> > On Wed, 12 Mar 2003, Richard Henderson wrote:
> > > > This bug is in most 2.95, 2.96 and according to Alan in 3.0 and early
> > > > 3.1) and people would just start "working around" it by commenting out
> > > > the check for getting something to work quickly then forgetting about
> > > > the issue completely.
> > >
> > > The bug report I can find,
> > >
> > > http://gcc.gnu.org/ml/gcc-patches/2001-06/msg00746.html
> > >
> > > was fixed before gcc 3.0.0 was released. So if this is
> > > a different bug...

Some data points, in time order.

SuSE 8.0 2.95.3-216 no bug yet [1]
Debian 3.0 2.95.4-14 no bug yet [1]
Red Hat 7.[23] 2.96-81 no bug yet [2,3]
Red Hat 7.[23] 2.96-98 bug introduced [2,3]
Mandrake 8.1 2.96-0.62mdk bug introduced [4]
Red Hat 7.[23] 2.96-103 bug fixed [2,3]
SuSE 8.0 3.0.4 (SuSE) bug fixed [1]
Mandrake 9.1 3.2.2-2mdk bug fixed [1]

[1] I checked
[2] http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=57760
[3] https://rhn.redhat.com/errata/RHBA-2002-055.html
[4] http://bugme.osdl.org/show_bug.cgi?id=432

So it's not so serious as we first thought. Probably the "halt build
if broken compiler detected" approach is enough.

Szaka

2003-03-12 15:32:30

by Arjan van de Ven

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Wed, Mar 12, 2003 at 04:35:10PM +0100, Szakacsits Szabolcs wrote:
> If all vendors is Red Hat then I believe you.

I say All Vendors simply because no vendor ships 2.5 kernels yet which
have the CONFIG option to NOT use -fomit-frame-pointer

2003-03-12 15:31:33

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On 12 Mar 2003, Arjan van de Ven wrote:
> On Wed, 2003-03-12 at 07:07, Szakacsits Szabolcs wrote:
> > On 11 Mar 2003, Linus Torvalds wrote:
> > >
> > > If there is a well-known list of compilers, we should put a BIG warning
> > > in some core kernel file to guide people to upgrade (or maybe work
> >
> > Not enough, nobody would notice and today most end user doesn't
> > compile the kernel himself, they are just shipped by a broken kernels.
>
> and all vendors always ship -fno-frame-pointer kernels so far so those
> users are ok! Until recently there was no way to build a non
> -fno-frame-pointer kernel!

If all vendors is Red Hat then I believe you. I know Stephen C.
Tweedie audited the kernel. Please don't take things personally,
this is the kernel list.

Szaka

2003-03-12 15:29:23

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Wed, 12 Mar 2003, Szakacsits Szabolcs wrote:
>
> Some data points, in time order.
>
> SuSE 8.0 2.95.3-216 no bug yet [1]
> Debian 3.0 2.95.4-14 no bug yet [1]
> Red Hat 7.[23] 2.96-81 no bug yet [2,3]
> Red Hat 7.[23] 2.96-98 bug introduced [2,3]
> Mandrake 8.1 2.96-0.62mdk bug introduced [4]
> Red Hat 7.[23] 2.96-103 bug fixed [2,3]
> SuSE 8.0 3.0.4 (SuSE) bug fixed [1]
> Mandrake 9.1 3.2.2-2mdk bug fixed [1]

Ok. So the test really is for one particular version only.

That's easy. I'll just add a

#ifdef CONFIG_FRAME_POINTER
#if __GNUC__ == 2 && __GNUC_MINOR__ == 96
#error This compiler is not safe with frame pointers
#endif
#endif

to <linux/compiler.h>. Yeah, it will get some fixed compilers too, but
that's just not worth worrying about - people will just have to turn off
CONFIG_FRAME_POINTER and be happy.

Linus

2003-03-12 15:38:50

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Wed, 12 Mar 2003, Arjan van de Ven wrote:
>
> On Wed, Mar 12, 2003 at 04:35:10PM +0100, Szakacsits Szabolcs wrote:
> > If all vendors is Red Hat then I believe you.
>
> I say All Vendors simply because no vendor ships 2.5 kernels yet which
> have the CONFIG option to NOT use -fomit-frame-pointer

Actually, that config option came from the 2.4.x gdb tree, since gdb users
want to be able to see "where". So any vendor that included the remote gdb
patch would have gotten it too.. (except in that kernel it's called
CONFIG_REMOTE_DEBUG and brings in a lot more).

I don't know if any vendor kernels come with the kgdb patch..

Linus

2003-03-12 16:29:49

by Randy.Dunlap

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Wed, 12 Mar 2003 07:47:08 -0800 (PST) Linus Torvalds <[email protected]> wrote:

|
| On Wed, 12 Mar 2003, Arjan van de Ven wrote:
| >
| > On Wed, Mar 12, 2003 at 04:35:10PM +0100, Szakacsits Szabolcs wrote:
| > > If all vendors is Red Hat then I believe you.
| >
| > I say All Vendors simply because no vendor ships 2.5 kernels yet which
| > have the CONFIG option to NOT use -fomit-frame-pointer
|
| Actually, that config option came from the 2.4.x gdb tree, since gdb users
| want to be able to see "where". So any vendor that included the remote gdb
| patch would have gotten it too.. (except in that kernel it's called
| CONFIG_REMOTE_DEBUG and brings in a lot more).
|
| I don't know if any vendor kernels come with the kgdb patch..

The kdb patch also adds the CONFIG_FRAME_POINTER option (2.4 and 2.5)
IIRC -- haven't looked lately.

--
~Randy

2003-03-12 16:42:12

by Randy.Dunlap

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Wed, 12 Mar 2003 07:07:26 +0100 (MET) Szakacsits Szabolcs <[email protected]> wrote:

|
| On 11 Mar 2003, Linus Torvalds wrote:
| >
| > If there is a well-known list of compilers, we should put a BIG warning
| > in some core kernel file to guide people to upgrade (or maybe work
|
| Not enough, nobody would notice and today most end user doesn't
| compile the kernel himself, they are just shipped by a broken kernels.
|
| We *also* need a mechanism to know the kernel was compiled with a
| broken compiler (from kernel point of view of course, not the latest
| C++ features). Like the 'tainted' approach but this would be marked as
| broken/miscompiled/etc. To be able to tell the user *immediately* to
| complain to his vendor instead hunting/finding the bug again and
| again, as happening now.

Not quite what you describe, but the in-kernel-config (ikconfig) patch by
me & some HP folks saves a "built-with" string along with the .config file
(as a CONFIG option, of course).


| BTW, if possible having the Code both before and after when a fault
| happens could also help a lot in the future.

Do you just mean more opcode decoding?

| And in general, an oops counter would be also useful, not spending too
| much time decoding potentialy bogus oopses.

Yes.

--
~Randy

2003-03-12 18:25:10

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Wed, 12 Mar 2003, Szakacsits Szabolcs wrote:
>
> The Code part of the Oops shows what's after EIP (i386). It's also
> important (if not more) what's before. I fail to see the difficulties
> to add this feature (or was it dropped?), ksymoops should handle it.

The difficulty is finidng the right instruction boundary. It's basically
impossible.

If you want to get the instructions before that point, just use

gdb vmlinux

and disassemble it by hand. Because the kernel _cannot_ do it reliably.

Linus

2003-03-12 18:21:22

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Wed, 12 Mar 2003, Randy.Dunlap wrote:
>
> Not quite what you describe, but the in-kernel-config (ikconfig) patch by
> me & some HP folks saves a "built-with" string along with the .config file
> (as a CONFIG option, of course).

>From a bit different point of view and in general this is much better.

> | BTW, if possible having the Code both before and after when a fault
> | happens could also help a lot in the future.
>
> Do you just mean more opcode decoding?

The Code part of the Oops shows what's after EIP (i386). It's also
important (if not more) what's before. I fail to see the difficulties
to add this feature (or was it dropped?), ksymoops should handle it.

Real world example: I've spent 2 days trying to find a kernel
configuration that both builds and works (ordinary hardware and I'm
using Linux for 8 years) then 3 days waiting for data I asked for. If
the above info was in Oops it could be told immediately with very high
probability by somebody who knew already about this compiler bug.

Szaka

2003-03-12 21:02:59

by Horst H. von Brand

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

Szakacsits Szabolcs <[email protected]> said:

[...]

> The Code part of the Oops shows what's after EIP (i386). It's also
> important (if not more) what's before. I fail to see the difficulties
> to add this feature (or was it dropped?), ksymoops should handle it.

It is _hard_ to do with variable length instructions (CISC, remember?), the
code is designed to be easily decoded forward, noone executes code going
backwards. Finding out what starts at EIP is easy.

When I needed to look at the code in an Oops I'd either objdump(1)ed it or
compiled the offending stuff to assembler (possibly with custom CFLAGS to
get info on line numbers and such in the output).
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2003-03-12 21:50:22

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Wed, 12 Mar 2003, Linus Torvalds wrote:
>
> The difficulty is finidng the right instruction boundary. It's basically
> impossible.

If I understand you correctly, no. We have the boundary at EIP.
Decoding what's before is max 7-8 tries by a human and one can figure
out the real code from the context (with high probability). 2-3 times
more code before EIP then after could significantly help of course.

> If you want to get the instructions before that point, just use
>
> gdb vmlinux

This approach frequently fails because vmlinux is on a users computer
far away and he

1) doesn't bother answering anymore
2) recompiled with different .config
3) reinstalled another distro
4) etc

> and disassemble it by hand. Because the kernel _cannot_ do it reliably.

The kernel shouldn't do it, it's not disassembler. It should just give
enough data for a human and disassembler. Nothing lost but much can be
gain.

Szaka

2003-03-12 21:59:37

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Wed, 12 Mar 2003, Horst von Brand wrote:

> It is _hard_ to do with variable length instructions (CISC, remember?), the
> code is designed to be easily decoded forward, noone executes code going
> backwards.

Of course, it's a bad approach. You start earlier and stop at EIP.
Repeat this for max(instruction length) different offsets and you will
have the winner. Figure it out from the context after EIP.

> When I needed to look at the code in an Oops I'd either objdump(1)ed it or
> compiled the offending stuff to assembler (possibly with custom CFLAGS to
> get info on line numbers and such in the output).

I was talking about cases when you can't do these.

Szaka

2003-03-12 22:09:42

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Wed, 12 Mar 2003, Szakacsits Szabolcs wrote:
>
> If I understand you correctly, no. We have the boundary at EIP.

Yes.

> Decoding what's before is max 7-8 tries by a human and one can figure
> out the real code from the context (with high probability).

The point being "with high probability".

I'm not adding uncertain instruction decoding to the kernel.

Linus

2003-03-12 22:24:40

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Wed, 12 Mar 2003, Linus Torvalds wrote:
>
> I'm not adding uncertain instruction decoding to the kernel.

>From some point of you I understand. But it's not uncertain. The
correct one is 100% included. The probability is the developer, not
the dumped code ... from another point of view.

Szaka

2003-03-12 23:07:25

by Bill Davidsen

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Wed, 12 Mar 2003, Linus Torvalds wrote:

>
> On Wed, 12 Mar 2003, Szakacsits Szabolcs wrote:
> >
> > Some data points, in time order.
> >
> > SuSE 8.0 2.95.3-216 no bug yet [1]
> > Debian 3.0 2.95.4-14 no bug yet [1]
> > Red Hat 7.[23] 2.96-81 no bug yet [2,3]
> > Red Hat 7.[23] 2.96-98 bug introduced [2,3]
> > Mandrake 8.1 2.96-0.62mdk bug introduced [4]
> > Red Hat 7.[23] 2.96-103 bug fixed [2,3]
> > SuSE 8.0 3.0.4 (SuSE) bug fixed [1]
> > Mandrake 9.1 3.2.2-2mdk bug fixed [1]
>
> Ok. So the test really is for one particular version only.
>
> That's easy. I'll just add a
>
> #ifdef CONFIG_FRAME_POINTER
> #if __GNUC__ == 2 && __GNUC_MINOR__ == 96
> #error This compiler is not safe with frame pointers
> #endif
> #endif
>
> to <linux/compiler.h>. Yeah, it will get some fixed compilers too, but
> that's just not worth worrying about - people will just have to turn off
> CONFIG_FRAME_POINTER and be happy.

Please don't use a hammer on that tack... The Redhat errata gcc (as an
example) seems to be fine (see the nice list in the first post) and id's
as:
oddball:davidsen> gcc -v
Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-110)

People trying to get 2.5 kernels working want frame pointers and kernel
symbols so problems can be reported in a useful way. People running 2.5
kernels are probably more likely to have installed errata.

Perhaps a warning (UNSAFE?) in the config would be better, or whatever
else would avoid just blocking the ability to compile a debug kernel.

Yes, I see there's another newer yet gcc, 2.96-113, but it's been working
since -103.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2003-03-12 23:24:12

by Randy.Dunlap

[permalink] [raw]
Subject: [PATCH] OOPS counters

On Wed, 12 Mar 2003 07:07:26 +0100 (MET) Szakacsits Szabolcs <[email protected]> wrote:

| And in general, an oops counter would be also useful, not spending too
| much time decoding potentialy bogus oopses.

Hi,

This patch (to 2.5.64) adds an Oops counter to all die() and __die()
functions that I could find and prints the counter on each Oops: message
that looks like so (the "[#n]" part):

Oops: 0002 [#2]

Comments?

--
~Randy


patch_name: oops_counter.patch
patch_version: 2003-03-12.14:50:05
author: Randy.Dunlap <[email protected]>
description: Add an Oops counter to oops messages.
product: Linux
product_versions: 2.5.64
changelog: Add an oops counter message in all die() or __die() functions.
diffstat: =
arch/arm/kernel/traps.c | 3 ++-
arch/i386/kernel/traps.c | 3 ++-
arch/ia64/kernel/traps.c | 4 +++-
arch/mips/kernel/traps.c | 3 ++-
arch/mips64/kernel/traps.c | 3 ++-
arch/ppc/kernel/traps.c | 3 ++-
arch/ppc64/kernel/traps.c | 3 ++-
arch/s390/kernel/traps.c | 3 ++-
arch/s390x/kernel/traps.c | 3 ++-
arch/sh/kernel/traps.c | 3 ++-
arch/x86_64/kernel/traps.c | 3 ++-
11 files changed, 23 insertions(+), 11 deletions(-)


diff -Naur ./arch/ppc/kernel/traps.c%OOPSC ./arch/ppc/kernel/traps.c
--- ./arch/ppc/kernel/traps.c%OOPSC Tue Mar 4 19:29:03 2003
+++ ./arch/ppc/kernel/traps.c Wed Mar 12 14:48:47 2003
@@ -86,13 +86,14 @@

void die(const char * str, struct pt_regs * fp, long err)
{
+ static int die_counter = 0;
console_verbose();
spin_lock_irq(&die_lock);
#ifdef CONFIG_PMAC_BACKLIGHT
set_backlight_enable(1);
set_backlight_level(BACKLIGHT_MAX);
#endif
- printk("Oops: %s, sig: %ld\n", str, err);
+ printk("Oops: %s, sig: %ld [#%d]\n", str, err, ++die_counter);
show_regs(fp);
spin_unlock_irq(&die_lock);
/* do_exit() should take care of panic'ing from an interrupt
diff -Naur ./arch/i386/kernel/traps.c%OOPSC ./arch/i386/kernel/traps.c
--- ./arch/i386/kernel/traps.c%OOPSC Tue Mar 4 19:29:01 2003
+++ ./arch/i386/kernel/traps.c Wed Mar 12 13:10:33 2003
@@ -247,11 +247,12 @@

void die(const char * str, struct pt_regs * regs, long err)
{
+ static int die_counter = 0;
console_verbose();
spin_lock_irq(&die_lock);
bust_spinlocks(1);
handle_BUG(regs);
- printk("%s: %04lx\n", str, err & 0xffff);
+ printk("%s: %04lx [#%d]\n", str, err & 0xffff, ++die_counter);
show_registers(regs);
bust_spinlocks(0);
spin_unlock_irq(&die_lock);
diff -Naur ./arch/mips/kernel/traps.c%OOPSC ./arch/mips/kernel/traps.c
--- ./arch/mips/kernel/traps.c%OOPSC Tue Mar 4 19:29:17 2003
+++ ./arch/mips/kernel/traps.c Wed Mar 12 14:38:40 2003
@@ -191,12 +191,13 @@
extern void __die(const char * str, struct pt_regs * regs, const char *where,
unsigned long line)
{
+ static int die_counter = 0;
console_verbose();
spin_lock_irq(&die_lock);
printk("%s", str);
if (where)
printk(" in %s, line %ld", where, line);
- printk(":\n");
+ printk("[#%d]:\n", ++die_counter);
show_regs(regs);
printk("Process %s (pid: %d, stackpage=%08lx)\n",
current->comm, current->pid, (unsigned long) current);
diff -Naur ./arch/ppc64/kernel/traps.c%OOPSC ./arch/ppc64/kernel/traps.c
--- ./arch/ppc64/kernel/traps.c%OOPSC Tue Mar 4 19:29:19 2003
+++ ./arch/ppc64/kernel/traps.c Wed Mar 12 14:47:46 2003
@@ -62,10 +62,11 @@

void die(const char *str, struct pt_regs *regs, long err)
{
+ static int die_counter = 0;
console_verbose();
spin_lock_irq(&die_lock);
bust_spinlocks(1);
- printk("Oops: %s, sig: %ld\n", str, err);
+ printk("Oops: %s, sig: %ld [#%d]\n", str, err, ++die_counter);
show_regs(regs);
bust_spinlocks(0);
spin_unlock_irq(&die_lock);
diff -Naur ./arch/mips64/kernel/traps.c%OOPSC ./arch/mips64/kernel/traps.c
--- ./arch/mips64/kernel/traps.c%OOPSC Tue Mar 4 19:29:30 2003
+++ ./arch/mips64/kernel/traps.c Wed Mar 12 14:47:11 2003
@@ -161,12 +161,13 @@

void die(const char * str, struct pt_regs * regs, unsigned long err)
{
+ static int die_counter = 0;
if (user_mode(regs)) /* Just return if in user mode. */
return;

console_verbose();
spin_lock_irq(&die_lock);
- printk("%s: %04lx\n", str, err & 0xffff);
+ printk("%s: %04lx [#%d]\n", str, err & 0xffff, ++die_counter);
show_regs(regs);
printk("Process %s (pid: %d, stackpage=%08lx)\n",
current->comm, current->pid, (unsigned long) current);
diff -Naur ./arch/ia64/kernel/traps.c%OOPSC ./arch/ia64/kernel/traps.c
--- ./arch/ia64/kernel/traps.c%OOPSC Tue Mar 4 19:29:52 2003
+++ ./arch/ia64/kernel/traps.c Wed Mar 12 14:46:28 2003
@@ -101,6 +101,7 @@
.lock_owner = -1,
.lock_owner_depth = 0
};
+ static int die_counter = 0;

if (die.lock_owner != smp_processor_id()) {
console_verbose();
@@ -111,7 +112,8 @@
}

if (++die.lock_owner_depth < 3) {
- printk("%s[%d]: %s %ld\n", current->comm, current->pid, str, err);
+ printk("%s[%d]: %s %ld [%d]\n",
+ current->comm, current->pid, str, err, ++die_counter);
show_regs(regs);
} else
printk(KERN_ERR "Recursive die() failure, output suppressed\n");
diff -Naur ./arch/arm/kernel/traps.c%OOPSC ./arch/arm/kernel/traps.c
--- ./arch/arm/kernel/traps.c%OOPSC Tue Mar 4 19:29:17 2003
+++ ./arch/arm/kernel/traps.c Wed Mar 12 14:45:13 2003
@@ -208,12 +208,13 @@
NORET_TYPE void die(const char *str, struct pt_regs *regs, int err)
{
struct task_struct *tsk = current;
+ static int die_counter = 0;

console_verbose();
spin_lock_irq(&die_lock);
bust_spinlocks(1);

- printk("Internal error: %s: %x\n", str, err);
+ printk("Internal error: %s: %x [#%d]\n", str, err, ++die_counter);
print_modules();
printk("CPU: %d\n", smp_processor_id());
show_regs(regs);
diff -Naur ./arch/x86_64/kernel/traps.c%OOPSC ./arch/x86_64/kernel/traps.c
--- ./arch/x86_64/kernel/traps.c%OOPSC Tue Mar 4 19:28:53 2003
+++ ./arch/x86_64/kernel/traps.c Wed Mar 12 14:44:07 2003
@@ -325,11 +325,12 @@
{
int cpu;
struct die_args args = { regs, str, err };
+ static int die_counter = 0;
console_verbose();
notifier_call_chain(&die_chain, DIE_DIE, &args);
bust_spinlocks(1);
handle_BUG(regs);
- printk("%s: %04lx\n", str, err & 0xffff);
+ printk("%s: %04lx [#%d]\n", str, err & 0xffff, ++die_counter);
cpu = safe_smp_processor_id();
/* racy, but better than risking deadlock. */
local_irq_disable();
diff -Naur ./arch/s390x/kernel/traps.c%OOPSC ./arch/s390x/kernel/traps.c
--- ./arch/s390x/kernel/traps.c%OOPSC Tue Mar 4 19:29:32 2003
+++ ./arch/s390x/kernel/traps.c Wed Mar 12 14:43:29 2003
@@ -228,10 +228,11 @@

void die(const char * str, struct pt_regs * regs, long err)
{
+ static int die_counter = 0;
console_verbose();
spin_lock_irq(&die_lock);
bust_spinlocks(1);
- printk("%s: %04lx\n", str, err & 0xffff);
+ printk("%s: %04lx [#%d]\n", str, err & 0xffff, ++die_counter);
show_regs(regs);
bust_spinlocks(0);
spin_unlock_irq(&die_lock);
diff -Naur ./arch/sh/kernel/traps.c%OOPSC ./arch/sh/kernel/traps.c
--- ./arch/sh/kernel/traps.c%OOPSC Tue Mar 4 19:28:56 2003
+++ ./arch/sh/kernel/traps.c Wed Mar 12 14:43:03 2003
@@ -58,9 +58,10 @@

void die(const char * str, struct pt_regs * regs, long err)
{
+ static int die_counter = 0;
console_verbose();
spin_lock_irq(&die_lock);
- printk("%s: %04lx\n", str, err & 0xffff);
+ printk("%s: %04lx [#%d]\n", str, err & 0xffff, ++die_counter);
show_regs(regs);
spin_unlock_irq(&die_lock);
do_exit(SIGSEGV);
diff -Naur ./arch/s390/kernel/traps.c%OOPSC ./arch/s390/kernel/traps.c
--- ./arch/s390/kernel/traps.c%OOPSC Tue Mar 4 19:29:15 2003
+++ ./arch/s390/kernel/traps.c Wed Mar 12 14:42:21 2003
@@ -226,10 +226,11 @@

void die(const char * str, struct pt_regs * regs, long err)
{
+ static int die_counter = 0;
console_verbose();
spin_lock_irq(&die_lock);
bust_spinlocks(1);
- printk("%s: %04lx\n", str, err & 0xffff);
+ printk("%s: %04lx [#%d]\n", str, err & 0xffff, ++die_counter);
show_regs(regs);
bust_spinlocks(0);
spin_unlock_irq(&die_lock);

2003-03-13 00:58:27

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Wed, 12 Mar 2003, Szakacsits Szabolcs wrote:
> > I'm not adding uncertain instruction decoding to the kernel.
>
> From some point of you I understand. But it's not uncertain. The
> correct one is 100% included.

Sorry, there is _no_ way you can do it correctly.

The preceding bytes may not even be code - they can be constant data in
the code segment. Trying to decode them as code just generates garbage in
those circumstances.

Linus

2003-03-13 17:52:03

by Zach Brown

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Tue, Mar 11, 2003 at 05:29:46PM +0100, Szakacsits Szabolcs wrote:

> Randy's compliler is 2.96 and it forgot to do a 'sub $0xc,%esp'. See
> yourself all the data at http://bugme.osdl.org/show_bug.cgi?id=432

we definitely ran into this in Lustre, too:

https://lxr.lustre.org/source/configure.in#027
https://lxr.lustre.org/source/lib/obd_pack.c#049

- z

2003-03-13 21:56:47

by Horst H. von Brand

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

Szakacsits Szabolcs <[email protected]> said:
> On Wed, 12 Mar 2003, Horst von Brand wrote:
> > It is _hard_ to do with variable length instructions (CISC, remember?), the
> > code is designed to be easily decoded forward, noone executes code going
> > backwards.

> Of course, it's a bad approach. You start earlier and stop at EIP.
> Repeat this for max(instruction length) different offsets and you will
> have the winner. Figure it out from the context after EIP.

By hand, OK. Automatically, no.

> > When I needed to look at the code in an Oops I'd either objdump(1)ed it or
> > compiled the offending stuff to assembler (possibly with custom CFLAGS to
> > get info on line numbers and such in the output).

> I was talking about cases when you can't do these.

I did this to find out where in the source it went south, and then look
around to find out why. A copy of that kernel's source is required anyway.

If you can divine the breakage just from the asm, more power to you. For us
mere mortals it isn't enough.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2003-03-13 21:58:39

by Horst H. von Brand

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

Linus Torvalds <[email protected]> said:
> On Wed, 12 Mar 2003, Szakacsits Szabolcs wrote:

[...]

> > Decoding what's before is max 7-8 tries by a human and one can figure
> > out the real code from the context (with high probability).
>
> The point being "with high probability".
>
> I'm not adding uncertain instruction decoding to the kernel.

No need. Just dump some bytes before EIP raw, plus raw bytes + decoded
after EIP. Could be of some help.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2003-03-13 23:15:22

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Thu, 13 Mar 2003, Horst von Brand wrote:
>
> No need. Just dump some bytes before EIP raw, plus raw bytes + decoded
> after EIP. Could be of some help.

Alpha does this. Of course, there you don't have any of the partial
instruction issues.

Linus

2003-03-14 01:02:10

by Jonathan Lundell

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

At 3:24pm -0800 3/13/03, Linus Torvalds wrote:
>On Thu, 13 Mar 2003, Horst von Brand wrote:
>>
>> No need. Just dump some bytes before EIP raw, plus raw bytes + decoded
>> after EIP. Could be of some help.
>
>Alpha does this. Of course, there you don't have any of the partial
>instruction issues.

If you've got a symbol some reasonable distance before EIP, you could
decode from there. I wrote a little code that does that (using
kallsyms) very crudely in the stack trace in order to give the reader
a hint about stack frames. Go to the prior symbol, which is usually
an entry point, and find the %esp arithmetic. Works pretty well for
figuring out the real call chain.
--
/Jonathan Lundell.

2003-03-14 04:18:20

by Randy.Dunlap

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

> At 3:24pm -0800 3/13/03, Linus Torvalds wrote:
>>On Thu, 13 Mar 2003, Horst von Brand wrote:
>>>
>>> No need. Just dump some bytes before EIP raw, plus raw bytes + decoded
>>> after EIP. Could be of some help.
>>
>>Alpha does this. Of course, there you don't have any of the partial
>> instruction issues.
>
> If you've got a symbol some reasonable distance before EIP, you could
> decode from there. I wrote a little code that does that (using
> kallsyms) very crudely in the stack trace in order to give the reader a
> hint about stack frames. Go to the prior symbol, which is usually an entry
> point, and find the %esp arithmetic. Works pretty well for figuring out the
> real call chain.

as long as it's not a data symbol...
can you determine that?

~Randy



2003-03-14 06:16:11

by Jonathan Lundell

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

At 8:29pm -0800 3/13/03, Randy.Dunlap wrote:
> > If you've got a symbol some reasonable distance before EIP, you could
>> decode from there. I wrote a little code that does that (using
>> kallsyms) very crudely in the stack trace in order to give the reader a
>> hint about stack frames. Go to the prior symbol, which is usually an entry
>> point, and find the %esp arithmetic. Works pretty well for figuring out the
>> real call chain.
>
>as long as it's not a data symbol...
>can you determine that?

Sometimes/mostly, and btw my code is i386-only. The trace is question
is arch/i386/kernel/traps.c:show_trace(). It already makes the test
kernel_text_address(), which works in the kernel, but not for modules
(at least in the kernel I'm using: 2.4.9 (don't ask)).

For addresses in the trace (as opposed to the trapped EIP), I look
for a call instruction preceding the putative return address. That's
backwards assembly, but since there are relatively few possibilities,
it seems to work fairly well.

So finding a call is a good clue that we're looking at text. Look
back from the call for argument pushes (I stop at the first non-push,
because of the backwards-disassembly problem), then go to the
previous symbol and scan forward for pushes and subtracts from %esp.
The sum of all those, plus four bytes for the return link, gives me a
lower limit on frame size. It's not perfect; a real disassembly
forward from the symbol would maybe be better, but that seems like
overkill (what to do with branches, etc).

The idea isn't to be perfect anyway, but to give me hints for
manually reconstructing the call chain. Way better than nothing.

But for your purposes, disassembling from the previous symbol gives
you a code dump, and you know that EIP had better be pointing to text.
--
/Jonathan Lundell.

2003-03-14 07:33:41

by Denis Vlasenko

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On 13 March 2003 23:04, Horst von Brand wrote:
> Szakacsits Szabolcs <[email protected]> said:
> > On Wed, 12 Mar 2003, Horst von Brand wrote:
> > > It is _hard_ to do with variable length instructions (CISC,
> > > remember?), the code is designed to be easily decoded forward,
> > > noone executes code going backwards.
> >
> > Of course, it's a bad approach. You start earlier and stop at EIP.
> > Repeat this for max(instruction length) different offsets and you
> > will have the winner. Figure it out from the context after EIP.
>
> By hand, OK. Automatically, no.

Why not? Disassemble from, say, EIP-16 and check whether you
have an instruction starting exactly at EIP. If no, repeat from EIP-15, -14...
You are guaranteed to succeed at EIP-0 ;)
--
vda

2003-03-14 07:59:54

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Wed, 12 Mar 2003, Linus Torvalds wrote:
>
> The preceding bytes may not even be code - they can be constant data in
> the code segment. Trying to decode them as code just generates garbage in
> those circumstances.

What do you exactly mean under "garbage"? There could be several (e.g.
by a jump to EIP). My best bet you don't want to dump the bytes before
EIP if they don't start on the correct instuction boundary the CPU was
or could execute or the reliable "off-line" disassembling of the
oopsed function would give.

Bcode, meaning before code [well, wrong choise, could be misunderstend
as byte code], would mean it's the bytes before Code. They are not
necessarily start on the _correct_ instruction boundary (14% they
are). One should disassemble them separately from offset 0,1,...6
(pedantic coders or in case of a later failure to 14) and choose the
one that makes sense based on

1) next instruction boundary is on EIP (can be automated)
and
2) has something to do with the C source code
and
3) the assembly makes sense (considering compiler
optimizations, generated dead/bad code, etc)
and
4) the assembly fits the context after EIP.

If you think this would result more confusion than benefit, I
understand (promised to my fiancee to say so ;)

On the other hand, if the kernel did this, a simple script could be
written analysing the last two years kernel oopses [and future ones]
on linux-kernel and tell what oopses resulted due to this access below
stack compiler bug. Yes, some minimal human intervention would be
still needed to confirm all of them but IMHO it's more productive then
just letting them unsolved and have doubts in kernel quality.

Szaka

2003-03-14 09:48:43

by Helge Hafting

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

Szakacsits Szabolcs wrote:
[...]
> Bcode, meaning before code [well, wrong choise, could be misunderstend
> as byte code], would mean it's the bytes before Code. They are not
> necessarily start on the _correct_ instruction boundary (14% they
> are). One should disassemble them separately from offset 0,1,...6
> (pedantic coders or in case of a later failure to 14) and choose the
> one that makes sense based on
>
> 1) next instruction boundary is on EIP (can be automated)
> and
The problem is that several offsets will fit into this. Going
backwards from those positions gives even more options when
going two instructions back and so on. And if you run into
an illegal opcode - was it a "wrong" attempt or did you
merely go beyond the start of the function?

> 2) has something to do with the C source code
And how do you plan on achieving that? This one is
impossible for the kernel, as the kernel don't know its
own sources. (Now that _can_ be arranged, but it
won't be easy without regular file access in the kernel.)
But even with the source, how would you determine that the
disassembled stuff "has something to do with the source?"
Even programmers are sometimes surprised by what compilers,
and particularly opimizing compilers do. I don't think
you or anybody else can provide a tool that reliably maps
assembly to source. And if it isn't reliable, it is no use.


> and
> 3) the assembly makes sense (considering compiler
> optimizations, generated dead/bad code, etc)
> and
> 4) the assembly fits the context after EIP.
>
> If you think this would result more confusion than benefit, I
> understand (promised to my fiancee to say so ;)
>
A tool doing this would be nice, but achieving 2 and 3 is impossible.
And even if you could do backwards disassembly with 95% success per
instruction you'd run into more and more trouble the farther backwards
you get. And then there's the problem of loops and jumps. Perhaps
the code did a nice long jump to the instruction that faulted. The
"previous instructions" are then useless because they weren't executed.
But there won't be any hint of that in the oops.

If you want an interesting excercise, try implementing (1) above.
Make a tool and try to disassemble perhaps 2-3 instructions
backwards from some random point in an object file. Make sure
your tool outputs _all_ valid combinations of instructions, not
just the first one. See how many you get. Then see how
far you get with 2,3 and 4.


> On the other hand, if the kernel did this, a simple script could be
> written analysing the last two years kernel oopses [and future ones]

Most really old oopses are either fixed ot otherwise irrelevant. The
code they refer to is changed, and there are newer versions of
the compiler.

Helge Hafting

2003-03-14 10:57:56

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Fri, 14 Mar 2003, Helge Hafting wrote:

> The problem is that several offsets will fit into this. Going
> backwards

You never ever go backwards. It's impossible. You always go ahead.
Even if you want to disassemble backwards you go ahead. How? You start
earlier but you must do it in a very limited times (compared to the
number of variations going backwards) with different offsets.

> > 2) has something to do with the C source code
> And how do you plan on achieving that? This one is

Manual investigation. I don't expect the kernel starts dumping the
"before code" at the correct instruction boundary even if Jonathan's
idea/code sounds brilliant to do so (I didn't check it).

What I tried to ask is purely just reading and dumping some bytes
before EIP also for postmortem analyses. That's all, nothing
complicated.

Szaka

2003-03-14 12:12:12

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Backward disassembling (was: Re: 2.5.63 accesses below %esp)


On Fri, 14 Mar 2003, Denis Vlasenko wrote:
> On 13 March 2003 23:04, Horst von Brand wrote:
> > Szakacsits Szabolcs <[email protected]> said:
> > >
> > > Of course, it's a bad approach. You start earlier and stop at EIP.
> > > Repeat this for max(instruction length) different offsets and you
> > > will have the winner. Figure it out from the context after EIP.
> >
> > By hand, OK. Automatically, no.
>
> Why not? Disassemble from, say, EIP-16 and check whether you have
> an instruction starting exactly at EIP. If no, repeat from EIP-15,
> -14... You are guaranteed to succeed at EIP-0 ;)

Disassembling must be started "much" earlier. From your example one
could get the impression you want to get the instruction right before
EIP. It's not possible to go back this way. For example if you want to
disassemble 100 bytes before EIP you must start at EIP-100 and EIP-99
and ... and EIP-100-max_instruction_length+1. Then you have the right
one among them (well, 99.9% but let's don't be too pedantic).

You also can't stop the above max_instruction_length iteration when
the next instruction address matches EIP. You can have even
max_instruction_length matches. But from the additional info (code
after EIP, assembly "quality", available source where the crash
happend) you could choose the right one.

Szaka

2003-03-14 16:43:01

by Jonathan Lundell

[permalink] [raw]
Subject: Re: Backward disassembling (was: Re: 2.5.63 accesses below %esp)

At 1:16pm +0100 3/14/03, Szakacsits Szabolcs wrote:
> > Why not? Disassemble from, say, EIP-16 and check whether you have
>> an instruction starting exactly at EIP. If no, repeat from EIP-15,
>> -14... You are guaranteed to succeed at EIP-0 ;)
>
>Disassembling must be started "much" earlier. From your example one
>could get the impression you want to get the instruction right before
>EIP. It's not possible to go back this way. For example if you want to
>disassemble 100 bytes before EIP you must start at EIP-100 and EIP-99
>and ... and EIP-100-max_instruction_length+1. Then you have the right
>one among them (well, 99.9% but let's don't be too pedantic).
>
>You also can't stop the above max_instruction_length iteration when
>the next instruction address matches EIP. You can have even
>max_instruction_length matches. But from the additional info (code
>after EIP, assembly "quality", available source where the crash
>happend) you could choose the right one.

Sounds similar to the problem of recognizing valid plaintext when
breaking a code.

As a practical matter (and in the context of this being a heuristic
debugging aid, not a guaranteed 100%-correct method), I wonder
whether one might not tend to sync up fairly quickly to the correct
code. For example, strings of one-byte instructions provide a
"landing zone" for disassembly leading up to them, and illegal
instructions provide clues that you're out of sync (not perfect, but
perhaps good enough).

I'm not in a position to do it right now, but I'd suggest trying it:
disassemble hunks of random code on random boundaries, and see how
many ways there tend to be of arriving at EIP+0, given enough of a
BEIP running start (for some definition of "enough").
--
/Jonathan Lundell.

2003-03-14 17:56:20

by Olaf Titz

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

> code is designed to be easily decoded forward, noone executes code going
> backwards. Finding out what starts at EIP is easy.

I remember reading once in a magazine that there exists an
undocumented/illegal instruction in the x86 which causes the IP to run
backwards, similar to setting the D flag.

Was an April 1st issue though ;-)

Olaf

2003-03-14 18:42:39

by Richard B. Johnson

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Fri, 14 Mar 2003, Olaf Titz wrote:

> > code is designed to be easily decoded forward, noone executes code going
> > backwards. Finding out what starts at EIP is easy.
>
> I remember reading once in a magazine that there exists an
> undocumented/illegal instruction in the x86 which causes the IP to run
> backwards, similar to setting the D flag.
>
> Was an April 1st issue though ;-)
>
> Olaf

There was a whole operating system written upon this principle.
I think it was called "retrograde", erm, "Redmond", yes, that's
what it was, something out of Redmond, Washington, ASU ^M^M^M USA


Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


2003-03-15 19:18:58

by Horst H. von Brand

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

"Randy.Dunlap" <[email protected]> said:

[...]

> > If you've got a symbol some reasonable distance before EIP,

What is a "reasonable distance"? What if no such symbol is found? What if
it data?
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2003-03-15 19:28:23

by Horst H. von Brand

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

Denis Vlasenko <[email protected]> said:
> On 13 March 2003 23:04, Horst von Brand wrote:
> > Szakacsits Szabolcs <[email protected]> said:
> > > On Wed, 12 Mar 2003, Horst von Brand wrote:
> > > > It is _hard_ to do with variable length instructions (CISC,
> > > > remember?), the code is designed to be easily decoded forward,
> > > > noone executes code going backwards.
> > >
> > > Of course, it's a bad approach. You start earlier and stop at EIP.
> > > Repeat this for max(instruction length) different offsets and you
> > > will have the winner. Figure it out from the context after EIP.
> >
> > By hand, OK. Automatically, no.
>
> Why not? Disassemble from, say, EIP-16 and check whether you
> have an instruction starting exactly at EIP. If no, repeat from EIP-15, -14...
> You are guaranteed to succeed at EIP-0 ;)

But your previous success (if any) doesn't mean anything, and might even
screw up the decoding after EIP (if accidentally an address looks like an
instruction, say). This is too much work (to get right) for something of
purely informational value (if that much), generated by a suspect kernel
(an Oops is when something went wrong...).
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2003-03-15 19:36:50

by Randy.Dunlap

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

> "Randy.Dunlap" <[email protected]> said:
>
> [...]
>
>> > If you've got a symbol some reasonable distance before EIP,
>
> What is a "reasonable distance"? What if no such symbol is found? What if it
> data?
> --

Come on, Horst, you can do better than that.

I mean the (almost) attribution, although the quote has "> >"...
so see http://marc.theaimsgroup.com/?l=linux-kernel&m=104760482905929&w=2,
for the writer of that clause.

It would be better just to omit the attribution that to bungle it.

~Randy



2003-03-17 07:32:30

by Denis Vlasenko

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On 15 March 2003 20:34, Horst von Brand wrote:
> Denis Vlasenko <[email protected]> said:
> > On 13 March 2003 23:04, Horst von Brand wrote:
> > > Szakacsits Szabolcs <[email protected]> said:
> > > > On Wed, 12 Mar 2003, Horst von Brand wrote:
> > > > > It is _hard_ to do with variable length instructions (CISC,
> > > > > remember?), the code is designed to be easily decoded
> > > > > forward, noone executes code going backwards.
> > > >
> > > > Of course, it's a bad approach. You start earlier and stop at
> > > > EIP. Repeat this for max(instruction length) different offsets
> > > > and you will have the winner. Figure it out from the context
> > > > after EIP.
> > >
> > > By hand, OK. Automatically, no.
> >
> > Why not? Disassemble from, say, EIP-16 and check whether you
> > have an instruction starting exactly at EIP. If no, repeat from
> > EIP-15, -14... You are guaranteed to succeed at EIP-0 ;)
>
> But your previous success (if any) doesn't mean anything, and might
> even screw up the decoding after EIP

How come? If I started to decode at EIP-n and got a sequence of
instructions at EIP-n, EIP-n+k1, EIP-n+k2, EIP-n+k3..., EIP,
instructions prior to EIP can be wrong. Instruction at EIP
and all subsequent ones ought to be right.

> (if accidentally an address
> looks like an instruction, say). This is too much work (to get right)
> for something of purely informational value (if that much), generated
> by a suspect kernel (an Oops is when something went wrong...).
--
vda

2003-03-17 21:33:03

by Horst H. von Brand

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

Denis Vlasenko <[email protected]> said:
> On 15 March 2003 20:34, Horst von Brand wrote:
> > Denis Vlasenko <[email protected]> said:

[...]

> > > Why not? Disassemble from, say, EIP-16 and check whether you
> > > have an instruction starting exactly at EIP. If no, repeat from
> > > EIP-15, -14... You are guaranteed to succeed at EIP-0 ;)

> > But your previous success (if any) doesn't mean anything, and might
> > even screw up the decoding after EIP

> How come? If I started to decode at EIP-n and got a sequence of
> instructions at EIP-n, EIP-n+k1, EIP-n+k2, EIP-n+k3..., EIP,
> instructions prior to EIP can be wrong. Instruction at EIP
> and all subsequent ones ought to be right.

Iff you exactly hit EIP that way (sure, should check). But wrong previous
instructions _will_ confuse people or start them on all kind of wild goose
chases. Too much work for a dubious gain.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2003-03-18 04:37:16

by Keith Owens

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Mon, 17 Mar 2003 17:43:21 -0400,
Horst von Brand <[email protected]> wrote:
>Denis Vlasenko <[email protected]> said:
>> How come? If I started to decode at EIP-n and got a sequence of
>> instructions at EIP-n, EIP-n+k1, EIP-n+k2, EIP-n+k3..., EIP,
>> instructions prior to EIP can be wrong. Instruction at EIP
>> and all subsequent ones ought to be right.
>
>Iff you exactly hit EIP that way (sure, should check). But wrong previous
>instructions _will_ confuse people or start them on all kind of wild goose
>chases. Too much work for a dubious gain.

At the risk of stating the obvious: the only program that cares about
the 'Code:' line is ksymoops. It already handles code around the EIP
by looking for a byte enclosed in <> and assuming that byte is at EIP.
ksymoops can happily decode around the failing instruction and does so
for most architectures with fixed length instructions.

I can change ksymoops to add a special case for architectures with
variable length instructions - i386, s390 and their 64 bit equivalents,
are there any others? For variable length instructions, ksymoops will
extract the bytes up to but not including eip, decode and print them
with a warning

This architecture has variable length instructions, decoding before eip is
unreliable, take these instructions with a pinch of salt.

Then the code from eip onwards will be decoded as normal, with the
heading 'This code should be reliable'. If a kernel with variable
length instructions prints 'Code:' with a byte enclosed in <> then you
get two decodes with suitable warning messages. No <> in the code line
means no change from current decode state, everybody is happy.

2003-03-18 06:06:07

by Denis Vlasenko

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On 17 March 2003 23:43, Horst von Brand wrote:
> Denis Vlasenko <[email protected]> said:
> > On 15 March 2003 20:34, Horst von Brand wrote:
> > > Denis Vlasenko <[email protected]> said:
>
> [...]
>
> > > > Why not? Disassemble from, say, EIP-16 and check whether you
> > > > have an instruction starting exactly at EIP. If no, repeat from
> > > > EIP-15, -14... You are guaranteed to succeed at EIP-0 ;)
> > >
> > > But your previous success (if any) doesn't mean anything, and
> > > might even screw up the decoding after EIP
> >
> > How come? If I started to decode at EIP-n and got a sequence of
> > instructions at EIP-n, EIP-n+k1, EIP-n+k2, EIP-n+k3..., EIP,
> > instructions prior to EIP can be wrong. Instruction at EIP
> > and all subsequent ones ought to be right.
>
> Iff you exactly hit EIP that way (sure, should check). But wrong
> previous instructions _will_ confuse people or start them on all kind
> of wild goose chases. Too much work for a dubious gain.

You are right. But that is better than showing no prior instructions
at all. And most of the time (can I say 90% ?) prior instructions
will be ok.
--
vda

2003-03-18 06:24:46

by John Alvord

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Tue, 18 Mar 2003 08:05:30 +0200, Denis Vlasenko
<[email protected]> wrote:

>On 17 March 2003 23:43, Horst von Brand wrote:
>> Denis Vlasenko <[email protected]> said:
>> > On 15 March 2003 20:34, Horst von Brand wrote:
>> > > Denis Vlasenko <[email protected]> said:
>>
>> [...]
>>
>> > > > Why not? Disassemble from, say, EIP-16 and check whether you
>> > > > have an instruction starting exactly at EIP. If no, repeat from
>> > > > EIP-15, -14... You are guaranteed to succeed at EIP-0 ;)
>> > >
>> > > But your previous success (if any) doesn't mean anything, and
>> > > might even screw up the decoding after EIP
>> >
>> > How come? If I started to decode at EIP-n and got a sequence of
>> > instructions at EIP-n, EIP-n+k1, EIP-n+k2, EIP-n+k3..., EIP,
>> > instructions prior to EIP can be wrong. Instruction at EIP
>> > and all subsequent ones ought to be right.
>>
>> Iff you exactly hit EIP that way (sure, should check). But wrong
>> previous instructions _will_ confuse people or start them on all kind
>> of wild goose chases. Too much work for a dubious gain.
>
>You are right. But that is better than showing no prior instructions
>at all. And most of the time (can I say 90% ?) prior instructions
>will be ok.

You can also show the instruction sequences that make sense and let
the human figure out the correct sequence when there are multiples.

john

2003-03-18 07:00:40

by Hugh Dickins

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Tue, 18 Mar 2003, Keith Owens wrote:
>
> I can change ksymoops to add a special case for architectures with
> variable length instructions - i386, s390 and their 64 bit equivalents,
> are there any others? For variable length instructions, ksymoops will
> extract the bytes up to but not including eip, decode and print them
> with a warning
>
> This architecture has variable length instructions, decoding before eip is
> unreliable, take these instructions with a pinch of salt.
>
> Then the code from eip onwards will be decoded as normal, with the
> heading 'This code should be reliable'.

If you go ahead with this (I'm indifferent), please remember that to
get reliable code from eip onwards, you need to handle the way both
2.4 and 2.5 nowadays pack short __LINE__ number and long __FILE__
pointer after BUG()'s ud2a (on i386).

Hugh

2003-03-18 19:38:28

by Szabolcs Szakacsits

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))


On Tue, 18 Mar 2003, Keith Owens wrote:
> At the risk of stating the obvious: the only program that cares about
> the 'Code:' line is ksymoops. It already handles code around the EIP
> by looking for a byte enclosed in <> and assuming that byte is at EIP.
> ksymoops can happily decode around the failing instruction and does so
> for most architectures with fixed length instructions.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Ah, this is the reason it didn't work for x86 when I looked this issue
with ksymoops days ago and tried all possible bracketing combinations
(nothing such limit described in the man page). I didn't mention this
before because it's a non-issue if kernel doesn't dump the backwards
bytes for these archs.

> I can change ksymoops to add a special case for architectures with
> variable length instructions - i386, s390 and their 64 bit equivalents,
> are there any others?

Please don't bother. Linus have indicated already 4 times in this
thread he will not dump backwards code if it doesn't start at
instruction boundary.

> For variable length instructions, ksymoops will extract the bytes
> up to but not including eip, decode and print them with a warning
>
> This architecture has variable length instructions, decoding before eip is
> unreliable, take these instructions with a pinch of salt.

86% it will be incorrect on x86. But the right code can be dumped 100%
among a max 7 decoded lists to choose from (for pedants, yes in theory
it's not exactly 100% but a bit less, only from practical and problem
solving point of view "100%"). I've found the max ususally is 1, 2 or
3 with decreasing probabilities but I didn't do exhaustive statistical
analysis and it also depends on the compiler (version).

Szaka

2003-03-20 10:37:41

by Keith Owens

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Tue, 18 Mar 2003 07:13:18 +0000 (GMT),
Hugh Dickins <[email protected]> wrote:
>If you go ahead with this (I'm indifferent)

ksymoops 2.4.9 can decode variable length instructions before eip
without affecting the reliabiloity of the code from eip onwards. It is
up to the kernel whether it dumps before eip or not.

>please remember that to
>get reliable code from eip onwards, you need to handle the way both
>2.4 and 2.5 nowadays pack short __LINE__ number and long __FILE__
>pointer after BUG()'s ud2a (on i386).

Nothing I can do about that. ksymoops uses objdump to decode the
instructions and objdump does not know that the kernel abuses ud2a to
add embedded line and file numbers. In any case it is irrelevant, the
only thing that ud2a ever tells you is "here there be BUG()". For
BUG() the code before eip is much more useful, see above.

2003-03-20 10:51:58

by Hugh Dickins

[permalink] [raw]
Subject: Re: 2.5.63 accesses below %esp (was: Re: ntfs OOPS (2.5.63))

On Thu, 20 Mar 2003, Keith Owens wrote:
> On Tue, 18 Mar 2003 07:13:18 +0000 (GMT),
> Hugh Dickins <[email protected]> wrote:
> >If you go ahead with this (I'm indifferent)
>
> ksymoops 2.4.9 can decode variable length instructions before eip
> without affecting the reliabiloity of the code from eip onwards. It is
> up to the kernel whether it dumps before eip or not.
>
> >please remember that to
> >get reliable code from eip onwards, you need to handle the way both
> >2.4 and 2.5 nowadays pack short __LINE__ number and long __FILE__
> >pointer after BUG()'s ud2a (on i386).
>
> Nothing I can do about that. ksymoops uses objdump to decode the
> instructions and objdump does not know that the kernel abuses ud2a to
> add embedded line and file numbers. In any case it is irrelevant, the
> only thing that ud2a ever tells you is "here there be BUG()". For
> BUG() the code before eip is much more useful, see above.

But better not to describe the code shown from eip onwards as
"always reliable": if after a BUG() it's alarming nonsense!

Hugh