2002-04-09 08:43:44

by Andrey Nekrasov

[permalink] [raw]
Subject: BUG: 2.4.19-pre6aa1

Hello.

1. kernel 2.4.19-pre6aa1, 1CPU, highmem 4Gb, userspace 3.5Gb

2. log from serial console:

...
VFS: Mounted root (nfs filesystem).
Freeing unused kernel memory: 252k freed
INIT: version 2.78 booting
kernel BUG at panic.c:139!
invalid operand: 0000
CPU: 0
EIP: 0010:[<e0115c1c>] Not tainted
EFLAGS: 00010202
eax: e27d3260 ebx: 3ffe5005 ecx: 00000120 edx: ffff02b0
esi: ffff02b0 edi: ffff12b0 ebp: 080ad000 esp: e1c17f1c
ds: 0018 es: 0018 ss: 0018
Process init (pid: 1, stackpage=e1c17000)
Stack: e012069b e27d3130 e1c16000 e27d3158 e27d5134 00000001 080ad000 e27de080
e27d4080 e0114caa e27d5134 e27d50a4 e27d3260 e27ca000 e27d6584 e27d6aa4
00000011 e27d3260 e27d50a4 e27d50c0 e1c16000 e27d326c e01154b7 00000011
Call Trace: [<e012069b>] [<e0114caa>] [<e01154b7>] [<e0107270>] [<e010858b>]

Code: 0f 0b 8b 00 36 f5 25 e0 eb fe 8d 76 00 8d bc 27 00 00 00 00
<0>Kernel panic: Attempted to kill init!


3. keymoops :

..

>>EIP; e0115c1c <out_of_line_bug+0/14> <=====
Trace; e012069a <copy_page_range+1da/334>
Trace; e0114caa <copy_mm+222/2bc>
Trace; e01154b6 <do_fork+42e/744>
Trace; e0107270 <sys_fork+14/1c>
Trace; e010858a <system_call+32/38>
Code; e0115c1c <out_of_line_bug+0/14>
00000000 <_EIP>:
Code; e0115c1c <out_of_line_bug+0/14> <=====
0: 0f 0b ud2a <=====
Code; e0115c1e <out_of_line_bug+2/14>
2: 8b 00 mov (%eax),%eax
Code; e0115c20 <out_of_line_bug+4/14>
4: 36 ss
Code; e0115c20 <out_of_line_bug+4/14>
5: f5 cmc
Code; e0115c22 <out_of_line_bug+6/14>
6: 25 e0 eb fe 8d and $0x8dfeebe0,%eax
Code; e0115c26 <out_of_line_bug+a/14>
b: 76 00 jbe d <_EIP+0xd> e0115c28
<out_of_line_bug+c/14>
Code; e0115c28 <out_of_line_bug+c/14>
d: 8d bc 27 00 00 00 00 lea 0x0(%edi,1),%edi




--
bye.
Andrey Nekrasov, SpyLOG.


2002-04-09 09:13:17

by Andrew Morton

[permalink] [raw]
Subject: Re: BUG: 2.4.19-pre6aa1

Andrey Nekrasov wrote:
>
> ..
> >>EIP; e0115c1c <out_of_line_bug+0/14> <=====
> Trace; e012069a <copy_page_range+1da/334>
> Trace; e0114caa <copy_mm+222/2bc>
> Trace; e01154b6 <do_fork+42e/744>
> Trace; e0107270 <sys_fork+14/1c>

hmm. That out-of-line stuff has obfuscated the trace
a bit. It died in kunmap_atomic or kmap_atomic, part
of Andrea's pte-highmem additions.

I guess the out-of-line bug should be if !CONFIG_DEBUG_KERNEL.

-

2002-04-09 16:39:29

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: BUG: 2.4.19-pre6aa1

On Tue, Apr 09, 2002 at 02:13:00AM -0700, Andrew Morton wrote:
> Andrey Nekrasov wrote:
> >
> > ..
> > >>EIP; e0115c1c <out_of_line_bug+0/14> <=====
> > Trace; e012069a <copy_page_range+1da/334>
> > Trace; e0114caa <copy_mm+222/2bc>
> > Trace; e01154b6 <do_fork+42e/744>
> > Trace; e0107270 <sys_fork+14/1c>
>
> hmm. That out-of-line stuff has obfuscated the trace
> a bit. It died in kunmap_atomic or kmap_atomic, part
> of Andrea's pte-highmem additions.
>
> I guess the out-of-line bug should be if !CONFIG_DEBUG_KERNEL.

I didn't complained yet but the whole point of the BUG() was to get such
a printk in the right place. Now the above report is trivial and the
debugging check triggered a false positive bugcheck due
CONFIG_DEBUG_HIGHMEM=y (I always compile with =n and that's why I didn't
triggered it here), but sometime it isn't that easy to find it out, in
particular when there are plenty of BUG()s in a row like in
page_alloc.c, so I disagree with the merger of the out_of_line_bug in
mainline.

I will the false positive bugcheck it in next -aa, for now you can
simply recompile the kernel with CONFIG_DEBUG_HIGHMEM=n (kernel hacking
menu) and you'll be just fine.

thanks for the feedback Andrey,

Andrea

2002-04-09 19:53:07

by Andrew Morton

[permalink] [raw]
Subject: Re: BUG: 2.4.19-pre6aa1

Andrea Arcangeli wrote:
>
> On Tue, Apr 09, 2002 at 02:13:00AM -0700, Andrew Morton wrote:
> > Andrey Nekrasov wrote:
> > >
> > > ..
> > > >>EIP; e0115c1c <out_of_line_bug+0/14> <=====
> > > Trace; e012069a <copy_page_range+1da/334>
> > > Trace; e0114caa <copy_mm+222/2bc>
> > > Trace; e01154b6 <do_fork+42e/744>
> > > Trace; e0107270 <sys_fork+14/1c>
> >
> > hmm. That out-of-line stuff has obfuscated the trace
> > a bit. It died in kunmap_atomic or kmap_atomic, part
> > of Andrea's pte-highmem additions.
> >
> > I guess the out-of-line bug should be if !CONFIG_DEBUG_KERNEL.
>
> I didn't complained yet but the whole point of the BUG() was to get such
> a printk in the right place. Now the above report is trivial and the
> debugging check triggered a false positive bugcheck due
> CONFIG_DEBUG_HIGHMEM=y (I always compile with =n and that's why I didn't
> triggered it here), but sometime it isn't that easy to find it out, in
> particular when there are plenty of BUG()s in a row like in
> page_alloc.c, so I disagree with the merger of the out_of_line_bug in
> mainline.

No, you misunderstand. All the BUG()s in .c files are unchanged.

out_of_line_bug() is used in one place only: in inline functions
which appear in commonly-included header files.

There are only ten or fifteen out_of_line_bug()s. We just happened
to hit one here. They were added by a process of peering at the
kernel image and asking "why does the same string appear 120 times?".

Yeah, it's all a bit sad. It's a workaround for a toolchain shortcoming,
and it does save 100 to 200 kbytes. If I'd been smarter I'd have
passed __LINE__ into out_of_line_bug(). It's only the string which
is a problem.


There is a sneaky new featurette, btw. We sometimes see BUG
reports where the reporter failed to report the file-and-line.
But it's still available in the oops record:

Code: 0f 0b c2 05 d8 36 92 f0 83 c4 14 5b 5e 5f 5d c3 8d 76 00 8d
^^^^^
This is the line number


-

2002-04-09 21:08:14

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: BUG: 2.4.19-pre6aa1

On Tue, Apr 09, 2002 at 11:50:55AM -0700, Andrew Morton wrote:
> Andrea Arcangeli wrote:
> >
> > On Tue, Apr 09, 2002 at 02:13:00AM -0700, Andrew Morton wrote:
> > > Andrey Nekrasov wrote:
> > > >
> > > > ..
> > > > >>EIP; e0115c1c <out_of_line_bug+0/14> <=====
> > > > Trace; e012069a <copy_page_range+1da/334>
> > > > Trace; e0114caa <copy_mm+222/2bc>
> > > > Trace; e01154b6 <do_fork+42e/744>
> > > > Trace; e0107270 <sys_fork+14/1c>
> > >
> > > hmm. That out-of-line stuff has obfuscated the trace
> > > a bit. It died in kunmap_atomic or kmap_atomic, part
> > > of Andrea's pte-highmem additions.
> > >
> > > I guess the out-of-line bug should be if !CONFIG_DEBUG_KERNEL.
> >
> > I didn't complained yet but the whole point of the BUG() was to get such
> > a printk in the right place. Now the above report is trivial and the
> > debugging check triggered a false positive bugcheck due
> > CONFIG_DEBUG_HIGHMEM=y (I always compile with =n and that's why I didn't
> > triggered it here), but sometime it isn't that easy to find it out, in
> > particular when there are plenty of BUG()s in a row like in
> > page_alloc.c, so I disagree with the merger of the out_of_line_bug in
> > mainline.
>
> No, you misunderstand. All the BUG()s in .c files are unchanged.
>
> out_of_line_bug() is used in one place only: in inline functions
> which appear in commonly-included header files.

This precise case of oops, was one that could be bitten by more than one
out_of_line_bug in the same function, copy_page_range. I found the
problem by re-reading the source code, certainly not by looking at the C
line where the out_of_line_bug was.

> There are only ten or fifteen out_of_line_bug()s. We just happened
> to hit one here. They were added by a process of peering at the
> kernel image and asking "why does the same string appear 120 times?".

The string duplication is ugly, but dropping the bug line is even
uglier. For embedded systems there was just a "turn off the strings"
compile time switch, for embedded systems the other strings matters as
well.

> Yeah, it's all a bit sad. It's a workaround for a toolchain shortcoming,
> and it does save 100 to 200 kbytes. If I'd been smarter I'd have
> passed __LINE__ into out_of_line_bug(). It's only the string which
> is a problem.

that would been much better indeed, the probability of a line collision
in a non obvious place is very low and a push on the stack before
calling the extern function won't grow the size of the kernel image
significantly.

> There is a sneaky new featurette, btw. We sometimes see BUG
> reports where the reporter failed to report the file-and-line.
> But it's still available in the oops record:
>
> Code: 0f 0b c2 05 d8 36 92 f0 83 c4 14 5b 5e 5f 5d c3 8d 76 00 8d
> ^^^^^

Cute, I'll sure make use of this featurette in the future :). (offtopic
with the out-of-line bug though, it still doesn't help there)

Andrea