2000-11-20 21:34:29

by Oliver Poths

[permalink] [raw]
Subject: kernel-2.4.0-test11 crashed again; this time i send you the Oops-message

Hello again,

I Tried again to built my soft-raid by the same way, with the same
result. But this time i sent you that nice message the kernel has shown
me:

Unable to handle kernel NULL pointer dereference at virtual address
00000010 printing eip:
c01c8e66
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c01c8e66>]
EFLAGS: 00010246
eax: 00000000 ebx: c7f5f000 ecx: c7f61000 edx: c7fe1e64
esi: c7f58000 edi: c7f6abc0 ebp: 00001000 esp: c7fe1e18
ds: 0018 es: 0018 ss: 0018
Process swapper (pid:1, stackpage=c7fe1000)
Stack: 00001000 c7f58000 c7f5f000 c7f61000 c7fe1e64 00000003 c7f6abc0
00000003
c01cbda5 00000003 c7fe1e64 00000003 00000003 c1249bc0 c1240400 c7fe1e78
00000000 c7f6abc0 c1240400 c7f6abc0 c7f75340 c7f752c0 00000000 00000000
Call Trace: [<c01cbda5>] [<c01cbe7c>] [<c01cc274>] [<c01c448c>]
[<c011ad30>] [<c01c47e1>] [<c024174f>]
[<c01c49f2>] [<c0107007>] [<c0108d58>]
Code: 8b 40 10 ff d0 83 c4 10 eb 3b 8b 42 0c 8b 78 34 83 7c 24 14
Kernel panic: Attempted to kill init!



looks fascinating...

Do you need the kernel-config?

Best regards
Oliver Poths


2000-11-20 21:50:59

by Tigran Aivazian

[permalink] [raw]
Subject: Re: kernel-2.4.0-test11 crashed again; this time i send you the Oops-message

On Mon, 20 Nov 2000, Oliver Poths wrote:
> looks fascinating...

you know, it looks even more fascinating when you pass it through ksymoops
like this:

ksymoops < rawoops > oops

and then mail the result.

Regards,
Tigran


2000-11-20 22:38:31

by Oliver Poths

[permalink] [raw]
Subject: Re: kernel-2.4.0-test11 crashed again; this time i send you the Oops-message

Here?s the output of ksymoops:

No modules in ksyms, skipping objects
Warning (read_lsmod): no symbols in lsmod, is /proc/modules a valid lsmod
file?
Warning (compare_maps): ksyms_base symbol
machine_real_restart_R__ver_machine_real_restart not found in System.map.
Ignoring ksyms_base entry
Unable to handle kernel NULL pointer dereference at virtual address
00000010 printing eip:
c01c8e66
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c01c8e66>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000000 ebx: c7f5f000 ecx: c7f61000 edx: c7fe1e64
esi: c7f58000 edi: c7f6abc0 ebp: 00001000 esp: c7fe1e18
ds: 0018 es: 0018 ss: 0018
Process swapper (pid:1, stackpage=c7fe1000)
Stack: 00001000 c7f58000 c7f5f000 c7f61000 c7fe1e64 00000003 c7f6abc0
00000003
c01cbda5 00000003 c7fe1e64 00000003 00000003 c1249bc0 c1240400
c7fe1e78
00000000 c7f6abc0 c1240400 c7f6abc0 c7f75340 c7f752c0 00000000
00000000
Call Trace: [<c01cbda5>] [<c01cbe7c>] [<c01cc274>] [<c01c448c>]
[<c011ad30>] [<c01c47e1>] [<c024174f>]
[<c01c49f2>] [<c0107007>] [<c0108d58>]
Code: 8b 40 10 ff d0 83 c4 10 eb 3b 8b 42 0c 8b 78 34 83 7c 24 14

>>EIP; c01c8e66 <xor_block+46/90> <=====
Trace; c01cbda5 <__check_consistency+165/230>
Trace; c01cbe7c <check_consistency+c/20>
Trace; c01cc274 <raid5_run+3e4/6b0>
Trace; c01c448c <do_md_run+2ac/320>
Trace; c011ad30 <printk+150/160>
Trace; c01c47e1 <autorun_array+81/b0>
Trace; c024174f <usb_bandwidth_option+3da3/8339>
Trace; c01c49f2 <autorun_devices+1e2/210>
Trace; c0107007 <init+7/110>
Trace; c0108d58 <kernel_thread+28/40>
Code; c01c8e66 <xor_block+46/90>
00000000 <_EIP>:
Code; c01c8e66 <xor_block+46/90> <=====
0: 8b 40 10 mov 0x10(%eax),%eax <=====
Code; c01c8e69 <xor_block+49/90>
3: ff d0 call *%eax
Code; c01c8e6b <xor_block+4b/90>
5: 83 c4 10 add $0x10,%esp
Code; c01c8e6e <xor_block+4e/90>
8: eb 3b jmp 45 <_EIP+0x45> c01c8eab
<xor_block+8b/90>
Code; c01c8e70 <xor_block+50/90>
a: 8b 42 0c mov 0xc(%edx),%eax
Code; c01c8e73 <xor_block+53/90>
d: 8b 78 34 mov 0x34(%eax),%edi
Code; c01c8e76 <xor_block+56/90>
10: 83 7c 24 14 00 cmpl $0x0,0x14(%esp,1)

Kernel panic: Attempted to kill init!

3 warnings issued. Results may not be reliable.

Hope this helps!
Oliver Poths


>>>>>>>>>>>>>>>>>> Urspr?ngliche Nachricht <<<<<<<<<<<<<<<<<<

Am 20.11.00, 22:22:30, schrieb Tigran Aivazian <[email protected]> zum
Thema Re: kernel-2.4.0-test11 crashed again; this time i send you the
Oops-message :


> On Mon, 20 Nov 2000, Oliver Poths wrote:
> > looks fascinating...

> you know, it looks even more fascinating when you pass it through
ksymoops
> like this:

> ksymoops < rawoops > oops

> and then mail the result.

> Regards,
> Tigran

2000-11-20 23:23:32

by NeilBrown

[permalink] [raw]
Subject: Re: kernel-2.4.0-test11 crashed again; this time i send you the Oops-message

On Monday November 20, [email protected] wrote:
> Here?s the output of ksymoops:
>
>
> >>EIP; c01c8e66 <xor_block+46/90> <=====

In drivers/md/Makefile, swap the order of "raid5.o xor.o" to be "xor.o
raid5.o", recompile, install, reboot.

NeilBrown

2000-11-23 01:38:17

by Peter Samuelson

[permalink] [raw]
Subject: Re: kernel-2.4.0-test11 crashed again; this time i send you the Oops-message


[Neil Brown]
> In drivers/md/Makefile, swap the order of "raid5.o xor.o" to be
> "xor.o raid5.o", recompile, install, reboot.

Don't forget the part about adding a comment saying that xor.c does in
fact need to come before raid5.c. This is the part that most likely
will not happen, so that two months from now nobody will remember it
and eventually it will trip us up again.

That's one of the things that our infamous LINK_FIRST infrastructure
would have done: pointed out special cases automatically so that even
*without* a comment people would look at it and immediately know "there
is *something* link-order-dependent here". Oh well.

Peter

2000-11-23 02:29:12

by Albert D. Cahalan

[permalink] [raw]
Subject: Re: kernel-2.4.0-test11 crashed again; this time i send you the Oops-message

Peter Samuelson writes:
> [Neil Brown]

>> In drivers/md/Makefile, swap the order of "raid5.o xor.o" to be
>> "xor.o raid5.o", recompile, install, reboot.
>
> Don't forget the part about adding a comment saying that xor.c does in
> fact need to come before raid5.c. This is the part that most likely
> will not happen, so that two months from now nobody will remember it
> and eventually it will trip us up again.
>
> That's one of the things that our infamous LINK_FIRST infrastructure
> would have done: pointed out special cases automatically so that even
> *without* a comment people would look at it and immediately know "there
> is *something* link-order-dependent here". Oh well.

The infamous LINK_FIRST infrastructure was sort of half-way done.

It would be best to cause drivers with an unspecified link order
to move around a bit, so that errors may be discovered more quickly.

LINK_FIRST is pretty coarse. One would want a topological sort,
or at least LINK_0 through LINK_9 _without_ anything else.

2000-11-23 02:49:52

by Keith Owens

[permalink] [raw]
Subject: Re: kernel-2.4.0-test11 crashed again; this time i send you the Oops-message

On Wed, 22 Nov 2000 20:58:28 -0500 (EST),
"Albert D. Cahalan" <[email protected]> wrote:
>The infamous LINK_FIRST infrastructure was sort of half-way done.
>
>It would be best to cause drivers with an unspecified link order
>to move around a bit, so that errors may be discovered more quickly.

The "other" list in LINK_FIRST is sorted by name. It could be changed
to a random sort, probably based on a hash of size and mtime. It would
be relatively expensive so would have to be restricted to a "exercise
the kernel" CONFIG option.

>LINK_FIRST is pretty coarse. One would want a topological sort,
>or at least LINK_0 through LINK_9 _without_ anything else.

There is no need for multiple LINK_n entries, the objects partition
neatly into three groups. LINK_FIRST objects, in the order they are
defined. The rest of the objects (object list - (LINK_FIRST +
LINK_LAST), in an undefined order. LINK_LAST objects, in the order
they are defined.

If you can come up with a concrete link order example that cannot be
handled by a three partition model then I will listen. Otherwise it is
just over engineering.

2000-11-23 03:13:18

by Peter Samuelson

[permalink] [raw]
Subject: Re: kernel-2.4.0-test11 crashed again; this time i send you the Oops-message


[Albert D. Cahalan]
> The infamous LINK_FIRST infrastructure was sort of half-way done.

I disagree: it could handle all cases I could see that we might
reasonably care about. I challenge anyone to come up with a
non-pathological case that could not be taken care of with a single
LINK_FIRST and/or a single LINK_LAST.

The worst I can think of is something like "all PCI drivers must come
before all ISA drivers" which would require listing all of one set or
the other. But when you see a case like that, it often means "this
directory really needs to be split", because you have two different
classes of things in a single directory.

> It would be best to cause drivers with an unspecified link order
> to move around a bit, so that errors may be discovered more quickly.

That was the plan -- in 2.5. (The 2.4 version did not disturb any
order at all, unless you explicitly put a LINK_FIRST declaration in the
individual makefile.) Now that LINK_FIRST is officially dead, none of
this will probably happen at all.

> LINK_FIRST is pretty coarse. One would want a topological sort, or at
> least LINK_0 through LINK_9 _without_ anything else.

Too complex, no easy upgrade path (read: too different from status
quo), very little benefit over LINK_FIRST + LINK_LAST. For the
topological sort in particular I'm interestested in how it's even
possible to do in a non-intrusive and maintainable way.

Peter

2000-11-25 06:11:36

by Albert D. Cahalan

[permalink] [raw]
Subject: Re: kernel-2.4.0-test11 crashed again; this time i send you the Oops-message

Keith Owens writes:
> "Albert D. Cahalan" <[email protected]> wrote:

>> The infamous LINK_FIRST infrastructure was sort of half-way done.
>>
>> It would be best to cause drivers with an unspecified link order
>> to move around a bit, so that errors may be discovered more quickly.
>
> The "other" list in LINK_FIRST is sorted by name. It could be changed
> to a random sort, probably based on a hash of size and mtime. It would
> be relatively expensive so would have to be restricted to a "exercise
> the kernel" CONFIG option.

Yes, throwing out the low bits of mtime so that everybody gets the
same link order for a week. (must be able to reproduce failures)

>> LINK_FIRST is pretty coarse. One would want a topological sort,
>> or at least LINK_0 through LINK_9 _without_ anything else.
>
> There is no need for multiple LINK_n entries, the objects partition
> neatly into three groups. LINK_FIRST objects, in the order they are
> defined. The rest of the objects (object list - (LINK_FIRST +
> LINK_LAST), in an undefined order. LINK_LAST objects, in the order
> they are defined.

Ah, but then Linus has an argument to crush you. There is no
reason left to have anything but LINK_FIRST, and so the rest
is redundant and you can just kill the whole idea.

Going with multiple LINK_n entries and nothing else makes it
possible to make order dynamic within any LINK_n group. This
forces eventual discovery of any problems.

> If you can come up with a concrete link order example that
> cannot be handled by a three partition model then I will
> listen. Otherwise it is just over engineering.

The three-partition model is over-engineering, because it adds
extra complexity without getting rid of the fixed-order group.
Instead you get two fixed-order groups and one dynamic-order one.

If we are to leave the current model of a single fixed-order
group, then we ought to switch to something else uniform.
The three groups you desribe just isn't very regular.