LinuxLists.cc - mmotm 2008-12-01-19-41: early exception (page fault -- deref of 0x20)

2008-12-02 23:10:08

Subject: mmotm 2008-12-01-19-41: early exception (page fault -- deref of 0x20)

Hi,

while writing out warning about empty mtrr (or any other in the early stage),
I'm getting this:
PANIC: early exception 0e rip 10:ffffffff8025be26 error 0 cr2 20
Pid: 0, comm: swapper Not tainted 2.6.28-rc6-mm1_64 #484
Call Trace:
[<ffffffff80770195>] early_idt_handler+0x55/0x69
[<ffffffff8025be26>] ? getnstimeofday+0x46/0xc0
[<ffffffff80258a81>] ktime_get_real+0x11/0x50
[<ffffffff803aa8d1>] seed_std_data+0x11/0x30
[<ffffffff8020ff60>] ? show_trace+0x10/0x20
[<ffffffff803aa900>] seed_random_pools+0x10/0x30
[<ffffffff8023dda9>] init_oops_id+0x9/0x40
[<ffffffff8023dde9>] print_oops_end_marker+0x9/0x20
[<ffffffff8023dfed>] warn_slowpath+0x9d/0xd0
[<ffffffff802595e4>] ? up+0x34/0x50
[<ffffffff8023e77d>] ? release_console_sem+0x1bd/0x210
[<ffffffff80592c38>] ? printk+0x3c/0x44
[<ffffffff80776e67>] mtrr_trim_uncached_memory+0x166/0x390
[<ffffffff8077fca0>] ? early_gart_iommu_check+0xaf/0x2a7
[<ffffffff80772574>] setup_arch+0x3ec/0x6a8
[<ffffffff80770adf>] start_kernel+0x65/0x3c6
[<ffffffff8077027d>] x86_64_start_reservations+0x7d/0x89
[<ffffffff80770384>] x86_64_start_kernel+0xd8/0xdf
RIP 0x10

The clock is still null in kernel/time/timekeeping.c.

This one is to blame:
random-add-a-way-to-get-some-random-bits-into-the-entropy-pools-early-on.patch

2008-12-02 23:34:31

by Andrew Morton

[permalink] [raw]

Subject: Re: mmotm 2008-12-01-19-41: early exception (page fault -- deref of 0x20)

On Wed, 03 Dec 2008 00:09:46 +0100
Jiri Slaby <[email protected]> wrote:

> Hi,
>
> while writing out warning about empty mtrr (or any other in the early stage),
> I'm getting this:
> PANIC: early exception 0e rip 10:ffffffff8025be26 error 0 cr2 20
> Pid: 0, comm: swapper Not tainted 2.6.28-rc6-mm1_64 #484
> Call Trace:
> [<ffffffff80770195>] early_idt_handler+0x55/0x69
> [<ffffffff8025be26>] ? getnstimeofday+0x46/0xc0
> [<ffffffff80258a81>] ktime_get_real+0x11/0x50
> [<ffffffff803aa8d1>] seed_std_data+0x11/0x30
> [<ffffffff8020ff60>] ? show_trace+0x10/0x20
> [<ffffffff803aa900>] seed_random_pools+0x10/0x30
> [<ffffffff8023dda9>] init_oops_id+0x9/0x40
> [<ffffffff8023dde9>] print_oops_end_marker+0x9/0x20
> [<ffffffff8023dfed>] warn_slowpath+0x9d/0xd0
> [<ffffffff802595e4>] ? up+0x34/0x50
> [<ffffffff8023e77d>] ? release_console_sem+0x1bd/0x210
> [<ffffffff80592c38>] ? printk+0x3c/0x44
> [<ffffffff80776e67>] mtrr_trim_uncached_memory+0x166/0x390
> [<ffffffff8077fca0>] ? early_gart_iommu_check+0xaf/0x2a7
> [<ffffffff80772574>] setup_arch+0x3ec/0x6a8
> [<ffffffff80770adf>] start_kernel+0x65/0x3c6
> [<ffffffff8077027d>] x86_64_start_reservations+0x7d/0x89
> [<ffffffff80770384>] x86_64_start_kernel+0xd8/0xdf
> RIP 0x10
>
> The clock is still null in kernel/time/timekeeping.c.
>
> This one is to blame:
> random-add-a-way-to-get-some-random-bits-into-the-entropy-pools-early-on.patch

urgh, OK, thanks, I'll drop it.

Presumably your machine picked a different clocksource from Arjan's,
mine and others. Now, how do we work out what clocksource you're
using?

dmesg -s 1000000 | grep -i clock
nope
dmesg -s 1000000 | grep -i using
nope

hrm.

2008-12-02 23:43:36

by Matt Mackall

[permalink] [raw]

Subject: Re: mmotm 2008-12-01-19-41: early exception (page fault -- deref of 0x20)

On Wed, 2008-12-03 at 00:09 +0100, Jiri Slaby wrote:
> Hi,
>
> while writing out warning about empty mtrr (or any other in the early stage),
> I'm getting this:
> PANIC: early exception 0e rip 10:ffffffff8025be26 error 0 cr2 20
> Pid: 0, comm: swapper Not tainted 2.6.28-rc6-mm1_64 #484
> Call Trace:
> [<ffffffff80770195>] early_idt_handler+0x55/0x69
> [<ffffffff8025be26>] ? getnstimeofday+0x46/0xc0
> [<ffffffff80258a81>] ktime_get_real+0x11/0x50
> [<ffffffff803aa8d1>] seed_std_data+0x11/0x30
> [<ffffffff8020ff60>] ? show_trace+0x10/0x20
> [<ffffffff803aa900>] seed_random_pools+0x10/0x30
> [<ffffffff8023dda9>] init_oops_id+0x9/0x40
> [<ffffffff8023dde9>] print_oops_end_marker+0x9/0x20
> [<ffffffff8023dfed>] warn_slowpath+0x9d/0xd0
> [<ffffffff802595e4>] ? up+0x34/0x50
> [<ffffffff8023e77d>] ? release_console_sem+0x1bd/0x210
> [<ffffffff80592c38>] ? printk+0x3c/0x44
> [<ffffffff80776e67>] mtrr_trim_uncached_memory+0x166/0x390
> [<ffffffff8077fca0>] ? early_gart_iommu_check+0xaf/0x2a7
> [<ffffffff80772574>] setup_arch+0x3ec/0x6a8
> [<ffffffff80770adf>] start_kernel+0x65/0x3c6
> [<ffffffff8077027d>] x86_64_start_reservations+0x7d/0x89
> [<ffffffff80770384>] x86_64_start_kernel+0xd8/0xdf
> RIP 0x10
>
> The clock is still null in kernel/time/timekeeping.c.
>
> This one is to blame:
> random-add-a-way-to-get-some-random-bits-into-the-entropy-pools-early-on.patch

Nice. I didn't like that patch anyway.

Seems to me we really need to simplify all our startup dependencies.
Some things should -always- work. Time is one of them, random number
generation may be another, memory allocation ought to be one as well.

So I think:

a) we need to come up with a solution/rule for early ktime faulting
b) the RNG should initialize itself on demand
c) seeding the RNG should include a (void *, len)

--
Mathematics is the supreme nostalgia of our time.

2008-12-02 23:43:51

by Matt Mackall

[permalink] [raw]

Subject: Re: mmotm 2008-12-01-19-41: early exception (page fault -- deref of 0x20)

On Tue, 2008-12-02 at 15:33 -0800, Andrew Morton wrote:
> On Wed, 03 Dec 2008 00:09:46 +0100
> Jiri Slaby <[email protected]> wrote:
>
> > Hi,
> >
> > while writing out warning about empty mtrr (or any other in the early stage),
> > I'm getting this:
> > PANIC: early exception 0e rip 10:ffffffff8025be26 error 0 cr2 20
> > Pid: 0, comm: swapper Not tainted 2.6.28-rc6-mm1_64 #484
> > Call Trace:
> > [<ffffffff80770195>] early_idt_handler+0x55/0x69
> > [<ffffffff8025be26>] ? getnstimeofday+0x46/0xc0
> > [<ffffffff80258a81>] ktime_get_real+0x11/0x50
> > [<ffffffff803aa8d1>] seed_std_data+0x11/0x30
> > [<ffffffff8020ff60>] ? show_trace+0x10/0x20
> > [<ffffffff803aa900>] seed_random_pools+0x10/0x30
> > [<ffffffff8023dda9>] init_oops_id+0x9/0x40
> > [<ffffffff8023dde9>] print_oops_end_marker+0x9/0x20
> > [<ffffffff8023dfed>] warn_slowpath+0x9d/0xd0
> > [<ffffffff802595e4>] ? up+0x34/0x50
> > [<ffffffff8023e77d>] ? release_console_sem+0x1bd/0x210
> > [<ffffffff80592c38>] ? printk+0x3c/0x44
> > [<ffffffff80776e67>] mtrr_trim_uncached_memory+0x166/0x390
> > [<ffffffff8077fca0>] ? early_gart_iommu_check+0xaf/0x2a7
> > [<ffffffff80772574>] setup_arch+0x3ec/0x6a8
> > [<ffffffff80770adf>] start_kernel+0x65/0x3c6
> > [<ffffffff8077027d>] x86_64_start_reservations+0x7d/0x89
> > [<ffffffff80770384>] x86_64_start_kernel+0xd8/0xdf
> > RIP 0x10
> >
> > The clock is still null in kernel/time/timekeeping.c.
> >
> > This one is to blame:
> > random-add-a-way-to-get-some-random-bits-into-the-entropy-pools-early-on.patch
>
> urgh, OK, thanks, I'll drop it.
>
> Presumably your machine picked a different clocksource from Arjan's,
> mine and others. Now, how do we work out what clocksource you're
> using?
>
> dmesg -s 1000000 | grep -i clock
> nope
> dmesg -s 1000000 | grep -i using
> nope

If we oops or warn while picking a timesource, we'll have lots of fun?

--
Mathematics is the supreme nostalgia of our time.

2008-12-02 23:50:49

by Arjan van de Ven

[permalink] [raw]

Subject: Re: mmotm 2008-12-01-19-41: early exception (page fault -- deref of 0x20)

Matt Mackall wrote:
>
> If we oops or warn while picking a timesource, we'll have lots of fun?
>

we really only need to mix in the tsc; ktime_get() is just an arch friendly way to get that
I supposed (wrongly).

but yes we need to do a few things
1) seed on demand with a platform time source
2) have a way where arch init can just hand semi random data during the boot process to
increase the randomness (even if it doesn't count as entropy)
I got pulled in some big project at work so I might not get to it this week, but if nobody beats
me to it I will get to it ;-(

2008-12-03 13:01:58

by Jiri Kosina

[permalink] [raw]

Subject: Re: mmotm 2008-12-01-19-41: early exception (page fault -- deref of 0x20)

On Tue, 2 Dec 2008, Andrew Morton wrote:

> Presumably your machine picked a different clocksource from Arjan's,
> mine and others. Now, how do we work out what clocksource you're using?
> dmesg -s 1000000 | grep -i clock
> nope
> dmesg -s 1000000 | grep -i using
> nope
> hrm.

cat /sys/devices/system/clocksource/clocksource0/current_clocksource

--
Jiri Kosina
SUSE Labs

2008-12-03 17:07:52

by Matt Mackall

[permalink] [raw]

Subject: Re: mmotm 2008-12-01-19-41: early exception (page fault -- deref of 0x20)

On Tue, 2008-12-02 at 15:50 -0800, Arjan van de Ven wrote:
> Matt Mackall wrote:
> >
> > If we oops or warn while picking a timesource, we'll have lots of fun?
> >
>
> we really only need to mix in the tsc; ktime_get() is just an arch friendly way to get that
> I supposed (wrongly).
>
> but yes we need to do a few things
> 1) seed on demand with a platform time source

Currently we use jiffies + get_cycles(). That's going to have somewhere
between, oh, 3 bits of entropy (very stable boot with only jiffies) and
25 bits of entropy (TSC with lots of waiting for hardware) at boot.
Ideally, we'd have access to a wall clock of some sort as well. But
that's also a fairly limited source - wall clocks are both low
resolution and predictable/collision-prone.

> 2) have a way where arch init can just hand semi random data during the boot process to
> increase the randomness (even if it doesn't count as entropy)

A simple wrapper around mix_pool_bytes probably fits the ticket.

But I don't think this will solve the general problem of 'large numbers
of practically identical machines booting up with the same pre-init
random number pools'. Beyond things like MAC addresses and serial
numbers (predictable/observable but at least not collision-prone), we
have no way to differentiate some boxes. We may need to forcibly
generate some timing entropy. Perhaps something like this:

http://markmail.org/message/xwsbywr6ziil2qu2

(which is way too slow in its current form)

There's a related problem of systems with no way to store a seed across
boots.

--
Mathematics is the supreme nostalgia of our time.

2008-12-03 17:14:31

by Arjan van de Ven

[permalink] [raw]

Subject: Re: mmotm 2008-12-01-19-41: early exception (page fault -- deref of 0x20)

Matt Mackall wrote:
> On Tue, 2008-12-02 at 15:50 -0800, Arjan van de Ven wrote:
>> Matt Mackall wrote:
>>> If we oops or warn while picking a timesource, we'll have lots of fun?
>>>
>> we really only need to mix in the tsc; ktime_get() is just an arch friendly way to get that
>> I supposed (wrongly).
>>
>> but yes we need to do a few things
>> 1) seed on demand with a platform time source
>
> Currently we use jiffies + get_cycles(). That's going to have somewhere
> between, oh, 3 bits of entropy (very stable boot with only jiffies) and
> 25 bits of entropy (TSC with lots of waiting for hardware) at boot.
> Ideally, we'd have access to a wall clock of some sort as well. But
> that's also a fairly limited source - wall clocks are both low
> resolution and predictable/collision-prone.

tsc taken quite some time appart in the boot, we will get multiple times a bunch of entropy.
This comes from variations in time various hardware things take (due to spread spectrum clocks)
and because of cpu speculation not being deterministic to this level of multi-milliseconds.

2008-12-22 12:04:37

by Jiri Slaby

[permalink] [raw]

Subject: Re: mmotm 2008-12-01-19-41: early exception (page fault -- deref of 0x20)

On 12/03/2008 02:00 PM, Jiri Kosina wrote:
> On Tue, 2 Dec 2008, Andrew Morton wrote:
>
>> Presumably your machine picked a different clocksource from Arjan's,
>> mine and others. Now, how do we work out what clocksource you're using?
>> dmesg -s 1000000 | grep -i clock
>> nope
>> dmesg -s 1000000 | grep -i using
>> nope
>> hrm.
>
> cat /sys/devices/system/clocksource/clocksource0/current_clocksource

tsc

It ran from qemu, but it oopsed even on my desktop (tsc too).

2008-12-22 12:25:45

by Jiri Kosina

[permalink] [raw]

Subject: Re: mmotm 2008-12-01-19-41: early exception (page fault -- deref of 0x20)

On Mon, 22 Dec 2008, Jiri Slaby wrote:

> >> Presumably your machine picked a different clocksource from Arjan's,
> >> mine and others. Now, how do we work out what clocksource you're using?
> >> dmesg -s 1000000 | grep -i clock
> >> nope
> >> dmesg -s 1000000 | grep -i using
> >> nope
> >> hrm.
> > cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> tsc
> It ran from qemu, but it oopsed even on my desktop (tsc too).

Could you please send the full oops to Thomas and Ingo?

--
Jiri Kosina
SUSE Labs