2009-03-04 14:12:56

by Patrick McHardy

[permalink] [raw]
Subject: Re: lguest: unhandled trap 13 in current -rc

Patrick McHardy wrote:
> When trying to run lguest in the current -rc, I get an "unhandled
> trap 13" and it stops. The address resolves to the rdmsr intruction
> in native_read_msr_safe(). -rc2 works fine, but I couldn't find
> any changes that looks related.
>
> .config is attached, more information available on request.

For the record, this is still broken in -rc7.


2009-03-06 06:56:20

by Rusty Russell

[permalink] [raw]
Subject: Re: lguest: unhandled trap 13 in current -rc

On Thursday 05 March 2009 00:42:39 Patrick McHardy wrote:
> Patrick McHardy wrote:
> > When trying to run lguest in the current -rc, I get an "unhandled
> > trap 13" and it stops. The address resolves to the rdmsr intruction
> > in native_read_msr_safe(). -rc2 works fine, but I couldn't find
> > any changes that looks related.
> >
> > .config is attached, more information available on request.
>
> For the record, this is still broken in -rc7.

(Sorry, I missed the first mail to lkml).

Reproduced on one of my test machines (kvm doesn't show the problem here).

Subject: lguest: fix crash 'unhandled trap 13 at <native_read_msr_safe>'

Impact: fix lguest boot crash on modern Intel machines

The code in early_init_intel does:

if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
u64 misc_enable;

rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);

And that rdmsr faults (not allowed from non-0 PL). We can get around
this by mugging the family ID part of the cpuid. 5 seems like a good
number.

Of course, this is a hack (how very lguest!). We could just indicate
that we don't support MSRs, or implement lguest_rdmst.

Reported-by: Patrick McHardy <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c
--- a/arch/x86/lguest/boot.c
+++ b/arch/x86/lguest/boot.c
@@ -343,6 +350,11 @@ static void lguest_cpuid(unsigned int *a
* flush_tlb_user() for both user and kernel mappings unless
* the Page Global Enable (PGE) feature bit is set. */
*dx |= 0x00002000;
+ /* We also lie, and say we're family id 5. 6 or greater
+ * leads to a rdmsr in early_init_intel which we can't handle.
+ * Family ID is returned as bits 8-12 in ax. */
+ *ax &= 0xFFFFF0FF;
+ *ax |= 0x00000500;
break;
case 0x80000000:
/* Futureproof this a little: if they ask how much extended

2009-03-06 11:17:35

by Patrick McHardy

[permalink] [raw]
Subject: Re: lguest: unhandled trap 13 in current -rc

Rusty Russell wrote:
> On Thursday 05 March 2009 00:42:39 Patrick McHardy wrote:
>> Patrick McHardy wrote:
>>> When trying to run lguest in the current -rc, I get an "unhandled
>>> trap 13" and it stops. The address resolves to the rdmsr intruction
>>> in native_read_msr_safe(). -rc2 works fine, but I couldn't find
>>> any changes that looks related.
>>>
>>> .config is attached, more information available on request.
>> For the record, this is still broken in -rc7.
>
> (Sorry, I missed the first mail to lkml).
>
> Reproduced on one of my test machines (kvm doesn't show the problem here).
>
> Subject: lguest: fix crash 'unhandled trap 13 at <native_read_msr_safe>'


Thanks Rusty, this fixed the problem for me.

There's a second problem, outgoing network performance using a routed
tap-interface is in the area of 20-30 kbit/second. I noticed this
previously (which is why I wanted to test the latest version), in
-rc2 what still worked was turning TSO/GSO etc. off, this seems to
be without effect in -rc7. I'll have a look at this myself tommorrow
unless someone beats me to it :)