>>> On 30.05.12 at 15:10, Andre Przywara <[email protected]> wrote:
> Because we are behind a family check before tweaking the topology
> bit, we can use the standard rd/wrmsr variants for the CPUID feature
> register.
> This fixes a crash when using the kernel as a Xen Dom0 on affected
> Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
> yet (this will be fixed in another patch).
I'm not following: If the AMD variants (putting a special value into
%edi) can be freely replaced by the non-AMD variants, why did
the AMD special ones get used in the first place?
Further, I can't see how checking_wrmsrl() is being paravirtualized
any better than wrmsrl_amd_safe() - both have nothing but an
exception handling fixup attached to the wrmsr invocation. Care
to point out what actual crash it is that was seen?
Finally, I would question whether re-enabling the topology
extensions under Xen shouldn't be skipped altogether, perhaps
even on Dom0 (as the hypervisor is controlling this MSR, but in
any case on DomU - the hypervisor won't allow (read: ignore,
not fault on) the write anyway (and will log a message for each
(v)CPU that attempts this).
Jan
> Signed-off-by: Andre Przywara <[email protected]>
> Cc: [email protected] # 3.4+
> ---
> arch/x86/kernel/cpu/amd.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 146bb62..80ccd99 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -586,9 +586,9 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
> !cpu_has(c, X86_FEATURE_TOPOEXT)) {
> u64 val;
>
> - if (!rdmsrl_amd_safe(0xc0011005, &val)) {
> + if (!rdmsrl_safe(0xc0011005, &val)) {
> val |= 1ULL << 54;
> - wrmsrl_amd_safe(0xc0011005, val);
> + checking_wrmsrl(0xc0011005, val);
> rdmsrl(0xc0011005, val);
> if (val & (1ULL << 54)) {
> set_cpu_cap(c, X86_FEATURE_TOPOEXT);
>>> On 30.05.12 at 16:02, Andre Przywara <[email protected]> wrote:
> On 05/30/2012 03:33 PM, Jan Beulich wrote:
>> Further, I can't see how checking_wrmsrl() is being paravirtualized
>> any better than wrmsrl_amd_safe() - both have nothing but an
>> exception handling fixup attached to the wrmsr invocation. Care
>> to point out what actual crash it is that was seen?
>
> AFAIK, the difference is between the "l" and the regs version for
> rd/wrmsr. We have a patch already here to fix this. Will send it out
> soon. Jacob, can you comment on this?
I see - the Xen code blindly overwrites pv_cpu_ops, despite not
having initialized all members. That's an obvious oversight of the
patch that introduced the _regs variants.
Plus having secondary instances of things like rdmsrl_amd_safe()
in asm/paravirt.h seems pretty strange an approach (which was
why initially I didn't spot how a crash could happen there) - only
the lowest level functions should get re-implemented here.
>> Finally, I would question whether re-enabling the topology
>> extensions under Xen shouldn't be skipped altogether, perhaps
>> even on Dom0 (as the hypervisor is controlling this MSR, but in
>> any case on DomU - the hypervisor won't allow (read: ignore,
>> not fault on) the write anyway (and will log a message for each
>> (v)CPU that attempts this).
>
> This is probably right. Let me think about this.
I'll submit a respective hypervisor side patch soonish.
Jan
On 05/30/2012 03:33 PM, Jan Beulich wrote:
>>>> On 30.05.12 at 15:10, Andre Przywara<[email protected]> wrote:
>> Because we are behind a family check before tweaking the topology
>> bit, we can use the standard rd/wrmsr variants for the CPUID feature
>> register.
>> This fixes a crash when using the kernel as a Xen Dom0 on affected
>> Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
>> yet (this will be fixed in another patch).
>
> I'm not following: If the AMD variants (putting a special value into
> %edi) can be freely replaced by the non-AMD variants, why did
> the AMD special ones get used in the first place?
Older CPUs (K8) needed the AMD variants, starting with family 10h we can
use the normal versions.
> Further, I can't see how checking_wrmsrl() is being paravirtualized
> any better than wrmsrl_amd_safe() - both have nothing but an
> exception handling fixup attached to the wrmsr invocation. Care
> to point out what actual crash it is that was seen?
AFAIK, the difference is between the "l" and the regs version for
rd/wrmsr. We have a patch already here to fix this. Will send it out
soon. Jacob, can you comment on this?
The crash dump:
[ 1.601021] ------------[ cut here ]------------
[ 1.601025] kernel BUG at
/buildrepos/linux-2.6-stable/arch/x86/include/asm/paravirt.h:133!
[ 1.601031] invalid opcode: 0000 [#1] SMP
[ 1.601038] CPU 0
[ 1.601041] Modules linked in:
[ 1.601047]
[ 1.601050] Pid: 0, comm: swapper/0 Not tainted 3.4.0 #1 AMD Virgo
Platform/Annapurna
[ 1.601061] RIP: e030:[<ffffffff8169b4fe>] [<ffffffff8169b4fe>]
init_amd+0x21d/0x603
[ 1.601072] RSP: e02b:ffffffff81c01df8 EFLAGS: 00010246
[ 1.601076] RAX: 0000000000000000 RBX: ffffffff81ca7500 RCX:
0000000000000000
[ 1.601081] RDX: 0000000000000020 RSI: 000000000000c000 RDI:
ffffffff81c01e78
[ 1.601086] RBP: ffffffff81c01ea8 R08: 0000000000004003 R09:
00000000ffffffff
[ 1.601090] R10: 0000000000000000 R11: 00000000ffffffff R12:
ffffffff81ca7574
[ 1.601095] R13: ffffffff81ca7514 R14: ffffffff81ca754c R15:
ffffffff81ca756c
[ 1.601103] FS: 0000000000000000(0000) GS:ffff8801ce600000(0000)
knlGS:0000000000000000
[ 1.601108] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1.601112] CR2: 0000000000000000 CR3: 0000000001c0c000 CR4:
0000000000040660
[ 1.601117] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 1.601122] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 1.601127] Process swapper/0 (pid: 0, threadinfo ffffffff81c00000,
task ffffffff81c14020)
[ 1.601131] Stack:
[ 1.601134] ffffffff81c01ea8 ffffffff8169a7d8 0000001600000017
0000000000000000
[ 1.601146] ffff8801c1c3ac00 ffff8801c1c002d0 ffffffff81c01e48
ffffffff81140118
[ 1.601157] ffffffff811415cc ffff8801c1c04220 ffff8801c1c02480
00000000000000d0
[ 1.601169] Call Trace:
[ 1.601175] [<ffffffff8169a7d8>] ? get_cpu_cap+0x1f2/0x201
[ 1.601183] [<ffffffff81140118>] ? init_kmem_cache_cpus+0x3e/0x4d
[ 1.601189] [<ffffffff811415cc>] ? sysfs_slab_alias+0x60/0x8d
[ 1.601195] [<ffffffff8169aa3d>] identify_cpu+0x256/0x34d
[ 1.601201] [<ffffffff81141649>] ? kmem_cache_alloc+0x50/0xe4
[ 1.601209] [<ffffffff81cd1c51>] identify_boot_cpu+0x10/0x3c
[ 1.601216] [<ffffffff81cd2052>] check_bugs+0x9/0x2d
[ 1.601222] [<ffffffff81cc7e08>] start_kernel+0x445/0x461
[ 1.601227] [<ffffffff81cc77d6>] ? kernel_init+0x1d8/0x1d8
[ 1.601233] [<ffffffff81cc72cf>] x86_64_start_reservations+0xba/0xc1
[ 1.601240] [<ffffffff81041797>] ? xen_setup_runstate_info+0x2c/0x36
[ 1.601247] [<ffffffff81ccbdc4>] xen_start_kernel+0x58b/0x592
[ 1.601251] Code: 0f 85 d3 00 00 00 31 c0 48 83 3d cd 5c 58 00 00 48
8d 7d b0 b9 08 00 00 00 f3
ab c7 45 b4 05 10 01 c0 c7 45 cc 3a 20 5a 9c 75 04 <0f> 0b eb fe 4c 8d
75 b0 4c 89 f7 ff 14 25 b0 11
c2 81 85 c0 8b
[ 1.601374] RIP [<ffffffff8169b4fe>] init_amd+0x21d/0x603
[ 1.601381] RSP <ffffffff81c01df8>
[ 1.601391] ---[ end trace a7919e7f17c0a725 ]---
[ 1.601397] Kernel panic - not syncing: Attempted to kill the idle task!
> Finally, I would question whether re-enabling the topology
> extensions under Xen shouldn't be skipped altogether, perhaps
> even on Dom0 (as the hypervisor is controlling this MSR, but in
> any case on DomU - the hypervisor won't allow (read: ignore,
> not fault on) the write anyway (and will log a message for each
> (v)CPU that attempts this).
This is probably right. Let me think about this.
Thanks for picking this up.
Regards,
Andre.
>
>> Signed-off-by: Andre Przywara<[email protected]>
>> Cc: [email protected] # 3.4+
>> ---
>> arch/x86/kernel/cpu/amd.c | 4 ++--
>> 1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
>> index 146bb62..80ccd99 100644
>> --- a/arch/x86/kernel/cpu/amd.c
>> +++ b/arch/x86/kernel/cpu/amd.c
>> @@ -586,9 +586,9 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
>> !cpu_has(c, X86_FEATURE_TOPOEXT)) {
>> u64 val;
>>
>> - if (!rdmsrl_amd_safe(0xc0011005,&val)) {
>> + if (!rdmsrl_safe(0xc0011005,&val)) {
>> val |= 1ULL<< 54;
>> - wrmsrl_amd_safe(0xc0011005, val);
>> + checking_wrmsrl(0xc0011005, val);
>> rdmsrl(0xc0011005, val);
>> if (val& (1ULL<< 54)) {
>> set_cpu_cap(c, X86_FEATURE_TOPOEXT);
>
>
>
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
On 05/30/2012 07:23 AM, Jan Beulich wrote:
>
> I see - the Xen code blindly overwrites pv_cpu_ops, despite not
> having initialized all members. That's an obvious oversight of the
> patch that introduced the _regs variants.
>
> Plus having secondary instances of things like rdmsrl_amd_safe()
> in asm/paravirt.h seems pretty strange an approach (which was
> why initially I didn't spot how a crash could happen there) - only
> the lowest level functions should get re-implemented here.
>
This kinds of things are part of why Xen makes me want to cry regularly.
-hpa
On Wed, May 30, 2012 at 04:02:48PM +0200, Andre Przywara wrote:
> On 05/30/2012 03:33 PM, Jan Beulich wrote:
> >>>>On 30.05.12 at 15:10, Andre Przywara<[email protected]> wrote:
> >>Because we are behind a family check before tweaking the topology
> >>bit, we can use the standard rd/wrmsr variants for the CPUID feature
> >>register.
> >>This fixes a crash when using the kernel as a Xen Dom0 on affected
> >>Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
> >>yet (this will be fixed in another patch).
> >
> >I'm not following: If the AMD variants (putting a special value into
> >%edi) can be freely replaced by the non-AMD variants, why did
> >the AMD special ones get used in the first place?
>
> Older CPUs (K8) needed the AMD variants, starting with family 10h we
> can use the normal versions.
>
> >Further, I can't see how checking_wrmsrl() is being paravirtualized
> >any better than wrmsrl_amd_safe() - both have nothing but an
> >exception handling fixup attached to the wrmsr invocation. Care
> >to point out what actual crash it is that was seen?
>
> AFAIK, the difference is between the "l" and the regs version for
> rd/wrmsr. We have a patch already here to fix this. Will send it out
> soon. Jacob, can you comment on this?
Right, the checking_wrmsrl turns into wrmsr_safe which is paravirtualized
but the rdmsrl_amd_safe which turns into rdmsr_regs is not paravirtualized
by enlighten.
>
> The crash dump:
>
> [ 1.601021] ------------[ cut here ]------------
> [ 1.601025] kernel BUG at
> /buildrepos/linux-2.6-stable/arch/x86/include/asm/paravirt.h:133!
> [ 1.601031] invalid opcode: 0000 [#1] SMP
> [ 1.601038] CPU 0
> [ 1.601041] Modules linked in:
> [ 1.601047]
> [ 1.601050] Pid: 0, comm: swapper/0 Not tainted 3.4.0 #1 AMD
> Virgo Platform/Annapurna
> [ 1.601061] RIP: e030:[<ffffffff8169b4fe>] [<ffffffff8169b4fe>]
> init_amd+0x21d/0x603
> [ 1.601072] RSP: e02b:ffffffff81c01df8 EFLAGS: 00010246
> [ 1.601076] RAX: 0000000000000000 RBX: ffffffff81ca7500 RCX:
> 0000000000000000
> [ 1.601081] RDX: 0000000000000020 RSI: 000000000000c000 RDI:
> ffffffff81c01e78
> [ 1.601086] RBP: ffffffff81c01ea8 R08: 0000000000004003 R09:
> 00000000ffffffff
> [ 1.601090] R10: 0000000000000000 R11: 00000000ffffffff R12:
> ffffffff81ca7574
> [ 1.601095] R13: ffffffff81ca7514 R14: ffffffff81ca754c R15:
> ffffffff81ca756c
> [ 1.601103] FS: 0000000000000000(0000) GS:ffff8801ce600000(0000)
> knlGS:0000000000000000
> [ 1.601108] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 1.601112] CR2: 0000000000000000 CR3: 0000000001c0c000 CR4:
> 0000000000040660
> [ 1.601117] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ 1.601122] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [ 1.601127] Process swapper/0 (pid: 0, threadinfo
> ffffffff81c00000, task ffffffff81c14020)
> [ 1.601131] Stack:
> [ 1.601134] ffffffff81c01ea8 ffffffff8169a7d8 0000001600000017
> 0000000000000000
> [ 1.601146] ffff8801c1c3ac00 ffff8801c1c002d0 ffffffff81c01e48
> ffffffff81140118
> [ 1.601157] ffffffff811415cc ffff8801c1c04220 ffff8801c1c02480
> 00000000000000d0
> [ 1.601169] Call Trace:
> [ 1.601175] [<ffffffff8169a7d8>] ? get_cpu_cap+0x1f2/0x201
> [ 1.601183] [<ffffffff81140118>] ? init_kmem_cache_cpus+0x3e/0x4d
> [ 1.601189] [<ffffffff811415cc>] ? sysfs_slab_alias+0x60/0x8d
> [ 1.601195] [<ffffffff8169aa3d>] identify_cpu+0x256/0x34d
> [ 1.601201] [<ffffffff81141649>] ? kmem_cache_alloc+0x50/0xe4
> [ 1.601209] [<ffffffff81cd1c51>] identify_boot_cpu+0x10/0x3c
> [ 1.601216] [<ffffffff81cd2052>] check_bugs+0x9/0x2d
> [ 1.601222] [<ffffffff81cc7e08>] start_kernel+0x445/0x461
> [ 1.601227] [<ffffffff81cc77d6>] ? kernel_init+0x1d8/0x1d8
> [ 1.601233] [<ffffffff81cc72cf>] x86_64_start_reservations+0xba/0xc1
> [ 1.601240] [<ffffffff81041797>] ? xen_setup_runstate_info+0x2c/0x36
> [ 1.601247] [<ffffffff81ccbdc4>] xen_start_kernel+0x58b/0x592
> [ 1.601251] Code: 0f 85 d3 00 00 00 31 c0 48 83 3d cd 5c 58 00 00
> 48 8d 7d b0 b9 08 00 00 00 f3
> ab c7 45 b4 05 10 01 c0 c7 45 cc 3a 20 5a 9c 75 04 <0f> 0b eb fe 4c
> 8d 75 b0 4c 89 f7 ff 14 25 b0 11
> c2 81 85 c0 8b
> [ 1.601374] RIP [<ffffffff8169b4fe>] init_amd+0x21d/0x603
> [ 1.601381] RSP <ffffffff81c01df8>
> [ 1.601391] ---[ end trace a7919e7f17c0a725 ]---
> [ 1.601397] Kernel panic - not syncing: Attempted to kill the idle task!
>
> >Finally, I would question whether re-enabling the topology
> >extensions under Xen shouldn't be skipped altogether, perhaps
> >even on Dom0 (as the hypervisor is controlling this MSR, but in
> >any case on DomU - the hypervisor won't allow (read: ignore,
> >not fault on) the write anyway (and will log a message for each
> >(v)CPU that attempts this).
>
> This is probably right. Let me think about this.
>
> Thanks for picking this up.
>
> Regards,
> Andre.
>
> >
> >>Signed-off-by: Andre Przywara<[email protected]>
> >>Cc: [email protected] # 3.4+
> >>---
> >> arch/x86/kernel/cpu/amd.c | 4 ++--
> >> 1 files changed, 2 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> >>index 146bb62..80ccd99 100644
> >>--- a/arch/x86/kernel/cpu/amd.c
> >>+++ b/arch/x86/kernel/cpu/amd.c
> >>@@ -586,9 +586,9 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
> >> !cpu_has(c, X86_FEATURE_TOPOEXT)) {
> >> u64 val;
> >>
> >>- if (!rdmsrl_amd_safe(0xc0011005,&val)) {
> >>+ if (!rdmsrl_safe(0xc0011005,&val)) {
> >> val |= 1ULL<< 54;
> >>- wrmsrl_amd_safe(0xc0011005, val);
> >>+ checking_wrmsrl(0xc0011005, val);
> >> rdmsrl(0xc0011005, val);
> >> if (val& (1ULL<< 54)) {
> >> set_cpu_cap(c, X86_FEATURE_TOPOEXT);
> >
> >
> >
>
>
> --
> Andre Przywara
> AMD-Operating System Research Center (OSRC), Dresden, Germany
On Wed, May 30, 2012 at 07:42:56AM -0700, H. Peter Anvin wrote:
> On 05/30/2012 07:23 AM, Jan Beulich wrote:
> >
> > I see - the Xen code blindly overwrites pv_cpu_ops, despite not
> > having initialized all members. That's an obvious oversight of the
> > patch that introduced the _regs variants.
> >
> > Plus having secondary instances of things like rdmsrl_amd_safe()
> > in asm/paravirt.h seems pretty strange an approach (which was
> > why initially I didn't spot how a crash could happen there) - only
> > the lowest level functions should get re-implemented here.
> >
>
> This kinds of things are part of why Xen makes me want to cry regularly.
It looks like an oversight by Borislav (177fed1ee8d727c39601ce9fc2299b4cb25a718e
and 132ec92f) and Yinghai (b05f78f5c713eda2c34e495d92495ee4f1c3b5e1) where
they added these wrappers way back in 2009!
On Wed, May 30, 2012 at 09:48:51AM -0500, Jacob Shin wrote:
> On Wed, May 30, 2012 at 04:02:48PM +0200, Andre Przywara wrote:
> > On 05/30/2012 03:33 PM, Jan Beulich wrote:
> > >>>>On 30.05.12 at 15:10, Andre Przywara<[email protected]> wrote:
> > >>Because we are behind a family check before tweaking the topology
> > >>bit, we can use the standard rd/wrmsr variants for the CPUID feature
> > >>register.
> > >>This fixes a crash when using the kernel as a Xen Dom0 on affected
> > >>Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
> > >>yet (this will be fixed in another patch).
> > >
> > >I'm not following: If the AMD variants (putting a special value into
> > >%edi) can be freely replaced by the non-AMD variants, why did
> > >the AMD special ones get used in the first place?
> >
> > Older CPUs (K8) needed the AMD variants, starting with family 10h we
> > can use the normal versions.
> >
> > >Further, I can't see how checking_wrmsrl() is being paravirtualized
> > >any better than wrmsrl_amd_safe() - both have nothing but an
> > >exception handling fixup attached to the wrmsr invocation. Care
> > >to point out what actual crash it is that was seen?
> >
> > AFAIK, the difference is between the "l" and the regs version for
> > rd/wrmsr. We have a patch already here to fix this. Will send it out
> > soon. Jacob, can you comment on this?
>
> Right, the checking_wrmsrl turns into wrmsr_safe which is paravirtualized
> but the rdmsrl_amd_safe which turns into rdmsr_regs is not paravirtualized
> by enlighten.
So would a patch to implements the rdmsr_regs fix this crash?
On Wed, May 30, 2012 at 10:50:05AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, May 30, 2012 at 09:48:51AM -0500, Jacob Shin wrote:
> > On Wed, May 30, 2012 at 04:02:48PM +0200, Andre Przywara wrote:
> > > On 05/30/2012 03:33 PM, Jan Beulich wrote:
> > > >>>>On 30.05.12 at 15:10, Andre Przywara<[email protected]> wrote:
> > > >>Because we are behind a family check before tweaking the topology
> > > >>bit, we can use the standard rd/wrmsr variants for the CPUID feature
> > > >>register.
> > > >>This fixes a crash when using the kernel as a Xen Dom0 on affected
> > > >>Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
> > > >>yet (this will be fixed in another patch).
> > > >
> > > >I'm not following: If the AMD variants (putting a special value into
> > > >%edi) can be freely replaced by the non-AMD variants, why did
> > > >the AMD special ones get used in the first place?
> > >
> > > Older CPUs (K8) needed the AMD variants, starting with family 10h we
> > > can use the normal versions.
> > >
> > > >Further, I can't see how checking_wrmsrl() is being paravirtualized
> > > >any better than wrmsrl_amd_safe() - both have nothing but an
> > > >exception handling fixup attached to the wrmsr invocation. Care
> > > >to point out what actual crash it is that was seen?
> > >
> > > AFAIK, the difference is between the "l" and the regs version for
> > > rd/wrmsr. We have a patch already here to fix this. Will send it out
> > > soon. Jacob, can you comment on this?
> >
> > Right, the checking_wrmsrl turns into wrmsr_safe which is paravirtualized
> > but the rdmsrl_amd_safe which turns into rdmsr_regs is not paravirtualized
> > by enlighten.
>
> So would a patch to implements the rdmsr_regs fix this crash?
Yes, the following patch also fixed the crash for us:
Implement rdmsr_regs and wrmsr_regs for Xen pvops.
Signed-off-by: Jacob Shin <[email protected]>
Cc: [email protected] # 3.4+
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 75f33b2..f3ae5ec 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -945,9 +945,12 @@ static void xen_write_cr4(unsigned long cr4)
native_write_cr4(cr4);
}
-static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
+static int xen_wrmsr_safe_regs(u32 regs[8])
{
int ret;
+ unsigned int msr = regs[1];
+ unsigned low = regs[0];
+ unsigned high = regs[2];
ret = 0;
@@ -985,12 +988,23 @@ static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
break;
default:
- ret = native_write_msr_safe(msr, low, high);
+ ret = native_wrmsr_safe_regs(regs);
}
return ret;
}
+static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
+{
+ u32 regs[8] = { 0 };
+
+ regs[0] = low;
+ regs[1] = msr;
+ regs[2] = high;
+
+ return xen_wrmsr_safe_regs(regs);
+}
+
void xen_setup_shared_info(void)
{
if (!xen_feature(XENFEAT_auto_translated_physmap)) {
@@ -1116,7 +1130,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
.wbinvd = native_wbinvd,
.read_msr = native_read_msr_safe,
+ .rdmsr_regs = native_rdmsr_safe_regs,
.write_msr = xen_write_msr_safe,
+ .wrmsr_regs = xen_wrmsr_safe_regs,
.read_tsc = native_read_tsc,
.read_pmc = native_read_pmc,
On Wed, May 30, 2012 at 10:49:29AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, May 30, 2012 at 07:42:56AM -0700, H. Peter Anvin wrote:
> > On 05/30/2012 07:23 AM, Jan Beulich wrote:
> > >
> > > I see - the Xen code blindly overwrites pv_cpu_ops, despite not
> > > having initialized all members. That's an obvious oversight of the
> > > patch that introduced the _regs variants.
> > >
> > > Plus having secondary instances of things like rdmsrl_amd_safe()
> > > in asm/paravirt.h seems pretty strange an approach (which was
> > > why initially I didn't spot how a crash could happen there) - only
> > > the lowest level functions should get re-implemented here.
> > >
> >
> > This kinds of things are part of why Xen makes me want to cry regularly.
>
> It looks like an oversight by Borislav (177fed1ee8d727c39601ce9fc2299b4cb25a718e
> and 132ec92f) and Yinghai (b05f78f5c713eda2c34e495d92495ee4f1c3b5e1) where
> they added these wrappers way back in 2009!
This is exactly why xen has nothing to do in arch/x86/. It is not an
oversight - I simply didn't test it on xen because I don't care about
it.
Remember our last discussion about mcelog?
This current case should be a perfect example for why xen shouldn't be
sprinkling code all over the place.
--
Regards/Gruss,
Boris.
>>> On 30.05.12 at 17:12, Borislav Petkov <[email protected]> wrote:
> This current case should be a perfect example for why xen shouldn't be
> sprinkling code all over the place.
Which means you're denying the benefits of para-virtualization
(at the base system level; perhaps it's less of a problem for you
when it comes to pv device drivers, which are generally
standalone entities) as that's what distinguishes Xen from all
other virtualization solutions Linux supports.
Jan
On 05/30/2012 08:40 AM, Jan Beulich wrote:
>>>> On 30.05.12 at 17:12, Borislav Petkov <[email protected]> wrote:
>> This current case should be a perfect example for why xen shouldn't be
>> sprinkling code all over the place.
>
> Which means you're denying the benefits of para-virtualization
> (at the base system level; perhaps it's less of a problem for you
> when it comes to pv device drivers, which are generally
> standalone entities) as that's what distinguishes Xen from all
> other virtualization solutions Linux supports.
>
And it also denies the just atrocious cost of Xen grabbing random kernel
internals and turning them into ABIs that we have to work around from
that point on. This is particularly obnoxious because the cost is
largely not borne by the Xen community but by the general Linux development.
-hpa
On Wed, May 30, 2012 at 04:40:37PM +0100, Jan Beulich wrote:
> >>> On 30.05.12 at 17:12, Borislav Petkov <[email protected]> wrote:
> > This current case should be a perfect example for why xen shouldn't be
> > sprinkling code all over the place.
>
> Which means you're denying the benefits of para-virtualization
Does it really mean that?
> (at the base system level; perhaps it's less of a problem for
> you when it comes to pv device drivers, which are generally
> standalone entities) as that's what distinguishes Xen from all other
> virtualization solutions Linux supports.
All I'm saying is, xen should solve the whole deal of what it wants to
do (whatever that is) _without_ and _agnostic_ from arch/x86/. Otherwise
x86 changes break xen.
I couldn't care less about what distinguishes xen from all other
solutions.
--
Regards/Gruss,
Boris.
> Yes, the following patch also fixed the crash for us:
>
> Implement rdmsr_regs and wrmsr_regs for Xen pvops.
That needs more data. Such as the reason for it, the crash
tombstone, and an analysis of the bug.
But at this point I am not sure what we are going to do.
I think Peter leans towards ripping the .rdmsr_regs/wdmsr_regs
function out altogether (so altering the amd_rdmsr... to use the
.rdmsr/.wrdmsr). At which point I think this would all work
just fine?
I am tempted to write a patch that checks all the pv-cpu-ops
to see if there are any that are NULL and throw a warning so
that this does not hit us in the future - to be at least more
proactive about this sort of thing.
>
> Signed-off-by: Jacob Shin <[email protected]>
> Cc: [email protected] # 3.4+
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 75f33b2..f3ae5ec 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -945,9 +945,12 @@ static void xen_write_cr4(unsigned long cr4)
> native_write_cr4(cr4);
> }
>
> -static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
> +static int xen_wrmsr_safe_regs(u32 regs[8])
> {
> int ret;
> + unsigned int msr = regs[1];
> + unsigned low = regs[0];
> + unsigned high = regs[2];
>
> ret = 0;
>
> @@ -985,12 +988,23 @@ static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
> break;
>
> default:
> - ret = native_write_msr_safe(msr, low, high);
> + ret = native_wrmsr_safe_regs(regs);
> }
>
> return ret;
> }
>
> +static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
> +{
> + u32 regs[8] = { 0 };
> +
> + regs[0] = low;
> + regs[1] = msr;
> + regs[2] = high;
> +
> + return xen_wrmsr_safe_regs(regs);
> +}
> +
> void xen_setup_shared_info(void)
> {
> if (!xen_feature(XENFEAT_auto_translated_physmap)) {
> @@ -1116,7 +1130,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
> .wbinvd = native_wbinvd,
>
> .read_msr = native_read_msr_safe,
> + .rdmsr_regs = native_rdmsr_safe_regs,
> .write_msr = xen_write_msr_safe,
> + .wrmsr_regs = xen_wrmsr_safe_regs,
> .read_tsc = native_read_tsc,
> .read_pmc = native_read_pmc,
>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> http://lists.xen.org/xen-devel
On Wed, May 30, 2012 at 01:17:54PM -0400, Konrad Rzeszutek Wilk wrote:
> > Yes, the following patch also fixed the crash for us:
> >
> > Implement rdmsr_regs and wrmsr_regs for Xen pvops.
>
> That needs more data. Such as the reason for it, the crash
> tombstone, and an analysis of the bug.
>
> But at this point I am not sure what we are going to do.
>
> I think Peter leans towards ripping the .rdmsr_regs/wdmsr_regs
> function out altogether (so altering the amd_rdmsr... to use the
> .rdmsr/.wrdmsr). At which point I think this would all work
> just fine?
I wouldn't do that. Andre's patch switches to the
rdmsrl_safe/checking_wrmsrl functions so you don't need to implement the
_regs functions for pvops.
I'll send it now with corrected commit message.
> I am tempted to write a patch that checks all the pv-cpu-ops to see if
> there are any that are NULL and throw a warning so that this does not
> hit us in the future - to be at least more proactive about this sort
> of thing.
This could be a prudent thing to do.
--
Regards/Gruss,
Boris.
On 05/30/2012 10:17 AM, Konrad Rzeszutek Wilk wrote:
>> Yes, the following patch also fixed the crash for us:
>>
>> Implement rdmsr_regs and wrmsr_regs for Xen pvops.
>
> That needs more data. Such as the reason for it, the crash
> tombstone, and an analysis of the bug.
>
> But at this point I am not sure what we are going to do.
>
> I think Peter leans towards ripping the .rdmsr_regs/wdmsr_regs
> function out altogether (so altering the amd_rdmsr... to use the
> .rdmsr/.wrdmsr). At which point I think this would all work
> just fine?
>
No, you can't just do that. Rip them out as in trap and emulate.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
On 05/30/2012 10:32 AM, Borislav Petkov wrote:
>
> I wouldn't do that. Andre's patch switches to the
> rdmsrl_safe/checking_wrmsrl functions so you don't need to implement the
> _regs functions for pvops.
>
But we just stated - it is wrong for K8?
-hpa
On Wed, May 30, 2012 at 10:47:32AM -0700, H. Peter Anvin wrote:
> > I wouldn't do that. Andre's patch switches to the
> > rdmsrl_safe/checking_wrmsrl functions so you don't need to implement the
> > _regs functions for pvops.
> >
> But we just stated - it is wrong for K8?
Not executed on K8 - only on F15h, models 0x10 - 0x1f.
--
Regards/Gruss,
Boris.
On 05/30/2012 10:51 AM, Borislav Petkov wrote:
> On Wed, May 30, 2012 at 10:47:32AM -0700, H. Peter Anvin wrote:
>>> I wouldn't do that. Andre's patch switches to the
>>> rdmsrl_safe/checking_wrmsrl functions so you don't need to implement the
>>> _regs functions for pvops.
>>>
>> But we just stated - it is wrong for K8?
>
> Not executed on K8 - only on F15h, models 0x10 - 0x1f.
>
OK. But there is still the general problem, no?
-hpa
On Wed, May 30, 2012 at 11:00:23AM -0700, H. Peter Anvin wrote:
> OK. But there is still the general problem, no?
With this patch xen crashes go away because they paravirt
native_{read,write}_msr_safe.
The other place where we use the amd_safe variants is an obscure K8,
revC and earlier fix for _some_ BIOSen and this hasn't bitten us yet
so I'm assuming people haven't run xen on such boxes yet. Does it need
fixing? Probably, if we really really have to.
Now, someone probably needs to paravirt the *safe_regs variants in case
something else decides to use them. I don't know what to do here, do I
want more paravirt code in there? No. I guess if this is done carefully
and cleanly, then it should be ok but it can't be done like that - it
needs to adhere to the current pv_cpu_ops thing which is already there.
Hmm...
--
Regards/Gruss,
Boris.
On Wed, May 30, 2012 at 08:17:23PM +0200, Borislav Petkov wrote:
> The other place where we use the amd_safe variants is an obscure K8,
> revC and earlier fix for _some_ BIOSen and this hasn't bitten us yet
> so I'm assuming people haven't run xen on such boxes yet. Does it need
> fixing? Probably, if we really really have to.
I'm being told we're safe here. Those boxes didn't have virtualization
support yet (SVM) so this one doesn't need fixing.
--
Regards/Gruss,
Boris.
On 05/30/2012 11:17 AM, Borislav Petkov wrote:
> On Wed, May 30, 2012 at 11:00:23AM -0700, H. Peter Anvin wrote:
>> OK. But there is still the general problem, no?
>
> With this patch xen crashes go away because they paravirt
> native_{read,write}_msr_safe.
>
> The other place where we use the amd_safe variants is an obscure K8,
> revC and earlier fix for _some_ BIOSen and this hasn't bitten us yet
> so I'm assuming people haven't run xen on such boxes yet. Does it need
> fixing? Probably, if we really really have to.
>
> Now, someone probably needs to paravirt the *safe_regs variants in case
> something else decides to use them. I don't know what to do here, do I
> want more paravirt code in there? No. I guess if this is done carefully
> and cleanly, then it should be ok but it can't be done like that - it
> needs to adhere to the current pv_cpu_ops thing which is already there.
>
I thought I was being told that Xen would trap and emulate the
rdmsr/wrmsr instructions. I guess they don't want to do that for the
handful of performance-sensitive MSRs there are, but those don't use the
*_regs variants.
-hpa
On 05/30/2012 11:19 AM, Borislav Petkov wrote:
> On Wed, May 30, 2012 at 08:17:23PM +0200, Borislav Petkov wrote:
>> The other place where we use the amd_safe variants is an obscure K8,
>> revC and earlier fix for _some_ BIOSen and this hasn't bitten us yet
>> so I'm assuming people haven't run xen on such boxes yet. Does it need
>> fixing? Probably, if we really really have to.
>
> I'm being told we're safe here. Those boxes didn't have virtualization
> support yet (SVM) so this one doesn't need fixing.
>
This is for PV, not for HVM (VT-x/SVM). HVM doesn't have this kind of
drain bamage.
-hpa
On Wed, May 30, 2012 at 11:21:00AM -0700, H. Peter Anvin wrote:
> This is for PV, not for HVM (VT-x/SVM). HVM doesn't have this kind of
> drain bamage.
Frankly, I wouldn't want to fix this for K8 only - they're going away
anyway. And besides, this is a BIOS fix for a very small subset of very
early K8s. It probably shouldnt've been there in the first place but
at the time I did it I wanted to fix the whole world (I'm much more
reasonable now :-)).
IOW, I'd like to leave sleeping dogs lie if you don't mind.
--
Regards/Gruss,
Boris.
On Wed, May 30, 2012 at 10:31:41AM -0700, H. Peter Anvin wrote:
> On 05/30/2012 10:17 AM, Konrad Rzeszutek Wilk wrote:
> >> Yes, the following patch also fixed the crash for us:
> >>
> >> Implement rdmsr_regs and wrmsr_regs for Xen pvops.
> >
> > That needs more data. Such as the reason for it, the crash
> > tombstone, and an analysis of the bug.
> >
> > But at this point I am not sure what we are going to do.
> >
> > I think Peter leans towards ripping the .rdmsr_regs/wdmsr_regs
> > function out altogether (so altering the amd_rdmsr... to use the
> > .rdmsr/.wrdmsr). At which point I think this would all work
> > just fine?
> >
>
> No, you can't just do that. Rip them out as in trap and emulate.
Then this should do it:
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 75f33b2..e74df95 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
.wbinvd = native_wbinvd,
.read_msr = native_read_msr_safe,
+ .rdmsr_regs = native_rdmsr_safe_regs,
.write_msr = xen_write_msr_safe,
+ .wrmsr_regs = native_wrmsr_safe_regs,
+
.read_tsc = native_read_tsc,
.read_pmc = native_read_pmc,
> > Now, someone probably needs to paravirt the *safe_regs variants in case
> > something else decides to use them. I don't know what to do here, do I
> > want more paravirt code in there? No. I guess if this is done carefully
> > and cleanly, then it should be ok but it can't be done like that - it
> > needs to adhere to the current pv_cpu_ops thing which is already there.
Using the native variant seems the right thing to do.
> >
>
> I thought I was being told that Xen would trap and emulate the
> rdmsr/wrmsr instructions. I guess they don't want to do that for the
It does.
> handful of performance-sensitive MSRs there are, but those don't use the
> *_regs variants.
The underlaying issue (as I understand) was that .rdmsr_regs
(and the corresponding write) was NULL and that caused the crash.
This tiny patch should do it:
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 75f33b2..e74df95 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
.wbinvd = native_wbinvd,
.read_msr = native_read_msr_safe,
+ .rdmsr_regs = native_rdmsr_safe_regs,
.write_msr = xen_write_msr_safe,
+ .wrmsr_regs = native_wrmsr_safe_regs,
+
.read_tsc = native_read_tsc,
.read_pmc = native_read_pmc,
On 05/30/2012 03:33 PM, Konrad Rzeszutek Wilk wrote:
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 75f33b2..e74df95 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
> .wbinvd = native_wbinvd,
>
> .read_msr = native_read_msr_safe,
> + .rdmsr_regs = native_rdmsr_safe_regs,
> .write_msr = xen_write_msr_safe,
> + .wrmsr_regs = native_wrmsr_safe_regs,
> +
> .read_tsc = native_read_tsc,
> .read_pmc = native_read_pmc,
>
Okay, tell me again:
why do we have these hooks again? Can we please weed out paravirt
methods that have no users?
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
Commit-ID: 1ab46fd319bcf1fcd9fb6311727d532b580e4eba
Gitweb: http://git.kernel.org/tip/1ab46fd319bcf1fcd9fb6311727d532b580e4eba
Author: Konrad Rzeszutek Wilk <[email protected]>
AuthorDate: Wed, 30 May 2012 18:23:56 -0400
Committer: H. Peter Anvin <[email protected]>
CommitDate: Wed, 30 May 2012 16:15:02 -0700
x86, amd, xen: Avoid NULL pointer paravirt references
Stub out MSR methods that aren't actually needed. This fixes a crash
as Xen Dom0 on AMD Trinity systems. A bigger patch should be added to
remove the paravirt machinery completely for the methods which
apparently have no users!
Reported-by: Andre Przywara <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
Cc: <[email protected]>
---
arch/x86/xen/enlighten.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 75f33b2..e74df95 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
.wbinvd = native_wbinvd,
.read_msr = native_read_msr_safe,
+ .rdmsr_regs = native_rdmsr_safe_regs,
.write_msr = xen_write_msr_safe,
+ .wrmsr_regs = native_wrmsr_safe_regs,
+
.read_tsc = native_read_tsc,
.read_pmc = native_read_pmc,
>>> On 30.05.12 at 19:17, Konrad Rzeszutek Wilk <[email protected]> wrote:
> I am tempted to write a patch that checks all the pv-cpu-ops
> to see if there are any that are NULL and throw a warning so
> that this does not hit us in the future - to be at least more
> proactive about this sort of thing.
Perhaps rather than using C99 initializers, using old-style ones
would be an alternative (assuming that the signatures of the
respective entries [or at least immediately neighboring ones]
are different), with a sentinel that is required to remain last
(i.e. adding at the very end would be prohibited)?
Or rather than doing a full structure assignment, assign
individual members directly to pv_cpu_ops (thus leaving
everything that's not explicitly overridden at its "native"
default)? After all, this is being done on __init code, so the
few extra code bytes shouldn't matter much? (All this of
course in the context of hpa's valid request that there be
no unused paravirt hooks in the first place.)
Jan
>>> On 30.05.12 at 20:17, Borislav Petkov <[email protected]> wrote:
> On Wed, May 30, 2012 at 11:00:23AM -0700, H. Peter Anvin wrote:
> The other place where we use the amd_safe variants is an obscure K8,
> revC and earlier fix for _some_ BIOSen and this hasn't bitten us yet
> so I'm assuming people haven't run xen on such boxes yet. Does it need
> fixing? Probably, if we really really have to.
This again is something that shouldn't even be attempted under
Xen. The hypervisor, unless really old, does this (and wouldn't
allow the write by any domain - privileged or not - anyway).
There's one more user though - the code triggered by the
"show_msr=" command line option. This one indeed requires
rdmsr_safe_regs to be functional (albeit under Xen, once
again, this won't work currently anyway for those MRS on
old CPUs where the special key in %edi is required, which the
emulation code in Xen doesn't support).
Jan
On 05/31/2012 12:33 AM, Konrad Rzeszutek Wilk wrote:
>>> Now, someone probably needs to paravirt the *safe_regs variants in case
>>> something else decides to use them. I don't know what to do here, do I
>>> want more paravirt code in there? No. I guess if this is done carefully
>>> and cleanly, then it should be ok but it can't be done like that - it
>>> needs to adhere to the current pv_cpu_ops thing which is already there.
>
> Using the native variant seems the right thing to do.
>>>
>>
>> I thought I was being told that Xen would trap and emulate the
>> rdmsr/wrmsr instructions. I guess they don't want to do that for the
>
> It does.
>> handful of performance-sensitive MSRs there are, but those don't use the
>> *_regs variants.
>
> The underlaying issue (as I understand) was that .rdmsr_regs
> (and the corresponding write) was NULL and that caused the crash.
> This tiny patch should do it:
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 75f33b2..e74df95 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
> .wbinvd = native_wbinvd,
>
> .read_msr = native_read_msr_safe,
> + .rdmsr_regs = native_rdmsr_safe_regs,
> .write_msr = xen_write_msr_safe,
> + .wrmsr_regs = native_wrmsr_safe_regs,
> +
> .read_tsc = native_read_tsc,
> .read_pmc = native_read_pmc,
>
>
Acked-by: Andre Przywara <[email protected]>
This works on the test machine.
Though I'd still like to have my original patch applied, because it
makes the thing a bit cleaner.
And I made a patch to remove the {rd,wr}msr_regs hooks from paravirt_ops
completely. Shall I send it out or do you want to make this part of
larger patch series to clean up pvops?
Regards,
Andre.
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
> >diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> >index 75f33b2..e74df95 100644
> >--- a/arch/x86/xen/enlighten.c
> >+++ b/arch/x86/xen/enlighten.c
> >@@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
> > .wbinvd = native_wbinvd,
> >
> > .read_msr = native_read_msr_safe,
> >+ .rdmsr_regs = native_rdmsr_safe_regs,
> > .write_msr = xen_write_msr_safe,
> >+ .wrmsr_regs = native_wrmsr_safe_regs,
> >+
> > .read_tsc = native_read_tsc,
> > .read_pmc = native_read_pmc,
> >
> >
>
> Acked-by: Andre Przywara <[email protected]>
>
> This works on the test machine.
Great! Thanks for doing a quick test for this.
>
> Though I'd still like to have my original patch applied, because it
> makes the thing a bit cleaner.
OK. Please re-send with an up-to-date git commit as suggested
by Peter.
>
> And I made a patch to remove the {rd,wr}msr_regs hooks from
> paravirt_ops completely. Shall I send it out or do you want to make
> this part of larger patch series to clean up pvops?
Please do send it out. Thanks again!
On 05/31/2012 12:17 AM, Jan Beulich wrote:
>>>> On 30.05.12 at 19:17, Konrad Rzeszutek Wilk <[email protected]> wrote:
>> I am tempted to write a patch that checks all the pv-cpu-ops
>> to see if there are any that are NULL and throw a warning so
>> that this does not hit us in the future - to be at least more
>> proactive about this sort of thing.
>
> Perhaps rather than using C99 initializers, using old-style ones
> would be an alternative (assuming that the signatures of the
> respective entries [or at least immediately neighboring ones]
> are different), with a sentinel that is required to remain last
> (i.e. adding at the very end would be prohibited)?
>
> Or rather than doing a full structure assignment, assign
> individual members directly to pv_cpu_ops (thus leaving
> everything that's not explicitly overridden at its "native"
> default)? After all, this is being done on __init code, so the
> few extra code bytes shouldn't matter much? (All this of
> course in the context of hpa's valid request that there be
> no unused paravirt hooks in the first place.)
>
Actually there is a really easy way to do this with C99 initializers:
create a macro with all the default assignments, and put that one first.
This is because it is legal to have more than one C99 initializer for
the same member, the last one is the one that takes effect.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
On Thu, May 31, 2012 at 08:39:15AM +0100, Jan Beulich wrote:
> >>> On 30.05.12 at 20:17, Borislav Petkov <[email protected]> wrote:
> > On Wed, May 30, 2012 at 11:00:23AM -0700, H. Peter Anvin wrote:
> > The other place where we use the amd_safe variants is an obscure K8,
> > revC and earlier fix for _some_ BIOSen and this hasn't bitten us yet
> > so I'm assuming people haven't run xen on such boxes yet. Does it need
> > fixing? Probably, if we really really have to.
>
> This again is something that shouldn't even be attempted under
> Xen. The hypervisor, unless really old, does this (and wouldn't
> allow the write by any domain - privileged or not - anyway).
>
> There's one more user though - the code triggered by the
> "show_msr=" command line option. This one indeed requires
> rdmsr_safe_regs to be functional (albeit under Xen, once
> again, this won't work currently anyway for those MRS on
> old CPUs where the special key in %edi is required, which the
> emulation code in Xen doesn't support).
This doesn't look right. Yinghai, why does generic x86 code use
rdmsrl_amd_safe - it should simply use rdmsrl_safe.
--
Regards/Gruss,
Boris.
* H. Peter Anvin <[email protected]> wrote:
> On 05/30/2012 03:33 PM, Konrad Rzeszutek Wilk wrote:
> > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > index 75f33b2..e74df95 100644
> > --- a/arch/x86/xen/enlighten.c
> > +++ b/arch/x86/xen/enlighten.c
> > @@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
> > .wbinvd = native_wbinvd,
> >
> > .read_msr = native_read_msr_safe,
> > + .rdmsr_regs = native_rdmsr_safe_regs,
> > .write_msr = xen_write_msr_safe,
> > + .wrmsr_regs = native_wrmsr_safe_regs,
> > +
> > .read_tsc = native_read_tsc,
> > .read_pmc = native_read_pmc,
> >
>
> Okay, tell me again:
>
> why do we have these hooks again? Can we please weed out
> paravirt methods that have no users?
ping?
Ingo
On Wed, Jun 06, 2012 at 11:27:13AM +0200, Ingo Molnar wrote:
>
> * H. Peter Anvin <[email protected]> wrote:
>
> > On 05/30/2012 03:33 PM, Konrad Rzeszutek Wilk wrote:
> > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > > index 75f33b2..e74df95 100644
> > > --- a/arch/x86/xen/enlighten.c
> > > +++ b/arch/x86/xen/enlighten.c
> > > @@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
> > > .wbinvd = native_wbinvd,
> > >
> > > .read_msr = native_read_msr_safe,
> > > + .rdmsr_regs = native_rdmsr_safe_regs,
> > > .write_msr = xen_write_msr_safe,
> > > + .wrmsr_regs = native_wrmsr_safe_regs,
> > > +
> > > .read_tsc = native_read_tsc,
> > > .read_pmc = native_read_pmc,
> > >
> >
> > Okay, tell me again:
> >
> > why do we have these hooks again? Can we please weed out
> > paravirt methods that have no users?
>
> ping?
I think we solved all that and hpa wanted to queue them up shortly:
http://marc.info/?l=linux-kernel&m=133856342618027&w=2
If it's easier, I can send you a pull request later.
Thanks.
--
Regards/Gruss,
Boris.
* Borislav Petkov <[email protected]> wrote:
> On Wed, Jun 06, 2012 at 11:27:13AM +0200, Ingo Molnar wrote:
> >
> > * H. Peter Anvin <[email protected]> wrote:
> >
> > > On 05/30/2012 03:33 PM, Konrad Rzeszutek Wilk wrote:
> > > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > > > index 75f33b2..e74df95 100644
> > > > --- a/arch/x86/xen/enlighten.c
> > > > +++ b/arch/x86/xen/enlighten.c
> > > > @@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
> > > > .wbinvd = native_wbinvd,
> > > >
> > > > .read_msr = native_read_msr_safe,
> > > > + .rdmsr_regs = native_rdmsr_safe_regs,
> > > > .write_msr = xen_write_msr_safe,
> > > > + .wrmsr_regs = native_wrmsr_safe_regs,
> > > > +
> > > > .read_tsc = native_read_tsc,
> > > > .read_pmc = native_read_pmc,
> > > >
> > >
> > > Okay, tell me again:
> > >
> > > why do we have these hooks again? Can we please weed out
> > > paravirt methods that have no users?
> >
> > ping?
>
> I think we solved all that and hpa wanted to queue them up shortly:
>
> http://marc.info/?l=linux-kernel&m=133856342618027&w=2
Great, will wait for hpa then.
Thanks,
Ingo