2013-04-07 05:54:37

by Zhenzhong Duan

[permalink] [raw]
Subject: [PATCH-v2] xen: Don't call arch_trigger_all_cpu_backtrace in dom0(pvm)

nmi isn't supported in dom0, fallback to general all cpu backtrace code.

Without fix, on xAPIC system, doing sysrq+l, no backtrace is showed,
as xen_send_IPI_all is called and it doesn't support nmi vector.

On x2APIC enabled system, got NULL pointer dereference as below.

SysRq : Show backtrace of all active CPUs
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff8125e3cb>] memcpy+0xb/0x120
Call Trace:
[<ffffffff81039633>] ? __x2apic_send_IPI_mask+0x73/0x160
[<ffffffff8103973e>] x2apic_send_IPI_all+0x1e/0x20
[<ffffffff8103498c>] arch_trigger_all_cpu_backtrace+0x6c/0xb0
[<ffffffff81501be4>] ? _raw_spin_lock_irqsave+0x34/0x50
[<ffffffff8131654e>] sysrq_handle_showallcpus+0xe/0x10
[<ffffffff8131616d>] __handle_sysrq+0x7d/0x140
[<ffffffff81316230>] ? __handle_sysrq+0x140/0x140
[<ffffffff81316287>] write_sysrq_trigger+0x57/0x60
[<ffffffff811ca996>] proc_reg_write+0x86/0xc0
[<ffffffff8116dd8e>] vfs_write+0xce/0x190
[<ffffffff8116e3e5>] sys_write+0x55/0x90
[<ffffffff8150a242>] system_call_fastpath+0x16/0x1b

That's because apic points to apic_x2apic_cluster or apic_x2apic_phys
but the basic element like cpumask isn't initialized.

-v2: Mask x2APIC feature in pvm to avoid overwrite of apic pointer,
update commit message per Konrad's suggestion.

Signed-off-by: Zhenzhong Duan <[email protected]>
Tested-by: Tamon Shiose <[email protected]>
---
arch/x86/xen/enlighten.c | 3 +++
include/linux/nmi.h | 2 ++
2 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index c8e1c7b..12b0718 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -386,6 +386,9 @@ static void __init xen_init_cpuid_mask(void)
cpuid_leaf1_edx_mask &=
~((1 << X86_FEATURE_APIC) | /* disable local APIC */
(1 << X86_FEATURE_ACPI)); /* disable ACPI */
+
+ cpuid_leaf1_ecx_mask &= ~(1 << (X86_FEATURE_X2APIC % 32));
+
ax = 1;
cx = 0;
xen_cpuid(&ax, &bx, &cx, &dx);
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index db50840..b845757 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -32,6 +32,8 @@ static inline void touch_nmi_watchdog(void)
#ifdef arch_trigger_all_cpu_backtrace
static inline bool trigger_all_cpu_backtrace(void)
{
+ if (xen_domain())
+ return false;
arch_trigger_all_cpu_backtrace();

return true;
--
1.7.3


2013-04-08 07:42:51

by Jan Beulich

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH-v2] xen: Don't call arch_trigger_all_cpu_backtrace in dom0(pvm)

>>> On 07.04.13 at 07:54, Zhenzhong Duan <[email protected]> wrote:
> nmi isn't supported in dom0, fallback to general all cpu backtrace code.

Since when is sending NMIs not supported, and since when is this
Dom0-specific? If you want to deal with this, you should do so
properly: Special case sending NMIs in the respective Xen specific
code (using VCPUOP_send_nmi), and carry this out in a way not
dependent upon running (un)privileged.

> Without fix, on xAPIC system, doing sysrq+l, no backtrace is showed,
> as xen_send_IPI_all is called and it doesn't support nmi vector.
>
> On x2APIC enabled system, got NULL pointer dereference as below.
>
> SysRq : Show backtrace of all active CPUs
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<ffffffff8125e3cb>] memcpy+0xb/0x120
> Call Trace:
> [<ffffffff81039633>] ? __x2apic_send_IPI_mask+0x73/0x160
> [<ffffffff8103973e>] x2apic_send_IPI_all+0x1e/0x20
> [<ffffffff8103498c>] arch_trigger_all_cpu_backtrace+0x6c/0xb0
> [<ffffffff81501be4>] ? _raw_spin_lock_irqsave+0x34/0x50
> [<ffffffff8131654e>] sysrq_handle_showallcpus+0xe/0x10
> [<ffffffff8131616d>] __handle_sysrq+0x7d/0x140
> [<ffffffff81316230>] ? __handle_sysrq+0x140/0x140
> [<ffffffff81316287>] write_sysrq_trigger+0x57/0x60
> [<ffffffff811ca996>] proc_reg_write+0x86/0xc0
> [<ffffffff8116dd8e>] vfs_write+0xce/0x190
> [<ffffffff8116e3e5>] sys_write+0x55/0x90
> [<ffffffff8150a242>] system_call_fastpath+0x16/0x1b
>
> That's because apic points to apic_x2apic_cluster or apic_x2apic_phys
> but the basic element like cpumask isn't initialized.

That's of course a bug on its own, fixing of which would go under
a suitable subject/title.

> -v2: Mask x2APIC feature in pvm to avoid overwrite of apic pointer,
> update commit message per Konrad's suggestion.
>
> Signed-off-by: Zhenzhong Duan <[email protected]>
> Tested-by: Tamon Shiose <[email protected]>
> ---
> arch/x86/xen/enlighten.c | 3 +++
> include/linux/nmi.h | 2 ++
> 2 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index c8e1c7b..12b0718 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -386,6 +386,9 @@ static void __init xen_init_cpuid_mask(void)
> cpuid_leaf1_edx_mask &=
> ~((1 << X86_FEATURE_APIC) | /* disable local APIC */
> (1 << X86_FEATURE_ACPI)); /* disable ACPI */
> +
> + cpuid_leaf1_ecx_mask &= ~(1 << (X86_FEATURE_X2APIC % 32));
> +

Bottom line - while this part may be fine (under a different title), ...

> ax = 1;
> cx = 0;
> xen_cpuid(&ax, &bx, &cx, &dx);
> diff --git a/include/linux/nmi.h b/include/linux/nmi.h
> index db50840..b845757 100644
> --- a/include/linux/nmi.h
> +++ b/include/linux/nmi.h
> @@ -32,6 +32,8 @@ static inline void touch_nmi_watchdog(void)
> #ifdef arch_trigger_all_cpu_backtrace
> static inline bool trigger_all_cpu_backtrace(void)
> {
> + if (xen_domain())
> + return false;

... this part clearly isn't.

Jan

> arch_trigger_all_cpu_backtrace();
>
> return true;

2013-04-09 16:36:19

by Ian Campbell

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH-v2] xen: Don't call arch_trigger_all_cpu_backtrace in dom0(pvm)

On Mon, 2013-04-08 at 08:42 +0100, Jan Beulich wrote:
> >>> On 07.04.13 at 07:54, Zhenzhong Duan <[email protected]> wrote:
> > nmi isn't supported in dom0, fallback to general all cpu backtrace code.
>
> Since when is sending NMIs not supported, and since when is this
> Dom0-specific? If you want to deal with this, you should do so
> properly: Special case sending NMIs in the respective Xen specific
> code (using VCPUOP_send_nmi), and carry this out in a way not
> dependent upon running (un)privileged.

You'd also need to implement the upcall support for receiving NMIs,
which IIRC isn't yet done for pvops.

Ian.

2013-05-15 08:40:32

by Zhenzhong Duan

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH-v2] xen: Don't call arch_trigger_all_cpu_backtrace in dom0(pvm)


On 2013-04-10 00:36, Ian Campbell wrote:
> On Mon, 2013-04-08 at 08:42 +0100, Jan Beulich wrote:
>>>>> On 07.04.13 at 07:54, Zhenzhong Duan <[email protected]> wrote:
>>> nmi isn't supported in dom0, fallback to general all cpu backtrace code.
>> Since when is sending NMIs not supported, and since when is this
>> Dom0-specific? If you want to deal with this, you should do so
>> properly: Special case sending NMIs in the respective Xen specific
>> code (using VCPUOP_send_nmi), and carry this out in a way not
>> dependent upon running (un)privileged.
> You'd also need to implement the upcall support for receiving NMIs,
> which IIRC isn't yet done for pvops.
Hi Ian,

Could you give a suggestion on which file to change to support NMI upcall?
I compare with vMCE code, made similar change.
Use VCPUOP_send_nmi to send nmi between pvm guest vcpus, but nmi isn't
triggered.

thanks
zduan

2013-05-15 09:33:09

by Stefan Bader

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH-v2] xen: Don't call arch_trigger_all_cpu_backtrace in dom0(pvm)

On 08.04.2013 09:42, Jan Beulich wrote:
>>>> On 07.04.13 at 07:54, Zhenzhong Duan <[email protected]> wrote:
>> nmi isn't supported in dom0, fallback to general all cpu backtrace code.
>
> Since when is sending NMIs not supported, and since when is this
> Dom0-specific? If you want to deal with this, you should do so
> properly: Special case sending NMIs in the respective Xen specific
> code (using VCPUOP_send_nmi), and carry this out in a way not
> dependent upon running (un)privileged.

FWIW, it seems different because PVM domU's end up using noop_apic, only dom0
seems to use the xen replacements for sending IPIs through apic->...
And the last time I looked there was no maping from NMI_VECTOR to a Xen
vector/handler.

>
>> Without fix, on xAPIC system, doing sysrq+l, no backtrace is showed,
>> as xen_send_IPI_all is called and it doesn't support nmi vector.
>>
>> On x2APIC enabled system, got NULL pointer dereference as below.
>>
>> SysRq : Show backtrace of all active CPUs
>> BUG: unable to handle kernel NULL pointer dereference at (null)
>> IP: [<ffffffff8125e3cb>] memcpy+0xb/0x120
>> Call Trace:
>> [<ffffffff81039633>] ? __x2apic_send_IPI_mask+0x73/0x160
>> [<ffffffff8103973e>] x2apic_send_IPI_all+0x1e/0x20
>> [<ffffffff8103498c>] arch_trigger_all_cpu_backtrace+0x6c/0xb0
>> [<ffffffff81501be4>] ? _raw_spin_lock_irqsave+0x34/0x50
>> [<ffffffff8131654e>] sysrq_handle_showallcpus+0xe/0x10
>> [<ffffffff8131616d>] __handle_sysrq+0x7d/0x140
>> [<ffffffff81316230>] ? __handle_sysrq+0x140/0x140
>> [<ffffffff81316287>] write_sysrq_trigger+0x57/0x60
>> [<ffffffff811ca996>] proc_reg_write+0x86/0xc0
>> [<ffffffff8116dd8e>] vfs_write+0xce/0x190
>> [<ffffffff8116e3e5>] sys_write+0x55/0x90
>> [<ffffffff8150a242>] system_call_fastpath+0x16/0x1b
>>
>> That's because apic points to apic_x2apic_cluster or apic_x2apic_phys
>> but the basic element like cpumask isn't initialized.
>
> That's of course a bug on its own, fixing of which would go under
> a suitable subject/title.
>
>> -v2: Mask x2APIC feature in pvm to avoid overwrite of apic pointer,
>> update commit message per Konrad's suggestion.
>>
>> Signed-off-by: Zhenzhong Duan <[email protected]>
>> Tested-by: Tamon Shiose <[email protected]>
>> ---
>> arch/x86/xen/enlighten.c | 3 +++
>> include/linux/nmi.h | 2 ++
>> 2 files changed, 5 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>> index c8e1c7b..12b0718 100644
>> --- a/arch/x86/xen/enlighten.c
>> +++ b/arch/x86/xen/enlighten.c
>> @@ -386,6 +386,9 @@ static void __init xen_init_cpuid_mask(void)
>> cpuid_leaf1_edx_mask &=
>> ~((1 << X86_FEATURE_APIC) | /* disable local APIC */
>> (1 << X86_FEATURE_ACPI)); /* disable ACPI */
>> +
>> + cpuid_leaf1_ecx_mask &= ~(1 << (X86_FEATURE_X2APIC % 32));
>> +
>
> Bottom line - while this part may be fine (under a different title), ...
>
>> ax = 1;
>> cx = 0;
>> xen_cpuid(&ax, &bx, &cx, &dx);
>> diff --git a/include/linux/nmi.h b/include/linux/nmi.h
>> index db50840..b845757 100644
>> --- a/include/linux/nmi.h
>> +++ b/include/linux/nmi.h
>> @@ -32,6 +32,8 @@ static inline void touch_nmi_watchdog(void)
>> #ifdef arch_trigger_all_cpu_backtrace
>> static inline bool trigger_all_cpu_backtrace(void)
>> {
>> + if (xen_domain())
>> + return false;
>
> ... this part clearly isn't.
>
> Jan
>
>> arch_trigger_all_cpu_backtrace();
>>
>> return true;
>
>
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> http://lists.xen.org/xen-devel
>



Attachments:
signature.asc (899.00 B)
OpenPGP digital signature

2013-05-15 10:21:15

by Ian Campbell

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH-v2] xen: Don't call arch_trigger_all_cpu_backtrace in dom0(pvm)

On Wed, 2013-05-15 at 16:40 +0800, Zhenzhong Duan wrote:
> On 2013-04-10 00:36, Ian Campbell wrote:
> > On Mon, 2013-04-08 at 08:42 +0100, Jan Beulich wrote:
> >>>>> On 07.04.13 at 07:54, Zhenzhong Duan <[email protected]> wrote:
> >>> nmi isn't supported in dom0, fallback to general all cpu backtrace code.
> >> Since when is sending NMIs not supported, and since when is this
> >> Dom0-specific? If you want to deal with this, you should do so
> >> properly: Special case sending NMIs in the respective Xen specific
> >> code (using VCPUOP_send_nmi), and carry this out in a way not
> >> dependent upon running (un)privileged.
> > You'd also need to implement the upcall support for receiving NMIs,
> > which IIRC isn't yet done for pvops.
> Hi Ian,
>
> Could you give a suggestion on which file to change to support NMI upcall?
> I compare with vMCE code, made similar change.
> Use VCPUOP_send_nmi to send nmi between pvm guest vcpus, but nmi isn't
> triggered.

You need to register a callback with CALLBACKOP_register
CALLBACKTYPE_nmi. You also need to write the code in entry.S to receive
that callback. IIRC you also need to arrange that returning from an NMI
is always done with HYPERVISOR_iret and not optimised to a direct iret
as it can be otherwise. This is to allow the hypervisor to implement NMI
masking correctly.

The linux-2.6.18-xen.hg tree implements NMI callbacks so you may find
inspiration there, although how upstreamable that approach is I'm not
sure. In particular stealing a EFLAGS bit to force the HYPERVISOR_iret
is certain to not be acceptable upstream.

Ian.