2013-05-06 13:05:08

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH] Patches to make PVHVM VCPU hotplug work with VCPUOP_register_info (v1).

Hey,

The first patch is a good candidate for stable, the other are more for
cleanups that I spotted.

The patch:
[PATCH 1/3] xen/vcpu/pvhvm: Fix vcpu hotplugging hanging.

along with the two earlier patches:
66ff0fe xen/smp/spinlock: Fix leakage of the spinlock interrupt line for every CPU online/offline
888b65b xen/smp: Fix leakage of timer interrupt line for every CPU online/offline.

make it possible to do VCPU hotplug in an PVHVM guest. This means that
with Xen 4.1 it works. Xen 4.2 and Xen 4.3 had a regression wherein the
VCPUOP_register_info hypercall did not work in HVM mode which meant:
- No events delievered to more than 32 VCPUs. I am not exactly
sure what that means except I think that IPIs would stop working with
guests with more than 32 VCPUs.

- Could not take advantage of the per-CPU page allocation for events
offered by the hypercall.

Anyhow, the regression is fixed in Xen 4.3 (and should appear in Xen 4.2.?)
and with these attached patches the VCPU hotplug mechanism will work.

There are also miscellaneous patches here that.

Note that during testing I found that this combination:

maxvcpus >= vcpus for PVHVM with v3.9 hits a generic bug. This generic
bug is dead-lock in the microcode. Asked x86 folks for assistance on that
as it would seem to appear on generic platforms too.


arch/x86/xen/enlighten.c | 39 ++++++++++++++++++++++++++++++++++++++-
arch/x86/xen/spinlock.c | 2 +-
2 files changed, 39 insertions(+), 2 deletions(-)

Konrad Rzeszutek Wilk (4):
xen/vcpu/pvhvm: Fix vcpu hotplugging hanging.
xen/vcpu: Document the xen_vcpu_info and xen_vcpu
xen/smp/pvhvm: Don't point per_cpu(xen_vpcu, 33 and larger) to shared_info
xen/spinlock: Fix check from greater than to be also be greater or equal to.


2013-05-06 13:05:14

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 1/4] xen/vcpu/pvhvm: Fix vcpu hotplugging hanging.

If a user did:

echo 0 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu1/online

we would (this a build with DEBUG enabled) get to:
smpboot: ++++++++++++++++++++=_---CPU UP 1
.. snip..
smpboot: Stack at about ffff880074c0ff44
smpboot: CPU1: has booted.

and hang. The RCU mechanism would kick in an try to IPI the CPU1
but the IPIs (and all other interrupts) would never arrive at the
CPU1. At first glance at least. A bit digging in the hypervisor
trace shows that (using xenanalyze):

[vla] d4v1 vec 243 injecting
0.043163027 --|x d4v1 intr_window vec 243 src 5(vector) intr f3
] 0.043163639 --|x d4v1 vmentry cycles 1468
] 0.043164913 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254
0.043164913 --|x d4v1 inj_virq vec 243 real
[vla] d4v1 vec 243 injecting
0.043164913 --|x d4v1 intr_window vec 243 src 5(vector) intr f3
] 0.043165526 --|x d4v1 vmentry cycles 1472
] 0.043166800 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254
0.043166800 --|x d4v1 inj_virq vec 243 real
[vla] d4v1 vec 243 injecting

there is a pending event (subsequent debugging shows it is the IPI
from the VCPU0 when smpboot.c on VCPU1 has done
"set_cpu_online(smp_processor_id(), true)") and the guest VCPU1 is
interrupted with the callback IPI (0xf3 aka 243) which ends up calling
__xen_evtchn_do_upcall.

The __xen_evtchn_do_upcall seems to do *something* but not acknowledge
the pending events. And the moment the guest does a 'cli' (that is the
ffffffff81673254 in the log above) the hypervisor is invoked again to
inject the IPI (0xf3) to tell the guest it has pending interrupts.
This repeats itself forever.

The culprit was the per_cpu(xen_vcpu, cpu) pointer. At the bootup
we set each per_cpu(xen_vcpu, cpu) to point to the
shared_info->vcpu_info[vcpu] but later on use the VCPUOP_register_vcpu_info
to register per-CPU structures (xen_vcpu_setup).
This is used to allow events for more than 32 VCPUs and for performance
optimizations reasons.

When the user performs the VCPU hotplug we end up calling the
the xen_vcpu_setup once more. We make the hypercall which returns
-EINVAL as it does not allow multiple registration calls (and
already has re-assigned where the events are being set). We pick
the fallback case and set per_cpu(xen_vcpu, cpu) to point to the
shared_info->vcpu_info[vcpu] (which is a good fallback during bootup).
However the hypervisor is still setting events in the register
per-cpu structure (per_cpu(xen_vcpu_info, cpu)).

As such when the events are set by the hypervisor (such as timer one),
and when we iterate in __xen_evtchn_do_upcall we end up reading stale
events from the shared_info->vcpu_info[vcpu] instead of the
per_cpu(xen_vcpu_info, cpu) structures. Hence we never acknowledge the
events that the hypervisor has set and the hypervisor keeps on reminding
us to ack the events which we never do.

The fix is simple. Don't on the second time when xen_vcpu_setup is
called over-write the per_cpu(xen_vcpu, cpu) if it points to
per_cpu(xen_vcpu_info).

CC: [email protected]
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/enlighten.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index ddbd54a..94a81f4 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -157,6 +157,21 @@ static void xen_vcpu_setup(int cpu)

BUG_ON(HYPERVISOR_shared_info == &xen_dummy_shared_info);

+ /*
+ * This path is called twice on PVHVM - first during bootup via
+ * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being
+ * hotplugged: cpu_up -> xen_hvm_cpu_notify.
+ * As we can only do the VCPUOP_register_vcpu_info once lets
+ * not over-write its result.
+ *
+ * For PV it is called during restore (xen_vcpu_restore) and bootup
+ * (xen_setup_vcpu_info_placement). The hotplug mechanism does not
+ * use this function.
+ */
+ if (xen_hvm_domain()) {
+ if (per_cpu(xen_vcpu, cpu) == &per_cpu(xen_vcpu_info, cpu))
+ return;
+ }
if (cpu < MAX_VIRT_CPUS)
per_cpu(xen_vcpu,cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu];

--
1.8.1.4

2013-05-06 13:05:13

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 2/4] xen/vcpu: Document the xen_vcpu_info and xen_vcpu

They are important structures and it is not clear at first
look what they are for.

The xen_vcpu is a pointer. By default it points to the shared_info
structure (at the CPU offset location). However if the
VCPUOP_register_vcpu_info hypercall is implemented we can make the
xen_vcpu pointer point to a per-CPU location.

Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/enlighten.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 94a81f4..9b34475 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -85,7 +85,21 @@

EXPORT_SYMBOL_GPL(hypercall_page);

+/*
+ * Pointer to the xen_vcpu_info structure or
+ * &HYPERVISOR_shared_info->vcpu_info[cpu]. See xen_hvm_init_shared_info
+ * and xen_vcpu_setup for details. By default it points to share_info->vcpu_info
+ * but if the hypervisor supports VCPUOP_register_vcpu_info then it can point
+ * to xen_vcpu_info. The pointer is used in __xen_evtchn_do_upcall to
+ * acknowledge pending events.
+ */
DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
+
+/*
+ * Per CPU pages used if hypervisor supports VCPUOP_register_vcpu_info
+ * hypercall. This can be used both in PV and PVHVM mode. The structure
+ * overrides the default per_cpu(xen_vcpu, cpu) value.
+ */
DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info);

enum xen_domain_type xen_domain_type = XEN_NATIVE;
@@ -187,7 +201,12 @@ static void xen_vcpu_setup(int cpu)

/* Check to see if the hypervisor will put the vcpu_info
structure where we want it, which allows direct access via
- a percpu-variable. */
+ a percpu-variable.
+ N.B. This hypercall can _only_ be called once per CPU. Subsequent
+ calls will error out with -EINVAL. This is due to the fact that
+ hypervisor has no unregister variant and this hypercall does not
+ allow to over-write info.mfn and info.offset.
+ */
err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);

if (err) {
--
1.8.1.4

2013-05-06 13:05:11

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 3/4] xen/smp/pvhvm: Don't point per_cpu(xen_vpcu, 33 and larger) to shared_info

As it will point to some data, but not event channel data (the
shared_info has an array limited to 32).

This means that for PVHVM guests with more than 32 VCPUs without
the usage of VCPUOP_register_info any interrupts to VCPUs
larger than 32 would have gone unnoticed during early bootup.

That is OK, as during early bootup, in smp_init we end up calling
the hotplug mechanism (xen_hvm_cpu_notify) which makes the
VCPUOP_register_vcpu_info call for all VCPUs and we can receive
interrupts on VCPUs 33 and further.

This is just a cleanup.

Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/enlighten.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 9b34475..18a4009 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1638,6 +1638,9 @@ void __ref xen_hvm_init_shared_info(void)
* online but xen_hvm_init_shared_info is run at resume time too and
* in that case multiple vcpus might be online. */
for_each_online_cpu(cpu) {
+ /* Leave it to be NULL. */
+ if (cpu >= MAX_VIRT_CPUS)
+ continue;
per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu];
}
}
--
1.8.1.4

2013-05-06 13:06:08

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 4/4] xen/spinlock: Fix check from greater than to be also be greater or equal to.

During review of git commit cb9c6f15f318aa3aeb62fe525aa5c6dcf6eee159
("xen/spinlock: Check against default value of -1 for IRQ line.")
Stefano pointed out a bug in the patch. Unfortunatly due to vacation
timing the fix was not applied and this patch fixes it up.

Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/spinlock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
index 8b54603..3002ec1 100644
--- a/arch/x86/xen/spinlock.c
+++ b/arch/x86/xen/spinlock.c
@@ -364,7 +364,7 @@ void __cpuinit xen_init_lock_cpu(int cpu)
int irq;
const char *name;

- WARN(per_cpu(lock_kicker_irq, cpu) > 0, "spinlock on CPU%d exists on IRQ%d!\n",
+ WARN(per_cpu(lock_kicker_irq, cpu) >= 0, "spinlock on CPU%d exists on IRQ%d!\n",
cpu, per_cpu(lock_kicker_irq, cpu));

/*
--
1.8.1.4

2013-05-06 14:58:06

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH 4/4] xen/spinlock: Fix check from greater than to be also be greater or equal to.

On Mon, 6 May 2013, Konrad Rzeszutek Wilk wrote:
> During review of git commit cb9c6f15f318aa3aeb62fe525aa5c6dcf6eee159
> ("xen/spinlock: Check against default value of -1 for IRQ line.")
> Stefano pointed out a bug in the patch. Unfortunatly due to vacation
> timing the fix was not applied and this patch fixes it up.
>
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

Acked-by: Stefano Stabellini <[email protected]>


> arch/x86/xen/spinlock.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
> index 8b54603..3002ec1 100644
> --- a/arch/x86/xen/spinlock.c
> +++ b/arch/x86/xen/spinlock.c
> @@ -364,7 +364,7 @@ void __cpuinit xen_init_lock_cpu(int cpu)
> int irq;
> const char *name;
>
> - WARN(per_cpu(lock_kicker_irq, cpu) > 0, "spinlock on CPU%d exists on IRQ%d!\n",
> + WARN(per_cpu(lock_kicker_irq, cpu) >= 0, "spinlock on CPU%d exists on IRQ%d!\n",
> cpu, per_cpu(lock_kicker_irq, cpu));
>
> /*
> --
> 1.8.1.4
>

2013-05-06 14:59:00

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH 2/4] xen/vcpu: Document the xen_vcpu_info and xen_vcpu

On Mon, 6 May 2013, Konrad Rzeszutek Wilk wrote:
> They are important structures and it is not clear at first
> look what they are for.
>
> The xen_vcpu is a pointer. By default it points to the shared_info
> structure (at the CPU offset location). However if the
> VCPUOP_register_vcpu_info hypercall is implemented we can make the
> xen_vcpu pointer point to a per-CPU location.
>
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>


Acked-by: Stefano Stabellini <[email protected]>

> arch/x86/xen/enlighten.c | 21 ++++++++++++++++++++-
> 1 file changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 94a81f4..9b34475 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -85,7 +85,21 @@
>
> EXPORT_SYMBOL_GPL(hypercall_page);
>
> +/*
> + * Pointer to the xen_vcpu_info structure or
> + * &HYPERVISOR_shared_info->vcpu_info[cpu]. See xen_hvm_init_shared_info
> + * and xen_vcpu_setup for details. By default it points to share_info->vcpu_info
> + * but if the hypervisor supports VCPUOP_register_vcpu_info then it can point
> + * to xen_vcpu_info. The pointer is used in __xen_evtchn_do_upcall to
> + * acknowledge pending events.
> + */
> DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
> +
> +/*
> + * Per CPU pages used if hypervisor supports VCPUOP_register_vcpu_info
> + * hypercall. This can be used both in PV and PVHVM mode. The structure
> + * overrides the default per_cpu(xen_vcpu, cpu) value.
> + */
> DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info);
>
> enum xen_domain_type xen_domain_type = XEN_NATIVE;
> @@ -187,7 +201,12 @@ static void xen_vcpu_setup(int cpu)
>
> /* Check to see if the hypervisor will put the vcpu_info
> structure where we want it, which allows direct access via
> - a percpu-variable. */
> + a percpu-variable.
> + N.B. This hypercall can _only_ be called once per CPU. Subsequent
> + calls will error out with -EINVAL. This is due to the fact that
> + hypervisor has no unregister variant and this hypercall does not
> + allow to over-write info.mfn and info.offset.
> + */
> err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
>
> if (err) {
> --
> 1.8.1.4
>

2013-05-06 15:00:27

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH 3/4] xen/smp/pvhvm: Don't point per_cpu(xen_vpcu, 33 and larger) to shared_info

On Mon, 6 May 2013, Konrad Rzeszutek Wilk wrote:
> As it will point to some data, but not event channel data (the
> shared_info has an array limited to 32).
>
> This means that for PVHVM guests with more than 32 VCPUs without
> the usage of VCPUOP_register_info any interrupts to VCPUs
> larger than 32 would have gone unnoticed during early bootup.
>
> That is OK, as during early bootup, in smp_init we end up calling
> the hotplug mechanism (xen_hvm_cpu_notify) which makes the
> VCPUOP_register_vcpu_info call for all VCPUs and we can receive
> interrupts on VCPUs 33 and further.
>
> This is just a cleanup.
>
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

Acked-by: Stefano Stabellini <[email protected]>


> arch/x86/xen/enlighten.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 9b34475..18a4009 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -1638,6 +1638,9 @@ void __ref xen_hvm_init_shared_info(void)
> * online but xen_hvm_init_shared_info is run at resume time too and
> * in that case multiple vcpus might be online. */
> for_each_online_cpu(cpu) {
> + /* Leave it to be NULL. */
> + if (cpu >= MAX_VIRT_CPUS)
> + continue;
> per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu];
> }
> }
> --
> 1.8.1.4
>

2013-05-06 15:08:17

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH 1/4] xen/vcpu/pvhvm: Fix vcpu hotplugging hanging.

On Mon, 6 May 2013, Konrad Rzeszutek Wilk wrote:
> If a user did:
>
> echo 0 > /sys/devices/system/cpu/cpu1/online
> echo 1 > /sys/devices/system/cpu/cpu1/online
>
> we would (this a build with DEBUG enabled) get to:
> smpboot: ++++++++++++++++++++=_---CPU UP 1
> .. snip..
> smpboot: Stack at about ffff880074c0ff44
> smpboot: CPU1: has booted.
>
> and hang. The RCU mechanism would kick in an try to IPI the CPU1
> but the IPIs (and all other interrupts) would never arrive at the
> CPU1. At first glance at least. A bit digging in the hypervisor
> trace shows that (using xenanalyze):
>
> [vla] d4v1 vec 243 injecting
> 0.043163027 --|x d4v1 intr_window vec 243 src 5(vector) intr f3
> ] 0.043163639 --|x d4v1 vmentry cycles 1468
> ] 0.043164913 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254
> 0.043164913 --|x d4v1 inj_virq vec 243 real
> [vla] d4v1 vec 243 injecting
> 0.043164913 --|x d4v1 intr_window vec 243 src 5(vector) intr f3
> ] 0.043165526 --|x d4v1 vmentry cycles 1472
> ] 0.043166800 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254
> 0.043166800 --|x d4v1 inj_virq vec 243 real
> [vla] d4v1 vec 243 injecting
>
> there is a pending event (subsequent debugging shows it is the IPI
> from the VCPU0 when smpboot.c on VCPU1 has done
> "set_cpu_online(smp_processor_id(), true)") and the guest VCPU1 is
> interrupted with the callback IPI (0xf3 aka 243) which ends up calling
> __xen_evtchn_do_upcall.
>
> The __xen_evtchn_do_upcall seems to do *something* but not acknowledge
> the pending events. And the moment the guest does a 'cli' (that is the
> ffffffff81673254 in the log above) the hypervisor is invoked again to
> inject the IPI (0xf3) to tell the guest it has pending interrupts.
> This repeats itself forever.
>
> The culprit was the per_cpu(xen_vcpu, cpu) pointer. At the bootup
> we set each per_cpu(xen_vcpu, cpu) to point to the
> shared_info->vcpu_info[vcpu] but later on use the VCPUOP_register_vcpu_info
> to register per-CPU structures (xen_vcpu_setup).
> This is used to allow events for more than 32 VCPUs and for performance
> optimizations reasons.
>
> When the user performs the VCPU hotplug we end up calling the
> the xen_vcpu_setup once more. We make the hypercall which returns
> -EINVAL as it does not allow multiple registration calls (and
> already has re-assigned where the events are being set). We pick
> the fallback case and set per_cpu(xen_vcpu, cpu) to point to the
> shared_info->vcpu_info[vcpu] (which is a good fallback during bootup).
> However the hypervisor is still setting events in the register
> per-cpu structure (per_cpu(xen_vcpu_info, cpu)).
>
> As such when the events are set by the hypervisor (such as timer one),
> and when we iterate in __xen_evtchn_do_upcall we end up reading stale
> events from the shared_info->vcpu_info[vcpu] instead of the
> per_cpu(xen_vcpu_info, cpu) structures. Hence we never acknowledge the
> events that the hypervisor has set and the hypervisor keeps on reminding
> us to ack the events which we never do.
>
> The fix is simple. Don't on the second time when xen_vcpu_setup is
> called over-write the per_cpu(xen_vcpu, cpu) if it points to
> per_cpu(xen_vcpu_info).

Acked-by: Stefano Stabellini <[email protected]>


> CC: [email protected]
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
> ---
> arch/x86/xen/enlighten.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index ddbd54a..94a81f4 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -157,6 +157,21 @@ static void xen_vcpu_setup(int cpu)
>
> BUG_ON(HYPERVISOR_shared_info == &xen_dummy_shared_info);
>
> + /*
> + * This path is called twice on PVHVM - first during bootup via
> + * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being
> + * hotplugged: cpu_up -> xen_hvm_cpu_notify.
> + * As we can only do the VCPUOP_register_vcpu_info once lets
> + * not over-write its result.
> + *
> + * For PV it is called during restore (xen_vcpu_restore) and bootup
> + * (xen_setup_vcpu_info_placement). The hotplug mechanism does not
> + * use this function.
> + */
> + if (xen_hvm_domain()) {
> + if (per_cpu(xen_vcpu, cpu) == &per_cpu(xen_vcpu_info, cpu))
> + return;
> + }
> if (cpu < MAX_VIRT_CPUS)
> per_cpu(xen_vcpu,cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu];
>
> --
> 1.8.1.4
>

2013-05-07 09:30:52

by Ian Campbell

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 2/4] xen/vcpu: Document the xen_vcpu_info and xen_vcpu

On Mon, 2013-05-06 at 14:04 +0100, Konrad Rzeszutek Wilk wrote:
> They are important structures and it is not clear at first
> look what they are for.
>
> The xen_vcpu is a pointer. By default it points to the shared_info
> structure (at the CPU offset location). However if the
> VCPUOP_register_vcpu_info hypercall is implemented we can make the
> xen_vcpu pointer point to a per-CPU location.
>
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
> ---
> arch/x86/xen/enlighten.c | 21 ++++++++++++++++++++-
> 1 file changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 94a81f4..9b34475 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -85,7 +85,21 @@
>
> EXPORT_SYMBOL_GPL(hypercall_page);
>
> +/*
> + * Pointer to the xen_vcpu_info structure or
> + * &HYPERVISOR_shared_info->vcpu_info[cpu]. See xen_hvm_init_shared_info
> + * and xen_vcpu_setup for details. By default it points to share_info->vcpu_info
> + * but if the hypervisor supports VCPUOP_register_vcpu_info then it can point
> + * to xen_vcpu_info. The pointer is used in __xen_evtchn_do_upcall to
> + * acknowledge pending events.

Also more subtly it is used by the patched version of irq enable/disable
e.g. xen_irq_enable_direct and xen_iret.

The desire to be able to do those mask/unmask operations as a single
instruction by using the per-cpu offset held in %gs is the real reason
vcpu info is in a per-cpu pointer and for the whole register_vcpu_info
thing IIRC.

> + */
> DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
> +
> +/*
> + * Per CPU pages used if hypervisor supports VCPUOP_register_vcpu_info
> + * hypercall. This can be used both in PV and PVHVM mode. The structure
> + * overrides the default per_cpu(xen_vcpu, cpu) value.
> + */
> DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info);
>
> enum xen_domain_type xen_domain_type = XEN_NATIVE;
> @@ -187,7 +201,12 @@ static void xen_vcpu_setup(int cpu)
>
> /* Check to see if the hypervisor will put the vcpu_info
> structure where we want it, which allows direct access via
> - a percpu-variable. */
> + a percpu-variable.
> + N.B. This hypercall can _only_ be called once per CPU. Subsequent
> + calls will error out with -EINVAL. This is due to the fact that
> + hypervisor has no unregister variant and this hypercall does not
> + allow to over-write info.mfn and info.offset.
> + */
> err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
>
> if (err) {

2013-05-07 14:06:29

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 2/4] xen/vcpu: Document the xen_vcpu_info and xen_vcpu

On Tue, May 07, 2013 at 10:30:47AM +0100, Ian Campbell wrote:
> On Mon, 2013-05-06 at 14:04 +0100, Konrad Rzeszutek Wilk wrote:
> > They are important structures and it is not clear at first
> > look what they are for.
> >
> > The xen_vcpu is a pointer. By default it points to the shared_info
> > structure (at the CPU offset location). However if the
> > VCPUOP_register_vcpu_info hypercall is implemented we can make the
> > xen_vcpu pointer point to a per-CPU location.
> >
> > Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
> > ---
> > arch/x86/xen/enlighten.c | 21 ++++++++++++++++++++-
> > 1 file changed, 20 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > index 94a81f4..9b34475 100644
> > --- a/arch/x86/xen/enlighten.c
> > +++ b/arch/x86/xen/enlighten.c
> > @@ -85,7 +85,21 @@
> >
> > EXPORT_SYMBOL_GPL(hypercall_page);
> >
> > +/*
> > + * Pointer to the xen_vcpu_info structure or
> > + * &HYPERVISOR_shared_info->vcpu_info[cpu]. See xen_hvm_init_shared_info
> > + * and xen_vcpu_setup for details. By default it points to share_info->vcpu_info
> > + * but if the hypervisor supports VCPUOP_register_vcpu_info then it can point
> > + * to xen_vcpu_info. The pointer is used in __xen_evtchn_do_upcall to
> > + * acknowledge pending events.
>
> Also more subtly it is used by the patched version of irq enable/disable
> e.g. xen_irq_enable_direct and xen_iret.
>
> The desire to be able to do those mask/unmask operations as a single
> instruction by using the per-cpu offset held in %gs is the real reason
> vcpu info is in a per-cpu pointer and for the whole register_vcpu_info
> thing IIRC.

>From a520996ae2e2792e1f90b74e67c974120c8a3b83 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <[email protected]>
Date: Sun, 5 May 2013 08:51:47 -0400
Subject: [PATCH] xen/vcpu: Document the xen_vcpu_info and xen_vcpu

They are important structures and it is not clear at first
look what they are for.

The xen_vcpu is a pointer. By default it points to the shared_info
structure (at the CPU offset location). However if the
VCPUOP_register_vcpu_info hypercall is implemented we can make the
xen_vcpu pointer point to a per-CPU location.

Acked-by: Stefano Stabellini <[email protected]>
[v1: Added comments from Ian Campbell]
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/enlighten.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 94a81f4..a2babdb 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -85,7 +85,29 @@

EXPORT_SYMBOL_GPL(hypercall_page);

+/*
+ * Pointer to the xen_vcpu_info structure or
+ * &HYPERVISOR_shared_info->vcpu_info[cpu]. See xen_hvm_init_shared_info
+ * and xen_vcpu_setup for details. By default it points to share_info->vcpu_info
+ * but if the hypervisor supports VCPUOP_register_vcpu_info then it can point
+ * to xen_vcpu_info. The pointer is used in __xen_evtchn_do_upcall to
+ * acknowledge pending events.
+ * Also more subtly it is used by the patched version of irq enable/disable
+ * e.g. xen_irq_enable_direct and xen_iret in PV mode.
+ *
+ * The desire to be able to do those mask/unmask operations as a single
+ * instruction by using the per-cpu offset held in %gs is the real reason
+ * vcpu info is in a per-cpu pointer and the original reason for this
+ * hypercall.
+ *
+ */
DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
+
+/*
+ * Per CPU pages used if hypervisor supports VCPUOP_register_vcpu_info
+ * hypercall. This can be used both in PV and PVHVM mode. The structure
+ * overrides the default per_cpu(xen_vcpu, cpu) value.
+ */
DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info);

enum xen_domain_type xen_domain_type = XEN_NATIVE;
@@ -187,7 +209,12 @@ static void xen_vcpu_setup(int cpu)

/* Check to see if the hypervisor will put the vcpu_info
structure where we want it, which allows direct access via
- a percpu-variable. */
+ a percpu-variable.
+ N.B. This hypercall can _only_ be called once per CPU. Subsequent
+ calls will error out with -EINVAL. This is due to the fact that
+ hypervisor has no unregister variant and this hypercall does not
+ allow to over-write info.mfn and info.offset.
+ */
err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);

if (err) {
--
1.8.1.4