2023-04-14 23:46:23

by Thomas Gleixner

[permalink] [raw]
Subject: [patch 16/37] x86/xen/smp_pv: Remove wait for CPU online

Now that the core code drops sparse_irq_lock after the idle thread
synchronized, it's pointless to wait for the AP to mark itself online.

Whether the control CPU runs in a wait loop or sleeps in the core code
waiting for the online operation to complete makes no difference.

Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: [email protected]
---
arch/x86/xen/smp_pv.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -340,11 +340,11 @@ static int xen_pv_cpu_up(unsigned int cp

xen_pmu_init(cpu);

- rc = HYPERVISOR_vcpu_op(VCPUOP_up, xen_vcpu_nr(cpu), NULL);
- BUG_ON(rc);
-
- while (cpu_report_state(cpu) != CPU_ONLINE)
- HYPERVISOR_sched_op(SCHEDOP_yield, NULL);
+ /*
+ * Why is this a BUG? If the hypercall fails then everything can be
+ * rolled back, no?
+ */
+ BUG_ON(HYPERVISOR_vcpu_op(VCPUOP_up, xen_vcpu_nr(cpu), NULL));

return 0;
}


2023-04-17 20:53:05

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: [patch 16/37] x86/xen/smp_pv: Remove wait for CPU online



On 4/14/23 7:44 PM, Thomas Gleixner wrote:
> Now that the core code drops sparse_irq_lock after the idle thread
> synchronized, it's pointless to wait for the AP to mark itself online.
>
> Whether the control CPU runs in a wait loop or sleeps in the core code
> waiting for the online operation to complete makes no difference.
>
> Signed-off-by: Thomas Gleixner <[email protected]>
> Cc: Juergen Gross <[email protected]>
> Cc: Boris Ostrovsky <[email protected]>
> Cc: [email protected]
> ---
> arch/x86/xen/smp_pv.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> --- a/arch/x86/xen/smp_pv.c
> +++ b/arch/x86/xen/smp_pv.c
> @@ -340,11 +340,11 @@ static int xen_pv_cpu_up(unsigned int cp
>
> xen_pmu_init(cpu);
>
> - rc = HYPERVISOR_vcpu_op(VCPUOP_up, xen_vcpu_nr(cpu), NULL);
> - BUG_ON(rc);
> -
> - while (cpu_report_state(cpu) != CPU_ONLINE)
> - HYPERVISOR_sched_op(SCHEDOP_yield, NULL);
> + /*
> + * Why is this a BUG? If the hypercall fails then everything can be
> + * rolled back, no?
> + */


In many cases this indicates either some sort of hypervisor internal error or broken logic in the guest, so it is, well, a bug. But I suppose it may also be some transient condition in the hypervisor (I don't see it now but it can happen in the future) so perhaps we should indeed try not to die on the spot.



-boris


> + BUG_ON(HYPERVISOR_vcpu_op(VCPUOP_up, xen_vcpu_nr(cpu), NULL));
>
> return 0;
> }
>