This started out with me noticing that "dom0_max_vcpus=<N>" with <N>
larger than the number of physical CPUs reported through ACPI tables
would not bring up the "excess" vCPU-s.
Noticing that xen_fill_possible_map() gets called way too early, whereas
xen_filter_cpu_maps() gets called too late (after per-CPU areas were
already set up), and further observing that each of the functions serves
only one of Dom0 or DomU, it looked like it was better to simplify this.
Use the .get_smp_config hook instead, uniformly for Dom0 and DomU.
xen_fill_possible_map() can be dropped altogether, while
xen_filter_cpu_maps() gets re-purposed but not otherwise changed.
Signed-off-by: Jan Beulich <[email protected]>
---
arch/x86/xen/enlighten_pv.c | 4 ----
arch/x86/xen/smp_pv.c | 26 ++++++--------------------
2 files changed, 6 insertions(+), 24 deletions(-)
--- 5.1-rc2/arch/x86/xen/enlighten_pv.c
+++ 5.1-rc2-xen-x86-Dom0-more-vCPUs/arch/x86/xen/enlighten_pv.c
@@ -1381,10 +1381,6 @@ asmlinkage __visible void __init xen_sta
xen_acpi_sleep_register();
- /* Avoid searching for BIOS MP tables */
- x86_init.mpparse.find_smp_config = x86_init_noop;
- x86_init.mpparse.get_smp_config = x86_init_uint_noop;
-
xen_boot_params_init_edd();
}
--- 5.1-rc2/arch/x86/xen/smp_pv.c
+++ 5.1-rc2-xen-x86-Dom0-more-vCPUs/arch/x86/xen/smp_pv.c
@@ -146,28 +146,12 @@ int xen_smp_intr_init_pv(unsigned int cp
return rc;
}
-static void __init xen_fill_possible_map(void)
-{
- int i, rc;
-
- if (xen_initial_domain())
- return;
-
- for (i = 0; i < nr_cpu_ids; i++) {
- rc = HYPERVISOR_vcpu_op(VCPUOP_is_up, i, NULL);
- if (rc >= 0) {
- num_processors++;
- set_cpu_possible(i, true);
- }
- }
-}
-
-static void __init xen_filter_cpu_maps(void)
+static void __init _get_smp_config(unsigned int early)
{
int i, rc;
unsigned int subtract = 0;
- if (!xen_initial_domain())
+ if (early)
return;
num_processors = 0;
@@ -217,7 +201,6 @@ static void __init xen_pv_smp_prepare_bo
loadsegment(es, __USER_DS);
#endif
- xen_filter_cpu_maps();
xen_setup_vcpu_info_placement();
/*
@@ -503,5 +486,8 @@ static const struct smp_ops xen_smp_ops
void __init xen_smp_init(void)
{
smp_ops = xen_smp_ops;
- xen_fill_possible_map();
+
+ /* Avoid searching for BIOS MP tables */
+ x86_init.mpparse.find_smp_config = x86_init_noop;
+ x86_init.mpparse.get_smp_config = _get_smp_config;
}
On 3/27/19 11:12 AM, Jan Beulich wrote:
> -
> -static void __init xen_filter_cpu_maps(void)
> +static void __init _get_smp_config(unsigned int early)
> {
> int i, rc;
> unsigned int subtract = 0;
>
> - if (!xen_initial_domain())
> + if (early)
> return;
>
> num_processors = 0;
Is there a reason to set_cpu_possible() here (not in the diff, but in
this routine)? This will be called from setup_arch() before
prefill_possible_map(), which will clear and then re-initialize
__cpu_possible_mask.
-boris
>>> On 27.03.19 at 18:07, <[email protected]> wrote:
> On 3/27/19 11:12 AM, Jan Beulich wrote:
>> -
>> -static void __init xen_filter_cpu_maps(void)
>> +static void __init _get_smp_config(unsigned int early)
>> {
>> int i, rc;
>> unsigned int subtract = 0;
>>
>> - if (!xen_initial_domain())
>> + if (early)
>> return;
>>
>> num_processors = 0;
>
>
> Is there a reason to set_cpu_possible() here (not in the diff, but in
> this routine)? This will be called from setup_arch() before
> prefill_possible_map(), which will clear and then re-initialize
> __cpu_possible_mask.
I don't think it's needed before my change either, so I'd call
removing this an orthogonal change. As said in the commit
message, the goal was to leave this function alone as far as
possible.
Before my patch, xen_filter_cpu_maps() gets called long after
prefill_possible_map(), and by the purpose of the latter function
the possible map shouldn't be altered anymore once that has
run. Adding bits to it is surely not going to have the intended
effect (setup_per_cpu_areas() has already run), while removing
bits may have some effect, but comes too late at least for
setup_per_cpu_areas().
And if we were to remove this, I think the CONFIG_HOTPLUG_CPU
section should go away as well. After all prefill_possible_map()
also sets nr_cpu_ids. To be honest, it was largely this code
fragment which made me want not touch the function more than
necessary: The comment there makes not clear to me at all why
all of this needs to be in an #ifdef in the first place.
Let me know whether you really want me to fold this extra
cleanup into this patch. If so, I'd then wonder whether the
set_cpu_present() from xen_pv_smp_prepare_cpus() shouldn't
be moved here, too. And the fiddling with the possible map
there looks bogus as well: Bring-up of CPUs past the command
line option should be avoided at boot time, but they shouldn't
be excluded from getting brought up later on. Note how
native_smp_prepare_cpus() ignores its parameter altogether.
Jan
On 3/28/19 5:03 AM, Jan Beulich wrote:
>>>> On 27.03.19 at 18:07, <[email protected]> wrote:
>> On 3/27/19 11:12 AM, Jan Beulich wrote:
>>> -
>>> -static void __init xen_filter_cpu_maps(void)
>>> +static void __init _get_smp_config(unsigned int early)
>>> {
>>> int i, rc;
>>> unsigned int subtract = 0;
>>>
>>> - if (!xen_initial_domain())
>>> + if (early)
>>> return;
>>>
>>> num_processors = 0;
>>
>> Is there a reason to set_cpu_possible() here (not in the diff, but in
>> this routine)? This will be called from setup_arch() before
>> prefill_possible_map(), which will clear and then re-initialize
>> __cpu_possible_mask.
> I don't think it's needed before my change either, so I'd call
> removing this an orthogonal change. As said in the commit
> message, the goal was to leave this function alone as far as
> possible.
>
> Before my patch, xen_filter_cpu_maps() gets called long after
> prefill_possible_map(), and by the purpose of the latter function
> the possible map shouldn't be altered anymore once that has
> run. Adding bits to it is surely not going to have the intended
> effect (setup_per_cpu_areas() has already run), while removing
> bits may have some effect, but comes too late at least for
> setup_per_cpu_areas().
OK. Then it looks like even though your patch changes behavior slightly
(so technically I guess it's not purely a cleanup) this shouldn't makes
any difference at least as far as possible cpu mask is concerned: if we
don't have percpu areas set up we can't do much for that vcpu since it
seems to me xen_vcpu_setup(), for example, won't do well.
>
> And if we were to remove this, I think the CONFIG_HOTPLUG_CPU
> section should go away as well. After all prefill_possible_map()
> also sets nr_cpu_ids. To be honest, it was largely this code
> fragment which made me want not touch the function more than
> necessary: The comment there makes not clear to me at all why
> all of this needs to be in an #ifdef in the first place.
This was introduced by cf405ae612b0 ("xen/smp: Fix crash when booting
with ACPI hotplug CPUs.").
I am not sure this is still relevant. The ACPI hotplug code had changed,
not significantly but sufficiently enough to alter behavior.
acpi_processor_hotadd_init() now fails before it gets a chance to call
arch_register_cpu() for vcpu>dom0_max_vcpus.
>
> Let me know whether you really want me to fold this extra
> cleanup into this patch. If so, I'd then wonder whether the
> set_cpu_present() from xen_pv_smp_prepare_cpus() shouldn't
> be moved here, too. And the fiddling with the possible map
> there looks bogus as well: Bring-up of CPUs past the command
> line option should be avoided at boot time, but they shouldn't
> be excluded from getting brought up later on. Note how
> native_smp_prepare_cpus() ignores its parameter altogether.
Yes, that does look strange.
Given especially xen_pv_smp_prepare_cpus(), I think re-working proper
setting of present/possible masks is well beyond the scope of your
original patch.
-boris
>>> On 28.03.19 at 17:50, <[email protected]> wrote:
> On 3/28/19 5:03 AM, Jan Beulich wrote:
>>>>> On 27.03.19 at 18:07, <[email protected]> wrote:
>>> On 3/27/19 11:12 AM, Jan Beulich wrote:
>>>> -
>>>> -static void __init xen_filter_cpu_maps(void)
>>>> +static void __init _get_smp_config(unsigned int early)
>>>> {
>>>> int i, rc;
>>>> unsigned int subtract = 0;
>>>>
>>>> - if (!xen_initial_domain())
>>>> + if (early)
>>>> return;
>>>>
>>>> num_processors = 0;
>>>
>>> Is there a reason to set_cpu_possible() here (not in the diff, but in
>>> this routine)? This will be called from setup_arch() before
>>> prefill_possible_map(), which will clear and then re-initialize
>>> __cpu_possible_mask.
>> I don't think it's needed before my change either, so I'd call
>> removing this an orthogonal change. As said in the commit
>> message, the goal was to leave this function alone as far as
>> possible.
>>
>> Before my patch, xen_filter_cpu_maps() gets called long after
>> prefill_possible_map(), and by the purpose of the latter function
>> the possible map shouldn't be altered anymore once that has
>> run. Adding bits to it is surely not going to have the intended
>> effect (setup_per_cpu_areas() has already run), while removing
>> bits may have some effect, but comes too late at least for
>> setup_per_cpu_areas().
>
> OK. Then it looks like even though your patch changes behavior slightly
> (so technically I guess it's not purely a cleanup) this shouldn't makes
> any difference at least as far as possible cpu mask is concerned: if we
> don't have percpu areas set up we can't do much for that vcpu since it
> seems to me xen_vcpu_setup(), for example, won't do well.
>
>> And if we were to remove this, I think the CONFIG_HOTPLUG_CPU
>> section should go away as well. After all prefill_possible_map()
>> also sets nr_cpu_ids. To be honest, it was largely this code
>> fragment which made me want not touch the function more than
>> necessary: The comment there makes not clear to me at all why
>> all of this needs to be in an #ifdef in the first place.
>
> This was introduced by cf405ae612b0 ("xen/smp: Fix crash when booting
> with ACPI hotplug CPUs.").
>
> I am not sure this is still relevant. The ACPI hotplug code had changed,
> not significantly but sufficiently enough to alter behavior.
> acpi_processor_hotadd_init() now fails before it gets a chance to call
> arch_register_cpu() for vcpu>dom0_max_vcpus.
>
>> Let me know whether you really want me to fold this extra
>> cleanup into this patch. If so, I'd then wonder whether the
>> set_cpu_present() from xen_pv_smp_prepare_cpus() shouldn't
>> be moved here, too. And the fiddling with the possible map
>> there looks bogus as well: Bring-up of CPUs past the command
>> line option should be avoided at boot time, but they shouldn't
>> be excluded from getting brought up later on. Note how
>> native_smp_prepare_cpus() ignores its parameter altogether.
>
> Yes, that does look strange.
>
> Given especially xen_pv_smp_prepare_cpus(), I think re-working proper
> setting of present/possible masks is well beyond the scope of your
> original patch.
Well, then the question is, what (if any) changes are you
expecting me to make for this change to be acceptable? Or do
you perhaps want me to add a 2nd patch on top addressing
the other outlined anomalies?
Jan
On 3/29/19 4:54 AM, Jan Beulich wrote:
>>>> On 28.03.19 at 17:50, <[email protected]> wrote:
>>
>> Given especially xen_pv_smp_prepare_cpus(), I think re-working proper
>> setting of present/possible masks is well beyond the scope of your
>> original patch.
> Well, then the question is, what (if any) changes are you
> expecting me to make for this change to be acceptable? Or do
> you perhaps want me to add a 2nd patch on top addressing
> the other outlined anomalies?
If your goal is just to fix the dom0_max_vcpus issue then this patch is
sufficient (but the commit message should say that this is what the
patch is for).
But if you are trying to make cpu masks management done properly then I
think this patch alone does not address this fully.
-boris
>>> On 29.03.19 at 14:42, <[email protected]> wrote:
> On 3/29/19 4:54 AM, Jan Beulich wrote:
>>>>> On 28.03.19 at 17:50, <[email protected]> wrote:
>>>
>>> Given especially xen_pv_smp_prepare_cpus(), I think re-working proper
>>> setting of present/possible masks is well beyond the scope of your
>>> original patch.
>> Well, then the question is, what (if any) changes are you
>> expecting me to make for this change to be acceptable? Or do
>> you perhaps want me to add a 2nd patch on top addressing
>> the other outlined anomalies?
>
> If your goal is just to fix the dom0_max_vcpus issue then this patch is
> sufficient (but the commit message should say that this is what the
> patch is for).
>
> But if you are trying to make cpu masks management done properly then I
> think this patch alone does not address this fully.
I.e. folding the further items discussed into this patch would be
fine by you? (I'm just trying to avoid having to go through
several more rounds of patch submissions.)
Jan
On 3/29/19 11:12 AM, Jan Beulich wrote:
>>>> On 29.03.19 at 14:42, <[email protected]> wrote:
>> On 3/29/19 4:54 AM, Jan Beulich wrote:
>>>>>> On 28.03.19 at 17:50, <[email protected]> wrote:
>>>> Given especially xen_pv_smp_prepare_cpus(), I think re-working proper
>>>> setting of present/possible masks is well beyond the scope of your
>>>> original patch.
>>> Well, then the question is, what (if any) changes are you
>>> expecting me to make for this change to be acceptable? Or do
>>> you perhaps want me to add a 2nd patch on top addressing
>>> the other outlined anomalies?
>> If your goal is just to fix the dom0_max_vcpus issue then this patch is
>> sufficient (but the commit message should say that this is what the
>> patch is for).
>>
>> But if you are trying to make cpu masks management done properly then I
>> think this patch alone does not address this fully.
> I.e. folding the further items discussed into this patch would be
> fine by you? (I'm just trying to avoid having to go through
> several more rounds of patch submissions.)
If you decide to pursue this (mask management) then it's up to you ---
if you feel you can/should do it all in a single patch then I don't see
why not.
-boris