2016-11-25 12:55:39

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements

Some time ago we forbade CPU offlining for Hyper-V and this was sufficient
if you boot with all CPUs onlined. Turns out, people may want to limit the
number online CPUs by passing 'maxcpus=' kernel parameter and we hit a
crash in Hyper-V code in this case. After some thinking, I think we may not
only fix the crash but also make the offlining prevention fine-grained: we
need to prevent from offlining CPUs which have VMBus channels attached
only. All offlined CPUs may always be onlined.

PATCH1 fixes a bug which is not directly related to the series, I hit it
while testing hv_vmbus module unload with this series.

Vitaly Kuznetsov (7):
hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels()
hv: allocate synic pages for all present CPUs
hv: init percpu_list in hv_synic_alloc()
hv: change clockevents unbind tactics
hv: check all present cpus in vmbus_wait_for_unload()
hv: switch to cpuhp state machine for synic init/cleanup
hv: make CPU offlining prevention fine-grained

drivers/hv/channel_mgmt.c | 6 +++--
drivers/hv/hv.c | 60 ++++++++++++++++++++++++++++++++++++-----------
drivers/hv/hyperv_vmbus.h | 4 ++--
drivers/hv/vmbus_drv.c | 28 ++++++++++++----------
4 files changed, 67 insertions(+), 31 deletions(-)

--
2.7.4


2016-11-25 12:56:12

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 4/7] hv: change clockevents unbind tactics

To get prepared to CPU offlining support we need co change the way how we
unbind clockevent devices. As one CPU may go online/offline multiple times
we need to bind it in hv_synic_init() and unbind it in hv_synic_cleanup().
The is an additional corner case: when we unload the module completely we
need to switch to some other clockevent mechanism before stopping VMBus or
we will hang. We can't call hv_synic_cleanup() before unloading VMBus as
we won't be able to send UNLOAD request and get a response so
hv_synic_clockevents_cleanup() has to live. Luckily, we can always call
clockevents_unbind_device(), even if it wasn't bound before and there is
no issue if we call it twice.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
drivers/hv/hv.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index a2567a4..c11393c 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -575,7 +575,7 @@ void hv_synic_clockevents_cleanup(void)
if (!(ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE))
return;

- for_each_online_cpu(cpu)
+ for_each_present_cpu(cpu)
clockevents_unbind_device(hv_context.clk_evt[cpu], cpu);
}

@@ -594,8 +594,10 @@ void hv_synic_cleanup(void *arg)
return;

/* Turn off clockevent device */
- if (ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE)
+ if (ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE) {
+ clockevents_unbind_device(hv_context.clk_evt[cpu], cpu);
hv_ce_shutdown(hv_context.clk_evt[cpu]);
+ }

rdmsrl(HV_X64_MSR_SINT0 + VMBUS_MESSAGE_SINT, shared_sint.as_uint64);

--
2.7.4

2016-11-25 12:57:14

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 5/7] hv: check all present cpus in vmbus_wait_for_unload()

It should never happen, but let's get prepared to receiving a confirmation
for VMBus unload on an offlined CPU. As we allocate all structures for all
present CPUs now it's safe.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
drivers/hv/channel_mgmt.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 1bc1d479..5011c95 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -674,7 +674,7 @@ static void vmbus_wait_for_unload(void)
if (completion_done(&vmbus_connection.unload_event))
break;

- for_each_online_cpu(cpu) {
+ for_each_present_cpu(cpu) {
page_addr = hv_context.synic_message_page[cpu];
msg = (struct hv_message *)page_addr +
VMBUS_MESSAGE_SINT;
@@ -700,7 +700,7 @@ static void vmbus_wait_for_unload(void)
* maybe-pending messages on all CPUs to be able to receive new
* messages after we reconnect.
*/
- for_each_online_cpu(cpu) {
+ for_each_present_cpu(cpu) {
page_addr = hv_context.synic_message_page[cpu];
msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT;
msg->header.message_type = HVMSG_NONE;
--
2.7.4

2016-11-25 12:57:25

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 7/7] hv: make CPU offlining prevention fine-grained

Since commit e513229b4c38 ("Drivers: hv: vmbus: prevent cpu offlining on
newer hypervisors") cpu offlining was disabled. It is still true that we
can't offline CPUs which have VMBus channels bound to them but we may have
'free' CPUs (e.v. we booted with maxcpus= parameter and onlined CPUs after
VMBus was initialized), these CPUs may be disabled without issues.

In future, we may even allow closing CPUs which have only sub-channels
assinged to them by closing these sub-channels. All devices will continue
to work.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
drivers/hv/hv.c | 31 +++++++++++++++++++++++++++++++
drivers/hv/vmbus_drv.c | 9 ++++-----
2 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index b3f6a1b..4ece040 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -586,10 +586,41 @@ int hv_synic_cleanup(unsigned int cpu)
union hv_synic_simp simp;
union hv_synic_siefp siefp;
union hv_synic_scontrol sctrl;
+ struct vmbus_channel *channel, *sc;
+ bool channel_found = false;
+ unsigned long flags;

if (!hv_context.synic_initialized)
return -EFAULT;

+ /*
+ * Search for channels which are bound to the CPU we're about to
+ * cleanup. In case we find one and vmbus is still connected we need to
+ * fail, this will effectively prevent CPU offlining. There is no way
+ * we can re-bind channels to different CPUs for now.
+ */
+ mutex_lock(&vmbus_connection.channel_mutex);
+ list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
+ if (channel->target_cpu == cpu) {
+ channel_found = true;
+ break;
+ }
+ spin_lock_irqsave(&channel->lock, flags);
+ list_for_each_entry(sc, &channel->sc_list, sc_list) {
+ if (sc->target_cpu == cpu) {
+ channel_found = true;
+ break;
+ }
+ }
+ spin_unlock_irqrestore(&channel->lock, flags);
+ if (channel_found)
+ break;
+ }
+ mutex_unlock(&vmbus_connection.channel_mutex);
+
+ if (channel_found && vmbus_connection.conn_state == CONNECTED)
+ return -EBUSY;
+
/* Turn off clockevent device */
if (ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE) {
clockevents_unbind_device(hv_context.clk_evt[cpu], cpu);
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index a115c90..61d879f 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -855,9 +855,6 @@ static int vmbus_bus_init(void)
if (ret)
goto err_connect;

- if (vmbus_proto_version > VERSION_WIN7)
- cpu_hotplug_disable();
-
/*
* Only register if the crash MSRs are available
*/
@@ -1328,6 +1325,9 @@ static void hv_kexec_handler(void)
{
hv_synic_clockevents_cleanup();
vmbus_initiate_unload(false);
+ vmbus_connection.conn_state = DISCONNECTED;
+ /* Make sure conn_state is set as hv_synic_cleanup checks for it */
+ mb();
cpuhp_remove_state(hyperv_cpuhp_online);
hv_cleanup(false);
};
@@ -1340,6 +1340,7 @@ static void hv_crash_handler(struct pt_regs *regs)
* doing the cleanup for current CPU only. This should be sufficient
* for kdump.
*/
+ vmbus_connection.conn_state = DISCONNECTED;
hv_synic_cleanup(smp_processor_id());
hv_cleanup(true);
};
@@ -1408,8 +1409,6 @@ static void __exit vmbus_exit(void)
cpuhp_remove_state(hyperv_cpuhp_online);
hv_synic_free();
acpi_bus_unregister_driver(&vmbus_acpi_driver);
- if (vmbus_proto_version > VERSION_WIN7)
- cpu_hotplug_enable();
}


--
2.7.4

2016-11-25 12:57:37

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 6/7] hv: switch to cpuhp state machine for synic init/cleanup

To make it possible to online/offline CPUs switch to cpuhp infrastructure
for doing hv_synic_init()/hv_synic_cleanup().

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
drivers/hv/hv.c | 15 +++++++--------
drivers/hv/hyperv_vmbus.h | 4 ++--
drivers/hv/vmbus_drv.c | 19 +++++++++++--------
3 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index c11393c..b3f6a1b 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -495,7 +495,7 @@ void hv_synic_free(void)
* retrieve the initialized message and event pages. Otherwise, we create and
* initialize the message and event pages.
*/
-void hv_synic_init(void *arg)
+int hv_synic_init(unsigned int cpu)
{
u64 version;
union hv_synic_simp simp;
@@ -504,10 +504,8 @@ void hv_synic_init(void *arg)
union hv_synic_scontrol sctrl;
u64 vp_index;

- int cpu = smp_processor_id();
-
if (!hv_context.hypercall_page)
- return;
+ return -EFAULT;

/* Check the version */
rdmsrl(HV_X64_MSR_SVERSION, version);
@@ -562,7 +560,7 @@ void hv_synic_init(void *arg)
HV_TIMER_FREQUENCY,
HV_MIN_DELTA_TICKS,
HV_MAX_MAX_DELTA_TICKS);
- return;
+ return 0;
}

/*
@@ -582,16 +580,15 @@ void hv_synic_clockevents_cleanup(void)
/*
* hv_synic_cleanup - Cleanup routine for hv_synic_init().
*/
-void hv_synic_cleanup(void *arg)
+int hv_synic_cleanup(unsigned int cpu)
{
union hv_synic_sint shared_sint;
union hv_synic_simp simp;
union hv_synic_siefp siefp;
union hv_synic_scontrol sctrl;
- int cpu = smp_processor_id();

if (!hv_context.synic_initialized)
- return;
+ return -EFAULT;

/* Turn off clockevent device */
if (ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE) {
@@ -623,4 +620,6 @@ void hv_synic_cleanup(void *arg)
rdmsrl(HV_X64_MSR_SCONTROL, sctrl.as_uint64);
sctrl.enable = 0;
wrmsrl(HV_X64_MSR_SCONTROL, sctrl.as_uint64);
+
+ return 0;
}
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index a5b4442..d50d56c 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -505,9 +505,9 @@ extern int hv_synic_alloc(void);

extern void hv_synic_free(void);

-extern void hv_synic_init(void *irqarg);
+extern int hv_synic_init(unsigned int cpu);

-extern void hv_synic_cleanup(void *arg);
+extern int hv_synic_cleanup(unsigned int cpu);

extern void hv_synic_clockevents_cleanup(void);

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 0276d2e..a115c90 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -49,6 +49,7 @@ static struct acpi_device *hv_acpi_dev;

static struct completion probe_event;

+static int hyperv_cpuhp_online;

static void hyperv_report_panic(struct pt_regs *regs)
{
@@ -844,7 +845,12 @@ static int vmbus_bus_init(void)
* Initialize the per-cpu interrupt state and
* connect to the host.
*/
- on_each_cpu(hv_synic_init, NULL, 1);
+ ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/hyperv:online",
+ hv_synic_init, hv_synic_cleanup);
+ if (ret < 0)
+ goto err_alloc;
+ hyperv_cpuhp_online = ret;
+
ret = vmbus_connect();
if (ret)
goto err_connect;
@@ -866,7 +872,7 @@ static int vmbus_bus_init(void)
return 0;

err_connect:
- on_each_cpu(hv_synic_cleanup, NULL, 1);
+ cpuhp_remove_state(hyperv_cpuhp_online);
err_alloc:
hv_synic_free();
hv_remove_vmbus_irq();
@@ -1320,12 +1326,9 @@ static struct acpi_driver vmbus_acpi_driver = {

static void hv_kexec_handler(void)
{
- int cpu;
-
hv_synic_clockevents_cleanup();
vmbus_initiate_unload(false);
- for_each_online_cpu(cpu)
- smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1);
+ cpuhp_remove_state(hyperv_cpuhp_online);
hv_cleanup(false);
};

@@ -1337,7 +1340,7 @@ static void hv_crash_handler(struct pt_regs *regs)
* doing the cleanup for current CPU only. This should be sufficient
* for kdump.
*/
- hv_synic_cleanup(NULL);
+ hv_synic_cleanup(smp_processor_id());
hv_cleanup(true);
};

@@ -1401,8 +1404,8 @@ static void __exit vmbus_exit(void)
hv_cleanup(false);
for_each_online_cpu(cpu) {
tasklet_kill(hv_context.event_dpc[cpu]);
- smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1);
}
+ cpuhp_remove_state(hyperv_cpuhp_online);
hv_synic_free();
acpi_bus_unregister_driver(&vmbus_acpi_driver);
if (vmbus_proto_version > VERSION_WIN7)
--
2.7.4

2016-11-25 12:59:56

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 3/7] hv: init percpu_list in hv_synic_alloc()

Initializing hv_context.percpu_list in hv_synic_alloc() helps to prevent a
crash in percpu_channel_enq() when not all CPUs were online during
initialization and it naturally belongs there.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
drivers/hv/hv.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index e4bb498..a2567a4 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -457,6 +457,8 @@ int hv_synic_alloc(void)
pr_err("Unable to allocate post msg page\n");
goto err;
}
+
+ INIT_LIST_HEAD(&hv_context.percpu_list[cpu]);
}

return 0;
@@ -552,8 +554,6 @@ void hv_synic_init(void *arg)
rdmsrl(HV_X64_MSR_VP_INDEX, vp_index);
hv_context.vp_index[cpu] = (u32)vp_index;

- INIT_LIST_HEAD(&hv_context.percpu_list[cpu]);
-
/*
* Register the per-cpu clockevent source.
*/
--
2.7.4

2016-11-25 13:00:06

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 2/7] hv: allocate synic pages for all present CPUs

It may happen that not all CPUs are online when we do hv_synic_alloc() and
in case more CPUs come online later we may try accessing these allocated
structures.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
drivers/hv/hv.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 60dbd6c..e4bb498 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -411,7 +411,7 @@ int hv_synic_alloc(void)
goto err;
}

- for_each_online_cpu(cpu) {
+ for_each_present_cpu(cpu) {
hv_context.event_dpc[cpu] = kmalloc(size, GFP_ATOMIC);
if (hv_context.event_dpc[cpu] == NULL) {
pr_err("Unable to allocate event dpc\n");
@@ -482,7 +482,7 @@ void hv_synic_free(void)
int cpu;

kfree(hv_context.hv_numa_map);
- for_each_online_cpu(cpu)
+ for_each_present_cpu(cpu)
hv_synic_free_cpu(cpu);
}

--
2.7.4

2016-11-25 13:00:15

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 1/7] hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels()

"kernel BUG at drivers/hv/channel_mgmt.c:350!" is observed when hv_vmbus
module is unloaded. BUG_ON() was introduced in commit 85d9aa705184
("Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister()") as
vmbus_free_channels() codepath was apparently forgotten.

Fixes: 85d9aa705184 ("Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister()")
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
drivers/hv/channel_mgmt.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 96a85cd..1bc1d479 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -389,6 +389,7 @@ void vmbus_free_channels(void)
{
struct vmbus_channel *channel, *tmp;

+ mutex_lock(&vmbus_connection.channel_mutex);
list_for_each_entry_safe(channel, tmp, &vmbus_connection.chn_list,
listentry) {
/* hv_process_channel_removal() needs this */
@@ -396,6 +397,7 @@ void vmbus_free_channels(void)

vmbus_device_unregister(channel->device_obj);
}
+ mutex_unlock(&vmbus_connection.channel_mutex);
}

/*
--
2.7.4

2016-11-26 17:06:05

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements

On Fri, 25 Nov 2016 13:48:36 +0100
Vitaly Kuznetsov <[email protected]> wrote:

> Some time ago we forbade CPU offlining for Hyper-V and this was sufficient
> if you boot with all CPUs onlined. Turns out, people may want to limit the
> number online CPUs by passing 'maxcpus=' kernel parameter and we hit a
> crash in Hyper-V code in this case. After some thinking, I think we may not
> only fix the crash but also make the offlining prevention fine-grained: we
> need to prevent from offlining CPUs which have VMBus channels attached
> only. All offlined CPUs may always be onlined.
>
> PATCH1 fixes a bug which is not directly related to the series, I hit it
> while testing hv_vmbus module unload with this series.
>
> Vitaly Kuznetsov (7):
> hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels()
> hv: allocate synic pages for all present CPUs
> hv: init percpu_list in hv_synic_alloc()
> hv: change clockevents unbind tactics
> hv: check all present cpus in vmbus_wait_for_unload()
> hv: switch to cpuhp state machine for synic init/cleanup
> hv: make CPU offlining prevention fine-grained
>
> drivers/hv/channel_mgmt.c | 6 +++--
> drivers/hv/hv.c | 60 ++++++++++++++++++++++++++++++++++++-----------
> drivers/hv/hyperv_vmbus.h | 4 ++--
> drivers/hv/vmbus_drv.c | 28 ++++++++++++----------
> 4 files changed, 67 insertions(+), 31 deletions(-)
>

As a temporary solution this is ok, but long term we need to support
dynamic CPU online/offline.

2016-11-28 06:35:42

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH 1/7] hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels()

> From: Vitaly Kuznetsov [mailto:[email protected]]
> Sent: Friday, November 25, 2016 20:49
> To: [email protected]
> Cc: [email protected]; KY Srinivasan <[email protected]>; Haiyang
> Zhang <[email protected]>; Dexuan Cui <[email protected]>;
> Stephen Hemminger <[email protected]>
> Subject: [PATCH 1/7] hv: acquire vmbus_connection.channel_mutex in
> vmbus_free_channels()
>
> "kernel BUG at drivers/hv/channel_mgmt.c:350!" is observed when hv_vmbus
> module is unloaded. BUG_ON() was introduced in commit 85d9aa705184
> ("Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister()") as
> vmbus_free_channels() codepath was apparently forgotten.
>
> Fixes: 85d9aa705184 ("Drivers: hv: vmbus: add an API
> vmbus_hvsock_device_unregister()")
> Signed-off-by: Vitaly Kuznetsov <[email protected]>

Reviewed-by: Dexuan Cui <[email protected]>

Thanks for the fix!

-- Dexuan

2016-11-28 06:56:32

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements

> From: Stephen Hemminger [mailto:[email protected]]
> Sent: Sunday, November 27, 2016 01:06
> To: Vitaly Kuznetsov <[email protected]>
> Cc: [email protected]; [email protected]; KY Srinivasan
> <[email protected]>; Haiyang Zhang <[email protected]>; Dexuan Cui
> <[email protected]>
> Subject: Re: [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements
>
> On Fri, 25 Nov 2016 13:48:36 +0100
> Vitaly Kuznetsov <[email protected]> wrote:
>
> > Some time ago we forbade CPU offlining for Hyper-V and this was sufficient
> > if you boot with all CPUs onlined. Turns out, people may want to limit the
> > number online CPUs by passing 'maxcpus=' kernel parameter and we hit a
> > crash in Hyper-V code in this case. After some thinking, I think we may not
> > only fix the crash but also make the offlining prevention fine-grained: we
> > need to prevent from offlining CPUs which have VMBus channels attached
> > only. All offlined CPUs may always be onlined.
> >
>
> As a temporary solution this is ok, but long term we need to support
> dynamic CPU online/offline.

If a CPU is bound to some channels, it seems impossible to make it offline,
unless Hyper-V supplies a mechanism to dynamically rebind the channels (i.e.
without closing & opening the channels) to another CPU, e.g. CPU0.
Currently it looks there is no such mechanism.

For CPU online, my understanding is: this patchset of Vitaly actually makes it
work (see PATCH 6/7).

Thanks,
-- Dexuan

2016-11-28 09:12:15

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements

Dexuan Cui <[email protected]> writes:

>> From: Stephen Hemminger [mailto:[email protected]]
>> Sent: Sunday, November 27, 2016 01:06
>> To: Vitaly Kuznetsov <[email protected]>
>> Cc: [email protected]; [email protected]; KY Srinivasan
>> <[email protected]>; Haiyang Zhang <[email protected]>; Dexuan Cui
>> <[email protected]>
>> Subject: Re: [PATCH 0/7] hv: CPU onlining/offlining fixes and improvements
>>
>> On Fri, 25 Nov 2016 13:48:36 +0100
>> Vitaly Kuznetsov <[email protected]> wrote:
>>
>> > Some time ago we forbade CPU offlining for Hyper-V and this was sufficient
>> > if you boot with all CPUs onlined. Turns out, people may want to limit the
>> > number online CPUs by passing 'maxcpus=' kernel parameter and we hit a
>> > crash in Hyper-V code in this case. After some thinking, I think we may not
>> > only fix the crash but also make the offlining prevention fine-grained: we
>> > need to prevent from offlining CPUs which have VMBus channels attached
>> > only. All offlined CPUs may always be onlined.
>> >
>>
>> As a temporary solution this is ok, but long term we need to support
>> dynamic CPU online/offline.
>
> If a CPU is bound to some channels, it seems impossible to make it offline,
> unless Hyper-V supplies a mechanism to dynamically rebind the channels (i.e.
> without closing & opening the channels) to another CPU, e.g. CPU0.
> Currently it looks there is no such mechanism.
>

Exactly. Offlining a CPU with an attached active VMBus channel means
killing the device and unless we're offereed an API to rebind it to some
other CPU there is no point in allowing that. We may, however, allow
offlining CPUs with sub-channels attached by closing these sub-channels,
not sure if we want to.

--
Vitaly