2014-02-05 22:10:06

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 00/51] CPU hotplug: Fix issues with callback registration

Hi,

Many subsystems and drivers have the need to register CPU hotplug callbacks
from their init routines and also perform initialization for the CPUs that are
already online. But unfortunately there is no race-free way to achieve this
today.

For example, consider this piece of code:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is not safe because there is a possibility of an ABBA deadlock involving
the cpu_add_remove_lock and the cpu_hotplug.lock.

CPU 0 CPU 1
----- -----

Acquire cpu_hotplug.lock
[via get_online_cpus()]

CPU online/offline operation
takes cpu_add_remove_lock
[via cpu_maps_update_begin()]

Try to acquire
cpu_add_remove_lock
[via register_cpu_notifier()]

CPU online/offline operation
tries to acquire cpu_hotplug.lock
[via cpu_hotplug_begin()]

*** DEADLOCK! ***


Other combinations of callback registration also don't work correctly.
Examples:

register_cpu_notifier(&foobar_cpu_notifier);

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

put_online_cpus();

This can lead to double initialization if a hotplug operation occurs after
registering the notifier and before invoking get_online_cpus().

On the other hand, the following piece of code can miss hotplug events
altogether:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

put_online_cpus();
^
| Race window; Can miss hotplug events here
v
register_cpu_notifier(&foobar_cpu_notifier);


To solve these issues and provide a race-free method to register CPU hotplug
callbacks, this patchset introduces new variants of the callback registration
APIs that don't hold the cpu_add_remove_lock, and exports the
cpu_add_remove_lock via cpu_maps_update_begin/done() for use by various
subsystems. With this in place, the following code snippet will register a
hotplug callback as well as initialize already online CPUs without any race
conditions.

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* This doesn't take the cpu_add_remove_lock */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


This patchset introduces this infrastructure in patch 1, and performs
tree-wide conversions (to use this model) in the remaining patches.

This patchset has been hosted in the below git tree. It applies cleanly on
v3.14-rc1.

git://github.com/srivatsabhat/linux.git cpuhp-registration-fixes-v1


Oleg, I have incorporated your fix for raid5 in this patchset, and modified
the patch a bit to handle the unregister_cpu_notifier() case as well. If you
are fine with that patch (patch 45), can you kindly provide your Signed-off-by
for that? Thank you!


Oleg Nesterov (1):
md, raid5: Fix CPU hotplug callback registration

Srivatsa S. Bhat (50):
CPU hotplug: Provide lockless versions of callback registration functions
Doc/cpu-hotplug: Specify race-free way to register CPU hotplug callbacks
CPU hotplug, perf: Fix CPU hotplug callback registration
ia64, salinfo: Fix hotplug callback registration
ia64, palinfo: Fix CPU hotplug callback registration
ia64, topology: Fix CPU hotplug callback registration
ia64, err-inject: Fix CPU hotplug callback registration
arm, hw-breakpoint: Fix CPU hotplug callback registration
arm, kvm: Fix CPU hotplug callback registration
s390, cacheinfo: Fix CPU hotplug callback registration
s390, smp: Fix CPU hotplug callback registration
sparc, sysfs: Fix CPU hotplug callback registration
powerpc, sysfs: Fix CPU hotplug callback registration
x86, msr: Fix CPU hotplug callback registration
x86, cpuid: Fix CPU hotplug callback registration
x86, vsyscall: Fix CPU hotplug callback registration
x86, intel, uncore: Fix CPU hotplug callback registration
x86, mce: Fix CPU hotplug callback registration
x86, therm_throt.c: Fix CPU hotplug callback registration
x86, amd, ibs: Fix CPU hotplug callback registration
x86, intel, cacheinfo: Fix CPU hotplug callback registration
x86, intel, rapl: Fix CPU hotplug callback registration
x86, amd, uncore: Fix CPU hotplug callback registration
x86, hpet: Fix CPU hotplug callback registration
x86, pci, amd-bus: Fix CPU hotplug callback registration
x86, oprofile, nmi: Fix CPU hotplug callback registration
x86, kvm: Fix CPU hotplug callback registration
arm64, hw_breakpoint.c: Fix CPU hotplug callback registration
arm64, debug-monitors: Fix CPU hotplug callback registration
powercap, intel-rapl: Fix CPU hotplug callback registration
scsi, bnx2i: Fix CPU hotplug callback registration
scsi, bnx2fc: Fix CPU hotplug callback registration
scsi, fcoe: Fix CPU hotplug callback registration
zsmalloc: Fix CPU hotplug callback registration
acpi-cpufreq: Fix CPU hotplug callback registration
drivers/base/topology.c: Fix CPU hotplug callback registration
clocksource, dummy-timer: Fix CPU hotplug callback registration
intel-idle: Fix CPU hotplug callback registration
oprofile, nmi-timer: Fix CPU hotplug callback registration
octeon, watchdog: Fix CPU hotplug callback registration
thermal, x86-pkg-temp: Fix CPU hotplug callback registration
hwmon, coretemp: Fix CPU hotplug callback registration
hwmon, via-cputemp: Fix CPU hotplug callback registration
xen, balloon: Fix CPU hotplug callback registration
trace, ring-buffer: Fix CPU hotplug callback registration
profile: Fix CPU hotplug callback registration
mm, vmstat: Fix CPU hotplug callback registration
mm, zswap: Fix CPU hotplug callback registration
net/core/flow.c: Fix CPU hotplug callback registration
net/iucv/iucv.c: Fix CPU hotplug callback registration

Documentation/cpu-hotplug.txt | 45 +++++++++
arch/arm/kernel/hw_breakpoint.c | 8 +-
arch/arm/kvm/arm.c | 7 +
arch/arm64/kernel/debug-monitors.c | 6 +
arch/arm64/kernel/hw_breakpoint.c | 7 +
arch/ia64/kernel/err_inject.c | 15 +++
arch/ia64/kernel/palinfo.c | 6 +
arch/ia64/kernel/salinfo.c | 6 +
arch/ia64/kernel/topology.c | 6 +
arch/powerpc/kernel/sysfs.c | 8 +-
arch/s390/kernel/cache.c | 5 +
arch/s390/kernel/smp.c | 13 ++-
arch/sparc/kernel/sysfs.c | 6 +
arch/x86/kernel/cpu/intel_cacheinfo.c | 13 ++-
arch/x86/kernel/cpu/mcheck/mce.c | 8 +-
arch/x86/kernel/cpu/mcheck/therm_throt.c | 5 +
arch/x86/kernel/cpu/perf_event_amd_ibs.c | 6 +
arch/x86/kernel/cpu/perf_event_amd_uncore.c | 7 +
arch/x86/kernel/cpu/perf_event_intel_rapl.c | 9 +-
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 6 +
arch/x86/kernel/cpuid.c | 15 ++-
arch/x86/kernel/hpet.c | 4 +
arch/x86/kernel/msr.c | 16 ++-
arch/x86/kernel/vsyscall_64.c | 6 +
arch/x86/kvm/x86.c | 7 +
arch/x86/oprofile/nmi_int.c | 15 +++
arch/x86/pci/amd_bus.c | 5 +
drivers/base/topology.c | 12 ++
drivers/clocksource/dummy_timer.c | 11 ++
drivers/cpufreq/acpi-cpufreq.c | 7 +
drivers/hwmon/coretemp.c | 14 +--
drivers/hwmon/via-cputemp.c | 14 +--
drivers/idle/intel_idle.c | 12 ++
drivers/md/raid5.c | 90 +++++++++----------
drivers/oprofile/nmi_timer_int.c | 23 +++--
drivers/powercap/intel_rapl.c | 10 ++
drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 12 ++
drivers/scsi/bnx2i/bnx2i_init.c | 12 ++
drivers/scsi/fcoe/fcoe.c | 15 +++
drivers/thermal/x86_pkg_temp_thermal.c | 14 +--
drivers/watchdog/octeon-wdt-main.c | 11 ++
drivers/xen/balloon.c | 35 +++++--
include/linux/cpu.h | 36 +++++++
include/linux/perf_event.h | 16 +++
kernel/cpu.c | 20 ++++
kernel/profile.c | 20 +++-
kernel/trace/ring_buffer.c | 19 ++--
mm/vmstat.c | 6 +
mm/zsmalloc.c | 17 +++-
mm/zswap.c | 8 +-
net/core/flow.c | 8 +-
net/iucv/iucv.c | 121 ++++++++++++-------------
52 files changed, 564 insertions(+), 259 deletions(-)


Regards,
Srivatsa S. Bhat
IBM Linux Technology Center


2014-02-05 22:10:16

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

The following method of CPU hotplug callback registration is not safe
due to the possibility of an ABBA deadlock involving the cpu_add_remove_lock
and the cpu_hotplug.lock.

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

The deadlock is shown below:

CPU 0 CPU 1
----- -----

Acquire cpu_hotplug.lock
[via get_online_cpus()]

CPU online/offline operation
takes cpu_add_remove_lock
[via cpu_maps_update_begin()]


Try to acquire
cpu_add_remove_lock
[via register_cpu_notifier()]


CPU online/offline operation
tries to acquire cpu_hotplug.lock
[via cpu_hotplug_begin()]


*** DEADLOCK! ***

The problem here is that callback registration takes the locks in one order
whereas the CPU hotplug operations take the same locks in the opposite order.
To avoid this issue and to provide a race-free method to register CPU hotplug
callbacks (along with initialization of already online CPUs), introduce new
variants of the callback registration APIs that simply register the callbacks
without holding the cpu_add_remove_lock during the registration. That way,
we can avoid the ABBA scenario. However, we will need to hold the
cpu_add_remove_lock throughout the entire critical section, to protect updates
to the callback/notifier chain.

This can be achieved by writing the callback registration code as follows:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* This doesn't take the cpu_add_remove_lock */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();

Note that we can't use get_online_cpus() here instead of cpu_maps_update_begin()
because the cpu_hotplug.lock is dropped during the invocation of CPU_POST_DEAD
notifiers, and hence get_online_cpus() cannot provide the necessary
synchronization to protect the callback/notifier chains against concurrent
reads and writes. On the other hand, since the cpu_add_remove_lock protects
the entire hotplug operation (including CPU_POST_DEAD), we can use
cpu_maps_update_begin/done() to guarantee proper synchronization.

Also, since cpu_maps_update_begin/done() is like a super-set of
get/put_online_cpus(), the former naturally protects the critical sections
from concurrent hotplug operations.

So, introduce the lockless variants of un/register_cpu_notifier() and also
export the cpu_maps_update_begin/done() APIs for use by modules. This way,
we provide a race-free way to register hotplug callbacks as well as perform
initialization for the CPUs that are already online.

Cc: Thomas Gleixner <[email protected]>
Cc: Toshi Kani <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

include/linux/cpu.h | 36 ++++++++++++++++++++++++++++++++++++
kernel/cpu.c | 20 ++++++++++++++++++--
2 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 03e235ad..eb97e37 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -122,26 +122,46 @@ enum {
{ .notifier_call = fn, .priority = pri }; \
register_cpu_notifier(&fn##_nb); \
}
+
+#define __cpu_notifier(fn, pri) { \
+ static struct notifier_block fn##_nb = \
+ { .notifier_call = fn, .priority = pri }; \
+ __register_cpu_notifier(&fn##_nb); \
+}
#else /* #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
#define cpu_notifier(fn, pri) do { (void)(fn); } while (0)
+#define __cpu_notifier(fn, pri) do { (void)(fn); } while (0)
#endif /* #else #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
+
#ifdef CONFIG_HOTPLUG_CPU
extern int register_cpu_notifier(struct notifier_block *nb);
+extern int __register_cpu_notifier(struct notifier_block *nb);
extern void unregister_cpu_notifier(struct notifier_block *nb);
+extern void __unregister_cpu_notifier(struct notifier_block *nb);
#else

#ifndef MODULE
extern int register_cpu_notifier(struct notifier_block *nb);
+extern int __register_cpu_notifier(struct notifier_block *nb);
#else
static inline int register_cpu_notifier(struct notifier_block *nb)
{
return 0;
}
+
+static inline int __register_cpu_notifier(struct notifier_block *nb)
+{
+ return 0;
+}
#endif

static inline void unregister_cpu_notifier(struct notifier_block *nb)
{
}
+
+static inline void __unregister_cpu_notifier(struct notifier_block *nb)
+{
+}
#endif

int cpu_up(unsigned int cpu);
@@ -152,16 +172,26 @@ extern void cpu_maps_update_done(void);
#else /* CONFIG_SMP */

#define cpu_notifier(fn, pri) do { (void)(fn); } while (0)
+#define __cpu_notifier(fn, pri) do { (void)(fn); } while (0)

static inline int register_cpu_notifier(struct notifier_block *nb)
{
return 0;
}

+static inline int __register_cpu_notifier(struct notifier_block *nb)
+{
+ return 0;
+}
+
static inline void unregister_cpu_notifier(struct notifier_block *nb)
{
}

+static inline void __unregister_cpu_notifier(struct notifier_block *nb)
+{
+}
+
static inline void cpu_maps_update_begin(void)
{
}
@@ -183,8 +213,11 @@ extern void put_online_cpus(void);
extern void cpu_hotplug_disable(void);
extern void cpu_hotplug_enable(void);
#define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri)
+#define __hotcpu_notifier(fn, pri) __cpu_notifier(fn, pri)
#define register_hotcpu_notifier(nb) register_cpu_notifier(nb)
+#define __register_hotcpu_notifier(nb) __register_cpu_notifier(nb)
#define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
+#define __unregister_hotcpu_notifier(nb) __unregister_cpu_notifier(nb)
void clear_tasks_mm_cpumask(int cpu);
int cpu_down(unsigned int cpu);

@@ -197,9 +230,12 @@ static inline void cpu_hotplug_done(void) {}
#define cpu_hotplug_disable() do { } while (0)
#define cpu_hotplug_enable() do { } while (0)
#define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0)
+#define __hotcpu_notifier(fn, pri) do { (void)(fn); } while (0)
/* These aren't inline functions due to a GCC bug. */
#define register_hotcpu_notifier(nb) ({ (void)(nb); 0; })
+#define __register_hotcpu_notifier(nb) ({ (void)(nb); 0; })
#define unregister_hotcpu_notifier(nb) ({ (void)(nb); })
+#define __unregister_hotcpu_notifier(nb) ({ (void)(nb); })
#endif /* CONFIG_HOTPLUG_CPU */

#ifdef CONFIG_PM_SLEEP_SMP
diff --git a/kernel/cpu.c b/kernel/cpu.c
index deff2e6..12a3a74 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -27,18 +27,22 @@
static DEFINE_MUTEX(cpu_add_remove_lock);

/*
- * The following two API's must be used when attempting
- * to serialize the updates to cpu_online_mask, cpu_present_mask.
+ * The following two API's must be used when attempting to serialize
+ * the updates to cpu_online_mask, cpu_present_mask. Also, they must
+ * be used to protect CPU hotplug callback (un)registration performed
+ * using __register_cpu_notifier() or __unregister_cpu_notifier().
*/
void cpu_maps_update_begin(void)
{
mutex_lock(&cpu_add_remove_lock);
}
+EXPORT_SYMBOL(cpu_maps_update_begin);

void cpu_maps_update_done(void)
{
mutex_unlock(&cpu_add_remove_lock);
}
+EXPORT_SYMBOL(cpu_maps_update_done);

static RAW_NOTIFIER_HEAD(cpu_chain);

@@ -166,6 +170,11 @@ int __ref register_cpu_notifier(struct notifier_block *nb)
return ret;
}

+int __ref __register_cpu_notifier(struct notifier_block *nb)
+{
+ return raw_notifier_chain_register(&cpu_chain, nb);
+}
+
static int __cpu_notify(unsigned long val, void *v, int nr_to_call,
int *nr_calls)
{
@@ -189,6 +198,7 @@ static void cpu_notify_nofail(unsigned long val, void *v)
BUG_ON(cpu_notify(val, v));
}
EXPORT_SYMBOL(register_cpu_notifier);
+EXPORT_SYMBOL(__register_cpu_notifier);

void __ref unregister_cpu_notifier(struct notifier_block *nb)
{
@@ -198,6 +208,12 @@ void __ref unregister_cpu_notifier(struct notifier_block *nb)
}
EXPORT_SYMBOL(unregister_cpu_notifier);

+void __ref __unregister_cpu_notifier(struct notifier_block *nb)
+{
+ raw_notifier_chain_unregister(&cpu_chain, nb);
+}
+EXPORT_SYMBOL(__unregister_cpu_notifier);
+
/**
* clear_tasks_mm_cpumask - Safely clear tasks' mm_cpumask for a CPU
* @cpu: a CPU id

2014-02-05 22:10:42

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 03/51] CPU hotplug, perf: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the perf subsystem's hotplug notifier by using this latter form of
callback registration.

Also provide a bare-bones version of perf_cpu_notifier() that doesn't
invoke the notifiers for the already online CPUs. This would be useful
for subsystems that need to perform a different set of initialization
for the already online CPUs, or don't need the initialization altogether.

Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

include/linux/perf_event.h | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e56b07f..f4057da 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -835,6 +835,8 @@ do { \
{ .notifier_call = fn, .priority = CPU_PRI_PERF }; \
unsigned long cpu = smp_processor_id(); \
unsigned long flags; \
+ \
+ cpu_maps_update_begin(); \
fn(&fn##_nb, (unsigned long)CPU_UP_PREPARE, \
(void *)(unsigned long)cpu); \
local_irq_save(flags); \
@@ -843,9 +845,21 @@ do { \
local_irq_restore(flags); \
fn(&fn##_nb, (unsigned long)CPU_ONLINE, \
(void *)(unsigned long)cpu); \
- register_cpu_notifier(&fn##_nb); \
+ __register_cpu_notifier(&fn##_nb); \
+ cpu_maps_update_done(); \
} while (0)

+/*
+ * Bare-bones version of perf_cpu_notifier(), which doesn't invoke the
+ * callback for already online CPUs.
+ */
+#define __perf_cpu_notifier(fn) \
+do { \
+ static struct notifier_block fn##_nb = \
+ { .notifier_call = fn, .priority = CPU_PRI_PERF }; \
+ \
+ __register_cpu_notifier(&fn##_nb); \
+} while (0)

struct perf_pmu_events_attr {
struct device_attribute attr;

2014-02-05 22:10:51

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 04/51] ia64, salinfo: Fix hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the salinfo code in ia64 by using this latter form of callback
registration.

Cc: Tony Luck <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/ia64/kernel/salinfo.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/ia64/kernel/salinfo.c b/arch/ia64/kernel/salinfo.c
index 960a396..e630bce 100644
--- a/arch/ia64/kernel/salinfo.c
+++ b/arch/ia64/kernel/salinfo.c
@@ -635,6 +635,8 @@ salinfo_init(void)
(void *)salinfo_entries[i].feature);
}

+ cpu_maps_update_begin();
+
for (i = 0; i < ARRAY_SIZE(salinfo_log_name); i++) {
data = salinfo_data + i;
data->type = i;
@@ -669,7 +671,9 @@ salinfo_init(void)
salinfo_timer.function = &salinfo_timeout;
add_timer(&salinfo_timer);

- register_hotcpu_notifier(&salinfo_cpu_notifier);
+ __register_hotcpu_notifier(&salinfo_cpu_notifier);
+
+ cpu_maps_update_done();

return 0;
}

2014-02-05 22:10:58

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 05/51] ia64, palinfo: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the palinfo code in ia64 by using this latter form of callback
registration.

Cc: Tony Luck <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/ia64/kernel/palinfo.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/ia64/kernel/palinfo.c b/arch/ia64/kernel/palinfo.c
index ab33328..cfaa06f 100644
--- a/arch/ia64/kernel/palinfo.c
+++ b/arch/ia64/kernel/palinfo.c
@@ -996,13 +996,17 @@ palinfo_init(void)
if (!palinfo_dir)
return -ENOMEM;

+ cpu_maps_update_begin();
+
/* Create palinfo dirs in /proc for all online cpus */
for_each_online_cpu(i) {
create_palinfo_proc_entries(i);
}

/* Register for future delivery via notify registration */
- register_hotcpu_notifier(&palinfo_cpu_notifier);
+ __register_hotcpu_notifier(&palinfo_cpu_notifier);
+
+ cpu_maps_update_done();

return 0;
}

2014-02-05 22:11:10

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 06/51] ia64, topology: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the topology code in ia64 by using this latter form of callback
registration.

Cc: Tony Luck <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/ia64/kernel/topology.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/ia64/kernel/topology.c b/arch/ia64/kernel/topology.c
index ca69a5a..a6aa6ed 100644
--- a/arch/ia64/kernel/topology.c
+++ b/arch/ia64/kernel/topology.c
@@ -454,12 +454,16 @@ static int __init cache_sysfs_init(void)
{
int i;

+ cpu_maps_update_begin();
+
for_each_online_cpu(i) {
struct device *sys_dev = get_cpu_device((unsigned int)i);
cache_add_dev(sys_dev);
}

- register_hotcpu_notifier(&cache_cpu_notifier);
+ __register_hotcpu_notifier(&cache_cpu_notifier);
+
+ cpu_maps_update_done();

return 0;
}

2014-02-05 22:11:22

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 07/51] ia64, err-inject: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the error injection code in ia64 by using this latter form of callback
registration.

Cc: Tony Luck <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/ia64/kernel/err_inject.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/kernel/err_inject.c b/arch/ia64/kernel/err_inject.c
index f59c0b8..b7b1403 100644
--- a/arch/ia64/kernel/err_inject.c
+++ b/arch/ia64/kernel/err_inject.c
@@ -269,12 +269,17 @@ err_inject_init(void)
#ifdef ERR_INJ_DEBUG
printk(KERN_INFO "Enter error injection driver.\n");
#endif
+
+ cpu_maps_update_begin();
+
for_each_online_cpu(i) {
err_inject_cpu_callback(&err_inject_cpu_notifier, CPU_ONLINE,
(void *)(long)i);
}

- register_hotcpu_notifier(&err_inject_cpu_notifier);
+ __register_hotcpu_notifier(&err_inject_cpu_notifier);
+
+ cpu_maps_update_done();

return 0;
}
@@ -288,11 +293,17 @@ err_inject_exit(void)
#ifdef ERR_INJ_DEBUG
printk(KERN_INFO "Exit error injection driver.\n");
#endif
+
+ cpu_maps_update_begin();
+
for_each_online_cpu(i) {
sys_dev = get_cpu_device(i);
sysfs_remove_group(&sys_dev->kobj, &err_inject_attr_group);
}
- unregister_hotcpu_notifier(&err_inject_cpu_notifier);
+
+ __unregister_hotcpu_notifier(&err_inject_cpu_notifier);
+
+ cpu_maps_update_done();
}

module_init(err_inject_init);

2014-02-05 22:11:32

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 08/51] arm, hw-breakpoint: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the hw-breakpoint code in arm by using this latter form of callback
registration.

Cc: Will Deacon <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/arm/kernel/hw_breakpoint.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c
index 3d44660..eaa7fcf 100644
--- a/arch/arm/kernel/hw_breakpoint.c
+++ b/arch/arm/kernel/hw_breakpoint.c
@@ -1072,6 +1072,8 @@ static int __init arch_hw_breakpoint_init(void)
core_num_brps = get_num_brps();
core_num_wrps = get_num_wrps();

+ cpu_maps_update_begin();
+
/*
* We need to tread carefully here because DBGSWENABLE may be
* driven low on this core and there isn't an architected way to
@@ -1088,6 +1090,7 @@ static int __init arch_hw_breakpoint_init(void)
if (!cpumask_empty(&debug_err_mask)) {
core_num_brps = 0;
core_num_wrps = 0;
+ cpu_maps_update_done();
return 0;
}

@@ -1107,7 +1110,10 @@ static int __init arch_hw_breakpoint_init(void)
TRAP_HWBKPT, "breakpoint debug exception");

/* Register hotplug and PM notifiers. */
- register_cpu_notifier(&dbg_reset_nb);
+ __register_cpu_notifier(&dbg_reset_nb);
+
+ cpu_maps_update_done();
+
pm_init();
return 0;
}

2014-02-05 22:11:42

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 09/51] arm, kvm: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the kvm code in arm by using this latter form of callback registration.

Cc: Christoffer Dall <[email protected]>
Cc: Gleb Natapov <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/arm/kvm/arm.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 1d8248e..e2ef4c4 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -1050,21 +1050,26 @@ int kvm_arch_init(void *opaque)
}
}

+ cpu_maps_update_begin();
+
err = init_hyp_mode();
if (err)
goto out_err;

- err = register_cpu_notifier(&hyp_init_cpu_nb);
+ err = __register_cpu_notifier(&hyp_init_cpu_nb);
if (err) {
kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
goto out_err;
}

+ cpu_maps_update_done();
+
hyp_cpu_pm_init();

kvm_coproc_table_init();
return 0;
out_err:
+ cpu_maps_update_done();
return err;
}

2014-02-05 22:11:49

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 10/51] s390, cacheinfo: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the cacheinfo code in s390 by using this latter form of callback
registration.

Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/s390/kernel/cache.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/s390/kernel/cache.c b/arch/s390/kernel/cache.c
index 3a414c0..075af62 100644
--- a/arch/s390/kernel/cache.c
+++ b/arch/s390/kernel/cache.c
@@ -378,9 +378,12 @@ static int __init cache_init(void)
if (!test_facility(34))
return 0;
cache_build_info();
+
+ cpu_maps_update_begin();
for_each_online_cpu(cpu)
cache_add_cpu(cpu);
- hotcpu_notifier(cache_hotplug, 0);
+ __hotcpu_notifier(cache_hotplug, 0);
+ cpu_maps_update_done();
return 0;
}
device_initcall(cache_init);

2014-02-05 22:11:58

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 11/51] s390, smp: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the smp code in s390 by using this latter form of callback registration.

Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/s390/kernel/smp.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index a7125b6..415a44e 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -1057,19 +1057,24 @@ static DEVICE_ATTR(rescan, 0200, NULL, rescan_store);

static int __init s390_smp_init(void)
{
- int cpu, rc;
+ int cpu, rc = 0;

- hotcpu_notifier(smp_cpu_notify, 0);
#ifdef CONFIG_HOTPLUG_CPU
rc = device_create_file(cpu_subsys.dev_root, &dev_attr_rescan);
if (rc)
return rc;
#endif
+ cpu_maps_update_begin();
for_each_present_cpu(cpu) {
rc = smp_add_present_cpu(cpu);
if (rc)
- return rc;
+ goto out;
}
- return 0;
+
+ __hotcpu_notifier(smp_cpu_notify, 0);
+
+out:
+ cpu_maps_update_done();
+ return rc;
}
subsys_initcall(s390_smp_init);

2014-02-05 22:12:08

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 12/51] sparc, sysfs: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the sysfs code in sparc by using this latter form of callback
registration.

Cc: "David S. Miller" <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/sparc/kernel/sysfs.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/sparc/kernel/sysfs.c b/arch/sparc/kernel/sysfs.c
index c21c673..2177b9e 100644
--- a/arch/sparc/kernel/sysfs.c
+++ b/arch/sparc/kernel/sysfs.c
@@ -300,7 +300,7 @@ static int __init topology_init(void)

check_mmu_stats();

- register_cpu_notifier(&sysfs_cpu_nb);
+ cpu_maps_update_begin();

for_each_possible_cpu(cpu) {
struct cpu *c = &per_cpu(cpu_devices, cpu);
@@ -310,6 +310,10 @@ static int __init topology_init(void)
register_cpu_online(cpu);
}

+ __register_cpu_notifier(&sysfs_cpu_nb);
+
+ cpu_maps_update_done();
+
return 0;
}

2014-02-05 22:12:33

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 14/51] x86, msr: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the msr code in x86 by using this latter form of callback registration.

Cc: "H. Peter Anvin" <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/x86/kernel/msr.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/msr.c b/arch/x86/kernel/msr.c
index 05266b5..f9e65ff 100644
--- a/arch/x86/kernel/msr.c
+++ b/arch/x86/kernel/msr.c
@@ -259,14 +259,15 @@ static int __init msr_init(void)
goto out_chrdev;
}
msr_class->devnode = msr_devnode;
- get_online_cpus();
+
+ cpu_maps_update_begin();
for_each_online_cpu(i) {
err = msr_device_create(i);
if (err != 0)
goto out_class;
}
- register_hotcpu_notifier(&msr_class_cpu_notifier);
- put_online_cpus();
+ __register_hotcpu_notifier(&msr_class_cpu_notifier);
+ cpu_maps_update_done();

err = 0;
goto out;
@@ -275,7 +276,7 @@ out_class:
i = 0;
for_each_online_cpu(i)
msr_device_destroy(i);
- put_online_cpus();
+ cpu_maps_update_done();
class_destroy(msr_class);
out_chrdev:
__unregister_chrdev(MSR_MAJOR, 0, NR_CPUS, "cpu/msr");
@@ -286,13 +287,14 @@ out:
static void __exit msr_exit(void)
{
int cpu = 0;
- get_online_cpus();
+
+ cpu_maps_update_begin();
for_each_online_cpu(cpu)
msr_device_destroy(cpu);
class_destroy(msr_class);
__unregister_chrdev(MSR_MAJOR, 0, NR_CPUS, "cpu/msr");
- unregister_hotcpu_notifier(&msr_class_cpu_notifier);
- put_online_cpus();
+ __unregister_hotcpu_notifier(&msr_class_cpu_notifier);
+ cpu_maps_update_done();
}

module_init(msr_init);

2014-02-05 22:12:23

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 13/51] powerpc, sysfs: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the sysfs code in powerpc by using this latter form of callback
registration.

Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Olof Johansson <[email protected]>
Cc: Wang Dongsheng <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/powerpc/kernel/sysfs.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 97e1dc9..c29ad44 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -975,7 +975,8 @@ static int __init topology_init(void)
int cpu;

register_nodes();
- register_cpu_notifier(&sysfs_cpu_nb);
+
+ cpu_maps_update_begin();

for_each_possible_cpu(cpu) {
struct cpu *c = &per_cpu(cpu_devices, cpu);
@@ -999,6 +1000,11 @@ static int __init topology_init(void)
if (cpu_online(cpu))
register_cpu_online(cpu);
}
+
+ __register_cpu_notifier(&sysfs_cpu_nb);
+
+ cpu_maps_update_done();
+
#ifdef CONFIG_PPC64
sysfs_create_dscr_default();
#endif /* CONFIG_PPC64 */

2014-02-05 22:12:42

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 15/51] x86, cpuid: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the cpuid code in x86 by using this latter form of callback registration.

Cc: "H. Peter Anvin" <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/x86/kernel/cpuid.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpuid.c b/arch/x86/kernel/cpuid.c
index 7d9481c..1df179a 100644
--- a/arch/x86/kernel/cpuid.c
+++ b/arch/x86/kernel/cpuid.c
@@ -198,14 +198,15 @@ static int __init cpuid_init(void)
goto out_chrdev;
}
cpuid_class->devnode = cpuid_devnode;
- get_online_cpus();
+
+ cpu_maps_update_begin();
for_each_online_cpu(i) {
err = cpuid_device_create(i);
if (err != 0)
goto out_class;
}
- register_hotcpu_notifier(&cpuid_class_cpu_notifier);
- put_online_cpus();
+ __register_hotcpu_notifier(&cpuid_class_cpu_notifier);
+ cpu_maps_update_done();

err = 0;
goto out;
@@ -215,7 +216,7 @@ out_class:
for_each_online_cpu(i) {
cpuid_device_destroy(i);
}
- put_online_cpus();
+ cpu_maps_update_done();
class_destroy(cpuid_class);
out_chrdev:
__unregister_chrdev(CPUID_MAJOR, 0, NR_CPUS, "cpu/cpuid");
@@ -227,13 +228,13 @@ static void __exit cpuid_exit(void)
{
int cpu = 0;

- get_online_cpus();
+ cpu_maps_update_begin();
for_each_online_cpu(cpu)
cpuid_device_destroy(cpu);
class_destroy(cpuid_class);
__unregister_chrdev(CPUID_MAJOR, 0, NR_CPUS, "cpu/cpuid");
- unregister_hotcpu_notifier(&cpuid_class_cpu_notifier);
- put_online_cpus();
+ __unregister_hotcpu_notifier(&cpuid_class_cpu_notifier);
+ cpu_maps_update_done();
}

module_init(cpuid_init);

2014-02-05 22:12:52

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 16/51] x86, vsyscall: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the vsyscall code in x86 by using this latter form of callback
registration.

Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/x86/kernel/vsyscall_64.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c
index 1f96f93..ae68465 100644
--- a/arch/x86/kernel/vsyscall_64.c
+++ b/arch/x86/kernel/vsyscall_64.c
@@ -393,9 +393,13 @@ static int __init vsyscall_init(void)
{
BUG_ON(VSYSCALL_ADDR(0) != __fix_to_virt(VSYSCALL_FIRST_PAGE));

+ cpu_maps_update_begin();
+
on_each_cpu(cpu_vsyscall_init, NULL, 1);
/* notifier priority > KVM */
- hotcpu_notifier(cpu_vsyscall_notifier, 30);
+ __hotcpu_notifier(cpu_vsyscall_notifier, 30);
+
+ cpu_maps_update_done();

return 0;
}

2014-02-05 22:13:05

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 17/51] x86, intel, uncore: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the uncore code in intel-x86 by using this latter form of callback
registration.

Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/x86/kernel/cpu/perf_event_intel_uncore.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index 29c2487..e8a8a48 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -3808,7 +3808,7 @@ static int __init uncore_cpu_init(void)
if (ret)
return ret;

- get_online_cpus();
+ cpu_maps_update_begin();

for_each_online_cpu(cpu) {
int i, phys_id = topology_physical_package_id(cpu);
@@ -3827,9 +3827,9 @@ static int __init uncore_cpu_init(void)
}
on_each_cpu(uncore_cpu_setup, NULL, 1);

- register_cpu_notifier(&uncore_cpu_nb);
+ __register_cpu_notifier(&uncore_cpu_nb);

- put_online_cpus();
+ cpu_maps_update_done();

return 0;
}

2014-02-05 22:13:17

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 18/51] x86, mce: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the mce code in x86 by using this latter form of callback registration.

Cc: Tony Luck <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/x86/kernel/cpu/mcheck/mce.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 4d5419b..613b080 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2434,14 +2434,18 @@ static __init int mcheck_init_device(void)
if (err)
return err;

+ cpu_maps_update_begin();
for_each_online_cpu(i) {
err = mce_device_create(i);
- if (err)
+ if (err) {
+ cpu_maps_update_done();
return err;
+ }
}

register_syscore_ops(&mce_syscore_ops);
- register_hotcpu_notifier(&mce_cpu_notifier);
+ __register_hotcpu_notifier(&mce_cpu_notifier);
+ cpu_maps_update_done();

/* register character device /dev/mcelog */
misc_register(&mce_chrdev_device);

2014-02-05 22:13:27

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 19/51] x86, therm_throt.c: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the thermal throttle code in x86 by using this latter form of callback
registration.

Cc: Tony Luck <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/x86/kernel/cpu/mcheck/therm_throt.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index 3eec7de..c0de00a 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -319,7 +319,7 @@ static __init int thermal_throttle_init_device(void)
if (!atomic_read(&therm_throt_en))
return 0;

- register_hotcpu_notifier(&thermal_throttle_cpu_notifier);
+ cpu_maps_update_begin();

#ifdef CONFIG_HOTPLUG_CPU
mutex_lock(&therm_cpu_lock);
@@ -333,6 +333,9 @@ static __init int thermal_throttle_init_device(void)
mutex_unlock(&therm_cpu_lock);
#endif

+ __register_hotcpu_notifier(&thermal_throttle_cpu_notifier);
+ cpu_maps_update_done();
+
return 0;
}
device_initcall(thermal_throttle_init_device);

2014-02-05 22:13:38

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 20/51] x86, amd, ibs: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the amd-ibs code in x86 by using this latter form of callback
registration.

Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/x86/kernel/cpu/perf_event_amd_ibs.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
index 4b8e4d3..f35a2c6 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
@@ -926,13 +926,13 @@ static __init int amd_ibs_init(void)
goto out;

perf_ibs_pm_init();
- get_online_cpus();
+ cpu_maps_update_begin();
ibs_caps = caps;
/* make ibs_caps visible to other cpus: */
smp_mb();
- perf_cpu_notifier(perf_ibs_cpu_notifier);
smp_call_function(setup_APIC_ibs, NULL, 1);
- put_online_cpus();
+ __perf_cpu_notifier(perf_ibs_cpu_notifier);
+ cpu_maps_update_done();

ret = perf_event_ibs_init();
out:

2014-02-05 22:13:50

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 21/51] x86, intel, cacheinfo: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the intel cacheinfo code in x86 by using this latter form of callback
registration.

Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Cc: Borislav Petkov <[email protected]>
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/x86/kernel/cpu/intel_cacheinfo.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 0641113..d15c0dc 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -1225,21 +1225,24 @@ static struct notifier_block cacheinfo_cpu_notifier = {

static int __init cache_sysfs_init(void)
{
- int i;
+ int i, err = 0;

if (num_cache_leaves == 0)
return 0;

+ cpu_maps_update_begin();
for_each_online_cpu(i) {
- int err;
struct device *dev = get_cpu_device(i);

err = cache_add_dev(dev);
if (err)
- return err;
+ goto out;
}
- register_hotcpu_notifier(&cacheinfo_cpu_notifier);
- return 0;
+ __register_hotcpu_notifier(&cacheinfo_cpu_notifier);
+
+out:
+ cpu_maps_update_done();
+ return err;
}

device_initcall(cache_sysfs_init);

2014-02-05 22:14:06

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 22/51] x86, intel, rapl: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the intel rapl code in x86 by using this latter form of callback
registration.

Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/x86/kernel/cpu/perf_event_intel_rapl.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
index 5ad35ad..bc41c02 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -646,19 +646,20 @@ static int __init rapl_pmu_init(void)
/* unsupported */
return 0;
}
- get_online_cpus();
+
+ cpu_maps_update_begin();

for_each_online_cpu(cpu) {
rapl_cpu_prepare(cpu);
rapl_cpu_init(cpu);
}

- perf_cpu_notifier(rapl_cpu_notifier);
+ __perf_cpu_notifier(rapl_cpu_notifier);

ret = perf_pmu_register(&rapl_pmu_class, "power", -1);
if (WARN_ON(ret)) {
pr_info("RAPL PMU detected, registration failed (%d), RAPL PMU disabled\n", ret);
- put_online_cpus();
+ cpu_maps_update_done();
return -1;
}

@@ -672,7 +673,7 @@ static int __init rapl_pmu_init(void)
hweight32(rapl_cntr_mask),
ktime_to_ms(pmu->timer_interval));

- put_online_cpus();
+ cpu_maps_update_done();

return 0;
}

2014-02-05 22:14:17

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 23/51] x86, amd, uncore: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the amd-uncore code in x86 by using this latter form of callback
registration.

Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/x86/kernel/cpu/perf_event_amd_uncore.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_amd_uncore.c b/arch/x86/kernel/cpu/perf_event_amd_uncore.c
index 754291a..3982ef0 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_uncore.c
@@ -531,15 +531,16 @@ static int __init amd_uncore_init(void)
if (ret)
return -ENODEV;

- get_online_cpus();
+ cpu_maps_update_begin();
+
/* init cpus already online before registering for hotplug notifier */
for_each_online_cpu(cpu) {
amd_uncore_cpu_up_prepare(cpu);
smp_call_function_single(cpu, init_cpu_already_online, NULL, 1);
}

- register_cpu_notifier(&amd_uncore_cpu_notifier_block);
- put_online_cpus();
+ __register_cpu_notifier(&amd_uncore_cpu_notifier_block);
+ cpu_maps_update_done();

return 0;
}

2014-02-05 22:14:28

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 24/51] x86, hpet: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the hpet code in x86 by using this latter form of callback registration.

Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/x86/kernel/hpet.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index da85a8e..199aaae 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -943,12 +943,14 @@ static __init int hpet_late_init(void)
if (boot_cpu_has(X86_FEATURE_ARAT))
return 0;

+ cpu_maps_update_begin();
for_each_online_cpu(cpu) {
hpet_cpuhp_notify(NULL, CPU_ONLINE, (void *)(long)cpu);
}

/* This notifier should be called after workqueue is ready */
- hotcpu_notifier(hpet_cpuhp_notify, -20);
+ __hotcpu_notifier(hpet_cpuhp_notify, -20);
+ cpu_maps_update_done();

return 0;
}

2014-02-05 22:14:40

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 25/51] x86, pci, amd-bus: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the amd-bus code in x86 by using this latter form of callback
registration.

Cc: Bjorn Helgaas <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/x86/pci/amd_bus.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/pci/amd_bus.c b/arch/x86/pci/amd_bus.c
index a48be98..7cd5d84 100644
--- a/arch/x86/pci/amd_bus.c
+++ b/arch/x86/pci/amd_bus.c
@@ -380,10 +380,13 @@ static int __init pci_io_ecs_init(void)
if (early_pci_allowed())
pci_enable_pci_io_ecs();

- register_cpu_notifier(&amd_cpu_notifier);
+ cpu_maps_update_begin();
for_each_online_cpu(cpu)
amd_cpu_notify(&amd_cpu_notifier, (unsigned long)CPU_ONLINE,
(void *)(long)cpu);
+ __register_cpu_notifier(&amd_cpu_notifier);
+ cpu_maps_update_done();
+
pci_probe |= PCI_HAS_IO_ECS;

return 0;

2014-02-05 22:14:51

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 26/51] x86, oprofile, nmi: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the oprofile code in x86 by using this latter form of callback
registration. But retain the calls to get/put_online_cpus(), since they
also protect the variables 'nmi_enabled' and 'ctr_running'. By nesting
get/put_online_cpus() *inside* cpu_maps_update_begin/done(), we avoid
the ABBA deadlock possibility mentioned above.

Cc: Robert Richter <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/x86/oprofile/nmi_int.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index 6890d84..85e5f6e 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -494,14 +494,19 @@ static int nmi_setup(void)
if (err)
goto fail;

+ cpu_maps_update_begin();
+
+ /* Use get/put_online_cpus() to protect 'nmi_enabled' */
get_online_cpus();
- register_cpu_notifier(&oprofile_cpu_nb);
nmi_enabled = 1;
/* make nmi_enabled visible to the nmi handler: */
smp_mb();
on_each_cpu(nmi_cpu_setup, NULL, 1);
+ __register_cpu_notifier(&oprofile_cpu_nb);
put_online_cpus();

+ cpu_maps_update_done();
+
return 0;
fail:
free_msrs();
@@ -512,12 +517,18 @@ static void nmi_shutdown(void)
{
struct op_msrs *msrs;

+ cpu_maps_update_begin();
+
+ /* Use get/put_online_cpus() to protect 'nmi_enabled' & 'ctr_running' */
get_online_cpus();
- unregister_cpu_notifier(&oprofile_cpu_nb);
on_each_cpu(nmi_cpu_shutdown, NULL, 1);
nmi_enabled = 0;
ctr_running = 0;
+ __unregister_cpu_notifier(&oprofile_cpu_nb);
put_online_cpus();
+
+ cpu_maps_update_done();
+
/* make variables visible to the nmi handler: */
smp_mb();
unregister_nmi_handler(NMI_LOCAL, "oprofile");

2014-02-05 22:15:01

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 27/51] x86, kvm: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the kvm code in x86 by using this latter form of callback registration.

Cc: Gleb Natapov <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/x86/kvm/x86.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 39c28f09..e3893b7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5365,7 +5365,8 @@ static void kvm_timer_init(void)
int cpu;

max_tsc_khz = tsc_khz;
- register_hotcpu_notifier(&kvmclock_cpu_notifier_block);
+
+ cpu_maps_update_begin();
if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) {
#ifdef CONFIG_CPU_FREQ
struct cpufreq_policy policy;
@@ -5382,6 +5383,10 @@ static void kvm_timer_init(void)
pr_debug("kvm: max_tsc_khz = %ld\n", max_tsc_khz);
for_each_online_cpu(cpu)
smp_call_function_single(cpu, tsc_khz_changed, NULL, 1);
+
+ __register_hotcpu_notifier(&kvmclock_cpu_notifier_block);
+ cpu_maps_update_done();
+
}

static DEFINE_PER_CPU(struct kvm_vcpu *, current_vcpu);

2014-02-05 22:15:16

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 28/51] arm64, hw_breakpoint.c: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the hw-breakpoint code in arm64 by using this latter form of callback
registration.

Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Lorenzo Pieralisi <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/arm64/kernel/hw_breakpoint.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
index f17f581..24e88d0 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -913,6 +913,8 @@ static int __init arch_hw_breakpoint_init(void)
pr_info("found %d breakpoint and %d watchpoint registers.\n",
core_num_brps, core_num_wrps);

+ cpu_maps_update_begin();
+
/*
* Reset the breakpoint resources. We assume that a halting
* debugger will leave the world in a nice state for us.
@@ -927,7 +929,10 @@ static int __init arch_hw_breakpoint_init(void)
TRAP_HWBKPT, "hw-watchpoint handler");

/* Register hotplug notifier. */
- register_cpu_notifier(&hw_breakpoint_reset_nb);
+ __register_cpu_notifier(&hw_breakpoint_reset_nb);
+
+ cpu_maps_update_done();
+
/* Register cpu_suspend hw breakpoint restore hook */
cpu_suspend_set_dbg_restorer(hw_breakpoint_reset);

2014-02-05 22:15:24

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 29/51] arm64, debug-monitors: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the debug-monitors code in arm64 by using this latter form of callback
registration.

Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

arch/arm64/kernel/debug-monitors.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index 636ba8b..959a16b 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -155,12 +155,16 @@ static struct notifier_block os_lock_nb = {

static int debug_monitors_init(void)
{
+ cpu_maps_update_begin();
+
/* Clear the OS lock. */
smp_call_function(clear_os_lock, NULL, 1);
clear_os_lock(NULL);

/* Register hotplug handler. */
- register_cpu_notifier(&os_lock_nb);
+ __register_cpu_notifier(&os_lock_nb);
+
+ cpu_maps_update_done();
return 0;
}
postcore_initcall(debug_monitors_init);

2014-02-05 22:15:36

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 30/51] powercap, intel-rapl: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the intel-rapl code in the powercap driver by using this latter form
of callback registration. But retain the calls to get/put_online_cpus(),
since they also protect the function rapl_cleanup_data(). By nesting
get/put_online_cpus() *inside* cpu_maps_update_begin/done(), we avoid
the ABBA deadlock possibility mentioned above.

Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Jacob Pan <[email protected]>
Cc: Srinivas Pandruvada <[email protected]>
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/powercap/intel_rapl.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c
index 3c67683..b460d46 100644
--- a/drivers/powercap/intel_rapl.c
+++ b/drivers/powercap/intel_rapl.c
@@ -1369,6 +1369,9 @@ static int __init rapl_init(void)

return -ENODEV;
}
+
+ cpu_maps_update_begin();
+
/* prevent CPU hotplug during detection */
get_online_cpus();
ret = rapl_detect_topology();
@@ -1380,20 +1383,23 @@ static int __init rapl_init(void)
ret = -ENODEV;
goto done;
}
- register_hotcpu_notifier(&rapl_cpu_notifier);
+ __register_hotcpu_notifier(&rapl_cpu_notifier);
done:
put_online_cpus();
+ cpu_maps_update_done();

return ret;
}

static void __exit rapl_exit(void)
{
+ cpu_maps_update_begin();
get_online_cpus();
- unregister_hotcpu_notifier(&rapl_cpu_notifier);
+ __unregister_hotcpu_notifier(&rapl_cpu_notifier);
rapl_unregister_powercap();
rapl_cleanup_data();
put_online_cpus();
+ cpu_maps_update_done();
}

module_init(rapl_init);

2014-02-05 22:15:43

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 31/51] scsi, bnx2i: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the bnx2i code in scsi by using this latter form of callback registration.

Cc: Eddie Wai <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/scsi/bnx2i/bnx2i_init.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/bnx2i/bnx2i_init.c b/drivers/scsi/bnx2i/bnx2i_init.c
index 34c294b..5c4e413 100644
--- a/drivers/scsi/bnx2i/bnx2i_init.c
+++ b/drivers/scsi/bnx2i/bnx2i_init.c
@@ -537,11 +537,15 @@ static int __init bnx2i_mod_init(void)
p->iothread = NULL;
}

+ cpu_maps_update_begin();
+
for_each_online_cpu(cpu)
bnx2i_percpu_thread_create(cpu);

/* Initialize per CPU interrupt thread */
- register_hotcpu_notifier(&bnx2i_cpu_notifier);
+ __register_hotcpu_notifier(&bnx2i_cpu_notifier);
+
+ cpu_maps_update_done();

return 0;

@@ -581,11 +585,15 @@ static void __exit bnx2i_mod_exit(void)
}
mutex_unlock(&bnx2i_dev_lock);

- unregister_hotcpu_notifier(&bnx2i_cpu_notifier);
+ cpu_maps_update_begin();

for_each_online_cpu(cpu)
bnx2i_percpu_thread_destroy(cpu);

+ __unregister_hotcpu_notifier(&bnx2i_cpu_notifier);
+
+ cpu_maps_update_done();
+
iscsi_unregister_transport(&bnx2i_iscsi_transport);
cnic_unregister_driver(CNIC_ULP_ISCSI);
}

2014-02-05 22:15:52

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 32/51] scsi, bnx2fc: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the bnx2fc code in scsi by using this latter form of callback
registration.

Cc: Eddie Wai <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
index 9b94850..f6c10c5 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
@@ -2586,12 +2586,16 @@ static int __init bnx2fc_mod_init(void)
spin_lock_init(&p->fp_work_lock);
}

+ cpu_maps_update_begin();
+
for_each_online_cpu(cpu) {
bnx2fc_percpu_thread_create(cpu);
}

/* Initialize per CPU interrupt thread */
- register_hotcpu_notifier(&bnx2fc_cpu_notifier);
+ __register_hotcpu_notifier(&bnx2fc_cpu_notifier);
+
+ cpu_maps_update_done();

cnic_register_driver(CNIC_ULP_FCOE, &bnx2fc_cnic_cb);

@@ -2656,13 +2660,17 @@ static void __exit bnx2fc_mod_exit(void)
if (l2_thread)
kthread_stop(l2_thread);

- unregister_hotcpu_notifier(&bnx2fc_cpu_notifier);
+ cpu_maps_update_begin();

/* Destroy per cpu threads */
for_each_online_cpu(cpu) {
bnx2fc_percpu_thread_destroy(cpu);
}

+ __unregister_hotcpu_notifier(&bnx2fc_cpu_notifier);
+
+ cpu_maps_update_done();
+
destroy_workqueue(bnx2fc_wq);
/*
* detach from scsi transport

2014-02-05 22:16:01

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 33/51] scsi, fcoe: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the fcoe code in scsi by using this latter form of callback registration.

Cc: Robert Love <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/scsi/fcoe/fcoe.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
index f317000..1c299de 100644
--- a/drivers/scsi/fcoe/fcoe.c
+++ b/drivers/scsi/fcoe/fcoe.c
@@ -2633,14 +2633,18 @@ static int __init fcoe_init(void)
skb_queue_head_init(&p->fcoe_rx_list);
}

+ cpu_maps_update_begin();
+
for_each_online_cpu(cpu)
fcoe_percpu_thread_create(cpu);

/* Initialize per CPU interrupt thread */
- rc = register_hotcpu_notifier(&fcoe_cpu_notifier);
+ rc = __register_hotcpu_notifier(&fcoe_cpu_notifier);
if (rc)
goto out_free;

+ cpu_maps_update_done();
+
/* Setup link change notification */
fcoe_dev_setup();

@@ -2655,6 +2659,9 @@ out_free:
for_each_online_cpu(cpu) {
fcoe_percpu_thread_destroy(cpu);
}
+
+ cpu_maps_update_done();
+
mutex_unlock(&fcoe_config_mutex);
destroy_workqueue(fcoe_wq);
return rc;
@@ -2687,11 +2694,15 @@ static void __exit fcoe_exit(void)
}
rtnl_unlock();

- unregister_hotcpu_notifier(&fcoe_cpu_notifier);
+ cpu_maps_update_begin();

for_each_online_cpu(cpu)
fcoe_percpu_thread_destroy(cpu);

+ __unregister_hotcpu_notifier(&fcoe_cpu_notifier);
+
+ cpu_maps_update_done();
+
mutex_unlock(&fcoe_config_mutex);

/*

2014-02-05 22:16:11

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 34/51] zsmalloc: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the zsmalloc code by using this latter form of callback registration.

Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

mm/zsmalloc.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index c03ca5e..6f7364c 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -814,21 +814,32 @@ static void zs_exit(void)
{
int cpu;

+ cpu_maps_update_begin();
+
for_each_online_cpu(cpu)
zs_cpu_notifier(NULL, CPU_DEAD, (void *)(long)cpu);
- unregister_cpu_notifier(&zs_cpu_nb);
+ __unregister_cpu_notifier(&zs_cpu_nb);
+
+ cpu_maps_update_done();
}

static int zs_init(void)
{
int cpu, ret;

- register_cpu_notifier(&zs_cpu_nb);
+ cpu_maps_update_begin();
+
+ __register_cpu_notifier(&zs_cpu_nb);
for_each_online_cpu(cpu) {
ret = zs_cpu_notifier(NULL, CPU_UP_PREPARE, (void *)(long)cpu);
- if (notifier_to_errno(ret))
+ if (notifier_to_errno(ret)) {
+ cpu_maps_update_done();
goto fail;
+ }
}
+
+ cpu_maps_update_done();
+
return 0;
fail:
zs_exit();

2014-02-05 22:16:21

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 35/51] acpi-cpufreq: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the acpi-cpufreq code by using this latter form of callback registration.

Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Viresh Kumar <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/cpufreq/acpi-cpufreq.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 18448a7..e2eb471 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -907,15 +907,16 @@ static void __init acpi_cpufreq_boost_init(void)

acpi_cpufreq_driver.boost_supported = true;
acpi_cpufreq_driver.boost_enabled = boost_state(0);
- get_online_cpus();
+
+ cpu_maps_update_begin();

/* Force all MSRs to the same value */
boost_set_msrs(acpi_cpufreq_driver.boost_enabled,
cpu_online_mask);

- register_cpu_notifier(&boost_nb);
+ __register_cpu_notifier(&boost_nb);

- put_online_cpus();
+ cpu_maps_update_done();
}
}

2014-02-05 22:16:30

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 36/51] drivers/base/topology.c: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the topology code by using this latter form of callback registration.

Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/base/topology.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/base/topology.c b/drivers/base/topology.c
index 94ffee3..9db29cc 100644
--- a/drivers/base/topology.c
+++ b/drivers/base/topology.c
@@ -161,16 +161,20 @@ static int topology_cpu_callback(struct notifier_block *nfb,
static int topology_sysfs_init(void)
{
int cpu;
- int rc;
+ int rc = 0;
+
+ cpu_maps_update_begin();

for_each_online_cpu(cpu) {
rc = topology_add_dev(cpu);
if (rc)
- return rc;
+ goto out;
}
- hotcpu_notifier(topology_cpu_callback, 0);
+ __hotcpu_notifier(topology_cpu_callback, 0);

- return 0;
+out:
+ cpu_maps_update_done();
+ return rc;
}

device_initcall(topology_sysfs_init);

2014-02-05 22:16:38

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 37/51] clocksource, dummy-timer: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the clocksource dummy-timer code by using this latter form of callback
registration.

Cc: Daniel Lezcano <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/clocksource/dummy_timer.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/clocksource/dummy_timer.c b/drivers/clocksource/dummy_timer.c
index b3eb582..995d446 100644
--- a/drivers/clocksource/dummy_timer.c
+++ b/drivers/clocksource/dummy_timer.c
@@ -56,14 +56,19 @@ static struct notifier_block dummy_timer_cpu_nb = {

static int __init dummy_timer_register(void)
{
- int err = register_cpu_notifier(&dummy_timer_cpu_nb);
+ int err = 0;
+
+ cpu_maps_update_begin();
+ err = __register_cpu_notifier(&dummy_timer_cpu_nb);
if (err)
- return err;
+ goto out;

/* We won't get a call on the boot CPU, so register immediately */
if (num_possible_cpus() > 1)
dummy_timer_setup();

- return 0;
+out:
+ cpu_maps_update_done();
+ return err;
}
early_initcall(dummy_timer_register);

2014-02-05 22:16:59

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 39/51] oprofile, nmi-timer: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the nmi-timer code in oprofile by using this latter form of callback
registration.

Cc: Robert Richter <[email protected]>
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/oprofile/nmi_timer_int.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/oprofile/nmi_timer_int.c b/drivers/oprofile/nmi_timer_int.c
index 76f1c93..bbceee6 100644
--- a/drivers/oprofile/nmi_timer_int.c
+++ b/drivers/oprofile/nmi_timer_int.c
@@ -108,8 +108,8 @@ static void nmi_timer_shutdown(void)
struct perf_event *event;
int cpu;

- get_online_cpus();
- unregister_cpu_notifier(&nmi_timer_cpu_nb);
+ cpu_maps_update_begin();
+ __unregister_cpu_notifier(&nmi_timer_cpu_nb);
for_each_possible_cpu(cpu) {
event = per_cpu(nmi_timer_events, cpu);
if (!event)
@@ -119,7 +119,7 @@ static void nmi_timer_shutdown(void)
perf_event_release_kernel(event);
}

- put_online_cpus();
+ cpu_maps_update_done();
}

static int nmi_timer_setup(void)
@@ -132,20 +132,23 @@ static int nmi_timer_setup(void)
do_div(period, HZ);
nmi_timer_attr.sample_period = period;

- get_online_cpus();
- err = register_cpu_notifier(&nmi_timer_cpu_nb);
+ cpu_maps_update_begin();
+ err = __register_cpu_notifier(&nmi_timer_cpu_nb);
if (err)
goto out;
+
/* can't attach events to offline cpus: */
for_each_online_cpu(cpu) {
err = nmi_timer_start_cpu(cpu);
- if (err)
- break;
+ if (err) {
+ cpu_maps_update_done();
+ nmi_timer_shutdown();
+ return err;
+ }
}
- if (err)
- nmi_timer_shutdown();
+
out:
- put_online_cpus();
+ cpu_maps_update_done();
return err;
}

2014-02-05 22:17:16

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 40/51] octeon, watchdog: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the watchdog code in octeon by using this latter form of callback
registration.

Cc: Wim Van Sebroeck <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/watchdog/octeon-wdt-main.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/watchdog/octeon-wdt-main.c b/drivers/watchdog/octeon-wdt-main.c
index 4612088..c991acf 100644
--- a/drivers/watchdog/octeon-wdt-main.c
+++ b/drivers/watchdog/octeon-wdt-main.c
@@ -708,10 +708,13 @@ static int __init octeon_wdt_init(void)

cpumask_clear(&irq_enabled_cpus);

+ cpu_maps_update_begin();
for_each_online_cpu(cpu)
octeon_wdt_setup_interrupt(cpu);

- register_hotcpu_notifier(&octeon_wdt_cpu_notifier);
+ __register_hotcpu_notifier(&octeon_wdt_cpu_notifier);
+ cpu_maps_update_done();
+
out:
return ret;
}
@@ -725,7 +728,8 @@ static void __exit octeon_wdt_cleanup(void)

misc_deregister(&octeon_wdt_miscdev);

- unregister_hotcpu_notifier(&octeon_wdt_cpu_notifier);
+ cpu_maps_update_begin();
+ __unregister_hotcpu_notifier(&octeon_wdt_cpu_notifier);

for_each_online_cpu(cpu) {
int core = cpu2core(cpu);
@@ -734,6 +738,9 @@ static void __exit octeon_wdt_cleanup(void)
/* Free the interrupt handler */
free_irq(OCTEON_IRQ_WDOG0 + core, octeon_wdt_poke_irq);
}
+
+ cpu_maps_update_done();
+
/*
* Disable the boot-bus memory, the code it points to is soon
* to go missing.

2014-02-05 22:17:24

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 41/51] thermal, x86-pkg-temp: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the thermal x86-pkg-temp code by using this latter form of callback
registration.

Cc: Zhang Rui <[email protected]>
Cc: Eduardo Valentin <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/thermal/x86_pkg_temp_thermal.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/thermal/x86_pkg_temp_thermal.c b/drivers/thermal/x86_pkg_temp_thermal.c
index 972e1c7..55f9eac 100644
--- a/drivers/thermal/x86_pkg_temp_thermal.c
+++ b/drivers/thermal/x86_pkg_temp_thermal.c
@@ -589,12 +589,12 @@ static int __init pkg_temp_thermal_init(void)
platform_thermal_package_rate_control =
pkg_temp_thermal_platform_thermal_rate_control;

- get_online_cpus();
+ cpu_maps_update_begin();
for_each_online_cpu(i)
if (get_core_online(i))
goto err_ret;
- register_hotcpu_notifier(&pkg_temp_thermal_notifier);
- put_online_cpus();
+ __register_hotcpu_notifier(&pkg_temp_thermal_notifier);
+ cpu_maps_update_done();

pkg_temp_debugfs_init(); /* Don't care if fails */

@@ -603,7 +603,7 @@ static int __init pkg_temp_thermal_init(void)
err_ret:
for_each_online_cpu(i)
put_core_offline(i);
- put_online_cpus();
+ cpu_maps_update_done();
kfree(pkg_work_scheduled);
platform_thermal_package_notify = NULL;
platform_thermal_package_rate_control = NULL;
@@ -616,8 +616,8 @@ static void __exit pkg_temp_thermal_exit(void)
struct phy_dev_entry *phdev, *n;
int i;

- get_online_cpus();
- unregister_hotcpu_notifier(&pkg_temp_thermal_notifier);
+ cpu_maps_update_begin();
+ __unregister_hotcpu_notifier(&pkg_temp_thermal_notifier);
mutex_lock(&phy_dev_list_mutex);
list_for_each_entry_safe(phdev, n, &phy_dev_list, list) {
/* Retore old MSR value for package thermal interrupt */
@@ -635,7 +635,7 @@ static void __exit pkg_temp_thermal_exit(void)
for_each_online_cpu(i)
cancel_delayed_work_sync(
&per_cpu(pkg_temp_thermal_threshold_work, i));
- put_online_cpus();
+ cpu_maps_update_done();

kfree(pkg_work_scheduled);

2014-02-05 22:17:37

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 42/51] hwmon, coretemp: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the hwmon coretemp code by using this latter form of callback
registration.

Cc: Fenghua Yu <[email protected]>
Cc: Jean Delvare <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/hwmon/coretemp.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index bbb0b0d..43c436b 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -849,20 +849,20 @@ static int __init coretemp_init(void)
if (err)
goto exit;

- get_online_cpus();
+ cpu_maps_update_begin();
for_each_online_cpu(i)
get_core_online(i);

#ifndef CONFIG_HOTPLUG_CPU
if (list_empty(&pdev_list)) {
- put_online_cpus();
+ cpu_maps_update_done();
err = -ENODEV;
goto exit_driver_unreg;
}
#endif

- register_hotcpu_notifier(&coretemp_cpu_notifier);
- put_online_cpus();
+ __register_hotcpu_notifier(&coretemp_cpu_notifier);
+ cpu_maps_update_done();
return 0;

#ifndef CONFIG_HOTPLUG_CPU
@@ -877,8 +877,8 @@ static void __exit coretemp_exit(void)
{
struct pdev_entry *p, *n;

- get_online_cpus();
- unregister_hotcpu_notifier(&coretemp_cpu_notifier);
+ cpu_maps_update_begin();
+ __unregister_hotcpu_notifier(&coretemp_cpu_notifier);
mutex_lock(&pdev_list_mutex);
list_for_each_entry_safe(p, n, &pdev_list, list) {
platform_device_unregister(p->pdev);
@@ -886,7 +886,7 @@ static void __exit coretemp_exit(void)
kfree(p);
}
mutex_unlock(&pdev_list_mutex);
- put_online_cpus();
+ cpu_maps_update_done();
platform_driver_unregister(&coretemp_driver);
}

2014-02-05 22:17:50

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 43/51] hwmon, via-cputemp: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the hwmon via-cputemp code by using this latter form of callback
registration.

Cc: Jean Delvare <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/hwmon/via-cputemp.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/hwmon/via-cputemp.c b/drivers/hwmon/via-cputemp.c
index 38944e9..a5c9b2b 100644
--- a/drivers/hwmon/via-cputemp.c
+++ b/drivers/hwmon/via-cputemp.c
@@ -319,7 +319,7 @@ static int __init via_cputemp_init(void)
if (err)
goto exit;

- get_online_cpus();
+ cpu_maps_update_begin();
for_each_online_cpu(i) {
struct cpuinfo_x86 *c = &cpu_data(i);

@@ -339,14 +339,14 @@ static int __init via_cputemp_init(void)

#ifndef CONFIG_HOTPLUG_CPU
if (list_empty(&pdev_list)) {
- put_online_cpus();
+ cpu_maps_update_done();
err = -ENODEV;
goto exit_driver_unreg;
}
#endif

- register_hotcpu_notifier(&via_cputemp_cpu_notifier);
- put_online_cpus();
+ __register_hotcpu_notifier(&via_cputemp_cpu_notifier);
+ cpu_maps_update_done();
return 0;

#ifndef CONFIG_HOTPLUG_CPU
@@ -361,8 +361,8 @@ static void __exit via_cputemp_exit(void)
{
struct pdev_entry *p, *n;

- get_online_cpus();
- unregister_hotcpu_notifier(&via_cputemp_cpu_notifier);
+ cpu_maps_update_begin();
+ __unregister_hotcpu_notifier(&via_cputemp_cpu_notifier);
mutex_lock(&pdev_list_mutex);
list_for_each_entry_safe(p, n, &pdev_list, list) {
platform_device_unregister(p->pdev);
@@ -370,7 +370,7 @@ static void __exit via_cputemp_exit(void)
kfree(p);
}
mutex_unlock(&pdev_list_mutex);
- put_online_cpus();
+ cpu_maps_update_done();
platform_driver_unregister(&via_cputemp_driver);
}

2014-02-05 22:18:02

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 44/51] xen, balloon: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Interestingly, the balloon code in xen can actually prevent double
initialization and hence can use the following simplified form of callback
registration:

register_cpu_notifier(&foobar_cpu_notifier);

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

put_online_cpus();

A hotplug operation that occurs between registering the notifier and calling
get_online_cpus(), won't disrupt anything, because the code takes care to
perform the memory allocations only once.

So reorganize the balloon code in xen this way to fix the deadlock with
callback registration.

Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: David Vrabel <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/xen/balloon.c | 35 +++++++++++++++++++++++------------
1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 37d06ea..afe1a3f 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -592,19 +592,29 @@ static void __init balloon_add_region(unsigned long start_pfn,
}
}

+static int alloc_balloon_scratch_page(int cpu)
+{
+ if (per_cpu(balloon_scratch_page, cpu) != NULL)
+ return 0;
+
+ per_cpu(balloon_scratch_page, cpu) = alloc_page(GFP_KERNEL);
+ if (per_cpu(balloon_scratch_page, cpu) == NULL) {
+ pr_warn("Failed to allocate balloon_scratch_page for cpu %d\n", cpu);
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+
static int balloon_cpu_notify(struct notifier_block *self,
unsigned long action, void *hcpu)
{
int cpu = (long)hcpu;
switch (action) {
case CPU_UP_PREPARE:
- if (per_cpu(balloon_scratch_page, cpu) != NULL)
- break;
- per_cpu(balloon_scratch_page, cpu) = alloc_page(GFP_KERNEL);
- if (per_cpu(balloon_scratch_page, cpu) == NULL) {
- pr_warn("Failed to allocate balloon_scratch_page for cpu %d\n", cpu);
+ if (alloc_balloon_scratch_page(cpu))
return NOTIFY_BAD;
- }
break;
default:
break;
@@ -624,15 +634,16 @@ static int __init balloon_init(void)
return -ENODEV;

if (!xen_feature(XENFEAT_auto_translated_physmap)) {
- for_each_online_cpu(cpu)
- {
- per_cpu(balloon_scratch_page, cpu) = alloc_page(GFP_KERNEL);
- if (per_cpu(balloon_scratch_page, cpu) == NULL) {
- pr_warn("Failed to allocate balloon_scratch_page for cpu %d\n", cpu);
+ register_cpu_notifier(&balloon_cpu_notifier);
+
+ get_online_cpus();
+ for_each_online_cpu(cpu) {
+ if (alloc_balloon_scratch_page(cpu)) {
+ put_online_cpus();
return -ENOMEM;
}
}
- register_cpu_notifier(&balloon_cpu_notifier);
+ put_online_cpus();
}

pr_info("Initialising balloon driver\n");

2014-02-05 22:18:15

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 45/51] md, raid5: Fix CPU hotplug callback registration

From: Oleg Nesterov <[email protected]>

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Interestingly, the raid5 code can actually prevent double initialization and
hence can use the following simplified form of callback registration:

register_cpu_notifier(&foobar_cpu_notifier);

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

put_online_cpus();

A hotplug operation that occurs between registering the notifier and calling
get_online_cpus(), won't disrupt anything, because the code takes care to
perform the memory allocations only once.

So reorganize the code in raid5 this way to fix the deadlock with callback
registration.

Cc: Neil Brown <[email protected]>
Cc: [email protected]
Cc: [email protected]
[Srivatsa: Fixed the unregister_cpu_notifier() deadlock, added the
free_scratch_buffer() helper to condense code further and wrote the changelog.]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/md/raid5.c | 90 +++++++++++++++++++++++++---------------------------
1 file changed, 44 insertions(+), 46 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index f1feade..16f5c21 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -5514,23 +5514,43 @@ raid5_size(struct mddev *mddev, sector_t sectors, int raid_disks)
return sectors * (raid_disks - conf->max_degraded);
}

+static void free_scratch_buffer(struct r5conf *conf, struct raid5_percpu *percpu)
+{
+ safe_put_page(percpu->spare_page);
+ kfree(percpu->scribble);
+ percpu->spare_page = NULL;
+ percpu->scribble = NULL;
+}
+
+static int alloc_scratch_buffer(struct r5conf *conf, struct raid5_percpu *percpu)
+{
+ if (conf->level == 6 && !percpu->spare_page)
+ percpu->spare_page = alloc_page(GFP_KERNEL);
+ if (!percpu->scribble)
+ percpu->scribble = kmalloc(conf->scribble_len, GFP_KERNEL);
+
+ if (!percpu->scribble || (conf->level == 6 && !percpu->spare_page)) {
+ free_scratch_buffer(conf, percpu);
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
static void raid5_free_percpu(struct r5conf *conf)
{
- struct raid5_percpu *percpu;
unsigned long cpu;

if (!conf->percpu)
return;

- get_online_cpus();
- for_each_possible_cpu(cpu) {
- percpu = per_cpu_ptr(conf->percpu, cpu);
- safe_put_page(percpu->spare_page);
- kfree(percpu->scribble);
- }
#ifdef CONFIG_HOTPLUG_CPU
unregister_cpu_notifier(&conf->cpu_notify);
#endif
+
+ get_online_cpus();
+ for_each_possible_cpu(cpu)
+ free_scratch_buffer(conf, per_cpu_ptr(conf->percpu, cpu));
put_online_cpus();

free_percpu(conf->percpu);
@@ -5557,15 +5577,7 @@ static int raid456_cpu_notify(struct notifier_block *nfb, unsigned long action,
switch (action) {
case CPU_UP_PREPARE:
case CPU_UP_PREPARE_FROZEN:
- if (conf->level == 6 && !percpu->spare_page)
- percpu->spare_page = alloc_page(GFP_KERNEL);
- if (!percpu->scribble)
- percpu->scribble = kmalloc(conf->scribble_len, GFP_KERNEL);
-
- if (!percpu->scribble ||
- (conf->level == 6 && !percpu->spare_page)) {
- safe_put_page(percpu->spare_page);
- kfree(percpu->scribble);
+ if (alloc_scratch_buffer(conf, percpu)) {
pr_err("%s: failed memory allocation for cpu%ld\n",
__func__, cpu);
return notifier_from_errno(-ENOMEM);
@@ -5573,10 +5585,7 @@ static int raid456_cpu_notify(struct notifier_block *nfb, unsigned long action,
break;
case CPU_DEAD:
case CPU_DEAD_FROZEN:
- safe_put_page(percpu->spare_page);
- kfree(percpu->scribble);
- percpu->spare_page = NULL;
- percpu->scribble = NULL;
+ free_scratch_buffer(conf, per_cpu_ptr(conf->percpu, cpu));
break;
default:
break;
@@ -5588,40 +5597,29 @@ static int raid456_cpu_notify(struct notifier_block *nfb, unsigned long action,
static int raid5_alloc_percpu(struct r5conf *conf)
{
unsigned long cpu;
- struct page *spare_page;
- struct raid5_percpu __percpu *allcpus;
- void *scribble;
- int err;
+ int err = 0;

- allcpus = alloc_percpu(struct raid5_percpu);
- if (!allcpus)
+ conf->percpu = alloc_percpu(struct raid5_percpu);
+ if (!conf->percpu)
return -ENOMEM;
- conf->percpu = allcpus;
+
+#ifdef CONFIG_HOTPLUG_CPU
+ conf->cpu_notify.notifier_call = raid456_cpu_notify;
+ conf->cpu_notify.priority = 0;
+ err = register_cpu_notifier(&conf->cpu_notify);
+ if (err)
+ return err;
+#endif

get_online_cpus();
- err = 0;
for_each_present_cpu(cpu) {
- if (conf->level == 6) {
- spare_page = alloc_page(GFP_KERNEL);
- if (!spare_page) {
- err = -ENOMEM;
- break;
- }
- per_cpu_ptr(conf->percpu, cpu)->spare_page = spare_page;
- }
- scribble = kmalloc(conf->scribble_len, GFP_KERNEL);
- if (!scribble) {
- err = -ENOMEM;
+ err = alloc_scratch_buffer(conf, per_cpu_ptr(conf->percpu, cpu));
+ if (err) {
+ pr_err("%s: failed memory allocation for cpu%ld\n",
+ __func__, cpu);
break;
}
- per_cpu_ptr(conf->percpu, cpu)->scribble = scribble;
}
-#ifdef CONFIG_HOTPLUG_CPU
- conf->cpu_notify.notifier_call = raid456_cpu_notify;
- conf->cpu_notify.priority = 0;
- if (err == 0)
- err = register_cpu_notifier(&conf->cpu_notify);
-#endif
put_online_cpus();

return err;

2014-02-05 22:18:26

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 46/51] trace, ring-buffer: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the tracing ring-buffer code by using this latter form of callback
registration.

Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

kernel/trace/ring_buffer.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 294b8a2..ca3eb61 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1301,7 +1301,7 @@ struct ring_buffer *__ring_buffer_alloc(unsigned long size, unsigned flags,
* In that off case, we need to allocate for all possible cpus.
*/
#ifdef CONFIG_HOTPLUG_CPU
- get_online_cpus();
+ cpu_maps_update_begin();
cpumask_copy(buffer->cpumask, cpu_online_mask);
#else
cpumask_copy(buffer->cpumask, cpu_possible_mask);
@@ -1324,10 +1324,10 @@ struct ring_buffer *__ring_buffer_alloc(unsigned long size, unsigned flags,
#ifdef CONFIG_HOTPLUG_CPU
buffer->cpu_notify.notifier_call = rb_cpu_notify;
buffer->cpu_notify.priority = 0;
- register_cpu_notifier(&buffer->cpu_notify);
+ __register_cpu_notifier(&buffer->cpu_notify);
+ cpu_maps_update_done();
#endif

- put_online_cpus();
mutex_init(&buffer->mutex);

return buffer;
@@ -1341,7 +1341,9 @@ struct ring_buffer *__ring_buffer_alloc(unsigned long size, unsigned flags,

fail_free_cpumask:
free_cpumask_var(buffer->cpumask);
- put_online_cpus();
+#ifdef CONFIG_HOTPLUG_CPU
+ cpu_maps_update_done();
+#endif

fail_free_buffer:
kfree(buffer);
@@ -1358,16 +1360,17 @@ ring_buffer_free(struct ring_buffer *buffer)
{
int cpu;

- get_online_cpus();
-
#ifdef CONFIG_HOTPLUG_CPU
- unregister_cpu_notifier(&buffer->cpu_notify);
+ cpu_maps_update_begin();
+ __unregister_cpu_notifier(&buffer->cpu_notify);
#endif

for_each_buffer_cpu(buffer, cpu)
rb_free_cpu_buffer(buffer->buffers[cpu]);

- put_online_cpus();
+#ifdef CONFIG_HOTPLUG_CPU
+ cpu_maps_update_done();
+#endif

kfree(buffer->buffers);
free_cpumask_var(buffer->cpumask);

2014-02-05 22:18:39

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 47/51] profile: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the profile code by using this latter form of callback registration.

Cc: Al Viro <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

kernel/profile.c | 20 +++++++++++++++-----
1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/kernel/profile.c b/kernel/profile.c
index 6631e1e..df27769 100644
--- a/kernel/profile.c
+++ b/kernel/profile.c
@@ -591,18 +591,28 @@ out_cleanup:
int __ref create_proc_profile(void) /* false positive from hotcpu_notifier */
{
struct proc_dir_entry *entry;
+ int err = 0;

if (!prof_on)
return 0;
- if (create_hash_tables())
- return -ENOMEM;
+
+ cpu_maps_update_begin();
+
+ if (create_hash_tables()) {
+ err = -ENOMEM;
+ goto out;
+ }
+
entry = proc_create("profile", S_IWUSR | S_IRUGO,
NULL, &proc_profile_operations);
if (!entry)
- return 0;
+ goto out;
proc_set_size(entry, (1 + prof_len) * sizeof(atomic_t));
- hotcpu_notifier(profile_cpu_callback, 0);
- return 0;
+ __hotcpu_notifier(profile_cpu_callback, 0);
+
+out:
+ cpu_maps_update_done();
+ return err;
}
module_init(create_proc_profile);
#endif /* CONFIG_PROC_FS */

2014-02-05 22:18:56

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 48/51] mm, vmstat: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the vmstat code in the MM subsystem by using this latter form of callback
registration.

Cc: Andrew Morton <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Yasuaki Ishimatsu <[email protected]>
Cc: Cody P Schafer <[email protected]>
Cc: Toshi Kani <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

mm/vmstat.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/vmstat.c b/mm/vmstat.c
index 7249614..70668ba 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1290,14 +1290,14 @@ static int __init setup_vmstat(void)
#ifdef CONFIG_SMP
int cpu;

- register_cpu_notifier(&vmstat_notifier);
+ cpu_maps_update_begin();
+ __register_cpu_notifier(&vmstat_notifier);

- get_online_cpus();
for_each_online_cpu(cpu) {
start_cpu_timer(cpu);
node_set_state(cpu_to_node(cpu), N_CPU);
}
- put_online_cpus();
+ cpu_maps_update_done();
#endif
#ifdef CONFIG_PROC_FS
proc_create("buddyinfo", S_IRUGO, NULL, &fragmentation_file_operations);

2014-02-05 22:19:04

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 49/51] mm, zswap: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the zswap code by using this latter form of callback registration.

Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

mm/zswap.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index e55bab9..681fa3f 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -387,18 +387,18 @@ static int zswap_cpu_init(void)
{
unsigned long cpu;

- get_online_cpus();
+ cpu_maps_update_begin();
for_each_online_cpu(cpu)
if (__zswap_cpu_notifier(CPU_UP_PREPARE, cpu) != NOTIFY_OK)
goto cleanup;
- register_cpu_notifier(&zswap_cpu_notifier_block);
- put_online_cpus();
+ __register_cpu_notifier(&zswap_cpu_notifier_block);
+ cpu_maps_update_done();
return 0;

cleanup:
for_each_online_cpu(cpu)
__zswap_cpu_notifier(CPU_UP_CANCELED, cpu);
- put_online_cpus();
+ cpu_maps_update_done();
return -ENOMEM;
}

2014-02-05 22:19:15

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 50/51] net/core/flow.c: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the code in net/core/flow.c by using this latter form of callback
registration.

Cc: "David S. Miller" <[email protected]>
Cc: Li RongQing <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

net/core/flow.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/core/flow.c b/net/core/flow.c
index dfa602c..0f2c995 100644
--- a/net/core/flow.c
+++ b/net/core/flow.c
@@ -456,6 +456,8 @@ static int __init flow_cache_init(struct flow_cache *fc)
if (!fc->percpu)
return -ENOMEM;

+ cpu_maps_update_begin();
+
for_each_online_cpu(i) {
if (flow_cache_cpu_prepare(fc, i))
goto err;
@@ -463,7 +465,9 @@ static int __init flow_cache_init(struct flow_cache *fc)
fc->hotcpu_notifier = (struct notifier_block){
.notifier_call = flow_cache_cpu,
};
- register_hotcpu_notifier(&fc->hotcpu_notifier);
+ __register_hotcpu_notifier(&fc->hotcpu_notifier);
+
+ cpu_maps_update_done();

setup_timer(&fc->rnd_timer, flow_cache_new_hashrnd,
(unsigned long) fc);
@@ -479,6 +483,8 @@ err:
fcp->hash_table = NULL;
}

+ cpu_maps_update_done();
+
free_percpu(fc->percpu);
fc->percpu = NULL;

2014-02-05 22:19:21

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 51/51] net/iucv/iucv.c: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the code in net/iucv/iucv.c by using this latter form of callback
registration. Also, provide helper functions to perform the common memory
allocations and frees, to condense repetitive code.

Cc: Ursula Braun <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

net/iucv/iucv.c | 121 ++++++++++++++++++++++++++-----------------------------
1 file changed, 57 insertions(+), 64 deletions(-)

diff --git a/net/iucv/iucv.c b/net/iucv/iucv.c
index cd5b8ec..f92348b 100644
--- a/net/iucv/iucv.c
+++ b/net/iucv/iucv.c
@@ -621,6 +621,42 @@ static void iucv_disable(void)
put_online_cpus();
}

+static void free_iucv_data(int cpu)
+{
+ kfree(iucv_param_irq[cpu]);
+ iucv_param_irq[cpu] = NULL;
+ kfree(iucv_param[cpu]);
+ iucv_param[cpu] = NULL;
+ kfree(iucv_irq_data[cpu]);
+ iucv_irq_data[cpu] = NULL;
+}
+
+static int alloc_iucv_data(int cpu)
+{
+ /* Note: GFP_DMA used to get memory below 2G */
+ iucv_irq_data[cpu] = kmalloc_node(sizeof(struct iucv_irq_data),
+ GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
+ if (!iucv_irq_data[cpu])
+ goto out_free;
+
+ /* Allocate parameter blocks. */
+ iucv_param[cpu] = kmalloc_node(sizeof(union iucv_param),
+ GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
+ if (!iucv_param[cpu])
+ goto out_free;
+
+ iucv_param_irq[cpu] = kmalloc_node(sizeof(union iucv_param),
+ GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
+ if (!iucv_param_irq[cpu])
+ goto out_free;
+
+ return 0;
+
+out_free:
+ free_iucv_data(cpu);
+ return -ENOMEM;
+}
+
static int iucv_cpu_notify(struct notifier_block *self,
unsigned long action, void *hcpu)
{
@@ -630,38 +666,14 @@ static int iucv_cpu_notify(struct notifier_block *self,
switch (action) {
case CPU_UP_PREPARE:
case CPU_UP_PREPARE_FROZEN:
- iucv_irq_data[cpu] = kmalloc_node(sizeof(struct iucv_irq_data),
- GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
- if (!iucv_irq_data[cpu])
- return notifier_from_errno(-ENOMEM);
-
- iucv_param[cpu] = kmalloc_node(sizeof(union iucv_param),
- GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
- if (!iucv_param[cpu]) {
- kfree(iucv_irq_data[cpu]);
- iucv_irq_data[cpu] = NULL;
+ if (alloc_iucv_data(cpu))
return notifier_from_errno(-ENOMEM);
- }
- iucv_param_irq[cpu] = kmalloc_node(sizeof(union iucv_param),
- GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
- if (!iucv_param_irq[cpu]) {
- kfree(iucv_param[cpu]);
- iucv_param[cpu] = NULL;
- kfree(iucv_irq_data[cpu]);
- iucv_irq_data[cpu] = NULL;
- return notifier_from_errno(-ENOMEM);
- }
break;
case CPU_UP_CANCELED:
case CPU_UP_CANCELED_FROZEN:
case CPU_DEAD:
case CPU_DEAD_FROZEN:
- kfree(iucv_param_irq[cpu]);
- iucv_param_irq[cpu] = NULL;
- kfree(iucv_param[cpu]);
- iucv_param[cpu] = NULL;
- kfree(iucv_irq_data[cpu]);
- iucv_irq_data[cpu] = NULL;
+ free_iucv_data(cpu);
break;
case CPU_ONLINE:
case CPU_ONLINE_FROZEN:
@@ -2025,33 +2037,20 @@ static int __init iucv_init(void)
goto out_int;
}

- for_each_online_cpu(cpu) {
- /* Note: GFP_DMA used to get memory below 2G */
- iucv_irq_data[cpu] = kmalloc_node(sizeof(struct iucv_irq_data),
- GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
- if (!iucv_irq_data[cpu]) {
- rc = -ENOMEM;
- goto out_free;
- }
+ cpu_maps_update_begin();

- /* Allocate parameter blocks. */
- iucv_param[cpu] = kmalloc_node(sizeof(union iucv_param),
- GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
- if (!iucv_param[cpu]) {
- rc = -ENOMEM;
- goto out_free;
- }
- iucv_param_irq[cpu] = kmalloc_node(sizeof(union iucv_param),
- GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
- if (!iucv_param_irq[cpu]) {
+ for_each_online_cpu(cpu) {
+ if (alloc_iucv_data(cpu)) {
rc = -ENOMEM;
goto out_free;
}
-
}
- rc = register_hotcpu_notifier(&iucv_cpu_notifier);
+ rc = __register_hotcpu_notifier(&iucv_cpu_notifier);
if (rc)
goto out_free;
+
+ cpu_maps_update_done();
+
rc = register_reboot_notifier(&iucv_reboot_notifier);
if (rc)
goto out_cpu;
@@ -2069,16 +2068,14 @@ static int __init iucv_init(void)
out_reboot:
unregister_reboot_notifier(&iucv_reboot_notifier);
out_cpu:
- unregister_hotcpu_notifier(&iucv_cpu_notifier);
+ cpu_maps_update_begin();
+ __unregister_hotcpu_notifier(&iucv_cpu_notifier);
out_free:
- for_each_possible_cpu(cpu) {
- kfree(iucv_param_irq[cpu]);
- iucv_param_irq[cpu] = NULL;
- kfree(iucv_param[cpu]);
- iucv_param[cpu] = NULL;
- kfree(iucv_irq_data[cpu]);
- iucv_irq_data[cpu] = NULL;
- }
+ for_each_possible_cpu(cpu)
+ free_iucv_data(cpu);
+
+ cpu_maps_update_done();
+
root_device_unregister(iucv_root);
out_int:
unregister_external_interrupt(0x4000, iucv_external_interrupt);
@@ -2105,15 +2102,11 @@ static void __exit iucv_exit(void)
kfree(p);
spin_unlock_irq(&iucv_queue_lock);
unregister_reboot_notifier(&iucv_reboot_notifier);
- unregister_hotcpu_notifier(&iucv_cpu_notifier);
- for_each_possible_cpu(cpu) {
- kfree(iucv_param_irq[cpu]);
- iucv_param_irq[cpu] = NULL;
- kfree(iucv_param[cpu]);
- iucv_param[cpu] = NULL;
- kfree(iucv_irq_data[cpu]);
- iucv_irq_data[cpu] = NULL;
- }
+ cpu_maps_update_begin();
+ __unregister_hotcpu_notifier(&iucv_cpu_notifier);
+ for_each_possible_cpu(cpu)
+ free_iucv_data(cpu);
+ cpu_maps_update_done();
root_device_unregister(iucv_root);
bus_unregister(&iucv_bus);
unregister_external_interrupt(0x4000, iucv_external_interrupt);

2014-02-05 22:21:20

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 38/51] intel-idle: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_maps_update_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_maps_update_done();


Fix the intel-idle code by using this latter form of callback registration.

Cc: Len Brown <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/idle/intel_idle.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 8e1939f..716ee5a 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -681,14 +681,19 @@ static int __init intel_idle_init(void)
if (intel_idle_cpuidle_devices == NULL)
return -ENOMEM;

+ cpu_maps_update_begin();
+
for_each_online_cpu(i) {
retval = intel_idle_cpu_init(i);
if (retval) {
+ cpu_maps_update_done();
cpuidle_unregister_driver(&intel_idle_driver);
return retval;
}
}
- register_cpu_notifier(&cpu_hotplug_notifier);
+ __register_cpu_notifier(&cpu_hotplug_notifier);
+
+ cpu_maps_update_done();

return 0;
}
@@ -698,10 +703,13 @@ static void __exit intel_idle_exit(void)
intel_idle_cpuidle_devices_uninit();
cpuidle_unregister_driver(&intel_idle_driver);

+ cpu_maps_update_begin();

if (lapic_timer_reliable_states != LAPIC_TIMER_ALWAYS_RELIABLE)
on_each_cpu(__setup_broadcast_timer, (void *)false, 1);
- unregister_cpu_notifier(&cpu_hotplug_notifier);
+ __unregister_cpu_notifier(&cpu_hotplug_notifier);
+
+ cpu_maps_update_done();

return;
}

2014-02-05 22:26:10

by Srivatsa S. Bhat

[permalink] [raw]
Subject: [PATCH 02/51] Doc/cpu-hotplug: Specify race-free way to register CPU hotplug callbacks

Recommend the usage of the new CPU hotplug callback registration APIs
(__register_cpu_notifier() etc), when subsystems need to also perform
initialization for already online CPUs. Provide examples of correct
and race-free ways of achieving this, and point out the kinds of code
that are error-prone.

Cc: Rob Landley <[email protected]>
Cc: [email protected]
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

Documentation/cpu-hotplug.txt | 45 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 45 insertions(+)

diff --git a/Documentation/cpu-hotplug.txt b/Documentation/cpu-hotplug.txt
index be675d2..42831b7 100644
--- a/Documentation/cpu-hotplug.txt
+++ b/Documentation/cpu-hotplug.txt
@@ -312,12 +312,57 @@ things will happen if a notifier in path sent a BAD notify code.
Q: I don't see my action being called for all CPUs already up and running?
A: Yes, CPU notifiers are called only when new CPUs are on-lined or offlined.
If you need to perform some action for each cpu already in the system, then
+ do this:

for_each_online_cpu(i) {
foobar_cpu_callback(&foobar_cpu_notifier, CPU_UP_PREPARE, i);
foobar_cpu_callback(&foobar_cpu_notifier, CPU_ONLINE, i);
}

+ However, if you want to register a hotplug callback, as well as perform
+ some initialization for CPUs that are already online, then do this:
+
+ Version 1: (Correct)
+ ---------
+
+ cpu_maps_update_begin();
+
+ for_each_online_cpu(i) {
+ foobar_cpu_callback(&foobar_cpu_notifier,
+ CPU_UP_PREPARE, i);
+ foobar_cpu_callback(&foobar_cpu_notifier,
+ CPU_ONLINE, i);
+ }
+
+ /* Note the use of the double underscored version of the API */
+ __register_cpu_notifier(&foobar_cpu_notifier);
+
+ cpu_maps_update_done();
+
+ Note that the following code is *NOT* the right way to achieve this,
+ because it is prone to an ABBA deadlock between the cpu_add_remove_lock
+ and the cpu_hotplug.lock.
+
+ Version 2: (Wrong!)
+ ---------
+
+ get_online_cpus();
+
+ for_each_online_cpu(i) {
+ foobar_cpu_callback(&foobar_cpu_notifier,
+ CPU_UP_PREPARE, i);
+ foobar_cpu_callback(&foobar_cpu_notifier,
+ CPU_ONLINE, i);
+ }
+
+ register_cpu_notifier(&foobar_cpu_notifier);
+
+ put_online_cpus();
+
+ So always use the first version shown above when you want to register
+ callbacks as well as initialize the already online CPUs.
+
+
Q: If i would like to develop cpu hotplug support for a new architecture,
what do i need at a minimum?
A: The following are what is required for CPU hotplug infrastructure to work

2014-02-05 23:41:43

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 46/51] trace, ring-buffer: Fix CPU hotplug callback registration

On Thu, 06 Feb 2014 03:42:58 +0530
"Srivatsa S. Bhat" <[email protected]> wrote:

> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> put_online_cpus();
>
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> with CPU hotplug operations).
>
> Instead, the correct and race-free way of performing the callback
> registration is:
>
> cpu_maps_update_begin();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> /* Note the use of the double underscored version of the API */
> __register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_maps_update_done();
>
>
> Fix the tracing ring-buffer code by using this latter form of callback
> registration.
>
> Cc: Steven Rostedt <[email protected]>
> Cc: Frederic Weisbecker <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Signed-off-by: Srivatsa S. Bhat <[email protected]>
>

Acked-by: Steven Rostedt <[email protected]>

-- Steve

2014-02-06 00:44:14

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 42/51] hwmon, coretemp: Fix CPU hotplug callback registration

On Thu, Feb 06, 2014 at 03:42:06AM +0530, Srivatsa S. Bhat wrote:
> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> put_online_cpus();
>
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> with CPU hotplug operations).
>
> Instead, the correct and race-free way of performing the callback
> registration is:
>
> cpu_maps_update_begin();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> /* Note the use of the double underscored version of the API */
> __register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_maps_update_done();
>
>
> Fix the hwmon coretemp code by using this latter form of callback
> registration.
>
> Cc: Fenghua Yu <[email protected]>
> Cc: Jean Delvare <[email protected]>
> Cc: Guenter Roeck <[email protected]>
> Cc: [email protected]
> Signed-off-by: Srivatsa S. Bhat <[email protected]>

Applied.

Guenter

2014-02-06 00:44:33

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 43/51] hwmon, via-cputemp: Fix CPU hotplug callback registration

On Thu, Feb 06, 2014 at 03:42:19AM +0530, Srivatsa S. Bhat wrote:
> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> put_online_cpus();
>
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> with CPU hotplug operations).
>
> Instead, the correct and race-free way of performing the callback
> registration is:
>
> cpu_maps_update_begin();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> /* Note the use of the double underscored version of the API */
> __register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_maps_update_done();
>
>
> Fix the hwmon via-cputemp code by using this latter form of callback
> registration.
>
> Cc: Jean Delvare <[email protected]>
> Cc: Guenter Roeck <[email protected]>
> Cc: [email protected]
> Signed-off-by: Srivatsa S. Bhat <[email protected]>

Applied.

Thanks,
Guenter

2014-02-06 01:12:33

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 45/51] md, raid5: Fix CPU hotplug callback registration

On Thu, 06 Feb 2014 03:42:45 +0530 "Srivatsa S. Bhat"
<[email protected]> wrote:

> From: Oleg Nesterov <[email protected]>
>
> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> put_online_cpus();
>
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> with CPU hotplug operations).
>
> Interestingly, the raid5 code can actually prevent double initialization and
> hence can use the following simplified form of callback registration:
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> put_online_cpus();
>
> A hotplug operation that occurs between registering the notifier and calling
> get_online_cpus(), won't disrupt anything, because the code takes care to
> perform the memory allocations only once.
>
> So reorganize the code in raid5 this way to fix the deadlock with callback
> registration.
>
> Cc: Neil Brown <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> [Srivatsa: Fixed the unregister_cpu_notifier() deadlock, added the
> free_scratch_buffer() helper to condense code further and wrote the changelog.]
> Signed-off-by: Srivatsa S. Bhat <[email protected]>
> ---
>
> drivers/md/raid5.c | 90 +++++++++++++++++++++++++---------------------------
> 1 file changed, 44 insertions(+), 46 deletions(-)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index f1feade..16f5c21 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -5514,23 +5514,43 @@ raid5_size(struct mddev *mddev, sector_t sectors, int raid_disks)
> return sectors * (raid_disks - conf->max_degraded);
> }
>
> +static void free_scratch_buffer(struct r5conf *conf, struct raid5_percpu *percpu)
> +{
> + safe_put_page(percpu->spare_page);
> + kfree(percpu->scribble);
> + percpu->spare_page = NULL;
> + percpu->scribble = NULL;
> +}
> +
> +static int alloc_scratch_buffer(struct r5conf *conf, struct raid5_percpu *percpu)
> +{
> + if (conf->level == 6 && !percpu->spare_page)
> + percpu->spare_page = alloc_page(GFP_KERNEL);
> + if (!percpu->scribble)
> + percpu->scribble = kmalloc(conf->scribble_len, GFP_KERNEL);
> +
> + if (!percpu->scribble || (conf->level == 6 && !percpu->spare_page)) {
> + free_scratch_buffer(conf, percpu);
> + return -ENOMEM;
> + }
> +
> + return 0;
> +}
> +
> static void raid5_free_percpu(struct r5conf *conf)
> {
> - struct raid5_percpu *percpu;
> unsigned long cpu;
>
> if (!conf->percpu)
> return;
>
> - get_online_cpus();
> - for_each_possible_cpu(cpu) {
> - percpu = per_cpu_ptr(conf->percpu, cpu);
> - safe_put_page(percpu->spare_page);
> - kfree(percpu->scribble);
> - }
> #ifdef CONFIG_HOTPLUG_CPU
> unregister_cpu_notifier(&conf->cpu_notify);
> #endif
> +
> + get_online_cpus();
> + for_each_possible_cpu(cpu)
> + free_scratch_buffer(conf, per_cpu_ptr(conf->percpu, cpu));
> put_online_cpus();
>
> free_percpu(conf->percpu);
> @@ -5557,15 +5577,7 @@ static int raid456_cpu_notify(struct notifier_block *nfb, unsigned long action,
> switch (action) {
> case CPU_UP_PREPARE:
> case CPU_UP_PREPARE_FROZEN:
> - if (conf->level == 6 && !percpu->spare_page)
> - percpu->spare_page = alloc_page(GFP_KERNEL);
> - if (!percpu->scribble)
> - percpu->scribble = kmalloc(conf->scribble_len, GFP_KERNEL);
> -
> - if (!percpu->scribble ||
> - (conf->level == 6 && !percpu->spare_page)) {
> - safe_put_page(percpu->spare_page);
> - kfree(percpu->scribble);
> + if (alloc_scratch_buffer(conf, percpu)) {
> pr_err("%s: failed memory allocation for cpu%ld\n",
> __func__, cpu);
> return notifier_from_errno(-ENOMEM);
> @@ -5573,10 +5585,7 @@ static int raid456_cpu_notify(struct notifier_block *nfb, unsigned long action,
> break;
> case CPU_DEAD:
> case CPU_DEAD_FROZEN:
> - safe_put_page(percpu->spare_page);
> - kfree(percpu->scribble);
> - percpu->spare_page = NULL;
> - percpu->scribble = NULL;
> + free_scratch_buffer(conf, per_cpu_ptr(conf->percpu, cpu));
> break;
> default:
> break;
> @@ -5588,40 +5597,29 @@ static int raid456_cpu_notify(struct notifier_block *nfb, unsigned long action,
> static int raid5_alloc_percpu(struct r5conf *conf)
> {
> unsigned long cpu;
> - struct page *spare_page;
> - struct raid5_percpu __percpu *allcpus;
> - void *scribble;
> - int err;
> + int err = 0;
>
> - allcpus = alloc_percpu(struct raid5_percpu);
> - if (!allcpus)
> + conf->percpu = alloc_percpu(struct raid5_percpu);
> + if (!conf->percpu)
> return -ENOMEM;
> - conf->percpu = allcpus;
> +
> +#ifdef CONFIG_HOTPLUG_CPU
> + conf->cpu_notify.notifier_call = raid456_cpu_notify;
> + conf->cpu_notify.priority = 0;
> + err = register_cpu_notifier(&conf->cpu_notify);
> + if (err)
> + return err;
> +#endif
>
> get_online_cpus();
> - err = 0;
> for_each_present_cpu(cpu) {
> - if (conf->level == 6) {
> - spare_page = alloc_page(GFP_KERNEL);
> - if (!spare_page) {
> - err = -ENOMEM;
> - break;
> - }
> - per_cpu_ptr(conf->percpu, cpu)->spare_page = spare_page;
> - }
> - scribble = kmalloc(conf->scribble_len, GFP_KERNEL);
> - if (!scribble) {
> - err = -ENOMEM;
> + err = alloc_scratch_buffer(conf, per_cpu_ptr(conf->percpu, cpu));
> + if (err) {
> + pr_err("%s: failed memory allocation for cpu%ld\n",
> + __func__, cpu);
> break;
> }
> - per_cpu_ptr(conf->percpu, cpu)->scribble = scribble;
> }
> -#ifdef CONFIG_HOTPLUG_CPU
> - conf->cpu_notify.notifier_call = raid456_cpu_notify;
> - conf->cpu_notify.priority = 0;
> - if (err == 0)
> - err = register_cpu_notifier(&conf->cpu_notify);
> -#endif
> put_online_cpus();
>
> return err;


Looks good, thanks.
Shall I wait for a signed-of-by from Oleg, then queue it through my md tree?

NeilBrown


Attachments:
signature.asc (828.00 B)

2014-02-06 01:19:50

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: [PATCH 44/51] xen, balloon: Fix CPU hotplug callback registration


----- [email protected] wrote:

> Subsystems that want to register CPU hotplug callbacks, as well as
> perform
> initialization for the CPUs that are already online, often do it as
> shown
> below:
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> put_online_cpus();
>
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running
> concurrently
> with CPU hotplug operations).
>
> Interestingly, the balloon code in xen can actually prevent double
> initialization and hence can use the following simplified form of
> callback
> registration:
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> put_online_cpus();
>
> A hotplug operation that occurs between registering the notifier and
> calling
> get_online_cpus(), won't disrupt anything, because the code takes care
> to
> perform the memory allocations only once.
>
> So reorganize the balloon code in xen this way to fix the deadlock
> with
> callback registration.
>
> Cc: Konrad Rzeszutek Wilk <[email protected]>
> Cc: Boris Ostrovsky <[email protected]>
> Cc: David Vrabel <[email protected]>
> Cc: [email protected]
> Signed-off-by: Srivatsa S. Bhat <[email protected]>
> ---
>
> drivers/xen/balloon.c | 35 +++++++++++++++++++++++------------
> 1 file changed, 23 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 37d06ea..afe1a3f 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -592,19 +592,29 @@ static void __init balloon_add_region(unsigned
> long start_pfn,
> }
> }
>
> +static int alloc_balloon_scratch_page(int cpu)
> +{
> + if (per_cpu(balloon_scratch_page, cpu) != NULL)
> + return 0;
> +
> + per_cpu(balloon_scratch_page, cpu) = alloc_page(GFP_KERNEL);
> + if (per_cpu(balloon_scratch_page, cpu) == NULL) {
> + pr_warn("Failed to allocate balloon_scratch_page for cpu %d\n",
> cpu);
> + return -ENOMEM;
> + }
> +
> + return 0;
> +}
> +
> +
> static int balloon_cpu_notify(struct notifier_block *self,
> unsigned long action, void *hcpu)
> {
> int cpu = (long)hcpu;
> switch (action) {
> case CPU_UP_PREPARE:
> - if (per_cpu(balloon_scratch_page, cpu) != NULL)
> - break;
> - per_cpu(balloon_scratch_page, cpu) = alloc_page(GFP_KERNEL);
> - if (per_cpu(balloon_scratch_page, cpu) == NULL) {
> - pr_warn("Failed to allocate balloon_scratch_page for cpu %d\n",
> cpu);
> + if (alloc_balloon_scratch_page(cpu))
> return NOTIFY_BAD;
> - }
> break;
> default:
> break;
> @@ -624,15 +634,16 @@ static int __init balloon_init(void)
> return -ENODEV;
>
> if (!xen_feature(XENFEAT_auto_translated_physmap)) {
> - for_each_online_cpu(cpu)
> - {
> - per_cpu(balloon_scratch_page, cpu) = alloc_page(GFP_KERNEL);
> - if (per_cpu(balloon_scratch_page, cpu) == NULL) {
> - pr_warn("Failed to allocate balloon_scratch_page for cpu %d\n",
> cpu);
> + register_cpu_notifier(&balloon_cpu_notifier);
> +
> + get_online_cpus();
> + for_each_online_cpu(cpu) {
> + if (alloc_balloon_scratch_page(cpu)) {
> + put_online_cpus();
> return -ENOMEM;


Not that original code was doing a particularly thorough job of cleaning up on allocation failure but if it couldn't get memory it would not register the notifier. So perhaps you should unregister it before returning here.

I am also not sure how we were susceptible to the deadlock here since we didn't call get_online_cpus(). (We probably should have but then commit description should say it).

-boris

> }
> }
> - register_cpu_notifier(&balloon_cpu_notifier);
> + put_online_cpus();
> }
>
> pr_info("Initialising balloon driver\n");

2014-02-06 01:25:35

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 42/51] hwmon, coretemp: Fix CPU hotplug callback registration

On Wed, Feb 05, 2014 at 04:44:12PM -0800, Guenter Roeck wrote:
> On Thu, Feb 06, 2014 at 03:42:06AM +0530, Srivatsa S. Bhat wrote:
> > Subsystems that want to register CPU hotplug callbacks, as well as perform
> > initialization for the CPUs that are already online, often do it as shown
> > below:
> >
> > get_online_cpus();
> >
> > for_each_online_cpu(cpu)
> > init_cpu(cpu);
> >
> > register_cpu_notifier(&foobar_cpu_notifier);
> >
> > put_online_cpus();
> >
> > This is wrong, since it is prone to ABBA deadlocks involving the
> > cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> > with CPU hotplug operations).
> >
> > Instead, the correct and race-free way of performing the callback
> > registration is:
> >
> > cpu_maps_update_begin();
> >
> > for_each_online_cpu(cpu)
> > init_cpu(cpu);
> >
> > /* Note the use of the double underscored version of the API */
> > __register_cpu_notifier(&foobar_cpu_notifier);
> >
> > cpu_maps_update_done();
> >
> >
> > Fix the hwmon coretemp code by using this latter form of callback
> > registration.
> >
> > Cc: Fenghua Yu <[email protected]>
> > Cc: Jean Delvare <[email protected]>
> > Cc: Guenter Roeck <[email protected]>
> > Cc: [email protected]
> > Signed-off-by: Srivatsa S. Bhat <[email protected]>
>
> Applied.
>
That obviously doesn't build ;-). Replace with

Acked-by: Guenter Roeck <[email protected]>

Guenter

2014-02-06 01:26:10

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 43/51] hwmon, via-cputemp: Fix CPU hotplug callback registration

On Wed, Feb 05, 2014 at 04:44:31PM -0800, Guenter Roeck wrote:
> On Thu, Feb 06, 2014 at 03:42:19AM +0530, Srivatsa S. Bhat wrote:
> > Subsystems that want to register CPU hotplug callbacks, as well as perform
> > initialization for the CPUs that are already online, often do it as shown
> > below:
> >
> > get_online_cpus();
> >
> > for_each_online_cpu(cpu)
> > init_cpu(cpu);
> >
> > register_cpu_notifier(&foobar_cpu_notifier);
> >
> > put_online_cpus();
> >
> > This is wrong, since it is prone to ABBA deadlocks involving the
> > cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> > with CPU hotplug operations).
> >
> > Instead, the correct and race-free way of performing the callback
> > registration is:
> >
> > cpu_maps_update_begin();
> >
> > for_each_online_cpu(cpu)
> > init_cpu(cpu);
> >
> > /* Note the use of the double underscored version of the API */
> > __register_cpu_notifier(&foobar_cpu_notifier);
> >
> > cpu_maps_update_done();
> >
> >
> > Fix the hwmon via-cputemp code by using this latter form of callback
> > registration.
> >
> > Cc: Jean Delvare <[email protected]>
> > Cc: Guenter Roeck <[email protected]>
> > Cc: [email protected]
> > Signed-off-by: Srivatsa S. Bhat <[email protected]>
>
> Applied.
>
Same here ...

Acked-by: Guenter Roeck <[email protected]>

2014-02-06 09:39:28

by Gautham R Shenoy

[permalink] [raw]
Subject: Re: [PATCH 00/51] CPU hotplug: Fix issues with callback registration

Hi,

On Thu, Feb 06, 2014 at 03:34:36AM +0530, Srivatsa S. Bhat wrote:
> Hi,
>
>
> To solve these issues and provide a race-free method to register CPU hotplug
> callbacks, this patchset introduces new variants of the callback registration
> APIs that don't hold the cpu_add_remove_lock, and exports the
> cpu_add_remove_lock via cpu_maps_update_begin/done() for use by various
> subsystems. With this in place, the following code snippet will register a
> hotplug callback as well as initialize already online CPUs without any race
> conditions.
>
> cpu_maps_update_begin();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> /* This doesn't take the cpu_add_remove_lock */
> __register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_maps_update_done();
>

Couple of comments:

Right now, cpu_add_remove_lock is being used to
1) Serialize the cpu-hotplug writers.

2) Serialize accesses to cpu_present/possible_map.

3) Serialize updates to the cpu_chain (the cpu hotplug notifier chain).

- This is necessary to ensure that registration of notifiers and
invocation of CPU_POST_DEAD notifications don't race with each
other. Else we could have used get/put_online_cpus() in
register_cpu_notifier() and this patch series wouldn't have been
necessary.

4) Bulk cpu-hotplug (disable/enable_non_boot_cpus), but this is a
special case of 1).

CPU_POST_DEAD notification, is invoked with the cpu_hotplug.lock
dropped. This was necessary for subsystems which would be waiting for
some other thread to finish some work, and that other thread could
invoke get_online_cpus(). If CPU_POST_DEAD notification were issued
without dropping the cpu_hotplug.lock, this would lead to a deadlock
as the notifier would be left stuck waiting for the thread which is
blocked in get_online_cpus().

It was introduced to ensure that multithreaded workqueues can safely
use get_online_cpus() [https://lkml.org/lkml/2008/6/29/121].

As of now, only two subsystems use this notification and workqueues is
_not_ one of them!
* arch/x86/kernel/cpu/mcheck/mce.c:mce_cpu_callback()
* drivers/cpufreq/cpufreq.c:cpufreq_cpu_callback()
I haven't yet audited these two cases to see if they really need this
to be handled in CPU_POST_DEAD or if they can be handled in CPU_DEAD.

Also can we have an alternate API, something like
cpu_hotplug_register_begin/end() instead of reusing
cpu_maps_update_begin/end() for this usage, since in most of the
patches that follow, we're not touching the any of the cpu_*_maps!

>
>
> Regards,
> Srivatsa S. Bhat
> IBM Linux Technology Center

--
Thanks and Regards
gautham.

2014-02-06 10:08:39

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 42/51] hwmon, coretemp: Fix CPU hotplug callback registration

On 02/06/2014 06:55 AM, Guenter Roeck wrote:
> On Wed, Feb 05, 2014 at 04:44:12PM -0800, Guenter Roeck wrote:
>> On Thu, Feb 06, 2014 at 03:42:06AM +0530, Srivatsa S. Bhat wrote:
>>> Subsystems that want to register CPU hotplug callbacks, as well as perform
>>> initialization for the CPUs that are already online, often do it as shown
>>> below:
[...]
>>> Fix the hwmon coretemp code by using this latter form of callback
>>> registration.
>>>
>>> Cc: Fenghua Yu <[email protected]>
>>> Cc: Jean Delvare <[email protected]>
>>> Cc: Guenter Roeck <[email protected]>
>>> Cc: [email protected]
>>> Signed-off-by: Srivatsa S. Bhat <[email protected]>
>>
>> Applied.
>>
> That obviously doesn't build ;-). Replace with
>
> Acked-by: Guenter Roeck <[email protected]>
>

Thanks a lot for the Ack! This patch has a dependency on Patch 1,
hence the trouble with applying it individually :(

Link to Patch 1:
http://article.gmane.org/gmane.linux.kernel/1641640

Regards,
Srivatsa S. Bhat

2014-02-06 10:10:55

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 45/51] md, raid5: Fix CPU hotplug callback registration

On 02/06/2014 06:41 AM, NeilBrown wrote:
> On Thu, 06 Feb 2014 03:42:45 +0530 "Srivatsa S. Bhat"
> <[email protected]> wrote:
>
>> From: Oleg Nesterov <[email protected]>
>>
>> Subsystems that want to register CPU hotplug callbacks, as well as perform
>> initialization for the CPUs that are already online, often do it as shown
>> below:
>>
>> get_online_cpus();
>>
>> for_each_online_cpu(cpu)
>> init_cpu(cpu);
>>
>> register_cpu_notifier(&foobar_cpu_notifier);
>>
>> put_online_cpus();
>>
>> This is wrong, since it is prone to ABBA deadlocks involving the
>> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
>> with CPU hotplug operations).
>>
>> Interestingly, the raid5 code can actually prevent double initialization and
>> hence can use the following simplified form of callback registration:
>>
>> register_cpu_notifier(&foobar_cpu_notifier);
>>
>> get_online_cpus();
>>
>> for_each_online_cpu(cpu)
>> init_cpu(cpu);
>>
>> put_online_cpus();
>>
>> A hotplug operation that occurs between registering the notifier and calling
>> get_online_cpus(), won't disrupt anything, because the code takes care to
>> perform the memory allocations only once.
>>
>> So reorganize the code in raid5 this way to fix the deadlock with callback
>> registration.
>>
>> Cc: Neil Brown <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> [Srivatsa: Fixed the unregister_cpu_notifier() deadlock, added the
>> free_scratch_buffer() helper to condense code further and wrote the changelog.]
>> Signed-off-by: Srivatsa S. Bhat <[email protected]>
>> ---
[...]
>
>
> Looks good, thanks.
> Shall I wait for a signed-of-by from Oleg, then queue it through my md tree?
>

Sure, that sounds great, since this patch doesn't have any dependency.
Thanks a lot!

Oleg, it would be great if you could kindly add your S-O-B to this patch.
Thanks!

Regards,
Srivatsa S. Bhat

2014-02-06 10:57:23

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH 08/51] arm, hw-breakpoint: Fix CPU hotplug callback registration

Hi Srivatsa,

On Wed, Feb 05, 2014 at 10:06:04PM +0000, Srivatsa S. Bhat wrote:
> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> put_online_cpus();
>
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> with CPU hotplug operations).

Hmm, the code in question (for this patch) runs from an arch_initcall. How
can you generate CPU hotplug operations at that stage?

> Instead, the correct and race-free way of performing the callback
> registration is:
>
> cpu_maps_update_begin();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> /* Note the use of the double underscored version of the API */
> __register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_maps_update_done();
>
>
> Fix the hw-breakpoint code in arm by using this latter form of callback
> registration.

I guess you introduce __register_cpu_notifier somewhere earlier in the
series, so it's best if you take this all via your tree.

Will

2014-02-06 11:09:58

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 00/51] CPU hotplug: Fix issues with callback registration

On 02/06/2014 03:08 PM, Gautham R Shenoy wrote:
> Hi,
>
> On Thu, Feb 06, 2014 at 03:34:36AM +0530, Srivatsa S. Bhat wrote:
>> Hi,
>>
>>
>> To solve these issues and provide a race-free method to register CPU hotplug
>> callbacks, this patchset introduces new variants of the callback registration
>> APIs that don't hold the cpu_add_remove_lock, and exports the
>> cpu_add_remove_lock via cpu_maps_update_begin/done() for use by various
>> subsystems. With this in place, the following code snippet will register a
>> hotplug callback as well as initialize already online CPUs without any race
>> conditions.
>>
>> cpu_maps_update_begin();
>>
>> for_each_online_cpu(cpu)
>> init_cpu(cpu);
>>
>> /* This doesn't take the cpu_add_remove_lock */
>> __register_cpu_notifier(&foobar_cpu_notifier);
>>
>> cpu_maps_update_done();
>>
>
> Couple of comments:
>
> Right now, cpu_add_remove_lock is being used to
> 1) Serialize the cpu-hotplug writers.
>
> 2) Serialize accesses to cpu_present/possible_map.
>
> 3) Serialize updates to the cpu_chain (the cpu hotplug notifier chain).
>
> - This is necessary to ensure that registration of notifiers and
> invocation of CPU_POST_DEAD notifications don't race with each
> other. Else we could have used get/put_online_cpus() in
> register_cpu_notifier() and this patch series wouldn't have been
> necessary.
>
> 4) Bulk cpu-hotplug (disable/enable_non_boot_cpus), but this is a
> special case of 1).
>
> CPU_POST_DEAD notification, is invoked with the cpu_hotplug.lock
> dropped. This was necessary for subsystems which would be waiting for
> some other thread to finish some work, and that other thread could
> invoke get_online_cpus(). If CPU_POST_DEAD notification were issued
> without dropping the cpu_hotplug.lock, this would lead to a deadlock
> as the notifier would be left stuck waiting for the thread which is
> blocked in get_online_cpus().
>
> It was introduced to ensure that multithreaded workqueues can safely
> use get_online_cpus() [https://lkml.org/lkml/2008/6/29/121].
>
> As of now, only two subsystems use this notification and workqueues is
> _not_ one of them!
> * arch/x86/kernel/cpu/mcheck/mce.c:mce_cpu_callback()
> * drivers/cpufreq/cpufreq.c:cpufreq_cpu_callback()
> I haven't yet audited these two cases to see if they really need this
> to be handled in CPU_POST_DEAD or if they can be handled in CPU_DEAD.
>

Well, cpufreq had a legitimate need to use POST_DEAD to avoid the
deadlock described in commit 1aee40ac9c. However, there had been some
discussion some time ago about reorganizing the cpufreq's hotplug callback
so as to move most (but not all) of its work outside of POST_DEAD [1].
But as it stands, I don't think it would be easy to totally get rid of
cpufreq's dependence on the POST_DEAD notifier.

Besides, I think its good to retain the POST_DEAD notifier option in
the CPU hotplug core code. It has come handy several times to fix hard
deadlock issues.

> Also can we have an alternate API, something like
> cpu_hotplug_register_begin/end() instead of reusing
> cpu_maps_update_begin/end() for this usage, since in most of the
> patches that follow, we're not touching the any of the cpu_*_maps!
>

Yes, the function names cpu_maps_update_begin/end() don't really suit
the kind of usage I'm proposing in this patchset, and hence is kind of
a misnomer. For better readability, I'm thinking of defining a macro
such as say, cpu_hotplug_notifier_lock()/unlock() that redirects to
cpu_maps_update_begin/end() respectively. That way, we can export just
those former symbols for use by modules, and thereby the code would look
more intuitive, like this:

cpu_hotplug_notifier_lock();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* This doesn't take the cpu_add_remove_lock */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_hotplug_notifier_unlock();

What do you think?

Regards,
Srivatsa S. Bhat

2014-02-06 11:13:24

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 00/51] CPU hotplug: Fix issues with callback registration

On 02/06/2014 04:34 PM, Srivatsa S. Bhat wrote:
> On 02/06/2014 03:08 PM, Gautham R Shenoy wrote:
>> Hi,
>>
>> On Thu, Feb 06, 2014 at 03:34:36AM +0530, Srivatsa S. Bhat wrote:
>>> Hi,
>>>
[...]
>> Couple of comments:
>>
>> Right now, cpu_add_remove_lock is being used to
>> 1) Serialize the cpu-hotplug writers.
>>
>> 2) Serialize accesses to cpu_present/possible_map.
>>
>> 3) Serialize updates to the cpu_chain (the cpu hotplug notifier chain).
>>
>> - This is necessary to ensure that registration of notifiers and
>> invocation of CPU_POST_DEAD notifications don't race with each
>> other. Else we could have used get/put_online_cpus() in
>> register_cpu_notifier() and this patch series wouldn't have been
>> necessary.
>>
>> 4) Bulk cpu-hotplug (disable/enable_non_boot_cpus), but this is a
>> special case of 1).
>>
>> CPU_POST_DEAD notification, is invoked with the cpu_hotplug.lock
>> dropped. This was necessary for subsystems which would be waiting for
>> some other thread to finish some work, and that other thread could
>> invoke get_online_cpus(). If CPU_POST_DEAD notification were issued
>> without dropping the cpu_hotplug.lock, this would lead to a deadlock
>> as the notifier would be left stuck waiting for the thread which is
>> blocked in get_online_cpus().
>>
>> It was introduced to ensure that multithreaded workqueues can safely
>> use get_online_cpus() [https://lkml.org/lkml/2008/6/29/121].
>>
>> As of now, only two subsystems use this notification and workqueues is
>> _not_ one of them!
>> * arch/x86/kernel/cpu/mcheck/mce.c:mce_cpu_callback()
>> * drivers/cpufreq/cpufreq.c:cpufreq_cpu_callback()
>> I haven't yet audited these two cases to see if they really need this
>> to be handled in CPU_POST_DEAD or if they can be handled in CPU_DEAD.
>>
>
> Well, cpufreq had a legitimate need to use POST_DEAD to avoid the
> deadlock described in commit 1aee40ac9c. However, there had been some
> discussion some time ago about reorganizing the cpufreq's hotplug callback
> so as to move most (but not all) of its work outside of POST_DEAD [1].

Forgot to give the link.. Here it is:
http://article.gmane.org/gmane.linux.kernel/1571276

Regards,
Srivatsa S. Bhat

2014-02-06 11:31:16

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 08/51] arm, hw-breakpoint: Fix CPU hotplug callback registration

Hi Will,

On 02/06/2014 04:27 PM, Will Deacon wrote:
> Hi Srivatsa,
>
> On Wed, Feb 05, 2014 at 10:06:04PM +0000, Srivatsa S. Bhat wrote:
>> Subsystems that want to register CPU hotplug callbacks, as well as perform
>> initialization for the CPUs that are already online, often do it as shown
>> below:
>>
>> get_online_cpus();
>>
>> for_each_online_cpu(cpu)
>> init_cpu(cpu);
>>
>> register_cpu_notifier(&foobar_cpu_notifier);
>>
>> put_online_cpus();
>>
>> This is wrong, since it is prone to ABBA deadlocks involving the
>> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
>> with CPU hotplug operations).
>
> Hmm, the code in question (for this patch) runs from an arch_initcall. How
> can you generate CPU hotplug operations at that stage?
>

You are right - in today's design of the init sequence, CPU hotplug
operations can't be triggered at that time during boot.

However, there have been proposals to boot CPUs in parallel along with the
rest of the kernel initialization [1] (and that would need full synchronization
with CPU hotplug even at the initcall stage). Of course this needs a lot of
auditing and modifications to the CPU hotplug notifiers of various subsystems
to make them robust enough to handle the parallel boot; so its not going to
happen very soon. But I felt that it would be a good idea to ensure that we
get the locking/synchronization right, even if the registrations happen very
early during boot today.. you know, just to be on the safer side and also to
make the job easier for whoever that is, who tries to implement parallel
CPU booting again in the future ;-)

[1]. http://thread.gmane.org/gmane.linux.kernel/1246209


>> Instead, the correct and race-free way of performing the callback
>> registration is:
>>
>> cpu_maps_update_begin();
>>
>> for_each_online_cpu(cpu)
>> init_cpu(cpu);
>>
>> /* Note the use of the double underscored version of the API */
>> __register_cpu_notifier(&foobar_cpu_notifier);
>>
>> cpu_maps_update_done();
>>
>>
>> Fix the hw-breakpoint code in arm by using this latter form of callback
>> registration.
>
> I guess you introduce __register_cpu_notifier somewhere earlier in the
> series,

Yes, patch 1 adds that API..

> so it's best if you take this all via your tree.
>

Hmm.. I'm not a maintainer myself, so I'm hoping that either Oleg or Rusty
or any of the other CPU hotplug maintainers (Thomas/Peter/Ingo) would be
willing to take these patches through their tree.

Regards,
Srivatsa S. Bhat

2014-02-06 11:39:37

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH 08/51] arm, hw-breakpoint: Fix CPU hotplug callback registration

On Thu, Feb 06, 2014 at 11:25:46AM +0000, Srivatsa S. Bhat wrote:
> Hi Will,

Hello,

> On 02/06/2014 04:27 PM, Will Deacon wrote:
> > On Wed, Feb 05, 2014 at 10:06:04PM +0000, Srivatsa S. Bhat wrote:
> >> Subsystems that want to register CPU hotplug callbacks, as well as perform
> >> initialization for the CPUs that are already online, often do it as shown
> >> below:
> >>
> >> get_online_cpus();
> >>
> >> for_each_online_cpu(cpu)
> >> init_cpu(cpu);
> >>
> >> register_cpu_notifier(&foobar_cpu_notifier);
> >>
> >> put_online_cpus();
> >>
> >> This is wrong, since it is prone to ABBA deadlocks involving the
> >> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> >> with CPU hotplug operations).
> >
> > Hmm, the code in question (for this patch) runs from an arch_initcall. How
> > can you generate CPU hotplug operations at that stage?
> >
>
> You are right - in today's design of the init sequence, CPU hotplug
> operations can't be triggered at that time during boot.

Phew, so we don't have a bug as the code stands today.

> However, there have been proposals to boot CPUs in parallel along with the
> rest of the kernel initialization [1] (and that would need full synchronization
> with CPU hotplug even at the initcall stage). Of course this needs a lot of
> auditing and modifications to the CPU hotplug notifiers of various subsystems
> to make them robust enough to handle the parallel boot; so its not going to
> happen very soon. But I felt that it would be a good idea to ensure that we
> get the locking/synchronization right, even if the registrations happen very
> early during boot today.. you know, just to be on the safer side and also to
> make the job easier for whoever that is, who tries to implement parallel
> CPU booting again in the future ;-)
>
> [1]. http://thread.gmane.org/gmane.linux.kernel/1246209

Makes sense, and this seems like a good start.

> > so it's best if you take this all via your tree.
> >
>
> Hmm.. I'm not a maintainer myself, so I'm hoping that either Oleg or Rusty
> or any of the other CPU hotplug maintainers (Thomas/Peter/Ingo) would be
> willing to take these patches through their tree.

Well, you can have my ack for this patch:

Acked-by: Will Deacon <[email protected]>

Cheers,

Will

2014-02-06 11:41:06

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH 28/51] arm64, hw_breakpoint.c: Fix CPU hotplug callback registration

On Wed, Feb 05, 2014 at 10:09:45PM +0000, Srivatsa S. Bhat wrote:
> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:

[...]

> Fix the hw-breakpoint code in arm64 by using this latter form of callback
> registration.
>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Lorenzo Pieralisi <[email protected]>
> Cc: [email protected]
> Signed-off-by: Srivatsa S. Bhat <[email protected]>
> ---
>
> arch/arm64/kernel/hw_breakpoint.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
> index f17f581..24e88d0 100644
> --- a/arch/arm64/kernel/hw_breakpoint.c
> +++ b/arch/arm64/kernel/hw_breakpoint.c
> @@ -913,6 +913,8 @@ static int __init arch_hw_breakpoint_init(void)
> pr_info("found %d breakpoint and %d watchpoint registers.\n",
> core_num_brps, core_num_wrps);
>
> + cpu_maps_update_begin();
> +
> /*
> * Reset the breakpoint resources. We assume that a halting
> * debugger will leave the world in a nice state for us.
> @@ -927,7 +929,10 @@ static int __init arch_hw_breakpoint_init(void)
> TRAP_HWBKPT, "hw-watchpoint handler");
>
> /* Register hotplug notifier. */
> - register_cpu_notifier(&hw_breakpoint_reset_nb);
> + __register_cpu_notifier(&hw_breakpoint_reset_nb);
> +
> + cpu_maps_update_done();
> +
> /* Register cpu_suspend hw breakpoint restore hook */
> cpu_suspend_set_dbg_restorer(hw_breakpoint_reset);

Acked-by: Will Deacon <[email protected]>

Will

2014-02-06 11:41:59

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH 29/51] arm64, debug-monitors: Fix CPU hotplug callback registration

On Wed, Feb 05, 2014 at 10:09:58PM +0000, Srivatsa S. Bhat wrote:
> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:

[...]

> Fix the debug-monitors code in arm64 by using this latter form of callback
> registration.
>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Russell King <[email protected]>
> Cc: [email protected]
> Signed-off-by: Srivatsa S. Bhat <[email protected]>
> ---
>
> arch/arm64/kernel/debug-monitors.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
> index 636ba8b..959a16b 100644
> --- a/arch/arm64/kernel/debug-monitors.c
> +++ b/arch/arm64/kernel/debug-monitors.c
> @@ -155,12 +155,16 @@ static struct notifier_block os_lock_nb = {
>
> static int debug_monitors_init(void)
> {
> + cpu_maps_update_begin();
> +
> /* Clear the OS lock. */
> smp_call_function(clear_os_lock, NULL, 1);
> clear_os_lock(NULL);
>
> /* Register hotplug handler. */
> - register_cpu_notifier(&os_lock_nb);
> + __register_cpu_notifier(&os_lock_nb);
> +
> + cpu_maps_update_done();
> return 0;
> }
> postcore_initcall(debug_monitors_init);

Acked-by: Will Deacon <[email protected]>

Will

2014-02-06 11:44:01

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 08/51] arm, hw-breakpoint: Fix CPU hotplug callback registration

On 02/06/2014 05:09 PM, Will Deacon wrote:
> On Thu, Feb 06, 2014 at 11:25:46AM +0000, Srivatsa S. Bhat wrote:
>> Hi Will,
>
> Hello,
>
>> On 02/06/2014 04:27 PM, Will Deacon wrote:
>>> On Wed, Feb 05, 2014 at 10:06:04PM +0000, Srivatsa S. Bhat wrote:
>>>> Subsystems that want to register CPU hotplug callbacks, as well as perform
>>>> initialization for the CPUs that are already online, often do it as shown
>>>> below:
>>>>
>>>> get_online_cpus();
>>>>
>>>> for_each_online_cpu(cpu)
>>>> init_cpu(cpu);
>>>>
>>>> register_cpu_notifier(&foobar_cpu_notifier);
>>>>
>>>> put_online_cpus();
>>>>
>>>> This is wrong, since it is prone to ABBA deadlocks involving the
>>>> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
>>>> with CPU hotplug operations).
>>>
>>> Hmm, the code in question (for this patch) runs from an arch_initcall. How
>>> can you generate CPU hotplug operations at that stage?
>>>
>>
>> You are right - in today's design of the init sequence, CPU hotplug
>> operations can't be triggered at that time during boot.
>
> Phew, so we don't have a bug as the code stands today.

Yes, that's right.

>
>> However, there have been proposals to boot CPUs in parallel along with the
>> rest of the kernel initialization [1] (and that would need full synchronization
>> with CPU hotplug even at the initcall stage). Of course this needs a lot of
>> auditing and modifications to the CPU hotplug notifiers of various subsystems
>> to make them robust enough to handle the parallel boot; so its not going to
>> happen very soon. But I felt that it would be a good idea to ensure that we
>> get the locking/synchronization right, even if the registrations happen very
>> early during boot today.. you know, just to be on the safer side and also to
>> make the job easier for whoever that is, who tries to implement parallel
>> CPU booting again in the future ;-)
>>
>> [1]. http://thread.gmane.org/gmane.linux.kernel/1246209
>
> Makes sense, and this seems like a good start.
>
>>> so it's best if you take this all via your tree.
>>>
>>
>> Hmm.. I'm not a maintainer myself, so I'm hoping that either Oleg or Rusty
>> or any of the other CPU hotplug maintainers (Thomas/Peter/Ingo) would be
>> willing to take these patches through their tree.
>
> Well, you can have my ack for this patch:
>
> Acked-by: Will Deacon <[email protected]>
>

Great! Thanks a lot Will!

Regards,
Srivatsa S. Bhat

2014-02-06 12:14:26

by Gautham R Shenoy

[permalink] [raw]
Subject: Re: [PATCH 00/51] CPU hotplug: Fix issues with callback registration

On Thu, Feb 06, 2014 at 04:34:33PM +0530, Srivatsa S. Bhat wrote:
> >
> > CPU_POST_DEAD notification, is invoked with the cpu_hotplug.lock
> > dropped. This was necessary for subsystems which would be waiting for
> > some other thread to finish some work, and that other thread could
> > invoke get_online_cpus(). If CPU_POST_DEAD notification were issued
> > without dropping the cpu_hotplug.lock, this would lead to a deadlock
> > as the notifier would be left stuck waiting for the thread which is
> > blocked in get_online_cpus().
> >
> > It was introduced to ensure that multithreaded workqueues can safely
> > use get_online_cpus() [https://lkml.org/lkml/2008/6/29/121].
> >
> > As of now, only two subsystems use this notification and workqueues is
> > _not_ one of them!
> > * arch/x86/kernel/cpu/mcheck/mce.c:mce_cpu_callback()
> > * drivers/cpufreq/cpufreq.c:cpufreq_cpu_callback()
> > I haven't yet audited these two cases to see if they really need this
> > to be handled in CPU_POST_DEAD or if they can be handled in CPU_DEAD.
> >
>
> Well, cpufreq had a legitimate need to use POST_DEAD to avoid the
> deadlock described in commit 1aee40ac9c. However, there had been some
> discussion some time ago about reorganizing the cpufreq's hotplug callback
> so as to move most (but not all) of its work outside of POST_DEAD [1].
> But as it stands, I don't think it would be easy to totally get rid of
> cpufreq's dependence on the POST_DEAD notifier.
>

Right, I see the reason why cpufreq needs POST_DEAD.

> Besides, I think its good to retain the POST_DEAD notifier option in
> the CPU hotplug core code. It has come handy several times to fix hard
> deadlock issues.
>

I know. I am not denying the usefulness of POST_DEAD. But the fact
that some of the CPU_* notifiers are invoked with the cpu_hotplug.lock
held while CPU_POST_DEAD is invoked with the lock dropped looks a bit
asymmetrical. At the moment I cannot think of a simpler alternative.


> > Also can we have an alternate API, something like
> > cpu_hotplug_register_begin/end() instead of reusing
> > cpu_maps_update_begin/end() for this usage, since in most of the
> > patches that follow, we're not touching the any of the cpu_*_maps!
> >
>
> Yes, the function names cpu_maps_update_begin/end() don't really suit
> the kind of usage I'm proposing in this patchset, and hence is kind of
> a misnomer. For better readability, I'm thinking of defining a macro
> such as say, cpu_hotplug_notifier_lock()/unlock() that redirects to
> cpu_maps_update_begin/end() respectively. That way, we can export just
> those former symbols for use by modules, and thereby the code would look
> more intuitive, like this:
>
> cpu_hotplug_notifier_lock();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> /* This doesn't take the cpu_add_remove_lock */
> __register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_hotplug_notifier_unlock();
>
> What do you think?

Sounds good.
>
> Regards,
> Srivatsa S. Bhat
>

Thanks and Regards
gautham.

2014-02-06 12:28:38

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 38/51] intel-idle: Fix CPU hotplug callback registration

On Thursday, February 06, 2014 03:41:23 AM Srivatsa S. Bhat wrote:
> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> put_online_cpus();
>
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> with CPU hotplug operations).
>
> Instead, the correct and race-free way of performing the callback
> registration is:
>
> cpu_maps_update_begin();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> /* Note the use of the double underscored version of the API */
> __register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_maps_update_done();
>
>
> Fix the intel-idle code by using this latter form of callback registration.
>
> Cc: Len Brown <[email protected]>
> Cc: "Rafael J. Wysocki" <[email protected]>
> Cc: [email protected]
> Signed-off-by: Srivatsa S. Bhat <[email protected]>

This looks good to me. Len, what do you think?

Srivatsa, how does it depend on the rest of your series?

> ---
>
> drivers/idle/intel_idle.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> index 8e1939f..716ee5a 100644
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -681,14 +681,19 @@ static int __init intel_idle_init(void)
> if (intel_idle_cpuidle_devices == NULL)
> return -ENOMEM;
>
> + cpu_maps_update_begin();
> +
> for_each_online_cpu(i) {
> retval = intel_idle_cpu_init(i);
> if (retval) {
> + cpu_maps_update_done();
> cpuidle_unregister_driver(&intel_idle_driver);
> return retval;
> }
> }
> - register_cpu_notifier(&cpu_hotplug_notifier);
> + __register_cpu_notifier(&cpu_hotplug_notifier);
> +
> + cpu_maps_update_done();
>
> return 0;
> }
> @@ -698,10 +703,13 @@ static void __exit intel_idle_exit(void)
> intel_idle_cpuidle_devices_uninit();
> cpuidle_unregister_driver(&intel_idle_driver);
>
> + cpu_maps_update_begin();
>
> if (lapic_timer_reliable_states != LAPIC_TIMER_ALWAYS_RELIABLE)
> on_each_cpu(__setup_broadcast_timer, (void *)false, 1);
> - unregister_cpu_notifier(&cpu_hotplug_notifier);
> + __unregister_cpu_notifier(&cpu_hotplug_notifier);
> +
> + cpu_maps_update_done();
>
> return;
> }
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2014-02-06 12:29:30

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 35/51] acpi-cpufreq: Fix CPU hotplug callback registration

On Thursday, February 06, 2014 03:40:53 AM Srivatsa S. Bhat wrote:
> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> put_online_cpus();
>
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> with CPU hotplug operations).
>
> Instead, the correct and race-free way of performing the callback
> registration is:
>
> cpu_maps_update_begin();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> /* Note the use of the double underscored version of the API */
> __register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_maps_update_done();
>
>
> Fix the acpi-cpufreq code by using this latter form of callback registration.
>
> Cc: "Rafael J. Wysocki" <[email protected]>
> Cc: Viresh Kumar <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Srivatsa S. Bhat <[email protected]>

Looks OK to me. How does it depend on the rest of your series?

> ---
>
> drivers/cpufreq/acpi-cpufreq.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
> index 18448a7..e2eb471 100644
> --- a/drivers/cpufreq/acpi-cpufreq.c
> +++ b/drivers/cpufreq/acpi-cpufreq.c
> @@ -907,15 +907,16 @@ static void __init acpi_cpufreq_boost_init(void)
>
> acpi_cpufreq_driver.boost_supported = true;
> acpi_cpufreq_driver.boost_enabled = boost_state(0);
> - get_online_cpus();
> +
> + cpu_maps_update_begin();
>
> /* Force all MSRs to the same value */
> boost_set_msrs(acpi_cpufreq_driver.boost_enabled,
> cpu_online_mask);
>
> - register_cpu_notifier(&boost_nb);
> + __register_cpu_notifier(&boost_nb);
>
> - put_online_cpus();
> + cpu_maps_update_done();
> }
> }
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe cpufreq" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

Subject: Re: [PATCH 48/51] mm, vmstat: Fix CPU hotplug callback registration

On Thu, 6 Feb 2014, Srivatsa S. Bhat wrote:

> Fix the vmstat code in the MM subsystem by using this latter form of callback
> registration.

Acked-by: Christoph Lameter <[email protected]>

2014-02-06 16:09:28

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 38/51] intel-idle: Fix CPU hotplug callback registration

On 02/06/2014 06:13 PM, Rafael J. Wysocki wrote:
> On Thursday, February 06, 2014 03:41:23 AM Srivatsa S. Bhat wrote:
>> Subsystems that want to register CPU hotplug callbacks, as well as perform
>> initialization for the CPUs that are already online, often do it as shown
>> below:
>>
>> get_online_cpus();
>>
>> for_each_online_cpu(cpu)
>> init_cpu(cpu);
>>
>> register_cpu_notifier(&foobar_cpu_notifier);
>>
>> put_online_cpus();
>>
>> This is wrong, since it is prone to ABBA deadlocks involving the
>> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
>> with CPU hotplug operations).
>>
>> Instead, the correct and race-free way of performing the callback
>> registration is:
>>
>> cpu_maps_update_begin();
>>
>> for_each_online_cpu(cpu)
>> init_cpu(cpu);
>>
>> /* Note the use of the double underscored version of the API */
>> __register_cpu_notifier(&foobar_cpu_notifier);
>>
>> cpu_maps_update_done();
>>
>>
>> Fix the intel-idle code by using this latter form of callback registration.
>>
>> Cc: Len Brown <[email protected]>
>> Cc: "Rafael J. Wysocki" <[email protected]>
>> Cc: [email protected]
>> Signed-off-by: Srivatsa S. Bhat <[email protected]>
>
> This looks good to me. Len, what do you think?
>

Thanks a lot Rafael!

> Srivatsa, how does it depend on the rest of your series?
>

It depends only on the first patch in the series:
http://article.gmane.org/gmane.linux.kernel/1641640

But don't take this patch yet, we are discussing a possible rename
of the function cpu_maps_update_begin()/done(). So I'll post a v2
after the name is finalized.

Thank you!

Regards,
Srivatsa S. Bhat




2014-02-06 16:11:18

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 35/51] acpi-cpufreq: Fix CPU hotplug callback registration

On 02/06/2014 06:13 PM, Rafael J. Wysocki wrote:
> On Thursday, February 06, 2014 03:40:53 AM Srivatsa S. Bhat wrote:
>> Subsystems that want to register CPU hotplug callbacks, as well as perform
>> initialization for the CPUs that are already online, often do it as shown
>> below:
>>
>> get_online_cpus();
>>
>> for_each_online_cpu(cpu)
>> init_cpu(cpu);
>>
>> register_cpu_notifier(&foobar_cpu_notifier);
>>
>> put_online_cpus();
>>
>> This is wrong, since it is prone to ABBA deadlocks involving the
>> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
>> with CPU hotplug operations).
>>
>> Instead, the correct and race-free way of performing the callback
>> registration is:
>>
>> cpu_maps_update_begin();
>>
>> for_each_online_cpu(cpu)
>> init_cpu(cpu);
>>
>> /* Note the use of the double underscored version of the API */
>> __register_cpu_notifier(&foobar_cpu_notifier);
>>
>> cpu_maps_update_done();
>>
>>
>> Fix the acpi-cpufreq code by using this latter form of callback registration.
>>
>> Cc: "Rafael J. Wysocki" <[email protected]>
>> Cc: Viresh Kumar <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Signed-off-by: Srivatsa S. Bhat <[email protected]>
>
> Looks OK to me. How does it depend on the rest of your series?
>

Thank you! Same here, every patch depends only on the first patch in
the series. (Except the raid5 and the xen/balloon patches which don't
have any dependency).

But I'll be posting a v2 of this patchset soon with a rename of the
API..

Regards,
Srivatsa S. Bhat

>> ---
>>
>> drivers/cpufreq/acpi-cpufreq.c | 7 ++++---
>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
>> index 18448a7..e2eb471 100644
>> --- a/drivers/cpufreq/acpi-cpufreq.c
>> +++ b/drivers/cpufreq/acpi-cpufreq.c
>> @@ -907,15 +907,16 @@ static void __init acpi_cpufreq_boost_init(void)
>>
>> acpi_cpufreq_driver.boost_supported = true;
>> acpi_cpufreq_driver.boost_enabled = boost_state(0);
>> - get_online_cpus();
>> +
>> + cpu_maps_update_begin();
>>
>> /* Force all MSRs to the same value */
>> boost_set_msrs(acpi_cpufreq_driver.boost_enabled,
>> cpu_online_mask);
>>
>> - register_cpu_notifier(&boost_nb);
>> + __register_cpu_notifier(&boost_nb);
>>
>> - put_online_cpus();
>> + cpu_maps_update_done();
>> }
>> }
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe cpufreq" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


--
Regards,
Srivatsa S. Bhat

2014-02-06 16:15:23

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 00/51] CPU hotplug: Fix issues with callback registration

On 02/06/2014 05:44 PM, Gautham R Shenoy wrote:
> On Thu, Feb 06, 2014 at 04:34:33PM +0530, Srivatsa S. Bhat wrote:
>>>
>>> CPU_POST_DEAD notification, is invoked with the cpu_hotplug.lock
>>> dropped. This was necessary for subsystems which would be waiting for
>>> some other thread to finish some work, and that other thread could
>>> invoke get_online_cpus(). If CPU_POST_DEAD notification were issued
>>> without dropping the cpu_hotplug.lock, this would lead to a deadlock
>>> as the notifier would be left stuck waiting for the thread which is
>>> blocked in get_online_cpus().
>>>
>>> It was introduced to ensure that multithreaded workqueues can safely
>>> use get_online_cpus() [https://lkml.org/lkml/2008/6/29/121].
>>>
>>> As of now, only two subsystems use this notification and workqueues is
>>> _not_ one of them!
>>> * arch/x86/kernel/cpu/mcheck/mce.c:mce_cpu_callback()
>>> * drivers/cpufreq/cpufreq.c:cpufreq_cpu_callback()
>>> I haven't yet audited these two cases to see if they really need this
>>> to be handled in CPU_POST_DEAD or if they can be handled in CPU_DEAD.
>>>
>>
>> Well, cpufreq had a legitimate need to use POST_DEAD to avoid the
>> deadlock described in commit 1aee40ac9c. However, there had been some
>> discussion some time ago about reorganizing the cpufreq's hotplug callback
>> so as to move most (but not all) of its work outside of POST_DEAD [1].
>> But as it stands, I don't think it would be easy to totally get rid of
>> cpufreq's dependence on the POST_DEAD notifier.
>>
>
> Right, I see the reason why cpufreq needs POST_DEAD.
>
>> Besides, I think its good to retain the POST_DEAD notifier option in
>> the CPU hotplug core code. It has come handy several times to fix hard
>> deadlock issues.
>>
>
> I know. I am not denying the usefulness of POST_DEAD. But the fact
> that some of the CPU_* notifiers are invoked with the cpu_hotplug.lock
> held while CPU_POST_DEAD is invoked with the lock dropped looks a bit
> asymmetrical. At the moment I cannot think of a simpler alternative.
>

Hmmm...

>
>>> Also can we have an alternate API, something like
>>> cpu_hotplug_register_begin/end() instead of reusing
>>> cpu_maps_update_begin/end() for this usage, since in most of the
>>> patches that follow, we're not touching the any of the cpu_*_maps!
>>>
>>
>> Yes, the function names cpu_maps_update_begin/end() don't really suit
>> the kind of usage I'm proposing in this patchset, and hence is kind of
>> a misnomer. For better readability, I'm thinking of defining a macro
>> such as say, cpu_hotplug_notifier_lock()/unlock() that redirects to
>> cpu_maps_update_begin/end() respectively. That way, we can export just
>> those former symbols for use by modules, and thereby the code would look
>> more intuitive, like this:
>>
>> cpu_hotplug_notifier_lock();
>>
>> for_each_online_cpu(cpu)
>> init_cpu(cpu);
>>
>> /* This doesn't take the cpu_add_remove_lock */
>> __register_cpu_notifier(&foobar_cpu_notifier);
>>
>> cpu_hotplug_notifier_unlock();
>>
>> What do you think?
>
> Sounds good.

Cool! If there are no objections, I'll use this naming for the APIs
and spin a v2 of the patchset soon.

Thank you!

Regards,
Srivatsa S. Bhat

2014-02-06 18:41:52

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On 02/06, Srivatsa S. Bhat wrote:
>
> The following method of CPU hotplug callback registration is not safe
> due to the possibility of an ABBA deadlock involving the cpu_add_remove_lock
> and the cpu_hotplug.lock.

Off-topic, but perhaps it also makes sense to add the lockdep annotations
later, to catch other similar problems. Currently get_online_cpus() acquires
nothing from lockdep pov.


As for the this patch/series, personally I agree.

Oleg.

2014-02-06 18:43:53

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 45/51] md, raid5: Fix CPU hotplug callback registration

On 02/06, Srivatsa S. Bhat wrote:
>
> On 02/06/2014 06:41 AM, NeilBrown wrote:
> > Shall I wait for a signed-of-by from Oleg, then queue it through my md tree?
> >
>
> Sure, that sounds great, since this patch doesn't have any dependency.
> Thanks a lot!
>
> Oleg, it would be great if you could kindly add your S-O-B to this patch.
> Thanks!

Thanks Neil and Srivatsa,

Sure, feel free to add

Signed-off-by: Oleg Nesterov <[email protected]>

2014-02-07 02:53:16

by Yasuaki Ishimatsu

[permalink] [raw]
Subject: Re: [PATCH 48/51] mm, vmstat: Fix CPU hotplug callback registration

(2014/02/06 7:13), Srivatsa S. Bhat wrote:
> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> put_online_cpus();
>
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> with CPU hotplug operations).
>
> Instead, the correct and race-free way of performing the callback
> registration is:
>
> cpu_maps_update_begin();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> /* Note the use of the double underscored version of the API */
> __register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_maps_update_done();
>
>
> Fix the vmstat code in the MM subsystem by using this latter form of callback
> registration.
>
> Cc: Andrew Morton <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Rik van Riel <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Yasuaki Ishimatsu <[email protected]>
> Cc: Cody P Schafer <[email protected]>
> Cc: Toshi Kani <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: [email protected]
> Signed-off-by: Srivatsa S. Bhat <[email protected]>
> ---

Looks good to me.

Reviewed-by: Yasuaki Ishimatsu <[email protected]>

Thanks,
Yasuaki Ishimatsu


>
> mm/vmstat.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 7249614..70668ba 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1290,14 +1290,14 @@ static int __init setup_vmstat(void)
> #ifdef CONFIG_SMP
> int cpu;
>
> - register_cpu_notifier(&vmstat_notifier);
> + cpu_maps_update_begin();
> + __register_cpu_notifier(&vmstat_notifier);
>
> - get_online_cpus();
> for_each_online_cpu(cpu) {
> start_cpu_timer(cpu);
> node_set_state(cpu_to_node(cpu), N_CPU);
> }
> - put_online_cpus();
> + cpu_maps_update_done();
> #endif
> #ifdef CONFIG_PROC_FS
> proc_create("buddyinfo", S_IRUGO, NULL, &fragmentation_file_operations);
>

2014-02-07 04:09:39

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 35/51] acpi-cpufreq: Fix CPU hotplug callback registration

On 6 February 2014 03:40, Srivatsa S. Bhat
<[email protected]> wrote:
> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> put_online_cpus();
>
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> with CPU hotplug operations).
>
> Instead, the correct and race-free way of performing the callback
> registration is:
>
> cpu_maps_update_begin();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> /* Note the use of the double underscored version of the API */
> __register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_maps_update_done();
>
>
> Fix the acpi-cpufreq code by using this latter form of callback registration.
>
> Cc: "Rafael J. Wysocki" <[email protected]>
> Cc: Viresh Kumar <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Srivatsa S. Bhat <[email protected]>

Acked-by: Viresh Kumar <[email protected]>

2014-02-07 04:39:27

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 50/51] net/core/flow.c: Fix CPU hotplug callback registration

From: "Srivatsa S. Bhat" <[email protected]>
Date: Thu, 06 Feb 2014 03:43:46 +0530

> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:
...
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> with CPU hotplug operations).
>
> Instead, the correct and race-free way of performing the callback
> registration is:
...
> Fix the code in net/core/flow.c by using this latter form of callback
> registration.
>
> Cc: "David S. Miller" <[email protected]>
> Cc: Li RongQing <[email protected]>
> Cc: Sasha Levin <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Chris Metcalf <[email protected]>
> Cc: [email protected]
> Signed-off-by: Srivatsa S. Bhat <[email protected]>

Applied.

2014-02-07 04:39:39

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 51/51] net/iucv/iucv.c: Fix CPU hotplug callback registration

From: "Srivatsa S. Bhat" <[email protected]>
Date: Thu, 06 Feb 2014 03:43:55 +0530

> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:
...
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> with CPU hotplug operations).
>
> Instead, the correct and race-free way of performing the callback
> registration is:
...
> Fix the code in net/iucv/iucv.c by using this latter form of callback
> registration. Also, provide helper functions to perform the common memory
> allocations and frees, to condense repetitive code.
>
> Cc: Ursula Braun <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Srivatsa S. Bhat <[email protected]>

Applied.

2014-02-07 05:19:45

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 51/51] net/iucv/iucv.c: Fix CPU hotplug callback registration

From: David Miller <[email protected]>
Date: Thu, 06 Feb 2014 20:39:35 -0800 (PST)

> Applied.

I just realized that this has a dependency not in the 'net'
tree, so I reverted and assume you will merge this with the
patch that provides the necessary interface(s).

Signed-off-by: David S. Miller <[email protected]>

2014-02-07 05:19:54

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 50/51] net/core/flow.c: Fix CPU hotplug callback registration

From: David Miller <[email protected]>
Date: Thu, 06 Feb 2014 20:39:21 -0800 (PST)

> From: "Srivatsa S. Bhat" <[email protected]>
> Date: Thu, 06 Feb 2014 03:43:46 +0530
>
>> Subsystems that want to register CPU hotplug callbacks, as well as perform
>> initialization for the CPUs that are already online, often do it as shown
>> below:
> ...
>> This is wrong, since it is prone to ABBA deadlocks involving the
>> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
>> with CPU hotplug operations).
>>
>> Instead, the correct and race-free way of performing the callback
>> registration is:
> ...
>> Fix the code in net/core/flow.c by using this latter form of callback
>> registration.
>>
>> Cc: "David S. Miller" <[email protected]>
>> Cc: Li RongQing <[email protected]>
>> Cc: Sasha Levin <[email protected]>
>> Cc: Andrew Morton <[email protected]>
>> Cc: Chris Metcalf <[email protected]>
>> Cc: [email protected]
>> Signed-off-by: Srivatsa S. Bhat <[email protected]>
>
> Applied.

I just realized that this has a dependency not in the 'net'
tree, so I reverted and assume you will merge this with the
patch that provides the necessary interface(s).

Signed-off-by: David S. Miller <[email protected]>

2014-02-07 19:11:42

by Gautham R Shenoy

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On Thu, Feb 06, 2014 at 07:41:03PM +0100, Oleg Nesterov wrote:
> On 02/06, Srivatsa S. Bhat wrote:
> >
> > The following method of CPU hotplug callback registration is not safe
> > due to the possibility of an ABBA deadlock involving the cpu_add_remove_lock
> > and the cpu_hotplug.lock.
>
> Off-topic, but perhaps it also makes sense to add the lockdep annotations
> later, to catch other similar problems. Currently get_online_cpus() acquires
> nothing from lockdep pov.

Well, both get/put_online_cpus() as well as cpu_hotplug_begin/end()
take the cpu_hotplug.lock mutex. So ideally the lockdep annotations of
mutex_lock/unlock() should have worked. If it hasn't, then the
following lockdep annotations to cpu-hotplug locking should do the
trick.

Signed-off-by: Gautham R. Shenoy <[email protected]>
---
kernel/cpu.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index deff2e6..3d2dd1c 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -19,6 +19,7 @@
#include <linux/mutex.h>
#include <linux/gfp.h>
#include <linux/suspend.h>
+#include <linux/lockdep.h>

#include "smpboot.h"

@@ -57,21 +58,34 @@ static struct {
* an ongoing cpu hotplug operation.
*/
int refcount;
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ struct lockdep_map dep_map;
+#endif
} cpu_hotplug = {
.active_writer = NULL,
.lock = __MUTEX_INITIALIZER(cpu_hotplug.lock),
.refcount = 0,
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ .dep_map = {.name = "cpu_hotplug.lock" },
+#endif
};

+#define cphp_lock_acquire_read(l, s, t, i) lock_acquire_shared_recursive(l, s, t, NULL, i)
+#define cphp_lock_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, NULL, i)
+#define cphp_lock_release(l, n, i) lock_release(l, n, i)
+
void get_online_cpus(void)
{
might_sleep();
if (cpu_hotplug.active_writer == current)
return;
+ cphp_lock_acquire_read(&cpu_hotplug.dep_map, 0, 0, _RET_IP_);
mutex_lock(&cpu_hotplug.lock);
cpu_hotplug.refcount++;
mutex_unlock(&cpu_hotplug.lock);

+
}
EXPORT_SYMBOL_GPL(get_online_cpus);

@@ -79,6 +93,7 @@ void put_online_cpus(void)
{
if (cpu_hotplug.active_writer == current)
return;
+
mutex_lock(&cpu_hotplug.lock);

if (WARN_ON(!cpu_hotplug.refcount))
@@ -87,6 +102,7 @@ void put_online_cpus(void)
if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer))
wake_up_process(cpu_hotplug.active_writer);
mutex_unlock(&cpu_hotplug.lock);
+ cphp_lock_release(&cpu_hotplug.dep_map, 1, _RET_IP_);

}
EXPORT_SYMBOL_GPL(put_online_cpus);
@@ -117,6 +133,7 @@ void cpu_hotplug_begin(void)
{
cpu_hotplug.active_writer = current;

+ cphp_lock_acquire(&cpu_hotplug.dep_map, 0, 0, _RET_IP_);
for (;;) {
mutex_lock(&cpu_hotplug.lock);
if (likely(!cpu_hotplug.refcount))
@@ -131,6 +148,7 @@ void cpu_hotplug_done(void)
{
cpu_hotplug.active_writer = NULL;
mutex_unlock(&cpu_hotplug.lock);
+ cphp_lock_release(&cpu_hotplug.dep_map, 1, _RET_IP_);
}

/*
--
1.8.3.1

2014-02-10 09:21:35

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

Hi Gautham,

On 02/08/2014 12:41 AM, Gautham R Shenoy wrote:
> On Thu, Feb 06, 2014 at 07:41:03PM +0100, Oleg Nesterov wrote:
>> On 02/06, Srivatsa S. Bhat wrote:
>>>
>>> The following method of CPU hotplug callback registration is not safe
>>> due to the possibility of an ABBA deadlock involving the cpu_add_remove_lock
>>> and the cpu_hotplug.lock.
>>
>> Off-topic, but perhaps it also makes sense to add the lockdep annotations
>> later, to catch other similar problems. Currently get_online_cpus() acquires
>> nothing from lockdep pov.
>
> Well, both get/put_online_cpus() as well as cpu_hotplug_begin/end()
> take the cpu_hotplug.lock mutex. So ideally the lockdep annotations of
> mutex_lock/unlock() should have worked.

The reason lockdep doesn't catch the lock-inversion (ABBA) deadlock between
cpu_hotplug.lock (from get_online_cpus) and cpu_add_remove_lock (from
cpu_maps_update_begin) is because, in the following path, the
cpu_add_remove_lock is acquired after *releasing* the cpu_hotplug.lock mutex.

get_online_cpus(); // acquire mutex; update counter; release mutex

register_cpu_notifier(); // acquire cpu_add_remove_lock ...

put_online_cpus();

> If it hasn't, then the
> following lockdep annotations to cpu-hotplug locking should do the
> trick.
>

This patch looks good to me. I have a couple of suggestions though..

> Signed-off-by: Gautham R. Shenoy <[email protected]>
> ---
> kernel/cpu.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index deff2e6..3d2dd1c 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -19,6 +19,7 @@
> #include <linux/mutex.h>
> #include <linux/gfp.h>
> #include <linux/suspend.h>
> +#include <linux/lockdep.h>
>
> #include "smpboot.h"
>
> @@ -57,21 +58,34 @@ static struct {
> * an ongoing cpu hotplug operation.
> */
> int refcount;
> +
> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> + struct lockdep_map dep_map;
> +#endif
> } cpu_hotplug = {
> .active_writer = NULL,
> .lock = __MUTEX_INITIALIZER(cpu_hotplug.lock),
> .refcount = 0,
> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> + .dep_map = {.name = "cpu_hotplug.lock" },
> +#endif
> };
>
> +#define cphp_lock_acquire_read(l, s, t, i) lock_acquire_shared_recursive(l, s, t, NULL, i)
> +#define cphp_lock_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, NULL, i)
> +#define cphp_lock_release(l, n, i) lock_release(l, n, i)
> +

Can you make them cpuhp_* instead of cphp_*? That way it would suit better as
a short-form of "cpu hotplug".

Also, perhaps we could use the lock_map_acquire(), lock_map_acquire_read()
and lock_map_release() macros to make the call-sites look neater.

Regards,
Srivatsa S. Bhat


> void get_online_cpus(void)
> {
> might_sleep();
> if (cpu_hotplug.active_writer == current)
> return;
> + cphp_lock_acquire_read(&cpu_hotplug.dep_map, 0, 0, _RET_IP_);
> mutex_lock(&cpu_hotplug.lock);
> cpu_hotplug.refcount++;
> mutex_unlock(&cpu_hotplug.lock);
>
> +
> }
> EXPORT_SYMBOL_GPL(get_online_cpus);
>
> @@ -79,6 +93,7 @@ void put_online_cpus(void)
> {
> if (cpu_hotplug.active_writer == current)
> return;
> +
> mutex_lock(&cpu_hotplug.lock);
>
> if (WARN_ON(!cpu_hotplug.refcount))
> @@ -87,6 +102,7 @@ void put_online_cpus(void)
> if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer))
> wake_up_process(cpu_hotplug.active_writer);
> mutex_unlock(&cpu_hotplug.lock);
> + cphp_lock_release(&cpu_hotplug.dep_map, 1, _RET_IP_);
>
> }
> EXPORT_SYMBOL_GPL(put_online_cpus);
> @@ -117,6 +133,7 @@ void cpu_hotplug_begin(void)
> {
> cpu_hotplug.active_writer = current;
>
> + cphp_lock_acquire(&cpu_hotplug.dep_map, 0, 0, _RET_IP_);
> for (;;) {
> mutex_lock(&cpu_hotplug.lock);
> if (likely(!cpu_hotplug.refcount))
> @@ -131,6 +148,7 @@ void cpu_hotplug_done(void)
> {
> cpu_hotplug.active_writer = NULL;
> mutex_unlock(&cpu_hotplug.lock);
> + cphp_lock_release(&cpu_hotplug.dep_map, 1, _RET_IP_);
> }
>
> /*
>

2014-02-10 10:53:59

by Gautham R Shenoy

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On Mon, Feb 10, 2014 at 02:45:55PM +0530, Srivatsa S. Bhat wrote:
> Hi Gautham,
>
> On 02/08/2014 12:41 AM, Gautham R Shenoy wrote:
> > On Thu, Feb 06, 2014 at 07:41:03PM +0100, Oleg Nesterov wrote:
> >> On 02/06, Srivatsa S. Bhat wrote:
> >>>
> >>> The following method of CPU hotplug callback registration is not safe
> >>> due to the possibility of an ABBA deadlock involving the cpu_add_remove_lock
> >>> and the cpu_hotplug.lock.
> >>
> >> Off-topic, but perhaps it also makes sense to add the lockdep annotations
> >> later, to catch other similar problems. Currently get_online_cpus() acquires
> >> nothing from lockdep pov.
> >
> > Well, both get/put_online_cpus() as well as cpu_hotplug_begin/end()
> > take the cpu_hotplug.lock mutex. So ideally the lockdep annotations of
> > mutex_lock/unlock() should have worked.
>
> The reason lockdep doesn't catch the lock-inversion (ABBA) deadlock between
> cpu_hotplug.lock (from get_online_cpus) and cpu_add_remove_lock (from
> cpu_maps_update_begin) is because, in the following path, the
> cpu_add_remove_lock is acquired after *releasing* the cpu_hotplug.lock mutex.
>

Right. I get it now!

> get_online_cpus(); // acquire mutex; update counter; release mutex
>
> register_cpu_notifier(); // acquire cpu_add_remove_lock ...
>
> put_online_cpus();
>
> > If it hasn't, then the
> > following lockdep annotations to cpu-hotplug locking should do the
> > trick.
> >
>
> This patch looks good to me. I have a couple of suggestions though..
>

Thanks. I have incorporated the suggestions. Could you check if the
following looks good ?

---
Add lockdep annotations for get/put_online_cpus() and
cpu_hotplug_begin()/cpu_hotplug_end().

Cc: Oleg Nesterov <[email protected]>
Cc: Srivatsa Bhat <[email protected]>
Signed-off-by: Gautham R. Shenoy <[email protected]>
---
kernel/cpu.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index deff2e6..33caf5e 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -19,6 +19,7 @@
#include <linux/mutex.h>
#include <linux/gfp.h>
#include <linux/suspend.h>
+#include <linux/lockdep.h>

#include "smpboot.h"

@@ -57,17 +58,30 @@ static struct {
* an ongoing cpu hotplug operation.
*/
int refcount;
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ struct lockdep_map dep_map;
+#endif
} cpu_hotplug = {
.active_writer = NULL,
.lock = __MUTEX_INITIALIZER(cpu_hotplug.lock),
.refcount = 0,
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ .dep_map = {.name = "cpu_hotplug.lock" },
+#endif
};

+/* Lockdep annotations for get/put_online_cpus() and cpu_hotplug_begin/end() */
+#define cpuhp_lock_acquire_read() lock_map_acquire_read(&cpu_hotplug.dep_map)
+#define cpuhp_lock_acquire() lock_map_acquire(&cpu_hotplug.dep_map)
+#define cpuhp_lock_release() lock_map_release(&cpu_hotplug.dep_map)
+
void get_online_cpus(void)
{
might_sleep();
if (cpu_hotplug.active_writer == current)
return;
+ cpuhp_lock_acquire_read();
mutex_lock(&cpu_hotplug.lock);
cpu_hotplug.refcount++;
mutex_unlock(&cpu_hotplug.lock);
@@ -87,6 +101,7 @@ void put_online_cpus(void)
if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer))
wake_up_process(cpu_hotplug.active_writer);
mutex_unlock(&cpu_hotplug.lock);
+ cpuhp_lock_release();

}
EXPORT_SYMBOL_GPL(put_online_cpus);
@@ -117,6 +132,7 @@ void cpu_hotplug_begin(void)
{
cpu_hotplug.active_writer = current;

+ cpuhp_lock_acquire();
for (;;) {
mutex_lock(&cpu_hotplug.lock);
if (likely(!cpu_hotplug.refcount))
@@ -131,6 +147,7 @@ void cpu_hotplug_done(void)
{
cpu_hotplug.active_writer = NULL;
mutex_unlock(&cpu_hotplug.lock);
+ cpuhp_lock_release();
}

/*
--
1.8.3.1

2014-02-10 11:16:31

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On 02/10/2014 04:21 PM, Gautham R Shenoy wrote:
> On Mon, Feb 10, 2014 at 02:45:55PM +0530, Srivatsa S. Bhat wrote:
>> Hi Gautham,
>>
>> On 02/08/2014 12:41 AM, Gautham R Shenoy wrote:
>>> On Thu, Feb 06, 2014 at 07:41:03PM +0100, Oleg Nesterov wrote:
>>>> On 02/06, Srivatsa S. Bhat wrote:
>>>>>
>>>>> The following method of CPU hotplug callback registration is not safe
>>>>> due to the possibility of an ABBA deadlock involving the cpu_add_remove_lock
>>>>> and the cpu_hotplug.lock.
>>>>
[...]
>> get_online_cpus(); // acquire mutex; update counter; release mutex
>>
>> register_cpu_notifier(); // acquire cpu_add_remove_lock ...
>>
>> put_online_cpus();
>>
>>> If it hasn't, then the
>>> following lockdep annotations to cpu-hotplug locking should do the
>>> trick.
>>>
>>
>> This patch looks good to me. I have a couple of suggestions though..
>>
>
> Thanks. I have incorporated the suggestions. Could you check if the
> following looks good ?
>
> ---
> Add lockdep annotations for get/put_online_cpus() and
> cpu_hotplug_begin()/cpu_hotplug_end().
>
> Cc: Oleg Nesterov <[email protected]>
> Cc: Srivatsa Bhat <[email protected]>
> Signed-off-by: Gautham R. Shenoy <[email protected]>
> ---
[...]
> +/* Lockdep annotations for get/put_online_cpus() and cpu_hotplug_begin/end() */
> +#define cpuhp_lock_acquire_read() lock_map_acquire_read(&cpu_hotplug.dep_map)
> +#define cpuhp_lock_acquire() lock_map_acquire(&cpu_hotplug.dep_map)
> +#define cpuhp_lock_release() lock_map_release(&cpu_hotplug.dep_map)
> +
> void get_online_cpus(void)
> {
> might_sleep();
> if (cpu_hotplug.active_writer == current)
> return;
> + cpuhp_lock_acquire_read();
> mutex_lock(&cpu_hotplug.lock);
> cpu_hotplug.refcount++;
> mutex_unlock(&cpu_hotplug.lock);
> @@ -87,6 +101,7 @@ void put_online_cpus(void)
> if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer))
> wake_up_process(cpu_hotplug.active_writer);
> mutex_unlock(&cpu_hotplug.lock);
> + cpuhp_lock_release();
>
> }
> EXPORT_SYMBOL_GPL(put_online_cpus);
> @@ -117,6 +132,7 @@ void cpu_hotplug_begin(void)
> {
> cpu_hotplug.active_writer = current;
>
> + cpuhp_lock_acquire();

Shouldn't we move this to _after_ the for-loop? Because, that's when the
hotplug writer is really in a state equivalent to exclusive access to the
hotplug lock... Else, we might fool lockdep into believing that the hotplug
writer has the lock for write, and at the same time several readers have
the lock for read as well.. no?

Sorry I didn't notice this earlier.

> for (;;) {
> mutex_lock(&cpu_hotplug.lock);
> if (likely(!cpu_hotplug.refcount))
> @@ -131,6 +147,7 @@ void cpu_hotplug_done(void)
> {
> cpu_hotplug.active_writer = NULL;
> mutex_unlock(&cpu_hotplug.lock);
> + cpuhp_lock_release();
> }
>
> /*
>

Regards,
Srivatsa S. Bhat

2014-02-10 12:07:10

by Gautham R Shenoy

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On Mon, Feb 10, 2014 at 04:41:04PM +0530, Srivatsa S. Bhat wrote:
> > ---
> [...]
> > +/* Lockdep annotations for get/put_online_cpus() and cpu_hotplug_begin/end() */
> > +#define cpuhp_lock_acquire_read() lock_map_acquire_read(&cpu_hotplug.dep_map)
> > +#define cpuhp_lock_acquire() lock_map_acquire(&cpu_hotplug.dep_map)
> > +#define cpuhp_lock_release() lock_map_release(&cpu_hotplug.dep_map)
> > +
> > void get_online_cpus(void)
> > {
> > might_sleep();
> > if (cpu_hotplug.active_writer == current)
> > return;
> > + cpuhp_lock_acquire_read();
> > mutex_lock(&cpu_hotplug.lock);
> > cpu_hotplug.refcount++;
> > mutex_unlock(&cpu_hotplug.lock);
> > @@ -87,6 +101,7 @@ void put_online_cpus(void)
> > if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer))
> > wake_up_process(cpu_hotplug.active_writer);
> > mutex_unlock(&cpu_hotplug.lock);
> > + cpuhp_lock_release();
> >
> > }
> > EXPORT_SYMBOL_GPL(put_online_cpus);
> > @@ -117,6 +132,7 @@ void cpu_hotplug_begin(void)
> > {
> > cpu_hotplug.active_writer = current;
> >
> > + cpuhp_lock_acquire();
>
> Shouldn't we move this to _after_ the for-loop?

No if we move this to after the for-loop, we won't be able to catch
the ABBA dependency that you had mentioned earlier.

Consider the case

Thread1: Thread 2:
------------------------------------------------------------------------
get_online_cpus()
// lockdep knows about this.
cpu_maps_update_begin()
//lockdep knows about this.

register_cpu_notifier()
|
|-> cpu_maps_update_begin()
//lockdep knows about this.


cpu_hotplug_begin()
|
|-->for(;;) {
Wait for all the
readers to exit.

This will never
happen now and
we're stuck here
forever without
telling anyone why!
}

cpuhp_lock_acquire();

--------------------------------------------------------------------------
> Because, that's when the
> hotplug writer is really in a state equivalent to exclusive access to the
> hotplug lock... Else, we might fool lockdep into believing that the hotplug
> writer has the lock for write, and at the same time several readers have
> the lock for read as well.. no?
>

Well as I understand it, the purpose of lockdep annotations is to
signal the intent of acquiring a lock as opposed to reporting the
status that the lock has been acquired.

The annotation in the earlier patch is consistent with the lockdep
annotations in rwlocks. Except for the fact that the readers of
cpu_hotplug.lock can sleep having acquired the lock, there's no
difference between rwlock semantics and cpu-hotplug lock behaviour.
Both are unfair to the writer as they allow new readers to acquire the
lock as long as there's some reader which holds the lock.

--
Thanks and Regards
gautham.

2014-02-10 13:34:21

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On 02/10/2014 05:35 PM, Gautham R Shenoy wrote:
> On Mon, Feb 10, 2014 at 04:41:04PM +0530, Srivatsa S. Bhat wrote:
>>> ---
>> [...]
>>> +/* Lockdep annotations for get/put_online_cpus() and cpu_hotplug_begin/end() */
>>> +#define cpuhp_lock_acquire_read() lock_map_acquire_read(&cpu_hotplug.dep_map)
>>> +#define cpuhp_lock_acquire() lock_map_acquire(&cpu_hotplug.dep_map)
>>> +#define cpuhp_lock_release() lock_map_release(&cpu_hotplug.dep_map)
>>> +
>>> void get_online_cpus(void)
>>> {
>>> might_sleep();
>>> if (cpu_hotplug.active_writer == current)
>>> return;
>>> + cpuhp_lock_acquire_read();
>>> mutex_lock(&cpu_hotplug.lock);
>>> cpu_hotplug.refcount++;
>>> mutex_unlock(&cpu_hotplug.lock);
>>> @@ -87,6 +101,7 @@ void put_online_cpus(void)
>>> if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer))
>>> wake_up_process(cpu_hotplug.active_writer);
>>> mutex_unlock(&cpu_hotplug.lock);
>>> + cpuhp_lock_release();
>>>
>>> }
>>> EXPORT_SYMBOL_GPL(put_online_cpus);
>>> @@ -117,6 +132,7 @@ void cpu_hotplug_begin(void)
>>> {
>>> cpu_hotplug.active_writer = current;
>>>
>>> + cpuhp_lock_acquire();
>>
>> Shouldn't we move this to _after_ the for-loop?
>
> No if we move this to after the for-loop, we won't be able to catch
> the ABBA dependency that you had mentioned earlier.
>
> Consider the case
>
> Thread1: Thread 2:
> ------------------------------------------------------------------------
> get_online_cpus()
> // lockdep knows about this.
> cpu_maps_update_begin()
> //lockdep knows about this.
>
> register_cpu_notifier()
> |
> |-> cpu_maps_update_begin()
> //lockdep knows about this.
>
>
> cpu_hotplug_begin()
> |
> |-->for(;;) {
> Wait for all the
> readers to exit.
>
> This will never
> happen now and
> we're stuck here
> forever without
> telling anyone why!
> }
>
> cpuhp_lock_acquire();
>

Ok, that is a very convincing explanation!

> --------------------------------------------------------------------------
>> Because, that's when the
>> hotplug writer is really in a state equivalent to exclusive access to the
>> hotplug lock... Else, we might fool lockdep into believing that the hotplug
>> writer has the lock for write, and at the same time several readers have
>> the lock for read as well.. no?
>>
>
> Well as I understand it, the purpose of lockdep annotations is to
> signal the intent of acquiring a lock as opposed to reporting the
> status that the lock has been acquired.
>
> The annotation in the earlier patch is consistent with the lockdep
> annotations in rwlocks. Except for the fact that the readers of
> cpu_hotplug.lock can sleep having acquired the lock, there's no
> difference between rwlock semantics and cpu-hotplug lock behaviour.
> Both are unfair to the writer as they allow new readers to acquire the
> lock as long as there's some reader which holds the lock.
>

Ah, ok.. Thanks a lot for the clarification!

Regards,
Srivatsa S. Bhat

2014-02-10 13:35:56

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On 02/10/2014 04:21 PM, Gautham R Shenoy wrote:
> On Mon, Feb 10, 2014 at 02:45:55PM +0530, Srivatsa S. Bhat wrote:
>> Hi Gautham,
>>
>> On 02/08/2014 12:41 AM, Gautham R Shenoy wrote:
>>> On Thu, Feb 06, 2014 at 07:41:03PM +0100, Oleg Nesterov wrote:
>>>> On 02/06, Srivatsa S. Bhat wrote:
>>>>>
>>>>> The following method of CPU hotplug callback registration is not safe
>>>>> due to the possibility of an ABBA deadlock involving the cpu_add_remove_lock
>>>>> and the cpu_hotplug.lock.
>>>>
[...]
>> This patch looks good to me. I have a couple of suggestions though..
>>
>
> Thanks. I have incorporated the suggestions. Could you check if the
> following looks good ?
>
> ---
> Add lockdep annotations for get/put_online_cpus() and
> cpu_hotplug_begin()/cpu_hotplug_end().
>
> Cc: Oleg Nesterov <[email protected]>
> Cc: Srivatsa Bhat <[email protected]>
> Signed-off-by: Gautham R. Shenoy <[email protected]>

Reviewed-by: Srivatsa S. Bhat <[email protected]>

Regards,
Srivatsa S. Bhat

> ---
> kernel/cpu.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index deff2e6..33caf5e 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -19,6 +19,7 @@
> #include <linux/mutex.h>
> #include <linux/gfp.h>
> #include <linux/suspend.h>
> +#include <linux/lockdep.h>
>
> #include "smpboot.h"
>
> @@ -57,17 +58,30 @@ static struct {
> * an ongoing cpu hotplug operation.
> */
> int refcount;
> +
> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> + struct lockdep_map dep_map;
> +#endif
> } cpu_hotplug = {
> .active_writer = NULL,
> .lock = __MUTEX_INITIALIZER(cpu_hotplug.lock),
> .refcount = 0,
> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> + .dep_map = {.name = "cpu_hotplug.lock" },
> +#endif
> };
>
> +/* Lockdep annotations for get/put_online_cpus() and cpu_hotplug_begin/end() */
> +#define cpuhp_lock_acquire_read() lock_map_acquire_read(&cpu_hotplug.dep_map)
> +#define cpuhp_lock_acquire() lock_map_acquire(&cpu_hotplug.dep_map)
> +#define cpuhp_lock_release() lock_map_release(&cpu_hotplug.dep_map)
> +
> void get_online_cpus(void)
> {
> might_sleep();
> if (cpu_hotplug.active_writer == current)
> return;
> + cpuhp_lock_acquire_read();
> mutex_lock(&cpu_hotplug.lock);
> cpu_hotplug.refcount++;
> mutex_unlock(&cpu_hotplug.lock);
> @@ -87,6 +101,7 @@ void put_online_cpus(void)
> if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer))
> wake_up_process(cpu_hotplug.active_writer);
> mutex_unlock(&cpu_hotplug.lock);
> + cpuhp_lock_release();
>
> }
> EXPORT_SYMBOL_GPL(put_online_cpus);
> @@ -117,6 +132,7 @@ void cpu_hotplug_begin(void)
> {
> cpu_hotplug.active_writer = current;
>
> + cpuhp_lock_acquire();
> for (;;) {
> mutex_lock(&cpu_hotplug.lock);
> if (likely(!cpu_hotplug.refcount))
> @@ -131,6 +147,7 @@ void cpu_hotplug_done(void)
> {
> cpu_hotplug.active_writer = NULL;
> mutex_unlock(&cpu_hotplug.lock);
> + cpuhp_lock_release();
> }
>
> /*
>

2014-02-10 15:31:42

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On 02/10, Gautham R Shenoy wrote:
>
> Add lockdep annotations for get/put_online_cpus() and
> cpu_hotplug_begin()/cpu_hotplug_end().

Thanks, looks good.

Reviewed-by: Oleg Nesterov <[email protected]>

2014-02-10 15:54:21

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 19/51] x86, therm_throt.c: Fix CPU hotplug callback registration

On 02/06, Srivatsa S. Bhat wrote:
>
> --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
> +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
> @@ -319,7 +319,7 @@ static __init int thermal_throttle_init_device(void)
> if (!atomic_read(&therm_throt_en))
> return 0;
>
> - register_hotcpu_notifier(&thermal_throttle_cpu_notifier);
> + cpu_maps_update_begin();
>
> #ifdef CONFIG_HOTPLUG_CPU
> mutex_lock(&therm_cpu_lock);
> @@ -333,6 +333,9 @@ static __init int thermal_throttle_init_device(void)
> mutex_unlock(&therm_cpu_lock);
> #endif
>
> + __register_hotcpu_notifier(&thermal_throttle_cpu_notifier);
> + cpu_maps_update_done();


Off-topic, but it seems that after this change therm_cpu_lock can die.
Of course this needs another patch (if I am right).

Oleg.

2014-02-10 17:27:33

by Balbir Singh

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On Mon, Feb 10, 2014 at 04:21:30PM +0530, Gautham R Shenoy wrote:
> On Mon, Feb 10, 2014 at 02:45:55PM +0530, Srivatsa S. Bhat wrote:
>
> + cpuhp_lock_acquire_read();
> mutex_lock(&cpu_hotplug.lock);

Don't you want to abstract cpuhp_lock_acquire_read and mutex_lock into a
more useful primitive. Ditto for the unlock bits - specifically if they
will always be used together.

2014-02-10 17:35:20

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 19/51] x86, therm_throt.c: Fix CPU hotplug callback registration

On 02/10/2014 09:23 PM, Oleg Nesterov wrote:
> On 02/06, Srivatsa S. Bhat wrote:
>>
>> --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
>> +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
>> @@ -319,7 +319,7 @@ static __init int thermal_throttle_init_device(void)
>> if (!atomic_read(&therm_throt_en))
>> return 0;
>>
>> - register_hotcpu_notifier(&thermal_throttle_cpu_notifier);
>> + cpu_maps_update_begin();
>>
>> #ifdef CONFIG_HOTPLUG_CPU
>> mutex_lock(&therm_cpu_lock);
>> @@ -333,6 +333,9 @@ static __init int thermal_throttle_init_device(void)
>> mutex_unlock(&therm_cpu_lock);
>> #endif
>>
>> + __register_hotcpu_notifier(&thermal_throttle_cpu_notifier);
>> + cpu_maps_update_done();
>
>
> Off-topic, but it seems that after this change therm_cpu_lock can die.
> Of course this needs another patch (if I am right).
>

I'm not sure I understood that clearly. Can you please explain?

I'm guessing that you are referring to some problem with the #ifdef
CONFIG_HOTPLUG_CPU around mutex_lock/unlock(&therm_cpu_lock) ?

Regards,
Srivatsa S. Bhat

2014-02-10 18:09:56

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 19/51] x86, therm_throt.c: Fix CPU hotplug callback registration

On 02/10/2014 10:59 PM, Srivatsa S. Bhat wrote:
> On 02/10/2014 09:23 PM, Oleg Nesterov wrote:
>> On 02/06, Srivatsa S. Bhat wrote:
>>>
>>> --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
>>> +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
>>> @@ -319,7 +319,7 @@ static __init int thermal_throttle_init_device(void)
>>> if (!atomic_read(&therm_throt_en))
>>> return 0;
>>>
>>> - register_hotcpu_notifier(&thermal_throttle_cpu_notifier);
>>> + cpu_maps_update_begin();
>>>
>>> #ifdef CONFIG_HOTPLUG_CPU
>>> mutex_lock(&therm_cpu_lock);
>>> @@ -333,6 +333,9 @@ static __init int thermal_throttle_init_device(void)
>>> mutex_unlock(&therm_cpu_lock);
>>> #endif
>>>
>>> + __register_hotcpu_notifier(&thermal_throttle_cpu_notifier);
>>> + cpu_maps_update_done();
>>
>>
>> Off-topic, but it seems that after this change therm_cpu_lock can die.
>> Of course this needs another patch (if I am right).
>>
>
> I'm not sure I understood that clearly. Can you please explain?
>
> I'm guessing that you are referring to some problem with the #ifdef
> CONFIG_HOTPLUG_CPU around mutex_lock/unlock(&therm_cpu_lock) ?
>

Oh, nevermind, now I see it. Basically you are suggesting that therm_cpu_lock
is useless after this patch and hence can be removed entirely. Yep, that
makes sense. (I hadn't noticed it earlier). I'll incorporate that change
in a separate patch in my v2 of the patchset. Thank you!

Regards,
Srivatsa S. Bhat

2014-02-10 18:51:19

by Gautham R Shenoy

[permalink] [raw]
Subject: Re: [PATCH 16/51] x86, vsyscall: Fix CPU hotplug callback registration

Hi,

On Thu, Feb 06, 2014 at 03:37:27AM +0530, Srivatsa S. Bhat wrote:
> @@ -393,9 +393,13 @@ static int __init vsyscall_init(void)
> {
> BUG_ON(VSYSCALL_ADDR(0) != __fix_to_virt(VSYSCALL_FIRST_PAGE));
>
> + cpu_maps_update_begin();
> +
> on_each_cpu(cpu_vsyscall_init, NULL, 1);
> /* notifier priority > KVM */
> - hotcpu_notifier(cpu_vsyscall_notifier, 30);
> + __hotcpu_notifier(cpu_vsyscall_notifier, 30);

While we're at it, we could also #define the VSYSCALL_PRIO relative to
KVM_PRIO instead of hard-coding the value here, no ?

> +
> + cpu_maps_update_done();
>
> return 0;
> }
>

2014-02-10 18:59:32

by Gautham R Shenoy

[permalink] [raw]
Subject: Re: [PATCH 24/51] x86, hpet: Fix CPU hotplug callback registration

Hi,

On Thu, Feb 06, 2014 at 03:39:00AM +0530, Srivatsa S. Bhat wrote:
> diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
> index da85a8e..199aaae 100644
> --- a/arch/x86/kernel/hpet.c
> +++ b/arch/x86/kernel/hpet.c
> @@ -943,12 +943,14 @@ static __init int hpet_late_init(void)
> if (boot_cpu_has(X86_FEATURE_ARAT))
> return 0;
>
> + cpu_maps_update_begin();
> for_each_online_cpu(cpu) {
> hpet_cpuhp_notify(NULL, CPU_ONLINE, (void *)(long)cpu);
> }
>
> /* This notifier should be called after workqueue is ready */
> - hotcpu_notifier(hpet_cpuhp_notify, -20);
> + __hotcpu_notifier(hpet_cpuhp_notify, -20);

We could #define HPET_CPUHP_PRIO relative to the workqueue priority instead of
hardcoding this it this way.


--
Thanks and Regards
gautham.

2014-02-10 19:08:27

by Gautham R Shenoy

[permalink] [raw]
Subject: Re: [PATCH 26/51] x86, oprofile, nmi: Fix CPU hotplug callback registration

Hi,

On Thu, Feb 06, 2014 at 03:39:22AM +0530, Srivatsa S. Bhat wrote:
> Fix the oprofile code in x86 by using this latter form of callback
> registration. But retain the calls to get/put_online_cpus(), since they
> also protect the variables 'nmi_enabled' and 'ctr_running'.

get/put_online_cpus() protect us against cpu_hotplug_begin/end(). The
latter is always nested inside cpu_maps_update_begin/end(), which we
are already using here.

So what additional protection are we getting by retaining
get/put_online_cpus() ?

> By nesting
> get/put_online_cpus() *inside* cpu_maps_update_begin/done(), we avoid
> the ABBA deadlock possibility mentioned above.
>
> Cc: Robert Richter <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: "H. Peter Anvin" <[email protected]>
> Cc: [email protected]
> Signed-off-by: Srivatsa S. Bhat <[email protected]>
> ---
>
> arch/x86/oprofile/nmi_int.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
> index 6890d84..85e5f6e 100644
> --- a/arch/x86/oprofile/nmi_int.c
> +++ b/arch/x86/oprofile/nmi_int.c
> @@ -494,14 +494,19 @@ static int nmi_setup(void)
> if (err)
> goto fail;
>
> + cpu_maps_update_begin();
> +
> + /* Use get/put_online_cpus() to protect 'nmi_enabled' */
> get_online_cpus();
> - register_cpu_notifier(&oprofile_cpu_nb);
> nmi_enabled = 1;
> /* make nmi_enabled visible to the nmi handler: */
> smp_mb();
> on_each_cpu(nmi_cpu_setup, NULL, 1);
> + __register_cpu_notifier(&oprofile_cpu_nb);
> put_online_cpus();
>
> + cpu_maps_update_done();
> +
> return 0;
> fail:
> free_msrs();
> @@ -512,12 +517,18 @@ static void nmi_shutdown(void)
> {
> struct op_msrs *msrs;
>
> + cpu_maps_update_begin();
> +
> + /* Use get/put_online_cpus() to protect 'nmi_enabled' & 'ctr_running' */
> get_online_cpus();
> - unregister_cpu_notifier(&oprofile_cpu_nb);
> on_each_cpu(nmi_cpu_shutdown, NULL, 1);
> nmi_enabled = 0;
> ctr_running = 0;
> + __unregister_cpu_notifier(&oprofile_cpu_nb);
> put_online_cpus();
> +
> + cpu_maps_update_done();
> +
> /* make variables visible to the nmi handler: */
> smp_mb();
> unregister_nmi_handler(NMI_LOCAL, "oprofile");
>

2014-02-10 19:28:36

by Gautham R Shenoy

[permalink] [raw]
Subject: Re: [PATCH 26/51] x86, oprofile, nmi: Fix CPU hotplug callback registration

On Tue, Feb 11, 2014 at 12:37:37AM +0530, Gautham R Shenoy wrote:
> Hi,
>
> On Thu, Feb 06, 2014 at 03:39:22AM +0530, Srivatsa S. Bhat wrote:
> > Fix the oprofile code in x86 by using this latter form of callback
> > registration. But retain the calls to get/put_online_cpus(), since they
> > also protect the variables 'nmi_enabled' and 'ctr_running'.
>
> get/put_online_cpus() protect us against cpu_hotplug_begin/end(). The
> latter is always nested inside cpu_maps_update_begin/end(), which we
> are already using here.
>
> So what additional protection are we getting by retaining
> get/put_online_cpus() ?

Probably you mean to say that there are other places which access
'nmi_enabled' and 'ctr_running' with the cpu-hotplug protection
provided only by get/put_online_cpus() and you are retaining the calls
in this patch to be consistent with those other places. If so, could
you reword the changelog to reflect this instead of saying "they also
protect the variables ..." ?

--
Thanks and Regards
gautham.

2014-02-11 01:32:51

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On Thu, 2014-02-06 at 03:34 +0530, Srivatsa S. Bhat wrote:
:
> The problem here is that callback registration takes the locks in one order
> whereas the CPU hotplug operations take the same locks in the opposite order.
> To avoid this issue and to provide a race-free method to register CPU hotplug
> callbacks (along with initialization of already online CPUs), introduce new
> variants of the callback registration APIs that simply register the callbacks
> without holding the cpu_add_remove_lock during the registration. That way,
> we can avoid the ABBA scenario. However, we will need to hold the
> cpu_add_remove_lock throughout the entire critical section, to protect updates
> to the callback/notifier chain.
>
> This can be achieved by writing the callback registration code as follows:
>
> cpu_maps_update_begin();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> /* This doesn't take the cpu_add_remove_lock */
> __register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_maps_update_done();
>
> Note that we can't use get_online_cpus() here instead of cpu_maps_update_begin()
> because the cpu_hotplug.lock is dropped during the invocation of CPU_POST_DEAD
> notifiers, and hence get_online_cpus() cannot provide the necessary
> synchronization to protect the callback/notifier chains against concurrent
> reads and writes. On the other hand, since the cpu_add_remove_lock protects
> the entire hotplug operation (including CPU_POST_DEAD), we can use
> cpu_maps_update_begin/done() to guarantee proper synchronization.
>
> Also, since cpu_maps_update_begin/done() is like a super-set of
> get/put_online_cpus(), the former naturally protects the critical sections
> from concurrent hotplug operations.

get/put_online_cpus() is a reader-lock and concurrent executions are
allowed among the readers. They won't be serialized until a cpu
online/offline operation begins. By replacing this lock with
cpu_maps_update_begin/done(), we now serialize all readers. Isn't that
too restrictive? Can we fix the issue with CPU_POST_DEAD and continue
to use get_online_cpus()?

Thanks,
-Toshi


2014-02-11 07:04:23

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 16/51] x86, vsyscall: Fix CPU hotplug callback registration

On 02/11/2014 12:20 AM, Gautham R Shenoy wrote:
> Hi,
>
> On Thu, Feb 06, 2014 at 03:37:27AM +0530, Srivatsa S. Bhat wrote:
>> @@ -393,9 +393,13 @@ static int __init vsyscall_init(void)
>> {
>> BUG_ON(VSYSCALL_ADDR(0) != __fix_to_virt(VSYSCALL_FIRST_PAGE));
>>
>> + cpu_maps_update_begin();
>> +
>> on_each_cpu(cpu_vsyscall_init, NULL, 1);
>> /* notifier priority > KVM */
>> - hotcpu_notifier(cpu_vsyscall_notifier, 30);
>> + __hotcpu_notifier(cpu_vsyscall_notifier, 30);
>
> While we're at it, we could also #define the VSYSCALL_PRIO relative to
> KVM_PRIO instead of hard-coding the value here, no ?
>

Yeah, that sounds like a good idea, but I guess we can do these
cleanups in a separate patch series.

>> +
>> + cpu_maps_update_done();
>>
>> return 0;
>> }
>>

Regards,
Srivatsa S. Bhat

2014-02-11 07:05:06

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 24/51] x86, hpet: Fix CPU hotplug callback registration

On 02/11/2014 12:28 AM, Gautham R Shenoy wrote:
> Hi,
>
> On Thu, Feb 06, 2014 at 03:39:00AM +0530, Srivatsa S. Bhat wrote:
>> diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
>> index da85a8e..199aaae 100644
>> --- a/arch/x86/kernel/hpet.c
>> +++ b/arch/x86/kernel/hpet.c
>> @@ -943,12 +943,14 @@ static __init int hpet_late_init(void)
>> if (boot_cpu_has(X86_FEATURE_ARAT))
>> return 0;
>>
>> + cpu_maps_update_begin();
>> for_each_online_cpu(cpu) {
>> hpet_cpuhp_notify(NULL, CPU_ONLINE, (void *)(long)cpu);
>> }
>>
>> /* This notifier should be called after workqueue is ready */
>> - hotcpu_notifier(hpet_cpuhp_notify, -20);
>> + __hotcpu_notifier(hpet_cpuhp_notify, -20);
>
> We could #define HPET_CPUHP_PRIO relative to the workqueue priority instead of
> hardcoding this it this way.
>

Sure, we can do that. Thanks for pointing that out!

Regards,
Srivatsa S. Bhat

2014-02-11 07:06:32

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 26/51] x86, oprofile, nmi: Fix CPU hotplug callback registration

On 02/11/2014 12:57 AM, Gautham R Shenoy wrote:
> On Tue, Feb 11, 2014 at 12:37:37AM +0530, Gautham R Shenoy wrote:
>> Hi,
>>
>> On Thu, Feb 06, 2014 at 03:39:22AM +0530, Srivatsa S. Bhat wrote:
>>> Fix the oprofile code in x86 by using this latter form of callback
>>> registration. But retain the calls to get/put_online_cpus(), since they
>>> also protect the variables 'nmi_enabled' and 'ctr_running'.
>>
>> get/put_online_cpus() protect us against cpu_hotplug_begin/end(). The
>> latter is always nested inside cpu_maps_update_begin/end(), which we
>> are already using here.
>>
>> So what additional protection are we getting by retaining
>> get/put_online_cpus() ?
>
> Probably you mean to say that there are other places which access
> 'nmi_enabled' and 'ctr_running' with the cpu-hotplug protection
> provided only by get/put_online_cpus() and you are retaining the calls
> in this patch to be consistent with those other places.

Yep, exactly!

> If so, could
> you reword the changelog to reflect this instead of saying "they also
> protect the variables ..." ?
>

Ok, will do! Thanks a lot!

Regards,
Srivatsa S. Bhat

2014-02-11 09:33:33

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On 02/11/2014 06:56 AM, Toshi Kani wrote:
> On Thu, 2014-02-06 at 03:34 +0530, Srivatsa S. Bhat wrote:
> :
[...]
>>
>> Also, since cpu_maps_update_begin/done() is like a super-set of
>> get/put_online_cpus(), the former naturally protects the critical sections
>> from concurrent hotplug operations.
>
> get/put_online_cpus() is a reader-lock and concurrent executions are
> allowed among the readers. They won't be serialized until a cpu
> online/offline operation begins. By replacing this lock with
> cpu_maps_update_begin/done(), we now serialize all readers. Isn't that
> too restrictive?

That's an excellent line of thought! It doesn't really hurt at the moment
because the for_each_online_cpu() kind of loop that the initcalls of various
subsystems run (before registering the notifier) are really tiny (typically
the loop runs for just 1 cpu, the boot-cpu). In other words, this change
represents a tiny increase in the critical section size; so its effect
shouldn't be noticeable. (Note that in the old model, register_cpu_notifier()
already takes the cpu_add_remove_lock, so they will be serialized at that
point, and this is necessary).

However, going forward, when we start using more aggressive CPU onlining
techniques during boot (such as parallel CPU hotplug), the issue you pointed
out can become a real bottleneck, since for_each_online_cpu() can become
quite a large loop, and hence explicit (and unnecessary) mutual exclusion
will start hurting.

> Can we fix the issue with CPU_POST_DEAD and continue
> to use get_online_cpus()?
>

We don't want to get rid of CPU_POST_DEAD, so unfortunately we can't continue
to use get_online_cpus(). However, I am thinking of introducing a Reader-Writer
semaphore for this purpose, so that the registration routines can run in
parallel most of the time. (Basically, the rw-semaphore is like
get/put_online_cpus(), except that it protects the full hotplug critical section,
including the CPU_POST_DEAD stage.)

The usage would be like this:

cpu_notifier_register_begin(); //does down_read(&cpu_hotplug_rwsem);

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Takes cpu_add_remove_lock mutex */
register_cpu_notifier(&foobar_cpu_notifier);

cpu_notifier_register_end(); //does up_read(&cpu_hotplug_rwsem);

An untested RFC patch is shown below. With this, the task performing CPU hotplug
will take the locks in this order:

down_write(&cpu_hotplug_rwsem);
mutex_lock(&cpu_add_remove_lock);

mutex_lock(&cpu_hotplug.lock);

//Perform CPU hotplug

mutex_unlock(&cpu_hotplug.lock);

mutex_unlock(&cpu_add_remove_lock);
up_write(&cpu_hotplug_rwsem);


The APIs register/unregister_cpu_notifiers will take only the
cpu_add_remove_lock mutex during callback registration.


-----------------------------------------------------------------------------

diff --git a/kernel/cpu.c b/kernel/cpu.c
index deff2e6..8054e6f 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -23,6 +23,19 @@
#include "smpboot.h"

#ifdef CONFIG_SMP
+
+static DECLARE_RWSEM(cpu_hotplug_rwsem);
+
+void cpu_notifier_register_begin(void)
+{
+ down_read(&cpu_hotplug_rwsem);
+}
+
+void cpu_notifier_register_end(void)
+{
+ up_read(&cpu_hotplug_rwsem);
+}
+
/* Serializes the updates to cpu_online_mask, cpu_present_mask */
static DEFINE_MUTEX(cpu_add_remove_lock);

@@ -32,12 +45,14 @@ static DEFINE_MUTEX(cpu_add_remove_lock);
*/
void cpu_maps_update_begin(void)
{
+ down_write(&cpu_hotplug_rwsem);
mutex_lock(&cpu_add_remove_lock);
}

void cpu_maps_update_done(void)
{
mutex_unlock(&cpu_add_remove_lock);
+ up_write(&cpu_hotplug_rwsem);
}

static RAW_NOTIFIER_HEAD(cpu_chain);
@@ -160,9 +175,10 @@ void cpu_hotplug_enable(void)
int __ref register_cpu_notifier(struct notifier_block *nb)
{
int ret;
- cpu_maps_update_begin();
+
+ mutex_lock(&cpu_add_remove_lock);
ret = raw_notifier_chain_register(&cpu_chain, nb);
- cpu_maps_update_done();
+ mutex_unlock(&cpu_add_remove_lock);
return ret;
}

@@ -192,9 +208,9 @@ EXPORT_SYMBOL(register_cpu_notifier);

void __ref unregister_cpu_notifier(struct notifier_block *nb)
{
- cpu_maps_update_begin();
+ mutex_lock(&cpu_add_remove_lock);
raw_notifier_chain_unregister(&cpu_chain, nb);
- cpu_maps_update_done();
+ mutex_unlock(&cpu_add_remove_lock);
}
EXPORT_SYMBOL(unregister_cpu_notifier);


Regards,
Srivatsa S. Bhat

2014-02-11 16:40:30

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On Tue, 2014-02-11 at 09:27 +0000, Srivatsa S. Bhat wrote:
> On 02/11/2014 06:56 AM, Toshi Kani wrote:
> > On Thu, 2014-02-06 at 03:34 +0530, Srivatsa S. Bhat wrote:
> > :
> [...]
> >>
> >> Also, since cpu_maps_update_begin/done() is like a super-set of
> >> get/put_online_cpus(), the former naturally protects the critical sections
> >> from concurrent hotplug operations.
> >
> > get/put_online_cpus() is a reader-lock and concurrent executions are
> > allowed among the readers. They won't be serialized until a cpu
> > online/offline operation begins. By replacing this lock with
> > cpu_maps_update_begin/done(), we now serialize all readers. Isn't that
> > too restrictive?
>
> That's an excellent line of thought! It doesn't really hurt at the moment
> because the for_each_online_cpu() kind of loop that the initcalls of various
> subsystems run (before registering the notifier) are really tiny (typically
> the loop runs for just 1 cpu, the boot-cpu). In other words, this change
> represents a tiny increase in the critical section size; so its effect
> shouldn't be noticeable. (Note that in the old model, register_cpu_notifier()
> already takes the cpu_add_remove_lock, so they will be serialized at that
> point, and this is necessary).
>
> However, going forward, when we start using more aggressive CPU onlining
> techniques during boot (such as parallel CPU hotplug), the issue you pointed
> out can become a real bottleneck, since for_each_online_cpu() can become
> quite a large loop, and hence explicit (and unnecessary) mutual exclusion
> will start hurting.
>
> > Can we fix the issue with CPU_POST_DEAD and continue
> > to use get_online_cpus()?
> >
>
> We don't want to get rid of CPU_POST_DEAD, so unfortunately we can't continue
> to use get_online_cpus(). However, I am thinking of introducing a Reader-Writer
> semaphore for this purpose, so that the registration routines can run in
> parallel most of the time. (Basically, the rw-semaphore is like
> get/put_online_cpus(), except that it protects the full hotplug critical section,
> including the CPU_POST_DEAD stage.)

I agree that introducing a reader-writer semaphore allows concurrent
executions. Adding yet another hotplug lock is a bit unfortunate,
though.

This may be a dumb question, but can't we simply do this way?

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

put_online_cpus();

register_cpu_notifier(&foobar_cpu_notifier);

Thanks,
-Toshi


2014-02-11 17:17:25

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On 02/11, Srivatsa S. Bhat wrote:
>
> +static DECLARE_RWSEM(cpu_hotplug_rwsem);
> +
> +void cpu_notifier_register_begin(void)
> +{
> + down_read(&cpu_hotplug_rwsem);
> +}
> +
> +void cpu_notifier_register_end(void)
> +{
> + up_read(&cpu_hotplug_rwsem);
> +}
> +
> /* Serializes the updates to cpu_online_mask, cpu_present_mask */
> static DEFINE_MUTEX(cpu_add_remove_lock);
>
> @@ -32,12 +45,14 @@ static DEFINE_MUTEX(cpu_add_remove_lock);
> */
> void cpu_maps_update_begin(void)
> {
> + down_write(&cpu_hotplug_rwsem);
> mutex_lock(&cpu_add_remove_lock);
> }
>
> void cpu_maps_update_done(void)
> {
> mutex_unlock(&cpu_add_remove_lock);
> + up_write(&cpu_hotplug_rwsem);
> }

I am a bit confused... If we do this, why we can't simply turn
cpu_add_remove_lock into rw_semaphore?

Oleg.

2014-02-11 17:18:46

by Gautham R Shenoy

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On Tue, Feb 11, 2014 at 09:33:56AM -0700, Toshi Kani wrote:
>
> I agree that introducing a reader-writer semaphore allows concurrent
> executions. Adding yet another hotplug lock is a bit unfortunate,
> though.
>

I agree with this last part. We already have enough locks for
cpu-hotplug. Another one sounds one too many!!


> This may be a dumb question, but can't we simply do this way?
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> put_online_cpus();
>
-------- Someone chooses to hotplug a cpu here ------
-------- But this subsystem might miss out on knowing
about it since it hasn't registered its
notifier yet!

> register_cpu_notifier(&foobar_cpu_notifier);
>
> Thanks,
> -Toshi
>

--
Thanks and Regards
gautham.

2014-02-11 17:41:52

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On Tue, 2014-02-11 at 22:48 +0530, Gautham R Shenoy wrote:
> On Tue, Feb 11, 2014 at 09:33:56AM -0700, Toshi Kani wrote:
> >
> > I agree that introducing a reader-writer semaphore allows concurrent
> > executions. Adding yet another hotplug lock is a bit unfortunate,
> > though.
> >
>
> I agree with this last part. We already have enough locks for
> cpu-hotplug. Another one sounds one too many!!
>
>
> > This may be a dumb question, but can't we simply do this way?
> >
> > get_online_cpus();
> >
> > for_each_online_cpu(cpu)
> > init_cpu(cpu);
> >
> > put_online_cpus();
> >
> -------- Someone chooses to hotplug a cpu here ------
> -------- But this subsystem might miss out on knowing
> about it since it hasn't registered its
> notifier yet!
>
> > register_cpu_notifier(&foobar_cpu_notifier);


How about this? foo_cpu_notifier returns NOP when foo_notifier_ready is
false.

register_cpu_notifier(&foobar_cpu_notifier);

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

foo_notifier_ready = true;

put_online_cpus();

Thanks,
-Toshi

2014-02-11 19:14:11

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On 02/11/2014 10:45 PM, Oleg Nesterov wrote:
> On 02/11, Srivatsa S. Bhat wrote:
>>
>> +static DECLARE_RWSEM(cpu_hotplug_rwsem);
>> +
>> +void cpu_notifier_register_begin(void)
>> +{
>> + down_read(&cpu_hotplug_rwsem);
>> +}
>> +
>> +void cpu_notifier_register_end(void)
>> +{
>> + up_read(&cpu_hotplug_rwsem);
>> +}
>> +
>> /* Serializes the updates to cpu_online_mask, cpu_present_mask */
>> static DEFINE_MUTEX(cpu_add_remove_lock);
>>
>> @@ -32,12 +45,14 @@ static DEFINE_MUTEX(cpu_add_remove_lock);
>> */
>> void cpu_maps_update_begin(void)
>> {
>> + down_write(&cpu_hotplug_rwsem);
>> mutex_lock(&cpu_add_remove_lock);
>> }
>>
>> void cpu_maps_update_done(void)
>> {
>> mutex_unlock(&cpu_add_remove_lock);
>> + up_write(&cpu_hotplug_rwsem);
>> }
>
> I am a bit confused... If we do this, why we can't simply turn
> cpu_add_remove_lock into rw_semaphore?
>

Short answer: Being a mutex, cpu_add_remove_lock ensures that the updates to
the cpu notifier chain get serialized. If we make that an rw-semaphore, then
the notifier chain mutations (during callback registration) will run in
parallel, wreaking havoc.

Long answer: There are two distinct phases in the critical section involving
the callback registration - one that should run in parallel with other
readers (other such critical sections) and the other one which should run
serially, as depicted below.

cpu_notifier_register_begin(); | Run in parallel
| with similar phases
for_each_online_cpu(cpu) | from other subsystems.
init_cpu(cpu); |

/* Updates the cpu notifier chain. */
register_cpu_notifier(&foobar_cpu_notifier); ||| -- Must run serially

cpu_notifier_register_done();


So, for the first part, we can use an rw-semaphore, to allow the init
routines of various subsystems to run in parallel. For the second part,
we need strict mutual exclusion; so we can use the cpu_add_remove_lock
mutex as it is. But it so happens that the length of the critical section
for both these locks are exactly the same on the hotplug writer side - they
both need to cover the full hotplug code, including the CPU_POST_DEAD stage.

I do agree that this approach introduces yet another lock in the hotplug
path. However, we can nicely abstract it into APIs that the rest of the
subsystems can call (as shown above), without needing to know the internal
lock ordering etc.

Thoughts?

Regards,
Srivatsa S. Bhat

2014-02-11 19:25:40

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On 02/11/2014 11:05 PM, Toshi Kani wrote:
> On Tue, 2014-02-11 at 22:48 +0530, Gautham R Shenoy wrote:
>> On Tue, Feb 11, 2014 at 09:33:56AM -0700, Toshi Kani wrote:
>>>
>>> I agree that introducing a reader-writer semaphore allows concurrent
>>> executions. Adding yet another hotplug lock is a bit unfortunate,
>>> though.
>>>
>>
>> I agree with this last part. We already have enough locks for
>> cpu-hotplug. Another one sounds one too many!!
>>
>>
>>> This may be a dumb question, but can't we simply do this way?
>>>
>>> get_online_cpus();
>>>
>>> for_each_online_cpu(cpu)
>>> init_cpu(cpu);
>>>
>>> put_online_cpus();
>>>
>> -------- Someone chooses to hotplug a cpu here ------
>> -------- But this subsystem might miss out on knowing
>> about it since it hasn't registered its
>> notifier yet!
>>
>>> register_cpu_notifier(&foobar_cpu_notifier);
>
>
> How about this? foo_cpu_notifier returns NOP when foo_notifier_ready is
> false.
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> foo_notifier_ready = true;
>
> put_online_cpus();
>

Nah, that looks a lot like some quick-n-dirty hack ;-(
It would also amount to burdening the various subsystems to add weird-looking
pieces of code such as this in their callbacks:

if (!foo_notifier_ready)
return NOTIFY_OK;

This only makes it all the more evident that the callback registration APIs
exposed by the CPU hotplug core is poorly designed.

What we need instead, is an elegant, well-defined and easy-to-use set of
interfaces/APIs exposed by the core CPU hotplug code to the various
subsystems. I don't think we should worry so much about the fact that
we can't use the familiar get/put_online_cpus() in this type of callback
registration scenario. We can introduce a sane set of APIs that work
well in such situations and use them consistently.

For example, something like the code snippet shown below looks pretty
neat to me:

cpu_notifier_register_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

cpu_notifier_register_done();

What do you think?

Regards,
Srivatsa S. Bhat

2014-02-11 20:57:42

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On Wed, 2014-02-12 at 00:50 +0530, Srivatsa S. Bhat wrote:
> On 02/11/2014 11:05 PM, Toshi Kani wrote:
:
> > How about this? foo_cpu_notifier returns NOP when foo_notifier_ready is
> > false.
> >
> > register_cpu_notifier(&foobar_cpu_notifier);
> >
> > get_online_cpus();
> >
> > for_each_online_cpu(cpu)
> > init_cpu(cpu);
> >
> > foo_notifier_ready = true;
> >
> > put_online_cpus();
> >
>
> Nah, that looks a lot like some quick-n-dirty hack ;-(
> It would also amount to burdening the various subsystems to add weird-looking
> pieces of code such as this in their callbacks:
>
> if (!foo_notifier_ready)
> return NOTIFY_OK;
>
> This only makes it all the more evident that the callback registration APIs
> exposed by the CPU hotplug core is poorly designed.
>
> What we need instead, is an elegant, well-defined and easy-to-use set of
> interfaces/APIs exposed by the core CPU hotplug code to the various
> subsystems. I don't think we should worry so much about the fact that
> we can't use the familiar get/put_online_cpus() in this type of callback
> registration scenario. We can introduce a sane set of APIs that work
> well in such situations and use them consistently.

> For example, something like the code snippet shown below looks pretty
> neat to me:
>
> cpu_notifier_register_begin();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_notifier_register_done();
>
> What do you think?

I agree that it is cleaner for the callers as long as people understand
how to use them. Can you document them properly so that they know when
they need to use them instead of the familiar get/put_online_cpus()?

Thanks,
-Toshi

2014-02-12 06:24:16

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On 02/12/2014 02:21 AM, Toshi Kani wrote:
> On Wed, 2014-02-12 at 00:50 +0530, Srivatsa S. Bhat wrote:
>> On 02/11/2014 11:05 PM, Toshi Kani wrote:
> :
>>> How about this? foo_cpu_notifier returns NOP when foo_notifier_ready is
>>> false.
>>>
>>> register_cpu_notifier(&foobar_cpu_notifier);
>>>
>>> get_online_cpus();
>>>
>>> for_each_online_cpu(cpu)
>>> init_cpu(cpu);
>>>
>>> foo_notifier_ready = true;
>>>
>>> put_online_cpus();
>>>
>>
>> Nah, that looks a lot like some quick-n-dirty hack ;-(
>> It would also amount to burdening the various subsystems to add weird-looking
>> pieces of code such as this in their callbacks:
>>
>> if (!foo_notifier_ready)
>> return NOTIFY_OK;
>>
>> This only makes it all the more evident that the callback registration APIs
>> exposed by the CPU hotplug core is poorly designed.
>>
>> What we need instead, is an elegant, well-defined and easy-to-use set of
>> interfaces/APIs exposed by the core CPU hotplug code to the various
>> subsystems. I don't think we should worry so much about the fact that
>> we can't use the familiar get/put_online_cpus() in this type of callback
>> registration scenario. We can introduce a sane set of APIs that work
>> well in such situations and use them consistently.
>
>> For example, something like the code snippet shown below looks pretty
>> neat to me:
>>
>> cpu_notifier_register_begin();
>>
>> for_each_online_cpu(cpu)
>> init_cpu(cpu);
>>
>> register_cpu_notifier(&foobar_cpu_notifier);
>>
>> cpu_notifier_register_done();
>>
>> What do you think?
>
> I agree that it is cleaner for the callers as long as people understand
> how to use them. Can you document them properly so that they know when
> they need to use them instead of the familiar get/put_online_cpus()?
>

Sure.. I had updated the documentation with the semantics introduced in
this patchset, in patch 2:

http://thread.gmane.org/gmane.linux.kernel/1641638/focus=1641695

Similarly I'll keep the docs updated with these new APIs in v2 as well.

Thank you!

Regards,
Srivatsa S. Bhat

2014-02-13 11:02:19

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On 02/12/2014 11:48 AM, Srivatsa S. Bhat wrote:
> On 02/12/2014 02:21 AM, Toshi Kani wrote:
>> On Wed, 2014-02-12 at 00:50 +0530, Srivatsa S. Bhat wrote:
>>> On 02/11/2014 11:05 PM, Toshi Kani wrote:
>> :
>>>> How about this? foo_cpu_notifier returns NOP when foo_notifier_ready is
>>>> false.
>>>>
>>>> register_cpu_notifier(&foobar_cpu_notifier);
>>>>
>>>> get_online_cpus();
>>>>
>>>> for_each_online_cpu(cpu)
>>>> init_cpu(cpu);
>>>>
>>>> foo_notifier_ready = true;
>>>>
>>>> put_online_cpus();
>>>>
>>>
>>> Nah, that looks a lot like some quick-n-dirty hack ;-(
>>> It would also amount to burdening the various subsystems to add weird-looking
>>> pieces of code such as this in their callbacks:
>>>
>>> if (!foo_notifier_ready)
>>> return NOTIFY_OK;
>>>
>>> This only makes it all the more evident that the callback registration APIs
>>> exposed by the CPU hotplug core is poorly designed.
>>>
>>> What we need instead, is an elegant, well-defined and easy-to-use set of
>>> interfaces/APIs exposed by the core CPU hotplug code to the various
>>> subsystems. I don't think we should worry so much about the fact that
>>> we can't use the familiar get/put_online_cpus() in this type of callback
>>> registration scenario. We can introduce a sane set of APIs that work
>>> well in such situations and use them consistently.
>>
>>> For example, something like the code snippet shown below looks pretty
>>> neat to me:
>>>
>>> cpu_notifier_register_begin();
>>>
>>> for_each_online_cpu(cpu)
>>> init_cpu(cpu);
>>>
>>> register_cpu_notifier(&foobar_cpu_notifier);
>>>
>>> cpu_notifier_register_done();
>>>
>>> What do you think?
>>
>> I agree that it is cleaner for the callers as long as people understand
>> how to use them. Can you document them properly so that they know when
>> they need to use them instead of the familiar get/put_online_cpus()?
>>
>
> Sure.. I had updated the documentation with the semantics introduced in
> this patchset, in patch 2:
>
> http://thread.gmane.org/gmane.linux.kernel/1641638/focus=1641695
>
> Similarly I'll keep the docs updated with these new APIs in v2 as well.
>

For now, however, let us not add the new rw-semaphore to the CPU hotplug
core yet. Its very unlikely that we'll see any performance issue immediately,
due to serialized initialization of cpu hotplug notifiers, since early boot
is mostly sequential anyway.

Some time in the future, if we start hitting bottlenecks in the cpu hotplug
notifier registration phase (perhaps when we implement parallel CPU boot-up
infrastructure), then we can directly use the rw-semaphore solution, since
we have already worked it out. Besides, like Gautham said, we might want
to be more careful and have a very good justification before adding more
locks to the CPU hotplug core code. So we'll add the new rw-sempahore if
and when it becomes necessary.

I'll post the v2 with the earlier design itself, by adding the new symbols
cpu_notifier_register_begin/done() (to enhance the readability) and map
them to cpu_maps_update_begin/done().

Thank you!

Regards,
Srivatsa S. Bhat

2014-02-13 11:06:36

by Gautham R Shenoy

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On Mon, Feb 10, 2014 at 06:26:20PM -0700, Toshi Kani wrote:
>
> get/put_online_cpus() is a reader-lock and concurrent executions are
> allowed among the readers. They won't be serialized until a cpu
> online/offline operation begins. By replacing this lock with
> cpu_maps_update_begin/done(), we now serialize all readers.

We're not serializing all the readers, just the ones which want to
register/unregister their cpu-hotplug notifiers. This is a one-off
event which typically happens during a module_init() or a
module_exit() time. So this patchset does not replace
get/put_online_cpus() if that is the concern!

--
Thanks and Regards
gautham.


2014-02-13 17:45:46

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On 02/12, Srivatsa S. Bhat wrote:
>
> On 02/11/2014 10:45 PM, Oleg Nesterov wrote:
> >
> > I am a bit confused... If we do this, why we can't simply turn
> > cpu_add_remove_lock into rw_semaphore?

[...snip...]

> cpu_notifier_register_begin(); | Run in parallel
> | with similar phases
> for_each_online_cpu(cpu) | from other subsystems.
> init_cpu(cpu); |
>
> /* Updates the cpu notifier chain. */
> register_cpu_notifier(&foobar_cpu_notifier); ||| -- Must run serially

Ah indeed, we can't use a single lock, thanks. Perhaps we can simply
add a spinlock_t which only protects cpu_chain though, but I am not
sure and currently this is off-topic anyway.

Thanks,

Oleg.

2014-02-13 18:00:22

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On 02/13/2014 11:14 PM, Oleg Nesterov wrote:
> On 02/12, Srivatsa S. Bhat wrote:
>>
>> On 02/11/2014 10:45 PM, Oleg Nesterov wrote:
>>>
>>> I am a bit confused... If we do this, why we can't simply turn
>>> cpu_add_remove_lock into rw_semaphore?
>
> [...snip...]
>
>> cpu_notifier_register_begin(); | Run in parallel
>> | with similar phases
>> for_each_online_cpu(cpu) | from other subsystems.
>> init_cpu(cpu); |
>>
>> /* Updates the cpu notifier chain. */
>> register_cpu_notifier(&foobar_cpu_notifier); ||| -- Must run serially
>
> Ah indeed, we can't use a single lock, thanks. Perhaps we can simply
> add a spinlock_t which only protects cpu_chain though, but I am not
> sure and currently this is off-topic anyway.
>

The problem with that would be that the chain invocations (during CPU hotplug)
would have to take the spinlock (to prevent running concurrently with chain
updaters). But unfortunately CPU hotplug notifier callbacks can sleep, so we
can't hold spinlocks while invoking them.

Regards,
Srivatsa S. Bhat

2014-02-13 21:00:22

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH 01/51] CPU hotplug: Provide lockless versions of callback registration functions

On Thu, 2014-02-13 at 10:56 +0000, Srivatsa S. Bhat wrote:
> On 02/12/2014 11:48 AM, Srivatsa S. Bhat wrote:
:
> >>> For example, something like the code snippet shown below looks pretty
> >>> neat to me:
> >>>
> >>> cpu_notifier_register_begin();
> >>>
> >>> for_each_online_cpu(cpu)
> >>> init_cpu(cpu);
> >>>
> >>> register_cpu_notifier(&foobar_cpu_notifier);
> >>>
> >>> cpu_notifier_register_done();
> >>>
> >>> What do you think?
> >>
> >> I agree that it is cleaner for the callers as long as people understand
> >> how to use them. Can you document them properly so that they know when
> >> they need to use them instead of the familiar get/put_online_cpus()?
> >>
> >
> > Sure.. I had updated the documentation with the semantics introduced in
> > this patchset, in patch 2:
> >
> > http://thread.gmane.org/gmane.linux.kernel/1641638/focus=1641695
> >
> > Similarly I'll keep the docs updated with these new APIs in v2 as well.
> >
>
> For now, however, let us not add the new rw-semaphore to the CPU hotplug
> core yet. Its very unlikely that we'll see any performance issue immediately,
> due to serialized initialization of cpu hotplug notifiers, since early boot
> is mostly sequential anyway.
>
> Some time in the future, if we start hitting bottlenecks in the cpu hotplug
> notifier registration phase (perhaps when we implement parallel CPU boot-up
> infrastructure), then we can directly use the rw-semaphore solution, since
> we have already worked it out. Besides, like Gautham said, we might want
> to be more careful and have a very good justification before adding more
> locks to the CPU hotplug core code. So we'll add the new rw-sempahore if
> and when it becomes necessary.
>
> I'll post the v2 with the earlier design itself, by adding the new symbols
> cpu_notifier_register_begin/done() (to enhance the readability) and map
> them to cpu_maps_update_begin/done().

Sounds reasonable to me. I was also concerned about exporting and
overloading cpu_maps_update_begin/done() for a different purpose (their
purpose is to update cpu_maps). So, I think adding the new interfaces
is good when we cannot use get/set_online_cpus() for this.

Thanks,
-Toshi


2014-02-14 06:47:18

by Madhavan Srinivasan

[permalink] [raw]
Subject: Re: [PATCH 13/51] powerpc, sysfs: Fix CPU hotplug callback registration

On Thursday 06 February 2014 03:36 AM, Srivatsa S. Bhat wrote:
> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:
>
> get_online_cpus();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> register_cpu_notifier(&foobar_cpu_notifier);
>
> put_online_cpus();
>
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> with CPU hotplug operations).
>
> Instead, the correct and race-free way of performing the callback
> registration is:
>
> cpu_maps_update_begin();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> /* Note the use of the double underscored version of the API */
> __register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_maps_update_done();
>
>
> Fix the sysfs code in powerpc by using this latter form of callback
> registration.

Acked-by: Madhavan Srinivasan <[email protected]>

>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Madhavan Srinivasan <[email protected]>
> Cc: Olof Johansson <[email protected]>
> Cc: Wang Dongsheng <[email protected]>
> Cc: [email protected]
> Signed-off-by: Srivatsa S. Bhat <[email protected]>
> ---
>
> arch/powerpc/kernel/sysfs.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
> index 97e1dc9..c29ad44 100644
> --- a/arch/powerpc/kernel/sysfs.c
> +++ b/arch/powerpc/kernel/sysfs.c
> @@ -975,7 +975,8 @@ static int __init topology_init(void)
> int cpu;
>
> register_nodes();
> - register_cpu_notifier(&sysfs_cpu_nb);
> +
> + cpu_maps_update_begin();
>
> for_each_possible_cpu(cpu) {
> struct cpu *c = &per_cpu(cpu_devices, cpu);
> @@ -999,6 +1000,11 @@ static int __init topology_init(void)
> if (cpu_online(cpu))
> register_cpu_online(cpu);
> }
> +
> + __register_cpu_notifier(&sysfs_cpu_nb);
> +
> + cpu_maps_update_done();
> +
> #ifdef CONFIG_PPC64
> sysfs_create_dscr_default();
> #endif /* CONFIG_PPC64 */
>