2011-06-13 18:01:11

by Suresh Siddha

[permalink] [raw]
Subject: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path

Currently stop machine infrastructure can be called only from a cpu that is
online. But for !CONFIG_SMP, we do allow calling __stop_machine() before the
cpu is online.

x86 for example requires stop machine infrastructure in the cpu online path
and currently implements its own stop machine (using stop_one_cpu_nowait())
for MTRR initialization in the cpu online path.

Enhance the __stop_machine() so that it can be called before the cpu
is onlined. This will pave the way for code consolidation and address potential
deadlocks caused by multiple mechanisms of doing system wide rendezvous.

This will also address the behavioral differences of __stop_machine()
between SMP and UP builds.

Signed-off-by: Suresh Siddha <[email protected]>
Cc: [email protected] # v2.6.35+
---
include/linux/stop_machine.h | 11 +++--
kernel/stop_machine.c | 91 ++++++++++++++++++++++++++++++++++++++++---
2 files changed, 93 insertions(+), 9 deletions(-)

Index: linux-2.6-tip/kernel/stop_machine.c
===================================================================
--- linux-2.6-tip.orig/kernel/stop_machine.c
+++ linux-2.6-tip/kernel/stop_machine.c
@@ -28,6 +28,7 @@
struct cpu_stop_done {
atomic_t nr_todo; /* nr left to execute */
bool executed; /* actually executed? */
+ bool caller_offline; /* stop_cpu caller cpu status */
int ret; /* collected return value */
struct completion completion; /* fired if nr_todo reaches 0 */
};
@@ -47,15 +48,24 @@ static void cpu_stop_init_done(struct cp
memset(done, 0, sizeof(*done));
atomic_set(&done->nr_todo, nr_todo);
init_completion(&done->completion);
+ done->caller_offline = !cpu_online(raw_smp_processor_id());
}

/* signal completion unless @done is NULL */
static void cpu_stop_signal_done(struct cpu_stop_done *done, bool executed)
{
if (done) {
+ bool caller_offline;
+
+ /*
+ * Get the offline status before we do the atomic_dec_and_test
+ * below. This will avoid unsafe references to the 'done'
+ * memory that is present on the stop_cpu caller's stack.
+ */
+ caller_offline = done->caller_offline;
if (executed)
done->executed = true;
- if (atomic_dec_and_test(&done->nr_todo))
+ if (atomic_dec_and_test(&done->nr_todo) && !caller_offline)
complete(&done->completion);
}
}
@@ -136,6 +146,56 @@ void stop_one_cpu_nowait(unsigned int cp
static DEFINE_MUTEX(stop_cpus_mutex);
static DEFINE_PER_CPU(struct cpu_stop_work, stop_cpus_work);

+/* stop all the online cpu's aswell as the calling cpu that is offline */
+static int __stop_all_cpus(cpu_stop_fn_t fn, void *arg)
+{
+ struct cpu_stop_work *work;
+ struct cpu_stop_done done;
+ unsigned int cpu;
+ int ret;
+
+ /*
+ * This cpu is not yet online, so loop till we get the mutex.
+ */
+ while (!mutex_trylock(&stop_cpus_mutex))
+ cpu_relax();
+
+ /* initialize works and done */
+ for_each_cpu(cpu, cpu_online_mask) {
+ work = &per_cpu(stop_cpus_work, cpu);
+ work->fn = fn;
+ work->arg = arg;
+ work->done = &done;
+ }
+
+ cpu_stop_init_done(&done, num_online_cpus());
+
+ /*
+ * Queue the work on all the online cpu's first.
+ */
+ for_each_cpu(cpu, cpu_online_mask)
+ cpu_stop_queue_work(&per_cpu(cpu_stopper, cpu),
+ &per_cpu(stop_cpus_work, cpu));
+
+ /*
+ * This cpu is not yet online. @fn needs to be run on this
+ * cpu, run it now.
+ */
+ ret = fn(arg);
+ if (ret)
+ done.ret = ret;
+
+ /*
+ * This cpu is not yet online. Wait synchronously for others to
+ * complete the stop cpu work.
+ */
+ while (atomic_read(&done.nr_todo))
+ cpu_relax();
+
+ mutex_unlock(&stop_cpus_mutex);
+ return done.ret;
+}
+
int __stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg)
{
struct cpu_stop_work *work;
@@ -431,6 +491,7 @@ static int stop_machine_cpu_stop(void *d
struct stop_machine_data *smdata = data;
enum stopmachine_state curstate = STOPMACHINE_NONE;
int cpu = smp_processor_id(), err = 0;
+ unsigned long flags = 0;
bool is_active;

if (!smdata->active_cpus)
@@ -446,7 +507,13 @@ static int stop_machine_cpu_stop(void *d
curstate = smdata->state;
switch (curstate) {
case STOPMACHINE_DISABLE_IRQ:
- local_irq_disable();
+ /*
+ * In most of the cases this gets called from
+ * the stop_cpu thread context. But this can
+ * also be called directly from the cpu online
+ * path where interrupts are disabled.
+ */
+ local_irq_save(flags);
hard_irq_disable();
break;
case STOPMACHINE_RUN:
@@ -460,7 +527,7 @@ static int stop_machine_cpu_stop(void *d
}
} while (curstate != STOPMACHINE_EXIT);

- local_irq_enable();
+ local_irq_restore(flags);
return err;
}

@@ -470,9 +537,23 @@ int __stop_machine(int (*fn)(void *), vo
.num_threads = num_online_cpus(),
.active_cpus = cpus };

- /* Set the initial state and stop all online cpus. */
+ /*
+ * Include the calling cpu that might not be online yet. We are
+ * checking only for the online status here, so no need to worry about
+ * preemption status (can't be preempted to a cpu that is not
+ * yet online). So raw_smp_processor_id() is safe here.
+ */
+ if (!cpu_online(raw_smp_processor_id()))
+ smdata.num_threads++;
+
+ /* Set the initial state and stop all cpus. */
set_state(&smdata, STOPMACHINE_PREPARE);
- return stop_cpus(cpu_online_mask, stop_machine_cpu_stop, &smdata);
+
+ if (!cpu_online(raw_smp_processor_id()))
+ return __stop_all_cpus(stop_machine_cpu_stop, &smdata);
+ else
+ return stop_cpus(cpu_online_mask, stop_machine_cpu_stop,
+ &smdata);
}

int stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus)
Index: linux-2.6-tip/include/linux/stop_machine.h
===================================================================
--- linux-2.6-tip.orig/include/linux/stop_machine.h
+++ linux-2.6-tip/include/linux/stop_machine.h
@@ -104,7 +104,7 @@ static inline int try_stop_cpus(const st
* @data: the data ptr for the @fn()
* @cpus: the cpus to run the @fn() on (NULL = any online cpu)
*
- * Description: This causes a thread to be scheduled on every cpu,
+ * Description: This causes a thread to be scheduled on every online cpu,
* each of which disables interrupts. The result is that no one is
* holding a spinlock or inside any other preempt-disabled region when
* @fn() runs.
@@ -120,7 +120,9 @@ int stop_machine(int (*fn)(void *), void
* @cpus: the cpus to run the @fn() on (NULL = any online cpu)
*
* Description: This is a special version of the above, which assumes cpus
- * won't come or go while it's being called. Used by hotplug cpu.
+ * won't come or go while it's being called. Used by hotplug cpu. This can
+ * be also called from the cpu online path, where the calling cpu is not
+ * yet online.
*/
int __stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus);

@@ -129,10 +131,11 @@ int __stop_machine(int (*fn)(void *), vo
static inline int __stop_machine(int (*fn)(void *), void *data,
const struct cpumask *cpus)
{
+ unsigned long flags;
int ret;
- local_irq_disable();
+ local_irq_save(flags);
ret = fn(data);
- local_irq_enable();
+ local_irq_restore(flags);
return ret;
}



2011-06-13 19:56:31

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path


* Suresh Siddha <[email protected]> wrote:

> Currently stop machine infrastructure can be called only from a cpu that is
> online. But for !CONFIG_SMP, we do allow calling __stop_machine() before the
> cpu is online.
>
> x86 for example requires stop machine infrastructure in the cpu online path
> and currently implements its own stop machine (using stop_one_cpu_nowait())
> for MTRR initialization in the cpu online path.
>
> Enhance the __stop_machine() so that it can be called before the cpu
> is onlined. This will pave the way for code consolidation and address potential
> deadlocks caused by multiple mechanisms of doing system wide rendezvous.
>
> This will also address the behavioral differences of __stop_machine()
> between SMP and UP builds.
>
> Signed-off-by: Suresh Siddha <[email protected]>
> Cc: [email protected] # v2.6.35+
> ---
> include/linux/stop_machine.h | 11 +++--
> kernel/stop_machine.c | 91 ++++++++++++++++++++++++++++++++++++++++---
> 2 files changed, 93 insertions(+), 9 deletions(-)

Btw., this is *way* too risky for a -stable backport.

Thanks,

Ingo

2011-06-13 20:49:13

by Suresh Siddha

[permalink] [raw]
Subject: Re: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path

On Mon, 2011-06-13 at 12:56 -0700, Ingo Molnar wrote:
> * Suresh Siddha <[email protected]> wrote:
>
> > include/linux/stop_machine.h | 11 +++--
> > kernel/stop_machine.c | 91 ++++++++++++++++++++++++++++++++++++++++---
> > 2 files changed, 93 insertions(+), 9 deletions(-)
>
> Btw., this is *way* too risky for a -stable backport.
>

Ingo, we can have a smaller patch (appended) for the -stable. How do you
want to go ahead? Take this small patch for both mainline and -stable
and the two code cleanup/consolidation patches for -tip (to go into
3.1?). Thanks.

---
From: Suresh Siddha <[email protected]>
Subject: x86, mtrr: lock stop machine during MTRR rendezvous sequence

MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially
happen in parallel with another system wide rendezvous using
stop_machine(). This can lead to deadlock (The order in which
works are queued can be different on different cpu's. Some cpu's
will be running the first rendezvous handler and others will be running
the second rendezvous handler. Each set waiting for the other set to join
for the system wide rendezvous, leading to a deadlock).

MTRR rendezvous sequence is not implemented using stop_machine() as this
gets called both from the process context aswell as the cpu online paths
(where the cpu has not come online and the interrupts are disabled etc).
stop_machine() works with only online cpus.

For now, take the stop_cpus mutex in the MTRR rendezvous sequence that
prevents the above mentioned deadlock. If this gets called in the cpu
online path, then we loop using try_lock. Otherwise we wait till the
mutex is available.

TBD: Extend the stop_machine() infrastructure to handle the case where
the calling cpu is still not online and use this for MTRR rendezvous
sequence (rather than implementing own stop machine using
stop_one_cpu_nowait()). This will consolidate the code.

fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008
Signed-off-by: Suresh Siddha <[email protected]>
Cc: [email protected] # v2.6.35 - v2.6.39
---
arch/x86/kernel/cpu/mtrr/main.c | 15 ++++++++++++++-
include/linux/stop_machine.h | 17 +++++++++++++++++
kernel/stop_machine.c | 15 +++++++++++++++
3 files changed, 46 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index 929739a..1e99048 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -244,9 +244,21 @@ static inline int types_compatible(mtrr_type type1, mtrr_type type2)
static void
set_mtrr(unsigned int reg, unsigned long base, unsigned long size, mtrr_type type)
{
+ int cpu = raw_smp_processor_id();
+ int online = cpu_online(cpu);
struct set_mtrr_data data;
unsigned long flags;
- int cpu;
+
+ /*
+ * If we are not yet online, then loop using the try lock, as we
+ * can't sleep.
+ */
+ if (online) {
+ lock_stop_cpus();
+ } else {
+ while (!try_lock_stop_cpus())
+ cpu_relax();
+ }

preempt_disable();

@@ -330,6 +342,7 @@ set_mtrr(unsigned int reg, unsigned long base, unsigned long size, mtrr_type typ

local_irq_restore(flags);
preempt_enable();
+ unlock_stop_cpus();
}

/**
diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h
index 092dc9b..8a28d4c 100644
--- a/include/linux/stop_machine.h
+++ b/include/linux/stop_machine.h
@@ -33,6 +33,10 @@ void stop_one_cpu_nowait(unsigned int cpu, cpu_stop_fn_t fn, void *arg,
int stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg);
int try_stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg);

+void lock_stop_cpus(void);
+int try_lock_stop_cpus(void);
+void unlock_stop_cpus(void);
+
#else /* CONFIG_SMP */

#include <linux/workqueue.h>
@@ -88,6 +92,19 @@ static inline int try_stop_cpus(const struct cpumask *cpumask,
return stop_cpus(cpumask, fn, arg);
}

+static inline void lock_stop_cpus(void)
+{
+}
+
+static inline try_lock_stop_cpus(void)
+{
+ return 1;
+}
+
+static inline void unlock_stop_cpus(void)
+{
+}
+
#endif /* CONFIG_SMP */

/*
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index e3516b2..d239a48 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -136,6 +136,21 @@ void stop_one_cpu_nowait(unsigned int cpu, cpu_stop_fn_t fn, void *arg,
static DEFINE_MUTEX(stop_cpus_mutex);
static DEFINE_PER_CPU(struct cpu_stop_work, stop_cpus_work);

+void lock_stop_cpus(void)
+{
+ mutex_lock(&stop_cpus_mutex);
+}
+
+int try_lock_stop_cpus(void)
+{
+ return mutex_trylock(&stop_cpus_mutex);
+}
+
+void unlock_stop_cpus(void)
+{
+ mutex_unlock(&stop_cpus_mutex);
+}
+
int __stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg)
{
struct cpu_stop_work *work;

2011-06-13 21:05:36

by Suresh Siddha

[permalink] [raw]
Subject: Re: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path

On Mon, 2011-06-13 at 13:49 -0700, Suresh Siddha wrote:
> On Mon, 2011-06-13 at 12:56 -0700, Ingo Molnar wrote:
> > * Suresh Siddha <[email protected]> wrote:
> >
> > > include/linux/stop_machine.h | 11 +++--
> > > kernel/stop_machine.c | 91 ++++++++++++++++++++++++++++++++++++++++---
> > > 2 files changed, 93 insertions(+), 9 deletions(-)
> >
> > Btw., this is *way* too risky for a -stable backport.
> >
>
> Ingo, we can have a smaller patch (appended) for the -stable. How do you
> want to go ahead? Take this small patch for both mainline and -stable
> and the two code cleanup/consolidation patches for -tip (to go into
> 3.1?). Thanks.

oops. sent it too soon. fixed a !CONFIG_SMP compilation issue. Updated
simple patch appended.
---

From: Suresh Siddha <[email protected]>
Subject: x86, mtrr: lock stop machine during MTRR rendezvous sequence

MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially
happen in parallel with another system wide rendezvous using
stop_machine(). This can lead to deadlock (The order in which
works are queued can be different on different cpu's. Some cpu's
will be running the first rendezvous handler and others will be running
the second rendezvous handler. Each set waiting for the other set to join
for the system wide rendezvous, leading to a deadlock).

MTRR rendezvous sequence is not implemented using stop_machine() as this
gets called both from the process context aswell as the cpu online paths
(where the cpu has not come online and the interrupts are disabled etc).
stop_machine() works with only online cpus.

For now, take the stop_cpus mutex in the MTRR rendezvous sequence that
prevents the above mentioned deadlock. If this gets called in the cpu
online path, then we loop using try_lock. Otherwise we wait till the
mutex is available.

TBD: Extend the stop_machine() infrastructure to handle the case where
the calling cpu is still not online and use this for MTRR rendezvous
sequence (rather than implementing own stop machine using
stop_one_cpu_nowait()). This will consolidate the code.

fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008
Signed-off-by: Suresh Siddha <[email protected]>
Cc: [email protected] # v2.6.35+
---
arch/x86/kernel/cpu/mtrr/main.c | 14 +++++++++++++-
include/linux/stop_machine.h | 17 +++++++++++++++++
kernel/stop_machine.c | 15 +++++++++++++++
3 files changed, 45 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index 929739a..3d15798 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -244,9 +244,20 @@ static inline int types_compatible(mtrr_type type1, mtrr_type type2)
static void
set_mtrr(unsigned int reg, unsigned long base, unsigned long size, mtrr_type type)
{
+ int cpu = raw_smp_processor_id();
struct set_mtrr_data data;
unsigned long flags;
- int cpu;
+
+ /*
+ * If we are not yet online, then loop using the try lock, as we
+ * can't sleep.
+ */
+ if (cpu_online(cpu)) {
+ lock_stop_cpus();
+ } else {
+ while (!try_lock_stop_cpus())
+ cpu_relax();
+ }

preempt_disable();

@@ -330,6 +341,7 @@ set_mtrr(unsigned int reg, unsigned long base, unsigned long size, mtrr_type typ

local_irq_restore(flags);
preempt_enable();
+ unlock_stop_cpus();
}

/**
diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h
index 092dc9b..7731e57 100644
--- a/include/linux/stop_machine.h
+++ b/include/linux/stop_machine.h
@@ -33,6 +33,10 @@ void stop_one_cpu_nowait(unsigned int cpu, cpu_stop_fn_t fn, void *arg,
int stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg);
int try_stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg);

+void lock_stop_cpus(void);
+int try_lock_stop_cpus(void);
+void unlock_stop_cpus(void);
+
#else /* CONFIG_SMP */

#include <linux/workqueue.h>
@@ -88,6 +92,19 @@ static inline int try_stop_cpus(const struct cpumask *cpumask,
return stop_cpus(cpumask, fn, arg);
}

+static inline void lock_stop_cpus(void)
+{
+}
+
+static inline int try_lock_stop_cpus(void)
+{
+ return 1;
+}
+
+static inline void unlock_stop_cpus(void)
+{
+}
+
#endif /* CONFIG_SMP */

/*
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index e3516b2..d239a48 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -136,6 +136,21 @@ void stop_one_cpu_nowait(unsigned int cpu, cpu_stop_fn_t fn, void *arg,
static DEFINE_MUTEX(stop_cpus_mutex);
static DEFINE_PER_CPU(struct cpu_stop_work, stop_cpus_work);

+void lock_stop_cpus(void)
+{
+ mutex_lock(&stop_cpus_mutex);
+}
+
+int try_lock_stop_cpus(void)
+{
+ return mutex_trylock(&stop_cpus_mutex);
+}
+
+void unlock_stop_cpus(void)
+{
+ mutex_unlock(&stop_cpus_mutex);
+}
+
int __stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg)
{
struct cpu_stop_work *work;

2011-06-14 08:08:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path


* Suresh Siddha <[email protected]> wrote:

> On Mon, 2011-06-13 at 12:56 -0700, Ingo Molnar wrote:
> > * Suresh Siddha <[email protected]> wrote:
> >
> > > include/linux/stop_machine.h | 11 +++--
> > > kernel/stop_machine.c | 91 ++++++++++++++++++++++++++++++++++++++++---
> > > 2 files changed, 93 insertions(+), 9 deletions(-)
> >
> > Btw., this is *way* too risky for a -stable backport.
> >
>
> Ingo, we can have a smaller patch (appended) for the -stable. How
> do you want to go ahead? Take this small patch for both mainline
> and -stable and the two code cleanup/consolidation patches for -tip
> (to go into 3.1?). Thanks.

this:

> arch/x86/kernel/cpu/mtrr/main.c | 15 ++++++++++++++-
> include/linux/stop_machine.h | 17 +++++++++++++++++
> kernel/stop_machine.c | 15 +++++++++++++++
> 3 files changed, 46 insertions(+), 1 deletions(-)

looks pretty risky as well, this is core kernel code that is
relatively rarely used and if it breaks it causes various high impact
regressions.

Once Tejun is fine with the code we can do the larger patch upstream
but not mark it for -stable backport. Once it's been upstream for a
couple of weeks, once we are sure it does not regress, can we perhaps
forward it to -stable ...

Thanks,

Ingo

2011-06-14 08:09:36

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path


* Suresh Siddha <[email protected]> wrote:

> On Mon, 2011-06-13 at 13:49 -0700, Suresh Siddha wrote:
> > On Mon, 2011-06-13 at 12:56 -0700, Ingo Molnar wrote:
> > > * Suresh Siddha <[email protected]> wrote:
> > >
> > > > include/linux/stop_machine.h | 11 +++--
> > > > kernel/stop_machine.c | 91 ++++++++++++++++++++++++++++++++++++++++---
> > > > 2 files changed, 93 insertions(+), 9 deletions(-)
> > >
> > > Btw., this is *way* too risky for a -stable backport.
> > >
> >
> > Ingo, we can have a smaller patch (appended) for the -stable. How do you
> > want to go ahead? Take this small patch for both mainline and -stable
> > and the two code cleanup/consolidation patches for -tip (to go into
> > 3.1?). Thanks.
>
> oops. sent it too soon. fixed a !CONFIG_SMP compilation issue.
> Updated simple patch appended.

See? It's by definition not a "simple" patch.

Thanks,

Ingo

2011-06-14 10:10:45

by Tejun Heo

[permalink] [raw]
Subject: Re: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path

Hello, Ingo, Suresh.

On Mon, Jun 13, 2011 at 01:49:18PM -0700, Suresh Siddha wrote:
> On Mon, 2011-06-13 at 12:56 -0700, Ingo Molnar wrote:
> > * Suresh Siddha <[email protected]> wrote:
> >
> > > include/linux/stop_machine.h | 11 +++--
> > > kernel/stop_machine.c | 91 ++++++++++++++++++++++++++++++++++++++++---
> > > 2 files changed, 93 insertions(+), 9 deletions(-)
> >
> > Btw., this is *way* too risky for a -stable backport.
> >
>
> Ingo, we can have a smaller patch (appended) for the -stable. How do you
> want to go ahead? Take this small patch for both mainline and -stable
> and the two code cleanup/consolidation patches for -tip (to go into
> 3.1?). Thanks.

So, here's what I think we should do.

* Polish up this simpler patch and send it for 3.0 through -tip. It's
slightly scary but not too much and fixes a real bug. After a
while, we can ask -stable to pull the simple version.

* Work on proper update which drops custom implementation from mtrr
code for 3.1 window. BTW, even after the recent revisions, I think
the stop machine change is a bit too hacky. I'll reply to that
separately.

> diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h
> index 092dc9b..8a28d4c 100644
> --- a/include/linux/stop_machine.h
> +++ b/include/linux/stop_machine.h
> @@ -33,6 +33,10 @@ void stop_one_cpu_nowait(unsigned int cpu, cpu_stop_fn_t fn, void *arg,
> int stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg);
> int try_stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg);
>
> +void lock_stop_cpus(void);
> +int try_lock_stop_cpus(void);
> +void unlock_stop_cpus(void);

Ugh... Can you please just export stop_cpus_mutex and have CONFIG_SMP
in mtrr code. After all, it's a temporary workaround for mtrr. No
reason to add three functions for that.

Thank you.

--
tejun

2011-06-14 21:17:29

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch v4 1/2] stop_machine: enable __stop_machine() to be called from the cpu online path


* [email protected] <[email protected]> wrote:

> Hello, Ingo, Suresh.
>
> On Mon, Jun 13, 2011 at 01:49:18PM -0700, Suresh Siddha wrote:
> > On Mon, 2011-06-13 at 12:56 -0700, Ingo Molnar wrote:
> > > * Suresh Siddha <[email protected]> wrote:
> > >
> > > > include/linux/stop_machine.h | 11 +++--
> > > > kernel/stop_machine.c | 91 ++++++++++++++++++++++++++++++++++++++++---
> > > > 2 files changed, 93 insertions(+), 9 deletions(-)
> > >
> > > Btw., this is *way* too risky for a -stable backport.
> > >
> >
> > Ingo, we can have a smaller patch (appended) for the -stable. How do you
> > want to go ahead? Take this small patch for both mainline and -stable
> > and the two code cleanup/consolidation patches for -tip (to go into
> > 3.1?). Thanks.
>
> So, here's what I think we should do.
>
> * Polish up this simpler patch and send it for 3.0 through -tip. It's
> slightly scary but not too much and fixes a real bug. After a
> while, we can ask -stable to pull the simple version.

Looks good to me.

> * Work on proper update which drops custom implementation from mtrr
> code for 3.1 window. BTW, even after the recent revisions, I think
> the stop machine change is a bit too hacky. I'll reply to that
> separately.

ok.

Thanks,

Ingo