Hi team,
commit a802dd0eb5fc97a50cf1abb1f788a8f6cc5db635
Author: Heiko Carstens <[email protected]>
Date: Mon Oct 13 23:50:08 2008 +0200
Call init_workqueues before pre smp initcalls.
This commit causes the console login prompt to ignore my keystrokes
so I'm unable to log in (though ssh works fine). Oddly, if I hit
ctrl-alt-delete the machine does reboot, but no other keystrokes
appear on the screen or have any effect that I can detect.
No surprise that this problem appears only on my dual-core amd64
machine, while the k7 single processor machine works normally.
I've not seen anyone else complaining on the linux list, so maybe
this is a BIOS bug on the affected machine? It's an ECS mobo with
a GeForce6100PM-M2 chipset, and I use a ps/2 keyboard.
Thanks.
On Thu, Oct 23, 2008 at 05:09:24PM -0700, walt wrote:
> Hi team,
>
> commit a802dd0eb5fc97a50cf1abb1f788a8f6cc5db635
> Author: Heiko Carstens <[email protected]>
> Date: Mon Oct 13 23:50:08 2008 +0200
>
> Call init_workqueues before pre smp initcalls.
>
> This commit causes the console login prompt to ignore my keystrokes
> so I'm unable to log in (though ssh works fine). Oddly, if I hit
> ctrl-alt-delete the machine does reboot, but no other keystrokes
> appear on the screen or have any effect that I can detect.
>
> No surprise that this problem appears only on my dual-core amd64
> machine, while the k7 single processor machine works normally.
>
> I've not seen anyone else complaining on the linux list, so maybe
> this is a BIOS bug on the affected machine? It's an ECS mobo with
> a GeForce6100PM-M2 chipset, and I use a ps/2 keyboard.
Uh? How did you get to this commit? I assume a git bisect?
Strange.. I'm going to have a look at it.
On Thu, Oct 23, 2008 at 05:09:24PM -0700, walt wrote:
> Hi team,
>
> commit a802dd0eb5fc97a50cf1abb1f788a8f6cc5db635
> Author: Heiko Carstens <[email protected]>
> Date: Mon Oct 13 23:50:08 2008 +0200
>
> Call init_workqueues before pre smp initcalls.
>
> This commit causes the console login prompt to ignore my keystrokes
> so I'm unable to log in (though ssh works fine). Oddly, if I hit
> ctrl-alt-delete the machine does reboot, but no other keystrokes
> appear on the screen or have any effect that I can detect.
>
> No surprise that this problem appears only on my dual-core amd64
> machine, while the k7 single processor machine works normally.
>
> I've not seen anyone else complaining on the linux list, so maybe
> this is a BIOS bug on the affected machine? It's an ECS mobo with
> a GeForce6100PM-M2 chipset, and I use a ps/2 keyboard.
Could you try the patch below and see if it makes any difference?
Thanks!
---
include/linux/stop_machine.h | 16 +++++++++++-----
init/main.c | 3 +--
kernel/stop_machine.c | 8 ++++++++
3 files changed, 20 insertions(+), 7 deletions(-)
Index: linux-2.6/include/linux/stop_machine.h
===================================================================
--- linux-2.6.orig/include/linux/stop_machine.h
+++ linux-2.6/include/linux/stop_machine.h
@@ -8,6 +8,16 @@
#include <linux/cpumask.h>
#include <asm/system.h>
+static inline int stop_machine_simple(int (*fn)(void *), void *data,
+ const cpumask_t *cpus)
+{
+ int ret;
+ local_irq_disable();
+ ret = fn(data);
+ local_irq_enable();
+ return ret;
+}
+
#if defined(CONFIG_STOP_MACHINE) && defined(CONFIG_SMP)
/**
@@ -40,11 +50,7 @@ int __stop_machine(int (*fn)(void *), vo
static inline int stop_machine(int (*fn)(void *), void *data,
const cpumask_t *cpus)
{
- int ret;
- local_irq_disable();
- ret = fn(data);
- local_irq_enable();
- return ret;
+ return stop_machine_simple(fn, data, cpus);
}
#endif /* CONFIG_SMP */
#endif /* _LINUX_STOP_MACHINE */
Index: linux-2.6/kernel/stop_machine.c
===================================================================
--- linux-2.6.orig/kernel/stop_machine.c
+++ linux-2.6/kernel/stop_machine.c
@@ -43,6 +43,7 @@ static struct workqueue_struct *stop_mac
static struct stop_machine_data active, idle;
static const cpumask_t *active_cpus;
static void *stop_machine_work;
+static int sm_initialized;
static void set_state(enum stopmachine_state newstate)
{
@@ -114,6 +115,10 @@ int __stop_machine(int (*fn)(void *), vo
struct work_struct *sm_work;
int i;
+ WARN_ON_ONCE(!sm_initialized);
+ if (num_online_cpus() == 1)
+ return stop_machine_simple(fn, data, cpus);
+
/* Set up initial state. */
mutex_lock(&lock);
num_threads = num_online_cpus();
@@ -157,7 +162,10 @@ EXPORT_SYMBOL_GPL(stop_machine);
static int __init stop_machine_init(void)
{
stop_machine_wq = create_rt_workqueue("kstop");
+ BUG_ON(!stop_machine_wq);
stop_machine_work = alloc_percpu(struct work_struct);
+ BUG_ON(!stop_machine_work);
+ sm_initialized = 1;
return 0;
}
early_initcall(stop_machine_init);
Index: linux-2.6/init/main.c
===================================================================
--- linux-2.6.orig/init/main.c
+++ linux-2.6/init/main.c
@@ -851,10 +851,9 @@ static int __init kernel_init(void * unu
cad_pid = task_pid(current);
- init_workqueues();
-
smp_prepare_cpus(setup_max_cpus);
+ init_workqueues();
do_pre_smp_initcalls();
start_boot_trace();
On Fri, 24 Oct 2008, Heiko Carstens wrote:
> On Thu, Oct 23, 2008 at 05:09:24PM -0700, walt wrote:
> > Hi team,
> >
> > commit a802dd0eb5fc97a50cf1abb1f788a8f6cc5db635
> > Author: Heiko Carstens <[email protected]>
> > Date: Mon Oct 13 23:50:08 2008 +0200
> >
> > Call init_workqueues before pre smp initcalls.
> >
> > This commit causes the console login prompt to ignore my keystrokes
> > so I'm unable to log in (though ssh works fine). Oddly, if I hit
> > ctrl-alt-delete the machine does reboot, but no other keystrokes
> > appear on the screen or have any effect that I can detect.
> >
> > No surprise that this problem appears only on my dual-core amd64
> > machine, while the k7 single processor machine works normally.
> >
> > I've not seen anyone else complaining on the linux list, so maybe
> > this is a BIOS bug on the affected machine? It's an ECS mobo with
> > a GeForce6100PM-M2 chipset, and I use a ps/2 keyboard.
>
> Could you try the patch below and see if it makes any difference?
> Thanks!
<patch snipped for brevity>
No difference that I can detect, sorry. The login prompt still
ignores my keystrokes. I'm happy to try any other ideas or patches.
(Yes, I did use git bisect to find this commit.)
Thanks.
On Fri, Oct 24, 2008 at 05:52:57AM -0700, [email protected] wrote:
> On Fri, 24 Oct 2008, Heiko Carstens wrote:
> > On Thu, Oct 23, 2008 at 05:09:24PM -0700, walt wrote:
> > > Hi team,
> > >
> > > commit a802dd0eb5fc97a50cf1abb1f788a8f6cc5db635
> > > Author: Heiko Carstens <[email protected]>
> > > Date: Mon Oct 13 23:50:08 2008 +0200
> > >
> > > Call init_workqueues before pre smp initcalls.
> > >
> > > This commit causes the console login prompt to ignore my keystrokes
> > > so I'm unable to log in (though ssh works fine). Oddly, if I hit
> > > ctrl-alt-delete the machine does reboot, but no other keystrokes
> > > appear on the screen or have any effect that I can detect.
> > >
> > > No surprise that this problem appears only on my dual-core amd64
> > > machine, while the k7 single processor machine works normally.
> > >
> > > I've not seen anyone else complaining on the linux list, so maybe
> > > this is a BIOS bug on the affected machine? It's an ECS mobo with
> > > a GeForce6100PM-M2 chipset, and I use a ps/2 keyboard.
> >
> > Could you try the patch below and see if it makes any difference?
> > Thanks!
>
> <patch snipped for brevity>
>
> No difference that I can detect, sorry. The login prompt still
> ignores my keystrokes. I'm happy to try any other ideas or patches.
> (Yes, I did use git bisect to find this commit.)
Could you cc me on replies, please? Mails sent via vger.kernel.org reach
me with a lag of several hours.. so I might miss your reply. :)
Please try the patch below. It basically reverts the whole stop_machine
patches. That way we should know if it really comes from the moved init
call or if it is some other weird bug. I would have expected more bug
reports if the stop_machine patches would be broken. Hmmm...
diff -purN a/include/linux/workqueue.h b/include/linux/workqueue.h
--- a/include/linux/workqueue.h 2008-10-24 21:30:55.000000000 +0200
+++ b/include/linux/workqueue.h 2008-10-24 21:34:05.000000000 +0200
@@ -149,11 +149,11 @@ struct execute_work {
extern struct workqueue_struct *
__create_workqueue_key(const char *name, int singlethread,
- int freezeable, int rt, struct lock_class_key *key,
+ int freezeable, struct lock_class_key *key,
const char *lock_name);
#ifdef CONFIG_LOCKDEP
-#define __create_workqueue(name, singlethread, freezeable, rt) \
+#define __create_workqueue(name, singlethread, freezeable) \
({ \
static struct lock_class_key __key; \
const char *__lock_name; \
@@ -164,19 +164,17 @@ __create_workqueue_key(const char *name,
__lock_name = #name; \
\
__create_workqueue_key((name), (singlethread), \
- (freezeable), (rt), &__key, \
+ (freezeable), &__key, \
__lock_name); \
})
#else
-#define __create_workqueue(name, singlethread, freezeable, rt) \
- __create_workqueue_key((name), (singlethread), (freezeable), (rt), \
- NULL, NULL)
+#define __create_workqueue(name, singlethread, freezeable) \
+ __create_workqueue_key((name), (singlethread), (freezeable), NULL, NULL)
#endif
-#define create_workqueue(name) __create_workqueue((name), 0, 0, 0)
-#define create_rt_workqueue(name) __create_workqueue((name), 0, 0, 1)
-#define create_freezeable_workqueue(name) __create_workqueue((name), 1, 1, 0)
-#define create_singlethread_workqueue(name) __create_workqueue((name), 1, 0, 0)
+#define create_workqueue(name) __create_workqueue((name), 0, 0)
+#define create_freezeable_workqueue(name) __create_workqueue((name), 1, 1)
+#define create_singlethread_workqueue(name) __create_workqueue((name), 1, 0)
extern void destroy_workqueue(struct workqueue_struct *wq);
diff -purN a/init/main.c b/init/main.c
--- a/init/main.c 2008-10-24 21:30:55.000000000 +0200
+++ b/init/main.c 2008-10-24 21:34:09.000000000 +0200
@@ -768,6 +768,8 @@ static void __init do_initcalls(void)
static void __init do_basic_setup(void)
{
rcu_init_sched(); /* needed by module_init stage. */
+ /* drivers will send hotplug events */
+ init_workqueues();
usermodehelper_init();
driver_init();
init_irq_proc();
@@ -851,8 +853,6 @@ static int __init kernel_init(void * unu
cad_pid = task_pid(current);
- init_workqueues();
-
smp_prepare_cpus(setup_max_cpus);
do_pre_smp_initcalls();
diff -purN a/kernel/stop_machine.c b/kernel/stop_machine.c
--- a/kernel/stop_machine.c 2008-10-24 21:30:55.000000000 +0200
+++ b/kernel/stop_machine.c 2008-10-24 21:34:01.000000000 +0200
@@ -37,13 +37,9 @@ struct stop_machine_data {
/* Like num_online_cpus(), but hotplug cpu uses us, so we need this. */
static unsigned int num_threads;
static atomic_t thread_ack;
+static struct completion finished;
static DEFINE_MUTEX(lock);
-static struct workqueue_struct *stop_machine_wq;
-static struct stop_machine_data active, idle;
-static const cpumask_t *active_cpus;
-static void *stop_machine_work;
-
static void set_state(enum stopmachine_state newstate)
{
/* Reset ack counter. */
@@ -55,26 +51,21 @@ static void set_state(enum stopmachine_s
/* Last one to ack a state moves to the next state. */
static void ack_state(void)
{
- if (atomic_dec_and_test(&thread_ack))
- set_state(state + 1);
+ if (atomic_dec_and_test(&thread_ack)) {
+ /* If we're the last one to ack the EXIT, we're finished. */
+ if (state == STOPMACHINE_EXIT)
+ complete(&finished);
+ else
+ set_state(state + 1);
+ }
}
-/* This is the actual function which stops the CPU. It runs
- * in the context of a dedicated stopmachine workqueue. */
-static void stop_cpu(struct work_struct *unused)
+/* This is the actual thread which stops the CPU. It exits by itself rather
+ * than waiting for kthread_stop(), because it's easier for hotplug CPU. */
+static int stop_cpu(struct stop_machine_data *smdata)
{
enum stopmachine_state curstate = STOPMACHINE_NONE;
- struct stop_machine_data *smdata = &idle;
- int cpu = smp_processor_id();
- int err;
-
- if (!active_cpus) {
- if (cpu == first_cpu(cpu_online_map))
- smdata = &active;
- } else {
- if (cpu_isset(cpu, *active_cpus))
- smdata = &active;
- }
+
/* Simple state machine */
do {
/* Chill out and ensure we re-read stopmachine_state. */
@@ -87,11 +78,9 @@ static void stop_cpu(struct work_struct
hard_irq_disable();
break;
case STOPMACHINE_RUN:
- /* On multiple CPUs only a single error code
- * is needed to tell that something failed. */
- err = smdata->fn(smdata->data);
- if (err)
- smdata->fnret = err;
+ /* |= allows error detection if functions on
+ * multiple CPUs. */
+ smdata->fnret |= smdata->fn(smdata->data);
break;
default:
break;
@@ -101,6 +90,7 @@ static void stop_cpu(struct work_struct
} while (curstate != STOPMACHINE_EXIT);
local_irq_enable();
+ do_exit(0);
}
/* Callback for CPUs which aren't supposed to do anything. */
@@ -111,34 +101,78 @@ static int chill(void *unused)
int __stop_machine(int (*fn)(void *), void *data, const cpumask_t *cpus)
{
- struct work_struct *sm_work;
- int i;
+ int i, err;
+ struct stop_machine_data active, idle;
+ struct task_struct **threads;
- /* Set up initial state. */
- mutex_lock(&lock);
- num_threads = num_online_cpus();
- active_cpus = cpus;
active.fn = fn;
active.data = data;
active.fnret = 0;
idle.fn = chill;
idle.data = NULL;
+ /* This could be too big for stack on large machines. */
+ threads = kcalloc(NR_CPUS, sizeof(threads[0]), GFP_KERNEL);
+ if (!threads)
+ return -ENOMEM;
+
+ /* Set up initial state. */
+ mutex_lock(&lock);
+ init_completion(&finished);
+ num_threads = num_online_cpus();
set_state(STOPMACHINE_PREPARE);
- /* Schedule the stop_cpu work on all cpus: hold this CPU so one
- * doesn't hit this CPU until we're ready. */
- get_cpu();
for_each_online_cpu(i) {
- sm_work = percpu_ptr(stop_machine_work, i);
- INIT_WORK(sm_work, stop_cpu);
- queue_work_on(i, stop_machine_wq, sm_work);
+ struct stop_machine_data *smdata = &idle;
+ struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
+
+ if (!cpus) {
+ if (i == first_cpu(cpu_online_map))
+ smdata = &active;
+ } else {
+ if (cpu_isset(i, *cpus))
+ smdata = &active;
+ }
+
+ threads[i] = kthread_create((void *)stop_cpu, smdata, "kstop%u",
+ i);
+ if (IS_ERR(threads[i])) {
+ err = PTR_ERR(threads[i]);
+ threads[i] = NULL;
+ goto kill_threads;
+ }
+
+ /* Place it onto correct cpu. */
+ kthread_bind(threads[i], i);
+
+ /* Make it highest prio. */
+ if (sched_setscheduler_nocheck(threads[i], SCHED_FIFO, ¶m))
+ BUG();
}
+
+ /* We've created all the threads. Wake them all: hold this CPU so one
+ * doesn't hit this CPU until we're ready. */
+ get_cpu();
+ for_each_online_cpu(i)
+ wake_up_process(threads[i]);
+
/* This will release the thread on our CPU. */
put_cpu();
- flush_workqueue(stop_machine_wq);
+ wait_for_completion(&finished);
mutex_unlock(&lock);
+
+ kfree(threads);
+
return active.fnret;
+
+kill_threads:
+ for_each_online_cpu(i)
+ if (threads[i])
+ kthread_stop(threads[i]);
+ mutex_unlock(&lock);
+
+ kfree(threads);
+ return err;
}
int stop_machine(int (*fn)(void *), void *data, const cpumask_t *cpus)
@@ -153,11 +187,3 @@ int stop_machine(int (*fn)(void *), void
return ret;
}
EXPORT_SYMBOL_GPL(stop_machine);
-
-static int __init stop_machine_init(void)
-{
- stop_machine_wq = create_rt_workqueue("kstop");
- stop_machine_work = alloc_percpu(struct work_struct);
- return 0;
-}
-early_initcall(stop_machine_init);
diff -purN a/kernel/workqueue.c b/kernel/workqueue.c
--- a/kernel/workqueue.c 2008-10-24 21:30:55.000000000 +0200
+++ b/kernel/workqueue.c 2008-10-24 21:34:05.000000000 +0200
@@ -62,7 +62,6 @@ struct workqueue_struct {
const char *name;
int singlethread;
int freezeable; /* Freeze threads during suspend */
- int rt;
#ifdef CONFIG_LOCKDEP
struct lockdep_map lockdep_map;
#endif
@@ -767,7 +766,6 @@ init_cpu_workqueue(struct workqueue_stru
static int create_workqueue_thread(struct cpu_workqueue_struct *cwq, int cpu)
{
- struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
struct workqueue_struct *wq = cwq->wq;
const char *fmt = is_single_threaded(wq) ? "%s" : "%s/%d";
struct task_struct *p;
@@ -783,8 +781,7 @@ static int create_workqueue_thread(struc
*/
if (IS_ERR(p))
return PTR_ERR(p);
- if (cwq->wq->rt)
- sched_setscheduler_nocheck(p, SCHED_FIFO, ¶m);
+
cwq->thread = p;
return 0;
@@ -804,7 +801,6 @@ static void start_workqueue_thread(struc
struct workqueue_struct *__create_workqueue_key(const char *name,
int singlethread,
int freezeable,
- int rt,
struct lock_class_key *key,
const char *lock_name)
{
@@ -826,7 +822,6 @@ struct workqueue_struct *__create_workqu
lockdep_init_map(&wq->lockdep_map, lock_name, key, 0);
wq->singlethread = singlethread;
wq->freezeable = freezeable;
- wq->rt = rt;
INIT_LIST_HEAD(&wq->list);
if (singlethread) {
Heiko Carstens wrote:
> On Fri, Oct 24, 2008 at 05:52:57AM -0700, [email protected] wrote:
>> On Fri, 24 Oct 2008, Heiko Carstens wrote:
>>> On Thu, Oct 23, 2008 at 05:09:24PM -0700, walt wrote:
>>>> Hi team,
>>>>
>>>> commit a802dd0eb5fc97a50cf1abb1f788a8f6cc5db635
>>>> Author: Heiko Carstens<[email protected]>
>>>> Date: Mon Oct 13 23:50:08 2008 +0200
>>>>
>>>> Call init_workqueues before pre smp initcalls.
>>>>
>>>> This commit causes the console login prompt to ignore my keystrokes
>>>> so I'm unable to log in (though ssh works fine). Oddly, if I hit
>>>> ctrl-alt-delete the machine does reboot, but no other keystrokes
>>>> appear on the screen or have any effect that I can detect...
> Please try the patch below. It basically reverts the whole stop_machine
> patches. That way we should know if it really comes from the moved init
> call or if it is some other weird bug. I would have expected more bug
> reports if the stop_machine patches would be broken. Hmmm...
<second patch snipped for brevity>
Yes your second patch fixes the keyboard problem, thanks. The lack of
other complainers is what made me think of a BIOS bug, but that's just
wild guesswork on my part. I'm happy to keep trying patches if you can
keep making them :-)
On Fri, Oct 24, 2008 at 04:13:59PM -0700, walt wrote:
>> Please try the patch below. It basically reverts the whole stop_machine
>> patches. That way we should know if it really comes from the moved init
>> call or if it is some other weird bug. I would have expected more bug
>> reports if the stop_machine patches would be broken. Hmmm...
>
> <second patch snipped for brevity>
>
> Yes your second patch fixes the keyboard problem, thanks. The lack of
> other complainers is what made me think of a BIOS bug, but that's just
> wild guesswork on my part. I'm happy to keep trying patches if you can
> keep making them :-)
Thanks a lot for testing and your patience!
The patch below moves the init call to its old place but makes the
stop_machine initialization happen later. I added a BUG_ON to catch possible
early calls, but there shouldn't be any. I think the patch should solve
the regression you are seeing. Could you give it a try please?
The patch applies on top of latest Linus' git tree.
Thanks,
Heiko
---
include/linux/stop_machine.h | 6 ++++++
init/main.c | 5 +++--
kernel/stop_machine.c | 9 ++++++---
3 files changed, 15 insertions(+), 5 deletions(-)
Index: linux-2.6/include/linux/stop_machine.h
===================================================================
--- linux-2.6.orig/include/linux/stop_machine.h
+++ linux-2.6/include/linux/stop_machine.h
@@ -35,6 +35,9 @@ int stop_machine(int (*fn)(void *), void
* won't come or go while it's being called. Used by hotplug cpu.
*/
int __stop_machine(int (*fn)(void *), void *data, const cpumask_t *cpus);
+
+void stop_machine_init(void);
+
#else
static inline int stop_machine(int (*fn)(void *), void *data,
@@ -46,5 +49,8 @@ static inline int stop_machine(int (*fn)
local_irq_enable();
return ret;
}
+
+static inline void stop_machine_init(void) { }
+
#endif /* CONFIG_SMP */
#endif /* _LINUX_STOP_MACHINE */
Index: linux-2.6/init/main.c
===================================================================
--- linux-2.6.orig/init/main.c
+++ linux-2.6/init/main.c
@@ -38,6 +38,7 @@
#include <linux/moduleparam.h>
#include <linux/kallsyms.h>
#include <linux/writeback.h>
+#include <linux/stop_machine.h>
#include <linux/cpu.h>
#include <linux/cpuset.h>
#include <linux/cgroup.h>
@@ -768,6 +769,8 @@ static void __init do_initcalls(void)
static void __init do_basic_setup(void)
{
rcu_init_sched(); /* needed by module_init stage. */
+ init_workqueues();
+ stop_machine_init();
usermodehelper_init();
driver_init();
init_irq_proc();
@@ -851,8 +854,6 @@ static int __init kernel_init(void * unu
cad_pid = task_pid(current);
- init_workqueues();
-
smp_prepare_cpus(setup_max_cpus);
do_pre_smp_initcalls();
Index: linux-2.6/kernel/stop_machine.c
===================================================================
--- linux-2.6.orig/kernel/stop_machine.c
+++ linux-2.6/kernel/stop_machine.c
@@ -43,6 +43,7 @@ static struct workqueue_struct *stop_mac
static struct stop_machine_data active, idle;
static const cpumask_t *active_cpus;
static void *stop_machine_work;
+static int stop_machine_initialized;
static void set_state(enum stopmachine_state newstate)
{
@@ -114,6 +115,7 @@ int __stop_machine(int (*fn)(void *), vo
struct work_struct *sm_work;
int i;
+ BUG_ON(!stop_machine_initialized);
/* Set up initial state. */
mutex_lock(&lock);
num_threads = num_online_cpus();
@@ -154,10 +156,11 @@ int stop_machine(int (*fn)(void *), void
}
EXPORT_SYMBOL_GPL(stop_machine);
-static int __init stop_machine_init(void)
+void __init stop_machine_init(void)
{
stop_machine_wq = create_rt_workqueue("kstop");
+ BUG_ON(!stop_machine_wq);
stop_machine_work = alloc_percpu(struct work_struct);
- return 0;
+ BUG_ON(!stop_machine_work);
+ stop_machine_initialized = 1;
}
-early_initcall(stop_machine_init);
On Sat, 25 Oct 2008, Heiko Carstens wrote:
> On Fri, Oct 24, 2008 at 04:13:59PM -0700, walt wrote:
> >> Please try the patch below. It basically reverts the whole stop_machine
> >> patches. That way we should know if it really comes from the moved init
> >> call or if it is some other weird bug. I would have expected more bug
> >> reports if the stop_machine patches would be broken. Hmmm...
> >
> > <second patch snipped for brevity>
> >
> > Yes your second patch fixes the keyboard problem, thanks. The lack of
> > other complainers is what made me think of a BIOS bug, but that's just
> > wild guesswork on my part. I'm happy to keep trying patches if you can
> > keep making them :-)
>
> Thanks a lot for testing and your patience!
> The patch below moves the init call to its old place but makes the
> stop_machine initialization happen later. I added a BUG_ON to catch possible
> early calls, but there shouldn't be any. I think the patch should solve
> the regression you are seeing. Could you give it a try please?
> The patch applies on top of latest Linus' git tree.
Thanks, Heiko: this patch, or the revert before it, gets my PowerMac G5
working with 2.6.28-rc1. I had the same console keyboard issue as Walt
(though USB not PS/2 here), and also the Broadcom tg3 appeared not to
be initializing (to judge by bootup messages - but without a keyboard
I didn't investigate, and forgot to try pinging from elsewhere).
Either of your patches fix both issues (but I've not reviewed them
or tried them on other machines which were having no problem).
Hugh
>
> Thanks,
> Heiko
>
> ---
> include/linux/stop_machine.h | 6 ++++++
> init/main.c | 5 +++--
> kernel/stop_machine.c | 9 ++++++---
> 3 files changed, 15 insertions(+), 5 deletions(-)
>
> Index: linux-2.6/include/linux/stop_machine.h
> ===================================================================
> --- linux-2.6.orig/include/linux/stop_machine.h
> +++ linux-2.6/include/linux/stop_machine.h
> @@ -35,6 +35,9 @@ int stop_machine(int (*fn)(void *), void
> * won't come or go while it's being called. Used by hotplug cpu.
> */
> int __stop_machine(int (*fn)(void *), void *data, const cpumask_t *cpus);
> +
> +void stop_machine_init(void);
> +
> #else
>
> static inline int stop_machine(int (*fn)(void *), void *data,
> @@ -46,5 +49,8 @@ static inline int stop_machine(int (*fn)
> local_irq_enable();
> return ret;
> }
> +
> +static inline void stop_machine_init(void) { }
> +
> #endif /* CONFIG_SMP */
> #endif /* _LINUX_STOP_MACHINE */
> Index: linux-2.6/init/main.c
> ===================================================================
> --- linux-2.6.orig/init/main.c
> +++ linux-2.6/init/main.c
> @@ -38,6 +38,7 @@
> #include <linux/moduleparam.h>
> #include <linux/kallsyms.h>
> #include <linux/writeback.h>
> +#include <linux/stop_machine.h>
> #include <linux/cpu.h>
> #include <linux/cpuset.h>
> #include <linux/cgroup.h>
> @@ -768,6 +769,8 @@ static void __init do_initcalls(void)
> static void __init do_basic_setup(void)
> {
> rcu_init_sched(); /* needed by module_init stage. */
> + init_workqueues();
> + stop_machine_init();
> usermodehelper_init();
> driver_init();
> init_irq_proc();
> @@ -851,8 +854,6 @@ static int __init kernel_init(void * unu
>
> cad_pid = task_pid(current);
>
> - init_workqueues();
> -
> smp_prepare_cpus(setup_max_cpus);
>
> do_pre_smp_initcalls();
> Index: linux-2.6/kernel/stop_machine.c
> ===================================================================
> --- linux-2.6.orig/kernel/stop_machine.c
> +++ linux-2.6/kernel/stop_machine.c
> @@ -43,6 +43,7 @@ static struct workqueue_struct *stop_mac
> static struct stop_machine_data active, idle;
> static const cpumask_t *active_cpus;
> static void *stop_machine_work;
> +static int stop_machine_initialized;
>
> static void set_state(enum stopmachine_state newstate)
> {
> @@ -114,6 +115,7 @@ int __stop_machine(int (*fn)(void *), vo
> struct work_struct *sm_work;
> int i;
>
> + BUG_ON(!stop_machine_initialized);
> /* Set up initial state. */
> mutex_lock(&lock);
> num_threads = num_online_cpus();
> @@ -154,10 +156,11 @@ int stop_machine(int (*fn)(void *), void
> }
> EXPORT_SYMBOL_GPL(stop_machine);
>
> -static int __init stop_machine_init(void)
> +void __init stop_machine_init(void)
> {
> stop_machine_wq = create_rt_workqueue("kstop");
> + BUG_ON(!stop_machine_wq);
> stop_machine_work = alloc_percpu(struct work_struct);
> - return 0;
> + BUG_ON(!stop_machine_work);
> + stop_machine_initialized = 1;
> }
> -early_initcall(stop_machine_init);
Heiko Carstens wrote:
> On Fri, Oct 24, 2008 at 04:13:59PM -0700, walt wrote:
>>> Please try the patch below. It basically reverts the whole stop_machine
>>> patches. That way we should know if it really comes from the moved init
>>> call or if it is some other weird bug. I would have expected more bug
>>> reports if the stop_machine patches would be broken. Hmmm...
>> <second patch snipped for brevity>
>>
>> Yes your second patch fixes the keyboard problem, thanks. The lack of
>> other complainers is what made me think of a BIOS bug, but that's just
>> wild guesswork on my part. I'm happy to keep trying patches if you can
>> keep making them :-)
>
> Thanks a lot for testing and your patience!
> The patch below moves the init call to its old place but makes the
> stop_machine initialization happen later. I added a BUG_ON to catch possible
> early calls, but there shouldn't be any. I think the patch should solve
> the regression you are seeing. Could you give it a try please?
> The patch applies on top of latest Linus' git tree...
<patch snipped for brevity>
Yes, everything seems back to normal with this patch, thanks.