2004-09-09 03:21:24

by Shaohua Li

[permalink] [raw]
Subject: RE: [ACPI] Re: [PATCH] Oops and panic while unloading holder of pm_idle

On Thursday, September 09, 2004, Zwane Mwaikambo wrote:
>Hello Shaohua,
>
>> + /* Wait the idle thread to read the new value,
>> + * otherwise we Oops.
>> + */
>
>I should have mentioned this earlier, but the comment needs work,
>something like this;
>
>"We are about to unload the current idle thread pm callback (pm_idle),
> Wait for all processors to update cached/local copies of pm_idle
before
> proceeding."
>
>> + /* If pm_idle is in a module and is preempted,
>> + * oops occurs. Disable preempt.
>> + */
>
>"Mark this as an RCU critical section so that synchronize_kernel() in
the
> unload path waits for our completion."
>
>Then resend the patch, i swear it'll be the last time ;)
Zwane,
It's fine. Here is the updated patch.

Thanks,
Shaohua

Signed-off-by: Li Shaohua <[email protected]>
===== arch/i386/kernel/apm.c 1.67 vs edited =====
--- 1.67/arch/i386/kernel/apm.c 2004-08-24 17:08:47 +08:00
+++ edited/arch/i386/kernel/apm.c 2004-09-09 10:57:16 +08:00
@@ -2362,8 +2362,15 @@ static void __exit apm_exit(void)
{
int error;

- if (set_pm_idle)
+ if (set_pm_idle) {
pm_idle = original_pm_idle;
+ /*
+ * We are about to unload the current idle thread pm
callback
+ * (pm_idle), Wait for all processors to update
cached/local
+ * copies of pm_idle before proceeding.
+ */
+ synchronize_kernel();
+ }
if (((apm_info.bios.flags & APM_BIOS_DISENGAGED) == 0)
&& (apm_info.connection_version > 0x0100)) {
error = apm_engage_power_management(APM_DEVICE_ALL, 0);
===== arch/i386/kernel/process.c 1.75 vs edited =====
--- 1.75/arch/i386/kernel/process.c 2004-08-24 17:08:44 +08:00
+++ edited/arch/i386/kernel/process.c 2004-09-09 10:59:47 +08:00
@@ -142,6 +142,12 @@ void cpu_idle (void)
/* endless idle loop with no priority at all */
while (1) {
while (!need_resched()) {
+ /*
+ * Mark this as an RCU critical section so that
+ * synchronize_kernel() in the unload path waits
+ * for our completion.
+ */
+ rcu_read_lock();
void (*idle)(void) = pm_idle;

if (!idle)
@@ -149,6 +155,7 @@ void cpu_idle (void)

irq_stat[smp_processor_id()].idle_timestamp =
jiffies;
idle();
+ rcu_read_unlock();
}
schedule();
}
===== arch/ia64/kernel/process.c 1.62 vs edited =====
--- 1.62/arch/ia64/kernel/process.c 2004-07-27 13:26:50 +08:00
+++ edited/arch/ia64/kernel/process.c 2004-09-09 11:01:12 +08:00
@@ -228,18 +228,26 @@ cpu_idle (void *unused)

/* endless idle loop with no priority at all */
while (1) {
- void (*idle)(void) = pm_idle;
- if (!idle)
- idle = default_idle;
-
#ifdef CONFIG_SMP
if (!need_resched())
min_xtp();
#endif
while (!need_resched()) {
+ void (*idle)(void);
+
if (mark_idle)
(*mark_idle)(1);
+ /*
+ * Mark this as an RCU critical section so that
+ * synchronize_kernel() in the unload path waits
+ * for our completion.
+ */
+ rcu_read_lock();
+ idle= pm_idle;
+ if (!idle)
+ idle = default_idle;
(*idle)();
+ rcu_read_unlock();
}

if (mark_idle)
===== arch/x86_64/kernel/process.c 1.35 vs edited =====
--- 1.35/arch/x86_64/kernel/process.c 2004-08-24 17:08:44 +08:00
+++ edited/arch/x86_64/kernel/process.c 2004-09-09 11:01:57 +08:00
@@ -130,11 +130,20 @@ void cpu_idle (void)
{
/* endless idle loop with no priority at all */
while (1) {
- void (*idle)(void) = pm_idle;
- if (!idle)
- idle = default_idle;
- while (!need_resched())
+ while (!need_resched()) {
+ void (*idle)(void);
+ /*
+ * Mark this as an RCU critical section so that
+ * synchronize_kernel() in the unload path waits
+ * for our completion.
+ */
+ rcu_read_lock();
+ idle = pm_idle;
+ if (!idle)
+ idle = default_idle;
idle();
+ rcu_read_unlock();
+ }
schedule();
}
}
===== drivers/acpi/processor.c 1.61 vs edited =====
--- 1.61/drivers/acpi/processor.c 2004-08-31 23:27:29 +08:00
+++ edited/drivers/acpi/processor.c 2004-09-09 11:02:56 +08:00
@@ -2419,6 +2419,11 @@ acpi_processor_remove (
/* Unregister the idle handler when processor #0 is removed. */
if (pr->id == 0) {
pm_idle = pm_idle_save;
+ /*
+ * We are about to unload the current idle thread pm
callback
+ * (pm_idle), Wait for all processors to update
cached/local
+ * copies of pm_idle before proceeding.
+ */
synchronize_kernel();
}

<<idle.patch>>


Attachments:
idle.patch (3.43 kB)
idle.patch

2004-09-09 12:03:18

by Zwane Mwaikambo

[permalink] [raw]
Subject: [PATCH] Close race with preempt and modular pm_idle callbacks

The following patch from Shaohua Li fixes a race with preempt enabled when
a module containing a pm_idle callback is unloaded. Cached values in local
variables need to be protected as RCU critical sections so that the
synchronize_kernel() call in the unload path waits for all processors.
There original bugzilla entry can be found at

Shaohua, i had to make a small change (variable declaration after
code in code block) so that it compiles with geriatric compilers such as
the ones Andrew is attached to ;)

http://bugzilla.kernel.org/show_bug.cgi?id=1716

arch/i386/kernel/apm.c | 9 ++++++++-
arch/i386/kernel/process.c | 10 +++++++++-
arch/ia64/kernel/process.c | 16 ++++++++++++----
arch/x86_64/kernel/process.c | 17 +++++++++++++----
drivers/acpi/processor.c | 5 +++++
5 files changed, 47 insertions(+), 10 deletions(-)

Signed-off-by: Li Shaohua <[email protected]>
Signed-off-by: Zwane Mwaikambo <[email protected]>

Index: linux-2.6.9-rc1-mm4/drivers/acpi/processor.c
===================================================================
RCS file: /home/cvsroot/linux-2.6.9-rc1-mm4/drivers/acpi/processor.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 processor.c
--- linux-2.6.9-rc1-mm4/drivers/acpi/processor.c 7 Sep 2004 11:52:02 -0000 1.1.1.1
+++ linux-2.6.9-rc1-mm4/drivers/acpi/processor.c 9 Sep 2004 11:55:03 -0000
@@ -2424,6 +2424,11 @@ acpi_processor_remove (
/* Unregister the idle handler when processor #0 is removed. */
if (pr->id == 0) {
pm_idle = pm_idle_save;
+ /*
+ * We are about to unload the current idle thread pm callback
+ * (pm_idle), Wait for all processors to update cached/local
+ * copies of pm_idle before proceeding.
+ */
synchronize_kernel();
}

Index: linux-2.6.9-rc1-mm4/arch/i386/kernel/apm.c
===================================================================
RCS file: /home/cvsroot/linux-2.6.9-rc1-mm4/arch/i386/kernel/apm.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 apm.c
--- linux-2.6.9-rc1-mm4/arch/i386/kernel/apm.c 7 Sep 2004 11:51:59 -0000 1.1.1.1
+++ linux-2.6.9-rc1-mm4/arch/i386/kernel/apm.c 9 Sep 2004 11:55:03 -0000
@@ -2362,8 +2362,15 @@ static void __exit apm_exit(void)
{
int error;

- if (set_pm_idle)
+ if (set_pm_idle) {
pm_idle = original_pm_idle;
+ /*
+ * We are about to unload the current idle thread pm callback
+ * (pm_idle), Wait for all processors to update cached/local
+ * copies of pm_idle before proceeding.
+ */
+ synchronize_kernel();
+ }
if (((apm_info.bios.flags & APM_BIOS_DISENGAGED) == 0)
&& (apm_info.connection_version > 0x0100)) {
error = apm_engage_power_management(APM_DEVICE_ALL, 0);
Index: linux-2.6.9-rc1-mm4/arch/i386/kernel/process.c
===================================================================
RCS file: /home/cvsroot/linux-2.6.9-rc1-mm4/arch/i386/kernel/process.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 process.c
--- linux-2.6.9-rc1-mm4/arch/i386/kernel/process.c 7 Sep 2004 11:51:59 -0000 1.1.1.1
+++ linux-2.6.9-rc1-mm4/arch/i386/kernel/process.c 9 Sep 2004 12:00:44 -0000
@@ -175,7 +175,14 @@ void cpu_idle (void)
/* endless idle loop with no priority at all */
while (1) {
while (!need_resched()) {
- void (*idle)(void) = pm_idle;
+ void (*idle)(void);
+ /*
+ * Mark this as an RCU critical section so that
+ * synchronize_kernel() in the unload path waits
+ * for our completion.
+ */
+ rcu_read_lock();
+ idle = pm_idle;

if (!idle)
idle = default_idle;
@@ -184,6 +191,7 @@ void cpu_idle (void)
play_dead();
irq_stat[smp_processor_id()].idle_timestamp = jiffies;
idle();
+ rcu_read_unlock();
}
schedule();
}
Index: linux-2.6.9-rc1-mm4/arch/ia64/kernel/process.c
===================================================================
RCS file: /home/cvsroot/linux-2.6.9-rc1-mm4/arch/ia64/kernel/process.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 process.c
--- linux-2.6.9-rc1-mm4/arch/ia64/kernel/process.c 7 Sep 2004 11:51:59 -0000 1.1.1.1
+++ linux-2.6.9-rc1-mm4/arch/ia64/kernel/process.c 9 Sep 2004 11:58:21 -0000
@@ -228,18 +228,26 @@ cpu_idle (void *unused)

/* endless idle loop with no priority at all */
while (1) {
- void (*idle)(void) = pm_idle;
- if (!idle)
- idle = default_idle;
-
#ifdef CONFIG_SMP
if (!need_resched())
min_xtp();
#endif
while (!need_resched()) {
+ void (*idle)(void);
+
if (mark_idle)
(*mark_idle)(1);
+ /*
+ * Mark this as an RCU critical section so that
+ * synchronize_kernel() in the unload path waits
+ * for our completion.
+ */
+ rcu_read_lock();
+ idle = pm_idle;
+ if (!idle)
+ idle = default_idle;
(*idle)();
+ rcu_read_unlock();
}

if (mark_idle)
Index: linux-2.6.9-rc1-mm4/arch/x86_64/kernel/process.c
===================================================================
RCS file: /home/cvsroot/linux-2.6.9-rc1-mm4/arch/x86_64/kernel/process.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 process.c
--- linux-2.6.9-rc1-mm4/arch/x86_64/kernel/process.c 7 Sep 2004 11:52:00 -0000 1.1.1.1
+++ linux-2.6.9-rc1-mm4/arch/x86_64/kernel/process.c 9 Sep 2004 11:55:03 -0000
@@ -131,11 +131,20 @@ void cpu_idle (void)
{
/* endless idle loop with no priority at all */
while (1) {
- void (*idle)(void) = pm_idle;
- if (!idle)
- idle = default_idle;
- while (!need_resched())
+ while (!need_resched()) {
+ void (*idle)(void);
+ /*
+ * Mark this as an RCU critical section so that
+ * synchronize_kernel() in the unload path waits
+ * for our completion.
+ */
+ rcu_read_lock();
+ idle = pm_idle;
+ if (!idle)
+ idle = default_idle;
idle();
+ rcu_read_unlock();
+ }
schedule();
}
}