2009-03-01 22:30:42

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFC][PATCH 0/4] Rework disabling of interrupts during suspend-resume

Hi,

The following patches modifiy the way in which we handle disabling interrupts
during suspend and enabling them during resume. They also change the ordering
of the core suspend and hibernation code.

Namely, interrupts are currently disabled on the boot CPU as soon as the
nonboot CPUs have been disabled, which doesn't allow device drivers' "late"
suspend and "early" resume callbacks to sleep. Among other things this means
they cannot execute ACPI AML routines, which leads to problems with
suspend-resume of PCI devices, as recently discussed.

1/4 modifies the [suspend|hibernation] and resume code, as well as the other
code using the device PM framework, so that device drivers will not receive
interrupts during the "late" suspend phase, although interrupts will only be
disabled on the CPU right before calling sysdev_suspend() (and analogously
during resume). [Ingo, I didn't add your ACK to the patch, because it's
changed since you saw it last time.]

2/4 - 4/4 modify the suspend, hibernation and kexec jump code, respectively,
so that the "late" phase of suspending devices will happen before the platform
"prepare" callback and the disabling of nonboot CPUs (and analogously during
resume).

Comments welcome.

Thanks,
Rafael


2009-03-01 22:30:27

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFC][PATCH 3/4] PM: Change hibernation code ordering

From: Rafael J. Wysocki <[email protected]>

Change the ordering of the hibernation core code so that the platform
"prepare" callbacks are executed and the nonboot CPUs are disabled
after calling device drivers' "late suspend" methods.

This change (along with the previous analogous change of the suspend
core code) will allow us to rework the PCI PM core so that the power
state of devices is changed in the "late" phase of suspend (and
analogously in the "early" phase of resume), which in turn will allow
us to avoid the race condition where a device using shared interrupts
is put into a low power state with interrupts enabled and then an
interrupt (for another device) comes in and confuses its driver.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
kernel/power/disk.c | 109 +++++++++++++++++++++++++++++-----------------------
1 file changed, 61 insertions(+), 48 deletions(-)

Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -228,13 +228,22 @@ static int create_image(int platform_mod
goto Unlock;
}

+ error = platform_pre_snapshot(platform_mode);
+ if (error || hibernation_test(TEST_PLATFORM))
+ goto Platform_finish;
+
+ error = disable_nonboot_cpus();
+ if (error || hibernation_test(TEST_CPUS)
+ || hibernation_testmode(HIBERNATION_TEST))
+ goto Enable_cpus;
+
local_irq_disable();

sysdev_suspend(PMSG_FREEZE);
if (error) {
printk(KERN_ERR "PM: Some devices failed to power down, "
"aborting hibernation\n");
- goto Power_up_devices;
+ goto Enable_irqs;
}

if (hibernation_test(TEST_CORE))
@@ -250,15 +259,22 @@ static int create_image(int platform_mod
restore_processor_state();
if (!in_suspend)
platform_leave(platform_mode);
+
Power_up:
sysdev_resume();
/* NOTE: device_power_up() is just a resume() for devices
* that suspended with irqs off ... no overall powerup.
*/

- Power_up_devices:
+ Enable_irqs:
local_irq_enable();

+ Enable_cpus:
+ enable_nonboot_cpus();
+
+ Platform_finish:
+ platform_finish(platform_mode);
+
device_power_up(in_suspend ?
(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);

@@ -298,25 +314,9 @@ int hibernation_snapshot(int platform_mo
if (hibernation_test(TEST_DEVICES))
goto Recover_platform;

- error = platform_pre_snapshot(platform_mode);
- if (error || hibernation_test(TEST_PLATFORM))
- goto Finish;
-
- error = disable_nonboot_cpus();
- if (!error) {
- if (hibernation_test(TEST_CPUS))
- goto Enable_cpus;
-
- if (hibernation_testmode(HIBERNATION_TEST))
- goto Enable_cpus;
+ error = create_image(platform_mode);
+ /* Control returns here after successful restore */

- error = create_image(platform_mode);
- /* Control returns here after successful restore */
- }
- Enable_cpus:
- enable_nonboot_cpus();
- Finish:
- platform_finish(platform_mode);
Resume_devices:
device_resume(in_suspend ?
(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
@@ -338,7 +338,7 @@ int hibernation_snapshot(int platform_mo
* kernel.
*/

-static int resume_target_kernel(void)
+static int resume_target_kernel(bool platform_mode)
{
int error;

@@ -351,9 +351,20 @@ static int resume_target_kernel(void)
goto Unlock;
}

+ error = platform_pre_restore(platform_mode);
+ if (error)
+ goto Cleanup;
+
+ error = disable_nonboot_cpus();
+ if (error)
+ goto Enable_cpus;
+
local_irq_disable();

- sysdev_suspend(PMSG_QUIESCE);
+ error = sysdev_suspend(PMSG_QUIESCE);
+ if (error)
+ goto Enable_irqs;
+
/* We'll ignore saved state, but this gets preempt count (etc) right */
save_processor_state();
error = restore_highmem();
@@ -379,8 +390,15 @@ static int resume_target_kernel(void)

sysdev_resume();

+ Enable_irqs:
local_irq_enable();

+ Enable_cpus:
+ enable_nonboot_cpus();
+
+ Cleanup:
+ platform_restore_cleanup(platform_mode);
+
device_power_up(PMSG_RECOVER);

Unlock:
@@ -405,19 +423,10 @@ int hibernation_restore(int platform_mod
pm_prepare_console();
suspend_console();
error = device_suspend(PMSG_QUIESCE);
- if (error)
- goto Finish;
-
- error = platform_pre_restore(platform_mode);
if (!error) {
- error = disable_nonboot_cpus();
- if (!error)
- error = resume_target_kernel();
- enable_nonboot_cpus();
+ error = resume_target_kernel(platform_mode);
+ device_resume(PMSG_RECOVER);
}
- platform_restore_cleanup(platform_mode);
- device_resume(PMSG_RECOVER);
- Finish:
resume_console();
pm_restore_console();
return error;
@@ -453,34 +462,38 @@ int hibernation_platform_enter(void)
goto Resume_devices;
}

+ device_pm_lock();
+
+ error = device_power_down(PMSG_HIBERNATE);
+ if (error)
+ goto Unlock;
+
error = hibernation_ops->prepare();
if (error)
- goto Resume_devices;
+ goto Platofrm_finish;

error = disable_nonboot_cpus();
if (error)
- goto Finish;
-
- device_pm_lock();
-
- error = device_power_down(PMSG_HIBERNATE);
- if (!error) {
- local_irq_disable();
- sysdev_suspend(PMSG_HIBERNATE);
- hibernation_ops->enter();
- /* We should never get here */
- while (1);
- }
+ goto Platofrm_finish;

- device_pm_unlock();
+ local_irq_disable();
+ sysdev_suspend(PMSG_HIBERNATE);
+ hibernation_ops->enter();
+ /* We should never get here */
+ while (1);

/*
* We don't need to reenable the nonboot CPUs or resume consoles, since
* the system is going to be halted anyway.
*/
- Finish:
+ Platofrm_finish:
hibernation_ops->finish();

+ device_power_up(PMSG_RESTORE);
+
+ Unlock:
+ device_pm_unlock();
+
Resume_devices:
entering_platform_hibernation = false;
device_resume(PMSG_RESTORE);

2009-03-01 22:30:59

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFC][PATCH 4/4] kexec: Change kexec jump code ordering

From: Rafael J. Wysocki <[email protected]>

Change the ordering of the kexec jump code so that the nonboot CPUs
are disabled after calling device drivers' "late suspend" methods.

This change reflects the recent modifications of the power management
code that is also used by kexec jump.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
kernel/kexec.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)

Index: linux-2.6/kernel/kexec.c
===================================================================
--- linux-2.6.orig/kernel/kexec.c
+++ linux-2.6/kernel/kexec.c
@@ -1450,9 +1450,6 @@ int kernel_kexec(void)
error = device_suspend(PMSG_FREEZE);
if (error)
goto Resume_console;
- error = disable_nonboot_cpus();
- if (error)
- goto Resume_devices;
device_pm_lock();
/* At this point, device_suspend() has been called,
* but *not* device_power_down(). We *must*
@@ -1463,13 +1460,15 @@ int kernel_kexec(void)
*/
error = device_power_down(PMSG_FREEZE);
if (error)
- goto Unlock_pm;
-
+ goto Resume_devices;
+ error = disable_nonboot_cpus();
+ if (error)
+ goto Enable_cpus;
local_irq_disable();
/* Suspend system devices */
error = sysdev_suspend(PMSG_FREEZE);
if (error)
- goto Power_up_devices;
+ goto Enable_irqs;
} else
#endif
{
@@ -1483,13 +1482,13 @@ int kernel_kexec(void)
#ifdef CONFIG_KEXEC_JUMP
if (kexec_image->preserve_context) {
sysdev_resume();
- Power_up_devices:
+ Enable_irqs:
local_irq_enable();
- device_power_up(PMSG_RESTORE);
- Unlock_pm:
- device_pm_unlock();
+ Enable_cpus:
enable_nonboot_cpus();
+ device_power_up(PMSG_RESTORE);
Resume_devices:
+ device_pm_unlock();
device_resume(PMSG_RESTORE);
Resume_console:
resume_console();

2009-03-01 22:31:23

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFC][PATCH 2/4] PM: Change suspend code ordering

From: Rafael J. Wysocki <[email protected]>

Change the ordering of the suspend core code so that the platform
"prepare" callback is executed and the nonboot CPUs are disabled
after calling device drivers' "late suspend" methods.

This change will allow us to rework the PCI PM core so that the power
state of devices is changed in the "late" phase of suspend (and
analogously in the "early" phase of resume), which in turn will allow
us to avoid the race condition where a device using shared interrupts
is put into a low power state with interrupts enabled and then an
interrupt (for another device) comes in and confuses its driver.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
kernel/power/main.c | 38 ++++++++++++++++++++++----------------
1 file changed, 22 insertions(+), 16 deletions(-)

Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -297,6 +297,19 @@ static int suspend_enter(suspend_state_t
goto Done;
}

+ if (suspend_ops->prepare) {
+ error = suspend_ops->prepare();
+ if (error)
+ goto Power_up_devices;
+ }
+
+ if (suspend_test(TEST_PLATFORM))
+ goto Platfrom_finish;
+
+ error = disable_nonboot_cpus();
+ if (error || suspend_test(TEST_CPUS))
+ goto Enable_cpus;
+
arch_suspend_disable_irqs();
BUG_ON(!irqs_disabled());

@@ -310,6 +323,14 @@ static int suspend_enter(suspend_state_t
arch_suspend_enable_irqs();
BUG_ON(irqs_disabled());

+ Enable_cpus:
+ enable_nonboot_cpus();
+
+ Platfrom_finish:
+ if (suspend_ops->finish)
+ suspend_ops->finish();
+
+ Power_up_devices:
device_power_up(PMSG_RESUME);

Done:
@@ -346,23 +367,8 @@ int suspend_devices_and_enter(suspend_st
if (suspend_test(TEST_DEVICES))
goto Recover_platform;

- if (suspend_ops->prepare) {
- error = suspend_ops->prepare();
- if (error)
- goto Resume_devices;
- }
-
- if (suspend_test(TEST_PLATFORM))
- goto Finish;
+ suspend_enter(state);

- error = disable_nonboot_cpus();
- if (!error && !suspend_test(TEST_CPUS))
- suspend_enter(state);
-
- enable_nonboot_cpus();
- Finish:
- if (suspend_ops->finish)
- suspend_ops->finish();
Resume_devices:
suspend_test_start();
device_resume(PMSG_RESUME);

2009-03-01 22:31:38

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4)

From: Rafael J. Wysocki <[email protected]>

Introduce two helper functions allowing us to prevent device drivers
from getting any interrupts (without disabling interrupts on the CPU)
during suspend (or hibernation) and to make them start to receive
interrupts again during the subsequent resume, respectively. These
functions make it possible to keep timer interrupts enabled while the
"late" suspend and "early" resume callbacks provided by device
drivers are being executed.

Use these functions to rework the handling of interrupts during
suspend (hibernation) and resume. Namely, interrupts will only be
disabled on the CPU right before suspending sysdevs, while device
drivers will be prevented from receiving interrupts, with the help of
the new helper function, before their "late" suspend callbacks run
(and analogously during resume).

In addition, since the device interrups are now disabled before the
CPU has turned all interrupts off and the CPU will ACK the interrupts
setting the IRQ_PENDING bit for them, check in sysdev_suspend() if
any wake-up interrupts are pending and abort suspend if that's the
case.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
arch/x86/kernel/apm_32.c | 15 ++++++--
drivers/base/power/main.c | 20 ++++++-----
drivers/base/sys.c | 8 ++++
drivers/xen/manage.c | 16 +++++----
include/linux/interrupt.h | 5 ++
include/linux/irq.h | 1
kernel/irq/Makefile | 1
kernel/irq/manage.c | 3 +
kernel/irq/pm.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++
kernel/kexec.c | 8 ++--
kernel/power/disk.c | 39 +++++++++++++++++------
kernel/power/main.c | 17 ++++++----
12 files changed, 170 insertions(+), 41 deletions(-)

Index: linux-2.6/include/linux/interrupt.h
===================================================================
--- linux-2.6.orig/include/linux/interrupt.h
+++ linux-2.6/include/linux/interrupt.h
@@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned
extern void disable_irq(unsigned int irq);
extern void enable_irq(unsigned int irq);

+/* The following three functions are for the core kernel use only. */
+extern void suspend_device_irqs(void);
+extern void resume_device_irqs(void);
+extern int check_wakeup_irqs(void);
+
#if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS)

extern cpumask_var_t irq_default_affinity;
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend
*/
static int suspend_enter(suspend_state_t state)
{
- int error = 0;
+ int error;

device_pm_lock();
- arch_suspend_disable_irqs();
- BUG_ON(!irqs_disabled());

- if ((error = device_power_down(PMSG_SUSPEND))) {
+ error = device_power_down(PMSG_SUSPEND);
+ if (error) {
printk(KERN_ERR "PM: Some devices failed to power down\n");
goto Done;
}

+ arch_suspend_disable_irqs();
+ BUG_ON(!irqs_disabled());
+
error = sysdev_suspend(PMSG_SUSPEND);
if (!error) {
if (!suspend_test(TEST_CORE))
@@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t
sysdev_resume();
}

- device_power_up(PMSG_RESUME);
- Done:
arch_suspend_enable_irqs();
BUG_ON(irqs_disabled());
+
+ device_power_up(PMSG_RESUME);
+
+ Done:
device_pm_unlock();
+
return error;
}

Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -214,7 +214,7 @@ static int create_image(int platform_mod
return error;

device_pm_lock();
- local_irq_disable();
+
/* At this point, device_suspend() has been called, but *not*
* device_power_down(). We *must* call device_power_down() now.
* Otherwise, drivers for some devices (e.g. interrupt controllers)
@@ -225,8 +225,11 @@ static int create_image(int platform_mod
if (error) {
printk(KERN_ERR "PM: Some devices failed to power down, "
"aborting hibernation\n");
- goto Enable_irqs;
+ goto Unlock;
}
+
+ local_irq_disable();
+
sysdev_suspend(PMSG_FREEZE);
if (error) {
printk(KERN_ERR "PM: Some devices failed to power down, "
@@ -252,12 +255,16 @@ static int create_image(int platform_mod
/* NOTE: device_power_up() is just a resume() for devices
* that suspended with irqs off ... no overall powerup.
*/
+
Power_up_devices:
+ local_irq_enable();
+
device_power_up(in_suspend ?
(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
- Enable_irqs:
- local_irq_enable();
+
+ Unlock:
device_pm_unlock();
+
return error;
}

@@ -336,13 +343,16 @@ static int resume_target_kernel(void)
int error;

device_pm_lock();
- local_irq_disable();
+
error = device_power_down(PMSG_QUIESCE);
if (error) {
printk(KERN_ERR "PM: Some devices failed to power down, "
"aborting resume\n");
- goto Enable_irqs;
+ goto Unlock;
}
+
+ local_irq_disable();
+
sysdev_suspend(PMSG_QUIESCE);
/* We'll ignore saved state, but this gets preempt count (etc) right */
save_processor_state();
@@ -366,11 +376,16 @@ static int resume_target_kernel(void)
swsusp_free();
restore_processor_state();
touch_softlockup_watchdog();
+
sysdev_resume();
- device_power_up(PMSG_RECOVER);
- Enable_irqs:
+
local_irq_enable();
+
+ device_power_up(PMSG_RECOVER);
+
+ Unlock:
device_pm_unlock();
+
return error;
}

@@ -447,15 +462,16 @@ int hibernation_platform_enter(void)
goto Finish;

device_pm_lock();
- local_irq_disable();
+
error = device_power_down(PMSG_HIBERNATE);
if (!error) {
+ local_irq_disable();
sysdev_suspend(PMSG_HIBERNATE);
hibernation_ops->enter();
/* We should never get here */
while (1);
}
- local_irq_enable();
+
device_pm_unlock();

/*
@@ -464,12 +480,15 @@ int hibernation_platform_enter(void)
*/
Finish:
hibernation_ops->finish();
+
Resume_devices:
entering_platform_hibernation = false;
device_resume(PMSG_RESTORE);
resume_console();
+
Close:
hibernation_ops->end();
+
return error;
}

Index: linux-2.6/arch/x86/kernel/apm_32.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/apm_32.c
+++ linux-2.6/arch/x86/kernel/apm_32.c
@@ -1190,8 +1190,10 @@ static int suspend(int vetoable)
struct apm_user *as;

device_suspend(PMSG_SUSPEND);
- local_irq_disable();
+
device_power_down(PMSG_SUSPEND);
+
+ local_irq_disable();
sysdev_suspend(PMSG_SUSPEND);

local_irq_enable();
@@ -1209,9 +1211,12 @@ static int suspend(int vetoable)
if (err != APM_SUCCESS)
apm_error("suspend", err);
err = (err == APM_SUCCESS) ? 0 : -EIO;
+
sysdev_resume();
- device_power_up(PMSG_RESUME);
local_irq_enable();
+
+ device_power_up(PMSG_RESUME);
+
device_resume(PMSG_RESUME);
queue_event(APM_NORMAL_RESUME, NULL);
spin_lock(&user_list_lock);
@@ -1228,8 +1233,9 @@ static void standby(void)
{
int err;

- local_irq_disable();
device_power_down(PMSG_SUSPEND);
+
+ local_irq_disable();
sysdev_suspend(PMSG_SUSPEND);
local_irq_enable();

@@ -1239,8 +1245,9 @@ static void standby(void)

local_irq_disable();
sysdev_resume();
- device_power_up(PMSG_RESUME);
local_irq_enable();
+
+ device_power_up(PMSG_RESUME);
}

static apm_event_t get_event(void)
Index: linux-2.6/drivers/xen/manage.c
===================================================================
--- linux-2.6.orig/drivers/xen/manage.c
+++ linux-2.6/drivers/xen/manage.c
@@ -39,12 +39,6 @@ static int xen_suspend(void *data)

BUG_ON(!irqs_disabled());

- err = device_power_down(PMSG_SUSPEND);
- if (err) {
- printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n",
- err);
- return err;
- }
err = sysdev_suspend(PMSG_SUSPEND);
if (err) {
printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n",
@@ -69,7 +63,6 @@ static int xen_suspend(void *data)
xen_mm_unpin_all();

sysdev_resume();
- device_power_up(PMSG_RESUME);

if (!*cancelled) {
xen_irq_resume();
@@ -108,6 +101,12 @@ static void do_suspend(void)
/* XXX use normal device tree? */
xenbus_suspend();

+ err = device_power_down(PMSG_SUSPEND);
+ if (err) {
+ printk(KERN_ERR "device_power_down failed: %d\n", err);
+ goto resume_devices;
+ }
+
err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0));
if (err) {
printk(KERN_ERR "failed to start xen_suspend: %d\n", err);
@@ -120,6 +119,9 @@ static void do_suspend(void)
} else
xenbus_suspend_cancel();

+ device_power_up(PMSG_RESUME);
+
+resume_devices:
device_resume(PMSG_RESUME);

/* Make sure timer events get retriggered on all CPUs */
Index: linux-2.6/kernel/kexec.c
===================================================================
--- linux-2.6.orig/kernel/kexec.c
+++ linux-2.6/kernel/kexec.c
@@ -1454,7 +1454,6 @@ int kernel_kexec(void)
if (error)
goto Resume_devices;
device_pm_lock();
- local_irq_disable();
/* At this point, device_suspend() has been called,
* but *not* device_power_down(). We *must*
* device_power_down() now. Otherwise, drivers for
@@ -1464,8 +1463,9 @@ int kernel_kexec(void)
*/
error = device_power_down(PMSG_FREEZE);
if (error)
- goto Enable_irqs;
+ goto Unlock_pm;

+ local_irq_disable();
/* Suspend system devices */
error = sysdev_suspend(PMSG_FREEZE);
if (error)
@@ -1484,9 +1484,9 @@ int kernel_kexec(void)
if (kexec_image->preserve_context) {
sysdev_resume();
Power_up_devices:
- device_power_up(PMSG_RESTORE);
- Enable_irqs:
local_irq_enable();
+ device_power_up(PMSG_RESTORE);
+ Unlock_pm:
device_pm_unlock();
enable_nonboot_cpus();
Resume_devices:
Index: linux-2.6/include/linux/irq.h
===================================================================
--- linux-2.6.orig/include/linux/irq.h
+++ linux-2.6/include/linux/irq.h
@@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig
#define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */
#define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */
#define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/
+#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */

#ifdef CONFIG_IRQ_PER_CPU
# define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU)
Index: linux-2.6/kernel/irq/pm.c
===================================================================
--- /dev/null
+++ linux-2.6/kernel/irq/pm.c
@@ -0,0 +1,78 @@
+/*
+ * linux/kernel/irq/pm.c
+ *
+ * Copyright (C) 2009 Rafael J. Wysocki <[email protected]>, Novell Inc.
+ *
+ * This file contains power management functions related to interrupts.
+ */
+
+#include <linux/irq.h>
+#include <linux/module.h>
+#include <linux/interrupt.h>
+
+/**
+ * suspend_device_irqs - disable all currently enabled interrupt lines
+ *
+ * During system-wide suspend or hibernation device interrupts need to be
+ * disabled at the chip level and this function is provided for this purpose.
+ * It disables all interrupt lines that are enabled at the moment and sets the
+ * IRQ_SUSPENDED flag for them.
+ */
+void suspend_device_irqs(void)
+{
+ struct irq_desc *desc;
+ int irq;
+
+ for_each_irq_desc(irq, desc) {
+ unsigned long flags;
+
+ spin_lock_irqsave(&desc->lock, flags);
+
+ if (!desc->depth && desc->action
+ && !(desc->action->flags & IRQF_TIMER)) {
+ desc->depth++;
+ desc->status |= IRQ_DISABLED | IRQ_SUSPENDED;
+ desc->chip->disable(irq);
+ }
+
+ spin_unlock_irqrestore(&desc->lock, flags);
+ }
+
+ for_each_irq_desc(irq, desc) {
+ if (desc->status & IRQ_SUSPENDED)
+ synchronize_irq(irq);
+ }
+}
+EXPORT_SYMBOL_GPL(suspend_device_irqs);
+
+/**
+ * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs()
+ *
+ * Enable all interrupt lines previously disabled by suspend_device_irqs() that
+ * have the IRQ_SUSPENDED flag set.
+ */
+void resume_device_irqs(void)
+{
+ struct irq_desc *desc;
+ int irq;
+
+ for_each_irq_desc(irq, desc)
+ if (desc->status & IRQ_SUSPENDED)
+ enable_irq(irq);
+}
+EXPORT_SYMBOL_GPL(resume_device_irqs);
+
+/**
+ * check_wakeup_irqs - check if any wake-up interrupts are pending
+ */
+int check_wakeup_irqs(void)
+{
+ struct irq_desc *desc;
+ int irq;
+
+ for_each_irq_desc(irq, desc)
+ if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING))
+ return -EBUSY;
+
+ return 0;
+}
Index: linux-2.6/kernel/irq/Makefile
===================================================================
--- linux-2.6.orig/kernel/irq/Makefile
+++ linux-2.6/kernel/irq/Makefile
@@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop
obj-$(CONFIG_PROC_FS) += proc.o
obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o
obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o
+obj-$(CONFIG_PM_SLEEP) += pm.o
Index: linux-2.6/kernel/irq/manage.c
===================================================================
--- linux-2.6.orig/kernel/irq/manage.c
+++ linux-2.6/kernel/irq/manage.c
@@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc
WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq);
break;
case 1: {
- unsigned int status = desc->status & ~IRQ_DISABLED;
+ unsigned int status;

+ status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED);
/* Prevent probing on this irq: */
desc->status = status | IRQ_NOPROBE;
check_irq_resend(desc, irq);
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -23,6 +23,7 @@
#include <linux/pm.h>
#include <linux/resume-trace.h>
#include <linux/rwsem.h>
+#include <linux/interrupt.h>

#include "../base.h"
#include "power.h"
@@ -305,7 +306,8 @@ static int resume_device_noirq(struct de
* Execute the appropriate "noirq resume" callback for all devices marked
* as DPM_OFF_IRQ.
*
- * Must be called with interrupts disabled and only one CPU running.
+ * Must be called under dpm_list_mtx. Device drivers should not receive
+ * interrupts while it's being executed.
*/
static void dpm_power_up(pm_message_t state)
{
@@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st
* device_power_up - Turn on all devices that need special attention.
* @state: PM transition of the system being carried out.
*
- * Power on system devices, then devices that required we shut them down
- * with interrupts disabled.
- *
- * Must be called with interrupts disabled.
+ * Call the "early" resume handlers and enable device drivers to receive
+ * interrupts.
*/
void device_power_up(pm_message_t state)
{
dpm_power_up(state);
+ resume_device_irqs();
}
EXPORT_SYMBOL_GPL(device_power_up);

@@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d
* device_power_down - Shut down special devices.
* @state: PM transition of the system being carried out.
*
- * Power down devices that require interrupts to be disabled.
- * Then power down system devices.
+ * Prevent device drivers from receiving interrupts and call the "late"
+ * suspend handlers.
*
- * Must be called with interrupts disabled and only one CPU running.
+ * Must be called under dpm_list_mtx.
*/
int device_power_down(pm_message_t state)
{
struct device *dev;
int error = 0;

+ suspend_device_irqs();
list_for_each_entry_reverse(dev, &dpm_list, power.entry) {
error = suspend_device_noirq(dev, state);
if (error) {
@@ -577,7 +579,7 @@ int device_power_down(pm_message_t state
dev->power.status = DPM_OFF_IRQ;
}
if (error)
- dpm_power_up(resume_event(state));
+ device_power_up(resume_event(state));
return error;
}
EXPORT_SYMBOL_GPL(device_power_down);
Index: linux-2.6/drivers/base/sys.c
===================================================================
--- linux-2.6.orig/drivers/base/sys.c
+++ linux-2.6/drivers/base/sys.c
@@ -22,6 +22,7 @@
#include <linux/pm.h>
#include <linux/device.h>
#include <linux/mutex.h>
+#include <linux/interrupt.h>

#include "base.h"

@@ -369,6 +370,13 @@ int sysdev_suspend(pm_message_t state)
struct sysdev_driver *drv, *err_drv;
int ret;

+ pr_debug("Checking wake-up interrupts\n");
+
+ /* Return error code if there are any wake-up interrupts pending */
+ ret = check_wakeup_irqs();
+ if (ret)
+ return ret;
+
pr_debug("Suspending System Devices\n");

list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) {

2009-03-02 20:50:47

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC][PATCH 2/4] PM: Change suspend code ordering



On Sun, 1 Mar 2009, Rafael J. Wysocki wrote:
>
> From: Rafael J. Wysocki <[email protected]>
>
> Change the ordering of the suspend core code so that the platform
> "prepare" callback is executed and the nonboot CPUs are disabled
> after calling device drivers' "late suspend" methods.

Ok, ack on this whole series, looks fine.

I'd like to see a 5/4 though:

> This change will allow us to rework the PCI PM core so that the power
> state of devices is changed in the "late" phase of suspend (and
> analogously in the "early" phase of resume)

.. doing this. Right now we have that hacky "avoid ACPI by doing a special
limited form of pci_set_power_state() and pci_enable() in the
early_resume. I'd love to see the actual PCI code cleanup too.

Linus

2009-03-02 22:03:20

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH 2/4] PM: Change suspend code ordering

On Monday 02 March 2009, Linus Torvalds wrote:
>
> On Sun, 1 Mar 2009, Rafael J. Wysocki wrote:
> >
> > From: Rafael J. Wysocki <[email protected]>
> >
> > Change the ordering of the suspend core code so that the platform
> > "prepare" callback is executed and the nonboot CPUs are disabled
> > after calling device drivers' "late suspend" methods.
>
> Ok, ack on this whole series, looks fine.

Thanks!

> I'd like to see a 5/4 though:
>
> > This change will allow us to rework the PCI PM core so that the power
> > state of devices is changed in the "late" phase of suspend (and
> > analogously in the "early" phase of resume)
>
> .. doing this. Right now we have that hacky "avoid ACPI by doing a special
> limited form of pci_set_power_state() and pci_enable() in the
> early_resume. I'd love to see the actual PCI code cleanup too.

Sure, that's the next step, but I wanted to get the ack on the preliminary
patches first. :-)

Rafael

2009-03-02 23:01:46

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4)

On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <[email protected]> wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> Introduce two helper functions allowing us to prevent device drivers
> from getting any interrupts (without disabling interrupts on the CPU)
> during suspend (or hibernation) and to make them start to receive
> interrupts again during the subsequent resume, respectively. ?These
> functions make it possible to keep timer interrupts enabled while the
> "late" suspend and "early" resume callbacks provided by device
> drivers are being executed.
>
> Use these functions to rework the handling of interrupts during
> suspend (hibernation) and resume. ?Namely, interrupts will only be
> disabled on the CPU right before suspending sysdevs, while device
> drivers will be prevented from receiving interrupts, with the help of
> the new helper function, before their "late" suspend callbacks run
> (and analogously during resume).
>
> In addition, since the device interrups are now disabled before the
> CPU has turned all interrupts off and the CPU will ACK the interrupts
> setting the IRQ_PENDING bit for them, check in sysdev_suspend() if
> any wake-up interrupts are pending and abort suspend if that's the
> case.
>


> +void resume_device_irqs(void)
> +{
> + ? ? ? struct irq_desc *desc;
> + ? ? ? int irq;
> +
> + ? ? ? for_each_irq_desc(irq, desc)
> + ? ? ? ? ? ? ? if (desc->status & IRQ_SUSPENDED)
> + ? ? ? ? ? ? ? ? ? ? ? enable_irq(irq);
> +}

I think you need to clear IRQ_SUSPENDED here, not in enable_irq.

> @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc
> ? ? ? ? ? ? ? ?WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq);
> ? ? ? ? ? ? ? ?break;
> ? ? ? ?case 1: {
> - ? ? ? ? ? ? ? unsigned int status = desc->status & ~IRQ_DISABLED;
> + ? ? ? ? ? ? ? unsigned int status;
>
> + ? ? ? ? ? ? ? status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED);
> ? ? ? ? ? ? ? ?/* Prevent probing on this irq: */
> ? ? ? ? ? ? ? ?desc->status = status | IRQ_NOPROBE;
> ? ? ? ? ? ? ? ?check_irq_resend(desc, irq);

This only clears IRQ_SUSPENDED if the interrupt was not disabled
elsewhere. If a driver calls interrupt_disable in suspend_late, but
calls interrupt_enable lazily, resume_device_irqs will reenable the
interrupt even though the driver has a disable reference.

The rest of the patch looks good.

--
Arve Hj?nnev?g

2009-03-02 23:13:34

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4)

On Tuesday 03 March 2009, Arve Hj?nnev?g wrote:
> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <[email protected]> wrote:
> > From: Rafael J. Wysocki <[email protected]>
> >
> > Introduce two helper functions allowing us to prevent device drivers
> > from getting any interrupts (without disabling interrupts on the CPU)
> > during suspend (or hibernation) and to make them start to receive
> > interrupts again during the subsequent resume, respectively. These
> > functions make it possible to keep timer interrupts enabled while the
> > "late" suspend and "early" resume callbacks provided by device
> > drivers are being executed.
> >
> > Use these functions to rework the handling of interrupts during
> > suspend (hibernation) and resume. Namely, interrupts will only be
> > disabled on the CPU right before suspending sysdevs, while device
> > drivers will be prevented from receiving interrupts, with the help of
> > the new helper function, before their "late" suspend callbacks run
> > (and analogously during resume).
> >
> > In addition, since the device interrups are now disabled before the
> > CPU has turned all interrupts off and the CPU will ACK the interrupts
> > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if
> > any wake-up interrupts are pending and abort suspend if that's the
> > case.
> >
>
>
> > +void resume_device_irqs(void)
> > +{
> > + struct irq_desc *desc;
> > + int irq;
> > +
> > + for_each_irq_desc(irq, desc)
> > + if (desc->status & IRQ_SUSPENDED)
> > + enable_irq(irq);
> > +}
>
> I think you need to clear IRQ_SUSPENDED here, not in enable_irq.

enable_irq() clears IRQ_SUSPENDED. This has already been discussed btw.

> > @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc
> > WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq);
> > break;
> > case 1: {
> > - unsigned int status = desc->status & ~IRQ_DISABLED;
> > + unsigned int status;
> >
> > + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED);
> > /* Prevent probing on this irq: */
> > desc->status = status | IRQ_NOPROBE;
> > check_irq_resend(desc, irq);
>
> This only clears IRQ_SUSPENDED if the interrupt was not disabled
> elsewhere. If a driver calls interrupt_disable in suspend_late, but
> calls interrupt_enable lazily, resume_device_irqs will reenable the
> interrupt even though the driver has a disable reference.

Then I'd regard the driver as buggy.

> The rest of the patch looks good.

I'm glad you like it.

Thanks,
Rafael

2009-03-02 23:19:18

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4)

On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <[email protected]> wrote:
> On Tuesday 03 March 2009, Arve Hj?nnev?g wrote:
>> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <[email protected]> wrote:
>> > From: Rafael J. Wysocki <[email protected]>
>> >
>> > Introduce two helper functions allowing us to prevent device drivers
>> > from getting any interrupts (without disabling interrupts on the CPU)
>> > during suspend (or hibernation) and to make them start to receive
>> > interrupts again during the subsequent resume, respectively. ?These
>> > functions make it possible to keep timer interrupts enabled while the
>> > "late" suspend and "early" resume callbacks provided by device
>> > drivers are being executed.
>> >
>> > Use these functions to rework the handling of interrupts during
>> > suspend (hibernation) and resume. ?Namely, interrupts will only be
>> > disabled on the CPU right before suspending sysdevs, while device
>> > drivers will be prevented from receiving interrupts, with the help of
>> > the new helper function, before their "late" suspend callbacks run
>> > (and analogously during resume).
>> >
>> > In addition, since the device interrups are now disabled before the
>> > CPU has turned all interrupts off and the CPU will ACK the interrupts
>> > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if
>> > any wake-up interrupts are pending and abort suspend if that's the
>> > case.
>> >
>>
>>
>> > +void resume_device_irqs(void)
>> > +{
>> > + ? ? ? struct irq_desc *desc;
>> > + ? ? ? int irq;
>> > +
>> > + ? ? ? for_each_irq_desc(irq, desc)
>> > + ? ? ? ? ? ? ? if (desc->status & IRQ_SUSPENDED)
>> > + ? ? ? ? ? ? ? ? ? ? ? enable_irq(irq);
>> > +}
>>
>> I think you need to clear IRQ_SUSPENDED here, not in enable_irq.
>
> enable_irq() clears IRQ_SUSPENDED. ?This has already been discussed btw.
>

I'm if I missed that discussion, but enable_irq cannot know who is
calling it and therefore cannot know if IRQ_SUSPENDED should be
cleared.

>> > @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc
>> > ? ? ? ? ? ? ? ?WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq);
>> > ? ? ? ? ? ? ? ?break;
>> > ? ? ? ?case 1: {
>> > - ? ? ? ? ? ? ? unsigned int status = desc->status & ~IRQ_DISABLED;
>> > + ? ? ? ? ? ? ? unsigned int status;
>> >
>> > + ? ? ? ? ? ? ? status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED);
>> > ? ? ? ? ? ? ? ?/* Prevent probing on this irq: */
>> > ? ? ? ? ? ? ? ?desc->status = status | IRQ_NOPROBE;
>> > ? ? ? ? ? ? ? ?check_irq_resend(desc, irq);
>>
>> This only clears IRQ_SUSPENDED if the interrupt was not disabled
>> elsewhere. If a driver calls interrupt_disable in suspend_late, but
>> calls interrupt_enable lazily, resume_device_irqs will reenable the
>> interrupt even though the driver has a disable reference.
>
> Then I'd regard the driver as buggy.

The bug is not in the driver. The driver called disable_irq once. You
called disable_irq once, but enable_irq twice.

--
Arve Hj?nnev?g

2009-03-02 23:27:43

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4)

On Tuesday 03 March 2009, Arve Hj?nnev?g wrote:
> On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <[email protected]> wrote:
> > On Tuesday 03 March 2009, Arve Hj?nnev?g wrote:
> >> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <[email protected]> wrote:
> >> > From: Rafael J. Wysocki <[email protected]>
> >> >
> >> > Introduce two helper functions allowing us to prevent device drivers
> >> > from getting any interrupts (without disabling interrupts on the CPU)
> >> > during suspend (or hibernation) and to make them start to receive
> >> > interrupts again during the subsequent resume, respectively. These
> >> > functions make it possible to keep timer interrupts enabled while the
> >> > "late" suspend and "early" resume callbacks provided by device
> >> > drivers are being executed.
> >> >
> >> > Use these functions to rework the handling of interrupts during
> >> > suspend (hibernation) and resume. Namely, interrupts will only be
> >> > disabled on the CPU right before suspending sysdevs, while device
> >> > drivers will be prevented from receiving interrupts, with the help of
> >> > the new helper function, before their "late" suspend callbacks run
> >> > (and analogously during resume).
> >> >
> >> > In addition, since the device interrups are now disabled before the
> >> > CPU has turned all interrupts off and the CPU will ACK the interrupts
> >> > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if
> >> > any wake-up interrupts are pending and abort suspend if that's the
> >> > case.
> >> >
> >>
> >>
> >> > +void resume_device_irqs(void)
> >> > +{
> >> > + struct irq_desc *desc;
> >> > + int irq;
> >> > +
> >> > + for_each_irq_desc(irq, desc)
> >> > + if (desc->status & IRQ_SUSPENDED)
> >> > + enable_irq(irq);
> >> > +}
> >>
> >> I think you need to clear IRQ_SUSPENDED here, not in enable_irq.
> >
> > enable_irq() clears IRQ_SUSPENDED. This has already been discussed btw.
> >
>
> I'm if I missed that discussion, but enable_irq cannot know who is
> calling it and therefore cannot know if IRQ_SUSPENDED should be
> cleared.

This change has been requested by Ingo and for a reason.

Ingo, what's your opinion?

> >> > @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc
> >> > WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq);
> >> > break;
> >> > case 1: {
> >> > - unsigned int status = desc->status & ~IRQ_DISABLED;
> >> > + unsigned int status;
> >> >
> >> > + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED);
> >> > /* Prevent probing on this irq: */
> >> > desc->status = status | IRQ_NOPROBE;
> >> > check_irq_resend(desc, irq);
> >>
> >> This only clears IRQ_SUSPENDED if the interrupt was not disabled
> >> elsewhere. If a driver calls interrupt_disable in suspend_late, but
> >> calls interrupt_enable lazily, resume_device_irqs will reenable the
> >> interrupt even though the driver has a disable reference.
> >
> > Then I'd regard the driver as buggy.
>
> The bug is not in the driver. The driver called disable_irq once. You
> called disable_irq once, but enable_irq twice.

Please.

Can you show me a _single_ _driver_ currently in the tree doing something
like you describe in suspend_late and resume_early? If you can't, then please
give up.

Thanks,
Rafael

2009-03-02 23:34:30

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4)



On Mon, 2 Mar 2009, Arve Hj?nnev?g wrote:
>
> > enable_irq() clears IRQ_SUSPENDED. ?This has already been discussed btw.
>
> I'm if I missed that discussion, but enable_irq cannot know who is
> calling it and therefore cannot know if IRQ_SUSPENDED should be
> cleared.

Sure it can.

If IRQ_SUSPENDED is not set, then clearing it is a no-op, so that's fine.

If IRQ_SUSPENDED _is_ set, then that means that we're after the
suspend_late() sequence and before the resume_early() sequence, and no
device driver is possibly called in between, so they'd sure better not be
doing anything that does an enable_irq().

IOW, we know who the caller is, simply because there can be no other valid
caller!

Linus

2009-03-02 23:36:25

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4)



On Mon, 2 Mar 2009, Linus Torvalds wrote:
>
> If IRQ_SUSPENDED _is_ set, then that means that we're after the
> suspend_late() sequence and before the resume_early() sequence

Sorry, after the suspend, and before the resume.

We could be _in_ the suspend_late/resume_early sequence, but a driver that
were to try to play with interrupts at that stage would be broken. It
can't very well do a enable_irq(), because that would be a MAJOR BUG - it
would make the whole irq suspend thing pointless, since now interrupts
would start to happen exactly where they must not happen!

Linus

2009-03-03 00:09:08

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4)

On Mon, Mar 2, 2009 at 3:35 PM, Linus Torvalds
<[email protected]> wrote:
>
>
> On Mon, 2 Mar 2009, Linus Torvalds wrote:
>>
>> If IRQ_SUSPENDED _is_ set, then that means that we're after the
>> suspend_late() sequence and before the resume_early() sequence
>
> Sorry, after the suspend, and before the resume.
>
> We could be _in_ the suspend_late/resume_early sequence, but a driver that
> were to try to play with interrupts at that stage would be broken. It
> can't very well do a enable_irq(), because that would be a MAJOR BUG - it
> would make the whole irq suspend thing pointless, since now interrupts
> would start to happen exactly where they must not happen!

It may be pointless for a driver to call disable_irq and enable_irq
from suspend_late or resume_early (instead of suspend and resume), but
I would not call it a bug. Since disable_irq and enable_irq are
reference counted all this is doing is indicating that this driver can
or cannot accept interrupts. If you want to make an additional
restriction that drivers are not allowed to call disable_irq or
enable_irq from suspend_late and resume_early, then yes you can tell
that enable_irq was called from resume_device_irqs.

I don't know of any drivers that do this, I was just pointing out the
danger of releasing a reference without knowing if you acquired that
reference.

--
Arve Hj?nnev?g

2009-03-03 08:41:23

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4)

On Mon, Mar 2, 2009 at 4:08 PM, Arve Hj?nnev?g <[email protected]> wrote:
> On Mon, Mar 2, 2009 at 3:35 PM, Linus Torvalds
> <[email protected]> wrote:
>>
>>
>> On Mon, 2 Mar 2009, Linus Torvalds wrote:
>>>
>>> If IRQ_SUSPENDED _is_ set, then that means that we're after the
>>> suspend_late() sequence and before the resume_early() sequence
>>
>> Sorry, after the suspend, and before the resume.
>>
>> We could be _in_ the suspend_late/resume_early sequence, but a driver that
>> were to try to play with interrupts at that stage would be broken. It
>> can't very well do a enable_irq(), because that would be a MAJOR BUG - it
>> would make the whole irq suspend thing pointless, since now interrupts
>> would start to happen exactly where they must not happen!
>
> It may be pointless for a driver to call disable_irq and enable_irq
> from suspend_late or resume_early (instead of suspend and resume), but
> I would not call it a bug. Since disable_irq and enable_irq are
> reference counted all this is doing is indicating that this driver can
> or cannot accept interrupts. If you want to make an additional
> restriction that drivers are not allowed to call disable_irq or
> enable_irq from suspend_late and resume_early, then yes you can tell
> that enable_irq was called from resume_device_irqs.
>
> I don't know of any drivers that do this, I was just pointing out the
> danger of releasing a reference without knowing if you acquired that
> reference.

I did think of a driver that can call enable_irq during the
suspend_late phase with this patch. This will not cause an extra
enable_irq, but it will enable the interrupt since suspend_device_irqs
never incremented depth. Our keypad driver disables its interrupt(s)
as soon as you press a key and starts a timer to scan the keypad. When
the timer detects that no keys are pressed, it re-enables the
interrupt. Since timers now run during suspend_late, this enable_irq
call can happen after suspend_device_irqs. If suspend_device_irqs
increments depth even if it is not zero, this can be avoided.

--
Arve Hj?nnev?g

2009-03-03 22:56:25

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4)

On Mon, Mar 2, 2009 at 3:27 PM, Rafael J. Wysocki <[email protected]> wrote:
> On Tuesday 03 March 2009, Arve Hj?nnev?g wrote:
>> On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <[email protected]> wrote:
>> > On Tuesday 03 March 2009, Arve Hj?nnev?g wrote:
>> >> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <[email protected]> wrote:
>> >> > From: Rafael J. Wysocki <[email protected]>
>> >> >
>> >> > Introduce two helper functions allowing us to prevent device drivers
>> >> > from getting any interrupts (without disabling interrupts on the CPU)
>> >> > during suspend (or hibernation) and to make them start to receive
>> >> > interrupts again during the subsequent resume, respectively. ?These
>> >> > functions make it possible to keep timer interrupts enabled while the
>> >> > "late" suspend and "early" resume callbacks provided by device
>> >> > drivers are being executed.
>> >> >
>> >> > Use these functions to rework the handling of interrupts during
>> >> > suspend (hibernation) and resume. ?Namely, interrupts will only be
>> >> > disabled on the CPU right before suspending sysdevs, while device
>> >> > drivers will be prevented from receiving interrupts, with the help of
>> >> > the new helper function, before their "late" suspend callbacks run
>> >> > (and analogously during resume).
>> >> >
>> >> > In addition, since the device interrups are now disabled before the
>> >> > CPU has turned all interrupts off and the CPU will ACK the interrupts
>> >> > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if
>> >> > any wake-up interrupts are pending and abort suspend if that's the
>> >> > case.
>> >> >
>> >>
>> >>
>> >> > +void resume_device_irqs(void)
>> >> > +{
>> >> > + ? ? ? struct irq_desc *desc;
>> >> > + ? ? ? int irq;
>> >> > +
>> >> > + ? ? ? for_each_irq_desc(irq, desc)
>> >> > + ? ? ? ? ? ? ? if (desc->status & IRQ_SUSPENDED)
>> >> > + ? ? ? ? ? ? ? ? ? ? ? enable_irq(irq);
>> >> > +}
>> >>
>> >> I think you need to clear IRQ_SUSPENDED here, not in enable_irq.
>> >
>> > enable_irq() clears IRQ_SUSPENDED. ?This has already been discussed btw.
>> >
>>
>> I'm if I missed that discussion, but enable_irq cannot know who is
>> calling it and therefore cannot know if IRQ_SUSPENDED should be
>> cleared.
>
> This change has been requested by Ingo and for a reason.
>
> Ingo, what's your opinion?
>
>> >> > @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc
>> >> > ? ? ? ? ? ? ? ?WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq);
>> >> > ? ? ? ? ? ? ? ?break;
>> >> > ? ? ? ?case 1: {
>> >> > - ? ? ? ? ? ? ? unsigned int status = desc->status & ~IRQ_DISABLED;
>> >> > + ? ? ? ? ? ? ? unsigned int status;
>> >> >
>> >> > + ? ? ? ? ? ? ? status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED);
>> >> > ? ? ? ? ? ? ? ?/* Prevent probing on this irq: */
>> >> > ? ? ? ? ? ? ? ?desc->status = status | IRQ_NOPROBE;
>> >> > ? ? ? ? ? ? ? ?check_irq_resend(desc, irq);
>> >>
>> >> This only clears IRQ_SUSPENDED if the interrupt was not disabled
>> >> elsewhere. If a driver calls interrupt_disable in suspend_late, but
>> >> calls interrupt_enable lazily, resume_device_irqs will reenable the
>> >> interrupt even though the driver has a disable reference.
>> >
>> > Then I'd regard the driver as buggy.
>>
>> The bug is not in the driver. The driver called disable_irq once. You
>> called disable_irq once, but enable_irq twice.
>
> Please.
>
> Can you show me a _single_ _driver_ currently in the tree doing something
> like you describe in suspend_late and resume_early? ?If you can't, then please
> give up.

I don't know if any drivers call disable_irq or enable_irq in their
suspend hooks, but your change also allow timers, and I assume kernel
threads, to run during this phase.

There are several drivers (keypad drivers in particular), in tree and
out of tree, that call enable_irq from timers, and disable_irq from
their interrupt handler. If you also apply your later change to
disable non boot cpus after suspend_device_irqs, then on smp systems
the interrupt handler may run at the same time as suspend_device_irqs.
If suspend_device_irqs gets the spinlock first, then IRQ_SUSPENDED
gets set. If another suspend/resume cycle happens before the timer
runs, you will incorrectly enable the interrupt.

--
Arve Hj?nnev?g

2009-03-04 22:04:10

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Update, rev. 5] Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4)

On Tuesday 03 March 2009, Arve Hj?nnev?g wrote:
> On Mon, Mar 2, 2009 at 3:27 PM, Rafael J. Wysocki <[email protected]> wrote:
> > On Tuesday 03 March 2009, Arve Hj?nnev?g wrote:
> >> On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <[email protected]> wrote:
> >> > On Tuesday 03 March 2009, Arve Hj?nnev?g wrote:
> >> >> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <[email protected]> wrote:
[--snip--]
> > Can you show me a _single_ _driver_ currently in the tree doing something
> > like you describe in suspend_late and resume_early? If you can't, then please
> > give up.
>
> I don't know if any drivers call disable_irq or enable_irq in their
> suspend hooks, but your change also allow timers, and I assume kernel
> threads, to run during this phase.
>
> There are several drivers (keypad drivers in particular), in tree and
> out of tree, that call enable_irq from timers, and disable_irq from
> their interrupt handler. If you also apply your later change to
> disable non boot cpus after suspend_device_irqs, then on smp systems
> the interrupt handler may run at the same time as suspend_device_irqs.
> If suspend_device_irqs gets the spinlock first, then IRQ_SUSPENDED
> gets set. If another suspend/resume cycle happens before the timer
> runs, you will incorrectly enable the interrupt.

Well, unfortunately this is a valid point IMO. I've been thinking for quite a
while how to fix it nicely, but I'm not sure if there is a nice fix.

Below is an updated patch, hopefully everyone will be fine with it.

Ingo, is making __enable_irq() an extern function acceptable?

Rafael

---
From: Rafael J. Wysocki <[email protected]>
Subject: PM: Rework handling of interrupts during suspend-resume (rev. 5)

Introduce two helper functions allowing us to prevent device drivers
from getting any interrupts (without disabling interrupts on the CPU)
during suspend (or hibernation) and to make them start to receive
interrupts again during the subsequent resume, respectively. These
functions make it possible to keep timer interrupts enabled while the
"late" suspend and "early" resume callbacks provided by device
drivers are being executed.

Use these functions to rework the handling of interrupts during
suspend (hibernation) and resume. Namely, interrupts will only be
disabled on the CPU right before suspending sysdevs, while device
drivers will be prevented from receiving interrupts, with the help of
the new helper function, before their "late" suspend callbacks run
(and analogously during resume).

In addition, since the device interrups are now disabled before the
CPU has turned all interrupts off and the CPU will ACK the interrupts
setting the IRQ_PENDING bit for them, check in sysdev_suspend() if
any wake-up interrupts are pending and abort suspend if that's the
case.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
arch/x86/kernel/apm_32.c | 15 +++++--
drivers/base/power/main.c | 20 +++++-----
drivers/base/sys.c | 8 ++++
drivers/xen/manage.c | 16 ++++----
include/linux/interrupt.h | 5 ++
include/linux/irq.h | 1
kernel/irq/Makefile | 1
kernel/irq/internals.h | 1
kernel/irq/manage.c | 2 -
kernel/irq/pm.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++
kernel/kexec.c | 8 ++--
kernel/power/disk.c | 39 ++++++++++++++------
kernel/power/main.c | 17 +++++---
13 files changed, 181 insertions(+), 41 deletions(-)

Index: linux-2.6/include/linux/interrupt.h
===================================================================
--- linux-2.6.orig/include/linux/interrupt.h
+++ linux-2.6/include/linux/interrupt.h
@@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned
extern void disable_irq(unsigned int irq);
extern void enable_irq(unsigned int irq);

+/* The following three functions are for the core kernel use only. */
+extern void suspend_device_irqs(void);
+extern void resume_device_irqs(void);
+extern int check_wakeup_irqs(void);
+
#if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS)

extern cpumask_var_t irq_default_affinity;
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend
*/
static int suspend_enter(suspend_state_t state)
{
- int error = 0;
+ int error;

device_pm_lock();
- arch_suspend_disable_irqs();
- BUG_ON(!irqs_disabled());

- if ((error = device_power_down(PMSG_SUSPEND))) {
+ error = device_power_down(PMSG_SUSPEND);
+ if (error) {
printk(KERN_ERR "PM: Some devices failed to power down\n");
goto Done;
}

+ arch_suspend_disable_irqs();
+ BUG_ON(!irqs_disabled());
+
error = sysdev_suspend(PMSG_SUSPEND);
if (!error) {
if (!suspend_test(TEST_CORE))
@@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t
sysdev_resume();
}

- device_power_up(PMSG_RESUME);
- Done:
arch_suspend_enable_irqs();
BUG_ON(irqs_disabled());
+
+ device_power_up(PMSG_RESUME);
+
+ Done:
device_pm_unlock();
+
return error;
}

Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -214,7 +214,7 @@ static int create_image(int platform_mod
return error;

device_pm_lock();
- local_irq_disable();
+
/* At this point, device_suspend() has been called, but *not*
* device_power_down(). We *must* call device_power_down() now.
* Otherwise, drivers for some devices (e.g. interrupt controllers)
@@ -225,8 +225,11 @@ static int create_image(int platform_mod
if (error) {
printk(KERN_ERR "PM: Some devices failed to power down, "
"aborting hibernation\n");
- goto Enable_irqs;
+ goto Unlock;
}
+
+ local_irq_disable();
+
sysdev_suspend(PMSG_FREEZE);
if (error) {
printk(KERN_ERR "PM: Some devices failed to power down, "
@@ -252,12 +255,16 @@ static int create_image(int platform_mod
/* NOTE: device_power_up() is just a resume() for devices
* that suspended with irqs off ... no overall powerup.
*/
+
Power_up_devices:
+ local_irq_enable();
+
device_power_up(in_suspend ?
(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
- Enable_irqs:
- local_irq_enable();
+
+ Unlock:
device_pm_unlock();
+
return error;
}

@@ -336,13 +343,16 @@ static int resume_target_kernel(void)
int error;

device_pm_lock();
- local_irq_disable();
+
error = device_power_down(PMSG_QUIESCE);
if (error) {
printk(KERN_ERR "PM: Some devices failed to power down, "
"aborting resume\n");
- goto Enable_irqs;
+ goto Unlock;
}
+
+ local_irq_disable();
+
sysdev_suspend(PMSG_QUIESCE);
/* We'll ignore saved state, but this gets preempt count (etc) right */
save_processor_state();
@@ -366,11 +376,16 @@ static int resume_target_kernel(void)
swsusp_free();
restore_processor_state();
touch_softlockup_watchdog();
+
sysdev_resume();
- device_power_up(PMSG_RECOVER);
- Enable_irqs:
+
local_irq_enable();
+
+ device_power_up(PMSG_RECOVER);
+
+ Unlock:
device_pm_unlock();
+
return error;
}

@@ -447,15 +462,16 @@ int hibernation_platform_enter(void)
goto Finish;

device_pm_lock();
- local_irq_disable();
+
error = device_power_down(PMSG_HIBERNATE);
if (!error) {
+ local_irq_disable();
sysdev_suspend(PMSG_HIBERNATE);
hibernation_ops->enter();
/* We should never get here */
while (1);
}
- local_irq_enable();
+
device_pm_unlock();

/*
@@ -464,12 +480,15 @@ int hibernation_platform_enter(void)
*/
Finish:
hibernation_ops->finish();
+
Resume_devices:
entering_platform_hibernation = false;
device_resume(PMSG_RESTORE);
resume_console();
+
Close:
hibernation_ops->end();
+
return error;
}

Index: linux-2.6/arch/x86/kernel/apm_32.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/apm_32.c
+++ linux-2.6/arch/x86/kernel/apm_32.c
@@ -1190,8 +1190,10 @@ static int suspend(int vetoable)
struct apm_user *as;

device_suspend(PMSG_SUSPEND);
- local_irq_disable();
+
device_power_down(PMSG_SUSPEND);
+
+ local_irq_disable();
sysdev_suspend(PMSG_SUSPEND);

local_irq_enable();
@@ -1209,9 +1211,12 @@ static int suspend(int vetoable)
if (err != APM_SUCCESS)
apm_error("suspend", err);
err = (err == APM_SUCCESS) ? 0 : -EIO;
+
sysdev_resume();
- device_power_up(PMSG_RESUME);
local_irq_enable();
+
+ device_power_up(PMSG_RESUME);
+
device_resume(PMSG_RESUME);
queue_event(APM_NORMAL_RESUME, NULL);
spin_lock(&user_list_lock);
@@ -1228,8 +1233,9 @@ static void standby(void)
{
int err;

- local_irq_disable();
device_power_down(PMSG_SUSPEND);
+
+ local_irq_disable();
sysdev_suspend(PMSG_SUSPEND);
local_irq_enable();

@@ -1239,8 +1245,9 @@ static void standby(void)

local_irq_disable();
sysdev_resume();
- device_power_up(PMSG_RESUME);
local_irq_enable();
+
+ device_power_up(PMSG_RESUME);
}

static apm_event_t get_event(void)
Index: linux-2.6/drivers/xen/manage.c
===================================================================
--- linux-2.6.orig/drivers/xen/manage.c
+++ linux-2.6/drivers/xen/manage.c
@@ -39,12 +39,6 @@ static int xen_suspend(void *data)

BUG_ON(!irqs_disabled());

- err = device_power_down(PMSG_SUSPEND);
- if (err) {
- printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n",
- err);
- return err;
- }
err = sysdev_suspend(PMSG_SUSPEND);
if (err) {
printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n",
@@ -69,7 +63,6 @@ static int xen_suspend(void *data)
xen_mm_unpin_all();

sysdev_resume();
- device_power_up(PMSG_RESUME);

if (!*cancelled) {
xen_irq_resume();
@@ -108,6 +101,12 @@ static void do_suspend(void)
/* XXX use normal device tree? */
xenbus_suspend();

+ err = device_power_down(PMSG_SUSPEND);
+ if (err) {
+ printk(KERN_ERR "device_power_down failed: %d\n", err);
+ goto resume_devices;
+ }
+
err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0));
if (err) {
printk(KERN_ERR "failed to start xen_suspend: %d\n", err);
@@ -120,6 +119,9 @@ static void do_suspend(void)
} else
xenbus_suspend_cancel();

+ device_power_up(PMSG_RESUME);
+
+resume_devices:
device_resume(PMSG_RESUME);

/* Make sure timer events get retriggered on all CPUs */
Index: linux-2.6/kernel/kexec.c
===================================================================
--- linux-2.6.orig/kernel/kexec.c
+++ linux-2.6/kernel/kexec.c
@@ -1454,7 +1454,6 @@ int kernel_kexec(void)
if (error)
goto Resume_devices;
device_pm_lock();
- local_irq_disable();
/* At this point, device_suspend() has been called,
* but *not* device_power_down(). We *must*
* device_power_down() now. Otherwise, drivers for
@@ -1464,8 +1463,9 @@ int kernel_kexec(void)
*/
error = device_power_down(PMSG_FREEZE);
if (error)
- goto Enable_irqs;
+ goto Unlock_pm;

+ local_irq_disable();
/* Suspend system devices */
error = sysdev_suspend(PMSG_FREEZE);
if (error)
@@ -1484,9 +1484,9 @@ int kernel_kexec(void)
if (kexec_image->preserve_context) {
sysdev_resume();
Power_up_devices:
- device_power_up(PMSG_RESTORE);
- Enable_irqs:
local_irq_enable();
+ device_power_up(PMSG_RESTORE);
+ Unlock_pm:
device_pm_unlock();
enable_nonboot_cpus();
Resume_devices:
Index: linux-2.6/include/linux/irq.h
===================================================================
--- linux-2.6.orig/include/linux/irq.h
+++ linux-2.6/include/linux/irq.h
@@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig
#define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */
#define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */
#define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/
+#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */

#ifdef CONFIG_IRQ_PER_CPU
# define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU)
Index: linux-2.6/kernel/irq/pm.c
===================================================================
--- /dev/null
+++ linux-2.6/kernel/irq/pm.c
@@ -0,0 +1,89 @@
+/*
+ * linux/kernel/irq/pm.c
+ *
+ * Copyright (C) 2009 Rafael J. Wysocki <[email protected]>, Novell Inc.
+ *
+ * This file contains power management functions related to interrupts.
+ */
+
+#include <linux/irq.h>
+#include <linux/module.h>
+#include <linux/interrupt.h>
+
+#include "internals.h"
+
+/**
+ * suspend_device_irqs - disable all currently enabled interrupt lines
+ *
+ * During system-wide suspend or hibernation device interrupts need to be
+ * disabled at the chip level and this function is provided for this purpose.
+ * It disables all interrupt lines that are enabled at the moment and sets the
+ * IRQ_SUSPENDED flag for them.
+ */
+void suspend_device_irqs(void)
+{
+ struct irq_desc *desc;
+ int irq;
+
+ for_each_irq_desc(irq, desc) {
+ unsigned long flags;
+ bool sync = false;
+
+ spin_lock_irqsave(&desc->lock, flags);
+
+ if (desc->action && !(desc->action->flags & IRQF_TIMER)) {
+ if (!desc->depth++) {
+ desc->status |= IRQ_DISABLED;
+ desc->chip->disable(irq);
+ sync = true;
+ }
+ desc->status |= IRQ_SUSPENDED;
+ }
+
+ spin_unlock_irqrestore(&desc->lock, flags);
+
+ if (sync)
+ synchronize_irq(irq);
+ }
+}
+EXPORT_SYMBOL_GPL(suspend_device_irqs);
+
+/**
+ * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs()
+ *
+ * Enable all interrupt lines previously disabled by suspend_device_irqs() that
+ * have the IRQ_SUSPENDED flag set.
+ */
+void resume_device_irqs(void)
+{
+ struct irq_desc *desc;
+ int irq;
+
+ for_each_irq_desc(irq, desc) {
+ unsigned long flags;
+
+ if (!(desc->status & IRQ_SUSPENDED))
+ continue;
+
+ spin_lock_irqsave(&desc->lock, flags);
+ desc->status &= ~IRQ_SUSPENDED;
+ __enable_irq(desc, irq);
+ spin_unlock_irqrestore(&desc->lock, flags);
+ }
+}
+EXPORT_SYMBOL_GPL(resume_device_irqs);
+
+/**
+ * check_wakeup_irqs - check if any wake-up interrupts are pending
+ */
+int check_wakeup_irqs(void)
+{
+ struct irq_desc *desc;
+ int irq;
+
+ for_each_irq_desc(irq, desc)
+ if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING))
+ return -EBUSY;
+
+ return 0;
+}
Index: linux-2.6/kernel/irq/Makefile
===================================================================
--- linux-2.6.orig/kernel/irq/Makefile
+++ linux-2.6/kernel/irq/Makefile
@@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop
obj-$(CONFIG_PROC_FS) += proc.o
obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o
obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o
+obj-$(CONFIG_PM_SLEEP) += pm.o
Index: linux-2.6/kernel/irq/manage.c
===================================================================
--- linux-2.6.orig/kernel/irq/manage.c
+++ linux-2.6/kernel/irq/manage.c
@@ -215,7 +215,7 @@ void disable_irq(unsigned int irq)
}
EXPORT_SYMBOL(disable_irq);

-static void __enable_irq(struct irq_desc *desc, unsigned int irq)
+void __enable_irq(struct irq_desc *desc, unsigned int irq)
{
switch (desc->depth) {
case 0:
Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -23,6 +23,7 @@
#include <linux/pm.h>
#include <linux/resume-trace.h>
#include <linux/rwsem.h>
+#include <linux/interrupt.h>

#include "../base.h"
#include "power.h"
@@ -305,7 +306,8 @@ static int resume_device_noirq(struct de
* Execute the appropriate "noirq resume" callback for all devices marked
* as DPM_OFF_IRQ.
*
- * Must be called with interrupts disabled and only one CPU running.
+ * Must be called under dpm_list_mtx. Device drivers should not receive
+ * interrupts while it's being executed.
*/
static void dpm_power_up(pm_message_t state)
{
@@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st
* device_power_up - Turn on all devices that need special attention.
* @state: PM transition of the system being carried out.
*
- * Power on system devices, then devices that required we shut them down
- * with interrupts disabled.
- *
- * Must be called with interrupts disabled.
+ * Call the "early" resume handlers and enable device drivers to receive
+ * interrupts.
*/
void device_power_up(pm_message_t state)
{
dpm_power_up(state);
+ resume_device_irqs();
}
EXPORT_SYMBOL_GPL(device_power_up);

@@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d
* device_power_down - Shut down special devices.
* @state: PM transition of the system being carried out.
*
- * Power down devices that require interrupts to be disabled.
- * Then power down system devices.
+ * Prevent device drivers from receiving interrupts and call the "late"
+ * suspend handlers.
*
- * Must be called with interrupts disabled and only one CPU running.
+ * Must be called under dpm_list_mtx.
*/
int device_power_down(pm_message_t state)
{
struct device *dev;
int error = 0;

+ suspend_device_irqs();
list_for_each_entry_reverse(dev, &dpm_list, power.entry) {
error = suspend_device_noirq(dev, state);
if (error) {
@@ -577,7 +579,7 @@ int device_power_down(pm_message_t state
dev->power.status = DPM_OFF_IRQ;
}
if (error)
- dpm_power_up(resume_event(state));
+ device_power_up(resume_event(state));
return error;
}
EXPORT_SYMBOL_GPL(device_power_down);
Index: linux-2.6/drivers/base/sys.c
===================================================================
--- linux-2.6.orig/drivers/base/sys.c
+++ linux-2.6/drivers/base/sys.c
@@ -22,6 +22,7 @@
#include <linux/pm.h>
#include <linux/device.h>
#include <linux/mutex.h>
+#include <linux/interrupt.h>

#include "base.h"

@@ -369,6 +370,13 @@ int sysdev_suspend(pm_message_t state)
struct sysdev_driver *drv, *err_drv;
int ret;

+ pr_debug("Checking wake-up interrupts\n");
+
+ /* Return error code if there are any wake-up interrupts pending */
+ ret = check_wakeup_irqs();
+ if (ret)
+ return ret;
+
pr_debug("Suspending System Devices\n");

list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) {
Index: linux-2.6/kernel/irq/internals.h
===================================================================
--- linux-2.6.orig/kernel/irq/internals.h
+++ linux-2.6/kernel/irq/internals.h
@@ -12,6 +12,7 @@ extern void compat_irq_chip_set_default_

extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq,
unsigned long flags);
+extern void __enable_irq(struct irq_desc *desc, unsigned int irq);

extern struct lock_class_key irq_desc_lock_class;
extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr);

2009-03-05 10:36:21

by Ingo Molnar

[permalink] [raw]
Subject: Re: [Update, rev. 5] Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4)


* Rafael J. Wysocki <[email protected]> wrote:

> On Tuesday 03 March 2009, Arve Hj?nnev?g wrote:
> > On Mon, Mar 2, 2009 at 3:27 PM, Rafael J. Wysocki <[email protected]> wrote:
> > > On Tuesday 03 March 2009, Arve Hj?nnev?g wrote:
> > >> On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <[email protected]> wrote:
> > >> > On Tuesday 03 March 2009, Arve Hj?nnev?g wrote:
> > >> >> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <[email protected]> wrote:
> [--snip--]
> > > Can you show me a _single_ _driver_ currently in the tree doing something
> > > like you describe in suspend_late and resume_early? If you can't, then please
> > > give up.
> >
> > I don't know if any drivers call disable_irq or enable_irq in their
> > suspend hooks, but your change also allow timers, and I assume kernel
> > threads, to run during this phase.
> >
> > There are several drivers (keypad drivers in particular), in tree and
> > out of tree, that call enable_irq from timers, and disable_irq from
> > their interrupt handler. If you also apply your later change to
> > disable non boot cpus after suspend_device_irqs, then on smp systems
> > the interrupt handler may run at the same time as suspend_device_irqs.
> > If suspend_device_irqs gets the spinlock first, then IRQ_SUSPENDED
> > gets set. If another suspend/resume cycle happens before the timer
> > runs, you will incorrectly enable the interrupt.
>
> Well, unfortunately this is a valid point IMO. I've been thinking for quite a
> while how to fix it nicely, but I'm not sure if there is a nice fix.
>
> Below is an updated patch, hopefully everyone will be fine with it.
>
> Ingo, is making __enable_irq() an extern function acceptable?

Sure, that's fine - it's a genirq internal function still
between kernel/irq/manage.c and kernel/irq/pm.c.

Ingo

2009-03-05 23:46:34

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/4] Rework disabling of interrupts during suspend-resume



On Sun, 1 Mar 2009, Rafael J. Wysocki wrote:
>
> The following patches modifiy the way in which we handle disabling interrupts
> during suspend and enabling them during resume. They also change the ordering
> of the core suspend and hibernation code.

Side note - I've tested them on the EeePC that had trouble resuming due to
interrupt timings, and it suspends and resumes fine with these patches
(modulo some new X problems, but that's what I get for living with
Fedora-11 testing).

Of course, it also suspends and resumes without them, since the CPU "cli"
was sufficient for that machine, and it doesn't have any ACPI issues. But
it's still an ack that at least nothing breaks that I can tell.

Linus

2009-03-06 06:50:52

by Sitsofe Wheeler

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/4] Rework disabling of interrupts during suspend-resume

On Thu, Mar 05, 2009 at 03:44:22PM -0800, Linus Torvalds wrote:
>
Side note - I've tested them on the EeePC that had trouble resuming due to
> interrupt timings, and it suspends and resumes fine with these patches
> (modulo some new X problems, but that's what I get for living with
> Fedora-11 testing).

Since you have an EeePC I'm guessing your graphics card is an i9xx of
some variety. The i9xx kernel driver seems to have been recently
reworked so even those people who aren't using GEM yet are seeing issues
here and there (some with VT switching, some with suspend to ram and
others with suspend to disk). I actually don't know if the stuff poping
up is any more than usual but if so I'm hoping most new issues go away
once the dust settles...

--
Sitsofe | http://sucs.org/~sits/

2009-03-06 10:20:12

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/4] Rework disabling of interrupts during suspend-resume

On Friday 06 March 2009, Linus Torvalds wrote:
>
> On Sun, 1 Mar 2009, Rafael J. Wysocki wrote:
> >
> > The following patches modifiy the way in which we handle disabling interrupts
> > during suspend and enabling them during resume. They also change the ordering
> > of the core suspend and hibernation code.
>
> Side note - I've tested them on the EeePC that had trouble resuming due to
> interrupt timings, and it suspends and resumes fine with these patches
> (modulo some new X problems, but that's what I get for living with
> Fedora-11 testing).
>
> Of course, it also suspends and resumes without them, since the CPU "cli"
> was sufficient for that machine, and it doesn't have any ACPI issues. But
> it's still an ack that at least nothing breaks that I can tell.

OK, thanks for testing! The next-step patches (ie. PCI suspend-resume rework)
are in the works.

Rafael