2008-02-21 01:00:59

by Julian Blake Kongslie

[permalink] [raw]
Subject: Hang on suspend

I'm getting a hang at suspend on 2.6.25-rc2. Booting with
no_console_suspend let me catch the error, which I photographed and
transcribed (hopefully without error) below.

This is a Lenovo Thinkpad T43p.

I've attached my config and a partial dmesg (this machine generates a
lot of messages at boot, and overfills the buffer before klogd gets to
it).

BUG: unable to handle kernel NULL pointer dereference at 00000090
IP: [<00000090>]
*pde = 00000000
Oops: 0000 [#1] PREEMPT
Modules linked in: (really big list that I won't copy unless asked)

Pid: 3877, comm: s2ram Not tainted (2.6.25-rc2-dukephillips #1)
EIP: 0060:[<00000090>] EFLAGS: 00210086 CPU: 0
EIP is at 0x90
EAX: f7ba9800 EBX: f7ba9808 ECX: 00000090 EDX: 00000002
ESI: 00000000 EDI: f7ba98d8 EBP: f5d91ee0 ESP: f5d91edc
DS: 007b ES: 007b FS: 0000 FS: 0033 SS: 0068
Process s2ram (pid: 3877, ti=f5d90000 task=f5d92000 task.ti=f5d90000)
Stack: c9253ca9 f5d91ef8 c025581a 00000002 00000000 00000003 f5eb3000 f5d91f08
c013e14d fffffff5 00000003 f5d91f18 c013e2aa 00000003 00000003 f5d91f30
c013e38e c02d8b64 c013e2f8 f7828500 00000003 f5d91f44 c01ffa59 00000003
Call Trace:
[<c0253ca9>] ? platform_suspend_late+0x19/0x1f
[<c025581a>] ? device_power_down+0x58/0x107
[<c013e14d>] ? suspend_devices_and_enter+0x80/0xe0
[<c013e2aa>] ? enter_state+0xb2/0x100
[<c013e38e>] ? state_store+0x96/0xac
[<c013e2f8>] ? state_store+0x0/0xac
[<c01ffa59>] ? kobj_attr_store+0x1a/0x22
[<c019e56b>] ? sysfs_write_file+0xb3/0xde
[<c019e4b8>] ? sysfs_write_file+0x0/0xde
[<c016c57c>] ? vfs_write+0x8c/0x131
[<c016caa5>] ? sys_write+0x3b/0x60
[<c01040ea>] ? sysenter_past_esp+0x5f/0x85
========================
Code: Bad EIP value.
EIP: [<00000090>] 0x90 SS:ESP 0068:f5d91edc
---[ end trace 28f0dcbff44f7e50 ]---

Any help is appreciated.

Thanks,

--
-Julian Blake Kongslie
<[email protected]>

If this is a mailing list, please CC me on replies.
vim: set ft=text :


Attachments:
dmesg (15.38 kB)
.config (45.02 kB)
signature.asc (287.00 B)
This is a digitally signed message part
Download all attachments

2008-02-23 20:45:20

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Hang on suspend

On Thursday, 21 of February 2008, Julian Blake Kongslie wrote:
> I'm getting a hang at suspend on 2.6.25-rc2. Booting with
> no_console_suspend let me catch the error, which I photographed and
> transcribed (hopefully without error) below.
>
> This is a Lenovo Thinkpad T43p.
>
> I've attached my config and a partial dmesg (this machine generates a
> lot of messages at boot, and overfills the buffer before klogd gets to
> it).

Can you please apply the appened patch and retest?

Thanks,
Rafael

---
drivers/base/power/main.c | 84 ++--------------------------------------------
1 file changed, 4 insertions(+), 80 deletions(-)

Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -48,7 +48,6 @@
*/

LIST_HEAD(dpm_active);
-static LIST_HEAD(dpm_locked);
static LIST_HEAD(dpm_off);
static LIST_HEAD(dpm_off_irq);
static LIST_HEAD(dpm_destroy);
@@ -81,28 +80,6 @@ void device_pm_add(struct device *dev)
*/
void device_pm_remove(struct device *dev)
{
- /*
- * If this function is called during a suspend, it will be blocked,
- * because we're holding the device's semaphore at that time, which may
- * lead to a deadlock. In that case we want to print a warning.
- * However, it may also be called by unregister_dropped_devices() with
- * the device's semaphore released, in which case the warning should
- * not be printed.
- */
- if (down_trylock(&dev->sem)) {
- if (down_read_trylock(&pm_sleep_rwsem)) {
- /* No suspend in progress, wait on dev->sem */
- down(&dev->sem);
- up_read(&pm_sleep_rwsem);
- } else {
- /* Suspend in progress, we may deadlock */
- dev_warn(dev, "Suspicious %s during suspend\n",
- __FUNCTION__);
- dump_stack();
- /* The user has been warned ... */
- down(&dev->sem);
- }
- }
pr_debug("PM: Removing info for %s:%s\n",
dev->bus ? dev->bus->name : "No Bus",
kobject_name(&dev->kobj));
@@ -110,7 +87,6 @@ void device_pm_remove(struct device *dev
dpm_sysfs_remove(dev);
list_del_init(&dev->power.entry);
mutex_unlock(&dpm_list_mtx);
- up(&dev->sem);
}

/**
@@ -266,7 +242,7 @@ static void dpm_resume(void)
struct list_head *entry = dpm_off.next;
struct device *dev = to_device(entry);

- list_move_tail(entry, &dpm_locked);
+ list_move_tail(entry, &dpm_active);
mutex_unlock(&dpm_list_mtx);
resume_device(dev);
mutex_lock(&dpm_list_mtx);
@@ -275,25 +251,6 @@ static void dpm_resume(void)
}

/**
- * unlock_all_devices - Release each device's semaphore
- *
- * Go through the dpm_off list. Put each device on the dpm_active
- * list and unlock it.
- */
-static void unlock_all_devices(void)
-{
- mutex_lock(&dpm_list_mtx);
- while (!list_empty(&dpm_locked)) {
- struct list_head *entry = dpm_locked.prev;
- struct device *dev = to_device(entry);
-
- list_move(entry, &dpm_active);
- up(&dev->sem);
- }
- mutex_unlock(&dpm_list_mtx);
-}
-
-/**
* unregister_dropped_devices - Unregister devices scheduled for removal
*
* Unregister all devices on the dpm_destroy list.
@@ -305,7 +262,6 @@ static void unregister_dropped_devices(v
struct list_head *entry = dpm_destroy.next;
struct device *dev = to_device(entry);

- up(&dev->sem);
mutex_unlock(&dpm_list_mtx);
/* This also removes the device from the list */
device_unregister(dev);
@@ -324,7 +280,6 @@ void device_resume(void)
{
might_sleep();
dpm_resume();
- unlock_all_devices();
unregister_dropped_devices();
up_write(&pm_sleep_rwsem);
}
@@ -461,8 +416,8 @@ static int dpm_suspend(pm_message_t stat
int error = 0;

mutex_lock(&dpm_list_mtx);
- while (!list_empty(&dpm_locked)) {
- struct list_head *entry = dpm_locked.prev;
+ while (!list_empty(&dpm_active)) {
+ struct list_head *entry = dpm_active.prev;
struct device *dev = to_device(entry);

list_del_init(&dev->power.entry);
@@ -478,7 +433,7 @@ static int dpm_suspend(pm_message_t stat
""));
mutex_lock(&dpm_list_mtx);
if (list_empty(&dev->power.entry))
- list_add(&dev->power.entry, &dpm_locked);
+ list_add_tail(&dev->power.entry, &dpm_active);
break;
}
mutex_lock(&dpm_list_mtx);
@@ -491,36 +446,6 @@ static int dpm_suspend(pm_message_t stat
}

/**
- * lock_all_devices - Acquire every device's semaphore
- *
- * Go through the dpm_active list. Carefully lock each device's
- * semaphore and put it in on the dpm_locked list.
- */
-static void lock_all_devices(void)
-{
- mutex_lock(&dpm_list_mtx);
- while (!list_empty(&dpm_active)) {
- struct list_head *entry = dpm_active.next;
- struct device *dev = to_device(entry);
-
- /* Required locking order is dev->sem first,
- * then dpm_list_mutex. Hence this awkward code.
- */
- get_device(dev);
- mutex_unlock(&dpm_list_mtx);
- down(&dev->sem);
- mutex_lock(&dpm_list_mtx);
-
- if (list_empty(entry))
- up(&dev->sem); /* Device was removed */
- else
- list_move_tail(entry, &dpm_locked);
- put_device(dev);
- }
- mutex_unlock(&dpm_list_mtx);
-}
-
-/**
* device_suspend - Save state and stop all devices in system.
* @state: new power management state
*
@@ -533,7 +458,6 @@ int device_suspend(pm_message_t state)

might_sleep();
down_write(&pm_sleep_rwsem);
- lock_all_devices();
error = dpm_suspend(state);
if (error)
device_resume();

2008-02-24 04:31:08

by Julian Blake Kongslie

[permalink] [raw]
Subject: Re: Hang on suspend

On Sat, 2008-02-23 at 21:43 +0100, Rafael J. Wysocki wrote:
> Can you please apply the appened patch and retest?

Didn't apply cleanly to v2.6.25-rc2; I had to mangle one or two lines.
The patch I applied follows at the end of this message.

Unfortunately, it's about the same as before. I got:

Freezing user space processes ... (elapsed 0.00 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
ACPI: Preparing to enter system sleep state S3
sd 0:0:0:0: [sda] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Stopping disk
nsc-ircc 00:0b: disabled
parport_pc 00:0a: disabled
eth1: Going into suspend...
ACPI: PCI interrupt for device 0000:0b:02.0 disabled
ACPI: PCI interrupt for device 0000:00:1e.3 disabled
ACPI: PCI interrupt for device 0000:00:1e.2 disabled
ACPI: PCI interrupt for device 0000:00:1d.7 disabled
ACPI: PCI interrupt for device 0000:00:1d.3 disabled
ACPI: PCI interrupt for device 0000:00:1d.2 disabled
ACPI: PCI interrupt for device 0000:00:1d.1 disabled
ACPI: PCI interrupt for device 0000:00:1d.0 disabled
BUG: unable to handle kernel NULL pointer dereference at 00000090

And the oops and stacktrace that follows appears essentially identical
to the one I already transcribed. If you need it copied, I've got
photographs, but I'd like to save myself the typing...

Those PCI devices, in case it matters, are:
0b:02.0 Wireless card (Intel 2200BG)
00:1e.3 Modem (ICH6 AC'97)
00:1e.2 Audio (ICH6 AC'97)
00:1e.0 82801 Mobile PCI Bridge
00:1d.7 USB2 EHCI
00:1d.3 USB UHCI
00:1d.2 USB UHCI
00:1d.1 USB UHCI
00:1d.0 USB UHCI

The other devices not mentioned are:
0b:00.0 CardBus Bridge: Ricoh Co Ltd RL5c476 II (rev 8d)
02:00.0 Ethernet (Broadcom BCM5751M)
01:00.0 ATI Mobility FireGL V3200
00:1f.3 SMBus (ICH6)
00:1f.2 ICH6M SATA
00:1f.0 ICH6M LPC Interface Bridge
00:1c.2 ICH6 PCI Express Port 3
00:1c.0 ICH6 PCI Express Port 1
00:01.0 915GM/PM PCI Express Root Port
00:00.0 Processor to DRAM Controller

> Thanks,
> Rafael

Thanks for the help!

--
-Julian Blake Kongslie
<[email protected]>

If this is a mailing list, please CC me on replies.
vim: set ft=text :

Here's the patch against v2.6.25-rc2 I used:

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index bdc03f7..e3095c7 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -48,7 +48,6 @@
*/

LIST_HEAD(dpm_active);
-static LIST_HEAD(dpm_locked);
static LIST_HEAD(dpm_off);
static LIST_HEAD(dpm_off_irq);
static LIST_HEAD(dpm_destroy);
@@ -81,28 +80,6 @@ void device_pm_add(struct device *dev)
*/
void device_pm_remove(struct device *dev)
{
- /*
- * If this function is called during a suspend, it will be blocked,
- * because we're holding the device's semaphore at that time, which may
- * lead to a deadlock. In that case we want to print a warning.
- * However, it may also be called by unregister_dropped_devices() with
- * the device's semaphore released, in which case the warning should
- * not be printed.
- */
- if (down_trylock(&dev->sem)) {
- if (down_read_trylock(&pm_sleep_rwsem)) {
- /* No suspend in progress, wait on dev->sem */
- down(&dev->sem);
- up_read(&pm_sleep_rwsem);
- } else {
- /* Suspend in progress, we may deadlock */
- dev_warn(dev, "Suspicious %s during suspend\n",
- __FUNCTION__);
- dump_stack();
- /* The user has been warned ... */
- down(&dev->sem);
- }
- }
pr_debug("PM: Removing info for %s:%s\n",
dev->bus ? dev->bus->name : "No Bus",
kobject_name(&dev->kobj));
@@ -110,7 +87,6 @@ void device_pm_remove(struct device *dev)
dpm_sysfs_remove(dev);
list_del_init(&dev->power.entry);
mutex_unlock(&dpm_list_mtx);
- up(&dev->sem);
}

/**
@@ -266,7 +242,7 @@ static void dpm_resume(void)
struct list_head *entry = dpm_off.next;
struct device *dev = to_device(entry);

- list_move_tail(entry, &dpm_locked);
+ list_move_tail(entry, &dpm_active);
mutex_unlock(&dpm_list_mtx);
resume_device(dev);
mutex_lock(&dpm_list_mtx);
@@ -275,25 +251,6 @@ static void dpm_resume(void)
}

/**
- * unlock_all_devices - Release each device's semaphore
- *
- * Go through the dpm_off list. Put each device on the dpm_active
- * list and unlock it.
- */
-static void unlock_all_devices(void)
-{
- mutex_lock(&dpm_list_mtx);
- while (!list_empty(&dpm_locked)) {
- struct list_head *entry = dpm_locked.prev;
- struct device *dev = to_device(entry);
-
- list_move(entry, &dpm_active);
- up(&dev->sem);
- }
- mutex_unlock(&dpm_list_mtx);
-}
-
-/**
* unregister_dropped_devices - Unregister devices scheduled for removal
*
* Unregister all devices on the dpm_destroy list.
@@ -305,7 +262,6 @@ static void unregister_dropped_devices(void)
struct list_head *entry = dpm_destroy.next;
struct device *dev = to_device(entry);

- up(&dev->sem);
mutex_unlock(&dpm_list_mtx);
/* This also removes the device from the list */
device_unregister(dev);
@@ -324,7 +280,6 @@ void device_resume(void)
{
might_sleep();
dpm_resume();
- unlock_all_devices();
unregister_dropped_devices();
up_write(&pm_sleep_rwsem);
}
@@ -461,8 +416,8 @@ static int dpm_suspend(pm_message_t state)
int error = 0;

mutex_lock(&dpm_list_mtx);
- while (!list_empty(&dpm_locked)) {
- struct list_head *entry = dpm_locked.prev;
+ while (!list_empty(&dpm_active)) {
+ struct list_head *entry = dpm_active.prev;
struct device *dev = to_device(entry);

list_del_init(&dev->power.entry);
@@ -478,7 +433,7 @@ static int dpm_suspend(pm_message_t state)
""));
mutex_lock(&dpm_list_mtx);
if (list_empty(&dev->power.entry))
- list_add(&dev->power.entry, &dpm_locked);
+ list_add_tail(&dev->power.entry, &dpm_active);
mutex_unlock(&dpm_list_mtx);
break;
}
@@ -492,36 +447,6 @@ static int dpm_suspend(pm_message_t state)
}

/**
- * lock_all_devices - Acquire every device's semaphore
- *
- * Go through the dpm_active list. Carefully lock each device's
- * semaphore and put it in on the dpm_locked list.
- */
-static void lock_all_devices(void)
-{
- mutex_lock(&dpm_list_mtx);
- while (!list_empty(&dpm_active)) {
- struct list_head *entry = dpm_active.next;
- struct device *dev = to_device(entry);
-
- /* Required locking order is dev->sem first,
- * then dpm_list_mutex. Hence this awkward code.
- */
- get_device(dev);
- mutex_unlock(&dpm_list_mtx);
- down(&dev->sem);
- mutex_lock(&dpm_list_mtx);
-
- if (list_empty(entry))
- up(&dev->sem); /* Device was removed */
- else
- list_move_tail(entry, &dpm_locked);
- put_device(dev);
- }
- mutex_unlock(&dpm_list_mtx);
-}
-
-/**
* device_suspend - Save state and stop all devices in system.
*
* Prevent new devices from being registered, then lock all devices
@@ -533,7 +458,6 @@ int device_suspend(pm_message_t state)

might_sleep();
down_write(&pm_sleep_rwsem);
- lock_all_devices();
error = dpm_suspend(state);
if (error)
device_resume();


Attachments:
signature.asc (287.00 B)
This is a digitally signed message part

2008-02-24 11:25:17

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Hang on suspend

On Sunday, 24 of February 2008, Julian Blake Kongslie wrote:
> On Sat, 2008-02-23 at 21:43 +0100, Rafael J. Wysocki wrote:
> > Can you please apply the appened patch and retest?
>
> Didn't apply cleanly to v2.6.25-rc2; I had to mangle one or two lines.
> The patch I applied follows at the end of this message.
>
> Unfortunately, it's about the same as before. I got:
>
> Freezing user space processes ... (elapsed 0.00 seconds) done.
> Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
> ACPI: Preparing to enter system sleep state S3
> sd 0:0:0:0: [sda] Synchronizing SCSI cache
> sd 0:0:0:0: [sda] Stopping disk
> nsc-ircc 00:0b: disabled
> parport_pc 00:0a: disabled
> eth1: Going into suspend...
> ACPI: PCI interrupt for device 0000:0b:02.0 disabled
> ACPI: PCI interrupt for device 0000:00:1e.3 disabled
> ACPI: PCI interrupt for device 0000:00:1e.2 disabled
> ACPI: PCI interrupt for device 0000:00:1d.7 disabled
> ACPI: PCI interrupt for device 0000:00:1d.3 disabled
> ACPI: PCI interrupt for device 0000:00:1d.2 disabled
> ACPI: PCI interrupt for device 0000:00:1d.1 disabled
> ACPI: PCI interrupt for device 0000:00:1d.0 disabled
> BUG: unable to handle kernel NULL pointer dereference at 00000090

Can you please check what's at the address platform_suspend_late+0x19?

You can used "gdb vmlinux" and then "l *platform_suspend_late+0x19" under the
gdb for this purpose.

Thanks,
Rafael

2008-02-24 21:20:30

by Julian Blake Kongslie

[permalink] [raw]
Subject: Re: Hang on suspend

On Sun, 2008-02-24 at 12:23 +0100, Rafael J. Wysocki wrote:
> Can you please check what's at the address platform_suspend_late+0x19?

0xc0253ca9 is in platform_suspend_late (drivers/base/platform.c:579).
574 {
575 struct platform_driver *drv = to_platform_driver(dev->driver);
576 struct platform_device *pdev;
577 int ret = 0;
578
579 pdev = container_of(dev, struct platform_device, dev);
580 if (dev->driver && drv->suspend_late)
581 ret = drv->suspend_late(pdev, mesg);
582
583 return ret;

> Thanks,
> Rafael

Thanks,

--
-Julian Blake Kongslie
<[email protected]>

If this is a mailing list, please CC me on replies.
vim: set ft=text :


Attachments:
signature.asc (287.00 B)
This is a digitally signed message part

2008-02-24 22:36:27

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Hang on suspend

[Please don't drop the CC list from replies.]

On Sunday, 24 of February 2008, Julian Blake Kongslie wrote:
> On Sun, 2008-02-24 at 12:23 +0100, Rafael J. Wysocki wrote:
> > Can you please check what's at the address platform_suspend_late+0x19?
>
> 0xc0253ca9 is in platform_suspend_late (drivers/base/platform.c:579).
> 574 {
> 575 struct platform_driver *drv = to_platform_driver(dev->driver);
> 576 struct platform_device *pdev;
> 577 int ret = 0;
> 578
> 579 pdev = container_of(dev, struct platform_device, dev);
> 580 if (dev->driver && drv->suspend_late)
> 581 ret = drv->suspend_late(pdev, mesg);
> 582
> 583 return ret;

That's strange, it looks like the container_of() accessing beyond the
structure.

Is this 100% reproducible?

Rafael

2008-02-24 22:44:30

by Julian Blake Kongslie

[permalink] [raw]
Subject: Re: Hang on suspend

On Sun, 2008-02-24 at 23:35 +0100, Rafael J. Wysocki wrote:
> [Please don't drop the CC list from replies.]

Yeah, sorry, I accidentally whacked the wrong reply button.

> On Sunday, 24 of February 2008, Julian Blake Kongslie wrote:
> > On Sun, 2008-02-24 at 12:23 +0100, Rafael J. Wysocki wrote:
> > > Can you please check what's at the address platform_suspend_late+0x19?
> >
> > 0xc0253ca9 is in platform_suspend_late (drivers/base/platform.c:579).
> > 574 {
> > 575 struct platform_driver *drv = to_platform_driver(dev->driver);
> > 576 struct platform_device *pdev;
> > 577 int ret = 0;
> > 578
> > 579 pdev = container_of(dev, struct platform_device, dev);
> > 580 if (dev->driver && drv->suspend_late)
> > 581 ret = drv->suspend_late(pdev, mesg);
> > 582
> > 583 return ret;
>
> That's strange, it looks like the container_of() accessing beyond the
> structure.
>
> Is this 100% reproducible?

Yes, it is. I've had this hang on suspend since at least some time in
early 2.6.24-rc, although because I have to transcribe the oops manually
I haven't had enough incentive to get it reported until now.
Unfortunately, I just don't need to suspend very often, so it may have
been introduced some time before then.

If you don't have any other ideas, I can try to bisect and find the
change that introduced it, but it'll probably take me a day or two; I
have other things I need to use my computer for, so it's a bit difficult
to do the constant-reboot-and-test cycle.

> Rafael

Thanks,

--
-Julian Blake Kongslie
<[email protected]>

If this is a mailing list, please CC me on replies.
vim: set ft=text :


Attachments:
signature.asc (287.00 B)
This is a digitally signed message part

2008-02-24 22:50:53

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Hang on suspend

On Sunday, 24 of February 2008, Julian Blake Kongslie wrote:
> On Sun, 2008-02-24 at 23:35 +0100, Rafael J. Wysocki wrote:
> > [Please don't drop the CC list from replies.]
>
> Yeah, sorry, I accidentally whacked the wrong reply button.
>
> > On Sunday, 24 of February 2008, Julian Blake Kongslie wrote:
> > > On Sun, 2008-02-24 at 12:23 +0100, Rafael J. Wysocki wrote:
> > > > Can you please check what's at the address platform_suspend_late+0x19?
> > >
> > > 0xc0253ca9 is in platform_suspend_late (drivers/base/platform.c:579).
> > > 574 {
> > > 575 struct platform_driver *drv = to_platform_driver(dev->driver);
> > > 576 struct platform_device *pdev;
> > > 577 int ret = 0;
> > > 578
> > > 579 pdev = container_of(dev, struct platform_device, dev);
> > > 580 if (dev->driver && drv->suspend_late)
> > > 581 ret = drv->suspend_late(pdev, mesg);
> > > 582
> > > 583 return ret;
> >
> > That's strange, it looks like the container_of() accessing beyond the
> > structure.
> >
> > Is this 100% reproducible?
>
> Yes, it is. I've had this hang on suspend since at least some time in
> early 2.6.24-rc, although because I have to transcribe the oops manually
> I haven't had enough incentive to get it reported until now.
> Unfortunately, I just don't need to suspend very often, so it may have
> been introduced some time before then.
>
> If you don't have any other ideas,

No, I don't.

> I can try to bisect and find the change that introduced it,

That would be great.

> but it'll probably take me a day or two; I have other things I need to use my
> computer for, so it's a bit difficult to do the constant-reboot-and-test
> cycle.

Not a problem at all.

Thanks,
Rafael

2008-03-13 10:12:01

by Pavel Machek

[permalink] [raw]
Subject: Re: Hang on suspend

On Sun 2008-02-24 14:44:09, Julian Blake Kongslie wrote:
> On Sun, 2008-02-24 at 23:35 +0100, Rafael J. Wysocki wrote:
> > [Please don't drop the CC list from replies.]
>
> Yeah, sorry, I accidentally whacked the wrong reply button.
>
> > On Sunday, 24 of February 2008, Julian Blake Kongslie wrote:
> > > On Sun, 2008-02-24 at 12:23 +0100, Rafael J. Wysocki wrote:
> > > > Can you please check what's at the address platform_suspend_late+0x19?
> > >
> > > 0xc0253ca9 is in platform_suspend_late (drivers/base/platform.c:579).
> > > 574 {
> > > 575 struct platform_driver *drv = to_platform_driver(dev->driver);
> > > 576 struct platform_device *pdev;
> > > 577 int ret = 0;
> > > 578
> > > 579 pdev = container_of(dev, struct platform_device, dev);
> > > 580 if (dev->driver && drv->suspend_late)
> > > 581 ret = drv->suspend_late(pdev, mesg);
> > > 582
> > > 583 return ret;
> >
> > That's strange, it looks like the container_of() accessing beyond the
> > structure.
> >
> > Is this 100% reproducible?
>
> Yes, it is. I've had this hang on suspend since at least some time in
> early 2.6.24-rc, although because I have to transcribe the oops manually
> I haven't had enough incentive to get it reported until now.
> Unfortunately, I just don't need to suspend very often, so it may have
> been introduced some time before then.
>
> If you don't have any other ideas, I can try to bisect and find the
> change that introduced it, but it'll probably take me a day or two; I
> have other things I need to use my computer for, so it's a bit difficult
> to do the constant-reboot-and-test cycle.

Any news on this? Did bisect show something interesting?



--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-03-14 17:44:46

by Julian Blake Kongslie

[permalink] [raw]
Subject: Re: Hang on suspend

On Thu, 2008-03-13 at 09:13 +0100, Pavel Machek wrote:
> Any news on this? Did bisect show something interesting?

I'm afraid not; I think this may be hardware-related on my end, because
it seems to vary with the size of the kernel image, of all things.

--
-Julian Blake Kongslie
<[email protected]>

If this is a mailing list, please CC me on replies.
vim: set ft=text :


Attachments:
signature.asc (287.00 B)
This is a digitally signed message part