2004-10-27 03:00:39

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Strange IO behaviour on wakeup from sleep

Hi !

Not much datas at this point yet, but paulus and I noticed that current
bk (happened already last saturday or so) has a very strange problem
when waking up from sleep (suspend to ram) on our laptops.

This doesn't seem to be directly related to the PM code, at least not
the arch one, as far as I know. The IDE throughput goes down to less
than 100k/sec on hdparm. We haven't yet figured out where the time is
lost, the disk seem to properly be restored to UDMA4 as usual, that code
didn't change for ages, I don't think it's a problem at that level in
IDE.

I'm not sure yet how to track that down, it could be the IO scheduler
getting messed up on wakeup for some reason. Any clue appreciated.

Ben.




2004-10-27 06:27:25

by Jens Axboe

[permalink] [raw]
Subject: Re: Strange IO behaviour on wakeup from sleep

On Wed, Oct 27 2004, Benjamin Herrenschmidt wrote:
> Hi !
>
> Not much datas at this point yet, but paulus and I noticed that current
> bk (happened already last saturday or so) has a very strange problem
> when waking up from sleep (suspend to ram) on our laptops.
>
> This doesn't seem to be directly related to the PM code, at least not
> the arch one, as far as I know. The IDE throughput goes down to less
> than 100k/sec on hdparm. We haven't yet figured out where the time is
> lost, the disk seem to properly be restored to UDMA4 as usual, that code
> didn't change for ages, I don't think it's a problem at that level in
> IDE.
>
> I'm not sure yet how to track that down, it could be the IO scheduler
> getting messed up on wakeup for some reason. Any clue appreciated.

Just saw the same thing here yesterday. It's not io scheduler related
(happened with even noop, if you switch to it), but apart from that I
have no clues so far either.

--
Jens Axboe

2004-10-27 11:22:53

by Tim Schmielau

[permalink] [raw]
Subject: Re: Strange IO behaviour on wakeup from sleep

On Wed, 27 Oct 2004, Benjamin Herrenschmidt wrote:

> Not much datas at this point yet, but paulus and I noticed that current
> bk (happened already last saturday or so) has a very strange problem
> when waking up from sleep (suspend to ram) on our laptops.

It's a shot in the dark, but I am concerned whether timers continue to
work correctly after suspend with the following patch from Linus' bk tree.
I think jiffies may not be set behind the back of the timer subsystem, but
maybe it works if we can guarantee there are no timers scheduled.

It might be worth backing out and retesting.

Tim


[PATCH] swsusp: fix process start times after resume

http://linus.bkbits.net:8080/linux-2.5/cset@4174ae167_Yica8ChkiLcj_rmOcG1Q?nav=index.html|ChangeSet@-2w

# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
# 2004/10/18 23:03:02-07:00 [email protected]
# [PATCH] swsusp: fix process start times after resume
#
# Currently, process start times change after swsusp (because they are
# derived from jiffies and current time, oops). This should fix it.
#
# Signed-off-by: Andrew Morton <[email protected]>
# Signed-off-by: Linus Torvalds <[email protected]>
#
# arch/i386/kernel/time.c
# 2004/10/18 22:26:45-07:00 [email protected] +5 -1
# swsusp: fix process start times after resume
#
diff -Nru a/arch/i386/kernel/time.c b/arch/i386/kernel/time.c
--- a/arch/i386/kernel/time.c 2004-10-27 03:58:08 -07:00
+++ b/arch/i386/kernel/time.c 2004-10-27 03:58:08 -07:00
@@ -319,7 +319,7 @@
return retval;
}

-static long clock_cmos_diff;
+static long clock_cmos_diff, sleep_start;

static int time_suspend(struct sys_device *dev, u32 state)
{
@@ -328,6 +328,7 @@
*/
clock_cmos_diff = -get_cmos_time();
clock_cmos_diff += get_seconds();
+ sleep_start = get_cmos_time();
return 0;
}

@@ -335,10 +336,13 @@
{
unsigned long flags;
unsigned long sec = get_cmos_time() + clock_cmos_diff;
+ unsigned long sleep_length = get_cmos_time() - sleep_start;
+
write_seqlock_irqsave(&xtime_lock, flags);
xtime.tv_sec = sec;
xtime.tv_nsec = 0;
write_sequnlock_irqrestore(&xtime_lock, flags);
+ jiffies += sleep_length * HZ;
return 0;
}

2004-10-27 12:11:53

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Strange IO behaviour on wakeup from sleep

On Wed, 2004-10-27 at 13:20 +0200, Tim Schmielau wrote:
> On Wed, 27 Oct 2004, Benjamin Herrenschmidt wrote:
>
> > Not much datas at this point yet, but paulus and I noticed that current
> > bk (happened already last saturday or so) has a very strange problem
> > when waking up from sleep (suspend to ram) on our laptops.
>
> It's a shot in the dark, but I am concerned whether timers continue to
> work correctly after suspend with the following patch from Linus' bk tree.
> I think jiffies may not be set behind the back of the timer subsystem, but
> maybe it works if we can guarantee there are no timers scheduled.
>
> It might be worth backing out and retesting.

The problem has been observed on ppc, while this patch only affects
i386...

Ben.


2004-10-27 12:14:19

by Tim Schmielau

[permalink] [raw]
Subject: Re: Strange IO behaviour on wakeup from sleep

On Wed, 27 Oct 2004, Benjamin Herrenschmidt wrote:

> The problem has been observed on ppc, while this patch only affects
> i386...

Oops, sorry for the noise.

Still need to check whether this patch is a problem or not.

Tim

2004-10-27 13:17:49

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Strange IO behaviour on wakeup from sleep

On Wed, 2004-10-27 at 23:01 +1000, Nigel Cunningham wrote:
> Hi.
>
> On Wed, 2004-10-27 at 22:06, Benjamin Herrenschmidt wrote:
> > The problem has been observed on ppc, while this patch only affects
> > i386...
>
> Another shot in the dark....
>
> Nothing interesting about /proc/interrupts?

Nope, looked already, interrupts seem to flow normally... the box works,
there are no errors or lost interrupts, it's just that disk IOs are
_extremely_ slow...

Ben.


2004-10-27 13:17:50

by Nigel Cunningham

[permalink] [raw]
Subject: Re: Strange IO behaviour on wakeup from sleep

Hi.

On Wed, 2004-10-27 at 22:06, Benjamin Herrenschmidt wrote:
> The problem has been observed on ppc, while this patch only affects
> i386...

Another shot in the dark....

Nothing interesting about /proc/interrupts?

Regards,

Nigel
--
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

Everyone lives by faith. Some people just don't believe it.
Want proof? Try to prove that the theory of evolution is true.

2004-10-27 13:31:39

by Nigel Cunningham

[permalink] [raw]
Subject: Re: Strange IO behaviour on wakeup from sleep

Hi again.

On Wed, 2004-10-27 at 23:08, Benjamin Herrenschmidt wrote:
> On Wed, 2004-10-27 at 23:01 +1000, Nigel Cunningham wrote:
> > Hi.
> >
> > On Wed, 2004-10-27 at 22:06, Benjamin Herrenschmidt wrote:
> > > The problem has been observed on ppc, while this patch only affects
> > > i386...
> >
> > Another shot in the dark....
> >
> > Nothing interesting about /proc/interrupts?
>
> Nope, looked already, interrupts seem to flow normally... the box works,
> there are no errors or lost interrupts, it's just that disk IOs are
> _extremely_ slow...

One more, if I may... no processes sucking CPU? (That would indicate a
thread not properly handled by the refrigerating).

Regards,

Nigel
--
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

Everyone lives by faith. Some people just don't believe it.
Want proof? Try to prove that the theory of evolution is true.

2004-10-27 14:22:30

by Zachary Amsden

[permalink] [raw]
Subject: Re: Strange IO behaviour on wakeup from sleep

>
>
>Hi !
>
>Not much datas at this point yet, but paulus and I noticed that current
>bk (happened already last saturday or so) has a very strange problem
>when waking up from sleep (suspend to ram) on our laptops.
>
>This doesn't seem to be directly related to the PM code, at least not
>the arch one, as far as I know. The IDE throughput goes down to less
>than 100k/sec on hdparm. We haven't yet figured out where the time is
>lost, the disk seem to properly be restored to UDMA4 as usual, that code
>didn't change for ages, I don't think it's a problem at that level in
>IDE.
>
>

I would tend to be very suspicious of DMA not being restored correctly
because on some systems, prior to or during suspend, DMA may be shutdown
to conserve power. There are changes afloat that touch suspend/resume,
and there have been historical problems with DMA not being restored
properly after wakeup on some laptops.

Although this may be another shot in the dark, it might rule out the DMA
problem: try cat /proc/ide/yourchipset before and after suspend and
note any changes. Failing that, use hdparm to turn off DMA before
suspend and see if the performance suffers to the same degree as after
wakeup.

Zachary Amsden
[email protected]

2004-10-27 22:49:37

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Strange IO behaviour on wakeup from sleep

On Wed, 2004-10-27 at 07:18 -0700, Zachary Amsden wrote:

> I would tend to be very suspicious of DMA not being restored correctly
> because on some systems, prior to or during suspend, DMA may be shutdown
> to conserve power. There are changes afloat that touch suspend/resume,
> and there have been historical problems with DMA not being restored
> properly after wakeup on some laptops.

DMA is restored, and the resulting is way slower than what PIO would
explain anyway. I get less than 100Kb/sec !

(I wrote the IDE suspend/resume code and the driver for this chipset, so
I'm fairly sure that side is ok, it didn't change for a while, but I'll
double check in case Bart latest updates broke something).

> Although this may be another shot in the dark, it might rule out the DMA
> problem: try cat /proc/ide/yourchipset before and after suspend and
> note any changes. Failing that, use hdparm to turn off DMA before
> suspend and see if the performance suffers to the same degree as after
> wakeup.

Tried all of that.

Ben.