2002-03-04 23:13:14

by Pavel Machek

[permalink] [raw]
Subject: swsusp is at it... again

Hi!

After about 20 resume cycles (compiled kernel with swsusp making
machine suspend/resume) I got that nasty FS corruption, again.

So...

1) Maybe your ext3 patches are not at fault.

2) Be carefull using swsusp patch. Real carefull.

3) Don't trust fsck. At this kind of corruption, e2fsck 1.19 will
report "clean" but will not repair it, putting your fs into
self-destruct mode. Bad bad. Its fixed on new versions. Always run
fsck twice, second time with -f.
Pavel
--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa


2002-03-05 16:23:30

by fchabaud

[permalink] [raw]
Subject: Re: swsusp is at it... again

Le 5 Mar, Pavel Machek a ?crit :
> Hi!
>
> After about 20 resume cycles (compiled kernel with swsusp making
> machine suspend/resume) I got that nasty FS corruption, again.
>
> So...
>
> 1) Maybe your ext3 patches are not at fault.

I suspect all this come from suspension failure and immediate resume. I
have reenabled your panic ! I believe that if a task isn't stopped and
suspension is aborted (calling thaw_process and so on) something is
altered. Maybe resuming assumes implicitely a state that is not
completely reached when a task cannot be stopped.

I also made a modification in stopping task to stop normal task and then
kernel threads (I had to add a new PF_KERNTHREAD flag). Perhaps the bug
has to do with the *order* of stopping processes (I think of that
because kernel messages are written to log files: what happens if
kjournald thread is stopped and a task still writes ?)

> 2) Be carefull using swsusp patch. Real carefull.
>
> 3) Don't trust fsck. At this kind of corruption, e2fsck 1.19 will
> report "clean" but will not repair it, putting your fs into
> self-destruct mode. Bad bad. Its fixed on new versions. Always run
> fsck twice, second time with -f.

tune2fs -e panic
is also a good precaution at least for ext3 filesystems because all my
root inode crashes were preceded by ext3-error messages and these
messages were sometimes several hours before effective crash.


--
Florent Chabaud
http://www.ssi.gouv.fr | http://fchabaud.free.fr

2002-03-06 13:44:51

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp is at it... again

Hi!

> > After about 20 resume cycles (compiled kernel with swsusp making
> > machine suspend/resume) I got that nasty FS corruption, again.
> >
> > So...
> >
> > 1) Maybe your ext3 patches are not at fault.
>
> I suspect all this come from suspension failure and immediate resume. I
> have reenabled your panic ! I believe that if a task isn't stopped

Okay, I think I can try that. [Do you think you can send me your diffs
to ext3?]

> I also made a modification in stopping task to stop normal task and then
> kernel threads (I had to add a new PF_KERNTHREAD flag). Perhaps the bug
> has to do with the *order* of stopping processes (I think of that
> because kernel messages are written to log files: what happens if
> kjournald thread is stopped and a task still writes ?)

Nothing that bad should happen... kjournald is only _delayed_ right?
And it could be delayed by scheduling as well.

> > 2) Be carefull using swsusp patch. Real carefull.
> >
> > 3) Don't trust fsck. At this kind of corruption, e2fsck 1.19 will
> > report "clean" but will not repair it, putting your fs into
> > self-destruct mode. Bad bad. Its fixed on new versions. Always run
> > fsck twice, second time with -f.
>
> tune2fs -e panic
> is also a good precaution at least for ext3 filesystems because all my
> root inode crashes were preceded by ext3-error messages and these
> messages were sometimes several hours before effective crash.

Yes.
Pavel
--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

2002-03-07 10:01:11

by fchabaud

[permalink] [raw]
Subject: Re: swsusp is at it... again

Le 5 Mar, Pavel Machek a ?crit :
> Hi!
>
>> > After about 20 resume cycles (compiled kernel with swsusp making
>> > machine suspend/resume) I got that nasty FS corruption, again.
>> >
>> > So...
>> >
>> > 1) Maybe your ext3 patches are not at fault.
>>
>> I suspect all this come from suspension failure and immediate resume. I
>> have reenabled your panic ! I believe that if a task isn't stopped
>
> Okay, I think I can try that. [Do you think you can send me your diffs
> to ext3?]

I attach my diff with your patch that I applied on 2.4.18.

>
>> I also made a modification in stopping task to stop normal task and then
>> kernel threads (I had to add a new PF_KERNTHREAD flag). Perhaps the bug
>> has to do with the *order* of stopping processes (I think of that
>> because kernel messages are written to log files: what happens if
>> kjournald thread is stopped and a task still writes ?)
>
> Nothing that bad should happen... kjournald is only _delayed_ right?
> And it could be delayed by scheduling as well.

Actually, I'm not sure that so simple. I have passed hours trying to
figure out exactly what's happening but I'm not confident with that
assumption. All transactions in journal have an expiration time based on
jiffies and I'm not sure jiffies are correctly resumed, are they ? Maybe
this expiration of transaction is handled in a way that is not
inocuitous in our context.

>
>> > 2) Be carefull using swsusp patch. Real carefull.
>> >
>> > 3) Don't trust fsck. At this kind of corruption, e2fsck 1.19 will
>> > report "clean" but will not repair it, putting your fs into
>> > self-destruct mode. Bad bad. Its fixed on new versions. Always run
>> > fsck twice, second time with -f.
>>
>> tune2fs -e panic
>> is also a good precaution at least for ext3 filesystems because all my
>> root inode crashes were preceded by ext3-error messages and these
>> messages were sometimes several hours before effective crash.
>
> Yes.

Unfortunately not sufficient at least in my last crash. The good point
is that this time, I had to reinstall, so the filesystem should now be
clean: no trace of ancient crashes should remain ;-)

Last but not least I won't be able to work on swsusp for a while due to
other priorities :-(


--
Florent Chabaud
http://www.ssi.gouv.fr | http://fchabaud.free.fr


Attachments:
pavel (23.55 kB)

2002-03-07 10:19:55

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp is at it... again and again

Hi!

> > After about 20 resume cycles (compiled kernel with swsusp making
> > machine suspend/resume) I got that nasty FS corruption, again.
> >
> > So...
> >
> > 1) Maybe your ext3 patches are not at fault.
>
> I suspect all this come from suspension failure and immediate resume. I
> have reenabled your panic ! I believe that if a task isn't stopped and
> suspension is aborted (calling thaw_process and so on) something is
> altered. Maybe resuming assumes implicitely a state that is not
> completely reached when a task cannot be stopped.

I don't think that's it. But I have another suspect:

mark_swapfiles in do_magic_resume_2. Oh, and you should also kill it
from do_magic_suspend_*. Its writing on filesystem during resume, and
it does not seem too safe.

However my test machine is hosed so badly I can not repair it with
fsck. [Another day, another bug in fsck ;-)], so it would be great if
you could test this....
Pavel
--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

2002-03-07 21:04:02

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp is at it... again and again

Hi!

> >> > After about 20 resume cycles (compiled kernel with swsusp making
> >> > machine suspend/resume) I got that nasty FS corruption, again.
> >> >
> >> > So...
> >> >
> >> > 1) Maybe your ext3 patches are not at fault.
> >>
> >> I suspect all this come from suspension failure and immediate resume. I
> >> have reenabled your panic ! I believe that if a task isn't stopped and
> >> suspension is aborted (calling thaw_process and so on) something is
> >> altered. Maybe resuming assumes implicitely a state that is not
> >> completely reached when a task cannot be stopped.
> >
> > I don't think that's it. But I have another suspect:
> >
> > mark_swapfiles in do_magic_resume_2. Oh, and you should also kill it
> > from do_magic_suspend_*. Its writing on filesystem during resume, and
> > it does not seem too safe.
>
> Sh... That's a mail I sent you and that was lost: all the oops I have
> observed occur with calltrace
> c0123398 t do_magic_resume_2
> c0123694 t do_magic
> c012389c t do_software_suspend
> c011a8bc T __run_task_queue
> with EIP
> c0122754 t mark_swapfiles
>
> Do you mean we should simply not call mark_swapfiles any more ?

I gave it second thought...

We *must* write to swapfiles during suspend. Doing it during resume
should be very similar to writing during suspend....

If you simply comment it out, it will refuse to suspend second
time. If you comment out even "SWAP-FILE" check, it should work... I'd
be very interested if it helps, but it would also confuse me a lot.
Pavel
--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

2002-03-08 11:21:49

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp is at it... again

Hi!

> >> I also made a modification in stopping task to stop normal task and then
> >> kernel threads (I had to add a new PF_KERNTHREAD flag). Perhaps the bug
> >> has to do with the *order* of stopping processes (I think of that
> >> because kernel messages are written to log files: what happens if
> >> kjournald thread is stopped and a task still writes ?)
> >
> > Nothing that bad should happen... kjournald is only _delayed_ right?
> > And it could be delayed by scheduling as well.
>
> Actually, I'm not sure that so simple. I have passed hours trying to
> figure out exactly what's happening but I'm not confident with that
> assumption. All transactions in journal have an expiration time based on
> jiffies and I'm not sure jiffies are correctly resumed, are they ? Maybe
> this expiration of transaction is handled in a way that is not
> inocuitous in our context.

Unless we made jiffies go back, we did nothing that could not happen
before, right? Your machine could for some reason freeze for half an
hour, then resume. That should be equivalent to swsusp.

> Index: linux/arch/i386/kernel/apm.c

apm.c parts not applied, I'm trying to keep it small.

> diff -u linux/drivers/char/agp/agpgart_be.c:1.1.1.2 linux/drivers/char/agp/agpgart_be.c:1.1.1.1.4.1
> --- linux/drivers/char/agp/agpgart_be.c:1.1.1.2 Mon Mar 4 20:17:17 2002
> +++ linux/drivers/char/agp/agpgart_be.c Thu Mar 7 10:10:42

This is needed even without swsusp, no?

> +++ linux/fs/buffer.c Thu Mar 7 10:11:23 2002

Applied.

> +++ linux/fs/jbd/journal.c Thu Mar 7 10:11:28 2002


Applied.

> - jbd_debug(2, "transaction too old, requesting commit for "
> - "handle %p\n", handle);
> + jbd_debug(2, "transaction %d too old, requesting commit for "
> + "handle %p\n", tid, handle);

I have enough of my own debugging hooks ;-)).

> +++ linux/fs/reiserfs/journal.c Thu Mar 7 10:11:30 2002

Applied.

> +++ linux/include/asm-i386/suspend.h Thu Mar 7 10:11:34 2002
> @@ -260,9 +260,9 @@
> if (!resume) {
> do_magic_suspend_1();
> save_processor_context(); /* We need to capture registers and memory at "same time" */
> - do_magic_suspend_2();
> - restore_processor_context();
> - return;
> + do_magic_suspend_2(); /* We should never return from here except when suspension fails */
> + /*restore_processor_context();
> + return;*/

I kept comment, but I'm not sure I want to go through normal resume at
this point.

> +++ linux/include/linux/sched.h Thu Mar 7 10:11:43 2002

Applied.

> +++ linux/include/linux/suspend.h Thu Mar 7 10:11:44 2002

Applied.

> diff -u linux/kernel/suspend.c:1.1.1.1 linux/kernel/suspend.c:1.1.2.1.2.2
> --- linux/kernel/suspend.c:1.1.1.1 Thu Mar 7 10:32:18 2002
> +++ linux/kernel/suspend.c Thu Mar 7 10:11:46 2002
> @@ -78,7 +78,10 @@
> #include <linux/major.h>
> #include <linux/blk.h>
> #include <linux/swap.h>
> -
> +#include <linux/pm.h>
> +#ifdef CONFIG_APM
> +# include <linux/apm_bios.h>
> +#endif
> #include <asm/uaccess.h>
> #include <asm/mmu_context.h>
> #include <asm/pgtable.h>

Is CONFIG_APM really neccessary?

> @@ -95,7 +98,8 @@
> #undef SUSPEND_CONSOLE
> #endif
>
> -#define TIMEOUT (6 * HZ) /* Timeout for stopping processes */
> +#define TIMEOUT (12 * HZ) /* Timeout for stopping processes (flushing ext3 journal
> + may take a while */
> #define ADDRESS(x) ((unsigned long) phys_to_virt(((x) << PAGE_SHIFT)))
>
> extern void wakeup_bdflush(void);

But you are not flushing journal just now, are you?

> @@ -197,16 +204,28 @@
> */
>
> #define INTERESTING(p) \
> - /* We don't want to touch kernel_threads..*/ \
> if (p->flags & PF_IOTHREAD) \
> continue; \
> + /* We don't want to touch kernel_threads on first pass...*/ \
> + if (p->flags & PF_KERNTHREAD) \
> + continue; \
> if (p == current) \
> continue; \
> if (p->state == TASK_ZOMBIE) \
> continue;
> +#define INTERESTING_FORCE(p) \
> + if (p->flags & PF_IOTHREAD) \
> + continue; \
> + /* We want to touch *only* kernel_threads on second pass...*/ \
> + if (!(p->flags & PF_KERNTHREAD)) \
> + continue; \
> + if (p == current) \
> + continue; \
> + if (p->state == TASK_ZOMBIE) \
> + continue;
>

I hope this 2-phase suspend is not really neccessary. I tried not to
apply it.

> + if(flag)
> + flush_signals(current); /* We have signaled a kernel thread, which isn't normal behaviour
> + and that may lead to 100%CPU sucking because those threads
> + just don't manage signals. */

But this looks critical.

> +/* Make disk drivers accept operations, again */
> +static void drivers_unsuspend(void)
> +{
> +#ifdef CONFIG_BLK_DEV_IDE
> + ide_disk_unsuspend();
> +#endif
> +}
> +
> +/* Called from process context */
> +static int drivers_suspend(void)
> +{
> +#ifdef CONFIG_BLK_DEV_IDE
> + ide_disk_suspend();
> +#else
> +#error Are you sure your disk driver supports suspend?
> +#endif
> + if(!pm_suspend_state) {
> + if(pm_send_all(PM_SUSPEND,(void *)3)) {
> + printk(KERN_WARNING "Problem while sending suspend event\n");
> + return(1);
> + }
> + pm_suspend_state=1;
> + } else
> + printk(KERN_WARNING "PM suspend state already raised\n");
> +
> + return(0);
> +}
> +
> +/* Called from interrupts disabled */
> +#define RESUME_PHASE1 1
> +#define RESUME_PHASE2 2
> +#define RESUME_ALL_PHASES (RESUME_PHASE1 | RESUME_PHASE2)

This looks better than what I have... So PHASE1 is interrupts off,
PHASE2 is interrupts on, ok?

> -#ifdef CONFIG_BLK_DEV_IDE
> - ide_disk_unsuspend();
> -#endif
> + drivers_unsuspend();

This was already there.

> +++ linux/mm/page_alloc.c Thu Mar 7 10:11:46 2002

I do not like this. loopcount seems very arbitrary, and what I have
here seems to work with 2.4.18...

> +++ linux/mm/vmscan.c Thu Mar 7 10:11:46 2002

Applied.

Oh, and you probably want to create little linux/CREDITS diff
;-))).
Pavel
--
(about SSSCA) "I don't say this lightly. However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa