2004-04-26 09:51:12

by Herbert Xu

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

tags 234976 pending
quit

On Fri, Mar 26, 2004 at 07:17:52PM +0100, Roland Stigge wrote:
>
> On Fri, 2004-03-26 at 17:17, Pavel Machek wrote:
> > > Please first read the full conversation I had with Herbert at
> > > http://bugs.debian.org/234976 . It contains documented debug sessions of
> > > this kind. Please tell me what to try.
> >
> > Intel AGP, by chance?
>
> Yes, but as I wrote, only on one of the machines in question.

OK, I've finally found out why agpgart locks up the machine upon
resuming from swsusp/pmdisk.

The reason is that the gatt table is remapped with ioremap_nocache,
which on i386 machines capable of PSE will result in 4M pages being
split.

When swsusp/pmdisk copies the pages back, the top page table is
written before the entries that it points to are filled in.
Depending on whether the second-level table lies before or after
the 4M-page in question, this will result in a page fault.

A simple solution is to copy the pages in reverse. This way the
top page table is filled in last which should resolve this particular
issue. The following patch does exactly that and fixes the problem
for me.

Cheers,
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Attachments:
(No filename) (1.33 kB)
p (1.05 kB)
Download all attachments

2004-04-26 10:52:24

by Herbert Xu

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

On Mon, Apr 26, 2004 at 07:48:34PM +1000, herbert wrote:
>
> A simple solution is to copy the pages in reverse. This way the
> top page table is filled in last which should resolve this particular
> issue. The following patch does exactly that and fixes the problem
> for me.

Of course this doesn't work for machines without PSE. But then the
original code didn't work either. Since resuming from 486's isn't
that cool anyway, IMHO someone should just add a PSE check in the
swsusp/pmdisk init code on i386.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2004-04-26 11:49:50

by Nigel Cunningham

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi.

On Mon, 26 Apr 2004 20:40:15 +1000, Herbert Xu
<[email protected]> wrote:

> On Mon, Apr 26, 2004 at 07:48:34PM +1000, herbert wrote:
>>
>> A simple solution is to copy the pages in reverse. This way the
>> top page table is filled in last which should resolve this particular
>> issue. The following patch does exactly that and fixes the problem
>> for me.
>
> Of course this doesn't work for machines without PSE. But then the
> original code didn't work either. Since resuming from 486's isn't
> that cool anyway, IMHO someone should just add a PSE check in the
> swsusp/pmdisk init code on i386.

There used to be such a check. Centrinos, however, if I recall correctly,
don't have PSE but can suspend with our current method. Perhaps we can
come up with a more nuanced test? Better still, though, we should just get
proper AGP support for suspending and resuming in.

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614, Australia.
+61 (2) 6251 7727 (wk)

2004-04-26 12:13:45

by Herbert Xu

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

On Mon, Apr 26, 2004 at 09:27:13PM +1000, Nigel Cunningham wrote:
>
> There used to be such a check. Centrinos, however, if I recall correctly,
> don't have PSE but can suspend with our current method. Perhaps we can

Then it's just pure luck. Whenever you have a page whose page table
lies in a page beyond that page itself the non-PSE case will fail.

> come up with a more nuanced test? Better still, though, we should just get
> proper AGP support for suspending and resuming in.

It's got nothing to do with AGP. This is a flaw in the swsusp code.
It can be triggered by anything that plays with page attributes.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2004-04-26 12:39:42

by Nigel Cunningham

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi.

On Mon, 26 Apr 2004 22:11:45 +1000, Herbert Xu
<[email protected]> wrote:

> On Mon, Apr 26, 2004 at 09:27:13PM +1000, Nigel Cunningham wrote:
>>
>> There used to be such a check. Centrinos, however, if I recall
>> correctly,
>> don't have PSE but can suspend with our current method. Perhaps we can
>
> Then it's just pure luck. Whenever you have a page whose page table
> lies in a page beyond that page itself the non-PSE case will fail.

I'm no expert on the hardware side of things, but from what I know, it's
really only these hardware devices that are accessing memory while we're
doing the copyback that are the problem. All processes are stopped and
we've called device_suspend(). Nothing but us should be using/modifying
the page tables.

>> come up with a more nuanced test? Better still, though, we should just
>> get
>> proper AGP support for suspending and resuming in.
>
> It's got nothing to do with AGP. This is a flaw in the swsusp code.
> It can be triggered by anything that plays with page attributes.

Not so much a flaw in the suspend code as something that needs to be dealt
with: it's not a bug for pages to have protection, and its not a bug for
us to need it temporarily removed in order to do the copyback. We just
need the support in the drivers to achieve that. When we have it (as we do
in some cases in 2.4), all is well.

Regards,

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614, Australia.
+61 (2) 6251 7727 (wk)

2004-04-26 14:05:27

by Pavel Machek

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi!
On Mon 26-04-04 19:48:34, Herbert Xu wrote:
> > Yes, but as I wrote, only on one of the machines in question.
>
> OK, I've finally found out why agpgart locks up the machine upon
> resuming from swsusp/pmdisk.
>
> The reason is that the gatt table is remapped with ioremap_nocache,
> which on i386 machines capable of PSE will result in 4M pages being
> split.
>
> When swsusp/pmdisk copies the pages back, the top page table is
> written before the entries that it points to are filled in.
> Depending on whether the second-level table lies before or after
> the 4M-page in question, this will result in a page fault.
>
> A simple solution is to copy the pages in reverse. This way the
> top page table is filled in last which should resolve this particular
> issue. The following patch does exactly that and fixes the problem
> for me.

Thanks a lot for figuring this out!

But... I do not like the fix. It does depend on memory layout on
very subtle way.

What about switching to temporary, PSE-enabled pagetables
in nosave area for suspend? Copying pagetables soon after boot
should do the trick.

Pavel

--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms

2004-04-26 14:12:53

by Pavel Machek

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi!

> >>come up with a more nuanced test? Better still, though, we should just
> >>get
> >>proper AGP support for suspending and resuming in.
> >
> >It's got nothing to do with AGP. This is a flaw in the swsusp code.
> >It can be triggered by anything that plays with page attributes.
>
> Not so much a flaw in the suspend code as something that needs to be dealt
> with: it's not a bug for pages to have protection, and its not a bug for
> us to need it temporarily removed in order to do the copyback. We just
> need the support in the drivers to achieve that. When we have it (as we do
> in some cases in 2.4), all is well.

No, Herbert is right here. This *is* swsusp fault. Swsusp assumes 4MB
tables which is not guaranteed even on PSE machines.
Pavel
--
Horseback riding is like software...
...vgf orggre jura vgf serr.

2004-04-26 14:12:51

by Pavel Machek

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi!

> >>A simple solution is to copy the pages in reverse. This way the
> >>top page table is filled in last which should resolve this
> >>particular
> >>issue. The following patch does exactly that and fixes the problem
> >>for me.
> >
> >Of course this doesn't work for machines without PSE. But then the
> >original code didn't work either. Since resuming from 486's isn't
> >that cool anyway, IMHO someone should just add a PSE check in the
> >swsusp/pmdisk init code on i386.
>
> There used to be such a check. Centrinos, however, if I recall
> correctly, don't have PSE but can suspend with our current method.
> Perhaps we can come up with a more nuanced test? Better still,

Test should still be there. Switching to temporary page tables
seems to be tbe solution.
--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms

2004-04-26 21:01:51

by Nigel Cunningham

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

On Mon, 26 Apr 2004 15:46:36 +0200, Pavel Machek <[email protected]> wrote:
> No, Herbert is right here. This *is* swsusp fault. Swsusp assumes 4MB
> tables which is not guaranteed even on PSE machines.

Okay then. Who wants to roll the patch? :>

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614, Australia.
+61 (2) 6251 7727 (wk)

2004-04-27 07:02:56

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

> re nuanced test? Better still,
>
> Test should still be there. Switching to temporary page tables
> seems to be tbe solution.

This is close to the problem I talked about when that PPC version
appeared, which is why, at least on resume, I run everything with
MMU off in the patch I proposed :)

(BTW, Nigel, did you merge the PPC support at all in swsusp2 ?)

Ben.


2004-04-27 08:04:46

by Herbert Xu

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

On Mon, Apr 26, 2004 at 03:08:07PM +0200, Pavel Machek wrote:
>
> What about switching to temporary, PSE-enabled pagetables
> in nosave area for suspend? Copying pagetables soon after boot
> should do the trick.

Yes that would solve the non-PSE problem as well. Only problem
is that I don't have any more time to spend on this issue so one
of you guys will need to write the code.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2004-04-27 09:01:29

by Nigel Cunningham

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi.

On Tue, 27 Apr 2004 16:56:25 +1000, Benjamin Herrenschmidt
<[email protected]> wrote:

>> re nuanced test? Better still,
>>
>> Test should still be there. Switching to temporary page tables
>> seems to be tbe solution.
>
> This is close to the problem I talked about when that PPC version
> appeared, which is why, at least on resume, I run everything with
> MMU off in the patch I proposed :)
>
> (BTW, Nigel, did you merge the PPC support at all in swsusp2 ?)

Yes. I have a recent patch still to apply, but I did put previous patches
in.

Regards,

Nigel


--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614, Australia.
+61 (2) 6251 7727 (wk)

2004-04-27 10:21:54

by Pavel Machek

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi!

> > re nuanced test? Better still,
> >
> > Test should still be there. Switching to temporary page tables
> > seems to be tbe solution.
>
> This is close to the problem I talked about when that PPC version
> appeared, which is why, at least on resume, I run everything with
> MMU off in the patch I proposed :)

I'm not sure if I can turn off MMU on i386 so easily. That would
certainly fix it, too.

BTW what is performance penalty of not running 4MB pages on kernel?
Every user with intel-agp (etc) eats it, even if they are not using 3D
on the machine...
Pavel
--
934a471f20d6580d5aad759bf0d97ddc

2004-04-27 10:26:33

by Herbert Xu

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

On Tue, Apr 27, 2004 at 12:21:27PM +0200, Pavel Machek wrote:
>
> BTW what is performance penalty of not running 4MB pages on kernel?
> Every user with intel-agp (etc) eats it, even if they are not using 3D
> on the machine...

The penalty only applies to the 4M region around the gatt table.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2004-04-27 10:27:06

by Pavel Machek

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi!

> > BTW what is performance penalty of not running 4MB pages on kernel?
> > Every user with intel-agp (etc) eats it, even if they are not using 3D
> > on the machine...
>
> The penalty only applies to the 4M region around the gatt table.

I thought that kernel should pretty much fit to single 4M region, but
I guess that's not case here. Thanks.

(Compiled kernel is still <4M, so having gatt near to kernel code
would hurt quite a bit; but that's probably not the case).
Pavel
--
934a471f20d6580d5aad759bf0d97ddc

2004-04-27 12:48:55

by Pavel Machek

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi!

This should be better solution, could anyone test it? [It compiles,
and I'm out of time now].

Pavel

--- tmp/linux/arch/i386/mm/init.c 2004-04-05 10:45:11.000000000 +0200
+++ linux/arch/i386/mm/init.c 2004-04-27 14:44:00.000000000 +0200
@@ -343,6 +343,12 @@
#else
set_pgd(swapper_pg_dir+i, __pgd(0));
#endif
+#ifdef CONFIG_SOFTWARE_SUSPEND
+ {
+ extern char swsusp_pg_dir[PAGE_SIZE];
+ memcpy(swsusp_pg_dir, swapper_pg_dir, PAGE_SIZE);
+ }
+#endif
flush_tlb_all();
}

--- tmp/linux/arch/i386/power/cpu.c 2003-09-28 22:05:30.000000000 +0200
+++ linux/arch/i386/power/cpu.c 2004-04-27 14:44:03.000000000 +0200
@@ -35,6 +35,9 @@
unsigned long saved_context_esi, saved_context_edi;
unsigned long saved_context_eflags;

+/* Special page directory for resume */
+char swsusp_pg_dir[PAGE_SIZE];
+
extern void enable_sep_cpu(void *);

void save_processor_state(void)
--- tmp/linux/arch/i386/power/swsusp.S 2003-09-28 22:05:30.000000000 +0200
+++ linux/arch/i386/power/swsusp.S 2004-04-27 14:41:54.000000000 +0200
@@ -29,7 +38,7 @@
jmp .L1449
.p2align 4,,7
.L1450:
- movl $swapper_pg_dir-__PAGE_OFFSET,%ecx
+ movl $swsusp_pg_dir-__PAGE_OFFSET,%ecx
movl %ecx,%cr3

call do_magic_resume_1

--
934a471f20d6580d5aad759bf0d97ddc

2004-04-27 12:56:52

by Herbert Xu

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

On Tue, Apr 27, 2004 at 02:48:38PM +0200, Pavel Machek wrote:
>
> This should be better solution, could anyone test it? [It compiles,
> and I'm out of time now].

Well it still doen't solve the non-PSE case since you're only copying the
top-level page table.

> --- tmp/linux/arch/i386/power/cpu.c 2003-09-28 22:05:30.000000000 +0200
> +++ linux/arch/i386/power/cpu.c 2004-04-27 14:44:03.000000000 +0200
> @@ -35,6 +35,9 @@
> unsigned long saved_context_esi, saved_context_edi;
> unsigned long saved_context_eflags;
>
> +/* Special page directory for resume */
> +char swsusp_pg_dir[PAGE_SIZE];
> +

You forgot to mark this as nosave.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2004-04-27 13:54:23

by Pavel Machek

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi!

> > This should be better solution, could anyone test it? [It compiles,
> > and I'm out of time now].
>
> Well it still doen't solve the non-PSE case since you're only copying the
> top-level page table.

Yes, right, but your patch does not solve that, too, right?
Someone else will have to do that one.

Rather than adding up-to 4M of nonsave pagetables, turning off
paging might be solution there.

And this will actually help. If we move saving few statments before,
we'll have identity mapping so we can turn paging off...

> > --- tmp/linux/arch/i386/power/cpu.c 2003-09-28 22:05:30.000000000 +0200
> > +++ linux/arch/i386/power/cpu.c 2004-04-27 14:44:03.000000000 +0200
> > @@ -35,6 +35,9 @@
> > unsigned long saved_context_esi, saved_context_edi;
> > unsigned long saved_context_eflags;
> >
> > +/* Special page directory for resume */
> > +char swsusp_pg_dir[PAGE_SIZE];
> > +
>
> You forgot to mark this as nosave.

Hmmm, right. Well, it should work anyway because this is
same in new and old kernel, so I'd still like some testing.
Pavel
--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms

2004-04-27 21:53:35

by Pavel Machek

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi!

> > --- tmp/linux/arch/i386/power/cpu.c 2003-09-28 22:05:30.000000000 +0200
> > +++ linux/arch/i386/power/cpu.c 2004-04-27 14:44:03.000000000 +0200
> > @@ -35,6 +35,9 @@
> > unsigned long saved_context_esi, saved_context_edi;
> > unsigned long saved_context_eflags;
> >
> > +/* Special page directory for resume */
> > +char swsusp_pg_dir[PAGE_SIZE];
> > +
>
> You forgot to mark this as nosave.

More importantly, I forgot to mark it as aligned on PAGE_SIZE. Oops
(er... double fault). Here's fixed patch, and this one should work.

Andrew, the crashes with intel-agp were not driver fault after
all. swsusp assumed 4MB pages, and intel-agp driver broke 4MB page
down, resulting in nasty crash.

Herbert's solution was to copy memory backwards, and avoid the crash
by luck (But thanks a lot for explaining me the problem!).

Non-PSE cpus are still not supported; but it should be easier when we
are running in pagedir with identity-mapped pages.

This solution copies page table at boot, where it is "known good",
still 4MB. Could you apply it?

Pavel

--- tmp/linux/arch/i386/mm/init.c 2004-04-05 10:45:11.000000000 +0200
+++ linux/arch/i386/mm/init.c 2004-04-27 23:39:07.000000000 +0200
@@ -331,6 +331,13 @@
void zap_low_mappings (void)
{
int i;
+
+#ifdef CONFIG_SOFTWARE_SUSPEND
+ {
+ extern char swsusp_pg_dir[PAGE_SIZE];
+ memcpy(swsusp_pg_dir, swapper_pg_dir, PAGE_SIZE);
+ }
+#endif
/*
* Zap initial low-memory mappings.
*
--- tmp/linux/arch/i386/power/cpu.c 2003-09-28 22:05:30.000000000 +0200
+++ linux/arch/i386/power/cpu.c 2004-04-27 23:41:01.000000000 +0200
@@ -35,6 +35,10 @@
unsigned long saved_context_esi, saved_context_edi;
unsigned long saved_context_eflags;

+/* Special page directory for resume */
+char __nosavedata swsusp_pg_dir[PAGE_SIZE]
+ __attribute__ ((aligned (PAGE_SIZE)));
+
extern void enable_sep_cpu(void *);

void save_processor_state(void)
--- tmp/linux/arch/i386/power/swsusp.S 2003-09-28 22:05:30.000000000 +0200
+++ linux/arch/i386/power/swsusp.S 2004-04-27 14:41:54.000000000 +0200
@@ -29,7 +38,7 @@
jmp .L1449
.p2align 4,,7
.L1450:
- movl $swapper_pg_dir-__PAGE_OFFSET,%ecx
+ movl $swsusp_pg_dir-__PAGE_OFFSET,%ecx
movl %ecx,%cr3

call do_magic_resume_1
--- tmp/linux/include/asm-i386/suspend.h 2003-09-28 22:06:36.000000000 +0200
+++ linux/include/asm-i386/suspend.h 2004-04-27 23:10:24.000000000 +0200
@@ -9,6 +9,9 @@
static inline int
arch_prepare_suspend(void)
{
+ /* If you want to make non-PSE machine work, turn off paging
+ in do_magic. swsusp_pg_dir should have identity mapping, so
+ it could work... */
if (!cpu_has_pse)
return -EPERM;
return 0;

--
934a471f20d6580d5aad759bf0d97ddc

2004-04-27 23:10:51

by Nigel Cunningham

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi Pavel et al.

On Tue, 27 Apr 2004 23:52:36 +0200, Pavel Machek <[email protected]> wrote:

> +#ifdef CONFIG_SOFTWARE_SUSPEND
> + {
> + extern char swsusp_pg_dir[PAGE_SIZE];
> + memcpy(swsusp_pg_dir, swapper_pg_dir, PAGE_SIZE);
> + }
> +#endif

Would you consider making that #ifdef CONFIG_PM, so that I could use it
too without needing to patch it further? (I'm using
CONFIG_SOFTWARE_SUSPEND2 if you prefer something more specific).

Regards,

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614, Australia.
+61 (2) 6251 7727 (wk)

2004-04-27 23:16:48

by Pavel Machek

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi!

> On Tue, 27 Apr 2004 23:52:36 +0200, Pavel Machek <[email protected]> wrote:
>
> >+#ifdef CONFIG_SOFTWARE_SUSPEND
> >+ {
> >+ extern char swsusp_pg_dir[PAGE_SIZE];
> >+ memcpy(swsusp_pg_dir, swapper_pg_dir, PAGE_SIZE);
> >+ }
> >+#endif
>
> Would you consider making that #ifdef CONFIG_PM, so that I could use it
> too without needing to patch it further? (I'm using
> CONFIG_SOFTWARE_SUSPEND2 if you prefer something more specific).
>

Well, swsusp_pg_dir is defined in kernel/power/cpu.c, so it is not as
easy as defining it CONFIG_PM.

What about make CONFIG_SOFTWARE_SUSPEND2 defining
CONFIG_SOFTWARE_SUSPEND, too? We want the merged, anyway...

Pavel
--
934a471f20d6580d5aad759bf0d97ddc

2004-04-27 23:28:08

by Nigel Cunningham

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi.

On Wed, 28 Apr 2004 01:16:26 +0200, Pavel Machek <[email protected]> wrote:

> Hi!
>
>> On Tue, 27 Apr 2004 23:52:36 +0200, Pavel Machek <[email protected]> wrote:
>>
>> >+#ifdef CONFIG_SOFTWARE_SUSPEND
>> >+ {
>> >+ extern char swsusp_pg_dir[PAGE_SIZE];
>> >+ memcpy(swsusp_pg_dir, swapper_pg_dir, PAGE_SIZE);
>> >+ }
>> >+#endif
>>
>> Would you consider making that #ifdef CONFIG_PM, so that I could use it
>> too without needing to patch it further? (I'm using
>> CONFIG_SOFTWARE_SUSPEND2 if you prefer something more specific).
>>
>
> Well, swsusp_pg_dir is defined in kernel/power/cpu.c, so it is not as
> easy as defining it CONFIG_PM.

Ah. Of course. Humble apologies.

> What about make CONFIG_SOFTWARE_SUSPEND2 defining
> CONFIG_SOFTWARE_SUSPEND, too? We want the merged, anyway...

Could do. And on the top of merging, sorry for the delays I'm getting
there. I figured out yesterday what was holding me back with getting SMP &
HighMem going under 2.6. It was really simple: the compile was using -O2.
A quick change to the Makefile and I can now use a C file as I do with 2.4.

Regards,

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614, Australia.
+61 (2) 6251 7727 (wk)

2004-04-27 23:33:03

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

> ged, anyway...
>
> Could do. And on the top of merging, sorry for the delays I'm getting
> there. I figured out yesterday what was holding me back with getting SMP &
> HighMem going under 2.6. It was really simple: the compile was using -O2.
> A quick change to the Makefile and I can now use a C file as I do with 2.4.

Which, as I keep saying, is plain broken ... You simply cannot control
what side effects the compiler will generate, like touching the stack,
etc... Such a critical routine _has_ to be written in assembly (and
properly commented of course). Anything else is asking for trouble.

Ben.

2004-04-27 23:45:30

by Nigel Cunningham

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi.

On Wed, 28 Apr 2004 09:24:11 +1000, Benjamin Herrenschmidt
<[email protected]> wrote:
> Which, as I keep saying, is plain broken ... You simply cannot control
> what side effects the compiler will generate, like touching the stack,
> etc... Such a critical routine _has_ to be written in assembly (and
> properly commented of course). Anything else is asking for trouble.

No, it doesn't have to be written in assembly. If that was the case, I
wouldn't have managed to get it working under a variety of compiler
versions already. So long as assemblers honour the directives, we're okay.
Of course I'll freely admit that hand coding would probably result in
nicer, tidier and maybe faster code, and you would know that it was doing
the right thing. In the long time I would prefer to do that. But right now
I'm being a pragmatist: .c works, I don't know x86 assembly and don't have
the time to learn it and the code is still changing a little, so I'll
delay making a .S file for now.

Please, forgive me!

Regards,

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614, Australia.
+61 (2) 6251 7727 (wk)

2004-04-28 00:21:01

by Pavel Machek

[permalink] [raw]
Subject: Re: Bug#234976: kernel-source-2.6.4: Software Suspend doesn't work

Hi!

> > ged, anyway...
> >
> > Could do. And on the top of merging, sorry for the delays I'm getting
> > there. I figured out yesterday what was holding me back with getting SMP &
> > HighMem going under 2.6. It was really simple: the compile was using -O2.
> > A quick change to the Makefile and I can now use a C file as I do with 2.4.
>
> Which, as I keep saying, is plain broken ... You simply cannot control
> what side effects the compiler will generate, like touching the stack,
> etc... Such a critical routine _has_ to be written in assembly (and
> properly commented of course). Anything else is asking for trouble.

You are of course right.

OTOH this is easy to solve... gcc -E and use resulting assembly after
verifying it (and probably cleaning it up).

BTW do not do the same mistake I did... I basically lost original .c
file. Make sure .c file is still there somewhere, so assembly can be
regenerated after big changes.
Pavel
--
934a471f20d6580d5aad759bf0d97ddc