2008-02-27 22:10:56

by Klaus S. Madsen

[permalink] [raw]
Subject: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Hi all,

I have a Thinkpad T61p, which I'm able to suspend with s2ram on
Linux 2.6.24.3. However when I try to suspend it on 2.6.25-rc3, s2ram
dies after changing to vt1, with a segfault. I'm using s2ram from cvs,
and libx86 version 0.99 from http://www.codon.org.uk/~mjg59/libx86/.

Some details about the segfault:

$ sudo gdb ./s2ram
(gdb) run
Starting program: /home/ksm/downloads/suspend/s2ram
Switching from vt7 to vt1
Calling get_mode

Program received signal SIGSEGV, Segmentation fault.
0xb7facf4a in run_vm86 () at lrmi.c:526
526 asm volatile (
(gdb) list
521 static int
522 lrmi_vm86(struct vm86_struct *vm)
523 {
524 int r;
525 #ifdef __PIC__
526 asm volatile (
527 "pushl %%ebx\n\t"
528 "movl %2, %%ebx\n\t"
529 "int $0x80\n\t"
530 "popl %%ebx"
(gdb) bt
#0 0xb7facf4a in run_vm86 () at lrmi.c:526
#1 0xb7fad61b in LRMI_int (i=16, r=0xbffca670) at lrmi.c:844
#2 0x0804acfc in do_vbe_service (AX=20227, BX=0, regs=0xbffca670)
at vbetool/vbetool.c:158
#3 0x0804af7e in __get_mode () at vbetool/vbetool.c:453
#4 0x0804a30f in s2ram_hacks () at s2ram-x86.c:268
#5 0x0804954f in main (argc=1, argv=0x0) at s2ram-main.c:92

I have tried to bisect the problem, and it fingered the following
commit:

commit 82bc03fc158e28c90d7ed9919410776039cb4e14
Author: Ingo Molnar <[email protected]>

x86: add PWT to NOCACHE flags

Reverting this commit in the bisected tree (by executing git show
82bc03fc158e28c90d7ed9919410776039cb4e14 | patch -R -p1), makes the
segfault go away. I've run make clean between each kernel compile, to be
sure the tree was correctly compiled.

I have attached the .config I'm using for 2.6.25-rc3 (when bisecting, I
just choose the default on every question). If necessary, I can try to
reconstruct the one I ended up with after the bisection.

Hope this makes sense to someone, and thanks in advance.

--
Kind regards
Klaus S. Madsen


2008-02-27 22:21:00

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Wednesday, 27 of February 2008, Klaus S. Madsen wrote:
> Hi all,
>
> I have a Thinkpad T61p, which I'm able to suspend with s2ram on
> Linux 2.6.24.3. However when I try to suspend it on 2.6.25-rc3, s2ram
> dies after changing to vt1, with a segfault. I'm using s2ram from cvs,
> and libx86 version 0.99 from http://www.codon.org.uk/~mjg59/libx86/.

There's a known suspend problem with 2.6.25-rc3 that has been fixed already
in the Linus' tree. Can you test the current head of the Linus' tree, please?

Rafael

2008-02-27 22:40:20

by Pavel Machek

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Wed 2008-02-27 23:19:11, Rafael J. Wysocki wrote:
> On Wednesday, 27 of February 2008, Klaus S. Madsen wrote:
> > Hi all,
> >
> > I have a Thinkpad T61p, which I'm able to suspend with s2ram on
> > Linux 2.6.24.3. However when I try to suspend it on 2.6.25-rc3, s2ram
> > dies after changing to vt1, with a segfault. I'm using s2ram from cvs,
> > and libx86 version 0.99 from http://www.codon.org.uk/~mjg59/libx86/.
>
> There's a known suspend problem with 2.6.25-rc3 that has been fixed already
> in the Linus' tree. Can you test the current head of the Linus' tree, please?

This does not look like known problem, actually... s2ram segfaults
somewhere in emulator...?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-02-28 06:50:46

by Klaus S. Madsen

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Wed, Feb 27, 2008 at 23:19:11 +0100, Rafael J. Wysocki wrote:
> On Wednesday, 27 of February 2008, Klaus S. Madsen wrote:
> > Hi all,
> >
> > I have a Thinkpad T61p, which I'm able to suspend with s2ram on
> > Linux 2.6.24.3. However when I try to suspend it on 2.6.25-rc3, s2ram
> > dies after changing to vt1, with a segfault. I'm using s2ram from cvs,
> > and libx86 version 0.99 from http://www.codon.org.uk/~mjg59/libx86/.
>
> There's a known suspend problem with 2.6.25-rc3 that has been fixed
> already in the Linus' tree. Can you test the current head of the
> Linus' tree, please?
I've tested the head of Linus' git as of this morning, and the problem
still exists. Note however, that I don't even get to the suspend part,
as s2ram crashes before it initiates the kernel part of STR.

--
Kind regards
Klaus S. Madsen

2008-02-28 07:02:48

by Pavel Machek

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Thu 2008-02-28 07:47:56, Klaus S. Madsen wrote:
> On Wed, Feb 27, 2008 at 23:19:11 +0100, Rafael J. Wysocki wrote:
> > On Wednesday, 27 of February 2008, Klaus S. Madsen wrote:
> > > Hi all,
> > >
> > > I have a Thinkpad T61p, which I'm able to suspend with s2ram on
> > > Linux 2.6.24.3. However when I try to suspend it on 2.6.25-rc3, s2ram
> > > dies after changing to vt1, with a segfault. I'm using s2ram from cvs,
> > > and libx86 version 0.99 from http://www.codon.org.uk/~mjg59/libx86/.
> >
> > There's a known suspend problem with 2.6.25-rc3 that has been fixed
> > already in the Linus' tree. Can you test the current head of the
> > Linus' tree, please?
> I've tested the head of Linus' git as of this morning, and the problem
> still exists. Note however, that I don't even get to the suspend part,
> as s2ram crashes before it initiates the kernel part of STR.

Yes, looks like ingo broke vm86 emulation, or something like that...?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-02-28 07:09:16

by Klaus S. Madsen

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Wed, Feb 27, 2008 at 23:10:33 +0100, Klaus S. Madsen wrote:
> Hi all,
>
> I have a Thinkpad T61p, which I'm able to suspend with s2ram on
> Linux 2.6.24.3. However when I try to suspend it on 2.6.25-rc3, s2ram
> dies after changing to vt1, with a segfault. I'm using s2ram from cvs,
> and libx86 version 0.99 from http://www.codon.org.uk/~mjg59/libx86/.
>
> Some details about the segfault:
>
> $ sudo gdb ./s2ram
> (gdb) run
> Starting program: /home/ksm/downloads/suspend/s2ram
> Switching from vt7 to vt1
> Calling get_mode
>
> Program received signal SIGSEGV, Segmentation fault.
> 0xb7facf4a in run_vm86 () at lrmi.c:526
> 526 asm volatile (
> (gdb) list
> 521 static int
> 522 lrmi_vm86(struct vm86_struct *vm)
> 523 {
> 524 int r;
> 525 #ifdef __PIC__
> 526 asm volatile (
> 527 "pushl %%ebx\n\t"
> 528 "movl %2, %%ebx\n\t"
> 529 "int $0x80\n\t"
> 530 "popl %%ebx"
> (gdb) bt
> #0 0xb7facf4a in run_vm86 () at lrmi.c:526
> #1 0xb7fad61b in LRMI_int (i=16, r=0xbffca670) at lrmi.c:844
> #2 0x0804acfc in do_vbe_service (AX=20227, BX=0, regs=0xbffca670)
> at vbetool/vbetool.c:158
> #3 0x0804af7e in __get_mode () at vbetool/vbetool.c:453
> #4 0x0804a30f in s2ram_hacks () at s2ram-x86.c:268
> #5 0x0804954f in main (argc=1, argv=0x0) at s2ram-main.c:92
>
> I have tried to bisect the problem, and it fingered the following
> commit:
>
> commit 82bc03fc158e28c90d7ed9919410776039cb4e14
> Author: Ingo Molnar <[email protected]>
>
> x86: add PWT to NOCACHE flags
>
> Reverting this commit in the bisected tree (by executing git show
> 82bc03fc158e28c90d7ed9919410776039cb4e14 | patch -R -p1), makes the
> segfault go away. I've run make clean between each kernel compile, to be
> sure the tree was correctly compiled.
>
> I have attached the .config I'm using for 2.6.25-rc3 (when bisecting, I
> just choose the default on every question). If necessary, I can try to
> reconstruct the one I ended up with after the bisection.
I have attached my .config now, in case it's important.

--
Kind regards
Klaus S. Madsen


Attachments:
(No filename) (2.10 kB)
config-t61p.txt (49.38 kB)
Download all attachments

2008-02-28 09:17:22

by Ingo Molnar

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending


* Klaus S. Madsen <[email protected]> wrote:

> Hi all,
>
> I have a Thinkpad T61p, which I'm able to suspend with s2ram
> on Linux 2.6.24.3. However when I try to suspend it on 2.6.25-rc3,
> s2ram dies after changing to vt1, with a segfault. I'm using s2ram
> from cvs, and libx86 version 0.99 from
> http://www.codon.org.uk/~mjg59/libx86/.
>
> Some details about the segfault:
>
> $ sudo gdb ./s2ram
> (gdb) run
> Starting program: /home/ksm/downloads/suspend/s2ram
> Switching from vt7 to vt1
> Calling get_mode
>
> Program received signal SIGSEGV, Segmentation fault.
> 0xb7facf4a in run_vm86 () at lrmi.c:526
> 526 asm volatile (
> (gdb) list
> 521 static int
> 522 lrmi_vm86(struct vm86_struct *vm)
> 523 {
> 524 int r;
> 525 #ifdef __PIC__
> 526 asm volatile (
> 527 "pushl %%ebx\n\t"
> 528 "movl %2, %%ebx\n\t"
> 529 "int $0x80\n\t"
> 530 "popl %%ebx"
> (gdb) bt
> #0 0xb7facf4a in run_vm86 () at lrmi.c:526
> #1 0xb7fad61b in LRMI_int (i=16, r=0xbffca670) at lrmi.c:844
> #2 0x0804acfc in do_vbe_service (AX=20227, BX=0, regs=0xbffca670)
> at vbetool/vbetool.c:158
> #3 0x0804af7e in __get_mode () at vbetool/vbetool.c:453
> #4 0x0804a30f in s2ram_hacks () at s2ram-x86.c:268
> #5 0x0804954f in main (argc=1, argv=0x0) at s2ram-main.c:92
>
> I have tried to bisect the problem, and it fingered the following
> commit:
>
> commit 82bc03fc158e28c90d7ed9919410776039cb4e14
> Author: Ingo Molnar <[email protected]>
>
> x86: add PWT to NOCACHE flags
>
> Reverting this commit in the bisected tree (by executing git show
> 82bc03fc158e28c90d7ed9919410776039cb4e14 | patch -R -p1), makes the
> segfault go away. I've run make clean between each kernel compile, to
> be sure the tree was correctly compiled.

thanks for tracking this down. It would be nice to figure out why this
change made a difference. Perhaps VM86 mode has some restrictions in
what type of pagetables it can operate in - and the CPU just refuses to
properly emulate those 16-bit instructions? (this would be very weird).
We are trying to execute 16-bit BIOS code here, right?

which instruction is the segfault coming from - the int $0x80? So in
vm86 mode we generated a #GPF which shows up as a SIGSEGV?

Ingo

2008-02-28 09:28:58

by Klaus S. Madsen

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Thu, Feb 28, 2008 at 10:16:39 +0100, Ingo Molnar wrote:
>
> * Klaus S. Madsen <[email protected]> wrote:
>
> > Hi all,
> >
> > I have a Thinkpad T61p, which I'm able to suspend with s2ram
> > on Linux 2.6.24.3. However when I try to suspend it on 2.6.25-rc3,
> > s2ram dies after changing to vt1, with a segfault. I'm using s2ram
> > from cvs, and libx86 version 0.99 from
> > http://www.codon.org.uk/~mjg59/libx86/.
> >
> > Some details about the segfault:
> >
> > $ sudo gdb ./s2ram
> > (gdb) run
> > Starting program: /home/ksm/downloads/suspend/s2ram
> > Switching from vt7 to vt1
> > Calling get_mode
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > 0xb7facf4a in run_vm86 () at lrmi.c:526
> > 526 asm volatile (
> > (gdb) list
> > 521 static int
> > 522 lrmi_vm86(struct vm86_struct *vm)
> > 523 {
> > 524 int r;
> > 525 #ifdef __PIC__
> > 526 asm volatile (
> > 527 "pushl %%ebx\n\t"
> > 528 "movl %2, %%ebx\n\t"
> > 529 "int $0x80\n\t"
> > 530 "popl %%ebx"
> > (gdb) bt
> > #0 0xb7facf4a in run_vm86 () at lrmi.c:526
> > #1 0xb7fad61b in LRMI_int (i=16, r=0xbffca670) at lrmi.c:844
> > #2 0x0804acfc in do_vbe_service (AX=20227, BX=0, regs=0xbffca670)
> > at vbetool/vbetool.c:158
> > #3 0x0804af7e in __get_mode () at vbetool/vbetool.c:453
> > #4 0x0804a30f in s2ram_hacks () at s2ram-x86.c:268
> > #5 0x0804954f in main (argc=1, argv=0x0) at s2ram-main.c:92
> >
> > I have tried to bisect the problem, and it fingered the following
> > commit:
> >
> > commit 82bc03fc158e28c90d7ed9919410776039cb4e14
> > Author: Ingo Molnar <[email protected]>
> >
> > x86: add PWT to NOCACHE flags
> >
> > Reverting this commit in the bisected tree (by executing git show
> > 82bc03fc158e28c90d7ed9919410776039cb4e14 | patch -R -p1), makes the
> > segfault go away. I've run make clean between each kernel compile, to
> > be sure the tree was correctly compiled.
>
> thanks for tracking this down. It would be nice to figure out why this
> change made a difference. Perhaps VM86 mode has some restrictions in
> what type of pagetables it can operate in - and the CPU just refuses to
> properly emulate those 16-bit instructions? (this would be very weird).
> We are trying to execute 16-bit BIOS code here, right?
>
> which instruction is the segfault coming from - the int $0x80? So in
> vm86 mode we generated a #GPF which shows up as a SIGSEGV?
I must say, that I don't quite understand why gdb fingers the "asm
volatile" line and not one of the assembly lines, when reporting the
segfault. But I'm not really well versed in lowlevel gdb use, so if you
could give me a about how I get gdb to disassemble the code at the
instruction pointer, I'll return with the result.

Thanks for taking the time to look at this.

--
Kind regards
Klaus S. Madsen

2008-02-28 09:40:31

by Ingo Molnar

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending


* Klaus S. Madsen <[email protected]> wrote:

> > > 524 int r;
> > > 525 #ifdef __PIC__
> > > 526 asm volatile (
> > > 527 "pushl %%ebx\n\t"
> > > 528 "movl %2, %%ebx\n\t"
> > > 529 "int $0x80\n\t"
> > > 530 "popl %%ebx"
> > > (gdb) bt
> > > #0 0xb7facf4a in run_vm86 () at lrmi.c:526
> > > #1 0xb7fad61b in LRMI_int (i=16, r=0xbffca670) at lrmi.c:844
> > > #2 0x0804acfc in do_vbe_service (AX=20227, BX=0, regs=0xbffca670)
> > > at vbetool/vbetool.c:158
> > > #3 0x0804af7e in __get_mode () at vbetool/vbetool.c:453
> > > #4 0x0804a30f in s2ram_hacks () at s2ram-x86.c:268
> > > #5 0x0804954f in main (argc=1, argv=0x0) at s2ram-main.c:92
> > >
> > > I have tried to bisect the problem, and it fingered the following
> > > commit:
> > >
> > > commit 82bc03fc158e28c90d7ed9919410776039cb4e14
> > > Author: Ingo Molnar <[email protected]>
> > >
> > > x86: add PWT to NOCACHE flags
> > >
> > > Reverting this commit in the bisected tree (by executing git show
> > > 82bc03fc158e28c90d7ed9919410776039cb4e14 | patch -R -p1), makes the
> > > segfault go away. I've run make clean between each kernel compile, to
> > > be sure the tree was correctly compiled.
> >
> > thanks for tracking this down. It would be nice to figure out why this
> > change made a difference. Perhaps VM86 mode has some restrictions in
> > what type of pagetables it can operate in - and the CPU just refuses to
> > properly emulate those 16-bit instructions? (this would be very weird).
> > We are trying to execute 16-bit BIOS code here, right?
> >
> > which instruction is the segfault coming from - the int $0x80? So in
> > vm86 mode we generated a #GPF which shows up as a SIGSEGV?
> I must say, that I don't quite understand why gdb fingers the "asm
> volatile" line and not one of the assembly lines, when reporting the
> segfault. But I'm not really well versed in lowlevel gdb use, so if you
> could give me a about how I get gdb to disassemble the code at the
> instruction pointer, I'll return with the result.

typing 'disassemble' should do the trick.

If you have a specific address outside of the current instruction
pointer, then doing disassembly on a range:

disassemble 0xb7facf4a 0xb7facf8a

should work too.

Ingo

2008-02-28 15:04:54

by Klaus S. Madsen

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Thu, Feb 28, 2008 at 10:40:00 +0100, Ingo Molnar wrote:
>
> * Klaus S. Madsen <[email protected]> wrote:
>
> > > > 524 int r;
> > > > 525 #ifdef __PIC__
> > > > 526 asm volatile (
> > > > 527 "pushl %%ebx\n\t"
> > > > 528 "movl %2, %%ebx\n\t"
> > > > 529 "int $0x80\n\t"
> > > > 530 "popl %%ebx"
> > > > (gdb) bt
> > > > #0 0xb7facf4a in run_vm86 () at lrmi.c:526
> > > > #1 0xb7fad61b in LRMI_int (i=16, r=0xbffca670) at lrmi.c:844
> > > > #2 0x0804acfc in do_vbe_service (AX=20227, BX=0, regs=0xbffca670)
> > > > at vbetool/vbetool.c:158
> > > > #3 0x0804af7e in __get_mode () at vbetool/vbetool.c:453
> > > > #4 0x0804a30f in s2ram_hacks () at s2ram-x86.c:268
> > > > #5 0x0804954f in main (argc=1, argv=0x0) at s2ram-main.c:92

[snip]

> > > thanks for tracking this down. It would be nice to figure out why this
> > > change made a difference. Perhaps VM86 mode has some restrictions in
> > > what type of pagetables it can operate in - and the CPU just refuses to
> > > properly emulate those 16-bit instructions? (this would be very weird).
> > > We are trying to execute 16-bit BIOS code here, right?
> > >
> > > which instruction is the segfault coming from - the int $0x80? So in
> > > vm86 mode we generated a #GPF which shows up as a SIGSEGV?

The segfault was at address 0xb7f59f4a, and the disassembly of
the run_vm86 function is:

0xb7f59f20 <run_vm86+0>: push %ebp
0xb7f59f21 <run_vm86+1>: mov %esp,%ebp
0xb7f59f23 <run_vm86+3>: push %edi
0xb7f59f24 <run_vm86+4>: push %esi
0xb7f59f25 <run_vm86+5>: push %ebx
0xb7f59f26 <run_vm86+6>: call 0xb7f59697 <__i686.get_pc_thunk.bx>
0xb7f59f2b <run_vm86+11>: add $0x18b5,%ebx
0xb7f59f31 <run_vm86+17>: sub $0x3c,%esp
0xb7f59f34 <run_vm86+20>: lea 0x48c(%ebx),%eax
0xb7f59f3a <run_vm86+26>: mov %eax,0xffffffc0(%ebp)
0xb7f59f3d <run_vm86+29>: mov $0x71,%eax
0xb7f59f42 <run_vm86+34>: mov 0xffffffc0(%ebp),%ecx
0xb7f59f45 <run_vm86+37>: push %ebx
0xb7f59f46 <run_vm86+38>: mov %ecx,%ebx
0xb7f59f48 <run_vm86+40>: int $0x80
0xb7f59f4a <run_vm86+42>: pop %ebx
0xb7f59f4b <run_vm86+43>: mov %eax,%edx
0xb7f59f4d <run_vm86+45>: and $0xff,%eax
0xb7f59f52 <run_vm86+50>: cmp $0x2,%eax
0xb7f59f55 <run_vm86+53>: je 0xb7f5a0b5 <run_vm86+405>
0xb7f59f5b <run_vm86+59>: sub $0x1,%eax
0xb7f59f5e <run_vm86+62>: jne 0xb7f5a28a <run_vm86+874>

Hope this helps.

--
Kind regards
Klaus S. Madsen

2008-02-28 17:54:21

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Klaus S. Madsen wrote:
>
> Hope this helps.
>

What this seems to indicate is a segfault inside VM mode that causes it
to exit to deliver the SIGSEGV, so without more information, such as
signal context, there isn't much to know about it.

It looks like the fault happens inside the VESA BIOS, specifically VBE
function 3:

--------V-104F03-----------------------------
INT 10 - VESA SuperVGA BIOS - GET CURRENT VIDEO MODE
AX = 4F03h
Return: AL = 4Fh if function supported
AH = status
00h successful
BX = video mode (see #00083,#00084)
bit 13: VBE/AF v1.0P accelerated video mode
bit 14: linear frame buffer enabled (VBE v2.0+)
bit 15: don't clear video memory
01h failed
SeeAlso: AH=0Fh,AX=4E04h,AX=4F02h

... which normally would be a trivial function which only reads a couple
of status words out of internal state and returns.

****

Typically, when the kernel reflects an error in VM86 mode it will update
the structure in memory (in your case, the vm86plus_struct) to reflect
the context. Would it be possible for you to read it out?

[FWIW, that code looks like it's using assembly for no good current
reason. Not sure if it'd help to clean it up.]

-hpa

2008-02-28 19:24:34

by Klaus S. Madsen

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Thu, Feb 28, 2008 at 09:52:57 -0800, H. Peter Anvin wrote:
> Klaus S. Madsen wrote:
> >
> >Hope this helps.
> >
>
> What this seems to indicate is a segfault inside VM mode that causes it
> to exit to deliver the SIGSEGV, so without more information, such as
> signal context, there isn't much to know about it.
>
> It looks like the fault happens inside the VESA BIOS, specifically VBE
> function 3:
>
> --------V-104F03-----------------------------
> INT 10 - VESA SuperVGA BIOS - GET CURRENT VIDEO MODE
> AX = 4F03h
> Return: AL = 4Fh if function supported
> AH = status
> 00h successful
> BX = video mode (see #00083,#00084)
> bit 13: VBE/AF v1.0P accelerated video mode
> bit 14: linear frame buffer enabled (VBE v2.0+)
> bit 15: don't clear video memory
> 01h failed
> SeeAlso: AH=0Fh,AX=4E04h,AX=4F02h
>
> ... which normally would be a trivial function which only reads a couple
> of status words out of internal state and returns.
>
> ****
>
> Typically, when the kernel reflects an error in VM86 mode it will update
> the structure in memory (in your case, the vm86plus_struct) to reflect
> the context. Would it be possible for you to read it out?
Hmm. As far as I can tell, its actually using the vm86old system call?
That's at least what the comment in libx86 states.

However the contents of struct vm86_struct after the segfault is:

(gdb) print context.vm
$2 = {regs = {ebx = 0, ecx = 0, edx = 0, esi = 0, edi = 0, ebp = 0,
eax = 20227, __null_ds = 0, __null_es = 0, __null_fs = -1071579136,
__null_gs = 0, orig_eax = -1, eip = 6326, cs = 49152, __csh = 0,
eflags = 209410, esp = 4090, ss = 256, __ssh = 0, es = 0, __esh = 0,
ds = 64, __dsh = 0, fs = 0, __fsh = 0, gs = 0, __gsh = 0}, flags = 0,
screen_bitmap = 0, cpu_type = 0, int_revectored = {__map = {0, 0, 0,0, 0,
0, 0, 2147483648}}, int21_revectored = {__map = {0, 0, 0, 0, 0, 0, 0,
0}}}

My version of glibc does not seem to have vm86old declared, so I haven't
tried to remove the assembly code.

Should I try to change it to use vm86, instead of vm86old?

--
Kind regards
Klaus S. Madsen

2008-02-28 19:32:43

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Klaus S. Madsen wrote:
>>
>> Typically, when the kernel reflects an error in VM86 mode it will update
>> the structure in memory (in your case, the vm86plus_struct) to reflect
>> the context. Would it be possible for you to read it out?
> Hmm. As far as I can tell, its actually using the vm86old system call?
> That's at least what the comment in libx86 states.
>
> However the contents of struct vm86_struct after the segfault is:
>
> (gdb) print context.vm
> $2 = {regs = {ebx = 0, ecx = 0, edx = 0, esi = 0, edi = 0, ebp = 0,
> eax = 20227, __null_ds = 0, __null_es = 0, __null_fs = -1071579136,
> __null_gs = 0, orig_eax = -1, eip = 6326, cs = 49152, __csh = 0,
> eflags = 209410, esp = 4090, ss = 256, __ssh = 0, es = 0, __esh = 0,
> ds = 64, __dsh = 0, fs = 0, __fsh = 0, gs = 0, __gsh = 0}, flags = 0,
> screen_bitmap = 0, cpu_type = 0, int_revectored = {__map = {0, 0, 0,0, 0,
> 0, 0, 2147483648}}, int21_revectored = {__map = {0, 0, 0, 0, 0, 0, 0,
> 0}}}
>
> My version of glibc does not seem to have vm86old declared, so I haven't
> tried to remove the assembly code.
>
> Should I try to change it to use vm86, instead of vm86old?
>

Yes, that would probably be a good idea. To some degree, I guess it
really has nothing to do with the more fundamental issue, but it's
somewhat odd.

I'll pick apart the state above looking for fishiness as soon as I get
back from lunch.

-hpa

2008-02-28 19:54:48

by Klaus S. Madsen

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Thu, Feb 28, 2008 at 11:31:13 -0800, H. Peter Anvin wrote:
> Klaus S. Madsen wrote:
> >>
> >>Typically, when the kernel reflects an error in VM86 mode it will update
> >>the structure in memory (in your case, the vm86plus_struct) to reflect
> >>the context. Would it be possible for you to read it out?
> >Hmm. As far as I can tell, its actually using the vm86old system call?
> >That's at least what the comment in libx86 states.
> >
> >However the contents of struct vm86_struct after the segfault is:
> >
> >(gdb) print context.vm
> >$2 = {regs = {ebx = 0, ecx = 0, edx = 0, esi = 0, edi = 0, ebp = 0,
> > eax = 20227, __null_ds = 0, __null_es = 0, __null_fs = -1071579136,
> > __null_gs = 0, orig_eax = -1, eip = 6326, cs = 49152, __csh = 0,
> > eflags = 209410, esp = 4090, ss = 256, __ssh = 0, es = 0, __esh = 0,
> > ds = 64, __dsh = 0, fs = 0, __fsh = 0, gs = 0, __gsh = 0}, flags = 0,
> > screen_bitmap = 0, cpu_type = 0, int_revectored = {__map = {0, 0, 0,0,
> > 0, 0, 0, 2147483648}}, int21_revectored = {__map = {0, 0, 0, 0, 0,
> > 0, 0, 0}}}
> >
> >My version of glibc does not seem to have vm86old declared, so I haven't
> >tried to remove the assembly code.
> >
> >Should I try to change it to use vm86, instead of vm86old?
> >
>
> Yes, that would probably be a good idea. To some degree, I guess it
> really has nothing to do with the more fundamental issue, but it's
> somewhat odd.
Ok. I tried changing the code from the assembly to do:

vm86(VM86_ENTER, vm)

instead. vm the vm86plus_struct structure fill in exactly the same way
as vm86_struct.

The output from GDB after the segfault is exactly the same as
previously, and all the fields in vm86plus_info_struct is zero.

> I'll pick apart the state above looking for fishiness as soon as I get
> back from lunch.
Great, thanks.

--
Kind regards
Klaus S. Madsen

2008-02-28 22:47:17

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

I just realized something important.

Could you please send me a *complete* strace of the execution of s2ram
from the beginning?

-hpa

2008-02-29 07:00:40

by Klaus S. Madsen

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Thu, Feb 28, 2008 at 14:45:58 -0800, H. Peter Anvin wrote:
> I just realized something important.
>
> Could you please send me a *complete* strace of the execution of s2ram
> from the beginning?
Sure. Here you go:

execve("/sbin/s2ram", ["s2ram"], [/* 20 vars */]) = 0
brk(0) = 0x8058000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f15000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/tls/i686/sse2/cmov/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/tls/i686/sse2/cmov", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/tls/i686/sse2/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/tls/i686/sse2", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/tls/i686/cmov/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/tls/i686/cmov", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/tls/i686/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/tls/i686", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/tls/sse2/cmov/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/tls/sse2/cmov", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/tls/sse2/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/tls/sse2", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/tls/cmov/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/tls/cmov", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/tls/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/tls", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/i686/sse2/cmov/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/i686/sse2/cmov", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/i686/sse2/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/i686/sse2", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/i686/cmov/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/i686/cmov", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/i686/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/i686", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/sse2/cmov/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/sse2/cmov", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/sse2/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/sse2", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/cmov/libx86.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/ksm/downloads/libx86-0.99/cmov", 0xbff318d8) = -1 ENOENT (No such file or directory)
open("/home/ksm/downloads/libx86-0.99/libx86.so.1", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0P\6\0\000"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=23446, ...}) = 0
mmap2(NULL, 11664, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7f12000
mmap2(0xb7f14000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1) = 0xb7f14000
close(3) = 0
open("/home/ksm/downloads/libx86-0.99/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=122255, ...}) = 0
mmap2(NULL, 122255, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7ef4000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/usr/lib/libz.so.1", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\20\31\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=80504, ...}) = 0
mmap2(NULL, 83232, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7edf000
mmap2(0xb7ef3000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x13) = 0xb7ef3000
close(3) = 0
open("/home/ksm/downloads/libx86-0.99/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/tls/i686/cmov/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\260a\1"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=1339816, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7ede000
mmap2(NULL, 1349136, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7d94000
mmap2(0xb7ed8000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x143) = 0xb7ed8000
mmap2(0xb7edb000, 9744, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7edb000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7d93000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb7d936b0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0xb7ed8000, 4096, PROT_READ) = 0
munmap(0xb7ef4000, 122255) = 0
open("/dev/mem", O_RDONLY|O_LARGEFILE) = 3
brk(0) = 0x8058000
brk(0x8089000) = 0x8089000
mmap2(NULL, 65536, PROT_READ, MAP_SHARED, 3, 0xf0) = 0xb7f02000
munmap(0xb7f02000, 65536) = 0
close(3) = 0
open("/dev/mem", O_RDONLY|O_LARGEFILE) = 3
mmap2(NULL, 2453, PROT_READ, MAP_SHARED, 3, 0xe0) = 0xb7f11000
munmap(0xb7f11000, 2453) = 0
close(3) = 0
brk(0x8079000) = 0x8079000
open("/dev/tty", O_RDWR|O_LARGEFILE) = 3
ioctl(3, KDGKBTYPE, 0xbff32007) = -1 EINVAL (Invalid argument)
close(3) = 0
open("/dev/tty0", O_RDWR|O_LARGEFILE) = 3
ioctl(3, KDGKBTYPE, 0xbff32007) = 0
ioctl(3, VT_GETSTATE, 0xbff32062) = 0
fstat64(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 1), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f11000
write(1, "Switching from vt7 to vt1\n", 26) = 26
open("/dev/tty", O_RDWR|O_LARGEFILE) = 4
ioctl(4, KDGKBTYPE, 0xbff32017) = -1 EINVAL (Invalid argument)
close(4) = 0
open("/dev/tty0", O_RDWR|O_LARGEFILE) = 4
ioctl(4, KDGKBTYPE, 0xbff32017) = 0
ioctl(4, VIDIOC_G_COMP or VT_ACTIVATE, 0x1) = 0
ioctl(4, VIDIOC_S_COMP or VT_WAITACTIVE, 0x1) = 0
open("/proc/sys/kernel/acpi_video_flags", O_RDONLY|O_LARGEFILE) = 5
fstat64(5, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f10000
read(5, "0\n", 1024) = 2
close(5) = 0
munmap(0xb7f10000, 4096) = 0
open("/proc/sys/kernel/acpi_video_flags", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 5
fstat64(5, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f10000
write(5, "2", 1) = 1
close(5) = 0
munmap(0xb7f10000, 4096) = 0
open("/dev/zero", O_RDWR) = 5
mmap2(0x1000, 655360, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED|MAP_FIXED, 5, 0) = 0x1000
close(5) = 0
open("/dev/mem", O_RDWR) = 5
mmap2(NULL, 1282, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED|MAP_FIXED, 5, 0) = 0
mmap2(0xa0000, 393216, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, 5, 0xa0) = 0xa0000
close(5) = 0
ioperm(0, 0x400, 0x1) = 0
iopl(0x3) = 0
access("/sys/bus/pci", R_OK) = 0
write(1, "Calling get_mode\n", 17) = 17
vm86(0x1, 0xb7f14ccc, 0xb7f14830, 0xc000, 0x18b6 <unfinished ...>
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV (core dumped) +++

--
Kind regards
Klaus S. Madsen

2008-02-29 21:06:53

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Klaus S. Madsen wrote:
> open("/dev/mem", O_RDWR) = 5
> mmap2(NULL, 1282, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED|MAP_FIXED, 5, 0) = 0
> mmap2(0xa0000, 393216, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, 5, 0xa0) = 0xa0000
^^^^^^^^^^^^^^^^^^^^
> close(5) = 0
> ioperm(0, 0x400, 0x1) = 0
> iopl(0x3) = 0
> access("/sys/bus/pci", R_OK) = 0
> write(1, "Calling get_mode\n", 17) = 17
> vm86(0x1, 0xb7f14ccc, 0xb7f14830, 0xc000, 0x18b6 <unfinished ...>
> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> +++ killed by SIGSEGV (core dumped) +++

This is the VGA BIOS being mapped, it's mapped PROT_READ|PROT_WRITE, but
no PROT_EXEC; if the kernel is NX-capable it *should* segfault trying
to execute out of this area, which is exactly what will happen when vm86
executes INT 10h.

If we can find that mmap() in the s2ram source code and add PROT_EXEC to
it, it would be interesting.

-hpa

2008-02-29 21:27:22

by Ingo Molnar

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending


* H. Peter Anvin <[email protected]> wrote:

> Klaus S. Madsen wrote:
>> open("/dev/mem", O_RDWR) = 5
>> mmap2(NULL, 1282, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED|MAP_FIXED, 5, 0) = 0
>> mmap2(0xa0000, 393216, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, 5, 0xa0) = 0xa0000
> ^^^^^^^^^^^^^^^^^^^^
>> close(5) = 0
>> ioperm(0, 0x400, 0x1) = 0
>> iopl(0x3) = 0
>> access("/sys/bus/pci", R_OK) = 0
>> write(1, "Calling get_mode\n", 17) = 17
>> vm86(0x1, 0xb7f14ccc, 0xb7f14830, 0xc000, 0x18b6 <unfinished ...>
>> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
>> +++ killed by SIGSEGV (core dumped) +++
>
> This is the VGA BIOS being mapped, it's mapped PROT_READ|PROT_WRITE, but
> no PROT_EXEC; if the kernel is NX-capable it *should* segfault trying to
> execute out of this area, which is exactly what will happen when vm86
> executes INT 10h.
>
> If we can find that mmap() in the s2ram source code and add PROT_EXEC
> to it, it would be interesting.

Klaus, could you send your .config as well? Lets make sure that NX is
even relevant in this context.

Ingo

2008-03-01 01:20:33

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Friday, 29 of February 2008, H. Peter Anvin wrote:
> Klaus S. Madsen wrote:
> > open("/dev/mem", O_RDWR) = 5
> > mmap2(NULL, 1282, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED|MAP_FIXED, 5, 0) = 0
> > mmap2(0xa0000, 393216, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, 5, 0xa0) = 0xa0000
> ^^^^^^^^^^^^^^^^^^^^
> > close(5) = 0
> > ioperm(0, 0x400, 0x1) = 0
> > iopl(0x3) = 0
> > access("/sys/bus/pci", R_OK) = 0
> > write(1, "Calling get_mode\n", 17) = 17
> > vm86(0x1, 0xb7f14ccc, 0xb7f14830, 0xc000, 0x18b6 <unfinished ...>
> > --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> > +++ killed by SIGSEGV (core dumped) +++
>
> This is the VGA BIOS being mapped, it's mapped PROT_READ|PROT_WRITE, but
> no PROT_EXEC; if the kernel is NX-capable it *should* segfault trying
> to execute out of this area, which is exactly what will happen when vm86
> executes INT 10h.
>
> If we can find that mmap() in the s2ram source code and add PROT_EXEC to
> it, it would be interesting.

This is in radeontool.c, line 91, AFAICS.

Thanks,
Rafael

2008-03-01 09:45:44

by Klaus S. Madsen

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Fri, Feb 29, 2008 at 22:26:54 +0100, Ingo Molnar wrote:
>
> * H. Peter Anvin <[email protected]> wrote:
>
> > Klaus S. Madsen wrote:
> >> open("/dev/mem", O_RDWR) = 5
> >> mmap2(NULL, 1282, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED|MAP_FIXED, 5, 0) = 0
> >> mmap2(0xa0000, 393216, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, 5, 0xa0) = 0xa0000
> > ^^^^^^^^^^^^^^^^^^^^
> >> close(5) = 0
> >> ioperm(0, 0x400, 0x1) = 0
> >> iopl(0x3) = 0
> >> access("/sys/bus/pci", R_OK) = 0
> >> write(1, "Calling get_mode\n", 17) = 17
> >> vm86(0x1, 0xb7f14ccc, 0xb7f14830, 0xc000, 0x18b6 <unfinished ...>
> >> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> >> +++ killed by SIGSEGV (core dumped) +++
> >
> > This is the VGA BIOS being mapped, it's mapped PROT_READ|PROT_WRITE, but
> > no PROT_EXEC; if the kernel is NX-capable it *should* segfault trying to
> > execute out of this area, which is exactly what will happen when vm86
> > executes INT 10h.
> >
> > If we can find that mmap() in the s2ram source code and add PROT_EXEC
> > to it, it would be interesting.
>
> Klaus, could you send your .config as well? Lets make sure that NX is
> even relevant in this context.
Allright. The mmap in question is in the x86-common.c file in libx86,
and adding PROT_EXEC to it solves the problem.

I have attached my .config.

The only thing I don't understand is why this is suddenly a problem with
2.6.25, and not with 2.6.24? Is there a bug in 2.6.24 and previously
that allows real-mode execution of non-executable pages?

--
Kind regards
Klaus S. Madsen


Attachments:
(No filename) (1.65 kB)
config.t61p.txt (49.38 kB)
Download all attachments

2008-03-01 19:54:51

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Klaus S. Madsen wrote:
> On Fri, Feb 29, 2008 at 22:26:54 +0100, Ingo Molnar wrote:
>> * H. Peter Anvin <[email protected]> wrote:
>>
>>> Klaus S. Madsen wrote:
>>>> open("/dev/mem", O_RDWR) = 5
>>>> mmap2(NULL, 1282, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED|MAP_FIXED, 5, 0) = 0
>>>> mmap2(0xa0000, 393216, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, 5, 0xa0) = 0xa0000
>>> ^^^^^^^^^^^^^^^^^^^^
>>>> close(5) = 0
>>>> ioperm(0, 0x400, 0x1) = 0
>>>> iopl(0x3) = 0
>>>> access("/sys/bus/pci", R_OK) = 0
>>>> write(1, "Calling get_mode\n", 17) = 17
>>>> vm86(0x1, 0xb7f14ccc, 0xb7f14830, 0xc000, 0x18b6 <unfinished ...>
>>>> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
>>>> +++ killed by SIGSEGV (core dumped) +++
>>> This is the VGA BIOS being mapped, it's mapped PROT_READ|PROT_WRITE, but
>>> no PROT_EXEC; if the kernel is NX-capable it *should* segfault trying to
>>> execute out of this area, which is exactly what will happen when vm86
>>> executes INT 10h.
>>>
>>> If we can find that mmap() in the s2ram source code and add PROT_EXEC
>>> to it, it would be interesting.
>> Klaus, could you send your .config as well? Lets make sure that NX is
>> even relevant in this context.
> Allright. The mmap in question is in the x86-common.c file in libx86,
> and adding PROT_EXEC to it solves the problem.
>
> I have attached my .config.
>
> The only thing I don't understand is why this is suddenly a problem with
> 2.6.25, and not with 2.6.24? Is there a bug in 2.6.24 and previously
> that allows real-mode execution of non-executable pages?
>

One wonders, especially since the checkin it was bisected to it had
nothing to do with NX. I suspect there is either a bug in the NX logic,
which this checkin inadvertently fixed(!), or there is an ad hoc hack
that should probably never have existed. More investigation necessary,
but now we know a lot more.

-hpa

2008-03-03 14:45:41

by Pavel Machek

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Hi!

> > > Klaus S. Madsen wrote:
> > >> open("/dev/mem", O_RDWR) = 5
> > >> mmap2(NULL, 1282, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED|MAP_FIXED, 5, 0) = 0
> > >> mmap2(0xa0000, 393216, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, 5, 0xa0) = 0xa0000
> > > ^^^^^^^^^^^^^^^^^^^^
> > >> close(5) = 0
> > >> ioperm(0, 0x400, 0x1) = 0
> > >> iopl(0x3) = 0
> > >> access("/sys/bus/pci", R_OK) = 0
> > >> write(1, "Calling get_mode\n", 17) = 17
> > >> vm86(0x1, 0xb7f14ccc, 0xb7f14830, 0xc000, 0x18b6 <unfinished ...>
> > >> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> > >> +++ killed by SIGSEGV (core dumped) +++
> > >
> > > This is the VGA BIOS being mapped, it's mapped PROT_READ|PROT_WRITE, but
> > > no PROT_EXEC; if the kernel is NX-capable it *should* segfault trying to
> > > execute out of this area, which is exactly what will happen when vm86
> > > executes INT 10h.
> > >
> > > If we can find that mmap() in the s2ram source code and add PROT_EXEC
> > > to it, it would be interesting.
> >
> > Klaus, could you send your .config as well? Lets make sure that NX is
> > even relevant in this context.
> Allright. The mmap in question is in the x86-common.c file in libx86,
> and adding PROT_EXEC to it solves the problem.

Ok, sw should probably fix that in s2ram... can you mail a patch to
me, and suspend-devel?

> I have attached my .config.
>
> The only thing I don't understand is why this is suddenly a problem with
> 2.6.25, and not with 2.6.24? Is there a bug in 2.6.24 and previously
> that allows real-mode execution of non-executable pages?

It is strange indeed... Should it be traced as an regression?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-03-03 15:12:17

by Klaus S. Madsen

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Mon, Mar 03, 2008 at 13:17:35 +0100, Pavel Machek wrote:
> Hi!
>
> > > > Klaus S. Madsen wrote:
> > > >> open("/dev/mem", O_RDWR) = 5
> > > >> mmap2(NULL, 1282, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED|MAP_FIXED, 5, 0) = 0
> > > >> mmap2(0xa0000, 393216, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, 5, 0xa0) = 0xa0000
> > > > ^^^^^^^^^^^^^^^^^^^^
> > > >> close(5) = 0
> > > >> ioperm(0, 0x400, 0x1) = 0
> > > >> iopl(0x3) = 0
> > > >> access("/sys/bus/pci", R_OK) = 0
> > > >> write(1, "Calling get_mode\n", 17) = 17
> > > >> vm86(0x1, 0xb7f14ccc, 0xb7f14830, 0xc000, 0x18b6 <unfinished ...>
> > > >> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> > > >> +++ killed by SIGSEGV (core dumped) +++
> > > >
> > > > This is the VGA BIOS being mapped, it's mapped
> > > > PROT_READ|PROT_WRITE, but no PROT_EXEC; if the kernel is
> > > > NX-capable it *should* segfault trying to execute out of this
> > > > area, which is exactly what will happen when vm86 executes INT
> > > > 10h.
> > > >
> > > > If we can find that mmap() in the s2ram source code and add
> > > > PROT_EXEC to it, it would be interesting.
> > >
> > > Klaus, could you send your .config as well? Lets make sure that NX is
> > > even relevant in this context.
> > Allright. The mmap in question is in the x86-common.c file in libx86,
> > and adding PROT_EXEC to it solves the problem.
>
> Ok, sw should probably fix that in s2ram... can you mail a patch to
> me, and suspend-devel?
As stated above, the problem is actually in libx86, and not in s2ram. I
have included Matthew Garrett in the CC list, as he's hosting the libx86
source code I have used as a base for my patch.

The following patch solves the segfault, by changing the mmap flags of
the video memory area, to allow execution. The patch is against
libx86-0.99 available from http://www.codon.org.uk/~mjg59/libx86/

--- libx86-0.99/x86-common.c 2006-09-08 00:44:27.000000000 +0200
+++ libx86-0.99.new/x86-common.c 2008-03-01 10:08:25.000000000 +0100
@@ -232,7 +232,7 @@
}

m = mmap((void *)0xa0000, 0x100000 - 0xa0000,
- PROT_READ | PROT_WRITE,
+ PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_FIXED | MAP_SHARED, fd_mem, 0xa0000);

if (m == (void *)-1) {

> > The only thing I don't understand is why this is suddenly a problem with
> > 2.6.25, and not with 2.6.24? Is there a bug in 2.6.24 and previously
> > that allows real-mode execution of non-executable pages?
>
> It is strange indeed... Should it be traced as an regression?
I honestly don't know. From hpa's comments, it seems as is the segfault
would be the expected behaviour, but as 2.6.24 and previously doesn't
segfault, I guess it's a matter of determining if the original
non-segfaulting behaviour is by design, or a bug.

--
Kind regards,
Klaus S. Madsen

2008-03-03 15:41:18

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Monday, 3 of March 2008, Pavel Machek wrote:
> Hi!
>
> > > > Klaus S. Madsen wrote:
> > > >> open("/dev/mem", O_RDWR) = 5
> > > >> mmap2(NULL, 1282, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED|MAP_FIXED, 5, 0) = 0
> > > >> mmap2(0xa0000, 393216, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, 5, 0xa0) = 0xa0000
> > > > ^^^^^^^^^^^^^^^^^^^^
> > > >> close(5) = 0
> > > >> ioperm(0, 0x400, 0x1) = 0
> > > >> iopl(0x3) = 0
> > > >> access("/sys/bus/pci", R_OK) = 0
> > > >> write(1, "Calling get_mode\n", 17) = 17
> > > >> vm86(0x1, 0xb7f14ccc, 0xb7f14830, 0xc000, 0x18b6 <unfinished ...>
> > > >> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> > > >> +++ killed by SIGSEGV (core dumped) +++
> > > >
> > > > This is the VGA BIOS being mapped, it's mapped PROT_READ|PROT_WRITE, but
> > > > no PROT_EXEC; if the kernel is NX-capable it *should* segfault trying to
> > > > execute out of this area, which is exactly what will happen when vm86
> > > > executes INT 10h.
> > > >
> > > > If we can find that mmap() in the s2ram source code and add PROT_EXEC
> > > > to it, it would be interesting.
> > >
> > > Klaus, could you send your .config as well? Lets make sure that NX is
> > > even relevant in this context.
> > Allright. The mmap in question is in the x86-common.c file in libx86,
> > and adding PROT_EXEC to it solves the problem.
>
> Ok, sw should probably fix that in s2ram... can you mail a patch to
> me, and suspend-devel?
>
> > I have attached my .config.
> >
> > The only thing I don't understand is why this is suddenly a problem with
> > 2.6.25, and not with 2.6.24? Is there a bug in 2.6.24 and previously
> > that allows real-mode execution of non-executable pages?
>
> It is strange indeed... Should it be traced as an regression?

I'm tracing it FWIW. However, it would be good to know if all of the previous
kernels were buggy and 2.6.25 fixed the problem or it's the other way around.

Thanks,
Rafael

2008-03-03 17:11:40

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Pavel Machek wrote:
>>
>> The only thing I don't understand is why this is suddenly a problem with
>> 2.6.25, and not with 2.6.24? Is there a bug in 2.6.24 and previously
>> that allows real-mode execution of non-executable pages?
>
> It is strange indeed... Should it be traced as an regression?
> Pavel

I'd like to understand what the heck happened, but as far as we can
observe right now, it's a *progression*, not a regression, since
executing out of a non-PROT_EXEC area isn't *supposed* to work...

-hpa

2008-03-03 17:47:14

by Pavel Machek

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Mon 2008-03-03 09:10:35, H. Peter Anvin wrote:
> Pavel Machek wrote:
>>>
>>> The only thing I don't understand is why this is suddenly a problem with
>>> 2.6.25, and not with 2.6.24? Is there a bug in 2.6.24 and previously
>>> that allows real-mode execution of non-executable pages?
>>
>> It is strange indeed... Should it be traced as an regression?
>
> I'd like to understand what the heck happened, but as far as we can observe
> right now, it's a *progression*, not a regression, since executing out of a
> non-PROT_EXEC area isn't *supposed* to work...

Okay, I guess this depends on the eye of the beholder... because s2ram
*is* supposed to work ;-).

Ideally, I'd like to keep 2.6.24 behaviour for at least a while, so we
can try to fix the libx86 out there or something...
Pavel
PS: Matthew, there's problem in libx86: it tries to execute from area
not marked as PROT_EXEC.
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-03-03 17:49:38

by Ingo Molnar

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending


* Klaus S. Madsen <[email protected]> wrote:

> The following patch solves the segfault, by changing the mmap flags of
> the video memory area, to allow execution. The patch is against
> libx86-0.99 available from http://www.codon.org.uk/~mjg59/libx86/
>
> --- libx86-0.99/x86-common.c 2006-09-08 00:44:27.000000000 +0200
> +++ libx86-0.99.new/x86-common.c 2008-03-01 10:08:25.000000000 +0100
> @@ -232,7 +232,7 @@
> }
>
> m = mmap((void *)0xa0000, 0x100000 - 0xa0000,
> - PROT_READ | PROT_WRITE,
> + PROT_READ | PROT_WRITE | PROT_EXEC,

are you sure you ID-ed the right commit that broke things?

while requiring PROT_EXEC is fine, breaking existing user-space apps
over that is not fine. So are you absolutely sure that by reverting that
PWT|PCD commit, s2ram again starts to work? That's utmost weird...

perhaps there's some CPU bug that causes NX to _NOT_ work if only PCD is
used (not PCD|PWT). Seems like a pretty unlikely scenario though.

Ingo

2008-03-03 17:51:03

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Pavel Machek wrote:
> On Mon 2008-03-03 09:10:35, H. Peter Anvin wrote:
>> Pavel Machek wrote:
>>>> The only thing I don't understand is why this is suddenly a problem with
>>>> 2.6.25, and not with 2.6.24? Is there a bug in 2.6.24 and previously
>>>> that allows real-mode execution of non-executable pages?
>>> It is strange indeed... Should it be traced as an regression?
>> I'd like to understand what the heck happened, but as far as we can observe
>> right now, it's a *progression*, not a regression, since executing out of a
>> non-PROT_EXEC area isn't *supposed* to work...
>
> Okay, I guess this depends on the eye of the beholder... because s2ram
> *is* supposed to work ;-).
>
> Ideally, I'd like to keep 2.6.24 behaviour for at least a while, so we
> can try to fix the libx86 out there or something...
> Pavel
> PS: Matthew, there's problem in libx86: it tries to execute from area
> not marked as PROT_EXEC.

Allowing execution of a PROT_EXEC area is a security hole. The fact
that you happened to benefit from it doesn't change its nature as a
security hole.

-hpa

2008-03-03 17:53:51

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Ingo Molnar wrote:
> * Klaus S. Madsen <[email protected]> wrote:
>
>> The following patch solves the segfault, by changing the mmap flags of
>> the video memory area, to allow execution. The patch is against
>> libx86-0.99 available from http://www.codon.org.uk/~mjg59/libx86/
>>
>> --- libx86-0.99/x86-common.c 2006-09-08 00:44:27.000000000 +0200
>> +++ libx86-0.99.new/x86-common.c 2008-03-01 10:08:25.000000000 +0100
>> @@ -232,7 +232,7 @@
>> }
>>
>> m = mmap((void *)0xa0000, 0x100000 - 0xa0000,
>> - PROT_READ | PROT_WRITE,
>> + PROT_READ | PROT_WRITE | PROT_EXEC,
>
> are you sure you ID-ed the right commit that broke things?
>
> while requiring PROT_EXEC is fine, breaking existing user-space apps
> over that is not fine. So are you absolutely sure that by reverting that
> PWT|PCD commit, s2ram again starts to work? That's utmost weird...
>
> perhaps there's some CPU bug that causes NX to _NOT_ work if only PCD is
> used (not PCD|PWT). Seems like a pretty unlikely scenario though.
>

It really does. What would be much more likely is that the PCD ->
(PCD|PWT) triggered something in the kernel proper.

-hpa

2008-03-03 17:54:12

by Ingo Molnar

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending


* H. Peter Anvin <[email protected]> wrote:

>> It is strange indeed... Should it be traced as an regression?
>
> I'd like to understand what the heck happened, but as far as we can
> observe right now, it's a *progression*, not a regression, since
> executing out of a non-PROT_EXEC area isn't *supposed* to work...

if s2ram worked before, and breaks with v2.6.25 then it's a regression.
It doesnt really matter that this segfault is the right thing to do ...

what would be _really_ nice to know how this all links up to the PCD ->
PCD|PWT commit. I.e. how did the commit below suddenly _trigger_ this
segfault? Are we absolutely sure that the bisection is correct, that
it's this commit which broke things? [i.e. if this commit is unapplied,
does s2ram work?]

there are a few "nearby" commits in the bisection tree:

commit 6c3866558213ff706d8331053386915371ad63ec
Author: Jeremy Fitzhardinge <[email protected]>
Date: Wed Jan 30 13:32:55 2008 +0100

x86: move all asm/pgtable constants into one place

commit 61f38226def55d972cfd0e789971e952525ff8e5
Author: Ingo Molnar <[email protected]>
Date: Wed Jan 30 13:32:55 2008 +0100

x86/pgtable: fix constant sign extension problem

commit dcbae6b377d78190954055ef2d8909ae83ff57de
Author: Ingo Molnar <[email protected]>
Date: Wed Jan 30 13:32:55 2008 +0100

x86/pgtable: unify pagetable accessors, #1

which are far more likely candidates IMO of causing this NX related
segfault than the PCD->PCD|PWT change that was bisected originally.

Ingo

---------->
commit 82bc03fc158e28c90d7ed9919410776039cb4e14
Author: Ingo Molnar <[email protected]>
Date: Wed Jan 30 13:32:54 2008 +0100

x86: add PWT to NOCACHE flags

add PWT bit to NOCACHE flags. No real difference to CPUs, but needed
later for PAT.

Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>

diff --git a/include/asm-x86/pgtable_32.h b/include/asm-x86/pgtable_32.h
index a8be1ee..16da5d5 100644
--- a/include/asm-x86/pgtable_32.h
+++ b/include/asm-x86/pgtable_32.h
@@ -156,7 +156,7 @@ void paging_init(void);
extern unsigned long long __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
#define __PAGE_KERNEL_RO (__PAGE_KERNEL & ~_PAGE_RW)
#define __PAGE_KERNEL_RX (__PAGE_KERNEL_EXEC & ~_PAGE_RW)
-#define __PAGE_KERNEL_NOCACHE (__PAGE_KERNEL | _PAGE_PCD)
+#define __PAGE_KERNEL_NOCACHE (__PAGE_KERNEL | _PAGE_PCD | _PAGE_PWT)
#define __PAGE_KERNEL_LARGE (__PAGE_KERNEL | _PAGE_PSE)
#define __PAGE_KERNEL_LARGE_EXEC (__PAGE_KERNEL_EXEC | _PAGE_PSE)

diff --git a/include/asm-x86/pgtable_64.h b/include/asm-x86/pgtable_64.h
index 3f28010..9c9cddf 100644
--- a/include/asm-x86/pgtable_64.h
+++ b/include/asm-x86/pgtable_64.h
@@ -189,13 +189,13 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, unsigned long
#define __PAGE_KERNEL_EXEC \
(_PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY | _PAGE_ACCESSED)
#define __PAGE_KERNEL_NOCACHE \
- (_PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY | _PAGE_PCD | _PAGE_ACCESSED | _PAGE_NX)
+ (_PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY | _PAGE_PCD | _PAGE_PWT | _PAGE_ACCESSED | _PAGE_NX)
#define __PAGE_KERNEL_RO \
(_PAGE_PRESENT | _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_NX)
#define __PAGE_KERNEL_VSYSCALL \
(_PAGE_PRESENT | _PAGE_USER | _PAGE_ACCESSED)
#define __PAGE_KERNEL_VSYSCALL_NOCACHE \
- (_PAGE_PRESENT | _PAGE_USER | _PAGE_ACCESSED | _PAGE_PCD)
+ (_PAGE_PRESENT | _PAGE_USER | _PAGE_ACCESSED | _PAGE_PCD | _PAGE_PWT)
#define __PAGE_KERNEL_LARGE \
(__PAGE_KERNEL | _PAGE_PSE)
#define __PAGE_KERNEL_LARGE_EXEC \

2008-03-03 17:59:41

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Ingo Molnar wrote:
>
> if s2ram worked before, and breaks with v2.6.25 then it's a regression.
> It doesnt really matter that this segfault is the right thing to do ...
>

I'm sorry, I have to disagree with this one, simply because it's a
security model violation. We can't just say "well, app X does this even
though it's forbidden, well, let's just put in a security hole for it."

-hpa

2008-03-03 20:52:38

by Klaus S. Madsen

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Mon, Mar 03, 2008 at 18:48:58 +0100, Ingo Molnar wrote:
>
> * Klaus S. Madsen <[email protected]> wrote:
>
> > The following patch solves the segfault, by changing the mmap flags of
> > the video memory area, to allow execution. The patch is against
> > libx86-0.99 available from http://www.codon.org.uk/~mjg59/libx86/
> >
> > --- libx86-0.99/x86-common.c 2006-09-08 00:44:27.000000000 +0200
> > +++ libx86-0.99.new/x86-common.c 2008-03-01 10:08:25.000000000 +0100
> > @@ -232,7 +232,7 @@
> > }
> >
> > m = mmap((void *)0xa0000, 0x100000 - 0xa0000,
> > - PROT_READ | PROT_WRITE,
> > + PROT_READ | PROT_WRITE | PROT_EXEC,
>
> are you sure you ID-ed the right commit that broke things?
I can't be sure. It was my third attempt, and there seems to be some
sort of Makefile trouble in that area, which causes the problem to
appear and disappear at random, unless I do a make clean && make. But
the triggering commit was found with make clean && make, and I made sure
that reverting the resulting commit did actually solve the problem...

However I wasn't able to make the problem go away, by removing the
_PAGE_PWT constants from __PAGE_KERNEL_NOCACHE and
__PAGE_KERNEL_VSYSCALL_NOCACHE in include-asm/pgtable.h in the newest
2.6.25:

diff --git a/include/asm-x86/pgtable.h b/include/asm-x86/pgtable.h
index 174b877..f81c968 100644
--- a/include/asm-x86/pgtable.h
+++ b/include/asm-x86/pgtable.h
@@ -84,9 +84,9 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
#define __PAGE_KERNEL_RO (__PAGE_KERNEL & ~_PAGE_RW)
#define __PAGE_KERNEL_RX (__PAGE_KERNEL_EXEC & ~_PAGE_RW)
#define __PAGE_KERNEL_EXEC_NOCACHE (__PAGE_KERNEL_EXEC | _PAGE_PCD | _PAGE_PWT)
-#define __PAGE_KERNEL_NOCACHE (__PAGE_KERNEL | _PAGE_PCD | _PAGE_PWT)
+#define __PAGE_KERNEL_NOCACHE (__PAGE_KERNEL | _PAGE_PCD)
#define __PAGE_KERNEL_VSYSCALL (__PAGE_KERNEL_RX | _PAGE_USER)
-#define __PAGE_KERNEL_VSYSCALL_NOCACHE (__PAGE_KERNEL_VSYSCALL | _PAGE_PCD | _PAGE_PWT)
+#define __PAGE_KERNEL_VSYSCALL_NOCACHE (__PAGE_KERNEL_VSYSCALL | _PAGE_PCD)
#define __PAGE_KERNEL_LARGE (__PAGE_KERNEL | _PAGE_PSE)
#define __PAGE_KERNEL_LARGE_EXEC (__PAGE_KERNEL_EXEC | _PAGE_PSE)

So while I'm fairly confident in that I bisected correctly, the number
of attempts I had to go through to get a reliable result, and the fact
that I cannot make the problem go away by reverting the current code to
something similar, counts quite a lot against me.

However I'm 100% confident that the problem appears between
cf8fa920cb4271f17e0265c863d64bea1b31941a and
925596a017bbd045ff711b778256f459e50a119, which is something like 16
commits. I have been at both points in the tree at least 2 times, and
confirmed that cf8fa920cb4271f17e0265c863d64bea1b31941a worked for me,
and 925596a017bbd045ff711b778256f459e50a119 didn't.

> while requiring PROT_EXEC is fine, breaking existing user-space apps
> over that is not fine. So are you absolutely sure that by reverting that
> PWT|PCD commit, s2ram again starts to work? That's utmost weird...
I'm sure that it fixed the problem for me, yes, and I'm fairly confident
that I ran make clean && make to compile the kernel during the entire
bisection between the two commites mentioned above.

> perhaps there's some CPU bug that causes NX to _NOT_ work if only PCD is
> used (not PCD|PWT). Seems like a pretty unlikely scenario though.
$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz
stepping : 10

But I'm a bit puzzled by the fact that I'm aparently the only one how
have encountered the problem? Maybe it's only a problem if one also uses
PAE? (Thats just a wild guess to explain why I'm the only one seeing
this).

--
Kind regards
Klaus S. Madsen

2008-03-03 20:59:54

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Klaus S. Madsen wrote:
> $ cat /proc/cpuinfo
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 15
> model name : Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz
> stepping : 10
>
> But I'm a bit puzzled by the fact that I'm aparently the only one how
> have encountered the problem? Maybe it's only a problem if one also uses
> PAE? (Thats just a wild guess to explain why I'm the only one seeing
> this).

NX protection only applies to PAE.

-hpa

2008-03-03 21:05:25

by Pavel Machek

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Hi!

> However I wasn't able to make the problem go away, by removing the
> _PAGE_PWT constants from __PAGE_KERNEL_NOCACHE and
> __PAGE_KERNEL_VSYSCALL_NOCACHE in include-asm/pgtable.h in the newest
> 2.6.25:
>
> diff --git a/include/asm-x86/pgtable.h b/include/asm-x86/pgtable.h
> index 174b877..f81c968 100644
> --- a/include/asm-x86/pgtable.h
> +++ b/include/asm-x86/pgtable.h
> @@ -84,9 +84,9 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
> #define __PAGE_KERNEL_RO (__PAGE_KERNEL & ~_PAGE_RW)
> #define __PAGE_KERNEL_RX (__PAGE_KERNEL_EXEC & ~_PAGE_RW)
> #define __PAGE_KERNEL_EXEC_NOCACHE (__PAGE_KERNEL_EXEC | _PAGE_PCD | _PAGE_PWT)
> -#define __PAGE_KERNEL_NOCACHE (__PAGE_KERNEL | _PAGE_PCD | _PAGE_PWT)
> +#define __PAGE_KERNEL_NOCACHE (__PAGE_KERNEL | _PAGE_PCD)
> #define __PAGE_KERNEL_VSYSCALL (__PAGE_KERNEL_RX | _PAGE_USER)
> -#define __PAGE_KERNEL_VSYSCALL_NOCACHE (__PAGE_KERNEL_VSYSCALL | _PAGE_PCD | _PAGE_PWT)
> +#define __PAGE_KERNEL_VSYSCALL_NOCACHE (__PAGE_KERNEL_VSYSCALL | _PAGE_PCD)
> #define __PAGE_KERNEL_LARGE (__PAGE_KERNEL | _PAGE_PSE)
> #define __PAGE_KERNEL_LARGE_EXEC (__PAGE_KERNEL_EXEC | _PAGE_PSE)
>
> So while I'm fairly confident in that I bisected correctly, the number
> of attempts I had to go through to get a reliable result, and the fact
> that I cannot make the problem go away by reverting the current code to
> something similar, counts quite a lot against me.
>
> However I'm 100% confident that the problem appears between
> cf8fa920cb4271f17e0265c863d64bea1b31941a and
> 925596a017bbd045ff711b778256f459e50a119, which is something like 16
> commits. I have been at both points in the tree at least 2 times, and
> confirmed that cf8fa920cb4271f17e0265c863d64bea1b31941a worked for me,
> and 925596a017bbd045ff711b778256f459e50a119 didn't.
>
> > while requiring PROT_EXEC is fine, breaking existing user-space apps
> > over that is not fine. So are you absolutely sure that by reverting that
> > PWT|PCD commit, s2ram again starts to work? That's utmost weird...
> I'm sure that it fixed the problem for me, yes, and I'm fairly confident
> that I ran make clean && make to compile the kernel during the entire
> bisection between the two commites mentioned above.
>
> > perhaps there's some CPU bug that causes NX to _NOT_ work if only PCD is
> > used (not PCD|PWT). Seems like a pretty unlikely scenario though.
> $ cat /proc/cpuinfo
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 15
> model name : Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz
> stepping : 10
>
> But I'm a bit puzzled by the fact that I'm aparently the only one how
> have encountered the problem? Maybe it's only a problem if one also uses
> PAE? (Thats just a wild guess to explain why I'm the only one seeing
> this).

I do not have NX capable CPU in machine using for suspend
testing. ... and you are using 32-bit userspace on 64-bit kernel,
right?

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-03-03 21:06:30

by Pavel Machek

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Mon 2008-03-03 22:05:26, Pavel Machek wrote:
> Hi!
>
> > However I wasn't able to make the problem go away, by removing the
> > _PAGE_PWT constants from __PAGE_KERNEL_NOCACHE and
> > __PAGE_KERNEL_VSYSCALL_NOCACHE in include-asm/pgtable.h in the newest
> > 2.6.25:
> >
> > diff --git a/include/asm-x86/pgtable.h b/include/asm-x86/pgtable.h
> > index 174b877..f81c968 100644
> > --- a/include/asm-x86/pgtable.h
> > +++ b/include/asm-x86/pgtable.h
> > @@ -84,9 +84,9 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
> > #define __PAGE_KERNEL_RO (__PAGE_KERNEL & ~_PAGE_RW)
> > #define __PAGE_KERNEL_RX (__PAGE_KERNEL_EXEC & ~_PAGE_RW)
> > #define __PAGE_KERNEL_EXEC_NOCACHE (__PAGE_KERNEL_EXEC | _PAGE_PCD | _PAGE_PWT)
> > -#define __PAGE_KERNEL_NOCACHE (__PAGE_KERNEL | _PAGE_PCD | _PAGE_PWT)
> > +#define __PAGE_KERNEL_NOCACHE (__PAGE_KERNEL | _PAGE_PCD)
> > #define __PAGE_KERNEL_VSYSCALL (__PAGE_KERNEL_RX | _PAGE_USER)
> > -#define __PAGE_KERNEL_VSYSCALL_NOCACHE (__PAGE_KERNEL_VSYSCALL | _PAGE_PCD | _PAGE_PWT)
> > +#define __PAGE_KERNEL_VSYSCALL_NOCACHE (__PAGE_KERNEL_VSYSCALL | _PAGE_PCD)
> > #define __PAGE_KERNEL_LARGE (__PAGE_KERNEL | _PAGE_PSE)
> > #define __PAGE_KERNEL_LARGE_EXEC (__PAGE_KERNEL_EXEC | _PAGE_PSE)
> >
> > So while I'm fairly confident in that I bisected correctly, the number
> > of attempts I had to go through to get a reliable result, and the fact
> > that I cannot make the problem go away by reverting the current code to
> > something similar, counts quite a lot against me.
> >
> > However I'm 100% confident that the problem appears between
> > cf8fa920cb4271f17e0265c863d64bea1b31941a and
> > 925596a017bbd045ff711b778256f459e50a119, which is something like 16
> > commits. I have been at both points in the tree at least 2 times, and
> > confirmed that cf8fa920cb4271f17e0265c863d64bea1b31941a worked for me,
> > and 925596a017bbd045ff711b778256f459e50a119 didn't.
> >
> > > while requiring PROT_EXEC is fine, breaking existing user-space apps
> > > over that is not fine. So are you absolutely sure that by reverting that
> > > PWT|PCD commit, s2ram again starts to work? That's utmost weird...
> > I'm sure that it fixed the problem for me, yes, and I'm fairly confident
> > that I ran make clean && make to compile the kernel during the entire
> > bisection between the two commites mentioned above.
> >
> > > perhaps there's some CPU bug that causes NX to _NOT_ work if only PCD is
> > > used (not PCD|PWT). Seems like a pretty unlikely scenario though.
> > $ cat /proc/cpuinfo
> > processor : 0
> > vendor_id : GenuineIntel
> > cpu family : 6
> > model : 15
> > model name : Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz
> > stepping : 10
> >
> > But I'm a bit puzzled by the fact that I'm aparently the only one how
> > have encountered the problem? Maybe it's only a problem if one also uses
> > PAE? (Thats just a wild guess to explain why I'm the only one seeing
> > this).
>
> I do not have NX capable CPU in machine using for suspend
> testing. ... and you are using 32-bit userspace on 64-bit kernel,
> right?

Sorry, I meant "32-bit userspace on 32-bit kernel"... but that on
64-bit capable cpu.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-03-03 21:06:49

by Matthew Garrett

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Mon, Mar 03, 2008 at 10:05:26PM +0100, Pavel Machek wrote:

> I do not have NX capable CPU in machine using for suspend
> testing. ... and you are using 32-bit userspace on 64-bit kernel,
> right?

Not if he's expecting vm86 to work...

--
Matthew Garrett | [email protected]

2008-03-03 21:22:51

by Klaus S. Madsen

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Mon, Mar 03, 2008 at 22:06:47 +0100, Pavel Machek wrote:
> > > But I'm a bit puzzled by the fact that I'm aparently the only one how
> > > have encountered the problem? Maybe it's only a problem if one also uses
> > > PAE? (Thats just a wild guess to explain why I'm the only one seeing
> > > this).
> >
> > I do not have NX capable CPU in machine using for suspend
> > testing. ... and you are using 32-bit userspace on 64-bit kernel,
> > right?
>
> Sorry, I meant "32-bit userspace on 32-bit kernel"... but that on
> 64-bit capable cpu.
Yes. With 4 GB of RAM, so I use PAE, which is also required for NX. Just
to be sure, I disabled PAE, and sure enough, the problem disappeared.

I'll double that the commit I fingered actually caused the problem, but
I will not have time for this before tomorrw evening.

Just to be sure: If I want to checkout the revision, just before the bad
commit I can do something like:

git checkout commit-id~1

Right?

Thanks for spending time on this.

--
Kind regards
Klaus S. Madsen

2008-03-04 12:36:42

by Ingo Molnar

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending


* Klaus S. Madsen <[email protected]> wrote:

> So while I'm fairly confident in that I bisected correctly, the number
> of attempts I had to go through to get a reliable result, and the fact
> that I cannot make the problem go away by reverting the current code
> to something similar, counts quite a lot against me.
>
> However I'm 100% confident that the problem appears between
> cf8fa920cb4271f17e0265c863d64bea1b31941a and
> 925596a017bbd045ff711b778256f459e50a119, which is something like 16
> commits. I have been at both points in the tree at least 2 times, and
> confirmed that cf8fa920cb4271f17e0265c863d64bea1b31941a worked for me,
> and 925596a017bbd045ff711b778256f459e50a119 didn't.

my guess would be that it's this commit that causes it:

| commit 6c3866558213ff706d8331053386915371ad63ec
| Author: Jeremy Fitzhardinge <[email protected]>
| Date: Wed Jan 30 13:32:55 2008 +0100
|
| x86: move all asm/pgtable constants into one place

> But I'm a bit puzzled by the fact that I'm aparently the only one how
> have encountered the problem? Maybe it's only a problem if one also
> uses PAE? (Thats just a wild guess to explain why I'm the only one
> seeing this).

PAE activates NX on 32-bit. So we probably had an NX regression that got
fixed by the side-effects of one of the unifications. Does it start
working if you disable NX via the noexec=off boot option?

Ingo

2008-03-04 12:43:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending


* Klaus S. Madsen <[email protected]> wrote:

> > are you sure you ID-ed the right commit that broke things?
>
> I can't be sure. It was my third attempt, and there seems to be some
> sort of Makefile trouble in that area, which causes the problem to
> appear and disappear at random, unless I do a make clean && make. But
> the triggering commit was found with make clean && make, and I made
> sure that reverting the resulting commit did actually solve the
> problem...

btw., even if it turns out to be the wrong commit, you sure poked in the
right general area. This is one of the reoccuring problems with git
bisection: a small mistake near the end of a long bisection session can
point to the wrong commit. Especially with more sporadic failure modes
it can be quite a challenge.

Ingo

2008-03-04 21:59:07

by Klaus S. Madsen

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Tue, Mar 04, 2008 at 13:36:02 +0100, Ingo Molnar wrote:
> my guess would be that it's this commit that causes it:
>
> | commit 6c3866558213ff706d8331053386915371ad63ec
> | Author: Jeremy Fitzhardinge <[email protected]>
> | Date: Wed Jan 30 13:32:55 2008 +0100
> |
> | x86: move all asm/pgtable constants into one place
All right. I just tried the following:

git checkout 82bc03fc158e28c90d7ed9919410776039cb4e14
make clean && make -j3 && sudo make install modules_install

And the resulting kernel works OK. However if I checkout commit
6c3866558213ff706d8331053386915371ad63ec, and compile it using the
previous commandline, s2ram fails so it seems you're right.

> > But I'm a bit puzzled by the fact that I'm aparently the only one how
> > have encountered the problem? Maybe it's only a problem if one also
> > uses PAE? (Thats just a wild guess to explain why I'm the only one
> > seeing this).
>
> PAE activates NX on 32-bit. So we probably had an NX regression that got
> fixed by the side-effects of one of the unifications. Does it start
> working if you disable NX via the noexec=off boot option?
If I boot with noexec=off, s2ram starts working again.

--
Kind regards
Klaus S. Madsen

2008-03-04 22:16:10

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Klaus S. Madsen wrote:
>>> But I'm a bit puzzled by the fact that I'm aparently the only one how
>>> have encountered the problem? Maybe it's only a problem if one also
>>> uses PAE? (Thats just a wild guess to explain why I'm the only one
>>> seeing this).
>> PAE activates NX on 32-bit. So we probably had an NX regression that got
>> fixed by the side-effects of one of the unifications. Does it start
>> working if you disable NX via the noexec=off boot option?
> If I boot with noexec=off, s2ram starts working again.

As one can expect.

-hpa

2008-03-04 23:16:50

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Ingo Molnar wrote:
> * Klaus S. Madsen <[email protected]> wrote:
>
>
>> So while I'm fairly confident in that I bisected correctly, the number
>> of attempts I had to go through to get a reliable result, and the fact
>> that I cannot make the problem go away by reverting the current code
>> to something similar, counts quite a lot against me.
>>
>> However I'm 100% confident that the problem appears between
>> cf8fa920cb4271f17e0265c863d64bea1b31941a and
>> 925596a017bbd045ff711b778256f459e50a119, which is something like 16
>> commits. I have been at both points in the tree at least 2 times, and
>> confirmed that cf8fa920cb4271f17e0265c863d64bea1b31941a worked for me,
>> and 925596a017bbd045ff711b778256f459e50a119 didn't.
>>
>
> my guess would be that it's this commit that causes it:
>
> | commit 6c3866558213ff706d8331053386915371ad63ec
> | Author: Jeremy Fitzhardinge <[email protected]>
> | Date: Wed Jan 30 13:32:55 2008 +0100
> |
> | x86: move all asm/pgtable constants into one place
>
>
>> But I'm a bit puzzled by the fact that I'm aparently the only one how
>> have encountered the problem? Maybe it's only a problem if one also
>> uses PAE? (Thats just a wild guess to explain why I'm the only one
>> seeing this).
>>
>
> PAE activates NX on 32-bit. So we probably had an NX regression that got
> fixed by the side-effects of one of the unifications. Does it start
> working if you disable NX via the noexec=off boot option?

What's the state of play here? Is upshot that this change fixed a bug
which broke s2ram, or caused a bug which broke s2ram?

J

2008-03-04 23:23:31

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

Jeremy Fitzhardinge wrote:
>>
>> PAE activates NX on 32-bit. So we probably had an NX regression that
>> got fixed by the side-effects of one of the unifications. Does it
>> start working if you disable NX via the noexec=off boot option?
>
> What's the state of play here? Is upshot that this change fixed a bug
> which broke s2ram, or caused a bug which broke s2ram?
>

As far as I can tell, this change fixed a bug, and the fact that the bug
was fixed triggered a s2ram bug.

-hpa

2008-03-04 23:31:39

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

H. Peter Anvin wrote:
> As far as I can tell, this change fixed a bug, and the fact that the
> bug was fixed triggered a s2ram bug.

OK. Doesn't surprise me. There were a few cases of dubious pte
handling which could have accidentally lost the NX bit.

J

2008-03-04 23:34:54

by Matthew Garrett

[permalink] [raw]
Subject: Re: Regression in 2.6.25-rc3: s2ram segfaults before suspending

On Tue, Mar 04, 2008 at 03:11:17PM -0800, H. Peter Anvin wrote:
> Jeremy Fitzhardinge wrote:
> >>
> >>PAE activates NX on 32-bit. So we probably had an NX regression that
> >>got fixed by the side-effects of one of the unifications. Does it
> >>start working if you disable NX via the noexec=off boot option?
> >
> >What's the state of play here? Is upshot that this change fixed a bug
> >which broke s2ram, or caused a bug which broke s2ram?
> >
>
> As far as I can tell, this change fixed a bug, and the fact that the bug
> was fixed triggered a s2ram bug.

Strictly a libx86 bug, so I'll try to get an updated version uploaded in
the near future. This won't hit x86_64 users, since the code is run
through x86emu rather than executed directly.

--
Matthew Garrett | [email protected]