2001-11-07 23:23:42

by David Chandler

[permalink] [raw]
Subject: Bug Report: Dereferencing a bad pointer

Bug Report

Summary:
Dereferencing a bad pointer in user space hangs rather than causing a
segmentation fault in 2.4.x kernels.

Keywords:
memory protection address dereference segmentation fault SIGSEGV


Full Description:

The following one-line C program, when compiled by gcc 2.96 without
optimization, should produce a SIGSEGV segmentation fault (on a machine
with 3 or less gigabytes of virtual memory, at least):

int main() { int k = *(int *)0xc0000000; }

However, it does not do so under 2.4.x -- it does cause a seg fault
under
2.2.x kernels.

Specifically, no seg fault occurs under kernels 2.4.2-2 (Red Hat build),

2.4.13, 2.4.13UML, 2.4.9UML, or 2.4.8UML. This one-liner does cause a
seg fault on 2.2.5-15 (Red Hat build) and 2.2.14-5.0 (Red Hat build).
All these were run on Pentium II, Pentium III, and Pentium 4 chips.
The "UML" kernels are Linus's official releases patched with the
user-mode linux patches and run on a Red Hat 7.1 2.4.2-2 Pentium 4 host;

Tom's rtbt was the UML file system.

Note that UML uses arch/um rather than arch/i386; this seems to remove
some suspicion from 'arch/i386/mm/fault.c', which has changed
considerably from 2.2.x to 2.4.x.

Rather than seg faulting, the 2.4.x kernels just sit at the offensive
dereference until you interrupt the process. Interruption works
flawlessly; you can use 'kill -INT', 'kill -SEGV' or 'kill -BUS' to
interrupt the process.

Please Cc: me on any responses -- the linux-kernel traffic is too much
for me.


David Chandler

--

_____
David L. Chandler. GrammaTech, Inc.
mailto:[email protected] http://www.grammatech.com




2001-11-07 23:40:42

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: Bug Report: Dereferencing a bad pointer

On Wed, Nov 07, 2001 at 06:23:13PM -0500, David Chandler wrote:
> The following one-line C program, when compiled by gcc 2.96 without
> optimization, should produce a SIGSEGV segmentation fault (on a machine
> with 3 or less gigabytes of virtual memory, at least):
>
> int main() { int k = *(int *)0xc0000000; }
>
> However, it does not do so under 2.4.x -- it does cause a seg fault
> under
> 2.2.x kernels.

Works here running 2.4.13-ac8+bits. Are you sure you didn't compile with
optimization enabled?

-ben

2001-11-08 15:29:50

by David Chandler

[permalink] [raw]
Subject: Re: Bug Report: Dereferencing a bad pointer

Benjamin LaHaise wrote:

> On Wed, Nov 07, 2001 at 06:23:13PM -0500, David Chandler wrote:
> > The following one-line C program, when compiled by gcc 2.96 without
> > optimization, should produce a SIGSEGV segmentation fault (on a machine
> > with 3 or less gigabytes of virtual memory, at least):
> >
> > int main() { int k = *(int *)0xc0000000; }
> >
> > However, it does not do so under 2.4.x -- it does cause a seg fault
> > under
> > 2.2.x kernels.
>
> Works here running 2.4.13-ac8+bits. Are you sure you didn't compile with
> optimization enabled?
>
> -ben

I'm quite sure -- an optimized build exits immediately, whereas I'm seeing a
hung process with 2.4 kernels that has to be killed (most any signal,
including SIGSEGV, will do the trick). With 2.2 kernels, the program seg
faults, as it should. By "Works here" I assume you mean that you received a
segmentation fault.

I get the same result with gcc 3.0.1 and gcc 2.96 (and yes, the relevant
generated code differs slightly). I have tried Linus's official 2.4.13+UML
on UML, but I've not tried 2.4.13-ac8.

Please Cc: me on any replies.


David Chandler

_____
David L. Chandler. GrammaTech, Inc.
mailto:[email protected] http://www.grammatech.com



2001-11-08 16:02:53

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Bug Report: Dereferencing a bad pointer

On Thu, 8 Nov 2001, David Chandler wrote:

> Benjamin LaHaise wrote:
>
> > On Wed, Nov 07, 2001 at 06:23:13PM -0500, David Chandler wrote:
> > > The following one-line C program, when compiled by gcc 2.96 without
> > > optimization, should produce a SIGSEGV segmentation fault (on a machine
> > > with 3 or less gigabytes of virtual memory, at least):
> > >
> > > int main() { int k = *(int *)0xc0000000; }
> > >

This may not necessarily produce a seg-fault! If this virtual
address is mapped within the current process (.bss .stack, etc.),
It's perfectly all right to write to it although you probably
broke malloc() by doing it. The actual value of the number in
the pointer depends upon PAGE_OFFSET and other kernel variables.
If you change the kernel, this number may change. It has nothing
to do with the size of virtual address space, really.

Script started on Thu Nov 8 10:44:03 2001
# cat >xxx.c
#include <stdio.h>
int bss;
int data = 0x100;
const char cons[]="X";

main()
{
int stack;

printf("main() = %p\n", main);
printf("stack = %p\n", &stack);
printf("const = %p\n", cons);
printf(" data = %p\n", &data);
printf(" bss = %p\n", &bss);
return 0;

}

# gcc -o xxx xxx.c
# ./xxx
main() = 0x80484cc
stack = 0xbffff6fc
const = 0x8048584
data = 0x80495d4
bss = 0x80496b8
# exit
exit

Script done on Thu Nov 8 10:44:27 2001

All this stuff you "own". You can write to most all of it because
the kernel has allocated it for you. Whether or not 'const' is
really read-only is "implementation dependent".

In your case, it looks as though you scribbled over the top of
your user stack, in some harmless place.

You cannot presume that a program that doesn't seg-fault is
memory-error free. Protection is in pages, not bytes, and you
already own a lot of address-space that you may think that
you don't. FYI, if you allocate a lot of memory using malloc(),
it sets the break address to acquire more memory. Then if you
free that memory, it does not necessarily give back the memory.

You may be able to write to freed memory without a seg-fault.
However, subsequent calls to malloc() may fail because you have
ticked-off malloc() and it's gonna get even.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

I was going to compile a list of innovations that could be
attributed to Microsoft. Once I realized that Ctrl-Alt-Del
was handled in the BIOS, I found that there aren't any.


2001-11-08 16:27:43

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: Bug Report: Dereferencing a bad pointer

On Thu, Nov 08, 2001 at 10:29:19AM -0500, David Chandler wrote:
> I get the same result with gcc 3.0.1 and gcc 2.96 (and yes, the relevant
> generated code differs slightly). I have tried Linus's official 2.4.13+UML
> on UML, but I've not tried 2.4.13-ac8.

Perhaps you should try -ac?

-ben

2001-11-08 17:17:44

by David Chandler

[permalink] [raw]
Subject: Re: Bug Report: Dereferencing a bad pointer

Dick,

You're right that the one-liner below may not necessarily produce a seg
fault, but shouldn't it terminate normally if it doesn't? After all,
the program just *reads*. Hanging does not seem to be an option!

BTW, your example program produces very similar output for the 2.4 and
2.2 kernels to which I have access. I apologize for any confusion my
original report created -- 0xc0000000 was chosen because of its relation
to the start of the stack frame, and indeed it has nothing
to do with the size of virtual address space.


David Chandler


"Richard B. Johnson" wrote:
>
> > > On Wed, Nov 07, 2001 at 06:23:13PM -0500, David Chandler wrote:
> > > > The following one-line C program, when compiled by gcc 2.96 without
> > > > optimization, should produce a SIGSEGV segmentation fault (on a machine
> > > > with 3 or less gigabytes of virtual memory, at least):
> > > >
> > > > int main() { int k = *(int *)0xc0000000; }
> > > >
>
> This may not necessarily produce a seg-fault! If this virtual
> address is mapped within the current process (.bss .stack, etc.),
> It's perfectly all right to write to it although you probably
> broke malloc() by doing it. The actual value of the number in
> the pointer depends upon PAGE_OFFSET and other kernel variables.
> If you change the kernel, this number may change. It has nothing
> to do with the size of virtual address space, really.


>
> All this stuff you "own". You can write to most all of it because
> the kernel has allocated it for you. Whether or not 'const' is
> really read-only is "implementation dependent".
>
> In your case, it looks as though you scribbled over the top of
> your user stack, in some harmless place.


> Cheers,
> Dick Johnson


--

_____
David L. Chandler. GrammaTech, Inc.
mailto:[email protected] http://www.grammatech.com

2001-11-08 17:55:15

by Tahar

[permalink] [raw]
Subject: Re: Bug Report: Dereferencing a bad pointer

Richard,

Your explanation shows why the process is not killed with a SIGSEGV, but
it don't points out why the process hangs !

"Richard B. Johnson" wrote:
>
> On Thu, 8 Nov 2001, David Chandler wrote:
>
> > Benjamin LaHaise wrote:
> >
> > > On Wed, Nov 07, 2001 at 06:23:13PM -0500, David Chandler wrote:
> > > > The following one-line C program, when compiled by gcc 2.96 without
> > > > optimization, should produce a SIGSEGV segmentation fault (on a machine
> > > > with 3 or less gigabytes of virtual memory, at least):
> > > >
> > > > int main() { int k = *(int *)0xc0000000; }
> > > >
>
> This may not necessarily produce a seg-fault! If this virtual
> address is mapped within the current process (.bss .stack, etc.),
> It's perfectly all right to write to it although you probably
> broke malloc() by doing it. The actual value of the number in
> the pointer depends upon PAGE_OFFSET and other kernel variables.
> If you change the kernel, this number may change. It has nothing
> to do with the size of virtual address space, really.
>
> Script started on Thu Nov 8 10:44:03 2001
> # cat >xxx.c
> #include <stdio.h>
> int bss;
> int data = 0x100;
> const char cons[]="X";
>
> main()
> {
> int stack;
>
> printf("main() = %p\n", main);
> printf("stack = %p\n", &stack);
> printf("const = %p\n", cons);
> printf(" data = %p\n", &data);
> printf(" bss = %p\n", &bss);
> return 0;
>
> }
>
> # gcc -o xxx xxx.c
> # ./xxx
> main() = 0x80484cc
> stack = 0xbffff6fc
> const = 0x8048584
> data = 0x80495d4
> bss = 0x80496b8
> # exit
> exit
>
> Script done on Thu Nov 8 10:44:27 2001
>
> All this stuff you "own". You can write to most all of it because
> the kernel has allocated it for you. Whether or not 'const' is
> really read-only is "implementation dependent".
>
> In your case, it looks as though you scribbled over the top of
> your user stack, in some harmless place.
>
> You cannot presume that a program that doesn't seg-fault is
> memory-error free. Protection is in pages, not bytes, and you
> already own a lot of address-space that you may think that
> you don't. FYI, if you allocate a lot of memory using malloc(),
> it sets the break address to acquire more memory. Then if you
> free that memory, it does not necessarily give back the memory.
>
> You may be able to write to freed memory without a seg-fault.
> However, subsequent calls to malloc() may fail because you have
> ticked-off malloc() and it's gonna get even.
>
> Cheers,
> Dick Johnson
>
> Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).
>
> I was going to compile a list of innovations that could be
> attributed to Microsoft. Once I realized that Ctrl-Alt-Del
> was handled in the BIOS, I found that there aren't any.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2001-11-08 17:59:57

by Alan

[permalink] [raw]
Subject: Re: Bug Report: Dereferencing a bad pointer

> On Thu, Nov 08, 2001 at 10:29:19AM -0500, David Chandler wrote:
> > I get the same result with gcc 3.0.1 and gcc 2.96 (and yes, the relevant
> > generated code differs slightly). I have tried Linus's official 2.4.13+UML
> > on UML, but I've not tried 2.4.13-ac8.
>
> Perhaps you should try -ac?

If you do then use ac7 for x86

2001-11-08 21:33:39

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Bug Report: Dereferencing a bad pointer

On Thu, 8 Nov 2001, David Chandler wrote:

> Dick,
>
> You're right that the one-liner below may not necessarily produce a seg
> fault, but shouldn't it terminate normally if it doesn't? After all,
> the program just *reads*. Hanging does not seem to be an option!
>
You may want to see if any deliberate seg-fault actually gets
delivered. Try to read *(0). If that works (seg-faults), then
there may be a problem with some boundary condition on paging.

I can't duplicate the problem here. You can also try to trace
the code execution to see if it falls into some user-space loop.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

I was going to compile a list of innovations that could be
attributed to Microsoft. Once I realized that Ctrl-Alt-Del
was handled in the BIOS, I found that there aren't any.


2001-11-08 21:57:49

by David Chandler

[permalink] [raw]
Subject: Re: Bug Report: Dereferencing a bad pointer

I get a seg fault on both 2.2 and 2.4 kernels by running the following
one-line C program:
int main() { int k = (int *)0x0; }

Debugging the offender,
int main() { int k = (int *)0xc0000000; }
is not very informative: single-stepping over the sole command just
hangs, and you have to press Control-C to interrupt gdb, at which point
you can single-step right into the same problem again.

When the program hangs, 'top' says that the CPU is fully utilized and
the system is spending 80% of its time in the kernel and 20% in the
offending process.

Have you not been able to duplicate it on a 2.4 kernel on x86? If not,
please tell me which 2.4 kernel correctly seg faults.


David Chandler

--

_____
David L. Chandler. GrammaTech, Inc.
mailto:[email protected] http://www.grammatech.com


"Richard B. Johnson" wrote:
>
> On Thu, 8 Nov 2001, David Chandler wrote:
>
> > Dick,
> >
> > You're right that the one-liner below may not necessarily produce a seg
> > fault, but shouldn't it terminate normally if it doesn't? After all,
> > the program just *reads*. Hanging does not seem to be an option!
> >
> You may want to see if any deliberate seg-fault actually gets
> delivered. Try to read *(0). If that works (seg-faults), then
> there may be a problem with some boundary condition on paging.
>
> I can't duplicate the problem here. You can also try to trace
> the code execution to see if it falls into some user-space loop.
>
> Cheers,
> Dick Johnson
>
> Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).
>
> I was going to compile a list of innovations that could be
> attributed to Microsoft. Once I realized that Ctrl-Alt-Del
> was handled in the BIOS, I found that there aren't any.

2001-11-08 22:39:33

by Brian Gerst

[permalink] [raw]
Subject: Re: Bug Report: Dereferencing a bad pointer

David Chandler wrote:
>
> I get a seg fault on both 2.2 and 2.4 kernels by running the following
> one-line C program:
> int main() { int k = (int *)0x0; }
>
> Debugging the offender,
> int main() { int k = (int *)0xc0000000; }
> is not very informative: single-stepping over the sole command just
> hangs, and you have to press Control-C to interrupt gdb, at which point
> you can single-step right into the same problem again.
>
> When the program hangs, 'top' says that the CPU is fully utilized and
> the system is spending 80% of its time in the kernel and 20% in the
> offending process.
>
> Have you not been able to duplicate it on a 2.4 kernel on x86? If not,
> please tell me which 2.4 kernel correctly seg faults.

How about address 0xc0001000? I have been unable to reproduce this on a
PII running 2.4.9, and an Athlon running 2.4.14.

--

Brian Gerst

2001-11-08 23:15:45

by David Chandler

[permalink] [raw]
Subject: Re: Bug Report: Dereferencing a bad pointer

0xc0001000 hangs the same way that 0xc0000000 does. I have reproduced
this on a 2.4.9+UML kernel running in user-mode linux on top of a
Pentium-4 2.4.2-2(RedHat) host. 'top' says that 75% of CPU is going to
the system in that case also.

Please Cc: me on any replies.


David Chandler
--
_____
David L. Chandler. GrammaTech, Inc.
mailto:[email protected] http://www.grammatech.com



Brian Gerst wrote:
>
> David Chandler wrote:
> >
> > Debugging the offender,
> > int main() { int k = (int *)0xc0000000; }
> > is not very informative: single-stepping over the sole command just
> > hangs, and you have to press Control-C to interrupt gdb, at which point
> > you can single-step right into the same problem again.
> >
> > When the program hangs, 'top' says that the CPU is fully utilized and
> > the system is spending 80% of its time in the kernel and 20% in the
> > offending process.
> >
> > Have you not been able to duplicate it on a 2.4 kernel on x86? If not,
> > please tell me which 2.4 kernel correctly seg faults.
>
> How about address 0xc0001000? I have been unable to reproduce this on a
> PII running 2.4.9, and an Athlon running 2.4.14.
>
> --
>
> Brian Gerst

2001-11-09 13:33:44

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Bug Report: Dereferencing a bad pointer

FILE=/tmp/grok

cat <<EOF >${FILE}.s
.section .text
.global _start
.type _start,@function

_start:
movl \$0xc0000000, %ebx
# movl (%ebx), %eax
movl \$1, %eax
xorl %ebx,%ebx
int \$0x80
EOF
as -o ${FILE}.o ${FILE}.s
ld -o ${FILE} ${FILE}.o
chmod +x ${FILE}
echo "This should execute fine"
rm -f core
${FILE}
if [ -f core ] ; then
echo "Failed"
else
echo "Okay"
fi
cat <<EOF >${FILE}.s
.section .text
.global _start
.type _start,@function

_start:
movl \$0xc0000000, %ebx
movl (%ebx), %eax
movl \$1, %eax
xorl %ebx,%ebx
int \$0x80
EOF
as -o ${FILE}.o ${FILE}.s
ld -o ${FILE} ${FILE}.o
chmod +x ${FILE}
echo "This should seg-fault"
${FILE}
if [ -f core ] ; then
echo "Okay"
else
echo "Failed"
fi
rm -f core