2005-03-30 03:23:38

by Horst H. von Brand

[permalink] [raw]
Subject: Re: Do not misuse Coverity please (Was: sound/oss/cs46xx.c: fix a check after use)

"Jean Delvare" <[email protected]> said:

[Sttributions missing, sorry]

> > > Think about it. If the pointer could be NULL, then it's unlikely that
> > > the bug would have gone unnoticed so far (unless the code is very
> > > recent). Coverity found 3 such bugs in one i2c driver [1], and the
> > > correct solution was to NOT check for NULL because it just couldn't
> > > happen.

> > No, there is a third case: the pointer can be NULL, but the compiler
> > happened to move the dereference down to after the check.

> Wow. Great point. I completely missed that possibility. In fact I didn't
> know that the compiler could possibly alter the order of the
> instructions. For one thing, I thought it was simply not allowed to. For
> another, I didn't know that it had been made so aware that it could
> actually figure out how to do this kind of things. What a mess. Let's
> just hope that the gcc folks know their business :)

The compiler is most definitely /not/ allowed to change the results the
code gives.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513


2005-03-30 08:00:57

by Jean Delvare

[permalink] [raw]
Subject: Re: Do not misuse Coverity please


Hi Horst,

> > > No, there is a third case: the pointer can be NULL, but the compiler
> > > happened to move the dereference down to after the check.
>
> > Wow. Great point. I completely missed that possibility. In fact I didn't
> > know that the compiler could possibly alter the order of the
> > instructions. For one thing, I thought it was simply not allowed to. For
> > another, I didn't know that it had been made so aware that it could
> > actually figure out how to do this kind of things. What a mess. Let's
> > just hope that the gcc folks know their business :)
>
> The compiler is most definitely /not/ allowed to change the results the
> code gives.

I think that Andrew's point was that the compiler could change the order
of the instructions *when this doesn't change the result*, not just in
the general case, of course. In our example, The instructions:

v = p->field;
if (!p) return;

can be seen as equivalent to

if (!p) return;
v = p->field;

by the compiler, which might actually optimize the first into the second
(and quite rightly so, as it actually is faster in the null case). The
fact that doing so prevents an oops is unknown to the compiler, so it
just wouldn't care.

Now I don't know what gcc actually does or not, but Andrew's point
makes perfect sense to me in the theory, and effectively voids my own
argument if gcc performs this kind of optimizations. (The prefered
approach to fix these bugs is a different issue though.)

--
Jean Delvare

2005-03-30 18:30:56

by Shankar Unni

[permalink] [raw]
Subject: Re: Do not misuse Coverity please

Jean Delvare wrote:

> v = p->field;
> if (!p) return;
>
> can be seen as equivalent to
>
> if (!p) return;
> v = p->field;

Heck, no.

You're missing the side-effect of a null pointer dereference crash (for
p->field) (even though v is unused before the return). The optimizer is
not allowed to make exceptions go away as a result of the hoisting.

2005-03-30 18:59:53

by Olivier Galibert

[permalink] [raw]
Subject: Re: Do not misuse Coverity please

On Wed, Mar 30, 2005 at 10:29:43AM -0800, Shankar Unni wrote:
> Jean Delvare wrote:
>
> > v = p->field;
> > if (!p) return;
> >
> >can be seen as equivalent to
> >
> > if (!p) return;
> > v = p->field;
>
> Heck, no.
>
> You're missing the side-effect of a null pointer dereference crash (for
> p->field) (even though v is unused before the return). The optimizer is
> not allowed to make exceptions go away as a result of the hoisting.

Actually it is. Dereferencing a null pointer is either undefined or
implementation-dependant in the standard (don't remember which), and
as such the compiler can do whatever it wants, be it starting nethack
or not doing the dereference in the first place.

The principle of least surprise makes doing such an "optimisation" not
so smart in practice. A compiler capable of detecting that situation
would be better off spitting a warning in red blinking letter.

OG.

2005-03-30 19:20:26

by Paulo Marques

[permalink] [raw]
Subject: Re: Do not misuse Coverity please

Shankar Unni wrote:
> Jean Delvare wrote:
>
>> v = p->field;
>> if (!p) return;
>>
>> can be seen as equivalent to
>>
>> if (!p) return;
>> v = p->field;
>
>
> Heck, no.
>
> You're missing the side-effect of a null pointer dereference crash (for
> p->field) (even though v is unused before the return). The optimizer is
> not allowed to make exceptions go away as a result of the hoisting.

I just had to try this out :)

Using gcc 3.3.2 this code sample:

> struct test {
> int code;
> };
>
> int test_func(struct test *a)
> {
> int ret;
> if (!a) return -1;
> ret = a->code;
> return ret;
> }

is compiled into:

> 0: 8b 54 24 04 mov 0x4(%esp,1),%edx
> 4: 83 c8 ff or $0xffffffff,%eax
> 7: 85 d2 test %edx,%edx
> 9: 74 02 je d <test_func+0xd>
> b: 8b 02 mov (%edx),%eax
> d: c3 ret

whereas this one:

> int test_func(struct test *a)
> {
> int ret;
> ret = a->code;
> if (!a) return -1;
> return ret;
> }

is simply compiled into:

> 0: 8b 44 24 04 mov 0x4(%esp,1),%eax
> 4: 8b 00 mov (%eax),%eax
> 6: c3 ret

It seems that gcc is smart enough to know that after we've dereferenced
a pointer, if it was NULL, it doesn't matter any more. So it just
assumes that if execution reaches that "if" statement then the pointer
can not be NULL at all.

So the 2 versions aren't equivalent, and gcc doesn't treat them as such
either.

Just a minor nitpick, though: wouldn't it be possible for an application
to catch the SIGSEGV and let the code proceed, making invalid the
assumption made by gcc?

--
Paulo Marques - http://www.grupopie.com

All that is necessary for the triumph of evil is that good men do nothing.
Edmund Burke (1729 - 1797)

2005-03-30 23:13:29

by Kyle Moffett

[permalink] [raw]
Subject: Big GCC bug!!! [Was: Re: Do not misuse Coverity please]

On Mar 30, 2005, at 14:14, Paulo Marques wrote:
> Just a minor nitpick, though: wouldn't it be possible for an
> application to catch the SIGSEGV and let the code proceed,
> making invalid the assumption made by gcc?

Uhh, it's even worse than that. Have a look at the following code:
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <errno.h>
> #include <sys/types.h>
> #include <sys/mman.h>
>
> struct test {
> int code;
> };
> int test_check_first(struct test *a) {
> int ret;
> if (!a) return -1;
> ret = a->code;
> return ret;
> }
> int test_check_last(struct test *a) {
> int ret;
> ret = a->code;
> if (!a) return -1;
> return ret;
> }
>
> int main() {
> int i;
> struct test *nullmem = mmap(NULL, 4096, PROT_READ|PROT_WRITE,
> MAP_ANON|MAP_FIXED|MAP_PRIVATE, -1, 0);
> if (nullmem == MAP_FAILED) {
> fprintf(stderr,"mmap: %s\n",strerror(errno));
> exit(1);
> }
> for (i = 0; i < 2; i++) {
> nullmem[i].code = i;
> printf("nullmem[%d].code = %d\n",i,i);
> printf("test_check_first(&nullmem[%d]) = %d\n",i,
> test_check_first(&nullmem[i]));
> printf("test_check_last(&nullmem[%d]) = %d\n",i,
> test_check_last(&nullmem[i]));
> }
> munmap(nullmem,4096);
> exit(0);
> }

Without optimization:
> king:~# gcc -o mmapnull mmapnull.c
> king:~# ./mmapnull
> nullmem[0].code = 0
> test_check_first(&nullmem[0]) = -1
> test_check_last(&nullmem[0]) = -1
> nullmem[1].code = 1
> test_check_first(&nullmem[1]) = 1
> test_check_last(&nullmem[1]) = 1

With optimization:
> king:~# gcc -O2 -o mmapnull mmapnull.c
> king:~# ./mmapnull
> nullmem[0].code = 0
> test_check_first(&nullmem[0]) = -1
> test_check_last(&nullmem[0]) = 0
BUG ==> ^^^
> nullmem[1].code = 1
> test_check_first(&nullmem[1]) = 1
> test_check_last(&nullmem[1]) = 1

This is on multiple platforms, including PPC Linux, X86 Linux, and
PPC Mac OS X. All exhibit the exact same behavior and output. I
think I'll probably go report a GCC bug now :-D

Dereferencing null pointers is relied upon by a number of various
emulators and such, and is "platform-defined" in the standard, so
since Linux allows mmap at NULL, GCC shouldn't optimize that case
any differently.

Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r
!y?(-)
------END GEEK CODE BLOCK------


2005-03-30 23:40:22

by Jakub Jelinek

[permalink] [raw]
Subject: Not a GCC bug (was Re: Big GCC bug!!! [Was: Re: Do not misuse Coverity please])

On Wed, Mar 30, 2005 at 06:11:43PM -0500, Kyle Moffett wrote:
> On Mar 30, 2005, at 14:14, Paulo Marques wrote:
> >Just a minor nitpick, though: wouldn't it be possible for an
> >application to catch the SIGSEGV and let the code proceed,
> >making invalid the assumption made by gcc?
>
> Uhh, it's even worse than that. Have a look at the following code:
> >#include <stdio.h>
> >#include <stdlib.h>
> >#include <string.h>
> >#include <errno.h>
> >#include <sys/types.h>
> >#include <sys/mman.h>
> >
> >struct test {
> > int code;
> >};
> >int test_check_first(struct test *a) {
> > int ret;
> > if (!a) return -1;
> > ret = a->code;
> > return ret;
> >}
> >int test_check_last(struct test *a) {
> > int ret;
> > ret = a->code;
> > if (!a) return -1;
> > return ret;
> >}
> >
> >int main() {
> > int i;
> > struct test *nullmem = mmap(NULL, 4096, PROT_READ|PROT_WRITE,
> > MAP_ANON|MAP_FIXED|MAP_PRIVATE, -1, 0);
> > if (nullmem == MAP_FAILED) {
> > fprintf(stderr,"mmap: %s\n",strerror(errno));
> > exit(1);
> > }
> > for (i = 0; i < 2; i++) {
> > nullmem[i].code = i;
> > printf("nullmem[%d].code = %d\n",i,i);
> > printf("test_check_first(&nullmem[%d]) = %d\n",i,
> > test_check_first(&nullmem[i]));
> > printf("test_check_last(&nullmem[%d]) = %d\n",i,
> > test_check_last(&nullmem[i]));
> > }
> > munmap(nullmem,4096);
> > exit(0);
> >}

This testcase violates ISO C99 6.3.2.3:
If a null pointer constant is converted to a pointer type, the resulting
pointer, called a null pointer, is guaranteed to compare unequal to a
pointer to any object or function.

But in the testcase above, there is an object whose pointer compares equal
to the null pointer. So there is nothing wrong with what GCC does.

Jakub

2005-03-30 23:58:08

by Robert Hancock

[permalink] [raw]
Subject: Re: Big GCC bug!!! [Was: Re: Do not misuse Coverity please]

Kyle Moffett wrote:
> Dereferencing null pointers is relied upon by a number of various
> emulators and such, and is "platform-defined" in the standard, so
> since Linux allows mmap at NULL, GCC shouldn't optimize that case
> any differently.

From the GCC manual: "The compiler assumes that dereferencing a null
pointer would have halted the program. If a pointer is checked after it
has already been dereferenced, it cannot be null. In some environments,
this assumption is not true, and programs can safely dereference null
pointers. Use -fno-delete-null-pointer-checks to disable this
optimization for programs which depend on that behavior. "

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/

2005-03-31 00:58:35

by Kyle Moffett

[permalink] [raw]
Subject: Re: Not a GCC bug (was Re: Big GCC bug!!! [Was: Re: Do not misuse Coverity please])

On Mar 30, 2005, at 18:38, Jakub Jelinek wrote:
> This testcase violates ISO C99 6.3.2.3:
> If a null pointer constant is converted to a pointer type, the
> resulting
> pointer, called a null pointer, is guaranteed to compare unequal to a
> pointer to any object or function.

Except that the result of dereferencing a null pointer is implementation
defined according to the C99 standard. My implementation allows me to
mmap
stuff at NULL, and therefore its compiler should be able to handle that
case. I would have no problem with either the standard or
implementation
if it either properly handled the case or didn't allow it in the first
place.

On another note, I've discovered the flag
"-fno-delete-null-pointer-checks",
which should probably be included in the kernel makefiles to disable
that
optimization for the kernel. (Ok, yes, I apologize, this isn't really
a GCC
bug, the behavior is documented, although it can be quite confusing. I
suspect it may bite some platform-specific code someday. It also
muddies
the waters somewhat with respect to the original note (and the effects
on
the generated code):

> int x = my_struct->the_x;
> if (!my_struct) return;

Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r
!y?(-)
------END GEEK CODE BLOCK------


2005-03-31 01:12:49

by Nick Piggin

[permalink] [raw]
Subject: Re: Not a GCC bug (was Re: Big GCC bug!!! [Was: Re: Do not misuse Coverity please])

Kyle Moffett wrote:
> On Mar 30, 2005, at 18:38, Jakub Jelinek wrote:
>
>> This testcase violates ISO C99 6.3.2.3:
>> If a null pointer constant is converted to a pointer type, the resulting
>> pointer, called a null pointer, is guaranteed to compare unequal to a
>> pointer to any object or function.
>
>
> Except that the result of dereferencing a null pointer is implementation
> defined according to the C99 standard. My implementation allows me to mmap
> stuff at NULL, and therefore its compiler should be able to handle that
> case. I would have no problem with either the standard or implementation
> if it either properly handled the case or didn't allow it in the first
> place.
>
> On another note, I've discovered the flag
> "-fno-delete-null-pointer-checks",
> which should probably be included in the kernel makefiles to disable that
> optimization for the kernel. (Ok, yes, I apologize, this isn't really a
> GCC
> bug, the behavior is documented, although it can be quite confusing. I
> suspect it may bite some platform-specific code someday. It also muddies
> the waters somewhat with respect to the original note (and the effects on
> the generated code):
>
>> int x = my_struct->the_x;
>> if (!my_struct) return;
>

Why should this be in the kernel makefiles? If my_struct is NULL,
then the kernel will never reach the if statement.

A warning might be nice though.

2005-03-31 01:27:58

by Kyle Moffett

[permalink] [raw]
Subject: Re: Not a GCC bug (was Re: Big GCC bug!!! [Was: Re: Do not misuse Coverity please])

On Mar 30, 2005, at 20:12, Nick Piggin wrote:
> Why should this be in the kernel makefiles? If my_struct is NULL,
> then the kernel will never reach the if statement.

Well, I think there is probably some arch code that uses 16-bit
that might use a null pointer, or at least a struct that starts
at the 0 address, which would have problems. I think it would
be better to avoid that issue just in case, especially since
this optimization does not save anything in the case of properly
written code.

> A warning might be nice though.

If we could turn off the optimization and add a warning, I
would support that. Even if we could only add the warning, then
at least people would know.


Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r
!y?(-)
------END GEEK CODE BLOCK------


2005-03-31 02:01:42

by Patrick McFarland

[permalink] [raw]
Subject: Re: Do not misuse Coverity please

On Wednesday 30 March 2005 01:55 pm, Olivier Galibert wrote:
> Actually it is. Dereferencing a null pointer is either undefined or
> implementation-dependant in the standard (don't remember which), and
> as such the compiler can do whatever it wants, be it starting nethack

Can this be configured to start slashem instead? ;)

> or not doing the dereference in the first place.

What scares me is, "Is there any code in the kernel that does this, or
something similarly stupid?". I regard the linux kernel as mostly sane, and
I'd rather not have that illusion of safety broken* any time soon. Sure, its
great for an app to randomly segfault, but having the kernel oops randomly is
not much fun.

To quote a sig I once saw on usenet: "Why Linux is better than Windows, Reason
#325: No needing to reboot every 15 seconds; or every 3 seconds during full
moons."



* I am by no means an expert on kernel programing

--
Patrick "Diablo-D3" McFarland || [email protected]
"Computer games don't affect kids; I mean if Pac-Man affected us as kids, we'd
all be running around in darkened rooms, munching magic pills and listening to
repetitive electronic music." -- Kristian Wilson, Nintendo, Inc, 1989


Attachments:
(No filename) (1.18 kB)
(No filename) (189.00 B)
Download all attachments