2007-08-02 19:55:54

by Guennadi Liakhovetski

[permalink] [raw]
Subject: gcc fixed size char array initialization bug - known?

Hi

I've run across the following gcc "feature":

char c[4] = "01234";

gcc emits a nice warning

warning: initializer-string for array of chars is too long

But do a

char c[4] = "0123";

and - a wonder - no warning. No warning with gcc 3.3.2, 3.3.5, 3.4.5,
4.1.2. I was told 4.2.x does produce a warning. Now do a

struct {
char c[4];
int i;
} t;
t.i = 0x12345678;
strcpy(t.c, c);

and t.i is silently corrupted. Just wanted to ask if this is known,
really...

Thanks
Guennadi
---
Guennadi Liakhovetski


2007-08-02 20:00:10

by Jan Engelhardt

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?


On Aug 2 2007 21:55, Guennadi Liakhovetski wrote:
>Hi
>
>I've run across the following gcc "feature":
>
> char c[4] = "01234";
>
>gcc emits a nice warning
>
>warning: initializer-string for array of chars is too long
>
>But do a
>
> char c[4] = "0123";
>
>and - a wonder - no warning. No warning with gcc 3.3.2, 3.3.5, 3.4.5,
>4.1.2. I was told 4.2.x does produce a warning. Now do a
>
> struct {
> char c[4];
> int i;
> } t;
> t.i = 0x12345678;
> strcpy(t.c, c);
>
>and t.i is silently corrupted. Just wanted to ask if this is known,
>really...

What does this have to do with the kernel? The string "0123" is
generally _five_ characters long, so c[4] is not enough.
Or use strncpy.



Jan
--

2007-08-02 20:03:17

by Jesper Juhl

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On 02/08/07, Jan Engelhardt <[email protected]> wrote:
>
> On Aug 2 2007 21:55, Guennadi Liakhovetski wrote:
> >Hi
> >
> >I've run across the following gcc "feature":
> >
> > char c[4] = "01234";
> >
> >gcc emits a nice warning
> >
> >warning: initializer-string for array of chars is too long
> >
> >But do a
> >
> > char c[4] = "0123";
> >
> >and - a wonder - no warning. No warning with gcc 3.3.2, 3.3.5, 3.4.5,
> >4.1.2. I was told 4.2.x does produce a warning. Now do a
> >
> > struct {
> > char c[4];
> > int i;
> > } t;
> > t.i = 0x12345678;
> > strcpy(t.c, c);
> >
> >and t.i is silently corrupted. Just wanted to ask if this is known,
> >really...
>
> What does this have to do with the kernel? The string "0123" is
> generally _five_ characters long, so c[4] is not enough.
> Or use strncpy.
>
I believe Guennadi's point is that gcc does not warn about it in the
case of c[4] = "0123"; but only in the case of c[4] = "01234" - so if
we do have such initializations in the kernel we may have some bugs
hiding there that gcc doesn't warn us about.

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2007-08-02 20:09:11

by Al Viro

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Thu, Aug 02, 2007 at 09:55:51PM +0200, Guennadi Liakhovetski wrote:
> But do a
>
> char c[4] = "0123";
>
> and - a wonder - no warning.

And this is a correct behaviour. You get a valid initialier for array;
see 6.7.8[14] for details. Moreover, that kind of code is often
quite deliberate.

>No warning with gcc 3.3.2, 3.3.5, 3.4.5,
> 4.1.2. I was told 4.2.x does produce a warning. Now do a
>
> struct {
> char c[4];
> int i;
> } t;
> t.i = 0x12345678;
> strcpy(t.c, c);
>
> and t.i is silently corrupted. Just wanted to ask if this is known,
> really...

strcpy() from array that doesn't contain 0 is an undefined behaviour,
nothing new about that...

2007-08-02 20:10:42

by Al Viro

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Thu, Aug 02, 2007 at 10:03:03PM +0200, Jesper Juhl wrote:
> I believe Guennadi's point is that gcc does not warn about it in the
> case of c[4] = "0123"; but only in the case of c[4] = "01234" - so if
> we do have such initializations in the kernel we may have some bugs
> hiding there that gcc doesn't warn us about.

Who said it's a bug? Or that all arrays of char have to contain '\0'
anywhere in them?

2007-08-02 20:12:23

by Jesper Juhl

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On 02/08/07, Al Viro <[email protected]> wrote:
> On Thu, Aug 02, 2007 at 10:03:03PM +0200, Jesper Juhl wrote:
> > I believe Guennadi's point is that gcc does not warn about it in the
> > case of c[4] = "0123"; but only in the case of c[4] = "01234" - so if
> > we do have such initializations in the kernel we may have some bugs
> > hiding there that gcc doesn't warn us about.
>
> Who said it's a bug? Or that all arrays of char have to contain '\0'
> anywhere in them?
>
I was simply trying to explain what I thought Guennadi meant. I was
not commenting on whether or not there's a bug there.

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2007-08-02 20:12:42

by Andi Kleen

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

Guennadi Liakhovetski <[email protected]> writes:

> Hi
>
> I've run across the following gcc "feature":
>
> char c[4] = "01234";
>
> gcc emits a nice warning
>
> warning: initializer-string for array of chars is too long
>
> But do a
>
> char c[4] = "0123";
> and - a wonder - no warning.

It's required by the C standard.

6.7.8.14 of C99:
``
An array of character type may be initialized by a character string literal, optionally
enclosed in braces. Successive characters of the character string literal (including the
terminating null character if there is room or if the array is of unknown size) initialize the
elements of the array.
''

Note the "if there is room".

I believe the rationale is that it still allows to conveniently initialize
non zero terminated strings.

-Andi

2007-08-02 20:15:27

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Thu, 2 Aug 2007, Jesper Juhl wrote:

> I believe Guennadi's point is that gcc does not warn about it in the
> case of c[4] = "0123"; but only in the case of c[4] = "01234" - so if
> we do have such initializations in the kernel we may have some bugs
> hiding there that gcc doesn't warn us about.

Exactly. Think of all structs with fixed-length char arrays (various
device name fields, etc.) static instances of which re scattered across
all possible drivers... Usually those strings should be long "enough", but
if someone manages to exactly hit the length, there won't be a warning.

Thanks
Guennadi
---
Guennadi Liakhovetski

2007-08-02 20:18:30

by Jan Engelhardt

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?


On Aug 2 2007 21:55, Guennadi Liakhovetski wrote:
>Hi
>
>I've run across the following gcc "feature":
>
> char c[4] = "01234";
>
>gcc emits a nice warning
>
>warning: initializer-string for array of chars is too long
>
>But do a
>
> char c[4] = "0123";
>
>and - a wonder - no warning. No warning with gcc 3.3.2, 3.3.5, 3.4.5,
>4.1.2. I was told 4.2.x does produce a warning. Now do a

Unfortunately, gcc 4.2.1 does not produce a warning for

char a[4] = "haha";


Jan
--

2007-08-02 20:21:38

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Thu, 2 Aug 2007, Al Viro wrote:

> On Thu, Aug 02, 2007 at 09:55:51PM +0200, Guennadi Liakhovetski wrote:
> > But do a
> >
> > char c[4] = "0123";
> >
> > and - a wonder - no warning.
>
> And this is a correct behaviour. You get a valid initialier for array;
> see 6.7.8[14] for details. Moreover, that kind of code is often
> quite deliberate.

But why 4.2 warns?

Thanks
Guennadi
---
Guennadi Liakhovetski

2007-08-02 20:42:16

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Thu, 2 Aug 2007, Andi Kleen wrote:

> Guennadi Liakhovetski <[email protected]> writes:
>
> > char c[4] = "0123";
> > and - a wonder - no warning.
>
> It's required by the C standard.
>
> 6.7.8.14 of C99:
> ``
> An array of character type may be initialized by a character string literal, optionally
> enclosed in braces. Successive characters of the character string literal (including the
> terminating null character if there is room or if the array is of unknown size) initialize the
> elements of the array.
> ''
>
> Note the "if there is room".
>
> I believe the rationale is that it still allows to conveniently initialize
> non zero terminated strings.

Right, I accept that it will compile, but I don't understand why "01234"
produces a warning and "0123" doesn't? Don't think C99 says anything about
that. And, AFAIU, using structs with fixed-size char array we more or less
rely on the compiler warning us if anyone initializes it with too long a
string.

Also interesting, that with

char c[4] = "012345";

the compiler warns, but actually allocates a 6-byte long array...

Thanks
Guennadi
---
Guennadi Liakhovetski

2007-08-02 20:42:28

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Thu, 2 Aug 2007, Al Viro wrote:

> On Thu, Aug 02, 2007 at 09:55:51PM +0200, Guennadi Liakhovetski wrote:
> > But do a
> >
> > char c[4] = "0123";
> >
> > and - a wonder - no warning.
>
> And this is a correct behaviour. You get a valid initialier for array;
> see 6.7.8[14] for details. Moreover, that kind of code is often

What is 6.7.8[14]? If you're referring to the ANSI standard, then,
unfortunately, I don't have it.

> quite deliberate.

Worse yet, K&R explicitely writes:

<quote>

char pattern[] = "ould";

is a shorthand for the longer but equivalent

char pattern[] = { 'o', 'u', 'l', 'd', '\0' };

</quote>

In the latter spelling gcc < 4.2 DOES warn too.

Thanks
Guennadi
---
Guennadi Liakhovetski

2007-08-02 21:09:39

by Al Viro

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Thu, Aug 02, 2007 at 10:26:37PM +0200, Guennadi Liakhovetski wrote:
>
> Worse yet, K&R explicitely writes:
>
> <quote>
>
> char pattern[] = "ould";
>
> is a shorthand for the longer but equivalent
>
> char pattern[] = { 'o', 'u', 'l', 'd', '\0' };
>
> </quote>
>
> In the latter spelling gcc < 4.2 DOES warn too.

Does warn for what? Array with known size? Sure, so it should - you
have excess initializer list elements.

Note the [] in the quoted - it does matter.

Again, it's perfectly legitimate to use string literal to initialize
any kind of array of character type. \0 goes there only if there's
space for it; if array size is unknown, the space is left. That's it.

2007-08-02 21:26:55

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Thu, 2 Aug 2007, Al Viro wrote:

> On Thu, Aug 02, 2007 at 10:26:37PM +0200, Guennadi Liakhovetski wrote:
> >
> > Worse yet, K&R explicitely writes:
> >
> > <quote>
> >
> > char pattern[] = "ould";
> >
> > is a shorthand for the longer but equivalent
> >
> > char pattern[] = { 'o', 'u', 'l', 'd', '\0' };
> >
> > </quote>
> >
> > In the latter spelling gcc < 4.2 DOES warn too.
>
> Does warn for what? Array with known size? Sure, so it should - you
> have excess initializer list elements.
>
> Note the [] in the quoted - it does matter.
>
> Again, it's perfectly legitimate to use string literal to initialize
> any kind of array of character type. \0 goes there only if there's
> space for it; if array size is unknown, the space is left. That's it.

Sure. Doing 'char c[4] = "01234";' is just like doing '"0123"': only
those bytes, for which there's space in the array go in, everything makes
perfect sense. What doesn't make sense to me though, is that in the former
case gcc warns, but not in the latter.

Maybe you're right in your interpretation of the standard / K&R, but it
doesn't still make it logical to me, sorry. I always thought "0123" was 5
bytes long (ok, ascii) and the terminating '\0' was an integral part of
the string, no different from any other character, and not some optional
token. Thus all the charecters should be handled equally.

Thanks
Guennadi
---
Guennadi Liakhovetski

2007-08-02 21:42:42

by Robert Hancock

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

Guennadi Liakhovetski wrote:
> On Thu, 2 Aug 2007, Andi Kleen wrote:
>
>> Guennadi Liakhovetski <[email protected]> writes:
>>
>>> char c[4] = "0123";
>>> and - a wonder - no warning.
>> It's required by the C standard.
>>
>> 6.7.8.14 of C99:
>> ``
>> An array of character type may be initialized by a character string literal, optionally
>> enclosed in braces. Successive characters of the character string literal (including the
>> terminating null character if there is room or if the array is of unknown size) initialize the
>> elements of the array.
>> ''
>>
>> Note the "if there is room".
>>
>> I believe the rationale is that it still allows to conveniently initialize
>> non zero terminated strings.
>
> Right, I accept that it will compile, but I don't understand why "01234"
> produces a warning and "0123" doesn't? Don't think C99 says anything about

Because 5 characters will not fit in a 4 character array, even without
the null terminator.

> that. And, AFAIU, using structs with fixed-size char array we more or less
> rely on the compiler warning us if anyone initializes it with too long a
> string.
>
> Also interesting, that with
>
> char c[4] = "012345";
>
> the compiler warns, but actually allocates a 6-byte long array...
>
> Thanks
> Guennadi


--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/

2007-08-02 22:15:57

by Stefan Richter

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

Guennadi Liakhovetski wrote:
> On Thu, 2 Aug 2007, Andi Kleen wrote:
>> 6.7.8.14 of C99:
>> ``
>> An array of character type may be initialized by a character string literal, optionally
>> enclosed in braces. Successive characters of the character string literal (including the
>> terminating null character if there is room or if the array is of unknown size) initialize the
>> elements of the array.
>> ''
>>
>> Note the "if there is room".
>>
>> I believe the rationale is that it still allows to conveniently initialize
>> non zero terminated strings.
>
> Right, I accept that it will compile, but I don't understand why "01234"
> produces a warning and "0123" doesn't? Don't think C99 says anything about

How should gcc know whether you actually wanted that char foo[len] to
contain a \0 as last element?

Given the respective command line switches, gcc does warn in some cases
where it is guesswork whether what you typed is what you intended. For
example

if (i = j())

is reason for gcc to warn even if that might exactly be what you wanted.
However this construct can easily be annotated as

if ((i = j()))

to show to gcc and to carbon-based bipedals that you indeed wanted this.

Now there is no nice way to make an annotation that says "look, I'm
going to initialize an array of char with a string literal now, and the
resulting array will contain a non-zero member as last element, and I
mean it". And since there is no such annotation possible, gcc does not
warn and demand that you annotate the perfectly valid and 100%
spec-compliant construct char a[4] = "1234";.
--
Stefan Richter
-=====-=-=== =--- ---==
http://arcgraph.de/sr/

2007-08-02 22:32:19

by Stefan Richter

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

Guennadi Liakhovetski wrote:
> with
>
> char c[4] = "012345";
>
> the compiler warns, but actually allocates a 6-byte long array...

Off-topic here, but: sizeof c / sizeof *c == 4.
--
Stefan Richter
-=====-=-=== =--- ---==
http://arcgraph.de/sr/

2007-08-02 22:36:46

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Thu, 2 Aug 2007, Robert Hancock wrote:

> Because 5 characters will not fit in a 4 character array, even without the
> null terminator.

On Fri, 3 Aug 2007, Stefan Richter wrote:

> How should gcc know whether you actually wanted that char foo[len] to
> contain a \0 as last element?

Robert, Stefan, I am sorry, I think, you are VERY wrong here. There is no
"even" and no guessing. The "string" DOES include a terminating '\0'. It
is EQUIVALENT to {'s', 't', 'r', 'i', 'n', 'g', '\0'}. And it contains
SEVEN characters. Please, re-read your K&R. Specifically, the Section
"Initialization" in the "Function and Program Structure" chapter (section
4.9 in my copy), the paragraph about initialization with a string, which I
quoted in an earlier email.

And, Stefan, there is a perfect way to specify a "0123" without the '\0' -
{'0', '1', '2', '3'}.

Thanks
Guennadi
---
Guennadi Liakhovetski

2007-08-02 22:43:16

by Stefan Richter

[permalink] [raw]
Subject: (off-topic) Re: gcc fixed size char array initialization bug - known?

Guennadi Liakhovetski wrote:
> Robert, Stefan, I am sorry, I think, you are VERY wrong here.

You meant to say "C99 is very wrong".

> And, Stefan, there is a perfect way to specify a "0123" without the '\0' -
> {'0', '1', '2', '3'}.

C99 says char c[4] = "0123"; is a perfect way to say char c[4] = {'0',
'1', '2', '3'};. Do you want gcc to choose otherwise and ban the former
of the two idioms?
--
Stefan Richter
-=====-=-=== =--- ---==
http://arcgraph.de/sr/

2007-08-02 22:48:59

by Randy Dunlap

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Fri, 3 Aug 2007 00:36:40 +0200 (CEST) Guennadi Liakhovetski wrote:

> On Thu, 2 Aug 2007, Robert Hancock wrote:
>
> > Because 5 characters will not fit in a 4 character array, even without the
> > null terminator.
>
> On Fri, 3 Aug 2007, Stefan Richter wrote:
>
> > How should gcc know whether you actually wanted that char foo[len] to
> > contain a \0 as last element?
>
> Robert, Stefan, I am sorry, I think, you are VERY wrong here. There is no
> "even" and no guessing. The "string" DOES include a terminating '\0'. It
> is EQUIVALENT to {'s', 't', 'r', 'i', 'n', 'g', '\0'}. And it contains
> SEVEN characters. Please, re-read your K&R. Specifically, the Section
> "Initialization" in the "Function and Program Structure" chapter (section
> 4.9 in my copy), the paragraph about initialization with a string, which I
> quoted in an earlier email.
>
> And, Stefan, there is a perfect way to specify a "0123" without the '\0' -
> {'0', '1', '2', '3'}.

We are actually a bit beyond traditional K&R, fwiw.

C99 spec that Al referred you to (available for around US$18 as a pdf)
says in 6.7.8, para. 14 (where Al said):

"An array of character type may be initialized by a character string literal, optionally
enclosed in braces. Successive characters of the character string literal (including the
terminating null character if there is room or if the array is of unknown size) initialize the
elements of the array."


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2007-08-02 22:51:20

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Fri, 3 Aug 2007, Stefan Richter wrote:

> Guennadi Liakhovetski wrote:
> > with
> >
> > char c[4] = "012345";
> >
> > the compiler warns, but actually allocates a 6-byte long array...
>
> Off-topic here, but: sizeof c / sizeof *c == 4.

Don't think it is OT here - kernel depends on gcc. And, what I meant, is,
that gcc places all 7 (sorry, not 6 as I said above) characters in the
.rodata section of the compiled object file. Of course, it doesn't mean,
that c is 7 characters long.

Thanks
Guennadi
---
Guennadi Liakhovetski

2007-08-02 23:02:30

by Al Viro

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Fri, Aug 03, 2007 at 12:36:40AM +0200, Guennadi Liakhovetski wrote:
> On Thu, 2 Aug 2007, Robert Hancock wrote:
>
> > Because 5 characters will not fit in a 4 character array, even without the
> > null terminator.
>
> On Fri, 3 Aug 2007, Stefan Richter wrote:
>
> > How should gcc know whether you actually wanted that char foo[len] to
> > contain a \0 as last element?
>
> Robert, Stefan, I am sorry, I think, you are VERY wrong here. There is no
> "even" and no guessing. The "string" DOES include a terminating '\0'.

Read the fucking standard. In particular, notice that meaning of
string literals outside of initializer is *defined* via that in
initializers. IOW, string literals contain _nothing_ - not '\0', not
anything else. The entire reason why use of string literal ends up
with anon array containing \0 is exactly there - it's "how do we
determine the actual length of array of character with unknown length
initialized by string literal". _That_ is where \0 comes from.

And yes, all quotes you've given are correct. You are blatantly ignoring
the context even when you are including all relevant parts into the quoted
text. This stuff hadn't changed since K&R.

2007-08-02 23:03:46

by Al Viro

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Thu, Aug 02, 2007 at 03:54:34PM -0700, Randy Dunlap wrote:
> >
> > And, Stefan, there is a perfect way to specify a "0123" without the '\0' -
> > {'0', '1', '2', '3'}.
>
> We are actually a bit beyond traditional K&R, fwiw.

Not in that area - this behaviour is precisely what traditional K&R
had all along. Unchanged.

2007-08-02 23:09:26

by Al Viro

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Fri, Aug 03, 2007 at 12:51:16AM +0200, Guennadi Liakhovetski wrote:
> On Fri, 3 Aug 2007, Stefan Richter wrote:
>
> > Guennadi Liakhovetski wrote:
> > > with
> > >
> > > char c[4] = "012345";
> > >
> > > the compiler warns, but actually allocates a 6-byte long array...
> >
> > Off-topic here, but: sizeof c / sizeof *c == 4.
>
> Don't think it is OT here - kernel depends on gcc. And, what I meant, is,
> that gcc places all 7 (sorry, not 6 as I said above) characters in the
> .rodata section of the compiled object file. Of course, it doesn't mean,
> that c is 7 characters long.

So gcc does that kind of recovery, after having warned you. Makes sense,
as long as it's for ordinary variables (and not, say it, struct fields) -
you get less likely runtime breakage on the undefined behaviour (e.g.
passing c to string functions). So gcc has generated some padding between
the global variables, that's all.

It doesn't change the fact that use of c[4] or strlen(c) or strcpy(..., c)
means nasal demon country for you.

Now, if gcc does that for similar situation with struct fields, you'd have
a cause to complain.

2007-08-02 23:26:46

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Thu, 2 Aug 2007, Randy Dunlap wrote:

> C99 spec that Al referred you to (available for around US$18 as a pdf)
> says in 6.7.8, para. 14 (where Al said):
>
> "An array of character type may be initialized by a character string literal, optionally
> enclosed in braces. Successive characters of the character string literal (including the
> terminating null character if there is room or if the array is of unknown size) initialize the
> elements of the array."

Wow... So, the terminating '\0' in the string constant IS "special" and
"optional"... Ok, then, THIS does answer my question, THIS I can
understand, and, ghm, accept...

Thanks to all who tried to explain this to me and sorry it took so long...

Thanks
Guennadi
---
Guennadi Liakhovetski

2007-08-02 23:27:56

by Stefan Richter

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

Al Viro wrote:
> On Fri, Aug 03, 2007 at 12:51:16AM +0200, Guennadi Liakhovetski wrote:
>> On Fri, 3 Aug 2007, Stefan Richter wrote:
>>
>>> Guennadi Liakhovetski wrote:
>>>> with
>>>>
>>>> char c[4] = "012345";
>>>>
>>>> the compiler warns, but actually allocates a 6-byte long array...
>>> Off-topic here, but: sizeof c / sizeof *c == 4.
>> Don't think it is OT here - kernel depends on gcc. And, what I meant, is,
>> that gcc places all 7 (sorry, not 6 as I said above) characters in the
>> .rodata section of the compiled object file. Of course, it doesn't mean,
>> that c is 7 characters long.
>
> So gcc does that kind of recovery, after having warned you. Makes sense,
> as long as it's for ordinary variables (and not, say it, struct fields) -
> you get less likely runtime breakage on the undefined behaviour (e.g.
> passing c to string functions). So gcc has generated some padding between
> the global variables, that's all.

No, the fact that the full 012345\0 ends up in the object file is
apparently unrelated to what happens to the variable c...

> It doesn't change the fact that use of c[4] or strlen(c) or strcpy(..., c)
> means nasal demon country for you.
>
> Now, if gcc does that for similar situation with struct fields, you'd have
> a cause to complain.

...since only 0123 will get into c at runtime, i.e. a 4 bytes long array
without \0 appendix or other extraordinary padding.

#include <stdio.h>
#include <string.h>

int main()
{
char c[4] = "012345";

printf("%d %d _%s_\n", sizeof c / sizeof *c, strlen(c), c);
return 0;
}

$ ./a.out
4 8 _01230??_

$ strings a.out |grep 0123
012345

--
Stefan Richter
-=====-=-=== =--- ---==
http://arcgraph.de/sr/

2007-08-02 23:30:30

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Fri, 3 Aug 2007, Al Viro wrote:

> It doesn't change the fact that use of c[4] or strlen(c) or strcpy(..., c)
> means nasal demon country for you.

Haha, funny. You, certainly, may think whatever you want, I'm anyway
greatful to you and to all the rest for the trouble you took to find THE
quote that actually answers the question.

Thanks
Guennadi
---
Guennadi Liakhovetski

2007-08-02 23:37:21

by Rene Herman

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On 08/03/2007 01:26 AM, Guennadi Liakhovetski wrote:

> On Thu, 2 Aug 2007, Randy Dunlap wrote:
>
>> C99 spec that Al referred you to (available for around US$18 as a pdf)
>> says in 6.7.8, para. 14 (where Al said):
>>
>> "An array of character type may be initialized by a character string
>> literal, optionally enclosed in braces. Successive characters of the
>> character string literal (including the terminating null character if
>> there is room or if the array is of unknown size) initialize the
>> elements of the array."
>
> Wow... So, the terminating '\0' in the string constant IS "special" and
> "optional"... Ok, then, THIS does answer my question, THIS I can
> understand, and, ghm, accept...
>
> Thanks to all who tried to explain this to me and sorry it took so
> long...

Ah come on, it would be great fun to now make the argument that that quoted
bit doesn't actually say wat should happen when there's _no_ room for the
terminating null character...

Rene.

2007-08-02 23:42:41

by Jakub Jelinek

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Thu, Aug 02, 2007 at 09:55:51PM +0200, Guennadi Liakhovetski wrote:
> I've run across the following gcc "feature":
>
> char c[4] = "01234";
>
> gcc emits a nice warning
>
> warning: initializer-string for array of chars is too long
>
> But do a
>
> char c[4] = "0123";
>
> and - a wonder - no warning. No warning with gcc 3.3.2, 3.3.5, 3.4.5,
> 4.1.2. I was told 4.2.x does produce a warning.

4.2.x nor 4.3 doesn't warn either and it is correct not to warn about
perfectly valid code.
ISO C99 is very obvious in that the terminating '\0' (resp. L'\0') from
the string literal is only added if there is room in the array or if the
array has unknown size.

Jakub

2007-08-03 03:05:37

by Satyam Sharma

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?



On Thu, 2 Aug 2007, Jan Engelhardt wrote:

> On Aug 2 2007 21:55, Guennadi Liakhovetski wrote:
> > [...]
> >
> > struct {
> > char c[4];
> > int i;
> > } t;
> > t.i = 0x12345678;
> > strcpy(t.c, c);
> >
> >and t.i is silently corrupted. Just wanted to ask if this is known,
> >really...
>
> What does this have to do with the kernel? The string "0123" is
> generally _five_ characters long, so c[4] is not enough.
> Or use strncpy.

<nitpicking>

While we're talking of null-termination of strings, then I bet you
generally want to be using strlcpy(), really. Often strncpy() isn't
what you want. Of course, if that buffer isn't a string at all, then
you should be using memfoo() functions and not strbar() ones in the
first place ...

2007-08-03 03:39:21

by Cong Wang

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Fri, Aug 03, 2007 at 08:47:56AM +0530, Satyam Sharma wrote:
>
>
>On Thu, 2 Aug 2007, Jan Engelhardt wrote:
>
>> On Aug 2 2007 21:55, Guennadi Liakhovetski wrote:
>> > [...]
>> >
>> > struct {
>> > char c[4];
>> > int i;
>> > } t;
>> > t.i = 0x12345678;
>> > strcpy(t.c, c);
>> >
>> >and t.i is silently corrupted. Just wanted to ask if this is known,
>> >really...
>>
>> What does this have to do with the kernel? The string "0123" is
>> generally _five_ characters long, so c[4] is not enough.
>> Or use strncpy.
>
><nitpicking>
>
>While we're talking of null-termination of strings, then I bet you
>generally want to be using strlcpy(), really. Often strncpy() isn't
>what you want. Of course, if that buffer isn't a string at all, then
>you should be using memfoo() functions and not strbar() ones in the
>first place ...

Afaik, strlcpy() and strlcat() are NOT standard C library functions.
But, I know, they are available in Linux kernel. ;) And yes, they
are better than strn{cpy,cat}().

Regards.

--
_ /|
\'o.O'
=(___)=
U

2007-08-03 04:57:21

by Carlo Florendo

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

Guennadi Liakhovetski wrote:
> On Thu, 2 Aug 2007, Robert Hancock wrote:
>
>> Because 5 characters will not fit in a 4 character array, even without the
>> null terminator.
>
> On Fri, 3 Aug 2007, Stefan Richter wrote:
>
>> How should gcc know whether you actually wanted that char foo[len] to
>> contain a \0 as last element?
>
> Robert, Stefan, I am sorry, I think, you are VERY wrong here. There is no
> "even" and no guessing. The "string" DOES include a terminating '\0'. It
> is EQUIVALENT to {'s', 't', 'r', 'i', 'n', 'g', '\0'}. And it contains
> SEVEN characters. Please, re-read your K&R. Specifically, the Section
> "Initialization" in the "Function and Program Structure" chapter (section
> 4.9 in my copy), the paragraph about initialization with a string, which I
> quoted in an earlier email.

Guennadi,

The declaration

char c[4] = "abcd";

is perfectly valid.

If other versions of gcc give warnings with that declaration, then those
warnings may be useful but it does not mean to say that other versions of
gcc follow the standards or not.

K&R is good as a reference but not as an authority. They drafted the book
as an informal specification of C. C has evolved throughout the decades.

The current standard is C99. And as quoted earlier in this thread,
character array initializations are described as:

6.7.8.14 of C99:
An array of character type may be initialized by a character string
literal, optionally enclosed in braces. Successive characters of the
character string literal (including the terminating null character if there
is room or if the array is of unknown size) initialize the elements of the
array.

The gcc warning you see on other versions is a warning that does not have
anything to do with the current C standard. The other versions of gcc that
do not emit such character initialization warnings do not mean that they
are buggy in that respect.

IOW, the fact that you did not see the warning in a certain gcc version
does not mean that it is buggy in that respect.

Thank you very much.

Best Regards,

Carlo

--
Carlo Florendo
Softare Engineer/Network Co-Administrator
Astra Philippines Inc.
UP-Ayala Technopark, UP Campus Diliman
1101 Quezon City, Philippines
http://www.astra.ph

--
The Astra Group of Companies
5-3-11 Sekido, Tama City
Tokyo 206-0011, Japan
http://www.astra.co.jp

2007-08-03 05:00:43

by Carlo Florendo

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

Guennadi Liakhovetski wrote:
> On Thu, 2 Aug 2007, Randy Dunlap wrote:
>
>> C99 spec that Al referred you to (available for around US$18 as a pdf)
>> says in 6.7.8, para. 14 (where Al said):
>>
>> "An array of character type may be initialized by a character string literal, optionally
>> enclosed in braces. Successive characters of the character string literal (including the
>> terminating null character if there is room or if the array is of unknown size) initialize the
>> elements of the array."
>
> Wow... So, the terminating '\0' in the string constant IS "special" and
> "optional"... Ok, then, THIS does answer my question, THIS I can
> understand, and, ghm, accept...
>
> Thanks to all who tried to explain this to me and sorry it took so long...

You should not have asked in the first place. The declaration

char c[4] = "abcd"

is perfectly valid. There is no cause for debate about it :)

Thank you very much.

Best Regards,

Carlo

--
Carlo Florendo
Softare Engineer/Network Co-Administrator
Astra Philippines Inc.
UP-Ayala Technopark, UP Campus Diliman
1101 Quezon City, Philippines
http://www.astra.ph

--
The Astra Group of Companies
5-3-11 Sekido, Tama City
Tokyo 206-0011, Japan
http://www.astra.co.jp

2007-08-03 07:33:17

by Bernd Petrovitsch

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

On Fri, 2007-08-03 at 11:40 +0800, WANG Cong wrote:
> On Fri, Aug 03, 2007 at 08:47:56AM +0530, Satyam Sharma wrote:
[....]
> >While we're talking of null-termination of strings, then I bet you
> >generally want to be using strlcpy(), really. Often strncpy() isn't
> >what you want. Of course, if that buffer isn't a string at all, then
> >you should be using memfoo() functions and not strbar() ones in the
> >first place ...
>
> Afaik, strlcpy() and strlcat() are NOT standard C library functions.

Yes, because they are not old enough as they are results of lessons
learned with strncpy() and strcpy() and other buffer overflows.

> But, I know, they are available in Linux kernel. ;) And yes, they
> are better than strn{cpy,cat}().

Bernd
--
Firmix Software GmbH http://www.firmix.at/
mobil: +43 664 4416156 fax: +43 1 7890849-55
Embedded Linux Development and Services

2007-08-03 07:56:38

by Jan Engelhardt

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?


On Aug 3 2007 01:30, Guennadi Liakhovetski wrote:
>On Fri, 3 Aug 2007, Al Viro wrote:
>
>> It doesn't change the fact that use of c[4] or strlen(c) or strcpy(..., c)
>> means nasal demon country for you.
>
>Haha, funny. You, certainly, may think whatever you want, I'm anyway
>greatful to you and to all the rest for the trouble you took to find THE
>quote that actually answers the question.

So back to the topic - if you want to check whether the kernel
'accidentally' uses

char foo[4] = "abcd";

change gcc or sparse to warn about this and then sort out the pieces
which are good. :)


Jan
--

2007-08-03 14:04:22

by Alexander van Heukelum

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?


On Fri, 3 Aug 2007 00:09:15 +0100, "Al Viro" <[email protected]>
said:
> On Fri, Aug 03, 2007 at 12:51:16AM +0200, Guennadi Liakhovetski wrote:
> > On Fri, 3 Aug 2007, Stefan Richter wrote:
> >
> > > Guennadi Liakhovetski wrote:
> > > > with
> > > >
> > > > char c[4] = "012345";
> > > >
> > > > the compiler warns, but actually allocates a 6-byte long array...
> > >
> > > Off-topic here, but: sizeof c / sizeof *c == 4.
> >
> > Don't think it is OT here - kernel depends on gcc. And, what I meant, is,
> > that gcc places all 7 (sorry, not 6 as I said above) characters in the
> > .rodata section of the compiled object file. Of course, it doesn't mean,
> > that c is 7 characters long.
>
> So gcc does that kind of recovery, after having warned you. Makes sense,
> as long as it's for ordinary variables (and not, say it, struct fields) -
> you get less likely runtime breakage on the undefined behaviour (e.g.
> passing c to string functions). So gcc has generated some padding
> between the global variables, that's all.
>
> It doesn't change the fact that use of c[4] or strlen(c) or strcpy(...,
> c) means nasal demon country for you.
>
> Now, if gcc does that for similar situation with struct fields, you'd
> have a cause to complain.

Hi!

(It took me a while before I understood that that last that referred to
padding inside a struct generated by gcc due to overlong initializers.)
But from the rest of the thread it seems that some people expect the
compiler to warn about the following...

struct {char c[4];} s1 = {"abcd"};

It doesn't. Of course if one wants to be warned in such cases
(initialisation
of a character array of specified length using a string constant) one
could
tell the compiler that the 0 at the end should really be there:

struct {char c[4];} s2 = {"abcd" "\0"};

Writing it like this will give them the expected warning.

Greetings,
Alexander
--
Alexander van Heukelum
[email protected]

--
http://www.fastmail.fm - Choose from over 50 domains or use your own

2007-08-03 15:16:21

by Stefan Richter

[permalink] [raw]
Subject: Re: gcc fixed size char array initialization bug - known?

Jakub Jelinek wrote:
> ISO C99 is very obvious in that the terminating '\0' (resp. L'\0') from
> the string literal is only added if there is room in the array or if the
> array has unknown size.

I would say C99 is /explicit/ in this regard.
It doesn't seem like an overly /obvious/ language feature to me.
--
Stefan Richter
-=====-=-=== =--- ---==
http://arcgraph.de/sr/