2005-04-03 11:55:49

by Dag Arne Osvik

[permalink] [raw]
Subject: Use of C99 int types

Hi,

I've been working on a new DES implementation for Linux, and ran into
the problem of how to get access to C99 types like uint_fast32_t for
internal (not interface) use. In my tests, key setup on Athlon 64 slows
down by 40% when using u32 instead of uint_fast32_t.

So I wonder if there is any standard way of, say, including stdint.h for
internal use in kernel code?

Dag Arne


2005-04-03 12:05:24

by Stephen Rothwell

[permalink] [raw]
Subject: Re: Use of C99 int types

On Sun, 03 Apr 2005 13:55:39 +0200 Dag Arne Osvik <[email protected]> wrote:
>
> I've been working on a new DES implementation for Linux, and ran into
> the problem of how to get access to C99 types like uint_fast32_t for
> internal (not interface) use. In my tests, key setup on Athlon 64 slows
> down by 40% when using u32 instead of uint_fast32_t.

If you look in stdint.h you may find that uint_fast32_t is actually
64 bits on Athlon 64 ... so does it help if you use u64?

--
Cheers,
Stephen Rothwell [email protected]
http://www.canb.auug.org.au/~sfr/


Attachments:
(No filename) (576.00 B)
(No filename) (189.00 B)
Download all attachments

2005-04-03 12:30:20

by Dag Arne Osvik

[permalink] [raw]
Subject: Re: Use of C99 int types

Stephen Rothwell wrote:

>On Sun, 03 Apr 2005 13:55:39 +0200 Dag Arne Osvik <[email protected]> wrote:
>
>
>>I've been working on a new DES implementation for Linux, and ran into
>>the problem of how to get access to C99 types like uint_fast32_t for
>>internal (not interface) use. In my tests, key setup on Athlon 64 slows
>>down by 40% when using u32 instead of uint_fast32_t.
>>
>>
>
>If you look in stdint.h you may find that uint_fast32_t is actually
>64 bits on Athlon 64 ... so does it help if you use u64?
>
>
>

Yes, but wouldn't it be much better to avoid code like the following,
which may also be wrong (in terms of speed)?

#ifdef CONFIG_64BIT // or maybe CONFIG_X86_64?
#define fast_u32 u64
#else
#define fast_u32 u32
#endif

2005-04-03 13:27:44

by Andreas Schwab

[permalink] [raw]
Subject: Re: Use of C99 int types

Dag Arne Osvik <[email protected]> writes:

> Yes, but wouldn't it be much better to avoid code like the following,
> which may also be wrong (in terms of speed)?
>
> #ifdef CONFIG_64BIT // or maybe CONFIG_X86_64?
> #define fast_u32 u64
> #else
> #define fast_u32 u32
> #endif

How about using just unsigned long instead?

Andreas.

--
Andreas Schwab, SuSE Labs, [email protected]
SuSE Linux Products GmbH, Maxfeldstra?e 5, 90409 N?rnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

2005-04-03 18:13:23

by Al Viro

[permalink] [raw]
Subject: Re: Use of C99 int types

On Sun, Apr 03, 2005 at 02:30:11PM +0200, Dag Arne Osvik wrote:
> Yes, but wouldn't it be much better to avoid code like the following,
> which may also be wrong (in terms of speed)?
>
> #ifdef CONFIG_64BIT // or maybe CONFIG_X86_64?
> #define fast_u32 u64
> #else
> #define fast_u32 u32
> #endif

... and with such name 99% will assume (at least at the first reading)
that it _is_ 32bits. We have more than enough portability bugs as it
is, no need to invite more by bad names.

2005-04-03 19:18:32

by Renate Meijer

[permalink] [raw]
Subject: Re: Use of C99 int types


On Apr 3, 2005, at 2:30 PM, Dag Arne Osvik wrote:

> Stephen Rothwell wrote:
>
>> On Sun, 03 Apr 2005 13:55:39 +0200 Dag Arne Osvik <[email protected]> wrote:
>>
>>> I've been working on a new DES implementation for Linux, and ran into
>>> the problem of how to get access to C99 types like uint_fast32_t for
>>> internal (not interface) use. In my tests, key setup on Athlon 64
>>> slows
>>> down by 40% when using u32 instead of uint_fast32_t.
>>>
>>
>> If you look in stdint.h you may find that uint_fast32_t is actually
>> 64 bits on Athlon 64 ... so does it help if you use u64?
>>
>>
>
> Yes, but wouldn't it be much better to avoid code like the following,
> which may also be wrong (in terms of speed)?
>
> #ifdef CONFIG_64BIT // or maybe CONFIG_X86_64?
> #define fast_u32 u64
> #else
> #define fast_u32 u32
> #endif

Isn't it better to use a general integer type, reflecting the cpu's
native register-size and let the compiler sort it out? Restrict all
uses of explicit width types to where it's *really* needed, that is, in
drivers, network-code, etc. I firmly oppose any definition of "#define
fast_u32 u64". This kind of definitions will only create needless
confusion.

I wonder how much other code is suffering from this kind of overly
explicit typing. It's much easier to make assumptions about integer
size unwittingly than it is to avoid them. I used to assume (for
instance) that sizeof(int) == sizeof(long) == sizeof(void *) at one
point in my career. Fortunately, reality soon asserted itself again.

Regards,

Renate Meijer.

2005-04-03 20:26:00

by Kenneth Johansson

[permalink] [raw]
Subject: Re: Use of C99 int types

On Sun, 2005-04-03 at 21:23 +0200, Renate Meijer wrote:
> On Apr 3, 2005, at 2:30 PM, Dag Arne Osvik wrote:
>
> > Stephen Rothwell wrote:
> >
> >> On Sun, 03 Apr 2005 13:55:39 +0200 Dag Arne Osvik <[email protected]> wrote:
> >>
> >>> I've been working on a new DES implementation for Linux, and ran into
> >>> the problem of how to get access to C99 types like uint_fast32_t for
> >>> internal (not interface) use. In my tests, key setup on Athlon 64
> >>> slows
> >>> down by 40% when using u32 instead of uint_fast32_t.
> >>>
> >>
> >> If you look in stdint.h you may find that uint_fast32_t is actually
> >> 64 bits on Athlon 64 ... so does it help if you use u64?
> >>
> >>
> >
> > Yes, but wouldn't it be much better to avoid code like the following,
> > which may also be wrong (in terms of speed)?
> >
> > #ifdef CONFIG_64BIT // or maybe CONFIG_X86_64?
> > #define fast_u32 u64
> > #else
> > #define fast_u32 u32
> > #endif
>
> Isn't it better to use a general integer type, reflecting the cpu's
> native register-size and let the compiler sort it out? Restrict all
> uses of explicit width types to where it's *really* needed, that is, in

But is this not exactly what Dag Arne Osvik was trying to do ??
uint_fast32_t means that we want at least 32 bits but it's OK with more
if that happens to be faster on this particular architecture. The
problem was that the C99 standard types are not defined anywhere in the
kernel headers so they can not be used.

Perhaps they should be added to asm/types.h ?




Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2005-04-03 22:09:10

by Kyle Moffett

[permalink] [raw]
Subject: Re: Use of C99 int types

On Apr 03, 2005, at 16:25, Kenneth Johansson wrote:
> But is this not exactly what Dag Arne Osvik was trying to do ??
> uint_fast32_t means that we want at least 32 bits but it's OK with
> more if that happens to be faster on this particular architecture.
> The problem was that the C99 standard types are not defined anywhere
> in the kernel headers so they can not be used.

Uhh, so what's wrong with "int" or "long"? On all existing archs
supported by linux, "int" is 32 bits, "long long" is 64 bits, and
"long" is an efficient word-sized value that can hold a casted
pointer. I suppose it's theoretical that linux could be ported to
some arch where int is 16 bits, but so much stuff implicitly depends
on at least 32-bits in int that I think that's unlikely. GCC will
generally do the right thing if you just tell it "int".

Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r
!y?(-)
------END GEEK CODE BLOCK------


2005-04-03 22:48:15

by Dag Arne Osvik

[permalink] [raw]
Subject: Re: Use of C99 int types

Andreas Schwab wrote:

>Dag Arne Osvik <[email protected]> writes:
>
>
>
>>Yes, but wouldn't it be much better to avoid code like the following,
>>which may also be wrong (in terms of speed)?
>>
>>#ifdef CONFIG_64BIT // or maybe CONFIG_X86_64?
>> #define fast_u32 u64
>>#else
>> #define fast_u32 u32
>>#endif
>>
>>
>
>How about using just unsigned long instead?
>
>

unsigned long happens to coincide with uint_fast32_t for x86 and x86-64,
but there's no guarantee that it will on other architectures. And, at
least in theory, long may even provide less than 32 bits.

--
Dag Arne

2005-04-03 23:03:38

by Dag Arne Osvik

[permalink] [raw]
Subject: Re: Use of C99 int types

Al Viro wrote:

>On Sun, Apr 03, 2005 at 02:30:11PM +0200, Dag Arne Osvik wrote:
>
>
>>Yes, but wouldn't it be much better to avoid code like the following,
>>which may also be wrong (in terms of speed)?
>>
>>#ifdef CONFIG_64BIT // or maybe CONFIG_X86_64?
>> #define fast_u32 u64
>>#else
>> #define fast_u32 u32
>>#endif
>>
>>
>
>... and with such name 99% will assume (at least at the first reading)
>that it _is_ 32bits. We have more than enough portability bugs as it
>is, no need to invite more by bad names.
>
>

Agreed. The way I see it there are two reasonable options. One is to
just use u32, which is always correct but sacrifices speed (at least
with the current gcc). The other is to introduce C99 types, which Linus
doesn't seem to object to when they are kept away from interfaces
(http://infocenter.guardiandigital.com/archive/linux-kernel/2004/Dec/0117.html).

--
Dag Arne

2005-04-03 23:06:03

by Al Viro

[permalink] [raw]
Subject: Re: Use of C99 int types

On Mon, Apr 04, 2005 at 12:48:04AM +0200, Dag Arne Osvik wrote:
> unsigned long happens to coincide with uint_fast32_t for x86 and x86-64,
> but there's no guarantee that it will on other architectures. And, at
> least in theory, long may even provide less than 32 bits.

To port on such platform we'd have to do a lot of rewriting - so much that
the impact of this issue will be lost in noise.

Look, it's very simple:
* too many people blindly assume that all world is 32bit l-e.
* too many of those who try to do portable code have very little
idea of what that means - see the drivers that try and mix e.g. size_t with
int, etc.
* stdint is not widely understood, to put it mildly.
* ...fast... types have very unfortunate names - these are guaranteed
to create a lot of confusion.
* pretty much everything in the kernel assumes that
4 = sizeof(int) <=
sizeof(long) = sizeof(pointer) = sizeof(size_t) = sizeof(ptrdiff_t) <=
sizeof(long long) = 8
and any platform that doesn't satisfy the above will require very serious
work on porting anyway.

2005-04-03 23:14:46

by Grzegorz Kulewski

[permalink] [raw]
Subject: Re: Use of C99 int types

On Mon, 4 Apr 2005, Dag Arne Osvik wrote:
> (...) And, at least in
> theory, long may even provide less than 32 bits.

Are you sure?

My copy of famous C book by B. W. Kernighan and D. Ritchie says that

sizeof(short) <= sizeof(int) <= sizeof(long)

and

sizeof(short) >= 16,
sizeof(int) >= 16,
sizeof(long) >= 32.

The book is about ANSI C not C99 but I think this is still valid.

Am I wrong?


Grzegorz Kulewski

2005-04-03 23:20:53

by Dag Arne Osvik

[permalink] [raw]
Subject: Re: Use of C99 int types

Grzegorz Kulewski wrote:

> On Mon, 4 Apr 2005, Dag Arne Osvik wrote:
>
>> (...) And, at least in theory, long may even provide less than 32 bits.
>
>
> Are you sure?
>
> My copy of famous C book by B. W. Kernighan and D. Ritchie says that
>
> sizeof(short) <= sizeof(int) <= sizeof(long)
>
> and
>
> sizeof(short) >= 16,
> sizeof(int) >= 16,
> sizeof(long) >= 32.
>
> The book is about ANSI C not C99 but I think this is still valid.
>
> Am I wrong?


No, I just looked it up (section 2.2), and you're right.

--
Dag Arne

2005-04-04 00:06:08

by Adrian Bunk

[permalink] [raw]
Subject: Re: Use of C99 int types

On Mon, Apr 04, 2005 at 12:48:04AM +0200, Dag Arne Osvik wrote:
> Andreas Schwab wrote:
>
> >Dag Arne Osvik <[email protected]> writes:
> >
> >
> >
> >>Yes, but wouldn't it be much better to avoid code like the following,
> >>which may also be wrong (in terms of speed)?
> >>
> >>#ifdef CONFIG_64BIT // or maybe CONFIG_X86_64?
> >>#define fast_u32 u64
> >>#else
> >>#define fast_u32 u32
> >>#endif
> >>
> >>
> >
> >How about using just unsigned long instead?
> >
> >
>
> unsigned long happens to coincide with uint_fast32_t for x86 and x86-64,
> but there's no guarantee that it will on other architectures.
>...

The stdint.h shipped with glibc says:

<-- snip -->

/* Unsigned. */
typedef unsigned char uint_fast8_t;
#if __WORDSIZE == 64
typedef unsigned long int uint_fast16_t;
typedef unsigned long int uint_fast32_t;
typedef unsigned long int uint_fast64_t;
#else
typedef unsigned int uint_fast16_t;
typedef unsigned int uint_fast32_t;
__extension__
typedef unsigned long long int uint_fast64_t;
#endif

<-- snip -->

> Dag Arne

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-04-04 03:11:30

by Herbert Xu

[permalink] [raw]
Subject: Re: Use of C99 int types

Dag Arne Osvik <[email protected]> wrote:
>
>>... and with such name 99% will assume (at least at the first reading)
>>that it _is_ 32bits. We have more than enough portability bugs as it
>>is, no need to invite more by bad names.
>
> Agreed. The way I see it there are two reasonable options. One is to
> just use u32, which is always correct but sacrifices speed (at least
> with the current gcc). The other is to introduce C99 types, which Linus
> doesn't seem to object to when they are kept away from interfaces
> (http://infocenter.guardiandigital.com/archive/linux-kernel/2004/Dec/0117.html).

There is a third option which has already been pointed out before:

Use unsigned long.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2005-04-04 08:42:23

by Dag Arne Osvik

[permalink] [raw]
Subject: Re: Use of C99 int types

Herbert Xu wrote:

>Dag Arne Osvik <[email protected]> wrote:
>
>
>>>... and with such name 99% will assume (at least at the first reading)
>>>that it _is_ 32bits. We have more than enough portability bugs as it
>>>is, no need to invite more by bad names.
>>>
>>>
>>Agreed. The way I see it there are two reasonable options. One is to
>>just use u32, which is always correct but sacrifices speed (at least
>>with the current gcc). The other is to introduce C99 types, which Linus
>>doesn't seem to object to when they are kept away from interfaces
>>(http://infocenter.guardiandigital.com/archive/linux-kernel/2004/Dec/0117.html).
>>
>>
>
>There is a third option which has already been pointed out before:
>
>Use unsigned long.
>
>

Yes, as Kulewski pointed out, unsigned long is at least 32 bits wide and
therefore correct. Whether it's also fastest is less of a concern, but
it is so for at least the x86* architectures. So, sure, I'll use it.

Cheers all,

--
Dag Arne

2005-04-04 10:00:08

by Renate Meijer

[permalink] [raw]
Subject: Re: Use of C99 int types


On Apr 4, 2005, at 12:08 AM, Kyle Moffett wrote:

> On Apr 03, 2005, at 16:25, Kenneth Johansson wrote:
>> But is this not exactly what Dag Arne Osvik was trying to do ??
>> uint_fast32_t means that we want at least 32 bits but it's OK with
>> more if that happens to be faster on this particular architecture.
>> The problem was that the C99 standard types are not defined anywhere
>> in the kernel headers so they can not be used.
>
> Uhh, so what's wrong with "int" or "long"?

My point exactly, though I agree with Kenneth that adding the C99 types
would be a Good Thing.

> GCC will generally do the right thing if you just tell it "int".

And if you don't, you imply some special requirement, which, if none
really exists, is
misleading.

Regards,

Renate.

timeo hominem unius libri

Thomas van Aquino

2005-04-04 10:50:14

by Dag Arne Osvik

[permalink] [raw]
Subject: Re: Use of C99 int types

Renate Meijer wrote:

>
> On Apr 4, 2005, at 12:08 AM, Kyle Moffett wrote:
>
>> On Apr 03, 2005, at 16:25, Kenneth Johansson wrote:
>>
>>> But is this not exactly what Dag Arne Osvik was trying to do ??
>>> uint_fast32_t means that we want at least 32 bits but it's OK with
>>> more if that happens to be faster on this particular architecture.
>>> The problem was that the C99 standard types are not defined anywhere
>>> in the kernel headers so they can not be used.
>>
>>
>> Uhh, so what's wrong with "int" or "long"?
>

Nothing, as long as they work as required. And Grzegorz Kulewski
pointed out that unsigned long is required to be at least 32 bits,
fulfilling the present need for a 32-bit or wider type.

>
> My point exactly, though I agree with Kenneth that adding the C99 types
> would be a Good Thing.


If it leads to better code, then indeed it would be. However, Al Viro
disagrees and strongly hints they would lead to worse code.

>
>> GCC will generally do the right thing if you just tell it "int".
>
>
> And if you don't, you imply some special requirement, which, if none
> really exists, is
> misleading.


And in this case there is such a requirement. Anyway, I've already
decided to use unsigned long as a replacement for uint_fast32_t in my
implementation.

--
Dag Arne

2005-04-04 20:50:43

by Renate Meijer

[permalink] [raw]
Subject: Re: Use of C99 int types


On Apr 4, 2005, at 12:50 PM, Dag Arne Osvik wrote:

> Renate Meijer wrote:
>
>>
>> On Apr 4, 2005, at 12:08 AM, Kyle Moffett wrote:
>>
>>> On Apr 03, 2005, at 16:25, Kenneth Johansson wrote:
>>>
>>>> But is this not exactly what Dag Arne Osvik was trying to do ??
>>>> uint_fast32_t means that we want at least 32 bits but it's OK with
>>>> more if that happens to be faster on this particular architecture.
>>>> The problem was that the C99 standard types are not defined anywhere
>>>> in the kernel headers so they can not be used.
>>>
>>>
>>> Uhh, so what's wrong with "int" or "long"?
>>
>
> Nothing, as long as they work as required. And Grzegorz Kulewski
> pointed out that unsigned long is required to be at least 32 bits,
> fulfilling the present need for a 32-bit or wider type.

>> My point exactly, though I agree with Kenneth that adding the C99
>> types
>> would be a Good Thing.
>
>
> If it leads to better code, then indeed it would be.

At least a 32 bit integer is guaranteed to stay an 32 bit integer
(should one be required)
though multiple incarnations of the compiler.

> However, Al Viro disagrees and strongly hints they would lead to
> worse code.

When used improperly. The #define Al Viro objected to, is
objectionable. It's highly
misleading, as Mr. Viro pointed out. I fail to see where he made
comments on stdint.h
as such.

>> And if you don't, you imply some special requirement, which, if none
>> really exists, is
>> misleading.
>
> And in this case there is such a requirement.

Apart from the integer having 32 bits?

> Anyway, I've already decided to use unsigned long as a replacement
> for uint_fast32_t in my implementation.

Ok. I can live with that.

Regards,

Renate Meijer.

2005-04-04 21:02:34

by Al Viro

[permalink] [raw]
Subject: Re: Use of C99 int types

On Mon, Apr 04, 2005 at 10:30:52PM +0200, Renate Meijer wrote:

> When used improperly. The #define Al Viro objected to, is
> objectionable. It's highly
> misleading, as Mr. Viro pointed out. I fail to see where he made
> comments on stdint.h
> as such.

Comments on stdint.h are very simple: ...fast... type names are misleading
in exactly the same way as that define. The fact that they are in standard
does not outweight the confusion potential.

2005-04-04 21:38:56

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: Use of C99 int types

On Mon, 4 Apr 2005, Al Viro wrote:

> On Mon, Apr 04, 2005 at 10:30:52PM +0200, Renate Meijer wrote:
>
>> When used improperly. The #define Al Viro objected to, is
>> objectionable. It's highly
>> misleading, as Mr. Viro pointed out. I fail to see where he made
>> comments on stdint.h
>> as such.
>
> Comments on stdint.h are very simple: ...fast... type names are misleading
> in exactly the same way as that define. The fact that they are in standard
> does not outweight the confusion potential.

I don't find stdint.h in the kernel source (up to 2.6.11). Is this
going to be a new addition?

It would be very helpful to start using the uint(8,16,32,64)_t types
because they are self-evident, a lot more than size_t or, my favorite
wchar_t.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.11 on an i686 machine (5537.79 BogoMips).
Notice : All mail here is now cached for review by Dictator Bush.
98.36% of all statistics are fiction.

2005-04-04 21:52:58

by Kyle Moffett

[permalink] [raw]
Subject: Re: Use of C99 int types

On Apr 04, 2005, at 17:25, Richard B. Johnson wrote:
> I don't find stdint.h in the kernel source (up to 2.6.11). Is this
> going to be a new addition?

Uhh, no. stdint.h is part of glibc, not the kernel.

> It would be very helpful to start using the uint(8,16,32,64)_t types
> because they are self-evident, a lot more than size_t or, my favorite
> wchar_t.

You miss the point of size_t and ssize_t/ptrdiff_t. They are types
guaranteed to be at least as big as the pointer size. uint8/16/32/64,
on the other hand, are specific bit-sizes, which may not be as fast or
correct as a simple size_t. Linus has pointed out that while it
doesn't matter which of __u32, u32, uint32_t, etc you use for kernel
private interfaces, you *cannot* use anything other than __u32 in the
parts of headers that userspace will see, because __u32 is defined
only by the kernel and so there is no risk for conflicts, as opposed
to uint32_t, which is also defined by libc, resulting in collisions
in naming.

Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r
!y?(-)
------END GEEK CODE BLOCK------


2005-04-05 08:44:17

by Renate Meijer

[permalink] [raw]
Subject: Re: Use of C99 int types


On Apr 4, 2005, at 10:57 PM, Al Viro wrote:

> On Mon, Apr 04, 2005 at 10:30:52PM +0200, Renate Meijer wrote:
>
>> When used improperly. The #define Al Viro objected to, is
>> objectionable. It's highly
>> misleading, as Mr. Viro pointed out. I fail to see where he made
>> comments on stdint.h
>> as such.
>
> Comments on stdint.h are very simple: ...fast... type names are
> misleading
> in exactly the same way as that define.

Yes. However, the consistent designation ...fast... does alleviate that
somewhat. It
suffices to remember that in case of 'fast', the width mentioned is a
minimum value.

> The fact that they are in standard does not outweight the confusion
> potential.

I'm not so sure. Again, these types are quite clearly designated,
something the #define
in question lacks. The other types in stdint.h, however, come in quite
handy. Specifically
since they are guaranteed to represent correct widths by the
compiler-guys.

Something to take up with the guys at 'comp.lang.c', i'd say.

Regards,

Renate Meijer.

2005-04-05 09:20:46

by Renate Meijer

[permalink] [raw]
Subject: Re: Use of C99 int types


On Apr 4, 2005, at 11:49 PM, Kyle Moffett wrote:

> On Apr 04, 2005, at 17:25, Richard B. Johnson wrote:
>> I don't find stdint.h in the kernel source (up to 2.6.11). Is this
>> going to be a new addition?
>
> Uhh, no. stdint.h is part of glibc, not the kernel.
>
>> It would be very helpful to start using the uint(8,16,32,64)_t types
>> because they are self-evident, a lot more than size_t or, my favorite
>> wchar_t.
>
> You miss the point of size_t and ssize_t/ptrdiff_t. They are types
> guaranteed to be at least as big as the pointer size.

IIRC, It is guaranteed that size_t can correctly represent the largest
object which
can be malloced. This usually coincides with the width of a pointer,
but not
neccesarily.

> uint8/16/32/64,
> on the other hand, are specific bit-sizes, which may not be as fast or
> correct as a simple size_t.

Using specific widths may yield benefits on one platform, whilst
proving a real
bottleneck when porting something to another. A potential of problems
easily
avoided by using plain-vanilla integers.

> Linus has pointed out that while it
> doesn't matter which of __u32, u32, uint32_t, etc you use for kernel
> private interfaces, you *cannot* use anything other than __u32 in the
> parts of headers that userspace will see, because __u32 is defined
> only by the kernel and so there is no risk for conflicts, as opposed
> to uint32_t, which is also defined by libc, resulting in collisions
> in naming.

Strictly speaking, a definition starting with a double underscore is
reserved for use
by the compiler and associated libs, this such a declaration would
invade implementation
namespace. The compilers implementation, that is.

In this case, the boundary is a bit vague, i see that, since a lot of
header definitions also reside
in the /usr/include hierarchy.

I think it would be usefull to at least *agree* on a standard type for
8/16/32/64-bit integer types. What
I see now as a result of grepping for 'uint32' is a lot more confusing
than stdint.h

There is u32, __u32, uint32, uint32_t, __uint32_t...

Especially the types with leading underscores look cool, but in reality
may cause a conflict with compiler
internals and should only be used when defining compiler libraries. The
'__' have explicitly been put in by
ISO in order to avoid conflicts between user-code and the standard
libraries, so if non-compiler-library code also starts using '__', just
coz it looks cool, that cunning plan is undone.

Furthermore, I think it's wise to convince the community that if not
needed, integers should not be specified
by any specific width.

Regards,

Renate Meijer.

2005-04-05 11:28:37

by Kyle Moffett

[permalink] [raw]
Subject: Re: Use of C99 int types

On Apr 05, 2005, at 05:23, Renate Meijer wrote:
>> uint8/16/32/64, on the other hand, are specific bit-sizes, which
>> may not be as fast or correct as a simple size_t.
>
> Using specific widths may yield benefits on one platform, whilst
> proving a real bottleneck when porting something to another. A
> potential of problems easily avoided by using plain-vanilla
> integers.

The point of specific-width integers is to preserve a specific
binary format, such as a filesystem on-disk data structure, or a
kernel-userspace ABI, etc. If you just need a number, use a
different type.

> Strictly speaking, a definition starting with a double
> underscore is reserved for use by the compiler and associated
> libs

Well, _strictly_speaking_, it's "implementation defined", where the
"implementation" includes the kernel (due to the syscall interface).

> this such a declaration would invade implementation namespace.
> The compilers implementation, that is.

But the C library is implicitly dependent on the kernel headers for
a wide variety of datatypes.

> In this case, the boundary is a bit vague, i see that, since a lot
> of header definitions also reside in the /usr/include hierarchy.

Some of which are produced by kernel sources: /usr/include/linux,
/usr/include/asm, etc.

> I think it would be usefull to at least *agree* on a standard type
> for 8/16/32/64-bit integer types. What I see now as a result of
> grepping for 'uint32' is a lot more confusing than stdint.h

Well, Linus has supported that there is no standard, except where
ABI is concerned, there we must use __u32 so that it does not clash
with libc or user programs.

> Especially the types with leading underscores look cool, but in
> reality may cause a conflict with compiler internals and should only
> be used when defining compiler libraries.

It's "implementation" (kernel+libc+gcc) defined. It just means that
gcc, the kernel, and libc have to be much more careful not to tread
on each others toes.

> The '__' have explicitly been put in by ISO in order to avoid
> conflicts between user-code and the standard libraries,

The "standard libraries" includes the syscall interface here. If
the kernel types could not be prefixed with __, then what _should_
we prefix them with?

> Furthermore, I think it's wise to convince the community that if
> not needed, integers should not be specified by any specific width.

That doesn't work for an ABI. If you switch compilers (or from 32-bit
to 64-bit like from x86 to x86-64, you _must_ be able to specify
certain widths for all the ABI numbers to preserve compatibility.

Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r
!y?(-)
------END GEEK CODE BLOCK------


2005-04-05 12:22:52

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: Use of C99 int types

On Mon, 4 Apr 2005, Kyle Moffett wrote:

> On Apr 04, 2005, at 17:25, Richard B. Johnson wrote:
>> I don't find stdint.h in the kernel source (up to 2.6.11). Is this
>> going to be a new addition?
>
> Uhh, no. stdint.h is part of glibc, not the kernel.
>
>> It would be very helpful to start using the uint(8,16,32,64)_t types
>> because they are self-evident, a lot more than size_t or, my favorite
>> wchar_t.
>
> You miss the point of size_t and ssize_t/ptrdiff_t. They are types
> guaranteed to be at least as big as the pointer size. uint8/16/32/64,
> on the other hand, are specific bit-sizes, which may not be as fast or
> correct as a simple size_t. Linus has pointed out that while it
> doesn't matter which of __u32, u32, uint32_t, etc you use for kernel
> private interfaces, you *cannot* use anything other than __u32 in the
> parts of headers that userspace will see, because __u32 is defined
> only by the kernel and so there is no risk for conflicts, as opposed
> to uint32_t, which is also defined by libc, resulting in collisions
> in naming.
>
> Cheers,
> Kyle Moffett
>

Actually not. I think the whole point of the C99 (POSIX integer)
types is to avoid problems like you cite. Nobody should be using
types that begin with an underscore in user-code anyway. That
name-space is reserved.

One cannot just use 'int' or 'long', in particular when interfacing
with an operating system. For example, look at the socket interface
code. Parameters are put into an array of longs and a pointer to
this array is passed to the socket interface. It's a mess when
converting this code to 64-bit world. If originally one used a
structure of the correct POSIX integer types, and a pointer to
the structure was passed, then absolutely nothing in the source-
code would have to be changed at all when compiling that interface
for a 64-bit machine. The continual short-cuts, with the continual
"special-case" hacks is what makes porting difficult. That's what
the POSIX types was supposed to help prevent.

That's why I think if there was a stdint.h file in the kernel, when
people were performing maintenance or porting their code, they
could start using those types.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.11 on an i686 machine (5537.79 BogoMips).
Notice : All mail here is now cached for review by Dictator Bush.
98.36% of all statistics are fiction.

2005-04-05 21:51:49

by Kyle Moffett

[permalink] [raw]
Subject: Re: Use of C99 int types

On Apr 05, 2005, at 08:18, Richard B. Johnson wrote:
> One cannot just use 'int' or 'long', in particular when interfacing
> with an operating system. For example, look at the socket interface
> code. Parameters are put into an array of longs and a pointer to
> this array is passed to the socket interface. It's a mess when
> converting this code to 64-bit world.

Exactly

> If originally one used a structure of the correct POSIX integer
> types, and a pointer to the structure was passed, then absolutely
> nothing in the source-code would have to be changed at all when
> compiling that interface for a 64-bit machine.

But you _can't_ use the POSIX integer types. When compiling the
kernel, if you use the types, you must define them in the kernel
headers. On the other hand, when compiling userspace stuff, you
_can't_ have them defined in the kernel headers because libc also
defines them. The solution is to use __{s,u}{8,16,32,64}, which
are _only_ defined by the kernel, not by libc or gcc, and can be
therefore used in the ABI.

> The continual short-cuts, with the continual "special-case"
> hacks is what makes porting difficult. That's what the POSIX
> types was supposed to help prevent.

Except the POSIX types themselves are not usable for the boundary
code for the reasons of double definition. Google for Linus'
posts on this topic a couple months ago.

> That's why I think if there was a stdint.h file in the kernel,
> when people were performing maintenance or porting their code,
> they could start using those types.

The types _are_ available from the kernel headers, but only when
compiling with __KERNEL__, to avoid conflicts from the libc
definitions.

Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r
!y?(-)
------END GEEK CODE BLOCK------


2005-04-05 22:13:43

by Kyle Moffett

[permalink] [raw]
Subject: Re: Use of C99 int types

Please don't remove Linux-Kernel from the CC, I think this is an
important discussion.

On Apr 05, 2005, at 15:17, Renate Meijer wrote:
>>> Strictly speaking, a definition starting with a double
>>> underscore is reserved for use by the compiler and associated
>>> libs
>>
>> Well, _strictly_speaking_, it's "implementation defined", where the
>> "implementation" includes the kernel (due to the syscall interface).
>
> Beg to differ. As far as i'm aware, the syscall interface is not part
> of C. Hence the kernel, in compiler terms, is not part of "the
> implementation" of the compiler.

POSIX and such include information about signal handling and the
user-mode environment for C programs, both of which are completely
irrelevant from the compiler's point of view, including libc stuff.

>> But the C library is implicitly dependent on the kernel headers for
>> a wide variety of datatypes.
>
> Correct. It is, however, not only dependent on the definitions as
> provided by linux, but also of those provided by just about any other
> OS the compiler is running on. Which, by the last count, was a pretty
> impressive number.

I don't see how this applies. We're only talking about the Linux
kernel here, right?

>> Well, Linus has supported that there is no standard, except where
>> ABI is concerned, there we must use __u32 so that it does not clash
>> with libc or user programs.
>
> The fact that there is no standard is not an argument against at
> least reaching some compromise. Surely 5 different names for a
> simple, generic 32-bit integer is a bit much.

Personally, I don't care what you feel like requiring for purely
in-kernel interfaces, but __{s,u}{8,16,32,64} must stay to avoid
namespace collisions with glibc in the kernel include files as used
by userspace.

>>> Especially the types with leading underscores look cool, but in
>>> reality may cause a conflict with compiler internals and should only
>>> be used when defining compiler libraries.
>>
>> It's "implementation" (kernel+libc+gcc) defined.
>
> I don't think the kernel has any place in that list.
>
> <quote>
> 3.10
> [#1] implementation
> a particular set of software, running in a particular
> translation environment under particular control options,
> that performs translation of programs for, and supports
> execution of functions in, a particular execution
> environment
> </quote>

This is kinda arguing semantics, but:
A particular set of software (linux+libc+gcc), running in a particular
translation environment (userspace) under particular control options
(Signals, nice values, etc), that performs translation of programs for
(emulating missing instructions), and supports execution of functions
(syscalls) in, a particular execution environment (also userspace).

Without the kernel userspace wouldn't have anything, because anything
syscall-related (which is basically everything) involves the kernel.
Heck, the kernel and its ABI is _more_ a part of the implementation
than glibc is! I can write an assembly program that doesn't link to
or use libc, but without using syscalls I can do nothing whatsoever.

That's not to say that I _like_ the way things are set up, but it's not
practical to change them at the moment.

<Wishful Thinking>
It would be nice if GCC provided a set of __gcc_foo inline definitions
for all sorts of useful functions and types, including various types of
memory barriers, sized types, etc and other platform-related garbage
that it would be good to have in the same place. Then the kernel and
glibc could both just assume that they are there and not worry nearly
as much about what platform you're on.
</Wishful Thinking>

> But that goes only for those definitions that will eventually wind up
> in /usr/include/*, not any code internal to (say) a driver and only
> affects a minimal set of interfaces. That is, in comparison to
>
> renate@indigo:~/linux-2.6.11.6$ find . -name \*.h -exec grep __uint32
> {} \; -print
>
> or worse
>
> renate@indigo:~/linux-2.6.11.6$ find . -name \*.c -exec grep __uint32
> {} \; -print\
>
> On the bright side, most of it is in linux/fs/xfs so it's pretty
> localized, on the other side, none of it is related to the ABI in
> any way.

Uhh, how about:
grep -rl __u32 . | egrep '[^:]+\.h:'
or:
grep -rl __u32 . | egrep '[^:]+\.c:'

Both of those return a _LOT_ of stuff.

> Nope. The syscall interface is employed by the library, no more,
> no less. The C standard does not include *any* platform specific
> stuff.

Which is why it reserves __ for use by the implementation so it can
play wherever it wants.

> Quite on purpose, by the way. Not all the world is a linux machine
> and an AVR doesn't even have syscalls.

But when I write my framebuffer library, I do:
#include <linux/fb.h>
#include <stdlib.h>
And I expect it to work! I want it to get the correct types, I
don't want it to clash with or require the libc types (My old
sources might redefine some stdint.h names, and I don't want it
to clash with my user-defined types.

> Anything you like. 'kernel_' or simply 'k_' would be appropriate.
> As long as you do not invade compiler namespace. It is separated
> and uglyfied for a purpose.

But the _entire_ non _ namespace is reserved for anything user
programs want to do with it. I think most of the kernel types in
the current headers use __kernel_, which is safe enough.

> Does not work when you are touching externally defined interfaces
> in general, including that of a CPU. There are places for uint32_t
> and friends and even for __uint32_t and it's kin, but abusing them
> will cause trouble in a world that is accommodating more than one
> register-size. This is all I am saying.

But in a world with more than one register size, you _must_ use them,
for example, the x86-64 code uses them to handle 32-bit backwards
compatibility, and the ppc64 code does likewise. When a program
compiled as ppc32 gets run on my ppc64 box, the kernel understands
that anything pushed onto the stack as arguments is 32-bit, and must
use specifically sized types to handle that properly.

Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r
!y?(-)
------END GEEK CODE BLOCK------


2005-04-06 21:13:10

by Kyle Moffett

[permalink] [raw]
Subject: Re: Use of C99 int types

On Apr 06, 2005, at 07:41, Renate Meijer wrote:
> On Apr 6, 2005, at 12:11 AM, Kyle Moffett wrote:
>> Please don't remove Linux-Kernel from the CC, I think this is an
>> important discussion.

GAAH!!! Read my lips!!! Quit removing Linux-Kernel from the CC!!!

> As I see it, there are a number of issues
>
> - Use of double underscores invades compiler namespace (except in
> those cases
> where kernel definitions end up as the basis for definitions in
> /usr/include/*, i.e.
> those that actually are part of the C-implementation for Linux.

It is these that I'm talking about. This is exactly my point (The
cases where
the kernel definitions are part of /usr/include).

> - Some type that does not conflict with compiler namespace to replace
> the variety
> of definitions for e.g. 32-bit unsigned integers we have now.

As I said, I don't care about this, so do whatever you want.

> - Removal of anything prefixed with a double underscore from
> non-C-implementation
> files.

ATM, much of the stuff in include/linux and include/asm-* is considered
"C-implementation" because it is used from userspace. If you want to
clean
that up and start moving abi files to include/kernel-abi or somesuch,
feel
free, but that's a lot of work

>> Personally, I don't care what you feel like requiring for purely
>> in-kernel interfaces, but __{s,u}{8,16,32,64} must stay to avoid
>> namespace collisions with glibc in the kernel include files as used
>> by userspace.
>
> Aye, but as I have pointed out several times, these types should be
> restricted
> to those files and *only* those files which eventually end up in the
> compilers
> includes. In every other place, they invite exactly the trouble they
> are intended
> to avoid.

Precisely.

So if you want to make the millions of patches, go right ahead, be my
guest. :-P
Until somebody steps forward to clean up the huge mess, nothing will
get done.

> So in every place exept those files which may actually cause a
> namespace conflict or
> a bug because some newer version does not support __foobar, or changed
> the
> semantics. Since using any __foobar type implies relying on the
> compiler internals,
> which may change without prior notice, it is ipso facto undesirable.

Except the kernel wants to be optimized and work and use what features
are available.
The kernel uses __foobar stuff provided by the compiler because it has
gccX.h files
specifically designed to take compiler interfaces, provide backups when
they don't
exist, and use them (and their better checking) when they do.

>> This is kinda arguing semantics, but:
>> A particular set of software (linux+libc+gcc), running in a particular
>> translation environment (userspace) under particular control options
>> (Signals, nice values, etc), that performs translation of programs for
>> (emulating missing instructions), and supports execution of functions
>> (syscalls) in, a particular execution environment (also userspace).
>
> Ok. And where exactly are linux and libc when compiling code for an
> Atmel ATmega32 (40 pin DIL) using gcc?

Where do you get Atmel ATmega32 from? I _only_ care about what symbols
Linux can use, and as I've mentioned, when running under *Linux*, then
it just so
happens that *Linux* is part of my implementation, therefore the
*Linux* sources,
which by definition aren't used elsewhere, can assume they are part of
said
implementation.

> The 'set of software' does
> *not* include any OS. Not Windows, not Linux, not MacOSX, since the
> whole thing might be directed at a lowly microcontroller, which DOES
> NOT HAVE ANY OPERATING SYSTEM WHATSOEVER.
>
> Nevertheless, gcc works fine.

This is unrelated and off topic. Heck, you've even consented above that
Linux can use

>> Without the kernel userspace wouldn't have anything, because anything
>> syscall-related (which is basically everything) involves the kernel.
>
> Sure. The same goes for every other program. However, it would be
> pretty
> stoopid to say the kernel is an integral part of (say) the Gimp . More
> so, since
> the Gimp and GCC run on completely different architectures aswell.
>
> By the same token, linux is part of XFree86 despite the fact XFree86
> does not
> require linux to run.

But an XFree86 binary compiled on FreeBSD, or a GIMP binary compiled on
FreeBSD,
for the most part, will not run on Linux, because the compiler uses the
_Linux_
environment to build the binary, including the _Linux_ headers and
such. The
built binary is nearly useless without Linux, but not vice-versa, hence
even
though the binary is not a derivative work of linux, it requires it to
run.

>> Heck, the kernel and its ABI is _more_ a part of the implementation
>> than glibc is! I can write an assembly program that doesn't link to
>> or use libc, but without using syscalls I can do nothing whatsoever.
>
> I can write entire applications using gcc without even thinking of
> using
> any 'syscall' or any other part of linux/bsd/whatever. Still... it's
> gcc.

Uhh, what exactly is your application going to do? So it wants to
access
memory, it faults to the kernel and gets stuff paged in. It wants to
access
a file, it does a syscall. If it wants to allocate more memory, it
calls
into the kernel. This is all platform specific, and part of the
implementation.

> <Wishful Thinking>
> It would be nice if Linux became totally independent of any compiler,
> or at least that
> coupling between them would be minimal and that the amount of assembly
> needed
> would be minimal.

If you feel like fixing the compiler to provide good enough interfaces,
or fixing the
kernel to abstract all of that out, then fine, but remember that the
kernel has to
deal _directly_ with the hardware and is generally dependent on direct
MMU twiddling,
which _can't_ be done from C :-P.

> It would be nice if linux defined and documented its own platform
> specific types
> somewhere in the arch directory, using a consistent (across platforms)
> naming scheme
> and used those types consistently throughout the kernel,
> drivers,daemons and other
> associated code.

Got a patch?

> </Wishfull Thinking>
>
> <Nightmare>
> Your scenario above. Never-ending streams of compatibility issues and
> gcc drifting
> further and further from the ISO-C standard and more and more
> developers depending
> on non-standard interfaces, linux growing ever more dependent on
> support fro features
> ABC and XYZ being implemented consistently cross platform, so that if
> I want to use
> gcc to compile for an AVR, i'm stuck with a shitload of linux issues,
> kept "for backward
> compatibility".
> </Nightmare>

Linux has a bunch of gcc headers that configure it based on the
compiler version.
Change the compiler and we'll add another header, which, though messy,
provides
us some safety. It would be _nice_ however, if the compiler had a
gcc/types.h file
that just provided it all for us, so we don't need to hardcode it all
based on
the architecture specified. Directly defined __gcc_u32 types (Or
whatever the GCC
people like) would be even better.

>>> Nope. The syscall interface is employed by the library, no more,
>>> no less. The C standard does not include *any* platform specific
>>> stuff.
>>
>> Which is why it reserves __ for use by the implementation so it can
>> play wherever it wants.
>
> The C-implementation,. which still does not include the kernel. At most
> a few header files, which are used as a basis for standard types by
> the C
> implementation, but no more. Any double underscore in a .c file is a
> blatant
> error. Most used in .h files are, too.

So how do I get <linux/fb.h> to work? There aren't "just a few", there
are
__u32, etc in _everything_ with an ioctl or syscall interface, basically
anything with an ABI.

> Fine. I assume it does. But #include <linux/fb.h> does not make the
> framebuffer (nor linux, for that matter) part of the c-implementation.
> From
> the two files mentioned above, only stdlib.h is.

Ok, so how do I fix linux/fb.h to _not_ use __u32?

>> I want it to get the correct types, I don't want it to clash with or
>> require the
>> libc types (My old sources might redefine some stdint.h names, and I
>> don't want it
>> to clash with my user-defined types.
>
> Redefining stdint types is (for this reason) a Bad Idea.

So how do I use them in <linux/*.h>?

>>> Anything you like. 'kernel_' or simply 'k_' would be appropriate.
>>> As long as you do not invade compiler namespace. It is separated
>>> and uglyfied for a purpose.
>>
>> But the _entire_ non _ namespace is reserved for anything user
>> programs want to do with it.
>
> The above prefix was an alternative to using a double underscore
> prefix. Using *no*
> prefix, should not conflict with the compiler, excepting, of course,
> the types required by
> the standard.

But these are also used by c programs.

>> When a program
>> compiled as ppc32 gets run on my ppc64 box, the kernel understands
>> that anything pushed onto the stack as arguments is 32-bit, and must
>> use specifically sized types to handle that properly.
>
> And thus you end up using a 32-bit interface between a 64 bit OS and a
> 64 bit
> application? Or two separate syscall interfaces?

What about "When a program compiled as *ppc32*..." don't you get? I
have
my ppc32 program. It doesn't support the new ppc64 syscall or ioctl
interface,
because it's 32-bit. I didn't say anything about ppc64 programs, which
use the
new ppc64 syscall interface.

> Neither option seems very desirable. What about pointers which are
> 32 bit on one platform and 64 on the other? IOW, i'm not sure
> "backwards
> compatibility" is the thing to strive for. We all know what it did to
> Intel-processors
> and if it means having to jam data from a 64-bit App to a 64 bit OS
> through a 32-bit
> syscall interface, it stinks.
>
> Especially since most packages need only to be recompiled for the new
> situation and
> source (commonly) is available.

But what about when I boot between ppc32 and ppc64 on my G5, because
PPC64 doesn't
support the driver for some piece of hardware? Why can't I just use
all my old
32-bit binaries? The G5 has full 32-bit compatibility, why shouldn't
the kernel?

Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r
!y?(-)
------END GEEK CODE BLOCK------


2005-04-07 11:25:26

by Renate Meijer

[permalink] [raw]
Subject: Re: Use of C99 int types


On Apr 6, 2005, at 11:11 PM, Kyle Moffett wrote:

> On Apr 06, 2005, at 07:41, Renate Meijer wrote:
>> On Apr 6, 2005, at 12:11 AM, Kyle Moffett wrote:
>>> Please don't remove Linux-Kernel from the CC, I think this is an
>>> important discussion.
>
> GAAH!!! Read my lips!!! Quit removing Linux-Kernel from the CC!!!
>
>> As I see it, there are a number of issues
>>
>> - Use of double underscores invades compiler namespace (except in
>> those cases
>> where kernel definitions end up as the basis for definitions in
>> /usr/include/*, i.e.
>> those that actually are part of the C-implementation for Linux.
>
> It is these that I'm talking about. This is exactly my point (The
> cases where
> the kernel definitions are part of /usr/include).

Yes. And my point was all the other occurences. Specifically those in
*.c files. Btw. Not everything in the /usr/include/* path is part of
the compiler. For instance, the network definitions are *not* part, nor
are the syscall interface or any filesystem code.

>> - Some type that does not conflict with compiler namespace to replace
>> the variety
>> of definitions for e.g. 32-bit unsigned integers we have now.
>
> As I said, I don't care about this, so do whatever you want.

Ok. But you are just one developer (as much as I respect that).

>> - Removal of anything prefixed with a double underscore from
>> non-C-implementation
>> files.
>
> ATM, much of the stuff in include/linux and include/asm-* is considered
> "C-implementation" because it is used from userspace. If you want to
> clean
> that up and start moving abi files to include/kernel-abi or somesuch,
> feel
> free, but that's a lot of work

Agreed, and i'll probably won't have much time for that the coming five
weeks. So don't hold
your breath. However, it would be a good thing if someone (or
preferably, more than one) would
endeavour to do this.

>>> Personally, I don't care what you feel like requiring for purely
>>> in-kernel interfaces, but __{s,u}{8,16,32,64} must stay to avoid
>>> namespace collisions with glibc in the kernel include files as used
>>> by userspace.
>>
>> Aye, but as I have pointed out several times, these types should be
>> restricted
>> to those files and *only* those files which eventually end up in the
>> compilers
>> includes. In every other place, they invite exactly the trouble they
>> are intended
>> to avoid.
>
> Precisely.
>
> So if you want to make the millions of patches, go right ahead, be my
> guest. :-P

Well... Someone's got to do it, and "millions" seems to be a bit
exaggerated...

> Until somebody steps forward to clean up the huge mess, nothing will
> get done.

I think it's worth the efforts of more than just a single individual.
Especially since you
seem to agree the current situation isn't exactly ideal.

>> So in every place exept those files which may actually cause a
>> namespace conflict or
>> a bug because some newer version does not support __foobar, or
>> changed the
>> semantics. Since using any __foobar type implies relying on the
>> compiler internals,
>> which may change without prior notice, it is ipso facto undesirable.
>
> Except the kernel wants to be optimized and work and use what features
> are available.
> The kernel uses __foobar stuff provided by the compiler because it has
> gccX.h files
> specifically designed to take compiler interfaces, provide backups
> when they don't
> exist, and use them (and their better checking) when they do.

I've checked those files, but the use of compiler specific tricks seems
a bit more widespread
than that.

>>> This is kinda arguing semantics, but:
>>> A particular set of software (linux+libc+gcc), running in a
>>> particular
>>> translation environment (userspace) under particular control options
>>> (Signals, nice values, etc), that performs translation of programs
>>> for
>>> (emulating missing instructions), and supports execution of functions
>>> (syscalls) in, a particular execution environment (also userspace).
>>
>> Ok. And where exactly are linux and libc when compiling code for an
>> Atmel ATmega32 (40 pin DIL) using gcc?
>
> Where do you get Atmel ATmega32 from?

My local electronics shop. I brought it up just to make the point the
kernel is *not* part of the compilers implementation, since the
compiler will work happily without it.

> I _only_ care about what symbols Linux can use,

Ok. However, the gcc-crowd may see that in a completely different
perspective. For them the linux kernel is just one application and
linux (as a platform) just one platform they support.

> and as I've mentioned, when running under *Linux*, then it just so
> happens that *Linux* is part of my implementation, therefore the
> *Linux* sources,
> which by definition aren't used elsewhere, can assume they are part of
> said
> implementation.

As i've said, I don't care what you are running, OS is *never* part of
the compiler. At most some interfaces
are in a fuzzy, roundabout way. And even that is a questionable
practice.

>> The 'set of software' does
>> *not* include any OS. Not Windows, not Linux, not MacOSX, since the
>> whole thing might be directed at a lowly microcontroller, which DOES
>> NOT HAVE ANY OPERATING SYSTEM WHATSOEVER.
>>
>> Nevertheless, gcc works fine.
>
> This is unrelated and off topic.

Just to make the bloody point GCC is not dependent on Linux in any way
and hence the kernel is *not* part of GCC.

> Heck, you've even consented above that
> Linux can use
>
>>> Without the kernel userspace wouldn't have anything, because anything
>>> syscall-related (which is basically everything) involves the kernel.
>>
>> Sure. The same goes for every other program. However, it would be
>> pretty
>> stoopid to say the kernel is an integral part of (say) the Gimp .
>> More so, since
>> the Gimp and GCC run on completely different architectures aswell.
>>
>> By the same token, linux is part of XFree86 despite the fact XFree86
>> does not
>> require linux to run.
>
> But an XFree86 binary compiled on FreeBSD, or a GIMP binary compiled
> on FreeBSD,
> for the most part, will not run on Linux, because the compiler uses
> the _Linux_
> environment to build the binary, including the _Linux_ headers and
> such.

No... A binary compiled for one platform will not run on another,
usually. This still does not imply the linux kernel is part of gcc.

> The built binary is nearly useless without Linux, but not vice-versa,
> hence even
> though the binary is not a derivative work of linux, it requires it to
> run.

Weird. How come i'm running XFree86 and the Gimp on MacOSX? There ain't
no linux in sight.

>>> Heck, the kernel and its ABI is _more_ a part of the implementation
>>> than glibc is! I can write an assembly program that doesn't link to
>>> or use libc, but without using syscalls I can do nothing whatsoever.
>>
>> I can write entire applications using gcc without even thinking of
>> using
>> any 'syscall' or any other part of linux/bsd/whatever. Still... it's
>> gcc.
>
> Uhh, what exactly is your application going to do?

Monitoring water levels and sending out alarm messages (by SMS) when
the level either
gets too low or too high. Furthermore it controls a "stuw", a device
for regulating
the waterlevel for which I do not know the english term.

> So it wants to access memory, it faults to the kernel and gets stuff
> paged in.

What kernel? What pages? What OS? There is no OS, just a set of library
functions
i developed (pretty much) myself.

> It wants to access
> a file, it does a syscall.

What files? I write to EEPROM directly. There is no filesystem. Hell
there isn't even a vfprintf. In case you are wondering, it's an
application for the Atmel ATMega I mentioned.

> If it wants to allocate more memory, it calls
> into the kernel.

Allocate MORE MEMORY? The 4 kb available is full enough as it is.

> This is all platform specific, and part of the implementation.

And for that reason, not part of the implementation. The library (glibc
in your case) handles (or rather, should handle) the trickery involved.
But glibc isn't part of gcc.

>> <Wishful Thinking>
>> It would be nice if Linux became totally independent of any compiler,
>> or at least that
>> coupling between them would be minimal and that the amount of
>> assembly needed
>> would be minimal.
>
> If you feel like fixing the compiler to provide good enough
> interfaces, or fixing the
> kernel to abstract all of that out, then fine, but remember that the
> kernel has to
> deal _directly_ with the hardware and is generally dependent on direct
> MMU twiddling,
> which _can't_ be done from C :-P.

Agreed. That why i said "minimal" instead of "absent". Personally I
think the compilers interfaces are pretty good.

>> It would be nice if linux defined and documented its own platform
>> specific types
>> somewhere in the arch directory, using a consistent (across
>> platforms) naming scheme
>> and used those types consistently throughout the kernel,
>> drivers,daemons and other
>> associated code.
>
> Got a patch?

Not yet.

>> </Wishfull Thinking>
>>
>> <Nightmare>
>> Your scenario above. Never-ending streams of compatibility issues and
>> gcc drifting
>> further and further from the ISO-C standard and more and more
>> developers depending
>> on non-standard interfaces, linux growing ever more dependent on
>> support fro features
>> ABC and XYZ being implemented consistently cross platform, so that if
>> I want to use
>> gcc to compile for an AVR, i'm stuck with a shitload of linux issues,
>> kept "for backward
>> compatibility".
>> </Nightmare>
>
> Linux has a bunch of gcc headers that configure it based on the
> compiler version.
> Change the compiler and we'll add another header, which, though messy,
> provides
> us some safety. It would be _nice_ however, if the compiler had a
> gcc/types.h file
> that just provided it all for us, so we don't need to hardcode it all
> based on
> the architecture specified. Directly defined __gcc_u32 types (Or
> whatever the GCC
> people like) would be even better.

Why?

>>>> Nope. The syscall interface is employed by the library, no more,
>>>> no less. The C standard does not include *any* platform specific
>>>> stuff.
>>>
>>> Which is why it reserves __ for use by the implementation so it can
>>> play wherever it wants.
>>
>> The C-implementation,. which still does not include the kernel. At
>> most
>> a few header files, which are used as a basis for standard types by
>> the C
>> implementation, but no more. Any double underscore in a .c file is a
>> blatant
>> error. Most used in .h files are, too.
>
> So how do I get <linux/fb.h> to work? There aren't "just a few",
> there are
> __u32, etc in _everything_ with an ioctl or syscall interface,
> basically
> anything with an ABI.


>> Fine. I assume it does. But #include <linux/fb.h> does not make the
>> framebuffer (nor linux, for that matter) part of the
>> c-implementation. From
>> the two files mentioned above, only stdlib.h is.
>
> Ok, so how do I fix linux/fb.h to _not_ use __u32?

The way other libraries handle it. That ain't magic. The thing that
strikes me in the file you
mention is that both versions (__u32 *and* u32) are used in parts of
the header that are
(judging by #ifdef __KERNEL__) internal to the kernel.

This implies that merely stripping the double underscore may suffice.
After all, that is code that will never be part of the
C-Implementation.

>>> I want it to get the correct types, I don't want it to clash with or
>>> require the
>>> libc types (My old sources might redefine some stdint.h names, and I
>>> don't want it
>>> to clash with my user-defined types.
>>
>> Redefining stdint types is (for this reason) a Bad Idea.
>
> So how do I use them in <linux/*.h>?

Either stdint.h becomes part of the kernel, which implies your
redefined versions in old source are up to some maintainance, or we
rely on the asm/types.h versions and use them consistently. In either
case, use of double underscores where none are required, is not good.
Nor is redefining stuff that's defined by gcc, or worse ISO.

>>>> Anything you like. 'kernel_' or simply 'k_' would be appropriate.
>>>> As long as you do not invade compiler namespace. It is separated
>>>> and uglyfied for a purpose.
>>>
>>> But the _entire_ non _ namespace is reserved for anything user
>>> programs want to do with it.
>>
>> The above prefix was an alternative to using a double underscore
>> prefix. Using *no*
>> prefix, should not conflict with the compiler, excepting, of course,
>> the types required by
>> the standard.
>
> But these are also used by c programs.

Used, yes. Defined, no. The stuff the compiler headers define *without*
double underscores is the stuff that is exported for users to use. The
stuff that *has* double underscores isn't.