On Sat, Nov 25, 2000 at 11:50:20AM +0000, Russell King wrote:
> Rusty Russell writes:
> > What irritates about these monkey-see-monkey-do patches is that if I
> > initialize a variable to NULL, it's because my code actually relies on
> > it; I don't want that information eliminated.
>
> What information is lost? Unless you're working on a really strange
> machine which does not zero bss, the following means the same from the
> codes point of view:
>
> static int foo = 0;
> static int foo;
>
> Both are initialised to zero by the time the code sees them for the
> first time. Therefore there is no difference to the code in its reliance
> on whether foo is zero. foo will be zero in both cases.
>
> Also, any good programmer worth their skin should know this, and should
> realise it. Therefore, there is no information loss
What a strange reaction. If I write
static int foo;
this means that foo is a variable, local to the present compilation unit,
whose initial value is irrelevant because it will be assigned to before use.
If I write
static int foo = 0;
this means that the code depends on the initialization.
Indeed, it is customary to write
int foo = 0; /* just for gcc */
when the initialization in fact is not necessary.
It is a bad programming habit to depend on this zero initialization.
Indeed, very often, when you have a program that does something
you need to change it so that it does that thing a number of times.
Well, put a for- or while-loop around it. But wait! The second time
through the loop certain variables need to be reinitialized. Which ones?
The ones that were initialized explicitly in your first program.
Make the program into a function in a larger one. Same story.
Saving a byte in the binary image is not very interesting.
Preserving information about the program is important.
I see that this message is cc'ed to Tigran, so let me address him as well.
Tigran, you like to destabilize Linux. I like to stabilize Linux.
If it is your intention to destabilize then you need not read the following.
But let us assume that you try to make a perfect system.
There is the issue of local and global correctness.
A piece of code is locally correct when its correctness can be seen
by just looking at those lines, or that function, or that source file.
A piece of code is globally correct when you need to read the entire kernel
source to convince yourself that all is well.
Often local correctness is obtained by local tests. After reading the entire
kernel source you conclude that these tests are superfluous because they
are satisfied in all cases. And you think it is an improvement to remove
the test. It almost never is. On a fast path, where every cycle counts, yes.
But it is not a good idea to sacrifice local correctness and save five
kernel image bytes, or speed up the mount system call by 0.001%.
Why not? Because you read the entire kernel source of today.
But not that of next week. Somewhere someone changes some code.
The test is gone and the kernel crashes instead of returning an error.
You even like to destabilize when there is no gain in size or speed at all.
It is bad coding practice to use casts. They tell the compiler not to check.
With functions returning (void *) the opposite is true. The compiler cannot
check now, but given a cast, it can. Thus, I wrote a few months ago
> If one just writes
> foo = kmalloc(n * sizeof(some_type), GFP_x);
> then neither the compiler nor the human eye can check
> easily that things are right, i.e. that foo really is
> a (some_type *). Therefore it is better to write
> foo = (some_type *) kmalloc(n * sizeof(some_type), GFP_x);
To my surprise you answered
: It is a small thing, Andries, but I still think otherwise than you.
: It is better for code to be smaller than to be slightly more fool-proof.
Please change your mind.
Andries
Andries Brouwer writes:
> What a strange reaction. If I write
>
> static int foo;
>
> this means that foo is a variable, local to the present compilation unit,
> whose initial value is irrelevant because it will be assigned to before use.
Wrong. The initial value is well-defined. Go and read any C standard you
choose. Any C standard you care. You will find out something really
interesting. I can guarantee that you will find out that it will be
initialised to zero. Unconditionally. No question. Absolutely.
> It is a bad programming habit to depend on this zero initialization.
Why? Again, it is WELL defined, and is WELL defined in any C standard.
> Indeed, very often, when you have a program that does something
> you need to change it so that it does that thing a number of times.
> Well, put a for- or while-loop around it. But wait! The second time
> through the loop certain variables need to be reinitialized. Which ones?
> The ones that were initialized explicitly in your first program.
> Make the program into a function in a larger one. Same story.
Your point here is as clear as mud.
> If it is your intention to destabilize then you need not read the following.
> But let us assume that you try to make a perfect system.
There is absolutely NO destabilisation going on here. Get a grip, read the
C standards, read the C startup code. Then come back with something more
relevent.
_____
|_____| ------------------------------------------------- ---+---+-
| | Russell King [email protected] --- ---
| | | | http://www.arm.linux.org.uk/personal/aboutme.html / / |
| +-+-+ --- -+-
/ | THE developer of ARM Linux |+| /|\
/ | | | --- |
+-+-+ ------------------------------------------------- /\\\ |
On Sat, Nov 25, 2000 at 09:07:08PM +0000, Russell King wrote:
> Andries Brouwer writes:
> > What a strange reaction. If I write
> >
> > static int foo;
> >
> > this means that foo is a variable, local to the present compilation unit,
> > whose initial value is irrelevant
>
> Wrong. The initial value is well-defined.
Oh, please - something is wrong with your reading comprehension.
Don't you understand the word "irrelevant"? It means that the
initial value does not matter. It does not mean undefined.
Please reread my letter and comment when you understand my point.
Andries Brouwer <[email protected]> wrote:
>
> int foo = 0; /* just for gcc */
> when the initialization in fact is not necessary.
Only for non-static foo.
> It is a bad programming habit to depend on this zero initialization.
> Indeed, very often, when you have a program that does something
> you need to change it so that it does that thing a number of times.
> Well, put a for- or while-loop around it. But wait! The second time
> through the loop certain variables need to be reinitialized. Which ones?
> The ones that were initialized explicitly in your first program.
> Make the program into a function in a larger one. Same story.
Again, this only applies to non-static variables. For static ones, they're
initialised once only even when they go out of scope.
> Saving a byte in the binary image is not very interesting.
> Preserving information about the program is important.
No information is lost.
--
Debian GNU/Linux 2.2 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Hello Andries,
On Sat, 25 Nov 2000, Andries Brouwer wrote:
> What a strange reaction. If I write
>
> static int foo;
>
> this means that foo is a variable, local to the present compilation unit,
> whose initial value is irrelevant because it will be assigned to before use.
> If I write
>
> static int foo = 0;
>
> this means that the code depends on the initialization.
> Indeed, it is customary to write
>
> int foo = 0; /* just for gcc */
>
> when the initialization in fact is not necessary.
What I am suggesting (in fact, not me but common sense) is that if you
write:
static int foo;
then you really mean "a variable is foo and since it is required to be
initialized to zero, I am quite free to _rely_ on this fact and will
possibly do so". It is true that information about whether you actually
rely on it or not is lost but surely such "loss" is worth being able to
run Linux rather than non-Linux (i.e. in the cases where it is a matter of
a "few bytes" that decides whether you _can_ run Linux or not at all, i.e.
presumably some small devices where you have to squeeze Linux with a given
set of drivers into finite room).
> Saving a byte in the binary image is not very interesting. Preserving
> information about the program is important.
Saving a single byte may not be. Some studies have shown that the total is
in the range of a megabyte, that is first. The second is -- developing the
optimal set of mind (namely that described above around "static int foo;"
) is very interesting as it ensures that Linux remains optimal even as it
and the number of people working on it grows astronomically. You must have
seen the source code of various commercial flavours of UNIX and therefore
understand why they are such a failure -- there is no one like Linus
Torvalds behind them which has so much patience that he gratefully accepts
all improvements, however small they may seem. I hope that Linux will
remain the cleanest system wrt attention to detail. Yes, I understand that
it requires absolutely _impossible_ amount of patience on the part of
Linus Torvalds, but that is indeed what he does -- the impossible and may
God bless him and keep him.
>
> I see that this message is cc'ed to Tigran, so let me address him as well.
> Tigran, you like to destabilize Linux. I like to stabilize Linux.
>
Oh, Andries, that is insulting. Surely you do not really mean that. So, I
_will_ read the rest of your message. :)
> If it is your intention to destabilize then you need not read the following.
> But let us assume that you try to make a perfect system.
>
> There is the issue of local and global correctness.
> A piece of code is locally correct when its correctness can be seen
> by just looking at those lines, or that function, or that source file.
> A piece of code is globally correct when you need to read the entire kernel
> source to convince yourself that all is well.
>
> Often local correctness is obtained by local tests. After reading the entire
> kernel source you conclude that these tests are superfluous because they
> are satisfied in all cases. And you think it is an improvement to remove
> the test. It almost never is. On a fast path, where every cycle counts, yes.
> But it is not a good idea to sacrifice local correctness and save five
> kernel image bytes, or speed up the mount system call by 0.001%.
> Why not? Because you read the entire kernel source of today.
> But not that of next week. Somewhere someone changes some code.
> The test is gone and the kernel crashes instead of returning an error.
your theory is very good, in theory, but is not so in practice. Namely, if
you cared to look in depth at the specific instances of the optimizations
I suggested which required what you call "global correctness checks" (I
like that terminology!) then you would either have found out that either:
a) I have done enough investigations to show that such tests can not only
be removed now but nothing in the future should ever require them to be
added.
or
b) I have made a mistake, in which case, I would be happy to see you
correcting me. Failure to do so indicates that I was right.
in both cases, I did not intentionally sacrifice "local correctness" as
you are trying to present. I think I value local correctness as much as
you do.
> You even like to destabilize when there is no gain in size or speed at all.
> It is bad coding practice to use casts. They tell the compiler not to check.
> With functions returning (void *) the opposite is true. The compiler cannot
> check now, but given a cast, it can. Thus, I wrote a few months ago
>
> > If one just writes
> > foo = kmalloc(n * sizeof(some_type), GFP_x);
> > then neither the compiler nor the human eye can check
> > easily that things are right, i.e. that foo really is
> > a (some_type *). Therefore it is better to write
> > foo = (some_type *) kmalloc(n * sizeof(some_type), GFP_x);
>
> To my surprise you answered
>
> : It is a small thing, Andries, but I still think otherwise than you.
> : It is better for code to be smaller than to be slightly more fool-proof.
Again, in theory you sound quite right. In practice, the specific cases
where I proposed removing such typecasts were immediately preceeded by the
declarations of the corresponding variables. I.e. it was _immediately_
obvious as to what type they are and those long casts were only making
code bigger, nothing else. You will certainly find quite a large number of
places where those casts are still there -- this is not because I haven't
seen them but because I didn't think worth changing (probably for the very
reason you kindly explained to me, for this I thank you!)
>
> Please change your mind.
>
> Andries
I have changed my mind about one thing -- there is a common sense or
"sense of measure" about what should and what should not be cc'd to
linux-kernel and I certainly was neglecting such "sense of measure" if I
allowed a mail like yours to come into existence. Nevertheless, it is
gratefully noted and will be acted upon accordingly.
Regards,
Tigran
On Sun, Nov 26, 2000 at 09:11:18AM +1100, Herbert Xu wrote:
> No information is lost.
Do I explain things so badly? Let me try again.
The difference between
static int a;
and
static int a = 0;
is the " = 0". The compiler may well generate the same code,
but I am not talking about the compiler. I am talking about
the programmer. This " = 0" means (to me, the programmer)
that the correctness of my program depends on this initialization.
Its absense means (to me) that it does not matter what initial
value the variable has.
This is a useful distinction. It means that if the program
static int a;
int main() {
/* do something */
}
is used as part of a larger program, I can just rename main
and get
static int a;
int do_something() {
...
}
But if the program
static int a = 0;
int main() {
/* do something */
}
is used as part of a larger program, it has to become
static int a;
int do_something() {
a = 0;
...
}
You see that I, in my own code, follow a certain convention
where presence or absence of assignments means something
about the code. If now you change "static int a = 0;"
into "static int a;" and justify that by saying that it
generates the same code, then I am unhappy, because now
if I turn main() into do_something() I either get a buggy
program, or otherwise I have to read the source of main()
again to see which variables need initialisation.
In a program source there is information for the compiler
and information for the future me. Removing the " = 0"
is like removing comments. For the compiler the information
remains the same. For the programmer something is lost.
Andries
On Sat, 25 Nov 2000, Andries Brouwer wrote:
> On Sun, Nov 26, 2000 at 09:11:18AM +1100, Herbert Xu wrote:
>
> > No information is lost.
>
> Do I explain things so badly? Let me try again.
> The difference between
>
> static int a;
>
> and
>
> static int a = 0;
>
> is the " = 0". The compiler may well generate the same code,
It does not. That's the whole point: the (functionally redundant) =0 wastes
another sizeof(int) bytes in the kernel image.
> but I am not talking about the compiler. I am talking about
> the programmer. This " = 0" means (to me, the programmer)
> that the correctness of my program depends on this initialization.
If you want to document your code like this, put it in a comment. That's what
they are there for. Or, if coding a function which explicitly relies on a
variable being 0, have that function set the variable to zero.
> Its absense means (to me) that it does not matter what initial
> value the variable has.
Which is silly. The variable is explicitly defined to be zero anyway, whether
you put this in your code or not.
> This is a useful distinction. It means that if the program
>
> static int a;
>
> int main() {
> /* do something */
> }
>
> is used as part of a larger program, I can just rename main
> and get
>
> static int a;
>
> int do_something() {
> ...
> }
>
> But if the program
>
> static int a = 0;
>
> int main() {
> /* do something */
> }
>
> is used as part of a larger program, it has to become
>
> static int a;
>
> int do_something() {
> a = 0;
> ...
> }
Just put:
static int a; /* must be set to zero in foobar() */
> You see that I, in my own code, follow a certain convention
> where presence or absence of assignments means something
> about the code.
Unfortunately, this handy documentation shortcut of yours bloats the kernel
unnecessarily.
> If now you change "static int a = 0;"
> into "static int a;" and justify that by saying that it
> generates the same code,
It does NOT generate the same code - that's the point. It generates smaller but
functionally equivalent code. The first version zeroes a TWICE, in effect; this
is completely unnecessary, and just bloats the kernel.
> then I am unhappy, because now
> if I turn main() into do_something() I either get a buggy
> program, or otherwise I have to read the source of main()
> again to see which variables need initialisation.
Oh no! You mean you might actually have to look at the code you're changing?!
This is hardly a valid reason for bloating the kernel! If you put the "this
variable must be zero when foo() is called" in a comment, rather than as a C
statement, it is equally clear to you - but avoids bloating the kernel.
> In a program source there is information for the compiler
> and information for the future me. Removing the " = 0"
> is like removing comments. For the compiler the information
> remains the same. For the programmer something is lost.
So put that comment in a real comment, rather than a redundant statement!
James.
Andries Brouwer wrote:
> In a program source there is information for the compiler
> and information for the future me. Removing the " = 0"
> is like removing comments. For the compiler the information
> remains the same. For the programmer something is lost.
This is pretty much personal opinion :)
The C language is full of implicit as well as explicit features. You
are arguing that using an implicit feature robs the programmer of
information. For you maybe... For others, no information is lost AND
the code is more clean AND the kernel is smaller. It's just a matter of
knowing and internalizing "the rules" in your head.
Jeff
--
Jeff Garzik |
Building 1024 | The chief enemy of creativity is "good" sense
MandrakeSoft | -- Picasso
On Sat, Nov 25, 2000 at 11:46:24PM +0100, Andries Brouwer wrote:
>
> But if the program
>
> static int a = 0;
>
> int main() {
> /* do something */
> }
>
> is used as part of a larger program, it has to become
>
> static int a;
>
> int do_something() {
> a = 0;
> ...
> }
Only if the person doing the change follows this convention, if that happens
to be you, not a problem. But in a project like Linux, it's not very likely
to happen.
It's much better to put a comment above the definition.
--
Debian GNU/Linux 2.2 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Sat, Nov 25, 2000 at 10:53:00PM +0000, James A Sutherland wrote:
> Which is silly. The variable is explicitly defined to be zero
> anyway, whether you put this in your code or not.
Why doesn't the compiler just leave out explicit zeros from the
'initial data' segment then? Seems like it ought to be tought to..
Tim.
*/
On Sat, Nov 25, 2000 at 10:27:15PM +0000, Tigran Aivazian wrote:
: Hello Andries,
Hi Tigran,
: ... I am quite free to _rely_ on this fact and will possibly do so.
Yes, you are. But some programmers have learned that it is a good
idea to code in a way that is informative to the programmer.
: > Tigran, you like to destabilize Linux.
:
: Oh, Andries, that is insulting. Surely you do not really mean that.
No insult intended.
It is just that if there is an abyss somewhere, I like to stay at least
a meter away from it. Someone else may think that one inch suffices.
I see you propose a lot of changes that yield a negligable advantage
and reduce stability a tiny little bit. That is pushing Linux in the
direction of this abyss. You notice that the view gets better, and I
get nervous.
You seem to have these strange ideas. I quoted you
: It is better for code to be smaller than to be slightly more fool-proof.
Here is a different one:
: I think that the check for inode->i_op == NULL in various vfs_XXX()
: functions is bogus, i.e. if it is NULL then it must be a bug in
: some filesystem's ->read_inode() method and therefore, instead of
: returning error to userspace we should immediately panic, since
: it is a kernel bug.
Does the kernel contain a bug? Panic! I don't think my alpha would
have gotten an uptime of 1198 days under that paradigm.
(I don't think you were serious, but still..)
[I am not so sure why i_op == NULL necessarily is a bug.
Sometimes a routine invents a dummy inode just because it is needed
in some calling convention, while nothing of this inode is used
except for example i_rdev. Maybe it does not occur today, in the
filesystems in the 2.4 kernel tree. But such checks: test i_op,
then test i_op->function, then call i_op->function() ensure
a local correctness. That is what I like.
Reading all filesystems in the kernel tree is what I don't like.
And there are many filesystems not in the kernel tree.]
This is not to debate this particular case - it is Al's business.
This is just an example where you want to sacrifice local correctness
and be satisfied with global correctness.
: "sense of measure"
Yes, well formulated!
But this was a communication to linux-kernel, not an attack.
It was meant to say two things, namely
(i) Source code is a communication to programmers and to the compiler.
It is a bad idea to optimize the communication towards the compiler
when that is detrimental to the communication towards programmers.
And (ii) locally correct code is more stable than code that is only
globally correct.
For me these are truisms, but when Rusty complained about loss of
information lots of people did not seem to understand what could be meant.
I took you as my victim because you always seem to take the point
of view that the code must be perfect, never mind the programmers,
and that it is a good idea to save a few instructions, never mind
local correctness. (And also because your old remark quoted above
still required a reaction.)
No offense intended.
Andries
On Sat, Nov 25, 2000 at 06:02:51PM -0500, Jeff Garzik wrote:
> Andries Brouwer wrote:
> > In a program source there is information for the compiler
> > and information for the future me. Removing the " = 0"
> > is like removing comments. For the compiler the information
> > remains the same. For the programmer something is lost.
>
> This is pretty much personal opinion :)
>
> The C language is full of implicit as well as explicit features. You
> are arguing that using an implicit feature robs the programmer of
> information. For you maybe... For others, no information is lost AND
> the code is more clean AND the kernel is smaller. It's just a matter of
> knowing and internalizing "the rules" in your head.
Oh Jeff,
All these really good people, unable to capture a simple idea.
Let me try one more time.
There is information. The information is:
"this variable needs initialization"
Now you tell me to know simple rules. OK, I know them.
But what do they tell me about my variables a and b, where
a requires initialization and b does not require it?
One can write a comment, like
int a; /* this variable needs initialization, fortunately
it is already initialized at startup */
int b; /* no initialization required */
But that is overdoing it, it uglifies the code.
One can leave the comment out, like
int a, b;
But then next month, when you decide to move this into
some function
int foo() {
int a, b;
...
there is no indication that you need an additional
a = 0;
You see?
There is real information here. Useful as a reminder.
Not necessary. The perfect programmer would see
immediately that the assignment is required, also
without the reminder. But not everybody is perfect
all of the time, and sometimes the code involved is
quite complicated. The tiny convention
"write an explicit initialization when initialization is needed"
is helpful. It is a form of program documentation.
Andries
>>>>> "AB" == Andries Brouwer <[email protected]> writes:
AB> No insult intended. It is just that if there is an abyss
AB> somewhere, I like to stay at least a meter away from it. Someone
AB> else may think that one inch suffices. I see you propose a lot
AB> of changes that yield a negligable advantage and reduce stability
AB> a tiny little bit. That is pushing Linux in the direction of this
AB> abyss. You notice that the view gets better, and I get nervous.
Can somebody stop this train load of bunk?
Uninitialized global variables always have a initial value of
zero. Static or otherwise. Period.
Anybody with more than a week's experience programming knows this.
It's idiomatic. Just as in English one says, "Go away!" knowing that
"You", the implied subject of the imperative sentence, will be
understood.
Andries, please devote your impressive energy to fixing _real_ bugs.
This kind of argument is best left until we're _really_ low on other
things to do.
On Sat, 25 Nov 2000, Tim Waugh wrote:
>
> On Sat, Nov 25, 2000 at 10:53:00PM +0000, James A Sutherland wrote:
>
> > Which is silly. The variable is explicitly defined to be zero
> > anyway, whether you put this in your code or not.
>
> Why doesn't the compiler just leave out explicit zeros from the
> 'initial data' segment then? Seems like it ought to be tought to..
Good idea; unfortunately, it's probably too kernel-specific, so gcc may not
want to include this change. Also, the kernel is gcc version-specific; even if
this feature were automated in gcc now, it could take some time before the
kernel could safely be built under that version. Better to optimise the source
code to avoid the problem, rather than change the compiler to kludge around it.
James.
> AB> of changes that yield a negligable advantage and reduce stability
> AB> a tiny little bit. That is pushing Linux in the direction of this
> AB> abyss. You notice that the view gets better, and I get nervous.
>
> Can somebody stop this train load of bunk?
>
> Uninitialized global variables always have a initial value of
> zero. Static or otherwise. Period.
That isnt what Andries is arguing about. Read harder. Its semantic differences
rather than code differences.
static int a=0;
says 'I thought about this. I want it to start at zero. I've written it this
way to remind of the fact'
Sure it generates the same code
On Sun, 26 Nov 2000 04:25:05 +0000 (GMT), Alan Cox
<[email protected]> wrote:
>> AB> of changes that yield a negligable advantage and reduce stability
>> AB> a tiny little bit. That is pushing Linux in the direction of this
>> AB> abyss. You notice that the view gets better, and I get nervous.
>>
>> Can somebody stop this train load of bunk?
>>
>> Uninitialized global variables always have a initial value of
>> zero. Static or otherwise. Period.
>
>That isnt what Andries is arguing about. Read harder. Its semantic differences
>rather than code differences.
>
> static int a=0;
>
>says 'I thought about this. I want it to start at zero. I've written it this
>way to remind of the fact'
>
>Sure it generates the same code
It also says "I do not know much about the details of the kernel C
environment. In particular I do not know that all static variables are
initialized to 0 in the kernel startup. I have not read setup.S."
john alvord
On Sun, 26 Nov 2000, John Alvord wrote:
> On Sun, 26 Nov 2000 04:25:05 +0000 (GMT), Alan Cox
> <[email protected]> wrote:
>
> >> AB> of changes that yield a negligable advantage and reduce stability
> >> AB> a tiny little bit. That is pushing Linux in the direction of this
> >> AB> abyss. You notice that the view gets better, and I get nervous.
> >>
> >> Can somebody stop this train load of bunk?
> >>
> >> Uninitialized global variables always have a initial value of
> >> zero. Static or otherwise. Period.
> >
> >That isnt what Andries is arguing about. Read harder. Its semantic differences
> >rather than code differences.
> >
> > static int a=0;
> >
> >says 'I thought about this. I want it to start at zero. I've written it this
> >way to remind of the fact'
> >
> >Sure it generates the same code
>
> It also says "I do not know much about the details of the kernel C
> environment. In particular I do not know that all static variables are
> initialized to 0 in the kernel startup. I have not read setup.S."
Are you positive for modules too...
Regardless of the fact you have displayed, some of us prefer to clobber it
to insure that it stays zero until access. Last thing you want is an
unstatic static when we go to spin a disk for data.
Just how warm and fuzzy do you fell if your block drivers do not insure
this point?
Cheers,
Andre Hedrick
Linux ATA Development
Andries Brouwer wrote:
> On Sat, Nov 25, 2000 at 10:27:15PM +0000, Tigran Aivazian wrote:
I think it's a bad sign if people like the two of you start flaming
each other ...
On the issue of static int foo = 0; vs. static int foo; I'd agree
with Andries' view. It's a common enough idiom that it is useful to
convey the intentions of the programmer.
On "optimizing" changes: there are plenty of very ugly things you can
do to a C program to make source or object code smaller (e.g. use only
one-character identifiers for smaller code; re-use variables as much
as possible, maybe with casts for smaller stack footprint, etc.). We
usually avoid these too, so a few extra initializations in the source
shouldn't hurt.
On the .data segment size: if all the energy that went into this
thread would have gone into implementing a gcc option to move all-zero
.data objects to .bss, the technical side of the problem would be
solved already ;-)
> Does the kernel contain a bug? Panic! I don't think my alpha would
> have gotten an uptime of 1198 days under that paradigm.
> (I don't think you were serious, but still..)
Hmm, sometimes a panic _is_ the right answer, though. If a critical
subsystem just politely returns an error to user space and tries to
continue, it may take a while until somebody realizes that there's
something wrong at all ...
- Werner
--
_________________________________________________________________________
/ Werner Almesberger, ICA, EPFL, CH [email protected] /
/_IN_N_032__Tel_+41_21_693_6621__Fax_+41_21_693_6610_____________________/
On Sat, 25 Nov 2000 21:10:19 -0800 (PST),
Andre Hedrick <[email protected]> wrote:
>On Sun, 26 Nov 2000, John Alvord wrote:
>> It also says "I do not know much about the details of the kernel C
>> environment. In particular I do not know that all static variables are
>> initialized to 0 in the kernel startup. I have not read setup.S."
>
>Are you positive for modules too...
Yes.
On Sun, 26 Nov 2000, Keith Owens wrote:
> >Are you positive for modules too...
>
> Yes.
I know this, I am being punchy.
Cheers,
Andre Hedrick
CTO Timpanogas Research Group
EVP Linux Development, TRG
Linux ATA Development
Hi Andries!
> All these really good people, unable to capture a simple idea.
> Let me try one more time.
> There is information. The information is:
> "this variable needs initialization"
> Now you tell me to know simple rules. OK, I know them.
> But what do they tell me about my variables a and b, where
> a requires initialization and b does not require it?
Distinguishing between variables initialized to zero and those not requiring
initialization is a good idea, but honestly, how common are static variables
declared at the top level which don't require initialization?
Have a nice fortnight
--
Martin `MJ' Mares <[email protected]> <[email protected]> http://atrey.karlin.mff.cuni.cz/~mj/
"RAM = Rarely Adequate Memory"
Andries Brouwer writes:
> Oh, please - something is wrong with your reading comprehension.
> Don't you understand the word "irrelevant"? It means that the
> initial value does not matter. It does not mean undefined.
> Please reread my letter and comment when you understand my point.
So now you try personnal insult to get your non-point across?
There is no more discussion to be had; this has rapidly decended into
yet another flaming match which I do not want to continue. Please
decist, and we'll all keep our opinions to ourselves on this matter, ok?
_____
|_____| ------------------------------------------------- ---+---+-
| | Russell King [email protected] --- ---
| | | | http://www.arm.linux.org.uk/personal/aboutme.html / / |
| +-+-+ --- -+-
/ | THE developer of ARM Linux |+| /|\
/ | | | --- |
+-+-+ ------------------------------------------------- /\\\ |
On Sat, 25 Nov 2000, Tim Waugh wrote:
> On Sat, Nov 25, 2000 at 10:53:00PM +0000, James A Sutherland wrote:
>
> > Which is silly. The variable is explicitly defined to be zero
> > anyway, whether you put this in your code or not.
>
> Why doesn't the compiler just leave out explicit zeros from the
> 'initial data' segment then? Seems like it ought to be tought to..
yes, taught to, _BUT_ never let this to be a default option, please.
Because there are valid cases where a programmer things "this is in .data"
and that means this should be in .data. Think of binary patching an object
as one valid example (there may be others, I forgot).
Regards,
Tigran
On Sun, 26 Nov 2000, John Alvord wrote:
> It also says "I do not know much about the details of the kernel C
> environment. In particular I do not know that all static variables are
> initialized to 0 in the kernel startup. I have not read setup.S."
John, please stop insulting Andries, you would be _surprized_ to find out
how much he actually knows about a multitude of things.
As for Andries' point of loss of information, he has a point, _but_ James'
suggestion to put that extra info in the comment, imho, outweighs the
small disadvantages (code looks a bit uglier) which Andries pointed out to
counter it.
Regards,
Tigran.
On Sun, 26 Nov 2000, John Alvord wrote:
> It also says "I do not know much about the details of the kernel C
> environment. In particular I do not know that all static variables are
> initialized to 0 in the kernel startup. I have not read setup.S."
~~~~~~~~~
Sorry, John, I _have_ to [give good example to others]. The above says
that _you_ my dear friend, do not know where the BSS clearing code is. It
is not in setup.S. It is not even in the same directory, where setup.S is.
It is in arch/i386/kernel/head.S, starting from line 120:
/*
* Clear BSS first so that there are no surprises...
*/
xorl %eax,%eax
movl $ SYMBOL_NAME(__bss_start),%edi
movl $ SYMBOL_NAME(_end),%ecx
subl %edi,%ecx
cld
rep
stosb
... speaking of which (putting asbesto on and hiding from Andries ;) can't
we optimize this code to move words at a time and not bytes.... ;)
Regards,
Tigran
John Alvord wrote:
> On Sun, 26 Nov 2000 04:25:05 +0000 (GMT), Alan Cox
> <[email protected]> wrote:
>
> >> AB> of changes that yield a negligable advantage and reduce stability
> >> AB> a tiny little bit. That is pushing Linux in the direction of this
> >> AB> abyss. You notice that the view gets better, and I get nervous.
> >>
> >> Can somebody stop this train load of bunk?
> >>
> >> Uninitialized global variables always have a initial value of
> >> zero. Static or otherwise. Period.
> >
> >That isnt what Andries is arguing about. Read harder. Its semantic differences
> >rather than code differences.
> >
> > static int a=0;
> >
> >says 'I thought about this. I want it to start at zero. I've written it this
> >way to remind of the fact'
> >
> >Sure it generates the same code
>
> It also says "I do not know much about the details of the kernel C
> environment. In particular I do not know that all static variables are
> initialized to 0 in the kernel startup. I have not read setup.S."
Nope. It doesn't say that. Maybe if you wrote the code. But if Andries
or I had written that line, it just says that when written the
programmer thought about the initial value, and that the initial value
matters on this variable.
It is a concise form of documentation. As Andries explained, this can
also be done with comments or with
static int a /* = 0 */;
However, I like the "=0" variant much better.
If you're worried about the inefficiency of the compiler, take it up
with the compiler guys. Or write an extra preprocessor step or
something like that.
Roger.
--
** [email protected] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots.
* There are also old, bald pilots.
On Sun, Nov 26, 2000 at 04:25:05AM +0000, Alan Cox wrote:
> static int a=0;
>
> says 'I thought about this. I want it to start at zero. I've written it this
> way to remind of the fact'
>
> Sure it generates the same code
I agree it would be best if gcc would generate the same code; unfortunately
this doesn't seem to be the case, which sounds like something to take up with
the gcc developers.
On Sun, Nov 26, 2000 at 10:52:05AM +0000, Tigran Aivazian wrote:
> that _you_ my dear friend, do not know where the BSS clearing code is. It
> is not in setup.S. It is not even in the same directory, where setup.S is.
> It is in arch/i386/kernel/head.S, starting from line 120:
On a related note, I seem to remember that back in the dark ages, the BSS
wasn't cleared. It said so somewhere in the Kernel Hackers Guide, I think.
Regards,
bert hubert
--
PowerDNS Versatile DNS Services
Trilab The Technology People
'SYN! .. SYN|ACK! .. ACK!' - the mating call of the internet
On Sun, Nov 26, 2000 at 10:37:07AM +0000, Tigran Aivazian wrote:
> On Sat, 25 Nov 2000, Tim Waugh wrote:
> > Why doesn't the compiler just leave out explicit zeros from the
> > 'initial data' segment then? Seems like it ought to be tought to..
>
> yes, taught to, _BUT_ never let this to be a default option, please.
> Because there are valid cases where a programmer things "this is in .data"
That's what __attribute__ ((section (".data"))) is for.
> and that means this should be in .data. Think of binary patching an object
> as one valid example (there may be others, I forgot).
can you think of any valid examples that apply to the kernel ?
>>>>> "AC" == Alan Cox <[email protected]> writes:
AC> Sure it generates the same code
If you accept that code == .text, as do I, then there is no code
generated for either of the forms being argued.
Followup to: <[email protected]>
By author: Alan Cox <[email protected]>
In newsgroup: linux.dev.kernel
>
> That isnt what Andries is arguing about. Read harder. Its semantic differences
> rather than code differences.
>
> static int a=0;
>
> says 'I thought about this. I want it to start at zero. I've written it this
> way to remind of the fact'
>
> Sure it generates the same code
>
The problem is that it doesn't. One could argue this is a gcc bug or
rather missed optimization.
One can, of course, also write:
static int a /* = 0 */;
... to make it clear to human programmers without making gcc make bad
code.
-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt
Hi!
> Sorry, John, I _have_ to [give good example to others]. The above says
> that _you_ my dear friend, do not know where the BSS clearing code is. It
> is not in setup.S. It is not even in the same directory, where setup.S is.
> It is in arch/i386/kernel/head.S, starting from line 120:
>
> /*
> * Clear BSS first so that there are no surprises...
> */
> xorl %eax,%eax
> movl $ SYMBOL_NAME(__bss_start),%edi
> movl $ SYMBOL_NAME(_end),%ecx
> subl %edi,%ecx
> cld
> rep
> stosb
>
> ... speaking of which (putting asbesto on and hiding from Andries ;) can't
> we optimize this code to move words at a time and not bytes.... ;)
There's better way: put bss clearing code at beggining of .C code and
do it with memset. [x86-64 does it this way.] It is both more obvious
[no assembly] and faster [memset is optimized].
Pavel
--
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.
On Sat, Nov 25, 2000 at 11:55:11PM +0000, Tim Waugh wrote:
> On Sat, Nov 25, 2000 at 10:53:00PM +0000, James A Sutherland wrote:
>
> > Which is silly. The variable is explicitly defined to be zero
> > anyway, whether you put this in your code or not.
>
> Why doesn't the compiler just leave out explicit zeros from the
> 'initial data' segment then? Seems like it ought to be tought to..
Because sometimes it matters. For example, in kernel mode (and certainly for
embedded programs that I'm more familiar with), the kernel does go through and
zero out the so called BSS segment, so that normally uninitialized static
variables will follow the rules as laid out under the C standards (both C89 and
C99). I can imagine however, that the code that is executed before the BSS
area is zeroed out needs to be extra careful in terms of statics that it
references, and those must be hand initialized.
--
Michael Meissner, Red Hat, Inc.
PMB 198, 174 Littleton Road #3, Westford, Massachusetts 01886, USA
Work: [email protected] phone: +1 978-486-9304
Non-work: [email protected] fax: +1 978-692-4482
Andries Brouwer wrote:
>
> On Sun, Nov 26, 2000 at 09:11:18AM +1100, Herbert Xu wrote:
>
> > No information is lost.
>
> Do I explain things so badly? Let me try again.
> The difference between
>
> static int a;
>
> and
>
> static int a = 0;
>
> is the " = 0". The compiler may well generate the same code,
> but I am not talking about the compiler. I am talking about
> the programmer. This " = 0" means (to me, the programmer)
> that the correctness of my program depends on this initialization.
> Its absense means (to me) that it does not matter what initial
> value the variable has.
Seems to me few other people think that way, thats why it is so
har for them to get. And thats why this style of coding isn't very
helpful either. It may be a real help for you, but not for others
who merely get confused or irritated at the small but easy to eliminate
micro-bloat.
There are certainly people so used to the implicit zeroing that they
think of "static int a;" as a zero initialization as explicit as
anything, because that's the way the language works. And they will
take just as much care if the "a-using" code is modified to run twice.
The "=0" part don't make it clearer for them if it was clear already.
Helge Hafting
Andries Brouwer writes:
> Do I explain things so badly? Let me try again.
> The difference between
>
> static int a;
>
> and
>
> static int a = 0;
>
> is the " = 0". The compiler may well generate the same code,
> but I am not talking about the compiler. I am talking about
> the programmer. This " = 0" means (to me, the programmer)
> that the correctness of my program depends on this initialization.
> Its absense means (to me) that it does not matter what initial
> value the variable has.
It is too late to fix things now. It would have been good to
have the compiler put explicitly zeroed data in a segment that
isn't shared with non-zero or uninitialized data, so that the
uninitialized data could be set to 0xfff00fff to catch bugs.
It would take much effort over many years to make that work.
I'd rather see the compiler optimize for cache line use and
make use of small address offsets to load variables.
Albert D. Cahalan writes:
> It is too late to fix things now. It would have been good to
> have the compiler put explicitly zeroed data in a segment that
> isn't shared with non-zero or uninitialized data, so that the
> uninitialized data could be set to 0xfff00fff to catch bugs.
> It would take much effort over many years to make that work.
Oh dear, here's that misconception again.
static int a;
isn't a bug. It is not "uninitialised data". It is defined to be
zero. Setting the BSS of any C program to contain non-zero data will
break it. Fact. The only bug you'll find is the fact that you're
breaking the C standard.
There is only two places where you come across uninitialised data:
1. memory obtained from outside text, data, bss limit of the program
(ie, malloced memory)
2. if you use auto variables which may be allocated on the stack
All variables declared at top-level are initialised. No questions
asked. And its not a bug to rely on such a fact.
_____
|_____| ------------------------------------------------- ---+---+-
| | Russell King [email protected] --- ---
| | | | http://www.arm.linux.org.uk/personal/aboutme.html / / |
| +-+-+ --- -+-
/ | THE developer of ARM Linux |+| /|\
/ | | | --- |
+-+-+ ------------------------------------------------- /\\\ |
[email protected] (H. Peter Anvin) wrote on 26.11.00 in <[email protected]>:
> The problem is that it doesn't. One could argue this is a gcc bug or
> rather missed optimization.
>
> One can, of course, also write:
>
> static int a /* = 0 */;
>
> ... to make it clear to human programmers without making gcc make bad
> code.
This (or similar) has the added advantage of making it obvious that this
is documentation, and not a superfluous initialization.
Sure, if you (generic you) look at your own code, you may know what it
means if it's written a certain way. But if you look at other's code, or
others look at your code, that is not clear. It is clear with a comment.
MfG Kai
[Tigran Aivazian]
> _BUT_ never let this to be a default option, please. Because there
> are valid cases where a programmer things "this is in .data" and that
> means this should be in .data.
If you are writing the sort of code that cares which section it ends up
in, you need to use __attribute__((section)). You probably will be
using things like __attribute__((align)) as well. Relying on compiler
behavior here is dangerous.
I agree though that an option is called for, either -fassume-bss-zero
or -fno-assume-bss-zero, not sure which should be the default.
Peter
Russell King writes:
> Albert D. Cahalan writes:
>> It is too late to fix things now. It would have been good to
>> have the compiler put explicitly zeroed data in a segment that
>> isn't shared with non-zero or uninitialized data, so that the
>> uninitialized data could be set to 0xfff00fff to catch bugs.
>> It would take much effort over many years to make that work.
>
> Oh dear, here's that misconception again.
>
> static int a;
>
> isn't a bug.
Alone, it is not.
> It is not "uninitialised data". It is defined to be
> zero. Setting the BSS of any C program to contain non-zero data will
> break it. Fact. The only bug you'll find is the fact that you're
> breaking the C standard.
Oh, bullshit. We break the C standard left and right already.
This is the kernel, and the kernel can initialize BSS any damn
way it feels like initializing it. The kernel isn't ever going
to be standard C.
Choosing an initializer that tends to catch unintended reliance
on zeroed data would be good. Too bad it is too late to fix.
> All variables declared at top-level are initialised. No questions
> asked. And its not a bug to rely on such a fact.
Go back and read the rest of this thread. Examples have been
provided (not by me) of such code leading to latter mistakes.
[Albert D. Cahalan]
> Choosing an initializer that tends to catch unintended reliance on
> zeroed data would be good. Too bad it is too late to fix.
Why would that be good? Why is it bad to accidentally rely on zeroed
data, if the data is in fact guaranteed to be zeroed? It's not like
this is going to change out from under us in a year. You said it
yourself: we can do whatever we want here. And I don't see why we
would ever want to do anything other than zero it.
> Go back and read the rest of this thread. Examples have been provided
> (not by me) of such code leading to latter mistakes.
Oh please, how hard can it be to type
static int foo; // = 0
as opposed to
static int foo = 0;
If the two produced the same object code, the second would be better,
but they don't, so it isn't. Patch gcc, if you care enough (and feel
you can convince the gcc steering committee to care enough).
Peter
Albert D. Cahalan writes:
> Oh, bullshit. We break the C standard left and right already.
> This is the kernel, and the kernel can initialize BSS any damn
> way it feels like initializing it. The kernel isn't ever going
> to be standard C.
>
> Choosing an initializer that tends to catch unintended reliance
> on zeroed data would be good. Too bad it is too late to fix.
Its not me talking bullshit here, its you. It is totally reasonable
to rely on:
static int foo;
to be zero. If it is not, that is a bug in the C startup code. No
two ways about it. If someone then says "I want to initialise the
BSS to some magic value to catch this reliance" then we are breaking
a lot of peoples expectations. (Least Surprise theory)
To say again, relying on foo to be zero is not a bug.
If you set the BSS to something non-zero, we already know that a lot
will break. But it will break because someone has broken the BSS
initialisation code, not because it is relying on something that is
expected to be standard. By setting the BSS to something non-zero,
you're not telling anyone anything new. About the only response
will be "fix the BSS initialisation".
If you want to try this, then that is up to you. Don't let us
stop you. However, don't expect people to accept patches to
"fix" your self-created problem.
I look forward to your complaints about the disk subsystems,
keyboard, console, and so forth apparantly being broken.
_____
|_____| ------------------------------------------------- ---+---+-
| | Russell King [email protected] --- ---
| | | | http://www.arm.linux.org.uk/personal/aboutme.html / / |
| +-+-+ --- -+-
/ | THE developer of ARM Linux |+| /|\
/ | | | --- |
+-+-+ ------------------------------------------------- /\\\ |