2001-04-12 12:37:12

by Adam J. Richter

[permalink] [raw]
Subject: List of all-zero .data variables in linux-2.4.3 available

For anyone who is interested, I have produced a list of all
of the .data variables that contain all zeroes and could be moved to
.bss within the kernel and all of the modules (all of the modules
that we build at Yggdrasil for x86, which is almost all). These
are global or static variables that have been declared

int foo = 0;

instead of

int foo; /* = 0 */

The result is that the .o files are bigger than they have
to be. The kernel memory image is not bigger, and gzip shrinks the
runs of zeroes down to almost nothing, so it does not have a huge effect
on bootable disks. Still, it would be nice to save the disk space of
the approximately 75 kilobytes of zeroes and perhaps squeeze in another
sector or two when building boot floppies.

I have also included a copy of the program that I wrote to
find these all-zero .data variables.

The program and the output are FTPable from
ftp://ftp.yggdrasil.com/private/adam/linux/zerovars/. Files with no
all-zero .data variables are not included in the listing. If you maintain
any code in the kernel, you might want to look at the output to see
how your code stacks up.

Adam J. Richter __ ______________ 4880 Stevens Creek Blvd, Suite 104
[email protected] \ / San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l United States of America
fax +1 408 261-6631 "Free Software For The Rest Of Us."


2001-04-12 14:42:35

by Johan Adolfsson

[permalink] [raw]
Subject: Re: List of all-zero .data variables in linux-2.4.3 available

Shouldn't a compiler be able to deal with this instead?
(Just a thought.)
/Johan

----- Original Message -----
From: Adam J. Richter <[email protected]>
To: <[email protected]>
Sent: Thursday, April 12, 2001 2:36 PM
Subject: List of all-zero .data variables in linux-2.4.3 available


> For anyone who is interested, I have produced a list of all
> of the .data variables that contain all zeroes and could be moved to
> .bss within the kernel and all of the modules (all of the modules
> that we build at Yggdrasil for x86, which is almost all). These
> are global or static variables that have been declared
>
> int foo = 0;
>
> instead of
>
> int foo; /* = 0 */
>
> The result is that the .o files are bigger than they have
> to be. The kernel memory image is not bigger, and gzip shrinks the
> runs of zeroes down to almost nothing, so it does not have a huge effect
> on bootable disks. Still, it would be nice to save the disk space of
> the approximately 75 kilobytes of zeroes and perhaps squeeze in another
> sector or two when building boot floppies.
>
> I have also included a copy of the program that I wrote to
> find these all-zero .data variables.
>
> The program and the output are FTPable from
> ftp://ftp.yggdrasil.com/private/adam/linux/zerovars/. Files with no
> all-zero .data variables are not included in the listing. If you maintain
> any code in the kernel, you might want to look at the output to see
> how your code stacks up.
>
> Adam J. Richter __ ______________ 4880 Stevens Creek Blvd, Suite
104
> [email protected] \ / San Jose, California
95129-1034
> +1 408 261-6630 | g g d r a s i l United States of America
> fax +1 408 261-6631 "Free Software For The Rest Of Us."
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2001-04-12 14:58:47

by Russell King

[permalink] [raw]
Subject: Re: List of all-zero .data variables in linux-2.4.3 available

On Thu, Apr 12, 2001 at 04:44:48PM +0200, [email protected] wrote:
> Shouldn't a compiler be able to deal with this instead?
> (Just a thought.)

Search the lkml archives for discussion on this topic around Christmas.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2001-04-12 15:32:38

by Richard B. Johnson

[permalink] [raw]
Subject: Re: List of all-zero .data variables in linux-2.4.3 available

On Thu, 12 Apr 2001 [email protected] wrote:

> Shouldn't a compiler be able to deal with this instead?
> (Just a thought.)
> /Johan

The compiler does deal with it. That's why you have a choice when
you write code.

The defacto standard has been that initialized data, regardless of
whether it's initialized with zero, goes into the ".data" area (segment).
Non initialized data, that gets zeroed at run-time goes into
the ".bss" area.

If you declare a file-scope variable as:

int foo;

it goes into '.bss'.

If you declare it as:

int foo = 0;

it goes into '.data'.

Data that is in '.data' occupies space in the executable image. With
the kernel, it makes the kernel larger than necessary.

Data that is in '.bss' is just a single long int in the file header.
It tells the loader how much space to allocate and zero. There is
quite an obvious advantage to using '.bss' when possible, rather
than '.data'.

At one time, data that was declared as:

int foo;

may, or may not, have been initialized to zero. This was "implementation
defined". Therefore we were taught to always initialize these variables.

The C98 standard now requires that such variables be initialized to
zero so you don't need to do this anymore. This allows such variables
to be put into space that is allocated at run-time, making executable
files shorter.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


2001-04-12 19:02:12

by Adam J. Richter

[permalink] [raw]
Subject: Re: List of all-zero .data variables in linux-2.4.3 available

[email protected] writes:
>Shouldn't a compiler be able to deal with this instead?

Yes. I sent some email to bug-gcc about this a couple of
months ago and even posted some (probably horribly incorrect) code
showing roughly the change I had in mind in the gcc source code
for the simple case of scalar variables. I was told that some code
to this was put in and then removed from gcc a long time ago, and
nobody seemed interested in putting it back in. I would think that this
would be a basic optimization that I would expect the compiler to make,
just like deleting "if(0) {......}" code, but gcc does not currently
do that. If somebody would like to fix gcc and do the necessary
lobbying to get such a change integrated, that would be great. However,
until that actually happens, I hope the file that I posted to
ftp://ftp.yggdrasil.com/private/adam/linux/zerovars/ will be useful
to individual maintainers and in identifying the largest arrays of
zeroes that can fix fixed in a few lines.

Adam J. Richter __ ______________ 4880 Stevens Creek Blvd, Suite 104
[email protected] \ / San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l United States of America
fax +1 408 261-6631 "Free Software For The Rest Of Us."

2001-04-12 19:15:42

by Ulrich Drepper

[permalink] [raw]
Subject: Re: List of all-zero .data variables in linux-2.4.3 available

"Adam J. Richter" <[email protected]> writes:

> >Shouldn't a compiler be able to deal with this instead?
>
> Yes.

No. gcc must not do this. There are situations where you must place
a zero-initialized variable in .data. It is a programmer problem.

--
---------------. ,-. 1325 Chesapeake Terrace
Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA
Red Hat `--' drepper at redhat.com `------------------------

2001-04-12 19:30:13

by Adam J. Richter

[permalink] [raw]
Subject: Re: List of all-zero .data variables in linux-2.4.3 available

>> = Adam Richter
> = Ulrich Drepper

>> >Shouldn't a compiler be able to deal with this instead?
>>
>> Yes.

>No. gcc must not do this. There are situations where you must place
>a zero-initialized variable in .data. It is a programmer problem.

I am aware of a couple of cases where code relied on static
variables being allocated contiguously, but, in both cases, those
variables were either all zeros or all non-zeros, so my proposed
change would not break such code. Also, variables being allocated
contiguously is not an assumption supported by any standard that
I am aware of, and the very rare cases where code relies on this
should instead use an array (they've been of the same type in the
examples that I have come across). At the very least, it seems
to me that this should be a compiler optimization flag, preferably
defaulted to "on".

If you have some other scenario in mind, I'd appreciate an
example or a clear reference to some explanation, and I think others
on linux-kernel would probably appreciate that too. It is a topic
that comes up repeatedly on linux-kernel.

Adam J. Richter __ ______________ 4880 Stevens Creek Blvd, Suite 104
[email protected] \ / San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l United States of America
fax +1 408 261-6631 "Free Software For The Rest Of Us."

2001-04-12 19:43:04

by Ulrich Drepper

[permalink] [raw]
Subject: Re: List of all-zero .data variables in linux-2.4.3 available

"Adam J. Richter" <[email protected]> writes:

> I am aware of a couple of cases where code relied on static
> variables being allocated contiguously, but, in both cases, those
> variables were either all zeros or all non-zeros, so my proposed
> change would not break such code.

Continuous placement is not the only property defined by
initialization. There are many more. You cannot change this since it
will quite a few programs and libraries and subtle and hard to
impossible to identify ways. Simply educate programmers to not
initialize.

--
---------------. ,-. 1325 Chesapeake Terrace
Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA
Red Hat `--' drepper at redhat.com `------------------------

2001-04-12 19:51:03

by Adam J. Richter

[permalink] [raw]
Subject: Re: List of all-zero .data variables in linux-2.4.3 available

>> I am aware of a couple of cases where code relied on static
>> variables being allocated contiguously, but, in both cases, those
>> variables were either all zeros or all non-zeros, so my proposed
>> change would not break such code.

>Continuous placement is not the only property defined by
>initialization. There are many more. You cannot change this since it
>will quite a few programs and libraries and subtle and hard to
>impossible to identify ways. Simply educate programmers to not
>initialize.

If it is so simple to "educate" programmers on this,
could you provide and example or some specifics, especially on why
this should not even be a compiler option? Surely that will save
you some iterations in this discussion.

Adam J. Richter __ ______________ 4880 Stevens Creek Blvd, Suite 104
[email protected] \ / San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l United States of America
fax +1 408 261-6631 "Free Software For The Rest Of Us."

2001-04-12 19:53:06

by Jeff Garzik

[permalink] [raw]
Subject: Re: List of all-zero .data variables in linux-2.4.3 available

"Adam J. Richter" wrote:
> For anyone who is interested, I have produced a list of all
> of the .data variables that contain all zeroes and could be moved to
> .bss within the kernel and all of the modules (all of the modules
> that we build at Yggdrasil for x86, which is almost all). These
> are global or static variables that have been declared

Thanks, but Andrey Panin did you one better -- he produced a patch which
fixes up a good number of these. You should follow lkml more closely :)

--
Jeff Garzik | Sam: "Mind if I drive?"
Building 1024 | Max: "Not if you don't mind me clawing at the dash
MandrakeSoft | and shrieking like a cheerleader."

2001-04-12 20:05:14

by Adam J. Richter

[permalink] [raw]
Subject: Re: List of all-zero .data variables in linux-2.4.3 available

>Thanks, but Andrey Panin did you one better -- he produced a patch which
>fixes up a good number of these. You should follow lkml more closely :)

I missed that patch and have been unable to find it on google/dejanews.
However, my point is to provide an exhaustive list with sizes (and the tool
for generating it), to make it easier to spot and prioritize ones that
may have been missed.

Anyhow, thanks for the tip. Perhaps I should run this program and
post results again on a subsequent kernel release (presumably
with Andrey's patch), although anyone else can run this program
just as easily.

Adam J. Richter __ ______________ 4880 Stevens Creek Blvd, Suite 104
[email protected] \ / San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l United States of America
fax +1 408 261-6631 "Free Software For The Rest Of Us."

2001-04-12 21:28:39

by H. Peter Anvin

[permalink] [raw]
Subject: Re: List of all-zero .data variables in linux-2.4.3 available

Followup to: <[email protected]>
By author: Ulrich Drepper <[email protected]>
In newsgroup: linux.dev.kernel
>
> "Adam J. Richter" <[email protected]> writes:
>
> > >Shouldn't a compiler be able to deal with this instead?
> >
> > Yes.
>
> No. gcc must not do this. There are situations where you must place
> a zero-initialized variable in .data. It is a programmer problem.
>

And this cannot be decorated with __attribute__((section(".data")))
why?

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt

2001-04-16 21:21:50

by Pavel Machek

[permalink] [raw]
Subject: Re: List of all-zero .data variables in linux-2.4.3 available

Hi!

> > I am aware of a couple of cases where code relied on static
> > variables being allocated contiguously, but, in both cases, those
> > variables were either all zeros or all non-zeros, so my proposed
> > change would not break such code.
>
> Continuous placement is not the only property defined by
> initialization. There are many more. You cannot change this since it
> will quite a few programs and libraries and subtle and hard to
> impossible to identify ways. Simply educate programmers to not
> initialize.

Unless ansiC specifies such behaviour, such code is buggy. And buggy
code should be fixed, not be used as argument against optimalization.
[Of course, you can turn off that optimalization for buggy code, if code
is too ugly to fix.]

--
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.