2002-10-18 11:05:27

by Samium Gromoff

[permalink] [raw]
Subject: 2.5 and lowmemory boxens

first: i`ve successfully ran 2.5.43 on a 386sx20/4M ram notebook.

the one problem was the ppp over serial not working, but i suspect
that it just needs to be recompiled with 2.5 headers (am i right?).

the other was, well, the fact that ultra-stripped 2.5.43
still used 200k more memory than 2.4.19, and thats despite it was
compiled with -Os instead of -O2.
actually it was 2000k free with 2.4 vs 1800k free with 2.5

i know Rik had plans of some ultra bloody embedded/lowmem
changes for such cases. i`d like to hear about things in the area :)

regards, Samium Gromoff, aka Serge Kosyrev


2002-10-18 11:29:16

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.5 and lowmemory boxens

"Samium Gromoff" <[email protected]> writes:

> first: i`ve successfully ran 2.5.43 on a 386sx20/4M ram notebook.
>
> the one problem was the ppp over serial not working, but i suspect
> that it just needs to be recompiled with 2.5 headers (am i right?).
>
> the other was, well, the fact that ultra-stripped 2.5.43
> still used 200k more memory than 2.4.19, and thats despite it was
> compiled with -Os instead of -O2.
> actually it was 2000k free with 2.4 vs 1800k free with 2.5
>
> i know Rik had plans of some ultra bloody embedded/lowmem
> changes for such cases. i`d like to hear about things in the area :)

I would start with clamping down all the hash tables. A lot of code
does something like:

mempages *= sizeof(struct list_head);
for (order = 0; ((1UL << order) << PAGE_SHIFT) < mempages; order++)
;

do {
unsigned long tmp;

nr_hash = (1UL << order) * PAGE_SIZE /
sizeof(struct list_head);
d_hash_mask = (nr_hash - 1);

tmp = nr_hash;
d_hash_shift = 0;
while ((tmp >>= 1UL) != 0UL)
d_hash_shift++;

dentry_hashtable = (struct list_head *)
__get_free_pages(GFP_ATOMIC, order);
} while (dentry_hashtable == NULL && --order >= 0);

which usually results in too big hash tables.

Unfortunately this isn't a common function that can be tuned centrally
(it *really* should be). But you could grep for hash and change them
all to only use a single page or even less.

More text size could be saved by going through the header files and
uninlining bigger functions.

When you have lots of daemons that hang around in select() or poll()
then it's possible to save 8K for each of them by applying a patch
that allocates data for small select on the stack.

Also Linux by default doesn't use the area in 640k-1MB. If you know
the exact mappings there on your box or trust your e820 table then
you can change setup.c to use the free areas in there.

-Andi

2002-10-18 11:39:06

by jbradford

[permalink] [raw]
Subject: Re: 2.5 and lowmemory boxens

> first: i`ve successfully ran 2.5.43 on a 386sx20/4M ram notebook.

Cool, I thought my 486sx20/4M was a good achievement :-)

> the one problem was the ppp over serial not working, but i suspect
> that it just needs to be recompiled with 2.5 headers (am i right?).

I have found that 16450-based serial ports are unreliable under
2.5.x. Enabling interrupt un-masking didn't help, and I suspect that
it is just the generally more bloated kernel making the cache, (or in
the case of a 386, the pre-fetch unit :-) ), less efficient, and
causing data to be lost.

> the other was, well, the fact that ultra-stripped 2.5.43
> still used 200k more memory than 2.4.19, and thats despite it was
> compiled with -Os instead of -O2.
> actually it was 2000k free with 2.4 vs 1800k free with 2.5

Yes, I've noticed the same thing during my experiments with low-memory boxes.

> i know Rik had plans of some ultra bloody embedded/lowmem
> changes for such cases. i`d like to hear about things in the area :)

I am also very interested in it.

John.

2002-10-18 14:15:59

by Russell King

[permalink] [raw]
Subject: Re: 2.5 and lowmemory boxens

On Fri, Oct 18, 2002 at 12:54:15PM +0100, [email protected] wrote:
> > the one problem was the ppp over serial not working, but i suspect
> > that it just needs to be recompiled with 2.5 headers (am i right?).
>
> I have found that 16450-based serial ports are unreliable under
> 2.5.x. Enabling interrupt un-masking didn't help, and I suspect that
> it is just the generally more bloated kernel making the cache, (or in
> the case of a 386, the pre-fetch unit :-) ), less efficient, and
> causing data to be lost.

Well, finding the cause of this is going to be such a pain in the ass.
With the major IDE change after the serial code went in 2.5, there is
no one kernel I can say "could you try to see what effect that kernel
has" to narrow it down to whether it is really due to the new serial
or due to other changes elsewhere.

You seem to imply that you loose received characters when you get IDE
activity. It would be nice to find out if how old serial + new IDE or
new serial + old IDE behave. (Such kernels do not exist.) Unfortunately,
neither is possible without lots of work, and there presently aren't
enough hours in the day to put together such kernels without co-operation
of other kernel developers.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-10-18 16:44:25

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5 and lowmemory boxens

Andi Kleen wrote:
>
> "Samium Gromoff" <[email protected]> writes:
>
> > first: i`ve successfully ran 2.5.43 on a 386sx20/4M ram notebook.
> >
> > the one problem was the ppp over serial not working, but i suspect
> > that it just needs to be recompiled with 2.5 headers (am i right?).
> >
> > the other was, well, the fact that ultra-stripped 2.5.43
> > still used 200k more memory than 2.4.19, and thats despite it was
> > compiled with -Os instead of -O2.
> > actually it was 2000k free with 2.4 vs 1800k free with 2.5
> >
> > i know Rik had plans of some ultra bloody embedded/lowmem
> > changes for such cases. i`d like to hear about things in the area :)
>
> I would start with clamping down all the hash tables.

Well here's some low-hanging fruit:

mnm:/usr/src/25> size kernel/pid.o
text data bss dec hex filename
1677 1088 131104 133869 20aed kernel/pid.o

(I have a trollpatch to fix this)

And the radix_tree_node mempool is 140k; I plan to do away with
that altogether.

timer.c and sched.c have significant NR_CPUS bloat problems on SMP.
Working on that.

2002-10-18 21:40:30

by Daniel Phillips

[permalink] [raw]
Subject: Re: 2.5 and lowmemory boxens

On Friday 18 October 2002 18:50, Andrew Morton wrote:
> > "Samium Gromoff" <[email protected]> writes:
> > > first: i`ve successfully ran 2.5.43 on a 386sx20/4M ram notebook.
>
> ...
>
> timer.c and sched.c have significant NR_CPUS bloat problems on SMP.
> Working on that.

Oooh, yes! So 2.6 will be just fine for my smp dsl router...

Seriously, we are getting closer to the day notebooks start shipping with
multi-core processors, and it's not beyond belief that a dsl router would
benefit from this as well. I.e., super-high processing power, but hardly
any memory/flash required. Xmeta, listening? What better geek trophy than
a 4-way notebook.

--
Daniel

2002-10-18 22:31:10

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5 and lowmemory boxens

Daniel Phillips wrote:
>
> On Friday 18 October 2002 18:50, Andrew Morton wrote:
> > > "Samium Gromoff" <[email protected]> writes:
> > > > first: i`ve successfully ran 2.5.43 on a 386sx20/4M ram notebook.
> >
> > ...
> >
> > timer.c and sched.c have significant NR_CPUS bloat problems on SMP.
> > Working on that.
>
> Oooh, yes! So 2.6 will be just fine for my smp dsl router...
>

Reducing NR_CPUS from 32 to 2 shrinks the ia32 kernel by 380 kilobytes.

Figure half a meg or more on 64-bit machines.

Not a huge amount. But not zero either.