I recently had pointed out to me that the default value for max_threads (ie
the max number of tasks per system) doesn't work right on machines with
lots of memory.
A quick examination of fork_init() shows that max_threads is supposed to be
limited so its stack/task_struct takes no more than half of physical
memory. This calculation ignores the fact that task_structs must be
allocated from the normal pool and not the highmem pool, which is a clear
bug. On a machine with enough physical memory it's possible for all of
normal memory to be allocated to task_structs, which tends to make the
machine die.
fork_init() gets its knowledge of physical memory passed in from
start_kernel(), which sets it from mum_physpages. This parameter is also
passed to several other init functions.
My question boils down to this... Should we change start_kernel() to limit
the physical memory size it passes to the init functions to not include
high memory, or should we only do it for fork_init()? What is the best way
to do calculate this number? I don't see any simple way in
architecture-independent code to get the size of high memory vs normal
memory.
What's the best approach here?
Thanks,
Dave McCracken
======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
[email protected] T/L 678-3059
In article <72940000.1003868385@baldur>,
Dave McCracken <[email protected]> writes:
> What's the best approach here?
I would just limit it to a reasonable max value; e.g. 10000
if someone needs more than 10000 threads/processes he/she can set sysctls
manually. The current scheduler would choke anyways if only a small
fraction of 10000 threads are runnable.
-Andi
--On Tuesday, October 23, 2001 22:36:51 +0200 Andi Kleen <[email protected]> wrote:
> I would just limit it to a reasonable max value; e.g. 10000
> if someone needs more than 10000 threads/processes he/she can set sysctls
> manually. The current scheduler would choke anyways if only a small
> fraction of 10000 threads are runnable.
Yes, that would solve the max_threads problem. It should be fairly simple
to pick a reasonable number.
But my question is also about the other subsystems called from
start_kernel() that take memory size as an argument. This includes
vfs_caches_init() which in turn calls dcache_init(), and buffer_init() and
page_cache_init(). I haven't dug down to the bottom of all these
functions, but I'm guessing they really want to base their calculations on
available normal memory and not high memory.
Dave McCracken
======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
[email protected] T/L 678-3059
On Tue, 23 Oct 2001, Dave McCracken wrote:
> A quick examination of fork_init() shows that max_threads is supposed
> to be limited so its stack/task_struct takes no more than half of
> physical memory. This calculation ignores the fact that task_structs
> must be allocated from the normal pool and not the highmem pool, which
> is a clear bug.
It also ignores the fact that tasks need things like page
tables, VMAs, etc... The total kernel memory demand of
the maximum number of tasks the kernel allows by default
is way higher than physical memory.
I submitted a patch a while ago to set the number way lower,
which was accepted by Alan and in the -ac kernels. A few months
later Linus followed and changed the limit in his kernels, too.
regards,
Rik
--
DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ (volunteers needed)
http://www.surriel.com/ http://distro.conectiva.com/
--On Tuesday, October 23, 2001 18:52:35 -0200 Rik van Riel
<[email protected]> wrote:
> I submitted a patch a while ago to set the number way lower,
> which was accepted by Alan and in the -ac kernels. A few months
> later Linus followed and changed the limit in his kernels, too.
Ok, that's what I get for reading the comment and not deciphering the
code... The actual calculation is mempages / (THREAD_SIZE/PAGE_SIZE) / 8
where THREAD_SIZE is 2 pages on i386. If I read it right this means it's
limiting it to 1/8 physical memory instead of half.
But there's still a problem. The value for mempages is all of physical
memory including highmem, so a machine with a sufficient amount of high
memory can set max_threads to a value way too high, given that most if not
all of the resources it's trying to limit have to come from normal memory
and not high memory.
Dave McCracken
======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
[email protected] T/L 678-3059
On Tue, 23 Oct 2001, Dave McCracken wrote:
> --On Tuesday, October 23, 2001 18:52:35 -0200 Rik van Riel
> <[email protected]> wrote:
>
> > I submitted a patch a while ago to set the number way lower,
> > which was accepted by Alan and in the -ac kernels. A few months
> > later Linus followed and changed the limit in his kernels, too.
>
> Ok, that's what I get for reading the comment and not deciphering the
> code...
*sigh* So my updated comment got backed out again ;/
Linus, what do you have against correct documentation ? ;)
> But there's still a problem. The value for mempages is all of physical
> memory including highmem, so a machine with a sufficient amount of high
> memory can set max_threads to a value way too high, given that most if not
> all of the resources it's trying to limit have to come from normal memory
> and not high memory.
Indeed, this needs to be fixed. A sane upper limit for
max_threads would be 10000, this also keeps in mind the
fact that we only have 32000 possible PIDs, some of which
could be taken by task groups, etc...
regards,
Rik
--
DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ (volunteers needed)
http://www.surriel.com/ http://distro.conectiva.com/
Rik wrote:
> ... A sane upper limit for
> max_threads would be 10000, this also keeps in mind the
> fact that we only have 32000 possible PIDs, some of which
> could be taken by task groups, etc...
? I thought the 2.4 kernel had switched to 32 bit pid's long ago.
Where does the limit of 32000 possible PIDs come from?
- Dan
On Tue, 23 Oct 2001, Dan Kegel wrote:
> Rik wrote:
> > ... A sane upper limit for
> > max_threads would be 10000, this also keeps in mind the
> > fact that we only have 32000 possible PIDs, some of which
> > could be taken by task groups, etc...
>
> ? I thought the 2.4 kernel had switched to 32 bit pid's long ago.
> Where does the limit of 32000 possible PIDs come from?
Please take a look at kernel/fork.c::get_pid() ...
Rik
--
DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ (volunteers needed)
http://www.surriel.com/ http://distro.conectiva.com/
Rik van Riel wrote:
>
> On Tue, 23 Oct 2001, Dan Kegel wrote:
> > Rik wrote:
> > > ... A sane upper limit for
> > > max_threads would be 10000, this also keeps in mind the
> > > fact that we only have 32000 possible PIDs, some of which
> > > could be taken by task groups, etc...
> >
> > ? I thought the 2.4 kernel had switched to 32 bit pid's long ago.
> > Where does the limit of 32000 possible PIDs come from?
>
> Please take a look at kernel/fork.c::get_pid() ...
Yes, I see the limit is enforced there, but why do we need that limit?
There are probably a bunch of user-space programs that assume
a pid fits in five digits, is that the main reason?
- Dan