2002-02-22 22:41:11

by Paul Larson

[permalink] [raw]
Subject: [PATCH] 2.4.18-rc2 Fix for get_pid hang

Marcelo,

This was made against 2.4.18-rc2 but applies cleanly against
2.4.18-rc4. This is a fix for a problem where if we run out of
available pids, get_pid will hang the system while it searches through
the tasks for an available pid forever.

Thanks,
Paul Larson











Attachments:
getpid.patch (1.55 kB)

2002-02-23 00:15:42

by Alan

[permalink] [raw]
Subject: Re: [PATCH] 2.4.18-rc2 Fix for get_pid hang

> This was made against 2.4.18-rc2 but applies cleanly against
> 2.4.18-rc4. This is a fix for a problem where if we run out of
> available pids, get_pid will hang the system while it searches through
> the tasks for an available pid forever.

Wouldn't it be a much cleaner patch to limit the maximum number of processes
to less than the number of pids available. You seem to be fixing a non
problem by adding branches to the innards of a loop.


2002-02-23 00:27:04

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] 2.4.18-rc2 Fix for get_pid hang

On Sat, 23 Feb 2002, Alan Cox wrote:

> > This was made against 2.4.18-rc2 but applies cleanly against
> > 2.4.18-rc4. This is a fix for a problem where if we run out of
> > available pids, get_pid will hang the system while it searches through
> > the tasks for an available pid forever.
>
> Wouldn't it be a much cleaner patch to limit the maximum number of
> processes to less than the number of pids available. You seem to be
> fixing a non problem by adding branches to the innards of a loop.

The problem here is that thread and process groups share
the pid namespace, so you the number of processes at which
you run out of pids varies between 10700 and 32000 ...

regards,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-02-23 00:42:36

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] 2.4.18-rc2 Fix for get_pid hang

At some point in the past, Paul Larson wrote:
>> This was made against 2.4.18-rc2 but applies cleanly against
>> 2.4.18-rc4. This is a fix for a problem where if we run out of
>> available pids, get_pid will hang the system while it searches
>> through the tasks for an available pid forever.

On Sat, Feb 23, 2002 at 12:29:47AM +0000, Alan Cox wrote:
> Wouldn't it be a much cleaner patch to limit the maximum number of
> processes to less than the number of pids available. You seem to be
> fixing a non problem by adding branches to the innards of a loop.

I've seen this one before. It seems to kick in at 11K processes, where
one would normally expect it much higher... so I'm not sure a constant
upper bound on that counter suffices. Maybe clashes of pid's with pgrp's
and sessions and tgrps are what does that, maybe it's something else.

and of course:

#include <stdgeek.h> /* Any hope of a non-O(tasks) solution? */


Cheers,
Bill

2002-02-25 13:53:45

by Paul Larson

[permalink] [raw]
Subject: Re: [PATCH] 2.4.18-rc2 Fix for get_pid hang

On Fri, 2002-02-22 at 18:29, Alan Cox wrote:
> > This was made against 2.4.18-rc2 but applies cleanly against
> > 2.4.18-rc4. This is a fix for a problem where if we run out of
> > available pids, get_pid will hang the system while it searches through
> > the tasks for an available pid forever.
>
> Wouldn't it be a much cleaner patch to limit the maximum number of processes
> to less than the number of pids available. You seem to be fixing a non
> problem by adding branches to the innards of a loop.
>
That was my original thought, but as Rik and WLI already pointed out, it
won't account for process groups, tgids, etc. This isn't a purely
theoretical problem either, as we have run up against it many times.

Thanks,
Paul Larson

2002-02-25 14:06:37

by Alan

[permalink] [raw]
Subject: Re: [PATCH] 2.4.18-rc2 Fix for get_pid hang

> > to less than the number of pids available. You seem to be fixing a non
> > problem by adding branches to the innards of a loop.
> >
> That was my original thought, but as Rik and WLI already pointed out, it
> won't account for process groups, tgids, etc. This isn't a purely
> theoretical problem either, as we have run up against it many times.

Agreed - and I don't have any better solutions except 32bit pid_t - which
I suspect one day we are going to need. At least the O(1) scheduler now
makes it feasible 8)