2004-06-08 08:01:59

by Zoltan Menyhart

[permalink] [raw]
Subject: Re: Who owns those locks ?

Bjorn Helgaas wrote:
>
> There are a couple issues I was thinking of when
> I wrote "clean it up, pull the bits together...":
>
> 1) Tony Luck's question about what happens when
> "shr.u r30 = r13, 12" yields zero in the 32-bit
> lock value. I'm not the 2.6 maintainer, but I'd
> sure like to see some solution for this. It would
> be a nightmare to debug a system where one random
> task didn't release locks correctly. Since other
> arches use a trick like this, I'm hoping they've
> figured out something we can copy (I haven't looked).

Sure, I did not want to make an error like saying:
"640 K ought to be enough for everyone".

I'm afraid, there is no perfect solution.

- We do not want to change the lock size to 64 bits, do we ?
-- Couple of new alignment problems.

- You keep my code, it is correct for a memory size up to 16 Tbytes.

- You shift by PAGE_SHIFT, rather than by 12 (using page size of
16 Kbytes) => up to 64 Tbytes.
-- Not that much human readable lock values.

- You move to PAGE_SIZE = 64 K, you get human readable lock values
up to 256 Tbytes.

- You could store the "PID | miraculous bit" (to avoid PID = 0
problem).
-- Somewhat longer code.

I expect the main stream IA64 kernel to move to PAGE_SIZE = 64 K by
the time there will be machines with more than 16 Tbytes of memory
(as the processors have got just a very limited number of
translation look aside buffer entries, and the ever growing
application / memory sizes result in higher TLB miss rate unless the
page size increases).


Regards,

Zolt?n


2004-06-08 15:06:06

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: Who owns those locks ?

On Tuesday 08 June 2004 2:03 am, Zoltan Menyhart wrote:
> - You keep my code, it is correct for a memory size up to 16 Tbytes.

Many if not most large machines have sparse address spaces,
so you may have memory at an address that will cause a
problem even if the actual amount of memory is much smaller.

The main point is that I wouldn't want a time bomb that
will silently fail when somebody happens to boot on such
a machine. Whether that's avoided by a "miraculous" bit,
throwing away problem pages at boot-time, avoiding task
allocation at specific addresses, etc., is secondary.