Since Linus opened for this the other day I guess I would like to
suggest it "officially":
Since glibc already runs with a 64-bit dev_t on as far as I know all
Linux platforms, which means that userspace is already taking the
performance hit, *and* since it cause it isn't murdeously obvious by
now, changing the kernel/userspace interface for this is painful as
hell, I would like to suggest that:
a) We use a 32+32 bit split for dev_t. Major zero, minor < 65536
would be reserved for compatibility with the old 16-bit dev_t; it
still leaves the zero value the "no device" entry. We could still
use major 0, minor >= 65536 as anonymous devices, or we could
switch using major 255 which has been reserved for expansion for
the past eight years.
b) In order to support NFSv2 and other filesystems which only support
a 32-bit dev_t, I suggest we stay within a (12,20)-bit range for as
long as that is practical. Note, however, that this only affect
using those filesystems for /dev, and I personally think it's not
too huge of a loss to say "well, if you use NFS for root, either
use NFSv3 or make /dev a tmpfs and extract a tarball from your
initrd."
All cases where we have to deal with a 32-bit dev_t on the wire or
on disk should use a 12+20 split.
How does this sound?
-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
On Thu, Mar 20, 2003 at 01:42:41PM -0800, H. Peter Anvin wrote:
> b) In order to support NFSv2 and other filesystems which only support
> a 32-bit dev_t, I suggest we stay within a (12,20)-bit range for as
Hmm, I guess that means dropping ext2/3 for / ;-(
Joel
--
"Nothing is wrong with California that a rise in the ocean level
wouldn't cure."
- Ross MacDonald
Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: [email protected]
Phone: (650) 506-8127
Joel Becker wrote:
> On Thu, Mar 20, 2003 at 01:42:41PM -0800, H. Peter Anvin wrote:
>
>>b) In order to support NFSv2 and other filesystems which only support
>> a 32-bit dev_t, I suggest we stay within a (12,20)-bit range for as
>
>
> Hmm, I guess that means dropping ext2/3 for / ;-(
>
Last I checked, all traditional (inode-based) Unix filesystems,
including ext2/3 used block pointers for dev_t. There are plenty of
block pointers; 60 bytes worth.
-hpa
On Thu, Mar 20, 2003 at 01:42:41PM -0800, H. Peter Anvin wrote:
>
> a) We use a 32+32 bit split for dev_t. Major zero, minor < 65536
> would be reserved for compatibility with the old 16-bit dev_t; it
> still leaves the zero value the "no device" entry. We could still
> use major 0, minor >= 65536 as anonymous devices, or we could
> switch using major 255 which has been reserved for expansion for
> the past eight years.
Well, it seems that this is the most reasonable split, able to handle
everyone for a long time. I can live with it, if only to keep people
from Oracle quiet :)
thanks,
greg k-h
On Thu, Mar 20, 2003 at 03:06:31PM -0800, H. Peter Anvin wrote:
> Last I checked, all traditional (inode-based) Unix filesystems,
> including ext2/3 used block pointers for dev_t. There are plenty of
> block pointers; 60 bytes worth.
They do indeed. But ext2/3 touches that block pointer with
cpu_to_le32() and friends. It needs fixing at best, and compatability
work for already existing partitions.
Joel
--
"There is shadow under this red rock.
(Come in under the shadow of this red rock)
And I will show you something different from either
Your shadow at morning striding behind you
Or your shadow at evening rising to meet you.
I will show you fear in a handful of dust."
Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: [email protected]
Phone: (650) 506-8127
Joel Becker wrote:
> On Thu, Mar 20, 2003 at 03:06:31PM -0800, H. Peter Anvin wrote:
>
>>Last I checked, all traditional (inode-based) Unix filesystems,
>>including ext2/3 used block pointers for dev_t. There are plenty of
>>block pointers; 60 bytes worth.
>
> They do indeed. But ext2/3 touches that block pointer with
> cpu_to_le32() and friends. It needs fixing at best, and compatability
> work for already existing partitions.
>
A few options:
a) Use an inode flag indicating a large dev_t. This is probably the
best option.
b) Use a sentinel value, e.g. 0xffffffff, to indicate that the major and
minor are in block pointers 1 and 2.
-hpa