LinuxLists.cc - Re: silent semantic changes with reiser4

2004-09-10 02:09:00

Subject: Re: silent semantic changes with reiser4

Jamie Lokier wrote:
[snip
> When a simple "cd" into .tar.gz or .iso is implemented properly, it
> will have _no_ performance penalty after you have first looked in the
> file, so long as it remains in the on-disk cache. And, the filesystem
> will manage that cache intelligently.
>
> Imagine: for looking at source files and such, you probably won't
> bother untarring in future, and you won't bother keeping untarred
> source trees in your home directory for easy access to things you look
> at often. Why waste the space? You could install whole applications
> as a .tar and run them from within it, with no performance penalty.
>
> Similarly, the filesystem will be able to archive directories
> automatically that haven't been touched in a long time, with no
> visible change except increased storage space. "grep" will be a bit
> slower, but you'll have a useful search tool by then (using coherent
> indexes) which will be more useful than grep, and much faster.
>

[Again, pardon me for being 5000 messages behind.]

Anyhow, I recall reading an article about 'unified name spaces', and I'm
not referring to what's on namesys.com. It mentioned how putting device
nodes into the file system is a very powerful innovation of UNIX.
Having files and devices in the same name space simplifies tools and
makes the environment much more powerful, because you can connect data
sources and targets together more arbitarily.

If I recall correctly, the article mentioned something about Plan9
taking things to a greater extreme than UNIX.

Going along with what you said, Jamie, if "containers" like tar files
could be accessed like directories without being first extracted, it
would increase the power and flexibility of the whole system. Windows
XP does something like this with the way it presents ZIP files as
directories, and although I'm sure the Windows way isn't nearly as
efficient and universal as how Linux would do it, I find the feature to
be INCREDIBLY useful.

Of course, we don't necessarily want codecs for every archive format
ever invented to be embedded in the kernel. Instead, we need to do
something like how you can mount an ISO on a loopback device, only more
transparently and more flexibly, and with the less performance-critical
codecs being implemented as daemons in userspace.

Also, there would be limitations. For instance, many such
pseudo-directories would be read-only, and some would be writable only
when on some file systems (like writing to a compressed archive might be
much easier to implement on Reiser4 than something else, or perhaps the
ext3 version would have to do a lot more copying, while the Reiser
version could take advantage of Reiser-specific features like inserting
in the middle of a file).

2004-09-10 05:23:04

by Hans Reiser

[permalink] [raw]

Subject: Re: silent semantic changes with reiser4

A friend asked me a question, and because he is very bright it reminded
me that I have not done a good job of reviewing the history of the
design's evolution.

He asked me, why not just access a filename's size as filename/size?

So, the original idea was to access metafiles as just files within a
directory, and it actually remains that way. However, I first decided to
make the standard unix metafiles less intrusive on the namespace. So
that led to calling it filename/..size and filename/..owner, etc. In
this scheme, the use of '..' was just a style convention for metafiles,
and not a requirement in anyway.

This was actually pretty decent as a design, but then a user on the
mailing list suggested replacing the '..' prefix with a subdirectory
prefix. I forget who suggested this and what the prefix was exactly. So
we then created a '..metas' subdirectory, and this had the advantage
that one could ls it to see all the builtins supported by a given
plugin. This is not an important advantage, and I encourage others to
critique it.

So, instead of filename/size we have filename/..metas/size. The only
thing gained by this is that 'size' has a rarer name of '..metas/size'.
The use of the '..metas/' prefix is purely a non-mandated style
convention. File plugins that dislike it are free to violate the
convention. There is no deep semantic to it, just a cowardly aversion to
intruding on current namespace usage with something as common as 'size'.
It has the significant disadvantage of being a longer name than 'size'
or '..size'. I could be talked out of it.

Hans

2004-09-10 06:33:35

by Peter Foldiak

[permalink] [raw]

Subject: Re: silent semantic changes with reiser4

On Fri, 2004-09-10 at 06:22, Hans Reiser wrote:
> He asked me, why not just access a filename's size as filename/size?

I now understand that you need a way to distinguish between something
like

shoe/size

and

shoe/.../size (or shoe/..size)

The first one is the size of the shoe, the second is the automatically
generated size of the file (object). You would get into trouble if you
would not allow the user to use shoe/size for shoe size. Peter

2004-09-10 07:05:03

by Hans Reiser

[permalink] [raw]

Subject: Re: silent semantic changes with reiser4

Peter Foldiak wrote:

>On Fri, 2004-09-10 at 06:22, Hans Reiser wrote:
>
>
>>He asked me, why not just access a filename's size as filename/size?
>>
>>
>
>I now understand that you need a way to distinguish between something
>like
>
>shoe/size
>
>and
>
>shoe/.../size (or shoe/..size)
>
>The first one is the size of the shoe, the second is the automatically
>generated size of the file (object). You would get into trouble if you
>would not allow the user to use shoe/size for shoe size. Peter
>
>
>
>
>
>
exactly.

Of course, problem/shoe/size could refer to shoe size in centimeters of
a problem shoe or the size of the problem relating to a shoe in units of
reporters providing press coverage of it or....

So there are lots of opportunities for ambiguity in semantics....

Still, widely used builtins seem like they should be moderately evasive
of commonly used names.

2004-09-10 15:48:24

by Timothy Miller

[permalink] [raw]

Subject: Re: silent semantic changes with reiser4

Hans Reiser wrote:
> Peter Foldiak wrote:
>
>> On Fri, 2004-09-10 at 06:22, Hans Reiser wrote:
>>
>>
>>> He asked me, why not just access a filename's size as filename/size?
>>>
>>
>>
>> I now understand that you need a way to distinguish between something
>> like
>>
>> shoe/size
>>
>> and
>>
>> shoe/.../size (or shoe/..size)
>>
>> The first one is the size of the shoe, the second is the automatically
>> generated size of the file (object). You would get into trouble if you
>> would not allow the user to use shoe/size for shoe size. Peter
>>
>>
>>
>>
>>
>>
> exactly.
>
> Of course, problem/shoe/size could refer to shoe size in centimeters of
> a problem shoe or the size of the problem relating to a shoe in units of
> reporters providing press coverage of it or....
>
> So there are lots of opportunities for ambiguity in semantics....
>
> Still, widely used builtins seem like they should be moderately evasive
> of commonly used names.

You know, if tools all need to be rewritten anyway to deal with the file
metadata "directory", then why not change the symbol that delimits the
metadata key?

Everyone likes ':', so we'd have "problem/shoe:size". (Don't bother to
complain about files which have : in them, because I already know it
sucks, but it's an example.)

See, unless you can come up with a way to seamlessly make old tools work
with the new semantics, then there's no reason not to make more than one
change to tools at at the same time.

Also, if you take the ':' example literally, then the file system would
need to be able to figure out that a file whose literal name is
"C:\MYPR0NDIR\BODY_PARTS.JPG" isn't referring to metadata for a file
named "C". :)

2004-09-10 15:56:15

by Wayne Scott

[permalink] [raw]

Subject: Re: silent semantic changes with reiser4

From: Timothy Miller <[email protected]>
> Everyone likes ':', so we'd have "problem/shoe:size". (Don't bother to
> complain about files which have : in them, because I already know it
> sucks, but it's an example.)

[[ I just joined this discussion, so pardon if this is already known.]]

One advantage of ':' is that portable programs already have to avoid
it because of NTFS alternate data streams:
http://www.diamondcs.com.au/index.php?page=archive&id=ntfs-streams

For example on an XP box with NTFS:

$ mkdir j
$ cd j
$ echo hi > foo:bar
$ ls -l
total 0
-rw-r--r-- 1 wscott Administ 0 Sep 10 10:45 foo
$ cat foo
$ cat foo:bar
hi
$ rm foo
$ cat foo:bar
cat: foo:bar: No such file or directory

-Wayne

2004-09-10 17:51:43

by Hans Reiser

[permalink] [raw]

Subject: Re: silent semantic changes with reiser4

Timothy Miller wrote:

>
>
> Hans Reiser wrote:
>
>> Peter Foldiak wrote:
>>
>>> On Fri, 2004-09-10 at 06:22, Hans Reiser wrote:
>>>
>>>
>>>> He asked me, why not just access a filename's size as filename/size?
>>>>
>>>
>>>
>>>
>>> I now understand that you need a way to distinguish between something
>>> like
>>>
>>> shoe/size
>>>
>>> and
>>>
>>> shoe/.../size (or shoe/..size)
>>>
>>> The first one is the size of the shoe, the second is the automatically
>>> generated size of the file (object). You would get into trouble if you
>>> would not allow the user to use shoe/size for shoe size. Peter
>>>
>>>
>>>
>>>
>>>
>>>
>> exactly.
>>
>> Of course, problem/shoe/size could refer to shoe size in centimeters
>> of a problem shoe or the size of the problem relating to a shoe in
>> units of reporters providing press coverage of it or....
>>
>> So there are lots of opportunities for ambiguity in semantics....
>>
>> Still, widely used builtins seem like they should be moderately
>> evasive of commonly used names.
>
>
> You know, if tools all need to be rewritten anyway to deal with the
> file metadata "directory", then why not change the symbol that
> delimits the metadata key?

because it is useful that it is only a style convention. Changing the
symbol makes it mandatory to distinguish metafiles from files.

>
> Everyone likes ':', so we'd have "problem/shoe:size". (Don't bother
> to complain about files which have : in them, because I already know
> it sucks, but it's an example.)
>
> See, unless you can come up with a way to seamlessly make old tools
> work with the new semantics, then there's no reason not to make more
> than one change to tools at at the same time.
>
> Also, if you take the ':' example literally, then the file system
> would need to be able to figure out that a file whose literal name is
> "C:\MYPR0NDIR\BODY_PARTS.JPG" isn't referring to metadata for a file
> named "C". :)
>
>
>

2004-09-10 19:56:29

by uwe

[permalink] [raw]

Subject: Re: silent semantic changes with reiser4

Wayne Scott wrote:

>One advantage of ':' is that portable programs already have to avoid
>it because of NTFS alternate data streams:

this is not going to work:

lynx http://www.site.tld:8080
ls /usr/share/man/man3/*::* ## perl stuff
http://cr.yp.to/proto/maildir.html

Regards, Uwe