2004-09-08 02:26:40

by Horst H. von Brand

[permalink] [raw]
Subject: Re: silent semantic changes with reiser4

David Lang <[email protected]> said:
> so far the best answer that I've seen is a slight varient of what Hans is
> proposing for the 'file-as-a-directory'

> make the base file itself be a serialized version of all the streams and
> if you want the 'main' stream open file/. (or some similar varient)

Serialized how? cpio(1), tar(1), cat(1)ed together, ...? Ship it via FTP or
something, how do you unpack on a traditional filesystem? On a wacky
filesystem?

Note that this has the same performance implications as the targzfilesystem
concept, just the other way around: Instead of unpacking on the fly it is
packing on the fly. And paying a (possibly huge) cost for "simple, everyday
operations" in some cases that can't be distinguished from ones where there
is no such cost is very wrong to me.

> this doesn't address the hard-link issue,

It has to be solved!

> but it should handle the backup
> problems (your backup software just goes through the files and what it
> gets is suitable for backups).

But how do you recreate the original "files" later? You only get the
serialized version back. Even worse, if you unpack on top of the original
wacky file, you'd get a wacky file, where the main stream is the serialized
version of the original... Rinse and repeat. [Shudder]

> you will ask what serializer to user, and my answer is to let one of the
> streams tell you, and have the kernel make a call out to userspace to
> execute the appropriate program (note that this means that tar is not put
> into the kernel)

If I want to see it serialized via tar(1), and you via cpio(1), and
somebody else via "take the icon stream, discard everything else", how do
you handle that?

> in fact it may make sense to just open file/file to get at the 'main'
> stream of the file (there may be cases where the concept of a single main
> stream may not make sense)

What do you do then?

> so if this solves the tool/backup problem

It makes it much worse, AFAICS.

> then we can look and figure out
> if there's a reasonable way to solve the hard-link problem

Right. And if none is found, can we drop this madness?
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513


2004-09-08 06:53:21

by David Lang

[permalink] [raw]
Subject: Re: silent semantic changes with reiser4

On Tue, 7 Sep 2004, Horst von Brand wrote:

> David Lang <[email protected]> said:
>> so far the best answer that I've seen is a slight varient of what Hans is
>> proposing for the 'file-as-a-directory'
>
>> make the base file itself be a serialized version of all the streams and
>> if you want the 'main' stream open file/. (or some similar varient)
>
> Serialized how? cpio(1), tar(1), cat(1)ed together, ...? Ship it via FTP or
> something, how do you unpack on a traditional filesystem? On a wacky
> filesystem?
>

as I said elsewhere in the origional post the owner of the file sets the
tool to use to serialize it. if you want tar set it to use tar, if you
want cpio set that. in both cases it would be a call out to userspace

> Note that this has the same performance implications as the targzfilesystem
> concept, just the other way around: Instead of unpacking on the fly it is
> packing on the fly. And paying a (possibly huge) cost for "simple, everyday
> operations" in some cases that can't be distinguished from ones where there
> is no such cost is very wrong to me.

this is a (huge) oppurtunity for the filesystem to provide for an
optimization. if it is able to maintain a serialized version then this
could turn into a simple read. no it's not always the simple obvious
thing, but very few things are.

the problem of things having unexpected side effects in an uncontrolled
server already are out there.

>> this doesn't address the hard-link issue,
>
> It has to be solved!

I never disputed that, in fact I pointed out that that issue remained,
with the idea (since disproved) that this suggestion would settle the
other issues sufficiantly to let everyone concentrate on the link issue.

>
>> but it should handle the backup
>> problems (your backup software just goes through the files and what it
>> gets is suitable for backups).
>
> But how do you recreate the original "files" later? You only get the
> serialized version back. Even worse, if you unpack on top of the original
> wacky file, you'd get a wacky file, where the main stream is the serialized
> version of the original... Rinse and repeat. [Shudder]

how smart if the filesystem plugin? can the filesystem recognise compound
filetypes? if so then it can tell what's happening and do the right thing
when recreating the file

>> you will ask what serializer to user, and my answer is to let one of the
>> streams tell you, and have the kernel make a call out to userspace to
>> execute the appropriate program (note that this means that tar is not put
>> into the kernel)
>
> If I want to see it serialized via tar(1), and you via cpio(1), and
> somebody else via "take the icon stream, discard everything else", how do
> you handle that?

the owner of the file decides (or whoever that owner grants the
permissions to to change it to a different system)

>> in fact it may make sense to just open file/file to get at the 'main'
>> stream of the file (there may be cases where the concept of a single main
>> stream may not make sense)
>
> What do you do then?

the backup doesn't really care. if you refrence files by file/file then
you can't really say that file/foo is any more the 'main' stream then
file/bar. but if you call it file/. then you do have to define what
themain version is.

>> so if this solves the tool/backup problem
>
> It makes it much worse, AFAICS.



>> then we can look and figure out
>> if there's a reasonable way to solve the hard-link problem
>
> Right. And if none is found, can we drop this madness?

solutions are known , it's just not known if any solution can be
implemented with suitable performance (lock the entire filesystem when you
go to do a dangerous action goes a _long_ way towards solving the
problems, but it's not acceptable due to performance issues)

the fundamental problem with hard links and directories (maintaining a
sane path) is the exact same problem that the *notify tools need solved so
that a program can setup notification on a directory tree that it's
interested in without tieing up a large number of file handles. it's a odd
problem, but if it can be solved there are several doors that open up.

David Lang



--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan