2004-09-01 20:10:02

by Jamie Lokier

[permalink] [raw]
Subject: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

I'm going to explain why filesystem support for .tar.gz or other
"document container"-like formats is useful. This does _not_ mean tar
in the kernel (I know someone who can't read will think that if I
don't say it isn't); it does mean hooks in the kernel for maintaining
coherency between different views and filesystem support for cacheing.

The vision I'm going for here is:

1. OpenOffice and similar programs store compound documents in
standard file formats, such as .tar.gz, compressed XML and such.

Fs support can reduce CPU time handling these sorts of files, as
I explain below, while still working with the standard file formats.

With appropriate userspace support, programs can be written which
have access to the capabilities on all platforms, but reduced CPU
time occurs only on platforms with the fs support.

2. Real-time indexing and local search engine tools. This isn't
just things like local Google; it's also your MP3 player scanning
for titles & artists, your email program scanning for subject
lines to display the summary fast, your blog server caching built
pages, your programming environment scanning for tags, and your
file transfer program scanning for shared deltas to reduce bandwidth.

I won't explain how these work as it would make this mail too
long. It should be clear to anyone who thinks about it why the
coherency mechanism is essential for real-time, and a consistent
interface to container internals helps with performance.

Horst von Brand wrote:
> Jamie Lokier <[email protected]> said:
> > When a simple "cd" into .tar.gz or .iso is implemented properly, it
> > will have _no_ performance penalty after you have first looked in the
> > file, so long as it remains in the on-disk cache. And, the filesystem
> > will manage that cache intelligently.
>
> Nonsense. The .iso or .tar or whatever would have to be kept un-isoed or
> un-tarred in memory (or on disk cache) for this to be true, and that takes
> quite a long time. Each time you want to peek anew at linux/Makefile, the
> whole tarfile will have to be read and stored somewhere,

Wrong. "So long as it remains in the on-disk cache" means each time
you peek at linux/Makefile, the tarfile is _not_ read.

For a tarfile it's slow the first time, and when it falls out of the
on-disk cache, otherwise, for component files you are using regularly
(even over a long time) it's as fast as reading a plain file.

You obviously know this, as you mentioned on-disk cache in the reply,
so I infer from the rest of your mail that what you're trying to say
is more about modifications than reading archives. That it would be
silly to keep working data in .tar.gz files, because working inside
them regularly would be slow.

Which means you must be assuming, incorrectly, that these .tar.gz
files are really kept up to date on disk with every component file
modification.

Which is silly. .tar.gz files are suitable for *transport* and
*archival*, not regular random access; it's almost rude of you to
suggest I didn't know that.

The proposal is that .tar.gz files (and others) are analysed on demand
and content cached on disk as it is read. Then subsequent reads will
be as fast as if you had unpacked the archives by hand, manually using
the tar command. This is obviously exactly the same as you do now,
with a small bit of added convenience.

The other part of the proposal is that when you modify a component,
the modifications are stored on disk in the same way as ordinary
files, using the regular high performance random access disk
structures. Nothing is done to recreate the archive at this point; I
think this is where you misunderstood and thus flamed.

_If_ after modifying components, you then read the .tar.gz as a file,
then (and only then) is it recreated, taking in the worst case the
same time as running the tar command.

The _only_ times when that occurs are precisely those times when you
would have run the tar command manually: because you only read the
.tar.gz file when you need the flat file for some purpose, such as
attaching it to an email, transferring by FTP or HTTP, or reading it
into a program that needs it in that format.

If there is anything about that strategy that doesn't make sense, then
I suggest I have failed to explain it properly, and you're welcome to
demand a clearer explanation.

> the .tar format is optimized for compact storage, the on-disk format
> of a filesystem is optimized for fast access and modifiability.

Actually no, .tar is not compact at all. It's also not optimised for
random read access, but after an index is build it is very fast for that.

.tar.gz is compact. Although that is not especially fast for random
read access, you can build a "compression dictionary index" which
optimises random read access even in a .tar.gz, without ever unpacking
the whole thing.

Some formats like .iso, .zip and .jar are optimised for compact
storage _and_ fast random access. They come with an index, and don't
need one to be built and cached.

These are not filesystem-like formats, obviously, but they are the
formats you need to pack and unpack when exchanging data with other
people. That's the _only_ reason they're on your disk (virtually;
they may not really exist some of the time).

> Now go ahead and enlarge a file on your .iso/.tar a bit...it
> will take ages to rebuild the whole thing. There is a _reason_ why there
> are filesystems and archives, and they use different formats. If it weren't
> so, everybody and Aunt Tillie would just carry .ext3's around, and would
> wonder what the heck all this fuss is about.

If you enlarge a file in your .iso/.tar subdirectory a bit... nothing
happens. Why would a smart programmer do anything so silly as rebuild
the archive at that point?

_If_ you subsequently read the .iso/.tar _file_, then and only then
does it rebuild. Once, after lots of component writes. The only time
you would ever do that is if you are specifically reading the archive
file, which means you actually want to use the repacked file at that
point, for example to FTP it somewhere or use as an email attachment.

If the filesystem does not do that on demand, then you would have run
the tar command manually at that point, precisely because that's a
point where you need the repacked archive. So in case where the
filesystem repacks the archive, it takes exactly the same time as you
would have taken anyway; it's just automatic instead of manual. (As a
handy side effect, the automatic method offers lower latency for
transmissions).

Now, why would we bother with all this?

I see three reasons: convenience, time efficiency, and storage efficiency.

Convenience is simply that it is handy to be able to look inside
archive files, in those situations where we _currently_ use them, without
having to manually untar when we need to, and without having to
remember to clean up old directories when we discover we aren't using
those often any more. This is _not_ an argument for using .tar.gz
files in place of ordinary directories! Convenience applies to doing
the things you do now.

Time efficiency has two angles. A simple one is that accessing
.tar.gz contents through any kind of filesystem interface, even pure
userspace, can be faster than unpacking whole files, simply because
there are ways to decode parts without unpacking it all.

However, the main time efficiency that I see comes from the increasing
number of applications where the "Open" and "Save" operations store
data in *.gz files (e.g. OpenOffice compressed XML documents), or
*.tar.gz files (some compound document formats), or other things like
that. (If you think about it, quite a lot of things are like that).

With these, every "Open" currently has to decompress and maybe unpack
an archive format. Every "Save" currently has to pack and then
compress. This is done so the user sees a single flat file containing
a complex document, but it is a waste of CPU time until the user
actually transports the flat file.

The lazy proposal, as described earlier in this mail, _removes_ these
decompression, unpacking, packing and compression CPU-intensive steps
when they are unnecessary. The experience of a single file containing
a complex document is maintained, but the CPU time is reduced in many
typical operations. "Open" gets faster after you first look at a
file, "Save" gets a lot faster for large documents, and the equivalent
of grep (or later, real-time local search engine) gets a lot faster
too. There is no operation where CPU time is overall increased.

This is what I've meant throughout this thread, when I say containers:
document files of the kind used to hold text, figures, etc. that are
typically transported as a unit, and edited as a unit, but nonetheless
at the moment they're stored in somewhat CPU intensive formats, for
compactness. That's fine for a 1 page letter, but think of the
OpenOffice 500-page book containing a large number of diagrams.

However, even simple programs that read & write compressed XML benefit.

The proposal allows that sort of thing to be handled more time
efficiently that it is today, and in a way that is very practical to use.

(It's unthinkable that OpenOffice and similar programs would have a
lot of code which stored data in a special way just for Linux, just
for these performance benefits which are otherwise user-invisible, but
it's thinkable that a general purpose userspace library which is
portable to all platforms could be written, which takes advantage of
the facility when it's available and does the equivalent of today's
"compress on save" when the filesystem facility isn't available).

Finally, storage efficiency comes from simply allowing the filesystem
and supporting tools to decide when it is best to store data in
unpacked, packed & compressed, both at the same time, or another
archival form. The filesystem has comparativaly good knowledge of
which data to archive and when, but it can only maintain the illusion
if there's a mechanism to make archived forms and unpacked forms
coherent.

Now, I'm sure there is a way to implement this on top of a neat and
simple kernel feature involving weird bind mounts, leases, dnotifies
and FUSE. But those kernel offering are quite a mess at the moment
and don't fit together in a way which can usefully create this effect.

Auto-mounting uservfs directories over file-as-directory, using
moveable bind mounts _nearly_ offers the kernel primitives we need to
build this in userspace and get the all the efficiencies. But not quite.

(We could obviously do it all in userspace by putting _everything_ in
a userspace filesystem, but that would be silly as it would throw away
all of the performance of having a threaded filesystem in the kernel.
It might do as a proof of concept though).

-- Jamie


2004-09-01 23:14:01

by Linus Torvalds

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)



On Wed, 1 Sep 2004, Jamie Lokier wrote:
>
> I'm going to explain why filesystem support for .tar.gz or other
> "document container"-like formats is useful. This does _not_ mean tar
> in the kernel (I know someone who can't read will think that if I
> don't say it isn't); it does mean hooks in the kernel for maintaining
> coherency between different views and filesystem support for cacheing.

I think that's a valid thing, but there are some fundamental problems with
it if you expect it to work on a normal filesystem (ie something that
isn't fundamentally designed as a database).

For example, _what_ kind of coherency do you think is acceptable? Quite
frankly, using standard UNIX interfaces, absolute coherency just isn't an
option, because it's just not possible to try to atomically update a view
at the same time somebody else is writing to the "main file". "mmap()" is
the most obvious example of this, but even the _basic_ notion of multiple
"read" calls is not atomic without locking that is _way_ too expensive.

A "read()" on a file is not atomic even on the _plain_ file: if somebody
does a concurrent "write()", the reader may see a partial update. This
becomes a million times more confusing if the reader is seeing a
structured view of the file the writer is modifying.

Also, it's likely impossible to write() to the view-file, again unless you
expect all the underlying filesystems to be something really special.

So from a _practical_ standpoint, I suspect that the best you can really
do pretty cheaply (and which gets you 90% of what you probably want) is:

- open-close consistency: the "validity" of the cache is checked at
_open_ time, and no guarantees are given about the cache being
updated afterwards.
- read-only access to the cache (ie you can only read the view, not write
to it).

and quite frankly, I think you can do the above pretty much totally in
user space with a small library and a daemon (in fact, ignoring security
issues you probably don't even need the daemon). And if you can prototype
it like that, and people actually find it useful, I suspect kernel support
for better performance might be possible.

Suggested interface:

int open_cached_view(int base_fd, char *type, char *subname);

where "type" would be the type of the view (ie "tar" for a tar-file view,
"idtag" for a mp3 ID tag, or NULL for "autodetect default view") and
"subname" would be the cache entry name (ie the tar-file filename, or the
tag type to open).

I bet you could write a small library to test this out for a few types.
See if it's useful to you. And only if it's useful (and would make a huge
performance difference) would it be worth putting in the kernel.

Implementation of the _user_space_ library would be something like this:

#define MAXNAME 1024
int open_cached_view(int base_fd, char *type, char *subname)
{
struct stat st;
char filename[PATH_MAX];
char name[MAXNAME];
int len, cachefd;

if (fstat(base_fd, &st) < 0)
return -1;
sprintf(name, "/proc/self/fd/%d", base_fd);
len = readlink(name, filename, sizeof(filename)-1);
if (len < 0)
return -1;
filename[len] = 0;

/* FIXME! Replace '/' with '#' in "type" and "subname" */
len = snprintf(name, sizeof(name),
"%04llx/%04llx/%s/%s/%s",
(unsigned long long) st.st_dev,
(unsigned long long) st.st_ino,
type ? : "default",
subname,
filename);
errno = ENAMETOOLONG;
if (len >= sizeof(name))
return -1;
cachefd = open(name, O_RDONLY);
if (cachefd >= 0) {
/* Check mtime here - maybe we could have kernel support */
return cachefd;
}
if (errno != ENOENT)
return -1;
/*
.. try to generate cache file here ..
*/

see what I'm aiming at? You start out with a generic "attribute cache"
library that does some hacky things (like depending on "mtime" for
coherency) and then if that works out you can see if it's useful.

Linus

2004-09-02 10:56:49

by Alan

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Mer, 2004-09-01 at 21:50, Linus Torvalds wrote:
> and quite frankly, I think you can do the above pretty much totally in
> user space with a small library and a daemon (in fact, ignoring security
> issues you probably don't even need the daemon). And if you can prototype
> it like that, and people actually find it useful, I suspect kernel support
> for better performance might be possible.

Gnome already supports this in the gnome-vfs2 layer. "MC" has supported
it since the late 1990's.


2004-09-02 14:09:02

by Horst H. von Brand

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Jamie Lokier <[email protected]> said:
> I'm going to explain why filesystem support for .tar.gz or other
> "document container"-like formats is useful.

Nobody disagrees there (I think), the disagreement is on whether the
usefullness is worth the hassle.

> This does _not_ mean tar
> in the kernel (I know someone who can't read will think that if I
> don't say it isn't); it does mean hooks in the kernel for maintaining
> coherency between different views and filesystem support for cacheing.

"Coherency" and "different views" implies atomic transactions, and being
able to tell that an access to the file requieres updating random junk
about it. It requires being able to guess if it is worth updating now
(given that the file might be modified a dozen times more before the junk
being checked).

> The vision I'm going for here is:
>
> 1. OpenOffice and similar programs store compound documents in
> standard file formats, such as .tar.gz, compressed XML and such.

And they are doing fine AFAICS. Besides, they won't exactly jump on the
possibility of leaving behind all other OSes on which they run to become a
Linux-only format.

> Fs support can reduce CPU time handling these sorts of files, as
> I explain below, while still working with the standard file formats.

I don't buy this one. A tar.gz must be uncompressed and unpacked, and
whatever you could save is surely dwarfed by those costs.

> With appropriate userspace support, programs can be written which
> have access to the capabilities on all platforms, but reduced CPU
> time occurs only on platforms with the fs support.

Userspace support isn't there on any of the platforms right now, if ever it
will be a strange-Linux-installation thing for quite some time to come. Not
exactly attractive for application writers.

> 2. Real-time indexing and local search engine tools.

Sure! Gimme the CPU power and disk throughput for that, pretty please. [No,
I won't tell I have better use for those right now...]

> This isn't
> just things like local Google; it's also your MP3 player scanning
> for titles & artists, your email program scanning for subject
> lines to display the summary fast, your blog server caching built
> pages, your programming environment scanning for tags, and your
> file transfer program scanning for shared deltas to reduce bandwidth.

With no description on how this is supposed to work, this is pure science
fiction/wet dreams.

> I won't explain how these work as it would make this mail too
> long. It should be clear to anyone who thinks about it why the
> coherency mechanism is essential for real-time, and a consistent
> interface to container internals helps with performance.

Coherency is essential, but it isn't free. Far from it. The easiest way of
getting coherency is having _one_ authoritative source. That way you don't
need coherency, and don't pay for it. Anything in this class must by force
be just hints, to be recomputed at a moment's notice. I.e., let the
application who might use it check and recompute as needed.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2004-09-02 16:13:11

by Jamie Lokier

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Alan Cox wrote:
> On Mer, 2004-09-01 at 21:50, Linus Torvalds wrote:
> > and quite frankly, I think you can do the above pretty much totally in
> > user space with a small library and a daemon (in fact, ignoring security
> > issues you probably don't even need the daemon). And if you can prototype
> > it like that, and people actually find it useful, I suspect kernel support
> > for better performance might be possible.
>
> Gnome already supports this in the gnome-vfs2 layer. "MC" has supported
> it since the late 1990's.

Firstly, if I have to do it from a Gnome program, about the only
program where looking in a tar file is visibly useful is Nautilus.
Ironically, clicking on a tar file in Nautilus doesn't work, despite
having a dependency on gnome-vfs2. :/

Secondly, no, Gnome and MC don't support entering a container file,
letting you make changes in it, and remembering those changes to
_lazily_ regenerate the container file when you need it linearized,
possibly months later or never, by some unrelated program.

Thirdly, you must be referring to the Gnome versions of Bash, Make,
GCC, coreutils and Perl which I haven't found. Perhaps we have a
different idea of what "supports this" means :)

uservfs, which is based on gnome-vfs and getting a bit rusty due to
disuse, does try to solve the last problem. Unfortunately it needs
further work to have a nicer interface, and the second problem is
still not solved by gnome-vfs.

-- Jamie

2004-09-02 17:34:04

by Jamie Lokier

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Horst von Brand wrote:
> > This does _not_ mean tar
> > in the kernel (I know someone who can't read will think that if I
> > don't say it isn't); it does mean hooks in the kernel for maintaining
> > coherency between different views and filesystem support for cacheing.
>
> "Coherency" and "different views" implies atomic transactions,

Wrong. They imply ordering.

> and being able to tell that an access to the file requieres updating
> random junk about it.

Correct.

> It requires being able to guess if it is worth updating now
> (given that the file might be modified a dozen times more before the junk
> being checked).

Wrong. Lazy and deterministic are compatible.

>
> > The vision I'm going for here is:
> >
> > 1. OpenOffice and similar programs store compound documents in
> > standard file formats, such as .tar.gz, compressed XML and such.
>
> And they are doing fine AFAICS. Besides, they won't exactly jump on the
> possibility of leaving behind all other OSes on which they run to become a
> Linux-only format.

You replied before reading the rest of the mail again, didn't you?
The whole purpose of the idea is to work with formats which are _not_
Linux-only.

> > Fs support can reduce CPU time handling these sorts of files, as
> > I explain below, while still working with the standard file formats.
>
> I don't buy this one. A tar.gz must be uncompressed and unpacked, and
> whatever you could save is surely dwarfed by those costs.

Didn't they explain what "lazy" means in your computer science class?
Google for "lazy evaluation".

In this context it means _unnecessary_ uncomressions and compressions,
i.e. repeated ones, or ones where you never really needed the
compressed form, are eliminated. Necessary ones are of course done.

Those are things which are currently wasting your CPU because apps
cannot eliminate them: to do so implies synchronisation between
different apps running at different times. Surprise surprise, that's
something filesystems and kernels are good for.

> > With appropriate userspace support, programs can be written which
> > have access to the capabilities on all platforms, but reduced CPU
> > time occurs only on platforms with the fs support.
>
> Userspace support isn't there on any of the platforms right now, if ever it
> will be a strange-Linux-installation thing for quite some time to come. Not
> exactly attractive for application writers.

Again, you have a strange vision of what userspace support means. I
say it means a portable and simple library for accessing components
inside a compressed container file in a standard format.

Something I believe OpenOffice et al. already has - so to say it
doesn't exist is both incorrect and missing the point. The point is
to change that library which already exists, so it uses the filesystem
facility when available. And to make the library better in standalone
form, so that other tools which manipulate container-like formats are
inclined to work with it as plugins, instead of creating their own
interface as cli tools.

_At no point is portability an issue_ - that library is already
portable to everything - and it's certainly not a strange-Linux thing
nor meant to become one.


> > 2. Real-time indexing and local search engine tools.
>
> Sure! Gimme the CPU power and disk throughput for that, pretty please. [No,
> I won't tell I have better use for those right now...]

You still haven't grasped the idea of an algorithm which is more
complex but reduces CPU time and disk throughput, have you?

*Today*, when I open Rhythmbox I have to wait 5 minutes while it wastes
my CPU power and disk throughput scanning all the files in my Music
directory.

The entire point of these indexing schemes is so that programs like
Rhythmbox will display the same data without having to scan the disk
for 5 minutes every time they start up. That's a near infinite
_saving_ of CPU power and time.

If you don't see that I'm just going to have to suggest you google for
"cache" and learn a bit about them.

> > This isn't
> > just things like local Google; it's also your MP3 player scanning
> > for titles & artists, your email program scanning for subject
> > lines to display the summary fast, your blog server caching built
> > pages, your programming environment scanning for tags, and your
> > file transfer program scanning for shared deltas to reduce bandwidth.
>
> With no description on how this is supposed to work, this is pure science
> fiction/wet dreams.

Sigh. If you must continue to hurl your blunt instruments around,
here is a description. I didn't want to write this becuase it is off
topic, and you insist on not understanding the basic of algorithm
complexity and CPU usage, so you'll probably not understand this
either, but I'll give it a go.

1. Local Google (by which I mean a search engine on your local machine),
Real-time (by which I mean the results are always up to date):

Every file modified since last search is known to the query engine.
This is a reality: BeOS does it; WinFS is expected to do it.

We know that it's possible to update free text indexes with
small amounts of known changes quickly, at the time of a query.

Thus we have real-time local free text search engine, and other
features like searching inside files and for file names. The
point is the real-time nature of it: the results you get
correspond to exactly the contents of the filesystem at the time of
the query (writes which occur _during_ a query are obviously
not coherent with this, but writes which complete before the
query, even immediately before, appear in the results).

Note that file write timing need not be affected - all the work
can happen during queries (although it is best to have a
nightly re-index or whatever so that the delay at query times
is kept moderate).

2. MP3 player scanning artists & titles:

Easy. MP3 player does what it does, shows the extracted ID
tags from all .mp3 files in your home directory. They do this
already! The difference is that with fs coherency hooks, they
can _store_ that ID information, retrieve it later without
having to scan all the .mp3s again (see Rhythmbox earlier), and
keep their lists updated on the screen as soon as any .mp3 is
changed or even any new ones created anywhere.

Technically it's a simpler subset of real-time queries in 1.

3. Email program scanning for subject lines fast:

See Evolution; the only difference is stat() on a thousand
files won't be required, it'll be exposed through a standard
query instead of an Evolution-only cache method so other mail
programs may use it as well as shell commands, and you can
pretend you have mbox instead of maildir (mbox is a container
format...).

4. Blog server caching built pages:

It's a reality already, obviously. The difference is it'll
make sense to built the pages through an indexed query on files
instead of a database, e.g. one file per article, and
(independent of the first point) the process of building can
keep track of all the prerequisite files and scripts and
templates used to produce the page, and actually expect to
know, _with coherent accuracy_, if any of those prerequisites
is different the next time the page is needed.

In other words, complex script-generated web pages, cached
output, and with no overhead when serving each page to check
prerequisites (i.e. no 100 stat() calls), because we know the
filesystem would have told us if any prerequisite had changed
prior to the moment we begin serving the cached output.

You can do something close to this already with dnotify (if you
ignore that it doesn't tell you about new hard links, which is
a dnotify bug), although doing the query part is unrealistic
with dnotify and stat() prerequisites unless the directory
names are structured thoughtfully.

5. Programming environment scanning for tags:

By now this should be obvious. No need to run "exuberant
ctags" or whatever class-hierarchy-extraction and
documentation-extraction program every so often after making
changes, and yet your IDE's clickable list of tags and notes
stays up to date with file modifications in real time. I think
some IDEs do this already with moderate size trees, using
stat(), but it's not realistic when you have tens of thousands
of source files.

6. File transfer program scanning for shared deltas.

This is nothing more than searching all files for common
subsequences which match strong hashes which are expected to be
common during file transfer operations. E.g. the GPL header at
the start of many source files would have one such hash. A
hash of every whole file is another one, along with the name of
the file (it is a key which indicates a "likely to match the
corresponding hash" condition), as is a hash of every aligned
64k subsequence, or whatever is appropriate to reduce disk I/O.

Having an index of these can speed up some file transfer
protocols, in the manner of rsync but comparing among a large
group of files instead of just two. The point is that kind of
protocol can be more efficient than rsync (sometimes
dramatically so), but it's only a net gain to use that kind of
algorithm if you have a handy _and reliable_ index of likely
common subsequences and whole-file hashes, otherwise it uses
far too much disk I/O to check the group. The index needs
filesystem coherency support, otherwise it is not reliable
enough for this bandwidth optimisation.

> > I won't explain how these work as it would make this mail too
> > long. It should be clear to anyone who thinks about it why the
> > coherency mechanism is essential for real-time, and a consistent
> > interface to container internals helps with performance.
>
> Coherency is essential, but it isn't free. Far from it. The easiest way of
> getting coherency is having _one_ authoritative source. That way you don't
> need coherency, and don't pay for it. Anything in this class must by force
> be just hints, to be recomputed at a moment's notice. I.e., let the
> application who might use it check and recompute as needed.

(a) Without some kind of minimal hook, there is no way for the application
to check without re-reading all the files it has used for a
computation, starting at the moment where it's about to potentially
recompute the result.

Sometimes you can get away with stat() on the files (e.g. Evolution
and fontconfig do that), but often you can't because that's not
trustworthy enough (security checks, protocol optimisations,
transparent "as if I didn't cache anything" semantics), and sometimes
stat() calls are far too slow anyway (anything that involves a lot of files).

(The trustworthiness problem is likely to be solved, for some
applications, by an xattr or such which is guaranteed to be deleted or
changed when a file is modified. That is really a very minimal hook,
but it is one).

(b) Even with all the misgivings of stat(), you can't realistically
update real time displays, e.g. the lists of artists & albums on the
screen in your MP3 player whenever you modify anything in your music
collection using another program. At best, you have to expect the
display to not notice the change for a while (minutes in the mp3
case), or until you click an "Explicit Refresh" button. Neither
should be necessary.


It's true that the simplest coherency is one authoratitive source
which you have to check all the time (although even that doesn't work
for some things). But checking all the time does rule out, due to its
high algorithmic complexity, a lot of very interesting applications
and even some interesting simple unixish tools which you might like,
such as real-time "locate".

-- Jamie

2004-09-02 17:44:36

by Dave Kleikamp

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, 2004-09-02 at 11:11, Jamie Lokier wrote:

> Firstly, if I have to do it from a Gnome program, about the only
> program where looking in a tar file is visibly useful is Nautilus.
> Ironically, clicking on a tar file in Nautilus doesn't work, despite
> having a dependency on gnome-vfs2. :/

This should be fixed in Nautilus, not the kernel.

> Secondly, no, Gnome and MC don't support entering a container file,
> letting you make changes in it, and remembering those changes to
> _lazily_ regenerate the container file when you need it linearized,
> possibly months later or never, by some unrelated program.

Why do this in a tar file? tar = "tape archive". It isn't designed to
be a file system. Sure, it's nice to have tools that make it easier to
access files in a tar file, but to this isn't a job for the kernel.

> Thirdly, you must be referring to the Gnome versions of Bash, Make,
> GCC, coreutils and Perl which I haven't found. Perhaps we have a
> different idea of what "supports this" means :)

Please don't tell me that we have expectations to run make from within a
tar file. This is getting silly. tar does a pretty good job of
extracting files into real directories, and putting them back into an
archive. I don't see a need to teach the kernel how to deal with
compound files when user space can do it very easily.

Shaggy

2004-09-02 17:48:07

by Linus Torvalds

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)



On Thu, 2 Sep 2004, Alan Cox wrote:
>
> On Mer, 2004-09-01 at 21:50, Linus Torvalds wrote:
> > and quite frankly, I think you can do the above pretty much totally in
> > user space with a small library and a daemon (in fact, ignoring security
> > issues you probably don't even need the daemon). And if you can prototype
> > it like that, and people actually find it useful, I suspect kernel support
> > for better performance might be possible.
>
> Gnome already supports this in the gnome-vfs2 layer. "MC" has supported
> it since the late 1990's.

And nobody has asked for kernel support that I know of.

So either "it just works" in user space, or people haven't figured out the
kernel could help them. Or decided it's not worth it, exactly because
they'd still have to support systems/filesystems that can't be converted.

Linus

2004-09-02 17:55:29

by Christoph Hellwig

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, Sep 02, 2004 at 10:46:32AM -0700, Linus Torvalds wrote:
> > On Mer, 2004-09-01 at 21:50, Linus Torvalds wrote:
> > > and quite frankly, I think you can do the above pretty much totally in
> > > user space with a small library and a daemon (in fact, ignoring security
> > > issues you probably don't even need the daemon). And if you can prototype
> > > it like that, and people actually find it useful, I suspect kernel support
> > > for better performance might be possible.
> >
> > Gnome already supports this in the gnome-vfs2 layer. "MC" has supported
> > it since the late 1990's.
>
> And nobody has asked for kernel support that I know of.
>
> So either "it just works" in user space, or people haven't figured out the
> kernel could help them. Or decided it's not worth it, exactly because
> they'd still have to support systems/filesystems that can't be converted.

http://oss.oracle.com/projects/userfs/ has code that clues gnomevfs onto
a kernel filesystem. The code is horrible, but it shows that it can
be done.

2004-09-02 18:06:43

by Linus Torvalds

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)



On Thu, 2 Sep 2004, Christoph Hellwig wrote:
>
> http://oss.oracle.com/projects/userfs/ has code that clues gnomevfs onto
> a kernel filesystem. The code is horrible, but it shows that it can
> be done.

I do like the setup where the extended features are done as a "view" on
top of some other filesystem, so that you can choose to _either_ access
the raw (and supposedly stable, simply by virtue of simplicity) or the
"fancy" interface. Without having to reformat the disk to a filesystem you
don't trust, or you have other reasons you can't use (disk sharing with
other systems, whatever).

It doesn't have to be "user", btw, in the sense that a lot of the normal
code could be in kernel mode. Same way as Tux handling all the regular
static requests entirely in kernel mode, but having the ability for
calling down to apache..

Linus

2004-09-02 18:24:30

by Christer Weinigel

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Jamie Lokier <[email protected]> writes:

> 1. Local Google (by which I mean a search engine on your local machine),
> Real-time (by which I mean the results are always up to date):
>
> Every file modified since last search is known to the query engine.
> This is a reality: BeOS does it; WinFS is expected to do it.
>
> Thus we have real-time local free text search engine, and other
> features like searching inside files and for file names. The
> point is the real-time nature of it: the results you get
> correspond to exactly the contents of the filesystem at the time of
> the query (writes which occur _during_ a query are obviously
> not coherent with this, but writes which complete before the
> query, even immediately before, appear in the results).

Can be done with dnotify/inotify and a cache daemon keeping track of
mtime. Yes, this will need a kernel change to make sure mtime always
changed when the file changes, but it does not require anything else.

> 2. MP3 player scanning artists & titles:

Same.

> 3. Email program scanning for subject lines fast:

Same here.

> 4. Blog server caching built pages:
> 5. Programming environment scanning for tags:
> 6. File transfer program scanning for shared deltas.

And so on.

/Christer

--
"Just how much can I get away with and still go to heaven?"

Freelance consultant specializing in device driver programming for Linux
Christer Weinigel <[email protected]> http://www.weinigel.se

2004-09-02 19:52:41

by Alan

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Iau, 2004-09-02 at 18:46, Linus Torvalds wrote:
> > Gnome already supports this in the gnome-vfs2 layer. "MC" has supported
> > it since the late 1990's.
>
> And nobody has asked for kernel support that I know of.

I asked our desktop people. They want something like inotify because
dontify doesn't cut it. They have zero interest in the multiple streams
and hiding icons in streams type stuff.

Alan

2004-09-02 19:56:44

by Spam

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)






> On Thu, 2 Sep 2004, Alan Cox wrote:
>>
>> On Mer, 2004-09-01 at 21:50, Linus Torvalds wrote:
>> > and quite frankly, I think you can do the above pretty much totally in
>> > user space with a small library and a daemon (in fact, ignoring security
>> > issues you probably don't even need the daemon). And if you can prototype
>> > it like that, and people actually find it useful, I suspect kernel support
>> > for better performance might be possible.
>>
>> Gnome already supports this in the gnome-vfs2 layer. "MC" has supported
>> it since the late 1990's.

> And nobody has asked for kernel support that I know of.

Actually. Doesn't matter if it is in kernel or not for the users as
long as it works.

The problem is that I do not see either Gnome or KDE to ever get
along to form one standard that everyone will use. Their libraries
are huge and memory hogging which so many Linux users just do not
like. What if a user doesn't want KDE or Gnome? Would all files
created with either be broken?

I doubt that something like file streams and meta-data can
successfully be implemented purely in user-space and get the same
support (ie be used by many programs) if this change doesn't come
from the kernel. I just do not see it happen.

> So either "it just works" in user space, or people haven't figured out the
> kernel could help them. Or decided it's not worth it, exactly because
> they'd still have to support systems/filesystems that can't be converted.

> Linus

2004-09-02 21:04:25

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, Sep 02, 2004 at 10:46:05AM +0100, Alan Cox wrote:
> On Mer, 2004-09-01 at 21:50, Linus Torvalds wrote:
> > and quite frankly, I think you can do the above pretty much totally in
> > user space with a small library and a daemon (in fact, ignoring security
> > issues you probably don't even need the daemon). And if you can prototype
> > it like that, and people actually find it useful, I suspect kernel support
> > for better performance might be possible.
>
> Gnome already supports this in the gnome-vfs2 layer. "MC" has supported
> it since the late 1990's.

Can it do this:

cd FC2-i386-disc1.iso
ls

or this:

cd /dev/sda1
ls
cd /dev/floppy
ls
cd /dev/cdrom
ls

?

--
Frank

2004-09-02 21:41:59

by Dave Kleikamp

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, 2004-09-02 at 15:38, Frank van Maarseveen wrote:
> Can it do this:
>
> cd FC2-i386-disc1.iso
> ls
>
> or this:
>
> cd /dev/sda1
> ls
> cd /dev/floppy
> ls
> cd /dev/cdrom
> ls
>
> ?

We have the mount command for that. :^)
--
David Kleikamp
IBM Linux Technology Center

2004-09-02 21:50:30

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, Sep 02, 2004 at 04:36:34PM -0500, Dave Kleikamp wrote:
> On Thu, 2004-09-02 at 15:38, Frank van Maarseveen wrote:
> > Can it do this:
> >
> > cd FC2-i386-disc1.iso
> > ls
> >
> > or this:
> >
> > cd /dev/sda1
> > ls
> > cd /dev/floppy
> > ls
> > cd /dev/cdrom
> > ls
> >
> > ?
>
> We have the mount command for that. :^)

mount is nice for root, clumsy for user. And a rather complicated
way of accessing data the kernel has knowledge about in the first
place. For filesystem images, cd'ing into the file is the most
obvious concept for file-as-a-dir IMHO.

--
Frank

2004-09-02 21:53:33

by Jamie Lokier

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Christer Weinigel wrote:
> Can be done with dnotify/inotify and a cache daemon keeping track of
> mtime. Yes, this will need a kernel change to make sure mtime always
> changed when the file changes, but it does not require anything else.

- Can the daemon keep track of _every_ file on my disk like this?
That's more than a million files, and about 10^5 directories.
dnotify would require the daemon to open all the directories.
I'm not sure what inotify offers.

- What happens at reboot - I guess the daemon has to call stat()
on every file to verify its indexes? Have you any idea how long
it takes to call stat() on every file in my home directory?

- The ordering problem: I write to a file, then the program
returns. System is very busy compiling. 2 minutes later, I
execute a search query. The file I wrote two minute ago doesn't
appear in the search results. What's wrong?

Due to scheduling, the daemon hasn't caught up yet. Ok, we can
accept that's just hard life. Sometimes it takes a while for
something I write to appear in search results.

But! That means I can't use these optimised queries as drop-in
replacements for calling grep and find, or for making Make-like
programs run faster (by eliminating parsing and stat() calls).
That's a shame, it would have been nice to have a mechanism that
could transparently optimise prorgrams that do calculations....

Do you see what I'm getting at? There's building some nice GUI
and search engine like functionality, where changes made by one
program _eventually_ show up in another (i.e. not synchronously).

That's easy.

And then there's optimising things like grep, find, perl, gcc,
make, httpd, rsync, in a way that's semantically transparent, but
executes faster _as if_ they had recalculated everything they
need to every time. That's harder.

> > 3. Email program scanning for subject lines fast:
>
> Same here.
>
> > 4. Blog server caching built pages:
> > 5. Programming environment scanning for tags:
> > 6. File transfer program scanning for shared deltas.
>
> And so on.

No, not 3, 4 or 6. For correct behaviour those require synchronous
query results. Think about 6, where one important cached query is
"what is the MD5 sum of this file", and another critical one, which
can only work through indexing, is "give me the name of any file whose
MD5 sum matches $A_SPECIFIC_MD5". Trusting the async results for
those kind of queries from your daemon would occasionally result in
data loss due to race conditions. So you wouldn't trust the async
results, and you fail to get those CPU-saving and bandwidth-saving
optimisations.

-- Jamie

2004-09-02 22:02:17

by Bill Huey

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, Sep 02, 2004 at 07:46:05PM +0100, Alan Cox wrote:
> On Iau, 2004-09-02 at 18:46, Linus Torvalds wrote:
> > > Gnome already supports this in the gnome-vfs2 layer. "MC" has supported
> > > it since the late 1990's.
> >
> > And nobody has asked for kernel support that I know of.
>
> I asked our desktop people. They want something like inotify because
> dontify doesn't cut it. They have zero interest in the multiple streams
> and hiding icons in streams type stuff.

It also depends on who you ask. I can't take a lot of the mainstream
X folks serious since they are still using integer math as parameters
to half broken drawing primitives and barely discovered things like OpenGL.
Their attitude doesn't treat these things as first class citizens in
what ever software system they create. They also haven't create a modern
and highly dynamic structured document system that's in wide use yet,
so this problem space hasn't really been pushed as hard as other much
more dynamic systems. And the advent of XML (basically a primitive and
flat model of what Hans is doing) for .NET style systems are going to
push these systems into those areas in new and unique ways. (Actually
retro Smalltalk-ish)

It seems that many of the original ideas about "why" GUI systems exist
have been lost to older commericial interests (Microsoft Win32) and that
has wiped out the fundamental classic computer science backing this from
history. This simple "MP3 metadata" stuff is a very superficial example
of how something like this is used.

The problems are fundamentally about data representation in a manner so
simple that its "expressive power" (Hans here) can extend itself to even
the dorkiest of shell scripts. To have that power immediately available
as network/local objects and to have their relationships clearly defined
is a very powerful manner to build software systems.

Unix folks tend to forget that since they either have never done this
kind of programming or never understood why this existed in the first
place. It's about a top-down methodology effecting the entire design of
the software system, not just purity Unix. If it can be integrate
smoothly into the system, then it should IMO.

The folks against this system forget about how important the context of
all this is work is set in... The mindset is fundamentally different and
I'm quite sick of hearing "It's not Unix" over and over again. And
notion of Linux being marginalize to a minority OS over this stuff is
just plain crazy.

bill

2004-09-02 22:03:26

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, Sep 02, 2004 at 11:00:27PM +0100, [email protected] wrote:
>
> The hell it is.
>
> a) kernel has *NO* *FUCKING* *KNOWLEDGE* of fs type contained on a device.

excuse me, but how does the kernel mount the root fs?

--
Frank

2004-09-02 22:03:07

by Al Viro

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, Sep 02, 2004 at 11:48:06PM +0200, Frank van Maarseveen wrote:
> mount is nice for root, clumsy for user. And a rather complicated
> way of accessing data the kernel has knowledge about in the first
> place. For filesystem images, cd'ing into the file is the most
> obvious concept for file-as-a-dir IMHO.

The hell it is.

a) kernel has *NO* *FUCKING* *KNOWLEDGE* of fs type contained on a device.
b) kernel has no way to guess which options to use
c) fs _type_ is a fundamental part of mount - device(s) (if any) involved
are arguments to be interpreted by that particular fs driver.
d) permissions required for that lovely operation (and questions like
whether we force nosuid/noexec, etc.) are nightmare to define.

Frankly, the longer that thread grows, the more obvious it becomes that
file-as-a-dir is a solution in search of problem. Desperate search, at
that.

2004-09-02 22:10:42

by Christoph Hellwig

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Fri, Sep 03, 2004 at 12:02:42AM +0200, Frank van Maarseveen wrote:
> On Thu, Sep 02, 2004 at 11:00:27PM +0100, [email protected] wrote:
> >
> > The hell it is.
> >
> > a) kernel has *NO* *FUCKING* *KNOWLEDGE* of fs type contained on a device.
>
> excuse me, but how does the kernel mount the root fs?

trial and error. That's why you see all thos ext3 mounted as ext2
problems, or $RANDOMFS as fat.

2004-09-02 22:28:09

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, Sep 02, 2004 at 11:17:22PM +0100, [email protected] wrote:
>
> What knowledge does the kernel have about fs type that could deal with the
> contents of given device? Details, please.

Try a "make tags;grep SUPER_MAGIC tags".
Or is it there for a different purpose?

--
Frank

2004-09-02 22:35:41

by Al Viro

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Fri, Sep 03, 2004 at 12:26:50AM +0200, Frank van Maarseveen wrote:
> On Thu, Sep 02, 2004 at 11:17:22PM +0100, [email protected] wrote:
> >
> > What knowledge does the kernel have about fs type that could deal with the
> > contents of given device? Details, please.
>
> Try a "make tags;grep SUPER_MAGIC tags".
> Or is it there for a different purpose?

RTFS and you'll see. Individual fs generally knows how to check if it
would be immediately unhappy with given image (not all types do, BTW).
Exact form of checks depends on fs type; for crying out loud, there's
not even a promise that they are mutually exclusive!

Read the fucking source. Read through the code that "chooses" fs type
of root fs. Look at it. Then use whatever you have between your ears.

2004-09-02 23:02:46

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, Sep 02, 2004 at 11:33:24PM +0100, [email protected] wrote:
>
> RTFS and you'll see. Individual fs generally knows how to check if it
> would be immediately unhappy with given image (not all types do, BTW).
> Exact form of checks depends on fs type; for crying out loud, there's
> not even a promise that they are mutually exclusive!

so?

A user can stick an USB memory card with _any_ malformed fs data and
make troubles via the automounter or user mounts. Yes, mount might do
some more checks but it sure won't do an fsck.

The user gets what he deserves when sticking crap in an USB port.

And that doesn't mean that the kernel should accept any fs image
when a user tries to cd into the file.

--
Frank

2004-09-02 23:06:56

by Al Viro

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Fri, Sep 03, 2004 at 12:56:34AM +0200, Frank van Maarseveen wrote:
> And that doesn't mean that the kernel should accept any fs image
> when a user tries to cd into the file.

No, it shouldn't. It should say "it's not a directory, bugger off".
Which it does.

2004-09-02 23:23:44

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, 02 Sep 2004 22:38:54 +0200, Frank van Maarseveen said:

> Can it do this:
>
> cd FC2-i386-disc1.iso
> ls

That one's at least theoretically doable, assuming that it really *IS* the
Fedora Core disk and an ISO9660 format...

> cd /dev/cdrom
> ls

And the CD in the drive at the moment is AC/DC "Back in Black". What
should this produce as output?


Attachments:
(No filename) (226.00 B)

2004-09-02 23:35:17

by Alan

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Iau, 2004-09-02 at 22:47, Jamie Lokier wrote:
> - Can the daemon keep track of _every_ file on my disk like this?
> That's more than a million files, and about 10^5 directories.
> dnotify would require the daemon to open all the directories.
> I'm not sure what inotify offers.

This is currently a real issue for both desktop search and for virus
scanners. They want a "what changed and where" system wide (or at least
per namespace/mount).


2004-09-02 23:37:10

by Paul Jakma

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, 2 Sep 2004, Jamie Lokier wrote:

> Firstly, if I have to do it from a Gnome program, about the only
> program where looking in a tar file is visibly useful is Nautilus.
> Ironically, clicking on a tar file in Nautilus doesn't work,
> despite having a dependency on gnome-vfs2. :/

Do you have file-roller installed?

I can open tar/zip/rar/etc.. files from anywhere in gnome2, eg in
Galeon I can click on a tar.gz URL (http or whatever) and have it
open it in file-roller, from where i can browse the files to my
hearts content.

regards,
--
Paul Jakma [email protected] [email protected] Key ID: 64A2FF6A
Fortune:
Why isn't there a special name for the tops of your feet?
-- Lily Tomlin

2004-09-02 23:48:09

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, Sep 02, 2004 at 07:19:47PM -0400, [email protected] wrote:
>
> > cd FC2-i386-disc1.iso
> > ls
>
> That one's at least theoretically doable, assuming that it really *IS* the
> Fedora Core disk and an ISO9660 format...

Impressive.

But when it comes to file-systems like ext[23] I think an in-kernel
solution might be preferable to get the exact semantics wrt locking,
atomicity, synchronization, coherency, whatever a kernel is good at
for filesystems.

> > cd /dev/cdrom
> > ls
>
> And the CD in the drive at the moment is AC/DC "Back in Black". What
> should this produce as output?

bash: cd: /dev/cdrom: Not a directory

When it doesn't match a (sub)set of known file system types I think.
Because that is the area the kernel has knowledge about. Nothing
else, no tarballs for example unless of course the kernel has "tarfs" :-)

--
Frank

2004-09-02 23:53:24

by Spam

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)




> On Thu, 2 Sep 2004, Jamie Lokier wrote:

>> Firstly, if I have to do it from a Gnome program, about the only
>> program where looking in a tar file is visibly useful is Nautilus.
>> Ironically, clicking on a tar file in Nautilus doesn't work,
>> despite having a dependency on gnome-vfs2. :/

> Do you have file-roller installed?

> I can open tar/zip/rar/etc.. files from anywhere in gnome2, eg in
> Galeon I can click on a tar.gz URL (http or whatever) and have it
> open it in file-roller, from where i can browse the files to my
> hearts content.

But can you actually do things with these files? Can you run
applications or edit files directly, or is there need for temporary
unzip first?

~S

> regards,

2004-09-02 23:54:46

by Alan

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Iau, 2004-09-02 at 22:56, Bill Huey wrote:
> It also depends on who you ask. I can't take a lot of the mainstream
> X folks serious since they are still using integer math as parameters

The X folks know what they are doing. Modern X has a complete
compositing model. Modern X has a superb font handling system. Nobody
broke anything along the way. The new API's can be mixed with the old,
there are good fallbacks for old servers.

In fact they are so good at it that most people don't notice beyond the
fact their UI looks better than before.

That is how you do change *right*

> more dynamic systems. And the advent of XML (basically a primitive and
> flat model of what Hans is doing) for .NET style systems are going to

I see you don't really get XML either. XML is just an encoding. Its
larger and prettier than ASN.1 and easier to hack about with perl. You
can do the same thing with lisp lists for that matter.

> have been lost to older commericial interests (Microsoft Win32) and that
> has wiped out the fundamental classic computer science backing this from
> history. This simple "MP3 metadata" stuff is a very superficial example
> of how something like this is used.

The trouble with computer science is that most of it sucks in the real
world. We don't write our OS's in Standard ML, we don't implement some
of the provably secure capability computing models. At the end of the
day they are neat, elegant and useless to real people.

> Unix folks tend to forget that since they either have never done this
> kind of programming or never understood why this existed in the first
> place. It's about a top-down methodology effecting the entire design of
> the software system, not just purity Unix. If it can be integrate
> smoothly into the system, then it should IMO.

The Unix world succeeded because Unix (at least in v7 days) was the
other way around to every other grungy OS on the planet. It had only
thing things it needed. I've used some of the grungy crawly horrors that
were its rivals and there is a reason they don't exist any more.

I would sum up the essence of the unix kernel side as
- Does only what it must do
- "Makes the usual easy makes the unusual possible"
- Has an API that is small enough for developers to learn
easily (an API so good every other OS promptly ripped it off)
People forget the worlds of SYS$QIO, RMS, FCB's and the like

Its worked remarkably well for a very very long time, and most of the
nasties have come from people trying to break that model or not
understanding it.

Alan

2004-09-03 00:02:16

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Fri, Sep 03, 2004 at 01:43:02AM +0200, Spam wrote:
>
> Yes why not? If there was any filesystem drivers for the AudioCD
> format then it could.
>
> I had such a driver for Windows 9x which would display several
> folders and files for inserted AudioCD's:
>
> D: (cdrom)
> Stereo
> 22050
> Track01.wav
> Track02.wav
> ...
> 44100
> Track01.wav
> ...
...
>
> If you just want to do a cd file.iso then it may be a totally
> different thing. Either you would have a automount feature or a
> filesystem/vfs plugin that could load secondary modules to support
> this kind of thing.

Why is this so different, compared to your example? They are both
filesystem drivers.

--
Frank

2004-09-02 23:48:08

by Spam

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)




> On Thu, 02 Sep 2004 22:38:54 +0200, Frank van Maarseveen said:

>> Can it do this:
>>
>> cd FC2-i386-disc1.iso
>> ls

> That one's at least theoretically doable, assuming that it really *IS* the
> Fedora Core disk and an ISO9660 format...

>> cd /dev/cdrom
>> ls

> And the CD in the drive at the moment is AC/DC "Back in Black". What
> should this produce as output?

Yes why not? If there was any filesystem drivers for the AudioCD
format then it could.

I had such a driver for Windows 9x which would display several
folders and files for inserted AudioCD's:

D: (cdrom)
Stereo
22050
Track01.wav
Track02.wav
...
44100
Track01.wav
...
Mono
22050
Track01.wav
...
44100
Track01.wav
...

Normal AudioCD players would also work even though this driver was
installed. These files were also visible for legacy applications in
the command prompt (inside Windows).

I do not see why this would not be possible in Linux. Of course, it
would perhaps require a filesystem driver/module to be present when
you mount.

If you just want to do a cd file.iso then it may be a totally
different thing. Either you would have a automount feature or a
filesystem/vfs plugin that could load secondary modules to support
this kind of thing.

~S

2004-09-03 00:16:58

by David Masover

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[email protected] wrote:
| On Thu, Sep 02, 2004 at 11:48:06PM +0200, Frank van Maarseveen wrote:
|
|>mount is nice for root, clumsy for user. And a rather complicated
|>way of accessing data the kernel has knowledge about in the first
|>place. For filesystem images, cd'ing into the file is the most
|>obvious concept for file-as-a-dir IMHO.
|
|
| The hell it is.
|
| a) kernel has *NO* *FUCKING* *KNOWLEDGE* of fs type contained on a device.

reiser4 kernel will contain knowledge of fs type contained in a file.

"file/..metas/type" might contain a mime type. Mime type might have to
be guessed, but at least if it's made by a local "mkisofs" then we're fine.

Indeed, that's not the only interface that's been discussed.
"file/..metas/is_isofs" might be consulted.


| b) kernel has no way to guess which options to use
| c) fs _type_ is a fundamental part of mount - device(s) (if any) involved
| are arguments to be interpreted by that particular fs driver.

Unless there's some severe security issues with "mount this iso as fat
and you get root access", this should work fine.

I see no reason why there can't be a global setting of the mount
commandline to use.

And it doesn't all have to be in the kernel. Only it'd be nice to have
some of it there because the kernel knows how to deal with an isofs,
even if it won't know what it looks like.

| d) permissions required for that lovely operation (and questions like
| whether we force nosuid/noexec, etc.) are nightmare to define.

They are quite simple, actually. Just set them globally -- some admins
would force nosetuid/noexec, some wouldn't. And the operation happens
transparently -- you need no "permissions" other than to read the
directory which would contain the mount.

| Frankly, the longer that thread grows, the more obvious it becomes that
| file-as-a-dir is a solution in search of problem. Desperate search, at
| that.

Actually, the longer this thread grows, the more obvious it is how when
there's a hot issue, everyone has an opinion, even if the same opinion
has been expressed ten or twenty times already.

File-as-a-dir has numerous advantages, but enough have been discussed.
Short list is image mounts, tarballs, streams, metas, and namespace
unification. Longer list and explanations can be found if you RTFA.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQTe19XgHNmZLgCUhAQJdow/+Knsw1GgpauqDUcg8sKtxzgXZ18OxMQ3Q
By8sRrSTuKAzI5A3BtYIzndsj1veP+7wndG7nYPz8NS1fU2+xWSIhoGq/YMaQsu4
70uMLu448PFXZua4hZMk2w4mkULXbGyYHJ1Bf+2Z7QkQ/8W08hozC8QQynxMXIkX
SrcWCS5hK8Nh7Ol691sDpPqexH7F1GwUyoslNGj63U5r6ViLAawt2ZKDYdT7ZPo8
0a/pWUHoHMPbv/KwqZZxRr1/qncA9QYQo6JqQBPPCr+tWNJs/ei3nAKGi58iOt1M
DK1TEKd2lpbmwiK5pWDwGz+nwWmaFTAyfTEEEcP4gZedSJtRXaxyNh0jRl1iLATB
SCO5Eb4jkQs8hdjHqQcQ1q7XKFX9eSXWeDdrGrtWaYC/QYOHxT+ci3lnKBKCG99Y
YTqg3sNEZlV1N0jIcNvFSDEYbbX12v1Y6xbwvUx48+sMyUj3suT76niTRbwEydfO
MA9y+wE2k4wF+h+sJCbTjimCNFvvuFTTJuBCbQTpfY4eOYBAFalxnWmrpTEfVzka
4iAqAYygWObGDkFFy/rp1HEVZPIKM0NwGLOsRwJsgyUMOsccBrEc0bg8sgMECVfs
5qNIb27tLokh8NBR6RodAv2NZYKC+foM0T+PC5bZMFD/Q7f6yDklqK4C4RCIYaXj
xO9z1C6FPcM=
=/gmK
-----END PGP SIGNATURE-----

2004-09-03 00:16:56

by Spam

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)




> On Gwe, 2004-09-03 at 00:49, Spam wrote:
>> But can you actually do things with these files? Can you run
>> applications or edit files directly, or is there need for temporary
>> unzip first?

> You always need that for zip files. Firstly because executables are
> paged so you need an accessible random access copy of the bits. Secondly
> because data may be paged, and also for seek performance.

Yes, some archive types can't be partially unzipped either. But my
point is that it wouldn't be transparent to the application/user in
the same way.

~S

2004-09-03 00:20:29

by Paul Jakma

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Fri, 3 Sep 2004, Spam wrote:

> Yes, some archive types can't be partially unzipped either. But my
> point is that it wouldn't be transparent to the application/user in
> the same way.

It doesnt matter whether it is transparent to the application. It can
be the application which implements the required level of
transparency.

User doesnt care what provides the transparency or how it's
implemented.

> ~S

regards,
--
Paul Jakma [email protected] [email protected] Key ID: 64A2FF6A
Fortune:
A committee is a group that keeps the minutes and loses hours.
-- Milton Berle

2004-09-03 00:22:04

by Linus Torvalds

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives



On Thu, 2 Sep 2004, David Masover wrote:
>
> reiser4 kernel will contain knowledge of fs type contained in a file.

That's a disaster, btw.

There is no one "fs type" of a file. Files have at _least_ one type
(bytestream), but most have more. Which is why automatically doing the
right thing (in the sense you seem to want) in kernel space is simply not
possible.

Linus

2004-09-03 00:29:12

by David Masover

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Spam wrote:
[...]
| I doubt that something like file streams and meta-data can
| successfully be implemented purely in user-space and get the same
| support (ie be used by many programs) if this change doesn't come
| from the kernel. I just do not see it happen.

The issue is not "many programs". The issue is "all programs".

Even if the political issues were solved -- even if Linus said to us all
"Thou shalt use this library or suffer my wrath!" -- it'd have to be
everywhere. Bash. Perl. Make. Gcc. Vim.

And btw, if it was political, you'd get no sympathy here. ("Oh no,
everyone's using their own game engine! We need to put the doom3 engine
in the kernel, now!") No, _people_ solve political problems. Kernels
don't.

Kernel support automatically adds support for a lot of features without
patching a thing.

There should be an interface in the filesystem. And for certain things,
uservfs will be incredibly slower than reiser4 as that interface.
(Remember Linus' point about TUX and Apache.)

I'll say this again: most of it -- all but the bare interface stuff --
should be in userspace. In fact, let's all add this to our signature so
no one brings it up again. ("Oh no, they want to put tar support in the
kernel!" no, we don't.)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQTe4BXgHNmZLgCUhAQKH1g//WXaL5xdLpX37TdgFhXidiRJGe/ojehj3
CZ7kseI9GkOBSxHt/yb5/xC6r+XT7JLlvJZybT7HLIRIxGp+WQHHBOD/xezWC+eX
OyRaOlkZ7o9HRQKhKNRIAwI4jgftpLhFUhePgibubS4UdxtzN2FWuULfKvMKIGHn
L4Zv4Dpje5ld7l7ce8jhfcURJ7AgAPwja3Tc7C38pmG+dSo2mj0I+YlCUED7mx3R
ZSv6WtdAUCZjnKv9hSQVruk3fjYZc4dLEGzGH1ZJsD1ZkH5wNmWds5gHGEvQrc4Z
9reNanTxy+0ECxndk2H/ukw5Wv011rJWubLy/CnaPakPrSvrsmmoEs8ZcVZavlg9
ABJX/NtyBVl/y8+6Eh6/BdAhQr30U+c/UZLNbOflmcPGPiJCiXfBuaX1OF+qffQ1
QQvAGPgO2R9egHJWqFhBaLHtBAmiXSRWUU4+4nPBYZ/X5dCmGuV46knQGHdqoAQc
l/qILh+spY09q9g118QbdnXBseiuVh/a+vf2GrbxbEMuWQu1kAI0DJbN0KKgUdtE
ZkmIqXYULO6QZsYk3L41ZyKyE7oFMUqbT0uxSQZUCjOcnuBpMn/PzwM0yMJeDUKx
295Yzq5lkqkGmHJHi7XGOOI5XVIPb++DWXXv9E6Bfgoj4TSZscfwM69PQRoSq0hu
iL9VOiyLRBU=
=zrvZ
-----END PGP SIGNATURE-----

2004-09-03 00:42:55

by David Masover

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dave Kleikamp wrote:
[...]
| Please don't tell me that we have expectations to run make from within a
| tar file. This is getting silly. tar does a pretty good job of
| extracting files into real directories, and putting them back into an
| archive. I don't see a need to teach the kernel how to deal with
| compound files when user space can do it very easily.

Suppose I've got a tar file with an index attached. Suppose it's
something like /usr/src/linux. Am I expected to extract all code for
all architectures, with all drivers, all docs, etc? Now, yes -- or I
have to figure out exactly which ones I need before I extract them
manually, one by one.

But with tar support for make (and so on), files can be extracted on
demand. It's possible to do this in userspace, with named pipes, but
that's much slower and insanely clumsy.

This has further implications -- imagine a desktop, binary distro
shipped with all files except the very most basic stuff as package
archives. They can all be extracted, on demand -- the first time I run
OpenOffice.org, it's installed. If there needs to be post-installation,
that's handled by the .deb plugin (or whatever).

I don't know offhand how big OOo is. I think it's something like this:
~ The installer is at most half (and maybe only a third) of the full
installation. That's a HUGE optimization!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQTe5/HgHNmZLgCUhAQI5vQ/+NyU/tbW1Dyaf/OlDUEScH8jHghdcMPQQ
qcyBbzid9hMT0pm4fRX4CQJ/vm+VLhfvYzEmgRUCyNY3JybCKeS/EynRt/ybdblu
aB+hO8meFitBmAa7kYrj1UhWvoSvDSZgAwC9k50DYPuQO1kVZFjFYcPee1P54iwJ
UMn9RE01aeufCt1+jWFxhsEZKfNWvXDCaQtqa483A2AWWzklwF25ZW2kSfp6G+i0
g1jND8pPDkQcP8ujGTuDxEI8LsN62glNzVZ8MhPa65lZI1vO5Ll2dDL2QKgNwziK
MqtMMJD1d3HWa7QBHwMegJ0teR/hiqJ62SgQr3QpW4Xy9Ss0VUVH1HNuhxwPB2rl
YYomqw2yO/GGSDs5XuXm/cRM5E9d+nvu1V8bsrSa5LK/64Vlp6huLkLNvOZ3y6vK
38ELPBxbmIA3iWTgaYDPANX/vrpnA0K8JQU9M4LMveaHhxfEcDbH+iZHtpjsYqF3
allfHH2SEZRFlXGxKBNZsXTcrudAHjoyEOQ+UiI9QLCM83G4bFGr1WEGOEHmD0ry
hBETe8GkwuQK1CfxFm5obgFUmE4TwVRIWVD71EvoJFuBS+dlezO6GOZ5mDf91tSe
goPS2f/9XKwyYfOnEnnfXT17k5SYjTB9m0upi6q7dJpxvg1535E6N1nCqdLZzTcJ
mVkdhfFsUqI=
=YjJH
-----END PGP SIGNATURE-----

2004-09-03 00:51:51

by Al Viro

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

On Thu, Sep 02, 2004 at 07:08:21PM -0500, David Masover wrote:
> | a) kernel has *NO* *FUCKING* *KNOWLEDGE* of fs type contained on a device.
>
> reiser4 kernel will contain knowledge of fs type contained in a file.

Right. And when I decide to talk to delusional nutcases I will go to
talk.origin and find a creationist to chat with.

Time to extend the killfile...

*plonk*

2004-09-03 00:51:50

by Spam

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)




> On Fri, 3 Sep 2004, Spam wrote:

>> Yes, some archive types can't be partially unzipped either. But my
>> point is that it wouldn't be transparent to the application/user in
>> the same way.

> It doesnt matter whether it is transparent to the application. It can
> be the application which implements the required level of
> transparency.

> User doesnt care what provides the transparency or how it's
> implemented.

Indeed. I hope I didn't say otherwise :). Just that I think it will
be very difficult to have this transparency in all apps. Just
thinking of "nano file.jpg/description.txt" or "ls
file.tar/untar/*.doc". Sure in some environments like Gnome it could
work, but it still doesn't for the rest of the flora of Linux
programs.



~S

> regards,

2004-09-03 01:12:44

by Linh Dang

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

Linus Torvalds <[email protected]> wrote:

>
>
> On Thu, 2 Sep 2004, Christoph Hellwig wrote:
>>
>> http://oss.oracle.com/projects/userfs/ has code that clues gnomevfs
>> onto a kernel filesystem. The code is horrible, but it shows that
>> it can be done.
>
> I do like the setup where the extended features are done as a "view"
> on top of some other filesystem, so that you can choose to _either_
> access the raw (and supposedly stable, simply by virtue of
> simplicity) or the "fancy" interface. Without having to reformat the
> disk to a filesystem you don't trust, or you have other reasons you
> can't use (disk sharing with other systems, whatever).

It'd be something similar to what clearcase does (not that I like
clearcase, I hate it with a passion for other reasons!)

On such a system, one would have multiple virtual views mounted (by
root) under:

/view/tar, /view/dpkg, /view/rpm, etc.

for every regular file /home/joe/blah.tar

the path /view/tar/home/joe/blah.tar/ is a directory where member of
the archives directly accessible.

old tools continue work as is. new tools can take a look on virtual
views for virtual access.

Not sure how such a system would work with the dentry cache.

--
Linh Dang

2004-09-03 01:12:21

by Spam

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives




> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1

> [email protected] wrote:
> | On Thu, Sep 02, 2004 at 11:48:06PM +0200, Frank van Maarseveen wrote:
> |
|>>mount is nice for root, clumsy for user. And a rather complicated
|>>way of accessing data the kernel has knowledge about in the first
|>>place. For filesystem images, cd'ing into the file is the most
|>>obvious concept for file-as-a-dir IMHO.
> |
> |
> | The hell it is.
> |
> | a) kernel has *NO* *FUCKING* *KNOWLEDGE* of fs type contained on a device.

> reiser4 kernel will contain knowledge of fs type contained in a file.

> "file/..metas/type" might contain a mime type. Mime type might have to
> be guessed, but at least if it's made by a local "mkisofs" then we're fine.

> Indeed, that's not the only interface that's been discussed.
> "file/..metas/is_isofs" might be consulted.

What you are talking about isn't the kernel or such, but plugins that
could extend the filesystem. Plugins could store information about
contents, encodings, formatting, filesystems, etc, as meta-info. If
you have a plugin that would allow you to traverse files as disk
images then it could read those meta-data. But before those plugins
exist then there is no such standard for info stored as meta-data and
the kernel wouldn't know anything about this to begin with.


> | b) kernel has no way to guess which options to use
> | c) fs _type_ is a fundamental part of mount - device(s) (if any) involved
> | are arguments to be interpreted by that particular fs driver.

> Unless there's some severe security issues with "mount this iso as fat
> and you get root access", this should work fine.

> I see no reason why there can't be a global setting of the mount
> commandline to use.

> And it doesn't all have to be in the kernel. Only it'd be nice to have
> some of it there because the kernel knows how to deal with an isofs,
> even if it won't know what it looks like.

> | d) permissions required for that lovely operation (and questions like
> | whether we force nosuid/noexec, etc.) are nightmare to define.

> They are quite simple, actually. Just set them globally -- some admins
> would force nosetuid/noexec, some wouldn't. And the operation happens
> transparently -- you need no "permissions" other than to read the
> directory which would contain the mount.

> | Frankly, the longer that thread grows, the more obvious it becomes that
> | file-as-a-dir is a solution in search of problem. Desperate search, at
> | that.

> Actually, the longer this thread grows, the more obvious it is how when
> there's a hot issue, everyone has an opinion, even if the same opinion
> has been expressed ten or twenty times already.

> File-as-a-dir has numerous advantages, but enough have been discussed.
> Short list is image mounts, tarballs, streams, metas, and namespace
> unification. Longer list and explanations can be found if you RTFA.
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.4 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

> iQIVAwUBQTe19XgHNmZLgCUhAQJdow/+Knsw1GgpauqDUcg8sKtxzgXZ18OxMQ3Q
> By8sRrSTuKAzI5A3BtYIzndsj1veP+7wndG7nYPz8NS1fU2+xWSIhoGq/YMaQsu4
> 70uMLu448PFXZua4hZMk2w4mkULXbGyYHJ1Bf+2Z7QkQ/8W08hozC8QQynxMXIkX
> SrcWCS5hK8Nh7Ol691sDpPqexH7F1GwUyoslNGj63U5r6ViLAawt2ZKDYdT7ZPo8
> 0a/pWUHoHMPbv/KwqZZxRr1/qncA9QYQo6JqQBPPCr+tWNJs/ei3nAKGi58iOt1M
> DK1TEKd2lpbmwiK5pWDwGz+nwWmaFTAyfTEEEcP4gZedSJtRXaxyNh0jRl1iLATB
> SCO5Eb4jkQs8hdjHqQcQ1q7XKFX9eSXWeDdrGrtWaYC/QYOHxT+ci3lnKBKCG99Y
> YTqg3sNEZlV1N0jIcNvFSDEYbbX12v1Y6xbwvUx48+sMyUj3suT76niTRbwEydfO
> MA9y+wE2k4wF+h+sJCbTjimCNFvvuFTTJuBCbQTpfY4eOYBAFalxnWmrpTEfVzka
> 4iAqAYygWObGDkFFy/rp1HEVZPIKM0NwGLOsRwJsgyUMOsccBrEc0bg8sgMECVfs
> 5qNIb27tLokh8NBR6RodAv2NZYKC+foM0T+PC5bZMFD/Q7f6yDklqK4C4RCIYaXj
> xO9z1C6FPcM=
> =/gmK
> -----END PGP SIGNATURE-----

2004-09-03 01:32:06

by David Masover

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Spam wrote:
[...]
| Indeed, that's not the only interface that's been discussed.
| "file/..metas/is_isofs" might be consulted.
|
|
|> What you are talking about isn't the kernel or such, but plugins that

Plugins are kernel-space. As Linus points out, dealing with mime-types
in the kernel is uncool.

|> could extend the filesystem. Plugins could store information about
|> contents, encodings, formatting, filesystems, etc, as meta-info. If
|> you have a plugin that would allow you to traverse files as disk
|> images then it could read those meta-data. But before those plugins
|> exist then there is no such standard for info stored as meta-data and
|> the kernel wouldn't know anything about this to begin with.

So implement a plugin which knows how to talk to a userland program
which knows about metadata. The plugin controls access to file-type.

Maybe there ought to be a general-purpose userland plugin interface? So
that the only things left in the kernel are things that have to be there
for speed and/or sanity reasons? (Things like cryptocompress and
standard file/directory plugins.)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQTfIs3gHNmZLgCUhAQJCqA//cPGI/TPfgtwovk5a8HdcY9TbjJELGtIb
CHcNa2LKGUB8uVvoL9M0Pe9ei4iVK2QOL1QjUAbIE0Dx7t18KR/5qEIfCHTSw0sJ
8u/r3aaJhFjFwMcVLQOZWJYuTeodgMwkV96GgViGoHoiqDiV7BzZgjd43qwiH8rW
FUTKlPn2VmijyTTbf5VfX4hvmsU/He+5W0t08/xe3vpCa+ihNFLJQAwGioo/wzFq
aHl9jBT1esFLxONd7OxQpgVl/2uHx+rSAY6F5RyBqL/Tpm1ZKlrMdAzmdDWcAJA9
KnOLN8ltcPmjP0eCzgCO/iq8yczcwcagfmbD+WcYOmQbTXMTjxktrZqRuLrUHU+7
tl8JISKch5epHfOnQ/RTMEgotlcQ0SCoE7K5lIUuyheMYRWoVDJSvy3okET6ZxQL
NP3PVHguQbu1Bo2X8LNsrnU0KFT7XrejpXzKalPWVbQxayEEWUwLXBdNIpgBoYwX
FZXQKEpS4MmB/8kEC9xuQ077PfAclcjcHSj4B7phbSVMioFy4/HojOLOLA8gjHyr
7p+wimfDpsuioDLlccwc1b+fLB2eJM5G+zfzh//uy7LdfFxuI+074FJYxebRI2eL
N/rpjT+S0S7L0/k71Rwp/5y4YgDgdQqU0wyGaBLRjP0qD7k2S41g+yrua6qArhzP
baR9dCu1zAQ=
=Nrye
-----END PGP SIGNATURE-----

2004-09-03 01:32:03

by David Masover

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
|
| On Thu, 2 Sep 2004, David Masover wrote:
|
|>reiser4 kernel will contain knowledge of fs type contained in a file.
|
|
| That's a disaster, btw.
|
| There is no one "fs type" of a file. Files have at _least_ one type
| (bytestream), but most have more. Which is why automatically doing the
| right thing (in the sense you seem to want) in kernel space is simply not
| possible.

Oops, my bad. The interface is kernel, the file type database is user.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQTfH7XgHNmZLgCUhAQL0aw//SG/RDCy30xcVh36FVVAOG7GjP6UkHBeo
14/O3hSELxwm9r8z0YnGnbpoJj9atbOCu6VJhiumtjeyikdH5OaOrcMo7DQYp47V
holyEKvjd+YUkCBJSjFBmG279Ac7wuk2orFlG76gCkFjNk11W/FQsjEM3gi8yKa0
ZODFmh0XJa2p0mKRnApwynO1ma7HgyqNRKrjmxMUK2VHpOsbqjVe1+Gc2lk0E9gv
xc2PIdCgi3yDEpDpxSY8LEBXad7GTwa2VEID5G7C6Z0wOH38YyMEpoHQz03se9zO
aCnVg8LPdFkZveWXPiiyJrwDERCyKt/yrRxrVznWyNVaWmhoxXmw7TmQexvGiDD2
E6+XT8uWfdeMZRPhv284qcBegIDw5c1wHgz+GRso61T6x04hfJyY/onbF5lHz661
lfys4JRT0zmYbrG1b3GVgmfKDv7t3V6DkY+Pi1E5raeZr2F0idiZo+7uKh2NZAML
PLhb8LLr/lbHOSRG3Rq0YY/l+Q6wZXy770qF/51jp/c3UXXJyVusK+bCLrVZ3VPa
0Uf3KkuFLcoD4jDTNjt79QVJXIvEHmNOxp/g80nJUFM0WDj7u4a7pFpLCHZzFeBl
5uoS9c7dYjuV+/EkQ2S8YwryeyCHQNt381icb8HtbdlQOCVuK0v9OspA3IZmphcB
oD1RBCkVTYQ=
=l8Vz
-----END PGP SIGNATURE-----

2004-09-03 02:33:24

by Alan

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Gwe, 2004-09-03 at 00:49, Spam wrote:
> But can you actually do things with these files? Can you run
> applications or edit files directly, or is there need for temporary
> unzip first?

You always need that for zip files. Firstly because executables are
paged so you need an accessible random access copy of the bits. Secondly
because data may be paged, and also for seek performance.

2004-09-03 02:50:20

by Al Viro

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Fri, Sep 03, 2004 at 12:11:33AM +0200, Frank van Maarseveen wrote:
> On Thu, Sep 02, 2004 at 11:06:40PM +0100, [email protected] wrote:
> > On Fri, Sep 03, 2004 at 12:02:42AM +0200, Frank van Maarseveen wrote:
> > > > a) kernel has *NO* *FUCKING* *KNOWLEDGE* of fs type contained on a device.
> > >
> > > excuse me, but how does the kernel mount the root fs?
> >
> > By trying all fs types it has registered in a more or less random (OK, defined
> > by order of fs type registration, which is kinda-sorta deterministic at
> > boot time) order. With no flags, unless you pass them explicitly in kernel
> > command line. Fs types list can also be set explicitly in the command line.
>
> Of course I know that: the point is, the kernel _has_ knowlegde contrary
> to what you blatantly said.

What knowledge does the kernel have about fs type that could deal with the
contents of given device? Details, please.

2004-09-02 22:17:09

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, Sep 02, 2004 at 11:06:40PM +0100, [email protected] wrote:
> On Fri, Sep 03, 2004 at 12:02:42AM +0200, Frank van Maarseveen wrote:
> > > a) kernel has *NO* *FUCKING* *KNOWLEDGE* of fs type contained on a device.
> >
> > excuse me, but how does the kernel mount the root fs?
>
> By trying all fs types it has registered in a more or less random (OK, defined
> by order of fs type registration, which is kinda-sorta deterministic at
> boot time) order. With no flags, unless you pass them explicitly in kernel
> command line. Fs types list can also be set explicitly in the command line.

Of course I know that: the point is, the kernel _has_ knowlegde contrary
to what you blatantly said.

--
Frank

2004-09-02 22:17:05

by Christoph Hellwig

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Fri, Sep 03, 2004 at 12:03:54AM +0200, Christoph Hellwig wrote:
> On Fri, Sep 03, 2004 at 12:02:42AM +0200, Frank van Maarseveen wrote:
> > On Thu, Sep 02, 2004 at 11:00:27PM +0100, [email protected] wrote:
> > >
> > > The hell it is.
> > >
> > > a) kernel has *NO* *FUCKING* *KNOWLEDGE* of fs type contained on a device.
> >
> > excuse me, but how does the kernel mount the root fs?
>
> trial and error. That's why you see all thos ext3 mounted as ext2
> problems, or $RANDOMFS as fat.

Andb btw, for an lkml discussion RTFS wouldn't hurt.

2004-09-03 03:04:30

by Al Viro

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Fri, Sep 03, 2004 at 12:02:42AM +0200, Frank van Maarseveen wrote:
> On Thu, Sep 02, 2004 at 11:00:27PM +0100, [email protected] wrote:
> >
> > The hell it is.
> >
> > a) kernel has *NO* *FUCKING* *KNOWLEDGE* of fs type contained on a device.
>
> excuse me, but how does the kernel mount the root fs?

By trying all fs types it has registered in a more or less random (OK, defined
by order of fs type registration, which is kinda-sorta deterministic at
boot time) order. With no flags, unless you pass them explicitly in kernel
command line. Fs types list can also be set explicitly in the command line.

Next question?

2004-09-02 20:25:22

by Linus Torvalds

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)



On Thu, 2 Sep 2004, Alan Cox wrote:
>
> I asked our desktop people. They want something like inotify because
> dontify doesn't cut it.

Well, dnotify() really _is_ inotify(), since it does actually work on
inodes, not dentries.

I think what they are really complaining about is that dnotify() only
notifies the _directory_ when a file is changed, and they'd like it to
notify the file itself too. Which is a one-liner, really.

Does the following make sense? (Totally untested, use-at-your-own-risk,
I've-never-actually-used-dnotify-in-user-space, whatever).

Linus

===== fs/dnotify.c 1.17 vs edited =====
--- 1.17/fs/dnotify.c 2004-08-09 18:45:22 -07:00
+++ edited/fs/dnotify.c 2004-09-02 13:21:26 -07:00
@@ -160,6 +160,8 @@
if (!dir_notify_enable)
return;

+ __inode_dir_notify(dentry->d_inode, event);
+
spin_lock(&dentry->d_lock);
parent = dentry->d_parent;
if (parent->d_inode->i_dnotify_mask & event) {

2004-09-03 03:50:10

by Chris Dukes

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

On Thu, Sep 02, 2004 at 08:28:20PM -0500, David Masover wrote:
>
> So implement a plugin which knows how to talk to a userland program
> which knows about metadata. The plugin controls access to file-type.
>
> Maybe there ought to be a general-purpose userland plugin interface? So
> that the only things left in the kernel are things that have to be there
> for speed and/or sanity reasons? (Things like cryptocompress and
> standard file/directory plugins.)

Ahem,
Wasn't this the goal of GNU HURD?

I really think you should ask them why they haven't delivered
something useful, then come back to this thread.

Thanks.
--
Chris Dukes
Warning: Do not use the reflow toaster oven to prepare foods after
it has been used for solder paste reflow.
http://www.stencilsunlimited.com/stencil_article_page5.htm

2004-09-03 04:37:46

by David Masover

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Dukes wrote:
| On Thu, Sep 02, 2004 at 08:28:20PM -0500, David Masover wrote:
|
|>So implement a plugin which knows how to talk to a userland program
|>which knows about metadata. The plugin controls access to file-type.
|>
|>Maybe there ought to be a general-purpose userland plugin interface? So
|>that the only things left in the kernel are things that have to be there
|>for speed and/or sanity reasons? (Things like cryptocompress and
|>standard file/directory plugins.)
|
|
| Ahem,
| Wasn't this the goal of GNU HURD?

The goal of GNU HURD was to take everything out of the kernel and make
it entirely daemons. That's a far cry from keeping a file-type database
(historically the realm of file managers) out of the kernel.

Most people hate putting things in the kernel that don't belong there.
Most of what's in the kernel is for speed and/or sanity -- filesystems
are there for speed, device drivers (and scheduling, and vm, and so on)
are there for sanity.

Correct me if I'm wrong on this.

| I really think you should ask them why they haven't delivered
| something useful, then come back to this thread.

Honestly? I think it's mostly got nothing to do with architecture. I
think it's mostly got to do with politics. Most people would rather
hack on Linux, which is already done, than try to develop HURD, which is
something new. Most people also enjoy working with Linus (or prefer
Linus to the FSF).

I do not like how Linux is monolithic. I do not like having to reboot
to upgrade the kernel, and I do not like having to run closed software
(the nvidia drivers) in the kernel (as in, full privelages, can crash
entire system, yadda yadda). But Linux is the best we have.

The HURD people have delivered at least something. I think there's even
a Debian/HURD distro. Whether it's useful probably has to do with
whether it's stable/fast, which isn't likely. You hear of Linux news
every day, you hear of HURD maybe once in a lifetime -- "Hey, HURD exists!"

But you are right -- I need to take a break from this thread. It's
becoming addictive, and it's hard keeping up with so many people who are
so much smarter than I am.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQTf0/3gHNmZLgCUhAQKGrw//XMBsWBo7uqlocjdvSati4GkxsB+f2GPW
hI3JboYhWVMB7Ya+mS9VbrruxVn7uCCUwLwNDDnMncHTf09Em+WhzwJCQXZ14Y2r
AxZ8iaVR9Qv8hIHGw59uJYWvtTSHgSmWiFFVa4UB4D3NneSFr2vgXXDbU8LjhHQB
46qs8IM6s+RCb0L608uGml6AmyrxFvDgWz9p+++tWicLjQPVhBnLQQnqYy92we7x
qUwxCyGQywmdcSsJlkcjKzLQ/dCDwQRq7LXmSCx3qbSRr27D6zpE6l9He62ZuoBR
I9NmO12S6VgYifO5fnq6pxSu2+GuJVW469tm7EiXmKZ/rjfMJKp0cqg+mtGTU91m
gk7cZTBbe4id9+IQPJPJJ08IMH5XTI4DI6dXCkKIB0TZ8teFnU2+DsUdZ50ZXwiL
etFUjortzjPjEqZoKfvK+4n7HOMu1w1og/7aE6P5UNTbj5ViwLQB9KjwLBDyBMJr
dTmcLRyxzHtDZJMQuLwvU/SsdF86CdlR4spkEI6CitRMiiWvYghOeOkasOO8qxVj
ckzZzUCRHNF/PTPSAEiIaw2HyjVzf0Nru96Y80/hik3hRYO0XuTP/3K/twHoebkz
Fo1NLtGU3g631pDYcpqnkkAcQN+3fiyZ3VEiNoYYLhZpkQiEH8goRv5lof9XJPAC
krhwM8lNHoM=
=JJEe
-----END PGP SIGNATURE-----

2004-09-03 05:22:20

by Hans Reiser

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

Linus Torvalds wrote:

>On Thu, 2 Sep 2004, David Masover wrote:
>
>
>>reiser4 kernel will contain knowledge of fs type contained in a file.
>>
>>
Whoa! No it won't. It will allow you to create a metafile named type
if you choose. Or maybe a metadirectory named types, I don't really
care much about this stuff yet, we need to do other things to the
semantics more urgently than this, Typing requires a lot of cooperation
from user space apps. I don't really expect to have that much social
pull on the issue, but if someone chooses to design their app to use a
types metadirectory, I don't mind accomodating on it. I apologize if
from my saying that I gave the impression that I think files should be
strongly typed.

I am quite well aware of the disadvantages of OS/400 hindering usability
with type information, as well the advantages of having types available
for those that want to look at them.....

>
>That's a disaster, btw.
>
>
In every implementation so far it has been a net disaster, because OS
designers who type things have been willing to make users pay attention
to type when they don't want to. Sometimes trying to make too much out
of something makes it a mistake when being a bit more laid back would
make it ok. Me, I am so laid back, I prefer to work on other features
for a few years first.....;-)

>There is no one "fs type" of a file. Files have at _least_ one type
>(bytestream), but most have more. Which is why automatically doing the
>right thing (in the sense you seem to want) in kernel space is simply not
>possible.
>
> Linus
>
>
>
>

2004-09-03 06:16:48

by Hans Reiser

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

I don't think streams are a good thing, I think that all of the
different pieces of additional functionality necessary to emulate them
with files and directories are a good thing.

Keeping streams out of linux was one of the (less important) ideas
behind reiser4. Streams are a rigid hack. The toolkit that can emulate
them, is useful, and the perceived importance of that emulation will
fade as people start to use the toolkit for things that are much more
fun than streams.

I agree with most of the rest of what you say though David.

Hans


David Masover wrote:

> File-as-a-dir has numerous advantages, but enough have been discussed.
> Short list is image mounts, tarballs, streams, metas, and namespace
> unification. Longer list and explanations can be found if you RTFA.

2004-09-03 07:13:39

by Jan Harkes

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, Sep 02, 2004 at 07:19:47PM -0400, [email protected] wrote:
> On Thu, 02 Sep 2004 22:38:54 +0200, Frank van Maarseveen said:
> > cd /dev/cdrom
> > ls
>
> And the CD in the drive at the moment is AC/DC "Back in Black". What
> should this produce as output?

Hehe, cdfs already figured that one out. Ofcourse you show the
individual tracks as .wav files.

Jan

2004-09-03 08:40:40

by Helge Hafting

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Frank van Maarseveen wrote:

>On Thu, Sep 02, 2004 at 11:33:24PM +0100, [email protected] wrote:
>
>
>>RTFS and you'll see. Individual fs generally knows how to check if it
>>would be immediately unhappy with given image (not all types do, BTW).
>>Exact form of checks depends on fs type; for crying out loud, there's
>>not even a promise that they are mutually exclusive!
>>
>>
>
>so?
>
>A user can stick an USB memory card with _any_ malformed fs data and
>make troubles via the automounter or user mounts. Yes, mount might do
>some more checks but it sure won't do an fsck.
>
>The user gets what he deserves when sticking crap in an USB port.
>
>And that doesn't mean that the kernel should accept any fs image
>when a user tries to cd into the file.
>
>
You don't need kernel support for cd'ing into fs images.
You need a shell (or GUI app) that:
1. notices that user tries to CD into a file, not a directory
2. Attempts fs type detection and do a loop mount.
3. Give error message if it wasn't a supported fs image.

Helge Hafting

2004-09-03 08:57:31

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Fri, Sep 03, 2004 at 10:43:44AM +0200, Helge Hafting wrote:
> >
> You don't need kernel support for cd'ing into fs images.
> You need a shell (or GUI app) that:
> 1. notices that user tries to CD into a file, not a directory
> 2. Attempts fs type detection and do a loop mount.
> 3. Give error message if it wasn't a supported fs image.

Ok, and right now I'm in vim typing this message and want
to ":new /tmp/backup.iso/etc/fstab"

modifying 1000+ applications is not an option IMO. Putting it in
a preloaded library might be doable except maybe for permission
problems but this is incredibly clumsy and lacks the possibility
to implement sane caching behavior because it is process bound.

--
Frank

2004-09-03 09:03:08

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Fri, Sep 03, 2004 at 10:50:18AM +0200, I wrote:
> and lacks the possibility
> to implement sane caching behavior because it is process bound.

sorry, this part is crap.

--
Frank

2004-09-03 12:56:51

by Dave Kleikamp

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

On Thu, 2004-09-02 at 19:25, David Masover wrote:
> Dave Kleikamp wrote:
> [...]
> | Please don't tell me that we have expectations to run make from within a
> | tar file. This is getting silly. tar does a pretty good job of
> | extracting files into real directories, and putting them back into an
> | archive. I don't see a need to teach the kernel how to deal with
> | compound files when user space can do it very easily.
>
> Suppose I've got a tar file with an index attached. Suppose it's
> something like /usr/src/linux. Am I expected to extract all code for
> all architectures, with all drivers, all docs, etc? Now, yes -- or I
> have to figure out exactly which ones I need before I extract them
> manually, one by one.

I don't think it's unreasonable to expect someone to either extract the
whole tar file, or identify what files they want from it. If you think
there is too much in the tar file, roll your own with only the files you
need.

> But with tar support for make (and so on), files can be extracted on
> demand. It's possible to do this in userspace, with named pipes, but
> that's much slower and insanely clumsy.

This doesn't justify bloating the kernel. untar the darn thing and user
space does fine.

> This has further implications -- imagine a desktop, binary distro
> shipped with all files except the very most basic stuff as package
> archives. They can all be extracted, on demand -- the first time I run
> OpenOffice.org, it's installed. If there needs to be post-installation,
> that's handled by the .deb plugin (or whatever).

Are you saying install it on demand the first time it's run? This
doesn't take any new kernel function.

Or are you saying that the files are never installed on the filesystem,
but always accessed from the package archives? In this case, why not
ship each package as a compressed iso, and have the system mount the iso
to run the app. I really don't see the point though, in that disk space
is very cheap these days.

> I don't know offhand how big OOo is. I think it's something like this:
> ~ The installer is at most half (and maybe only a third) of the full
> installation. That's a HUGE optimization!

Optimization? How much performance are you willing to give up to save a
little bit of disk space?

--
David Kleikamp
IBM Linux Technology Center

2004-09-03 13:09:39

by Dave Kleikamp

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, 2004-09-02 at 19:39, Spam wrote:
>
>
> > On Fri, 3 Sep 2004, Spam wrote:
>
> >> Yes, some archive types can't be partially unzipped either. But my
> >> point is that it wouldn't be transparent to the application/user in
> >> the same way.
>
> > It doesnt matter whether it is transparent to the application. It can
> > be the application which implements the required level of
> > transparency.
>
> > User doesnt care what provides the transparency or how it's
> > implemented.
>
> Indeed. I hope I didn't say otherwise :). Just that I think it will
> be very difficult to have this transparency in all apps.

You're missing the point. We don't need transparency in all apps. You
can write an application to be as transparent as you want, but you don't
need every app to to understand every file format.

> Just
> thinking of "nano file.jpg/description.txt" or "ls
> file.tar/untar/*.doc".

I don't do much image editting, but I'm sure there are applications that
let you edit the description in a text file. You can even create a
script that extracts it, runs nano, and puts it back into the jpeg.

This works for me:
tar -tf file.tar | grep '\.doc'

There are userland tools that deal with hundreds of file formats. Use
the tool you need, rather than try to have the kernel do everything.

> Sure in some environments like Gnome it could
> work, but it still doesn't for the rest of the flora of Linux
> programs.

Just choose the right program. tar groks tar files, not ls.

--
David Kleikamp
IBM Linux Technology Center

2004-09-03 13:19:48

by Spam

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)




> On Thu, 2004-09-02 at 19:39, Spam wrote:
>>
>>
>> > On Fri, 3 Sep 2004, Spam wrote:
>>
>> >> Yes, some archive types can't be partially unzipped either. But my
>> >> point is that it wouldn't be transparent to the application/user in
>> >> the same way.
>>
>> > It doesnt matter whether it is transparent to the application. It can
>> > be the application which implements the required level of
>> > transparency.
>>
>> > User doesnt care what provides the transparency or how it's
>> > implemented.
>>
>> Indeed. I hope I didn't say otherwise :). Just that I think it will
>> be very difficult to have this transparency in all apps.

> You're missing the point. We don't need transparency in all apps. You
> can write an application to be as transparent as you want, but you don't
> need every app to to understand every file format.

No, but not every user "can write an application" either, or even
have the skills to apply patches. What I was talking about wasn't
just tar, which itself isn't the best example anyway, but the idea
that users can load plugins that will extend the functionality of
their filesystems. That idea seem to be to be _much_ better than
trying to teach every user how to write applications or patch
existing ones.

>> Just
>> thinking of "nano file.jpg/description.txt" or "ls
>> file.tar/untar/*.doc".

> I don't do much image editting, but I'm sure there are applications that
> let you edit the description in a text file. You can even create a
> script that extracts it, runs nano, and puts it back into the jpeg.

> This works for me:
> tar -tf file.tar | grep '\.doc'

> There are userland tools that deal with hundreds of file formats. Use
> the tool you need, rather than try to have the kernel do everything.

No, but if I wanted to have an encryption plugin active for some of
my files or directories then why should I not be able to? I still
want to edit, view and save my encrypted files.

Again, this was just an example of what could be done with plugins.
It is not said that every conceivable plugin will be written, nor
loaded per default. Even though plugins cannot today be dynamically
used, they will be eventually. Reiser4 is still very young.

Please separate your thoughts for specific plugins from those of the
idea to have plugins at all.

~S

>> Sure in some environments like Gnome it could
>> work, but it still doesn't for the rest of the flora of Linux
>> programs.

> Just choose the right program. tar groks tar files, not ls.


2004-09-03 13:36:28

by Dave Kleikamp

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Fri, 2004-09-03 at 08:16, Spam wrote:
> > You're missing the point. We don't need transparency in all apps. You
> > can write an application to be as transparent as you want, but you don't
> > need every app to to understand every file format.
>
> No, but not every user "can write an application" either, or even
> have the skills to apply patches. What I was talking about wasn't
> just tar, which itself isn't the best example anyway,

That was one of the examples you gave, that and .jpg. I believe they
are both ridiculous.

> but the idea
> that users can load plugins that will extend the functionality of
> their filesystems. That idea seem to be to be _much_ better than
> trying to teach every user how to write applications or patch
> existing ones.

If I understand Hans' plugins, they are not user-loadable, but rather a
statically built part of the kernel.

>
> No, but if I wanted to have an encryption plugin active for some of
> my files or directories then why should I not be able to? I still
> want to edit, view and save my encrypted files.

I would not argue against an encryption plugin.

> Again, this was just an example of what could be done with plugins.
> It is not said that every conceivable plugin will be written, nor
> loaded per default.

This I agree with.

> Even though plugins cannot today be dynamically
> used, they will be eventually. Reiser4 is still very young.

As kernel modules, this would make sense. I don't see just any
unprivileged user being able to add code into the file system, though.

> Please separate your thoughts for specific plugins from those of the
> idea to have plugins at all.

I'm not against reiser4 plugins. I don't think file system code should
care about the type of data in a file, and do any interpretation based
on it.

Shaggy
--
David Kleikamp
IBM Linux Technology Center

2004-09-03 13:59:02

by John Stoffel

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

>>>>> "David" == David Masover <[email protected]> writes:

David> File-as-a-dir has numerous advantages, but enough have been
David> discussed. Short list is image mounts, tarballs, streams,
David> metas, and namespace unification. Longer list and explanations
David> can be found if you RTFA.

And it has numerous dis-advantages as well, since it doesn't have a
good set of semantics and syntax defined yet, nor does it explain
except by vigorous handwaving the performance and security impacts it
can have.

My personal feeling is that the mount(8) command should be the tool
used to extract and expose the internal namespace of files like this
and to then graft it onto the standard Unix namespace with gross Unix
semantics, but it's own wacky internal semantics. This way, standard
tools don't care, but special tools which know how to handle it can do
what they want.


> mount -t tarfs /some/place/on/disk/foo.tar.gz /mnt/tar
> cp /var/tmp/img.gif .
> umount /mnt/tar

Oops! Someone did a rm /some/place/on/disk/foo.tar.gz between steps
one and two. Now what happens? Please define those semantics...

John

2004-09-03 14:06:22

by Spam

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives




>>>>>> "David" == David Masover <[email protected]> writes:

David>> File-as-a-dir has numerous advantages, but enough have been
David>> discussed. Short list is image mounts, tarballs, streams,
David>> metas, and namespace unification. Longer list and explanations
David>> can be found if you RTFA.

> And it has numerous dis-advantages as well, since it doesn't have a
> good set of semantics and syntax defined yet, nor does it explain
> except by vigorous handwaving the performance and security impacts it
> can have.

> My personal feeling is that the mount(8) command should be the tool
> used to extract and expose the internal namespace of files like this
> and to then graft it onto the standard Unix namespace with gross Unix
> semantics, but it's own wacky internal semantics. This way, standard
> tools don't care, but special tools which know how to handle it can do
> what they want.


>> mount -t tarfs /some/place/on/disk/foo.tar.gz /mnt/tar
>> cp /var/tmp/img.gif .
>> umount /mnt/tar

> Oops! Someone did a rm /some/place/on/disk/foo.tar.gz between steps
> one and two. Now what happens? Please define those semantics...

Uhm, can you delete a file (loop) that is mounted?

> John




2004-09-03 15:36:46

by [email protected]

[permalink] [raw]
Subject: Re: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, 02 Sep 2004 23:27:06 +0100, Alan Cox <[email protected]> wrote:
> On Iau, 2004-09-02 at 22:47, Jamie Lokier wrote:
> > - Can the daemon keep track of _every_ file on my disk like this?
> > That's more than a million files, and about 10^5 directories.
> > dnotify would require the daemon to open all the directories.
> > I'm not sure what inotify offers.
>
> This is currently a real issue for both desktop search and for virus
> scanners. They want a "what changed and where" system wide (or at least
> per namespace/mount).

In the database work everything is a transaction and the transactions
are logged. Reiser4 is fully atomic and logged. So to get the "what
changed and where" you just process the transaction logs from the
point of your last marked checkpoint. Hot backup db servers work this
way too by listening to the transaction log stream. You don't need a
daemon in this model.

Jon Smirl
[email protected]

2004-09-03 16:45:06

by Pavel Machek

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Hi!

> > Thirdly, you must be referring to the Gnome versions of Bash, Make,
> > GCC, coreutils and Perl which I haven't found. Perhaps we have a
> > different idea of what "supports this" means :)
>
> Please don't tell me that we have expectations to run make from within a
> tar file. This is getting silly. tar does a pretty good job of
> extracting files into real directories, and putting them back into an
> archive. I don't see a need to teach the kernel how to deal with
> compound files when user space can do it very easily.

Actually its not easy. User has to manually extract it and manually
delete it when he's done. Not nice.
Pavel
--
When do you have heart between your knees?

2004-09-03 16:46:17

by Pavel Machek

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Hi!

> I bet you could write a small library to test this out for a few types.
> See if it's useful to you. And only if it's useful (and would make a huge
> performance difference) would it be worth putting in the kernel.

It seemed really usefull in uservfs incarnation. Unfortunately the
daemon was not multihreaded at that time, so it was not really usefull
on multiuser systems :-(

> Implementation of the _user_space_ library would be something like this:
>
> #define MAXNAME 1024
> int open_cached_view(int base_fd, char *type, char *subname)

Well, you'd need more than simple open. For caching tar (etc), you'd
need stat_cached_view and opendir_cached_view and ...

And this really works, only that its called mc_open(), mc_stat() etc.

Gnome actually uses newer incarnation of mc_open etc, but they had to
introduce rather ugly interface to make it asynchronous.
Pavel
--
When do you have heart between your knees?

2004-09-03 16:48:40

by Paul Jakma

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Fri, 3 Sep 2004, Spam wrote:

> Indeed. I hope I didn't say otherwise :).

Sure.

> Just that I think it will
> be very difficult to have this transparency in all apps. Just
> thinking of "nano file.jpg/description.txt" or "ls
> file.tar/untar/*.doc". Sure in some environments like Gnome it could
> work, but it still doesn't for the rest of the flora of Linux
> programs.

"will it be transparent for all apps?", whether that's worth doing
depends on the technical implications. Thankfully we have Al and
Linus to make the judgement call on that ;)

Personally, I think that if GNOME can provide transparency for GNOME
users, I think that's probably enough - unless there are literally no
issues in adding some kind of VFS support.

The nano / ls /tar user is likely a very different user to the GNOME
user. That user is also likely to appreciate the problems with
backups and such more.

Anyway, userspace transparency is sufficient for most classes of
users. Only reason to provide some kernel support is if it makes
sense ("but not all apps can use GNOME transparency" not being one of
those reasons).

regards,
--
Paul Jakma [email protected] [email protected] Key ID: 64A2FF6A
Fortune:
Beat your son every day; you may not know why, but he will.

2004-09-03 19:39:09

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Fri, 03 Sep 2004 03:13:35 EDT, Jan Harkes said:
> On Thu, Sep 02, 2004 at 07:19:47PM -0400, [email protected] wrote:
> > On Thu, 02 Sep 2004 22:38:54 +0200, Frank van Maarseveen said:
> > > cd /dev/cdrom
> > > ls
> >
> > And the CD in the drive at the moment is AC/DC "Back in Black". What
> > should this produce as output?
>
> Hehe, cdfs already figured that one out. Ofcourse you show the
> individual tracks as .wav files.

That's sidestepping the *real* issue - which is that you need reasonable semantics
even if you don't have cdfs or whatever special gee-wizz-bang driver handy.

Consider an embedded system - it may have iso9660 support, and boot off a CD
like Knoppix, but not have cdfs. What do you do then?

(For bonus points, figure out the security issues involved in dealing with an intentionall
corrupted image on a CD......)


Attachments:
(No filename) (226.00 B)

2004-09-03 23:46:43

by David Masover

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Paul Jakma wrote:
| On Fri, 3 Sep 2004, Spam wrote:
|
|> Indeed. I hope I didn't say otherwise :).
|
|
| Sure.
|
|> Just that I think it will
|> be very difficult to have this transparency in all apps. Just
|> thinking of "nano file.jpg/description.txt" or "ls
|> file.tar/untar/*.doc". Sure in some environments like Gnome it could
|> work, but it still doesn't for the rest of the flora of Linux
|> programs.
|
|
| "will it be transparent for all apps?", whether that's worth doing
| depends on the technical implications. Thankfully we have Al and Linus
| to make the judgement call on that ;)

So far, the technical implications are mainly "does it create a serious
issue in the distant future" or "does some exotic new feature that
doesn't exist yet cause problems" and not "it's currently broken".

Right now, I can edit a file's permissions, transparently, from any app,
and I haven't noticed any stability issues at all.

| Personally, I think that if GNOME can provide transparency for GNOME
| users, I think that's probably enough - unless there are literally no
| issues in adding some kind of VFS support.

Only it can't. Especially if it's a typical GNOME user, who gets used
to having their files encrypted in foo.tar.pgp, say. Works in abiword,
works in gedit, breaks in OpenOffice. Not good.

| The nano / ls /tar user is likely a very different user to the GNOME
| user. That user is also likely to appreciate the problems with backups
| and such more.

I'm a vim/ls/tar user. And I believe the problems with backups to be
very minor. There are other issues that I don't understand as
thoroughly, so I'll leave them to Al and Linus.

| Anyway, userspace transparency is sufficient for most classes of users.
| Only reason to provide some kernel support is if it makes sense ("but
| not all apps can use GNOME transparency" not being one of those reasons).

Can you justify why "not all apps can use GNOME transparency" is not a
valid reason? Would you still think so if GNOME transparency was the
only way to read isofs?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQTkCL3gHNmZLgCUhAQJVSxAAiFd+e/Q0VjAvPaQL1IDwA56SP7LF2M1v
jnk5wehjvicPEfSn2HpWik7Pr55K2/7dUn6mEJO4N43x/zmfyiA2wXLibfWJtD3X
VgtFMgbOeKu+WobfdnvlBPAKYoZ+MYsC2jyDYwOHguJbR9epmFtHv0mMhmD60s+B
3JDLaYXwPjmjs7zezy2B/Xc3mgm060VgDZRVOnrYZLzHAi3dLvZLzGjORf5kBZtd
NtsiPUZw3oIqORBHqfWDWUUi4irfzklZkiuYp44RNFZs0z0xvPrUJ1klXiJpbQ9t
HTkz9qRe0sxjRYfLcJzWxfahaNN+2SXyR5FFqkpzfUfyNuoOUMZDtin+ZM83qyVc
J/zRCZvWQOaZUtgas9KCCYnbwgCFLcDEE0xLLDRMGAMKeHOwCImU55ChTC/0HTas
Q7Bd0df456dAv+ktbdMRaznWeNOpW5rJWXtxNeWnncTYKG0ogXct9mD0+BEIssaK
qaEYUousp6ZRvOSlCCNCF5hOvfEuZLuAZe2Q8zKH39B4Hg7BMnmIOw6VPLmlLKds
pGNuIKfKjyzkxcdHwqGp36e27wizyKtCbZ58Zd0x1wnf47gkm8kgkY38tb0ct7Xf
Ch+s3+zLsOtVnnipnlGhIPEtxYv78DiqhGUc+pdZisi5VlfA4PTXZAAjoICeXA/H
EqfVU6yYsu8=
=ggxZ
-----END PGP SIGNATURE-----

2004-09-03 23:55:29

by David Masover

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Helge Hafting wrote:
[...]
| You don't need kernel support for cd'ing into fs images.
| You need a shell (or GUI app) that:
| 1. notices that user tries to CD into a file, not a directory
| 2. Attempts fs type detection and do a loop mount.
| 3. Give error message if it wasn't a supported fs image.

You can argue this about FS plugins, too. You don't need kernel support
for cryptocompress, for example. Just have a fifo farm. Every file
that has compression turned on is actually a named pipe, the original
file is hidden (starts with a dot), and you have a script running to
each fifo which runs zcat.

The kernel support (read: interface) is just cleaner in some ways.

In the loop mount example, it isn't universal -- the user can't CD into
a file if they use the wrong shell, and two different shells might make
two different mounts. How many mounts before we run out of loop devices?

In the cryptocompress example, there's no caching or compress-on-flush
optimization, and it isn't entirely transparent to anything. It doesn't
look like a file, it looks like a fifo. ls will make it look yellow.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQTkEWngHNmZLgCUhAQKmMhAAkECgB9hUCxM2vNn0NgzQmkuz028QsAEN
6B1ej6tCbVyT8whsVD2YA7rQH5UavUeUKjuFNJNwelXeEYqG8tbv4vcE1Z65U5tR
GUJkDxWwgjfx4JtE137VmWXcMWW/6FeIBbHuKFohDeBz2nBYmc3w/b4NjgT//JOG
0iUQGRDQcGxDuCYqMqCO8+m/LOSL9wbYDG+bXa4NxVAcghmxAhAJFiDpRDQHcYxX
CeUI7ZkFnNX/4A52YTAFyHQmeoQRX1SLxJsn0GH9mcf2Gsxs8AtoTQgtedDsz9I2
Ct6Dil6NfI94VfOx0EWa1y4I0p7UnFjAsye9zeElhumhocFvEcd5Lzn6SnCY/rQh
o4VukNKUtDE92noL62BZGgCc9ek3/YZivMMFfmQOrjKjfQxEMwSvyjKWIj9XPATT
FWkN7mZn7JSxluB1n1BtHTYvdieD65RGWF1aW1atJZ1r0z0xzy95LF/kqm9Rkigr
zv6+1j9ND79pBikYuq4ol916R9bW93TkriSYbENBP5dtMx4nXUYI6kRU53Y1UxZH
+mstouIpGr9MB7kHLLDYFFFV8upRJi2EQHA2sAtHOaMmjZUoJrbsIdA9n1r1LsKt
6Wjq8MofdyvZWXbzh9aGD4QkovNf6tw4UHAGzVf9Uh3n6mwh7yw4qyDC5h8dqMEe
P4wWkRawtHc=
=GC9p
-----END PGP SIGNATURE-----

2004-09-04 00:15:17

by David Masover

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dave Kleikamp wrote:
| On Thu, 2004-09-02 at 19:39, Spam wrote:
|
|>
|>
|>
|>>On Fri, 3 Sep 2004, Spam wrote:
|>
|>>> Yes, some archive types can't be partially unzipped either. But my
|>>> point is that it wouldn't be transparent to the application/user in
|>>> the same way.
|>
|>>It doesnt matter whether it is transparent to the application. It can
|>>be the application which implements the required level of
|>>transparency.
|>
|>>User doesnt care what provides the transparency or how it's
|>>implemented.
|>
|> Indeed. I hope I didn't say otherwise :). Just that I think it will
|> be very difficult to have this transparency in all apps.
|
|
| You're missing the point. We don't need transparency in all apps. You
| can write an application to be as transparent as you want, but you don't
| need every app to to understand every file format.

You don't need every app to understand every filesystem, either. Come
to think of it, why would I want anything other than CD burning software
and file browsers to understand a CD drive?

Let's all just wait -- quietly -- for the powers that be to work out
whether it's feasable and sane to support it in all apps. Because
wouldn't that be better?

|> Just
|> thinking of "nano file.jpg/description.txt" or "ls
|> file.tar/untar/*.doc".
|
|
| I don't do much image editting, but I'm sure there are applications that
| let you edit the description in a text file. You can even create a
| script that extracts it, runs nano, and puts it back into the jpeg.

That is a PITA. Because you'll need to make more scripts. And still more.

Say you want to grep for a jpeg with a particular description. Say you
want to copy the description from one jpeg into another.

Maybe you could make a general-purpose command, like "jpeg_run", which
runs a command on the jpeg description, but it'd still be hackish, slow,
and more to type and remember. Consider that you might have no idea
what "jpeg_run" is called, but you can always do "ls file.jpeg/metas" to
find out how to edit it.
|
| This works for me:
| tar -tf file.tar | grep '\.doc'

And then you need to run "tar -tf" a minute later, this time looking for
*.xls. Maybe file.tar is actually a several gigabyte file.tar.bz2.

|> Sure in some environments like Gnome it could
|> work, but it still doesn't for the rest of the flora of Linux
|> programs.
|
|
| Just choose the right program. tar groks tar files, not ls.

tar groks tar. bzip2 groks bz2. gzip groks gz. "mount -o loop" groks
images. zip groks zip. rar groks rar. openoffice groks .sxw, unless
staroffice does. xmms can read id3 info from mp3s. gcc compiles C.

If the file is an object, it's easier. You can ask the file what it is,
and what you can do with it. And you can tell it to do certain standard
things. You can do all of this while reading almost no documentation on
what the file is. At the end of the day, it won't matter to you whether
a file is zip, rar, or tar -- it's an archive, and you can extract it by
copying files out of its /contents directory.

Your way (the traditional way) means you have to learn to use which
program is the right program and how to use that program, and you'll
have to remember it constantly.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQTkIq3gHNmZLgCUhAQJaNRAAnp/o0Wr+SJK5tkgYFCuc+CKIP+eALJo/
9wHnqK0nLdnji+vG0Czd9TUj1vWtoMrUichAwFoguMHHg3VeZGu61YwZoZ4idLNM
QbZ+CuQdUygNmyT0byGMFemP+cSbyvff1PRMy2BlSHKW3gUhvnQyggLGVKxpMRWf
VNnEHvkvJeA9PpEm6QGi1VRNp5bc0+Ocl4kO4CJk5ZYZ9D+BV6NwN/MZwqwlsu+Y
RZsYYEa6mLiCnU4rEo0tEAvvwMdC0e/9s9TQMcmJbT6JnybkWIRFMrTS4pmSabKg
uYGG9p1WrX8/V8WRNnaodlvx35gRPQj5S5SWDBoSWr999nmq31Y5RZ9QwbVj0d+U
yB4yNu0NpvFBJfwg8nVIKUa+bhPLCkdY5w+GnlEYGweSN20FYOaLiqiPtXBqJm8a
PP+8OCL35zy+1X7t/tq+JG/K91fYbECPR/qrAyHDXzNSuxdqidvBjfsEPBHOGwu6
RpsPoKyEvkzdXgauotWbVzWLt1ijGmYd/8Uk19OnmiloggViQhUWAJTIVGcvMNt8
Tnk8ESQMzGHbbVOIu3gB6DZD6P0IEEN4L6+gOZWoYqe2nvaYKvIL1My/i4BMU5Ml
cxDkzzeq58iVK07Has8lwgFp0U9iD1LrbIT5p0ZJn/52EXqlX4dJCYSSNaHzUqE8
hcLG4aDELEI=
=Wcyv
-----END PGP SIGNATURE-----

2004-09-04 00:26:17

by David Masover

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dave Kleikamp wrote:
| On Thu, 2004-09-02 at 19:25, David Masover wrote:
|
|>Dave Kleikamp wrote:
|>[...]
|>| Please don't tell me that we have expectations to run make from within a
|>| tar file. This is getting silly. tar does a pretty good job of
|>| extracting files into real directories, and putting them back into an
|>| archive. I don't see a need to teach the kernel how to deal with
|>| compound files when user space can do it very easily.
|>
|>Suppose I've got a tar file with an index attached. Suppose it's
|>something like /usr/src/linux. Am I expected to extract all code for
|>all architectures, with all drivers, all docs, etc? Now, yes -- or I
|>have to figure out exactly which ones I need before I extract them
|>manually, one by one.
|
|
| I don't think it's unreasonable to expect someone to either extract the
| whole tar file, or identify what files they want from it. If you think
| there is too much in the tar file, roll your own with only the files you
| need.

That is not a solution, that is exactly the opposite of a solution.
"Roll my own" implies that I must download the tar file, extract the
whole thing, grab what I need, and then tar it up again. Then I extract
it again. Doh!

More seriously, suppose it's a format like zip, where I don't have to
decompress the whole file to get a listing. And suppose I want to
compile some little example from it, but the file itself is huge -- half
a gig, say, and I only need ten megabytes.

But the only way I'm going to find out _which_ ten megabytes I need is
to extract the Makefile, read it, then go find all the other little
Makefiles, not to mention the configure script, the header files, and so
on...

But suppose that make can _transparently_ extract _only_ the files I
need for this?

|>But with tar support for make (and so on), files can be extracted on
|>demand. It's possible to do this in userspace, with named pipes, but
|>that's much slower and insanely clumsy.
|
|
| This doesn't justify bloating the kernel. untar the darn thing and user
| space does fine.

Does it really bloat the kernel? My kernel doesn't feel bloated, and
it's got reiser4 and much of what's needed for this.

Remember, saying "I can access foo.zip/contents" doesn't mean "zip is in
the kernel".

|>This has further implications -- imagine a desktop, binary distro
|>shipped with all files except the very most basic stuff as package
|>archives. They can all be extracted, on demand -- the first time I run
|>OpenOffice.org, it's installed. If there needs to be post-installation,
|>that's handled by the .deb plugin (or whatever).
|
|
| Are you saying install it on demand the first time it's run? This
| doesn't take any new kernel function.

It does for it to appear installed. And to only install the pieces that
are needed at that time.

| Or are you saying that the files are never installed on the filesystem,
| but always accessed from the package archives? In this case, why not

No, only that if I only wanted OpenOffice writer, why should I install
all of OpenOffice? Yes, it could be broken into smaller packages. What
I'm talking about is exactly the pieces of an installation that you need
- -- insanely fine granularity -- done automatically, on demand.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQTkLiHgHNmZLgCUhAQK7+A//TATQvQ3U61VA/mdVqylnrkWC7tNOewwq
MqF+KKU3Cc/n54mOlGTpph2qpPJTzv1y4KlVgcDM/d0bn1cDPx41n//xEe9QXlqu
vOywb8g11HSlAhKmbl4APwCHFHr1HibHgYqM7PmeVSD+Xfy5gJvIW5Oc44f16+q9
agEXWAk0EgM0WCAKEQFxN56i8e7qHq28PPGzcpGcn08xmBmD9Ik71jjpLY88csYy
kjH32ExEy+uABq+Tglfr0EBZR4RDuqkxsei7cL3Rn58O8twJtn8UP3VcukTLciZw
jb7If3ekuO7BXYJbwB/foFEESFql68jNKH7c7+Bzeb5pREloreVine/2rRM1iekD
FUeTv78kn+6G/INl9XwUB2ER0KZOy8n2wZut35T5w94GtZgmdHpm+3mCOscS6BdG
JNx/HRGJJXfm0P/7tKbgZ/3wjQlFzbC5HcByn9Ocfm8qrNsoLxwtF/8aId/9ctD9
lEmDMUHYuVC51/m5ka+i/XUQeuzgbtY5QKoNsxWXYZfeBNQfMqfMOsVWP1wFMlpB
mPmf6w+4idp1aIYwgvPyQee1BZiXmkbncglcnY+J4y46AZ+tDEO5eqjJrEU9kA7v
NNVfCehCZ2IIo1TMHcQizsHQ53NEmZ5gwCYXvymGS1+6fyE0SqcP9EolVuUfnRN9
yNdulMJ0gDg=
=TeOM
-----END PGP SIGNATURE-----

2004-09-04 03:24:19

by Horst H. von Brand

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Spam <[email protected]> said:
> Dave Kleikamp <[email protected]> said:
> >Spam <[email protected]> said:

[...]

> > You're missing the point. We don't need transparency in all apps. You
> > can write an application to be as transparent as you want, but you don't
> > need every app to to understand every file format.

> No, but not every user "can write an application" either, or even
> have the skills to apply patches. What I was talking about wasn't
> just tar, which itself isn't the best example anyway, but the idea
> that users can load plugins that will extend the functionality of
> their filesystems. That idea seem to be to be _much_ better than
> trying to teach every user how to write applications or patch
> existing ones.

Why compare "write application or apply patches" to "load plugin"? It
would be locical to compare running applications with loading plugins (and
even so, loading plugins is presumably root-only).

[...]

> > There are userland tools that deal with hundreds of file formats. Use
> > the tool you need, rather than try to have the kernel do everything.

> No, but if I wanted to have an encryption plugin active for some of
> my files or directories then why should I not be able to? I still
> want to edit, view and save my encrypted files.

Use an editor that knows about encrypted files. Decrypt/edit/encrypt if no
other option (I'm sure emacs can be coerced to do that transparently ;-). I
for one would be _way_ more confortable with my users doing that than them
loading random modules into the kernel. Besides, if one of my users doesn't
trust the system encryption programs, and prefers hers, she can be happy.

> Again, this was just an example of what could be done with plugins.
> It is not said that every conceivable plugin will be written, nor
> loaded per default. Even though plugins cannot today be dynamically
> used, they will be eventually. Reiser4 is still very young.

Modules of the kernel were supposed to have all those magic properties too,
until there were nasty races... and it was _seriously_ considered to take
them out. They stayed because they are root-only business, and (un)loading
is rare. FS plugins are kernel modules, AFAIU, and are subject to the same
problems.

> Please separate your thoughts for specific plugins from those of the
> idea to have plugins at all.

If you can't find concrete uses for specific plugins, they are the
proverbial solution searching for a problem.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2004-09-04 05:38:25

by David Masover

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Horst von Brand wrote:
[...]
| Use an editor that knows about encrypted files. Decrypt/edit/encrypt if no

You use emacs. I use vim. My brother uses gedit. My parents use
abiword. Perhaps I should patch them all? If that was so easy, why is
there cryptoloop/dm-crypt?

| is rare. FS plugins are kernel modules, AFAIU, and are subject to the same
| problems.

Actually, FS plugins currently cannot be modules. They are currently
called "plugins" because they share some concepts with browser plugins,
and it sounds great in marketing.

| If you can't find concrete uses for specific plugins, they are the
| proverbial solution searching for a problem.

Fine, let them be. They are just very well structured code. If you use
reiser4 and don't look in the source or the "metas" dir, they are
completely invisible to you. I used the betas for months, and "metas"
never burned me.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQTlUt3gHNmZLgCUhAQJVGhAAki5RLZckm5jiZnw7MXQLqURBvm+cz7wU
eFPfGyOE0SpFKJLFntWr0zrAgyK5ClgwqB7wsJEldWUfGBPdYpH5lroOQEHVEGBs
4X+ze/xyUOL6z3S07a85jNibYamDeoCDc5P0Vc6GWrdpsU7FQGXrHykNyglDxFJ1
MiYEQkB8NYDzQukl+7HPR3qPhQpAl5hx3XtmOcC5w0/88ATMqXg81DoVzPAPlsrL
IPu4ai7KjXRaY1sKo8SU4orj7iQHmmkiFJJg+QwVU9sO2GMBGpXZRSr3KcUL3ux5
nr+++ceVXyLADZaJRYp5LoTxL0KPJUKhaa9ABLmN2zQ5hT/v6AlQmKKD3s6ca02a
A8MQxy69hG50RVSeJm9yjRYQQBvATEXslCQXPXSAlLJGrPZ1FZgQdYyo2wNboD23
ep+JP2qTPdyTFFl2TOtoeR7fIsjg6DF5Bq0uh0maqC0UXXIo1GRO/OQGsNMfCN88
pevDg0GvE+bdeL8CEZYfDzu4aaUs+ltzZSPEKlXHCGFORL9iSuYhqdUCRPbcKYOy
uyhE7fgZoPYsOZLfChmXllEF69Cs5Vm5R0ymIgHqprfAjfqYf4ypbU0fukFDQ5dS
GGFTfxketTHdhYr7ATAfZg08ZMP819UnvcwISflLlDBwvpL4BDAWOrnDFnbHSugk
lPAxyPnfECM=
=h24Y
-----END PGP SIGNATURE-----

2004-09-04 06:22:12

by Hans Reiser

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

David Masover wrote:

> I used the betas for months, and "metas"
> never burned me.

metas is much less likely than what clearcase uses to be hit
accidentally (After all these years I forget what exactly clearcase
special cases, maybe it was "@@"). When people using clearcase suffer a
namespace collision, life goes on, no big deal, they structure a
filename slightly differently and so what? I mean, just how much do we
suffer from not being able to use '/' in filenames? Every once in a
while it is annoying, but not that much....

2004-09-04 11:07:47

by Spam

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)




> Spam <[email protected]> said:
>> Dave Kleikamp <[email protected]> said:
>> >Spam <[email protected]> said:

> [...]

>> > You're missing the point. We don't need transparency in all apps. You
>> > can write an application to be as transparent as you want, but you don't
>> > need every app to to understand every file format.

>> No, but not every user "can write an application" either, or even
>> have the skills to apply patches. What I was talking about wasn't
>> just tar, which itself isn't the best example anyway, but the idea
>> that users can load plugins that will extend the functionality of
>> their filesystems. That idea seem to be to be _much_ better than
>> trying to teach every user how to write applications or patch
>> existing ones.

> Why compare "write application or apply patches" to "load plugin"? It
> would be locical to compare running applications with loading plugins (and
> even so, loading plugins is presumably root-only).

Doesn't matter if it is root-only. Ask the admin about the specific
plugin the user wants. For most other users they can su to root and
do this on their own computers. Not everyone is using Linux is a
corporate world.

> [...]

>> > There are userland tools that deal with hundreds of file formats. Use
>> > the tool you need, rather than try to have the kernel do everything.

>> No, but if I wanted to have an encryption plugin active for some of
>> my files or directories then why should I not be able to? I still
>> want to edit, view and save my encrypted files.

> Use an editor that knows about encrypted files. Decrypt/edit/encrypt if no
> other option (I'm sure emacs can be coerced to do that transparently ;-). I
> for one would be _way_ more confortable with my users doing that than them
> loading random modules into the kernel. Besides, if one of my users doesn't
> trust the system encryption programs, and prefers hers, she can be happy.

That just doesn't do it. I doubt there will be an option to save
encrypted Word and Excel files and be able to open them in Abiword,
StarOffice or OpenOffice unless the decryption/encryption is done on
a lower level.

Also, eventually there may be a userland interface for loading
certain modules without root access.?

>> Again, this was just an example of what could be done with plugins.
>> It is not said that every conceivable plugin will be written, nor
>> loaded per default. Even though plugins cannot today be dynamically
>> used, they will be eventually. Reiser4 is still very young.

> Modules of the kernel were supposed to have all those magic properties too,
> until there were nasty races... and it was _seriously_ considered to take
> them out. They stayed because they are root-only business, and (un)loading
> is rare. FS plugins are kernel modules, AFAIU, and are subject to the same
> problems.

>> Please separate your thoughts for specific plugins from those of the
>> idea to have plugins at all.

> If you can't find concrete uses for specific plugins, they are the
> proverbial solution searching for a problem.

To me there are fine and good uses for plugins. Everyone seem to
think different though.

Sure many things can perhaps be done by linking and piping lots of
programs, just to access some data. But that certainly is far from
user friendly.

~S


2004-09-04 11:48:45

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

On Fri, 03 Sep 2004 19:13:31 -0500
David Masover <[email protected]> wrote:

> | Just choose the right program. tar groks tar files, not ls.
>
> tar groks tar. bzip2 groks bz2. gzip groks gz. "mount -o loop" groks
> images. zip groks zip. rar groks rar. openoffice groks .sxw, unless
> staroffice does. xmms can read id3 info from mp3s. gcc compiles C.
>
> If the file is an object, it's easier. You can ask the file what it is,
> and what you can do with it. And you can tell it to do certain standard
> things. You can do all of this while reading almost no documentation on
> what the file is. At the end of the day, it won't matter to you whether
> a file is zip, rar, or tar -- it's an archive, and you can extract it by
> copying files out of its /contents directory.
>
> Your way (the traditional way) means you have to learn to use which
> program is the right program and how to use that program, and you'll
> have to remember it constantly.

Just a short input from someone listening quite a while to this thread:
I think your approach to the problem leads actually nowhere. The reason for
this is implicit in your own explanation. _Currently_ you talk of archives, but
as soon as you got that, what's next?
Your idea needs abstraction, then you see the problem more clearly:
An archive is only some sort of file type, just as (you already named) mp3 or
.sxw is another type. If you really want to do something _generally_ useful you
should think of a method to parse and use all kinds of filetypes and create an
interface for that.
And one thing is clear: as there are numerous different types you cannot pull
all that code inside the kernel. Obviously there should be a way some
application can install a hook, a helper, a plugin or whatever to provide
extended functionality on its special filetypes. If you don't want to use tar,
you don't need the plugin either.
If you want tar, you should (as a user) be able to install the "fs-plugin"
(just a name, do not shoot me for it) together with tar as an application. You
get the idea? Obviously this must be possible during runtime, everything else
is senseless.
So, please, if you go on investigating this whole stuff do it in a more general
way, because "tar inside kernel" is not really what your idea is all about, no?
And one last obvious thing: if one doesn't want this kind of file-handling, he
should not need it. It has to be an add-on functionality, not a replacement.

Regards,
Stephan

2004-09-04 18:27:26

by David Masover

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Stephan von Krawczynski wrote:
| On Fri, 03 Sep 2004 19:13:31 -0500
| David Masover <[email protected]> wrote:
|
|
|>| Just choose the right program. tar groks tar files, not ls.
|>
|>tar groks tar. bzip2 groks bz2. gzip groks gz. "mount -o loop" groks
|>images. zip groks zip. rar groks rar. openoffice groks .sxw, unless
|>staroffice does. xmms can read id3 info from mp3s. gcc compiles C.
|>
|>If the file is an object, it's easier. You can ask the file what it is,
|>and what you can do with it. And you can tell it to do certain standard
|>things. You can do all of this while reading almost no documentation on
|>what the file is. At the end of the day, it won't matter to you whether
|>a file is zip, rar, or tar -- it's an archive, and you can extract it by
|>copying files out of its /contents directory.
|>
|>Your way (the traditional way) means you have to learn to use which
|>program is the right program and how to use that program, and you'll
|>have to remember it constantly.
|
|
| Just a short input from someone listening quite a while to this thread:
| I think your approach to the problem leads actually nowhere. The
reason for
| this is implicit in your own explanation. _Currently_ you talk of
archives, but
| as soon as you got that, what's next?

Anything sane. Currently I talk of archives, because I need examples,
and archives have been a good example for awhile of something that can
gain things from plugins which they cannot get anywhere else, which have
implications immediately after the archive plugin is wrote, without
requiring any application support.

| Your idea needs abstraction, then you see the problem more clearly:
| An archive is only some sort of file type, just as (you already named)
mp3 or
| .sxw is another type. If you really want to do something _generally_
useful you
| should think of a method to parse and use all kinds of filetypes and
create an
| interface for that.

Fine. It'd certainly be nice to use vim to edit an id3 tag, or a text
portion of a .sxw document (openoffice is HUGE compared to vim).

| And one thing is clear: as there are numerous different types you
cannot pull
| all that code inside the kernel. Obviously there should be a way some

I never said it should all go in the kernel. I repeatedly said that
most of it should not -- all that goes in the kernel is the most generic
interface that is sane. Obviously an archive plugin is different than a
compression plugin, but once you've got the archive plugin, the rest
(individual format support) can go in userland.

| application can install a hook, a helper, a plugin or whatever to provide
| extended functionality on its special filetypes. If you don't want to
use tar,
| you don't need the plugin either.

So you disable it at compile time. I don't want to use XFS, and I
certainly don't want to use ATI's video drivers. The latter I choose
not to download, and the former I choose not to activate.

| If you want tar, you should (as a user) be able to install the "fs-plugin"

If by "as a user" you mean "not root", I have to disagree. I think it'd
have security implications. It'd be great if I'm wrong.

| (just a name, do not shoot me for it) together with tar as an
application. You
| get the idea? Obviously this must be possible during runtime,
everything else
| is senseless.

I would love for this same attitude of "runtime or bust" to be more
common. I would love for there to be a generic way to write a userland
fs plugin, just as there's a generic way to write a userland filesystem
(lufs). Just understand that some plugins will be in userland, and some
will be in the kernel.

And, in fact, the part of the archive plugin that knows about
tar/zip/rar/whatever goes in userspace. But the part that does caching
probably goes in kernel space, even if it becomes a more generic
"caching plugin".

| So, please, if you go on investigating this whole stuff do it in a
more general
| way, because "tar inside kernel" is not really what your idea is all
about, no?

Absolutely not.

| And one last obvious thing: if one doesn't want this kind of
file-handling, he
| should not need it. It has to be an add-on functionality, not a
replacement.

For awhile, yes. But I can imagine that sometime in the future, an app
would need this kind of functionality. And why not? Is it so much
worse for an app to insist on "mp3 metadata plugin" than to insist on
"libid3"?

In any case, look at the current implementation. I can do
echo 750 > file/metas/rwx
but I can also use chmod, which is currently much more efficient. I
think Hans wants to eventually create an efficient interface to access
lots of small files at once (or maybe this is done?) -- and his dream is
to see things like chmod replaced with a call to this interface. But at
best, chmod will become a library call sometime years from now. It will
never go away entirely.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQToI+3gHNmZLgCUhAQKEWg/9F0FZGxB2yXoo9l5UKI7Fvh19RlnpJvZE
D3DajNqxXZ5smYIba7QxATEwMLWNMwOdiJTDIcQ7jylbD3hTkp0uFQLXcS5FeWt5
yxA0oDYEHsW14aQov4pHGplHVvCHSkm6vJn/zU91QdJAZVYnH4na5NFuWsAIoUWi
5vigYTefavabFv0/RnyPCN6zFfn3hrjOnKqSELUXXleG/fV+XJVw3Nyg4DSe1hK+
xmXHFWkGC832tH052EH/CFlE17/K05QdpfBKYNzCFA5dvTtCs11OQZSggtGVipmn
cSycyDHtJraMdF78Qa8LZzjGiZbT/3befM6wrnekcwhjxCqok8CSbOvbVh2ENtiu
klNtHeQEEtm+4MhbEEkdiiuWaU7EHIFJFGYC6LJE0B4DAKDdi4TM9QbqxLW8HghQ
WUJfMh8H2FtqotjMsdDW0OCmSuzh0B4ZZ9EXvSMIttXBJ7T2+q442hm4PAFQWXSO
FPCAiUgGZ+ZCbyZI1/pjM+CGblMTCfCBOm8UI3MSjTTI7cVzWFrQHJSAUhCPIsxA
9RZlEvmg+ElbnGrJkpDggpToDFN5uHXj41r4CZBnXgjfUdA0jmHnWpzO29Smfu7i
cJPtN+LzR1Se+fz4IvYamWd1p/Mi3vQACPt4r3IGVKa27tYXzDw+CNa+CliHnczt
lAhISMnhoVo=
=7rV6
-----END PGP SIGNATURE-----

2004-09-05 07:27:51

by Stephen Rothwell

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, 2 Sep 2004 13:22:41 -0700 (PDT) Linus Torvalds <[email protected]> wrote:
>
> Well, dnotify() really _is_ inotify(), since it does actually work on
> inodes, not dentries.

The "d" stands for directory not dentry :-)

> I think what they are really complaining about is that dnotify() only
> notifies the _directory_ when a file is changed, and they'd like it to
> notify the file itself too. Which is a one-liner, really.

I don't think so, since this notify will only happen if the process has
registered for the notification and there is no way to register unless the
file is a directory ...

> Does the following make sense? (Totally untested, use-at-your-own-risk,
> I've-never-actually-used-dnotify-in-user-space, whatever).

I had intended to extend dnotify to do file notifies, but I think the
real killer is needing the keep the file open that you want to be
notified about when you want to be notified about lots of files ...

I think that is what inotify was trying to fix (but I haven't had a chance
to look at it recently). It reminds me of omirr that we had many years
ago - I wonder what happened to it?

--
Cheers,
Stephen Rothwell [email protected]
http://www.canb.auug.org.au/~sfr/


Attachments:
(No filename) (1.21 kB)
(No filename) (189.00 B)
Download all attachments

2004-09-05 11:38:39

by Michelle Konzack

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

Am 2004-09-03 16:01:17, schrieb Spam:
> >>>>>> "David" == David Masover <[email protected]> writes:

> >> mount -t tarfs /some/place/on/disk/foo.tar.gz /mnt/tar
> >> cp /var/tmp/img.gif .
> >> umount /mnt/tar
>
> > Oops! Someone did a rm /some/place/on/disk/foo.tar.gz between steps
> > one and two. Now what happens? Please define those semantics...
>
> Uhm, can you delete a file (loop) that is mounted?

Not as $USER but root... :-/

> > John

Greetings
Michelle

--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
Michelle Konzack Apt. 917 ICQ #328449886
50, rue de Soultz MSM LinuxMichi
0033/3/88452356 67100 Strasbourg/France IRC #Debian (irc.icq.com)


Attachments:
(No filename) (737.00 B)
signature.pgp (189.00 B)
Digital signature
Download all attachments

2004-09-05 13:20:42

by Tonnerre

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Salut,

On Wed, Sep 01, 2004 at 01:50:56PM -0700, Linus Torvalds wrote:
> #define MAXNAME 1024
> int open_cached_view(int base_fd, char *type, char *subname)
> {
> struct stat st;
> char filename[PATH_MAX];
> char name[MAXNAME];
> int len, cachefd;
>
> if (fstat(base_fd, &st) < 0)
> return -1;
> sprintf(name, "/proc/self/fd/%d", base_fd);
> len = readlink(name, filename, sizeof(filename)-1);
> if (len < 0)
> return -1;
> filename[len] = 0;
>
> /* FIXME! Replace '/' with '#' in "type" and "subname" */
> len = snprintf(name, sizeof(name),
> "%04llx/%04llx/%s/%s/%s",
> (unsigned long long) st.st_dev,
> (unsigned long long) st.st_ino,
> type ? : "default",
> subname,
> filename);
> errno = ENAMETOOLONG;
> if (len >= sizeof(name))
> return -1;
> cachefd = open(name, O_RDONLY);
> if (cachefd >= 0) {
> /* Check mtime here - maybe we could have kernel support */
> return cachefd;
> }
> if (errno != ENOENT)
> return -1;
> /*
> .. try to generate cache file here ..
> */

Around of what I've had in mind. Only that one might use libmagic
instead of the type argument. The rest can be done by a corresponding
MIME plugin.

Tonnerre


Attachments:
(No filename) (1.55 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-09-05 14:47:31

by Tonnerre

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

Salut,

On Thu, Sep 02, 2004 at 07:25:32PM -0500, David Masover wrote:
> This has further implications -- imagine a desktop, binary distro
> shipped with all files except the very most basic stuff as package
> archives. They can all be extracted, on demand -- the first time I run
> OpenOffice.org, it's installed. If there needs to be post-installation,
> that's handled by the .deb plugin (or whatever).

zsh using apt extensions does that.

Tonnerre


Attachments:
(No filename) (458.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-09-06 07:53:20

by Tonnerre

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Salut,

On Thu, Sep 02, 2004 at 09:50:31PM +0200, Spam wrote:
> Their libraries are huge and memory hogging which so many Linux
> users just do not like.

This is rather a fud argument: both the gnome VFS code and the
KIOserver/KIOslave code aren't really large. You don't want to use
them for a busybox/tinylogin system, however.

> What if a user doesn't want KDE or Gnome? Would all files created
> with either be broken?

The files still work well, just that you can't access them over the
old fancy URL schemes.

> I doubt that something like file streams and meta-data can
> successfully be implemented purely in user-space and get the same
> support (ie be used by many programs) if this change doesn't come
> from the kernel. I just do not see it happen.

Actually, practical discordianism. If you develop a common API,
there'll always be people disagreeing.

GTK+ with all its features is just cool. Desktop warping is a really
nice thing. But there are people out there who don't want to use
it. They use QT, or even plain old Athena Widgets. So what? Will we be
implementing the X toolkits into the kernel?

In case of marketing it's up to the distributions to provide something
concise so everyone can use their programs through a coherent
namespace. (I.e. port all the apps they ship to gnome-vfs or kio).

Tonnerre


Attachments:
(No filename) (1.37 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-09-06 07:57:40

by Tonnerre

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Salut,

On Thu, Sep 02, 2004 at 10:38:54PM +0200, Frank van Maarseveen wrote:
> Can it do this:
>
> cd FC2-i386-disc1.iso
> ls
>
> or this:
>
> cd /dev/sda1
> ls
> cd /dev/floppy
> ls
> cd /dev/cdrom
> ls
>
> ?

Actually I see some
small-security-hole-you-can-drive-a-big-yellow-truck-with-flashes-on-through
problem.

$ cat fs_header owner_root flags_with_suid evil_program > evil.iso
$ ls -l evil.iso/evil_program
-rwsr-xr-x 1 root root 24 2003-11-09 16:41 evil.iso/evil_program
$ evil.iso/evil_program
Mwahaha, I got root!
Copy-right violating IP... done.
Hijacking administrators wife... done.
Overwriting OS with Parachute 2010... done.

etc. pp.

Tonnerre


Attachments:
(No filename) (749.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-09-06 08:07:12

by Tonnerre

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Salut,

On Fri, Sep 03, 2004 at 12:26:50AM +0200, Frank van Maarseveen wrote:
> Try a "make tags;grep SUPER_MAGIC tags".
> Or is it there for a different purpose?

Problem is:

There are cool superblock magics for reiserfs. And for ext[23]. And
even for good old Minix. Cool.

However, there are also ugly file systems, such as fat for
example. Fat has been defined as "something that can be read by a fat
driver", which can be pretty much anything. You can't really detect
whether it is fat. Even Microsoft only guess whether some FS is fat or
not.

And don't say detect it via a partition table. There are a lot of
cases where you get absolutely no or wrong information out of it, and
there are also lots of cases where you plainly have none.

Tonnerre


Attachments:
(No filename) (787.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-09-06 08:06:49

by Spam

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)




> Salut,

> On Thu, Sep 02, 2004 at 09:50:31PM +0200, Spam wrote:
>> Their libraries are huge and memory hogging which so many Linux
>> users just do not like.

> This is rather a fud argument: both the gnome VFS code and the
> KIOserver/KIOslave code aren't really large. You don't want to use
> them for a busybox/tinylogin system, however.

Then it is good. Just I see no programs other than Gnome or KDE apps
that are using them.

>> What if a user doesn't want KDE or Gnome? Would all files created
>> with either be broken?

> The files still work well, just that you can't access them over the
> old fancy URL schemes.

No only, but if you cp then with a non KDE/Gnome app then you will
loose the meta-data and extra info too. That was my point.

>> I doubt that something like file streams and meta-data can
>> successfully be implemented purely in user-space and get the same
>> support (ie be used by many programs) if this change doesn't come
>> from the kernel. I just do not see it happen.

> Actually, practical discordianism. If you develop a common API,
> there'll always be people disagreeing.

> GTK+ with all its features is just cool. Desktop warping is a really
> nice thing. But there are people out there who don't want to use
> it. They use QT, or even plain old Athena Widgets. So what? Will we be
> implementing the X toolkits into the kernel?

This is certainly not what I said or wanted.

> In case of marketing it's up to the distributions to provide something
> concise so everyone can use their programs through a coherent
> namespace. (I.e. port all the apps they ship to gnome-vfs or kio).

Do you really believe this will happen? Good if it did. I do not
believe it. And I certainly do not see the thousands of man hours
needed to actually provide all the patches as a benefit.

~S

> Tonnerre

2004-09-06 08:11:32

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Mon, Sep 06, 2004 at 09:56:03AM +0200, Tonnerre wrote:
>
> $ cat fs_header owner_root flags_with_suid evil_program > evil.iso
> $ ls -l evil.iso/evil_program

It should of course be equivalent to a user mount: nodev nosuid etc.

--
Frank

2004-09-06 08:42:03

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Mon, Sep 06, 2004 at 10:04:24AM +0200, Tonnerre wrote:
>
> Problem is:
>
> There are cool superblock magics for reiserfs. And for ext[23]. And
> even for good old Minix. Cool.
>
> However, there are also ugly file systems, such as fat for
> example. Fat has been defined as "something that can be read by a fat

This problem is not new. Kernel probes for filesystems in a particular
order for mounting the root fs. And mount understands the fstype=auto in
/etc/fstab. There is no perfect solution but it's sure possible to come
up with something acceptable and workable with not too much effort, for
a configuratble subset of file-systems. This is not that much different
from an automounter/usermount mounting a USB storage device or cdrom:
ext3, ext2, udf, iso9660, vfat, read-only or not, just to name a few
things. This should work anyway.

--
Frank

2004-09-06 08:57:51

by Tonnerre

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Salut,

On Mon, Sep 06, 2004 at 10:05:34AM +0200, Spam wrote:
> Then it is good. Just I see no programs other than Gnome or KDE apps
> that are using them.

Because KDE people hate Gnome people and vice versa, and because the
rest of the world just neglects the two races for political reasons.

Maybe the Freedesktop project should provide some convenient
specification/code to do it. Like they do for HAL and DBUS (Please
note that this is something interesting because it does clever things
on hardware without requiring to patch the kernel.)

> > In case of marketing it's up to the distributions to provide
> > something concise so everyone can use their programs through a
> > coherent namespace. (I.e. port all the apps they ship to gnome-vfs
> > or kio).
>
> Do you really believe this will happen?

If the distributors really want to be able to gain money, and if the
Free Unix community wants to gain a significant market share, this is
supposed to happen. It's the question of whether we can ignore our
childish concept wars, or if we're always going to stay at that low
level we're at now.

Actually, this can't be fixed by putting everything into the kernel.

Tonnerre


Attachments:
(No filename) (1.20 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-09-06 09:03:08

by Tonnerre

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

Salut,

On Fri, Sep 03, 2004 at 04:01:17PM +0200, Spam wrote:
> >> mount -t tarfs /some/place/on/disk/foo.tar.gz /mnt/tar
> >> cp /var/tmp/img.gif .
> >> umount /mnt/tar
>
> > Oops! Someone did a rm /some/place/on/disk/foo.tar.gz between steps
> > one and two. Now what happens? Please define those semantics...
>
> Uhm, can you delete a file (loop) that is mounted?

"Text file busy"

But for files this isn't true.

Currently, we have another method assuring data coherency: we remove
the inode only when the last reference goes away.

Tonnerre


Attachments:
(No filename) (561.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-09-06 09:21:08

by Giovanni A. Orlando

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

Tonnerre wrote:

>Salut,
>
>On Mon, Sep 06, 2004 at 10:05:34AM +0200, Spam wrote:
>
>
>>Then it is good. Just I see no programs other than Gnome or KDE apps
>>that are using them.
>>
>>
>
>Because KDE people hate Gnome people and vice versa, and because the
>rest of the world just neglects the two races for political reasons.
>
>
Hi,

This is completely wrong. Neither KDE people hate GNOME people
nor viceversa.

Time ago, I repeat in a conference the GNOME people need to jump
on the KDE wagon, so the train will move to a single direction, and
people does not approve this comment.

More than on the license, QPL, there are a developer problem.

Develop in C (Gtk, GNOME) is a lot more easy than develop in C++
(Qt, KDE)

But, for the GUI, the C++ approach is a lot superior ... a lot.

The problem is this. They start to develop on Gtk to create GNOME, in C
for a license problem.

Now, they said why I need to change?

But they need to change, because it is supeiror. Everyone may of
course wants to do what prefer.

>Maybe the Freedesktop project should provide some convenient
>specification/code to do it. Like they do for HAL and DBUS (Please
>note that this is something interesting because it does clever things
>on hardware without requiring to patch the kernel.)
>
>
>
I don't agree FreeDesktop.org, because it is handled by a RedHat employee
and the code is made on Gtk2, generally.

... They are not neutral.

Thanks,
Giovanni

>>>In case of marketing it's up to the distributions to provide
>>>something concise so everyone can use their programs through a
>>>coherent namespace. (I.e. port all the apps they ship to gnome-vfs
>>>or kio).
>>>
>>>
>>Do you really believe this will happen?
>>
>>
>
>If the distributors really want to be able to gain money, and if the
>Free Unix community wants to gain a significant market share, this is
>supposed to happen. It's the question of whether we can ignore our
>childish concept wars, or if we're always going to stay at that low
>level we're at now.
>
>Actually, this can't be fixed by putting everything into the kernel.
>
> Tonnerre
>
>


--

--

--
Check FT Websites ...
http://www.futuretg.com - ftp://ftp.futuretg.com
http://www.FTLinuxCourse.com
http://www.FTLinuxCourse.com/Certification
http://www.rpmparadaise.org
http://GNULinuxUtilities.com
http://www.YourPersonalOperatingSystem.com

--

2004-09-06 12:44:04

by Herbert Poetzl

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Mon, Sep 06, 2004 at 10:08:45AM +0200, Frank van Maarseveen wrote:
> On Mon, Sep 06, 2004 at 09:56:03AM +0200, Tonnerre wrote:
> >
> > $ cat fs_header owner_root flags_with_suid evil_program > evil.iso
> > $ ls -l evil.iso/evil_program
>
> It should of course be equivalent to a user mount: nodev nosuid etc.

hmm, sounds reasonable, but what if root accesses it?
(or somebody with the 'right' capability)

- it might be strange if even root is not able to
open device nodes or execute files from an archive

- it might lead to interesting situations if the
archive is opened by root, but accessed by an user
(thinking of caches and such)

best,
Herbert

> --
> Frank
> -
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2004-09-06 12:55:01

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Mon, Sep 06, 2004 at 02:43:57PM +0200, Herbert Poetzl wrote:
> hmm, sounds reasonable, but what if root accesses it?
> (or somebody with the 'right' capability)
>
> - it might be strange if even root is not able to
> open device nodes or execute files from an archive

Yes, but if the file is owned by or writable for non-root then
you've got a security problem. So, unless owned by root and not
writable (readable, executable?) for anyone else "nodev" and
'nosuid" are mandatory.

>
> - it might lead to interesting situations if the
> archive is opened by root, but accessed by an user
> (thinking of caches and such)

See the above.
Alternatively, each process could have its own vfsmount (please don't
shoot me for suggesting this ;-)

--
Frank

2004-09-06 12:56:04

by Clemens Schwaighofer

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Frank van Maarseveen wrote:

|>
|>We have the mount command for that. :^)
|
|
| mount is nice for root, clumsy for user. And a rather complicated
| way of accessing data the kernel has knowledge about in the first
| place. For filesystem images, cd'ing into the file is the most
| obvious concept for file-as-a-dir IMHO.

thats why we have automount.

lg, clemens
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBPF3djBz/yQjBxz8RAmVhAKCODKNKOV9V0I59SUfQ1pp+qk88xwCg6+4p
Q+JcbO+W0+6cUfSEf3z/Iwk=
=IwZQ
-----END PGP SIGNATURE-----

2004-09-06 12:57:21

by Grzegorz Jaśkiewicz

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

On Mon, 06 Sep 2004 11:15:59 +0200, Dr. Giovanni A. Orlando
<[email protected]> wrote:
> Tonnerre wrote:
> >Because KDE people hate Gnome people and vice versa, and because the
> >rest of the world just neglects the two races for political reasons.
> This is completely wrong. Neither KDE people hate GNOME people
> nor viceversa.
>
> Time ago, I repeat in a conference the GNOME people need to jump
> on the KDE wagon, so the train will move to a single direction, and
> people does not approve this comment.
>
> More than on the license, QPL, there are a developer problem.

KDE is covered with LGPL/GPL licence, so is QT, so I don't see a problem here.


> Develop in C (Gtk, GNOME) is a lot more easy than develop in C++
> (Qt, KDE)
Well, some ppl knowing well C++ will say something completly different.
Since I've started QT/KDE development, my life as developer has changed.
I am developing everything in QT, as far as desktop stuff goes. And
it's much more clear and easier.
You just need to lear C++ well ;)

> But, for the GUI, the C++ approach is a lot superior ... a lot.
>
> The problem is this. They start to develop on Gtk to create GNOME, in C
> for a license problem.
> I don't agree FreeDesktop.org, because it is handled by a RedHat employee
> and the code is made on Gtk2, generally.
>
> ... They are not neutral.
Agreed.

--
GJ

2004-09-06 13:01:09

by Spam

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives




> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1

> Frank van Maarseveen wrote:

|>>
|>>We have the mount command for that. :^)
> |
> |
> | mount is nice for root, clumsy for user. And a rather complicated
> | way of accessing data the kernel has knowledge about in the first
> | place. For filesystem images, cd'ing into the file is the most
> | obvious concept for file-as-a-dir IMHO.

> thats why we have automount.

Which still needs to be setup in fstab, right?

> lg, clemens
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.5 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

> iD8DBQFBPF3djBz/yQjBxz8RAmVhAKCODKNKOV9V0I59SUfQ1pp+qk88xwCg6+4p
> Q+JcbO+W0+6cUfSEf3z/Iwk=
> =IwZQ
> -----END PGP SIGNATURE-----

2004-09-06 13:04:14

by Spam

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives




> On Mon, Sep 06, 2004 at 09:59:25PM +0900, Clemens Schwaighofer wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Spam wrote:
>>
>> | thats why we have automount.
>> |
>> |
>> |> Which still needs to be setup in fstab, right?
>>
>> no, in /etc/automount* actually. I have no fstab entry here and still I
>> can mount various samba shares, cdroms, etc ...

> but can you access images that way?

Probably, but only predefined ones.. IE, no random automount?

~S


2004-09-06 13:04:02

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

On Mon, Sep 06, 2004 at 09:59:25PM +0900, Clemens Schwaighofer wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Spam wrote:
>
> | thats why we have automount.
> |
> |
> |> Which still needs to be setup in fstab, right?
>
> no, in /etc/automount* actually. I have no fstab entry here and still I
> can mount various samba shares, cdroms, etc ...

but can you access images that way?

--
Frank

2004-09-06 13:04:02

by Clemens Schwaighofer

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Spam wrote:

| thats why we have automount.
|
|
|> Which still needs to be setup in fstab, right?

no, in /etc/automount* actually. I have no fstab entry here and still I
can mount various samba shares, cdroms, etc ...

lg, clemens

even my usbhd is automount :)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBPF8tjBz/yQjBxz8RApB5AKDOjvCM2foy0lTXAy3IOVXXxdBcUACgxUyn
1WlAMXdnleNZOTOwvVRQCz0=
=LX+8
-----END PGP SIGNATURE-----

2004-09-06 13:20:42

by Clemens Schwaighofer

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Frank van Maarseveen wrote:
| On Mon, Sep 06, 2004 at 09:59:25PM +0900, Clemens Schwaighofer wrote:
|
|>-----BEGIN PGP SIGNED MESSAGE-----
|>Hash: SHA1
|>
|>Spam wrote:
|>
|>| thats why we have automount.
|>|
|>|
|>|> Which still needs to be setup in fstab, right?
|>
|>no, in /etc/automount* actually. I have no fstab entry here and still I
|>can mount various samba shares, cdroms, etc ...
|
|
| but can you access images that way?
|

in theory yes, but only predefined ones. But I don't see automount for
this. more for cdroms and other removable medias.

I haven't had the need to mount and iso for quite some time, so I don't
know if I need root access for that, but I think yes like for all mount
things.

lg, clemens
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBPGNgjBz/yQjBxz8RAsm7AJsEwa18ymMBWKf1rJ2PAwPOfCMIJgCglaNS
d8OqXmD0jsYezcE8u7BGrkM=
=ctEE
-----END PGP SIGNATURE-----

2004-09-06 13:47:49

by Tonnerre

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

Salut,

On Mon, Sep 06, 2004 at 09:53:49PM +0900, Clemens Schwaighofer wrote:
> thats why we have automount.

Plus the HAL automounting code which dynamically generates mountpoints
for HAL detected devices...

Tonnerre


Attachments:
(No filename) (227.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-09-06 14:45:38

by Tonnerre

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Salut,

On Fri, Sep 03, 2004 at 01:43:02AM +0200, Spam wrote:
> Yes why not? If there was any filesystem drivers for the AudioCD
> format then it could.
>
> I had such a driver for Windows 9x which would display several
> folders and files for inserted AudioCD's:
>
> D: (cdrom)
> Stereo
> 22050
> Track01.wav
> Track02.wav
> ...
> 44100
> Track01.wav
> ...
> Mono
> 22050
> Track01.wav
> ...
> 44100
> Track01.wav
> ...

So you'd like the kernel to know about raw CD PCM and RIFF PCM format
and conversion? Great.. That's really Solarisish!

Tonnerre


Attachments:
(No filename) (677.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-09-06 14:59:40

by Spam

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)




> Salut,

> On Fri, Sep 03, 2004 at 01:43:02AM +0200, Spam wrote:
>> Yes why not? If there was any filesystem drivers for the AudioCD
>> format then it could.
>>
>> I had such a driver for Windows 9x which would display several
>> folders and files for inserted AudioCD's:
>>
>> D: (cdrom)
>> Stereo
>> 22050
>> Track01.wav
>> Track02.wav
>> ...
>> 44100
>> Track01.wav
>> ...
>> Mono
>> 22050
>> Track01.wav
>> ...
>> 44100
>> Track01.wav
>> ...

> So you'd like the kernel to know about raw CD PCM and RIFF PCM format
> and conversion? Great.. That's really Solarisish!

If it could be done in userland, and still be accessible for most
applications then that is great.

If the user things that such a plugin is of value for him then I do
not see why he should not be able to use it? Again these are example
of stuff that could be done if there was a plugin/extensible
interface.

You make it sound like everything on the planet should be predefined
and included in the kernel. This is not what I was saying or wanted.

From what I understood from Hans, there will be a way to load
plugins without having to recompile reiser4 module (or the kernel).
This is what I'd like to see.

Some things/plugins may even be partially user-space.

~S



> Tonnerre


2004-09-06 15:08:58

by Tonnerre

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Salut,

On Fri, Sep 03, 2004 at 02:39:06AM +0200, Spam wrote:
> "nano file.jpg/description.txt"

Is that supposed to do steganography?

Tonnerre


Attachments:
(No filename) (148.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-09-06 15:16:05

by Spam

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)




> Salut,

> On Fri, Sep 03, 2004 at 02:39:06AM +0200, Spam wrote:
>> "nano file.jpg/description.txt"

> Is that supposed to do steganography?

No, not at all. I already use similar things in Windows this is why
I suggested it here. Having a separate file for descriptions are not
as convenient. Especially if you have a few hundred JPEGs and I want
to quickly view some properties of the one you have just looked at.

~S

> Tonnerre

2004-09-06 15:57:22

by Christer Weinigel

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

Jamie Lokier <[email protected]> writes:

> Christer Weinigel wrote:
> > Can be done with dnotify/inotify and a cache daemon keeping track of
> > mtime. Yes, this will need a kernel change to make sure mtime always
> > changed when the file changes, but it does not require anything else.
>
> - Can the daemon keep track of _every_ file on my disk like this?
> That's more than a million files, and about 10^5 directories.
> dnotify would require the daemon to open all the directories.
> I'm not sure what inotify offers.

I don't think dnotify/inotify handles subdirectories well yet, so I
suppose that they would have to be extended.

> - What happens at reboot - I guess the daemon has to call stat()
> on every file to verify its indexes? Have you any idea how long
> it takes to call stat() on every file in my home directory?

The daemon saves state before it shuts down and reloads the state
after a reboot. You have to make sure that it is started first and
stopped last during the boot process. How would a kernel plugin
handle things that happen before or after the plugin module has been
loaded? It's the same problem.

> - The ordering problem: I write to a file, then the program
> returns. System is very busy compiling. 2 minutes later, I
> execute a search query. The file I wrote two minute ago doesn't
> appear in the search results. What's wrong?
>
> Due to scheduling, the daemon hasn't caught up yet. Ok, we can
> accept that's just hard life. Sometimes it takes a while for
> something I write to appear in search results.
>
> But! That means I can't use these optimised queries as drop-in
> replacements for calling grep and find, or for making Make-like
> programs run faster (by eliminating parsing and stat() calls).
> That's a shame, it would have been nice to have a mechanism that
> could transparently optimise prorgrams that do calculations....

Sure you can. First of all, you can just wait for the daemon to
finish indexing any files that it has been notified about changes in.
This is no different from you having to wait for the kernel to finish
indexing the files. Or are you suggesting that the kernel should stop
all other processes until the indexing is done?

> Do you see what I'm getting at? There's building some nice GUI
> and search engine like functionality, where changes made by one
> program _eventually_ show up in another (i.e. not synchronously).
>
> That's easy.
>
> And then there's optimising things like grep, find, perl, gcc,
> make, httpd, rsync, in a way that's semantically transparent, but
> executes faster _as if_ they had recalculated everything they
> need to every time. That's harder.

If you have a good notify, it's not harder.

> No, not 3, 4 or 6. For correct behaviour those require synchronous
> query results. Think about 6, where one important cached query is
> "what is the MD5 sum of this file", and another critical one, which
> can only work through indexing, is "give me the name of any file whose
> MD5 sum matches $A_SPECIFIC_MD5". Trusting the async results for
> those kind of queries from your daemon would occasionally result in
> data loss due to race conditions. So you wouldn't trust the async
> results, and you fail to get those CPU-saving and bandwidth-saving
> optimisations.

So how do you calculate the MD5 sum of a file that is in the process
of being modified? It's not possible to do that unless you block all
other access to that file and recalculate the MD5 sum after each
write. With a notifier that tells the daemon that it has stale data
and needs to reprocess the file, it's no different.

/Christer

--
"Just how much can I get away with and still go to heaven?"

Freelance consultant specializing in device driver programming for Linux
Christer Weinigel <[email protected]> http://www.weinigel.se

2004-09-06 16:01:20

by Chris Dukes

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Mon, Sep 06, 2004 at 05:55:21PM +0200, Christer Weinigel wrote:

Could ya'll take this to slashdot for a bit just to improve the level
of commentary there?
http://slashdot.org/article.pl?sid=04/09/06/1235236

--
Chris Dukes
Warning: Do not use the reflow toaster oven to prepare foods after
it has been used for solder paste reflow.
http://www.stencilsunlimited.com/stencil_article_page5.htm

2004-09-07 21:51:53

by Bill Huey

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives (was: silent semantic changes with reiser4)

On Thu, Sep 02, 2004 at 11:48:10PM +0100, Alan Cox wrote:
> On Iau, 2004-09-02 at 22:56, Bill Huey wrote:
> > It also depends on who you ask. I can't take a lot of the mainstream
> > X folks serious since they are still using integer math as parameters
>
> The X folks know what they are doing. Modern X has a complete
> compositing model. Modern X has a superb font handling system. Nobody
> broke anything along the way. The new API's can be mixed with the old,
> there are good fallbacks for old servers.

Keith Packard knows what he's doing. The rest that don't follow his lead
on these issues are useless or detracting and redirecting resources away
from critical problems that have been solved under other, more mature,
operating systems, specifically Apple's OS X.

Notice, that you didn't deflect the claim that folks are still using
integer math to draw things. This is critical...

> In fact they are so good at it that most people don't notice beyond the
> fact their UI looks better than before.
>
> That is how you do change *right*

It's one way, but it's not top-level redesign heavily enough that you still
have a substandard rendering system. Notice that were wasn't a mention of
device independent rendering, WYSIWYG (what you get is what you see) in
their rasterization model, nor was there any mention of how a scalable
unified rasterization model used in an application. OpenStep ruled during
it's time and rendered and generate better PostScript than Macintosh
applications because of this unified rasterization model. Split development
and the lack of "connecting the dots" across systems is one of the reasons
why X has such poor app development. It takes way more time to do things
in this environment than in other system.

What folks X land are missing here is a notion of design verses piecewise
hacks to half emulate a critical function. The fault in this matter is
really spread out to multipule camps not doing their homework regarding
these matters.

Also, there's no mention of making OpenGL a first class citizen in this
system so that all compositing is hardware accelerate. What OS has this ?
You guess it Apple's OS X has this with full transparency and bitmap
backstore for region invalidating and redrawing. X currently has none of
this stuff. You could say that this is planned in the future, but there
has been no push for doing this in this aggressive a manner which is a
leadership failure of that community as a whole.

Also, the notion of color space, not just RGB, but CMYK is a first class
citizen in Display Postscript. X is also missing this. These aren't
things that you can gradually add in. It's about looking at the system
from the top-down and solving these problems using other systems that.

X is at best piecewise and substandard to just about any modern rendering
system that I've used.

> > more dynamic systems. And the advent of XML (basically a primitive and
> > flat model of what Hans is doing) for .NET style systems are going to
>
> I see you don't really get XML either. XML is just an encoding. Its
> larger and prettier than ASN.1 and easier to hack about with perl. You
> can do the same thing with lisp lists for that matter.

I get it fine. What you don't get is something called Model View Controller
in that .NET applications have the ability to take that serialized structure,
apart of their entire object store/serialization system and transform
it into many views. The same widget data can be used not only to build HTML
for a web page, but also a standard component driven GUI system a native
windowing kit. This exploits polymorphism and how OOP should have been
done in the first place. This use of it is very important and obviously
useful in this case.

In MVC, all data has relationships and XML s the kind of DB format that these
systems use for object serialization. XML is the storage layer in these systems.
It's used for every object, remote or local.

This is why something like Reiser FS 4 with it's persistent object store
capability may have a critical influence on how these systems are built.
These systems are returning nothing but structured storage in the form
of XML analougs.

Your Perl/Python, what ever example here doesn't cut it and explicate
how XML based system are being generically used in this sense. It's too
technically shallow for this context.

> > have been lost to older commericial interests (Microsoft Win32) and that
> > has wiped out the fundamental classic computer science backing this from
> > history. This simple "MP3 metadata" stuff is a very superficial example
> > of how something like this is used.
>
> The trouble with computer science is that most of it sucks in the real
> world. We don't write our OS's in Standard ML, we don't implement some
> of the provably secure capability computing models. At the end of the
> day they are neat, elegant and useless to real people.

Java and .NET are just those things that you have been talking about.
As these systems become more and more prevalent (they will) they're
influence is just the beginning. These system have pretty much happened
overnight and can't be ignored by kernel folks.

> > Unix folks tend to forget that since they either have never done this
> > kind of programming or never understood why this existed in the first
> > place. It's about a top-down methodology effecting the entire design of
> > the software system, not just purity Unix. If it can be integrate
> > smoothly into the system, then it should IMO.
>
> The Unix world succeeded because Unix (at least in v7 days) was the
> other way around to every other grungy OS on the planet. It had only
> thing things it needed. I've used some of the grungy crawly horrors that
> were its rivals and there is a reason they don't exist any more.

Look, I agree with you for the most part about Unix, but these issues
have to be reexamined again after some period of time has elapsed.

I can only guess that some of these system are horribly implemented
minimally, but I can't comment about their original intentions since
I've personally never used them.

> I would sum up the essence of the unix kernel side as
> - Does only what it must do
> - "Makes the usual easy makes the unusual possible"
> - Has an API that is small enough for developers to learn
> easily (an API so good every other OS promptly ripped it off)
> People forget the worlds of SYS$QIO, RMS, FCB's and the like
>
> Its worked remarkably well for a very very long time, and most of the
> nasties have come from people trying to break that model or not
> understanding it.

This is a difficult topic, but I don't how folks can dimiss something
like Reiser FS semantics and the importance of systems like this for
future software design across all application boundaries.

There has to be a way of getting this stuff in and keeping the traditional
semantics in place.

I was thinking about have a dual mounted file system, one with traditional
Unix semantics with read-only permissions, but then having it also being
mounted in a special namespace directory structure (something like /proc)
that would reveal other streams for "stream aware" applications with
read/write permissions.

It might solve both problems in this case. Don't know.

I've been on vacation for the last couple of days at Burning Man, which
is why this response has been delayed. :)

bill

2004-09-08 09:52:16

by Helge Hafting

[permalink] [raw]
Subject: Re: The argument for fs assistance in handling archives

David Masover wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Helge Hafting wrote:
> [...]
> | You don't need kernel support for cd'ing into fs images.
> | You need a shell (or GUI app) that:
> | 1. notices that user tries to CD into a file, not a directory
> | 2. Attempts fs type detection and do a loop mount.
> | 3. Give error message if it wasn't a supported fs image.
>
> You can argue this about FS plugins, too. You don't need kernel support
> for cryptocompress, for example. Just have a fifo farm. Every file
> that has compression turned on is actually a named pipe, the original
> file is hidden (starts with a dot), and you have a script running to
> each fifo which runs zcat.

Eww. A shell/filemanager supporting fs images is one running program.
The fifo farm is one per file. Ouch.

>
> The kernel support (read: interface) is just cleaner in some ways.
>
> In the loop mount example, it isn't universal -- the user can't CD into
> a file if they use the wrong shell, and two different shells might make
> two different mounts. How many mounts before we run out of loop devices?

This limitation apply to the kernel interface too. But you can
configure for an
arbitrary number of loop mounts anyway. The user will have to mount
using the correct tool - after that the door is opened for anything.

>
> In the cryptocompress example, there's no caching or compress-on-flush
> optimization, and it isn't entirely transparent to anything. It doesn't
> look like a file, it looks like a fifo. ls will make it look yellow.


Even the kernel supported "cd into fs images" has problems. How should
the kernel know which files you want to support with mounting, and which
ones are the wast majority of plain files? A file manager doesn't
have to do this fully automatic, it can have a button/hotkey for "loop
mount this".
So the user will have to tell it what to do, but in a much easier way
than typing "mount -o loop . . . " The file manager can also ask for
encryption keys - something the kernel cannot do because it doesn't
know what user interface(s) is in use by the process that attempts the "cd".
There may be one interface (stdin, X) there may be both, i.e. something
started from an xterm can use both, there may be none, (the
web/ftp/file-server
got a request for something inside a encrypted fs image.)

And even if the kernel can figure out the correct user interface, do we
want it to contain an X client for asking for keys? :-/

Helge Hafting