David Masover <[email protected]> said:
> Horst von Brand wrote:
> | Spam <[email protected]> said:
> |
> | [...]
> |
> |
> |> The problem with the userspace library is standardization. What
> |> would be needed is a userspace library that has a extensible plugin
> |> interface that is standardized. Otherwise we would need lots of
> |> different libraries, and I seriously doubt that 1) this will happen
> |> and 2) we get all Linux programs to be patched to use it.
> |
> |
> | What is the difference with a kernel implementation? Not by being
> | in-kernel will it make all the incompatible ways of doing this
> | magically vanish, and give outstanding performance. Plus handling
> | and maintaining the in-kernel stuff is _much_ harder than userspace
> | libraries.
> First of all, only the interface has to be in the kernel. I haven't
> heard anyone suggest otherwise.
Right. But it is _another_ interface in the kernel, plus special userland
code supporting it.
> Second, there are quite a few things which I might want to do, which can
> be done with this interface and without patching programs,
Such as?
> but would
> require massive patches to userspace. There have been numerous examples.
Haven't seen any that made sense to me, sorry.
> There are some things which can't be solved without patching.
Maybe. Question is, is it worth it (kernel modifications + userland
support, or just userland support, or leave it alone). Sure, it might make
your particular application easier to write (at a cost for _all_ filesystem
hackers!), perhaps even a bit faster; but is _your_ particular convenience
worth the cost for _everybody_?
> Version
> control is one such thing.
bk, cvs, svn, rcs, ... are working just fine here, thank you so much. Used
to work on SunOS and Solaris, even SCO Unix (I used at least rcs and cvs
there). No Reiser4 in sight.
> But then there can be more generic patches
> - -- as soon as the transaction API is done, you only have to patch apps
> to use that, and have a version control reiser4 plugin.
Again, _what_ version control exactly? Will the above packages be able to
make use of it (remember they all are cross-platform (at least cross-Unix),
and so quite unlikely to make use of a Reiser4 on Linux whackiness...)?
> | I'd go the other way around: Get userspace to agree on a common framework,
> | make it work in userspace; if (extensive, hopefully) experience shows that
> | a pure userspace solution has issues that can't be solved except by kernel
> | assistance, so be it.
> We already have such a framework -- it's called "VFS".
Right. It offers what applications need to build their own stuff. It is
minimalistic (well, sort of) and time-proven.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Horst von Brand wrote:
[...]
|>Second, there are quite a few things which I might want to do, which can
|>be done with this interface and without patching programs,
|
|
| Such as?
They've been mentioned.
|
|
|> but would
|>require massive patches to userspace. There have been numerous examples.
|
|
| Haven't seen any that made sense to me, sorry.
Sorry if they don't make sense to you, but I don't feel like discussing
them now. Either you get it or you don't, either you agree or you
don't. Read the archives.
|
|
|>There are some things which can't be solved without patching.
|
|
| Maybe. Question is, is it worth it (kernel modifications + userland
| support, or just userland support, or leave it alone). Sure, it might make
| your particular application easier to write (at a cost for _all_
filesystem
| hackers!), perhaps even a bit faster; but is _your_ particular convenience
| worth the cost for _everybody_?
There are far more userland developers than filesystem hackers. If
you're a filesystem hacker, it's easier to understand how much work it
is to do something in the filesystem/kernel, and harder to understand
how it works in userland, or for the actual user.
|
|
|> Version
|>control is one such thing.
|
|
| bk, cvs, svn, rcs, ... are working just fine here, thank you so much. Used
| to work on SunOS and Solaris, even SCO Unix (I used at least rcs and cvs
| there). No Reiser4 in sight.
Transparently?
It _works_ to do
zcat file.gz > /tmp/file
vim /tmp/file
gzip -c /tmp/file > file.gz
rm /tmp/file
You can even do that as a script, call it zvim. You could even do it as
a generic script, where "vim" is replaced with "$1". But is it as
elegent as transparently compressed files?
I'm not sure about the version control thing -- I don't think people
have hit every conceptual issue about it yet. But the point is:
Moving complexity from kernel space to user space (or the other way)
doesn't make it any less complex.
|> But then there can be more generic patches
|>- -- as soon as the transaction API is done, you only have to patch apps
|>to use that, and have a version control reiser4 plugin.
|
|
| Again, _what_ version control exactly? Will the above packages be able to
| make use of it (remember they all are cross-platform (at least
cross-Unix),
| and so quite unlikely to make use of a Reiser4 on Linux whackiness...)?
Probably. It wasn't specified what version control would be used --
that's currently just as abstract as what compression algorithm to use
to transparently compress files.
Maybe something new. Version control just doesn't look that complex
once you've got the interface and the data storage done. I do
incremental backups using rsync and hardlinks -- why would version
control be so much more complex than that?
Said backup system uses hardlinks, btw, which not all systems support.
And rsync is fairly broken on some platforms.
|>| I'd go the other way around: Get userspace to agree on a common
framework,
|>| make it work in userspace; if (extensive, hopefully) experience
shows that
|>| a pure userspace solution has issues that can't be solved except by
kernel
|>| assistance, so be it.
|
|
|>We already have such a framework -- it's called "VFS".
|
|
| Right. It offers what applications need to build their own stuff. It is
| minimalistic (well, sort of) and time-proven.
x86 assemby is minimalistic and time-proven.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iQIVAwUBQT1MGHgHNmZLgCUhAQLxoQ/+Mglz6tr0A4wQmAnP1y9q7WVqQQ8s3Cwg
b3VyOHaIIZSw3lod+bWDgE9LJJsQOxO4SThlWLyMEHipFJT0iHVIYjLSrC5e4DOb
Wl5E51yz3ZVx11mRTM7uEw+Ez6wOcaWIFLTvarDxzmDaxS6oh2ZpN2Ibnf8h/1lT
0YJnz6C964TkaA8jYIsQljIWkMMk7EGzP6UlrQOBbo4xMJNT+wIZMJWX5JAsULc2
EGzTO8dgZB0MbJ5STufS4h5tudny/lz4TO9iTv9CRGhusf2am7k7PVFPG4RcjXKU
iMgLTHKsXbDjncj39JZqwQtBZP8nuf72pMzbEe5iiYYIHGWu4Mm/JSxUmuJZ/YXX
yZIS6DKwSrKzDf/p1chvLVCaScxsaIWuetoe4ODFUoWMRUfCdxK+r7+6j9tKn+hn
LK5iYVAs3/xpU64jpBKvrKlRlISm8++GvGD+tdnZKAnetmaS0QHb2DrbwSbONvC1
4RvABUYC2IoVoDuAsueRDQxqTJjGPWn7DBvVUI5SCQZVlJPMavHmrv9qRVtNA4Y4
LBzfYK6aCprbmmX2Axs8FS4ptNGdwGpUhwuVKfSzlOgUS5gIaWlMKOGxWOTgDx7U
Q53mVhdZIinHH+/h4xBpcP3Q1fk8nTQ1gqfYVTseNlrMZvgFAax7E2VINiJWohis
KH6z9bFgkZA=
=aNM7
-----END PGP SIGNATURE-----
I just want to remind that no one has been able to offer a good way of
handling attributes/streams/metafiles other than reiser4 in which
CTRL-XCTRL-Ffilename within emacs to edit the stream/attribute/metafile
will work without modifying the emacs source. David Masover is right
that there are far more application writers than kernel hackers, and we
should make the kernel more complicated if it makes a few thousand apps
simpler.
On Mon, Sep 06, 2004 at 11:02:01PM -0700, Hans Reiser wrote:
> I just want to remind that no one has been able to offer a good way of
> handling attributes/streams/metafiles other than reiser4 in which
> CTRL-XCTRL-Ffilename within emacs to edit the stream/attribute/metafile
> will work without modifying the emacs source. David Masover is right
> that there are far more application writers than kernel hackers, and we
> should make the kernel more complicated if it makes a few thousand apps
> simpler.
This thread is getting a bit soft on technical details and/or value.
Hans, please post the results of fsstress (the bits from ext3 cvs) on
reiser4 on SMP machines of non-i386 architectures (e.g. ppc64, sparc64,
ia64) as well as the results of transferring reiser4 filesystems
between machines whose PAGE_SIZE, wordsize and endianness vary.
-- wli
David Masover <[email protected]> writes:
> |>Second, there are quite a few things which I might want to do, which can
> |>be done with this interface and without patching programs,
> | Such as?
> They've been mentioned.
> | Haven't seen any that made sense to me, sorry.
> Sorry if they don't make sense to you, but I don't feel like discussing
> them now. Either you get it or you don't, either you agree or you
> don't. Read the archives.
Great argument. Not. There has been so much shit thrown around here
so that it's impossible to keep track of all examples.
Could you please try summarize a few of the arguments that you find
especially compelling? This thread has gotten very confused since
there are a bunch of different subjects all being intermixed here.
What are we discussing?
1. Do we want support for named streams?
I belive the answer is yes, since both NTFS and HFS (that's the
MacOS filesystem, isn't it?) supports streams we want Linux to
support this if possible.
Anyone disagreeing?
2. How do we want to expose named streams?
One suggestion is file-as-directory in some form.
Another suggestion made is to expose named streams somewhere under
/proc/self/fd.
Yet another suggestion is to use the openat(3) API from solaris.
Some filesystems exposes extra data in a special directory in the
same directory as the file, such as netapps .snapshot directories
or the extra directories that netatalk expects. This has the
advantage that it even works on non-named stream capable
filesystems, but it has a lot of problems too.
Linux already has limited support for names streams via the xattr
interface, but it's not a good interface for people wanting to
have large files as named streams.
4. What belongs in the generic VFS, what belongs in Reiser4?
Some things reiser4 do, such as files-as-directories need changes
to the VFS because it breaks assumptions that the VFS makes
(i.e. a deadlock or an oops when doing a hard link out of one).
Some other things reiser4 can do would be better if they were in
the VFS since other filesystems might want to support the same
functionality.
Or Linux may not support some of the things reiserfs at all.
5. What belongs in the kernel, what belongs in userspace?
This is mostly what I have been trying to argue about.
So, to try to summarize my opinion, regarding file-as-directory, I
belive it's fatally flawed because it breaks a lot of assumptions that
existing code make. One example of an application that will break is
a web server that tries to filter out accesses to "bad" files,
files-as-directories suddenly means that part of those files will be
accessible (and there are a _lot_ of CERT reports on just this kind of
problems with Windows web servers due to access to named streams not
being restricted or ways to access files with non-canonical names that
also managed to bypass access restrictions).
Files-as-directories also does not give us named streams on
directories. The suggestion to have dir/metas access the named
streams means that if someone already has a file named metas in a
directory that file will be lost. (Does anyone remember the
discussions about the linux kernel having a directory named "core" and
the problems this caused for some people?)
All this suggests to me that named streams must live in another
namespace than the normal one. To me, openat(3) seems like a good
choice for an API and it has the advantage that someone else, Solaris,
already has implemented it.
Additionally, files-as-directores does not solve the problem of
"cp a b" losing named streams. There is curently no copyfile syscall
in the Linux kernel, "cp a b" essentially does "cat a >b". So unless
cp is modified we don't gain anything. If cp is modified to know
about named streams, it really does not matter if named streams are
accessed as file-as-directories, via openat(3) or via a shared library
with some other interface.
Regarding the kernel or userspace discussion. In my opinion anything
that can be done in user space should be done in userspace. If the
performance sucks, or it has security problems, or needs caching that
cant be solved in userspace it can be moved to the kernel, but in that
case the smallest and cleanest API possible should be implemented.
If, for historical reasons, an API must be in the kernel, there is not
much we can do about it either. It'll have to stay there, but we can
avoid making the same mistakes again.
So, for all the examples of the kernel having plugins that
automatically lets an application see a tar-file as a directory, I
really, really don't belive this belongs in the kernel. First of all,
this is the file-as-directory discussion again, I belive it is a
mistake to expose the contents as a directory on top of the file
because it breaks a lot of assumptions that unix programs make.
It's much better to expose the contents at another place in the
filesystem by doing a temporary mount of the file with the proper
filesystem. As Pavel Machek pointed out, this has the problem of who
cleans up the mount if the application crashes. One way to handle
this could be something like this:
mount -t tarfs -o loop bar.tar /tmp/bar-fabb50509
chdir /tmp/bar-fabb50509
umount -f /tmp/bar-fabb50509
This will require the ability to unmount busy filesystems (but I
belive Alexander Viro already has implemented the infrastructure
needed for this).
Or for files that we don't have a real filesystem driver (or on other
systems where userspace mounts are not allowed), we could just unpack
the contents into /tmp. For cleanup we could let whatever cleans up
/tmp anyways handle it, or have a cache daemon that keeps track of
untarred directories and removes them after a while.
Another way is to completely forget about presenting the contents of a
tar file as a real files, and just use a shared library to get at the
contents (now we just have to convince everyone to use the shared
library). This would also be portable to other systems.
If we do this right, it could all be hidden in a shared library, and
if the system below it supports more advanced features, it can use it.
Regarding the "I want a realtime index of all files". I belive that a
notifier that can tell me when a file has been changed and a userspace
daemon ought to handle most of the cases that have been mentioned.
The suggested problems of not getting an up to date query response can
be handled by just asking the daemon "are you done with indexing yet".
The design of such a daemon and the support it needs from the kernel
can definitely be discussed. But to put the indexer itself in the
kernel sounds like a bad idea. Even adding an API to query the
indexer into the kernel sounds pointless, why do that instead of just
opening a Unix socket to the indexer and asking it directly?
/Christer
--
"Just how much can I get away with and still go to heaven?"
Freelance consultant specializing in device driver programming for Linux
Christer Weinigel <[email protected]> http://www.weinigel.se
Christer Weinigel <[email protected]> wrote:
> Additionally, files-as-directores does not solve the problem of
> "cp a b" losing named streams. There is curently no copyfile syscall
> in the Linux kernel, "cp a b" essentially does "cat a >b". So unless
> cp is modified we don't gain anything. If cp is modified to know
> about named streams, it really does not matter if named streams are
> accessed as file-as-directories, via openat(3) or via a shared library
> with some other interface.
You cannot just 'modify cp'. cp is a programming interface standardized
in POSIX.1. You can of course add non-standard extensions to some cp
implementations, but it seems hardly evitable then that you either have
to use cp in a non-standard manner regularly with Linux or risk to lose
data.
This is even more severe with tar/pax. Just patching GNU tar for file
streams, as it was suggested earlier in this discussion, is still far
away from a real solution because it neither solves the issues with
the POSIX.1 pax standard nor those with other implementations of it.
Given these facts, it does not seem so clear to me that adding named
streams for Windows and Mac OS interoperability would be a win to Linux
in the end. The loss of interoperability to Unix/POSIX/today's Linux
might have much worse effects.
The current xattr extension is much less of a problem because it only
holds metadata, which is mostly not applicable to other environments
anyway.
Gunnar
--
http://omnibus.ruf.uni-freiburg.de/~gritter
On Tue, Sep 07, 2004 at 01:55:32PM +0200, Christer Weinigel wrote:
> David Masover <[email protected]> writes:
>
> > |>Second, there are quite a few things which I might want to do, which can
> > |>be done with this interface and without patching programs,
> > | Such as?
> > They've been mentioned.
>
> > | Haven't seen any that made sense to me, sorry.
> > Sorry if they don't make sense to you, but I don't feel like discussing
> > them now. Either you get it or you don't, either you agree or you
> > don't. Read the archives.
>
> Great argument. Not. There has been so much shit thrown around here
> so that it's impossible to keep track of all examples.
>
> Could you please try summarize a few of the arguments that you find
> especially compelling? This thread has gotten very confused since
> there are a bunch of different subjects all being intermixed here.
>
> What are we discussing?
>
> 1. Do we want support for named streams?
>
> I belive the answer is yes, since both NTFS and HFS (that's the
> MacOS filesystem, isn't it?) supports streams we want Linux to
> support this if possible.
well, yes HFS has this, is it advantageous, no
it's kind of heritage ...
> Anyone disagreeing?
yes, MacOS X allows to use UFS instead of HFS+
which doesn't support the fancy/confusing streams
I, for my part, do not like the idea of multiple
streams for one file, IMHO all features can be
provided by using directories instead, which does
not break any userspace tools _and_ sounds natural
to me ...
best,
Herbert
> 2. How do we want to expose named streams?
>
> One suggestion is file-as-directory in some form.
>
> Another suggestion made is to expose named streams somewhere under
> /proc/self/fd.
>
> Yet another suggestion is to use the openat(3) API from solaris.
>
> Some filesystems exposes extra data in a special directory in the
> same directory as the file, such as netapps .snapshot directories
> or the extra directories that netatalk expects. This has the
> advantage that it even works on non-named stream capable
> filesystems, but it has a lot of problems too.
>
> Linux already has limited support for names streams via the xattr
> interface, but it's not a good interface for people wanting to
> have large files as named streams.
>
> 4. What belongs in the generic VFS, what belongs in Reiser4?
>
> Some things reiser4 do, such as files-as-directories need changes
> to the VFS because it breaks assumptions that the VFS makes
> (i.e. a deadlock or an oops when doing a hard link out of one).
>
> Some other things reiser4 can do would be better if they were in
> the VFS since other filesystems might want to support the same
> functionality.
>
> Or Linux may not support some of the things reiserfs at all.
>
> 5. What belongs in the kernel, what belongs in userspace?
>
> This is mostly what I have been trying to argue about.
>
> So, to try to summarize my opinion, regarding file-as-directory, I
> belive it's fatally flawed because it breaks a lot of assumptions that
> existing code make. One example of an application that will break is
> a web server that tries to filter out accesses to "bad" files,
> files-as-directories suddenly means that part of those files will be
> accessible (and there are a _lot_ of CERT reports on just this kind of
> problems with Windows web servers due to access to named streams not
> being restricted or ways to access files with non-canonical names that
> also managed to bypass access restrictions).
>
> Files-as-directories also does not give us named streams on
> directories. The suggestion to have dir/metas access the named
> streams means that if someone already has a file named metas in a
> directory that file will be lost. (Does anyone remember the
> discussions about the linux kernel having a directory named "core" and
> the problems this caused for some people?)
>
> All this suggests to me that named streams must live in another
> namespace than the normal one. To me, openat(3) seems like a good
> choice for an API and it has the advantage that someone else, Solaris,
> already has implemented it.
>
> Additionally, files-as-directores does not solve the problem of
> "cp a b" losing named streams. There is curently no copyfile syscall
> in the Linux kernel, "cp a b" essentially does "cat a >b". So unless
> cp is modified we don't gain anything. If cp is modified to know
> about named streams, it really does not matter if named streams are
> accessed as file-as-directories, via openat(3) or via a shared library
> with some other interface.
>
> Regarding the kernel or userspace discussion. In my opinion anything
> that can be done in user space should be done in userspace. If the
> performance sucks, or it has security problems, or needs caching that
> cant be solved in userspace it can be moved to the kernel, but in that
> case the smallest and cleanest API possible should be implemented.
>
> If, for historical reasons, an API must be in the kernel, there is not
> much we can do about it either. It'll have to stay there, but we can
> avoid making the same mistakes again.
>
> So, for all the examples of the kernel having plugins that
> automatically lets an application see a tar-file as a directory, I
> really, really don't belive this belongs in the kernel. First of all,
> this is the file-as-directory discussion again, I belive it is a
> mistake to expose the contents as a directory on top of the file
> because it breaks a lot of assumptions that unix programs make.
>
> It's much better to expose the contents at another place in the
> filesystem by doing a temporary mount of the file with the proper
> filesystem. As Pavel Machek pointed out, this has the problem of who
> cleans up the mount if the application crashes. One way to handle
> this could be something like this:
>
> mount -t tarfs -o loop bar.tar /tmp/bar-fabb50509
> chdir /tmp/bar-fabb50509
> umount -f /tmp/bar-fabb50509
>
> This will require the ability to unmount busy filesystems (but I
> belive Alexander Viro already has implemented the infrastructure
> needed for this).
>
> Or for files that we don't have a real filesystem driver (or on other
> systems where userspace mounts are not allowed), we could just unpack
> the contents into /tmp. For cleanup we could let whatever cleans up
> /tmp anyways handle it, or have a cache daemon that keeps track of
> untarred directories and removes them after a while.
>
> Another way is to completely forget about presenting the contents of a
> tar file as a real files, and just use a shared library to get at the
> contents (now we just have to convince everyone to use the shared
> library). This would also be portable to other systems.
>
> If we do this right, it could all be hidden in a shared library, and
> if the system below it supports more advanced features, it can use it.
>
> Regarding the "I want a realtime index of all files". I belive that a
> notifier that can tell me when a file has been changed and a userspace
> daemon ought to handle most of the cases that have been mentioned.
> The suggested problems of not getting an up to date query response can
> be handled by just asking the daemon "are you done with indexing yet".
> The design of such a daemon and the support it needs from the kernel
> can definitely be discussed. But to put the indexer itself in the
> kernel sounds like a bad idea. Even adding an API to query the
> indexer into the kernel sounds pointless, why do that instead of just
> opening a Unix socket to the indexer and asking it directly?
>
> /Christer
>
> --
> "Just how much can I get away with and still go to heaven?"
>
> Freelance consultant specializing in device driver programming for Linux
> Christer Weinigel <[email protected]> http://www.weinigel.se
> -
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> David Masover <[email protected]> writes:
>> |>Second, there are quite a few things which I might want to do, which can
>> |>be done with this interface and without patching programs,
>> | Such as?
>> They've been mentioned.
>> | Haven't seen any that made sense to me, sorry.
>> Sorry if they don't make sense to you, but I don't feel like discussing
>> them now. Either you get it or you don't, either you agree or you
>> don't. Read the archives.
> Great argument. Not. There has been so much shit thrown around here
> so that it's impossible to keep track of all examples.
> Could you please try summarize a few of the arguments that you find
> especially compelling? This thread has gotten very confused since
> there are a bunch of different subjects all being intermixed here.
Indeed. We are discussion changes to the heart of Linux. It is bound
to get a little heated :/
> What are we discussing?
> 1. Do we want support for named streams?
> I belive the answer is yes, since both NTFS and HFS (that's the
> MacOS filesystem, isn't it?) supports streams we want Linux to
> support this if possible.
> Anyone disagreeing?
No :)
> 2. How do we want to expose named streams?
> One suggestion is file-as-directory in some form.
> Another suggestion made is to expose named streams somewhere under
> /proc/self/fd.
> Yet another suggestion is to use the openat(3) API from solaris.
> Some filesystems exposes extra data in a special directory in the
> same directory as the file, such as netapps .snapshot directories
> or the extra directories that netatalk expects. This has the
> advantage that it even works on non-named stream capable
> filesystems, but it has a lot of problems too.
> Linux already has limited support for names streams via the xattr
> interface, but it's not a good interface for people wanting to
> have large files as named streams.
> 4. What belongs in the generic VFS, what belongs in Reiser4?
> Some things reiser4 do, such as files-as-directories need changes
> to the VFS because it breaks assumptions that the VFS makes
> (i.e. a deadlock or an oops when doing a hard link out of one).
> Some other things reiser4 can do would be better if they were in
> the VFS since other filesystems might want to support the same
> functionality.
> Or Linux may not support some of the things reiserfs at all.
> 5. What belongs in the kernel, what belongs in userspace?
> This is mostly what I have been trying to argue about.
> So, to try to summarize my opinion, regarding file-as-directory, I
> belive it's fatally flawed because it breaks a lot of assumptions that
> existing code make. One example of an application that will break is
> a web server that tries to filter out accesses to "bad" files,
> files-as-directories suddenly means that part of those files will be
> accessible (and there are a _lot_ of CERT reports on just this kind of
> problems with Windows web servers due to access to named streams not
> being restricted or ways to access files with non-canonical names that
> also managed to bypass access restrictions).
But restrictions could be controlled. But exactly how depends on
what solution for named streams is chosen. For example, there might
be separate permissions for named streams other than the main
stream.
> Files-as-directories also does not give us named streams on
> directories. The suggestion to have dir/metas access the named
> streams means that if someone already has a file named metas in a
> directory that file will be lost. (Does anyone remember the
> discussions about the linux kernel having a directory named "core" and
> the problems this caused for some people?)
Yes, reserved keywords needs to be chosen very carefully if they are
used. Other examples here has been ..metas or ...
> All this suggests to me that named streams must live in another
> namespace than the normal one. To me, openat(3) seems like a good
> choice for an API and it has the advantage that someone else, Solaris,
> already has implemented it.
> Additionally, files-as-directores does not solve the problem of
> "cp a b" losing named streams. There is curently no copyfile syscall
> in the Linux kernel, "cp a b" essentially does "cat a >b". So unless
> cp is modified we don't gain anything. If cp is modified to know
> about named streams, it really does not matter if named streams are
> accessed as file-as-directories, via openat(3) or via a shared library
> with some other interface.
One suggestion is missed. It is to provide system calls for copy.
That would also solve the problem. Named streams and metas would
then be handled correctly. It also allows further changes to
filesystems without having to patch applications yet again.
A copy system call would also be large beneficial for networked
filesystems (NFS, Samba, etc) as data wouldn't have to be
transferred over the network and back.
> Regarding the kernel or userspace discussion. In my opinion anything
> that can be done in user space should be done in userspace. If the
> performance sucks, or it has security problems, or needs caching that
> cant be solved in userspace it can be moved to the kernel, but in that
> case the smallest and cleanest API possible should be implemented.
Yes. I agree. I do not think most people arguing for plugins or
usage for named streams actually wanted to put everything in the
kernel, but rather to have a kernel interface to which user-level
modules, plugins and applications could work.
> If, for historical reasons, an API must be in the kernel, there is not
> much we can do about it either. It'll have to stay there, but we can
> avoid making the same mistakes again.
> So, for all the examples of the kernel having plugins that
> automatically lets an application see a tar-file as a directory, I
> really, really don't belive this belongs in the kernel. First of all,
> this is the file-as-directory discussion again, I belive it is a
> mistake to expose the contents as a directory on top of the file
> because it breaks a lot of assumptions that unix programs make.
> It's much better to expose the contents at another place in the
> filesystem by doing a temporary mount of the file with the proper
> filesystem. As Pavel Machek pointed out, this has the problem of who
> cleans up the mount if the application crashes. One way to handle
> this could be something like this:
> mount -t tarfs -o loop bar.tar /tmp/bar-fabb50509
> chdir /tmp/bar-fabb50509
> umount -f /tmp/bar-fabb50509
> This will require the ability to unmount busy filesystems (but I
> belive Alexander Viro already has implemented the infrastructure
> needed for this).
> Or for files that we don't have a real filesystem driver (or on other
> systems where userspace mounts are not allowed), we could just unpack
> the contents into /tmp. For cleanup we could let whatever cleans up
> /tmp anyways handle it, or have a cache daemon that keeps track of
> untarred directories and removes them after a while.
But now, as you said, we are not talking about named streams, but
about a specic type of plugin and its implications.
Can we make a plugin infrastructure that will let user-space plugins
to be loaded for certain directories or files? If we can, then it
would present a much cleaner and easier way for the user to access
data he wants. In this particular example it was a tar file.
> Another way is to completely forget about presenting the contents of a
> tar file as a real files, and just use a shared library to get at the
> contents (now we just have to convince everyone to use the shared
> library). This would also be portable to other systems.
Yes. If it was possible. I do honestly think it will be easier and
less man hour work to create a run-time loadable user-space plugin
interface than to convince thousands of application developers to
use this shared library.
> If we do this right, it could all be hidden in a shared library, and
> if the system below it supports more advanced features, it can use it.
> Regarding the "I want a realtime index of all files". I belive that a
> notifier that can tell me when a file has been changed and a userspace
> daemon ought to handle most of the cases that have been mentioned.
We have FAM (File Alteration Monitor) that notifies applications of
changes to files they are interested in.
http://oss.sgi.com/projects/fam/
> The suggested problems of not getting an up to date query response can
> be handled by just asking the daemon "are you done with indexing yet".
> The design of such a daemon and the support it needs from the kernel
> can definitely be discussed. But to put the indexer itself in the
> kernel sounds like a bad idea. Even adding an API to query the
> indexer into the kernel sounds pointless, why do that instead of just
> opening a Unix socket to the indexer and asking it directly?
I think that we do not need a indexer with Reiser4 anyway. It is
already a database that could be queried directly. There shouldn't
be any need to build a database on top of Reiser4 (like the
updatedb) that holds the same information already existing in
Reiser4.
Now, it is just a matter of having an interface to query the Reiser4
db. One of the suggested ways were through meta-data. For example:
ls -l ..metas/keyword/.pdf
~S
> /Christer
Spam <[email protected]> writes:
> > Additionally, files-as-directores does not solve the problem of
> > "cp a b" losing named streams. There is curently no copyfile syscall
> > in the Linux kernel, "cp a b" essentially does "cat a >b". So unless
> > cp is modified we don't gain anything. If cp is modified to know
> > about named streams, it really does not matter if named streams are
> > accessed as file-as-directories, via openat(3) or via a shared library
> > with some other interface.
>
> One suggestion is missed. It is to provide system calls for copy.
> That would also solve the problem. Named streams and metas would
> then be handled correctly. It also allows further changes to
> filesystems without having to patch applications yet again.
But this still solves only part of the problem. A backup application
won't have any use for a copyfile syscall, it will need to be taught
about streams.
> A copy system call would also be large beneficial for networked
> filesystems (NFS, Samba, etc) as data wouldn't have to be
> transferred over the network and back.
Definitely.
> Can we make a plugin infrastructure that will let user-space plugins
> to be loaded for certain directories or files? If we can, then it
> would present a much cleaner and easier way for the user to access
> data he wants. In this particular example it was a tar file.
In that case I'd argue that:
mount -t userfs -o driver=tarfs foo /tmp/foo
is a rather good kernel interface for plugins. userfs (or something
based on userfs) is the plugin API and tarfs is a plugin. :-)
To make this efficient, well have to allow non-root users to perform
the mount syscall (with the limitation that they can only mount on top
of directories they own and that the mounts have the nosuid and nodev
flags set).
/Christer
--
"Just how much can I get away with and still go to heaven?"
Freelance consultant specializing in device driver programming for Linux
Christer Weinigel <[email protected]> http://www.weinigel.se
Am Dienstag, 7. September 2004 14:30 schrieb Herbert Poetzl:
> > 1. Do we want support for named streams?
> >
> > ? ?I belive the answer is yes, since both NTFS and HFS (that's the
> > ? ?MacOS filesystem, isn't it?) supports streams we want Linux to
> > ? ?support this if possible.
>
> well, yes HFS has this, is it advantageous, no
> it's kind of heritage ...
HFS doesn't support _named_ streams. There are just two forks
per file.
Regards
Oliver
> Spam <[email protected]> writes:
>> > Additionally, files-as-directores does not solve the problem of
>> > "cp a b" losing named streams. There is curently no copyfile syscall
>> > in the Linux kernel, "cp a b" essentially does "cat a >b". So unless
>> > cp is modified we don't gain anything. If cp is modified to know
>> > about named streams, it really does not matter if named streams are
>> > accessed as file-as-directories, via openat(3) or via a shared library
>> > with some other interface.
>>
>> One suggestion is missed. It is to provide system calls for copy.
>> That would also solve the problem. Named streams and metas would
>> then be handled correctly. It also allows further changes to
>> filesystems without having to patch applications yet again.
> But this still solves only part of the problem. A backup application
> won't have any use for a copyfile syscall, it will need to be taught
> about streams.
Yes, but backup programs always needed to be taught about new
features. Be it new type of files, attributes or meta-data. I think
that teaching backup applications is far better than teaching every
application.
>> A copy system call would also be large beneficial for networked
>> filesystems (NFS, Samba, etc) as data wouldn't have to be
>> transferred over the network and back.
> Definitely.
>> Can we make a plugin infrastructure that will let user-space plugins
>> to be loaded for certain directories or files? If we can, then it
>> would present a much cleaner and easier way for the user to access
>> data he wants. In this particular example it was a tar file.
> In that case I'd argue that:
> mount -t userfs -o driver=tarfs foo /tmp/foo
> is a rather good kernel interface for plugins. userfs (or something
> based on userfs) is the plugin API and tarfs is a plugin. :-)
> To make this efficient, well have to allow non-root users to perform
> the mount syscall (with the limitation that they can only mount on top
> of directories they own and that the mounts have the nosuid and nodev
> flags set).
Yes, this seem to be one solution. It isn't very dynamic in usage
though. You can't use this directly from applications wihout
manually doing the mount.
This is only a solution to browsing contents of files. It doesn't
provide a solution for using meta-data streams or other things like
this.
~S
> /Christer
Spam <[email protected]> wrote:
> One suggestion is missed. It is to provide system calls for copy.
> That would also solve the problem.
No, it would not. If you read the POSIX.1 specification for cp
carefully <http://www.unix.org/version3/online.html>, you will
notice that the process for copying a regular file is carefully
standardized. A POSIX.1-conforming cp implementation would not
be allowed to copy additional streams, unless either additional
options are given or the type of the file being copied is other
than S_IFREG. And cp is just one example of a standardized file
handling program.
Gunnar
--
http://omnibus.ruf.uni-freiburg.de/~gritter
> Spam <[email protected]> wrote:
>> One suggestion is missed. It is to provide system calls for copy.
>> That would also solve the problem.
> No, it would not. If you read the POSIX.1 specification for cp
> carefully <http://www.unix.org/version3/online.html>, you will
> notice that the process for copying a regular file is carefully
> standardized. A POSIX.1-conforming cp implementation would not
> be allowed to copy additional streams, unless either additional
> options are given or the type of the file being copied is other
> than S_IFREG. And cp is just one example of a standardized file
> handling program.
It would solve the problem in Linux. However, it may not be POSIX.1
compatible. On the other hand I read that NTFS 5.0 is POSIX.1
compliant - and Windows uses copy system call. NTFS also has streams
support using special character in the file names to select the
streams.
Surely there must be a solution in Linux that will allow things like
streams and meta-data(meta-streams) be visible to the user.
~S
> Gunnar
Spam <[email protected]> wrote:
> > Spam <[email protected]> wrote:
> >> One suggestion is missed. It is to provide system calls for copy.
> >> That would also solve the problem.
> > No, it would not. If you read the POSIX.1 specification for cp
> > carefully <http://www.unix.org/version3/online.html>, you will
> > notice that the process for copying a regular file is carefully
> > standardized. A POSIX.1-conforming cp implementation would not
> > be allowed to copy additional streams, unless either additional
> > options are given or the type of the file being copied is other
> > than S_IFREG. And cp is just one example of a standardized file
> > handling program.
> It would solve the problem in Linux. However, it may not be POSIX.1
> compatible. On the other hand I read that NTFS 5.0 is POSIX.1
> compliant - and Windows uses copy system call.
You should obviously take a thorough read of at least the Base
Definitions, section 2, 'Conformance' of POSIX.1 before you further
comment on this issue.
Gunnar
Christer Weinigel wrote:
>Spam <[email protected]> writes:
>
>
>
>>>Additionally, files-as-directores does not solve the problem of
>>>"cp a b" losing named streams.
>>>
reiser4 does not support streams, it supports files that can do what
streams do. cp -r does not currently lose files.
Gunnar Ritter wrote:
>
>
>You cannot just 'modify cp'.
>
People who think that POSIX is the objective rather than the least
common denominator of OS design have had their head screwed on backwards
to better look at where their competitors used to be.
However, I agree that streams suck. That is why reiser4 just has files
and directories and not streams. Our files and directories just happen
to be able to do all that streams can do.
Hans
Christer Weinigel wrote:
>David Masover <[email protected]> writes:
>
>
>
>>|>Second, there are quite a few things which I might want to do, which can
>>|>be done with this interface and without patching programs,
>>| Such as?
>>They've been mentioned.
>>
>>
>
>
>
>>| Haven't seen any that made sense to me, sorry.
>>Sorry if they don't make sense to you, but I don't feel like discussing
>>them now. Either you get it or you don't, either you agree or you
>>don't. Read the archives.
>>
>>
>
>Great argument. Not. There has been so much shit thrown around here
>so that it's impossible to keep track of all examples.
>
>Could you please try summarize a few of the arguments that you find
>especially compelling? This thread has gotten very confused since
>there are a bunch of different subjects all being intermixed here.
>
>What are we discussing?
>
>1. Do we want support for named streams?
>
> I belive the answer is yes, since both NTFS and HFS (that's the
> MacOS filesystem, isn't it?) supports streams we want Linux to
> support this if possible.
>
> Anyone disagreeing?
>
>
No, we want files and directories that can do what streams can do. This
means files that are also directories, plugins that aggregate the
contents of a directory, files that inherit stat data, maybe I forget
something ---- it is on my website.
Streams themselves are a bad idea because they are a filesystem inside
of a file which is double the overhead, and they fragment the namespace.
>4. What belongs in the generic VFS, what belongs in Reiser4?
>
> Some things reiser4 do, such as files-as-directories need changes
> to the VFS because it breaks assumptions that the VFS makes
> (i.e. a deadlock or an oops when doing a hard link out of one).
>
> Some other things reiser4 can do would be better if they were in
> the VFS since other filesystems might want to support the same
> functionality.
>
>
It is always better to lead by example. These ideas are too new for the
other fs developers, they need 5 years to get used to them. Reiser4
should create something that works, and let others follow when they will.
> Or Linux may not support some of the things reiserfs at all.
>
>5. What belongs in the kernel, what belongs in userspace?
>
>
This is the wrong question. The right question is, what belongs in a
unified namespace? Then having answered that, the extent to which it is
easier to have all aspects of name resolution together in one body of
code is an implementation detail that can change and evolve with time.
A rapidly evolving namespace is easier to modify if it is all one body
of code. Furthermore, the people doing the work should really be left
to decide such implementation details themselves because they are more
expert on their code than anyone else.
Hans
Hans Reiser <[email protected]> wrote:
> Gunnar Ritter wrote:
> >You cannot just 'modify cp'.
> >
> People who think that POSIX is the objective rather than the least
> common denominator of OS design
I am not principally adversed against extensions to POSIX. My mailx
implementation 'nail' has e.g. perhaps more extensions than there are
commands and options in the POSIX standard for it.
POSIX is also not against extensions. In fact, POSIX development
generally works as follows: One vendor creates something as an extension,
other vendors follow to implement it, and later on it is discussed if it
is desirable to integrate the feature into the standard itself. It is
absolutely possible that Sun's openat() might be in POSIX.1-2010 one day,
for example. Useful extensions are thus welcome to POSIX.
This does not mean, however, that one should not clearly distinct between
standard and extensions, and that extensions should be created at will
without carefully weighting pros and cons.
I did not say: POSIX forbids to handle streams or directory/file mixes.
This would not even have been true. However, POSIX restricts the choice
of possible interfaces for them. One of those restrictions is that 'cp'
may not copy streams if used in strict accordance with POSIX. As you
acknowledged in your reply, POSIX is the least common denominator. Thus
'cp' implementations should not be modified in a way that violates it.
This means, in effect, that a strictly conforming POSIX application (i.e.
something like a shell script that uses no POSIX extensions or methods
which are not clearly defined in the standard) will very likely be unable
to copy streams, unless some other, conforming, method is found. Which is
a problem one should know about when discussing this.
> have had their head screwed on backwards
> to better look at where their competitors used to be.
But there are not only forwards and backwards directions. Sideways might
lead to nowhere.
Gunnar
--
http://omnibus.ruf.uni-freiburg.de/~gritter
Gunnar Ritter <[email protected]> writes:
> No, it would not. If you read the POSIX.1 specification for cp
> carefully <http://www.unix.org/version3/online.html>, you will
> notice that the process for copying a regular file is carefully
> standardized. A POSIX.1-conforming cp implementation would not
> be allowed to copy additional streams, unless either additional
> options are given or the type of the file being copied is other
> than S_IFREG. And cp is just one example of a standardized file
> handling program.
We can safely ignore POSIX when it is too broken. cp could very well
be modified to copy named streams except when the option --posix is
specified or the environment variale POSIXLY_CORRECT is set.
/Christer
--
"Just how much can I get away with and still go to heaven?"
Freelance consultant specializing in device driver programming for Linux
Christer Weinigel <[email protected]> http://www.weinigel.se
Hans Reiser <[email protected]> writes:
> Christer Weinigel wrote:
>
> >Spam <[email protected]> writes:
> >
> >
> >>> Additionally, files-as-directores does not solve the problem of
> >>> "cp a b" losing named streams.
> >>>
> reiser4 does not support streams, it supports files that can do what
> streams do. cp -r does not currently lose files.
I'm not talking about reiser4 only. I'm interested in semantics that
can be sanely apply to reiser4, NTFS, HFS, or any other file system
that supports XATTRS.
I relize that resier4 is your baby and you're only interested in it,
but I'm much more interested in linux in general.
/Christer
--
"Just how much can I get away with and still go to heaven?"
Freelance consultant specializing in device driver programming for Linux
Christer Weinigel <[email protected]> http://www.weinigel.se
Hans Reiser <[email protected]> writes:
> No, we want files and directories that can do what streams can do.
> This means files that are also directories, plugins that aggregate the
> contents of a directory, files that inherit stat data, maybe I forget
> something ---- it is on my website.
>
> Streams themselves are a bad idea because they are a filesystem inside
> of a file which is double the overhead, and they fragment the
> namespace.
There really isn't a difference between your files are directories
think and named streams that can be accessed with openat. Both are
concepts that won't fit into the classical Unix entity "file".
As far as I can tell, windows named streams could be exposed via a
file-as-directory model. In that case what windows sees as
foo.txt:icon would be accessed as foo.txt/icon instead. The question
in my mind is if that is a desirable concept. I can see that it is
very a attractive concept because it's easy to explain to users, but
at the same time, a lot of things suggest that the files-as-
directories concept is a bad idea. I mentioned a few of the possible
problems in the grand-grand-parent of this message.
Try to address those problems. It may be that your answer is "I don't
give a shit about existing backup applications", or "I don't care
about security in existing web-servers, they'll have to be taught
about reiser4", but in that case say so. Or show how it's not a
problem with existing applictions.
> It is always better to lead by example. These ideas are too new for
> the other fs developers, they need 5 years to get used to them.
> Reiser4 should create something that works, and let others follow when
> they will.
Pretty please, give up the marketing bullshit. I already know that
you belive reiser4 is the best thing since sliced bread. You say that
"reiser4 should create something that works". So far you haven't
convinced me that reiser4 actually "works". It's a nice proof of
concept, but it definitely does not feel like something that I'd like
in the mainstream kernel until those problems are addressed.
Have you looked at the VFS deadlocks that Alexander Viro pointed at?
Does tar work with files-as-directories? Can I use tar to make a
backup of my resier4 directory. What changes would be required to the
tar application or to the tar file format to make this work?
Can I do a cp -a /home/wingel to a NFS-mounted drive on some other
computer to make a backup of my home directory or will I lose data
that way? What changes would be required to cp to make this work?
Please remember that reiser4 does not exist in a vaccum. If you want
to make a fundamental change to unix concepts that have been with us
for three decades, it's really up to you to show that it doesn't break
too many existing applications.
> This is the wrong question. The right question is, what belongs in a
> unified namespace? Then having answered that, the extent to which it
> is easier to have all aspects of name resolution together in one body
> of code is an implementation detail that can change and evolve with
> time. A rapidly evolving namespace is easier to modify if it is all
> one body of code. Furthermore, the people doing the work should
> really be left to decide such implementation details themselves
> because they are more expert on their code than anyone else.
I still haven't seen a good answer to the namespace problem. You say
that files-as-directories is great, I'm definitely not convinced.
The VFS is the body of code that handles most aspects of name
resolution today, so it sounds as if reiser4 is supposed to be the new
super-VFS, and you want to keep it that way and do not want to
integrate with the current VFS. In that case resier4 really does not
belong in the mainstream kernel since it's your pet research project.
/Christer
--
"Just how much can I get away with and still go to heaven?"
Freelance consultant specializing in device driver programming for Linux
Christer Weinigel <[email protected]> http://www.weinigel.se
Christer Weinigel <[email protected]> wrote:
> Gunnar Ritter <[email protected]> writes:
> > No, it would not. If you read the POSIX.1 specification for cp
> > carefully <http://www.unix.org/version3/online.html>, you will
> > notice that the process for copying a regular file is carefully
> > standardized. A POSIX.1-conforming cp implementation would not
> > be allowed to copy additional streams, unless either additional
> > options are given or the type of the file being copied is other
> > than S_IFREG. And cp is just one example of a standardized file
> > handling program.
> We can safely ignore POSIX when it is too broken.
Excuse me, but there's really nothing broken here with POSIX and cp.
You're just making an insulting talk about a part of the specification
which currently serves GNU/Linux and other Unix-like environments very
well, and has done so for about twelve years now.
> cp could very well be modified to copy named streams except when
> the option --posix is specified
Hey, you didn't ever even have a look at POSIX Shell & Utilities, did
you? Then why are you making derogatory statements about it?
> or the environment variale POSIXLY_CORRECT is set.
Cool, data loss depending upon an environment variable which is even
currently used by many programs unaware of such results. This really
sounds like good engineering to me.
Gunnar
so far the best answer that I've seen is a slight varient of what Hans is
proposing for the 'file-as-a-directory'
make the base file itself be a serialized version of all the streams and
if you want the 'main' stream open file/. (or some similar varient)
this doesn't address the hard-link issue, but it should handle the backup
problems (your backup software just goes through the files and what it
gets is suitable for backups).
you will ask what serializer to user, and my answer is to let one of the
streams tell you, and have the kernel make a call out to userspace to
execute the appropriate program (note that this means that tar is not put
into the kernel)
in fact it may make sense to just open file/file to get at the 'main'
stream of the file (there may be cases where the concept of a single main
stream may not make sense)
so if this solves the tool/backup problem then we can look and figure out
if there's a reasonable way to solve the hard-link problem
David Lang
On Tue, 8 Sep 2004, Christer Weinigel wrote:
> Date: 08 Sep 2004 00:13:12 +0200
> From: Christer Weinigel <[email protected]>
> To: Hans Reiser <[email protected]>
> Cc: Christer Weinigel <[email protected]>,
> David Masover <[email protected]>,
> Horst von Brand <[email protected]>, Spam <[email protected]>,
> Tonnerre <[email protected]>, Linus Torvalds <[email protected]>,
> Pavel Machek <[email protected]>, Jamie Lokier <[email protected]>,
> Chris Wedgwood <[email protected]>, [email protected],
> Christoph Hellwig <[email protected]>, [email protected],
> [email protected], Alexander Lyamin aka FLX <[email protected]>,
> ReiserFS List <[email protected]>
> Subject: Re: silent semantic changes with reiser4
>
> Hans Reiser <[email protected]> writes:
>
>> No, we want files and directories that can do what streams can do.
>> This means files that are also directories, plugins that aggregate the
>> contents of a directory, files that inherit stat data, maybe I forget
>> something ---- it is on my website.
>>
>> Streams themselves are a bad idea because they are a filesystem inside
>> of a file which is double the overhead, and they fragment the
>> namespace.
>
> There really isn't a difference between your files are directories
> think and named streams that can be accessed with openat. Both are
> concepts that won't fit into the classical Unix entity "file".
>
> As far as I can tell, windows named streams could be exposed via a
> file-as-directory model. In that case what windows sees as
> foo.txt:icon would be accessed as foo.txt/icon instead. The question
> in my mind is if that is a desirable concept. I can see that it is
> very a attractive concept because it's easy to explain to users, but
> at the same time, a lot of things suggest that the files-as-
> directories concept is a bad idea. I mentioned a few of the possible
> problems in the grand-grand-parent of this message.
>
> Try to address those problems. It may be that your answer is "I don't
> give a shit about existing backup applications", or "I don't care
> about security in existing web-servers, they'll have to be taught
> about reiser4", but in that case say so. Or show how it's not a
> problem with existing applictions.
>
>> It is always better to lead by example. These ideas are too new for
>> the other fs developers, they need 5 years to get used to them.
>> Reiser4 should create something that works, and let others follow when
>> they will.
>
> Pretty please, give up the marketing bullshit. I already know that
> you belive reiser4 is the best thing since sliced bread. You say that
> "reiser4 should create something that works". So far you haven't
> convinced me that reiser4 actually "works". It's a nice proof of
> concept, but it definitely does not feel like something that I'd like
> in the mainstream kernel until those problems are addressed.
>
> Have you looked at the VFS deadlocks that Alexander Viro pointed at?
>
> Does tar work with files-as-directories? Can I use tar to make a
> backup of my resier4 directory. What changes would be required to the
> tar application or to the tar file format to make this work?
>
> Can I do a cp -a /home/wingel to a NFS-mounted drive on some other
> computer to make a backup of my home directory or will I lose data
> that way? What changes would be required to cp to make this work?
>
> Please remember that reiser4 does not exist in a vaccum. If you want
> to make a fundamental change to unix concepts that have been with us
> for three decades, it's really up to you to show that it doesn't break
> too many existing applications.
>
>> This is the wrong question. The right question is, what belongs in a
>> unified namespace? Then having answered that, the extent to which it
>> is easier to have all aspects of name resolution together in one body
>> of code is an implementation detail that can change and evolve with
>> time. A rapidly evolving namespace is easier to modify if it is all
>> one body of code. Furthermore, the people doing the work should
>> really be left to decide such implementation details themselves
>> because they are more expert on their code than anyone else.
>
> I still haven't seen a good answer to the namespace problem. You say
> that files-as-directories is great, I'm definitely not convinced.
>
> The VFS is the body of code that handles most aspects of name
> resolution today, so it sounds as if reiser4 is supposed to be the new
> super-VFS, and you want to keep it that way and do not want to
> integrate with the current VFS. In that case resier4 really does not
> belong in the mainstream kernel since it's your pet research project.
>
> /Christer
>
> --
> "Just how much can I get away with and still go to heaven?"
>
> Freelance consultant specializing in device driver programming for Linux
> Christer Weinigel <[email protected]> http://www.weinigel.se
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
On Mon, Sep 06, 2004 at 11:28:06PM -0700, William Lee Irwin III wrote:
> This thread is getting a bit soft on technical details and/or value.
> Hans, please post the results of fsstress (the bits from ext3 cvs) on
> reiser4 on SMP machines of non-i386 architectures (e.g. ppc64, sparc64,
> ia64) as well as the results of transferring reiser4 filesystems
> between machines whose PAGE_SIZE, wordsize and endianness vary.
Okay, sounds like time to run these tests and post the results myself.
Hans, these things should be part of your standard QA. It would likely
make a better impression if there were some record of these kinds of
things having been successfully tested prior to your releases. For
future reference, Andrew, Christoph, myself, and others can provide
more detailed references to suites of stress tests and various kinds of
tests filesystems should pass before being considered stable, and we
(it's a relatively safe presumption that I speak for all of us when I
say this) would appreciate this kind of testing in the future.
-- wli
Gunnar Ritter <[email protected]> writes:
> Excuse me, but there's really nothing broken here with POSIX and cp.
> You're just making an insulting talk about a part of the specification
> which currently serves GNU/Linux and other Unix-like environments very
> well, and has done so for about twelve years now.
"Broken" in the sense "POSIX mandates something that users wouldn't
expect".
> > cp could very well be modified to copy named streams except when
> > the option --posix is specified
>
> Hey, you didn't ever even have a look at POSIX Shell & Utilities, did
> you? Then why are you making derogatory statements about it?
Those derogatory statments are really all in your mind.
> > or the environment variale POSIXLY_CORRECT is set.
>
> Cool, data loss depending upon an environment variable which is even
> currently used by many programs unaware of such results. This really
> sounds like good engineering to me.
How would you consider cp to cause "data loss" if it _besides_ copying
the normal stream _also_ copied any named streams or xattrs belonging
to the stream? How would it cause data loss if cp started using a
theoretical copyfile syscall? It may not be 100% according to POSIX,
but I'd definitely say that it does what the user expects.
Lots of GNU utilities already differ from POSIX mandated behaviour
because the authors of those utilities belive that the POSIX mandated
behaviour is confusing.
http://www.wlug.org.nz/POSIXLY_CORRECT
POSIXLY_CORRECT is an environment variable that some programs use
to follow strict POSIX standards behaviour, where that isn't the
default.
Probably the most well-known example of this is that POSIX states
that filesystem blocks are 512 bytes per block, so the GNU
fileutils such as df(1) and GNU tar(1) use 512 if the variable
POSIXLY_CORRECT is set, and 1024 bytes per block by default.
Many of the GNU tools comply with POSIX by default, except for
where the author thinks the POSIX standard is wrong or dumb. :) As
a result, some programs also check if a variable named
POSIX_ME_HARDER is set as an acceptable alias for
POSIXLY_CORRECT. See Democracy Triumphs in Disk Units.
/Christer
--
"Just how much can I get away with and still go to heaven?"
Freelance consultant specializing in device driver programming for Linux
Christer Weinigel <[email protected]> http://www.weinigel.se
David Lang <[email protected]> writes:
> so far the best answer that I've seen is a slight varient of what Hans
> is proposing for the 'file-as-a-directory'
>
> make the base file itself be a serialized version of all the streams
> and if you want the 'main' stream open file/. (or some similar
> varient)
> in fact it may make sense to just open file/file to get at the 'main'
> stream of the file (there may be cases where the concept of a single
> main stream may not make sense)
So what happens if I have a text file foo.txt and add an author
attribute to it? When I read foo.txt the next time it's supposed to
give me a serialized version with both the contents of foo.txt _and_
the author attribute?
That would definitely confuse me.
Or did I misunderstand something?
/Christer
--
"Just how much can I get away with and still go to heaven?"
Freelance consultant specializing in device driver programming for Linux
Christer Weinigel <[email protected]> http://www.weinigel.se
On Tue, 8 Sep 2004, Christer Weinigel wrote:
> Subject: Re: silent semantic changes with reiser4
>
> David Lang <[email protected]> writes:
>
>> so far the best answer that I've seen is a slight varient of what Hans
>> is proposing for the 'file-as-a-directory'
>>
>> make the base file itself be a serialized version of all the streams
>> and if you want the 'main' stream open file/. (or some similar
>> varient)
>
>> in fact it may make sense to just open file/file to get at the 'main'
>> stream of the file (there may be cases where the concept of a single
>> main stream may not make sense)
>
> So what happens if I have a text file foo.txt and add an author
> attribute to it? When I read foo.txt the next time it's supposed to
> give me a serialized version with both the contents of foo.txt _and_
> the author attribute?
>
> That would definitely confuse me.
>
> Or did I misunderstand something?
>
good point. under my scheme you would need to access foo.txt/foo.txt or
foo.txt/. instead of just foo.txt
I guess my way would work if there is a way to know that a file has been
extended (or if you just make it a habit of opening the file/file instead
of just file) but not for random additions of streams to otherwise normal
files.
Oh well, it seemed like a easy fix (and turned out to be to easy to be
practical)
David Lang
> /Christer
>
> --
> "Just how much can I get away with and still go to heaven?"
>
> Freelance consultant specializing in device driver programming for Linux
> Christer Weinigel <[email protected]> http://www.weinigel.se
>
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
David Lang <[email protected]> wrote:
> so far the best answer that I've seen is a slight varient of what Hans is
> proposing for the 'file-as-a-directory'
>
> make the base file itself be a serialized version of all the streams and
> if you want the 'main' stream open file/. (or some similar varient)
As has been said previously, all such proposals except for those
with two leading slashes would directly violate POSIX.1-2004, Base
Definitions, 4.11 'Pathname Resolution'. In particular,
# A pathname that contains at least one non-slash character and that
# ends with one or more trailing slashes shall be resolved as if a
# single dot character ( '.' ) were appended to the pathname.
A regular file name with special semantics, in contrast, would not
violate POSIX, in particular if a stat() on it would not return one
of the standard S_IFXXX types. Inappropriate operations could just
fail with EINVAL on such files, and it would be unspecified by the
standard how they were handled by 'cp' or other standardized utilities.
Having a separate S_IFXXX type for streamed files would also give
attention to portability issues easily. It might in fact be better
to have existing applications fail explicitly with them than to
make this as transparent as possible, but hope for good luck.
Utilities like 'tar' would just say 'Unknown file type' or the like
and exclude such files from the archives they create. This would lead
the user's explicit attention to the data loss. He might then choose
to ignore the error, or to get a special version of 'tar' to handle it.
It would also be clear to programmers and users that such files are
special things they don't normally need. Regular Linux installations
might totally ignore them. If streamed files are only intended for
use with CIFS or other special cases, that might be a more clean
solution for this issue than watering down the existing portable
file semantics.
Gunnar
Christer Weinigel <[email protected]> wrote:
> Gunnar Ritter <[email protected]> writes:
> > Excuse me, but there's really nothing broken here with POSIX and cp.
> > You're just making an insulting talk about a part of the specification
> > which currently serves GNU/Linux and other Unix-like environments very
> > well, and has done so for about twelve years now.
> "Broken" in the sense "POSIX mandates something that users wouldn't
> expect".
Only if one breaks it by making extensions in an inappropriate way.
This is not a fault of POSIX. POSIX usually allows a lot of sane ways
to introduce extensions. There are usually valid interoperability
arguments for behavior prescribed by POSIX. It is really not one of
those standards where you want to ignore every second word because
it is obviously nothing but committee nonsense.
> > > or the environment variale POSIXLY_CORRECT is set.
> > Cool, data loss depending upon an environment variable which is even
> > currently used by many programs unaware of such results. This really
> > sounds like good engineering to me.
> How would you consider cp to cause "data loss" if it _besides_ copying
> the normal stream _also_ copied any named streams or xattrs belonging
> to the stream?
You are reversing the argument. If additional streams are introduced
inappropriately by extending the semantics of S_IFREG files, POSIX
requires cp to lose the data. Your proposal would then make this loss
of additional stream data dependent on an environment variable that
is already in wide use. If it was set by accident, the data would be
lost.
Besides, copying xattrs is usually permitted (POSIX.1-2004, XCU cp):
# If the implementation provides additional or alternate access control
# mechanisms (see the Base Definitions volume of IEEE Std 1003.1-2001,
# Section 4.4, File Access Permissions), their effect on copies of files
# is implementation-defined.
It is also permitted to add other S_IFXXX types and then let cp act
in an implementation-defined manner on them (cf. my earlier message
<[email protected]>).
The 'standardized' data loss would only occur if the standardized type
of regular file, S_IFREG, was abused. This would really not be a fault
of POSIX.
> Lots of GNU utilities already differ from POSIX mandated behaviour
> because the authors of those utilities belive that the POSIX mandated
> behaviour is confusing.
Sure, but it is not the preferred method of adding features. In
addition, most of the existing POSIXLY_CORRECT influences are
nothing but cosmetical details in comparison to copying/not
copying stream data.
Gunnar
On Wed, 08 Sep 2004 01:14:25 +0200, Gunnar Ritter
<[email protected]> wrote:
> Having a separate S_IFXXX type for streamed files would also give
> attention to portability issues easily. It might in fact be better
> to have existing applications fail explicitly with them than to
> make this as transparent as possible, but hope for good luck.
>
> Utilities like 'tar' would just say 'Unknown file type' or the like
> and exclude such files from the archives they create. This would lead
> the user's explicit attention to the data loss. He might then choose
> to ignore the error, or to get a special version of 'tar' to handle it.
>
> It would also be clear to programmers and users that such files are
> special things they don't normally need. Regular Linux installations
> might totally ignore them. If streamed files are only intended for
> use with CIFS or other special cases, that might be a more clean
> solution for this issue than watering down the existing portable
> file semantics.
You could actually use two S_IFXXX types - one for stream files which
should be archived by backup programs because they contain real
information, and a different one for stream files which should not be
archived because they are merely a different "view" of data from the main
file. This simplifies the archival problem because legacy programs would
complain that the file types aren't handled - alerting the user to
potential data loss, and updated programs would automatically backup only
the attributes that are required.
--
-Julian Blake Kongslie
<[email protected]>
On Tue, Sep 07, 2004 at 03:38:01PM -0700, William Lee Irwin III wrote:
> Okay, sounds like time to run these tests and post the results myself.
> Hans, these things should be part of your standard QA. It would likely
> make a better impression if there were some record of these kinds of
> things having been successfully tested prior to your releases. For
> future reference, Andrew, Christoph, myself, and others can provide
> more detailed references to suites of stress tests and various kinds of
> tests filesystems should pass before being considered stable, and we
> (it's a relatively safe presumption that I speak for all of us when I
> say this) would appreciate this kind of testing in the future.
Step 1, the tools are very broken. This level of nonfunctionality
of the reiser4 toolchain precludes any kind of exposure to the kind of
testing I've asked about. I would very strongly prefer not to have to
become a reiser4 implementor and furthermore fix numerous bugs just to
have the smallest bit of assurance that this thing won't generate bug
reports en masse once merged.
The following is from an UltraSPARC system (64-bit wordsize, big-endian,
8KB pagesize, 32-bit userspace, including 32-bit compiled reiser4 tools)
running 2.6.9-rc1-mm3:
# /usr/local/sbin/mkfs.reiser4 -f -b 4096 /dev/loop0
/usr/local/sioctl32(mkfs.reiser4:1949): Unknown cmd fd(3) cmd(40081272){00} arg(efffc810) on /dev/loop0
4boicnt/lm3k2f(sm.krfesi.sreeri4s e1r.40:.109
9C)o:p yUrnikgnhotw n( Cc)m d2 0f0d1(,3 )2 0c0m2d,( 420000831,2 7220)0{40 0b}y raHragn(se fRfefics8e1r0,) loinc e/ndseivn/gl ogoopv0e
ned by
reiser4progs/COPYING.
Block size 4096 will be used.
Linux 2.6.9-rc1-mm3 is detected.
Reiser4 is going to be created on /dev/loop0.
(Yes/No): Yes
ioctl32(mkfs.reiser4:1949): Unknown cmd fd(3) cmd(40081272){00} arg(efffc738) on /dev/loop0
i o c t l 3 2 ( m k f s . r e i s e r 4 : 1 9 4 9 ) : U n k n o w n c m d sC r/edaetvi/nlgo orpe0i0 8 1 2 7 2 ) { 0 0 } a r g ( e f f f c 7 3 8 )
er4 on /dev/loop0 ... Bus error
#
With the below patch, I get:
# /usr/local/sbin/mkfs.reiser4 -f -b 4096 /dev/loop0 /usr/local/sbin/mkfs.reiser4 1.0.0
Copyright (C) 2001, 2002, 2003, 2004 by Hans Reiser, licensing governed by
reiser4progs/COPYING.
Block size 4096 will be used.
Linux 2.6.9-rc1-mm3 is detected.
Reiser4 is going to be created on /dev/loop0.
(Yes/No): Yes
Creating reiser4 on /dev/loop0 ... Bus error
#
The S_ISBLK() check is so that other assumptions about the file being
a block device elsewhere won't be tripped up; it's not strictly
necessary, only retained so as not to expose unaudited code to a new
situation for which it's not prepared.
strace(1) before and after the patch is included here. The backtrace
from the core dump is:
#0 0x7005e464 in cde40_insert_units () from /usr/local/lib/libreiser4-1.0.so.0
(gdb) bt
#0 0x7005e464 in cde40_insert_units () from /usr/local/lib/libreiser4-1.0.so.0
#1 0x7004fa44 in node40_modify () from /usr/local/lib/libreiser4-1.0.so.0
#2 0x7003c608 in cb_node_insert () from /usr/local/lib/libreiser4-1.0.so.0
#3 0x7003c58c in reiser4_node_modify ()
from /usr/local/lib/libreiser4-1.0.so.0
#4 0x7003ee7c in reiser4_tree_modify ()
from /usr/local/lib/libreiser4-1.0.so.0
#5 0x7006bc10 in obj40_insert () from /usr/local/lib/libreiser4-1.0.so.0
#6 0x7006d9e8 in dir40_create () from /usr/local/lib/libreiser4-1.0.so.0
#7 0x70041610 in reiser4_object_create ()
from /usr/local/lib/libreiser4-1.0.so.0
#8 0x00012160 in main ()
-- wli
Index: libaal-1.0.0/src/file.c
===================================================================
--- libaal-1.0.0.orig/src/file.c 2004-01-08 06:49:40.000000000 -0800
+++ libaal-1.0.0/src/file.c 2004-09-07 19:36:49.593844072 -0700
@@ -193,59 +193,31 @@
return !aal_strncmp(file1, file2, aal_strlen(file1));
}
-#if defined(__linux__) && defined(_IOR) && !defined(BLKGETSIZE64)
-# define BLKGETSIZE64 _IOR(0x12, 114, uint64_t)
-#endif
-
/* Handler for "len" operation for use with file device. See bellow for
understanding where it is used. */
static count_t file_len(
aal_device_t *device) /* file device, lenght will be obtained from */
{
- uint64_t size;
- off_t max_off = 0;
+ off_t off, size;
+ struct stat stat_buf;
+ int fd;
- if (!device)
+ if (!device)
return INVAL_BLK;
-
-#ifdef BLKGETSIZE64
- if ((int)ioctl(*((int *)device->entity), BLKGETSIZE64, &size) >= (int)0) {
- uint32_t block_count;
-
- size = (size / 4096) * 4096 / device->blksize;
- block_count = size;
-
- if ((uint64_t)block_count != size) {
- aal_fatal("The partition size is too big.");
- return INVAL_BLK;
- }
-
- return (count_t)block_count;
- }
-
-#endif
-
-#ifdef BLKGETSIZE
- {
- unsigned long l_size;
-
- if (ioctl(*((int *)device->entity), BLKGETSIZE, &l_size) >= 0) {
- size = l_size;
- return (count_t)((size * 512 / 4096) * 4096 /
- device->blksize);
- }
- }
-
-#endif
-
- if ((max_off = lseek(*((int *)device->entity),
- 0, SEEK_END)) == (off_t)-1)
- {
- file_error(device);
+ fd = *(int *)device->entity;
+ if (fstat(fd, &stat_buf))
+ return INVAL_BLK;
+ if (!S_ISBLK(stat_buf.st_mode))
+ return INVAL_BLK;
+ off = lseek(fd, 0, SEEK_CUR);
+ size = lseek(fd, 0, SEEK_END);
+ if (lseek(fd, off, SEEK_SET) == (off_t)-1)
+ errno = 0;
+ if (size == (off_t)-1) {
+ errno = 0;
return INVAL_BLK;
}
-
- return (count_t)(max_off / device->blksize);
+ return (size & ~4096ULL)/device->blksize;
}
/* Initializing the file device operations. They are used when any operation of
On Tue, Sep 07, 2004 at 07:43:19PM -0700, William Lee Irwin III wrote:
> - return (count_t)(max_off / device->blksize);
> + return (size & ~4096ULL)/device->blksize;
Correcting this to return (size & ~4095ULL)/device->blksize does not
fix the coredumps.
-- wli
Gunnar Ritter <[email protected]> writes:
> Besides, copying xattrs is usually permitted (POSIX.1-2004, XCU cp):
>
> # If the implementation provides additional or alternate access control
> # mechanisms (see the Base Definitions volume of IEEE Std 1003.1-2001,
> # Section 4.4, File Access Permissions), their effect on copies of files
> # is implementation-defined.
In <[email protected]> you wrote:
>A POSIX.1-conforming cp implementation would not be allowed to copy
>additional streams, unless either additional options are given or the
>type of the file being copied is other than S_IFREG.
I read this as that POSIX mandates that cp can absolutely not copy
anything else but the file contents. That is what I called broken.
If we implement named streams as xattrs and that can be accessed with
openat(..., O_XATTR) this means that cp is allowed to copy the xattrs
(well, named streans don't neccesarily have to be "alternate access
control mechanisms", but they can use the same xattr namespace).
That's quite ok in that case.
/Christer
--
"Just how much can I get away with and still go to heaven?"
Freelance consultant specializing in device driver programming for Linux
Christer Weinigel <[email protected]> http://www.weinigel.se
Salut,
On Tue, Sep 07, 2004 at 12:50:16AM -0500, David Masover wrote:
> Transparently?
>
> It _works_ to do
> zcat file.gz > /tmp/file
> vim /tmp/file
> gzip -c /tmp/file > file.gz
> rm /tmp/file
> You can even do that as a script, call it zvim. You could even do it as
> a generic script, where "vim" is replaced with "$1". But is it as
> elegent as transparently compressed files?
...or you get yourself a sane editor which supports
gzopen/gzread/gzwrite/gzclose. You can do that in userland, no kernel
implementation needed.
Tonnerre
Christer Weinigel <[email protected]> wrote:
> Gunnar Ritter <[email protected]> writes:
>
> > Besides, copying xattrs is usually permitted (POSIX.1-2004, XCU cp):
> >
> > # If the implementation provides additional or alternate access control
> > # mechanisms (see the Base Definitions volume of IEEE Std 1003.1-2001,
> > # Section 4.4, File Access Permissions), their effect on copies of files
> > # is implementation-defined.
>
> In <[email protected]> you wrote:
>
> >A POSIX.1-conforming cp implementation would not be allowed to copy
> >additional streams, unless either additional options are given or the
> >type of the file being copied is other than S_IFREG.
>
> I read this as that POSIX mandates that cp can absolutely not copy
> anything else but the file contents. That is what I called broken.
It would be really helpful if you read the specification before you
comment on it or try to interpret my wording further.
> If we implement named streams as xattrs and that can be accessed with
> openat(..., O_XATTR) this means that cp is allowed to copy the xattrs
No. Not if the file type of such beasts remains S_IFREG.
> (well, named streans don't neccesarily have to be "alternate access
> control mechanisms", but they can use the same xattr namespace).
POSIX does not know anything about the 'xattr namespace'. It just
allows 'additional or alternate access control mechanisms'. Which
are, in turn, well-defined terms in the standard again. You can
read about them too <http://www.unix.org/version3/>.
Gunnar