Hi ALexander,
This patch combines your previous patch with 2 changes I have just
suggested. Both changes are obvious (and correct).
(correction to my previous message -- the immutable thing to return EPERM
is not just a good idea, it is imperative and Linux already does that --
so we can drop that check from if() unconditionally, which is what the
patch below does).
Tested under 2.4.0-test12-pre6.
Regards,
Tigran
--- linux/fs/open.c Thu Oct 26 16:11:21 2000
+++ work/fs/open.c Wed Dec 6 08:05:43 2000
@@ -102,7 +102,12 @@
goto out;
inode = nd.dentry->d_inode;
- error = -EACCES;
+ /* For directories it's -EISDIR, for other non-regulars - -EINVAL */
+ error = -EISDIR;
+ if (S_ISDIR(inode->i_mode))
+ goto dput_and_out;
+
+ error = -EINVAL;
if (!S_ISREG(inode->i_mode))
goto dput_and_out;
@@ -110,12 +115,8 @@
if (error)
goto dput_and_out;
- error = -EROFS;
- if (IS_RDONLY(inode))
- goto dput_and_out;
-
error = -EPERM;
- if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
+ if (IS_APPEND(inode))
goto dput_and_out;
/*
@@ -163,7 +164,7 @@
goto out;
dentry = file->f_dentry;
inode = dentry->d_inode;
- error = -EACCES;
+ error = -EINVAL;
if (!S_ISREG(inode->i_mode) || !(file->f_mode & FMODE_WRITE))
goto out_putf;
error = -EPERM;
On Wed, 6 Dec 2000, Tigran Aivazian wrote:
> (correction to my previous message -- the immutable thing to return
EPERM
~~~~~
I meant EACCES. And I think Linux does the right thing but FreeBSD is
broken. The rationale is in the definition of EACCES and EPERM, taken from
SuSv2 (and why would you let it "be damned" if you just relied on it
yourself in your own patch? Though you used an old/obsolete version called
POSIX).
Let's look at the definitions:
[EACCES]
Permission denied An attempt was made to access a file in a way
forbidden by its file access permissions.
[EPERM]
Operation not permitted An attempt was made to perform an
operation limited to processes with appropriate privileges or to
the owner of a file or other resource.
Now, in the case of truncating immutable files we are trying to violate
file access permissions _and_ there is no special privilege that we could
possess which would allow us to succeed in this case. So EPERM is not
right but EACCES is right. So, the current state of Linux is correct but
inefficient and the patch below is an optimization (+ Al Viro's fixes).
Regards,
Tigran
> is not just a good idea, it is imperative and Linux already does that --
> so we can drop that check from if() unconditionally, which is what the
> patch below does).
>
> Tested under 2.4.0-test12-pre6.
>
> Regards,
> Tigran
>
> --- linux/fs/open.c Thu Oct 26 16:11:21 2000
> +++ work/fs/open.c Wed Dec 6 08:05:43 2000
> @@ -102,7 +102,12 @@
> goto out;
> inode = nd.dentry->d_inode;
>
> - error = -EACCES;
> + /* For directories it's -EISDIR, for other non-regulars - -EINVAL */
> + error = -EISDIR;
> + if (S_ISDIR(inode->i_mode))
> + goto dput_and_out;
> +
> + error = -EINVAL;
> if (!S_ISREG(inode->i_mode))
> goto dput_and_out;
>
> @@ -110,12 +115,8 @@
> if (error)
> goto dput_and_out;
>
> - error = -EROFS;
> - if (IS_RDONLY(inode))
> - goto dput_and_out;
> -
> error = -EPERM;
> - if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
> + if (IS_APPEND(inode))
> goto dput_and_out;
>
> /*
> @@ -163,7 +164,7 @@
> goto out;
> dentry = file->f_dentry;
> inode = dentry->d_inode;
> - error = -EACCES;
> + error = -EINVAL;
> if (!S_ISREG(inode->i_mode) || !(file->f_mode & FMODE_WRITE))
> goto out_putf;
> error = -EPERM;
>
>
On Wed, 6 Dec 2000, Tigran Aivazian wrote:
>
> This patch combines your previous patch with 2 changes I have just
> suggested. Both changes are obvious (and correct).
Why remove the EROFS test?
Linus
On Wed, 6 Dec 2000, Linus Torvalds wrote:
>
>
> On Wed, 6 Dec 2000, Tigran Aivazian wrote:
> >
> > This patch combines your previous patch with 2 changes I have just
> > suggested. Both changes are obvious (and correct).
>
> Why remove the EROFS test?
Tigran has a point - permission() does
if ((mask & S_IWOTH) && IS_RDONLY(inode) &&
(S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode)))
return -EROFS; /* Nobody gets write access to a read-only fs */
so do_sys_truncate() has no chance in hell to trigger its own check for
IS_RDONLY() - we have
if (!S_ISREG(inode->i_mode))
goto dput_and_out;
error = permission(inode,MAY_WRITE);
if (error)
goto dput_and_out;
error = -EROFS;
if (IS_RDONLY(inode))
goto dput_and_out;
there, so if it's not a regular file we die before the call of permission(),
if it is and fs is readonly - we get -EROFS from permission() and die
there. In either case we don't get to the IS_RDONLY() check...
On Wed, 6 Dec 2000, Alexander Viro wrote:
> On Wed, 6 Dec 2000, Linus Torvalds wrote:
> > Why remove the EROFS test?
>
> there, so if it's not a regular file we die before the call of permission(),
> if it is and fs is readonly - we get -EROFS from permission() and die
> there. In either case we don't get to the IS_RDONLY() check...
just to add that I removed the immutable test for almost the same reason:
a) we don't hit that test because permission takes care of it (for
regulars/dirs/symlinks but here only regulars are important)
b) the error returned by permission EACCES makes a lot more sense than
EPERM we were trying to return. I already quotes Single UNIX v2 to explain
why.
Now, here is the same patch, except:
a) tested under test12-pre7
b) I suggest we explain why we return EPERM for append-only files despite
the fact that SuS and common sense suggest EACCES.
Facts verified under FreeBSD 4.2
The rationale for being compatible with 4.4BSD on append-only but not on
immutable is -- for immutable we can do the test by means of permission()
fast but for append-only we would need an extra if() above permission so
let's just be BSD-compatible. Alternatively, one could ignore BSD
altogether and return EACCES in both. Or, one could ignore SuS altogether
and return EPERM for both immutable and append-only. It is a matter of
taste so... I chose something in the middle , perhaps non-intuitive but
optimized for speed and the size of code.
Regards,
Tigran
--- linux/fs/open.c Thu Oct 26 16:11:21 2000
+++ work/fs/open.c Thu Dec 7 09:06:50 2000
@@ -102,7 +102,12 @@
goto out;
inode = nd.dentry->d_inode;
- error = -EACCES;
+ /* For directories it's -EISDIR, for other non-regulars - -EINVAL */
+ error = -EISDIR;
+ if (S_ISDIR(inode->i_mode))
+ goto dput_and_out;
+
+ error = -EINVAL;
if (!S_ISREG(inode->i_mode))
goto dput_and_out;
@@ -110,12 +115,9 @@
if (error)
goto dput_and_out;
- error = -EROFS;
- if (IS_RDONLY(inode))
- goto dput_and_out;
-
+ /* EPERM for 4.4BSD compat. EACCES would make more sense */
error = -EPERM;
- if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
+ if (IS_APPEND(inode))
goto dput_and_out;
/*
@@ -163,7 +165,7 @@
goto out;
dentry = file->f_dentry;
inode = dentry->d_inode;
- error = -EACCES;
+ error = -EINVAL;
if (!S_ISREG(inode->i_mode) || !(file->f_mode & FMODE_WRITE))
goto out_putf;
error = -EPERM;
On Thu, 7 Dec 2000, Tigran Aivazian wrote:
> a) we don't hit that test because permission takes care of it (for
> regulars/dirs/symlinks but here only regulars are important)
omit what is in brackets but everything in email and the patch itself are
valid and tested. The detail in bracket above was unimportant since the
main text provided enough proof.
Regards,
Tigran
On Thu, 7 Dec 2000, Tigran Aivazian wrote:
> The rationale for being compatible with 4.4BSD on append-only but not on
> immutable is -- for immutable we can do the test by means of permission()
> fast but for append-only we would need an extra if() above permission so
> let's just be BSD-compatible. Alternatively, one could ignore BSD
> altogether and return EACCES in both. Or, one could ignore SuS altogether
> and return EPERM for both immutable and append-only. It is a matter of
> taste so... I chose something in the middle , perhaps non-intuitive but
> optimized for speed and the size of code.
Big, loud "yuck". The first rule of optimization: don't. I agree that
in the current code test for immutable is dead. However, the rationale
above is a BS. The path is nowhere near time-critical ones. Moreover,
it doesn't make the code simpler.
History: 4.4BSD treats immutable and append-only the same way it treats
removal from sticky-bit directories, etc. I.e. consistenly returns
-EPERM. Yes, on access(), open() for write, directory operations, yodda,
yodda.
So correct solution may very well be to change the return value of
permission(9). FWIW, MAY_TRUNCATE might be a good idea - notice that
knfsd already has something like that. It makes sense for directories,
BTW - having may_delete() drop the IS_APPEND() test and pass MAY_TRUNCATE
to permission() instead.
<looking at the truncate(2) manpage>
Oh, lovely - where the hell had the following come from?
% man truncate
...
EINVAL The pathname contains a character with the high-
order bit set.
...
Andries, would you mind removing that, erm, statement? I'm curious about
its genesis - AFAIK we had 8bit-clean namei for ages (quite possibly since
0.01).
On Thu, 7 Dec 2000, Alexander Viro wrote:
> So correct solution may very well be to change the return value of
> permission(9). FWIW, MAY_TRUNCATE might be a good idea - notice that
> knfsd already has something like that. It makes sense for directories,
> BTW - having may_delete() drop the IS_APPEND() test and pass MAY_TRUNCATE
> to permission() instead.
Alexander,
The above sounds like an excellent idea (MAY_TRUNCATE specifically!) but
please understand that I simply wanted to minimize the amount of
fundamental code changes. Yes, it would be much cleaner to have a MAY_XXX
which groups little specific tests together for every scenario, e.g. for
truncate().
So, I wasn't suggesting BS as the "best thing for all times" but merely as
the "best thing at the moment without changing too much and yet not
leaving dead code in there and not adding too much new code".
Regards,
Tigran
On Thu, 7 Dec 2000, Alexander Viro wrote:
> On Thu, 7 Dec 2000, Tigran Aivazian wrote:
>
> > The rationale for being compatible with 4.4BSD on append-only but not on
> > immutable is -- for immutable we can do the test by means of permission()
> > fast but for append-only we would need an extra if() above permission so
> > let's just be BSD-compatible. Alternatively, one could ignore BSD
> > altogether and return EACCES in both. Or, one could ignore SuS altogether
> > and return EPERM for both immutable and append-only. It is a matter of
> > taste so... I chose something in the middle , perhaps non-intuitive but
> > optimized for speed and the size of code.
>
> So correct solution may very well be to change the return value of
> permission(9). FWIW, MAY_TRUNCATE might be a good idea - notice that
> knfsd already has something like that. It makes sense for directories,
> BTW - having may_delete() drop the IS_APPEND() test and pass MAY_TRUNCATE
> to permission() instead.
>
Ok, so you agree with me (I ignore your comment on "BS" because it would
imply that your own suggestion is BS too and that is not nice) that one of
the three alternatives must be chosen. Which one? You have not specified
yet. So far you only said that a different implementation, i.e. a
different place to put the checks, is preferrable.
Which of these three you think is a good idea:
a) be 4.4BSD-compatible and return EPERM for both immutable and
append-only
b) be SuSv2-compatible and return EACCES for both immutable and
append-only
c) be half-compatible with BSD and half-compatible with SuS but with
minimum changes to current code
I understand and agree that c) is not a very good idea. Then, a) or b)?
Regards,
Tigran
On Thu, 7 Dec 2000, Tigran Aivazian wrote:
> yet. So far you only said that a different implementation, i.e. a
> different place to put the checks, is preferrable.
-EPERM returned by permission() if we ask for write access to immutable.
Al, currently walking through the /usr/share/man/man2 and swearing silently...
On Thu, Dec 07, 2000 at 08:23:59AM -0500, Alexander Viro wrote:
> <looking at the truncate(2) manpage>
> Oh, lovely - where the hell had the following come from?
> % man truncate
> ...
> EINVAL The pathname contains a character with the high-
> order bit set.
> ...
> Andries, would you mind removing that, erm, statement? I'm curious about
> its genesis - AFAIK we had 8bit-clean namei for ages (quite possibly since
> 0.01).
I removed it in man-pages-1.30.
Apparently you have an older version.
(1.31 has been released; expect 1.32 soon)
Andries
On Thu, Dec 07, 2000 at 10:24:31AM -0500, Alexander Viro wrote:
> Al, currently walking through the /usr/share/man/man2 and swearing silently...
Swearing? At the POSIX decisions or at the man page quality?
In the latter case, additions and corrections are very welcome.
Make sure that you have 1.31 installed.
Andries
On Thu, 7 Dec 2000, Andries Brouwer wrote:
> On Thu, Dec 07, 2000 at 10:24:31AM -0500, Alexander Viro wrote:
>
> > Al, currently walking through the /usr/share/man/man2 and swearing silently...
>
> Swearing? At the POSIX decisions or at the man page quality?
Mostly at the out-of-sync/not-all-errors-documented kind of places and amount
of fun involved in getting them in sync with the tree (and with each other,
for that matter). Oh, well...
> In the latter case, additions and corrections are very welcome.
> Make sure that you have 1.31 installed.
Grabbed it. BTW, if you still have 1.7, 1.10, 1.13 and 1.14... I've managed
to dig the rest out, but these seem to be gone (looks like a modified 1.10
is out there, but...)
I was thinking about putting the whole bunch under the CVS. If you have
the missing ones somewhere and could send an URL...
BTW, could we finally lose mpx(2)? Very few programs used it under v7 and
that experiment had been abandoned early in 80s, so it's not like we needed
it for porting. phys(2) is already gone (not from unimplemented(2), though),
so we have a precedent of removing such stuff.
> BTW, if you still have 1.7, 1.10, 1.13 and 1.14...
See ftp://ftp.cwi.nl/pub/aeb/manpages/ (will soon disappear again).
> BTW, could we finally lose mpx(2)?
Maybe we lost it - I find sys_mpx only in a comment in arch/arm/kernel/calls.S
Andries
On Fri, 8 Dec 2000 [email protected] wrote:
> > BTW, if you still have 1.7, 1.10, 1.13 and 1.14...
>
> See ftp://ftp.cwi.nl/pub/aeb/manpages/ (will soon disappear again).
Got them, thanks.
> > BTW, could we finally lose mpx(2)?
>
> Maybe we lost it - I find sys_mpx only in a comment in arch/arm/kernel/calls.S
Sure, but man2/mpx.2 is alive and well...
On Thu, Dec 07, 2000 at 09:59:18PM -0500, Alexander Viro wrote:
> On Fri, 8 Dec 2000 [email protected] wrote:
> > > BTW, could we finally lose mpx(2)?
> >
> > Maybe we lost it - I find sys_mpx only in a comment in arch/arm/kernel/calls.S
>
> Sure, but man2/mpx.2 is alive and well...
manpages-dev 1.31-1 of debian:
$ ll /usr/share/man/man2/mpx.2.gz
... /usr/share/man/man2/mpx.2.gz -> unimplemented.2.gz
$ man 2 mpx
NAME
afs_syscall, break, ftime, gtty, lock, mpx, phys, prof,
profil, stty, ulimit - unimplemented system calls
--
marko