Hi,
This is V2 of the patch. Posted V1 here.
https://lore.kernel.org/linux-fsdevel/[email protected]/
Right now we don't allow setting user.* xattrs on symlinks and special
files at all. Initially I thought that real reason behind this
restriction is quota limitations but from last conversation it seemed
that real reason is that permission bits on symlink and special files
are special and different from regular files and directories, hence
this restriction is in place.
Given it probably is not a quota issue (I tested with xfs user quota
enabled and quota restrictions kicked in on symlink), I dropped the
idea of allowing user.* xattr if process has CAP_SYS_RESOURCE.
Instead this version of patch allows reading/writing user.* xattr
on symlink and special files if caller is owner or priviliged (has
CAP_FOWNER) w.r.t inode.
We need this for virtiofs daemon. I also found one more user. Giuseppe,
seems to set user.* xattr attrs on unpriviliged fuse-overlay as well
and he ran into similar issue. So fuse-overlay should benefit from
this change as well.
Who wants to set user.* xattr on symlink/special files
-----------------------------------------------------
In virtiofs, actual file server is virtiosd daemon running on host.
There we have a mode where xattrs can be remapped to something else.
For example security.selinux can be remapped to
user.virtiofsd.securit.selinux on the host.
This remapping is useful when SELinux is enabled in guest and virtiofs
as being used as rootfs. Guest and host SELinux policy might not match
and host policy might deny security.selinux xattr setting by guest
onto host. Or host might have SELinux disabled and in that case to
be able to set security.selinux xattr, virtiofsd will need to have
CAP_SYS_ADMIN (which we are trying to avoid). Being able to remap
guest security.selinux (or other xattrs) on host to something else
is also better from security point of view.
But when we try this, we noticed that SELinux relabeling in guest
is failing on some symlinks. When I debugged a little more, I
came to know that "user.*" xattrs are not allowed on symlinks
or special files.
So if we allow owner (or CAP_FOWNER) to set user.* xattr, it will
allow virtiofs to arbitrarily remap guests's xattrs to something
else on host and that solves this SELinux issue nicely and provides
two SELinux policies (host and guest) to co-exist nicely without
interfering with each other.
Thanks
Vivek
Vivek Goyal (1):
xattr: Allow user.* xattr on symlink and special files
fs/xattr.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
--
2.25.4
Currently user.* xattr are not allowed on symlink and special files.
man xattr and recent discussion suggested that primary reason for this
restriction is how file permissions for symlinks and special files
are little different from regular files and directories.
For symlinks, they are world readable/writable and if user xattr were
to be permitted, it will allow unpriviliged users to dump a huge amount
of user.* xattrs on symlinks without any control.
For special files, permissions typically control capability to read/write
from devices (and not necessarily from filesystem). So if a user can
write to device (/dev/null), does not necessarily mean it should be allowed
to write large number of user.* xattrs on the filesystem device node is
residing in.
This patch proposes to relax the restrictions a bit and allow file owner
or priviliged user (CAP_FOWNER), to be able to read/write user.* xattrs
on symlink and special files.
virtiofs daemon has a need to store user.* xatrrs on all the files
(including symlinks and special files), and currently that fails. This
patch should help.
Link: https://lore.kernel.org/linux-fsdevel/[email protected]/
Signed-off-by: Vivek Goyal <[email protected]>
---
fs/xattr.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/xattr.c b/fs/xattr.c
index 5c8c5175b385..2f1855c8b620 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -120,12 +120,14 @@ xattr_permission(struct user_namespace *mnt_userns, struct inode *inode,
}
/*
- * In the user.* namespace, only regular files and directories can have
- * extended attributes. For sticky directories, only the owner and
- * privileged users can write attributes.
+ * In the user.* namespace, for symlinks and special files, only
+ * the owner and priviliged users can read/write attributes.
+ * For sticky directories, only the owner and privileged users can
+ * write attributes.
*/
if (!strncmp(name, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN)) {
- if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
+ if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode) &&
+ !inode_owner_or_capable(mnt_userns, inode))
return (mask & MAY_WRITE) ? -EPERM : -ENODATA;
if (S_ISDIR(inode->i_mode) && (inode->i_mode & S_ISVTX) &&
(mask & MAY_WRITE) &&
--
2.25.4
On Thu, Jul 08, 2021 at 01:57:38PM -0400, Vivek Goyal wrote:
> Currently user.* xattr are not allowed on symlink and special files.
>
> man xattr and recent discussion suggested that primary reason for this
> restriction is how file permissions for symlinks and special files
> are little different from regular files and directories.
>
> For symlinks, they are world readable/writable and if user xattr were
> to be permitted, it will allow unpriviliged users to dump a huge amount
> of user.* xattrs on symlinks without any control.
>
> For special files, permissions typically control capability to read/write
> from devices (and not necessarily from filesystem). So if a user can
> write to device (/dev/null), does not necessarily mean it should be allowed
> to write large number of user.* xattrs on the filesystem device node is
> residing in.
>
> This patch proposes to relax the restrictions a bit and allow file owner
> or priviliged user (CAP_FOWNER), to be able to read/write user.* xattrs
> on symlink and special files.
>
> virtiofs daemon has a need to store user.* xatrrs on all the files
> (including symlinks and special files), and currently that fails. This
> patch should help.
>
> Link: https://lore.kernel.org/linux-fsdevel/[email protected]/
> Signed-off-by: Vivek Goyal <[email protected]>
> ---
Seems reasonable and useful.
Acked-by: Christian Brauner <[email protected]>
One question, do all filesystem supporting xattrs deal with setting them
on symlinks/device files correctly?
> fs/xattr.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/fs/xattr.c b/fs/xattr.c
> index 5c8c5175b385..2f1855c8b620 100644
> --- a/fs/xattr.c
> +++ b/fs/xattr.c
> @@ -120,12 +120,14 @@ xattr_permission(struct user_namespace *mnt_userns, struct inode *inode,
> }
>
> /*
> - * In the user.* namespace, only regular files and directories can have
> - * extended attributes. For sticky directories, only the owner and
> - * privileged users can write attributes.
> + * In the user.* namespace, for symlinks and special files, only
> + * the owner and priviliged users can read/write attributes.
> + * For sticky directories, only the owner and privileged users can
> + * write attributes.
> */
> if (!strncmp(name, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN)) {
> - if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
> + if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode) &&
> + !inode_owner_or_capable(mnt_userns, inode))
> return (mask & MAY_WRITE) ? -EPERM : -ENODATA;
> if (S_ISDIR(inode->i_mode) && (inode->i_mode & S_ISVTX) &&
> (mask & MAY_WRITE) &&
> --
> 2.25.4
>
On Fri, Jul 09, 2021 at 11:19:15AM +0200, Christian Brauner wrote:
> On Thu, Jul 08, 2021 at 01:57:38PM -0400, Vivek Goyal wrote:
> > Currently user.* xattr are not allowed on symlink and special files.
> >
> > man xattr and recent discussion suggested that primary reason for this
> > restriction is how file permissions for symlinks and special files
> > are little different from regular files and directories.
> >
> > For symlinks, they are world readable/writable and if user xattr were
> > to be permitted, it will allow unpriviliged users to dump a huge amount
> > of user.* xattrs on symlinks without any control.
> >
> > For special files, permissions typically control capability to read/write
> > from devices (and not necessarily from filesystem). So if a user can
> > write to device (/dev/null), does not necessarily mean it should be allowed
> > to write large number of user.* xattrs on the filesystem device node is
> > residing in.
> >
> > This patch proposes to relax the restrictions a bit and allow file owner
> > or priviliged user (CAP_FOWNER), to be able to read/write user.* xattrs
> > on symlink and special files.
> >
> > virtiofs daemon has a need to store user.* xatrrs on all the files
> > (including symlinks and special files), and currently that fails. This
> > patch should help.
> >
> > Link: https://lore.kernel.org/linux-fsdevel/[email protected]/
> > Signed-off-by: Vivek Goyal <[email protected]>
> > ---
>
> Seems reasonable and useful.
> Acked-by: Christian Brauner <[email protected]>
>
> One question, do all filesystem supporting xattrs deal with setting them
> on symlinks/device files correctly?
Wrote a simple bash script to do setfattr/getfattr user.foo xattr on
symlink and device node on ext4, xfs and btrfs and it works fine.
https://github.com/rhvgoyal/misc/blob/master/generic-programs/user-xattr-special-files.sh
I probably can add some more filesystems to test.
Thanks
Vivek
>
> > fs/xattr.c | 10 ++++++----
> > 1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/xattr.c b/fs/xattr.c
> > index 5c8c5175b385..2f1855c8b620 100644
> > --- a/fs/xattr.c
> > +++ b/fs/xattr.c
> > @@ -120,12 +120,14 @@ xattr_permission(struct user_namespace *mnt_userns, struct inode *inode,
> > }
> >
> > /*
> > - * In the user.* namespace, only regular files and directories can have
> > - * extended attributes. For sticky directories, only the owner and
> > - * privileged users can write attributes.
> > + * In the user.* namespace, for symlinks and special files, only
> > + * the owner and priviliged users can read/write attributes.
> > + * For sticky directories, only the owner and privileged users can
> > + * write attributes.
> > */
> > if (!strncmp(name, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN)) {
> > - if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
> > + if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode) &&
> > + !inode_owner_or_capable(mnt_userns, inode))
> > return (mask & MAY_WRITE) ? -EPERM : -ENODATA;
> > if (S_ISDIR(inode->i_mode) && (inode->i_mode & S_ISVTX) &&
> > (mask & MAY_WRITE) &&
> > --
> > 2.25.4
> >
>
On 7/9/2021 8:27 AM, Vivek Goyal wrote:
> On Fri, Jul 09, 2021 at 11:19:15AM +0200, Christian Brauner wrote:
>> On Thu, Jul 08, 2021 at 01:57:38PM -0400, Vivek Goyal wrote:
>>> Currently user.* xattr are not allowed on symlink and special files.
>>>
>>> man xattr and recent discussion suggested that primary reason for this
>>> restriction is how file permissions for symlinks and special files
>>> are little different from regular files and directories.
>>>
>>> For symlinks, they are world readable/writable and if user xattr were
>>> to be permitted, it will allow unpriviliged users to dump a huge amount
>>> of user.* xattrs on symlinks without any control.
>>>
>>> For special files, permissions typically control capability to read/write
>>> from devices (and not necessarily from filesystem). So if a user can
>>> write to device (/dev/null), does not necessarily mean it should be allowed
>>> to write large number of user.* xattrs on the filesystem device node is
>>> residing in.
>>>
>>> This patch proposes to relax the restrictions a bit and allow file owner
>>> or priviliged user (CAP_FOWNER), to be able to read/write user.* xattrs
>>> on symlink and special files.
>>>
>>> virtiofs daemon has a need to store user.* xatrrs on all the files
>>> (including symlinks and special files), and currently that fails. This
>>> patch should help.
>>>
>>> Link: https://lore.kernel.org/linux-fsdevel/[email protected]/
>>> Signed-off-by: Vivek Goyal <[email protected]>
>>> ---
>> Seems reasonable and useful.
>> Acked-by: Christian Brauner <[email protected]>
>>
>> One question, do all filesystem supporting xattrs deal with setting them
>> on symlinks/device files correctly?
> Wrote a simple bash script to do setfattr/getfattr user.foo xattr on
> symlink and device node on ext4, xfs and btrfs and it works fine.
How about nfs, tmpfs, overlayfs and/or some of the other less conventional
filesystems?
>
> https://github.com/rhvgoyal/misc/blob/master/generic-programs/user-xattr-special-files.sh
>
> I probably can add some more filesystems to test.
>
> Thanks
> Vivek
>
>>> fs/xattr.c | 10 ++++++----
>>> 1 file changed, 6 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/xattr.c b/fs/xattr.c
>>> index 5c8c5175b385..2f1855c8b620 100644
>>> --- a/fs/xattr.c
>>> +++ b/fs/xattr.c
>>> @@ -120,12 +120,14 @@ xattr_permission(struct user_namespace *mnt_userns, struct inode *inode,
>>> }
>>>
>>> /*
>>> - * In the user.* namespace, only regular files and directories can have
>>> - * extended attributes. For sticky directories, only the owner and
>>> - * privileged users can write attributes.
>>> + * In the user.* namespace, for symlinks and special files, only
>>> + * the owner and priviliged users can read/write attributes.
>>> + * For sticky directories, only the owner and privileged users can
>>> + * write attributes.
>>> */
>>> if (!strncmp(name, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN)) {
>>> - if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
>>> + if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode) &&
>>> + !inode_owner_or_capable(mnt_userns, inode))
>>> return (mask & MAY_WRITE) ? -EPERM : -ENODATA;
>>> if (S_ISDIR(inode->i_mode) && (inode->i_mode & S_ISVTX) &&
>>> (mask & MAY_WRITE) &&
>>> --
>>> 2.25.4
>>>
On 7/8/21 13:57, Vivek Goyal wrote:
> Hi,
>
> This is V2 of the patch. Posted V1 here.
>
> https://lore.kernel.org/linux-fsdevel/[email protected]/
>
> Right now we don't allow setting user.* xattrs on symlinks and special
> files at all. Initially I thought that real reason behind this
> restriction is quota limitations but from last conversation it seemed
> that real reason is that permission bits on symlink and special files
> are special and different from regular files and directories, hence
> this restriction is in place.
>
> Given it probably is not a quota issue (I tested with xfs user quota
> enabled and quota restrictions kicked in on symlink), I dropped the
> idea of allowing user.* xattr if process has CAP_SYS_RESOURCE.
>
> Instead this version of patch allows reading/writing user.* xattr
> on symlink and special files if caller is owner or priviliged (has
> CAP_FOWNER) w.r.t inode.
>
> We need this for virtiofs daemon. I also found one more user. Giuseppe,
> seems to set user.* xattr attrs on unpriviliged fuse-overlay as well
> and he ran into similar issue. So fuse-overlay should benefit from
> this change as well.
>
> Who wants to set user.* xattr on symlink/special files
> -----------------------------------------------------
>
> In virtiofs, actual file server is virtiosd daemon running on host.
> There we have a mode where xattrs can be remapped to something else.
> For example security.selinux can be remapped to
> user.virtiofsd.securit.selinux on the host.
>
> This remapping is useful when SELinux is enabled in guest and virtiofs
> as being used as rootfs. Guest and host SELinux policy might not match
> and host policy might deny security.selinux xattr setting by guest
> onto host. Or host might have SELinux disabled and in that case to
> be able to set security.selinux xattr, virtiofsd will need to have
> CAP_SYS_ADMIN (which we are trying to avoid). Being able to remap
> guest security.selinux (or other xattrs) on host to something else
> is also better from security point of view.
>
> But when we try this, we noticed that SELinux relabeling in guest
> is failing on some symlinks. When I debugged a little more, I
> came to know that "user.*" xattrs are not allowed on symlinks
> or special files.
>
> So if we allow owner (or CAP_FOWNER) to set user.* xattr, it will
> allow virtiofs to arbitrarily remap guests's xattrs to something
> else on host and that solves this SELinux issue nicely and provides
> two SELinux policies (host and guest) to co-exist nicely without
> interfering with each other.
>
> Thanks
> Vivek
>
>
> Vivek Goyal (1):
> xattr: Allow user.* xattr on symlink and special files
>
> fs/xattr.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
I just wanted to point out that the work Giuseppe is doing is to support
nfs homedirs with container runtimes like Rootless Podman.
Basically fuse-overlayfs on top of NFS homedir needs to be able to use
user xattrs to set file permissions and ownership fields to be
represented to containers.
Currently NFS Servers do not understand User Namespace and seeing a
client user attempting to chown to a different user, is blocked on the
server, even though user namespace on the client allows it.
fuse-overlay intercepts the chown from the container and writes out the
user.Xattr the permissions and owner/group as user.Xattrs. And all the
server sees is the user modifying the xattrs now chowning the real UID
of the file.
On Fri, Jul 09, 2021 at 08:34:41AM -0700, Casey Schaufler wrote:
> On 7/9/2021 8:27 AM, Vivek Goyal wrote:
> > On Fri, Jul 09, 2021 at 11:19:15AM +0200, Christian Brauner wrote:
> >> On Thu, Jul 08, 2021 at 01:57:38PM -0400, Vivek Goyal wrote:
> >>> Currently user.* xattr are not allowed on symlink and special files.
> >>>
> >>> man xattr and recent discussion suggested that primary reason for this
> >>> restriction is how file permissions for symlinks and special files
> >>> are little different from regular files and directories.
> >>>
> >>> For symlinks, they are world readable/writable and if user xattr were
> >>> to be permitted, it will allow unpriviliged users to dump a huge amount
> >>> of user.* xattrs on symlinks without any control.
> >>>
> >>> For special files, permissions typically control capability to read/write
> >>> from devices (and not necessarily from filesystem). So if a user can
> >>> write to device (/dev/null), does not necessarily mean it should be allowed
> >>> to write large number of user.* xattrs on the filesystem device node is
> >>> residing in.
> >>>
> >>> This patch proposes to relax the restrictions a bit and allow file owner
> >>> or priviliged user (CAP_FOWNER), to be able to read/write user.* xattrs
> >>> on symlink and special files.
> >>>
> >>> virtiofs daemon has a need to store user.* xatrrs on all the files
> >>> (including symlinks and special files), and currently that fails. This
> >>> patch should help.
> >>>
> >>> Link: https://lore.kernel.org/linux-fsdevel/[email protected]/
> >>> Signed-off-by: Vivek Goyal <[email protected]>
> >>> ---
> >> Seems reasonable and useful.
> >> Acked-by: Christian Brauner <[email protected]>
> >>
> >> One question, do all filesystem supporting xattrs deal with setting them
> >> on symlinks/device files correctly?
> > Wrote a simple bash script to do setfattr/getfattr user.foo xattr on
> > symlink and device node on ext4, xfs and btrfs and it works fine.
>
> How about nfs, tmpfs, overlayfs and/or some of the other less conventional
> filesystems?
tmpfs does not support user.* xattr at all on any kind of files.
overlayfs works fine. I updated my test too.
nfs seems to have some issues.
- I can set user.foo xattr on symlink and query it back using xattr name.
getfattr -h -n user.foo foo-link.txt
But when I try to dump all xattrs on this file, user.foo is being
filtered out it looks like. Not sure why.
- I can't set "user.foo" xattr on a device node on nfs and I get
"Permission denied". I am assuming nfs server is returning this.
I am using knfsd with following in /etc/exports.
/mnt/test/nfs-server 127.0.0.1(insecure,no_root_squash,rw,async)
Copying Bruce. He might have an idea.
Thanks
Vivek
>
> >
> > https://github.com/rhvgoyal/misc/blob/master/generic-programs/user-xattr-special-files.sh
> >
> > I probably can add some more filesystems to test.
> >
> > Thanks
> > Vivek
> >
> >>> fs/xattr.c | 10 ++++++----
> >>> 1 file changed, 6 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/fs/xattr.c b/fs/xattr.c
> >>> index 5c8c5175b385..2f1855c8b620 100644
> >>> --- a/fs/xattr.c
> >>> +++ b/fs/xattr.c
> >>> @@ -120,12 +120,14 @@ xattr_permission(struct user_namespace *mnt_userns, struct inode *inode,
> >>> }
> >>>
> >>> /*
> >>> - * In the user.* namespace, only regular files and directories can have
> >>> - * extended attributes. For sticky directories, only the owner and
> >>> - * privileged users can write attributes.
> >>> + * In the user.* namespace, for symlinks and special files, only
> >>> + * the owner and priviliged users can read/write attributes.
> >>> + * For sticky directories, only the owner and privileged users can
> >>> + * write attributes.
> >>> */
> >>> if (!strncmp(name, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN)) {
> >>> - if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
> >>> + if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode) &&
> >>> + !inode_owner_or_capable(mnt_userns, inode))
> >>> return (mask & MAY_WRITE) ? -EPERM : -ENODATA;
> >>> if (S_ISDIR(inode->i_mode) && (inode->i_mode & S_ISVTX) &&
> >>> (mask & MAY_WRITE) &&
> >>> --
> >>> 2.25.4
> >>>
>
On Fri, Jul 9, 2021 at 1:59 PM Vivek Goyal <[email protected]> wrote:
> nfs seems to have some issues.
I'm not sure what the expected behavior is for nfs. All I have for
now is some generic troubleshooting ideas, sorry:
> - I can set user.foo xattr on symlink and query it back using xattr name.
>
> getfattr -h -n user.foo foo-link.txt
>
> But when I try to dump all xattrs on this file, user.foo is being
> filtered out it looks like. Not sure why.
Logging into the server and seeing what's set there could help confirm
whether it's the client or server that's at fault. (Or watching the
traffic in wireshark; there are GET/SET/LISTXATTR ops that should be
easy to spot.)
> - I can't set "user.foo" xattr on a device node on nfs and I get
> "Permission denied". I am assuming nfs server is returning this.
Wireshark should tell you whether it's the server or client doing that.
The RFC is https://datatracker.ietf.org/doc/html/rfc8276, and I don't
see any explicit statement about what the server should do in the case
of symlinks or device nodes, but I do see "Any regular file or
directory may have a set of extended attributes", so that was clearly
the assumption. Also, NFS4ERR_WRONG_TYPE is listed as a possible
error return for the xattr ops. But on a quick skim I don't see any
explicit checks in the nfsd code, so I *think* it's just relying on
the vfs for any file type checks.
--b.
On Fri, Jul 09, 2021 at 08:34:41AM -0700, Casey Schaufler wrote:
> >> One question, do all filesystem supporting xattrs deal with setting them
> >> on symlinks/device files correctly?
> > Wrote a simple bash script to do setfattr/getfattr user.foo xattr on
> > symlink and device node on ext4, xfs and btrfs and it works fine.
>
> How about nfs, tmpfs, overlayfs and/or some of the other less conventional
> filesystems?
As a suggestion, perhaps you could take your bash script and turn it
into an xfstests test so we can more easily test various file systems,
both now and once the commit is accepted, to look for regressions?
Cheers,
- Ted
On Fri, 9 Jul 2021 08:34:41 -0700
Casey Schaufler <[email protected]> wrote:
> On 7/9/2021 8:27 AM, Vivek Goyal wrote:
> > On Fri, Jul 09, 2021 at 11:19:15AM +0200, Christian Brauner wrote:
> >> On Thu, Jul 08, 2021 at 01:57:38PM -0400, Vivek Goyal wrote:
> >>> Currently user.* xattr are not allowed on symlink and special files.
> >>>
> >>> man xattr and recent discussion suggested that primary reason for this
> >>> restriction is how file permissions for symlinks and special files
> >>> are little different from regular files and directories.
> >>>
> >>> For symlinks, they are world readable/writable and if user xattr were
> >>> to be permitted, it will allow unpriviliged users to dump a huge amount
> >>> of user.* xattrs on symlinks without any control.
> >>>
> >>> For special files, permissions typically control capability to read/write
> >>> from devices (and not necessarily from filesystem). So if a user can
> >>> write to device (/dev/null), does not necessarily mean it should be allowed
> >>> to write large number of user.* xattrs on the filesystem device node is
> >>> residing in.
> >>>
> >>> This patch proposes to relax the restrictions a bit and allow file owner
> >>> or priviliged user (CAP_FOWNER), to be able to read/write user.* xattrs
> >>> on symlink and special files.
> >>>
> >>> virtiofs daemon has a need to store user.* xatrrs on all the files
> >>> (including symlinks and special files), and currently that fails. This
> >>> patch should help.
> >>>
> >>> Link: https://lore.kernel.org/linux-fsdevel/[email protected]/
> >>> Signed-off-by: Vivek Goyal <[email protected]>
> >>> ---
> >> Seems reasonable and useful.
> >> Acked-by: Christian Brauner <[email protected]>
> >>
> >> One question, do all filesystem supporting xattrs deal with setting them
> >> on symlinks/device files correctly?
> > Wrote a simple bash script to do setfattr/getfattr user.foo xattr on
> > symlink and device node on ext4, xfs and btrfs and it works fine.
>
> How about nfs, tmpfs, overlayfs and/or some of the other less conventional
> filesystems?
>
How about virtiofs then ? :-)
> >
> > https://github.com/rhvgoyal/misc/blob/master/generic-programs/user-xattr-special-files.sh
> >
> > I probably can add some more filesystems to test.
> >
> > Thanks
> > Vivek
> >
> >>> fs/xattr.c | 10 ++++++----
> >>> 1 file changed, 6 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/fs/xattr.c b/fs/xattr.c
> >>> index 5c8c5175b385..2f1855c8b620 100644
> >>> --- a/fs/xattr.c
> >>> +++ b/fs/xattr.c
> >>> @@ -120,12 +120,14 @@ xattr_permission(struct user_namespace *mnt_userns, struct inode *inode,
> >>> }
> >>>
> >>> /*
> >>> - * In the user.* namespace, only regular files and directories can have
> >>> - * extended attributes. For sticky directories, only the owner and
> >>> - * privileged users can write attributes.
> >>> + * In the user.* namespace, for symlinks and special files, only
> >>> + * the owner and priviliged users can read/write attributes.
> >>> + * For sticky directories, only the owner and privileged users can
> >>> + * write attributes.
> >>> */
> >>> if (!strncmp(name, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN)) {
> >>> - if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
> >>> + if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode) &&
> >>> + !inode_owner_or_capable(mnt_userns, inode))
> >>> return (mask & MAY_WRITE) ? -EPERM : -ENODATA;
> >>> if (S_ISDIR(inode->i_mode) && (inode->i_mode & S_ISVTX) &&
> >>> (mask & MAY_WRITE) &&
> >>> --
> >>> 2.25.4
> >>>
>
> _______________________________________________
> Virtio-fs mailing list
> [email protected]
> https://listman.redhat.com/mailman/listinfo/virtio-fs
>
On Fri, Jul 09, 2021 at 04:10:16PM -0400, Bruce Fields wrote:
> On Fri, Jul 9, 2021 at 1:59 PM Vivek Goyal <[email protected]> wrote:
> > nfs seems to have some issues.
>
> I'm not sure what the expected behavior is for nfs. All I have for
> now is some generic troubleshooting ideas, sorry:
>
> > - I can set user.foo xattr on symlink and query it back using xattr name.
> >
> > getfattr -h -n user.foo foo-link.txt
> >
> > But when I try to dump all xattrs on this file, user.foo is being
> > filtered out it looks like. Not sure why.
>
> Logging into the server and seeing what's set there could help confirm
> whether it's the client or server that's at fault. (Or watching the
> traffic in wireshark; there are GET/SET/LISTXATTR ops that should be
> easy to spot.)
>
> > - I can't set "user.foo" xattr on a device node on nfs and I get
> > "Permission denied". I am assuming nfs server is returning this.
>
> Wireshark should tell you whether it's the server or client doing that.
>
> The RFC is https://datatracker.ietf.org/doc/html/rfc8276, and I don't
> see any explicit statement about what the server should do in the case
> of symlinks or device nodes, but I do see "Any regular file or
> directory may have a set of extended attributes", so that was clearly
> the assumption. Also, NFS4ERR_WRONG_TYPE is listed as a possible
> error return for the xattr ops. But on a quick skim I don't see any
> explicit checks in the nfsd code, so I *think* it's just relying on
> the vfs for any file type checks.
Hi Bruce,
Thanks for the response. I am just trying to do set a user.foo xattr on
a device node on nfs.
setfattr -n "user.foo" -v "bar" /mnt/nfs/test-dev
and I get -EACCESS.
I put some printk() statements and EACCESS is being returned from here.
nfs4_xattr_set_nfs4_user() {
if (!nfs_access_get_cached(inode, current_cred(), &cache, true)) {
if (!(cache.mask & NFS_ACCESS_XAWRITE)) {
return -EACCES;
}
}
}
Value of cache.mask=0xd at the time of error.
Thanks
Vivek
On Mon, Jul 12, 2021 at 10:02:47AM -0400, Vivek Goyal wrote:
> On Fri, Jul 09, 2021 at 04:10:16PM -0400, Bruce Fields wrote:
> > On Fri, Jul 9, 2021 at 1:59 PM Vivek Goyal <[email protected]> wrote:
> > > nfs seems to have some issues.
> >
> > I'm not sure what the expected behavior is for nfs. All I have for
> > now is some generic troubleshooting ideas, sorry:
> >
> > > - I can set user.foo xattr on symlink and query it back using xattr name.
> > >
> > > getfattr -h -n user.foo foo-link.txt
> > >
> > > But when I try to dump all xattrs on this file, user.foo is being
> > > filtered out it looks like. Not sure why.
> >
> > Logging into the server and seeing what's set there could help confirm
> > whether it's the client or server that's at fault. (Or watching the
> > traffic in wireshark; there are GET/SET/LISTXATTR ops that should be
> > easy to spot.)
> >
> > > - I can't set "user.foo" xattr on a device node on nfs and I get
> > > "Permission denied". I am assuming nfs server is returning this.
> >
> > Wireshark should tell you whether it's the server or client doing that.
> >
> > The RFC is https://datatracker.ietf.org/doc/html/rfc8276, and I don't
> > see any explicit statement about what the server should do in the case
> > of symlinks or device nodes, but I do see "Any regular file or
> > directory may have a set of extended attributes", so that was clearly
> > the assumption. Also, NFS4ERR_WRONG_TYPE is listed as a possible
> > error return for the xattr ops. But on a quick skim I don't see any
> > explicit checks in the nfsd code, so I *think* it's just relying on
> > the vfs for any file type checks.
>
> Hi Bruce,
>
> Thanks for the response. I am just trying to do set a user.foo xattr on
> a device node on nfs.
>
> setfattr -n "user.foo" -v "bar" /mnt/nfs/test-dev
>
> and I get -EACCESS.
>
> I put some printk() statements and EACCESS is being returned from here.
>
> nfs4_xattr_set_nfs4_user() {
> if (!nfs_access_get_cached(inode, current_cred(), &cache, true)) {
> if (!(cache.mask & NFS_ACCESS_XAWRITE)) {
> return -EACCES;
> }
> }
> }
>
> Value of cache.mask=0xd at the time of error.
Looks like 0xd is what the server returns to access on a device node
with mode bits rw- for the caller.
Commit c11d7fd1b317 "nfsd: take xattr bits into account for permission
checks" added the ACCESS_X* bits for regular files and directories but
not others.
But you don't want to determine permission from the mode bits anyway,
you want it to depend on the owner, so I guess we should be calling
xattr_permission somewhere if we want that behavior.
The RFC assumes user xattrs are for regular files and directories,
without, as far as I can tell, actually explicitly forbidding them on
other objects. We should also raise this with the working group if we
want to increase the chances that you'll get the behavior you want on
non-Linux servers.
The "User extended attributes" section of the xattr(7) man page will
need updating.
--b.
On Mon, Jul 12, 2021 at 11:41:06AM -0400, J. Bruce Fields wrote:
> On Mon, Jul 12, 2021 at 10:02:47AM -0400, Vivek Goyal wrote:
> > On Fri, Jul 09, 2021 at 04:10:16PM -0400, Bruce Fields wrote:
> > > On Fri, Jul 9, 2021 at 1:59 PM Vivek Goyal <[email protected]> wrote:
> > > > nfs seems to have some issues.
> > >
> > > I'm not sure what the expected behavior is for nfs. All I have for
> > > now is some generic troubleshooting ideas, sorry:
> > >
> > > > - I can set user.foo xattr on symlink and query it back using xattr name.
> > > >
> > > > getfattr -h -n user.foo foo-link.txt
> > > >
> > > > But when I try to dump all xattrs on this file, user.foo is being
> > > > filtered out it looks like. Not sure why.
> > >
> > > Logging into the server and seeing what's set there could help confirm
> > > whether it's the client or server that's at fault. (Or watching the
> > > traffic in wireshark; there are GET/SET/LISTXATTR ops that should be
> > > easy to spot.)
> > >
> > > > - I can't set "user.foo" xattr on a device node on nfs and I get
> > > > "Permission denied". I am assuming nfs server is returning this.
> > >
> > > Wireshark should tell you whether it's the server or client doing that.
> > >
> > > The RFC is https://datatracker.ietf.org/doc/html/rfc8276, and I don't
> > > see any explicit statement about what the server should do in the case
> > > of symlinks or device nodes, but I do see "Any regular file or
> > > directory may have a set of extended attributes", so that was clearly
> > > the assumption. Also, NFS4ERR_WRONG_TYPE is listed as a possible
> > > error return for the xattr ops. But on a quick skim I don't see any
> > > explicit checks in the nfsd code, so I *think* it's just relying on
> > > the vfs for any file type checks.
> >
> > Hi Bruce,
> >
> > Thanks for the response. I am just trying to do set a user.foo xattr on
> > a device node on nfs.
> >
> > setfattr -n "user.foo" -v "bar" /mnt/nfs/test-dev
> >
> > and I get -EACCESS.
> >
> > I put some printk() statements and EACCESS is being returned from here.
> >
> > nfs4_xattr_set_nfs4_user() {
> > if (!nfs_access_get_cached(inode, current_cred(), &cache, true)) {
> > if (!(cache.mask & NFS_ACCESS_XAWRITE)) {
> > return -EACCES;
> > }
> > }
> > }
> >
> > Value of cache.mask=0xd at the time of error.
>
> Looks like 0xd is what the server returns to access on a device node
> with mode bits rw- for the caller.
>
> Commit c11d7fd1b317 "nfsd: take xattr bits into account for permission
> checks" added the ACCESS_X* bits for regular files and directories but
> not others.
>
> But you don't want to determine permission from the mode bits anyway,
> you want it to depend on the owner,
Thinking more about this part. Current implementation of my patch is
effectively doing both the checks. It checks that you are owner or
have CAP_FOWNER in xattr_permission() and then goes on to call
inode_permission(). And that means file mode bits will also play a
role. If caller does not have write permission on the file, it will
be denied setxattr().
If I don't call inode_permission(), and just return 0 right away for
file owner (for symlinks and special files), then just being owner
is enough to write user.* xattr. And then even security modules will
not get a chance to block that operation. IOW, if you are owner of
a symlink or special file, you can write as many user.* xattr as you
like and except quota does not look like anything else can block
it. I am wondering if this approach is ok?
> so I guess we should be calling
> xattr_permission somewhere if we want that behavior.
>
> The RFC assumes user xattrs are for regular files and directories,
> without, as far as I can tell, actually explicitly forbidding them on
> other objects. We should also raise this with the working group if we
> want to increase the chances that you'll get the behavior you want on
> non-Linux servers.
Ok. I am hoping once this patch merges in some form, then I can
follow it up with relevant working group.
>
> The "User extended attributes" section of the xattr(7) man page will
> need updating.
Agreed. I will take care of that in a separate patch.
Right now, I am not too sure if being owner should be the only check
and I should skip calling inode_permission() entirely or not.
Thanks
Vivek
>
> --b.
>
On Fri, Jul 09, 2021 at 04:36:33PM -0400, Theodore Ts'o wrote:
> On Fri, Jul 09, 2021 at 08:34:41AM -0700, Casey Schaufler wrote:
> > >> One question, do all filesystem supporting xattrs deal with setting them
> > >> on symlinks/device files correctly?
> > > Wrote a simple bash script to do setfattr/getfattr user.foo xattr on
> > > symlink and device node on ext4, xfs and btrfs and it works fine.
> >
> > How about nfs, tmpfs, overlayfs and/or some of the other less conventional
> > filesystems?
>
> As a suggestion, perhaps you could take your bash script and turn it
> into an xfstests test so we can more easily test various file systems,
> both now and once the commit is accepted, to look for regressions?
Sounds good. I see there is already an xattr test (generic/062) which
is broken after my patch. Current test expects that user.* xattrs will
fail on symlink/special device.
I will probably have to query kernel version and modify test so that
expect failure before a certain version and success otherwise.
Thanks
Vivek
On Mon, Jul 12, 2021 at 01:47:59PM -0400, Vivek Goyal wrote:
> On Mon, Jul 12, 2021 at 11:41:06AM -0400, J. Bruce Fields wrote:
> > Looks like 0xd is what the server returns to access on a device node
> > with mode bits rw- for the caller.
> >
> > Commit c11d7fd1b317 "nfsd: take xattr bits into account for permission
> > checks" added the ACCESS_X* bits for regular files and directories but
> > not others.
> >
> > But you don't want to determine permission from the mode bits anyway,
> > you want it to depend on the owner,
>
> Thinking more about this part. Current implementation of my patch is
> effectively doing both the checks. It checks that you are owner or
> have CAP_FOWNER in xattr_permission() and then goes on to call
> inode_permission(). And that means file mode bits will also play a
> role. If caller does not have write permission on the file, it will
> be denied setxattr().
>
> If I don't call inode_permission(), and just return 0 right away for
> file owner (for symlinks and special files), then just being owner
> is enough to write user.* xattr. And then even security modules will
> not get a chance to block that operation. IOW, if you are owner of
> a symlink or special file, you can write as many user.* xattr as you
> like and except quota does not look like anything else can block
> it. I am wondering if this approach is ok?
Yeah, I'd expect security modules to get a say, and I wouldn't expect
mode bits on device nodes to be useful for deciding whether it makes
sense for xattrs to be readable or writeable.
But, I don't really know.
Do we have any other use cases besides this case of storing security
labels in user xattrs?
--b.
On Mon, Jul 12, 2021 at 03:31:39PM -0400, J. Bruce Fields wrote:
> On Mon, Jul 12, 2021 at 01:47:59PM -0400, Vivek Goyal wrote:
> > On Mon, Jul 12, 2021 at 11:41:06AM -0400, J. Bruce Fields wrote:
> > > Looks like 0xd is what the server returns to access on a device node
> > > with mode bits rw- for the caller.
> > >
> > > Commit c11d7fd1b317 "nfsd: take xattr bits into account for permission
> > > checks" added the ACCESS_X* bits for regular files and directories but
> > > not others.
> > >
> > > But you don't want to determine permission from the mode bits anyway,
> > > you want it to depend on the owner,
> >
> > Thinking more about this part. Current implementation of my patch is
> > effectively doing both the checks. It checks that you are owner or
> > have CAP_FOWNER in xattr_permission() and then goes on to call
> > inode_permission(). And that means file mode bits will also play a
> > role. If caller does not have write permission on the file, it will
> > be denied setxattr().
> >
> > If I don't call inode_permission(), and just return 0 right away for
> > file owner (for symlinks and special files), then just being owner
> > is enough to write user.* xattr. And then even security modules will
> > not get a chance to block that operation. IOW, if you are owner of
> > a symlink or special file, you can write as many user.* xattr as you
> > like and except quota does not look like anything else can block
> > it. I am wondering if this approach is ok?
>
> Yeah, I'd expect security modules to get a say, and I wouldn't expect
> mode bits on device nodes to be useful for deciding whether it makes
> sense for xattrs to be readable or writeable.
Actually, calling inode_permission() for symlinks probably should be
fine.
Its the device node which is problematic. Because we started with the
assumption that mode bits there represent access writes for read/writing
to device (and not to the filesystem).
>
> But, I don't really know.
>
> Do we have any other use cases besides this case of storing security
> labels in user xattrs?
Storing security label was one example. In case of virtiofs, there is
a good chance that we will end up remapping all the guest xattrs and
prefix these with "user.virtiofsd".
fuse-overlay is another use case. They are storing real uid/gid in
user.* xattrs for files over NFS.
I think overlayfs can be another benefeciary in some form. Now there
is support for unpriviliged mouting of overlayfs from inside a user
namespace. And that uses xattrs "user.overlay" on upper files for
overlayfs specific metadata. Device nodes are not copied up. But
they might have an issue with symlinks. Miklos, will know more.
Thanks
Vivek
On 7/12/2021 10:47 AM, Vivek Goyal wrote:
> On Mon, Jul 12, 2021 at 11:41:06AM -0400, J. Bruce Fields wrote:
>> On Mon, Jul 12, 2021 at 10:02:47AM -0400, Vivek Goyal wrote:
>>> On Fri, Jul 09, 2021 at 04:10:16PM -0400, Bruce Fields wrote:
>>>> On Fri, Jul 9, 2021 at 1:59 PM Vivek Goyal <[email protected]> wrote:
>>>>> nfs seems to have some issues.
>>>> I'm not sure what the expected behavior is for nfs. All I have for
>>>> now is some generic troubleshooting ideas, sorry:
>>>>
>>>>> - I can set user.foo xattr on symlink and query it back using xattr name.
>>>>>
>>>>> getfattr -h -n user.foo foo-link.txt
>>>>>
>>>>> But when I try to dump all xattrs on this file, user.foo is being
>>>>> filtered out it looks like. Not sure why.
>>>> Logging into the server and seeing what's set there could help confirm
>>>> whether it's the client or server that's at fault. (Or watching the
>>>> traffic in wireshark; there are GET/SET/LISTXATTR ops that should be
>>>> easy to spot.)
>>>>
>>>>> - I can't set "user.foo" xattr on a device node on nfs and I get
>>>>> "Permission denied". I am assuming nfs server is returning this.
>>>> Wireshark should tell you whether it's the server or client doing that.
>>>>
>>>> The RFC is https://datatracker.ietf.org/doc/html/rfc8276, and I don't
>>>> see any explicit statement about what the server should do in the case
>>>> of symlinks or device nodes, but I do see "Any regular file or
>>>> directory may have a set of extended attributes", so that was clearly
>>>> the assumption. Also, NFS4ERR_WRONG_TYPE is listed as a possible
>>>> error return for the xattr ops. But on a quick skim I don't see any
>>>> explicit checks in the nfsd code, so I *think* it's just relying on
>>>> the vfs for any file type checks.
>>> Hi Bruce,
>>>
>>> Thanks for the response. I am just trying to do set a user.foo xattr on
>>> a device node on nfs.
>>>
>>> setfattr -n "user.foo" -v "bar" /mnt/nfs/test-dev
>>>
>>> and I get -EACCESS.
>>>
>>> I put some printk() statements and EACCESS is being returned from here.
>>>
>>> nfs4_xattr_set_nfs4_user() {
>>> if (!nfs_access_get_cached(inode, current_cred(), &cache, true)) {
>>> if (!(cache.mask & NFS_ACCESS_XAWRITE)) {
>>> return -EACCES;
>>> }
>>> }
>>> }
>>>
>>> Value of cache.mask=0xd at the time of error.
>> Looks like 0xd is what the server returns to access on a device node
>> with mode bits rw- for the caller.
>>
>> Commit c11d7fd1b317 "nfsd: take xattr bits into account for permission
>> checks" added the ACCESS_X* bits for regular files and directories but
>> not others.
>>
>> But you don't want to determine permission from the mode bits anyway,
>> you want it to depend on the owner,
> Thinking more about this part. Current implementation of my patch is
> effectively doing both the checks. It checks that you are owner or
> have CAP_FOWNER in xattr_permission() and then goes on to call
> inode_permission(). And that means file mode bits will also play a
> role. If caller does not have write permission on the file, it will
> be denied setxattr().
>
> If I don't call inode_permission(), and just return 0 right away for
> file owner (for symlinks and special files), then just being owner
> is enough to write user.* xattr. And then even security modules will
> not get a chance to block that operation.
That isn't going to fly. SELinux and Smack don't rely on ownership
as a criteria for access. Being the owner of a symlink conveys no
special privilege. The LSM must be consulted to determine if the
module's policy allows the access.
> IOW, if you are owner of
> a symlink or special file, you can write as many user.* xattr as you
> like and except quota does not look like anything else can block
> it. I am wondering if this approach is ok?
>
>
>
>> so I guess we should be calling
>> xattr_permission somewhere if we want that behavior.
>> The RFC assumes user xattrs are for regular files and directories,
>> without, as far as I can tell, actually explicitly forbidding them on
>> other objects. We should also raise this with the working group if we
>> want to increase the chances that you'll get the behavior you want on
>> non-Linux servers.
> Ok. I am hoping once this patch merges in some form, then I can
> follow it up with relevant working group.
>
>> The "User extended attributes" section of the xattr(7) man page will
>> need updating.
> Agreed. I will take care of that in a separate patch.
>
> Right now, I am not too sure if being owner should be the only check
> and I should skip calling inode_permission() entirely or not.
>
> Thanks
> Vivek
>
>> --b.
>>
On 7/12/2021 5:49 AM, Greg Kurz wrote:
> On Fri, 9 Jul 2021 08:34:41 -0700
> Casey Schaufler <[email protected]> wrote:
>
>> On 7/9/2021 8:27 AM, Vivek Goyal wrote:
>>> On Fri, Jul 09, 2021 at 11:19:15AM +0200, Christian Brauner wrote:
>>>> On Thu, Jul 08, 2021 at 01:57:38PM -0400, Vivek Goyal wrote:
>>>>> Currently user.* xattr are not allowed on symlink and special files.
>>>>>
>>>>> man xattr and recent discussion suggested that primary reason for this
>>>>> restriction is how file permissions for symlinks and special files
>>>>> are little different from regular files and directories.
>>>>>
>>>>> For symlinks, they are world readable/writable and if user xattr were
>>>>> to be permitted, it will allow unpriviliged users to dump a huge amount
>>>>> of user.* xattrs on symlinks without any control.
>>>>>
>>>>> For special files, permissions typically control capability to read/write
>>>>> from devices (and not necessarily from filesystem). So if a user can
>>>>> write to device (/dev/null), does not necessarily mean it should be allowed
>>>>> to write large number of user.* xattrs on the filesystem device node is
>>>>> residing in.
>>>>>
>>>>> This patch proposes to relax the restrictions a bit and allow file owner
>>>>> or priviliged user (CAP_FOWNER), to be able to read/write user.* xattrs
>>>>> on symlink and special files.
>>>>>
>>>>> virtiofs daemon has a need to store user.* xatrrs on all the files
>>>>> (including symlinks and special files), and currently that fails. This
>>>>> patch should help.
>>>>>
>>>>> Link: https://lore.kernel.org/linux-fsdevel/[email protected]/
>>>>> Signed-off-by: Vivek Goyal <[email protected]>
>>>>> ---
>>>> Seems reasonable and useful.
>>>> Acked-by: Christian Brauner <[email protected]>
>>>>
>>>> One question, do all filesystem supporting xattrs deal with setting them
>>>> on symlinks/device files correctly?
>>> Wrote a simple bash script to do setfattr/getfattr user.foo xattr on
>>> symlink and device node on ext4, xfs and btrfs and it works fine.
>> How about nfs, tmpfs, overlayfs and/or some of the other less conventional
>> filesystems?
>>
> How about virtiofs then ? :-)
One of the "less conventional filesystems", surely.
�
On Tue, Jul 13, 2021 at 07:17:00AM -0700, Casey Schaufler wrote:
> On 7/12/2021 10:47 AM, Vivek Goyal wrote:
> > On Mon, Jul 12, 2021 at 11:41:06AM -0400, J. Bruce Fields wrote:
> >> On Mon, Jul 12, 2021 at 10:02:47AM -0400, Vivek Goyal wrote:
> >>> On Fri, Jul 09, 2021 at 04:10:16PM -0400, Bruce Fields wrote:
> >>>> On Fri, Jul 9, 2021 at 1:59 PM Vivek Goyal <[email protected]> wrote:
> >>>>> nfs seems to have some issues.
> >>>> I'm not sure what the expected behavior is for nfs. All I have for
> >>>> now is some generic troubleshooting ideas, sorry:
> >>>>
> >>>>> - I can set user.foo xattr on symlink and query it back using xattr name.
> >>>>>
> >>>>> getfattr -h -n user.foo foo-link.txt
> >>>>>
> >>>>> But when I try to dump all xattrs on this file, user.foo is being
> >>>>> filtered out it looks like. Not sure why.
> >>>> Logging into the server and seeing what's set there could help confirm
> >>>> whether it's the client or server that's at fault. (Or watching the
> >>>> traffic in wireshark; there are GET/SET/LISTXATTR ops that should be
> >>>> easy to spot.)
> >>>>
> >>>>> - I can't set "user.foo" xattr on a device node on nfs and I get
> >>>>> "Permission denied". I am assuming nfs server is returning this.
> >>>> Wireshark should tell you whether it's the server or client doing that.
> >>>>
> >>>> The RFC is https://datatracker.ietf.org/doc/html/rfc8276, and I don't
> >>>> see any explicit statement about what the server should do in the case
> >>>> of symlinks or device nodes, but I do see "Any regular file or
> >>>> directory may have a set of extended attributes", so that was clearly
> >>>> the assumption. Also, NFS4ERR_WRONG_TYPE is listed as a possible
> >>>> error return for the xattr ops. But on a quick skim I don't see any
> >>>> explicit checks in the nfsd code, so I *think* it's just relying on
> >>>> the vfs for any file type checks.
> >>> Hi Bruce,
> >>>
> >>> Thanks for the response. I am just trying to do set a user.foo xattr on
> >>> a device node on nfs.
> >>>
> >>> setfattr -n "user.foo" -v "bar" /mnt/nfs/test-dev
> >>>
> >>> and I get -EACCESS.
> >>>
> >>> I put some printk() statements and EACCESS is being returned from here.
> >>>
> >>> nfs4_xattr_set_nfs4_user() {
> >>> if (!nfs_access_get_cached(inode, current_cred(), &cache, true)) {
> >>> if (!(cache.mask & NFS_ACCESS_XAWRITE)) {
> >>> return -EACCES;
> >>> }
> >>> }
> >>> }
> >>>
> >>> Value of cache.mask=0xd at the time of error.
> >> Looks like 0xd is what the server returns to access on a device node
> >> with mode bits rw- for the caller.
> >>
> >> Commit c11d7fd1b317 "nfsd: take xattr bits into account for permission
> >> checks" added the ACCESS_X* bits for regular files and directories but
> >> not others.
> >>
> >> But you don't want to determine permission from the mode bits anyway,
> >> you want it to depend on the owner,
> > Thinking more about this part. Current implementation of my patch is
> > effectively doing both the checks. It checks that you are owner or
> > have CAP_FOWNER in xattr_permission() and then goes on to call
> > inode_permission(). And that means file mode bits will also play a
> > role. If caller does not have write permission on the file, it will
> > be denied setxattr().
> >
> > If I don't call inode_permission(), and just return 0 right away for
> > file owner (for symlinks and special files), then just being owner
> > is enough to write user.* xattr. And then even security modules will
> > not get a chance to block that operation.
>
> That isn't going to fly. SELinux and Smack don't rely on ownership
> as a criteria for access. Being the owner of a symlink conveys no
> special privilege. The LSM must be consulted to determine if the
> module's policy allows the access.
Getting back to this thread after a while. Sorry got busy in other
things.
I noticed that if we skip calling inode_permission() for special files,
then we will skip calling security_inode_permission() but we will
still call security hooks for setxattr/getxattr/removexattr etc.
security_inode_setxattr()
security_inode_getxattr()
security_inode_removexattr()
So LSMs will still get a chance whether to allow/disallow this operation
or not.
And skipping security_inode_permission() kind of makes sense that for
special files, I am not writing to device. So taking permission from
LSMs, will not make much sense.
Thanks
Vivek