2021-09-02 23:26:38

by Vivek Goyal

[permalink] [raw]
Subject: [PATCH v3 0/1] Relax restrictions on user.* xattr

Hi,

This is V3 of the patch. Previous versions were posted here.

v2:
https://lore.kernel.org/linux-fsdevel/[email protected]/
v1:
https://lore.kernel.org/linux-fsdevel/[email protected]
+m/

Changes since v2
----------------
- Do not call inode_permission() for special files as file mode bits
on these files represent permissions to read/write from/to device
and not necessarily permission to read/write xattrs. In this case
now user.* extended xattrs can be read/written on special files
as long as caller is owner of file or has CAP_FOWNER.

- Fixed "man xattr". Will post a patch in same thread little later. (J.
Bruce Fields)

- Fixed xfstest 062. Changed it to run only on older kernels where
user extended xattrs are not allowed on symlinks/special files. Added
a new replacement test 648 which does exactly what 062. Just that
it is supposed to run on newer kernels where user extended xattrs
are allowed on symlinks and special files. Will post patch in
same thread (Ted Ts'o).

Testing
-------
- Ran xfstest "./check -g auto" with and without patches and did not
notice any new failures.

- Tested setting "user.*" xattr with ext4/xfs/btrfs/overlay/nfs
filesystems and it works.

Description
===========

Right now we don't allow setting user.* xattrs on symlinks and special
files at all. Initially I thought that real reason behind this
restriction is quota limitations but from last conversation it seemed
that real reason is that permission bits on symlink and special files
are special and different from regular files and directories, hence
this restriction is in place. (I tested with xfs user quota enabled and
quota restrictions kicked in on symlink).

This version of patch allows reading/writing user.* xattr on symlink and
special files if caller is owner or priviliged (has CAP_FOWNER) w.r.t inode.

Who wants to set user.* xattr on symlink/special files
-----------------------------------------------------
I have primarily two users at this point of time.

- virtiofs daemon.

- fuse-overlay. Giuseppe, seems to set user.* xattr attrs on unpriviliged
fuse-overlay as well and he ran into similar issue. So fuse-overlay
should benefit from this change as well.

Why virtiofsd wants to set user.* xattr on symlink/special files
----------------------------------------------------------------
In virtiofs, actual file server is virtiosd daemon running on host.
There we have a mode where xattrs can be remapped to something else.
For example security.selinux can be remapped to
user.virtiofsd.securit.selinux on the host.

This remapping is useful when SELinux is enabled in guest and virtiofs
as being used as rootfs. Guest and host SELinux policy might not match
and host policy might deny security.selinux xattr setting by guest
onto host. Or host might have SELinux disabled and in that case to
be able to set security.selinux xattr, virtiofsd will need to have
CAP_SYS_ADMIN (which we are trying to avoid). Being able to remap
guest security.selinux (or other xattrs) on host to something else
is also better from security point of view.

But when we try this, we noticed that SELinux relabeling in guest
is failing on some symlinks. When I debugged a little more, I
came to know that "user.*" xattrs are not allowed on symlinks
or special files.

So if we allow owner (or CAP_FOWNER) to set user.* xattr, it will
allow virtiofs to arbitrarily remap guests's xattrs to something
else on host and that solves this SELinux issue nicely and provides
two SELinux policies (host and guest) to co-exist nicely without
interfering with each other.

Thanks
Vivek

Vivek Goyal (1):
xattr: Allow user.* xattr on symlink and special files

fs/xattr.c | 23 ++++++++++++++++++-----
1 file changed, 18 insertions(+), 5 deletions(-)

--
2.31.1


2021-09-03 00:30:12

by Vivek Goyal

[permalink] [raw]
Subject: [PATCH 2/1] man-pages: xattr.7: Update text for user extended xattr behavior change

I have proposed a patch to relax restrictions on user extended xattrs and
allow file owner (or CAP_FOWNER) to get/set user extended xattrs on symlink
and device files.

Signed-off-by: Vivek Goyal <[email protected]>
---
man7/xattr.7 | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

Index: man-pages/man7/xattr.7
===================================================================
--- man-pages.orig/man7/xattr.7 2021-09-01 13:46:16.165016463 -0400
+++ man-pages/man7/xattr.7 2021-09-01 16:31:51.038016463 -0400
@@ -129,8 +129,13 @@ a way not controllable by disk quotas fo
special files and directories.
.PP
For this reason,
-user extended attributes are allowed only for regular files and directories,
-and access to user extended attributes is restricted to the
+user extended attributes are allowed only for regular files and directories
+till kernel 5.14. In newer kernel (5.15 onwards), restrictions have been
+relaxed a bit and user extended attributes are also allowed on symlinks
+and special files as long as caller is either owner of the file or is
+privileged (CAP_FOWNER).
+
+Access to user extended attributes is restricted to the
owner and to users with appropriate capabilities for directories with the
sticky bit set (see the
.BR chmod (1)

2021-09-03 00:31:21

by Vivek Goyal

[permalink] [raw]
Subject: [PATCH 3/1] xfstests: generic/062: Do not run on newer kernels


xfstests: generic/062: Do not run on newer kernels

This test has been written with assumption that setting user.* xattrs will
fail on symlink and special files. When newer kernels support setting
user.* xattrs on symlink and special files, this test starts failing.

Found it hard to change test in such a way that it works on both type of
kernels. Primary problem is 062.out file which hardcodes the output and
output will be different on old and new kernels.

So instead, do not run this test if kernel is new and is expected to
exhibit new behavior. Next patch will create a new test and run that
test on new kernel.

IOW, on old kernels run 062 and on new kernels run new test.

This is a proposed patch. Will need to be fixed if corresponding
kernel changes are merged upstream.

Signed-off-by: Vivek Goyal <[email protected]>
---
tests/generic/062 | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

Index: xfstests-dev/tests/generic/062
===================================================================
--- xfstests-dev.orig/tests/generic/062 2021-08-31 15:51:08.160307982 -0400
+++ xfstests-dev/tests/generic/062 2021-08-31 16:27:41.678307982 -0400
@@ -55,6 +55,26 @@ _require_attrs
_require_symlinks
_require_mknod

+user_xattr_allowed()
+{
+ local kernel_version kernel_patchlevel
+
+ kernel_version=`uname -r | awk -F. '{print $1}'`
+ kernel_patchlevel=`uname -r | awk -F. '{print $2}'`
+
+ # Kernel version 5.14 onwards allow user xattr on symlink/special files.
+ [ $kernel_version -lt 5 ] && return 1
+ [ $kernel_patchlevel -lt 14 ] && return 1
+ return 0;
+}
+
+
+# Kernel version 5.14 onwards allow user xattr on symlink/special files.
+# Do not run this test on newer kernels. Instead run the new test
+# which has been written with the assumption that user.* xattr
+# will succeed on symlink and special files.
+user_xattr_allowed && _notrun "Kernel allows user.* xattrs on symlinks and special files. Skipping this test. Run newer test instead."
+
rm -f $tmp.backup1 $tmp.backup2 $seqres.full

# real QA test starts here

2021-09-03 00:42:09

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH v3 0/1] Relax restrictions on user.* xattr

Hi,

On Thu, Sep 2, 2021 at 5:22 PM Vivek Goyal <[email protected]> wrote:
> This is V3 of the patch. Previous versions were posted here.
>
> v2: https://lore.kernel.org/linux-fsdevel/[email protected]/
> v1: https://lore.kernel.org/linux-fsdevel/[email protected]/
>
> Changes since v2
> ----------------
> - Do not call inode_permission() for special files as file mode bits
> on these files represent permissions to read/write from/to device
> and not necessarily permission to read/write xattrs. In this case
> now user.* extended xattrs can be read/written on special files
> as long as caller is owner of file or has CAP_FOWNER.
>
> - Fixed "man xattr". Will post a patch in same thread little later. (J.
> Bruce Fields)
>
> - Fixed xfstest 062. Changed it to run only on older kernels where
> user extended xattrs are not allowed on symlinks/special files. Added
> a new replacement test 648 which does exactly what 062. Just that
> it is supposed to run on newer kernels where user extended xattrs
> are allowed on symlinks and special files. Will post patch in
> same thread (Ted Ts'o).
>
> Testing
> -------
> - Ran xfstest "./check -g auto" with and without patches and did not
> notice any new failures.
>
> - Tested setting "user.*" xattr with ext4/xfs/btrfs/overlay/nfs
> filesystems and it works.
>
> Description
> ===========
>
> Right now we don't allow setting user.* xattrs on symlinks and special
> files at all. Initially I thought that real reason behind this
> restriction is quota limitations but from last conversation it seemed
> that real reason is that permission bits on symlink and special files
> are special and different from regular files and directories, hence
> this restriction is in place. (I tested with xfs user quota enabled and
> quota restrictions kicked in on symlink).
>
> This version of patch allows reading/writing user.* xattr on symlink and
> special files if caller is owner or priviliged (has CAP_FOWNER) w.r.t inode.

the idea behind user.* xattrs is that they behave similar to file
contents as far as permissions go. It follows from that that symlinks
and special files cannot have user.* xattrs. This has been the model
for many years now and applications may be expecting these semantics,
so we cannot simply change the behavior. So NACK from me.

> Who wants to set user.* xattr on symlink/special files
> -----------------------------------------------------
> I have primarily two users at this point of time.
>
> - virtiofs daemon.
>
> - fuse-overlay. Giuseppe, seems to set user.* xattr attrs on unpriviliged
> fuse-overlay as well and he ran into similar issue. So fuse-overlay
> should benefit from this change as well.
>
> Why virtiofsd wants to set user.* xattr on symlink/special files
> ----------------------------------------------------------------
> In virtiofs, actual file server is virtiosd daemon running on host.
> There we have a mode where xattrs can be remapped to something else.
> For example security.selinux can be remapped to
> user.virtiofsd.securit.selinux on the host.
>
> This remapping is useful when SELinux is enabled in guest and virtiofs
> as being used as rootfs. Guest and host SELinux policy might not match
> and host policy might deny security.selinux xattr setting by guest
> onto host. Or host might have SELinux disabled and in that case to
> be able to set security.selinux xattr, virtiofsd will need to have
> CAP_SYS_ADMIN (which we are trying to avoid). Being able to remap
> guest security.selinux (or other xattrs) on host to something else
> is also better from security point of view.
>
> But when we try this, we noticed that SELinux relabeling in guest
> is failing on some symlinks. When I debugged a little more, I
> came to know that "user.*" xattrs are not allowed on symlinks
> or special files.
>
> So if we allow owner (or CAP_FOWNER) to set user.* xattr, it will
> allow virtiofs to arbitrarily remap guests's xattrs to something
> else on host and that solves this SELinux issue nicely and provides
> two SELinux policies (host and guest) to co-exist nicely without
> interfering with each other.

The fact that user.* xattrs don't work in this remapping scenario
should have told you that you're doing things wrong; the user.*
namespace seriously was never meant to be abused in this way.

You may be able to get away with using trusted.* xattrs which support
roughly the kind of daemon use I think you're talking about here, but
I'm not sure selinux will be happy with labels that aren't fully under
its own control. I really wonder why this wasn't obvious enough.

Thanks,
Andreas

> Thanks
> Vivek
>
> Vivek Goyal (1):
> xattr: Allow user.* xattr on symlink and special files
>
> fs/xattr.c | 23 ++++++++++++++++++-----
> 1 file changed, 18 insertions(+), 5 deletions(-)
>
> --
> 2.31.1
>

2021-09-03 00:43:05

by Vivek Goyal

[permalink] [raw]
Subject: Re: [PATCH v3 0/1] Relax restrictions on user.* xattr

On Thu, Sep 02, 2021 at 07:52:41PM +0200, Andreas Gruenbacher wrote:
> Hi,
>
> On Thu, Sep 2, 2021 at 5:22 PM Vivek Goyal <[email protected]> wrote:
> > This is V3 of the patch. Previous versions were posted here.
> >
> > v2: https://lore.kernel.org/linux-fsdevel/[email protected]/
> > v1: https://lore.kernel.org/linux-fsdevel/[email protected]/
> >
> > Changes since v2
> > ----------------
> > - Do not call inode_permission() for special files as file mode bits
> > on these files represent permissions to read/write from/to device
> > and not necessarily permission to read/write xattrs. In this case
> > now user.* extended xattrs can be read/written on special files
> > as long as caller is owner of file or has CAP_FOWNER.
> >
> > - Fixed "man xattr". Will post a patch in same thread little later. (J.
> > Bruce Fields)
> >
> > - Fixed xfstest 062. Changed it to run only on older kernels where
> > user extended xattrs are not allowed on symlinks/special files. Added
> > a new replacement test 648 which does exactly what 062. Just that
> > it is supposed to run on newer kernels where user extended xattrs
> > are allowed on symlinks and special files. Will post patch in
> > same thread (Ted Ts'o).
> >
> > Testing
> > -------
> > - Ran xfstest "./check -g auto" with and without patches and did not
> > notice any new failures.
> >
> > - Tested setting "user.*" xattr with ext4/xfs/btrfs/overlay/nfs
> > filesystems and it works.
> >
> > Description
> > ===========
> >
> > Right now we don't allow setting user.* xattrs on symlinks and special
> > files at all. Initially I thought that real reason behind this
> > restriction is quota limitations but from last conversation it seemed
> > that real reason is that permission bits on symlink and special files
> > are special and different from regular files and directories, hence
> > this restriction is in place. (I tested with xfs user quota enabled and
> > quota restrictions kicked in on symlink).
> >
> > This version of patch allows reading/writing user.* xattr on symlink and
> > special files if caller is owner or priviliged (has CAP_FOWNER) w.r.t inode.
>
> the idea behind user.* xattrs is that they behave similar to file
> contents as far as permissions go. It follows from that that symlinks
> and special files cannot have user.* xattrs. This has been the model
> for many years now and applications may be expecting these semantics,
> so we cannot simply change the behavior. So NACK from me.

Directories with sticky bit break this general rule and don't follow
permission bit model.

man xattr says.

*****************************************************************
and access to user extended attributes is re‐
stricted to the owner and to users with appropriate capabilities for
directories with the sticky bit set
******************************************************************

So why not allow similar exceptions for symlinks and device files.

I can understand the concern about behavior change suddenly and
applications being surprised. If that's the only concern we could
think of making user opt-in for this new behavior based on a kernel
CONFIG, kernel command line or something else.


>
> > Who wants to set user.* xattr on symlink/special files
> > -----------------------------------------------------
> > I have primarily two users at this point of time.
> >
> > - virtiofs daemon.
> >
> > - fuse-overlay. Giuseppe, seems to set user.* xattr attrs on unpriviliged
> > fuse-overlay as well and he ran into similar issue. So fuse-overlay
> > should benefit from this change as well.
> >
> > Why virtiofsd wants to set user.* xattr on symlink/special files
> > ----------------------------------------------------------------
> > In virtiofs, actual file server is virtiosd daemon running on host.
> > There we have a mode where xattrs can be remapped to something else.
> > For example security.selinux can be remapped to
> > user.virtiofsd.securit.selinux on the host.
> >
> > This remapping is useful when SELinux is enabled in guest and virtiofs
> > as being used as rootfs. Guest and host SELinux policy might not match
> > and host policy might deny security.selinux xattr setting by guest
> > onto host. Or host might have SELinux disabled and in that case to
> > be able to set security.selinux xattr, virtiofsd will need to have
> > CAP_SYS_ADMIN (which we are trying to avoid). Being able to remap
> > guest security.selinux (or other xattrs) on host to something else
> > is also better from security point of view.
> >
> > But when we try this, we noticed that SELinux relabeling in guest
> > is failing on some symlinks. When I debugged a little more, I
> > came to know that "user.*" xattrs are not allowed on symlinks
> > or special files.
> >
> > So if we allow owner (or CAP_FOWNER) to set user.* xattr, it will
> > allow virtiofs to arbitrarily remap guests's xattrs to something
> > else on host and that solves this SELinux issue nicely and provides
> > two SELinux policies (host and guest) to co-exist nicely without
> > interfering with each other.
>
> The fact that user.* xattrs don't work in this remapping scenario
> should have told you that you're doing things wrong; the user.*
> namespace seriously was never meant to be abused in this way.

Guest's security label is not be parsed by host kernel. Host kernel
will have its own security label and will take decisions based on
that. In that aspect making use of "user.*" xattr seemed to make
lot of sense and we were wondering why user.* xattr is limited to
regualr files and directories only and can we change that behavior.

>
> You may be able to get away with using trusted.* xattrs which support
> roughly the kind of daemon use I think you're talking about here, but
> I'm not sure selinux will be happy with labels that aren't fully under
> its own control. I really wonder why this wasn't obvious enough.

I guess trusted.* will do same thing. But it requires CAP_SYS_ADMIN
in init_user_ns. And that rules out running virtiofsd unpriviliged
or inside a user namespace. Also it reduces the risk posted by
virtiofsd on host filesystem due to CAP_SYS_ADMIN. That's why we
were trying to steer clear of trusted.* xattr space.

Also, trusted.* xattr space does not work with NFS.

$ setfattr -n "trusted.virtiofs" -v "foo" test.txt
setfattr: test.txt: Operation not supported

We want to be able run virtiofsd over NFS mounted dir too.

So its not that we did not consider trusted.* xattrs. We ran
into above issues.

Thanks
Vivek

2021-09-03 04:57:09

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH 3/1] xfstests: generic/062: Do not run on newer kernels

On Thu, Sep 02, 2021 at 11:47:31AM -0400, Vivek Goyal wrote:
>
> xfstests: generic/062: Do not run on newer kernels
>
> This test has been written with assumption that setting user.* xattrs will
> fail on symlink and special files. When newer kernels support setting
> user.* xattrs on symlink and special files, this test starts failing.
>
> Found it hard to change test in such a way that it works on both type of
> kernels. Primary problem is 062.out file which hardcodes the output and
> output will be different on old and new kernels.
>
> So instead, do not run this test if kernel is new and is expected to
> exhibit new behavior. Next patch will create a new test and run that
> test on new kernel.
>
> IOW, on old kernels run 062 and on new kernels run new test.
>
> This is a proposed patch. Will need to be fixed if corresponding
> kernel changes are merged upstream.
>
> Signed-off-by: Vivek Goyal <[email protected]>
> ---
> tests/generic/062 | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> Index: xfstests-dev/tests/generic/062
> ===================================================================
> --- xfstests-dev.orig/tests/generic/062 2021-08-31 15:51:08.160307982 -0400
> +++ xfstests-dev/tests/generic/062 2021-08-31 16:27:41.678307982 -0400
> @@ -55,6 +55,26 @@ _require_attrs
> _require_symlinks
> _require_mknod
>
> +user_xattr_allowed()
> +{
> + local kernel_version kernel_patchlevel
> +
> + kernel_version=`uname -r | awk -F. '{print $1}'`
> + kernel_patchlevel=`uname -r | awk -F. '{print $2}'`
> +
> + # Kernel version 5.14 onwards allow user xattr on symlink/special files.
> + [ $kernel_version -lt 5 ] && return 1
> + [ $kernel_patchlevel -lt 14 ] && return 1
> + return 0;
> +}

We don't do this because code changes get backported to random
kernels and so the kernel release is not a reliable indicator of
feature support.

Probing the functionality is the only way to reliably detect what a
kernel supports. That's what we don in all the _requires*()
functions, which is what this should all be wrapped in.

> +# Kernel version 5.14 onwards allow user xattr on symlink/special files.
> +# Do not run this test on newer kernels. Instead run the new test
> +# which has been written with the assumption that user.* xattr
> +# will succeed on symlink and special files.
> +user_xattr_allowed && _notrun "Kernel allows user.* xattrs on symlinks and special files. Skipping this test. Run newer test instead."

"run a newer test instead" is not a useful error message. Nor do you
need "skipping this test" - that's exactly what "notrun" means.

Cheers,

Dave.
--
Dave Chinner
[email protected]

2021-09-03 06:12:31

by Zorro Lang

[permalink] [raw]
Subject: Re: [PATCH 3/1] xfstests: generic/062: Do not run on newer kernels

On Thu, Sep 02, 2021 at 11:47:31AM -0400, Vivek Goyal wrote:
>
> xfstests: generic/062: Do not run on newer kernels
>
> This test has been written with assumption that setting user.* xattrs will
> fail on symlink and special files. When newer kernels support setting
> user.* xattrs on symlink and special files, this test starts failing.
>
> Found it hard to change test in such a way that it works on both type of
> kernels. Primary problem is 062.out file which hardcodes the output and
> output will be different on old and new kernels.
>
> So instead, do not run this test if kernel is new and is expected to
> exhibit new behavior. Next patch will create a new test and run that
> test on new kernel.
>
> IOW, on old kernels run 062 and on new kernels run new test.
>
> This is a proposed patch. Will need to be fixed if corresponding
> kernel changes are merged upstream.
>
> Signed-off-by: Vivek Goyal <[email protected]>
> ---
> tests/generic/062 | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> Index: xfstests-dev/tests/generic/062
> ===================================================================
> --- xfstests-dev.orig/tests/generic/062 2021-08-31 15:51:08.160307982 -0400
> +++ xfstests-dev/tests/generic/062 2021-08-31 16:27:41.678307982 -0400
> @@ -55,6 +55,26 @@ _require_attrs
> _require_symlinks
> _require_mknod
>
> +user_xattr_allowed()
> +{
> + local kernel_version kernel_patchlevel
> +
> + kernel_version=`uname -r | awk -F. '{print $1}'`
> + kernel_patchlevel=`uname -r | awk -F. '{print $2}'`
> +
> + # Kernel version 5.14 onwards allow user xattr on symlink/special files.
> + [ $kernel_version -lt 5 ] && return 1
> + [ $kernel_patchlevel -lt 14 ] && return 1
> + return 0;
> +}

I don't think this's a good way to judge if run or notrun a test. Many downstream
kernels always backport upstream features. I can't say what's the best way to
deal with this thing, I only can provide two optional methods:

1) Add new requre_* helpers to check if current kernel support to set xattr on
symlink and special files, then let this case only run on support/unsupport
condition.

2) Use _link_out_file() to link the .out file to different golden images (refer to
generic/050 etc), according to different feature implementation.

If anyone has a better method, feel free to talk :)

Thanks,
Zorro

> +
> +
> +# Kernel version 5.14 onwards allow user xattr on symlink/special files.
> +# Do not run this test on newer kernels. Instead run the new test
> +# which has been written with the assumption that user.* xattr
> +# will succeed on symlink and special files.
> +user_xattr_allowed && _notrun "Kernel allows user.* xattrs on symlinks and special files. Skipping this test. Run newer test instead."
> +
> rm -f $tmp.backup1 $tmp.backup2 $seqres.full
>
> # real QA test starts here
>

2021-09-03 06:41:20

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH 3/1] xfstests: generic/062: Do not run on newer kernels

On Thu, Sep 2, 2021 at 5:47 PM Vivek Goyal <[email protected]> wrote:
> xfstests: generic/062: Do not run on newer kernels
>
> This test has been written with assumption that setting user.* xattrs will
> fail on symlink and special files. When newer kernels support setting
> user.* xattrs on symlink and special files, this test starts failing.

It's actually a good thing that this test case triggers for the kernel
change you're proposing; that change should never be merged. The
user.* namespace is meant for data with the same access permissions as
the file data, and it has been for many years. We may have
applications that assume the existing behavior. In addition, this
change would create backwards compatibility problems for things like
backups.

I'm not convinced that what you're actually proposing (mapping
security.selinux to a different attribute name) actually makes sense,
but that's a question for the selinux folks to decide. Mapping it to a
user.* attribute is definitely wrong though. The modified behavior
would affect anybody, not only users of selinux and/or virtiofs. If
mapping attribute names is actually the right approach, then you need
to look at trusted.* xattrs, which exist specifically for this kind of
purpose. You've noted that trusted.* xattrs aren't supported over nfs.
That's unfortunate, but not an acceptable excuse for messing up user.*
xattrs.

Thanks,
Andreas

2021-09-03 06:59:06

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH 3/1] xfstests: generic/062: Do not run on newer kernels

On Fri, Sep 3, 2021 at 8:31 AM Andreas Gruenbacher <[email protected]> wrote:
> On Thu, Sep 2, 2021 at 5:47 PM Vivek Goyal <[email protected]> wrote:
> > xfstests: generic/062: Do not run on newer kernels
> >
> > This test has been written with assumption that setting user.* xattrs will
> > fail on symlink and special files. When newer kernels support setting
> > user.* xattrs on symlink and special files, this test starts failing.
>
> It's actually a good thing that this test case triggers for the kernel
> change you're proposing; that change should never be merged. The
> user.* namespace is meant for data with the same access permissions as
> the file data, and it has been for many years. We may have
> applications that assume the existing behavior. In addition, this
> change would create backwards compatibility problems for things like
> backups.
>
> I'm not convinced that what you're actually proposing (mapping
> security.selinux to a different attribute name) actually makes sense,
> but that's a question for the selinux folks to decide. Mapping it to a
> user.* attribute is definitely wrong though. The modified behavior
> would affect anybody, not only users of selinux and/or virtiofs. If
> mapping attribute names is actually the right approach, then you need
> to look at trusted.* xattrs, which exist specifically for this kind of
> purpose. You've noted that trusted.* xattrs aren't supported over nfs.
> That's unfortunate, but not an acceptable excuse for messing up user.*
> xattrs.

Another possibility would be to make selinux use a different
security.* attribute for this nested selinux case. That way, the
"host" selinux would retain some control over the labels the "guest"
uses.

Thanks,
Andreas

2021-09-03 14:47:13

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 3/1] xfstests: generic/062: Do not run on newer kernels

Well, we could also look at supporting trusted.* xattrs over NFS. I
don't know much about them, but it looks like it wouldn't be a lot of
work to specify, especially now that we've already got user xattrs?
We'd just write a new internet draft that refers to the existing
user.* xattr draft for most of the details.

--b.

On Fri, Sep 3, 2021 at 2:56 AM Andreas Gruenbacher <[email protected]> wrote:
>
> On Fri, Sep 3, 2021 at 8:31 AM Andreas Gruenbacher <[email protected]> wrote:
> > On Thu, Sep 2, 2021 at 5:47 PM Vivek Goyal <[email protected]> wrote:
> > > xfstests: generic/062: Do not run on newer kernels
> > >
> > > This test has been written with assumption that setting user.* xattrs will
> > > fail on symlink and special files. When newer kernels support setting
> > > user.* xattrs on symlink and special files, this test starts failing.
> >
> > It's actually a good thing that this test case triggers for the kernel
> > change you're proposing; that change should never be merged. The
> > user.* namespace is meant for data with the same access permissions as
> > the file data, and it has been for many years. We may have
> > applications that assume the existing behavior. In addition, this
> > change would create backwards compatibility problems for things like
> > backups.
> >
> > I'm not convinced that what you're actually proposing (mapping
> > security.selinux to a different attribute name) actually makes sense,
> > but that's a question for the selinux folks to decide. Mapping it to a
> > user.* attribute is definitely wrong though. The modified behavior
> > would affect anybody, not only users of selinux and/or virtiofs. If
> > mapping attribute names is actually the right approach, then you need
> > to look at trusted.* xattrs, which exist specifically for this kind of
> > purpose. You've noted that trusted.* xattrs aren't supported over nfs.
> > That's unfortunate, but not an acceptable excuse for messing up user.*
> > xattrs.
>
> Another possibility would be to make selinux use a different
> security.* attribute for this nested selinux case. That way, the
> "host" selinux would retain some control over the labels the "guest"
> uses.
>
> Thanks,
> Andreas
>

2021-09-03 15:46:00

by Vivek Goyal

[permalink] [raw]
Subject: Re: [PATCH 3/1] xfstests: generic/062: Do not run on newer kernels

On Fri, Sep 03, 2021 at 10:42:34AM -0400, Bruce Fields wrote:
> Well, we could also look at supporting trusted.* xattrs over NFS. I
> don't know much about them, but it looks like it wouldn't be a lot of
> work to specify, especially now that we've already got user xattrs?
> We'd just write a new internet draft that refers to the existing
> user.* xattr draft for most of the details.

Will be nice if we can support trusted.* xattrs on NFS.

Vivek

>
> --b.
>
> On Fri, Sep 3, 2021 at 2:56 AM Andreas Gruenbacher <[email protected]> wrote:
> >
> > On Fri, Sep 3, 2021 at 8:31 AM Andreas Gruenbacher <[email protected]> wrote:
> > > On Thu, Sep 2, 2021 at 5:47 PM Vivek Goyal <[email protected]> wrote:
> > > > xfstests: generic/062: Do not run on newer kernels
> > > >
> > > > This test has been written with assumption that setting user.* xattrs will
> > > > fail on symlink and special files. When newer kernels support setting
> > > > user.* xattrs on symlink and special files, this test starts failing.
> > >
> > > It's actually a good thing that this test case triggers for the kernel
> > > change you're proposing; that change should never be merged. The
> > > user.* namespace is meant for data with the same access permissions as
> > > the file data, and it has been for many years. We may have
> > > applications that assume the existing behavior. In addition, this
> > > change would create backwards compatibility problems for things like
> > > backups.
> > >
> > > I'm not convinced that what you're actually proposing (mapping
> > > security.selinux to a different attribute name) actually makes sense,
> > > but that's a question for the selinux folks to decide. Mapping it to a
> > > user.* attribute is definitely wrong though. The modified behavior
> > > would affect anybody, not only users of selinux and/or virtiofs. If
> > > mapping attribute names is actually the right approach, then you need
> > > to look at trusted.* xattrs, which exist specifically for this kind of
> > > purpose. You've noted that trusted.* xattrs aren't supported over nfs.
> > > That's unfortunate, but not an acceptable excuse for messing up user.*
> > > xattrs.
> >
> > Another possibility would be to make selinux use a different
> > security.* attribute for this nested selinux case. That way, the
> > "host" selinux would retain some control over the labels the "guest"
> > uses.
> >
> > Thanks,
> > Andreas
> >
>

2021-09-03 15:53:20

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 3/1] xfstests: generic/062: Do not run on newer kernels

On Fri, Sep 3, 2021 at 11:43 AM Vivek Goyal <[email protected]> wrote:
> On Fri, Sep 03, 2021 at 10:42:34AM -0400, Bruce Fields wrote:
> > Well, we could also look at supporting trusted.* xattrs over NFS. I
> > don't know much about them, but it looks like it wouldn't be a lot of
> > work to specify, especially now that we've already got user xattrs?
> > We'd just write a new internet draft that refers to the existing
> > user.* xattr draft for most of the details.
>
> Will be nice if we can support trusted.* xattrs on NFS.

Maybe I should start a separate thread for that. Who would need to be
on it to be sure we get this right?

--b.

2021-09-03 16:53:28

by Casey Schaufler

[permalink] [raw]
Subject: Re: [PATCH 3/1] xfstests: generic/062: Do not run on newer kernels

On 9/3/2021 8:50 AM, Bruce Fields wrote:
> On Fri, Sep 3, 2021 at 11:43 AM Vivek Goyal <[email protected]> wrote:
>> On Fri, Sep 03, 2021 at 10:42:34AM -0400, Bruce Fields wrote:
>>> Well, we could also look at supporting trusted.* xattrs over NFS. I
>>> don't know much about them, but it looks like it wouldn't be a lot of
>>> work to specify, especially now that we've already got user xattrs?
>>> We'd just write a new internet draft that refers to the existing
>>> user.* xattr draft for most of the details.
>> Will be nice if we can support trusted.* xattrs on NFS.
> Maybe I should start a separate thread for that. Who would need to be
> on it to be sure we get this right?

I would like to be included. It would probably be a good idea to
include the LSM list, [email protected]. I'll leave
the networking and filesystem folks to speak for themselves.

>
> --b.
>

2021-09-03 17:40:50

by Vivek Goyal

[permalink] [raw]
Subject: Re: [PATCH 3/1] xfstests: generic/062: Do not run on newer kernels

On Fri, Sep 03, 2021 at 11:50:43AM -0400, Bruce Fields wrote:
> On Fri, Sep 3, 2021 at 11:43 AM Vivek Goyal <[email protected]> wrote:
> > On Fri, Sep 03, 2021 at 10:42:34AM -0400, Bruce Fields wrote:
> > > Well, we could also look at supporting trusted.* xattrs over NFS. I
> > > don't know much about them, but it looks like it wouldn't be a lot of
> > > work to specify, especially now that we've already got user xattrs?
> > > We'd just write a new internet draft that refers to the existing
> > > user.* xattr draft for most of the details.
> >
> > Will be nice if we can support trusted.* xattrs on NFS.
>
> Maybe I should start a separate thread for that. Who would need to be
> on it to be sure we get this right?

I will like to be on cc list.

Vivek

2021-09-06 15:05:23

by Dr. David Alan Gilbert

[permalink] [raw]
Subject: Re: [PATCH v3 0/1] Relax restrictions on user.* xattr

* Andreas Gruenbacher ([email protected]) wrote:
> Hi,
>
> On Thu, Sep 2, 2021 at 5:22 PM Vivek Goyal <[email protected]> wrote:
> > This is V3 of the patch. Previous versions were posted here.
> >
> > v2: https://lore.kernel.org/linux-fsdevel/[email protected]/
> > v1: https://lore.kernel.org/linux-fsdevel/[email protected]/
> >
> > Changes since v2
> > ----------------
> > - Do not call inode_permission() for special files as file mode bits
> > on these files represent permissions to read/write from/to device
> > and not necessarily permission to read/write xattrs. In this case
> > now user.* extended xattrs can be read/written on special files
> > as long as caller is owner of file or has CAP_FOWNER.
> >
> > - Fixed "man xattr". Will post a patch in same thread little later. (J.
> > Bruce Fields)
> >
> > - Fixed xfstest 062. Changed it to run only on older kernels where
> > user extended xattrs are not allowed on symlinks/special files. Added
> > a new replacement test 648 which does exactly what 062. Just that
> > it is supposed to run on newer kernels where user extended xattrs
> > are allowed on symlinks and special files. Will post patch in
> > same thread (Ted Ts'o).
> >
> > Testing
> > -------
> > - Ran xfstest "./check -g auto" with and without patches and did not
> > notice any new failures.
> >
> > - Tested setting "user.*" xattr with ext4/xfs/btrfs/overlay/nfs
> > filesystems and it works.
> >
> > Description
> > ===========
> >
> > Right now we don't allow setting user.* xattrs on symlinks and special
> > files at all. Initially I thought that real reason behind this
> > restriction is quota limitations but from last conversation it seemed
> > that real reason is that permission bits on symlink and special files
> > are special and different from regular files and directories, hence
> > this restriction is in place. (I tested with xfs user quota enabled and
> > quota restrictions kicked in on symlink).
> >
> > This version of patch allows reading/writing user.* xattr on symlink and
> > special files if caller is owner or priviliged (has CAP_FOWNER) w.r.t inode.
>
> the idea behind user.* xattrs is that they behave similar to file
> contents as far as permissions go. It follows from that that symlinks
> and special files cannot have user.* xattrs. This has been the model
> for many years now and applications may be expecting these semantics,
> so we cannot simply change the behavior. So NACK from me.
>
> > Who wants to set user.* xattr on symlink/special files
> > -----------------------------------------------------
> > I have primarily two users at this point of time.
> >
> > - virtiofs daemon.
> >
> > - fuse-overlay. Giuseppe, seems to set user.* xattr attrs on unpriviliged
> > fuse-overlay as well and he ran into similar issue. So fuse-overlay
> > should benefit from this change as well.
> >
> > Why virtiofsd wants to set user.* xattr on symlink/special files
> > ----------------------------------------------------------------
> > In virtiofs, actual file server is virtiosd daemon running on host.
> > There we have a mode where xattrs can be remapped to something else.
> > For example security.selinux can be remapped to
> > user.virtiofsd.securit.selinux on the host.
> >
> > This remapping is useful when SELinux is enabled in guest and virtiofs
> > as being used as rootfs. Guest and host SELinux policy might not match
> > and host policy might deny security.selinux xattr setting by guest
> > onto host. Or host might have SELinux disabled and in that case to
> > be able to set security.selinux xattr, virtiofsd will need to have
> > CAP_SYS_ADMIN (which we are trying to avoid). Being able to remap
> > guest security.selinux (or other xattrs) on host to something else
> > is also better from security point of view.
> >
> > But when we try this, we noticed that SELinux relabeling in guest
> > is failing on some symlinks. When I debugged a little more, I
> > came to know that "user.*" xattrs are not allowed on symlinks
> > or special files.
> >
> > So if we allow owner (or CAP_FOWNER) to set user.* xattr, it will
> > allow virtiofs to arbitrarily remap guests's xattrs to something
> > else on host and that solves this SELinux issue nicely and provides
> > two SELinux policies (host and guest) to co-exist nicely without
> > interfering with each other.
>
> The fact that user.* xattrs don't work in this remapping scenario
> should have told you that you're doing things wrong; the user.*
> namespace seriously was never meant to be abused in this way.
>
> You may be able to get away with using trusted.* xattrs which support
> roughly the kind of daemon use I think you're talking about here, but
> I'm not sure selinux will be happy with labels that aren't fully under
> its own control. I really wonder why this wasn't obvious enough.

It was; however in our use case it wasn't an issue in general, because
the selinux instance that was setting the labels was inside an untrusted
guest, as such it's labels on the host are themselves untrusted, and
hence user. made some sense to the host - until we found out the
restrictons on user. the hard way.

The mapping code we have doesn't explicitly set user. - it's an
arbitrary remapper that can map to anything you like, trusted. whatever,
but user. feels (to us) like it's right for an untrusted guest.

IMHO the real problem here is that the user/trusted/system/security
'namespaces' are arbitrary hacks rather than a proper namespacing
mechanism that allows you to create new (nested) namespaces and associate
permissions with each one.

Each one carries with it some arbitrary baggage (trusted not working on
NFS, user. having the special rules on symlinks etc).

Then every fs or application that trips over these arbitrary limits adds
some hack to work around them in a different way to every other fs or
app that's doing the same thing; (see 9p, overlayfs, fuse-overlayfs,
crosvm etc etc all that do some level of renaming)

What we really need is a namespace where you can do anything you like,
but it's then limited by the security modules, so that I could allow
user.virtiofsd.guest1 to be able to set labels on symlinks for example.

Dave

> Thanks,
> Andreas
>
> > Thanks
> > Vivek
> >
> > Vivek Goyal (1):
> > xattr: Allow user.* xattr on symlink and special files
> >
> > fs/xattr.c | 23 ++++++++++++++++++-----
> > 1 file changed, 18 insertions(+), 5 deletions(-)
> >
> > --
> > 2.31.1
> >
>
--
Dr. David Alan Gilbert / [email protected] / Manchester, UK

2021-09-06 15:39:16

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH v3 0/1] Relax restrictions on user.* xattr

On Mon, 6 Sept 2021 at 16:39, Dr. David Alan Gilbert
<[email protected]> wrote:

> IMHO the real problem here is that the user/trusted/system/security
> 'namespaces' are arbitrary hacks rather than a proper namespacing
> mechanism that allows you to create new (nested) namespaces and associate
> permissions with each one.

Indeed.

This is what Eric Biederman suggested at some point for supporting
trusted xattrs within a user namespace:

| For trusted xattrs I think it makes sense in principle. The namespace
| would probably become something like "trusted<ns-root-uid>.".

Theory sounds simple enough. Anyone interested in looking at the details?

Thanks,
Miklos

2021-09-07 22:27:49

by Vivek Goyal

[permalink] [raw]
Subject: Re: [PATCH v3 0/1] Relax restrictions on user.* xattr

On Mon, Sep 06, 2021 at 04:56:44PM +0200, Miklos Szeredi wrote:
> On Mon, 6 Sept 2021 at 16:39, Dr. David Alan Gilbert
> <[email protected]> wrote:
>
> > IMHO the real problem here is that the user/trusted/system/security
> > 'namespaces' are arbitrary hacks rather than a proper namespacing
> > mechanism that allows you to create new (nested) namespaces and associate
> > permissions with each one.
>
> Indeed.
>
> This is what Eric Biederman suggested at some point for supporting
> trusted xattrs within a user namespace:
>
> | For trusted xattrs I think it makes sense in principle. The namespace
> | would probably become something like "trusted<ns-root-uid>.".
>
> Theory sounds simple enough. Anyone interested in looking at the details?

So this namespaced trusted.* xattr domain will basically avoid the need
to have CAP_SYS_ADMIN in init_user_ns, IIUC. I guess this is better
than giving CAP_SYS_ADMIN in init_user_ns.

Vivek

2021-09-08 07:42:44

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH v3 0/1] Relax restrictions on user.* xattr

On Tue, 7 Sept 2021 at 23:40, Vivek Goyal <[email protected]> wrote:
>
> On Mon, Sep 06, 2021 at 04:56:44PM +0200, Miklos Szeredi wrote:
> > On Mon, 6 Sept 2021 at 16:39, Dr. David Alan Gilbert
> > <[email protected]> wrote:
> >
> > > IMHO the real problem here is that the user/trusted/system/security
> > > 'namespaces' are arbitrary hacks rather than a proper namespacing
> > > mechanism that allows you to create new (nested) namespaces and associate
> > > permissions with each one.
> >
> > Indeed.
> >
> > This is what Eric Biederman suggested at some point for supporting
> > trusted xattrs within a user namespace:
> >
> > | For trusted xattrs I think it makes sense in principle. The namespace
> > | would probably become something like "trusted<ns-root-uid>.".
> >
> > Theory sounds simple enough. Anyone interested in looking at the details?
>
> So this namespaced trusted.* xattr domain will basically avoid the need
> to have CAP_SYS_ADMIN in init_user_ns, IIUC. I guess this is better
> than giving CAP_SYS_ADMIN in init_user_ns.

That's the objective, yes. I think the trick is getting filesystems
to store yet another xattr type.

Thanks,
Miklos

2021-09-08 14:25:36

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH v3 0/1] Relax restrictions on user.* xattr

Miklos Szeredi <[email protected]> writes:

> On Tue, 7 Sept 2021 at 23:40, Vivek Goyal <[email protected]> wrote:
>>
>> On Mon, Sep 06, 2021 at 04:56:44PM +0200, Miklos Szeredi wrote:
>> > On Mon, 6 Sept 2021 at 16:39, Dr. David Alan Gilbert
>> > <[email protected]> wrote:
>> >
>> > > IMHO the real problem here is that the user/trusted/system/security
>> > > 'namespaces' are arbitrary hacks rather than a proper namespacing
>> > > mechanism that allows you to create new (nested) namespaces and associate
>> > > permissions with each one.
>> >
>> > Indeed.
>> >
>> > This is what Eric Biederman suggested at some point for supporting
>> > trusted xattrs within a user namespace:
>> >
>> > | For trusted xattrs I think it makes sense in principle. The namespace
>> > | would probably become something like "trusted<ns-root-uid>.".
>> >
>> > Theory sounds simple enough. Anyone interested in looking at the details?
>>
>> So this namespaced trusted.* xattr domain will basically avoid the need
>> to have CAP_SYS_ADMIN in init_user_ns, IIUC. I guess this is better
>> than giving CAP_SYS_ADMIN in init_user_ns.
>
> That's the objective, yes. I think the trick is getting filesystems
> to store yet another xattr type.

Using the uid of the root user of a user namespace is probably the best
idea we have so far for identifying a user namespace in persistent
on-disk meta-data. We ran into a little trouble using that idea
for file capabilities.

The key problem was there are corner cases where some nested user
namespaces have the same root user id as their parent namespaces. This
has the potential to allow privilege escalation if the creator of the
user namespace does not have sufficient capabilities.

The solution we adopted can be seen in db2e718a4798 ("capabilities:
require CAP_SETFCAP to map uid 0").

That solution is basically not allowing the creation of user namespaces
that could have problems. I think use trusted xattrs this way the code
would need to treat CAP_SYS_ADMIN the same way it currently treats
CAP_SETFCAP.

Eric