Introduce a helper which makes it possible to detect fileystems whose
superblock is visible in multiple user namespace. This currently only
means proc and sys. Such filesystems usually have special semantics so their
behavior will not be changed with the introduction of fsid mappings.
Signed-off-by: Christian Brauner <[email protected]>
---
/* v2 */
unchanged
/* v3 */
unchanged
---
include/linux/fs.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3cd4fe6b845e..fdc8fb2d786b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3651,4 +3651,9 @@ static inline int inode_drain_writes(struct inode *inode)
return filemap_write_and_wait(inode->i_mapping);
}
+static inline bool is_userns_visible(unsigned long iflags)
+{
+ return (iflags & SB_I_USERNS_VISIBLE);
+}
+
#endif /* _LINUX_FS_H */
--
2.25.0
On Tue, Feb 18, 2020 at 03:33:55PM +0100, Christian Brauner wrote:
> Introduce a helper which makes it possible to detect fileystems whose
> superblock is visible in multiple user namespace. This currently only
> means proc and sys. Such filesystems usually have special semantics so their
> behavior will not be changed with the introduction of fsid mappings.
Hi,
I'm afraid I've got a bit of a hangup about the terminology here. I
*think* what you mean is that SB_I_USERNS_VISIBLE is an fs whose uids are
always translated per the id mappings, not fsid mappings. But when I see
the name it seems to imply that !SB_I_USERNS_VISIBLE filesystems can't
be seen by other namespaces at all.
Am I right in my first interpretation? If so, can we talk about the
naming?
> Signed-off-by: Christian Brauner <[email protected]>
> ---
> /* v2 */
> unchanged
>
> /* v3 */
> unchanged
> ---
> include/linux/fs.h | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 3cd4fe6b845e..fdc8fb2d786b 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3651,4 +3651,9 @@ static inline int inode_drain_writes(struct inode *inode)
> return filemap_write_and_wait(inode->i_mapping);
> }
>
> +static inline bool is_userns_visible(unsigned long iflags)
> +{
> + return (iflags & SB_I_USERNS_VISIBLE);
> +}
> +
> #endif /* _LINUX_FS_H */
> --
> 2.25.0
On Tue, Feb 18, 2020 at 08:42:33PM -0600, Serge Hallyn wrote:
> On Tue, Feb 18, 2020 at 03:33:55PM +0100, Christian Brauner wrote:
> > Introduce a helper which makes it possible to detect fileystems whose
> > superblock is visible in multiple user namespace. This currently only
> > means proc and sys. Such filesystems usually have special semantics so their
> > behavior will not be changed with the introduction of fsid mappings.
>
> Hi,
>
> I'm afraid I've got a bit of a hangup about the terminology here. I
> *think* what you mean is that SB_I_USERNS_VISIBLE is an fs whose uids are
> always translated per the id mappings, not fsid mappings. But when I see
Correct!
> the name it seems to imply that !SB_I_USERNS_VISIBLE filesystems can't
> be seen by other namespaces at all.
>
> Am I right in my first interpretation? If so, can we talk about the
> naming?
Yep, your first interpretation is right. What about: wants_idmaps()
On Wed, Feb 19, 2020 at 4:06 AM Christian Brauner
<[email protected]> wrote:
>
> On Tue, Feb 18, 2020 at 08:42:33PM -0600, Serge Hallyn wrote:
> > On Tue, Feb 18, 2020 at 03:33:55PM +0100, Christian Brauner wrote:
> > > Introduce a helper which makes it possible to detect fileystems whose
> > > superblock is visible in multiple user namespace. This currently only
> > > means proc and sys. Such filesystems usually have special semantics so their
> > > behavior will not be changed with the introduction of fsid mappings.
> >
> > Hi,
> >
> > I'm afraid I've got a bit of a hangup about the terminology here. I
> > *think* what you mean is that SB_I_USERNS_VISIBLE is an fs whose uids are
> > always translated per the id mappings, not fsid mappings. But when I see
>
> Correct!
>
> > the name it seems to imply that !SB_I_USERNS_VISIBLE filesystems can't
> > be seen by other namespaces at all.
> >
> > Am I right in my first interpretation? If so, can we talk about the
> > naming?
>
> Yep, your first interpretation is right. What about: wants_idmaps()
Maybe fsidmap_exempt()?
I still haven't convinced myself that any of the above is actually
correct behavior, especially when people do things like creating
setuid binaries.
On Wed, Feb 19, 2020 at 09:18:51AM -0800, Andy Lutomirski wrote:
> On Wed, Feb 19, 2020 at 4:06 AM Christian Brauner
> <[email protected]> wrote:
> >
> > On Tue, Feb 18, 2020 at 08:42:33PM -0600, Serge Hallyn wrote:
> > > On Tue, Feb 18, 2020 at 03:33:55PM +0100, Christian Brauner wrote:
> > > > Introduce a helper which makes it possible to detect fileystems whose
> > > > superblock is visible in multiple user namespace. This currently only
> > > > means proc and sys. Such filesystems usually have special semantics so their
> > > > behavior will not be changed with the introduction of fsid mappings.
> > >
> > > Hi,
> > >
> > > I'm afraid I've got a bit of a hangup about the terminology here. I
> > > *think* what you mean is that SB_I_USERNS_VISIBLE is an fs whose uids are
> > > always translated per the id mappings, not fsid mappings. But when I see
> >
> > Correct!
> >
> > > the name it seems to imply that !SB_I_USERNS_VISIBLE filesystems can't
> > > be seen by other namespaces at all.
> > >
> > > Am I right in my first interpretation? If so, can we talk about the
> > > naming?
> >
> > Yep, your first interpretation is right. What about: wants_idmaps()
>
> Maybe fsidmap_exempt()?
Yeah, and maybe SB_USERNS_FSID_EXEMPT ?
> I still haven't convinced myself that any of the above is actually
> correct behavior, especially when people do things like creating
> setuid binaries.
The only place that would be a problem is if the child userns has an
fsidmapping from X to 0 in the parent userns, right? Yeah I'm sure
many people would ignore all advice to the contrary and do this anyway,
but I would try hard to suggest that people use an intermediary userns
for storing filesystems for the "docker share" case. So the host fsid
range would start at say 200000. So a setuid binary would just be
setuid-200000.