2013-05-22 07:30:37

by Stanislav Kinsbursky

[permalink] [raw]
Subject: [RFC PATCH] fs: call_usermodehelper_root helper introduced

Usermode helper executes all binaries in global "init" root context. This
doesn't allow to call a binary from other root context (for example in a
container).
Currently, both containerized NFS client and NFS server requires an ability to
execute a binary in a container's root context. Root swap can be done in
"init" callback, passed by UMH caller.
But since we have 2 callers already (and more of them are expected to appear
in future) and because set_fs_root() in not exported, it looks reasonable to
add one more generic UMH helper to generic fs code.
Root path reference must be hold by the caller, since it will be put on UMH
thread exit.

Signed-off-by: Stanislav Kinsbursky <[email protected]>
---
fs/fs_struct.c | 28 ++++++++++++++++++++++++++++
include/linux/fs_struct.h | 4 ++++
2 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/fs/fs_struct.c b/fs/fs_struct.c
index d8ac61d..cd1de8e 100644
--- a/fs/fs_struct.c
+++ b/fs/fs_struct.c
@@ -4,6 +4,7 @@
#include <linux/path.h>
#include <linux/slab.h>
#include <linux/fs_struct.h>
+#include <linux/kmod.h>
#include "internal.h"

/*
@@ -157,6 +158,33 @@ int current_umask(void)
}
EXPORT_SYMBOL(current_umask);

+static int umh_set_fs_root(struct subprocess_info *info, struct cred *new)
+{
+ set_fs_root(current->fs, info->data);
+ return 0;
+}
+
+/*
+ * Call a usermode helper with a specific fs root.
+ *
+ * The caller must hold extra reference to it otherwise, because it will be
+ * put on usermodehelper thread exit.
+ */
+int call_usermodehelper_root(char *path, char **argv, char **envp,
+ struct path *root, int wait)
+{
+ struct subprocess_info *info;
+ gfp_t gfp_mask = (wait == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;
+
+ info = call_usermodehelper_setup(path, argv, envp, gfp_mask,
+ umh_set_fs_root, NULL, root);
+ if (info == NULL)
+ return -ENOMEM;
+
+ return call_usermodehelper_exec(info, wait);
+}
+EXPORT_SYMBOL(call_usermodehelper_root);
+
/* to be mentioned only in INIT_TASK */
struct fs_struct init_fs = {
.users = 1,
diff --git a/include/linux/fs_struct.h b/include/linux/fs_struct.h
index 2b93a9a..cead51e 100644
--- a/include/linux/fs_struct.h
+++ b/include/linux/fs_struct.h
@@ -52,4 +52,8 @@ static inline void get_fs_root_and_pwd(struct fs_struct *fs, struct path *root,

extern bool current_chrooted(void);

+extern int
+call_usermodehelper_root(char *path, char **argv, char **envp,
+ struct path *root, int wait);
+
#endif /* _LINUX_FS_STRUCT_H */


2013-05-22 16:07:42

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

On 05/22, Stanislav Kinsbursky wrote:
>
> +static int umh_set_fs_root(struct subprocess_info *info, struct cred *new)
> +{
> + set_fs_root(current->fs, info->data);
> + return 0;
> +}
> +
> +/*
> + * Call a usermode helper with a specific fs root.
> + *
> + * The caller must hold extra reference to it otherwise, because it will be
> + * put on usermodehelper thread exit.
> + */
> +int call_usermodehelper_root(char *path, char **argv, char **envp,
> + struct path *root, int wait)
> +{
> + struct subprocess_info *info;
> + gfp_t gfp_mask = (wait == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;
> +
> + info = call_usermodehelper_setup(path, argv, envp, gfp_mask,
> + umh_set_fs_root, NULL, root);
> + if (info == NULL)
> + return -ENOMEM;
> +
> + return call_usermodehelper_exec(info, wait);
> +}
> +EXPORT_SYMBOL(call_usermodehelper_root);

Of course, I can't ack the placement (and yes, it was me who argued
that kmod.c is probably not the best place), but the patch looks fine.

I am not sure the new helper actually needs "int wait" but this matches
call_usermodehelper().

For what it's worth:

Reviewed-by: Oleg Nesterov <[email protected]>

2013-05-22 17:34:26

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

Stanislav Kinsbursky <[email protected]> writes:

> Usermode helper executes all binaries in global "init" root context. This
> doesn't allow to call a binary from other root context (for example in a
> container).
> Currently, both containerized NFS client and NFS server requires an ability to
> execute a binary in a container's root context. Root swap can be done in
> "init" callback, passed by UMH caller.
> But since we have 2 callers already (and more of them are expected to appear
> in future) and because set_fs_root() in not exported, it looks reasonable to
> add one more generic UMH helper to generic fs code.
> Root path reference must be hold by the caller, since it will be put on UMH
> thread exit.

Awesome. With this patch as an uprivilieged user I get to pick which
binary the kernel will execute. At least if nfs and nfsd ever runs in a
user namespace (something that looks like only matter of time).

I think this is a seriously bad idea.

Why can't we do this in userspace with setns as we do with the core dump
helper?

I am missing a lot of context here and capturing the context of a
process at time time we mount the filesystem and reconstituing it in
call user mode helper seems like something we could do.

This patch as it stands looks like it would compete for the honor of the
easiest kernel feature to exploit.

Eric

2013-05-22 18:36:20

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

[email protected] (Eric W. Biederman) writes:

> I am missing a lot of context here and capturing the context of a
> process at time time we mount the filesystem and reconstituing it in
> call user mode helper seems like something we could do.

If we want to do something like this the only sane thing I can see is to
have a per container version of kthread call it uthread. That the user
mode helper code would use to launch a new process.

Anything else and I expect we will be tearing our hair out for the rest
of our lives with weird corner cases or unexpected semantics.

At first glace I would exepct uthread to be per pid namespace in
implementation.

Eric

2013-05-22 19:24:09

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

On Wed, May 22, 2013 at 11:35:56AM -0700, Eric W. Biederman wrote:
> [email protected] (Eric W. Biederman) writes:
>
> > I am missing a lot of context here and capturing the context of a
> > process at time time we mount the filesystem and reconstituing it in
> > call user mode helper seems like something we could do.

So it's not enough just to ensure that the user namespace is set
correctly? (To the namespace of the mount process in the nfs case, or
of the process that starts nfsd in the nfsd case.)

> If we want to do something like this the only sane thing I can see is to
> have a per container version of kthread call it uthread. That the user
> mode helper code would use to launch a new process.
>
> Anything else and I expect we will be tearing our hair out for the rest
> of our lives with weird corner cases or unexpected semantics.

Could you give examples of weird corner cases or unexpected semantics?

--b.

> At first glace I would exepct uthread to be per pid namespace in
> implementation.
>
> Eric
>
>

2013-05-23 03:37:47

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

"J. Bruce Fields" <[email protected]> writes:

> On Wed, May 22, 2013 at 11:35:56AM -0700, Eric W. Biederman wrote:
>> [email protected] (Eric W. Biederman) writes:
>>
>> > I am missing a lot of context here and capturing the context of a
>> > process at time time we mount the filesystem and reconstituing it in
>> > call user mode helper seems like something we could do.
>
> So it's not enough just to ensure that the user namespace is set
> correctly? (To the namespace of the mount process in the nfs case, or
> of the process that starts nfsd in the nfsd case.)

No. By just setting the user namespace, you gain the ability to create
processes outside of your curremt rlimits, and cgroups, and pid
namespace, etc.

With the wrong set of namespaces you will talk to the wrong processes,
or utilize the wrong resources.

Outside of your current cgroup you gain the ability to use more
resources than you were limited to.

Having thought about it what I would propose is to fork a process from
the context of the mounting process (or possibly from the context of the
process that writes to the sysctl that sets the program to spawn) and
have that process be a daemon that becomes responsible for spawning user
mode helpers with context. Either that or whe need the user mode helper
userspace infrastructure to become namespace aware and call setns.

>> If we want to do something like this the only sane thing I can see is to
>> have a per container version of kthread call it uthread. That the user
>> mode helper code would use to launch a new process.
>>
>> Anything else and I expect we will be tearing our hair out for the rest
>> of our lives with weird corner cases or unexpected semantics.
>
> Could you give examples of weird corner cases or unexpected semantics?

I gave a couple and it is the classic problem with suid executables
where a lot of unexpected things are inherited from the environment, and
so things become a never ending race. We replaced daemonize in the
kernel with just forking processes from kthreadd for this very reason.
There was always something else that needed to be reset to make a
process a proper kernel thread.

Eric

2013-05-23 08:08:35

by Stanislav Kinsbursky

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

22.05.2013 21:33, Eric W. Biederman ?????:
> Stanislav Kinsbursky <[email protected]> writes:
>
>> Usermode helper executes all binaries in global "init" root context. This
>> doesn't allow to call a binary from other root context (for example in a
>> container).
>> Currently, both containerized NFS client and NFS server requires an ability to
>> execute a binary in a container's root context. Root swap can be done in
>> "init" callback, passed by UMH caller.
>> But since we have 2 callers already (and more of them are expected to appear
>> in future) and because set_fs_root() in not exported, it looks reasonable to
>> add one more generic UMH helper to generic fs code.
>> Root path reference must be hold by the caller, since it will be put on UMH
>> thread exit.
>
> Awesome. With this patch as an uprivilieged user I get to pick which
> binary the kernel will execute. At least if nfs and nfsd ever runs in a
> user namespace (something that looks like only matter of time).
>

Not really. Only by using a kernel module to call the UMH.
And an unprivileged can't load a module as far a I know.
I.e. NFSd, for example, will use unprivileged user's root to perform this call.

> I think this is a seriously bad idea.
>
> Why can't we do this in userspace with setns as we do with the core dump
> helper?
>

Could you, please, clarify, how setns can help here?

> I am missing a lot of context here and capturing the context of a
> process at time time we mount the filesystem and reconstituing it in
> call user mode helper seems like something we could do.
>
> This patch as it stands looks like it would compete for the honor of the
> easiest kernel feature to exploit.
>

Hmmm...
As far as I can see (maybe I'm missing something), there main security issue that could be here
is allowing of using any passed root to swap to.
What about using the current root instead of passed one? I.e. taking the root to swap to inside the UMH.
Does this keeps the isolation on the same level?

> Eric
>


--
Best regards,
Stanislav Kinsbursky

2013-05-23 08:11:55

by Stanislav Kinsbursky

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

22.05.2013 22:35, Eric W. Biederman ?????:
> [email protected] (Eric W. Biederman) writes:
>
>> I am missing a lot of context here and capturing the context of a
>> process at time time we mount the filesystem and reconstituing it in
>> call user mode helper seems like something we could do.
>
> If we want to do something like this the only sane thing I can see is to
> have a per container version of kthread call it uthread. That the user
> mode helper code would use to launch a new process.
>

The main point here, is that a container can have it's own root, different
to kthread's one (another mount or at least chroot result).

> Anything else and I expect we will be tearing our hair out for the rest
> of our lives with weird corner cases or unexpected semantics.
>
> At first glace I would exepct uthread to be per pid namespace in
> implementation.
>

Having a per-pidnamespace kernel thread would be really great.
But regrettably doesn't solve the root swapping problem.

> Eric
>
>


--
Best regards,
Stanislav Kinsbursky

2013-05-23 10:00:59

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

Stanislav Kinsbursky <[email protected]> writes:

> 22.05.2013 21:33, Eric W. Biederman пишет:
>> Stanislav Kinsbursky <[email protected]> writes:
>>
>>> Usermode helper executes all binaries in global "init" root context. This
>>> doesn't allow to call a binary from other root context (for example in a
>>> container).
>>> Currently, both containerized NFS client and NFS server requires an ability to
>>> execute a binary in a container's root context. Root swap can be done in
>>> "init" callback, passed by UMH caller.
>>> But since we have 2 callers already (and more of them are expected to appear
>>> in future) and because set_fs_root() in not exported, it looks reasonable to
>>> add one more generic UMH helper to generic fs code.
>>> Root path reference must be hold by the caller, since it will be put on UMH
>>> thread exit.
>>
>> Awesome. With this patch as an uprivilieged user I get to pick which
>> binary the kernel will execute. At least if nfs and nfsd ever runs in a
>> user namespace (something that looks like only matter of time).
>>
>
> Not really. Only by using a kernel module to call the UMH.
> And an unprivileged can't load a module as far a I know.
> I.e. NFSd, for example, will use unprivileged user's root to perform this call.

To help me understand the context which instances of call user mode
helper are you expecting to use this facility?

>> I think this is a seriously bad idea.
>>
>> Why can't we do this in userspace with setns as we do with the core dump
>> helper?
>>
>
> Could you, please, clarify, how setns can help here?

setns can change the mount namespace, and chroot can change to root
directory in the specified mount namespace. Essentially you can enter
into a containers complete context (pid, mnt, root, etc) comming from
the outside.

For nfs and nfsd I believe everything is controlled by mounts. So the
alternate solution is to have a kernel thread for calling your user mode
helper processes that is created with plain old kernel_thread aka fork
when we perform the controlling nfs mount, and have that thread be the
parent process for any containerized user mode helper calls.

>> I am missing a lot of context here and capturing the context of a
>> process at time time we mount the filesystem and reconstituing it in
>> call user mode helper seems like something we could do.
>>
>> This patch as it stands looks like it would compete for the honor of the
>> easiest kernel feature to exploit.
>>
>
> Hmmm...
> As far as I can see (maybe I'm missing something), there main security issue that could be here
> is allowing of using any passed root to swap to.

Assuming that it is mounts that control this, and root in a user
namespace will be able to mount nfs. Then root in a user namespace will
be able to cause any program to run as the global root by simply
performing the appropriate set of mounts, and triggering the user mode
helper.

Instant gaining of all capabilities and privilieges on the system given
to a binary of your choice.

> What about using the current root instead of passed one? I.e. taking the root to swap to inside the UMH.
> Does this keeps the isolation on the same level?

Passing the necessary information to the user mode helper so the user
mode helper can perform the swaps should be fine. That essentially is
what I was suggesting when I mentioned setns above.

Eric

2013-05-23 10:36:15

by Stanislav Kinsbursky

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

23.05.2013 14:00, Eric W. Biederman пишет:
> Stanislav Kinsbursky <[email protected]> writes:
>
>> 22.05.2013 21:33, Eric W. Biederman пишет:
>>> Stanislav Kinsbursky <[email protected]> writes:
>>>
>>>> Usermode helper executes all binaries in global "init" root context. This
>>>> doesn't allow to call a binary from other root context (for example in a
>>>> container).
>>>> Currently, both containerized NFS client and NFS server requires an ability to
>>>> execute a binary in a container's root context. Root swap can be done in
>>>> "init" callback, passed by UMH caller.
>>>> But since we have 2 callers already (and more of them are expected to appear
>>>> in future) and because set_fs_root() in not exported, it looks reasonable to
>>>> add one more generic UMH helper to generic fs code.
>>>> Root path reference must be hold by the caller, since it will be put on UMH
>>>> thread exit.
>>>
>>> Awesome. With this patch as an uprivilieged user I get to pick which
>>> binary the kernel will execute. At least if nfs and nfsd ever runs in a
>>> user namespace (something that looks like only matter of time).
>>>
>>
>> Not really. Only by using a kernel module to call the UMH.
>> And an unprivileged can't load a module as far a I know.
>> I.e. NFSd, for example, will use unprivileged user's root to perform this call.
>
> To help me understand the context which instances of call user mode
> helper are you expecting to use this facility?
>

Ok. Here is how the NFSd uses UMH:
UMH is used on NFSd service to start user-space client tracker daemon
("/sbin/nfsdcltarck"), which is used to store some per-client locks data on
persistent storage.

>>> I think this is a seriously bad idea.
>>>
>>> Why can't we do this in userspace with setns as we do with the core dump
>>> helper?
>>>
>>
>> Could you, please, clarify, how setns can help here?
>
> setns can change the mount namespace, and chroot can change to root
> directory in the specified mount namespace. Essentially you can enter
> into a containers complete context (pid, mnt, root, etc) comming from
> the outside.
>

So, you are actually suggesting to move the binary start from the kernel to user-space.
IOW, you are suggesting to do not using UMH at all.
Am I right?
I don't know the reasons, why it was done by using UMH and not in userspace.
Could you clarify this, Jeff?

> For nfs and nfsd I believe everything is controlled by mounts. So the
> alternate solution is to have a kernel thread for calling your user mode
> helper processes that is created with plain old kernel_thread aka fork
> when we perform the controlling nfs mount, and have that thread be the
> parent process for any containerized user mode helper calls.
>

Can't we just run into lack of credentials to execute the desired binary with this
kernel_thread() approach?

>>> I am missing a lot of context here and capturing the context of a
>>> process at time time we mount the filesystem and reconstituing it in
>>> call user mode helper seems like something we could do.
>>>
>>> This patch as it stands looks like it would compete for the honor of the
>>> easiest kernel feature to exploit.
>>>
>>
>> Hmmm...
>> As far as I can see (maybe I'm missing something), there main security issue that could be here
>> is allowing of using any passed root to swap to.
>
> Assuming that it is mounts that control this, and root in a user
> namespace will be able to mount nfs. Then root in a user namespace will
> be able to cause any program to run as the global root by simply
> performing the appropriate set of mounts, and triggering the user mode
> helper.
>

Why you say "any program" in this case?
There is no way to pass a program name into the kernel as far a I know.
In case of NFSd the binary path is hard-coded in the kernel module.

> Instant gaining of all capabilities and privilieges on the system given
> to a binary of your choice.
>
>> What about using the current root instead of passed one? I.e. taking the root to swap to inside the UMH.
>> Does this keeps the isolation on the same level?
>
> Passing the necessary information to the user mode helper so the user
> mode helper can perform the swaps should be fine. That essentially is
> what I was suggesting when I mentioned setns above.
>

Looks I'm missing something here.
If you are not against of using the UMH itself, then there must be a way to tell the
UMH where it should lookup the binary (UHM thread's root is used to lookup for the path in existent
implementation).
And that's what I'm trying to implement.

> Eric
>


--
Best regards,
Stanislav Kinsbursky

2013-05-23 11:31:24

by Jeff Layton

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

On Thu, 23 May 2013 14:35:53 +0400
Stanislav Kinsbursky <[email protected]> wrote:

> 23.05.2013 14:00, Eric W. Biederman пишет:
> > Stanislav Kinsbursky <[email protected]> writes:
> >
> >> 22.05.2013 21:33, Eric W. Biederman пишет:
> >>> Stanislav Kinsbursky <[email protected]> writes:
> >>>
> >>>> Usermode helper executes all binaries in global "init" root context. This
> >>>> doesn't allow to call a binary from other root context (for example in a
> >>>> container).
> >>>> Currently, both containerized NFS client and NFS server requires an ability to
> >>>> execute a binary in a container's root context. Root swap can be done in
> >>>> "init" callback, passed by UMH caller.
> >>>> But since we have 2 callers already (and more of them are expected to appear
> >>>> in future) and because set_fs_root() in not exported, it looks reasonable to
> >>>> add one more generic UMH helper to generic fs code.
> >>>> Root path reference must be hold by the caller, since it will be put on UMH
> >>>> thread exit.
> >>>
> >>> Awesome. With this patch as an uprivilieged user I get to pick which
> >>> binary the kernel will execute. At least if nfs and nfsd ever runs in a
> >>> user namespace (something that looks like only matter of time).
> >>>
> >>
> >> Not really. Only by using a kernel module to call the UMH.
> >> And an unprivileged can't load a module as far a I know.
> >> I.e. NFSd, for example, will use unprivileged user's root to perform this call.
> >
> > To help me understand the context which instances of call user mode
> > helper are you expecting to use this facility?
> >
>
> Ok. Here is how the NFSd uses UMH:
> UMH is used on NFSd service to start user-space client tracker daemon
> ("/sbin/nfsdcltarck"), which is used to store some per-client locks data on
> persistent storage.
>
> >>> I think this is a seriously bad idea.
> >>>
> >>> Why can't we do this in userspace with setns as we do with the core dump
> >>> helper?
> >>>
> >>
> >> Could you, please, clarify, how setns can help here?
> >
> > setns can change the mount namespace, and chroot can change to root
> > directory in the specified mount namespace. Essentially you can enter
> > into a containers complete context (pid, mnt, root, etc) comming from
> > the outside.
> >
>
> So, you are actually suggesting to move the binary start from the kernel to user-space.
> IOW, you are suggesting to do not using UMH at all.
> Am I right?
> I don't know the reasons, why it was done by using UMH and not in userspace.
> Could you clarify this, Jeff?
>

nfsdcltrack is a "one-shot" program for managing and querying the nfsd
client tracking database. When knfsd needs to query or modify the
db, it uses the UMH infrastructure to call this program that does
what's requested and then exits.

So, I'm not sure I really understand your question. It wasn't done in
userspace since the whole purpose of this program is to handle upcalls
from the kernel.

--
Jeff Layton <[email protected]>

2013-05-23 11:38:44

by Stanislav Kinsbursky

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

23.05.2013 15:31, Jeff Layton пишет:
> On Thu, 23 May 2013 14:35:53 +0400
> Stanislav Kinsbursky <[email protected]> wrote:
>
>> 23.05.2013 14:00, Eric W. Biederman пишет:
>>> Stanislav Kinsbursky <[email protected]> writes:
>>>
>>>> 22.05.2013 21:33, Eric W. Biederman пишет:
>>>>> Stanislav Kinsbursky <[email protected]> writes:
>>>>>
>>>>>> Usermode helper executes all binaries in global "init" root context. This
>>>>>> doesn't allow to call a binary from other root context (for example in a
>>>>>> container).
>>>>>> Currently, both containerized NFS client and NFS server requires an ability to
>>>>>> execute a binary in a container's root context. Root swap can be done in
>>>>>> "init" callback, passed by UMH caller.
>>>>>> But since we have 2 callers already (and more of them are expected to appear
>>>>>> in future) and because set_fs_root() in not exported, it looks reasonable to
>>>>>> add one more generic UMH helper to generic fs code.
>>>>>> Root path reference must be hold by the caller, since it will be put on UMH
>>>>>> thread exit.
>>>>>
>>>>> Awesome. With this patch as an uprivilieged user I get to pick which
>>>>> binary the kernel will execute. At least if nfs and nfsd ever runs in a
>>>>> user namespace (something that looks like only matter of time).
>>>>>
>>>>
>>>> Not really. Only by using a kernel module to call the UMH.
>>>> And an unprivileged can't load a module as far a I know.
>>>> I.e. NFSd, for example, will use unprivileged user's root to perform this call.
>>>
>>> To help me understand the context which instances of call user mode
>>> helper are you expecting to use this facility?
>>>
>>
>> Ok. Here is how the NFSd uses UMH:
>> UMH is used on NFSd service to start user-space client tracker daemon
>> ("/sbin/nfsdcltarck"), which is used to store some per-client locks data on
>> persistent storage.
>>
>>>>> I think this is a seriously bad idea.
>>>>>
>>>>> Why can't we do this in userspace with setns as we do with the core dump
>>>>> helper?
>>>>>
>>>>
>>>> Could you, please, clarify, how setns can help here?
>>>
>>> setns can change the mount namespace, and chroot can change to root
>>> directory in the specified mount namespace. Essentially you can enter
>>> into a containers complete context (pid, mnt, root, etc) comming from
>>> the outside.
>>>
>>
>> So, you are actually suggesting to move the binary start from the kernel to user-space.
>> IOW, you are suggesting to do not using UMH at all.
>> Am I right?
>> I don't know the reasons, why it was done by using UMH and not in userspace.
>> Could you clarify this, Jeff?
>>
>
> nfsdcltrack is a "one-shot" program for managing and querying the nfsd
> client tracking database. When knfsd needs to query or modify the
> db, it uses the UMH infrastructure to call this program that does
> what's requested and then exits.
>
> So, I'm not sure I really understand your question. It wasn't done in
> userspace since the whole purpose of this program is to handle upcalls
> from the kernel.
>

The question is what was the reason to start this binary from kernel by UMH?
I.e. why it can't be started by some user-space process before or after NFSd start?
I don't familiar with this client tracking facility and that's the only reason why I'm asking.

--
Best regards,
Stanislav Kinsbursky

2013-05-23 11:56:32

by Jeff Layton

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

On Thu, 23 May 2013 15:38:17 +0400
Stanislav Kinsbursky <[email protected]> wrote:

> 23.05.2013 15:31, Jeff Layton пишет:
> > On Thu, 23 May 2013 14:35:53 +0400
> > Stanislav Kinsbursky <[email protected]> wrote:
> >
> >> 23.05.2013 14:00, Eric W. Biederman пишет:
> >>> Stanislav Kinsbursky <[email protected]> writes:
> >>>
> >>>> 22.05.2013 21:33, Eric W. Biederman пишет:
> >>>>> Stanislav Kinsbursky <[email protected]> writes:
> >>>>>
> >>>>>> Usermode helper executes all binaries in global "init" root context. This
> >>>>>> doesn't allow to call a binary from other root context (for example in a
> >>>>>> container).
> >>>>>> Currently, both containerized NFS client and NFS server requires an ability to
> >>>>>> execute a binary in a container's root context. Root swap can be done in
> >>>>>> "init" callback, passed by UMH caller.
> >>>>>> But since we have 2 callers already (and more of them are expected to appear
> >>>>>> in future) and because set_fs_root() in not exported, it looks reasonable to
> >>>>>> add one more generic UMH helper to generic fs code.
> >>>>>> Root path reference must be hold by the caller, since it will be put on UMH
> >>>>>> thread exit.
> >>>>>
> >>>>> Awesome. With this patch as an uprivilieged user I get to pick which
> >>>>> binary the kernel will execute. At least if nfs and nfsd ever runs in a
> >>>>> user namespace (something that looks like only matter of time).
> >>>>>
> >>>>
> >>>> Not really. Only by using a kernel module to call the UMH.
> >>>> And an unprivileged can't load a module as far a I know.
> >>>> I.e. NFSd, for example, will use unprivileged user's root to perform this call.
> >>>
> >>> To help me understand the context which instances of call user mode
> >>> helper are you expecting to use this facility?
> >>>
> >>
> >> Ok. Here is how the NFSd uses UMH:
> >> UMH is used on NFSd service to start user-space client tracker daemon
> >> ("/sbin/nfsdcltarck"), which is used to store some per-client locks data on
> >> persistent storage.
> >>
> >>>>> I think this is a seriously bad idea.
> >>>>>
> >>>>> Why can't we do this in userspace with setns as we do with the core dump
> >>>>> helper?
> >>>>>
> >>>>
> >>>> Could you, please, clarify, how setns can help here?
> >>>
> >>> setns can change the mount namespace, and chroot can change to root
> >>> directory in the specified mount namespace. Essentially you can enter
> >>> into a containers complete context (pid, mnt, root, etc) comming from
> >>> the outside.
> >>>
> >>
> >> So, you are actually suggesting to move the binary start from the kernel to user-space.
> >> IOW, you are suggesting to do not using UMH at all.
> >> Am I right?
> >> I don't know the reasons, why it was done by using UMH and not in userspace.
> >> Could you clarify this, Jeff?
> >>
> >
> > nfsdcltrack is a "one-shot" program for managing and querying the nfsd
> > client tracking database. When knfsd needs to query or modify the
> > db, it uses the UMH infrastructure to call this program that does
> > what's requested and then exits.
> >
> > So, I'm not sure I really understand your question. It wasn't done in
> > userspace since the whole purpose of this program is to handle upcalls
> > from the kernel.
> >
>
> The question is what was the reason to start this binary from kernel by UMH?

Manipulating and querying the client tracking database is an infrequent
event, so having a continuously running daemon is wasteful and means
that the admin has to ensure that it's running. A UMH upcall is much
simpler and generally "just works" if the program is present.

> I.e. why it can't be started by some user-space process before or after NFSd start?
> I don't familiar with this client tracking facility and that's the only reason why I'm asking.
>

This program is not a daemon that runs continuously. It's only called
when the kernel needs to manipulate the database. Are you asking
whether we could turn this into a continuously running daemon? If so
then the answer is "yes", but that's not really a good idea either.

In fact, we had that with the nfsdcld program, but no one liked it
(including me) for the reasons I detailed above.

--
Jeff Layton <[email protected]>

2013-05-23 11:59:06

by Stanislav Kinsbursky

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

23.05.2013 15:56, Jeff Layton пишет:
> On Thu, 23 May 2013 15:38:17 +0400
> Stanislav Kinsbursky <[email protected]> wrote:
>
>> 23.05.2013 15:31, Jeff Layton пишет:
>>> On Thu, 23 May 2013 14:35:53 +0400
>>> Stanislav Kinsbursky <[email protected]> wrote:
>>>
>>>> 23.05.2013 14:00, Eric W. Biederman пишет:
>>>>> Stanislav Kinsbursky <[email protected]> writes:
>>>>>
>>>>>> 22.05.2013 21:33, Eric W. Biederman пишет:
>>>>>>> Stanislav Kinsbursky <[email protected]> writes:
>>>>>>>
>>>>>>>> Usermode helper executes all binaries in global "init" root context. This
>>>>>>>> doesn't allow to call a binary from other root context (for example in a
>>>>>>>> container).
>>>>>>>> Currently, both containerized NFS client and NFS server requires an ability to
>>>>>>>> execute a binary in a container's root context. Root swap can be done in
>>>>>>>> "init" callback, passed by UMH caller.
>>>>>>>> But since we have 2 callers already (and more of them are expected to appear
>>>>>>>> in future) and because set_fs_root() in not exported, it looks reasonable to
>>>>>>>> add one more generic UMH helper to generic fs code.
>>>>>>>> Root path reference must be hold by the caller, since it will be put on UMH
>>>>>>>> thread exit.
>>>>>>>
>>>>>>> Awesome. With this patch as an uprivilieged user I get to pick which
>>>>>>> binary the kernel will execute. At least if nfs and nfsd ever runs in a
>>>>>>> user namespace (something that looks like only matter of time).
>>>>>>>
>>>>>>
>>>>>> Not really. Only by using a kernel module to call the UMH.
>>>>>> And an unprivileged can't load a module as far a I know.
>>>>>> I.e. NFSd, for example, will use unprivileged user's root to perform this call.
>>>>>
>>>>> To help me understand the context which instances of call user mode
>>>>> helper are you expecting to use this facility?
>>>>>
>>>>
>>>> Ok. Here is how the NFSd uses UMH:
>>>> UMH is used on NFSd service to start user-space client tracker daemon
>>>> ("/sbin/nfsdcltarck"), which is used to store some per-client locks data on
>>>> persistent storage.
>>>>
>>>>>>> I think this is a seriously bad idea.
>>>>>>>
>>>>>>> Why can't we do this in userspace with setns as we do with the core dump
>>>>>>> helper?
>>>>>>>
>>>>>>
>>>>>> Could you, please, clarify, how setns can help here?
>>>>>
>>>>> setns can change the mount namespace, and chroot can change to root
>>>>> directory in the specified mount namespace. Essentially you can enter
>>>>> into a containers complete context (pid, mnt, root, etc) comming from
>>>>> the outside.
>>>>>
>>>>
>>>> So, you are actually suggesting to move the binary start from the kernel to user-space.
>>>> IOW, you are suggesting to do not using UMH at all.
>>>> Am I right?
>>>> I don't know the reasons, why it was done by using UMH and not in userspace.
>>>> Could you clarify this, Jeff?
>>>>
>>>
>>> nfsdcltrack is a "one-shot" program for managing and querying the nfsd
>>> client tracking database. When knfsd needs to query or modify the
>>> db, it uses the UMH infrastructure to call this program that does
>>> what's requested and then exits.
>>>
>>> So, I'm not sure I really understand your question. It wasn't done in
>>> userspace since the whole purpose of this program is to handle upcalls
>>> from the kernel.
>>>
>>
>> The question is what was the reason to start this binary from kernel by UMH?
>
> Manipulating and querying the client tracking database is an infrequent
> event, so having a continuously running daemon is wasteful and means
> that the admin has to ensure that it's running. A UMH upcall is much
> simpler and generally "just works" if the program is present.
>
>> I.e. why it can't be started by some user-space process before or after NFSd start?
>> I don't familiar with this client tracking facility and that's the only reason why I'm asking.
>>
>
> This program is not a daemon that runs continuously. It's only called
> when the kernel needs to manipulate the database. Are you asking
> whether we could turn this into a continuously running daemon? If so
> then the answer is "yes", but that's not really a good idea either.
>
> In fact, we had that with the nfsdcld program, but no one liked it
> (including me) for the reasons I detailed above.
>

No, I'm just asking to understand.
Eric was, actually, asking the same. I.e. how does NFSd uses UMH and why this can't be done in userspace?
Thanks you for your answer.

--
Best regards,
Stanislav Kinsbursky

2013-05-23 12:25:47

by Boaz Harrosh

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

On 23/05/13 14:58, Stanislav Kinsbursky wrote:
> 23.05.2013 15:56, Jeff Layton пишет:
>> On Thu, 23 May 2013 15:38:17 +0400
>> Stanislav Kinsbursky <[email protected]> wrote:
>>
>>> 23.05.2013 15:31, Jeff Layton пишет:
>>>> On Thu, 23 May 2013 14:35:53 +0400
>>>> Stanislav Kinsbursky <[email protected]> wrote:
>>>>
>>>>> 23.05.2013 14:00, Eric W. Biederman пишет:
>>>>>> Stanislav Kinsbursky <[email protected]> writes:
>>>>>>
>>>>>>> 22.05.2013 21:33, Eric W. Biederman пишет:
>>>>>>>> Stanislav Kinsbursky <[email protected]> writes:
>>>>>>>>
>>>>>>>>> Usermode helper executes all binaries in global "init" root context. This
>>>>>>>>> doesn't allow to call a binary from other root context (for example in a
>>>>>>>>> container).
>>>>>>>>> Currently, both containerized NFS client and NFS server requires an ability to
>>>>>>>>> execute a binary in a container's root context. Root swap can be done in
>>>>>>>>> "init" callback, passed by UMH caller.
>>>>>>>>> But since we have 2 callers already (and more of them are expected to appear
>>>>>>>>> in future) and because set_fs_root() in not exported, it looks reasonable to
>>>>>>>>> add one more generic UMH helper to generic fs code.
>>>>>>>>> Root path reference must be hold by the caller, since it will be put on UMH
>>>>>>>>> thread exit.
>>>>>>>>
>>>>>>>> Awesome. With this patch as an uprivilieged user I get to pick which
>>>>>>>> binary the kernel will execute. At least if nfs and nfsd ever runs in a
>>>>>>>> user namespace (something that looks like only matter of time).
>>>>>>>>
>>>>>>>
>>>>>>> Not really. Only by using a kernel module to call the UMH.
>>>>>>> And an unprivileged can't load a module as far a I know.
>>>>>>> I.e. NFSd, for example, will use unprivileged user's root to perform this call.
>>>>>>
>>>>>> To help me understand the context which instances of call user mode
>>>>>> helper are you expecting to use this facility?
>>>>>>
>>>>>
>>>>> Ok. Here is how the NFSd uses UMH:
>>>>> UMH is used on NFSd service to start user-space client tracker daemon
>>>>> ("/sbin/nfsdcltarck"), which is used to store some per-client locks data on
>>>>> persistent storage.
>>>>>
>>>>>>>> I think this is a seriously bad idea.
>>>>>>>>
>>>>>>>> Why can't we do this in userspace with setns as we do with the core dump
>>>>>>>> helper?
>>>>>>>>
>>>>>>>
>>>>>>> Could you, please, clarify, how setns can help here?
>>>>>>
>>>>>> setns can change the mount namespace, and chroot can change to root
>>>>>> directory in the specified mount namespace. Essentially you can enter
>>>>>> into a containers complete context (pid, mnt, root, etc) comming from
>>>>>> the outside.
>>>>>>
>>>>>
>>>>> So, you are actually suggesting to move the binary start from the kernel to user-space.
>>>>> IOW, you are suggesting to do not using UMH at all.
>>>>> Am I right?
>>>>> I don't know the reasons, why it was done by using UMH and not in userspace.
>>>>> Could you clarify this, Jeff?
>>>>>
>>>>
>>>> nfsdcltrack is a "one-shot" program for managing and querying the nfsd
>>>> client tracking database. When knfsd needs to query or modify the
>>>> db, it uses the UMH infrastructure to call this program that does
>>>> what's requested and then exits.
>>>>
>>>> So, I'm not sure I really understand your question. It wasn't done in
>>>> userspace since the whole purpose of this program is to handle upcalls
>>>> from the kernel.
>>>>
>>>
>>> The question is what was the reason to start this binary from kernel by UMH?
>>
>> Manipulating and querying the client tracking database is an infrequent
>> event, so having a continuously running daemon is wasteful and means
>> that the admin has to ensure that it's running. A UMH upcall is much
>> simpler and generally "just works" if the program is present.
>>
>>> I.e. why it can't be started by some user-space process before or after NFSd start?
>>> I don't familiar with this client tracking facility and that's the only reason why I'm asking.
>>>
>>
>> This program is not a daemon that runs continuously. It's only called
>> when the kernel needs to manipulate the database. Are you asking
>> whether we could turn this into a continuously running daemon? If so
>> then the answer is "yes", but that's not really a good idea either.
>>
>> In fact, we had that with the nfsdcld program, but no one liked it
>> (including me) for the reasons I detailed above.
>>
>
> No, I'm just asking to understand.
> Eric was, actually, asking the same. I.e. how does NFSd uses UMH and why this can't be done in userspace?
> Thanks you for your answer.
>

I'm not familiar with nfsdcltrack but I would imagine it receives it's information from
Kernel as a command line parameters.

Would it not be the simplest approach to add a --chroot=/path/to/root optional
parameter to nfsdcltrack so it should access an alternate DB relative to
--chroot.

This would address Eric's concern of not executing user-privileged executable
from Kernel. I think

Just my $0.017
Boaz

2013-05-23 13:06:29

by Jeff Layton

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

On Thu, 23 May 2013 15:25:20 +0300
Boaz Harrosh <[email protected]> wrote:

> On 23/05/13 14:58, Stanislav Kinsbursky wrote:
> > 23.05.2013 15:56, Jeff Layton пишет:
> >> On Thu, 23 May 2013 15:38:17 +0400
> >> Stanislav Kinsbursky <[email protected]> wrote:
> >>
> >>> 23.05.2013 15:31, Jeff Layton пишет:
> >>>> On Thu, 23 May 2013 14:35:53 +0400
> >>>> Stanislav Kinsbursky <[email protected]> wrote:
> >>>>
> >>>>> 23.05.2013 14:00, Eric W. Biederman пишет:
> >>>>>> Stanislav Kinsbursky <[email protected]> writes:
> >>>>>>
> >>>>>>> 22.05.2013 21:33, Eric W. Biederman пишет:
> >>>>>>>> Stanislav Kinsbursky <[email protected]> writes:
> >>>>>>>>
> >>>>>>>>> Usermode helper executes all binaries in global "init" root context. This
> >>>>>>>>> doesn't allow to call a binary from other root context (for example in a
> >>>>>>>>> container).
> >>>>>>>>> Currently, both containerized NFS client and NFS server requires an ability to
> >>>>>>>>> execute a binary in a container's root context. Root swap can be done in
> >>>>>>>>> "init" callback, passed by UMH caller.
> >>>>>>>>> But since we have 2 callers already (and more of them are expected to appear
> >>>>>>>>> in future) and because set_fs_root() in not exported, it looks reasonable to
> >>>>>>>>> add one more generic UMH helper to generic fs code.
> >>>>>>>>> Root path reference must be hold by the caller, since it will be put on UMH
> >>>>>>>>> thread exit.
> >>>>>>>>
> >>>>>>>> Awesome. With this patch as an uprivilieged user I get to pick which
> >>>>>>>> binary the kernel will execute. At least if nfs and nfsd ever runs in a
> >>>>>>>> user namespace (something that looks like only matter of time).
> >>>>>>>>
> >>>>>>>
> >>>>>>> Not really. Only by using a kernel module to call the UMH.
> >>>>>>> And an unprivileged can't load a module as far a I know.
> >>>>>>> I.e. NFSd, for example, will use unprivileged user's root to perform this call.
> >>>>>>
> >>>>>> To help me understand the context which instances of call user mode
> >>>>>> helper are you expecting to use this facility?
> >>>>>>
> >>>>>
> >>>>> Ok. Here is how the NFSd uses UMH:
> >>>>> UMH is used on NFSd service to start user-space client tracker daemon
> >>>>> ("/sbin/nfsdcltarck"), which is used to store some per-client locks data on
> >>>>> persistent storage.
> >>>>>
> >>>>>>>> I think this is a seriously bad idea.
> >>>>>>>>
> >>>>>>>> Why can't we do this in userspace with setns as we do with the core dump
> >>>>>>>> helper?
> >>>>>>>>
> >>>>>>>
> >>>>>>> Could you, please, clarify, how setns can help here?
> >>>>>>
> >>>>>> setns can change the mount namespace, and chroot can change to root
> >>>>>> directory in the specified mount namespace. Essentially you can enter
> >>>>>> into a containers complete context (pid, mnt, root, etc) comming from
> >>>>>> the outside.
> >>>>>>
> >>>>>
> >>>>> So, you are actually suggesting to move the binary start from the kernel to user-space.
> >>>>> IOW, you are suggesting to do not using UMH at all.
> >>>>> Am I right?
> >>>>> I don't know the reasons, why it was done by using UMH and not in userspace.
> >>>>> Could you clarify this, Jeff?
> >>>>>
> >>>>
> >>>> nfsdcltrack is a "one-shot" program for managing and querying the nfsd
> >>>> client tracking database. When knfsd needs to query or modify the
> >>>> db, it uses the UMH infrastructure to call this program that does
> >>>> what's requested and then exits.
> >>>>
> >>>> So, I'm not sure I really understand your question. It wasn't done in
> >>>> userspace since the whole purpose of this program is to handle upcalls
> >>>> from the kernel.
> >>>>
> >>>
> >>> The question is what was the reason to start this binary from kernel by UMH?
> >>
> >> Manipulating and querying the client tracking database is an infrequent
> >> event, so having a continuously running daemon is wasteful and means
> >> that the admin has to ensure that it's running. A UMH upcall is much
> >> simpler and generally "just works" if the program is present.
> >>
> >>> I.e. why it can't be started by some user-space process before or after NFSd start?
> >>> I don't familiar with this client tracking facility and that's the only reason why I'm asking.
> >>>
> >>
> >> This program is not a daemon that runs continuously. It's only called
> >> when the kernel needs to manipulate the database. Are you asking
> >> whether we could turn this into a continuously running daemon? If so
> >> then the answer is "yes", but that's not really a good idea either.
> >>
> >> In fact, we had that with the nfsdcld program, but no one liked it
> >> (including me) for the reasons I detailed above.
> >>
> >
> > No, I'm just asking to understand.
> > Eric was, actually, asking the same. I.e. how does NFSd uses UMH and why this can't be done in userspace?
> > Thanks you for your answer.
> >
>
> I'm not familiar with nfsdcltrack but I would imagine it receives it's information from
> Kernel as a command line parameters.
>
> Would it not be the simplest approach to add a --chroot=/path/to/root optional
> parameter to nfsdcltrack so it should access an alternate DB relative to
> --chroot.
>
> This would address Eric's concern of not executing user-privileged executable
> from Kernel. I think
>
> Just my $0.017
> Boaz
>

I think that sounds reasonable. Is it always the case
that /path/to/root is reachable from the "primary" namespace? If not,
you may need to do something more exotic there.

Also, do you have to do anything like change the uid/gid to a different
user who is root within the container?

What might help most here is to lay out a particular scenario for how
you envision setting up knfsd in a container so we can ensure that it's
addressed properly by whatever solution you settle on.

--
Jeff Layton <[email protected]>

2013-05-23 19:06:38

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

On Wed, May 22, 2013 at 08:37:23PM -0700, Eric W. Biederman wrote:
> "J. Bruce Fields" <[email protected]> writes:
>
> > On Wed, May 22, 2013 at 11:35:56AM -0700, Eric W. Biederman wrote:
> >> [email protected] (Eric W. Biederman) writes:
> >>
> >> > I am missing a lot of context here and capturing the context of a
> >> > process at time time we mount the filesystem and reconstituing it in
> >> > call user mode helper seems like something we could do.
> >
> > So it's not enough just to ensure that the user namespace is set
> > correctly? (To the namespace of the mount process in the nfs case, or
> > of the process that starts nfsd in the nfsd case.)
>
> No. By just setting the user namespace, you gain the ability to create
> processes outside of your curremt rlimits, and cgroups, and pid
> namespace, etc.
>
> With the wrong set of namespaces you will talk to the wrong processes,
> or utilize the wrong resources.
>
> Outside of your current cgroup you gain the ability to use more
> resources than you were limited to.
>
> Having thought about it what I would propose is to fork a process from
> the context of the mounting process (or possibly from the context of the
> process that writes to the sysctl that sets the program to spawn) and
> have that process be a daemon that becomes responsible for spawning user
> mode helpers with context. Either that or whe need the user mode helper
> userspace infrastructure to become namespace aware and call setns.
>
> >> If we want to do something like this the only sane thing I can see is to
> >> have a per container version of kthread call it uthread. That the user
> >> mode helper code would use to launch a new process.
> >>
> >> Anything else and I expect we will be tearing our hair out for the rest
> >> of our lives with weird corner cases or unexpected semantics.
> >
> > Could you give examples of weird corner cases or unexpected semantics?
>
> I gave a couple and it is the classic problem with suid executables
> where a lot of unexpected things are inherited from the environment, and
> so things become a never ending race. We replaced daemonize in the
> kernel with just forking processes from kthreadd for this very reason.
> There was always something else that needed to be reset to make a
> process a proper kernel thread.

Got it, thanks for the explanation.

--b.

2013-05-23 19:56:24

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

On Thu, May 23, 2013 at 09:05:26AM -0400, Jeff Layton wrote:
> On Thu, 23 May 2013 15:25:20 +0300
> > I'm not familiar with nfsdcltrack but I would imagine it receives it's information from
> > Kernel as a command line parameters.
> >
> > Would it not be the simplest approach to add a --chroot=/path/to/root optional
> > parameter to nfsdcltrack so it should access an alternate DB relative to
> > --chroot.
> >
> > This would address Eric's concern of not executing user-privileged executable
> > from Kernel. I think
> >
> > Just my $0.017
> > Boaz
> >
>
> I think that sounds reasonable. Is it always the case
> that /path/to/root is reachable from the "primary" namespace?

I don't think we can assume that.

> If not, you may need to do something more exotic there.

We should be able to pass a file descriptor and then work relative to
that.

> Also, do you have to do anything like change the uid/gid to a different
> user who is root within the container?

Yeah, you may need to create files, for example, right?

> What might help most here is to lay out a particular scenario for how
> you envision setting up knfsd in a container so we can ensure that it's
> addressed properly by whatever solution you settle on.

It would seem cleaner to me the less userspace has to understand about
containers--ideally someone could run a general-purpose distro with its
nfs-utils in a container and have nfs and nfsd just work.

So I'd like to understand whether it is feasible to spawn helpers from a
thread that's descended from whoever started nfsd (or whatever the
proper ancestor is).

(And, what about the nfsd threads themselves? If we're going to allow
unprivileged users to start nfsd, then we probably want the nfsd threads
to inherit from the user somehow, don't we?)

As I understand it recent clients use request_key to do idmapping. I
don't understand that (or keyrings) well. How should they work? I
would have expected that you'd want a separate request-key for each
container rather than a single request-key working on behalf of all
containers.

--b.

2013-05-23 20:15:06

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

On Thu, May 23, 2013 at 03:55:47PM -0400, J. Bruce Fields wrote:
> On Thu, May 23, 2013 at 09:05:26AM -0400, Jeff Layton wrote:
> > What might help most here is to lay out a particular scenario for how
> > you envision setting up knfsd in a container so we can ensure that it's
> > addressed properly by whatever solution you settle on.

BTW the problem I have here is that the only case I've personally had
any interest in is using network and file namespaces to isolate nfsd's
to make them safe to migrate across nodes of a cluster.

So while the idea of making user namespaces and unprivileged knfsd and
the rest work is really interesting and I'm happy to think about it, I'm
not sure how feasible or useful it is.

I'd therefore actually prefer just to take something like Stanislav's
patch now and put off the problem of how to make it work correctly with
user namespaces until we actually turn that on. His patch fixes a real
bug that we have now, while user-namespaced-nfsd still sounds a bit
pie-in-the-sky to me.


But maybe I don't understand why Eric thinks nfsd in usernamespaces is
imminent. Or maybe I'm missing some security problem that Stanislav's
patch would introduce now without allowing nfsd to run in a user
namespace.

--b.

2013-05-23 21:33:15

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

"J. Bruce Fields" <[email protected]> writes:

> On Thu, May 23, 2013 at 03:55:47PM -0400, J. Bruce Fields wrote:
>> On Thu, May 23, 2013 at 09:05:26AM -0400, Jeff Layton wrote:
>> > What might help most here is to lay out a particular scenario for how
>> > you envision setting up knfsd in a container so we can ensure that it's
>> > addressed properly by whatever solution you settle on.
>
> BTW the problem I have here is that the only case I've personally had
> any interest in is using network and file namespaces to isolate nfsd's
> to make them safe to migrate across nodes of a cluster.
>
> So while the idea of making user namespaces and unprivileged knfsd and
> the rest work is really interesting and I'm happy to think about it, I'm
> not sure how feasible or useful it is.
>
> I'd therefore actually prefer just to take something like Stanislav's
> patch now and put off the problem of how to make it work correctly with
> user namespaces until we actually turn that on. His patch fixes a real
> bug that we have now, while user-namespaced-nfsd still sounds a bit
> pie-in-the-sky to me.
>
>
> But maybe I don't understand why Eric thinks nfsd in usernamespaces is
> imminent. Or maybe I'm missing some security problem that Stanislav's
> patch would introduce now without allowing nfsd to run in a user
> namespace.

The problem I saw is less pronounced but I think actually exists without
user namespaces as well. In particular if we let root in the container
start knfsd in a network and mount namespace. Then if that container
does not have certain capabilities like say CAP_MKNOD all of a sudden
we have a process running in the container with CAP_MKNOD.

I haven't had time to look at everything just yet but I don't think
solving this is particularly hard.

There are really two very simple solutions.
1) When we start knfsd in the container we create a kernel thread that
is a child of the userspace process that started knfsd. That kernel
thread can then be used to launch user mode helpers.

This uses the same code as is needed today to launch user mode
helpers with just a different parent thread.

2) We pass enough information for the user mode helper to figure out
what container this is all for. File descriptors or whatever.
Then the user mode helper outside the container using chdir and setns
sets up the environment that the user mode helper typically expects
and runs the process inside of the container.

Or we look at what the user mode helper is doing and realize we don't
need to run the user mode binary inside of the container. If all it
does is query /etc/passwd for username to uid mapping for example (for
user namespaces we will probably care but not without them).

I don't think any of this is hard to implement.

I think user namespaces are imminent because after my last pass through
the code the remaining work looked pretty trivial, and as soon as the
dust settles I expect user namespaces become the common way to run code
in containers, which should greatly increase the demand for this feature
in user namespaces.

Eric




2013-05-24 05:45:10

by Stanislav Kinsbursky

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

23.05.2013 23:55, J. Bruce Fields пишет:
> On Thu, May 23, 2013 at 09:05:26AM -0400, Jeff Layton wrote:
>> On Thu, 23 May 2013 15:25:20 +0300
>>> I'm not familiar with nfsdcltrack but I would imagine it receives it's information from
>>> Kernel as a command line parameters.
>>>
>>> Would it not be the simplest approach to add a --chroot=/path/to/root optional
>>> parameter to nfsdcltrack so it should access an alternate DB relative to
>>> --chroot.
>>>
>>> This would address Eric's concern of not executing user-privileged executable
>>> from Kernel. I think
>>>
>>> Just my $0.017
>>> Boaz
>>>
>>
>> I think that sounds reasonable. Is it always the case
>> that /path/to/root is reachable from the "primary" namespace?
>
> I don't think we can assume that.
>

Yes, we can't. For example in case of different mount namespaces.

>> If not, you may need to do something more exotic there.
>
> We should be able to pass a file descriptor and then work relative to
> that.
>

We can't do this either.
Moreover, passing a file descriptor is something, that solves (?) completely different problem.
Imagine the following:
1) We have a host, based on, say RHEL6, which nfs-utils has doesn't have "/sbin/nfsdcltrack" and all.
2) And we have a container in it, based on, say, Fedora-19, which nfs-utils has this binary.

In case of starting NFSd in Fedora CT, we won't be able to execute the desired binary without root swapping.
Because we won't be able to even lookup it in the host file system.

So, as I said previously, the main problem here is not how to modify the userspace binary, but how to lookup and execute the right (!) one.
And I don't see, how we can do this (simple enough) without root swap.


--
Best regards,
Stanislav Kinsbursky

2013-05-24 06:04:39

by Stanislav Kinsbursky

[permalink] [raw]
Subject: Re: [RFC PATCH] fs: call_usermodehelper_root helper introduced

24.05.2013 01:32, Eric W. Biederman ?????:
> "J. Bruce Fields" <[email protected]> writes:
>
>> On Thu, May 23, 2013 at 03:55:47PM -0400, J. Bruce Fields wrote:
>>> On Thu, May 23, 2013 at 09:05:26AM -0400, Jeff Layton wrote:
>>>> What might help most here is to lay out a particular scenario for how
>>>> you envision setting up knfsd in a container so we can ensure that it's
>>>> addressed properly by whatever solution you settle on.
>>
>> BTW the problem I have here is that the only case I've personally had
>> any interest in is using network and file namespaces to isolate nfsd's
>> to make them safe to migrate across nodes of a cluster.
>>
>> So while the idea of making user namespaces and unprivileged knfsd and
>> the rest work is really interesting and I'm happy to think about it, I'm
>> not sure how feasible or useful it is.
>>
>> I'd therefore actually prefer just to take something like Stanislav's
>> patch now and put off the problem of how to make it work correctly with
>> user namespaces until we actually turn that on. His patch fixes a real
>> bug that we have now, while user-namespaced-nfsd still sounds a bit
>> pie-in-the-sky to me.
>>
>>
>> But maybe I don't understand why Eric thinks nfsd in usernamespaces is
>> imminent. Or maybe I'm missing some security problem that Stanislav's
>> patch would introduce now without allowing nfsd to run in a user
>> namespace.
>
> The problem I saw is less pronounced but I think actually exists without
> user namespaces as well. In particular if we let root in the container
> start knfsd in a network and mount namespace. Then if that container
> does not have certain capabilities like say CAP_MKNOD all of a sudden
> we have a process running in the container with CAP_MKNOD.
>
> I haven't had time to look at everything just yet but I don't think
> solving this is particularly hard.
>
> There are really two very simple solutions.
> 1) When we start knfsd in the container we create a kernel thread that
> is a child of the userspace process that started knfsd. That kernel
> thread can then be used to launch user mode helpers.
>
> This uses the same code as is needed today to launch user mode
> helpers with just a different parent thread.
>

This might work for us.
but one of the reasons I'm worrying about is that the spawned kernel thread will
have the same capabilities as the process, calling for NFSd start.
And this capabilities can be insufficient for running "/sbin/nfsdcltrack" binary.

> 2) We pass enough information for the user mode helper to figure out
> what container this is all for. File descriptors or whatever.
> Then the user mode helper outside the container using chdir and setns
> sets up the environment that the user mode helper typically expects
> and runs the process inside of the container.
>

How does chdir and setns differs from doing set_fs_root() in UMH in terms of
security and isolation?

> Or we look at what the user mode helper is doing and realize we don't
> need to run the user mode binary inside of the container. If all it
> does is query /etc/passwd for username to uid mapping for example (for
> user namespaces we will probably care but not without them).
>
> I don't think any of this is hard to implement.
>
> I think user namespaces are imminent because after my last pass through
> the code the remaining work looked pretty trivial, and as soon as the
> dust settles I expect user namespaces become the common way to run code
> in containers, which should greatly increase the demand for this feature
> in user namespaces.
>
> Eric
>
>
>
>
>


--
Best regards,
Stanislav Kinsbursky