2011-02-01 18:17:25

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] System Wide Capability Bounding Set

What are we thinking? Any suggestions how to do what we need other than

global bounding such that pP' = gbset & (fI | pI)

Or an interface in which I can force things out of the bset and pI of
other tasks? Possibly the interface could be specific to the "khelper"
thread?

-Eric


2011-02-01 21:26:14

by Serge Hallyn

[permalink] [raw]
Subject: Re: [PATCH] System Wide Capability Bounding Set

Quoting Eric Paris ([email protected]):
> What are we thinking? Any suggestions how to do what we need other than
>
> global bounding such that pP' = gbset & (fI | pI)

That should be sufficient for what you want.

I would however like to hear whether Andrew has had any other ideas
given the broader picture.

> Or an interface in which I can force things out of the bset and pI of
> other tasks? Possibly the interface could be specific to the "khelper"
> thread?

No no no no no :)

thanks,
-serge

2011-02-02 04:02:40

by Andrew G. Morgan

[permalink] [raw]
Subject: Re: [PATCH] System Wide Capability Bounding Set

On Tue, Feb 1, 2011 at 1:26 PM, Serge E. Hallyn
<[email protected]> wrote:
> Quoting Eric Paris ([email protected]):
>> What are we thinking? ?Any suggestions how to do what we need other than
>>
>> global bounding such that ?pP' = gbset & (fI | pI)
>
> That should be sufficient for what you want.
>
> I would however like to hear whether Andrew has had any other ideas
> given the broader picture.
>

I think I now see what you are after.

You want some sort of transient TCB that can lock all of the doors you
care to lock and then run the whole system in a partially crippled
sandbox.

I have some concerns about how you know you have truly locked down the
system and question the viability of a VM that doesn't virtualize IO
too, but presumably you have some way to protect the storage of the
kernel binary and initrd that cannot be overcome, and protections from
DMA etc. being used by the guest to to overwrite kernel memory.

In this case, I would like to suggest what you need is a user
configurable state for kernel threads to launch helper programs - a
kernel side equivalent to Sergey's wrapper idea. I continue to
dislike the global bounding set idea, but I would support a base
credential set for this kthread launcher. I'd include pI, bset, and
securebits and uid as something your initrd could initialize away from
their default values for kernel launched helper binaries. I'd prefer
it if you allowed the regular capability convolution rules to apply
and propagate this bounding set for all kernel launched binaries, and
also add the relevant code to init to enforce your desired bounding
set for init parented processes.

This way you will both meet your current needs and also maintain
support for a capability managed 'raw' kernel experience with no
asynchronous capability manipulation system-wide.

Cheers

Andrew


>> Or an interface in which I can force things out of the bset and pI of
>> other tasks? ?Possibly the interface could be specific to the "khelper"
>> thread?
>
> No no no no no :)
>
> thanks,
> -serge
>

2011-02-08 02:56:29

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] System Wide Capability Bounding Set

On Tue, 2011-02-01 at 20:02 -0800, Andrew G. Morgan wrote:

> In this case, I would like to suggest what you need is a user
> configurable state for kernel threads to launch helper programs - a
> kernel side equivalent to Sergey's wrapper idea. I continue to
> dislike the global bounding set idea, but I would support a base
> credential set for this kthread launcher. I'd include pI, bset, and
> securebits and uid as something your initrd could initialize away from
> their default values for kernel launched helper binaries. I'd prefer
> it if you allowed the regular capability convolution rules to apply
> and propagate this bounding set for all kernel launched binaries, and
> also add the relevant code to init to enforce your desired bounding
> set for init parented processes.


> >> Or an interface in which I can force things out of the bset and pI of
> >> other tasks? Possibly the interface could be specific to the "khelper"
> >> thread?
> >
> > No no no no no :)

Below is what I'm working on. I've asked dhowells to review the creds
code, since commit_creds() does not take const. Maybe that's just an
oversight. Basically I've exposed two new sysctls.

/proc/sys/kernel/usermodehelper/bset
/proc/sys/kernel/usermodehelper/inheritable

You must have CAP_SYS_MODULE to change these (changes are &= ONLY).
When the kernel launches a usermodehelper it will do so with these as
the bset and pI. I haven't attempted securebits and uid (since I didn't
really need them I don't think) But will if anyone can think of a use
case.

Is this what you were thinking?

-Eric

commit 23dbb6813349509a463103a34b51b18182c2ca0f
Author: Eric Paris <[email protected]>
Date: Mon Feb 7 21:39:58 2011 -0500

limit kthreads usermode helper words stuff

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index 6efd7a7..79bb98d 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -24,6 +24,7 @@
#include <linux/errno.h>
#include <linux/compiler.h>
#include <linux/workqueue.h>
+#include <linux/sysctl.h>

#define KMOD_PATH_LEN 256

@@ -109,6 +110,8 @@ call_usermodehelper(char *path, char **argv, char **envp, enum umh_wait wait)
NULL, NULL, NULL);
}

+extern struct ctl_table usermodehelper_table[];
+
extern void usermodehelper_init(void);

extern int usermodehelper_disable(void);
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 9cd0591..d38be14 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -25,6 +25,7 @@
#include <linux/kmod.h>
#include <linux/slab.h>
#include <linux/completion.h>
+#include <linux/cred.h>
#include <linux/file.h>
#include <linux/fdtable.h>
#include <linux/workqueue.h>
@@ -43,6 +44,12 @@ extern int max_threads;

static struct workqueue_struct *khelper_wq;

+#define CAP_BSET (void *)1
+#define CAP_PI (void *)2
+
+kernel_cap_t usermodehelper_bset = CAP_FULL_SET;
+kernel_cap_t usermodehelper_inheritable = CAP_FULL_SET;
+
#ifdef CONFIG_MODULES

/*
@@ -132,6 +139,8 @@ EXPORT_SYMBOL(__request_module);
static int ____call_usermodehelper(void *data)
{
struct subprocess_info *sub_info = data;
+ const struct cred *old;
+ struct cred *new;
int retval;

spin_lock_irq(&current->sighand->siglock);
@@ -153,10 +162,23 @@ static int ____call_usermodehelper(void *data)
goto fail;
}

+ retval = -ENOMEM;
+ new = prepare_kernel_cred(current);
+ if (!new)
+ goto fail;
+
+ new->cap_bset = usermodehelper_bset;
+ new->cap_inheritable = usermodehelper_inheritable;
+
+ old = get_cred(current_cred());
+ commit_creds(new);
+
retval = kernel_execve(sub_info->path,
(const char *const *)sub_info->argv,
(const char *const *)sub_info->envp);

+ commit_creds(old);
+
/* Exec failed? */
fail:
sub_info->retval = retval;
@@ -418,6 +440,79 @@ unlock:
}
EXPORT_SYMBOL(call_usermodehelper_exec);

+static int proc_cap_handler(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ struct ctl_table t;
+ unsigned long cap_array[_KERNEL_CAPABILITY_U32S];
+ kernel_cap_t new_cap;
+ int err, i;
+
+ if (write && !capable(CAP_SYS_MODULE))
+ return -EPERM;
+
+ /*
+ * convert from the global kernel_cap_t to the ulong array to print to
+ * userspace if this is a read.
+ */
+ for (i = 0; i < _KERNEL_CAPABILITY_U32S; i++) {
+ if (table->data == CAP_BSET)
+ cap_array[i] = usermodehelper_bset.cap[i];
+ else if (table->data == CAP_PI)
+ cap_array[i] = usermodehelper_inheritable.cap[i];
+ else
+ BUG();
+ }
+
+ t = *table;
+ t.data = &cap_array;
+
+ /*
+ * actually read or write and array of ulongs from userspace. Remember
+ * these are least significant 32 bits first
+ */
+ err = proc_doulongvec_minmax(&t, write, buffer, lenp, ppos);
+ if (err < 0)
+ return err;
+
+ /*
+ * convert from the sysctl array of ulongs to the kernel_cap_t
+ * internal representation
+ */
+ for (i = 0; i < _KERNEL_CAPABILITY_U32S; i++)
+ new_cap.cap[i] = cap_array[i];
+
+ /*
+ * Drop everything not in the new_cap (but don't add things)
+ */
+ if (write) {
+ if (table->data == CAP_BSET)
+ usermodehelper_bset = cap_intersect(usermodehelper_bset, new_cap);
+ if (table->data == CAP_PI)
+ usermodehelper_inheritable = cap_intersect(usermodehelper_inheritable, new_cap);
+ }
+
+ return 0;
+}
+
+struct ctl_table usermodehelper_table[] = {
+ {
+ .procname = "bset",
+ .data = CAP_BSET,
+ .maxlen = _KERNEL_CAPABILITY_U32S * sizeof(unsigned long),
+ .mode = 0600,
+ .proc_handler = proc_cap_handler,
+ },
+ {
+ .procname = "inheritable",
+ .data = CAP_PI,
+ .maxlen = _KERNEL_CAPABILITY_U32S * sizeof(unsigned long),
+ .mode = 0600,
+ .proc_handler = proc_cap_handler,
+ },
+ { }
+};
+
void __init usermodehelper_init(void)
{
khelper_wq = create_singlethread_workqueue("khelper");
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 0f1bd83..099d2e2 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -56,6 +56,7 @@
#include <linux/kprobes.h>
#include <linux/pipe_fs_i.h>
#include <linux/oom.h>
+#include <linux/kmod.h>

#include <asm/uaccess.h>
#include <asm/processor.h>
@@ -617,6 +618,11 @@ static struct ctl_table kern_table[] = {
.child = random_table,
},
{
+ .procname = "usermodehelper",
+ .mode = 0555,
+ .child = usermodehelper_table,
+ },
+ {
.procname = "overflowuid",
.data = &overflowuid,
.maxlen = sizeof(int),

2011-02-14 20:45:50

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] System Wide Capability Bounding Set

On Mon, Feb 7, 2011 at 9:55 PM, Eric Paris <[email protected]> wrote:
>
> Below is what I'm working on. ?I've asked dhowells to review the creds
> code, since commit_creds() does not take const. ?Maybe that's just an
> oversight. ?Basically I've exposed two new sysctls.
>
> /proc/sys/kernel/usermodehelper/bset
> /proc/sys/kernel/usermodehelper/inheritable
>
> You must have CAP_SYS_MODULE to change these (changes are &= ONLY).
> When the kernel launches a usermodehelper it will do so with these as
> the bset and pI. ?I haven't attempted securebits and uid (since I didn't
> really need them I don't think) ?But will if anyone can think of a use
> case.
>
> Is this what you were thinking?


Anything? Problems with this patch/approach?

-Eric

2011-02-14 21:25:05

by Serge Hallyn

[permalink] [raw]
Subject: Re: [PATCH] System Wide Capability Bounding Set

Quoting Eric Paris ([email protected]):
> On Mon, Feb 7, 2011 at 9:55 PM, Eric Paris <[email protected]> wrote:
> >
> > Below is what I'm working on. ?I've asked dhowells to review the creds
> > code, since commit_creds() does not take const. ?Maybe that's just an
> > oversight. ?Basically I've exposed two new sysctls.
> >
> > /proc/sys/kernel/usermodehelper/bset
> > /proc/sys/kernel/usermodehelper/inheritable
> >
> > You must have CAP_SYS_MODULE to change these (changes are &= ONLY).
> > When the kernel launches a usermodehelper it will do so with these as
> > the bset and pI. ?I haven't attempted securebits and uid (since I didn't
> > really need them I don't think) ?But will if anyone can think of a use
> > case.
> >
> > Is this what you were thinking?
>
>
> Anything? Problems with this patch/approach?

Sorry, I've just not had a chance to take a close enough look. I'll
try to do so tonight.

-serge

2011-02-18 00:29:17

by Serge Hallyn

[permalink] [raw]
Subject: Re: [PATCH] System Wide Capability Bounding Set

Quoting Eric Paris ([email protected]):
> On Tue, 2011-02-01 at 20:02 -0800, Andrew G. Morgan wrote:
>
> > In this case, I would like to suggest what you need is a user
> > configurable state for kernel threads to launch helper programs - a
> > kernel side equivalent to Sergey's wrapper idea. I continue to
> > dislike the global bounding set idea, but I would support a base
> > credential set for this kthread launcher. I'd include pI, bset, and
> > securebits and uid as something your initrd could initialize away from
> > their default values for kernel launched helper binaries. I'd prefer
> > it if you allowed the regular capability convolution rules to apply
> > and propagate this bounding set for all kernel launched binaries, and
> > also add the relevant code to init to enforce your desired bounding
> > set for init parented processes.
>
>
> > >> Or an interface in which I can force things out of the bset and pI of
> > >> other tasks? Possibly the interface could be specific to the "khelper"
> > >> thread?
> > >
> > > No no no no no :)
>
> Below is what I'm working on. I've asked dhowells to review the creds
> code, since commit_creds() does not take const. Maybe that's just an
> oversight. Basically I've exposed two new sysctls.
>
> /proc/sys/kernel/usermodehelper/bset
> /proc/sys/kernel/usermodehelper/inheritable
>
> You must have CAP_SYS_MODULE to change these (changes are &= ONLY).
> When the kernel launches a usermodehelper it will do so with these as
> the bset and pI. I haven't attempted securebits and uid (since I didn't
> really need them I don't think) But will if anyone can think of a use
> case.
>
> Is this what you were thinking?

Sorry about the wait.

No objection from me.

If someone ends up wanting to do more generic security context tweaking
we can worry about that later.

thanks,
-serge

> -Eric
>
> commit 23dbb6813349509a463103a34b51b18182c2ca0f
> Author: Eric Paris <[email protected]>
> Date: Mon Feb 7 21:39:58 2011 -0500
>
> limit kthreads usermode helper words stuff
>
> diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> index 6efd7a7..79bb98d 100644
> --- a/include/linux/kmod.h
> +++ b/include/linux/kmod.h
> @@ -24,6 +24,7 @@
> #include <linux/errno.h>
> #include <linux/compiler.h>
> #include <linux/workqueue.h>
> +#include <linux/sysctl.h>
>
> #define KMOD_PATH_LEN 256
>
> @@ -109,6 +110,8 @@ call_usermodehelper(char *path, char **argv, char **envp, enum umh_wait wait)
> NULL, NULL, NULL);
> }
>
> +extern struct ctl_table usermodehelper_table[];
> +
> extern void usermodehelper_init(void);
>
> extern int usermodehelper_disable(void);
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index 9cd0591..d38be14 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -25,6 +25,7 @@
> #include <linux/kmod.h>
> #include <linux/slab.h>
> #include <linux/completion.h>
> +#include <linux/cred.h>
> #include <linux/file.h>
> #include <linux/fdtable.h>
> #include <linux/workqueue.h>
> @@ -43,6 +44,12 @@ extern int max_threads;
>
> static struct workqueue_struct *khelper_wq;
>
> +#define CAP_BSET (void *)1
> +#define CAP_PI (void *)2
> +
> +kernel_cap_t usermodehelper_bset = CAP_FULL_SET;
> +kernel_cap_t usermodehelper_inheritable = CAP_FULL_SET;
> +
> #ifdef CONFIG_MODULES
>
> /*
> @@ -132,6 +139,8 @@ EXPORT_SYMBOL(__request_module);
> static int ____call_usermodehelper(void *data)
> {
> struct subprocess_info *sub_info = data;
> + const struct cred *old;
> + struct cred *new;
> int retval;
>
> spin_lock_irq(&current->sighand->siglock);
> @@ -153,10 +162,23 @@ static int ____call_usermodehelper(void *data)
> goto fail;
> }
>
> + retval = -ENOMEM;
> + new = prepare_kernel_cred(current);
> + if (!new)
> + goto fail;
> +
> + new->cap_bset = usermodehelper_bset;
> + new->cap_inheritable = usermodehelper_inheritable;
> +
> + old = get_cred(current_cred());
> + commit_creds(new);
> +
> retval = kernel_execve(sub_info->path,
> (const char *const *)sub_info->argv,
> (const char *const *)sub_info->envp);
>
> + commit_creds(old);
> +
> /* Exec failed? */
> fail:
> sub_info->retval = retval;
> @@ -418,6 +440,79 @@ unlock:
> }
> EXPORT_SYMBOL(call_usermodehelper_exec);
>
> +static int proc_cap_handler(struct ctl_table *table, int write,
> + void __user *buffer, size_t *lenp, loff_t *ppos)
> +{
> + struct ctl_table t;
> + unsigned long cap_array[_KERNEL_CAPABILITY_U32S];
> + kernel_cap_t new_cap;
> + int err, i;
> +
> + if (write && !capable(CAP_SYS_MODULE))
> + return -EPERM;
> +
> + /*
> + * convert from the global kernel_cap_t to the ulong array to print to
> + * userspace if this is a read.
> + */
> + for (i = 0; i < _KERNEL_CAPABILITY_U32S; i++) {
> + if (table->data == CAP_BSET)
> + cap_array[i] = usermodehelper_bset.cap[i];
> + else if (table->data == CAP_PI)
> + cap_array[i] = usermodehelper_inheritable.cap[i];
> + else
> + BUG();
> + }
> +
> + t = *table;
> + t.data = &cap_array;
> +
> + /*
> + * actually read or write and array of ulongs from userspace. Remember
> + * these are least significant 32 bits first
> + */
> + err = proc_doulongvec_minmax(&t, write, buffer, lenp, ppos);
> + if (err < 0)
> + return err;
> +
> + /*
> + * convert from the sysctl array of ulongs to the kernel_cap_t
> + * internal representation
> + */
> + for (i = 0; i < _KERNEL_CAPABILITY_U32S; i++)
> + new_cap.cap[i] = cap_array[i];
> +
> + /*
> + * Drop everything not in the new_cap (but don't add things)
> + */
> + if (write) {
> + if (table->data == CAP_BSET)
> + usermodehelper_bset = cap_intersect(usermodehelper_bset, new_cap);
> + if (table->data == CAP_PI)
> + usermodehelper_inheritable = cap_intersect(usermodehelper_inheritable, new_cap);
> + }
> +
> + return 0;
> +}
> +
> +struct ctl_table usermodehelper_table[] = {
> + {
> + .procname = "bset",
> + .data = CAP_BSET,
> + .maxlen = _KERNEL_CAPABILITY_U32S * sizeof(unsigned long),
> + .mode = 0600,
> + .proc_handler = proc_cap_handler,
> + },
> + {
> + .procname = "inheritable",
> + .data = CAP_PI,
> + .maxlen = _KERNEL_CAPABILITY_U32S * sizeof(unsigned long),
> + .mode = 0600,
> + .proc_handler = proc_cap_handler,
> + },
> + { }
> +};
> +
> void __init usermodehelper_init(void)
> {
> khelper_wq = create_singlethread_workqueue("khelper");
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 0f1bd83..099d2e2 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -56,6 +56,7 @@
> #include <linux/kprobes.h>
> #include <linux/pipe_fs_i.h>
> #include <linux/oom.h>
> +#include <linux/kmod.h>
>
> #include <asm/uaccess.h>
> #include <asm/processor.h>
> @@ -617,6 +618,11 @@ static struct ctl_table kern_table[] = {
> .child = random_table,
> },
> {
> + .procname = "usermodehelper",
> + .mode = 0555,
> + .child = usermodehelper_table,
> + },
> + {
> .procname = "overflowuid",
> .data = &overflowuid,
> .maxlen = sizeof(int),
>
>