2008-03-13 03:28:52

by Serge E. Hallyn

[permalink] [raw]
Subject: [RFC] cgroups: implement device whitelist lsm (v2)

Implement a cgroup using the LSM interface to enforce mknod and open
on device files.

This implements a simple device access whitelist. A whitelist entry
has 4 fields. 'type' is a (all), c (char), or b (block). 'all' means it
applies to all types, all major numbers, and all minor numbers. Major and
minor are obvious. Access is a composition of r (read), w (write), and
m (mknod).

The root devcgroup starts with rwm to 'all'. A child devcg gets a copy
of the parent. Admins can then add and remove devices to the whitelist.
Once CAP_HOST_ADMIN is introduced it will be needed to add entries as
well or remove entries from another cgroup, though just CAP_SYS_ADMIN
will suffice to remove entries for your own group.

An entry is added by doing "echo <type> <maj> <min> <access>" > devcg.allow,
for instance:

echo b 7 0 mrw > /cgroups/1/devcg.allow

An entry is removed by doing likewise into devcg.deny. Since this is a
pure whitelist, not acls, you can only remove entries which exist in the
whitelist. You must explicitly

echo a 0 0 mrw > /cgroups/1/devcg.deny

to remove the "allow all" entry which is automatically inherited from
the root cgroup.

While composing this with the ns_cgroup may seem logical, it is not
the right thing to do, because updates to /cg/cg1/devcg.deny are
not reflected in /cg/cg1/cg2/devcg.allow.

A task may only be moved to another devcgroup if it is moving to
a direct descendent of its current devcgroup.

CAP_NS_OVERRIDE is defined as the capability needed to cross namespaces.
A task needs both CAP_NS_OVERRIDE and CAP_SYS_ADMIN to create a new
devcgroup, update a devcgroup's access, or move a task to a new
devcgroup.

CONFIG_COMMONCAP is defined whenever security/commoncap.c should
be compiled, so that the decision of whether to show the option
for FILE_CAPABILITIES can be a bit cleaner.

Pavel, do you have any experience with what sorts of rules a typical
customer would get? I wasn't sure whether it was worth having some
sort of hash or tree for searching through a cgroup's rules since if
most containers have say 0-2 rules then the overhead will just be
higher.

Changelog:
Mar 12 2008: allow dev_cgroup lsm to be used when
SECURITY=n, and allow stacking with SELinux
and Smack. Don't work too hard in Kconfig
to prevent a warning when smack+devcg are
both compiled in, worry about that later.

Signed-off-by: Serge E. Hallyn <[email protected]>
---
include/linux/capability.h | 11 +-
include/linux/cgroup_subsys.h | 6 +
include/linux/devcg.h | 55 ++++++
include/linux/security.h | 15 ++
init/Kconfig | 7 +
kernel/Makefile | 1 +
kernel/dev_cgroup.c | 411 +++++++++++++++++++++++++++++++++++++++++
security/Kconfig | 8 +-
security/Makefile | 12 +-
security/dev_cgroup.c | 138 ++++++++++++++
security/smack/smack_lsm.c | 38 ++++
11 files changed, 692 insertions(+), 10 deletions(-)
create mode 100644 include/linux/devcg.h
create mode 100644 kernel/dev_cgroup.c
create mode 100644 security/dev_cgroup.c

diff --git a/include/linux/capability.h b/include/linux/capability.h
index eaab759..f8ecba1 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -333,7 +333,16 @@ typedef struct kernel_cap_struct {

#define CAP_MAC_ADMIN 33

-#define CAP_LAST_CAP CAP_MAC_ADMIN
+/* Allow acting on resources in another namespace. In particular:
+ * 1. when combined with CAP_MKNOD and dev_cgroup is enabled,
+ * allow creation of devices not in the device whitelist.
+ * 2. whencombined with CAP_SYS_ADMIN and dev_cgroup is enabled,
+ * allow editing device cgroup whitelist
+ */
+
+#define CAP_NS_OVERRIDE 34
+
+#define CAP_LAST_CAP CAP_NS_OVERRIDE

#define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)

diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
index 1ddebfc..01e8034 100644
--- a/include/linux/cgroup_subsys.h
+++ b/include/linux/cgroup_subsys.h
@@ -42,3 +42,9 @@ SUBSYS(mem_cgroup)
#endif

/* */
+
+#ifdef CONFIG_CGROUP_DEV
+SUBSYS(devcg)
+#endif
+
+/* */
diff --git a/include/linux/devcg.h b/include/linux/devcg.h
new file mode 100644
index 0000000..7d9f367
--- /dev/null
+++ b/include/linux/devcg.h
@@ -0,0 +1,55 @@
+#include <linux/module.h>
+#include <linux/cgroup.h>
+#include <linux/fs.h>
+#include <linux/list.h>
+#include <linux/security.h>
+
+#include <asm/uaccess.h>
+
+#define ACC_MKNOD 1
+#define ACC_READ 2
+#define ACC_WRITE 4
+
+#define DEV_BLOCK 1
+#define DEV_CHAR 2
+#define DEV_ALL 4 /* this represents all devices */
+
+/*
+ * whitelist locking rules:
+ * cgroup_lock() cannot be taken under cgroup->lock.
+ * cgroup->lock can be taken with or without cgroup_lock().
+ *
+ * modifications always require cgroup_lock
+ * modifications to a list which is visible require the
+ * cgroup->lock *and* cgroup_lock()
+ * walking the list requires cgroup->lock or cgroup_lock().
+ *
+ * reasoning: dev_whitelist_copy() needs to kmalloc, so needs
+ * a mutex, which the cgroup_lock() is. Since modifying
+ * a visible list requires both locks, either lock can be
+ * taken for walking the list. Since the wh->spinlock is taken
+ * for modifying a public-accessible list, the spinlock is
+ * sufficient for just walking the list.
+ */
+
+struct dev_whitelist_item {
+ u32 major, minor;
+ short type;
+ short access;
+ struct list_head list;
+};
+
+struct dev_cgroup {
+ struct cgroup_subsys_state css;
+ struct list_head whitelist;
+ spinlock_t lock;
+};
+
+static inline struct dev_cgroup *cgroup_to_devcg(
+ struct cgroup *cgroup)
+{
+ return container_of(cgroup_subsys_state(cgroup, devcg_subsys_id),
+ struct dev_cgroup, css);
+}
+
+extern struct cgroup_subsys devcg_subsys;
diff --git a/include/linux/security.h b/include/linux/security.h
index 2231526..a818d3f 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1735,6 +1735,13 @@ int security_secctx_to_secid(char *secdata, u32 seclen, u32 *secid);
void security_release_secctx(char *secdata, u32 seclen);

#else /* CONFIG_SECURITY */
+#ifdef CONFIG_CGROUP_DEV
+extern int devcgroup_inode_mknod(struct inode *dir, struct dentry *dentry,
+ int mode, dev_t dev);
+extern int devcgroup_inode_permission(struct inode *inode, int mask,
+ struct nameidata *nd);
+#endif
+
struct security_mnt_opts {
};

@@ -2011,7 +2018,11 @@ static inline int security_inode_mknod (struct inode *dir,
struct dentry *dentry,
int mode, dev_t dev)
{
+#ifdef CONFIG_CGROUP_DEV
+ return devcgroup_inode_mknod(dir, dentry, mode, dev);
+#else
return 0;
+#endif
}

static inline int security_inode_rename (struct inode *old_dir,
@@ -2036,7 +2047,11 @@ static inline int security_inode_follow_link (struct dentry *dentry,
static inline int security_inode_permission (struct inode *inode, int mask,
struct nameidata *nd)
{
+#ifdef CONFIG_CGROUP_DEV
+ return devcgroup_inode_permission(inode, mask, nd);
+#else
return 0;
+#endif
}

static inline int security_inode_setattr (struct dentry *dentry,
diff --git a/init/Kconfig b/init/Kconfig
index 009f2d8..05343a2 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -298,6 +298,13 @@ config CGROUP_NS
for instance virtual servers and checkpoint/restart
jobs.

+config CGROUP_DEV
+ bool "Device controller for cgroups"
+ depends on CGROUPS && EXPERIMENTAL
+ help
+ Provides a cgroup implementing whitelists for devices which
+ a process in the cgroup can mknod or open.
+
config CPUSETS
bool "Cpuset support"
depends on SMP && CGROUPS
diff --git a/kernel/Makefile b/kernel/Makefile
index 9cc073e..74cd321 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -42,6 +42,7 @@ obj-$(CONFIG_CGROUPS) += cgroup.o
obj-$(CONFIG_CGROUP_DEBUG) += cgroup_debug.o
obj-$(CONFIG_CPUSETS) += cpuset.o
obj-$(CONFIG_CGROUP_NS) += ns_cgroup.o
+obj-$(CONFIG_CGROUP_DEV) += dev_cgroup.o
obj-$(CONFIG_UTS_NS) += utsname.o
obj-$(CONFIG_USER_NS) += user_namespace.o
obj-$(CONFIG_PID_NS) += pid_namespace.o
diff --git a/kernel/dev_cgroup.c b/kernel/dev_cgroup.c
new file mode 100644
index 0000000..f088824
--- /dev/null
+++ b/kernel/dev_cgroup.c
@@ -0,0 +1,411 @@
+/*
+ * dev_cgroup.c - device cgroup subsystem
+ *
+ * Copyright 2007 IBM Corp
+ */
+
+#include <linux/devcg.h>
+
+static int devcg_can_attach(struct cgroup_subsys *ss,
+ struct cgroup *new_cgroup, struct task_struct *task)
+{
+ struct cgroup *orig;
+
+ if (!capable(CAP_SYS_ADMIN) || !capable(CAP_NS_OVERRIDE))
+ return -EPERM;
+
+ if (current != task) {
+ if (!cgroup_is_descendant(new_cgroup))
+ return -EPERM;
+ }
+
+ if (atomic_read(&new_cgroup->count) != 0)
+ return -EPERM;
+
+ orig = task_cgroup(task, devcg_subsys_id);
+ if (orig && orig != new_cgroup->parent)
+ return -EPERM;
+
+ return 0;
+}
+
+/*
+ * called under cgroup_lock()
+ */
+int dev_whitelist_copy(struct list_head *dest, struct list_head *orig)
+{
+ struct dev_whitelist_item *wh, *tmp, *new;
+
+ list_for_each_entry(wh, orig, list) {
+ new = kmalloc(sizeof(*wh), GFP_KERNEL);
+ if (!new)
+ goto free_and_exit;
+ new->major = wh->major;
+ new->minor = wh->minor;
+ new->type = wh->type;
+ new->access = wh->access;
+ list_add_tail(&new->list, dest);
+ }
+
+ return 0;
+
+free_and_exit:
+ list_for_each_entry_safe(wh, tmp, dest, list) {
+ list_del(&wh->list);
+ kfree(wh);
+ }
+ return -ENOMEM;
+}
+
+/* Stupid prototype - don't bother combining existing entries */
+/*
+ * called under cgroup_lock()
+ * since the list is visible to other tasks, we need the spinlock also
+ */
+int dev_whitelist_add(struct dev_cgroup *dev_cgroup,
+ struct dev_whitelist_item *wh)
+{
+ struct dev_whitelist_item *whcopy;
+
+ whcopy = kmalloc(sizeof(*whcopy), GFP_KERNEL);
+ if (!whcopy)
+ return -ENOMEM;
+
+ memcpy(whcopy, wh, sizeof(*whcopy));
+ spin_lock(&dev_cgroup->lock);
+ list_add_tail(&whcopy->list, &dev_cgroup->whitelist);
+ spin_unlock(&dev_cgroup->lock);
+ return 0;
+}
+
+/*
+ * called under cgroup_lock()
+ * since the list is visible to other tasks, we need the spinlock also
+ */
+void dev_whitelist_rm(struct dev_cgroup *dev_cgroup,
+ struct dev_whitelist_item *wh)
+{
+ struct dev_whitelist_item *walk, *tmp;
+
+ spin_lock(&dev_cgroup->lock);
+ list_for_each_entry_safe(walk, tmp, &dev_cgroup->whitelist, list) {
+ if (walk->type == DEV_ALL)
+ goto remove;
+ if (walk->type != wh->type)
+ continue;
+ if (walk->major != ~0 && walk->major != wh->major)
+ continue;
+ if (walk->minor != ~0 && walk->minor != wh->minor)
+ continue;
+
+remove:
+ walk->access &= ~wh->access;
+ if (!walk->access) {
+ list_del(&walk->list);
+ kfree(walk);
+ }
+ }
+ spin_unlock(&dev_cgroup->lock);
+}
+
+/*
+ * Rules: you can only create a cgroup if
+ * 1. you are capable(CAP_SYS_ADMIN|CAP_NS_OVERRIDE)
+ * 2. the target cgroup is a descendant of your own cgroup
+ *
+ * Note: called from kernel/cgroup.c with cgroup_lock() held.
+ */
+static struct cgroup_subsys_state *devcg_create(struct cgroup_subsys *ss,
+ struct cgroup *cgroup)
+{
+ struct dev_cgroup *dev_cgroup, *parent_dev_cgroup;
+ struct cgroup *parent_cgroup;
+ int ret;
+
+ if (!capable(CAP_SYS_ADMIN) || !capable(CAP_NS_OVERRIDE))
+ return ERR_PTR(-EPERM);
+ if (!cgroup_is_descendant(cgroup))
+ return ERR_PTR(-EPERM);
+
+ dev_cgroup = kzalloc(sizeof(*dev_cgroup), GFP_KERNEL);
+ if (!dev_cgroup)
+ return ERR_PTR(-ENOMEM);
+ INIT_LIST_HEAD(&dev_cgroup->whitelist);
+ parent_cgroup = cgroup->parent;
+
+ if (parent_cgroup == NULL) {
+ struct dev_whitelist_item *wh;
+ wh = kmalloc(sizeof(*wh), GFP_KERNEL);
+ wh->minor = wh->major = ~0;
+ wh->type = DEV_ALL;
+ wh->access = ACC_MKNOD | ACC_READ | ACC_WRITE;
+ list_add(&wh->list, &dev_cgroup->whitelist);
+ } else {
+ parent_dev_cgroup = cgroup_to_devcg(parent_cgroup);
+ ret = dev_whitelist_copy(&dev_cgroup->whitelist,
+ &parent_dev_cgroup->whitelist);
+ if (ret) {
+ kfree(dev_cgroup);
+ return ERR_PTR(ret);
+ }
+ }
+
+ spin_lock_init(&dev_cgroup->lock);
+ return &dev_cgroup->css;
+}
+
+static void devcg_destroy(struct cgroup_subsys *ss,
+ struct cgroup *cgroup)
+{
+ struct dev_cgroup *dev_cgroup;
+ struct dev_whitelist_item *wh, *tmp;
+
+ dev_cgroup = cgroup_to_devcg(cgroup);
+ list_for_each_entry_safe(wh, tmp, &dev_cgroup->whitelist, list) {
+ list_del(&wh->list);
+ kfree(wh);
+ }
+ kfree(dev_cgroup);
+}
+
+#define DEVCG_ALLOW 1
+#define DEVCG_DENY 2
+
+void set_access(char *acc, short access)
+{
+ int idx = 0;
+ memset(acc, 0, 4);
+ if (access & ACC_READ)
+ acc[idx++] = 'r';
+ if (access & ACC_WRITE)
+ acc[idx++] = 'w';
+ if (access & ACC_MKNOD)
+ acc[idx++] = 'm';
+}
+
+char type_to_char(short type)
+{
+ if (type == DEV_ALL)
+ return 'a';
+ if (type == DEV_CHAR)
+ return 'c';
+ if (type == DEV_BLOCK)
+ return 'b';
+ return 'X';
+}
+
+static void set_majmin(char *str, int len, unsigned m)
+{
+ memset(str, 0, len);
+ if (m == ~0)
+ sprintf(str, "*");
+ else
+ snprintf(str, len, "%d", m);
+}
+
+char *print_whitelist(struct dev_cgroup *devcgroup, int *len)
+{
+ char *buf, *s, acc[4];
+ struct dev_whitelist_item *wh;
+ int ret;
+ int count = 0;
+ char maj[10], min[10];
+
+ buf = kmalloc(4096, GFP_KERNEL);
+ if (!buf)
+ return ERR_PTR(-ENOMEM);
+ s = buf;
+ *s = '\0';
+ *len = 0;
+
+ spin_lock(&devcgroup->lock);
+ list_for_each_entry(wh, &devcgroup->whitelist, list) {
+ set_access(acc, wh->access);
+ set_majmin(maj, 10, wh->major);
+ set_majmin(min, 10, wh->minor);
+ ret = snprintf(s, 4095-(s-buf), "%c %s %s %s\n",
+ type_to_char(wh->type), maj, min, acc);
+ if (s+ret >= buf+4095) {
+ kfree(buf);
+ buf = ERR_PTR(-ENOMEM);
+ break;
+ }
+ s += ret;
+ *len += ret;
+ count++;
+ }
+ spin_unlock(&devcgroup->lock);
+
+ return buf;
+}
+
+static ssize_t devcg_access_read(struct cgroup *cgroup,
+ struct cftype *cft, struct file *file,
+ char __user *userbuf, size_t nbytes, loff_t *ppos)
+{
+ struct dev_cgroup *devcgrp = cgroup_to_devcg(cgroup);
+ int filetype = cft->private;
+ char *buffer;
+ int len, retval;
+
+ if (filetype != DEVCG_ALLOW)
+ return -EINVAL;
+ buffer = print_whitelist(devcgrp, &len);
+ if (IS_ERR(buffer))
+ return PTR_ERR(buffer);
+
+ retval = simple_read_from_buffer(userbuf, nbytes, ppos, buffer, len);
+ kfree(buffer);
+ return retval;
+}
+
+static inline short convert_access(char *acc)
+{
+ short access = 0;
+
+ while (*acc) {
+ switch (*acc) {
+ case 'r':
+ case 'R': access |= ACC_READ; break;
+ case 'w':
+ case 'W': access |= ACC_WRITE; break;
+ case 'm':
+ case 'M': access |= ACC_MKNOD; break;
+ case '\n': break;
+ default:
+ return -EINVAL;
+ }
+ acc++;
+ }
+
+ return access;
+}
+
+static inline short convert_type(char intype)
+{
+ short type = 0;
+ switch (intype) {
+ case 'a': type = DEV_ALL; break;
+ case 'c': type = DEV_CHAR; break;
+ case 'b': type = DEV_BLOCK; break;
+ default: type = -EACCES; break;
+ }
+ return type;
+}
+
+static int convert_majmin(char *m, unsigned *u)
+{
+ if (m[0] == '*') {
+ *u = ~0;
+ return 0;
+ }
+ if (sscanf(m, "%u", u) != 1)
+ return -EINVAL;
+ return 0;
+}
+
+static ssize_t devcg_access_write(struct cgroup *cgroup, struct cftype *cft,
+ struct file *file, const char __user *userbuf,
+ size_t nbytes, loff_t *ppos)
+{
+ struct cgroup *cur_cgroup;
+ struct dev_cgroup *devcgrp, *cur_devcgroup;
+ int filetype = cft->private;
+ char *buffer, acc[4], maj[10], min[10];
+ int retval = 0;
+ int nitems;
+ char type;
+ struct dev_whitelist_item wh;
+
+ if (!capable(CAP_SYS_ADMIN) || !capable(CAP_NS_OVERRIDE))
+ return -EPERM;
+
+ devcgrp = cgroup_to_devcg(cgroup);
+ cur_cgroup = task_cgroup(current, devcg_subsys.subsys_id);
+ cur_devcgroup = cgroup_to_devcg(cur_cgroup);
+
+ buffer = kmalloc(nbytes+1, GFP_KERNEL);
+ if (!buffer)
+ return -ENOMEM;
+
+ if (copy_from_user(buffer, userbuf, nbytes)) {
+ retval = -EFAULT;
+ goto out1;
+ }
+ buffer[nbytes] = 0; /* nul-terminate */
+
+ cgroup_lock();
+ if (cgroup_is_removed(cgroup)) {
+ retval = -ENODEV;
+ goto out2;
+ }
+
+ memset(&wh, 0, sizeof(wh));
+ memset(acc, 0, 4);
+ nitems = sscanf(buffer, "%c %9s %9s %3s", &type, maj, min,
+ acc);
+ retval = -EINVAL;
+ if (nitems != 4)
+ goto out2;
+ wh.type = convert_type(type);
+ if (wh.type < 0)
+ goto out2;
+ wh.access = convert_access(acc);
+ if (convert_majmin(maj, &wh.major))
+ goto out2;
+ if (convert_majmin(min, &wh.minor))
+ goto out2;
+ if (wh.access < 0)
+ goto out2;
+ retval = 0;
+ switch (filetype) {
+ case DEVCG_ALLOW:
+ retval = dev_whitelist_add(devcgrp, &wh);
+ break;
+ case DEVCG_DENY:
+ dev_whitelist_rm(devcgrp, &wh);
+ break;
+ default:
+ retval = -EINVAL;
+ goto out2;
+ }
+
+ if (retval == 0)
+ retval = nbytes;
+
+out2:
+ cgroup_unlock();
+out1:
+ kfree(buffer);
+ return retval;
+}
+
+static struct cftype dev_cgroup_files[] = {
+ {
+ .name = "allow",
+ .read = devcg_access_read,
+ .write = devcg_access_write,
+ .private = DEVCG_ALLOW,
+ },
+ {
+ .name = "deny",
+ .write = devcg_access_write,
+ .private = DEVCG_DENY,
+ },
+};
+
+static int devcg_populate(struct cgroup_subsys *ss,
+ struct cgroup *cont)
+{
+ return cgroup_add_files(cont, ss, dev_cgroup_files,
+ ARRAY_SIZE(dev_cgroup_files));
+}
+
+struct cgroup_subsys devcg_subsys = {
+ .name = "devcg",
+ .can_attach = devcg_can_attach,
+ .create = devcg_create,
+ .destroy = devcg_destroy,
+ .populate = devcg_populate,
+ .subsys_id = devcg_subsys_id,
+};
diff --git a/security/Kconfig b/security/Kconfig
index 5dfc206..8082edc 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -75,15 +75,19 @@ config SECURITY_NETWORK_XFRM

config SECURITY_CAPABILITIES
bool "Default Linux Capabilities"
- depends on SECURITY
+ depends on SECURITY && !CGROUP_DEV
default y
help
This enables the "default" Linux capabilities functionality.
If you are unsure how to answer this question, answer Y.

+config COMMONCAP
+ bool
+ default !SECURITY || SECURITY_CAPABILITIES || SECURITY_ROOTPLUG || SECURITY_SMACK || CGROUP_DEV
+
config SECURITY_FILE_CAPABILITIES
bool "File POSIX Capabilities (EXPERIMENTAL)"
- depends on (SECURITY=n || SECURITY_CAPABILITIES!=n) && EXPERIMENTAL
+ depends on COMMONCAP && EXPERIMENTAL
default n
help
This enables filesystem capabilities, allowing you to give
diff --git a/security/Makefile b/security/Makefile
index 9e8b025..6093003 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -6,15 +6,13 @@ obj-$(CONFIG_KEYS) += keys/
subdir-$(CONFIG_SECURITY_SELINUX) += selinux
subdir-$(CONFIG_SECURITY_SMACK) += smack

-# if we don't select a security model, use the default capabilities
-ifneq ($(CONFIG_SECURITY),y)
-obj-y += commoncap.o
-endif
+obj-$(CONFIG_COMMONCAP) += commoncap.o

# Object file lists
obj-$(CONFIG_SECURITY) += security.o dummy.o inode.o
# Must precede capability.o in order to stack properly.
obj-$(CONFIG_SECURITY_SELINUX) += selinux/built-in.o
-obj-$(CONFIG_SECURITY_SMACK) += commoncap.o smack/built-in.o
-obj-$(CONFIG_SECURITY_CAPABILITIES) += commoncap.o capability.o
-obj-$(CONFIG_SECURITY_ROOTPLUG) += commoncap.o root_plug.o
+obj-$(CONFIG_SECURITY_SMACK) += smack/built-in.o
+obj-$(CONFIG_SECURITY_CAPABILITIES) += capability.o
+obj-$(CONFIG_SECURITY_ROOTPLUG) += root_plug.o
+obj-$(CONFIG_CGROUP_DEV) += dev_cgroup.o
diff --git a/security/dev_cgroup.c b/security/dev_cgroup.c
new file mode 100644
index 0000000..cef1527
--- /dev/null
+++ b/security/dev_cgroup.c
@@ -0,0 +1,138 @@
+/*
+ * LSM portion of the device cgroup subsystem.
+ *
+ * Copyright 2007 IBM Corp
+ */
+
+#include <linux/devcg.h>
+
+int devcgroup_inode_permission(struct inode *inode, int mask,
+ struct nameidata *nd)
+{
+ struct cgroup *cgroup;
+ struct dev_cgroup *dev_cgroup;
+ struct dev_whitelist_item *wh;
+
+ dev_t device = inode->i_rdev;
+ if (!device)
+ return 0;
+ if (!S_ISBLK(inode->i_mode) && !S_ISCHR(inode->i_mode))
+ return 0;
+ cgroup = task_cgroup(current, devcg_subsys.subsys_id);
+ dev_cgroup = cgroup_to_devcg(cgroup);
+ if (!dev_cgroup)
+ return 0;
+
+ spin_lock(&dev_cgroup->lock);
+ list_for_each_entry(wh, &dev_cgroup->whitelist, list) {
+ if (wh->type & DEV_ALL)
+ goto acc_check;
+ if ((wh->type & DEV_BLOCK) && !S_ISBLK(inode->i_mode))
+ continue;
+ if ((wh->type & DEV_CHAR) && !S_ISCHR(inode->i_mode))
+ continue;
+ if (wh->major != ~0 && wh->major != imajor(inode))
+ continue;
+ if (wh->minor != ~0 && wh->minor != iminor(inode))
+ continue;
+acc_check:
+ if ((mask & MAY_WRITE) && !(wh->access & ACC_WRITE))
+ continue;
+ if ((mask & MAY_READ) && !(wh->access & ACC_READ))
+ continue;
+ spin_unlock(&dev_cgroup->lock);
+ return 0;
+ }
+ spin_unlock(&dev_cgroup->lock);
+
+ return -EPERM;
+}
+
+int devcgroup_inode_mknod(struct inode *dir, struct dentry *dentry,
+ int mode, dev_t dev)
+{
+ struct cgroup *cgroup;
+ struct dev_cgroup *dev_cgroup;
+ struct dev_whitelist_item *wh;
+
+ cgroup = task_cgroup(current, devcg_subsys.subsys_id);
+ dev_cgroup = cgroup_to_devcg(cgroup);
+ if (!dev_cgroup)
+ return 0;
+
+ spin_lock(&dev_cgroup->lock);
+ list_for_each_entry(wh, &dev_cgroup->whitelist, list) {
+ if (wh->type & DEV_ALL)
+ goto acc_check;
+ if ((wh->type & DEV_BLOCK) && !S_ISBLK(mode))
+ continue;
+ if ((wh->type & DEV_CHAR) && !S_ISCHR(mode))
+ continue;
+ if (wh->major != ~0 && wh->major != MAJOR(dev))
+ continue;
+ if (wh->minor != ~0 && wh->minor != MINOR(dev))
+ continue;
+acc_check:
+ if (!(wh->access & ACC_MKNOD))
+ continue;
+ spin_unlock(&dev_cgroup->lock);
+ return 0;
+ }
+ spin_unlock(&dev_cgroup->lock);
+ return -EPERM;
+}
+
+#ifdef CONFIG_SECURITY
+static struct security_operations devcgroup_security_ops = {
+ .inode_mknod = devcgroup_inode_mknod,
+ .inode_permission = devcgroup_inode_permission,
+
+ .ptrace = cap_ptrace,
+ .capget = cap_capget,
+ .capset_check = cap_capset_check,
+ .capset_set = cap_capset_set,
+ .capable = cap_capable,
+ .settime = cap_settime,
+ .netlink_send = cap_netlink_send,
+ .netlink_recv = cap_netlink_recv,
+
+ .bprm_apply_creds = cap_bprm_apply_creds,
+ .bprm_set_security = cap_bprm_set_security,
+ .bprm_secureexec = cap_bprm_secureexec,
+
+ .inode_setxattr = cap_inode_setxattr,
+ .inode_removexattr = cap_inode_removexattr,
+ .inode_need_killpriv = cap_inode_need_killpriv,
+ .inode_killpriv = cap_inode_killpriv,
+
+ .task_kill = cap_task_kill,
+ .task_setscheduler = cap_task_setscheduler,
+ .task_setioprio = cap_task_setioprio,
+ .task_setnice = cap_task_setnice,
+ .task_post_setuid = cap_task_post_setuid,
+ .task_prctl = cap_task_prctl,
+ .task_reparent_to_init = cap_task_reparent_to_init,
+
+ .syslog = cap_syslog,
+
+ .vm_enough_memory = cap_vm_enough_memory,
+};
+
+#define MY_NAME "dev_cgroup"
+static int __init dev_cgroup_security_init(void)
+{
+ /* register ourselves with the security framework */
+ if (register_security(&devcgroup_security_ops)) {
+ if (mod_reg_security(MY_NAME, &devcgroup_security_ops)) {
+ printk(KERN_ERR "Failure registering dev_cgroup LSM\n");
+ return -EINVAL;
+ }
+ printk(KERN_INFO "dev_cgroup LSM initialized as secondary\n");
+ return 0;
+ }
+ printk(KERN_INFO "Device cgroup LSM initialized\n");
+ return 0;
+}
+
+security_initcall(dev_cgroup_security_init);
+#endif
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 20ec35c..08bf081 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -36,6 +36,18 @@
#define SOCKFS_MAGIC 0x534F434B
#define TMPFS_MAGIC 0x01021994

+#ifdef CONFIG_CGROUP_DEV
+/*
+ * A better approach is probably to switch to full secondary->
+ * support for smack, but for now just hook the devcgroup lsm
+ * hooks explicitly when compiled in
+ */
+int devcgroup_inode_mknod(struct inode *dir, struct dentry *dentry,
+ int mode, dev_t dev);
+int devcgroup_inode_permission(struct inode *inode, int mask,
+ struct nameidata *nd);
+#endif
+
/**
* smk_fetch - Fetch the smack label from a file.
* @ip: a pointer to the inode
@@ -523,6 +535,12 @@ static int smack_inode_rename(struct inode *old_inode,
static int smack_inode_permission(struct inode *inode, int mask,
struct nameidata *nd)
{
+#ifdef CONFIG_CGROUP_DEV
+ int err;
+ err = devcgroup_inode_permission(inode, mask, nd);
+ if (err)
+ return err;
+#endif
/*
* No permission to check. Existence test. Yup, it's there.
*/
@@ -1370,6 +1388,23 @@ static int smack_inode_setsecurity(struct inode *inode, const char *name,
return 0;
}

+#ifdef CONFIG_CGROUP_DEV
+/**
+ * smack_inode_mknod
+ * @dir contains the inode structure of parent of the new file.
+ * @dentry contains the dentry structure of the new file.
+ * @mode contains the mode of the new file.
+ * @dev contains the device number.
+ *
+ * Just calls the devcgroup hook to allow stacking.
+ */
+static int smack_inode_mknod(struct inode *dir, struct dentry *dentry,
+ int mode, dev_t dev)
+{
+ return devcgroup_inode_mknod(dir, dentry, mode, dev);
+}
+#endif
+
/**
* smack_socket_post_create - finish socket setup
* @sock: the socket
@@ -2460,6 +2495,9 @@ struct security_operations smack_ops = {
.inode_getsecurity = smack_inode_getsecurity,
.inode_setsecurity = smack_inode_setsecurity,
.inode_listsecurity = smack_inode_listsecurity,
+#ifdef CONFIG_CGROUP_DEV
+ .inode_mknod = smack_inode_mknod,
+#endif

.file_permission = smack_file_permission,
.file_alloc_security = smack_file_alloc_security,
--
1.5.1


2008-03-13 09:26:40

by James Morris

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

On Wed, 12 Mar 2008, Serge E. Hallyn wrote:

> +#ifdef CONFIG_SECURITY
> +static struct security_operations devcgroup_security_ops = {
> + .inode_mknod = devcgroup_inode_mknod,
> + .inode_permission = devcgroup_inode_permission,
> +
> + .ptrace = cap_ptrace,
> + .capget = cap_capget,
> + .capset_check = cap_capset_check,
> + .capset_set = cap_capset_set,
> + .capable = cap_capable,
> + .settime = cap_settime,
> + .netlink_send = cap_netlink_send,
> + .netlink_recv = cap_netlink_recv,
> +
> + .bprm_apply_creds = cap_bprm_apply_creds,
> + .bprm_set_security = cap_bprm_set_security,
> + .bprm_secureexec = cap_bprm_secureexec,
> +
> + .inode_setxattr = cap_inode_setxattr,
> + .inode_removexattr = cap_inode_removexattr,
> + .inode_need_killpriv = cap_inode_need_killpriv,
> + .inode_killpriv = cap_inode_killpriv,
> +
> + .task_kill = cap_task_kill,
> + .task_setscheduler = cap_task_setscheduler,
> + .task_setioprio = cap_task_setioprio,
> + .task_setnice = cap_task_setnice,
> + .task_post_setuid = cap_task_post_setuid,
> + .task_prctl = cap_task_prctl,
> + .task_reparent_to_init = cap_task_reparent_to_init,
> +
> + .syslog = cap_syslog,
> +
> + .vm_enough_memory = cap_vm_enough_memory,
> +};

For lower overall complexity, why not just extend the capability LSM to
include the devcgroup_ perms if CONFIG_CGROUP_DEV ?



- James
--
James Morris
<[email protected]>

2008-03-13 13:18:32

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Quoting James Morris ([email protected]):
> On Wed, 12 Mar 2008, Serge E. Hallyn wrote:
>
> > +#ifdef CONFIG_SECURITY
> > +static struct security_operations devcgroup_security_ops = {
> > + .inode_mknod = devcgroup_inode_mknod,
> > + .inode_permission = devcgroup_inode_permission,
> > +
> > + .ptrace = cap_ptrace,
> > + .capget = cap_capget,
> > + .capset_check = cap_capset_check,
> > + .capset_set = cap_capset_set,
> > + .capable = cap_capable,
> > + .settime = cap_settime,
> > + .netlink_send = cap_netlink_send,
> > + .netlink_recv = cap_netlink_recv,
> > +
> > + .bprm_apply_creds = cap_bprm_apply_creds,
> > + .bprm_set_security = cap_bprm_set_security,
> > + .bprm_secureexec = cap_bprm_secureexec,
> > +
> > + .inode_setxattr = cap_inode_setxattr,
> > + .inode_removexattr = cap_inode_removexattr,
> > + .inode_need_killpriv = cap_inode_need_killpriv,
> > + .inode_killpriv = cap_inode_killpriv,
> > +
> > + .task_kill = cap_task_kill,
> > + .task_setscheduler = cap_task_setscheduler,
> > + .task_setioprio = cap_task_setioprio,
> > + .task_setnice = cap_task_setnice,
> > + .task_post_setuid = cap_task_post_setuid,
> > + .task_prctl = cap_task_prctl,
> > + .task_reparent_to_init = cap_task_reparent_to_init,
> > +
> > + .syslog = cap_syslog,
> > +
> > + .vm_enough_memory = cap_vm_enough_memory,
> > +};
>
> For lower overall complexity, why not just extend the capability LSM to
> include the devcgroup_ perms if CONFIG_CGROUP_DEV ?

That does make for a simpler implementation at this point, however if
any other such LSMs come along (as Casey seemed to think they would) the
end result could end up being horrible spaghetti code of dependencies
and interrelated configs.

But OTOH we went years with no such changes, so that's probably not a
particularly practical concern unless someone can cite specific upcoming
examples. So if noone objects I'll try that approach.

thanks,
-serge

2008-03-13 13:52:30

by James Morris

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

On Thu, 13 Mar 2008, Serge E. Hallyn wrote:

> That does make for a simpler implementation at this point, however if
> any other such LSMs come along (as Casey seemed to think they would) the
> end result could end up being horrible spaghetti code of dependencies
> and interrelated configs.

That can be addressed as the need arises. A basic tenet of kernel
development is to avoid speculative infrastructure.

> But OTOH we went years with no such changes, so that's probably not a
> particularly practical concern unless someone can cite specific upcoming
> examples. So if noone objects I'll try that approach.

--
James Morris
<[email protected]>

2008-03-13 14:38:21

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Quoting James Morris ([email protected]):
> On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
>
> > That does make for a simpler implementation at this point, however if
> > any other such LSMs come along (as Casey seemed to think they would) the
> > end result could end up being horrible spaghetti code of dependencies
> > and interrelated configs.
>
> That can be addressed as the need arises. A basic tenet of kernel
> development is to avoid speculative infrastructure.

True, but while this change simplifies the code a bit, the semantics
seem more muddled - devcg will be enforcing when CONFIG_CGROUP_DEV=y
and:

SECURITY=n or
rootplug is enabled
capabilities is enabled
smack is enabled
selinux+capabilities is enabled

It will not be enforcing when
dummy is loaded
only selinux (and not capabilities) is loaded

If that's ok with people then I'm fine with it. I suppose it should be
explained in the CONFIG_CGROUP_DEV help section, which it isn't in this
version I'm about to set. Patch hitting the wire in a minute.

thanks,
-serge

> > But OTOH we went years with no such changes, so that's probably not a
> > particularly practical concern unless someone can cite specific upcoming
> > examples. So if noone objects I'll try that approach.
>
> --
> James Morris
> <[email protected]>

2008-03-13 22:28:12

by James Morris

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

On Thu, 13 Mar 2008, Serge E. Hallyn wrote:

> True, but while this change simplifies the code a bit, the semantics
> seem more muddled - devcg will be enforcing when CONFIG_CGROUP_DEV=y
> and:
>
> SECURITY=n or
> rootplug is enabled
> capabilities is enabled
> smack is enabled
> selinux+capabilities is enabled

Well, this is how real systems are going to be deployed.

It becomes confusing, IMHO, if you have to change which secondary LSM you
stack with SELinux to enable a cgroup feature.

--
James Morris
<[email protected]>

2008-03-13 22:46:31

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Quoting James Morris ([email protected]):
> On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
>
> > True, but while this change simplifies the code a bit, the semantics
> > seem more muddled - devcg will be enforcing when CONFIG_CGROUP_DEV=y
> > and:
> >
> > SECURITY=n or
> > rootplug is enabled
> > capabilities is enabled
> > smack is enabled
> > selinux+capabilities is enabled
>
> Well, this is how real systems are going to be deployed.

Sorry, do you mean with capabilities?

> It becomes confusing, IMHO, if you have to change which secondary LSM you
> stack with SELinux to enable a cgroup feature.

So you're saying selinux without capabilities should still be able to
use dev_cgroup? (Just making sure I understand right)

-serge

2008-03-13 23:49:36

by James Morris

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

On Thu, 13 Mar 2008, Serge E. Hallyn wrote:

> Quoting James Morris ([email protected]):
> > On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
> >
> > > True, but while this change simplifies the code a bit, the semantics
> > > seem more muddled - devcg will be enforcing when CONFIG_CGROUP_DEV=y
> > > and:
> > >
> > > SECURITY=n or
> > > rootplug is enabled
> > > capabilities is enabled
> > > smack is enabled
> > > selinux+capabilities is enabled
> >
> > Well, this is how real systems are going to be deployed.
>
> Sorry, do you mean with capabilities?

Yes.

All Fedora, RHEL, CentOS etc. ship with SELinux+capabilities. I can't
imagine not enabling them on other kernels.

> > It becomes confusing, IMHO, if you have to change which secondary LSM you
> > stack with SELinux to enable a cgroup feature.
>
> So you're saying selinux without capabilities should still be able to
> use dev_cgroup? (Just making sure I understand right)

Nope, SELinux always stacks with capabilities, so havng the cgroup hooks
in capabilities makes sense (rather than having us change the secondary
stacking LSM just to enable a feature).





--
James Morris
<[email protected]>

2008-03-14 01:41:33

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Quoting James Morris ([email protected]):
> On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
>
> > Quoting James Morris ([email protected]):
> > > On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
> > >
> > > > True, but while this change simplifies the code a bit, the semantics
> > > > seem more muddled - devcg will be enforcing when CONFIG_CGROUP_DEV=y
> > > > and:
> > > >
> > > > SECURITY=n or
> > > > rootplug is enabled
> > > > capabilities is enabled
> > > > smack is enabled
> > > > selinux+capabilities is enabled
> > >
> > > Well, this is how real systems are going to be deployed.
> >
> > Sorry, do you mean with capabilities?
>
> Yes.
>
> All Fedora, RHEL, CentOS etc. ship with SELinux+capabilities. I can't
> imagine not enabling them on other kernels.
>
> > > It becomes confusing, IMHO, if you have to change which secondary LSM you
> > > stack with SELinux to enable a cgroup feature.
> >
> > So you're saying selinux without capabilities should still be able to
> > use dev_cgroup? (Just making sure I understand right)
>
> Nope, SELinux always stacks with capabilities, so havng the cgroup hooks
> in capabilities makes sense (rather than having us change the secondary
> stacking LSM just to enable a feature).

Oh, ok.

Will let the patch stand until Pavel and Greg comment then.

thanks,
-serge

2008-03-14 02:51:29

by Casey Schaufler

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)


--- James Morris <[email protected]> wrote:

> On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
>
> > Quoting James Morris ([email protected]):
> > > On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
> > >
> > > > True, but while this change simplifies the code a bit, the semantics
> > > > seem more muddled - devcg will be enforcing when CONFIG_CGROUP_DEV=y
> > > > and:
> > > >
> > > > SECURITY=n or
> > > > rootplug is enabled
> > > > capabilities is enabled
> > > > smack is enabled
> > > > selinux+capabilities is enabled
> > >
> > > Well, this is how real systems are going to be deployed.
> >
> > Sorry, do you mean with capabilities?
>
> Yes.
>
> All Fedora, RHEL, CentOS etc. ship with SELinux+capabilities. I can't
> imagine not enabling them on other kernels.
>
> > > It becomes confusing, IMHO, if you have to change which secondary LSM you
>
> > > stack with SELinux to enable a cgroup feature.
> >
> > So you're saying selinux without capabilities should still be able to
> > use dev_cgroup? (Just making sure I understand right)
>
> Nope, SELinux always stacks with capabilities, so havng the cgroup hooks
> in capabilities makes sense (rather than having us change the secondary
> stacking LSM just to enable a feature).

That's what I was getting at. When the next feature comes along
are we going to stuff it into capabilities, too? Maybe we'll
cram it into audit or CIPSO instead, but how long can this go on?
Eventually we need a mechanism that allows more or less general
mix-and-match, maybe with a few rules like "don't mix plaids and
stripes" to keep things sane or these lesser facilities have no
chance. Seems like we're still making LSM too hard to use.

Unless I take an aggressive approach to adding them to Smack.
Hmm, might be a way to make a buck or two that way.
(Only kidding)(Well, mostly only kidding)

And yes, I understand that there are still those about who
don't like LSM in any form, much less in a useful one.


Casey Schaufler
[email protected]

2008-03-14 05:00:28

by Greg KH

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

On Thu, Mar 13, 2008 at 08:41:21PM -0500, Serge E. Hallyn wrote:
> Quoting James Morris ([email protected]):
> > On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
> >
> > > Quoting James Morris ([email protected]):
> > > > On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
> > > >
> > > > > True, but while this change simplifies the code a bit, the semantics
> > > > > seem more muddled - devcg will be enforcing when CONFIG_CGROUP_DEV=y
> > > > > and:
> > > > >
> > > > > SECURITY=n or
> > > > > rootplug is enabled
> > > > > capabilities is enabled
> > > > > smack is enabled
> > > > > selinux+capabilities is enabled
> > > >
> > > > Well, this is how real systems are going to be deployed.
> > >
> > > Sorry, do you mean with capabilities?
> >
> > Yes.
> >
> > All Fedora, RHEL, CentOS etc. ship with SELinux+capabilities. I can't
> > imagine not enabling them on other kernels.
> >
> > > > It becomes confusing, IMHO, if you have to change which secondary LSM you
> > > > stack with SELinux to enable a cgroup feature.
> > >
> > > So you're saying selinux without capabilities should still be able to
> > > use dev_cgroup? (Just making sure I understand right)
> >
> > Nope, SELinux always stacks with capabilities, so havng the cgroup hooks
> > in capabilities makes sense (rather than having us change the secondary
> > stacking LSM just to enable a feature).
>
> Oh, ok.
>
> Will let the patch stand until Pavel and Greg comment then.

My main question was why was that file in the kernel/ directory?
Shouldn't that also be in the security/ directory?

And to be honest, I didn't really look at it at all other than the
diffstat to make sure you weren't messing with the kobj_map stuff
anymore :)

thanks,

greg k-h

2008-03-14 09:17:10

by Paul Menage

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

On Wed, Mar 12, 2008 at 8:27 PM, Serge E. Hallyn <[email protected]> wrote:
>
> While composing this with the ns_cgroup may seem logical, it is not
> the right thing to do, because updates to /cg/cg1/devcg.deny are
> not reflected in /cg/cg1/cg2/devcg.allow.

Maybe you should follow up the tree to ensure that all parent groups
have access to the device too? Or alternatively, cache the results of
this lookup whenever permissions for a device change?

>
> A task may only be moved to another devcgroup if it is moving to
> a direct descendent of its current devcgroup.

What's the rationale for that?

>
> CAP_NS_OVERRIDE is defined as the capability needed to cross namespaces.
> A task needs both CAP_NS_OVERRIDE and CAP_SYS_ADMIN to create a new
> devcgroup, update a devcgroup's access, or move a task to a new
> devcgroup.

But this isn't necessarily crossing namespaces. It could be used for
device control in the same namespace (e.g. allowing a job to access a
raw disk for its data storage rather than going through the
filesystem).

Paul

2008-03-14 09:18:26

by Paul Menage

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

On Wed, Mar 12, 2008 at 8:27 PM, Serge E. Hallyn <[email protected]> wrote:
> Implement a cgroup using the LSM interface to enforce mknod and open
> on device files.
>
> This implements a simple device access whitelist. A whitelist entry
> has 4 fields. 'type' is a (all), c (char), or b (block). 'all' means it
> applies to all types, all major numbers, and all minor numbers. Major and
> minor are obvious. Access is a composition of r (read), w (write), and
> m (mknod).
>
> The root devcgroup starts with rwm to 'all'. A child devcg gets a copy
> of the parent. Admins can then add and remove devices to the whitelist.
> Once CAP_HOST_ADMIN is introduced it will be needed to add entries as
> well or remove entries from another cgroup, though just CAP_SYS_ADMIN
> will suffice to remove entries for your own group.
>
> An entry is added by doing "echo <type> <maj> <min> <access>" > devcg.allow,
> for instance:
>
> echo b 7 0 mrw > /cgroups/1/devcg.allow
>
> An entry is removed by doing likewise into devcg.deny. Since this is a
> pure whitelist, not acls, you can only remove entries which exist in the
> whitelist. You must explicitly
>
> echo a 0 0 mrw > /cgroups/1/devcg.deny
>
> to remove the "allow all" entry which is automatically inherited from
> the root cgroup.

In keeping with the naming convention for control groups, "devices"
would be better than "devcg".

Paul

2008-03-14 09:33:05

by Pavel Emelyanov

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Serge E. Hallyn wrote:
> Quoting James Morris ([email protected]):
>> On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
>>
>>> Quoting James Morris ([email protected]):
>>>> On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
>>>>
>>>>> True, but while this change simplifies the code a bit, the semantics
>>>>> seem more muddled - devcg will be enforcing when CONFIG_CGROUP_DEV=y
>>>>> and:
>>>>>
>>>>> SECURITY=n or
>>>>> rootplug is enabled
>>>>> capabilities is enabled
>>>>> smack is enabled
>>>>> selinux+capabilities is enabled
>>>> Well, this is how real systems are going to be deployed.
>>> Sorry, do you mean with capabilities?
>> Yes.
>>
>> All Fedora, RHEL, CentOS etc. ship with SELinux+capabilities. I can't
>> imagine not enabling them on other kernels.
>>
>>>> It becomes confusing, IMHO, if you have to change which secondary LSM you
>>>> stack with SELinux to enable a cgroup feature.
>>> So you're saying selinux without capabilities should still be able to
>>> use dev_cgroup? (Just making sure I understand right)
>> Nope, SELinux always stacks with capabilities, so havng the cgroup hooks
>> in capabilities makes sense (rather than having us change the secondary
>> stacking LSM just to enable a feature).
>
> Oh, ok.
>
> Will let the patch stand until Pavel and Greg comment then.

Well, I saw your previous patch, that was implemented as just another
LSM module and I liked it except for the LSM dependency.

Since this version can happily work w/o LSM, I like it too :)

> thanks,
> -serge
>

2008-03-14 13:54:30

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Quoting Greg KH ([email protected]):
> On Thu, Mar 13, 2008 at 08:41:21PM -0500, Serge E. Hallyn wrote:
> > Quoting James Morris ([email protected]):
> > > On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
> > >
> > > > Quoting James Morris ([email protected]):
> > > > > On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
> > > > >
> > > > > > True, but while this change simplifies the code a bit, the semantics
> > > > > > seem more muddled - devcg will be enforcing when CONFIG_CGROUP_DEV=y
> > > > > > and:
> > > > > >
> > > > > > SECURITY=n or
> > > > > > rootplug is enabled
> > > > > > capabilities is enabled
> > > > > > smack is enabled
> > > > > > selinux+capabilities is enabled
> > > > >
> > > > > Well, this is how real systems are going to be deployed.
> > > >
> > > > Sorry, do you mean with capabilities?
> > >
> > > Yes.
> > >
> > > All Fedora, RHEL, CentOS etc. ship with SELinux+capabilities. I can't
> > > imagine not enabling them on other kernels.
> > >
> > > > > It becomes confusing, IMHO, if you have to change which secondary LSM you
> > > > > stack with SELinux to enable a cgroup feature.
> > > >
> > > > So you're saying selinux without capabilities should still be able to
> > > > use dev_cgroup? (Just making sure I understand right)
> > >
> > > Nope, SELinux always stacks with capabilities, so havng the cgroup hooks
> > > in capabilities makes sense (rather than having us change the secondary
> > > stacking LSM just to enable a feature).
> >
> > Oh, ok.
> >
> > Will let the patch stand until Pavel and Greg comment then.
>
> My main question was why was that file in the kernel/ directory?
> Shouldn't that also be in the security/ directory?

I'm using cgroups to track the tasks which should have their device
permissions restricted. Right now cgroups are all under kernel/.

> And to be honest, I didn't really look at it at all other than the
> diffstat to make sure you weren't messing with the kobj_map stuff
> anymore :)
>
> thanks,
>
> greg k-h
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2008-03-14 13:58:29

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Quoting Pavel Emelyanov ([email protected]):
> Serge E. Hallyn wrote:
> > Quoting James Morris ([email protected]):
> >> On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
> >>
> >>> Quoting James Morris ([email protected]):
> >>>> On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
> >>>>
> >>>>> True, but while this change simplifies the code a bit, the semantics
> >>>>> seem more muddled - devcg will be enforcing when CONFIG_CGROUP_DEV=y
> >>>>> and:
> >>>>>
> >>>>> SECURITY=n or
> >>>>> rootplug is enabled
> >>>>> capabilities is enabled
> >>>>> smack is enabled
> >>>>> selinux+capabilities is enabled
> >>>> Well, this is how real systems are going to be deployed.
> >>> Sorry, do you mean with capabilities?
> >> Yes.
> >>
> >> All Fedora, RHEL, CentOS etc. ship with SELinux+capabilities. I can't
> >> imagine not enabling them on other kernels.
> >>
> >>>> It becomes confusing, IMHO, if you have to change which secondary LSM you
> >>>> stack with SELinux to enable a cgroup feature.
> >>> So you're saying selinux without capabilities should still be able to
> >>> use dev_cgroup? (Just making sure I understand right)
> >> Nope, SELinux always stacks with capabilities, so havng the cgroup hooks
> >> in capabilities makes sense (rather than having us change the secondary
> >> stacking LSM just to enable a feature).
> >
> > Oh, ok.
> >
> > Will let the patch stand until Pavel and Greg comment then.
>
> Well, I saw your previous patch, that was implemented as just another
> LSM module and I liked it except for the LSM dependency.

James and Stephen agree with your LSM qualms. I suppose we could add
cgroups next to the lsm hooks. I suspect Paul Menage would complain
about that (Paul?), and I do think it's silly as they are security
questions, not group tracking questions, but if it's what people want
I can send out a new patch next week.

> Since this version can happily work w/o LSM, I like it too :)

In an earlier version I asked whether you had any experience with usual
# rules per container. Do you have an idea? Right now the whitelist is
a straight list we search through linearly. If # rules is generally
tiny then I'm inclined to keep it that way...

thanks,
-serge

2008-03-14 14:00:42

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Quoting Paul Menage ([email protected]):
> On Wed, Mar 12, 2008 at 8:27 PM, Serge E. Hallyn <[email protected]> wrote:
> > Implement a cgroup using the LSM interface to enforce mknod and open
> > on device files.
> >
> > This implements a simple device access whitelist. A whitelist entry
> > has 4 fields. 'type' is a (all), c (char), or b (block). 'all' means it
> > applies to all types, all major numbers, and all minor numbers. Major and
> > minor are obvious. Access is a composition of r (read), w (write), and
> > m (mknod).
> >
> > The root devcgroup starts with rwm to 'all'. A child devcg gets a copy
> > of the parent. Admins can then add and remove devices to the whitelist.
> > Once CAP_HOST_ADMIN is introduced it will be needed to add entries as
> > well or remove entries from another cgroup, though just CAP_SYS_ADMIN
> > will suffice to remove entries for your own group.
> >
> > An entry is added by doing "echo <type> <maj> <min> <access>" > devcg.allow,
> > for instance:
> >
> > echo b 7 0 mrw > /cgroups/1/devcg.allow
> >
> > An entry is removed by doing likewise into devcg.deny. Since this is a
> > pure whitelist, not acls, you can only remove entries which exist in the
> > whitelist. You must explicitly
> >
> > echo a 0 0 mrw > /cgroups/1/devcg.deny
> >
> > to remove the "allow all" entry which is automatically inherited from
> > the root cgroup.
>
> In keeping with the naming convention for control groups, "devices"
> would be better than "devcg".

Noted, thanks.

-serge

2008-03-14 14:05:47

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Quoting Paul Menage ([email protected]):
> On Wed, Mar 12, 2008 at 8:27 PM, Serge E. Hallyn <[email protected]> wrote:
> >
> > While composing this with the ns_cgroup may seem logical, it is not
> > the right thing to do, because updates to /cg/cg1/devcg.deny are
> > not reflected in /cg/cg1/cg2/devcg.allow.
>
> Maybe you should follow up the tree to ensure that all parent groups
> have access to the device too? Or alternatively, cache the results of
> this lookup whenever permissions for a device change?

Yes, I considered that. Alternatively additions to a parent cgroup's
.deny could be propagated to all its descendents (but not additions to
the .allow).

I've noted this as something to add to the next version.

> > A task may only be moved to another devcgroup if it is moving to
> > a direct descendent of its current devcgroup.
>
> What's the rationale for that?

To prevent it escaping to laxer device permissions, which of course only
makes sense if we do what you recommend above :)

> > CAP_NS_OVERRIDE is defined as the capability needed to cross namespaces.
> > A task needs both CAP_NS_OVERRIDE and CAP_SYS_ADMIN to create a new
> > devcgroup, update a devcgroup's access, or move a task to a new
> > devcgroup.
>
> But this isn't necessarily crossing namespaces. It could be used for
> device control in the same namespace (e.g. allowing a job to access a
> raw disk for its data storage rather than going through the
> filesystem).

Yeah it should be renamed. I want to use the same cap which we would
use for user namespaces though. CAP_NS_CONT(ainer)? Even though there
really is no such thing as a 'container'. But that would tie together
any such privileges for cgroups and namespaces.

thanks,
-serge

2008-03-14 14:12:23

by Paul Menage

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

On Fri, Mar 14, 2008 at 6:58 AM, Serge E. Hallyn <[email protected]> wrote:
> James and Stephen agree with your LSM qualms. I suppose we could add
> cgroups next to the lsm hooks. I suspect Paul Menage would complain
> about that (Paul?),

Depends on what you mean by "add cgroups to the LSM hooks". Could you
expand on that?

Paul

2008-03-14 14:15:49

by Pavel Emelyanov

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Serge E. Hallyn wrote:
> Quoting Pavel Emelyanov ([email protected]):
>> Serge E. Hallyn wrote:
>>> Quoting James Morris ([email protected]):
>>>> On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
>>>>
>>>>> Quoting James Morris ([email protected]):
>>>>>> On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
>>>>>>
>>>>>>> True, but while this change simplifies the code a bit, the semantics
>>>>>>> seem more muddled - devcg will be enforcing when CONFIG_CGROUP_DEV=y
>>>>>>> and:
>>>>>>>
>>>>>>> SECURITY=n or
>>>>>>> rootplug is enabled
>>>>>>> capabilities is enabled
>>>>>>> smack is enabled
>>>>>>> selinux+capabilities is enabled
>>>>>> Well, this is how real systems are going to be deployed.
>>>>> Sorry, do you mean with capabilities?
>>>> Yes.
>>>>
>>>> All Fedora, RHEL, CentOS etc. ship with SELinux+capabilities. I can't
>>>> imagine not enabling them on other kernels.
>>>>
>>>>>> It becomes confusing, IMHO, if you have to change which secondary LSM you
>>>>>> stack with SELinux to enable a cgroup feature.
>>>>> So you're saying selinux without capabilities should still be able to
>>>>> use dev_cgroup? (Just making sure I understand right)
>>>> Nope, SELinux always stacks with capabilities, so havng the cgroup hooks
>>>> in capabilities makes sense (rather than having us change the secondary
>>>> stacking LSM just to enable a feature).
>>> Oh, ok.
>>>
>>> Will let the patch stand until Pavel and Greg comment then.
>> Well, I saw your previous patch, that was implemented as just another
>> LSM module and I liked it except for the LSM dependency.
>
> James and Stephen agree with your LSM qualms. I suppose we could add

Thanks!

> cgroups next to the lsm hooks. I suspect Paul Menage would complain
> about that (Paul?), and I do think it's silly as they are security
> questions, not group tracking questions, but if it's what people want
> I can send out a new patch next week.

The way I see this is: cgroups provide a common way to group tasks
and an API for general configuration - that's the controller "face",
and it's up to the controller to decide where he turns his "back",
IOW where the hooks are placed. For the memory controller - they are
injected directly into the mm code. For this controller, I think it
would be OK to use LSM or about-LSM hooks.

>> Since this version can happily work w/o LSM, I like it too :)
>
> In an earlier version I asked whether you had any experience with usual
> # rules per container. Do you have an idea? Right now the whitelist is
> a straight list we search through linearly. If # rules is generally
> tiny then I'm inclined to keep it that way...

The # of rules usually has a linear dependency on the number of containers
(each of then has to have an access to /dev/null,zero,random at least), so
having 100 containers we will have to scan through a 300-entries list. I'd
vote for a hash table or a radix/binary/rb tree for that. Or any other way
for non-linear search you can provide :)

> thanks,
> -serge
>

2008-03-14 14:16:04

by Paul Menage

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

On Fri, Mar 14, 2008 at 7:05 AM, Serge E. Hallyn <[email protected]> wrote:
> > > A task may only be moved to another devcgroup if it is moving to
> > > a direct descendent of its current devcgroup.
> >
> > What's the rationale for that?
>
> To prevent it escaping to laxer device permissions, which of course only
> makes sense if we do what you recommend above :)
>

That makes it impossible for a root process to enter a child cgroup,
do something, and then go back to its own cgroup. Why aren't the
existing cgroup security semantics sufficient?

Paul

2008-03-14 14:35:47

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Quoting Paul Menage ([email protected]):
> On Fri, Mar 14, 2008 at 7:05 AM, Serge E. Hallyn <[email protected]> wrote:
> > > > A task may only be moved to another devcgroup if it is moving to
> > > > a direct descendent of its current devcgroup.
> > >
> > > What's the rationale for that?
> >
> > To prevent it escaping to laxer device permissions, which of course only
> > makes sense if we do what you recommend above :)
> >
>
> That makes it impossible for a root process to enter a child cgroup,
> do something, and then go back to its own cgroup.

Yes, but it can fire off a child in the child cgroup to do something,
and go on on its own cgroup when the child finishes.

> Why aren't the
> existing cgroup security semantics sufficient?

Because the point of this is to provide some restrictions to otherwise
privileged users, and cgroups only provides dac-based permissions.

But that doesn't mean that I'm not doing too much. I could just add a
CAP_SYS_ADMIN or CAP_CONT_OVERRIDE+CAP_SYS_ADMIN check, and not restrict
which cgroups a task can move to. Does that sound good?

-serge

2008-03-14 14:38:08

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Quoting Pavel Emelyanov ([email protected]):
> Serge E. Hallyn wrote:
> > Quoting Pavel Emelyanov ([email protected]):
> >> Serge E. Hallyn wrote:
> >>> Quoting James Morris ([email protected]):
> >>>> On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
> >>>>
> >>>>> Quoting James Morris ([email protected]):
> >>>>>> On Thu, 13 Mar 2008, Serge E. Hallyn wrote:
> >>>>>>
> >>>>>>> True, but while this change simplifies the code a bit, the semantics
> >>>>>>> seem more muddled - devcg will be enforcing when CONFIG_CGROUP_DEV=y
> >>>>>>> and:
> >>>>>>>
> >>>>>>> SECURITY=n or
> >>>>>>> rootplug is enabled
> >>>>>>> capabilities is enabled
> >>>>>>> smack is enabled
> >>>>>>> selinux+capabilities is enabled
> >>>>>> Well, this is how real systems are going to be deployed.
> >>>>> Sorry, do you mean with capabilities?
> >>>> Yes.
> >>>>
> >>>> All Fedora, RHEL, CentOS etc. ship with SELinux+capabilities. I can't
> >>>> imagine not enabling them on other kernels.
> >>>>
> >>>>>> It becomes confusing, IMHO, if you have to change which secondary LSM you
> >>>>>> stack with SELinux to enable a cgroup feature.
> >>>>> So you're saying selinux without capabilities should still be able to
> >>>>> use dev_cgroup? (Just making sure I understand right)
> >>>> Nope, SELinux always stacks with capabilities, so havng the cgroup hooks
> >>>> in capabilities makes sense (rather than having us change the secondary
> >>>> stacking LSM just to enable a feature).
> >>> Oh, ok.
> >>>
> >>> Will let the patch stand until Pavel and Greg comment then.
> >> Well, I saw your previous patch, that was implemented as just another
> >> LSM module and I liked it except for the LSM dependency.
> >
> > James and Stephen agree with your LSM qualms. I suppose we could add
>
> Thanks!
>
> > cgroups next to the lsm hooks. I suspect Paul Menage would complain
> > about that (Paul?), and I do think it's silly as they are security
> > questions, not group tracking questions, but if it's what people want
> > I can send out a new patch next week.
>
> The way I see this is: cgroups provide a common way to group tasks
> and an API for general configuration - that's the controller "face",
> and it's up to the controller to decide where he turns his "back",
> IOW where the hooks are placed. For the memory controller - they are
> injected directly into the mm code. For this controller, I think it
> would be OK to use LSM or about-LSM hooks.
>
> >> Since this version can happily work w/o LSM, I like it too :)
> >
> > In an earlier version I asked whether you had any experience with usual
> > # rules per container. Do you have an idea? Right now the whitelist is
> > a straight list we search through linearly. If # rules is generally
> > tiny then I'm inclined to keep it that way...
>
> The # of rules usually has a linear dependency on the number of containers
> (each of then has to have an access to /dev/null,zero,random at least), so
> having 100 containers we will have to scan through a 300-entries list.

Oh no, the rules are stored per-container, so it sounds like you're
saying 3 entries per container?

> I'd
> vote for a hash table or a radix/binary/rb tree for that. Or any other way
> for non-linear search you can provide :)

I'm fine with that, but not for 3 rules :)

-serge

2008-03-14 14:42:22

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Quoting Paul Menage ([email protected]):
> On Fri, Mar 14, 2008 at 6:58 AM, Serge E. Hallyn <[email protected]> wrote:
> > James and Stephen agree with your LSM qualms. I suppose we could add
> > cgroups next to the lsm hooks. I suspect Paul Menage would complain
> > about that (Paul?),
>
> Depends on what you mean by "add cgroups to the LSM hooks". Could you
> expand on that?

cgroup hooks next to the lsm hooks. So in fs/namei.c where there are
security_inode_permission() hooks, there would also be
cgroup_inode_permission() hooks to let the devices cgroup mediate the
access. Well, in permission(), probably not in exec_permission_lite()
since that's probalby not a device access :)

So far it looks like everyone likes that, so as long as you don't nack I
guess that'll be the way to go.

thanks,
-serge

2008-03-14 15:15:54

by Pavel Emelyanov

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

[snip]

>> My main question was why was that file in the kernel/ directory?
>> Shouldn't that also be in the security/ directory?
>
> I'm using cgroups to track the tasks which should have their device
> permissions restricted. Right now cgroups are all under kernel/.

No. Memory cgroup is under mm/ :)

>> And to be honest, I didn't really look at it at all other than the
>> diffstat to make sure you weren't messing with the kobj_map stuff
>> anymore :)
>>
>> thanks,
>>
>> greg k-h
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2008-03-14 15:16:42

by Pavel Emelyanov

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

[snip]

>> The # of rules usually has a linear dependency on the number of containers
>> (each of then has to have an access to /dev/null,zero,random at least), so
>> having 100 containers we will have to scan through a 300-entries list.
>
> Oh no, the rules are stored per-container, so it sounds like you're
> saying 3 entries per container?

Oops :) I've missed that part :(

>> I'd
>> vote for a hash table or a radix/binary/rb tree for that. Or any other way
>> for non-linear search you can provide :)
>
> I'm fine with that, but not for 3 rules :)

So am I :) Anyway - if someday this will grow up to tens of entries turning
it into a more scalable lookup would be easy.

> -serge
>

2008-03-14 15:45:49

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Quoting Pavel Emelyanov ([email protected]):
> [snip]
>
> >> My main question was why was that file in the kernel/ directory?
> >> Shouldn't that also be in the security/ directory?
> >
> > I'm using cgroups to track the tasks which should have their device
> > permissions restricted. Right now cgroups are all under kernel/.
>
> No. Memory cgroup is under mm/ :)

Ah.

Guess it could all go under security/. Should it still go there even if
we make it not use lsm?

> >> And to be honest, I didn't really look at it at all other than the
> >> diffstat to make sure you weren't messing with the kobj_map stuff
> >> anymore :)
> >>
> >> thanks,
> >>
> >> greg k-h
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> >> the body of a message to [email protected]
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >

2008-03-14 16:15:58

by Pavel Emelyanov

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

Serge E. Hallyn wrote:
> Quoting Pavel Emelyanov ([email protected]):
>> [snip]
>>
>>>> My main question was why was that file in the kernel/ directory?
>>>> Shouldn't that also be in the security/ directory?
>>> I'm using cgroups to track the tasks which should have their device
>>> permissions restricted. Right now cgroups are all under kernel/.
>> No. Memory cgroup is under mm/ :)
>
> Ah.
>
> Guess it could all go under security/. Should it still go there even if
> we make it not use lsm?

Sure it can - security/ is in obj-y regardless of whether the
SECURITY itself is on or off :)

>>>> And to be honest, I didn't really look at it at all other than the
>>>> diffstat to make sure you weren't messing with the kobj_map stuff
>>>> anymore :)
>>>>
>>>> thanks,
>>>>
>>>> greg k-h
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2008-03-14 16:59:51

by Stephen Smalley

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)


On Fri, 2008-03-14 at 10:45 -0500, Serge E. Hallyn wrote:
> Quoting Pavel Emelyanov ([email protected]):
> > [snip]
> >
> > >> My main question was why was that file in the kernel/ directory?
> > >> Shouldn't that also be in the security/ directory?
> > >
> > > I'm using cgroups to track the tasks which should have their device
> > > permissions restricted. Right now cgroups are all under kernel/.
> >
> > No. Memory cgroup is under mm/ :)
>
> Ah.
>
> Guess it could all go under security/. Should it still go there even if
> we make it not use lsm?

There is the precedent of the security/keys directory (security-related,
but not using LSM - aside from calling LSM hooks for access checks and
labeling of keys).

--
Stephen Smalley
National Security Agency

2008-03-16 00:57:57

by Paul Menage

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

On Fri, Mar 14, 2008 at 10:35 PM, Serge E. Hallyn <[email protected]> wrote:
>
> > Why aren't the
> > existing cgroup security semantics sufficient?
>
> Because the point of this is to provide some restrictions to otherwise
> privileged users, and cgroups only provides dac-based permissions.
>
> But that doesn't mean that I'm not doing too much. I could just add a
> CAP_SYS_ADMIN or CAP_CONT_OVERRIDE+CAP_SYS_ADMIN check, and not restrict
> which cgroups a task can move to. Does that sound good?

Sounds reasonable.

Paul

2008-03-16 00:59:20

by Paul Menage

[permalink] [raw]
Subject: Re: [RFC] cgroups: implement device whitelist lsm (v2)

On Fri, Mar 14, 2008 at 10:42 PM, Serge E. Hallyn <[email protected]> wrote:
>
> cgroup hooks next to the lsm hooks. So in fs/namei.c where there are
> security_inode_permission() hooks, there would also be
> cgroup_inode_permission() hooks to let the devices cgroup mediate the
> access. Well, in permission(), probably not in exec_permission_lite()
> since that's probalby not a device access :)

This would just be a device cgroup-specific thing, right? Nothing to
do with the generic framework? If so, then that sounds fine (to me, at
least).

Paul