2011-04-22 00:24:14

by Andi Kleen

[permalink] [raw]
Subject: Make RCU dcache work with CONFIG_SECURITY=y

We found that all .38+ kernels with CONFIG_SECURITY just enables -- but
not even any security module active -- are slower than .37. And also
they don't really scale on larger machines. CONFIG_SECURITY
is a quite common configuration, so this was seen multiple times.

The problem is that with CONFIG_SECURITY every directory permission
check will drop out of the RCU walk and redo a bunch of work
(and not scale of course), just in case the security module
cannot handle it.

This patchkit tries to address this. First it moves the check for
RCU walks into the low level security module, so for the
CONFIG_SECURITY=y selinux=0 at runtime case you always get full
performance. This is an independent patch.

Then it turned out that the two security modules who use the
inode_exec_permission hook that impacts dcache walking -- SMACK
and selinux -- already use RCU internally. So I added two
followon patches that make them not drop out of the RCU walk,
as long as they stay in their RCU "fast" path. For selinux
this means a cache hit only and no audit event. For smack
it means any check as long as auditing is disabled.

I didn't find good test suites for the security modules, so
there wasn't a lot of testing on this unfortunately
(the selinux one for LTP doesn't seem to work). Some close
review of these changes is needed.

On the other hand the VFS changes itself are very straight forward
and the 1/1 patch is very straight forward (and a win in itself)

The bottom line is with this patchkit a CONFIG_SECURITY=y
kernel has as good VFS performance as a kernel with CONFIG_SECURITY
disabled.

-Andi


2011-04-22 00:24:12

by Andi Kleen

[permalink] [raw]
Subject: [PATCH 1/3] SECURITY: Move exec_permission RCU checks into security modules

From: Andi Kleen <[email protected]>

Right now all RCU walks fall back to reference walk when CONFIG_SECURITY
is enabled, even though just the standard capability module is active.
This is because security_inode_exec_permission unconditionally fails
RCU walks.

Move this decision to the low level security module. This requires
passing the RCU flags down the security hook. This way at least
the capability module and a few easy cases in selinux/smack work
with RCU walks with CONFIG_SECURITY=y

Signed-off-by: Andi Kleen <[email protected]>
---
include/linux/security.h | 2 +-
security/capability.c | 2 +-
security/security.c | 6 ++----
security/selinux/hooks.c | 6 +++++-
security/smack/smack_lsm.c | 6 +++++-
5 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/include/linux/security.h b/include/linux/security.h
index ca02f17..8ce59ef 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1456,7 +1456,7 @@ struct security_operations {
struct inode *new_dir, struct dentry *new_dentry);
int (*inode_readlink) (struct dentry *dentry);
int (*inode_follow_link) (struct dentry *dentry, struct nameidata *nd);
- int (*inode_permission) (struct inode *inode, int mask);
+ int (*inode_permission) (struct inode *inode, int mask, unsigned flags);
int (*inode_setattr) (struct dentry *dentry, struct iattr *attr);
int (*inode_getattr) (struct vfsmount *mnt, struct dentry *dentry);
int (*inode_setxattr) (struct dentry *dentry, const char *name,
diff --git a/security/capability.c b/security/capability.c
index 2984ea4..bbb5115 100644
--- a/security/capability.c
+++ b/security/capability.c
@@ -181,7 +181,7 @@ static int cap_inode_follow_link(struct dentry *dentry,
return 0;
}

-static int cap_inode_permission(struct inode *inode, int mask)
+static int cap_inode_permission(struct inode *inode, int mask, unsigned flags)
{
return 0;
}
diff --git a/security/security.c b/security/security.c
index 1011423..4ba6d4c 100644
--- a/security/security.c
+++ b/security/security.c
@@ -518,16 +518,14 @@ int security_inode_permission(struct inode *inode, int mask)
{
if (unlikely(IS_PRIVATE(inode)))
return 0;
- return security_ops->inode_permission(inode, mask);
+ return security_ops->inode_permission(inode, mask, 0);
}

int security_inode_exec_permission(struct inode *inode, unsigned int flags)
{
if (unlikely(IS_PRIVATE(inode)))
return 0;
- if (flags)
- return -ECHILD;
- return security_ops->inode_permission(inode, MAY_EXEC);
+ return security_ops->inode_permission(inode, MAY_EXEC, flags);
}

int security_inode_setattr(struct dentry *dentry, struct iattr *attr)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index f9c3764..a73f4e4 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -2635,7 +2635,7 @@ static int selinux_inode_follow_link(struct dentry *dentry, struct nameidata *na
return dentry_has_perm(cred, NULL, dentry, FILE__READ);
}

-static int selinux_inode_permission(struct inode *inode, int mask)
+static int selinux_inode_permission(struct inode *inode, int mask, unsigned flags)
{
const struct cred *cred = current_cred();
struct common_audit_data ad;
@@ -2649,6 +2649,10 @@ static int selinux_inode_permission(struct inode *inode, int mask)
if (!mask)
return 0;

+ /* May be droppable after audit */
+ if (flags & IPERM_FLAG_RCU)
+ return -ECHILD;
+
COMMON_AUDIT_DATA_INIT(&ad, FS);
ad.u.fs.inode = inode;

diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index c6f8fca..400a5d5 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -686,7 +686,7 @@ static int smack_inode_rename(struct inode *old_inode,
*
* Returns 0 if access is permitted, -EACCES otherwise
*/
-static int smack_inode_permission(struct inode *inode, int mask)
+static int smack_inode_permission(struct inode *inode, int mask, unsigned flags)
{
struct smk_audit_info ad;

@@ -696,6 +696,10 @@ static int smack_inode_permission(struct inode *inode, int mask)
*/
if (mask == 0)
return 0;
+
+ /* May be droppable after audit */
+ if (flags & IPERM_FLAG_RCU)
+ return -ECHILD;
smk_ad_init(&ad, __func__, LSM_AUDIT_DATA_FS);
smk_ad_setfield_u_fs_inode(&ad, inode);
return smk_curacc(smk_of_inode(inode), mask, &ad);
--
1.7.4.2

2011-04-22 00:24:13

by Andi Kleen

[permalink] [raw]
Subject: [PATCH 3/3] SMACK: Make smack directory access check RCU safe

From: Andi Kleen <[email protected]>

SMACK already uses RCU internally, so except for auditing,
it's safe to not abort a RCU dcache walk.

Signed-off-by: Andi Kleen <[email protected]>
---
security/smack/smack.h | 14 ++++++++++++--
security/smack/smack_access.c | 26 ++++++++++++++++++++------
security/smack/smack_lsm.c | 5 +----
3 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/security/smack/smack.h b/security/smack/smack.h
index b449cfd..0cc17e3 100644
--- a/security/smack/smack.h
+++ b/security/smack/smack.h
@@ -198,8 +198,18 @@ struct inode_smack *new_inode_smack(char *);
* These functions are in smack_access.c
*/
int smk_access_entry(char *, char *, struct list_head *);
-int smk_access(char *, char *, int, struct smk_audit_info *);
-int smk_curacc(char *, u32, struct smk_audit_info *);
+int smk_access_flags(char *, char *, int, struct smk_audit_info *, int);
+static inline int smk_access(char *a, char *b, int c, struct smk_audit_info *d)
+{
+ return smk_access_flags(a, b, c, d, 0);
+}
+int smk_curacc_flags(char *, u32, struct smk_audit_info *, int vfs_flags);
+
+static inline int smk_curacc(char *a, u32 b, struct smk_audit_info *c)
+{
+ return smk_curacc_flags(a, b, c, 0);
+}
+
int smack_to_cipso(const char *, struct smack_cipso *);
void smack_from_cipso(u32, char *, char *);
char *smack_from_secid(const u32);
diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
index 9294c5d..43b20f3 100644
--- a/security/smack/smack_access.c
+++ b/security/smack/smack_access.c
@@ -128,8 +128,8 @@ int smk_access_entry(char *subject_label, char *object_label,
* will be on the list, so checking the pointers may be a worthwhile
* optimization.
*/
-int smk_access(char *subject_label, char *object_label, int request,
- struct smk_audit_info *a)
+int smk_access_flags(char *subject_label, char *object_label, int request,
+ struct smk_audit_info *a, int vfs_flags)
{
int may = MAY_NOT;
int rc = 0;
@@ -194,8 +194,17 @@ int smk_access(char *subject_label, char *object_label, int request,
rc = -EACCES;
out_audit:
#ifdef CONFIG_AUDIT
- if (a)
+ if (a) {
+ /*
+ * If we're in a RCU walk try again without RCU
+ * for auditing. While in theory this may skip
+ * auditing when things change logically it is
+ * just as if the operation succeeded a bit later.
+ */
+ if (vfs_flags & IPERM_FLAG_RCU)
+ return -ECHILD;
smack_log(subject_label, object_label, request, rc, a);
+ }
#endif
return rc;
}
@@ -211,7 +220,8 @@ out_audit:
* non zero otherwise. It allows that current may have the capability
* to override the rules.
*/
-int smk_curacc(char *obj_label, u32 mode, struct smk_audit_info *a)
+int smk_curacc_flags(char *obj_label, u32 mode, struct smk_audit_info *a,
+ int vfs_flags)
{
struct task_smack *tsp = current_security();
char *sp = smk_of_task(tsp);
@@ -221,7 +231,7 @@ int smk_curacc(char *obj_label, u32 mode, struct smk_audit_info *a)
/*
* Check the global rule list
*/
- rc = smk_access(sp, obj_label, mode, NULL);
+ rc = smk_access_flags(sp, obj_label, mode, NULL, vfs_flags);
if (rc == 0) {
/*
* If there is an entry in the task's rule list
@@ -248,8 +258,12 @@ int smk_curacc(char *obj_label, u32 mode, struct smk_audit_info *a)

out_audit:
#ifdef CONFIG_AUDIT
- if (a)
+ if (a && rc != -ECHILD) {
+ /* Audit in non RCU mode */
+ if (vfs_flags & IPERM_FLAG_RCU)
+ return -ECHILD;
smack_log(sp, obj_label, mode, rc, a);
+ }
#endif
return rc;
}
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 400a5d5..366d250 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -697,12 +697,9 @@ static int smack_inode_permission(struct inode *inode, int mask, unsigned flags)
if (mask == 0)
return 0;

- /* May be droppable after audit */
- if (flags & IPERM_FLAG_RCU)
- return -ECHILD;
smk_ad_init(&ad, __func__, LSM_AUDIT_DATA_FS);
smk_ad_setfield_u_fs_inode(&ad, inode);
- return smk_curacc(smk_of_inode(inode), mask, &ad);
+ return smk_curacc_flags(smk_of_inode(inode), mask, &ad, flags);
}

/**
--
1.7.4.2

2011-04-22 00:24:50

by Andi Kleen

[permalink] [raw]
Subject: [PATCH 2/3] SELINUX: Make selinux cache VFS RCU walks safe

From: Andi Kleen <[email protected]>

Now that the security modules can decide whether they support the
dcache RCU walk or not it's possible to make selinux a bit more
RCU friendly. selinux already uses RCU for its internal decision
cache, so this must be already RCU safe.

This patch makes the VFS RCU walk not retry in selinux if it
hit the cache, and only fallback on a cache miss.

I had to add some parameters to pass the state around, otherwise
the patch is quite simple.

Signed-off-by: Andi Kleen <[email protected]>
---
security/selinux/avc.c | 48 +++++++++++++++++++++++++++++++--------
security/selinux/hooks.c | 28 +++++++++++-----------
security/selinux/include/avc.h | 22 ++++++++++++-----
security/selinux/ss/services.c | 2 +-
4 files changed, 68 insertions(+), 32 deletions(-)

diff --git a/security/selinux/avc.c b/security/selinux/avc.c
index 9da6420..9163c5f 100644
--- a/security/selinux/avc.c
+++ b/security/selinux/avc.c
@@ -471,6 +471,7 @@ static void avc_audit_post_callback(struct audit_buffer *ab, void *a)
* @avd: access vector decisions
* @result: result from avc_has_perm_noaudit
* @a: auxiliary audit data
+ * @flags: VFS walk flags
*
* Audit the granting or denial of permissions in accordance
* with the policy. This function is typically called by
@@ -481,9 +482,10 @@ static void avc_audit_post_callback(struct audit_buffer *ab, void *a)
* be performed under a lock, to allow the lock to be released
* before calling the auditing code.
*/
-void avc_audit(u32 ssid, u32 tsid,
+int avc_audit(u32 ssid, u32 tsid,
u16 tclass, u32 requested,
- struct av_decision *avd, int result, struct common_audit_data *a)
+ struct av_decision *avd, int result, struct common_audit_data *a,
+ unsigned flags)
{
struct common_audit_data stack_data;
u32 denied, audited;
@@ -515,7 +517,18 @@ void avc_audit(u32 ssid, u32 tsid,
else
audited = requested & avd->auditallow;
if (!audited)
- return;
+ return 0;
+
+ /*
+ * When in a RCU walk do the audit on the RCU retry (until
+ * someone makes audit RCU safe)
+ * Note this may drop some audits when the situation changes during
+ * retry. However this is logically just as if the operation happened
+ * a little later.
+ */
+ if (flags & IPERM_FLAG_RCU)
+ return -ECHILD;
+
if (!a) {
a = &stack_data;
COMMON_AUDIT_DATA_INIT(a, NONE);
@@ -529,6 +542,7 @@ void avc_audit(u32 ssid, u32 tsid,
a->lsm_pre_audit = avc_audit_pre_callback;
a->lsm_post_audit = avc_audit_post_callback;
common_lsm_audit(a);
+ return 0;
}

/**
@@ -726,6 +740,7 @@ int avc_ss_reset(u32 seqno)
* @requested: requested permissions, interpreted based on @tclass
* @flags: AVC_STRICT or 0
* @avd: access vector decisions
+ * @vfsflags: VFS walk flags
*
* Check the AVC to determine whether the @requested permissions are granted
* for the SID pair (@ssid, @tsid), interpreting the permissions
@@ -741,7 +756,8 @@ int avc_ss_reset(u32 seqno)
int avc_has_perm_noaudit(u32 ssid, u32 tsid,
u16 tclass, u32 requested,
unsigned flags,
- struct av_decision *in_avd)
+ struct av_decision *in_avd,
+ unsigned vfsflags)
{
struct avc_node *node;
struct av_decision avd_entry, *avd;
@@ -756,6 +772,10 @@ int avc_has_perm_noaudit(u32 ssid, u32 tsid,
if (!node) {
rcu_read_unlock();

+ /* Try again later if RCU */
+ if (vfsflags & IPERM_FLAG_RCU)
+ return -ECHILD;
+
if (in_avd)
avd = in_avd;
else
@@ -793,6 +813,7 @@ int avc_has_perm_noaudit(u32 ssid, u32 tsid,
* @tclass: target security class
* @requested: requested permissions, interpreted based on @tclass
* @auditdata: auxiliary audit data
+ * @flags: VFS walk flags
*
* Check the AVC to determine whether the @requested permissions are granted
* for the SID pair (@ssid, @tsid), interpreting the permissions
@@ -802,14 +823,21 @@ int avc_has_perm_noaudit(u32 ssid, u32 tsid,
* permissions are granted, -%EACCES if any permissions are denied, or
* another -errno upon other errors.
*/
-int avc_has_perm(u32 ssid, u32 tsid, u16 tclass,
- u32 requested, struct common_audit_data *auditdata)
+int avc_has_perm_flags(u32 ssid, u32 tsid, u16 tclass,
+ u32 requested, struct common_audit_data *auditdata,
+ unsigned flags)
{
struct av_decision avd;
- int rc;
-
- rc = avc_has_perm_noaudit(ssid, tsid, tclass, requested, 0, &avd);
- avc_audit(ssid, tsid, tclass, requested, &avd, rc, auditdata);
+ int rc, rc2;
+
+ rc = avc_has_perm_noaudit(ssid, tsid, tclass, requested, 0, &avd,
+ flags);
+ if (rc == -ECHILD)
+ return rc;
+ rc2 = avc_audit(ssid, tsid, tclass, requested, &avd, rc, auditdata,
+ flags);
+ if (rc2 == -ECHILD)
+ return rc2;
return rc;
}

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index a73f4e4..ea8c755 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -1445,9 +1445,12 @@ static int task_has_capability(struct task_struct *tsk,
BUG();
}

- rc = avc_has_perm_noaudit(sid, sid, sclass, av, 0, &avd);
- if (audit == SECURITY_CAP_AUDIT)
- avc_audit(sid, sid, sclass, av, &avd, rc, &ad);
+ rc = avc_has_perm_noaudit(sid, sid, sclass, av, 0, &avd, 0);
+ if (audit == SECURITY_CAP_AUDIT) {
+ int rc2 = avc_audit(sid, sid, sclass, av, &avd, rc, &ad, 0);
+ if (rc2 == -ECHILD)
+ return rc2;
+ }
return rc;
}

@@ -1467,7 +1470,8 @@ static int task_has_system(struct task_struct *tsk,
static int inode_has_perm(const struct cred *cred,
struct inode *inode,
u32 perms,
- struct common_audit_data *adp)
+ struct common_audit_data *adp,
+ unsigned flags)
{
struct inode_security_struct *isec;
struct common_audit_data ad;
@@ -1487,7 +1491,7 @@ static int inode_has_perm(const struct cred *cred,
ad.u.fs.inode = inode;
}

- return avc_has_perm(sid, isec->sid, isec->sclass, perms, adp);
+ return avc_has_perm_flags(sid, isec->sid, isec->sclass, perms, adp, flags);
}

/* Same as inode_has_perm, but pass explicit audit data containing
@@ -1504,7 +1508,7 @@ static inline int dentry_has_perm(const struct cred *cred,
COMMON_AUDIT_DATA_INIT(&ad, FS);
ad.u.fs.path.mnt = mnt;
ad.u.fs.path.dentry = dentry;
- return inode_has_perm(cred, inode, av, &ad);
+ return inode_has_perm(cred, inode, av, &ad, 0);
}

/* Check whether a task can use an open file descriptor to
@@ -1540,7 +1544,7 @@ static int file_has_perm(const struct cred *cred,
/* av is zero if only checking access to the descriptor. */
rc = 0;
if (av)
- rc = inode_has_perm(cred, inode, av, &ad);
+ rc = inode_has_perm(cred, inode, av, &ad, 0);

out:
return rc;
@@ -2103,7 +2107,7 @@ static inline void flush_unauthorized_files(const struct cred *cred,
file = file_priv->file;
inode = file->f_path.dentry->d_inode;
if (inode_has_perm(cred, inode,
- FILE__READ | FILE__WRITE, NULL)) {
+ FILE__READ | FILE__WRITE, NULL, 0)) {
drop_tty = 1;
}
}
@@ -2649,10 +2653,6 @@ static int selinux_inode_permission(struct inode *inode, int mask, unsigned flag
if (!mask)
return 0;

- /* May be droppable after audit */
- if (flags & IPERM_FLAG_RCU)
- return -ECHILD;
-
COMMON_AUDIT_DATA_INIT(&ad, FS);
ad.u.fs.inode = inode;

@@ -2661,7 +2661,7 @@ static int selinux_inode_permission(struct inode *inode, int mask, unsigned flag

perms = file_mask_to_av(inode->i_mode, mask);

- return inode_has_perm(cred, inode, perms, &ad);
+ return inode_has_perm(cred, inode, perms, &ad, flags);
}

static int selinux_inode_setattr(struct dentry *dentry, struct iattr *iattr)
@@ -3209,7 +3209,7 @@ static int selinux_dentry_open(struct file *file, const struct cred *cred)
* new inode label or new policy.
* This check is not redundant - do not remove.
*/
- return inode_has_perm(cred, inode, open_file_to_av(file), NULL);
+ return inode_has_perm(cred, inode, open_file_to_av(file), NULL, 0);
}

/* task security operations */
diff --git a/security/selinux/include/avc.h b/security/selinux/include/avc.h
index 5615081..65d6e52 100644
--- a/security/selinux/include/avc.h
+++ b/security/selinux/include/avc.h
@@ -54,21 +54,29 @@ struct avc_cache_stats {

void __init avc_init(void);

-void avc_audit(u32 ssid, u32 tsid,
+int avc_audit(u32 ssid, u32 tsid,
u16 tclass, u32 requested,
struct av_decision *avd,
int result,
- struct common_audit_data *a);
+ struct common_audit_data *a, unsigned flags);

#define AVC_STRICT 1 /* Ignore permissive mode. */
int avc_has_perm_noaudit(u32 ssid, u32 tsid,
u16 tclass, u32 requested,
unsigned flags,
- struct av_decision *avd);
-
-int avc_has_perm(u32 ssid, u32 tsid,
- u16 tclass, u32 requested,
- struct common_audit_data *auditdata);
+ struct av_decision *avd, unsigned vfsflags);
+
+int avc_has_perm_flags(u32 ssid, u32 tsid,
+ u16 tclass, u32 requested,
+ struct common_audit_data *auditdata,
+ unsigned);
+
+static inline int avc_has_perm(u32 ssid, u32 tsid,
+ u16 tclass, u32 requested,
+ struct common_audit_data *auditdata)
+{
+ return avc_has_perm_flags(ssid, tsid, tclass, requested, auditdata, 0);
+}

u32 avc_policy_seqno(void);

diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index 6ef4af4..4749202 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -2212,7 +2212,7 @@ out_unlock:
rc = avc_has_perm_noaudit(fromsid, mysids[i],
SECCLASS_PROCESS, /* kernel value */
PROCESS__TRANSITION, AVC_STRICT,
- NULL);
+ NULL, 0);
if (!rc)
mysids2[j++] = mysids[i];
cond_resched();
--
1.7.4.2

2011-04-22 00:45:20

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH 2/3] SELINUX: Make selinux cache VFS RCU walks safe

I'll take a close look over the weekend, but I'm pretty sure this is
even more strict than it needs to be. I looked at this a while ago
and the only RCU unsafe location I could find was in the generic LSM
'audit' code (nothing to do with the audit subsystem). That code can
do a d = d_find_alias(); dput(d). I don't think I realized the dput()
was not RCU safe at the time. We use it to come up with a name of a
dentry that might have caused the denial (although obviously not
necessarily the right name)

I could just drop that piece of functionality (and rely on the audit
subsystem for the info), but I think I'd rather do it your way. I
think I can push your flags a lot deeper than you have pushed them
(and remove them in some places you have included them). Let me look
over the next day or two....

-Eric

On Thu, Apr 21, 2011 at 8:23 PM, Andi Kleen <[email protected]> wrote:
> From: Andi Kleen <[email protected]>
>
> Now that the security modules can decide whether they support the
> dcache RCU walk or not it's possible to make selinux a bit more
> RCU friendly. selinux already uses RCU for its internal decision
> cache, so this must be already RCU safe.
>
> This patch makes the VFS RCU walk not retry in selinux if it
> hit the cache, and only fallback on a cache miss.
>
> I had to add some parameters to pass the state around, otherwise
> the patch is quite simple.
>
> Signed-off-by: Andi Kleen <[email protected]>
> ---
> ?security/selinux/avc.c ? ? ? ? | ? 48 +++++++++++++++++++++++++++++++--------
> ?security/selinux/hooks.c ? ? ? | ? 28 +++++++++++-----------
> ?security/selinux/include/avc.h | ? 22 ++++++++++++-----
> ?security/selinux/ss/services.c | ? ?2 +-
> ?4 files changed, 68 insertions(+), 32 deletions(-)
>
> diff --git a/security/selinux/avc.c b/security/selinux/avc.c
> index 9da6420..9163c5f 100644
> --- a/security/selinux/avc.c
> +++ b/security/selinux/avc.c
> @@ -471,6 +471,7 @@ static void avc_audit_post_callback(struct audit_buffer *ab, void *a)
> ?* @avd: access vector decisions
> ?* @result: result from avc_has_perm_noaudit
> ?* @a: ?auxiliary audit data
> + * @flags: VFS walk flags
> ?*
> ?* Audit the granting or denial of permissions in accordance
> ?* with the policy. ?This function is typically called by
> @@ -481,9 +482,10 @@ static void avc_audit_post_callback(struct audit_buffer *ab, void *a)
> ?* be performed under a lock, to allow the lock to be released
> ?* before calling the auditing code.
> ?*/
> -void avc_audit(u32 ssid, u32 tsid,
> +int avc_audit(u32 ssid, u32 tsid,
> ? ? ? ? ? ? ? u16 tclass, u32 requested,
> - ? ? ? ? ? ? ?struct av_decision *avd, int result, struct common_audit_data *a)
> + ? ? ? ? ? ? ?struct av_decision *avd, int result, struct common_audit_data *a,
> + ? ? ? ? ? ? ?unsigned flags)
> ?{
> ? ? ? ?struct common_audit_data stack_data;
> ? ? ? ?u32 denied, audited;
> @@ -515,7 +517,18 @@ void avc_audit(u32 ssid, u32 tsid,
> ? ? ? ?else
> ? ? ? ? ? ? ? ?audited = requested & avd->auditallow;
> ? ? ? ?if (!audited)
> - ? ? ? ? ? ? ? return;
> + ? ? ? ? ? ? ? return 0;
> +
> + ? ? ? /*
> + ? ? ? ?* When in a RCU walk do the audit on the RCU retry (until
> + ? ? ? ?* someone makes audit RCU safe)
> + ? ? ? ?* Note this may drop some audits when the situation changes during
> + ? ? ? ?* retry. However this is logically just as if the operation happened
> + ? ? ? ?* a little later.
> + ? ? ? ?*/
> + ? ? ? if (flags & IPERM_FLAG_RCU)
> + ? ? ? ? ? ? ? return -ECHILD;
> +
> ? ? ? ?if (!a) {
> ? ? ? ? ? ? ? ?a = &stack_data;
> ? ? ? ? ? ? ? ?COMMON_AUDIT_DATA_INIT(a, NONE);
> @@ -529,6 +542,7 @@ void avc_audit(u32 ssid, u32 tsid,
> ? ? ? ?a->lsm_pre_audit = avc_audit_pre_callback;
> ? ? ? ?a->lsm_post_audit = avc_audit_post_callback;
> ? ? ? ?common_lsm_audit(a);
> + ? ? ? return 0;
> ?}
>
> ?/**
> @@ -726,6 +740,7 @@ int avc_ss_reset(u32 seqno)
> ?* @requested: requested permissions, interpreted based on @tclass
> ?* @flags: ?AVC_STRICT or 0
> ?* @avd: access vector decisions
> + * @vfsflags: VFS walk flags
> ?*
> ?* Check the AVC to determine whether the @requested permissions are granted
> ?* for the SID pair (@ssid, @tsid), interpreting the permissions
> @@ -741,7 +756,8 @@ int avc_ss_reset(u32 seqno)
> ?int avc_has_perm_noaudit(u32 ssid, u32 tsid,
> ? ? ? ? ? ? ? ? ? ? ? ? u16 tclass, u32 requested,
> ? ? ? ? ? ? ? ? ? ? ? ? unsigned flags,
> - ? ? ? ? ? ? ? ? ? ? ? ?struct av_decision *in_avd)
> + ? ? ? ? ? ? ? ? ? ? ? ?struct av_decision *in_avd,
> + ? ? ? ? ? ? ? ? ? ? ? ?unsigned vfsflags)
> ?{
> ? ? ? ?struct avc_node *node;
> ? ? ? ?struct av_decision avd_entry, *avd;
> @@ -756,6 +772,10 @@ int avc_has_perm_noaudit(u32 ssid, u32 tsid,
> ? ? ? ?if (!node) {
> ? ? ? ? ? ? ? ?rcu_read_unlock();
>
> + ? ? ? ? ? ? ? /* Try again later if RCU */
> + ? ? ? ? ? ? ? if (vfsflags & IPERM_FLAG_RCU)
> + ? ? ? ? ? ? ? ? ? ? ? return -ECHILD;
> +
> ? ? ? ? ? ? ? ?if (in_avd)
> ? ? ? ? ? ? ? ? ? ? ? ?avd = in_avd;
> ? ? ? ? ? ? ? ?else
> @@ -793,6 +813,7 @@ int avc_has_perm_noaudit(u32 ssid, u32 tsid,
> ?* @tclass: target security class
> ?* @requested: requested permissions, interpreted based on @tclass
> ?* @auditdata: auxiliary audit data
> + * @flags: VFS walk flags
> ?*
> ?* Check the AVC to determine whether the @requested permissions are granted
> ?* for the SID pair (@ssid, @tsid), interpreting the permissions
> @@ -802,14 +823,21 @@ int avc_has_perm_noaudit(u32 ssid, u32 tsid,
> ?* permissions are granted, -%EACCES if any permissions are denied, or
> ?* another -errno upon other errors.
> ?*/
> -int avc_has_perm(u32 ssid, u32 tsid, u16 tclass,
> - ? ? ? ? ? ? ? ?u32 requested, struct common_audit_data *auditdata)
> +int avc_has_perm_flags(u32 ssid, u32 tsid, u16 tclass,
> + ? ? ? ? ? ? ? ? ? ? ?u32 requested, struct common_audit_data *auditdata,
> + ? ? ? ? ? ? ? ? ? ? ?unsigned flags)
> ?{
> ? ? ? ?struct av_decision avd;
> - ? ? ? int rc;
> -
> - ? ? ? rc = avc_has_perm_noaudit(ssid, tsid, tclass, requested, 0, &avd);
> - ? ? ? avc_audit(ssid, tsid, tclass, requested, &avd, rc, auditdata);
> + ? ? ? int rc, rc2;
> +
> + ? ? ? rc = avc_has_perm_noaudit(ssid, tsid, tclass, requested, 0, &avd,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? flags);
> + ? ? ? if (rc == -ECHILD)
> + ? ? ? ? ? ? ? return rc;
> + ? ? ? rc2 = avc_audit(ssid, tsid, tclass, requested, &avd, rc, auditdata,
> + ? ? ? ? ? ? ? ? ? ? ? flags);
> + ? ? ? if (rc2 == -ECHILD)
> + ? ? ? ? ? ? ? return rc2;
> ? ? ? ?return rc;
> ?}
>
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index a73f4e4..ea8c755 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -1445,9 +1445,12 @@ static int task_has_capability(struct task_struct *tsk,
> ? ? ? ? ? ? ? ?BUG();
> ? ? ? ?}
>
> - ? ? ? rc = avc_has_perm_noaudit(sid, sid, sclass, av, 0, &avd);
> - ? ? ? if (audit == SECURITY_CAP_AUDIT)
> - ? ? ? ? ? ? ? avc_audit(sid, sid, sclass, av, &avd, rc, &ad);
> + ? ? ? rc = avc_has_perm_noaudit(sid, sid, sclass, av, 0, &avd, 0);
> + ? ? ? if (audit == SECURITY_CAP_AUDIT) {
> + ? ? ? ? ? ? ? int rc2 = avc_audit(sid, sid, sclass, av, &avd, rc, &ad, 0);
> + ? ? ? ? ? ? ? if (rc2 == -ECHILD)
> + ? ? ? ? ? ? ? ? ? ? ? return rc2;
> + ? ? ? }
> ? ? ? ?return rc;
> ?}
>
> @@ -1467,7 +1470,8 @@ static int task_has_system(struct task_struct *tsk,
> ?static int inode_has_perm(const struct cred *cred,
> ? ? ? ? ? ? ? ? ? ? ? ? ?struct inode *inode,
> ? ? ? ? ? ? ? ? ? ? ? ? ?u32 perms,
> - ? ? ? ? ? ? ? ? ? ? ? ? struct common_audit_data *adp)
> + ? ? ? ? ? ? ? ? ? ? ? ? struct common_audit_data *adp,
> + ? ? ? ? ? ? ? ? ? ? ? ? unsigned flags)
> ?{
> ? ? ? ?struct inode_security_struct *isec;
> ? ? ? ?struct common_audit_data ad;
> @@ -1487,7 +1491,7 @@ static int inode_has_perm(const struct cred *cred,
> ? ? ? ? ? ? ? ?ad.u.fs.inode = inode;
> ? ? ? ?}
>
> - ? ? ? return avc_has_perm(sid, isec->sid, isec->sclass, perms, adp);
> + ? ? ? return avc_has_perm_flags(sid, isec->sid, isec->sclass, perms, adp, flags);
> ?}
>
> ?/* Same as inode_has_perm, but pass explicit audit data containing
> @@ -1504,7 +1508,7 @@ static inline int dentry_has_perm(const struct cred *cred,
> ? ? ? ?COMMON_AUDIT_DATA_INIT(&ad, FS);
> ? ? ? ?ad.u.fs.path.mnt = mnt;
> ? ? ? ?ad.u.fs.path.dentry = dentry;
> - ? ? ? return inode_has_perm(cred, inode, av, &ad);
> + ? ? ? return inode_has_perm(cred, inode, av, &ad, 0);
> ?}
>
> ?/* Check whether a task can use an open file descriptor to
> @@ -1540,7 +1544,7 @@ static int file_has_perm(const struct cred *cred,
> ? ? ? ?/* av is zero if only checking access to the descriptor. */
> ? ? ? ?rc = 0;
> ? ? ? ?if (av)
> - ? ? ? ? ? ? ? rc = inode_has_perm(cred, inode, av, &ad);
> + ? ? ? ? ? ? ? rc = inode_has_perm(cred, inode, av, &ad, 0);
>
> ?out:
> ? ? ? ?return rc;
> @@ -2103,7 +2107,7 @@ static inline void flush_unauthorized_files(const struct cred *cred,
> ? ? ? ? ? ? ? ? ? ? ? ?file = file_priv->file;
> ? ? ? ? ? ? ? ? ? ? ? ?inode = file->f_path.dentry->d_inode;
> ? ? ? ? ? ? ? ? ? ? ? ?if (inode_has_perm(cred, inode,
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?FILE__READ | FILE__WRITE, NULL)) {
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?FILE__READ | FILE__WRITE, NULL, 0)) {
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?drop_tty = 1;
> ? ? ? ? ? ? ? ? ? ? ? ?}
> ? ? ? ? ? ? ? ?}
> @@ -2649,10 +2653,6 @@ static int selinux_inode_permission(struct inode *inode, int mask, unsigned flag
> ? ? ? ?if (!mask)
> ? ? ? ? ? ? ? ?return 0;
>
> - ? ? ? /* May be droppable after audit */
> - ? ? ? if (flags & IPERM_FLAG_RCU)
> - ? ? ? ? ? ? ? return -ECHILD;
> -
> ? ? ? ?COMMON_AUDIT_DATA_INIT(&ad, FS);
> ? ? ? ?ad.u.fs.inode = inode;
>
> @@ -2661,7 +2661,7 @@ static int selinux_inode_permission(struct inode *inode, int mask, unsigned flag
>
> ? ? ? ?perms = file_mask_to_av(inode->i_mode, mask);
>
> - ? ? ? return inode_has_perm(cred, inode, perms, &ad);
> + ? ? ? return inode_has_perm(cred, inode, perms, &ad, flags);
> ?}
>
> ?static int selinux_inode_setattr(struct dentry *dentry, struct iattr *iattr)
> @@ -3209,7 +3209,7 @@ static int selinux_dentry_open(struct file *file, const struct cred *cred)
> ? ? ? ? * new inode label or new policy.
> ? ? ? ? * This check is not redundant - do not remove.
> ? ? ? ? */
> - ? ? ? return inode_has_perm(cred, inode, open_file_to_av(file), NULL);
> + ? ? ? return inode_has_perm(cred, inode, open_file_to_av(file), NULL, 0);
> ?}
>
> ?/* task security operations */
> diff --git a/security/selinux/include/avc.h b/security/selinux/include/avc.h
> index 5615081..65d6e52 100644
> --- a/security/selinux/include/avc.h
> +++ b/security/selinux/include/avc.h
> @@ -54,21 +54,29 @@ struct avc_cache_stats {
>
> ?void __init avc_init(void);
>
> -void avc_audit(u32 ssid, u32 tsid,
> +int avc_audit(u32 ssid, u32 tsid,
> ? ? ? ? ? ? ? u16 tclass, u32 requested,
> ? ? ? ? ? ? ? struct av_decision *avd,
> ? ? ? ? ? ? ? int result,
> - ? ? ? ? ? ? ?struct common_audit_data *a);
> + ? ? ? ? ? ? struct common_audit_data *a, unsigned flags);
>
> ?#define AVC_STRICT 1 /* Ignore permissive mode. */
> ?int avc_has_perm_noaudit(u32 ssid, u32 tsid,
> ? ? ? ? ? ? ? ? ? ? ? ? u16 tclass, u32 requested,
> ? ? ? ? ? ? ? ? ? ? ? ? unsigned flags,
> - ? ? ? ? ? ? ? ? ? ? ? ?struct av_decision *avd);
> -
> -int avc_has_perm(u32 ssid, u32 tsid,
> - ? ? ? ? ? ? ? ?u16 tclass, u32 requested,
> - ? ? ? ? ? ? ? ?struct common_audit_data *auditdata);
> + ? ? ? ? ? ? ? ? ? ? ? ?struct av_decision *avd, unsigned vfsflags);
> +
> +int avc_has_perm_flags(u32 ssid, u32 tsid,
> + ? ? ? ? ? ? ? ? ? ? ?u16 tclass, u32 requested,
> + ? ? ? ? ? ? ? ? ? ? ?struct common_audit_data *auditdata,
> + ? ? ? ? ? ? ? ? ? ? ?unsigned);
> +
> +static inline int avc_has_perm(u32 ssid, u32 tsid,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?u16 tclass, u32 requested,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?struct common_audit_data *auditdata)
> +{
> + ? ? ? return avc_has_perm_flags(ssid, tsid, tclass, requested, auditdata, 0);
> +}
>
> ?u32 avc_policy_seqno(void);
>
> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> index 6ef4af4..4749202 100644
> --- a/security/selinux/ss/services.c
> +++ b/security/selinux/ss/services.c
> @@ -2212,7 +2212,7 @@ out_unlock:
> ? ? ? ? ? ? ? ?rc = avc_has_perm_noaudit(fromsid, mysids[i],
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?SECCLASS_PROCESS, /* kernel value */
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?PROCESS__TRANSITION, AVC_STRICT,
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? NULL);
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? NULL, 0);
> ? ? ? ? ? ? ? ?if (!rc)
> ? ? ? ? ? ? ? ? ? ? ? ?mysids2[j++] = mysids[i];
> ? ? ? ? ? ? ? ?cond_resched();
> --
> 1.7.4.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>

2011-04-22 00:46:55

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH 1/3] SECURITY: Move exec_permission RCU checks into security modules

On Thu, Apr 21, 2011 at 8:23 PM, Andi Kleen <[email protected]> wrote:
> From: Andi Kleen <[email protected]>
>
> Right now all RCU walks fall back to reference walk when CONFIG_SECURITY
> is enabled, even though just the standard capability module is active.
> This is because security_inode_exec_permission unconditionally fails
> RCU walks.
>
> Move this decision to the low level security module. This requires
> passing the RCU flags down the security hook. This way at least
> the capability module and a few easy cases in selinux/smack work
> with RCU walks with CONFIG_SECURITY=y
>
> Signed-off-by: Andi Kleen <[email protected]>

Acked-by: Eric Paris <[email protected]>

> ---
> ?include/linux/security.h ? | ? ?2 +-
> ?security/capability.c ? ? ?| ? ?2 +-
> ?security/security.c ? ? ? ?| ? ?6 ++----
> ?security/selinux/hooks.c ? | ? ?6 +++++-
> ?security/smack/smack_lsm.c | ? ?6 +++++-
> ?5 files changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/security.h b/include/linux/security.h
> index ca02f17..8ce59ef 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -1456,7 +1456,7 @@ struct security_operations {
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct inode *new_dir, struct dentry *new_dentry);
> ? ? ? ?int (*inode_readlink) (struct dentry *dentry);
> ? ? ? ?int (*inode_follow_link) (struct dentry *dentry, struct nameidata *nd);
> - ? ? ? int (*inode_permission) (struct inode *inode, int mask);
> + ? ? ? int (*inode_permission) (struct inode *inode, int mask, unsigned flags);
> ? ? ? ?int (*inode_setattr) ? ?(struct dentry *dentry, struct iattr *attr);
> ? ? ? ?int (*inode_getattr) (struct vfsmount *mnt, struct dentry *dentry);
> ? ? ? ?int (*inode_setxattr) (struct dentry *dentry, const char *name,
> diff --git a/security/capability.c b/security/capability.c
> index 2984ea4..bbb5115 100644
> --- a/security/capability.c
> +++ b/security/capability.c
> @@ -181,7 +181,7 @@ static int cap_inode_follow_link(struct dentry *dentry,
> ? ? ? ?return 0;
> ?}
>
> -static int cap_inode_permission(struct inode *inode, int mask)
> +static int cap_inode_permission(struct inode *inode, int mask, unsigned flags)
> ?{
> ? ? ? ?return 0;
> ?}
> diff --git a/security/security.c b/security/security.c
> index 1011423..4ba6d4c 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -518,16 +518,14 @@ int security_inode_permission(struct inode *inode, int mask)
> ?{
> ? ? ? ?if (unlikely(IS_PRIVATE(inode)))
> ? ? ? ? ? ? ? ?return 0;
> - ? ? ? return security_ops->inode_permission(inode, mask);
> + ? ? ? return security_ops->inode_permission(inode, mask, 0);
> ?}
>
> ?int security_inode_exec_permission(struct inode *inode, unsigned int flags)
> ?{
> ? ? ? ?if (unlikely(IS_PRIVATE(inode)))
> ? ? ? ? ? ? ? ?return 0;
> - ? ? ? if (flags)
> - ? ? ? ? ? ? ? return -ECHILD;
> - ? ? ? return security_ops->inode_permission(inode, MAY_EXEC);
> + ? ? ? return security_ops->inode_permission(inode, MAY_EXEC, flags);
> ?}
>
> ?int security_inode_setattr(struct dentry *dentry, struct iattr *attr)
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index f9c3764..a73f4e4 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -2635,7 +2635,7 @@ static int selinux_inode_follow_link(struct dentry *dentry, struct nameidata *na
> ? ? ? ?return dentry_has_perm(cred, NULL, dentry, FILE__READ);
> ?}
>
> -static int selinux_inode_permission(struct inode *inode, int mask)
> +static int selinux_inode_permission(struct inode *inode, int mask, unsigned flags)
> ?{
> ? ? ? ?const struct cred *cred = current_cred();
> ? ? ? ?struct common_audit_data ad;
> @@ -2649,6 +2649,10 @@ static int selinux_inode_permission(struct inode *inode, int mask)
> ? ? ? ?if (!mask)
> ? ? ? ? ? ? ? ?return 0;
>
> + ? ? ? /* May be droppable after audit */
> + ? ? ? if (flags & IPERM_FLAG_RCU)
> + ? ? ? ? ? ? ? return -ECHILD;
> +
> ? ? ? ?COMMON_AUDIT_DATA_INIT(&ad, FS);
> ? ? ? ?ad.u.fs.inode = inode;
>
> diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
> index c6f8fca..400a5d5 100644
> --- a/security/smack/smack_lsm.c
> +++ b/security/smack/smack_lsm.c
> @@ -686,7 +686,7 @@ static int smack_inode_rename(struct inode *old_inode,
> ?*
> ?* Returns 0 if access is permitted, -EACCES otherwise
> ?*/
> -static int smack_inode_permission(struct inode *inode, int mask)
> +static int smack_inode_permission(struct inode *inode, int mask, unsigned flags)
> ?{
> ? ? ? ?struct smk_audit_info ad;
>
> @@ -696,6 +696,10 @@ static int smack_inode_permission(struct inode *inode, int mask)
> ? ? ? ? */
> ? ? ? ?if (mask == 0)
> ? ? ? ? ? ? ? ?return 0;
> +
> + ? ? ? /* May be droppable after audit */
> + ? ? ? if (flags & IPERM_FLAG_RCU)
> + ? ? ? ? ? ? ? return -ECHILD;
> ? ? ? ?smk_ad_init(&ad, __func__, LSM_AUDIT_DATA_FS);
> ? ? ? ?smk_ad_setfield_u_fs_inode(&ad, inode);
> ? ? ? ?return smk_curacc(smk_of_inode(inode), mask, &ad);
> --
> 1.7.4.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>

2011-04-22 01:40:53

by Shaohua Li

[permalink] [raw]
Subject: Re: Make RCU dcache work with CONFIG_SECURITY=y

On Fri, 2011-04-22 at 08:23 +0800, Andi Kleen wrote:
> We found that all .38+ kernels with CONFIG_SECURITY just enables -- but
> not even any security module active -- are slower than .37. And also
> they don't really scale on larger machines. CONFIG_SECURITY
> is a quite common configuration, so this was seen multiple times.
>
> The problem is that with CONFIG_SECURITY every directory permission
> check will drop out of the RCU walk and redo a bunch of work
> (and not scale of course), just in case the security module
> cannot handle it.
>
> This patchkit tries to address this. First it moves the check for
> RCU walks into the low level security module, so for the
> CONFIG_SECURITY=y selinux=0 at runtime case you always get full
> performance. This is an independent patch.
>
> Then it turned out that the two security modules who use the
> inode_exec_permission hook that impacts dcache walking -- SMACK
> and selinux -- already use RCU internally. So I added two
> followon patches that make them not drop out of the RCU walk,
> as long as they stay in their RCU "fast" path. For selinux
> this means a cache hit only and no audit event. For smack
> it means any check as long as auditing is disabled.
>
> I didn't find good test suites for the security modules, so
> there wasn't a lot of testing on this unfortunately
> (the selinux one for LTP doesn't seem to work). Some close
> review of these changes is needed.
>
> On the other hand the VFS changes itself are very straight forward
> and the 1/1 patch is very straight forward (and a win in itself)
>
> The bottom line is with this patchkit a CONFIG_SECURITY=y
> kernel has as good VFS performance as a kernel with CONFIG_SECURITY
> disabled.
Great! This fully recovers the dbench regression I reported back:
http://marc.info/?l=linux-fsdevel&m=129591574123544&w=2
and we get a better throughput than 2.6.37

Thanks,
Shaohua

2011-04-22 04:34:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/3] SECURITY: Move exec_permission RCU checks into security modules

On Thu, Apr 21, 2011 at 05:23:19PM -0700, Andi Kleen wrote:
> From: Andi Kleen <[email protected]>
>
> Right now all RCU walks fall back to reference walk when CONFIG_SECURITY
> is enabled, even though just the standard capability module is active.
> This is because security_inode_exec_permission unconditionally fails
> RCU walks.
>
> Move this decision to the low level security module. This requires
> passing the RCU flags down the security hook. This way at least
> the capability module and a few easy cases in selinux/smack work
> with RCU walks with CONFIG_SECURITY=y
>
> Signed-off-by: Andi Kleen <[email protected]>
> ---
> include/linux/security.h | 2 +-
> security/capability.c | 2 +-
> security/security.c | 6 ++----
> security/selinux/hooks.c | 6 +++++-
> security/smack/smack_lsm.c | 6 +++++-
> 5 files changed, 14 insertions(+), 8 deletions(-)

This seems to miss the hunk in fs/namei.c where the LSM hook is called.

2011-04-22 15:17:30

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 2/3] SELINUX: Make selinux cache VFS RCU walks safe

On Thu, Apr 21, 2011 at 08:45:17PM -0400, Eric Paris wrote:
> I'll take a close look over the weekend, but I'm pretty sure this is
> even more strict than it needs to be. I looked at this a while ago
> and the only RCU unsafe location I could find was in the generic LSM
> 'audit' code (nothing to do with the audit subsystem). That code can
> do a d = d_find_alias(); dput(d). I don't think I realized the dput()
> was not RCU safe at the time. We use it to come up with a name of a
> dentry that might have caused the denial (although obviously not
> necessarily the right name)
>
> I could just drop that piece of functionality (and rely on the audit
> subsystem for the info), but I think I'd rather do it your way. I
> think I can push your flags a lot deeper than you have pushed them
> (and remove them in some places you have included them). Let me look
> over the next day or two....

Sounds good. I would prefer to do that as a follow on patch to make
this patch not even more complicated. Is that ok for you?

Also the same approach could be applied to SMACK then I guess.

-Andi

2011-04-22 15:26:06

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/3] SECURITY: Move exec_permission RCU checks into security modules

> This seems to miss the hunk in fs/namei.c where the LSM hook is called.

namei.c doesn't need changes because it already passes in the flags
The changed semantics only "start" in security/security.c

Thanks for review.

-Andi

--
[email protected] -- Speaking for myself only

2011-04-22 15:27:51

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/3] SECURITY: Move exec_permission RCU checks into security modules

On Fri, Apr 22, 2011 at 08:25:20AM -0700, Andi Kleen wrote:
> > This seems to miss the hunk in fs/namei.c where the LSM hook is called.
>
> namei.c doesn't need changes because it already passes in the flags
> The changed semantics only "start" in security/security.c

You're right, sorry.

2011-04-22 18:26:37

by Linus Torvalds

[permalink] [raw]
Subject: Re: Make RCU dcache work with CONFIG_SECURITY=y

On Thu, Apr 21, 2011 at 5:23 PM, Andi Kleen <[email protected]> wrote:
>
> I didn't find good test suites for the security modules, so
> there wasn't a lot of testing on this unfortunately
> (the selinux one for LTP doesn't seem to work). Some close
> review of these changes is needed.
>
> On the other hand the VFS changes itself are very straight forward
> and the 1/1 patch is very straight forward (and a win in itself)
>
> The bottom line is with this patchkit a CONFIG_SECURITY=y
> kernel has as good VFS performance as a kernel with CONFIG_SECURITY
> disabled.

Gaah. My immediate reaction to the patch-series was "This is great, I
was really hoping we could get all those annoying cases sorted out,
and I'll queue them for the next merge window".

Having then actually read through the patches a bit more, I then got
convinced that at least the first patch should probably be applied
right away and be marked for stable, since it looks pretty damn
obvious to me, and it might already on its own fix the performance
regression for some configurations (although realistically I guess few
enough people really do the "selinux=0" thing, so the big advantage is
making easier to backport the other patches later if we don't do them
now).

And now I'm vacillating about the two later patches too. They look
fine to me, but I really have _zero_ familiarity with selinux and
smack internals, so unlike the first patch, I can't go "that looks
like the obviously right thing, and it clearly catches all the RCU
cases".

The "we can't use all the nifty RCU pathwalk in the config that most
distros ship by default" is clearly a performance regression, and has
meant that it's not been really showing its real advantages for most
people. So in that sense, it's a regression fix and thus valid even
though we're pretty late in the -rc series.

But at the same time, it's also a bit scary.

Comments? I'd really like to see/hear feedback like "yeah, this looks
really obviously safe" vs "yeah, looks good, but I really don't feel
very comfortable with it" from the security people.

Linus

2011-04-22 21:16:56

by Andi Kleen

[permalink] [raw]
Subject: Re: Make RCU dcache work with CONFIG_SECURITY=y

On Fri, Apr 22, 2011 at 11:26:09AM -0700, Linus Torvalds wrote:
> On Thu, Apr 21, 2011 at 5:23 PM, Andi Kleen <[email protected]> wrote:
> >
> > I didn't find good test suites for the security modules, so
> > there wasn't a lot of testing on this unfortunately
> > (the selinux one for LTP doesn't seem to work). Some close
> > review of these changes is needed.
> >
> > On the other hand the VFS changes itself are very straight forward
> > and the 1/1 patch is very straight forward (and a win in itself)
> >
> > The bottom line is with this patchkit a CONFIG_SECURITY=y
> > kernel has as good VFS performance as a kernel with CONFIG_SECURITY
> > disabled.
>
> Gaah. My immediate reaction to the patch-series was "This is great, I
> was really hoping we could get all those annoying cases sorted out,
> and I'll queue them for the next merge window".
>
> Having then actually read through the patches a bit more, I then got
> convinced that at least the first patch should probably be applied
> right away and be marked for stable, since it looks pretty damn
> obvious to me, and it might already on its own fix the performance
> regression for some configurations (although realistically I guess few
> enough people really do the "selinux=0" thing, so the big advantage is
> making easier to backport the other patches later if we don't do them
> now).

Yes I agree. The first patch is (nearly) a no-brainer and already
has significant benefits. I would like to see it in .39.

> Comments? I'd really like to see/hear feedback like "yeah, this looks
> really obviously safe" vs "yeah, looks good, but I really don't feel
> very comfortable with it" from the security people.

Especially SMACK review is needed. Or maybe selinux only for now,
already got one ack for that.

(BTW I have some doubts on the locking in smack in general,
but that's a separate issue -- see other thread)

-Andi

--
[email protected] -- Speaking for myself only.

2011-04-22 21:17:50

by Eric Paris

[permalink] [raw]
Subject: Re: Make RCU dcache work with CONFIG_SECURITY=y

On Fri, 2011-04-22 at 11:26 -0700, Linus Torvalds wrote:
> On Thu, Apr 21, 2011 at 5:23 PM, Andi Kleen <[email protected]> wrote:
> >
> > I didn't find good test suites for the security modules, so
> > there wasn't a lot of testing on this unfortunately
> > (the selinux one for LTP doesn't seem to work). Some close
> > review of these changes is needed.
> >
> > On the other hand the VFS changes itself are very straight forward
> > and the 1/1 patch is very straight forward (and a win in itself)
> >
> > The bottom line is with this patchkit a CONFIG_SECURITY=y
> > kernel has as good VFS performance as a kernel with CONFIG_SECURITY
> > disabled.
>
> Gaah. My immediate reaction to the patch-series was "This is great, I
> was really hoping we could get all those annoying cases sorted out,
> and I'll queue them for the next merge window".
>
> Having then actually read through the patches a bit more, I then got
> convinced that at least the first patch should probably be applied
> right away and be marked for stable, since it looks pretty damn
> obvious to me, and it might already on its own fix the performance
> regression for some configurations (although realistically I guess few
> enough people really do the "selinux=0" thing, so the big advantage is
> making easier to backport the other patches later if we don't do them
> now).
>
> And now I'm vacillating about the two later patches too. They look
> fine to me, but I really have _zero_ familiarity with selinux and
> smack internals, so unlike the first patch, I can't go "that looks
> like the obviously right thing, and it clearly catches all the RCU
> cases".
>
> The "we can't use all the nifty RCU pathwalk in the config that most
> distros ship by default" is clearly a performance regression, and has
> meant that it's not been really showing its real advantages for most
> people. So in that sense, it's a regression fix and thus valid even
> though we're pretty late in the -rc series.
>
> But at the same time, it's also a bit scary.
>
> Comments? I'd really like to see/hear feedback like "yeah, this looks
> really obviously safe" vs "yeah, looks good, but I really don't feel
> very comfortable with it" from the security people.

>From an SELinux PoV (And I feel semi confident about SMACK) patch 1/3
can and should go in as soon as you want. 2/3 looked safe on first
glance, but I think I can make it smaller and better. I'll try to get a
version of 2/3 on list in the next couple of days. My first thought was
that it should probably go in via my SELinux tree next window......

-Eric

2011-04-22 21:32:55

by Casey Schaufler

[permalink] [raw]
Subject: Re: Make RCU dcache work with CONFIG_SECURITY=y

On 4/22/2011 2:16 PM, Andi Kleen wrote:
> On Fri, Apr 22, 2011 at 11:26:09AM -0700, Linus Torvalds wrote:
>> On Thu, Apr 21, 2011 at 5:23 PM, Andi Kleen <[email protected]> wrote:
>>> I didn't find good test suites for the security modules, so
>>> there wasn't a lot of testing on this unfortunately
>>> (the selinux one for LTP doesn't seem to work). Some close
>>> review of these changes is needed.
>>>
>>> On the other hand the VFS changes itself are very straight forward
>>> and the 1/1 patch is very straight forward (and a win in itself)
>>>
>>> The bottom line is with this patchkit a CONFIG_SECURITY=y
>>> kernel has as good VFS performance as a kernel with CONFIG_SECURITY
>>> disabled.
>> Gaah. My immediate reaction to the patch-series was "This is great, I
>> was really hoping we could get all those annoying cases sorted out,
>> and I'll queue them for the next merge window".
>>
>> Having then actually read through the patches a bit more, I then got
>> convinced that at least the first patch should probably be applied
>> right away and be marked for stable, since it looks pretty damn
>> obvious to me, and it might already on its own fix the performance
>> regression for some configurations (although realistically I guess few
>> enough people really do the "selinux=0" thing, so the big advantage is
>> making easier to backport the other patches later if we don't do them
>> now).
> Yes I agree. The first patch is (nearly) a no-brainer and already
> has significant benefits. I would like to see it in .39.
>
>> Comments? I'd really like to see/hear feedback like "yeah, this looks
>> really obviously safe" vs "yeah, looks good, but I really don't feel
>> very comfortable with it" from the security people.
> Especially SMACK review is needed.

I am happy to get all the help I can on this. I am not now
nor have I ever been especially comfortable with sophisticated
locking models. Where possible I have written code with minimal
locking requirements, but sometimes you just can't avoid it. I
have been fortunate in that several people have offered advice
in the past.

> Or maybe selinux only for now,
> already got one ack for that.
>
> (BTW I have some doubts on the locking in smack in general,
> but that's a separate issue -- see other thread)
>
> -Andi
>

2011-04-22 23:30:13

by Linus Torvalds

[permalink] [raw]
Subject: Re: Make RCU dcache work with CONFIG_SECURITY=y

On Fri, Apr 22, 2011 at 2:17 PM, Eric Paris <[email protected]> wrote:
> On Fri, 2011-04-22 at 11:26 -0700, Linus Torvalds wrote:
>>
>> Comments? I'd really like to see/hear feedback like "yeah, this looks
>> really obviously safe" vs "yeah, looks good, but I really don't feel
>> very comfortable with it" from the security people.
>
> From an SELinux PoV (And I feel semi confident about SMACK) patch 1/3
> can and should go in as soon as you want. ?2/3 looked safe on first
> glance, but I think I can make it smaller and better. ?I'll try to get a
> version of 2/3 on list in the next couple of days. ?My first thought was
> that it should probably go in via my SELinux tree next window......

Ok, given the pretty obvious nature of 1/3 and no objections, I
committed that one.

Let's see what the situation with 2/3 is. It would be good to not have
slower path lookup for all the selinux-using people, but whatever.

Linus