2014-01-09 15:19:50

by Steven Rostedt

[permalink] [raw]
Subject: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

While running stress tests on adding and deleting ftrace instances I
hit this bug:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
PGD 63681067 PUD 7ddbe067 PMD 0
Oops: 0000 [#1] PREEMPT
CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20
Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000
RIP: 0010:[<ffffffff812d8bc5>] [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
RSP: 0018:ffff88007ddb1c48 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840
RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000
RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54
R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000
R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000
FS: 00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0
Stack:
0000000000000081 ffff88006ee46000 0000000000000081 ffffffff812d8b45
ffff88006ee46000 0000000000000081 ffff880078375800 ffff880078375800
ffff88007ddb1c98 ffffffff812d358c ffff88007ddb1cb8 ffffffff811364f1
Call Trace:^M
[<ffffffff812d8b45>] ? selinux_inode_permission+0x5/0x160
[<ffffffff812d358c>] security_inode_permission+0x1c/0x30
[<ffffffff811364f1>] __inode_permission+0x41/0xa0
[<ffffffff81136568>] inode_permission+0x18/0x50
[<ffffffff811378b6>] link_path_walk+0x66/0x920
[<ffffffff810875a5>] ? __rcu_read_lock+0x5/0x20
[<ffffffff8113a9e6>] path_openat+0xa6/0x6c0
[<ffffffff8113b000>] ? path_openat+0x6c0/0x6c0
[<ffffffff810c4e69>] ? __trace_graph_entry+0x49/0xc0
[<ffffffff8112b196>] ? do_sys_open+0x146/0x240
[<ffffffff8113b043>] do_filp_open+0x43/0xa0
[<ffffffff8113b005>] ? do_filp_open+0x5/0xa0
[<ffffffff8112b196>] do_sys_open+0x146/0x240
[<ffffffff8112b2ae>] SyS_open+0x1e/0x20
[<ffffffff81948cd0>] system_call_fastpath+0x16/0x1b
Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff
RIP [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
RSP <ffff88007ddb1c48>
CR2: 0000000000000020
---[ end trace 9d800e5ac5059462 ]---


Investigating, I found that the inode->i_security was NULL, and the
dereference of it caused the oops.

in selinux_inode_permission():
----
isec = inode->i_security;

rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);
----

Note, the crash came from stressing the deletion and reading of debugfs
files. I was not able to recreate this via normal files. But I'm not
sure they are safe. It may just be that the race window is much harder
to hit.

What seems to have happened (and what I have traced), is the file is
being opened at the same time the file or directory is being deleted.
As the dentry and inode locks are not held during the path walk, nor is
the inodes ref counts being incremented, there is nothing saving these
structures from being discarded except for an rcu_read_lock().

The rcu_read_lock() protects against freeing of the inode, but it does
not protect freeing of the inode_security_struct. What needs to happen
is the inode->i_security needs to be checked for NULL, and the
inode_security_struct needs to be freed after a synchronize_rcu(). A
simple check is added to handle the first case, and a call_rcu() was
added to handle the second.

Cc: [email protected]
Signed-off-by: Steven Rostedt <[email protected]>

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 419491d..f2b2d41 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -234,6 +234,14 @@ static int inode_alloc_security(struct inode *inode)
return 0;
}

+static void inode_free_rcu(struct rcu_head *head)
+{
+ struct inode_security_struct *isec;
+
+ isec = container_of(head, struct inode_security_struct, rcu);
+ kmem_cache_free(sel_inode_cache, isec);
+}
+
static void inode_free_security(struct inode *inode)
{
struct inode_security_struct *isec = inode->i_security;
@@ -245,7 +253,7 @@ static void inode_free_security(struct inode *inode)
spin_unlock(&sbsec->isec_lock);

inode->i_security = NULL;
- kmem_cache_free(sel_inode_cache, isec);
+ call_rcu(&isec->rcu, inode_free_rcu);
}

static int file_alloc_security(struct file *file)
@@ -2781,6 +2789,10 @@ static int selinux_inode_permission(struct inode *inode, int mask)
sid = cred_sid(cred);
isec = inode->i_security;

+ /* inode is being destroyed */
+ if (!isec)
+ return 0;
+
rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);
audited = avc_audit_required(perms, &avd, rc,
from_access ? FILE__AUDIT_ACCESS : 0,
diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
index b1dfe10..078e553 100644
--- a/security/selinux/include/objsec.h
+++ b/security/selinux/include/objsec.h
@@ -38,7 +38,10 @@ struct task_security_struct {

struct inode_security_struct {
struct inode *inode; /* back pointer to inode object */
- struct list_head list; /* list of inode_security_struct */
+ union {
+ struct list_head list; /* list of inode_security_struct */
+ struct rcu_head rcu; /* for freeing the inode_security_struct */
+ };
u32 task_sid; /* SID of creating task */
u32 sid; /* SID of this object */
u16 sclass; /* security class of this object */


2014-01-09 15:32:08

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

Didn't Al find this/something very similar. I really hate this
solution. Why should every LSM try to understand the intimate
lifetime rules of the parent subsystems? The real problem is that
inode_free_security() is being called while the inode is still in use.
While I agree with the assessment, I disagree with the solution. Let
me try to find where Al and Christoph talked about this....

On Thu, Jan 9, 2014 at 10:19 AM, Steven Rostedt <[email protected]> wrote:
> While running stress tests on adding and deleting ftrace instances I
> hit this bug:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
> IP: [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
> PGD 63681067 PUD 7ddbe067 PMD 0
> Oops: 0000 [#1] PREEMPT
> CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20
> Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
> task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000
> RIP: 0010:[<ffffffff812d8bc5>] [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
> RSP: 0018:ffff88007ddb1c48 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840
> RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000
> RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54
> R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000
> R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000
> FS: 00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
> CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0
> Stack:
> 0000000000000081 ffff88006ee46000 0000000000000081 ffffffff812d8b45
> ffff88006ee46000 0000000000000081 ffff880078375800 ffff880078375800
> ffff88007ddb1c98 ffffffff812d358c ffff88007ddb1cb8 ffffffff811364f1
> Call Trace:^M
> [<ffffffff812d8b45>] ? selinux_inode_permission+0x5/0x160
> [<ffffffff812d358c>] security_inode_permission+0x1c/0x30
> [<ffffffff811364f1>] __inode_permission+0x41/0xa0
> [<ffffffff81136568>] inode_permission+0x18/0x50
> [<ffffffff811378b6>] link_path_walk+0x66/0x920
> [<ffffffff810875a5>] ? __rcu_read_lock+0x5/0x20
> [<ffffffff8113a9e6>] path_openat+0xa6/0x6c0
> [<ffffffff8113b000>] ? path_openat+0x6c0/0x6c0
> [<ffffffff810c4e69>] ? __trace_graph_entry+0x49/0xc0
> [<ffffffff8112b196>] ? do_sys_open+0x146/0x240
> [<ffffffff8113b043>] do_filp_open+0x43/0xa0
> [<ffffffff8113b005>] ? do_filp_open+0x5/0xa0
> [<ffffffff8112b196>] do_sys_open+0x146/0x240
> [<ffffffff8112b2ae>] SyS_open+0x1e/0x20
> [<ffffffff81948cd0>] system_call_fastpath+0x16/0x1b
> Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff
> RIP [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
> RSP <ffff88007ddb1c48>
> CR2: 0000000000000020
> ---[ end trace 9d800e5ac5059462 ]---
>
>
> Investigating, I found that the inode->i_security was NULL, and the
> dereference of it caused the oops.
>
> in selinux_inode_permission():
> ----
> isec = inode->i_security;
>
> rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);
> ----
>
> Note, the crash came from stressing the deletion and reading of debugfs
> files. I was not able to recreate this via normal files. But I'm not
> sure they are safe. It may just be that the race window is much harder
> to hit.
>
> What seems to have happened (and what I have traced), is the file is
> being opened at the same time the file or directory is being deleted.
> As the dentry and inode locks are not held during the path walk, nor is
> the inodes ref counts being incremented, there is nothing saving these
> structures from being discarded except for an rcu_read_lock().
>
> The rcu_read_lock() protects against freeing of the inode, but it does
> not protect freeing of the inode_security_struct. What needs to happen
> is the inode->i_security needs to be checked for NULL, and the
> inode_security_struct needs to be freed after a synchronize_rcu(). A
> simple check is added to handle the first case, and a call_rcu() was
> added to handle the second.
>
> Cc: [email protected]
> Signed-off-by: Steven Rostedt <[email protected]>
>
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index 419491d..f2b2d41 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -234,6 +234,14 @@ static int inode_alloc_security(struct inode *inode)
> return 0;
> }
>
> +static void inode_free_rcu(struct rcu_head *head)
> +{
> + struct inode_security_struct *isec;
> +
> + isec = container_of(head, struct inode_security_struct, rcu);
> + kmem_cache_free(sel_inode_cache, isec);
> +}
> +
> static void inode_free_security(struct inode *inode)
> {
> struct inode_security_struct *isec = inode->i_security;
> @@ -245,7 +253,7 @@ static void inode_free_security(struct inode *inode)
> spin_unlock(&sbsec->isec_lock);
>
> inode->i_security = NULL;
> - kmem_cache_free(sel_inode_cache, isec);
> + call_rcu(&isec->rcu, inode_free_rcu);
> }
>
> static int file_alloc_security(struct file *file)
> @@ -2781,6 +2789,10 @@ static int selinux_inode_permission(struct inode *inode, int mask)
> sid = cred_sid(cred);
> isec = inode->i_security;
>
> + /* inode is being destroyed */
> + if (!isec)
> + return 0;
> +
> rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);
> audited = avc_audit_required(perms, &avd, rc,
> from_access ? FILE__AUDIT_ACCESS : 0,
> diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
> index b1dfe10..078e553 100644
> --- a/security/selinux/include/objsec.h
> +++ b/security/selinux/include/objsec.h
> @@ -38,7 +38,10 @@ struct task_security_struct {
>
> struct inode_security_struct {
> struct inode *inode; /* back pointer to inode object */
> - struct list_head list; /* list of inode_security_struct */
> + union {
> + struct list_head list; /* list of inode_security_struct */
> + struct rcu_head rcu; /* for freeing the inode_security_struct */
> + };
> u32 task_sid; /* SID of creating task */
> u32 sid; /* SID of this object */
> u16 sclass; /* security class of this object */
>

2014-01-09 15:51:24

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

On Thu, 9 Jan 2014 10:31:55 -0500
Eric Paris <[email protected]> wrote:

> Didn't Al find this/something very similar. I really hate this

I'm not involved with the vfs, so I'm unaware of other solutions
presented. I just hit this now and solving bugs is where I get a chance
to learn about other aspects of the kernel. ;-)

> solution. Why should every LSM try to understand the intimate
> lifetime rules of the parent subsystems? The real problem is that
> inode_free_security() is being called while the inode is still in use.
> While I agree with the assessment, I disagree with the solution. Let
> me try to find where Al and Christoph talked about this....
>

The other obvious solution (but not as trivial to implement) is to call
the security_inode_free() and friends (probably __destroy_inode()
itself) after a synchronize_rcu().

Perhaps something like this?

-- Steve

diff --git a/fs/inode.c b/fs/inode.c
index 4bcdad3..a8f3b88 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -252,16 +252,17 @@ EXPORT_SYMBOL(__destroy_inode);
static void i_callback(struct rcu_head *head)
{
struct inode *inode = container_of(head, struct inode, i_rcu);
+ __destroy_inode(inode);
kmem_cache_free(inode_cachep, inode);
}

static void destroy_inode(struct inode *inode)
{
BUG_ON(!list_empty(&inode->i_lru));
- __destroy_inode(inode);
- if (inode->i_sb->s_op->destroy_inode)
+ if (inode->i_sb->s_op->destroy_inode) {
+ __destroy_inode(inode);
inode->i_sb->s_op->destroy_inode(inode);
- else
+ } else
call_rcu(&inode->i_rcu, i_callback);
}

2014-01-09 15:57:33

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

On Thu, 2014-01-09 at 10:51 -0500, Steven Rostedt wrote:
> On Thu, 9 Jan 2014 10:31:55 -0500
> Eric Paris <[email protected]> wrote:
>
> > Didn't Al find this/something very similar. I really hate this
>
> I'm not involved with the vfs, so I'm unaware of other solutions
> presented. I just hit this now and solving bugs is where I get a chance
> to learn about other aspects of the kernel. ;-)
>
> > solution. Why should every LSM try to understand the intimate
> > lifetime rules of the parent subsystems? The real problem is that
> > inode_free_security() is being called while the inode is still in use.
> > While I agree with the assessment, I disagree with the solution. Let
> > me try to find where Al and Christoph talked about this....
> >
>
> The other obvious solution (but not as trivial to implement) is to call
> the security_inode_free() and friends (probably __destroy_inode()
> itself) after a synchronize_rcu().
>
> Perhaps something like this?

I can't for the life of me find that conversation! Maybe I'm just
making it all up... Usually I forget conversations, not remember ones
that didn't happen...

Assuming the VFS guys say that delaying __destroy_inode() is safe like
that, I like it better. It also means that this is fixed for all LSMs,
not just SELinux...

-Eric

>
> -- Steve
>
> diff --git a/fs/inode.c b/fs/inode.c
> index 4bcdad3..a8f3b88 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -252,16 +252,17 @@ EXPORT_SYMBOL(__destroy_inode);
> static void i_callback(struct rcu_head *head)
> {
> struct inode *inode = container_of(head, struct inode, i_rcu);
> + __destroy_inode(inode);
> kmem_cache_free(inode_cachep, inode);
> }
>
> static void destroy_inode(struct inode *inode)
> {
> BUG_ON(!list_empty(&inode->i_lru));
> - __destroy_inode(inode);
> - if (inode->i_sb->s_op->destroy_inode)
> + if (inode->i_sb->s_op->destroy_inode) {
> + __destroy_inode(inode);
> inode->i_sb->s_op->destroy_inode(inode);
> - else
> + } else
> call_rcu(&inode->i_rcu, i_callback);
> }
>

2014-01-09 16:06:05

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

[adding lsm and selinux]

Am I just crazy, or was this bug discussed (and obviously not fixed)
some time ago?

VFS can still use inodes after security_inode_free_security() was
called...

On Thu, 2014-01-09 at 10:57 -0500, Eric Paris wrote:
> On Thu, 2014-01-09 at 10:51 -0500, Steven Rostedt wrote:
> > On Thu, 9 Jan 2014 10:31:55 -0500
> > Eric Paris <[email protected]> wrote:
> >
> > > Didn't Al find this/something very similar. I really hate this
> >
> > I'm not involved with the vfs, so I'm unaware of other solutions
> > presented. I just hit this now and solving bugs is where I get a chance
> > to learn about other aspects of the kernel. ;-)
> >
> > > solution. Why should every LSM try to understand the intimate
> > > lifetime rules of the parent subsystems? The real problem is that
> > > inode_free_security() is being called while the inode is still in use.
> > > While I agree with the assessment, I disagree with the solution. Let
> > > me try to find where Al and Christoph talked about this....
> > >
> >
> > The other obvious solution (but not as trivial to implement) is to call
> > the security_inode_free() and friends (probably __destroy_inode()
> > itself) after a synchronize_rcu().
> >
> > Perhaps something like this?
>
> I can't for the life of me find that conversation! Maybe I'm just
> making it all up... Usually I forget conversations, not remember ones
> that didn't happen...
>
> Assuming the VFS guys say that delaying __destroy_inode() is safe like
> that, I like it better. It also means that this is fixed for all LSMs,
> not just SELinux...
>
> -Eric
>
> >
> > -- Steve
> >
> > diff --git a/fs/inode.c b/fs/inode.c
> > index 4bcdad3..a8f3b88 100644
> > --- a/fs/inode.c
> > +++ b/fs/inode.c
> > @@ -252,16 +252,17 @@ EXPORT_SYMBOL(__destroy_inode);
> > static void i_callback(struct rcu_head *head)
> > {
> > struct inode *inode = container_of(head, struct inode, i_rcu);
> > + __destroy_inode(inode);
> > kmem_cache_free(inode_cachep, inode);
> > }
> >
> > static void destroy_inode(struct inode *inode)
> > {
> > BUG_ON(!list_empty(&inode->i_lru));
> > - __destroy_inode(inode);
> > - if (inode->i_sb->s_op->destroy_inode)
> > + if (inode->i_sb->s_op->destroy_inode) {
> > + __destroy_inode(inode);
> > inode->i_sb->s_op->destroy_inode(inode);
> > - else
> > + } else
> > call_rcu(&inode->i_rcu, i_callback);
> > }
> >
>

2014-01-09 16:12:12

by Stephen Smalley

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

On 01/09/2014 11:05 AM, Eric Paris wrote:
> [adding lsm and selinux]
>
> Am I just crazy, or was this bug discussed (and obviously not fixed)
> some time ago?
>
> VFS can still use inodes after security_inode_free_security() was
> called...

I didn't know that was the case; originally when we added the hook it
was not possible. I have seen a Red Hat bugzilla report about it,
but no upstream discussion.

2014-01-09 16:23:04

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

On Thu, 09 Jan 2014 11:10:05 -0500
Stephen Smalley <[email protected]> wrote:

> I didn't know that was the case; originally when we added the hook it
> was not possible. I have seen a Red Hat bugzilla report about it,
> but no upstream discussion.

Maybe not enough people use SELinux ;-)

Do you know what the Red Hat BZ # was?

-- Steve

2014-01-09 16:25:28

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

https://bugzilla.redhat.com/show_bug.cgi?id=829715

at least has some discussion...

On Thu, Jan 9, 2014 at 11:22 AM, Steven Rostedt <[email protected]> wrote:
> On Thu, 09 Jan 2014 11:10:05 -0500
> Stephen Smalley <[email protected]> wrote:
>
>> I didn't know that was the case; originally when we added the hook it
>> was not possible. I have seen a Red Hat bugzilla report about it,
>> but no upstream discussion.
>
> Maybe not enough people use SELinux ;-)
>
> Do you know what the Red Hat BZ # was?
>
> -- Steve
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2014-01-09 20:21:04

by Mimi Zohar

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

On Thu, 2014-01-09 at 11:10 -0500, Stephen Smalley wrote:
> On 01/09/2014 11:05 AM, Eric Paris wrote:
> > [adding lsm and selinux]
> >
> > Am I just crazy, or was this bug discussed (and obviously not fixed)
> > some time ago?
> >
> > VFS can still use inodes after security_inode_free_security() was
> > called...
>
> I didn't know that was the case; originally when we added the hook it
> was not possible. I have seen a Red Hat bugzilla report about it,
> but no upstream discussion.

For those of us that don't have access to the RH bugzilla, can someone
please summarize the problem?

thanks,

Mimi


2014-01-09 20:24:33

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

On Thu, Jan 9, 2014 at 3:20 PM, Mimi Zohar <[email protected]> wrote:

> For those of us that don't have access to the RH bugzilla, can someone
> please summarize the problem?

The upstream discussion (nothing really useful in the bug other than a
link to it) is here.

http://oss.sgi.com/archives/xfs/2013-11/msg00709.html

2014-01-09 22:13:18

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

On Thu, Jan 09, 2014 at 10:31:55AM -0500, Eric Paris wrote:
> Didn't Al find this/something very similar. I really hate this
> solution. Why should every LSM try to understand the intimate
> lifetime rules of the parent subsystems? The real problem is that
> inode_free_security() is being called while the inode is still in use.
> While I agree with the assessment, I disagree with the solution. Let
> me try to find where Al and Christoph talked about this....

Because LSM has stuck its fingers into the guts of those filesystems,
obviously.

Just RCU-delay freeing the damn thing and treat NULL ->i_security in
->permission() (which can happen only with MAY_NOT_BLOCK in mask) as
"return -ECHILD and let the caller deal with that".

Modifying every ->destroy_inode() is obviously wrong - there's a lot more
filesystems than LSM buggers in the tree.

2014-01-09 22:17:20

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

On Thu, Jan 09, 2014 at 10:51:14AM -0500, Steven Rostedt wrote:

> diff --git a/fs/inode.c b/fs/inode.c
> index 4bcdad3..a8f3b88 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -252,16 +252,17 @@ EXPORT_SYMBOL(__destroy_inode);
> static void i_callback(struct rcu_head *head)
> {
> struct inode *inode = container_of(head, struct inode, i_rcu);
> + __destroy_inode(inode);
> kmem_cache_free(inode_cachep, inode);
> }
>
> static void destroy_inode(struct inode *inode)
> {
> BUG_ON(!list_empty(&inode->i_lru));
> - __destroy_inode(inode);
> - if (inode->i_sb->s_op->destroy_inode)
> + if (inode->i_sb->s_op->destroy_inode) {
> + __destroy_inode(inode);
> inode->i_sb->s_op->destroy_inode(inode);
> - else
> + } else
> call_rcu(&inode->i_rcu, i_callback);
> }
>
No go - idiotify and friends grab mutexes from fsnotify_inode_delete().
Can't do that from rcu callbacks.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2014-01-09 22:19:25

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

On Thu, 2014-01-09 at 22:13 +0000, Al Viro wrote:
> On Thu, Jan 09, 2014 at 10:31:55AM -0500, Eric Paris wrote:
> > Didn't Al find this/something very similar. I really hate this
> > solution. Why should every LSM try to understand the intimate
> > lifetime rules of the parent subsystems? The real problem is that
> > inode_free_security() is being called while the inode is still in use.
> > While I agree with the assessment, I disagree with the solution. Let
> > me try to find where Al and Christoph talked about this....
>
> Because LSM has stuck its fingers into the guts of those filesystems,
> obviously.
>
> Just RCU-delay freeing the damn thing and treat NULL ->i_security in
> ->permission() (which can happen only with MAY_NOT_BLOCK in mask) as
> "return -ECHILD and let the caller deal with that".
>
> Modifying every ->destroy_inode() is obviously wrong - there's a lot more
> filesystems than LSM buggers in the tree.

We just want the same lifetime as the inode. Allocate the security blob
when the inode is allocated and free the security blob when the inode is
freed.

If we -rcu delay the free'ing we shouldn't need the NULL check, right?

-Eric

2014-01-09 22:26:08

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

On Thu, Jan 09, 2014 at 05:18:09PM -0500, Eric Paris wrote:

> > Just RCU-delay freeing the damn thing and treat NULL ->i_security in
> > ->permission() (which can happen only with MAY_NOT_BLOCK in mask) as
> > "return -ECHILD and let the caller deal with that".
> >
> > Modifying every ->destroy_inode() is obviously wrong - there's a lot more
> > filesystems than LSM buggers in the tree.
>
> We just want the same lifetime as the inode. Allocate the security blob
> when the inode is allocated and free the security blob when the inode is
> freed.

Ultimate freeing of struct inode can easily happen outside of VFS - that's
what ->destroy_inode() is for. Moreover, filesystem might decide to do
very odd things to it, as long as it doesn't do so without RCU delay.
So no, there's no single place to do that.

2014-01-09 22:28:23

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

On Thu, Jan 09, 2014 at 11:05:45AM -0500, Eric Paris wrote:
> [adding lsm and selinux]
>
> Am I just crazy, or was this bug discussed (and obviously not fixed)
> some time ago?
>
> VFS can still use inodes after security_inode_free_security() was
> called...

Unrelated bug.

> > Assuming the VFS guys say that delaying __destroy_inode() is safe like
> > that, I like it better. It also means that this is fixed for all LSMs,
> > not just SELinux...

Recall what your own code called from __destroy_inode() (fsnotify horrors)
is doing - you can't grab a mutex from RCU callback...

2014-01-09 22:45:35

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

On Thu, 2014-01-09 at 22:13 +0000, Al Viro wrote:
> On Thu, Jan 09, 2014 at 10:31:55AM -0500, Eric Paris wrote:
> > Didn't Al find this/something very similar. I really hate this
> > solution. Why should every LSM try to understand the intimate
> > lifetime rules of the parent subsystems? The real problem is that
> > inode_free_security() is being called while the inode is still in use.
> > While I agree with the assessment, I disagree with the solution. Let
> > me try to find where Al and Christoph talked about this....
>
> Because LSM has stuck its fingers into the guts of those filesystems,
> obviously.
>
> Just RCU-delay freeing the damn thing and treat NULL ->i_security in
> ->permission() (which can happen only with MAY_NOT_BLOCK in mask) as
> "return -ECHILD and let the caller deal with that".
>
> Modifying every ->destroy_inode() is obviously wrong - there's a lot more
> filesystems than LSM buggers in the tree.

I'll do it if I've got no other choice. But it seems crazy that the LSM
is guessing that kfree_rcu() is the right answer and will be the right
answer forever. But clearly even ease inode lifetime rules can't be
counted on. fa0d7e3de6d6fc5004ad9dea0dd6b286af8f03e9 broke what was
already a perfectly sane/true/reasonable assumption about inode
lifetimes. We put the 'free the security blob' with the 'free the
inode' call. The VFS moved the 'free the inode' call. Are they going
to do it again? Will they realize that the LSM now has such intricate
object lifetime knowledge built in?

I really think the LSM function needs to, somehow, be synchronous. I
can expose a generic struct i_security with an rcu_head as the only
member which all LSMs must implement as the first member of their blob.
The VFS can do a call_rcu() on that blob...

Like I said, I can do it all in security/ but it's just BEGGING for more
of this in the future...

-Eric