2012-08-21 19:04:06

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 0/3] audit-tree fixes

Linus,

The audit subsystem maintainers (Al and Eric) are not responding to repeated
resends. Eric did ack them a while ago, but no response since then. So I'm
sending these directly to you.

Git tree is here:

git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git audit-fixes

Thanks,
Miklos


---
Miklos Szeredi (3):
audit: don't free_chunk() after fsnotify_add_mark()
audit: fix refcounting in audit-tree
audit: clean up refcounting in audit-tree

---
kernel/audit_tree.c | 19 ++++++++++++-------
1 files changed, 12 insertions(+), 7 deletions(-)


2012-08-21 19:04:08

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 1/3] audit: don't free_chunk() after fsnotify_add_mark()

From: Miklos Szeredi <[email protected]>

Don't do free_chunk() after fsnotify_add_mark(). That one does a delayed unref
via the destroy list and this results in use-after-free.

Signed-off-by: Miklos Szeredi <[email protected]>
Acked-by: Eric Paris <[email protected]>
CC: [email protected]
---
kernel/audit_tree.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
index 3a5ca58..69a5851 100644
--- a/kernel/audit_tree.c
+++ b/kernel/audit_tree.c
@@ -259,7 +259,7 @@ static void untag_chunk(struct node *p)

fsnotify_duplicate_mark(&new->mark, entry);
if (fsnotify_add_mark(&new->mark, new->mark.group, new->mark.i.inode, NULL, 1)) {
- free_chunk(new);
+ fsnotify_put_mark(&new->mark);
goto Fallback;
}

@@ -322,7 +322,7 @@ static int create_chunk(struct inode *inode, struct audit_tree *tree)

entry = &chunk->mark;
if (fsnotify_add_mark(entry, audit_tree_group, inode, NULL, 0)) {
- free_chunk(chunk);
+ fsnotify_put_mark(entry);
return -ENOSPC;
}

@@ -396,7 +396,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
fsnotify_duplicate_mark(chunk_entry, old_entry);
if (fsnotify_add_mark(chunk_entry, chunk_entry->group, chunk_entry->i.inode, NULL, 1)) {
spin_unlock(&old_entry->lock);
- free_chunk(chunk);
+ fsnotify_put_mark(chunk_entry);
fsnotify_put_mark(old_entry);
return -ENOSPC;
}
--
1.7.7

2012-08-21 19:04:14

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 2/3] audit: fix refcounting in audit-tree

From: Miklos Szeredi <[email protected]>

Refcounting of fsnotify_mark in audit tree is broken. E.g:

refcount
create_chunk
alloc_chunk 1
fsnotify_add_mark 2

untag_chunk
fsnotify_get_mark 3
fsnotify_destroy_mark
audit_tree_freeing_mark 2
fsnotify_put_mark 1
fsnotify_put_mark 0
via destroy_list
fsnotify_mark_destroy -1

This was reported by various people as triggering Oops when stopping auditd.

We could just remove the put_mark from audit_tree_freeing_mark() but that would
break freeing via inode destruction. So this patch simply omits a put_mark
after calling destroy_mark or adds a get_mark before.

The additional get_mark is necessary where there's no other put_mark after
fsnotify_destroy_mark() since it assumes that the caller is holding a reference
(or the inode is keeping the mark pinned, not the case here AFAICS).

Signed-off-by: Miklos Szeredi <[email protected]>
Reported-by: Valentin Avram <[email protected]>
Reported-by: Peter Moody <[email protected]>
Acked-by: Eric Paris <[email protected]>
CC: [email protected]
---
kernel/audit_tree.c | 5 ++---
1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
index 69a5851..2b2ffff 100644
--- a/kernel/audit_tree.c
+++ b/kernel/audit_tree.c
@@ -250,7 +250,6 @@ static void untag_chunk(struct node *p)
spin_unlock(&hash_lock);
spin_unlock(&entry->lock);
fsnotify_destroy_mark(entry);
- fsnotify_put_mark(entry);
goto out;
}

@@ -293,7 +292,6 @@ static void untag_chunk(struct node *p)
spin_unlock(&hash_lock);
spin_unlock(&entry->lock);
fsnotify_destroy_mark(entry);
- fsnotify_put_mark(entry);
goto out;

Fallback:
@@ -332,6 +330,7 @@ static int create_chunk(struct inode *inode, struct audit_tree *tree)
spin_unlock(&hash_lock);
chunk->dead = 1;
spin_unlock(&entry->lock);
+ fsnotify_get_mark(entry);
fsnotify_destroy_mark(entry);
fsnotify_put_mark(entry);
return 0;
@@ -412,6 +411,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
spin_unlock(&chunk_entry->lock);
spin_unlock(&old_entry->lock);

+ fsnotify_get_mark(chunk_entry);
fsnotify_destroy_mark(chunk_entry);

fsnotify_put_mark(chunk_entry);
@@ -445,7 +445,6 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
spin_unlock(&old_entry->lock);
fsnotify_destroy_mark(old_entry);
fsnotify_put_mark(old_entry); /* pair to fsnotify_find mark_entry */
- fsnotify_put_mark(old_entry); /* and kill it */
return 0;
}

--
1.7.7

2012-08-21 19:04:22

by Miklos Szeredi

[permalink] [raw]
Subject: [PATCH 3/3] audit: clean up refcounting in audit-tree

From: Miklos Szeredi <[email protected]>

Drop the initial reference by fsnotify_init_mark early instead of
audit_tree_freeing_mark() at destroy time.

In the cases we destroy the mark before we drop the initial reference we need to
get rid of the get_mark that balances the put_mark in audit_tree_freeing_mark().

Signed-off-by: Miklos Szeredi <[email protected]>
---
kernel/audit_tree.c | 12 +++++++++---
1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
index 2b2ffff..ed206fd 100644
--- a/kernel/audit_tree.c
+++ b/kernel/audit_tree.c
@@ -292,6 +292,7 @@ static void untag_chunk(struct node *p)
spin_unlock(&hash_lock);
spin_unlock(&entry->lock);
fsnotify_destroy_mark(entry);
+ fsnotify_put_mark(&new->mark); /* drop initial reference */
goto out;

Fallback:
@@ -330,7 +331,6 @@ static int create_chunk(struct inode *inode, struct audit_tree *tree)
spin_unlock(&hash_lock);
chunk->dead = 1;
spin_unlock(&entry->lock);
- fsnotify_get_mark(entry);
fsnotify_destroy_mark(entry);
fsnotify_put_mark(entry);
return 0;
@@ -346,6 +346,7 @@ static int create_chunk(struct inode *inode, struct audit_tree *tree)
insert_hash(chunk);
spin_unlock(&hash_lock);
spin_unlock(&entry->lock);
+ fsnotify_put_mark(entry); /* drop initial reference */
return 0;
}

@@ -411,7 +412,6 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
spin_unlock(&chunk_entry->lock);
spin_unlock(&old_entry->lock);

- fsnotify_get_mark(chunk_entry);
fsnotify_destroy_mark(chunk_entry);

fsnotify_put_mark(chunk_entry);
@@ -444,6 +444,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
spin_unlock(&chunk_entry->lock);
spin_unlock(&old_entry->lock);
fsnotify_destroy_mark(old_entry);
+ fsnotify_put_mark(chunk_entry); /* drop initial reference */
fsnotify_put_mark(old_entry); /* pair to fsnotify_find mark_entry */
return 0;
}
@@ -915,7 +916,12 @@ static void audit_tree_freeing_mark(struct fsnotify_mark *entry, struct fsnotify
struct audit_chunk *chunk = container_of(entry, struct audit_chunk, mark);

evict_chunk(chunk);
- fsnotify_put_mark(entry);
+
+ /*
+ * We are guaranteed to have at least one reference to the mark from
+ * either the inode or the caller of fsnotify_destroy_mark().
+ */
+ BUG_ON(atomic_read(&entry->refcnt) < 1);
}

static bool audit_tree_send_event(struct fsnotify_group *group, struct inode *inode,
--
1.7.7

2012-08-21 19:38:12

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 3/3] audit: clean up refcounting in audit-tree

On Tue, Aug 21, 2012 at 12:03 PM, Miklos Szeredi <[email protected]> wrote:
> + /*
> + * We are guaranteed to have at least one reference to the mark from
> + * either the inode or the caller of fsnotify_destroy_mark().
> + */
> + BUG_ON(atomic_read(&entry->refcnt) < 1);

I pulled, but *please* don't use BUG_ON() as some kind of "let's
assert some random crap" thing. We've literally had DoS security
issues due to code having BUG_ON()'s and killing the machine, and
BUG_ON() often makes things *worse* if it ends up happening in irq
context or with some critical lock held, and then the machine is just
dead with no logging and no messages left anywhere.

So before adding a BUG_ON(), you should ask yourself the following questions:

(a) is this something I need to even test?

There are lots of rules we have in the kernel. We don't add
BUG_ON() for each and every one of them. Is it such a critical data
structure that I really need to test for that condition that should
never happen?

(b) Is this data structure *so* central that I need to immediately
kill everything, or do I just want it logged?

If it's just a "I want people to know about it, but I don't
expect it to happen, I'm just adding a debug thing to make sure", then
WARN_ON_ONCE() is likely the right thing. It's *more* likely to get
reported, exactly because the machine is more likely to survive a
WARN_ON_ONCE().

(c) am I sure that none of the callers hold any central locks that
make the BUG_ON() be worse than the alternatives?

BUG_ON() is really drastic. Some machines will reboot on bugs. Others
will halt. And a even the common ones that are just set to kill the
particular process can effectively kill the whole machine due to locks
or preemption counts etc that never get released.

The kind of place that deserves a BUG_ON() is some really *central*
code where you have major issues, and there's just not anything you
can do to continue. If somebody passes kfree() a bad pointer, there's
just nothing kfree() can sanely do about it. If somebody does a
list_del() with list debugging enabled, and it notices that the list
pointer are crap, what are you going to do? You can't continue.

But some random data structure that has the wrong refcount? If you
*can* return with a warning (and ONCE, at that, so that not only does
it get logged, the log doesn't get spammed and useless because it gets
too big), that's likely what you should do.

And this is *doubly* true if it's a patch in the -rc series and you
added the code because you weren't sure you tested all possible random
cases. Don't potentially kill the machine because you weren't sure you
got all cases!

Linus