2005-05-17 16:25:56

by Valdis Klētnieks

[permalink] [raw]
Subject: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

It threw 5 of them in short succession. Different entry points into
avc_has_perm(). Here's the tracebacks:

[4295584.974000] Debug: sleeping function called from invalid context at mm/slab.c:2502
[4295584.974000] in_atomic():1, irqs_disabled():0
[4295584.974000] [<c01035a8>] dump_stack+0x15/0x17
[4295584.974000] [<c013ba6d>] kmem_cache_alloc+0x1e/0x6a
[4295584.974000] [<c02de4fa>] skb_clone+0x14/0x183
[4295584.974000] [<c02ef64a>] netlink_unicast+0x7d/0x171
[4295584.974000] [<c0130947>] audit_log_end_fast+0xf5/0x188
[4295584.974000] [<c01c3c56>] avc_audit+0x94d/0x958
[4295584.974000] [<c01c3ffb>] avc_has_perm+0x3b/0x48
[4295584.974000] [<c01c8022>] selinux_socket_unix_stream_connect+0x6f/0xa8
[4295584.974000] [<c032b740>] unix_stream_connect+0x228/0x482
[4295584.974000] [<c02dbab6>] sys_connect+0x6a/0x81
[4295584.974000] [<c02dc23b>] sys_socketcall+0x6f/0x166
[4295584.974000] [<c01026cf>] sysenter_past_esp+0x54/0x75

[4295592.398000] Debug: sleeping function called from invalid context at mm/slab.c:2502
[4295592.398000] in_atomic():1, irqs_disabled():0
[4295592.398000] [<c01035a8>] dump_stack+0x15/0x17
[4295592.398000] [<c013ba6d>] kmem_cache_alloc+0x1e/0x6a
[4295592.398000] [<c02de4fa>] skb_clone+0x14/0x183
[4295592.398000] [<c02ef64a>] netlink_unicast+0x7d/0x171
[4295592.398000] [<c0130947>] audit_log_end_fast+0xf5/0x188
[4295592.398000] [<c01c3c56>] avc_audit+0x94d/0x958
[4295592.398000] [<c01c3ffb>] avc_has_perm+0x3b/0x48
[4295592.398000] [<c01c8790>] ipc_has_perm+0x52/0x5a
[4295592.398000] [<c01b808c>] ipcperms+0x89/0x93
[4295592.398000] [<c01bba55>] do_shmat+0x28d/0x2a2
[4295592.398000] [<c0107bfd>] sys_ipc+0xe8/0x143
[4295592.398000] [<c01026cf>] sysenter_past_esp+0x54/0x75

[4295857.484000] Debug: sleeping function called from invalid context at mm/slab.c:2502
[4295857.484000] in_atomic():1, irqs_disabled():0
[4295857.484000] [<c01035a8>] dump_stack+0x15/0x17
[4295857.484000] [<c013ba6d>] kmem_cache_alloc+0x1e/0x6a
[4295857.484000] [<c02de4fa>] skb_clone+0x14/0x183
[4295857.484000] [<c02ef64a>] netlink_unicast+0x7d/0x171
[4295857.484000] [<c0130947>] audit_log_end_fast+0xf5/0x188
[4295857.484000] [<c01c3c56>] avc_audit+0x94d/0x958
[4295857.484000] [<c01c3ffb>] avc_has_perm+0x3b/0x48
[4295857.484000] [<c01c8790>] ipc_has_perm+0x52/0x5a
[4295857.484000] [<c01b808c>] ipcperms+0x89/0x93
[4295857.484000] [<c01bba55>] do_shmat+0x28d/0x2a2
[4295857.484000] [<c0107bfd>] sys_ipc+0xe8/0x143
[4295857.484000] [<c01026cf>] sysenter_past_esp+0x54/0x75

[4295859.266000] Debug: sleeping function called from invalid context at mm/slab.c:2502
[4295859.266000] in_atomic():1, irqs_disabled():0
[4295859.266000] [<c01035a8>] dump_stack+0x15/0x17
[4295859.266000] [<c013ba6d>] kmem_cache_alloc+0x1e/0x6a
[4295859.266000] [<c02de4fa>] skb_clone+0x14/0x183
[4295859.266000] [<c02ef64a>] netlink_unicast+0x7d/0x171
[4295859.266000] [<c0130947>] audit_log_end_fast+0xf5/0x188
[4295859.266000] [<c01c3c56>] avc_audit+0x94d/0x958
[4295859.266000] [<c01c3ffb>] avc_has_perm+0x3b/0x48
[4295859.266000] [<c01c8022>] selinux_socket_unix_stream_connect+0x6f/0xa8
[4295859.266000] [<c032b740>] unix_stream_connect+0x228/0x482
[4295859.266000] [<c02dbab6>] sys_connect+0x6a/0x81
[4295859.266000] [<c02dc23b>] sys_socketcall+0x6f/0x166
[4295859.266000] [<c0102729>] syscall_call+0x7/0xb

[4295873.575000] Debug: sleeping function called from invalid context at mm/slab.c:2502
[4295873.575000] in_atomic():1, irqs_disabled():0
[4295873.575000] [<c01035a8>] dump_stack+0x15/0x17
[4295873.576000] [<c013ba6d>] kmem_cache_alloc+0x1e/0x6a
[4295873.576000] [<c02de4fa>] skb_clone+0x14/0x183
[4295873.576000] [<c02ef64a>] netlink_unicast+0x7d/0x171
[4295873.576000] [<c0130947>] audit_log_end_fast+0xf5/0x188
[4295873.576000] [<c01c3c56>] avc_audit+0x94d/0x958
[4295873.576000] [<c01c3ffb>] avc_has_perm+0x3b/0x48
[4295873.576000] [<c01c8022>] selinux_socket_unix_stream_connect+0x6f/0xa8
[4295873.576000] [<c032b740>] unix_stream_connect+0x228/0x482
[4295873.576000] [<c02dbab6>] sys_connect+0x6a/0x81
[4295873.576000] [<c02dc23b>] sys_socketcall+0x6f/0x166
[4295873.576000] [<c0102729>] syscall_call+0x7/0xb


Attachments:
(No filename) (226.00 B)

2005-05-17 16:56:19

by Chris Wright

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

* [email protected] ([email protected]) wrote:
> It threw 5 of them in short succession. Different entry points into
> avc_has_perm(). Here's the tracebacks:

I'm guessing this is from my change to use single skb for audit buffer
instead of a temp buffer.

> [4295584.974000] Debug: sleeping function called from invalid context at mm/slab.c:2502
> [4295584.974000] in_atomic():1, irqs_disabled():0

This is gfp_any() flag that's used here, which I think is the problem.

> [4295584.974000] [<c01035a8>] dump_stack+0x15/0x17
> [4295584.974000] [<c013ba6d>] kmem_cache_alloc+0x1e/0x6a
> [4295584.974000] [<c02de4fa>] skb_clone+0x14/0x183
> [4295584.974000] [<c02ef64a>] netlink_unicast+0x7d/0x171

Here and up we are in netlink code which does netlink_trim to reduce the
skb data size before queing to socket. This does skb_clone with
gfp_any() flag. We aren't in softirq, so we get GFP_KERNEL flag set
even though in_atomic() is true (spin_lock held I'm assuming).

> [4295584.974000] [<c0130947>] audit_log_end_fast+0xf5/0x188
> [4295584.974000] [<c01c3c56>] avc_audit+0x94d/0x958
> [4295584.974000] [<c01c3ffb>] avc_has_perm+0x3b/0x48
> [4295584.974000] [<c01c8022>] selinux_socket_unix_stream_connect+0x6f/0xa8
> [4295584.974000] [<c032b740>] unix_stream_connect+0x228/0x482
> [4295584.974000] [<c02dbab6>] sys_connect+0x6a/0x81
> [4295584.974000] [<c02dc23b>] sys_socketcall+0x6f/0x166
> [4295584.974000] [<c01026cf>] sysenter_past_esp+0x54/0x75

2005-05-17 17:15:08

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Tue, 2005-05-17 at 09:55 -0700, Chris Wright wrote:
> Here and up we are in netlink code which does netlink_trim to reduce
> the skb data size before queing to socket. This does skb_clone with
> gfp_any() flag. We aren't in softirq, so we get GFP_KERNEL flag set
> even though in_atomic() is true (spin_lock held I'm assuming).

netlink_unicast() is only calling skb_clone() because we artificially
increased the refcount on the skb in question.

As I understand it, we do that in order to prevent the skb from being
lost if netlink_unicast() returns an error -- it normally frees the skb
before returning in that case. Am I alone in thinking that behaviour is
strange?

I'm really not fond of the refcount trick -- I suspect I'd be happier if
we were just to try to keep track of sk_rmem_alloc so we never hit the
condition in netlink_attachskb() which might cause it to fail.

--
dwmw2

2005-05-17 17:44:09

by Chris Wright

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

* David Woodhouse ([email protected]) wrote:
> On Tue, 2005-05-17 at 09:55 -0700, Chris Wright wrote:
> > Here and up we are in netlink code which does netlink_trim to reduce
> > the skb data size before queing to socket. This does skb_clone with
> > gfp_any() flag. We aren't in softirq, so we get GFP_KERNEL flag set
> > even though in_atomic() is true (spin_lock held I'm assuming).
>
> netlink_unicast() is only calling skb_clone() because we artificially
> increased the refcount on the skb in question.

It would call pskb_expand_head() otherwise, so we'd hit the same
warning.

> As I understand it, we do that in order to prevent the skb from being
> lost if netlink_unicast() returns an error -- it normally frees the skb
> before returning in that case. Am I alone in thinking that behaviour is
> strange?

I think the idea is build skb, hand off to netlink layer and never think
about it again. Fire and forget, what's not to like? ;-))

> I'm really not fond of the refcount trick -- I suspect I'd be happier if
> we were just to try to keep track of sk_rmem_alloc so we never hit the
> condition in netlink_attachskb() which might cause it to fail.

This has some issues w.r.t. truesize and socket buffer space. The trim
is done to keep accounting sane, so we'd either have to trim ourselves
or take into account the change in size. And ultimately, we'd still get
trimmed by netlink, so the GFP issue is still there. Ideally, gfp_any()
would really be _any_

thanks,
-chris

2005-05-17 18:01:41

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Tue, 17 May 2005 09:55:28 PDT, Chris Wright said:
> * [email protected] ([email protected]) wrote:
> > It threw 5 of them in short succession. Different entry points into
> > avc_has_perm(). Here's the tracebacks:
>
> I'm guessing this is from my change to use single skb for audit buffer
> instead of a temp buffer.
>
> > [4295584.974000] Debug: sleeping function called from invalid context at mm/slab.c:2502
> > [4295584.974000] in_atomic():1, irqs_disabled():0
>
> This is gfp_any() flag that's used here, which I think is the problem.

I'll be more than happy to test any patches...


Attachments:
(No filename) (226.00 B)

2005-05-18 08:30:52

by Herbert Xu

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

Guys, please CC netdev on issues like this.

On Tue, May 17, 2005 at 05:43:00PM +0000, Chris Wright wrote:
>
> This has some issues w.r.t. truesize and socket buffer space. The trim
> is done to keep accounting sane, so we'd either have to trim ourselves
> or take into account the change in size. And ultimately, we'd still get
> trimmed by netlink, so the GFP issue is still there. Ideally, gfp_any()
> would really be _any_

The trimming is completely optional. That is, if the allocation fails
nothing bad will happen. So the solution is to simply use GFP_ATOMIC.

Signed-off-by: Herbert Xu <[email protected]>

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Attachments:
(No filename) (850.00 B)
p (277.00 B)
Download all attachments

2005-05-18 17:02:21

by Chris Wright

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

* Herbert Xu ([email protected]) wrote:
> Guys, please CC netdev on issues like this.

Sorry Herbert, we hadn't yet concluded that it's not an issue that we
need to resolve within audit.

> On Tue, May 17, 2005 at 05:43:00PM +0000, Chris Wright wrote:
> >
> > This has some issues w.r.t. truesize and socket buffer space. The trim
> > is done to keep accounting sane, so we'd either have to trim ourselves
> > or take into account the change in size. And ultimately, we'd still get
> > trimmed by netlink, so the GFP issue is still there. Ideally, gfp_any()
> > would really be _any_
>
> The trimming is completely optional. That is, if the allocation fails
> nothing bad will happen. So the solution is to simply use GFP_ATOMIC.

Well, it does more pressure on atomic pool (for those cases that
GFP_KERNEL would have sufficed).

thanks,
-chris

2005-05-18 17:54:59

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Wed, 2005-05-18 at 10:00 -0700, Chris Wright wrote:
> * Herbert Xu ([email protected]) wrote:
> > Guys, please CC netdev on issues like this.
>
> Sorry Herbert, we hadn't yet concluded that it's not an issue that we
> need to resolve within audit.

I suspect that it _is_ an issue we can resolve entirely within audit
code. See the patch I posted half an hour or so ago to the linux-audit
list. If we agree on that approach, I'll do the equivalent for the git
tree either later this evening or tomorrow.

I've reverted your recent change to put audit messages directly into
skbs "in order to eliminate the extra copy", on the basis that it
blatantly wasn't having that effect anyway. Now we copy from the
audit_buffer into an optimally-sized skb which netlink_trim() won't have
to mangle. I've also removed the skb_get() immediately before
netlink_send() which always made me unhappy.

--
dwmw2

2005-05-18 21:30:53

by Herbert Xu

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Wed, May 18, 2005 at 06:52:35PM +0100, David Woodhouse wrote:
>
> I've reverted your recent change to put audit messages directly into
> skbs "in order to eliminate the extra copy", on the basis that it
> blatantly wasn't having that effect anyway. Now we copy from the
> audit_buffer into an optimally-sized skb which netlink_trim() won't have
> to mangle. I've also removed the skb_get() immediately before
> netlink_send() which always made me unhappy.

Even if the audit code is never going to call netlink_unicast with
spin locks held, we simply cannot assume that for all current and
future users of netlink_unicast.

As a consequence we can't use gfp_any() in netlink_unicast.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2005-05-19 11:35:46

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Tue, 2005-05-17 at 12:24 -0400, [email protected] wrote:
> [4295584.974000] Debug: sleeping function called from invalid context
> at mm/slab.c:2502
> [4295584.974000] in_atomic():1, irqs_disabled():0
> [4295584.974000] [<c01035a8>] dump_stack+0x15/0x17
> [4295584.974000] [<c013ba6d>] kmem_cache_alloc+0x1e/0x6a
> [4295584.974000] [<c02de4fa>] skb_clone+0x14/0x183
> [4295584.974000] [<c02ef64a>] netlink_unicast+0x7d/0x171
> [4295584.974000] [<c0130947>] audit_log_end_fast+0xf5/0x188
> [4295584.974000] [<c01c3c56>] avc_audit+0x94d/0x958

OK, we'll just let audit_log() assemble its own skb and queue it for a
separate kernel thread to feed up to auditd. We'll fix up the horrid
error handling mess we had before too, where we used to clone the skb
because we know netlink_unicast() would free it even on temporary errors
(like the SO_RCVBUF limit being reached).

This includes one of Steve's earlier patches which ensures that messages
in the skb are NUL-terminated. It should all appear some time soon in
http://www.kernel.org/git/gitweb.cgi?p=linux/kernel/git/dwmw2/audit-2.6.git;a=summary
or more perhaps more usefully (since Thomas allows us to distinguish
between local and merged commits) in
http://www.tglx.de/cgi-bin/gittracker/commit/tracker-linux/audit-2.6.git?project=tracker-linux%2Fdwmw2%2Faudit-2.6.git&pagelen=10&exclude=tracker-linux%2Ftorvalds%2Flinux-2.6.git&offset=0

Index: kernel/audit.c
===================================================================
--- e45ee43e7af31f847377e8bb3a0a61581732b653/kernel/audit.c (mode:100644)
+++ c1096ff7ae35b77bf8108c3a60b856551c50a9d7/kernel/audit.c (mode:100644)
@@ -46,6 +46,8 @@
#include <asm/types.h>
#include <linux/mm.h>
#include <linux/module.h>
+#include <linux/err.h>
+#include <linux/kthread.h>

#include <linux/audit.h>

@@ -77,7 +79,6 @@

/* Number of outstanding audit_buffers allowed. */
static int audit_backlog_limit = 64;
-static atomic_t audit_backlog = ATOMIC_INIT(0);

/* The identity of the user shutting down the audit system. */
uid_t audit_sig_uid = -1;
@@ -95,19 +96,17 @@
/* The netlink socket. */
static struct sock *audit_sock;

-/* There are two lists of audit buffers. The txlist contains audit
- * buffers that cannot be sent immediately to the netlink device because
- * we are in an irq context (these are sent later in a tasklet).
- *
- * The second list is a list of pre-allocated audit buffers (if more
+/* The audit_freelist is a list of pre-allocated audit buffers (if more
* than AUDIT_MAXFREE are in use, the audit buffer is freed instead of
* being placed on the freelist). */
-static DEFINE_SPINLOCK(audit_txlist_lock);
static DEFINE_SPINLOCK(audit_freelist_lock);
static int audit_freelist_count = 0;
-static LIST_HEAD(audit_txlist);
static LIST_HEAD(audit_freelist);

+static struct sk_buff_head audit_skb_queue;
+static struct task_struct *kauditd_task;
+static DECLARE_WAIT_QUEUE_HEAD(kauditd_wait);
+
/* There are three lists of rules -- one to search at task creation
* time, one to search at syscall entry time, and another to search at
* syscall exit time. */
@@ -151,9 +150,6 @@
struct audit_rule rule;
};

-static void audit_log_end_irq(struct audit_buffer *ab);
-static void audit_log_end_fast(struct audit_buffer *ab);
-
static void audit_panic(const char *message)
{
switch (audit_failure)
@@ -224,10 +220,8 @@

if (print) {
printk(KERN_WARNING
- "audit: audit_lost=%d audit_backlog=%d"
- " audit_rate_limit=%d audit_backlog_limit=%d\n",
+ "audit: audit_lost=%d audit_rate_limit=%d audit_backlog_limit=%d\n",
atomic_read(&audit_lost),
- atomic_read(&audit_backlog),
audit_rate_limit,
audit_backlog_limit);
audit_panic(message);
@@ -281,6 +275,38 @@
return old;
}

+int kauditd_thread(void *dummy)
+{
+ struct sk_buff *skb;
+
+ while (1) {
+ skb = skb_dequeue(&audit_skb_queue);
+ if (skb) {
+ if (audit_pid) {
+ int err = netlink_unicast(audit_sock, skb, audit_pid, 0);
+ if (err < 0) {
+ BUG_ON(err != -ECONNREFUSED); /* Shoudn't happen */
+ printk(KERN_ERR "audit: *NO* daemon at audit_pid=%d\n", audit_pid);
+ audit_pid = 0;
+ }
+ } else {
+ printk(KERN_ERR "%s\n", skb->data + NLMSG_SPACE(0));
+ kfree_skb(skb);
+ }
+ } else {
+ DECLARE_WAITQUEUE(wait, current);
+ set_current_state(TASK_INTERRUPTIBLE);
+ add_wait_queue(&kauditd_wait, &wait);
+
+ if (!skb_queue_len(&audit_skb_queue))
+ schedule();
+
+ __set_current_state(TASK_RUNNING);
+ remove_wait_queue(&kauditd_wait, &wait);
+ }
+ }
+}
+
void audit_send_reply(int pid, int seq, int type, int done, int multi,
void *payload, int size)
{
@@ -293,13 +319,16 @@

skb = alloc_skb(len, GFP_KERNEL);
if (!skb)
- goto nlmsg_failure;
+ return;

- nlh = NLMSG_PUT(skb, pid, seq, t, len - sizeof(*nlh));
+ nlh = NLMSG_PUT(skb, pid, seq, t, size);
nlh->nlmsg_flags = flags;
data = NLMSG_DATA(nlh);
memcpy(data, payload, size);
- netlink_unicast(audit_sock, skb, pid, MSG_DONTWAIT);
+
+ /* Ignore failure. It'll only happen if the sender goes away,
+ because our timeout is set to infinite. */
+ netlink_unicast(audit_sock, skb, pid, 0);
return;

nlmsg_failure: /* Used by NLMSG_PUT */
@@ -351,6 +380,15 @@
if (err)
return err;

+ /* As soon as there's any sign of userspace auditd, start kauditd to talk to it */
+ if (!kauditd_task)
+ kauditd_task = kthread_run(kauditd_thread, NULL, "kauditd");
+ if (IS_ERR(kauditd_task)) {
+ err = PTR_ERR(kauditd_task);
+ kauditd_task = NULL;
+ return err;
+ }
+
pid = NETLINK_CREDS(skb)->pid;
uid = NETLINK_CREDS(skb)->uid;
loginuid = NETLINK_CB(skb).loginuid;
@@ -365,7 +403,7 @@
status_set.rate_limit = audit_rate_limit;
status_set.backlog_limit = audit_backlog_limit;
status_set.lost = atomic_read(&audit_lost);
- status_set.backlog = atomic_read(&audit_backlog);
+ status_set.backlog = skb_queue_len(&audit_skb_queue);
audit_send_reply(NETLINK_CB(skb).pid, seq, AUDIT_GET, 0, 0,
&status_set, sizeof(status_set));
break;
@@ -471,44 +509,6 @@
up(&audit_netlink_sem);
}

-/* Grab skbuff from the audit_buffer and send to user space. */
-static inline int audit_log_drain(struct audit_buffer *ab)
-{
- struct sk_buff *skb = ab->skb;
-
- if (skb) {
- int retval = 0;
-
- if (audit_pid) {
- struct nlmsghdr *nlh = (struct nlmsghdr *)skb->data;
- nlh->nlmsg_len = skb->len - NLMSG_SPACE(0);
- skb_get(skb); /* because netlink_* frees */
- retval = netlink_unicast(audit_sock, skb, audit_pid,
- MSG_DONTWAIT);
- }
- if (retval == -EAGAIN &&
- (atomic_read(&audit_backlog)) < audit_backlog_limit) {
- audit_log_end_irq(ab);
- return 1;
- }
- if (retval < 0) {
- if (retval == -ECONNREFUSED) {
- printk(KERN_ERR
- "audit: *NO* daemon at audit_pid=%d\n",
- audit_pid);
- audit_pid = 0;
- } else
- audit_log_lost("netlink socket too busy");
- }
- if (!audit_pid) { /* No daemon */
- int offset = NLMSG_SPACE(0);
- int len = skb->len - offset;
- skb->data[offset + len] = '\0';
- printk(KERN_ERR "%s\n", skb->data + offset);
- }
- }
- return 0;
-}

/* Initialize audit support at boot time. */
static int __init audit_init(void)
@@ -519,6 +519,8 @@
if (!audit_sock)
audit_panic("cannot initialize netlink socket");

+ audit_sock->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT;
+ skb_queue_head_init(&audit_skb_queue);
audit_initialized = 1;
audit_enabled = audit_default;
audit_log(NULL, AUDIT_KERNEL, "initialized");
@@ -549,7 +551,7 @@

if (ab->skb)
kfree_skb(ab->skb);
- atomic_dec(&audit_backlog);
+
spin_lock_irqsave(&audit_freelist_lock, flags);
if (++audit_freelist_count > AUDIT_MAXFREE)
kfree(ab);
@@ -579,13 +581,12 @@
if (!ab)
goto err;
}
- atomic_inc(&audit_backlog);

ab->skb = alloc_skb(AUDIT_BUFSIZ, gfp_mask);
if (!ab->skb)
goto err;

- ab->ctx = ctx;
+ ab->ctx = ctx;
nlh = (struct nlmsghdr *)skb_put(ab->skb, NLMSG_SPACE(0));
nlh->nlmsg_type = type;
nlh->nlmsg_flags = 0;
@@ -612,18 +613,6 @@
if (!audit_initialized)
return NULL;

- if (audit_backlog_limit
- && atomic_read(&audit_backlog) > audit_backlog_limit) {
- if (audit_rate_check())
- printk(KERN_WARNING
- "audit: audit_backlog=%d > "
- "audit_backlog_limit=%d\n",
- atomic_read(&audit_backlog),
- audit_backlog_limit);
- audit_log_lost("backlog limit exceeded");
- return NULL;
- }
-
ab = audit_buffer_alloc(ctx, GFP_ATOMIC, type);
if (!ab) {
audit_log_lost("out of memory in audit_log_start");
@@ -692,7 +681,8 @@
goto out;
len = vsnprintf(skb->tail, avail, fmt, args2);
}
- skb_put(skb, (len < avail) ? len : avail);
+ if (len > 0)
+ skb_put(skb, len);
out:
return;
}
@@ -710,20 +700,47 @@
va_end(args);
}

-void audit_log_hex(struct audit_buffer *ab, const unsigned char *buf, size_t len)
+/* This function will take the passed buf and convert it into a string of
+ * ascii hex digits. The new string is placed onto the skb. */
+void audit_log_hex(struct audit_buffer *ab, const unsigned char *buf,
+ size_t len)
{
- int i;
+ int i, avail, new_len;
+ unsigned char *ptr;
+ struct sk_buff *skb;
+ static const unsigned char *hex = "0123456789ABCDEF";
+
+ BUG_ON(!ab->skb);
+ skb = ab->skb;
+ avail = skb_tailroom(skb);
+ new_len = len<<1;
+ if (new_len >= avail) {
+ /* Round the buffer request up to the next multiple */
+ new_len = AUDIT_BUFSIZ*(((new_len-avail)/AUDIT_BUFSIZ) + 1);
+ avail = audit_expand(ab, new_len);
+ if (!avail)
+ return;
+ }

- for (i=0; i<len; i++)
- audit_log_format(ab, "%02x", buf[i]);
+ ptr = skb->tail;
+ for (i=0; i<len; i++) {
+ *ptr++ = hex[(buf[i] & 0xF0)>>4]; /* Upper nibble */
+ *ptr++ = hex[buf[i] & 0x0F]; /* Lower nibble */
+ }
+ *ptr = 0;
+ skb_put(skb, len << 1); /* new string is twice the old string */
}

+/* This code will escape a string that is passed to it if the string
+ * contains a control character, unprintable character, double quote mark,
+ * or a space. Unescaped strings will start and end with a double quote mark.
+ * Strings that are escaped are printed in hex (2 digits per char). */
void audit_log_untrustedstring(struct audit_buffer *ab, const char *string)
{
const unsigned char *p = string;

while (*p) {
- if (*p == '"' || *p == ' ' || *p < 0x20 || *p > 0x7f) {
+ if (*p == '"' || *p < 0x21 || *p > 0x7f) {
audit_log_hex(ab, string, strlen(string));
return;
}
@@ -732,97 +749,54 @@
audit_log_format(ab, "\"%s\"", string);
}

-
-/* This is a helper-function to print the d_path without using a static
- * buffer or allocating another buffer in addition to the one in
- * audit_buffer. */
+/* This is a helper-function to print the escaped d_path */
void audit_log_d_path(struct audit_buffer *ab, const char *prefix,
struct dentry *dentry, struct vfsmount *vfsmnt)
{
- char *p;
- struct sk_buff *skb = ab->skb;
- int len, avail;
+ char *p, *path;

if (prefix)
audit_log_format(ab, " %s", prefix);

- avail = skb_tailroom(skb);
- p = d_path(dentry, vfsmnt, skb->tail, avail);
- if (IS_ERR(p)) {
- /* FIXME: can we save some information here? */
- audit_log_format(ab, "<toolong>");
- } else {
- /* path isn't at start of buffer */
- len = ((char *)skb->tail + avail - 1) - p;
- memmove(skb->tail, p, len);
- skb_put(skb, len);
- }
-}
-
-/* Remove queued messages from the audit_txlist and send them to user space. */
-static void audit_tasklet_handler(unsigned long arg)
-{
- LIST_HEAD(list);
- struct audit_buffer *ab;
- unsigned long flags;
-
- spin_lock_irqsave(&audit_txlist_lock, flags);
- list_splice_init(&audit_txlist, &list);
- spin_unlock_irqrestore(&audit_txlist_lock, flags);
-
- while (!list_empty(&list)) {
- ab = list_entry(list.next, struct audit_buffer, list);
- list_del(&ab->list);
- audit_log_end_fast(ab);
+ /* We will allow 11 spaces for ' (deleted)' to be appended */
+ path = kmalloc(PATH_MAX+11, GFP_KERNEL);
+ if (!path) {
+ audit_log_format(ab, "<no memory>");
+ return;
}
+ p = d_path(dentry, vfsmnt, path, PATH_MAX+11);
+ if (IS_ERR(p)) { /* Should never happen since we send PATH_MAX */
+ /* FIXME: can we save some information here? */
+ audit_log_format(ab, "<too long>");
+ } else
+ audit_log_untrustedstring(ab, p);
+ kfree(path);
}

-static DECLARE_TASKLET(audit_tasklet, audit_tasklet_handler, 0);
-
/* The netlink_* functions cannot be called inside an irq context, so
* the audit buffer is places on a queue and a tasklet is scheduled to
* remove them from the queue outside the irq context. May be called in
* any context. */
-static void audit_log_end_irq(struct audit_buffer *ab)
-{
- unsigned long flags;
-
- if (!ab)
- return;
- spin_lock_irqsave(&audit_txlist_lock, flags);
- list_add_tail(&ab->list, &audit_txlist);
- spin_unlock_irqrestore(&audit_txlist_lock, flags);
-
- tasklet_schedule(&audit_tasklet);
-}
-
-/* Send the message in the audit buffer directly to user space. May not
- * be called in an irq context. */
-static void audit_log_end_fast(struct audit_buffer *ab)
+void audit_log_end(struct audit_buffer *ab)
{
- BUG_ON(in_irq());
if (!ab)
return;
if (!audit_rate_check()) {
audit_log_lost("rate limit exceeded");
} else {
- if (audit_log_drain(ab))
- return;
+ if (audit_pid) {
+ struct nlmsghdr *nlh = (struct nlmsghdr *)ab->skb->data;
+ nlh->nlmsg_len = ab->skb->len - NLMSG_SPACE(0);
+ skb_queue_tail(&audit_skb_queue, ab->skb);
+ ab->skb = NULL;
+ wake_up_interruptible(&kauditd_wait);
+ } else {
+ printk("%s\n", ab->skb->data + NLMSG_SPACE(0));
+ }
}
audit_buffer_free(ab);
}

-/* Send or queue the message in the audit buffer, depending on the
- * current context. (A convenience function that may be called in any
- * context.) */
-void audit_log_end(struct audit_buffer *ab)
-{
- if (in_irq())
- audit_log_end_irq(ab);
- else
- audit_log_end_fast(ab);
-}
-
/* Log an audit record. This is a convenience function that calls
* audit_log_start, audit_log_vformat, and audit_log_end. It may be
* called in any context. */


--
dwmw2

2005-05-19 18:45:52

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Thu, 19 May 2005 12:34:08 BST, David Woodhouse said:
> On Tue, 2005-05-17 at 12:24 -0400, [email protected] wrote:
> > [4295584.974000] Debug: sleeping function called from invalid context
> > at mm/slab.c:2502

> OK, we'll just let audit_log() assemble its own skb and queue it for a
> separate kernel thread to feed up to auditd. We'll fix up the horrid
> error handling mess we had before too, where we used to clone the skb
> because we know netlink_unicast() would free it even on temporary errors
> (like the SO_RCVBUF limit being reached).
>
> This includes one of Steve's earlier patches which ensures that messages
> in the skb are NUL-terminated. It should all appear some time soon in
> http://www.kernel.org/git/gitweb.cgi?p=linux/kernel/git/dwmw2/audit-2.6.git;a
=summary
> or more perhaps more usefully (since Thomas allows us to distinguish
> between local and merged commits) in
> http://www.tglx.de/cgi-bin/gittracker/commit/tracker-linux/audit-2.6.git?proj
ect=tracker-linux%2Fdwmw2%2Faudit-2.6.git&pagelen=10&exclude=tracker-linux%2Fto
rvalds%2Flinux-2.6.git&offset=0
>
> Index: kernel/audit.c
> ===================================================================
> --- e45ee43e7af31f847377e8bb3a0a61581732b653/kernel/audit.c (mode:100644)
> +++ c1096ff7ae35b77bf8108c3a60b856551c50a9d7/kernel/audit.c (mode:100644)

Patch applies with a few offsets against 12-rc4-mm2, and I'm not
seeing the messages any more...


Attachments:
(No filename) (226.00 B)

2005-05-20 14:30:37

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Thu, 19 May 2005 14:45:31 EDT, [email protected] said:

> > Index: kernel/audit.c
> > ===================================================================
> > --- e45ee43e7af31f847377e8bb3a0a61581732b653/kernel/audit.c (mode:100644)
> > +++ c1096ff7ae35b77bf8108c3a60b856551c50a9d7/kernel/audit.c (mode:100644)
>
> Patch applies with a few offsets against 12-rc4-mm2, and I'm not
> seeing the messages any more...

Looks like we either only swatted half the bug, or the patch moved it around.
Slightly different trace this time:

May 20 09:28:22 turing-police kernel: [4294758.508000] audit(1116595648.944:0): avc: denied { write } for name=X0 dev=tmpfs in
o=5682 scontext=system_u:system_r:initrc_t tcontext=system_u:object_r:initrc_tmp_t tclass=sock_file
May 20 09:28:22 turing-police kernel: [4294758.508000] Debug: sleeping function called from invalid context at mm/slab.c:2502
May 20 09:28:22 turing-police kernel: [4294758.508000] in_atomic():1, irqs_disabled():0
May 20 09:28:22 turing-police kernel: [4294758.508000] [dump_stack+21/23] dump_stack+0x15/0x17
May 20 09:28:22 turing-police kernel: [4294758.508000] [kmem_cache_alloc+30/106] kmem_cache_alloc+0x1e/0x6a
May 20 09:28:22 turing-police kernel: [4294758.508000] [audit_log_d_path+49/139] audit_log_d_path+0x31/0x8b
May 20 09:28:22 turing-police kernel: [4294758.508000] [avc_audit+1418/2392] avc_audit+0x58a/0x958
May 20 09:28:22 turing-police kernel: [4294758.508000] [avc_has_perm+59/72] avc_has_perm+0x3b/0x48
May 20 09:28:22 turing-police kernel: [4294758.508000] [selinux_socket_unix_stream_connect+111/168] selinux_socket_unix_stream_c
onnect+0x6f/0xa8
May 20 09:28:22 turing-police kernel: [4294758.508000] [unix_stream_connect+552/1154] unix_stream_connect+0x228/0x482
May 20 09:28:22 turing-police kernel: [4294758.508000] [sys_connect+106/129] sys_connect+0x6a/0x81
May 20 09:28:22 turing-police kernel: [4294758.508000] [sys_socketcall+111/358] sys_socketcall+0x6f/0x166
May 20 09:28:22 turing-police kernel: [4294758.508000] [sysenter_past_esp+84/117] sysenter_past_esp+0x54/0x75
May 20 09:28:22 turing-police kernel: [4294758.508000] audit(1116595648.944:0): avc: denied { connectto } for path="/tmp/.X11-
unix/X0" scontext=system_u:system_r:initrc_t tcontext=system_u:system_r:xdm_xserver_t tclass=unix_stream_socket
...
May 20 09:32:38 turing-police kernel: [4295068.589000] Debug: sleeping function called from invalid context at mm/slab.c:2502
May 20 09:32:38 turing-police kernel: [4295068.589000] in_atomic():1, irqs_disabled():0
May 20 09:32:38 turing-police kernel: [4295068.589000] [dump_stack+21/23] dump_stack+0x15/0x17
May 20 09:32:38 turing-police kernel: [4295068.589000] [kmem_cache_alloc+30/106] kmem_cache_alloc+0x1e/0x6a
May 20 09:32:38 turing-police kernel: [4295068.589000] [audit_log_d_path+49/139] audit_log_d_path+0x31/0x8b
May 20 09:32:38 turing-police kernel: [4295068.589000] [avc_audit+1418/2392] avc_audit+0x58a/0x958
May 20 09:32:38 turing-police kernel: [4295068.589000] [avc_has_perm+59/72] avc_has_perm+0x3b/0x48
May 20 09:32:38 turing-police kernel: [4295068.589000] [selinux_socket_unix_stream_connect+111/168] selinux_socket_unix_stream_c
onnect+0x6f/0xa8
May 20 09:32:38 turing-police kernel: [4295068.589000] [unix_stream_connect+552/1154] unix_stream_connect+0x228/0x482
May 20 09:32:38 turing-police kernel: [4295068.589000] [sys_connect+106/129] sys_connect+0x6a/0x81
May 20 09:32:38 turing-police kernel: [4295068.589000] [sys_socketcall+111/358] sys_socketcall+0x6f/0x166
May 20 09:32:38 turing-police kernel: [4295068.589000] [syscall_call+7/11] syscall_call+0x7/0xb
May 20 09:32:52 turing-police kernel: [4295082.517000] Debug: sleeping function called from invalid context at mm/slab.c:2502
May 20 09:32:52 turing-police kernel: [4295082.517000] in_atomic():1, irqs_disabled():0
May 20 09:32:52 turing-police kernel: [4295082.517000] [dump_stack+21/23] dump_stack+0x15/0x17
May 20 09:32:52 turing-police kernel: [4295082.517000] [kmem_cache_alloc+30/106] kmem_cache_alloc+0x1e/0x6a
May 20 09:32:52 turing-police kernel: [4295082.517000] [audit_log_d_path+49/139] audit_log_d_path+0x31/0x8b
May 20 09:32:52 turing-police kernel: [4295082.517000] [avc_audit+1418/2392] avc_audit+0x58a/0x958
May 20 09:32:52 turing-police kernel: [4295082.517000] [avc_has_perm+59/72] avc_has_perm+0x3b/0x48
May 20 09:32:52 turing-police kernel: [4295082.517000] [selinux_socket_unix_stream_connect+111/168] selinux_socket_unix_stream_c
onnect+0x6f/0xa8
May 20 09:32:52 turing-police kernel: [4295082.517000] [unix_stream_connect+552/1154] unix_stream_connect+0x228/0x482
May 20 09:32:52 turing-police kernel: [4295082.518000] [sys_connect+106/129] sys_connect+0x6a/0x81
May 20 09:32:52 turing-police kernel: [4295082.518000] [sys_socketcall+111/358] sys_socketcall+0x6f/0x166
May 20 09:32:52 turing-police kernel: [4295082.518000] [syscall_call+7/11] syscall_call+0x7/0xb


Attachments:
(No filename) (226.00 B)

2005-05-20 15:02:21

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Fri, 2005-05-20 at 10:30 -0400, [email protected] wrote:
> Looks like we either only swatted half the bug, or the patch moved it
> around. Slightly different trace this time:

OK. Steve's audit_log_d_path() change, which I pulled in because it had
the side-effect of NUL-terminating the buffer, is now using GFP_KERNEL
where previously it was not.

We could make it use GFP_ATOMIC, but I suspect the better answer if at
all possible would be to make sure that avc_audit doesn't call it with
spinlocks held. Or maybe to make avc_audit() pass a gfp_mask to it, but
I don't like that much.

--
dwmw2

2005-05-20 15:19:47

by Stephen Smalley

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Fri, 2005-05-20 at 15:59 +0100, David Woodhouse wrote:
> On Fri, 2005-05-20 at 10:30 -0400, [email protected] wrote:
> > Looks like we either only swatted half the bug, or the patch moved it
> > around. Slightly different trace this time:
>
> OK. Steve's audit_log_d_path() change, which I pulled in because it had
> the side-effect of NUL-terminating the buffer, is now using GFP_KERNEL
> where previously it was not.
>
> We could make it use GFP_ATOMIC, but I suspect the better answer if at
> all possible would be to make sure that avc_audit doesn't call it with
> spinlocks held. Or maybe to make avc_audit() pass a gfp_mask to it, but
> I don't like that much.

The lock is being held by the af_unix code (unix_state_wlock), not
avc_audit; the AVC is called under all kinds of circumstances (softirq,
hard irq, caller holding locks on relevant objects) for permission
checking and must never sleep.

One option might be to defer some of the AVC auditing to the audit
framework (e.g. save the vfsmount and dentry on the current audit
context and let audit_log_exit perform the audit_log_d_path).

--
Stephen Smalley
National Security Agency

2005-05-20 15:39:08

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Fri, 2005-05-20 at 11:09 -0400, Stephen Smalley wrote:
> The lock is being held by the af_unix code (unix_state_wlock), not
> avc_audit; the AVC is called under all kinds of circumstances (softirq,
> hard irq, caller holding locks on relevant objects) for permission
> checking and must never sleep.
>
> One option might be to defer some of the AVC auditing to the audit
> framework (e.g. save the vfsmount and dentry on the current audit
> context and let audit_log_exit perform the audit_log_d_path).

Yeah, maybe. Assuming you pin them, it's easy enough to hang something
off the audit context's aux list which refers to them. I'm really not
that fond of the idea of allocating a whole PATH_MAX with GFP_ATOMIC.

--
dwmw2

2005-05-20 16:41:48

by Chris Wright

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

* David Woodhouse ([email protected]) wrote:
> On Fri, 2005-05-20 at 11:09 -0400, Stephen Smalley wrote:
> > The lock is being held by the af_unix code (unix_state_wlock), not
> > avc_audit; the AVC is called under all kinds of circumstances (softirq,
> > hard irq, caller holding locks on relevant objects) for permission
> > checking and must never sleep.
> >
> > One option might be to defer some of the AVC auditing to the audit
> > framework (e.g. save the vfsmount and dentry on the current audit
> > context and let audit_log_exit perform the audit_log_d_path).
>
> Yeah, maybe. Assuming you pin them, it's easy enough to hang something
> off the audit context's aux list which refers to them. I'm really not
> that fond of the idea of allocating a whole PATH_MAX with GFP_ATOMIC.

I agree, PATH_MAX atomic is greedy. But the calling convention could
be a bit awkward. Currently it's as got as audit_log_format. Tacking
vfsmount/dentry pair on might be tough to format the message with.
Got a good idea?

2005-05-20 16:49:58

by Stephen Smalley

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Fri, 2005-05-20 at 16:36 +0100, David Woodhouse wrote:
> On Fri, 2005-05-20 at 11:09 -0400, Stephen Smalley wrote:
> > The lock is being held by the af_unix code (unix_state_wlock), not
> > avc_audit; the AVC is called under all kinds of circumstances (softirq,
> > hard irq, caller holding locks on relevant objects) for permission
> > checking and must never sleep.
> >
> > One option might be to defer some of the AVC auditing to the audit
> > framework (e.g. save the vfsmount and dentry on the current audit
> > context and let audit_log_exit perform the audit_log_d_path).
>
> Yeah, maybe. Assuming you pin them, it's easy enough to hang something
> off the audit context's aux list which refers to them. I'm really not
> that fond of the idea of allocating a whole PATH_MAX with GFP_ATOMIC.

Untested patch below, relative to 2.6.12-rc4-mm2 plus your socketcall
patch to avoid the obvious conflict there. Is this what you had in
mind?

include/linux/audit.h | 2 ++
kernel/auditsc.c | 38 ++++++++++++++++++++++++++++++++++++++
security/selinux/avc.c | 17 ++++++++---------
3 files changed, 48 insertions(+), 9 deletions(-)

--- linux-2.6.12-rc4-mm2/include/linux/audit.h.orig 2005-05-20 12:37:41.000000000 -0400
+++ linux-2.6.12-rc4-mm2/include/linux/audit.h 2005-05-20 12:38:15.000000000 -0400
@@ -238,6 +238,7 @@ extern uid_t audit_get_loginuid(struct a
extern int audit_ipc_perms(unsigned long qbytes, uid_t uid, gid_t gid, mode_t mode);
extern int audit_socketcall(int nargs, unsigned long *args);
extern int audit_sockaddr(int len, void *addr);
+extern int audit_avc_path(struct dentry *dentry, struct vfsmount *mnt);
extern void audit_signal_info(int sig, struct task_struct *t);
#else
#define audit_alloc(t) ({ 0; })
@@ -253,6 +254,7 @@ extern void audit_signal_info(int sig, s
#define audit_ipc_perms(q,u,g,m) ({ 0; })
#define audit_socketcall(n,a) ({ 0; })
#define audit_sockaddr(len, addr) ({ 0; })
+#define audit_avc_path(dentry, mnt) ({ 0; })
#define audit_signal_info(s,t) do { ; } while (0)
#endif

--- linux-2.6.12-rc4-mm2/kernel/auditsc.c.orig 2005-05-20 12:37:19.000000000 -0400
+++ linux-2.6.12-rc4-mm2/kernel/auditsc.c 2005-05-20 12:43:05.000000000 -0400
@@ -34,6 +34,7 @@
#include <asm/types.h>
#include <linux/mm.h>
#include <linux/module.h>
+#include <linux/mount.h>
#include <linux/socket.h>
#include <linux/audit.h>
#include <linux/personality.h>
@@ -124,6 +125,11 @@ struct audit_aux_data_sockaddr {
char a[0];
};

+struct audit_aux_data_avc {
+ struct audit_aux_data d;
+ struct dentry *dentry;
+ struct vfsmount *mnt;
+};

/* The per-task audit context. */
struct audit_context {
@@ -553,6 +559,11 @@ static inline void audit_free_aux(struct
struct audit_aux_data *aux;

while ((aux = context->aux)) {
+ if (aux->type == AUDIT_AVC) {
+ struct audit_aux_data_avc *axi = (void *)aux;
+ dput(axi->dentry);
+ mntput(axi->mnt);
+ }
context->aux = aux->next;
kfree(aux);
}
@@ -728,6 +739,12 @@ static void audit_log_exit(struct audit_
} /* case AUDIT_SOCKADDR */
break;

+ case AUDIT_AVC: {
+ struct audit_aux_data_avc *axi = (void *)aux;
+ if (axi->dentry)
+ audit_log_d_path(ab, "path=", axi->dentry, axi->mnt);
+ } /* case AUDIT_AVC */
+ break;
}
audit_log_end(ab);

@@ -1128,6 +1145,27 @@ int audit_sockaddr(int len, void *a)
return 0;
}

+int audit_avc_path(struct dentry *dentry, struct vfsmount *mnt)
+{
+ struct audit_aux_data_avc *ax;
+ struct audit_context *context = current->audit_context;
+
+ if (likely(!context))
+ return 0;
+
+ ax = kmalloc(sizeof(*ax), GFP_ATOMIC);
+ if (!ax)
+ return -ENOMEM;
+
+ ax->dentry = dget(dentry);
+ ax->mnt = mntget(mnt);
+
+ ax->d.type = AUDIT_AVC;
+ ax->d.next = context->aux;
+ context->aux = (void *)ax;
+ return 0;
+}
+
void audit_signal_info(int sig, struct task_struct *t)
{
extern pid_t audit_sig_pid;
--- linux-2.6.12-rc4-mm2/security/selinux/avc.c.orig 2005-05-20 12:37:41.000000000 -0400
+++ linux-2.6.12-rc4-mm2/security/selinux/avc.c 2005-05-20 12:39:06.000000000 -0400
@@ -566,13 +566,10 @@ void avc_audit(u32 ssid, u32 tsid,
case AVC_AUDIT_DATA_FS:
if (a->u.fs.dentry) {
struct dentry *dentry = a->u.fs.dentry;
- if (a->u.fs.mnt) {
- audit_log_d_path(ab, "path=", dentry,
- a->u.fs.mnt);
- } else {
- audit_log_format(ab, " name=%s",
- dentry->d_name.name);
- }
+ if (a->u.fs.mnt)
+ audit_avc_path(dentry, a->u.fs.mnt);
+ audit_log_format(ab, " name=%s",
+ dentry->d_name.name);
inode = dentry->d_inode;
} else if (a->u.fs.inode) {
struct dentry *dentry;
@@ -623,8 +620,10 @@ void avc_audit(u32 ssid, u32 tsid,
case AF_UNIX:
u = unix_sk(sk);
if (u->dentry) {
- audit_log_d_path(ab, "path=",
- u->dentry, u->mnt);
+ audit_avc_path(u->dentry, u->mnt);
+ audit_log_format(ab, " name=%s",
+ u->dentry->d_name.name);
+
break;
}
if (!u->addr)


--
Stephen Smalley
National Security Agency

2005-05-20 16:58:09

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Fri, 2005-05-20 at 12:40 -0400, Stephen Smalley wrote:
> Untested patch below, relative to 2.6.12-rc4-mm2 plus your socketcall
> patch to avoid the obvious conflict there. Is this what you had in
> mind?

Yeah, basically. Although it would be better to introduce AUDIT_AVC_PATH
instead of using AUDIT_AVC for the type. If there's general agreement
this this is a sane answer, I'll stick it in the git tree. Can I see a
Signed-off-by line please?

--
dwmw2

2005-05-20 17:02:59

by Chris Wright

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

* Stephen Smalley ([email protected]) wrote:
> Untested patch below, relative to 2.6.12-rc4-mm2 plus your socketcall
> patch to avoid the obvious conflict there. Is this what you had in
> mind?
>
> include/linux/audit.h | 2 ++
> kernel/auditsc.c | 38 ++++++++++++++++++++++++++++++++++++++
> security/selinux/avc.c | 17 ++++++++---------
> 3 files changed, 48 insertions(+), 9 deletions(-)
>
> --- linux-2.6.12-rc4-mm2/include/linux/audit.h.orig 2005-05-20 12:37:41.000000000 -0400
> +++ linux-2.6.12-rc4-mm2/include/linux/audit.h 2005-05-20 12:38:15.000000000 -0400
> @@ -238,6 +238,7 @@ extern uid_t audit_get_loginuid(struct a
> extern int audit_ipc_perms(unsigned long qbytes, uid_t uid, gid_t gid, mode_t mode);
> extern int audit_socketcall(int nargs, unsigned long *args);
> extern int audit_sockaddr(int len, void *addr);
> +extern int audit_avc_path(struct dentry *dentry, struct vfsmount *mnt);
> extern void audit_signal_info(int sig, struct task_struct *t);
> #else
> #define audit_alloc(t) ({ 0; })
> @@ -253,6 +254,7 @@ extern void audit_signal_info(int sig, s
> #define audit_ipc_perms(q,u,g,m) ({ 0; })
> #define audit_socketcall(n,a) ({ 0; })
> #define audit_sockaddr(len, addr) ({ 0; })
> +#define audit_avc_path(dentry, mnt) ({ 0; })
> #define audit_signal_info(s,t) do { ; } while (0)
> #endif
>
> --- linux-2.6.12-rc4-mm2/kernel/auditsc.c.orig 2005-05-20 12:37:19.000000000 -0400
> +++ linux-2.6.12-rc4-mm2/kernel/auditsc.c 2005-05-20 12:43:05.000000000 -0400
> @@ -34,6 +34,7 @@
> #include <asm/types.h>
> #include <linux/mm.h>
> #include <linux/module.h>
> +#include <linux/mount.h>
> #include <linux/socket.h>
> #include <linux/audit.h>
> #include <linux/personality.h>
> @@ -124,6 +125,11 @@ struct audit_aux_data_sockaddr {
> char a[0];
> };
>
> +struct audit_aux_data_avc {

I guess it's not really avc specific (although it's primary user).

> + struct audit_aux_data d;
> + struct dentry *dentry;
> + struct vfsmount *mnt;
> +};
>
> /* The per-task audit context. */
> struct audit_context {
> @@ -553,6 +559,11 @@ static inline void audit_free_aux(struct
> struct audit_aux_data *aux;
>
> while ((aux = context->aux)) {
> + if (aux->type == AUDIT_AVC) {
> + struct audit_aux_data_avc *axi = (void *)aux;
> + dput(axi->dentry);
> + mntput(axi->mnt);
> + }
> context->aux = aux->next;
> kfree(aux);
> }
> @@ -728,6 +739,12 @@ static void audit_log_exit(struct audit_
> } /* case AUDIT_SOCKADDR */
> break;
>
> + case AUDIT_AVC: {
> + struct audit_aux_data_avc *axi = (void *)aux;
> + if (axi->dentry)
> + audit_log_d_path(ab, "path=", axi->dentry, axi->mnt);
> + } /* case AUDIT_AVC */

Won't this change the order quite a bit? And how do you correlate path
vs. exe, etc.? Oh, I see, you're not using it for exe...

> + break;
> }
> audit_log_end(ab);
>
> @@ -1128,6 +1145,27 @@ int audit_sockaddr(int len, void *a)
> return 0;
> }
>
> +int audit_avc_path(struct dentry *dentry, struct vfsmount *mnt)
> +{
> + struct audit_aux_data_avc *ax;
> + struct audit_context *context = current->audit_context;
> +
> + if (likely(!context))
> + return 0;
> +
> + ax = kmalloc(sizeof(*ax), GFP_ATOMIC);
> + if (!ax)
> + return -ENOMEM;
> +
> + ax->dentry = dget(dentry);
> + ax->mnt = mntget(mnt);
> +
> + ax->d.type = AUDIT_AVC;
> + ax->d.next = context->aux;
> + context->aux = (void *)ax;
> + return 0;
> +}
> +
> void audit_signal_info(int sig, struct task_struct *t)
> {
> extern pid_t audit_sig_pid;
> --- linux-2.6.12-rc4-mm2/security/selinux/avc.c.orig 2005-05-20 12:37:41.000000000 -0400
> +++ linux-2.6.12-rc4-mm2/security/selinux/avc.c 2005-05-20 12:39:06.000000000 -0400
> @@ -566,13 +566,10 @@ void avc_audit(u32 ssid, u32 tsid,
> case AVC_AUDIT_DATA_FS:
> if (a->u.fs.dentry) {
> struct dentry *dentry = a->u.fs.dentry;
> - if (a->u.fs.mnt) {
> - audit_log_d_path(ab, "path=", dentry,
> - a->u.fs.mnt);
> - } else {
> - audit_log_format(ab, " name=%s",
> - dentry->d_name.name);
> - }
> + if (a->u.fs.mnt)
> + audit_avc_path(dentry, a->u.fs.mnt);
> + audit_log_format(ab, " name=%s",
> + dentry->d_name.name);
> inode = dentry->d_inode;
> } else if (a->u.fs.inode) {
> struct dentry *dentry;
> @@ -623,8 +620,10 @@ void avc_audit(u32 ssid, u32 tsid,
> case AF_UNIX:
> u = unix_sk(sk);
> if (u->dentry) {
> - audit_log_d_path(ab, "path=",
> - u->dentry, u->mnt);
> + audit_avc_path(u->dentry, u->mnt);
> + audit_log_format(ab, " name=%s",
> + u->dentry->d_name.name);
> +
> break;
> }
> if (!u->addr)
>

2005-05-20 17:07:49

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Fri, 2005-05-20 at 12:40 -0400, Stephen Smalley wrote:
> @@ -728,6 +739,12 @@ static void audit_log_exit(struct audit_
> } /* case AUDIT_SOCKADDR */
> break;
>
> + case AUDIT_AVC: {
> + struct audit_aux_data_avc *axi = (void *)aux;
> + if (axi->dentry)
> + audit_log_d_path(ab, "path=", axi->dentry, axi->mnt);
> + } /* case AUDIT_AVC */
> + break;
> }
> audit_log_end(ab);

It gets freed at this point too, not just in audit_free_aux(). So you
have to do the mntput and dput here too.

--
dwmw2

2005-05-20 17:07:58

by Stephen Smalley

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Fri, 2005-05-20 at 17:55 +0100, David Woodhouse wrote:
> Yeah, basically. Although it would be better to introduce AUDIT_AVC_PATH
> instead of using AUDIT_AVC for the type. If there's general agreement
> this this is a sane answer, I'll stick it in the git tree. Can I see a
> Signed-off-by line please?

Patched kernel compiles, boots, and runs the selinux testsuite as
expected, with just the (last component) name= info in the avc message
and the path= info in the associated syscall audit message. I don't
mind introducing an AUDIT_AVC_PATH type if desired, but saw that there
was an AUDIT_AVC definition that wasn't being used yet. What do people
want? Would we end up adding separate types for each kind of auxiliary
audit data provided by the AVC, or just put them all into a single top-
level type with possibly a subtype to distinguish.

--
Stephen Smalley
National Security Agency

2005-05-20 17:15:07

by Stephen Smalley

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Fri, 2005-05-20 at 10:01 -0700, Chris Wright wrote:
> > +struct audit_aux_data_avc {
>
> I guess it's not really avc specific (although it's primary user).

You mean generalize the struct for possible re-use by other audit
helpers but keep the type value and function distinct?
audit_aux_data_path? Analogous to struct path in namei.c.

> Won't this change the order quite a bit? And how do you correlate path
> vs. exe, etc.? Oh, I see, you're not using it for exe...

Could be an issue for syscalls that involve multiple files, e.g. rename.
We are at least still logging the last component name, device, and inode
number with the avc message, and only deferring logging of the full
pathname.

--
Stephen Smalley
National Security Agency

2005-05-20 17:16:17

by Stephen Smalley

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Fri, 2005-05-20 at 18:05 +0100, David Woodhouse wrote:
> It gets freed at this point too, not just in audit_free_aux(). So you
> have to do the mntput and dput here too.

Ah, good point. Thanks for catching it. I'll update the patch once I
get clarification on the other issues (type value, general struct).

--
Stephen Smalley
National Security Agency

2005-05-20 17:19:40

by Chris Wright

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

* Stephen Smalley ([email protected]) wrote:
> On Fri, 2005-05-20 at 10:01 -0700, Chris Wright wrote:
> > > +struct audit_aux_data_avc {
> >
> > I guess it's not really avc specific (although it's primary user).
>
> You mean generalize the struct for possible re-use by other audit
> helpers but keep the type value and function distinct?
> audit_aux_data_path? Analogous to struct path in namei.c.

Yup.

> > Won't this change the order quite a bit? And how do you correlate path
> > vs. exe, etc.? Oh, I see, you're not using it for exe...
>
> Could be an issue for syscalls that involve multiple files, e.g. rename.
> We are at least still logging the last component name, device, and inode
> number with the avc message, and only deferring logging of the full
> pathname.

If it works for you, it's really only effecting your messages at this point.
I took David's idea to mean replace audit_log_d_path altogether, and all
paths logged on exit with aux data.

thanks,
-chris

2005-05-20 17:36:28

by Stephen Smalley

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Fri, 2005-05-20 at 17:55 +0100, David Woodhouse wrote:
> Yeah, basically. Although it would be better to introduce AUDIT_AVC_PATH
> instead of using AUDIT_AVC for the type. If there's general agreement
> this this is a sane answer, I'll stick it in the git tree. Can I see a
> Signed-off-by line please?

This patch changes the SELinux AVC to defer logging of paths to the audit
framework upon syscall exit, by saving a reference to the (dentry,vfsmount)
pair in an auxiliary audit item on the current audit context for processing
by audit_log_exit.

Signed-off-by: Stephen Smalley <[email protected]>

---

include/linux/audit.h | 3 +++
kernel/auditsc.c | 39 +++++++++++++++++++++++++++++++++++++++
security/selinux/avc.c | 17 ++++++++---------
3 files changed, 50 insertions(+), 9 deletions(-)

--- linux-2.6.12-rc4-mm2/include/linux/audit.h.orig 2005-05-20 12:37:41.000000000 -0400
+++ linux-2.6.12-rc4-mm2/include/linux/audit.h 2005-05-20 13:17:21.000000000 -0400
@@ -75,6 +75,7 @@

#define AUDIT_AVC 1400 /* SE Linux avc denial or grant */
#define AUDIT_SELINUX_ERR 1401 /* Internal SE Linux Errors */
+#define AUDIT_AVC_PATH 1402 /* dentry, vfsmount pair from avc */

#define AUDIT_KERNEL 2000 /* Asynchronous audit record. NOT A REQUEST. */

@@ -238,6 +239,7 @@ extern uid_t audit_get_loginuid(struct a
extern int audit_ipc_perms(unsigned long qbytes, uid_t uid, gid_t gid, mode_t mode);
extern int audit_socketcall(int nargs, unsigned long *args);
extern int audit_sockaddr(int len, void *addr);
+extern int audit_avc_path(struct dentry *dentry, struct vfsmount *mnt);
extern void audit_signal_info(int sig, struct task_struct *t);
#else
#define audit_alloc(t) ({ 0; })
@@ -253,6 +255,7 @@ extern void audit_signal_info(int sig, s
#define audit_ipc_perms(q,u,g,m) ({ 0; })
#define audit_socketcall(n,a) ({ 0; })
#define audit_sockaddr(len, addr) ({ 0; })
+#define audit_avc_path(dentry, mnt) ({ 0; })
#define audit_signal_info(s,t) do { ; } while (0)
#endif

--- linux-2.6.12-rc4-mm2/kernel/auditsc.c.orig 2005-05-20 12:37:19.000000000 -0400
+++ linux-2.6.12-rc4-mm2/kernel/auditsc.c 2005-05-20 13:21:40.000000000 -0400
@@ -34,6 +34,7 @@
#include <asm/types.h>
#include <linux/mm.h>
#include <linux/module.h>
+#include <linux/mount.h>
#include <linux/socket.h>
#include <linux/audit.h>
#include <linux/personality.h>
@@ -124,6 +125,11 @@ struct audit_aux_data_sockaddr {
char a[0];
};

+struct audit_aux_data_path {
+ struct audit_aux_data d;
+ struct dentry *dentry;
+ struct vfsmount *mnt;
+};

/* The per-task audit context. */
struct audit_context {
@@ -553,6 +559,11 @@ static inline void audit_free_aux(struct
struct audit_aux_data *aux;

while ((aux = context->aux)) {
+ if (aux->type == AUDIT_AVC_PATH) {
+ struct audit_aux_data_path *axi = (void *)aux;
+ dput(axi->dentry);
+ mntput(axi->mnt);
+ }
context->aux = aux->next;
kfree(aux);
}
@@ -728,6 +739,13 @@ static void audit_log_exit(struct audit_
} /* case AUDIT_SOCKADDR */
break;

+ case AUDIT_AVC_PATH: {
+ struct audit_aux_data_path *axi = (void *)aux;
+ audit_log_d_path(ab, "path=", axi->dentry, axi->mnt);
+ dput(axi->dentry);
+ mntput(axi->mnt);
+ } /* case AUDIT_AVC_PATH */
+ break;
}
audit_log_end(ab);

@@ -1128,6 +1146,27 @@ int audit_sockaddr(int len, void *a)
return 0;
}

+int audit_avc_path(struct dentry *dentry, struct vfsmount *mnt)
+{
+ struct audit_aux_data_path *ax;
+ struct audit_context *context = current->audit_context;
+
+ if (likely(!context))
+ return 0;
+
+ ax = kmalloc(sizeof(*ax), GFP_ATOMIC);
+ if (!ax)
+ return -ENOMEM;
+
+ ax->dentry = dget(dentry);
+ ax->mnt = mntget(mnt);
+
+ ax->d.type = AUDIT_AVC_PATH;
+ ax->d.next = context->aux;
+ context->aux = (void *)ax;
+ return 0;
+}
+
void audit_signal_info(int sig, struct task_struct *t)
{
extern pid_t audit_sig_pid;
--- linux-2.6.12-rc4-mm2/security/selinux/avc.c.orig 2005-05-20 12:37:41.000000000 -0400
+++ linux-2.6.12-rc4-mm2/security/selinux/avc.c 2005-05-20 12:39:06.000000000 -0400
@@ -566,13 +566,10 @@ void avc_audit(u32 ssid, u32 tsid,
case AVC_AUDIT_DATA_FS:
if (a->u.fs.dentry) {
struct dentry *dentry = a->u.fs.dentry;
- if (a->u.fs.mnt) {
- audit_log_d_path(ab, "path=", dentry,
- a->u.fs.mnt);
- } else {
- audit_log_format(ab, " name=%s",
- dentry->d_name.name);
- }
+ if (a->u.fs.mnt)
+ audit_avc_path(dentry, a->u.fs.mnt);
+ audit_log_format(ab, " name=%s",
+ dentry->d_name.name);
inode = dentry->d_inode;
} else if (a->u.fs.inode) {
struct dentry *dentry;
@@ -623,8 +620,10 @@ void avc_audit(u32 ssid, u32 tsid,
case AF_UNIX:
u = unix_sk(sk);
if (u->dentry) {
- audit_log_d_path(ab, "path=",
- u->dentry, u->mnt);
+ audit_avc_path(u->dentry, u->mnt);
+ audit_log_format(ab, " name=%s",
+ u->dentry->d_name.name);
+
break;
}
if (!u->addr)



--
Stephen Smalley
National Security Agency

2005-05-20 23:29:50

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 - sleeping function called from invalid context at mm/slab.c:2502

On Fri, 2005-05-20 at 13:26 -0400, Stephen Smalley wrote:
>
> This patch changes the SELinux AVC to defer logging of paths to the audit
> framework upon syscall exit, by saving a reference to the (dentry,vfsmount)
> pair in an auxiliary audit item on the current audit context for processing
> by audit_log_exit.
>
> Signed-off-by: Stephen Smalley <[email protected]>

Applied; thanks.

--
dwmw2