2010-08-20 02:13:17

by Michael Neuling

[permalink] [raw]
Subject: [PATCH] audit: speedup for syscalls when auditing is disabled

We found that when auditing is disabled using "auditctl -D", that
there's still a significant overhead when doing syscalls. This overhead
is not present when a single never rule is inserted using "auditctl -a
task,never".

Using Anton's null syscall microbenchmark from
http://ozlabs.org/~anton/junkcode/null_syscall.c we currently have on a
powerpc machine:

# auditctl -D
No rules
# ./null_syscall
null_syscall: 739.03 cycles 100.00%
# auditctl -a task,never
# ./null_syscall
null_syscall: 204.63 cycles 100.00%

This doesn't seem right, as we'd hope that auditing would have the same
minimal impact when disabled via -D as when we have a single never rule.

The patch below creates a fast path when initialising a task. If the
rules list for tasks is empty (the disabled -D option), we mark auditing
as disabled for this task.

When this is applied, our null syscall benchmark improves in the
disabled case to match the single never rule case.

# auditctl -D
No rules
# ./null_syscall
null_syscall: 204.62 cycles 100.00%
# auditctl -a task,never
# ./null_syscall
null_syscall: 204.63 cycles 100.00%

Reported-by: Anton Blanchard <[email protected]>
Signed-off-by: Michael Neuling <[email protected]>
---
I'm not familiar with the auditing code/infrastructure so I may have
misunderstood something here

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 1b31c13..1cd6ec7 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -666,6 +666,11 @@ static enum audit_state audit_filter_task(struct task_struct *tsk, char **key)
enum audit_state state;

rcu_read_lock();
+ /* Fast path. If the list is empty, disable auditing */
+ if (list_empty(&audit_filter_list[AUDIT_FILTER_TASK])) {
+ rcu_read_unlock();
+ return AUDIT_DISABLED;
+ }
list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_TASK], list) {
if (audit_filter_rules(tsk, &e->rule, NULL, NULL, &state)) {
if (state == AUDIT_RECORD_CONTEXT)


2010-08-23 17:56:31

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] audit: speedup for syscalls when auditing is disabled

On Fri, 2010-08-20 at 12:13 +1000, Michael Neuling wrote:
> We found that when auditing is disabled using "auditctl -D", that
> there's still a significant overhead when doing syscalls. This overhead
> is not present when a single never rule is inserted using "auditctl -a
> task,never".
>
> Using Anton's null syscall microbenchmark from
> http://ozlabs.org/~anton/junkcode/null_syscall.c we currently have on a
> powerpc machine:
>
> # auditctl -D
> No rules
> # ./null_syscall
> null_syscall: 739.03 cycles 100.00%
> # auditctl -a task,never
> # ./null_syscall
> null_syscall: 204.63 cycles 100.00%
>
> This doesn't seem right, as we'd hope that auditing would have the same
> minimal impact when disabled via -D as when we have a single never rule.
>
> The patch below creates a fast path when initialising a task. If the
> rules list for tasks is empty (the disabled -D option), we mark auditing
> as disabled for this task.
>
> When this is applied, our null syscall benchmark improves in the
> disabled case to match the single never rule case.
>
> # auditctl -D
> No rules
> # ./null_syscall
> null_syscall: 204.62 cycles 100.00%
> # auditctl -a task,never
> # ./null_syscall
> null_syscall: 204.63 cycles 100.00%
>
> Reported-by: Anton Blanchard <[email protected]>
> Signed-off-by: Michael Neuling <[email protected]>
> ---
> I'm not familiar with the auditing code/infrastructure so I may have
> misunderstood something here
>
> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index 1b31c13..1cd6ec7 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -666,6 +666,11 @@ static enum audit_state audit_filter_task(struct task_struct *tsk, char **key)
> enum audit_state state;
>
> rcu_read_lock();
> + /* Fast path. If the list is empty, disable auditing */
> + if (list_empty(&audit_filter_list[AUDIT_FILTER_TASK])) {
> + rcu_read_unlock();
> + return AUDIT_DISABLED;
> + }
> list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_TASK], list) {
> if (audit_filter_rules(tsk, &e->rule, NULL, NULL, &state)) {
> if (state == AUDIT_RECORD_CONTEXT)

I don't think this works at all. I don't see how syscall audit'ing can
work. What if I have nothing in the AUDIT_FILTER_TASK list but I want
to audit all 'open(2)' syscalls? This patch is going to leave the task
in the DISABLED state and we won't ever be able to match on the syscall
rules.


I wonder if you could get much back, in terms of performance, by moving
the
context->dummy = !audit_n_rules;
line to the top and just returning if context->dummy == 1;

I'll play a bit, but I thought that was supposed to be a safe thing to
do....

2010-08-24 02:11:24

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH] audit: speedup for syscalls when auditing is disabled

In message <[email protected]> you wrote:
> On Fri, 2010-08-20 at 12:13 +1000, Michael Neuling wrote:
> > We found that when auditing is disabled using "auditctl -D", that
> > there's still a significant overhead when doing syscalls. This overhead
> > is not present when a single never rule is inserted using "auditctl -a
> > task,never".
> >
> > Using Anton's null syscall microbenchmark from
> > http://ozlabs.org/~anton/junkcode/null_syscall.c we currently have on a
> > powerpc machine:
> >
> > # auditctl -D
> > No rules
> > # ./null_syscall
> > null_syscall: 739.03 cycles 100.00%
> > # auditctl -a task,never
> > # ./null_syscall
> > null_syscall: 204.63 cycles 100.00%
> >
> > This doesn't seem right, as we'd hope that auditing would have the same
> > minimal impact when disabled via -D as when we have a single never rule.
> >
> > The patch below creates a fast path when initialising a task. If the
> > rules list for tasks is empty (the disabled -D option), we mark auditing
> > as disabled for this task.
> >
> > When this is applied, our null syscall benchmark improves in the
> > disabled case to match the single never rule case.
> >
> > # auditctl -D
> > No rules
> > # ./null_syscall
> > null_syscall: 204.62 cycles 100.00%
> > # auditctl -a task,never
> > # ./null_syscall
> > null_syscall: 204.63 cycles 100.00%
> >
> > Reported-by: Anton Blanchard <[email protected]>
> > Signed-off-by: Michael Neuling <[email protected]>
> > ---
> > I'm not familiar with the auditing code/infrastructure so I may have
> > misunderstood something here
> >
> > diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> > index 1b31c13..1cd6ec7 100644
> > --- a/kernel/auditsc.c
> > +++ b/kernel/auditsc.c
> > @@ -666,6 +666,11 @@ static enum audit_state audit_filter_task(struct task_
struct *tsk, char **key)
> > enum audit_state state;
> >
> > rcu_read_lock();
> > + /* Fast path. If the list is empty, disable auditing */
> > + if (list_empty(&audit_filter_list[AUDIT_FILTER_TASK])) {
> > + rcu_read_unlock();
> > + return AUDIT_DISABLED;
> > + }
> > list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_TASK], list)
{
> > if (audit_filter_rules(tsk, &e->rule, NULL, NULL, &state)) {
> > if (state == AUDIT_RECORD_CONTEXT)
>
> I don't think this works at all. I don't see how syscall audit'ing can
> work. What if I have nothing in the AUDIT_FILTER_TASK list but I want
> to audit all 'open(2)' syscalls? This patch is going to leave the task
> in the DISABLED state and we won't ever be able to match on the syscall
> rules.

Sorry my bad. I'm not too familiar with the audit infrastructure.

On reflection, we might have a bug in audit_alloc though. Currently we
have this:

int audit_alloc(struct task_struct *tsk)
{
<snip>
state = audit_filter_task(tsk, &key);
if (likely(state == AUDIT_DISABLED))
return 0;

<snip>
set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
return 0;
}

This gets called on fork. If we have "task,never" rule, we hit this
state == AUDIT_DISABLED path, return immediately and the tasks
TIF_SYSCALL_AUDIT flags doesn't get set. On powerpc, we check
TIF_SYSCALL_AUDIT in asm on syscall entry to fast path not calling the
syscall audit code.

This seems wrong to me as a "never" _task_ audit rule shouldn't effect
_syscall_ auditing? Is there some interaction between task and syscall
auditing that I'm missing?

> I wonder if you could get much back, in terms of performance, by moving
> the
> context->dummy = !audit_n_rules;
> line to the top and just returning if context->dummy == 1;

We get 668.09 cycles with this optimisation, so it comes down a bit, but
no where near if the auditing is disabled altogether.

Like I said above, powerpc has a fast path in asm on system call entry
to check the thread_info flags for if syscall auditing is disabled. If
it's disabled, we don't call the audit code, hence why it's very fast in
this case.

> I'll play a bit, but I thought that was supposed to be a safe thing to
> do....

Thanks!

Mikey

2010-08-24 02:21:22

by Anton Blanchard

[permalink] [raw]
Subject: Re: [PATCH] audit: speedup for syscalls when auditing is disabled


Hi Eric,

> I don't think this works at all. I don't see how syscall audit'ing can
> work. What if I have nothing in the AUDIT_FILTER_TASK list but I want
> to audit all 'open(2)' syscalls? This patch is going to leave the task
> in the DISABLED state and we won't ever be able to match on the syscall
> rules.

That's a good point. What if we went through and created an audit context
for each thread at the point where we add a rule to the audit subsystem?

That would make the common case where no one touches audit go fast. It's
only once you add a rule that you get the syscall entry/exit overhead of
audit.

Anton

2010-08-24 03:43:50

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] audit: speedup for syscalls when auditing is disabled

On Tue, 2010-08-24 at 12:11 +1000, Michael Neuling wrote:
> In message <[email protected]> you wrote:
> > On Fri, 2010-08-20 at 12:13 +1000, Michael Neuling wrote:
> > > We found that when auditing is disabled using "auditctl -D", that
> > > there's still a significant overhead when doing syscalls. This overhead
> > > is not present when a single never rule is inserted using "auditctl -a
> > > task,never".
> > >
> > > Using Anton's null syscall microbenchmark from
> > > http://ozlabs.org/~anton/junkcode/null_syscall.c we currently have on a
> > > powerpc machine:
> > >
> > > # auditctl -D
> > > No rules
> > > # ./null_syscall
> > > null_syscall: 739.03 cycles 100.00%
> > > # auditctl -a task,never
> > > # ./null_syscall
> > > null_syscall: 204.63 cycles 100.00%
> > >
> > > This doesn't seem right, as we'd hope that auditing would have the same
> > > minimal impact when disabled via -D as when we have a single never rule.
> > >
> > > The patch below creates a fast path when initialising a task. If the
> > > rules list for tasks is empty (the disabled -D option), we mark auditing
> > > as disabled for this task.
> > >
> > > When this is applied, our null syscall benchmark improves in the
> > > disabled case to match the single never rule case.
> > >
> > > # auditctl -D
> > > No rules
> > > # ./null_syscall
> > > null_syscall: 204.62 cycles 100.00%
> > > # auditctl -a task,never
> > > # ./null_syscall
> > > null_syscall: 204.63 cycles 100.00%
> > >
> > > Reported-by: Anton Blanchard <[email protected]>
> > > Signed-off-by: Michael Neuling <[email protected]>
> > > ---
> > > I'm not familiar with the auditing code/infrastructure so I may have
> > > misunderstood something here
> > >
> > > diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> > > index 1b31c13..1cd6ec7 100644
> > > --- a/kernel/auditsc.c
> > > +++ b/kernel/auditsc.c
> > > @@ -666,6 +666,11 @@ static enum audit_state audit_filter_task(struct task_
> struct *tsk, char **key)
> > > enum audit_state state;
> > >
> > > rcu_read_lock();
> > > + /* Fast path. If the list is empty, disable auditing */
> > > + if (list_empty(&audit_filter_list[AUDIT_FILTER_TASK])) {
> > > + rcu_read_unlock();
> > > + return AUDIT_DISABLED;
> > > + }
> > > list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_TASK], list)
> {
> > > if (audit_filter_rules(tsk, &e->rule, NULL, NULL, &state)) {
> > > if (state == AUDIT_RECORD_CONTEXT)
> >
> > I don't think this works at all. I don't see how syscall audit'ing can
> > work. What if I have nothing in the AUDIT_FILTER_TASK list but I want
> > to audit all 'open(2)' syscalls? This patch is going to leave the task
> > in the DISABLED state and we won't ever be able to match on the syscall
> > rules.
>
> Sorry my bad. I'm not too familiar with the audit infrastructure.
>
> On reflection, we might have a bug in audit_alloc though. Currently we
> have this:
>
> int audit_alloc(struct task_struct *tsk)
> {
> <snip>
> state = audit_filter_task(tsk, &key);
> if (likely(state == AUDIT_DISABLED))
> return 0;
>
> <snip>
> set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
> return 0;
> }
>
> This gets called on fork. If we have "task,never" rule, we hit this
> state == AUDIT_DISABLED path, return immediately and the tasks
> TIF_SYSCALL_AUDIT flags doesn't get set. On powerpc, we check
> TIF_SYSCALL_AUDIT in asm on syscall entry to fast path not calling the
> syscall audit code.

I'm guessing it actually bypasses audit if the flag is not set? So we
might have a bug, but i'd be surprised since I think we tested audit on
powerpc....

> This seems wrong to me as a "never" _task_ audit rule shouldn't effect
> _syscall_ auditing? Is there some interaction between task and syscall
> auditing that I'm missing?

There are 3 states for a given task, I don't remember the names off the
top of my head, so I'll guess with: on, off, build. 'Build' is the
state most processes usually live in. In this state we collect audit
information about the task during the whole syscall and then we might
(likely) throw that information away at syscall exit.

Some types of audit rule, which alter this state, can be checked at
either 'entry' or 'exit' (first rule wins) At syscall entry we only have
enough information (questionable if we even have enough information at
all but that's a different question) to filter based on the task. You
can create rules that will audit all tasks, or in your case will
explicitly disable auditing for all tasks.

Normally a process would be in the default 'build' state after syscall
entry, we will collect information about the syscall, and then we will
check syscall rules at exit. Once you explicitly say 'I do not want any
audit messages for this task' you are in 'off' instead of 'build.'

> > I wonder if you could get much back, in terms of performance, by moving
> > the
> > context->dummy = !audit_n_rules;
> > line to the top and just returning if context->dummy == 1;
>
> We get 668.09 cycles with this optimisation, so it comes down a bit, but
> no where near if the auditing is disabled altogether.

Clean that patch up and send it. Sounds like a win no matter what else
we do.

> Like I said above, powerpc has a fast path in asm on system call entry
> to check the thread_info flags for if syscall auditing is disabled. If
> it's disabled, we don't call the audit code, hence why it's very fast in
> this case.

Here's a new idea to think about with obvious tradeoffs. What do you
think about doing a little bit of assembly rejiggering?

Add a new spot in the assembly which will call a function which will
check if audit_n_rules > 0 and if so will set TIF_SYSCALL_AUDIT and if
not will clear TIF_SYSCALL_AUDIT? It might make things slightly worse
on systems which explictly disable audit and the flag would always be
clear on every task (like you did with the explicit rule) but I'm
guessing might be a win on systems with no rules which are wasting time
on the audit slow path.....

-Eric

2010-08-24 03:51:23

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] audit: speedup for syscalls when auditing is disabled

On Tue, 2010-08-24 at 12:16 +1000, Anton Blanchard wrote:
> Hi Eric,
>
> > I don't think this works at all. I don't see how syscall audit'ing can
> > work. What if I have nothing in the AUDIT_FILTER_TASK list but I want
> > to audit all 'open(2)' syscalls? This patch is going to leave the task
> > in the DISABLED state and we won't ever be able to match on the syscall
> > rules.
>
> That's a good point. What if we went through and created an audit context
> for each thread at the point where we add a rule to the audit subsystem?
>
> That would make the common case where no one touches audit go fast. It's
> only once you add a rule that you get the syscall entry/exit overhead of
> audit.
>
> Anton

It might be viable but I certainly couldn't say. I'm a bit scared of
allocating and attaching a struct audit_context to a running task as we
have no idea if that task is about to exit a syscall with an unfilled
out audit_context struct nor do I personally off the top of my head know
the task flag setting rules (especially when you want to set other
people's flags)

Maybe another approach would be to move audit_alloc() into the syscall
entry path rather than only at fork().

I think the best solution is going to be some better way of
setting/clearing TIF_SYSCALL_AUDIT. (notice the thing is never cleared,
ever, today)

-Eric

2010-08-24 05:56:13

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH] audit: speedup for syscalls when auditing is disabled

> > On reflection, we might have a bug in audit_alloc though. Currently we
> > have this:
> >
> > int audit_alloc(struct task_struct *tsk)
> > {
> > <snip>
> > state = audit_filter_task(tsk, &key);
> > if (likely(state == AUDIT_DISABLED))
> > return 0;
> >
> > <snip>
> > set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
> > return 0;
> > }
> >
> > This gets called on fork. If we have "task,never" rule, we hit this
> > state == AUDIT_DISABLED path, return immediately and the tasks
> > TIF_SYSCALL_AUDIT flags doesn't get set. On powerpc, we check
> > TIF_SYSCALL_AUDIT in asm on syscall entry to fast path not calling the
> > syscall audit code.
>
> I'm guessing it actually bypasses audit if the flag is not set?

Correct.

> So we might have a bug, but i'd be surprised since I think we tested
> audit on powerpc....

So on powerpc we have this in entry_64.S

#define _TIF_SYSCALL_AUDIT (1<<TIF_SYSCALL_AUDIT)
#define _TIF_SYSCALL_T_OR_A (_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP)

andi. r11,r10,_TIF_SYSCALL_T_OR_A
bne- syscall_dotrace

and there is similar code in x86 in entry_64.S

#ifdef CONFIG_AUDITSYSCALL
bt $TIF_SYSCALL_AUDIT,%edx
jc sysret_audit
#endif

So I think other archs are broken too.

I think the bug is in the generic code though. Shouldn't syscall
auditing be enabled independent of task fork/exec auditing?

> > This seems wrong to me as a "never" _task_ audit rule shouldn't effect
> > _syscall_ auditing? Is there some interaction between task and syscall
> > auditing that I'm missing?
>
> There are 3 states for a given task, I don't remember the names off the
> top of my head, so I'll guess with: on, off, build. 'Build' is the
> state most processes usually live in. In this state we collect audit
> information about the task during the whole syscall and then we might
> (likely) throw that information away at syscall exit.
>
> Some types of audit rule, which alter this state, can be checked at
> either 'entry' or 'exit' (first rule wins) At syscall entry we only have
> enough information (questionable if we even have enough information at
> all but that's a different question) to filter based on the task. You
> can create rules that will audit all tasks, or in your case will
> explicitly disable auditing for all tasks.

So does this mean that an "auditctl -a task,never" _should_ disable
syscall auding? I'm not 100% clear on this still.

> Normally a process would be in the default 'build' state after syscall
> entry, we will collect information about the syscall, and then we will
> check syscall rules at exit. Once you explicitly say 'I do not want any
> audit messages for this task' you are in 'off' instead of 'build.'

Ok

> > > I wonder if you could get much back, in terms of performance, by moving
> > > the
> > > context->dummy = !audit_n_rules;
> > > line to the top and just returning if context->dummy == 1;
> >
> > We get 668.09 cycles with this optimisation, so it comes down a bit, but
> > no where near if the auditing is disabled altogether.
>
> Clean that patch up and send it. Sounds like a win no matter what else
> we do.

Ok, sending in separate email.

> > Like I said above, powerpc has a fast path in asm on system call entry
> > to check the thread_info flags for if syscall auditing is disabled. If
> > it's disabled, we don't call the audit code, hence why it's very fast in
> > this case.
>
> Here's a new idea to think about with obvious tradeoffs. What do you
> think about doing a little bit of assembly rejiggering?
>
> Add a new spot in the assembly which will call a function which will
> check if audit_n_rules > 0 and if so will set TIF_SYSCALL_AUDIT and if
> not will clear TIF_SYSCALL_AUDIT? It might make things slightly worse
> on systems which explictly disable audit and the flag would always be
> clear on every task (like you did with the explicit rule) but I'm
> guessing might be a win on systems with no rules which are wasting time
> on the audit slow path.....

This sounds good to me except for one thing...

If we set TIF_SYSCALL_AUDIT when audit_n_rules > 0, it'll change the
functionality from what we have now in the "auditctl -a task,never"
case. Currently in this case, TIF_SYSCALL_AUDIT will not be set and
syscalls won't be audited. I think this is a bug (as discussed above),
but I wanted to point it out anyway.

Anyway, I'll take a stab at this in a bit.

Regards,
Mikey

2010-08-24 05:56:09

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH] audit: speedup for syscalls when auditing is disabled

> > > I wonder if you could get much back, in terms of performance, by moving
> > > the
> > > context->dummy = !audit_n_rules;
> > > line to the top and just returning if context->dummy == 1;
> >
> > We get 668.09 cycles with this optimisation, so it comes down a bit, but
> > no where near if the auditing is disabled altogether.
>
> Clean that patch up and send it. Sounds like a win no matter what else
> we do.

ok...

audit: speedup audit_syscall_entry when there are zero rules

This creates a check at the start of audit_syscall_entry to see if there
are zero rules in the audit filter list. If there are zero rules return
immediately.

This buys about ~10% on a null syscall on powerpc.

Signed-off-by: Michael Neuling <[email protected]>

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 1b31c13..bc0872b 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1579,6 +1579,9 @@ void audit_syscall_entry(int arch, int major,
if (unlikely(!context))
return;

+ context->dummy = !audit_n_rules;
+ if (context->dummy == 1)
+ return;
/*
* This happens only on certain architectures that make system
* calls in kernel_thread via the entry.S interface, instead of
@@ -1628,7 +1631,6 @@ void audit_syscall_entry(int arch, int major,
context->argv[3] = a4;

state = context->state;
- context->dummy = !audit_n_rules;
if (!context->dummy && state == AUDIT_BUILD_CONTEXT) {
context->prio = 0;
state = audit_filter_syscall(tsk, context, &audit_filter_list[AUDIT_FILTER_ENTRY]);

2010-08-24 15:15:04

by Miloslav Trmac

[permalink] [raw]
Subject: Re: [PATCH] audit: speedup for syscalls when auditing is disabled

----- "Eric Paris" <[email protected]> wrote:
> Add a new spot in the assembly which will call a function which will
> check if audit_n_rules > 0 and if so will set TIF_SYSCALL_AUDIT and if
> not will clear TIF_SYSCALL_AUDIT? It might make things slightly worse
> on systems which explictly disable audit and the flag would always be
> clear on every task (like you did with the explicit rule) but I'm
> guessing might be a win on systems with no rules which are wasting time
> on the audit slow path.....
Is "audit_n_rules" a specific enough trigger? Right now, even if there are no rules configured at all, audit_log_start() while processing a syscall will mark that syscall for auditing, and all collected information about the syscall will be logged at syscall exit.

Would the suggested change break this behavior?
Mirek

2010-08-24 15:17:57

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] audit: speedup for syscalls when auditing is disabled

On Tue, 2010-08-24 at 11:14 -0400, Miloslav Trmac wrote:
> ----- "Eric Paris" <[email protected]> wrote:
> > Add a new spot in the assembly which will call a function which will
> > check if audit_n_rules > 0 and if so will set TIF_SYSCALL_AUDIT and if
> > not will clear TIF_SYSCALL_AUDIT? It might make things slightly worse
> > on systems which explictly disable audit and the flag would always be
> > clear on every task (like you did with the explicit rule) but I'm
> > guessing might be a win on systems with no rules which are wasting time
> > on the audit slow path.....
> Is "audit_n_rules" a specific enough trigger? Right now, even if
> there are no rules configured at all, audit_log_start() while
> processing a syscall will mark that syscall for auditing, and all
> collected information about the syscall will be logged at syscall
> exit.
>
> Would the suggested change break this behavior?

That's correct, if there are 0 rules we still collect information while
knowing that it will get thrown away at the end (since obviously no rule
is going to switch the state from 'build' to 'on')

The only real loss is that a syscall in progress when a rule is loaded
would get audited today and wouldn't get audited with my new proposal.
I don't see that as an unacceptable race.....

-Eric

2010-08-24 20:06:58

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] audit: speedup for syscalls when auditing is disabled

On Tue, 2010-08-24 at 15:56 +1000, Michael Neuling wrote:
> > > On reflection, we might have a bug in audit_alloc though. Currently we
> > > have this:
> > >
> > > int audit_alloc(struct task_struct *tsk)
> > > {
> > > <snip>
> > > state = audit_filter_task(tsk, &key);
> > > if (likely(state == AUDIT_DISABLED))
> > > return 0;
> > >
> > > <snip>
> > > set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
> > > return 0;
> > > }
> > >
> > > This gets called on fork. If we have "task,never" rule, we hit this
> > > state == AUDIT_DISABLED path, return immediately and the tasks
> > > TIF_SYSCALL_AUDIT flags doesn't get set. On powerpc, we check
> > > TIF_SYSCALL_AUDIT in asm on syscall entry to fast path not calling the
> > > syscall audit code.
> >
> > I'm guessing it actually bypasses audit if the flag is not set?
>
> Correct.
>
> > So we might have a bug, but i'd be surprised since I think we tested
> > audit on powerpc....
>
> So on powerpc we have this in entry_64.S
>
> #define _TIF_SYSCALL_AUDIT (1<<TIF_SYSCALL_AUDIT)
> #define _TIF_SYSCALL_T_OR_A (_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP)
>
> andi. r11,r10,_TIF_SYSCALL_T_OR_A
> bne- syscall_dotrace
>
> and there is similar code in x86 in entry_64.S
>
> #ifdef CONFIG_AUDITSYSCALL
> bt $TIF_SYSCALL_AUDIT,%edx
> jc sysret_audit
> #endif
>
> So I think other archs are broken too.
>
> I think the bug is in the generic code though. Shouldn't syscall
> auditing be enabled independent of task fork/exec auditing?
>
> > > This seems wrong to me as a "never" _task_ audit rule shouldn't effect
> > > _syscall_ auditing? Is there some interaction between task and syscall
> > > auditing that I'm missing?
> >
> > There are 3 states for a given task, I don't remember the names off the
> > top of my head, so I'll guess with: on, off, build. 'Build' is the
> > state most processes usually live in. In this state we collect audit
> > information about the task during the whole syscall and then we might
> > (likely) throw that information away at syscall exit.
> >
> > Some types of audit rule, which alter this state, can be checked at
> > either 'entry' or 'exit' (first rule wins) At syscall entry we only have
> > enough information (questionable if we even have enough information at
> > all but that's a different question) to filter based on the task. You
> > can create rules that will audit all tasks, or in your case will
> > explicitly disable auditing for all tasks.
>
> So does this mean that an "auditctl -a task,never" _should_ disable
> syscall auding? I'm not 100% clear on this still.

Yes. We only have one kind of auditing. It audits what happens during
a syscall. We have lots of different types of rules that tell the
kernel if we should actually send that info to userspace or not. If we
hit a rule that says 'do not send anything to userspace' (like your
rule) the we should not send anything to userspace. It does not matter
if some later rule would have sent it. So given two rules like:

auditctl -a task,never
auditctl -a exit,always -S open

We hit the task,never rule first, so it wins. We won't alloc a context,
we wont set the thread flag, we won't go slowly. You can't get a task
out of this state. If that task,never rule was removed and you created
a task it would get set up as 'build' and would at syscall exit match
the exit,always rule and the collected information would get dumped.
exit,* rules are a LOT more flexible than task,* rules (as exit,* rules
can match on things collected by the syscall whereas task,* rules can
only match on what we know at the beginning, pid, selinux context, uid,
etc.....)

My suggestion was to more liberaly clear the thread audit flag so we get
the assembly fastpath. But this also requires that we set the flag when
appropriate.

-Eric

2010-08-25 03:11:30

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH] audit: speedup for syscalls when auditing is disabled

In message <[email protected]> you wrote:
> On Tue, 2010-08-24 at 12:11 +1000, Michael Neuling wrote:
> > In message <[email protected]> you wrote:
> > > On Fri, 2010-08-20 at 12:13 +1000, Michael Neuling wrote:
> > > > We found that when auditing is disabled using "auditctl -D", that
> > > > there's still a significant overhead when doing syscalls. This overhea
d
> > > > is not present when a single never rule is inserted using "auditctl -a
> > > > task,never".
> > > >
> > > > Using Anton's null syscall microbenchmark from
> > > > http://ozlabs.org/~anton/junkcode/null_syscall.c we currently have on a
> > > > powerpc machine:
> > > >
> > > > # auditctl -D
> > > > No rules
> > > > # ./null_syscall
> > > > null_syscall: 739.03 cycles 100.00%
> > > > # auditctl -a task,never
> > > > # ./null_syscall
> > > > null_syscall: 204.63 cycles 100.00%
> > > >
> > > > This doesn't seem right, as we'd hope that auditing would have the same
> > > > minimal impact when disabled via -D as when we have a single never rule
.
> > > >
> > > > The patch below creates a fast path when initialising a task. If the
> > > > rules list for tasks is empty (the disabled -D option), we mark auditin
g
> > > > as disabled for this task.
> > > >
> > > > When this is applied, our null syscall benchmark improves in the
> > > > disabled case to match the single never rule case.
> > > >
> > > > # auditctl -D
> > > > No rules
> > > > # ./null_syscall
> > > > null_syscall: 204.62 cycles 100.00%
> > > > # auditctl -a task,never
> > > > # ./null_syscall
> > > > null_syscall: 204.63 cycles 100.00%
> > > >
> > > > Reported-by: Anton Blanchard <[email protected]>
> > > > Signed-off-by: Michael Neuling <[email protected]>
> > > > ---
> > > > I'm not familiar with the auditing code/infrastructure so I may have
> > > > misunderstood something here
> > > >
> > > > diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> > > > index 1b31c13..1cd6ec7 100644
> > > > --- a/kernel/auditsc.c
> > > > +++ b/kernel/auditsc.c
> > > > @@ -666,6 +666,11 @@ static enum audit_state audit_filter_task(struct t
ask_
> > struct *tsk, char **key)
> > > > enum audit_state state;
> > > >
> > > > rcu_read_lock();
> > > > + /* Fast path. If the list is empty, disable auditing */
> > > > + if (list_empty(&audit_filter_list[AUDIT_FILTER_TASK])) {
> > > > + rcu_read_unlock();
> > > > + return AUDIT_DISABLED;
> > > > + }
> > > > list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_TASK
], list)
> > {
> > > > if (audit_filter_rules(tsk, &e->rule, NULL, NULL, &stat
e)) {
> > > > if (state == AUDIT_RECORD_CONTEXT)
> > >
> > > I don't think this works at all. I don't see how syscall audit'ing can
> > > work. What if I have nothing in the AUDIT_FILTER_TASK list but I want
> > > to audit all 'open(2)' syscalls? This patch is going to leave the task
> > > in the DISABLED state and we won't ever be able to match on the syscall
> > > rules.
> >
> > Sorry my bad. I'm not too familiar with the audit infrastructure.
> >
> > On reflection, we might have a bug in audit_alloc though. Currently we
> > have this:
> >
> > int audit_alloc(struct task_struct *tsk)
> > {
> > <snip>
> > state = audit_filter_task(tsk, &key);
> > if (likely(state == AUDIT_DISABLED))
> > return 0;
> >
> > <snip>
> > set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
> > return 0;
> > }
> >
> > This gets called on fork. If we have "task,never" rule, we hit this
> > state == AUDIT_DISABLED path, return immediately and the tasks
> > TIF_SYSCALL_AUDIT flags doesn't get set. On powerpc, we check
> > TIF_SYSCALL_AUDIT in asm on syscall entry to fast path not calling the
> > syscall audit code.
>
> I'm guessing it actually bypasses audit if the flag is not set? So we
> might have a bug, but i'd be surprised since I think we tested audit on
> powerpc....
>
> > This seems wrong to me as a "never" _task_ audit rule shouldn't effect
> > _syscall_ auditing? Is there some interaction between task and syscall
> > auditing that I'm missing?
>
> There are 3 states for a given task, I don't remember the names off the
> top of my head, so I'll guess with: on, off, build. 'Build' is the
> state most processes usually live in. In this state we collect audit
> information about the task during the whole syscall and then we might
> (likely) throw that information away at syscall exit.
>
> Some types of audit rule, which alter this state, can be checked at
> either 'entry' or 'exit' (first rule wins) At syscall entry we only have
> enough information (questionable if we even have enough information at
> all but that's a different question) to filter based on the task. You
> can create rules that will audit all tasks, or in your case will
> explicitly disable auditing for all tasks.
>
> Normally a process would be in the default 'build' state after syscall
> entry, we will collect information about the syscall, and then we will
> check syscall rules at exit. Once you explicitly say 'I do not want any
> audit messages for this task' you are in 'off' instead of 'build.'
>
> > > I wonder if you could get much back, in terms of performance, by moving
> > > the
> > > context->dummy = !audit_n_rules;
> > > line to the top and just returning if context->dummy == 1;
> >
> > We get 668.09 cycles with this optimisation, so it comes down a bit, but
> > no where near if the auditing is disabled altogether.
>
> Clean that patch up and send it. Sounds like a win no matter what else
> we do.
>
> > Like I said above, powerpc has a fast path in asm on system call entry
> > to check the thread_info flags for if syscall auditing is disabled. If
> > it's disabled, we don't call the audit code, hence why it's very fast in
> > this case.
>
> Here's a new idea to think about with obvious tradeoffs. What do you
> think about doing a little bit of assembly rejiggering?
>
> Add a new spot in the assembly which will call a function which will
> check if audit_n_rules > 0 and if so will set TIF_SYSCALL_AUDIT and if
> not will clear TIF_SYSCALL_AUDIT? It might make things slightly worse
> on systems which explictly disable audit and the flag would always be
> clear on every task (like you did with the explicit rule) but I'm
> guessing might be a win on systems with no rules which are wasting time
> on the audit slow path.....

BTW, do you think we can do this in audit_syscall_exit() too?

If I do, I get down to 387 cycles (739.03 vanilla, 668.09 with
audit_syscall_entry() optimisation, 204 best case) so about
another 50% perf improvement.

Patch was simply:

--- linux-next.orig/kernel/auditsc.c
+++ linux-next/kernel/auditsc.c
@@ -1681,7 +1683,7 @@ void audit_syscall_exit(int valid, long

context = audit_get_context(tsk, valid, return_code);

- if (likely(!context))
+ if (likely((!context) || (audit_n_rules == 0)))
return;

if (context->in_syscall && context->current_state == AUDIT_RECORD_CONTEXT)

2010-08-25 11:59:59

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] audit: speedup for syscalls when auditing is disabled

On Wed, 2010-08-25 at 13:11 +1000, Michael Neuling wrote:

> BTW, do you think we can do this in audit_syscall_exit() too?

No, I don't think that is safe, consider the case where we remove the
last rule while this task was inside a syscall. It may have information
stored which is supposed to get freed in the syscall exit.

We could probably drop the if (!context) statement altogether and then
before the audit_get_context() call we could add

if (audit_dummy_context())
return;

Which should be safe since that would imply there were no rules when we
entered the syscall. I'm impressed with how much difference these 2
simple patches can make!

Looks like audit_get_context() could use some cleanups too. What's the
point of setting the return code and crap like that when we know it's
not going anywhere. In any case, if you want to clean up this last idea
and send it I'll make sure it gets queued up for the next go round.

-Eric

> If I do, I get down to 387 cycles (739.03 vanilla, 668.09 with
> audit_syscall_entry() optimisation, 204 best case) so about
> another 50% perf improvement.
>
> Patch was simply:
>
> --- linux-next.orig/kernel/auditsc.c
> +++ linux-next/kernel/auditsc.c
> @@ -1681,7 +1683,7 @@ void audit_syscall_exit(int valid, long
>
> context = audit_get_context(tsk, valid, return_code);
>
> - if (likely(!context))
> + if (likely((!context) || (audit_n_rules == 0)))
> return;
>
> if (context->in_syscall && context->current_state == AUDIT_RECORD_CONTEXT)
>

2010-08-26 03:37:06

by Anton Blanchard

[permalink] [raw]
Subject: Re: [PATCH] audit: speedup for syscalls when auditing is disabled


Hi Eric,

Here's another approach Mikey and I were discussing. We allocate the
tsk->audit_context as before, but we avoid setting the TIF_SYSCALL_AUDIT until
the first rule gets added.

We could look at clearing the flag when the rules go back to zero, but this
simple patch covers the most common case I think.

Anton
---

Index: powerpc.git/kernel/auditfilter.c
===================================================================
--- powerpc.git.orig/kernel/auditfilter.c 2010-08-26 08:04:19.998892577 +1000
+++ powerpc.git/kernel/auditfilter.c 2010-08-26 08:04:30.290374256 +1000
@@ -859,6 +859,21 @@ out:
static u64 prio_low = ~0ULL/2;
static u64 prio_high = ~0ULL/2 - 1;

+#ifdef CONFIG_AUDITSYSCALL
+static void enable_syscall_auditing(void)
+{
+ unsigned long flags;
+ struct task_struct *g, *t;
+
+ read_lock_irqsave(&tasklist_lock, flags);
+ do_each_thread(g, t) {
+ if (t->audit_context)
+ set_tsk_thread_flag(t, TIF_SYSCALL_AUDIT);
+ } while_each_thread(g, t);
+ read_unlock_irqrestore(&tasklist_lock, flags);
+}
+#endif
+
/* Add rule to given filterlist if not a duplicate. */
static inline int audit_add_rule(struct audit_entry *entry)
{
@@ -922,9 +937,14 @@ static inline int audit_add_rule(struct
list_add_tail_rcu(&entry->list, list);
}
#ifdef CONFIG_AUDITSYSCALL
- if (!dont_count)
+ if (!dont_count) {
audit_n_rules++;

+ /* Did we add our first rule? */
+ if (audit_n_rules == 1)
+ enable_syscall_auditing();
+ }
+
if (!audit_match_signal(entry))
audit_signals++;
#endif
Index: powerpc.git/kernel/auditsc.c
===================================================================
--- powerpc.git.orig/kernel/auditsc.c 2010-08-26 08:04:19.998892577 +1000
+++ powerpc.git/kernel/auditsc.c 2010-08-26 08:04:30.390388654 +1000
@@ -886,7 +886,10 @@ int audit_alloc(struct task_struct *tsk)
context->filterkey = key;

tsk->audit_context = context;
- set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
+
+ /* We postpone setting the thread flag until we add the first rule */
+ if (audit_n_rules != 0)
+ set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
return 0;
}

2010-08-27 17:49:30

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH] audit: speedup for syscalls when auditing is disabled

On Thu, 2010-08-26 at 13:34 +1000, Anton Blanchard wrote:
> Hi Eric,
>
> Here's another approach Mikey and I were discussing. We allocate the
> tsk->audit_context as before, but we avoid setting the TIF_SYSCALL_AUDIT until
> the first rule gets added.
>
> We could look at clearing the flag when the rules go back to zero, but this
> simple patch covers the most common case I think.

It just dawned on me where we are going to have problems. We have
things other than syscall filter rules that can cause us to want the
collected audit info. Namely SELinux (or other LSM) denials.

Crap.

So the change in audit_alloc() should probably be conditionalized on
more than just audit_n_rules(). Not exactly sure what that is though.

It might also make our syscall entry/exit speedups not as great of an
idea as I thought. I need to look for other audit users to see how
these things are oging to affect them :(

-Eric