2014-02-08 21:06:35

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v3] audit: Turn off TIF_SYSCALL_AUDIT when there are no rules

This toggles TIF_SYSCALL_AUDIT as needed when rules change instead
of leaving it set whenever rules might be set in the future. This
reduces syscall latency from >60ns to closer to 40ns on my laptop.

This code is a little bit tricky. It's not safe to turn
TIF_SYSCALL_AUDIT off during a syscall that is being audited. So,
with this change, the only thing that ever turns TIF_SYSCALL_AUDIT
off after a process has been created is __audit_syscall_exit.

This has the somewhat odd side-effect that, unless there's a
'task,never' rule, every task's first syscall will use the slow
path. This should be more or less unnoticable (what's another
20-40ns at task creation time between friends?).

There is a user-visible effect of this change: if there are no audit
rules, then events like AVC denials will not result in a syscall
audit record being written. I'm happy to accept this minor loss in
debuggability in exchange for a massive speedup of all system calls
on default distribution configurations.

A better solution would be to ditch the syscall entry hook entirely
and use the syscall_get_xyz calls as needed to fill in the context.
This could be done on top of this patch, but it would be more code.

Cc: Oleg Nesterov <[email protected]>
Cc: Steve Grubb <[email protected]>
Cc: Eric Paris <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---

Changes from v2 (actually v2.1):
- Use for_each_process_thread instead of do_each_thread :)
- Turn off TIF_SYSCALL_AUDIT lazily, thus hopefully avoiding all of
the nasty cases that Oleg noticed.

Changes from v1:
- For new tasks, set flags in a new audit_sync_flags callback instead of
in audit_alloc (thanks, Oleg).
- Rework locking.
- Use irqsave/irqrestore to avoid having to think about who else might have
taken spinlocks.

include/linux/audit.h | 8 ++++++--
kernel/auditfilter.c | 4 ++--
kernel/auditsc.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++++--
3 files changed, 63 insertions(+), 6 deletions(-)

diff --git a/include/linux/audit.h b/include/linux/audit.h
index aa865a9..ab00ffb 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -298,7 +298,8 @@ static inline void audit_mmap_fd(int fd, int flags)
__audit_mmap_fd(fd, flags);
}

-extern int audit_n_rules;
+extern void audit_inc_n_rules(void);
+extern void audit_dec_n_rules(void);
extern int audit_signals;
#else /* CONFIG_AUDITSYSCALL */
static inline int audit_alloc(struct task_struct *task)
@@ -404,7 +405,10 @@ static inline void audit_mmap_fd(int fd, int flags)
{ }
static inline void audit_ptrace(struct task_struct *t)
{ }
-#define audit_n_rules 0
+static inline void audit_inc_n_rules(void)
+{ }
+static inline void audit_dec_n_rules(void)
+{ }
#define audit_signals 0
#endif /* CONFIG_AUDITSYSCALL */

diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
index 14a78cc..4c7054b 100644
--- a/kernel/auditfilter.c
+++ b/kernel/auditfilter.c
@@ -903,7 +903,7 @@ static inline int audit_add_rule(struct audit_entry *entry)
}
#ifdef CONFIG_AUDITSYSCALL
if (!dont_count)
- audit_n_rules++;
+ audit_inc_n_rules();

if (!audit_match_signal(entry))
audit_signals++;
@@ -955,7 +955,7 @@ static inline int audit_del_rule(struct audit_entry *entry)

#ifdef CONFIG_AUDITSYSCALL
if (!dont_count)
- audit_n_rules--;
+ audit_dec_n_rules();

if (!audit_match_signal(entry))
audit_signals--;
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 7aef2f4..d76947c 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -80,7 +80,7 @@
#define MAX_EXECVE_AUDIT_LEN 7500

/* number of audit rules */
-int audit_n_rules;
+static int audit_n_rules;

/* determines whether we collect data for signals sent */
int audit_signals;
@@ -911,6 +911,39 @@ static inline struct audit_context *audit_alloc_context(enum audit_state state)
return context;
}

+void audit_inc_n_rules()
+{
+ struct task_struct *p, *t;
+
+ read_lock(&tasklist_lock);
+ audit_n_rules++;
+ smp_wmb();
+ if (audit_n_rules == 1) {
+ /*
+ * We now have a rule; we need to hook syscall entry.
+ */
+ for_each_process_thread(p, t) {
+ if (t->audit_context)
+ set_tsk_thread_flag(t, TIF_SYSCALL_AUDIT);
+ }
+ }
+ read_unlock(&tasklist_lock);
+}
+
+void audit_dec_n_rules()
+{
+ read_lock(&tasklist_lock);
+ --audit_n_rules;
+ BUG_ON(audit_n_rules < 0);
+
+ /*
+ * If audit_n_rules == 0, then __audit_syscall_exit will clear
+ * TIF_SYSCALL_AUDIT.
+ */
+
+ read_unlock(&tasklist_lock);
+}
+
/**
* audit_alloc - allocate an audit context block for a task
* @tsk: task
@@ -938,11 +971,12 @@ int audit_alloc(struct task_struct *tsk)
if (!(context = audit_alloc_context(state))) {
kfree(key);
audit_log_lost("out of memory in audit_alloc");
+ clear_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
return -ENOMEM;
}
context->filterkey = key;

- tsk->audit_context = context;
+ tsk->audit_context = context;
set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
return 0;
}
@@ -1528,6 +1562,25 @@ void __audit_syscall_exit(int success, long return_code)
context->filterkey = NULL;
}
tsk->audit_context = context;
+
+ if (ACCESS_ONCE(audit_n_rules) == 0) {
+ /*
+ * Either this is the very first syscall by this process or
+ * audit_dec_n_rules recently set audit_n_rules to zero.
+ */
+ smp_rmb();
+
+ /* audit_inc_n_rules could increment audit_n_rules here... */
+
+ clear_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
+
+ smp_rmb();
+
+ if (ACCESS_ONCE(audit_n_rules) != 0) {
+ /* ... if so, set TIF_SYSCALL_AUDIT again. */
+ set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
+ }
+ }
}

static inline void handle_one(const struct inode *inode)
--
1.8.5.3


2014-02-10 16:57:33

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH v3] audit: Turn off TIF_SYSCALL_AUDIT when there are no rules

On 02/08, Andy Lutomirski wrote:
>
> +void audit_inc_n_rules()
> +{
> + struct task_struct *p, *t;
> +
> + read_lock(&tasklist_lock);
> + audit_n_rules++;
> + smp_wmb();
> + if (audit_n_rules == 1) {
> + /*
> + * We now have a rule; we need to hook syscall entry.
> + */
> + for_each_process_thread(p, t) {
> + if (t->audit_context)
> + set_tsk_thread_flag(t, TIF_SYSCALL_AUDIT);
> + }
> + }
> + read_unlock(&tasklist_lock);
> +}
> +
> +void audit_dec_n_rules()
> +{
> + read_lock(&tasklist_lock);
> + --audit_n_rules;
> + BUG_ON(audit_n_rules < 0);
> +
> + /*
> + * If audit_n_rules == 0, then __audit_syscall_exit will clear
> + * TIF_SYSCALL_AUDIT.
> + */
> +
> + read_unlock(&tasklist_lock);
> +}

To be honest, I do not understand why _dec_ takes tasklist_lock...
And why _inc_ increments audit_n_rules under tasklist.

> @@ -1528,6 +1562,25 @@ void __audit_syscall_exit(int success, long return_code)
> context->filterkey = NULL;
> }
> tsk->audit_context = context;
> +
> + if (ACCESS_ONCE(audit_n_rules) == 0) {
> + /*
> + * Either this is the very first syscall by this process or
> + * audit_dec_n_rules recently set audit_n_rules to zero.
> + */
> + smp_rmb();

rmb() looks wrong, we need mb() to serialize ACCESS_ONCE() and
clear_tsk_thread_flag().

But, otoh, I think we do not need any barrier at all, we can rely on
control dependency. See the recent 18c03c61444a21 "Documentation/
memory-barriers.txt: Prohibit speculative writes".

> + /* audit_inc_n_rules could increment audit_n_rules here... */
> +
> + clear_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
> +
> + smp_rmb();

Again, I guess this should be mb() or smp_mb__after_clear_bit().


And I still think this needs more changes. Once again, I do not think
that, say, __audit_log_bprm_fcaps() should populate context->aux if
!TIF_SYSCALL_AUDIT, this list can grow indefinitely. Or __audit_signal_info()...

Perhaps __audit_syscall_exit() should also set context->dummy?

Oleg.

2014-02-10 17:29:48

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3] audit: Turn off TIF_SYSCALL_AUDIT when there are no rules

On Mon, Feb 10, 2014 at 8:57 AM, Oleg Nesterov <[email protected]> wrote:
> On 02/08, Andy Lutomirski wrote:
>>
>> +void audit_inc_n_rules()
>> +{
>> + struct task_struct *p, *t;
>> +
>> + read_lock(&tasklist_lock);
>> + audit_n_rules++;
>> + smp_wmb();
>> + if (audit_n_rules == 1) {
>> + /*
>> + * We now have a rule; we need to hook syscall entry.
>> + */
>> + for_each_process_thread(p, t) {
>> + if (t->audit_context)
>> + set_tsk_thread_flag(t, TIF_SYSCALL_AUDIT);
>> + }
>> + }
>> + read_unlock(&tasklist_lock);
>> +}
>> +
>> +void audit_dec_n_rules()
>> +{
>> + read_lock(&tasklist_lock);
>> + --audit_n_rules;
>> + BUG_ON(audit_n_rules < 0);
>> +
>> + /*
>> + * If audit_n_rules == 0, then __audit_syscall_exit will clear
>> + * TIF_SYSCALL_AUDIT.
>> + */
>> +
>> + read_unlock(&tasklist_lock);
>> +}
>
> To be honest, I do not understand why _dec_ takes tasklist_lock...
> And why _inc_ increments audit_n_rules under tasklist.

Bah, incorrect leftover from last time.

>
>> @@ -1528,6 +1562,25 @@ void __audit_syscall_exit(int success, long return_code)
>> context->filterkey = NULL;
>> }
>> tsk->audit_context = context;
>> +
>> + if (ACCESS_ONCE(audit_n_rules) == 0) {
>> + /*
>> + * Either this is the very first syscall by this process or
>> + * audit_dec_n_rules recently set audit_n_rules to zero.
>> + */
>> + smp_rmb();
>
> rmb() looks wrong, we need mb() to serialize ACCESS_ONCE() and
> clear_tsk_thread_flag().

I clearly need to review the rules. I think you're right, though --
no barrier should be needed.

>
> But, otoh, I think we do not need any barrier at all, we can rely on
> control dependency. See the recent 18c03c61444a21 "Documentation/
> memory-barriers.txt: Prohibit speculative writes".
>
>> + /* audit_inc_n_rules could increment audit_n_rules here... */
>> +
>> + clear_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
>> +
>> + smp_rmb();
>
> Again, I guess this should be mb() or smp_mb__after_clear_bit().
>
>
> And I still think this needs more changes. Once again, I do not think
> that, say, __audit_log_bprm_fcaps() should populate context->aux if
> !TIF_SYSCALL_AUDIT, this list can grow indefinitely. Or __audit_signal_info()...
>
> Perhaps __audit_syscall_exit() should also set context->dummy?

That would work.

I'm still torn between trying to make it possible for things like
__audit_log_bprm_fcaps to start a syscall audit record in the middle
of a syscall or to just try to tighten up the current approach to the
point where it will work correctly.

Grr. Why is all this crap tied up with syscall auditing anyway? ISTM
it would have been a lot nicer if audit calls just immediately emitted
audit records, completely independently of the syscall machinery.

--Andy

2014-02-10 17:47:37

by Steve Grubb

[permalink] [raw]
Subject: Re: [PATCH v3] audit: Turn off TIF_SYSCALL_AUDIT when there are no rules

On Monday, February 10, 2014 09:29:19 AM Andy Lutomirski wrote:
> Grr. Why is all this crap tied up with syscall auditing anyway? ISTM
> it would have been a lot nicer if audit calls just immediately emitted
> audit records, completely independently of the syscall machinery.

Because the majority of people needing audit need syscall records for it to
make any sense. The auxiliary records generally report on the object of the
syscall. We still require information about who was doing something, what they
were doing, and what the result was.

Even if you just get the AVC's, you still don't know what happened. If you get
a deny record, was it really denied? The system could have been in permissive
mode and the syscall succeeded. You only get the real decision when you have
syscall records.

-Steve

2014-02-10 18:06:24

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3] audit: Turn off TIF_SYSCALL_AUDIT when there are no rules

On Mon, Feb 10, 2014 at 9:47 AM, Steve Grubb <[email protected]> wrote:
> On Monday, February 10, 2014 09:29:19 AM Andy Lutomirski wrote:
>> Grr. Why is all this crap tied up with syscall auditing anyway? ISTM
>> it would have been a lot nicer if audit calls just immediately emitted
>> audit records, completely independently of the syscall machinery.
>
> Because the majority of people needing audit need syscall records for it to
> make any sense. The auxiliary records generally report on the object of the
> syscall. We still require information about who was doing something, what they
> were doing, and what the result was.
>
> Even if you just get the AVC's, you still don't know what happened. If you get
> a deny record, was it really denied? The system could have been in permissive
> mode and the syscall succeeded. You only get the real decision when you have
> syscall records.
>

Fair enough.

I'll see if I can turn this into something more workable.

--Andy

2014-02-10 19:01:58

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3] audit: Turn off TIF_SYSCALL_AUDIT when there are no rules

On Mon, Feb 10, 2014 at 9:29 AM, Andy Lutomirski <[email protected]> wrote:
> On Mon, Feb 10, 2014 at 8:57 AM, Oleg Nesterov <[email protected]> wrote:
>> On 02/08, Andy Lutomirski wrote:
>>>
>>> +void audit_inc_n_rules()
>>> +{
>>> + struct task_struct *p, *t;
>>> +
>>> + read_lock(&tasklist_lock);
>>> + audit_n_rules++;
>>> + smp_wmb();
>>> + if (audit_n_rules == 1) {
>>> + /*
>>> + * We now have a rule; we need to hook syscall entry.
>>> + */
>>> + for_each_process_thread(p, t) {
>>> + if (t->audit_context)
>>> + set_tsk_thread_flag(t, TIF_SYSCALL_AUDIT);
>>> + }
>>> + }
>>> + read_unlock(&tasklist_lock);
>>> +}
>>> +
>>> +void audit_dec_n_rules()
>>> +{
>>> + read_lock(&tasklist_lock);
>>> + --audit_n_rules;
>>> + BUG_ON(audit_n_rules < 0);
>>> +
>>> + /*
>>> + * If audit_n_rules == 0, then __audit_syscall_exit will clear
>>> + * TIF_SYSCALL_AUDIT.
>>> + */
>>> +
>>> + read_unlock(&tasklist_lock);
>>> +}
>>
>> To be honest, I do not understand why _dec_ takes tasklist_lock...
>> And why _inc_ increments audit_n_rules under tasklist.
>
> Bah, incorrect leftover from last time.
>
>>
>>> @@ -1528,6 +1562,25 @@ void __audit_syscall_exit(int success, long return_code)
>>> context->filterkey = NULL;
>>> }
>>> tsk->audit_context = context;
>>> +
>>> + if (ACCESS_ONCE(audit_n_rules) == 0) {
>>> + /*
>>> + * Either this is the very first syscall by this process or
>>> + * audit_dec_n_rules recently set audit_n_rules to zero.
>>> + */
>>> + smp_rmb();
>>
>> rmb() looks wrong, we need mb() to serialize ACCESS_ONCE() and
>> clear_tsk_thread_flag().
>
> I clearly need to review the rules. I think you're right, though --
> no barrier should be needed.
>
>>
>> But, otoh, I think we do not need any barrier at all, we can rely on
>> control dependency. See the recent 18c03c61444a21 "Documentation/
>> memory-barriers.txt: Prohibit speculative writes".
>>
>>> + /* audit_inc_n_rules could increment audit_n_rules here... */
>>> +
>>> + clear_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
>>> +
>>> + smp_rmb();
>>
>> Again, I guess this should be mb() or smp_mb__after_clear_bit().
>>
>>
>> And I still think this needs more changes. Once again, I do not think
>> that, say, __audit_log_bprm_fcaps() should populate context->aux if
>> !TIF_SYSCALL_AUDIT, this list can grow indefinitely. Or __audit_signal_info()...
>>
>> Perhaps __audit_syscall_exit() should also set context->dummy?
>
> That would work.
>
> I'm still torn between trying to make it possible for things like
> __audit_log_bprm_fcaps to start a syscall audit record in the middle
> of a syscall or to just try to tighten up the current approach to the
> point where it will work correctly.

This is worse than I thought. Things like signal auditing can enter
the audit system from outside of a syscall. I don't think there's
currently any way to tell whether you're in a syscall (when
TIF_SYSCALL_AUDIT is clear) so getting this to work right would
require arch help.

I'll ask what people on the Fedora list think about just changing the
default to -t task,never.

--Andy

2014-02-10 19:12:42

by Steve Grubb

[permalink] [raw]
Subject: Re: [PATCH v3] audit: Turn off TIF_SYSCALL_AUDIT when there are no rules

On Monday, February 10, 2014 11:01:36 AM Andy Lutomirski wrote:
> >> And I still think this needs more changes. Once again, I do not think
> >> that, say, __audit_log_bprm_fcaps() should populate context->aux if
> >> !TIF_SYSCALL_AUDIT, this list can grow indefinitely. Or
> >> __audit_signal_info()...
> >>
> >> Perhaps __audit_syscall_exit() should also set context->dummy?
> >
> > That would work.
> >
> > I'm still torn between trying to make it possible for things like
> > __audit_log_bprm_fcaps to start a syscall audit record in the middle
> > of a syscall or to just try to tighten up the current approach to the
> > point where it will work correctly.
>
> This is worse than I thought. Things like signal auditing can enter
> the audit system from outside of a syscall. I don't think there's
> currently any way to tell whether you're in a syscall (when
> TIF_SYSCALL_AUDIT is clear) so getting this to work right would
> require arch help.
>
> I'll ask what people on the Fedora list think about just changing the
> default to -t task,never.

I can't recall ever seeing the task filter used in real life. But assuming you
wanted to audit no tasks, what is the difference between using that filter and
never setting audit_enable in the first place?

-Steve

2014-02-10 20:04:33

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3] audit: Turn off TIF_SYSCALL_AUDIT when there are no rules

On Mon, Feb 10, 2014 at 11:12 AM, Steve Grubb <[email protected]> wrote:
> On Monday, February 10, 2014 11:01:36 AM Andy Lutomirski wrote:
>> >> And I still think this needs more changes. Once again, I do not think
>> >> that, say, __audit_log_bprm_fcaps() should populate context->aux if
>> >> !TIF_SYSCALL_AUDIT, this list can grow indefinitely. Or
>> >> __audit_signal_info()...
>> >>
>> >> Perhaps __audit_syscall_exit() should also set context->dummy?
>> >
>> > That would work.
>> >
>> > I'm still torn between trying to make it possible for things like
>> > __audit_log_bprm_fcaps to start a syscall audit record in the middle
>> > of a syscall or to just try to tighten up the current approach to the
>> > point where it will work correctly.
>>
>> This is worse than I thought. Things like signal auditing can enter
>> the audit system from outside of a syscall. I don't think there's
>> currently any way to tell whether you're in a syscall (when
>> TIF_SYSCALL_AUDIT is clear) so getting this to work right would
>> require arch help.
>>
>> I'll ask what people on the Fedora list think about just changing the
>> default to -t task,never.
>
> I can't recall ever seeing the task filter used in real life. But assuming you
> wanted to audit no tasks, what is the difference between using that filter and
> never setting audit_enable in the first place?

Two possible differences:

1. You can toggle it both ways at runtime. Setting audit_enabled is a
one-way street (at least as far TIF_SYSCALL_AUDIT is concerned).

2. Do AVC denial messages still get logged if audit_enable == 0? If
not, then audit_enable is a non-starter.

--Andy

2014-02-18 17:31:41

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH v3] audit: Turn off TIF_SYSCALL_AUDIT when there are no rules

On Mon, 2014-02-10 at 11:01 -0800, Andy Lutomirski wrote:
> On Mon, Feb 10, 2014 at 9:29 AM, Andy Lutomirski <[email protected]> wrote:
> > On Mon, Feb 10, 2014 at 8:57 AM, Oleg Nesterov <[email protected]> wrote:
> >> On 02/08, Andy Lutomirski wrote:
> >>>
> >>> +void audit_inc_n_rules()
> >>> +{
> >>> + struct task_struct *p, *t;
> >>> +
> >>> + read_lock(&tasklist_lock);
> >>> + audit_n_rules++;
> >>> + smp_wmb();
> >>> + if (audit_n_rules == 1) {
> >>> + /*
> >>> + * We now have a rule; we need to hook syscall entry.
> >>> + */
> >>> + for_each_process_thread(p, t) {
> >>> + if (t->audit_context)
> >>> + set_tsk_thread_flag(t, TIF_SYSCALL_AUDIT);
> >>> + }
> >>> + }
> >>> + read_unlock(&tasklist_lock);
> >>> +}
> >>> +
> >>> +void audit_dec_n_rules()
> >>> +{
> >>> + read_lock(&tasklist_lock);
> >>> + --audit_n_rules;
> >>> + BUG_ON(audit_n_rules < 0);
> >>> +
> >>> + /*
> >>> + * If audit_n_rules == 0, then __audit_syscall_exit will clear
> >>> + * TIF_SYSCALL_AUDIT.
> >>> + */
> >>> +
> >>> + read_unlock(&tasklist_lock);
> >>> +}
> >>
> >> To be honest, I do not understand why _dec_ takes tasklist_lock...
> >> And why _inc_ increments audit_n_rules under tasklist.
> >
> > Bah, incorrect leftover from last time.
> >
> >>
> >>> @@ -1528,6 +1562,25 @@ void __audit_syscall_exit(int success, long return_code)
> >>> context->filterkey = NULL;
> >>> }
> >>> tsk->audit_context = context;
> >>> +
> >>> + if (ACCESS_ONCE(audit_n_rules) == 0) {
> >>> + /*
> >>> + * Either this is the very first syscall by this process or
> >>> + * audit_dec_n_rules recently set audit_n_rules to zero.
> >>> + */
> >>> + smp_rmb();
> >>
> >> rmb() looks wrong, we need mb() to serialize ACCESS_ONCE() and
> >> clear_tsk_thread_flag().
> >
> > I clearly need to review the rules. I think you're right, though --
> > no barrier should be needed.
> >
> >>
> >> But, otoh, I think we do not need any barrier at all, we can rely on
> >> control dependency. See the recent 18c03c61444a21 "Documentation/
> >> memory-barriers.txt: Prohibit speculative writes".
> >>
> >>> + /* audit_inc_n_rules could increment audit_n_rules here... */
> >>> +
> >>> + clear_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
> >>> +
> >>> + smp_rmb();
> >>
> >> Again, I guess this should be mb() or smp_mb__after_clear_bit().
> >>
> >>
> >> And I still think this needs more changes. Once again, I do not think
> >> that, say, __audit_log_bprm_fcaps() should populate context->aux if
> >> !TIF_SYSCALL_AUDIT, this list can grow indefinitely. Or __audit_signal_info()...
> >>
> >> Perhaps __audit_syscall_exit() should also set context->dummy?
> >
> > That would work.
> >
> > I'm still torn between trying to make it possible for things like
> > __audit_log_bprm_fcaps to start a syscall audit record in the middle
> > of a syscall or to just try to tighten up the current approach to the
> > point where it will work correctly.

Personally, I'd say just hand the next syscall_entry(), don't try to get
that race closed that fast... "The first syscall after a rule is added
will be audited"

> This is worse than I thought. Things like signal auditing can enter
> the audit system from outside of a syscall.

Not sure what you mean here. The only place I know of that we do
something like that is signal delivery to auditd itself, which is a
horrible nasty ugly ungodly terrible eat-your-babies hack... We'd have
to special case that hack to not pay attention to TIF_SYSCALL_AUDIT or
audit_dummy_context(), but the rest might be fixable if we set/unset the
dummy spot...

> I don't think there's
> currently any way to tell whether you're in a syscall (when
> TIF_SYSCALL_AUDIT is clear) so getting this to work right would
> require arch help.

Don't understand why this is needed...

> I'll ask what people on the Fedora list think about just changing the
> default to -t task,never.
>
> --Andy

2014-02-18 17:32:13

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH v3] audit: Turn off TIF_SYSCALL_AUDIT when there are no rules

On Mon, 2014-02-10 at 12:04 -0800, Andy Lutomirski wrote:
> On Mon, Feb 10, 2014 at 11:12 AM, Steve Grubb <[email protected]> wrote:

> 2. Do AVC denial messages still get logged if audit_enable == 0? If
> not, then audit_enable is a non-starter.

They go out printk/dmesg/syslog

2014-02-18 20:18:03

by Eric Paris

[permalink] [raw]
Subject: Re: [PATCH v3] audit: Turn off TIF_SYSCALL_AUDIT when there are no rules

On Sat, 2014-02-08 at 13:06 -0800, Andy Lutomirski wrote:
> This toggles TIF_SYSCALL_AUDIT as needed when rules change instead
> of leaving it set whenever rules might be set in the future. This
> reduces syscall latency from >60ns to closer to 40ns on my laptop.

Al also politely reminded me it might be wise to get some perf data
about where exactly we are spending out time. I don't know squat about
perf, but Linus always tells me to do:

perf record -g -e cycles:pp -F 25000 $YOURTEST
perf report -s symbol

(the "-s symbol" is so that you don't get separate data for the
different processes that are part of the kernel build - you'll just
want "general kernel data"), and on one of the kernel symbols just
select it and do "Zoom into kernel DSO". You should see something like
this: