2017-04-09 03:02:27

by Seth Forshee

[permalink] [raw]
Subject: audit regressions in 4.11

I've observed audit regressions in 4.11-rc when not using a userspace
audit daemon. The most obvious issue is that audit messages are not
appearing in dmesg anymore. If a sufficient number of audit messages are
generated the kernel will also start invoking the OOM killer.

It looks like previously, when there's no auditd in userspace kauditd
would call kauditd_hold_skb(), which prints the message using printk and
either frees the skb or queues it (with a limit on the number of queued
skb's by default).

Since 5b52330bbfe6 "audit: fix auditd/kernel connection state tracking"
when there's no auditd kauditd will instead use the retry queue, which
has no limit. But it will not process the retry queue when there's no
auditd, so messages pile up there indefinitely.

Thanks,
Seth


2017-04-09 13:14:19

by Paul Moore

[permalink] [raw]
Subject: Re: audit regressions in 4.11

On Sat, Apr 8, 2017 at 11:02 PM, Seth Forshee
<[email protected]> wrote:
> I've observed audit regressions in 4.11-rc when not using a userspace
> audit daemon. The most obvious issue is that audit messages are not
> appearing in dmesg anymore. If a sufficient number of audit messages are
> generated the kernel will also start invoking the OOM killer.
>
> It looks like previously, when there's no auditd in userspace kauditd
> would call kauditd_hold_skb(), which prints the message using printk and
> either frees the skb or queues it (with a limit on the number of queued
> skb's by default).
>
> Since 5b52330bbfe6 "audit: fix auditd/kernel connection state tracking"
> when there's no auditd kauditd will instead use the retry queue, which
> has no limit. But it will not process the retry queue when there's no
> auditd, so messages pile up there indefinitely.

Hi Seth,

Thanks for the report. Let me play with this and think on it for a
bit, but looking at the code again I think the issue is that we check
to see if auditd is connected at the top of the kauditd_thread() loop
and if it isn't we skip right to the main_queue label and bypass the
hold/retry queue processing which has the logic to ensure the retry
queue is managed correctly. My initial thinking is that the fix is to
check and see if auditd is connected in kauditd_retry_skb(), if it
isn't we skip the retry queue and call kauditd_hold_skb(), if auditd
is connected we add the record to the retry queue (what we currently
do).

--
paul moore
http://www.paul-moore.com

2017-04-09 14:40:20

by Seth Forshee

[permalink] [raw]
Subject: Re: audit regressions in 4.11

On Sun, Apr 09, 2017 at 09:14:03AM -0400, Paul Moore wrote:
> On Sat, Apr 8, 2017 at 11:02 PM, Seth Forshee
> <[email protected]> wrote:
> > I've observed audit regressions in 4.11-rc when not using a userspace
> > audit daemon. The most obvious issue is that audit messages are not
> > appearing in dmesg anymore. If a sufficient number of audit messages are
> > generated the kernel will also start invoking the OOM killer.
> >
> > It looks like previously, when there's no auditd in userspace kauditd
> > would call kauditd_hold_skb(), which prints the message using printk and
> > either frees the skb or queues it (with a limit on the number of queued
> > skb's by default).
> >
> > Since 5b52330bbfe6 "audit: fix auditd/kernel connection state tracking"
> > when there's no auditd kauditd will instead use the retry queue, which
> > has no limit. But it will not process the retry queue when there's no
> > auditd, so messages pile up there indefinitely.
>
> Hi Seth,
>
> Thanks for the report. Let me play with this and think on it for a
> bit, but looking at the code again I think the issue is that we check
> to see if auditd is connected at the top of the kauditd_thread() loop
> and if it isn't we skip right to the main_queue label and bypass the
> hold/retry queue processing which has the logic to ensure the retry
> queue is managed correctly. My initial thinking is that the fix is to
> check and see if auditd is connected in kauditd_retry_skb(), if it
> isn't we skip the retry queue and call kauditd_hold_skb(), if auditd
> is connected we add the record to the retry queue (what we currently
> do).

Yeah, my first thought was to make this change:

kauditd_send_queue(sk, portid, &audit_queue, 1,
kauditd_send_multicast_skb,
- kauditd_retry_skb);
+ sk ? kauditd_retry_skb : kauditd_hold_skb);

However some scenarios could result in unbounded queueing on the hold
queue as well, so I'm not sure if that's quite enough.

Thanks,
Seth

2017-04-09 15:43:41

by Paul Moore

[permalink] [raw]
Subject: Re: audit regressions in 4.11

On Sun, Apr 9, 2017 at 10:40 AM, Seth Forshee
<[email protected]> wrote:
> On Sun, Apr 09, 2017 at 09:14:03AM -0400, Paul Moore wrote:
>> On Sat, Apr 8, 2017 at 11:02 PM, Seth Forshee
>> <[email protected]> wrote:
>> > I've observed audit regressions in 4.11-rc when not using a userspace
>> > audit daemon. The most obvious issue is that audit messages are not
>> > appearing in dmesg anymore. If a sufficient number of audit messages are
>> > generated the kernel will also start invoking the OOM killer.
>> >
>> > It looks like previously, when there's no auditd in userspace kauditd
>> > would call kauditd_hold_skb(), which prints the message using printk and
>> > either frees the skb or queues it (with a limit on the number of queued
>> > skb's by default).
>> >
>> > Since 5b52330bbfe6 "audit: fix auditd/kernel connection state tracking"
>> > when there's no auditd kauditd will instead use the retry queue, which
>> > has no limit. But it will not process the retry queue when there's no
>> > auditd, so messages pile up there indefinitely.
>>
>> Hi Seth,
>>
>> Thanks for the report. Let me play with this and think on it for a
>> bit, but looking at the code again I think the issue is that we check
>> to see if auditd is connected at the top of the kauditd_thread() loop
>> and if it isn't we skip right to the main_queue label and bypass the
>> hold/retry queue processing which has the logic to ensure the retry
>> queue is managed correctly. My initial thinking is that the fix is to
>> check and see if auditd is connected in kauditd_retry_skb(), if it
>> isn't we skip the retry queue and call kauditd_hold_skb(), if auditd
>> is connected we add the record to the retry queue (what we currently
>> do).
>
> Yeah, my first thought was to make this change:
>
> kauditd_send_queue(sk, portid, &audit_queue, 1,
> kauditd_send_multicast_skb,
> - kauditd_retry_skb);
> + sk ? kauditd_retry_skb : kauditd_hold_skb);
>
> However some scenarios could result in unbounded queueing on the hold
> queue as well, so I'm not sure if that's quite enough.

At the moment I think I'd prefer to put the auditd check inside
kauditd_retry_skb() itself, but you've got the basic idea.

Keep in mind that kauditd_hold_skb() already has the logic inside
itself to prevent the hold queue from growing out of control.

--
paul moore
http://www.paul-moore.com