2019-09-25 16:25:10

by Dave Jones

[permalink] [raw]
Subject: ntp audit spew.

I have some hosts that are constantly spewing audit messages like so:

[46897.591182] audit: type=1333 audit(1569250288.663:220): op=offset old=2543677901372 new=2980866217213
[46897.591184] audit: type=1333 audit(1569250288.663:221): op=freq old=-2443166611284 new=-2436281764244
[48850.604005] audit: type=1333 audit(1569252241.675:222): op=offset old=1850302393317 new=3190241577926
[48850.604008] audit: type=1333 audit(1569252241.675:223): op=freq old=-2436281764244 new=-2413071187316
[49926.567270] audit: type=1333 audit(1569253317.638:224): op=offset old=2453141035832 new=2372389610455
[49926.567273] audit: type=1333 audit(1569253317.638:225): op=freq old=-2413071187316 new=-2403561671476

This gets emitted every time ntp makes an adjustment, which is apparently very frequent on some hosts.


Audit isn't even enabled on these machines.

# auditctl -l
No rules

# auditctl -s
enabled 0
failure 1
pid 0
rate_limit 0
backlog_limit 64
lost 0
backlog 0
loginuid_immutable 0 unlocked

Asides from the log spew, why is this code doing _anything_ when audit
isn't enabled ?

Something like this:


diff --git a/kernel/audit.c b/kernel/audit.c
index da8dc0db5bd3..1291d826c024 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -2340,6 +2340,9 @@ void audit_log(struct audit_context *ctx, gfp_t gfp_mask, int type,
struct audit_buffer *ab;
va_list args;

+ if (audit_initialized != AUDIT_INITIALIZED)
+ return;
+
ab = audit_log_start(ctx, gfp_mask, type);
if (ab) {
va_start(args, fmt);


Might silence the spew, but I'm concerned that the amount of work that
audit is doing on an unconfigured machine might warrant further
investigation.

("turn off CONFIG_AUDIT" isn't an option unfortunately, as this is a
one-size-fits-all kernel that runs on some other hosts that /do/ have
audit configured)

Dave


2019-09-25 16:40:45

by Dave Jones

[permalink] [raw]
Subject: Re: ntp audit spew.

On Mon, Sep 23, 2019 at 12:14:14PM -0400, Paul Moore wrote:
> On Mon, Sep 23, 2019 at 11:50 AM Dave Jones <[email protected]> wrote:
> >
> > I have some hosts that are constantly spewing audit messages like so:
> >
> > [46897.591182] audit: type=1333 audit(1569250288.663:220): op=offset old=2543677901372 new=2980866217213
> > [46897.591184] audit: type=1333 audit(1569250288.663:221): op=freq old=-2443166611284 new=-2436281764244
> > [48850.604005] audit: type=1333 audit(1569252241.675:222): op=offset old=1850302393317 new=3190241577926
> > [48850.604008] audit: type=1333 audit(1569252241.675:223): op=freq old=-2436281764244 new=-2413071187316
> > [49926.567270] audit: type=1333 audit(1569253317.638:224): op=offset old=2453141035832 new=2372389610455
> > [49926.567273] audit: type=1333 audit(1569253317.638:225): op=freq old=-2413071187316 new=-2403561671476
> >
> > This gets emitted every time ntp makes an adjustment, which is apparently very frequent on some hosts.
> >
> >
> > Audit isn't even enabled on these machines.
> >
> > # auditctl -l
> > No rules
>
> What happens when you run 'auditctl -a never,task'? That *should*
> silence those messages as the audit_ntp_log() function has the
> requisite audit_dummy_context() check.

They still get emitted.

> FWIW, this is the distro
> default for many (most? all?) distros; for example, check
> /etc/audit/audit.rules on a stock Fedora system.

As these machines aren't using audit, they aren't running auditd either.
Essentially: nothing enables audit, but the kernel side continues to log
ntp regardless (no other audit messages seem to do this).

Dave

2019-09-25 20:50:24

by Paul Moore

[permalink] [raw]
Subject: Re: ntp audit spew.

On Mon, Sep 23, 2019 at 11:50 AM Dave Jones <[email protected]> wrote:
>
> I have some hosts that are constantly spewing audit messages like so:
>
> [46897.591182] audit: type=1333 audit(1569250288.663:220): op=offset old=2543677901372 new=2980866217213
> [46897.591184] audit: type=1333 audit(1569250288.663:221): op=freq old=-2443166611284 new=-2436281764244
> [48850.604005] audit: type=1333 audit(1569252241.675:222): op=offset old=1850302393317 new=3190241577926
> [48850.604008] audit: type=1333 audit(1569252241.675:223): op=freq old=-2436281764244 new=-2413071187316
> [49926.567270] audit: type=1333 audit(1569253317.638:224): op=offset old=2453141035832 new=2372389610455
> [49926.567273] audit: type=1333 audit(1569253317.638:225): op=freq old=-2413071187316 new=-2403561671476
>
> This gets emitted every time ntp makes an adjustment, which is apparently very frequent on some hosts.
>
>
> Audit isn't even enabled on these machines.
>
> # auditctl -l
> No rules

[NOTE: added linux-audit to the CC line]

There is an audit mailing list, please CC it when you have audit
concerns/questions/etc.

What happens when you run 'auditctl -a never,task'? That *should*
silence those messages as the audit_ntp_log() function has the
requisite audit_dummy_context() check. FWIW, this is the distro
default for many (most? all?) distros; for example, check
/etc/audit/audit.rules on a stock Fedora system. A more selective
configuration could simply exclude the TIME_ADJNTPVAL record (type
1333) from the records that the kernel emits.

We could also add a audit_enabled check at the top of
audit_ntp_log()/__audit_ntp_log(), but I imagine some of that depends
on the various security requirements (they can be bizzare and I can't
say I'm up to date on all those - Steve Grubb should be able to
comment on that).

--
paul moore
http://www.paul-moore.com

2019-09-25 23:22:43

by Dave Jones

[permalink] [raw]
Subject: Re: ntp audit spew.

On Mon, Sep 23, 2019 at 02:57:08PM -0400, Paul Moore wrote:
> On Mon, Sep 23, 2019 at 12:58 PM Dave Jones <[email protected]> wrote:
> > On Mon, Sep 23, 2019 at 12:14:14PM -0400, Paul Moore wrote:
> > > On Mon, Sep 23, 2019 at 11:50 AM Dave Jones <[email protected]> wrote:
> > > >
> > > > I have some hosts that are constantly spewing audit messages like so:
> > > >
> > > > [46897.591182] audit: type=1333 audit(1569250288.663:220): op=offset old=2543677901372 new=2980866217213
> > > > [46897.591184] audit: type=1333 audit(1569250288.663:221): op=freq old=-2443166611284 new=-2436281764244
> > > > [48850.604005] audit: type=1333 audit(1569252241.675:222): op=offset old=1850302393317 new=3190241577926
> > > > [48850.604008] audit: type=1333 audit(1569252241.675:223): op=freq old=-2436281764244 new=-2413071187316
> > > > [49926.567270] audit: type=1333 audit(1569253317.638:224): op=offset old=2453141035832 new=2372389610455
> > > > [49926.567273] audit: type=1333 audit(1569253317.638:225): op=freq old=-2413071187316 new=-2403561671476
> > > >
> > > > This gets emitted every time ntp makes an adjustment, which is apparently very frequent on some hosts.
> > > >
> > > >
> > > > Audit isn't even enabled on these machines.
> > > >
> > > > # auditctl -l
> > > > No rules
> > >
> > > What happens when you run 'auditctl -a never,task'? That *should*
> > > silence those messages as the audit_ntp_log() function has the
> > > requisite audit_dummy_context() check.
> >
> > They still get emitted.
> >
> > > FWIW, this is the distro
> > > default for many (most? all?) distros; for example, check
> > > /etc/audit/audit.rules on a stock Fedora system.
> >
> > As these machines aren't using audit, they aren't running auditd either.
> > Essentially: nothing enables audit, but the kernel side continues to log
> > ntp regardless (no other audit messages seem to do this).
>
> What does your kernel command line look like? Do you have "audit=1"
> somewhere in there?

nope.

ro root=LABEL=/ biosdevname=0 net.ifnames=0 fsck.repair=yes systemd.gpt_auto=0 pcie_pme=nomsi ipv6.autoconf=0 erst_disable crashkernel=128M console=tty0 console=ttyS1,57600 intel_iommu=tboot_noforce

Dave

2019-09-25 23:23:26

by Eric Paris

[permalink] [raw]
Subject: Re: ntp audit spew.

Is this the thing where systemd is listening on the multicast netlink
socket and causes everything to come out kmesg as well?

On Mon, 2019-09-23 at 15:49 -0400, Dave Jones wrote:
> On Mon, Sep 23, 2019 at 02:57:08PM -0400, Paul Moore wrote:
> > On Mon, Sep 23, 2019 at 12:58 PM Dave Jones <
> [email protected]> wrote:
> > > On Mon, Sep 23, 2019 at 12:14:14PM -0400, Paul Moore wrote:
> > > > On Mon, Sep 23, 2019 at 11:50 AM Dave Jones <
> [email protected]> wrote:
> > > > >
> > > > > I have some hosts that are constantly spewing audit
> messages like so:
> > > > >
> > > > > [46897.591182] audit: type=1333 audit(1569250288.663:220):
> op=offset old=2543677901372 new=2980866217213
> > > > > [46897.591184] audit: type=1333 audit(1569250288.663:221):
> op=freq old=-2443166611284 new=-2436281764244
> > > > > [48850.604005] audit: type=1333 audit(1569252241.675:222):
> op=offset old=1850302393317 new=3190241577926
> > > > > [48850.604008] audit: type=1333 audit(1569252241.675:223):
> op=freq old=-2436281764244 new=-2413071187316
> > > > > [49926.567270] audit: type=1333 audit(1569253317.638:224):
> op=offset old=2453141035832 new=2372389610455
> > > > > [49926.567273] audit: type=1333 audit(1569253317.638:225):
> op=freq old=-2413071187316 new=-2403561671476
> > > > >
> > > > > This gets emitted every time ntp makes an adjustment, which
> is apparently very frequent on some hosts.
> > > > >
> > > > >
> > > > > Audit isn't even enabled on these machines.
> > > > >
> > > > > # auditctl -l
> > > > > No rules
> > > >
> > > > What happens when you run 'auditctl -a never,task'? That
> *should*
> > > > silence those messages as the audit_ntp_log() function has
> the
> > > > requisite audit_dummy_context() check.
> > >
> > > They still get emitted.
> > >
> > > > FWIW, this is the distro
> > > > default for many (most? all?) distros; for example, check
> > > > /etc/audit/audit.rules on a stock Fedora system.
> > >
> > > As these machines aren't using audit, they aren't running auditd
> either.
> > > Essentially: nothing enables audit, but the kernel side
> continues to log
> > > ntp regardless (no other audit messages seem to do this).
> >
> > What does your kernel command line look like? Do you have
> "audit=1"
> > somewhere in there?
>
> nope.
>
> ro root=LABEL=/ biosdevname=0 net.ifnames=0 fsck.repair=yes
> systemd.gpt_auto=0 pcie_pme=nomsi ipv6.autoconf=0 erst_disable
> crashkernel=128M console=tty0 console=ttyS1,57600
> intel_iommu=tboot_noforce
>
> Dave
>

2019-09-26 00:40:57

by Paul Moore

[permalink] [raw]
Subject: Re: ntp audit spew.

On Mon, Sep 23, 2019 at 3:49 PM Eric Paris <[email protected]> wrote:
> Is this the thing where systemd is listening on the multicast netlink
> socket and causes everything to come out kmesg as well?

I don't think so, but I'm still a little confused as to why DaveJ is
seeing these records, so I'll go with a weak "maybe" ;)

> On Mon, 2019-09-23 at 15:49 -0400, Dave Jones wrote:
> > On Mon, Sep 23, 2019 at 02:57:08PM -0400, Paul Moore wrote:
> > > On Mon, Sep 23, 2019 at 12:58 PM Dave Jones <
> > [email protected]> wrote:
> > > > On Mon, Sep 23, 2019 at 12:14:14PM -0400, Paul Moore wrote:
> > > > > On Mon, Sep 23, 2019 at 11:50 AM Dave Jones <
> > [email protected]> wrote:
> > > > > >
> > > > > > I have some hosts that are constantly spewing audit
> > messages like so:
> > > > > >
> > > > > > [46897.591182] audit: type=1333 audit(1569250288.663:220):
> > op=offset old=2543677901372 new=2980866217213
> > > > > > [46897.591184] audit: type=1333 audit(1569250288.663:221):
> > op=freq old=-2443166611284 new=-2436281764244
> > > > > > [48850.604005] audit: type=1333 audit(1569252241.675:222):
> > op=offset old=1850302393317 new=3190241577926
> > > > > > [48850.604008] audit: type=1333 audit(1569252241.675:223):
> > op=freq old=-2436281764244 new=-2413071187316
> > > > > > [49926.567270] audit: type=1333 audit(1569253317.638:224):
> > op=offset old=2453141035832 new=2372389610455
> > > > > > [49926.567273] audit: type=1333 audit(1569253317.638:225):
> > op=freq old=-2413071187316 new=-2403561671476
> > > > > >
> > > > > > This gets emitted every time ntp makes an adjustment, which
> > is apparently very frequent on some hosts.
> > > > > >
> > > > > >
> > > > > > Audit isn't even enabled on these machines.
> > > > > >
> > > > > > # auditctl -l
> > > > > > No rules
> > > > >
> > > > > What happens when you run 'auditctl -a never,task'? That
> > *should*
> > > > > silence those messages as the audit_ntp_log() function has
> > the
> > > > > requisite audit_dummy_context() check.
> > > >
> > > > They still get emitted.
> > > >
> > > > > FWIW, this is the distro
> > > > > default for many (most? all?) distros; for example, check
> > > > > /etc/audit/audit.rules on a stock Fedora system.
> > > >
> > > > As these machines aren't using audit, they aren't running auditd
> > either.
> > > > Essentially: nothing enables audit, but the kernel side
> > continues to log
> > > > ntp regardless (no other audit messages seem to do this).
> > >
> > > What does your kernel command line look like? Do you have
> > "audit=1"
> > > somewhere in there?
> >
> > nope.
> >
> > ro root=LABEL=/ biosdevname=0 net.ifnames=0 fsck.repair=yes
> > systemd.gpt_auto=0 pcie_pme=nomsi ipv6.autoconf=0 erst_disable
> > crashkernel=128M console=tty0 console=ttyS1,57600
> > intel_iommu=tboot_noforce
> >
> > Dave
> >
>


--
paul moore
http://www.paul-moore.com

2019-09-26 00:42:27

by Paul Moore

[permalink] [raw]
Subject: Re: ntp audit spew.

On Mon, Sep 23, 2019 at 5:00 PM Richard Guy Briggs <[email protected]> wrote:
> On 2019-09-23 12:14, Paul Moore wrote:
> > On Mon, Sep 23, 2019 at 11:50 AM Dave Jones <[email protected]> wrote:
> > >
> > > I have some hosts that are constantly spewing audit messages like so:
> > >
> > > [46897.591182] audit: type=1333 audit(1569250288.663:220): op=offset old=2543677901372 new=2980866217213
> > > [46897.591184] audit: type=1333 audit(1569250288.663:221): op=freq old=-2443166611284 new=-2436281764244
>
> Odd. It appears these two above should have the same serial number and
> should be accompanied by a syscall record. It appears that it has no
> context to update to connect the two records. Is it possible it is not
> being called in a task context? If that were the case though, I'd
> expect audit_dummy_context() to return 1...

Yeah, I'm a little confused with these messages too. As you pointed
out, the different serial numbers imply that the audit_context is NULL
and if the audit_context is NULL I would have expected it to fail the
audit_dummy_context() check in audit_ntp_log(). I'm looking at this
with tired eyes at the moment, so I'm likely missing something, but I
just don't see it right now ...

What is even more confusing is that I don't see this issue on my test systems.

> Checking audit_enabled should not be necessary but might fix the
> problem, but still not explain why we're getting these records.

I'd like to understand why this is happening before we start changing the code.

--
paul moore
http://www.paul-moore.com

2019-09-26 02:37:34

by Paul Moore

[permalink] [raw]
Subject: Re: ntp audit spew.

On Mon, Sep 23, 2019 at 12:58 PM Dave Jones <[email protected]> wrote:
> On Mon, Sep 23, 2019 at 12:14:14PM -0400, Paul Moore wrote:
> > On Mon, Sep 23, 2019 at 11:50 AM Dave Jones <[email protected]> wrote:
> > >
> > > I have some hosts that are constantly spewing audit messages like so:
> > >
> > > [46897.591182] audit: type=1333 audit(1569250288.663:220): op=offset old=2543677901372 new=2980866217213
> > > [46897.591184] audit: type=1333 audit(1569250288.663:221): op=freq old=-2443166611284 new=-2436281764244
> > > [48850.604005] audit: type=1333 audit(1569252241.675:222): op=offset old=1850302393317 new=3190241577926
> > > [48850.604008] audit: type=1333 audit(1569252241.675:223): op=freq old=-2436281764244 new=-2413071187316
> > > [49926.567270] audit: type=1333 audit(1569253317.638:224): op=offset old=2453141035832 new=2372389610455
> > > [49926.567273] audit: type=1333 audit(1569253317.638:225): op=freq old=-2413071187316 new=-2403561671476
> > >
> > > This gets emitted every time ntp makes an adjustment, which is apparently very frequent on some hosts.
> > >
> > >
> > > Audit isn't even enabled on these machines.
> > >
> > > # auditctl -l
> > > No rules
> >
> > What happens when you run 'auditctl -a never,task'? That *should*
> > silence those messages as the audit_ntp_log() function has the
> > requisite audit_dummy_context() check.
>
> They still get emitted.
>
> > FWIW, this is the distro
> > default for many (most? all?) distros; for example, check
> > /etc/audit/audit.rules on a stock Fedora system.
>
> As these machines aren't using audit, they aren't running auditd either.
> Essentially: nothing enables audit, but the kernel side continues to log
> ntp regardless (no other audit messages seem to do this).

What does your kernel command line look like? Do you have "audit=1"
somewhere in there?

--
paul moore
http://www.paul-moore.com

2019-09-26 05:02:41

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: ntp audit spew.

On 2019-09-23 12:14, Paul Moore wrote:
> On Mon, Sep 23, 2019 at 11:50 AM Dave Jones <[email protected]> wrote:
> >
> > I have some hosts that are constantly spewing audit messages like so:
> >
> > [46897.591182] audit: type=1333 audit(1569250288.663:220): op=offset old=2543677901372 new=2980866217213
> > [46897.591184] audit: type=1333 audit(1569250288.663:221): op=freq old=-2443166611284 new=-2436281764244

Odd. It appears these two above should have the same serial number and
should be accompanied by a syscall record. It appears that it has no
context to update to connect the two records. Is it possible it is not
being called in a task context? If that were the case though, I'd
expect audit_dummy_context() to return 1...

Checking audit_enabled should not be necessary but might fix the
problem, but still not explain why we're getting these records.

> > [48850.604005] audit: type=1333 audit(1569252241.675:222): op=offset old=1850302393317 new=3190241577926
> > [48850.604008] audit: type=1333 audit(1569252241.675:223): op=freq old=-2436281764244 new=-2413071187316
> > [49926.567270] audit: type=1333 audit(1569253317.638:224): op=offset old=2453141035832 new=2372389610455
> > [49926.567273] audit: type=1333 audit(1569253317.638:225): op=freq old=-2413071187316 new=-2403561671476
> >
> > This gets emitted every time ntp makes an adjustment, which is apparently very frequent on some hosts.
> >
> >
> > Audit isn't even enabled on these machines.
> >
> > # auditctl -l
> > No rules
>
> [NOTE: added linux-audit to the CC line]
>
> There is an audit mailing list, please CC it when you have audit
> concerns/questions/etc.
>
> What happens when you run 'auditctl -a never,task'? That *should*
> silence those messages as the audit_ntp_log() function has the
> requisite audit_dummy_context() check. FWIW, this is the distro
> default for many (most? all?) distros; for example, check
> /etc/audit/audit.rules on a stock Fedora system. A more selective
> configuration could simply exclude the TIME_ADJNTPVAL record (type
> 1333) from the records that the kernel emits.
>
> We could also add a audit_enabled check at the top of
> audit_ntp_log()/__audit_ntp_log(), but I imagine some of that depends
> on the various security requirements (they can be bizzare and I can't
> say I'm up to date on all those - Steve Grubb should be able to
> comment on that).
>
> --
> paul moore
> http://www.paul-moore.com
>
> --
> Linux-audit mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/linux-audit

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2019-09-26 08:09:25

by Steve Grubb

[permalink] [raw]
Subject: Re: ntp audit spew.

On Monday, September 23, 2019 12:14:14 PM EDT Paul Moore wrote:
> On Mon, Sep 23, 2019 at 11:50 AM Dave Jones <[email protected]>
wrote:
> > I have some hosts that are constantly spewing audit messages like so:
> >
> > [46897.591182] audit: type=1333 audit(1569250288.663:220): op=offset
> > old=2543677901372 new=2980866217213 [46897.591184] audit: type=1333
> > audit(1569250288.663:221): op=freq old=-2443166611284 new=-2436281764244
> > [48850.604005] audit: type=1333 audit(1569252241.675:222): op=offset
> > old=1850302393317 new=3190241577926 [48850.604008] audit: type=1333
> > audit(1569252241.675:223): op=freq old=-2436281764244 new=-2413071187316
> > [49926.567270] audit: type=1333 audit(1569253317.638:224): op=offset
> > old=2453141035832 new=2372389610455 [49926.567273] audit: type=1333
> > audit(1569253317.638:225): op=freq old=-2413071187316 new=-2403561671476
> >
> > This gets emitted every time ntp makes an adjustment, which is apparently
> > very frequent on some hosts.
> >
> >
> > Audit isn't even enabled on these machines.
> >
> > # auditctl -l
> > No rules
>
> [NOTE: added linux-audit to the CC line]
>
> There is an audit mailing list, please CC it when you have audit
> concerns/questions/etc.
>
> What happens when you run 'auditctl -a never,task'?

Actually, "-e 0" should turn it off. There is a general problem where systemd
turns on auditing just because it can. The above rule just makes audit
processes inauditable, but does not affect the kernel originating events.

-Steve

> That *should*
> silence those messages as the audit_ntp_log() function has the
> requisite audit_dummy_context() check. FWIW, this is the distro
> default for many (most? all?) distros; for example, check
> /etc/audit/audit.rules on a stock Fedora system. A more selective
> configuration could simply exclude the TIME_ADJNTPVAL record (type
> 1333) from the records that the kernel emits.
>
> We could also add a audit_enabled check at the top of
> audit_ntp_log()/__audit_ntp_log(), but I imagine some of that depends
> on the various security requirements (they can be bizzare and I can't
> say I'm up to date on all those - Steve Grubb should be able to
> comment on that).




2019-09-26 08:10:37

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: ntp audit spew.

On 2019-09-23 23:01, Paul Moore wrote:
> On Mon, Sep 23, 2019 at 5:00 PM Richard Guy Briggs <[email protected]> wrote:
> > On 2019-09-23 12:14, Paul Moore wrote:
> > > On Mon, Sep 23, 2019 at 11:50 AM Dave Jones <[email protected]> wrote:
> > > >
> > > > I have some hosts that are constantly spewing audit messages like so:
> > > >
> > > > [46897.591182] audit: type=1333 audit(1569250288.663:220): op=offset old=2543677901372 new=2980866217213
> > > > [46897.591184] audit: type=1333 audit(1569250288.663:221): op=freq old=-2443166611284 new=-2436281764244
> >
> > Odd. It appears these two above should have the same serial number and
> > should be accompanied by a syscall record. It appears that it has no
> > context to update to connect the two records. Is it possible it is not
> > being called in a task context? If that were the case though, I'd
> > expect audit_dummy_context() to return 1...
>
> Yeah, I'm a little confused with these messages too. As you pointed
> out, the different serial numbers imply that the audit_context is NULL
> and if the audit_context is NULL I would have expected it to fail the
> audit_dummy_context() check in audit_ntp_log(). I'm looking at this
> with tired eyes at the moment, so I'm likely missing something, but I
> just don't see it right now ...
>
> What is even more confusing is that I don't see this issue on my test systems.
>
> > Checking audit_enabled should not be necessary but might fix the
> > problem, but still not explain why we're getting these records.
>
> I'd like to understand why this is happening before we start changing the code.

Absolutely.

This looks like a similar issue to the AUDIT_NETFILTER_CFG issue where
we get a lone record unconnected to a syscall when one of the netfilter
table initialization (ipv4 filter) is linked into the kernel rather than
compiled as a module, so it is run in kernel context at boot rather than
in user context as a module load later. This is why I ask if it is
being run by a kernel thread rather than a user task, perhaps using a
syscall function call internally.

> paul moore

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2019-09-26 10:47:03

by Paul Moore

[permalink] [raw]
Subject: Re: ntp audit spew.

On Tue, Sep 24, 2019 at 9:19 AM Steve Grubb <[email protected]> wrote:
> On Monday, September 23, 2019 12:14:14 PM EDT Paul Moore wrote:
> > On Mon, Sep 23, 2019 at 11:50 AM Dave Jones <[email protected]>
> wrote:
> > > I have some hosts that are constantly spewing audit messages like so:
> > >
> > > [46897.591182] audit: type=1333 audit(1569250288.663:220): op=offset
> > > old=2543677901372 new=2980866217213 [46897.591184] audit: type=1333
> > > audit(1569250288.663:221): op=freq old=-2443166611284 new=-2436281764244
> > > [48850.604005] audit: type=1333 audit(1569252241.675:222): op=offset
> > > old=1850302393317 new=3190241577926 [48850.604008] audit: type=1333
> > > audit(1569252241.675:223): op=freq old=-2436281764244 new=-2413071187316
> > > [49926.567270] audit: type=1333 audit(1569253317.638:224): op=offset
> > > old=2453141035832 new=2372389610455 [49926.567273] audit: type=1333
> > > audit(1569253317.638:225): op=freq old=-2413071187316 new=-2403561671476
> > >
> > > This gets emitted every time ntp makes an adjustment, which is apparently
> > > very frequent on some hosts.
> > >
> > >
> > > Audit isn't even enabled on these machines.
> > >
> > > # auditctl -l
> > > No rules
> >
> > [NOTE: added linux-audit to the CC line]
> >
> > There is an audit mailing list, please CC it when you have audit
> > concerns/questions/etc.
> >
> > What happens when you run 'auditctl -a never,task'?
>
> Actually, "-e 0" should turn it off. There is a general problem where systemd
> turns on auditing just because it can. The above rule just makes audit
> processes inauditable, but does not affect the kernel originating events.

The 'auditctl -s' output was lost when I trimmed/replied to DaveJ's
original email (sorry), but it appears that audit_enabled is already
'0':

> # auditctl -s
> enabled 0
> failure 1
> pid 0
> rate_limit 0
> backlog_limit 64
> lost 0
> backlog 0
> loginuid_immutable 0 unlocked

Original post from DaveJ:

* https://lore.kernel.org/lkml/[email protected]

--
paul moore
http://www.paul-moore.com

2019-09-26 10:47:14

by Paul Moore

[permalink] [raw]
Subject: Re: ntp audit spew.

On Tue, Sep 24, 2019 at 9:50 AM Richard Guy Briggs <[email protected]> wrote:
> On 2019-09-23 23:01, Paul Moore wrote:
> > On Mon, Sep 23, 2019 at 5:00 PM Richard Guy Briggs <[email protected]> wrote:
> > > On 2019-09-23 12:14, Paul Moore wrote:
> > > > On Mon, Sep 23, 2019 at 11:50 AM Dave Jones <[email protected]> wrote:
> > > > >
> > > > > I have some hosts that are constantly spewing audit messages like so:
> > > > >
> > > > > [46897.591182] audit: type=1333 audit(1569250288.663:220): op=offset old=2543677901372 new=2980866217213
> > > > > [46897.591184] audit: type=1333 audit(1569250288.663:221): op=freq old=-2443166611284 new=-2436281764244
> > >
> > > Odd. It appears these two above should have the same serial number and
> > > should be accompanied by a syscall record. It appears that it has no
> > > context to update to connect the two records. Is it possible it is not
> > > being called in a task context? If that were the case though, I'd
> > > expect audit_dummy_context() to return 1...
> >
> > Yeah, I'm a little confused with these messages too. As you pointed
> > out, the different serial numbers imply that the audit_context is NULL
> > and if the audit_context is NULL I would have expected it to fail the
> > audit_dummy_context() check in audit_ntp_log(). I'm looking at this
> > with tired eyes at the moment, so I'm likely missing something, but I
> > just don't see it right now ...
> >
> > What is even more confusing is that I don't see this issue on my test systems.
> >
> > > Checking audit_enabled should not be necessary but might fix the
> > > problem, but still not explain why we're getting these records.
> >
> > I'd like to understand why this is happening before we start changing the code.
>
> Absolutely.
>
> This looks like a similar issue to the AUDIT_NETFILTER_CFG issue where
> we get a lone record unconnected to a syscall when one of the netfilter
> table initialization (ipv4 filter) is linked into the kernel rather than
> compiled as a module, so it is run in kernel context at boot rather than
> in user context as a module load later. This is why I ask if it is
> being run by a kernel thread rather than a user task, perhaps using a
> syscall function call internally.

I don't see where in the code that could happen, but I agree that it
looks like it; maybe I'm just missing a code path somewhere.

Is anyone else seeing these records? Granted my audit test systems
are running chrony, not ntp, but the syscalls/behaviors should be
similar and I can't seem to recreate this.

--
paul moore
http://www.paul-moore.com

2019-09-26 15:52:43

by Paul Moore

[permalink] [raw]
Subject: Re: ntp audit spew.

On Tue, Sep 24, 2019 at 1:05 PM Paul Moore <[email protected]> wrote:
> On Tue, Sep 24, 2019 at 9:50 AM Richard Guy Briggs <[email protected]> wrote:
> > On 2019-09-23 23:01, Paul Moore wrote:
> > > On Mon, Sep 23, 2019 at 5:00 PM Richard Guy Briggs <[email protected]> wrote:
> > > > On 2019-09-23 12:14, Paul Moore wrote:
> > > > > On Mon, Sep 23, 2019 at 11:50 AM Dave Jones <[email protected]> wrote:
> > > > > >
> > > > > > I have some hosts that are constantly spewing audit messages like so:
> > > > > >
> > > > > > [46897.591182] audit: type=1333 audit(1569250288.663:220): op=offset old=2543677901372 new=2980866217213
> > > > > > [46897.591184] audit: type=1333 audit(1569250288.663:221): op=freq old=-2443166611284 new=-2436281764244
> > > >
> > > > Odd. It appears these two above should have the same serial number and
> > > > should be accompanied by a syscall record. It appears that it has no
> > > > context to update to connect the two records. Is it possible it is not
> > > > being called in a task context? If that were the case though, I'd
> > > > expect audit_dummy_context() to return 1...
> > >
> > > Yeah, I'm a little confused with these messages too. As you pointed
> > > out, the different serial numbers imply that the audit_context is NULL
> > > and if the audit_context is NULL I would have expected it to fail the
> > > audit_dummy_context() check in audit_ntp_log(). I'm looking at this
> > > with tired eyes at the moment, so I'm likely missing something, but I
> > > just don't see it right now ...
> > >
> > > What is even more confusing is that I don't see this issue on my test systems.
> > >
> > > > Checking audit_enabled should not be necessary but might fix the
> > > > problem, but still not explain why we're getting these records.
> > >
> > > I'd like to understand why this is happening before we start changing the code.
> >
> > Absolutely.
> >
> > This looks like a similar issue to the AUDIT_NETFILTER_CFG issue where
> > we get a lone record unconnected to a syscall when one of the netfilter
> > table initialization (ipv4 filter) is linked into the kernel rather than
> > compiled as a module, so it is run in kernel context at boot rather than
> > in user context as a module load later. This is why I ask if it is
> > being run by a kernel thread rather than a user task, perhaps using a
> > syscall function call internally.
>
> I don't see where in the code that could happen, but I agree that it
> looks like it; maybe I'm just missing a code path somewhere.
>
> Is anyone else seeing these records? Granted my audit test systems
> are running chrony, not ntp, but the syscalls/behaviors should be
> similar and I can't seem to recreate this.

Dave, can you provide any additional information on the systems where
you are seeing this? Kernel, userspace, distro, relevant configs,
etc.

--
paul moore
http://www.paul-moore.com

2019-10-31 16:41:42

by Chris Mason

[permalink] [raw]
Subject: [PATCH] audit: set context->dummy even when audit is off

Dave Jones reported that we're finding a considerable amount of dmesg
traffic from NTP time adjustments being reported through the audit
subsystem. His original post is here:

https://lore.kernel.org/lkml/[email protected]/

The confusing part is that we're seeing this on machines that don't have
audit on. The NTP code uses audit_dummy_context() to decide if it
should log things:

static inline void audit_ntp_log(const struct audit_ntp_data *ad)
{
if (!audit_dummy_context())
__audit_ntp_log(ad);
}

I confirmed with perf probes that:

context->dummy = 0
audit_n_rules = 0
audit_enabled = 0
audit_ever_enabled = 1 // seems to be from journald

The box boots, journald turns audit on, some time later our
configuration management runs around and turns audit off. This journald
feature is discussed here: https://github.com/systemd/systemd/issues/959

From what I can tell, audit_syscall_entry is responsible for setting
context->dummy, but we never get down to the test for audit_n_rules:

__audit_syscall_entry(int major, unsigned long a1, unsigned long a2,
unsigned long a3, unsigned long a4)
{
struct audit_context *context = audit_context();
enum audit_state state;

if (!audit_enabled || !context)
return;
^^^^^^^^^^^^^^^^^^ --- we bail here

[ ... ]

context->dummy = !audit_n_rules;

This leaves context->dummy at 0, which appears to be the original value
from kzalloc().

If you've gotten this far, you've read everything I know about the audit
code. With that said, my preference is to make a single source of truth for
decisions about logging. This commit changes __audit_syscall_entry() to
set context->dummy when audit is off.

Reported-by: Dave Jones <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
---
kernel/auditsc.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 4effe01ebbe2..a5c82d8f9c2b 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1631,8 +1631,19 @@ void __audit_syscall_entry(int major, unsigned long a1, unsigned long a2,
struct audit_context *context = audit_context();
enum audit_state state;

- if (!audit_enabled || !context)
+ if (!context)
+ return;
+
+ if (!audit_enabled) {
+ /*
+ * ntp clock adjustments and a few other places check for
+ * a dummy context without checking to see if audit
+ * is enabled. Make sure we set context->dummy when audit
+ * is off, otherwise they will try to log things.
+ */
+ context->dummy = 1;
return;
+ }

BUG_ON(context->in_syscall || context->name_count);

--
2.17.1

2019-10-31 23:31:43

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH] audit: set context->dummy even when audit is off

On Thu, Oct 31, 2019 at 12:40 PM Chris Mason <[email protected]> wrote:
> Dave Jones reported that we're finding a considerable amount of dmesg
> traffic from NTP time adjustments being reported through the audit
> subsystem. His original post is here:
>
> https://lore.kernel.org/lkml/[email protected]/
>
> The confusing part is that we're seeing this on machines that don't have
> audit on. The NTP code uses audit_dummy_context() to decide if it
> should log things:
>
> static inline void audit_ntp_log(const struct audit_ntp_data *ad)
> {
> if (!audit_dummy_context())
> __audit_ntp_log(ad);
> }
>
> I confirmed with perf probes that:
>
> context->dummy = 0
> audit_n_rules = 0
> audit_enabled = 0
> audit_ever_enabled = 1 // seems to be from journald
>
> The box boots, journald turns audit on, some time later our
> configuration management runs around and turns audit off. This journald
> feature is discussed here: https://github.com/systemd/systemd/issues/959
>
> From what I can tell, audit_syscall_entry is responsible for setting
> context->dummy, but we never get down to the test for audit_n_rules:
>
> __audit_syscall_entry(int major, unsigned long a1, unsigned long a2,
> unsigned long a3, unsigned long a4)
> {
> struct audit_context *context = audit_context();
> enum audit_state state;
>
> if (!audit_enabled || !context)
> return;
> ^^^^^^^^^^^^^^^^^^ --- we bail here
>
> [ ... ]
>
> context->dummy = !audit_n_rules;
>
> This leaves context->dummy at 0, which appears to be the original value
> from kzalloc().
>
> If you've gotten this far, you've read everything I know about the audit
> code. With that said, my preference is to make a single source of truth for
> decisions about logging. This commit changes __audit_syscall_entry() to
> set context->dummy when audit is off.
>
> Reported-by: Dave Jones <[email protected]>
> Signed-off-by: Chris Mason <[email protected]>
> ---
> kernel/auditsc.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)

Hi Chris,

This is a rather hasty email as I'm at a conference right now, but I
wanted to convey that I'm not opposed to making sure that the NTP
records obey the audit configuration (that was the original intent
after all), I think it is just that we are all a little confused as to
why you are seeing the NTP records *and*only* the NTP records.

It's been a while, but I thought we suggested Dave try running
'auditctl -a never,task' to see if that would solve his problem and I
believe his answer was no, which confused me a bit as the
audit_filter_task() call in audit_alloc() should see that rule and
return a state of AUDIT_DISABLED which not only prevents audit_alloc()
from allocating an audit_context (and remember if the audit_context is
NULL then audit_dummy_context() returns true), but it also clears the
TIF_SYSCALL_AUDIT flag (which I'm guessing you also want).

Can you confirm the results of 'auditctl -a never,task' on your systems?

--
paul moore
http://www.paul-moore.com

2019-11-01 14:11:46

by Chris Mason

[permalink] [raw]
Subject: Re: [PATCH] audit: set context->dummy even when audit is off

On 31 Oct 2019, at 19:27, Paul Moore wrote:

> On Thu, Oct 31, 2019 at 12:40 PM Chris Mason <[email protected]> wrote:
> [ ... ]
> Hi Chris,
>
> This is a rather hasty email as I'm at a conference right now, but I
> wanted to convey that I'm not opposed to making sure that the NTP
> records obey the audit configuration (that was the original intent
> after all), I think it is just that we are all a little confused as to
> why you are seeing the NTP records *and*only* the NTP records.

This part is harder to nail down because there's a window during boot
where journald has enabled audit but chef hasn't yet run in and turned
it off, so we get a lot of logs early and then mostly ntp after that.

I feel like the answer is that most of the places that actually log
audit records are also checking audit_enabled. Poking a bit with
cscope, we're not using most of the places that rely only on
audit_dummy_context().

I grabbed the last week or so of audit: lines from netconsole, and most
of them (73%) were type 1130 from early in the boot. These are the ones
turned off when chef runs. Another big chunk were the one time audit
initialized message from boot, mostly reflecting our rollout of the new
kernel. After that it was ntp and couple of random things like
fanotify.

>
> It's been a while, but I thought we suggested Dave try running
> 'auditctl -a never,task' to see if that would solve his problem and I
> believe his answer was no, which confused me a bit as the
> audit_filter_task() call in audit_alloc() should see that rule and
> return a state of AUDIT_DISABLED which not only prevents audit_alloc()
> from allocating an audit_context (and remember if the audit_context is
> NULL then audit_dummy_context() returns true), but it also clears the
> TIF_SYSCALL_AUDIT flag (which I'm guessing you also want).
>

Thanks for the reminder on this part, I meant to test it. Yes, auditctl
-a never,task does stop the messages, even without my patch applied.

-chris

2019-11-01 14:18:08

by Steve Grubb

[permalink] [raw]
Subject: Re: [PATCH] audit: set context->dummy even when audit is off

On Friday, November 1, 2019 9:24:17 AM EDT Chris Mason wrote:
> On 31 Oct 2019, at 19:27, Paul Moore wrote:
> > On Thu, Oct 31, 2019 at 12:40 PM Chris Mason <[email protected]> wrote:
> > [ ... ]
> > Hi Chris,
> >
> > This is a rather hasty email as I'm at a conference right now, but I
> > wanted to convey that I'm not opposed to making sure that the NTP
> > records obey the audit configuration (that was the original intent
> > after all), I think it is just that we are all a little confused as to
> > why you are seeing the NTP records *and*only* the NTP records.
>
> This part is harder to nail down because there's a window during boot
> where journald has enabled audit but chef hasn't yet run in and turned
> it off, so we get a lot of logs early and then mostly ntp after that.

This is the root of the problem. Journald should never turn on audit since it
has no idea if auditd even has rules to load. What if the end user does not
want auditing? By blindly enabling audit without knowing if its wanted, it
causes a system performance hit even with no rules loaded. It would be best
if journald leaves audit alone. If it wants to listen on the multicast
socket, so be it. It should just listen and not try to alter the system.

Back to ntp, it sounds like the ntp record needs to check for audit_enabled
rather than the dummy context.

-Steve



2019-11-01 16:16:49

by Chris Mason

[permalink] [raw]
Subject: Re: [PATCH] audit: set context->dummy even when audit is off

On 1 Nov 2019, at 10:16, Steve Grubb wrote:

> On Friday, November 1, 2019 9:24:17 AM EDT Chris Mason wrote:
>> On 31 Oct 2019, at 19:27, Paul Moore wrote:
>>> On Thu, Oct 31, 2019 at 12:40 PM Chris Mason <[email protected]> wrote:
>>> [ ... ]
>>> Hi Chris,
>>>
>>> This is a rather hasty email as I'm at a conference right now, but I
>>> wanted to convey that I'm not opposed to making sure that the NTP
>>> records obey the audit configuration (that was the original intent
>>> after all), I think it is just that we are all a little confused as
>>> to
>>> why you are seeing the NTP records *and*only* the NTP records.
>>
>> This part is harder to nail down because there's a window during boot
>> where journald has enabled audit but chef hasn't yet run in and
>> turned
>> it off, so we get a lot of logs early and then mostly ntp after that.
>
> This is the root of the problem. Journald should never turn on audit
> since it
> has no idea if auditd even has rules to load. What if the end user
> does not
> want auditing? By blindly enabling audit without knowing if its
> wanted, it
> causes a system performance hit even with no rules loaded. It would be
> best
> if journald leaves audit alone. If it wants to listen on the multicast
> socket, so be it. It should just listen and not try to alter the
> system.
>
> Back to ntp, it sounds like the ntp record needs to check for
> audit_enabled
> rather than the dummy context.

I'm not against sprinkling more audit_enabled checks, but we'd have to
change audit_inode() and a bunch of the other callers of
audit_dummy_context() as well.

-chris

2019-11-05 00:18:43

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH] audit: set context->dummy even when audit is off

On Fri, Nov 1, 2019 at 9:24 AM Chris Mason <[email protected]> wrote:
> On 31 Oct 2019, at 19:27, Paul Moore wrote:
> > It's been a while, but I thought we suggested Dave try running
> > 'auditctl -a never,task' to see if that would solve his problem and I
> > believe his answer was no, which confused me a bit as the
> > audit_filter_task() call in audit_alloc() should see that rule and
> > return a state of AUDIT_DISABLED which not only prevents audit_alloc()
> > from allocating an audit_context (and remember if the audit_context is
> > NULL then audit_dummy_context() returns true), but it also clears the
> > TIF_SYSCALL_AUDIT flag (which I'm guessing you also want).
>
> Thanks for the reminder on this part, I meant to test it. Yes, auditctl
> -a never,task does stop the messages, even without my patch applied.

I'm glad to hear that worked, I was going to be *very* confused if you
came back and said you were still seeing NTP records.

I would suggest that regardless of what happens with audit_enabled you
likely want to keep this audit rule as part of your boot
configuration, not only does it squelch the audit records, but it
should improve performance as well (at the cost of no syscall
auditing). A number of Linux distros have this as their default at
boot.

--
paul moore
http://www.paul-moore.com

2019-11-05 00:42:21

by Chris Mason

[permalink] [raw]
Subject: Re: [PATCH] audit: set context->dummy even when audit is off

On 4 Nov 2019, at 19:15, Paul Moore wrote:

> On Fri, Nov 1, 2019 at 9:24 AM Chris Mason <[email protected]> wrote:
>> On 31 Oct 2019, at 19:27, Paul Moore wrote:
>>> It's been a while, but I thought we suggested Dave try running
>>> 'auditctl -a never,task' to see if that would solve his problem and
>>> I
>>> believe his answer was no, which confused me a bit as the
>>> audit_filter_task() call in audit_alloc() should see that rule and
>>> return a state of AUDIT_DISABLED which not only prevents
>>> audit_alloc()
>>> from allocating an audit_context (and remember if the audit_context
>>> is
>>> NULL then audit_dummy_context() returns true), but it also clears
>>> the
>>> TIF_SYSCALL_AUDIT flag (which I'm guessing you also want).
>>
>> Thanks for the reminder on this part, I meant to test it. Yes,
>> auditctl
>> -a never,task does stop the messages, even without my patch applied.
>
> I'm glad to hear that worked, I was going to be *very* confused if you
> came back and said you were still seeing NTP records.
>
> I would suggest that regardless of what happens with audit_enabled you
> likely want to keep this audit rule as part of your boot
> configuration, not only does it squelch the audit records, but it
> should improve performance as well (at the cost of no syscall
> auditing). A number of Linux distros have this as their default at
> boot.
>

Definitely, we'll be testing auditctl -a never,task internally. Before
we went down that path I wanted to fully understand what was going on,
but I think all the big questions have been answered at this point.

I'm happy to try variations on my patch, but if you want to include it,
please do remember that I've really only tested it with auditing off.

-chris

2019-11-05 00:46:33

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH] audit: set context->dummy even when audit is off

On Mon, Nov 4, 2019 at 7:39 PM Chris Mason <[email protected]> wrote:
> On 4 Nov 2019, at 19:15, Paul Moore wrote:
>
> > On Fri, Nov 1, 2019 at 9:24 AM Chris Mason <[email protected]> wrote:
> >> On 31 Oct 2019, at 19:27, Paul Moore wrote:
> >>> It's been a while, but I thought we suggested Dave try running
> >>> 'auditctl -a never,task' to see if that would solve his problem and
> >>> I
> >>> believe his answer was no, which confused me a bit as the
> >>> audit_filter_task() call in audit_alloc() should see that rule and
> >>> return a state of AUDIT_DISABLED which not only prevents
> >>> audit_alloc()
> >>> from allocating an audit_context (and remember if the audit_context
> >>> is
> >>> NULL then audit_dummy_context() returns true), but it also clears
> >>> the
> >>> TIF_SYSCALL_AUDIT flag (which I'm guessing you also want).
> >>
> >> Thanks for the reminder on this part, I meant to test it. Yes,
> >> auditctl
> >> -a never,task does stop the messages, even without my patch applied.
> >
> > I'm glad to hear that worked, I was going to be *very* confused if you
> > came back and said you were still seeing NTP records.
> >
> > I would suggest that regardless of what happens with audit_enabled you
> > likely want to keep this audit rule as part of your boot
> > configuration, not only does it squelch the audit records, but it
> > should improve performance as well (at the cost of no syscall
> > auditing). A number of Linux distros have this as their default at
> > boot.
> >
>
> Definitely, we'll be testing auditctl -a never,task internally. Before
> we went down that path I wanted to fully understand what was going on,
> but I think all the big questions have been answered at this point.

Yes, that is why we didn't do anything earlier with Dave's reports; we
couldn't reconcile the results with the code, and the lack of other
similar problem reports made me suspicious. As you said, we
understand things a bit better now.

> I'm happy to try variations on my patch, but if you want to include it,
> please do remember that I've really only tested it with auditing off.

Understood. FWIW, I'm not overly in love with the approach in the
patch you posted, but I haven't looked too seriously into alternatives
thus far. I expect with most everyone running with the "never" audit
rule installed this solves this just fine for the time being.

--
paul moore
http://www.paul-moore.com