On 2017-02-28 23:15, Steve Grubb wrote:
> On Tuesday, February 28, 2017 10:37:04 PM EST Richard Guy Briggs wrote:
> > Sorry, I forgot to include Cc: in this cover letter for context to the 4
> > alt patches.
> >
> > On 2017-02-28 22:15, Richard Guy Briggs wrote:
> > > The background to this is:
> > > https://github.com/linux-audit/audit-kernel/issues/8
> > >
> > > In short, audit SYSCALL records for *init_module were occasionally
> > > accompanied by hundreds to thousands of null PATH records.
> > >
> > > I chatted with Al Viro and Eric Paris about this Friday afternoon and
> > > they seemed to vaguely recall this issue and didn't have any solid
> > > recommendations as to what was the right thing to do (other than the
> > > same suggestion from both that I won't print here).
> > >
> > > It was reproducible on a number of vintages of distributions with
> > > default kernels, but triggering on very few of the many modules loaded
> > > at boot time. It was reproduced with fs-nfs4 and nfsv4 modules on
> > > tracefs, but there are reports of it also happening with debugfs. It
> > > was triggering only in __audit_inode_child with a parent that was not
> > > found in the task context's audit names_list.
> > >
> > > I have four potential solutions listed in my order of preference and I'd
> > > like to get some feedback about which one would be the most acceptable.
>
> 0.5 - Notice that we are in *init_module & delete_module and inhibit
> generation of any record type except SYSCALL and KERN_MODULE ? There are some
> classification routines for -F perms=wrxa that might be used to create a new
> class for loading/deleting modules that sets a flag that we use to suppress
> some record types.
Ok, I was partially able to do this.
If I try and catch it in audit_log_start() which is the common point for
all the record types to be able to limit to just SYSCALL and
KERN_MODULE, there will already be a linked list of hundreds to
thousands of audit_names and will still print a non-zero items count in
the SYSCALL record. This also sounds like a potentially lazy way to
deal with other record spam (like setuid BRPM_FCAPS).
If I catch it in __audit_inode_child in the same place as I caught the
filesystem type, it is effective for only the PATH record, which is all
that is a problem at the moment.
It touches nine arch-related files, which is a lot more disruptive than
I was hoping.
> > > 1 - In __audit_inode_child, return immedialy upon detecting TRACEFS and
> > >
> > > DEBUGFS (and potentially other filesystems identified, via s_magic).
>
> XFS creates them too. Who knows what else.
Why would this happen? I would assume it is a mounted filesystem. Do
you have a sample of the extra records?
This brings me back to the original reaction I had to your suggestion
which is: Are you certain there is never a circumstance where *_module
syscalls never involve a file? Say, the module itself on loading pulls
in other files from the mounted filesystem?
> -Steve
>
> > > 2 - In __audit_inode_child, return after not finding the parent in that
> > >
> > > task context's audit names_list.
> > >
> > > 3 - In __audit_inode_child, mark the parent and its child as "hidden"
> > >
> > > when the parent isn't found in that task context's audit names_list.
> > > This will still result in an "items=" count that does not match the
> > > number of accompanying PATH records for that SYSCALL record, which
> > > may upset userspace tools but would still indicate suppressed
> > > records.
> > >
> > > 4 - In __audit_inode_child, when the parent isn't found, store the
> > >
> > > child's dentry in the child's (new or not) audit_names structure
> > > (properly refcounted with dget) and store the parent's dentry in its
> > > newly created audit_names structure (via dget_parent), then if the
> > > name isn't available at PATH record generation time, use that stored
> > > value (with dentry_path_raw and released with dput)
> > >
> > > Is there another more elegant solution that I've missed that catches
> > > things before they get anywhere near audit_inode_child (called from
> > > tracefs' notifiers)?
> > >
> > > I'll thread onto this message tested patches for all four solutions.
- RGB
--
Richard Guy Briggs <[email protected]>
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635
Tracefs or debugfs were causing hundreds to thousands of null PATH
records to be associated with the init_module and finit_module SYSCALL
records on a few modules when the following rule was in place for
startup:
-a always,exit -F arch=x86_64 -S init_module -F key=mod-load
In __audit_inode_child, return immedialy upon detecting module-related
syscalls.
See https://github.com/linux-audit/audit-kernel/issues/8
Test case: https://github.com/linux-audit/audit-testsuite/issues/42
Signed-off-by: Richard Guy Briggs <[email protected]>
---
kernel/auditsc.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 4db32e8..d7fe943 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1868,6 +1868,12 @@ void __audit_inode_child(struct inode *parent,
if (!context->in_syscall)
return;
+ switch (context->major) {
+ case __NR_init_module:
+ case __NR_delete_module:
+ case __NR_finit_module:
+ return;
+ }
if (inode)
handle_one(inode);
--
1.7.1
On Fri, Mar 3, 2017 at 4:14 PM, Richard Guy Briggs <[email protected]> wrote:
> On 2017-02-28 23:15, Steve Grubb wrote:
>> On Tuesday, February 28, 2017 10:37:04 PM EST Richard Guy Briggs wrote:
>> > Sorry, I forgot to include Cc: in this cover letter for context to the 4
>> > alt patches.
>> >
>> > On 2017-02-28 22:15, Richard Guy Briggs wrote:
>> > > The background to this is:
>> > > https://github.com/linux-audit/audit-kernel/issues/8
>> > >
>> > > In short, audit SYSCALL records for *init_module were occasionally
>> > > accompanied by hundreds to thousands of null PATH records.
>> > >
>> > > I chatted with Al Viro and Eric Paris about this Friday afternoon and
>> > > they seemed to vaguely recall this issue and didn't have any solid
>> > > recommendations as to what was the right thing to do (other than the
>> > > same suggestion from both that I won't print here).
>> > >
>> > > It was reproducible on a number of vintages of distributions with
>> > > default kernels, but triggering on very few of the many modules loaded
>> > > at boot time. It was reproduced with fs-nfs4 and nfsv4 modules on
>> > > tracefs, but there are reports of it also happening with debugfs. It
>> > > was triggering only in __audit_inode_child with a parent that was not
>> > > found in the task context's audit names_list.
>> > >
>> > > I have four potential solutions listed in my order of preference and I'd
>> > > like to get some feedback about which one would be the most acceptable.
>>
>> 0.5 - Notice that we are in *init_module & delete_module and inhibit
>> generation of any record type except SYSCALL and KERN_MODULE ? There are some
>> classification routines for -F perms=wrxa that might be used to create a new
>> class for loading/deleting modules that sets a flag that we use to suppress
>> some record types.
>
> Ok, I was partially able to do this.
>
> If I try and catch it in audit_log_start() which is the common point for
> all the record types to be able to limit to just SYSCALL and
> KERN_MODULE, there will already be a linked list of hundreds to
> thousands of audit_names and will still print a non-zero items count in
> the SYSCALL record. This also sounds like a potentially lazy way to
> deal with other record spam (like setuid BRPM_FCAPS).
>
> If I catch it in __audit_inode_child in the same place as I caught the
> filesystem type, it is effective for only the PATH record, which is all
> that is a problem at the moment.
>
> It touches nine arch-related files, which is a lot more disruptive than
> I was hoping.
Blocking PATH record on creation based on syscall *really* seems like
a bad/dangerous idea. If we want to block all these tracefs/debugfs
records, let's just block the fs. Although as of right now I'm not a
fan of blocking anything.
--
paul moore
http://www.paul-moore.com
+++ Richard Guy Briggs [06/03/17 16:49 -0500]:
>On 2017-03-03 19:22, Paul Moore wrote:
>> On Fri, Mar 3, 2017 at 4:14 PM, Richard Guy Briggs <[email protected]> wrote:
>> > On 2017-02-28 23:15, Steve Grubb wrote:
>> >> On Tuesday, February 28, 2017 10:37:04 PM EST Richard Guy Briggs wrote:
>> >> > Sorry, I forgot to include Cc: in this cover letter for context to the 4
>> >> > alt patches.
>> >> >
>> >> > On 2017-02-28 22:15, Richard Guy Briggs wrote:
>> >> > > The background to this is:
>> >> > > https://github.com/linux-audit/audit-kernel/issues/8
>> >> > >
>> >> > > In short, audit SYSCALL records for *init_module were occasionally
>> >> > > accompanied by hundreds to thousands of null PATH records.
>> >> > >
>> >> > > I chatted with Al Viro and Eric Paris about this Friday afternoon and
>> >> > > they seemed to vaguely recall this issue and didn't have any solid
>> >> > > recommendations as to what was the right thing to do (other than the
>> >> > > same suggestion from both that I won't print here).
>> >> > >
>> >> > > It was reproducible on a number of vintages of distributions with
>> >> > > default kernels, but triggering on very few of the many modules loaded
>> >> > > at boot time. It was reproduced with fs-nfs4 and nfsv4 modules on
>> >> > > tracefs, but there are reports of it also happening with debugfs. It
>> >> > > was triggering only in __audit_inode_child with a parent that was not
>> >> > > found in the task context's audit names_list.
>> >> > >
>> >> > > I have four potential solutions listed in my order of preference and I'd
>> >> > > like to get some feedback about which one would be the most acceptable.
>> >>
>> >> 0.5 - Notice that we are in *init_module & delete_module and inhibit
>> >> generation of any record type except SYSCALL and KERN_MODULE ? There are some
>> >> classification routines for -F perms=wrxa that might be used to create a new
>> >> class for loading/deleting modules that sets a flag that we use to suppress
>> >> some record types.
>> >
>> > Ok, I was partially able to do this.
>> >
>> > If I try and catch it in audit_log_start() which is the common point for
>> > all the record types to be able to limit to just SYSCALL and
>> > KERN_MODULE, there will already be a linked list of hundreds to
>> > thousands of audit_names and will still print a non-zero items count in
>> > the SYSCALL record. This also sounds like a potentially lazy way to
>> > deal with other record spam (like setuid BRPM_FCAPS).
>> >
>> > If I catch it in __audit_inode_child in the same place as I caught the
>> > filesystem type, it is effective for only the PATH record, which is all
>> > that is a problem at the moment.
>> >
>> > It touches nine arch-related files, which is a lot more disruptive than
>> > I was hoping.
>>
>> Blocking PATH record on creation based on syscall *really* seems like
>> a bad/dangerous idea. If we want to block all these tracefs/debugfs
>> records, let's just block the fs. Although as of right now I'm not a
>> fan of blocking anything.
>
>I agree. What makes me leery of this approach is if a kernel module in
>turn accesses directly other files, or bypasses the load_module call to
>load another module from a file and avoids logging.
AFAIK load_module is *the* entry point for module loading, it is where
all the setup occurs in order for a module to be properly set up and
registered in our internal data structures (e.g the global modules
list). If a module wants another module loaded, it can request for it
to be loaded via request_module(), which punts the request to modprobe
in userspace to load the module in question, but I'm not sure if
that's at all related to this null PATH record issue.
Jessica
On 2017-03-03 19:22, Paul Moore wrote:
> On Fri, Mar 3, 2017 at 4:14 PM, Richard Guy Briggs <[email protected]> wrote:
> > On 2017-02-28 23:15, Steve Grubb wrote:
> >> On Tuesday, February 28, 2017 10:37:04 PM EST Richard Guy Briggs wrote:
> >> > Sorry, I forgot to include Cc: in this cover letter for context to the 4
> >> > alt patches.
> >> >
> >> > On 2017-02-28 22:15, Richard Guy Briggs wrote:
> >> > > The background to this is:
> >> > > https://github.com/linux-audit/audit-kernel/issues/8
> >> > >
> >> > > In short, audit SYSCALL records for *init_module were occasionally
> >> > > accompanied by hundreds to thousands of null PATH records.
> >> > >
> >> > > I chatted with Al Viro and Eric Paris about this Friday afternoon and
> >> > > they seemed to vaguely recall this issue and didn't have any solid
> >> > > recommendations as to what was the right thing to do (other than the
> >> > > same suggestion from both that I won't print here).
> >> > >
> >> > > It was reproducible on a number of vintages of distributions with
> >> > > default kernels, but triggering on very few of the many modules loaded
> >> > > at boot time. It was reproduced with fs-nfs4 and nfsv4 modules on
> >> > > tracefs, but there are reports of it also happening with debugfs. It
> >> > > was triggering only in __audit_inode_child with a parent that was not
> >> > > found in the task context's audit names_list.
> >> > >
> >> > > I have four potential solutions listed in my order of preference and I'd
> >> > > like to get some feedback about which one would be the most acceptable.
> >>
> >> 0.5 - Notice that we are in *init_module & delete_module and inhibit
> >> generation of any record type except SYSCALL and KERN_MODULE ? There are some
> >> classification routines for -F perms=wrxa that might be used to create a new
> >> class for loading/deleting modules that sets a flag that we use to suppress
> >> some record types.
> >
> > Ok, I was partially able to do this.
> >
> > If I try and catch it in audit_log_start() which is the common point for
> > all the record types to be able to limit to just SYSCALL and
> > KERN_MODULE, there will already be a linked list of hundreds to
> > thousands of audit_names and will still print a non-zero items count in
> > the SYSCALL record. This also sounds like a potentially lazy way to
> > deal with other record spam (like setuid BRPM_FCAPS).
> >
> > If I catch it in __audit_inode_child in the same place as I caught the
> > filesystem type, it is effective for only the PATH record, which is all
> > that is a problem at the moment.
> >
> > It touches nine arch-related files, which is a lot more disruptive than
> > I was hoping.
>
> Blocking PATH record on creation based on syscall *really* seems like
> a bad/dangerous idea. If we want to block all these tracefs/debugfs
> records, let's just block the fs. Although as of right now I'm not a
> fan of blocking anything.
I agree. What makes me leery of this approach is if a kernel module in
turn accesses directly other files, or bypasses the load_module call to
load another module from a file and avoids logging.
> paul moore
- RGB
--
Richard Guy Briggs <[email protected]>
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635
On 2017-03-06 17:30, Jessica Yu wrote:
> +++ Richard Guy Briggs [06/03/17 16:49 -0500]:
> >On 2017-03-03 19:22, Paul Moore wrote:
> >>On Fri, Mar 3, 2017 at 4:14 PM, Richard Guy Briggs <[email protected]> wrote:
> >>> On 2017-02-28 23:15, Steve Grubb wrote:
> >>>> On Tuesday, February 28, 2017 10:37:04 PM EST Richard Guy Briggs wrote:
> >>>> > Sorry, I forgot to include Cc: in this cover letter for context to the 4
> >>>> > alt patches.
> >>>> >
> >>>> > On 2017-02-28 22:15, Richard Guy Briggs wrote:
> >>>> > > The background to this is:
> >>>> > > https://github.com/linux-audit/audit-kernel/issues/8
> >>>> > >
> >>>> > > In short, audit SYSCALL records for *init_module were occasionally
> >>>> > > accompanied by hundreds to thousands of null PATH records.
> >>>> > >
> >>>> > > I chatted with Al Viro and Eric Paris about this Friday afternoon and
> >>>> > > they seemed to vaguely recall this issue and didn't have any solid
> >>>> > > recommendations as to what was the right thing to do (other than the
> >>>> > > same suggestion from both that I won't print here).
> >>>> > >
> >>>> > > It was reproducible on a number of vintages of distributions with
> >>>> > > default kernels, but triggering on very few of the many modules loaded
> >>>> > > at boot time. It was reproduced with fs-nfs4 and nfsv4 modules on
> >>>> > > tracefs, but there are reports of it also happening with debugfs. It
> >>>> > > was triggering only in __audit_inode_child with a parent that was not
> >>>> > > found in the task context's audit names_list.
> >>>> > >
> >>>> > > I have four potential solutions listed in my order of preference and I'd
> >>>> > > like to get some feedback about which one would be the most acceptable.
> >>>>
> >>>> 0.5 - Notice that we are in *init_module & delete_module and inhibit
> >>>> generation of any record type except SYSCALL and KERN_MODULE ? There are some
> >>>> classification routines for -F perms=wrxa that might be used to create a new
> >>>> class for loading/deleting modules that sets a flag that we use to suppress
> >>>> some record types.
> >>>
> >>> Ok, I was partially able to do this.
> >>>
> >>> If I try and catch it in audit_log_start() which is the common point for
> >>> all the record types to be able to limit to just SYSCALL and
> >>> KERN_MODULE, there will already be a linked list of hundreds to
> >>> thousands of audit_names and will still print a non-zero items count in
> >>> the SYSCALL record. This also sounds like a potentially lazy way to
> >>> deal with other record spam (like setuid BRPM_FCAPS).
> >>>
> >>> If I catch it in __audit_inode_child in the same place as I caught the
> >>> filesystem type, it is effective for only the PATH record, which is all
> >>> that is a problem at the moment.
> >>>
> >>> It touches nine arch-related files, which is a lot more disruptive than
> >>> I was hoping.
> >>
> >>Blocking PATH record on creation based on syscall *really* seems like
> >>a bad/dangerous idea. If we want to block all these tracefs/debugfs
> >>records, let's just block the fs. Although as of right now I'm not a
> >>fan of blocking anything.
> >
> >I agree. What makes me leery of this approach is if a kernel module in
> >turn accesses directly other files, or bypasses the load_module call to
> >load another module from a file and avoids logging.
>
> AFAIK load_module is *the* entry point for module loading, it is where
> all the setup occurs in order for a module to be properly set up and
> registered in our internal data structures (e.g the global modules
> list). If a module wants another module loaded, it can request for it
> to be loaded via request_module(), which punts the request to modprobe
> in userspace to load the module in question, but I'm not sure if
> that's at all related to this null PATH record issue.
Yes, there is a lot going on in that function and by far the easiest way
to be able to load another module, but I'm being a bit paranoid in
suggesting that a rogue module may try and skip some steps listed there
and roll its own, hence the desire to not disable all PATH auxilliary
records for *_module SYSCALL records, but only the filesystem types that
don't pose a threat.
> Jessica
- RGB
--
Richard Guy Briggs <[email protected]>
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635
On Friday, March 3, 2017 4:14:54 PM EST Richard Guy Briggs wrote:
> > > > 1 - In __audit_inode_child, return immedialy upon detecting TRACEFS
> > > > and
> > > >
> > > > DEBUGFS (and potentially other filesystems identified, via s_magic).
> >
> > XFS creates them too. Who knows what else.
>
> Why would this happen? I would assume it is a mounted filesystem. Do
> you have a sample of the extra records?
I can't find them right away. But I've seen them.
> This brings me back to the original reaction I had to your suggestion
> which is: Are you certain there is never a circumstance where *_module
> syscalls never involve a file? Say, the module itself on loading pulls
> in other files from the mounted filesystem?
We don't care about this. Audit events have to tell a story. They must have a
subject, action, and object. In this case its "somebody loaded a kernel module
X". Where X is the module name. Paths are irrelevant to the story and just
make it hard to understand the event.
-Steve
On Monday, March 6, 2017 4:49:21 PM EST Richard Guy Briggs wrote:
> > Blocking PATH record on creation based on syscall *really* seems like
> > a bad/dangerous idea. If we want to block all these tracefs/debugfs
> > records, let's just block the fs. Although as of right now I'm not a
> > fan of blocking anything.
>
> I agree. What makes me leery of this approach is if a kernel module in
> turn accesses directly other files, or bypasses the load_module call to
> load another module from a file and avoids logging.
In this case, we want a second event with that module name. We do not want any
PATH records.
-Steve