2020-07-24 09:17:29

by Thiébaud Weksteen

[permalink] [raw]
Subject: [PATCH] selinux: add tracepoint on denials

The audit data currently captures which process and which target
is responsible for a denial. There is no data on where exactly in the
process that call occurred. Debugging can be made easier by being able to
reconstruct the unified kernel and userland stack traces [1]. Add a
tracepoint on the SELinux denials which can then be used by userland
(i.e. perf).

Although this patch could manually be added by each OS developer to
trouble shoot a denial, adding it to the kernel streamlines the
developers workflow.

[1] https://source.android.com/devices/tech/debug/native_stack_dump

Signed-off-by: Thiébaud Weksteen <[email protected]>
Signed-off-by: Joel Fernandes <[email protected]>
---
MAINTAINERS | 1 +
include/trace/events/selinux.h | 35 ++++++++++++++++++++++++++++++++++
security/selinux/avc.c | 6 ++++++
3 files changed, 42 insertions(+)
create mode 100644 include/trace/events/selinux.h

diff --git a/MAINTAINERS b/MAINTAINERS
index e64cdde81851..6b6cd5e13537 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15358,6 +15358,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git
F: Documentation/ABI/obsolete/sysfs-selinux-checkreqprot
F: Documentation/ABI/obsolete/sysfs-selinux-disable
F: Documentation/admin-guide/LSM/SELinux.rst
+F: include/trace/events/selinux.h
F: include/uapi/linux/selinux_netlink.h
F: scripts/selinux/
F: security/selinux/
diff --git a/include/trace/events/selinux.h b/include/trace/events/selinux.h
new file mode 100644
index 000000000000..e247187a8135
--- /dev/null
+++ b/include/trace/events/selinux.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM selinux
+
+#if !defined(_TRACE_SELINUX_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_SELINUX_H
+
+#include <linux/ktime.h>
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(selinux_denied,
+
+ TP_PROTO(int cls, int av),
+
+ TP_ARGS(cls, av),
+
+ TP_STRUCT__entry(
+ __field(int, cls)
+ __field(int, av)
+ ),
+
+ TP_fast_assign(
+ __entry->cls = cls;
+ __entry->av = av;
+ ),
+
+ TP_printk("denied %d %d",
+ __entry->cls,
+ __entry->av)
+);
+
+#endif
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/security/selinux/avc.c b/security/selinux/avc.c
index d18cb32a242a..85d2e22ab656 100644
--- a/security/selinux/avc.c
+++ b/security/selinux/avc.c
@@ -31,6 +31,9 @@
#include "avc_ss.h"
#include "classmap.h"

+#define CREATE_TRACE_POINTS
+#include <trace/events/selinux.h>
+
#define AVC_CACHE_SLOTS 512
#define AVC_DEF_CACHE_THRESHOLD 512
#define AVC_CACHE_RECLAIM 16
@@ -672,6 +675,9 @@ static void avc_audit_pre_callback(struct audit_buffer *ab, void *a)
return;
}

+ if (sad->denied)
+ trace_selinux_denied(sad->tclass, av);
+
perms = secclass_map[sad->tclass-1].perms;

audit_log_format(ab, " {");
--
2.28.0.rc0.142.g3c755180ce-goog


2020-07-24 13:34:04

by Stephen Smalley

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

On Fri, Jul 24, 2020 at 5:15 AM Thiébaud Weksteen <[email protected]> wrote:
>
> The audit data currently captures which process and which target
> is responsible for a denial. There is no data on where exactly in the
> process that call occurred. Debugging can be made easier by being able to
> reconstruct the unified kernel and userland stack traces [1]. Add a
> tracepoint on the SELinux denials which can then be used by userland
> (i.e. perf).
>
> Although this patch could manually be added by each OS developer to
> trouble shoot a denial, adding it to the kernel streamlines the
> developers workflow.
>
> [1] https://source.android.com/devices/tech/debug/native_stack_dump
>
> Signed-off-by: Thiébaud Weksteen <[email protected]>
> Signed-off-by: Joel Fernandes <[email protected]>
> ---
> MAINTAINERS | 1 +
> include/trace/events/selinux.h | 35 ++++++++++++++++++++++++++++++++++
> security/selinux/avc.c | 6 ++++++
> 3 files changed, 42 insertions(+)
> create mode 100644 include/trace/events/selinux.h
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index e64cdde81851..6b6cd5e13537 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -15358,6 +15358,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git
> F: Documentation/ABI/obsolete/sysfs-selinux-checkreqprot
> F: Documentation/ABI/obsolete/sysfs-selinux-disable
> F: Documentation/admin-guide/LSM/SELinux.rst
> +F: include/trace/events/selinux.h
> F: include/uapi/linux/selinux_netlink.h
> F: scripts/selinux/
> F: security/selinux/
> diff --git a/include/trace/events/selinux.h b/include/trace/events/selinux.h
> new file mode 100644
> index 000000000000..e247187a8135
> --- /dev/null
> +++ b/include/trace/events/selinux.h
> @@ -0,0 +1,35 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM selinux
> +
> +#if !defined(_TRACE_SELINUX_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_SELINUX_H
> +
> +#include <linux/ktime.h>
> +#include <linux/tracepoint.h>
> +
> +TRACE_EVENT(selinux_denied,
> +
> + TP_PROTO(int cls, int av),
> +
> + TP_ARGS(cls, av),
> +
> + TP_STRUCT__entry(
> + __field(int, cls)
> + __field(int, av)
> + ),
> +
> + TP_fast_assign(
> + __entry->cls = cls;
> + __entry->av = av;
> + ),
> +
> + TP_printk("denied %d %d",
> + __entry->cls,
> + __entry->av)
> +);

I would think you would want to log av as %x for easier interpretation
especially when there are multiple permissions being checked at once
(which can happen). Also both cls and av would properly be unsigned
values. Only other question I have is whether it would be beneficial
to include other information here to help uniquely identify/correlate
the denial with the avc: message and whether any decoding of the
class, av, or other information could/should be done here versus in
some userland helper.

2020-07-24 13:53:27

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

On Fri, 24 Jul 2020 11:15:03 +0200
"Thiébaud Weksteen" <[email protected]> wrote:
> diff --git a/security/selinux/avc.c b/security/selinux/avc.c
> index d18cb32a242a..85d2e22ab656 100644
> --- a/security/selinux/avc.c
> +++ b/security/selinux/avc.c
> @@ -31,6 +31,9 @@
> #include "avc_ss.h"
> #include "classmap.h"
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/selinux.h>
> +
> #define AVC_CACHE_SLOTS 512
> #define AVC_DEF_CACHE_THRESHOLD 512
> #define AVC_CACHE_RECLAIM 16
> @@ -672,6 +675,9 @@ static void avc_audit_pre_callback(struct audit_buffer *ab, void *a)
> return;
> }
>
> + if (sad->denied)

First, I would like to deny sadness as well ;-)

Now, there is a way to add that branch within the "nop" area of the
trace event, and remove the conditional branch from the main code.

> + trace_selinux_denied(sad->tclass, av);
> +

Instead have this:

trace_selinux_denied(sad, av);

> perms = secclass_map[sad->tclass-1].perms;
>
> audit_log_format(ab, " {");

> --- /dev/null
> +++ b/include/trace/events/selinux.h
> @@ -0,0 +1,35 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM selinux
> +
> +#if !defined(_TRACE_SELINUX_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_SELINUX_H
> +
> +#include <linux/ktime.h>
> +#include <linux/tracepoint.h>
> +
> +TRACE_EVENT(selinux_denied,

TRACE_EVENT_CONDITION(selinux_denied,

> +
> + TP_PROTO(int cls, int av),

TP_PROTO(struct selinux_audit_data sad, int av)

> +
> + TP_ARGS(cls, av),
> +

TP_CONDITION(sad->denied),

The above condition will be tested before calling the tracepoint. But
only if the trace event is enabled.

> + TP_STRUCT__entry(
> + __field(int, cls)
> + __field(int, av)
> + ),
> +
> + TP_fast_assign(
> + __entry->cls = cls;

__entry->cls = sad->tclass;

> + __entry->av = av;
> + ),
> +
> + TP_printk("denied %d %d",
> + __entry->cls,
> + __entry->av)
> +);
> +
> +#endif
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>

-- Steve

2020-07-24 13:57:52

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

On Fri, Jul 24, 2020 at 9:32 AM Stephen Smalley
<[email protected]> wrote:
> On Fri, Jul 24, 2020 at 5:15 AM Thiébaud Weksteen <[email protected]> wrote:
> > The audit data currently captures which process and which target
> > is responsible for a denial. There is no data on where exactly in the
> > process that call occurred. Debugging can be made easier by being able to
> > reconstruct the unified kernel and userland stack traces [1]. Add a
> > tracepoint on the SELinux denials which can then be used by userland
> > (i.e. perf).
> >
> > Although this patch could manually be added by each OS developer to
> > trouble shoot a denial, adding it to the kernel streamlines the
> > developers workflow.
> >
> > [1] https://source.android.com/devices/tech/debug/native_stack_dump
> >
> > Signed-off-by: Thiébaud Weksteen <[email protected]>
> > Signed-off-by: Joel Fernandes <[email protected]>
> > ---
> > MAINTAINERS | 1 +
> > include/trace/events/selinux.h | 35 ++++++++++++++++++++++++++++++++++
> > security/selinux/avc.c | 6 ++++++
> > 3 files changed, 42 insertions(+)
> > create mode 100644 include/trace/events/selinux.h
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index e64cdde81851..6b6cd5e13537 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -15358,6 +15358,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git
> > F: Documentation/ABI/obsolete/sysfs-selinux-checkreqprot
> > F: Documentation/ABI/obsolete/sysfs-selinux-disable
> > F: Documentation/admin-guide/LSM/SELinux.rst
> > +F: include/trace/events/selinux.h
> > F: include/uapi/linux/selinux_netlink.h
> > F: scripts/selinux/
> > F: security/selinux/
> > diff --git a/include/trace/events/selinux.h b/include/trace/events/selinux.h
> > new file mode 100644
> > index 000000000000..e247187a8135
> > --- /dev/null
> > +++ b/include/trace/events/selinux.h
> > @@ -0,0 +1,35 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#undef TRACE_SYSTEM
> > +#define TRACE_SYSTEM selinux
> > +
> > +#if !defined(_TRACE_SELINUX_H) || defined(TRACE_HEADER_MULTI_READ)
> > +#define _TRACE_SELINUX_H
> > +
> > +#include <linux/ktime.h>
> > +#include <linux/tracepoint.h>
> > +
> > +TRACE_EVENT(selinux_denied,
> > +
> > + TP_PROTO(int cls, int av),
> > +
> > + TP_ARGS(cls, av),
> > +
> > + TP_STRUCT__entry(
> > + __field(int, cls)
> > + __field(int, av)
> > + ),
> > +
> > + TP_fast_assign(
> > + __entry->cls = cls;
> > + __entry->av = av;
> > + ),
> > +
> > + TP_printk("denied %d %d",
> > + __entry->cls,
> > + __entry->av)
> > +);
>
> I would think you would want to log av as %x for easier interpretation
> especially when there are multiple permissions being checked at once
> (which can happen). Also both cls and av would properly be unsigned
> values. Only other question I have is whether it would be beneficial
> to include other information here to help uniquely identify/correlate
> the denial with the avc: message and whether any decoding of the
> class, av, or other information could/should be done here versus in
> some userland helper.

It does seem like at the very least it would be nice to see the av as
hex values instead of integers, e.g. "%x" in the TP_printk() call.
Considering this patch is about making dev's lives easier, I tend to
agree with Stephen questioning if you should go a step further and
convert both the class and av values into string representations.

--
paul moore
http://www.paul-moore.com

2020-07-28 12:51:27

by Thiébaud Weksteen

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

Thanks for the review! I'll send a new revision of the patch with the
%x formatter and using the TP_CONDITION macro.

On adding further information to the trace event, I would prefer
adding the strict minimum to be able to correlate the event with the
avc message. The reason is that tracevents have a fixed size (see
https://www.kernel.org/doc/Documentation/trace/events.txt). For
instance, we would need to decide on a maximum size for the string
representation of the list of permissions. This would also duplicate
the reporting done in the avc audit event. I'll simply add the pid as
part of the printk, which should be sufficient for the correlation.


On Fri, Jul 24, 2020 at 3:55 PM Paul Moore <[email protected]> wrote:
>
> On Fri, Jul 24, 2020 at 9:32 AM Stephen Smalley
> <[email protected]> wrote:
> > On Fri, Jul 24, 2020 at 5:15 AM Thiébaud Weksteen <[email protected]> wrote:
> > > The audit data currently captures which process and which target
> > > is responsible for a denial. There is no data on where exactly in the
> > > process that call occurred. Debugging can be made easier by being able to
> > > reconstruct the unified kernel and userland stack traces [1]. Add a
> > > tracepoint on the SELinux denials which can then be used by userland
> > > (i.e. perf).
> > >
> > > Although this patch could manually be added by each OS developer to
> > > trouble shoot a denial, adding it to the kernel streamlines the
> > > developers workflow.
> > >
> > > [1] https://source.android.com/devices/tech/debug/native_stack_dump
> > >
> > > Signed-off-by: Thiébaud Weksteen <[email protected]>
> > > Signed-off-by: Joel Fernandes <[email protected]>
> > > ---
> > > MAINTAINERS | 1 +
> > > include/trace/events/selinux.h | 35 ++++++++++++++++++++++++++++++++++
> > > security/selinux/avc.c | 6 ++++++
> > > 3 files changed, 42 insertions(+)
> > > create mode 100644 include/trace/events/selinux.h
> > >
> > > diff --git a/MAINTAINERS b/MAINTAINERS
> > > index e64cdde81851..6b6cd5e13537 100644
> > > --- a/MAINTAINERS
> > > +++ b/MAINTAINERS
> > > @@ -15358,6 +15358,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git
> > > F: Documentation/ABI/obsolete/sysfs-selinux-checkreqprot
> > > F: Documentation/ABI/obsolete/sysfs-selinux-disable
> > > F: Documentation/admin-guide/LSM/SELinux.rst
> > > +F: include/trace/events/selinux.h
> > > F: include/uapi/linux/selinux_netlink.h
> > > F: scripts/selinux/
> > > F: security/selinux/
> > > diff --git a/include/trace/events/selinux.h b/include/trace/events/selinux.h
> > > new file mode 100644
> > > index 000000000000..e247187a8135
> > > --- /dev/null
> > > +++ b/include/trace/events/selinux.h
> > > @@ -0,0 +1,35 @@
> > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > +#undef TRACE_SYSTEM
> > > +#define TRACE_SYSTEM selinux
> > > +
> > > +#if !defined(_TRACE_SELINUX_H) || defined(TRACE_HEADER_MULTI_READ)
> > > +#define _TRACE_SELINUX_H
> > > +
> > > +#include <linux/ktime.h>
> > > +#include <linux/tracepoint.h>
> > > +
> > > +TRACE_EVENT(selinux_denied,
> > > +
> > > + TP_PROTO(int cls, int av),
> > > +
> > > + TP_ARGS(cls, av),
> > > +
> > > + TP_STRUCT__entry(
> > > + __field(int, cls)
> > > + __field(int, av)
> > > + ),
> > > +
> > > + TP_fast_assign(
> > > + __entry->cls = cls;
> > > + __entry->av = av;
> > > + ),
> > > +
> > > + TP_printk("denied %d %d",
> > > + __entry->cls,
> > > + __entry->av)
> > > +);
> >
> > I would think you would want to log av as %x for easier interpretation
> > especially when there are multiple permissions being checked at once
> > (which can happen). Also both cls and av would properly be unsigned
> > values. Only other question I have is whether it would be beneficial
> > to include other information here to help uniquely identify/correlate
> > the denial with the avc: message and whether any decoding of the
> > class, av, or other information could/should be done here versus in
> > some userland helper.
>
> It does seem like at the very least it would be nice to see the av as
> hex values instead of integers, e.g. "%x" in the TP_printk() call.
> Considering this patch is about making dev's lives easier, I tend to
> agree with Stephen questioning if you should go a step further and
> convert both the class and av values into string representations.
>
> --
> paul moore
> http://www.paul-moore.com

2020-07-28 13:08:36

by Stephen Smalley

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

On 7/28/20 8:49 AM, Thiébaud Weksteen wrote:

> Thanks for the review! I'll send a new revision of the patch with the
> %x formatter and using the TP_CONDITION macro.
>
> On adding further information to the trace event, I would prefer
> adding the strict minimum to be able to correlate the event with the
> avc message. The reason is that tracevents have a fixed size (see
> https://www.kernel.org/doc/Documentation/trace/events.txt). For
> instance, we would need to decide on a maximum size for the string
> representation of the list of permissions. This would also duplicate
> the reporting done in the avc audit event. I'll simply add the pid as
> part of the printk, which should be sufficient for the correlation.

Ok, also please use unsigned int for the fields and %u for the cls value.

(btw top-posting is discouraged for mailing list discussions, see
http://vger.kernel.org/lkml/#s3-9)

2020-07-28 13:13:42

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

On Tue, 28 Jul 2020 14:49:24 +0200
Thiébaud Weksteen <[email protected]> wrote:

> Thanks for the review! I'll send a new revision of the patch with the
> %x formatter and using the TP_CONDITION macro.
>
> On adding further information to the trace event, I would prefer
> adding the strict minimum to be able to correlate the event with the
> avc message. The reason is that tracevents have a fixed size (see
> https://www.kernel.org/doc/Documentation/trace/events.txt). For

Wait! What?

Where in that document does it say that trace events have a fixed size.
We have a lot of dynamically sized trace events.

> instance, we would need to decide on a maximum size for the string
> representation of the list of permissions. This would also duplicate
> the reporting done in the avc audit event. I'll simply add the pid as
> part of the printk, which should be sufficient for the correlation.
>

Please take a look at samples/trace_events/trace_events_sample.h

and read the example on __print_symbolic().

I think that's what you are looking for.

-- Steve

2020-07-28 13:21:37

by Thiébaud Weksteen

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

On Tue, Jul 28, 2020 at 3:04 PM Stephen Smalley
<[email protected]> wrote:
> Ok, also please use unsigned int for the fields and %u for the cls value.

Will do in v3. Thanks.

2020-07-28 13:26:40

by Thiébaud Weksteen

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

On Tue, Jul 28, 2020 at 3:12 PM Steven Rostedt <[email protected]> wrote:
> Where in that document does it say that trace events have a fixed size.
> We have a lot of dynamically sized trace events.

My mistake. From the "format" pseudo-file, I assumed the offset and
size were fixed.

> Please take a look at samples/trace_events/trace_events_sample.h
> and read the example on __print_symbolic().
> I think that's what you are looking for.

Ack, thanks for pointing these out. I still think that my other
argument (i.e. duplication of avc message) holds.

2020-07-28 15:13:46

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

On Tue, Jul 28, 2020 at 8:49 AM Thiébaud Weksteen <[email protected]> wrote:
>
> Thanks for the review! I'll send a new revision of the patch with the
> %x formatter and using the TP_CONDITION macro.
>
> On adding further information to the trace event, I would prefer
> adding the strict minimum to be able to correlate the event with the
> avc message. The reason is that tracevents have a fixed size (see
> https://www.kernel.org/doc/Documentation/trace/events.txt). For
> instance, we would need to decide on a maximum size for the string
> representation of the list of permissions.

It sounds like this is no longer an issue, hopefully this changes your
thinking as I'm not sure how usable it would be in practice for users
not overly familiar with SELinux.

Perhaps it would be helpful if you provided an example of how one
would be expected to use this new tracepoint? That would help put
things in the proper perspective.

> This would also duplicate
> the reporting done in the avc audit event. I'll simply add the pid as
> part of the printk, which should be sufficient for the correlation.

Well, to be honest, the very nature of this tracepoint is duplicating
the AVC audit record with a focus on using perf to establish a full
backtrace at the expense of reduced information. At least that is how
it appears to me.

--
paul moore
http://www.paul-moore.com

2020-07-28 15:24:44

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

On Fri, Jul 24, 2020 at 5:15 AM Thiébaud Weksteen <[email protected]> wrote:
>
> The audit data currently captures which process and which target
> is responsible for a denial. There is no data on where exactly in the
> process that call occurred. Debugging can be made easier by being able to
> reconstruct the unified kernel and userland stack traces [1]. Add a
> tracepoint on the SELinux denials which can then be used by userland
> (i.e. perf).
>
> Although this patch could manually be added by each OS developer to
> trouble shoot a denial, adding it to the kernel streamlines the
> developers workflow.
>
> [1] https://source.android.com/devices/tech/debug/native_stack_dump
>
> Signed-off-by: Thiébaud Weksteen <[email protected]>
> Signed-off-by: Joel Fernandes <[email protected]>

While I am in support of the general idea, could you change my SOB to
something like Inspired-by?

This is really your patch, but I did demonstrate the idea in an
article where the intention was to apply a patch out of tree to do
stack dumps / tracing. SOB on the other hand is supposed to track the
flow of a patch (the people who the patch goes through) when it is
sent upstream.

Thanks,

- Joel

2020-07-28 16:04:22

by Thiébaud Weksteen

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

On Tue, Jul 28, 2020 at 5:12 PM Paul Moore <[email protected]> wrote:
> Perhaps it would be helpful if you provided an example of how one
> would be expected to use this new tracepoint? That would help put
> things in the proper perspective.

The best example is the one I provided in the commit message, that is
using perf (or a perf equivalent), to hook onto that tracepoint.

> Well, to be honest, the very nature of this tracepoint is duplicating
> the AVC audit record with a focus on using perf to establish a full
> backtrace at the expense of reduced information. At least that is how
> it appears to me.

I see both methods as complementary. By default, the kernel itself can
do some reporting (i.e avc message) on which process triggered the
denial, what was the context, etc. This is useful even in production
and doesn't require any extra tooling.
The case for adding this tracepoint can be seen as advanced debugging.
That is, once an avc denial has been confirmed, a developer can use
this tracepoint to surface the userland stacktrace. It requires more
userland tools and symbols on the userland binaries.

2020-07-28 16:20:43

by Stephen Smalley

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

On 7/28/20 12:02 PM, Thiébaud Weksteen wrote:

> On Tue, Jul 28, 2020 at 5:12 PM Paul Moore <[email protected]> wrote:
>> Perhaps it would be helpful if you provided an example of how one
>> would be expected to use this new tracepoint? That would help put
>> things in the proper perspective.
> The best example is the one I provided in the commit message, that is
> using perf (or a perf equivalent), to hook onto that tracepoint.
>
>> Well, to be honest, the very nature of this tracepoint is duplicating
>> the AVC audit record with a focus on using perf to establish a full
>> backtrace at the expense of reduced information. At least that is how
>> it appears to me.
> I see both methods as complementary. By default, the kernel itself can
> do some reporting (i.e avc message) on which process triggered the
> denial, what was the context, etc. This is useful even in production
> and doesn't require any extra tooling.
> The case for adding this tracepoint can be seen as advanced debugging.
> That is, once an avc denial has been confirmed, a developer can use
> this tracepoint to surface the userland stacktrace. It requires more
> userland tools and symbols on the userland binaries.

Providing an example of the tracepoint output in the patch description
would be helpful IMHO.

2020-07-28 16:21:49

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

On Tue, Jul 28, 2020 at 12:02 PM Thiébaud Weksteen <[email protected]> wrote:
> On Tue, Jul 28, 2020 at 5:12 PM Paul Moore <[email protected]> wrote:
> > Perhaps it would be helpful if you provided an example of how one
> > would be expected to use this new tracepoint? That would help put
> > things in the proper perspective.
>
> The best example is the one I provided in the commit message, that is
> using perf (or a perf equivalent), to hook onto that tracepoint.

I probably wasn't as clear as I should have been. I think it would be
helpful if you demonstrated how one would take the SELinux data in the
perf event and translated that into something meaningful.

--
paul moore
http://www.paul-moore.com

2020-07-30 08:04:15

by peter enderborg

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

On 7/28/20 6:02 PM, Thiébaud Weksteen wrote:
> On Tue, Jul 28, 2020 at 5:12 PM Paul Moore <[email protected]> wrote:
>> Perhaps it would be helpful if you provided an example of how one
>> would be expected to use this new tracepoint? That would help put
>> things in the proper perspective.
> The best example is the one I provided in the commit message, that is
> using perf (or a perf equivalent), to hook onto that tracepoint.
>
>> Well, to be honest, the very nature of this tracepoint is duplicating
>> the AVC audit record with a focus on using perf to establish a full
>> backtrace at the expense of reduced information. At least that is how
>> it appears to me.
> I see both methods as complementary. By default, the kernel itself can
> do some reporting (i.e avc message) on which process triggered the
> denial, what was the context, etc. This is useful even in production
> and doesn't require any extra tooling.
> The case for adding this tracepoint can be seen as advanced debugging.
> That is, once an avc denial has been confirmed, a developer can use
> this tracepoint to surface the userland stacktrace. It requires more
> userland tools and symbols on the userland binaries.

I think from development view you would like to have a better
way to trap this events in userspace. One idea that I have is
is to have more outcomes from a rule. We have today allow,
dontaudit, auditallow i think it would be good to have signal sent too.
"signal-xxx-allow" for some set of signals. SIGBUS, SIGSEGV, SIGABRT maybe.

That will be a good way to pickup the problem with a debugger or generate a
a core file.

I have also done some selinux trace functions. I think they collide with this set,
but I think I can rebase them upon yours and see if they give some more functionality.

I see this functionality very much needed in some form.


2020-07-30 14:32:10

by peter enderborg

[permalink] [raw]
Subject: [PATCH] RFC: selinux avc trace

I did manage to rebase it but this is about my approach.

Compared to Thiébaud Weksteen patch this adds:

1 Filtering. Types goes to trace so we can put up a filter for contexts or type etc.

2 It tries also to cover non denies.  And upon that you should be able to do coverage tools.
I think many systems have a lot more rules that what is needed, but there is good way
to find out what.  A other way us to make a stat page for the rules, but this way connect to
userspace and can be used for test cases.

This code need a lot more work, but it shows how the filter should work (extra info is not right)
and there are  memory leaks, extra debug info and nonsense variable etc.

From: Peter Enderborg <[email protected]>
Date: Thu, 30 Jul 2020 14:44:53 +0200
Subject: [PATCH] RFC: selinux avc trace

This is not done yet. But it shows a trace for selinux avc.
---
 include/trace/events/avc.h |  92 +++++++++++++++++++++++++++++
 security/selinux/avc.c     | 115 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 205 insertions(+), 2 deletions(-)
 create mode 100644 include/trace/events/avc.h

diff --git a/include/trace/events/avc.h b/include/trace/events/avc.h
new file mode 100644
index 000000000000..28c1044e918b
--- /dev/null
+++ b/include/trace/events/avc.h
@@ -0,0 +1,92 @@
+/*
+ * License terms: GNU General Public License (GPL) version 2
+ *
+ * Author: Peter Enderborg <[email protected]>
+ */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM avc
+
+#if !defined(_TRACE_AVC_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_AVC_H
+
+#include <linux/tracepoint.h>
+TRACE_EVENT(avc_data,
+        TP_PROTO(u32 requested,
+             u32 denied,
+             u32 audited,
+             int result,
+             const char *msg
+             ),
+
+        TP_ARGS(requested, denied, audited, result,msg),
+
+        TP_STRUCT__entry(
+             __field(u32, requested)
+             __field(u32, denied)
+             __field(u32, audited)
+             __field(int, result)
+             __array(char, msg, 255)
+                 ),
+
+        TP_fast_assign(
+               __entry->requested    = requested;
+               __entry->denied    = denied;
+               __entry->audited    = audited;
+               __entry->result    = result;
+               memcpy(__entry->msg, msg, 255);
+    ),
+
+        TP_printk("requested=0x%x denied=%d audited=%d result=%d msg=%s",
+              __entry->requested, __entry->denied, __entry->audited, __entry->result, __entry->msg
+              )
+);
+TRACE_EVENT(avc_req,
+        TP_PROTO(u32 requested,
+             u32 denied,
+             u32 audited,
+             int result,
+             const char *msg,
+             u32 ssid,
+             struct selinux_state *state
+             ),
+
+        TP_ARGS(requested, denied, audited, result,msg, ssid, state),
+
+        TP_STRUCT__entry(
+            __field(u32, requested)
+            __field(u32, denied)
+            __field(u32, audited)
+            __field(int, result)
+            __array(char, msg, 255)
+            __field(u32, ssid)
+            __field(struct selinux_state *, state)
+            __field(char*, scontext)
+            __field(u32, ssid_len)
+            __field(u32, ssid_res)
+                 ),
+
+        TP_fast_assign(
+            __entry->requested    = requested;
+            __entry->denied    = denied;
+            __entry->audited    = audited;
+             __entry->result    = result;
+            memcpy(__entry->msg, msg, 255);
+            __entry->ssid    = ssid;
+            __entry->state    = state;
+            __entry->ssid_res = security_sid_to_context(
+                           state,
+                           ssid,
+                           &__entry->scontext,
+                           &__entry->ssid_len);
+    ),
+
+        TP_printk("requested=0x%x denied=%d audited=%d result=%d msg=%s ssid=%d state=%p scontext=%s slen=%d s=%d",
+              __entry->requested, __entry->denied, __entry->audited, __entry->result, __entry->msg, __entry->ssid, __entry->state,__entry->scontext,__entry->ssid_len, __entry->ssid_res
+              )
+);
+
+#endif /* _TRACE_AVC_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/security/selinux/avc.c b/security/selinux/avc.c
index d18cb32a242a..d8cb9a23d669 100644
--- a/security/selinux/avc.c
+++ b/security/selinux/avc.c
@@ -30,6 +30,8 @@
 #include "avc.h"
 #include "avc_ss.h"
 #include "classmap.h"
+#define CREATE_TRACE_POINTS
+#include <trace/events/avc.h>
 
 #define AVC_CACHE_SLOTS            512
 #define AVC_DEF_CACHE_THRESHOLD        512
@@ -126,6 +128,41 @@ static inline int avc_hash(u32 ssid, u32 tsid, u16 tclass)
     return (ssid ^ (tsid<<2) ^ (tclass<<4)) & (AVC_CACHE_SLOTS - 1);
 }
 
+static int avc_dump_avs(char *ab, u16 tclass, u32 av)
+{
+    const char **perms;
+    int i, perm;
+    int rp;
+
+    if (av == 0) {
+        rp = sprintf(ab, " null");
+        return rp;
+    }
+
+    BUG_ON(!tclass || tclass >= ARRAY_SIZE(secclass_map));
+    perms = secclass_map[tclass-1].perms;
+
+    rp = sprintf(ab, " {");
+    i = 0;
+    perm = 1;
+    while (i < (sizeof(av) * 8)) {
+        if ((perm & av) && perms[i]) {
+            rp +=sprintf(ab+rp, " %s", perms[i]);
+            av &= ~perm;
+        }
+        i++;
+        perm <<= 1;
+    }
+
+    if (av)
+        rp += sprintf(ab+rp, " 0x%x", av);
+
+    rp +=sprintf(ab+rp, " }");
+    return rp;
+}
+
+
+
 /**
  * avc_init - Initialize the AVC.
  *
@@ -421,8 +458,12 @@ static inline int avc_xperms_audit(struct selinux_state *state,
 
     audited = avc_xperms_audit_required(
             requested, avd, xpd, perm, result, &denied);
-    if (likely(!audited))
+    if (likely(!audited)) {
+        trace_avc_data(requested, denied, audited, -1,"foo");
         return 0;
+    }
+
+    trace_avc_data(requested, denied, audited, -1,"bar");
     return slow_avc_audit(state, ssid, tsid, tclass, requested,
             audited, denied, result, ad);
 }
@@ -541,6 +582,34 @@ static inline struct avc_node *avc_search_node(struct selinux_avc *avc,
     return ret;
 }
 
+static int avc_dump_querys(struct selinux_state *state, char *ab, u32 ssid, u32 tsid, u16 tclass)
+{
+    int rc;
+    char *scontext;
+    u32 scontext_len;
+    int rp;
+
+    rc = security_sid_to_context(state,ssid, &scontext, &scontext_len);
+    if (rc)
+        rp = sprintf(ab, "ssid=%d", ssid);
+    else {
+        rp = sprintf(ab, "scontext=%s", scontext);
+        kfree(scontext);
+    }
+
+    rc = security_sid_to_context(state, tsid, &scontext, &scontext_len);
+    if (rc)
+        rp +=sprintf(ab+rp, " tsid=%d", tsid);
+    else {
+        rp +=sprintf(ab+rp, " tcontext=%s", scontext);
+        kfree(scontext);
+    }
+
+    BUG_ON(!tclass || tclass >= ARRAY_SIZE(secclass_map));
+    rp += sprintf(ab+rp, " tclass=%s", secclass_map[tclass-1].name);
+    return rp;
+}
+
 /**
  * avc_lookup - Look up an AVC entry.
  * @ssid: source security identifier
@@ -690,6 +759,7 @@ static void avc_audit_pre_callback(struct audit_buffer *ab, void *a)
         audit_log_format(ab, " 0x%x", av);
 
     audit_log_format(ab, " } for ");
+
 }
 
 /**
@@ -780,6 +850,7 @@ noinline int slow_avc_audit(struct selinux_state *state,
     a->selinux_audit_data = &sad;
 
     common_lsm_audit(a, avc_audit_pre_callback, avc_audit_post_callback);
+    pr_info("done lsm");
     return 0;
 }
 
@@ -1105,6 +1176,34 @@ int avc_has_extended_perms(struct selinux_state *state,
         return rc2;
     return rc;
 }
+static int avc_dump_extra_info_s(char *ab,
+        struct common_audit_data *ad)
+{
+    struct task_struct *tsk = current;
+    int rp;
+
+    if (tsk && tsk->pid) {
+        rp = sprintf(ab, " ppid=%d pcomm=", tsk->parent->pid);
+        rp += sprintf(ab+rp, tsk->parent->comm);
+
+        if (tsk->group_leader->pid != tsk->pid) {
+            rp +=sprintf(ab+rp, " pgid=%d pgcomm=",
+                    tsk->group_leader->pid);
+            rp += sprintf(ab+rp,
+                    tsk->group_leader->comm);
+        } else if (tsk->parent->group_leader->pid) {
+            rp += sprintf(ab+rp, " pgid=%d pgcomm=",
+                    tsk->parent->group_leader->pid);
+            rp += sprintf(ab+rp,
+                    tsk->parent->group_leader->comm);
+        }
+    } else {
+        rp = sprintf(ab," no task %p", tsk);
+    }
+    return rp;
+}
+
+
 
 /**
  * avc_has_perm_noaudit - Check permissions but perform no auditing.
@@ -1178,14 +1277,26 @@ int avc_has_perm(struct selinux_state *state, u32 ssid, u32 tsid, u16 tclass,
 {
     struct av_decision avd;
     int rc, rc2;
+    char *lb;
+    int x;
+
+    lb = kmalloc(1024, GFP_KERNEL);
 
     rc = avc_has_perm_noaudit(state, ssid, tsid, tclass, requested, 0,
                   &avd);
 
     rc2 = avc_audit(state, ssid, tsid, tclass, requested, &avd, rc,
             auditdata, 0);
-    if (rc2)
+    if (rc2)  {
+        kfree(lb);
         return rc2;
+    }
+
+    x = avc_dump_avs(lb, tclass, 42);
+    x += avc_dump_querys(state, lb+x,ssid, tsid, tclass);
+        if(1)    x += avc_dump_extra_info_s(lb+x, auditdata);
+    trace_avc_data(requested, rc, 0, rc2,lb);
+    kfree(lb);
     return rc;
 }
 
--
2.17.1


2020-07-30 14:53:18

by Stephen Smalley

[permalink] [raw]
Subject: Re: [PATCH] RFC: selinux avc trace

On Thu, Jul 30, 2020 at 10:29 AM peter enderborg
<[email protected]> wrote:
>
> I did manage to rebase it but this is about my approach.
>
> Compared to Thiébaud Weksteen patch this adds:
>
> 1 Filtering. Types goes to trace so we can put up a filter for contexts or type etc.
>
> 2 It tries also to cover non denies. And upon that you should be able to do coverage tools.
> I think many systems have a lot more rules that what is needed, but there is good way
> to find out what. A other way us to make a stat page for the rules, but this way connect to
> userspace and can be used for test cases.
>
> This code need a lot more work, but it shows how the filter should work (extra info is not right)
> and there are memory leaks, extra debug info and nonsense variable etc.

Perhaps the two of you could work together to come up with a common
tracepoint that addresses both needs.
On the one hand, we don't need/want to duplicate the avc message
itself; we just need enough to be able to correlate them.
With respect to non-denials, SELinux auditallow statements can be used
to generate avc: granted messages that can be used to support coverage
tools although you can easily flood the logs that way. One other
limitation of the other patch is that it doesn't support generating
trace information for denials silenced by dontaudit rules, which might
be challenging to debug especially on Android where you can't just run
semodule -DB to strip all dontaudits.

2020-07-30 15:07:28

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] RFC: selinux avc trace

On Thu, 30 Jul 2020 16:29:12 +0200
peter enderborg <[email protected]> wrote:

> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM avc
> +
> +#if !defined(_TRACE_AVC_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_AVC_H
> +
> +#include <linux/tracepoint.h>
> +TRACE_EVENT(avc_data,
> +        TP_PROTO(u32 requested,
> +             u32 denied,
> +             u32 audited,
> +             int result,
> +             const char *msg
> +             ),
> +
> +        TP_ARGS(requested, denied, audited, result,msg),
> +
> +        TP_STRUCT__entry(
> +             __field(u32, requested)
> +             __field(u32, denied)
> +             __field(u32, audited)
> +             __field(int, result)
> +             __array(char, msg, 255)

You want to use __string() here, otherwise you are wasting a lot of
buffer space.

__string( msg, msg)

> +                 ),
> +
> +        TP_fast_assign(
> +               __entry->requested    = requested;
> +               __entry->denied    = denied;
> +               __entry->audited    = audited;
> +               __entry->result    = result;
> +               memcpy(__entry->msg, msg, 255);

Not to mention, the above is a bug. As the msg being passed in, is
highly unlikely to be 255 bytes. You just leaked all that memory after
the sting to user space.

Where you want here:

__assign_str( msg, msg );


-- Steve



> +    ),
> +
> +        TP_printk("requested=0x%x denied=%d audited=%d result=%d
> msg=%s",
> +              __entry->requested, __entry->denied, __entry->audited,
> __entry->result, __entry->msg
> +              )

2020-07-30 15:33:56

by peter enderborg

[permalink] [raw]
Subject: Re: [PATCH] RFC: selinux avc trace

On 7/30/20 5:04 PM, Steven Rostedt wrote:
> On Thu, 30 Jul 2020 16:29:12 +0200
> peter enderborg <[email protected]> wrote:
>
>> +#undef TRACE_SYSTEM
>> +#define TRACE_SYSTEM avc
>> +
>> +#if !defined(_TRACE_AVC_H) || defined(TRACE_HEADER_MULTI_READ)
>> +#define _TRACE_AVC_H
>> +
>> +#include <linux/tracepoint.h>
>> +TRACE_EVENT(avc_data,
>> +        TP_PROTO(u32 requested,
>> +             u32 denied,
>> +             u32 audited,
>> +             int result,
>> +             const char *msg
>> +             ),
>> +
>> +        TP_ARGS(requested, denied, audited, result,msg),
>> +
>> +        TP_STRUCT__entry(
>> +             __field(u32, requested)
>> +             __field(u32, denied)
>> +             __field(u32, audited)
>> +             __field(int, result)
>> +             __array(char, msg, 255)
> You want to use __string() here, otherwise you are wasting a lot of
> buffer space.
>
> __string( msg, msg)
It should be a full structure with a lot of sub strings.  But that make is even more relevant.
>
>> +                 ),
>> +
>> +        TP_fast_assign(
>> +               __entry->requested    = requested;
>> +               __entry->denied    = denied;
>> +               __entry->audited    = audited;
>> +               __entry->result    = result;
>> +               memcpy(__entry->msg, msg, 255);
> Not to mention, the above is a bug. As the msg being passed in, is
> highly unlikely to be 255 bytes. You just leaked all that memory after
> the sting to user space.
>
> Where you want here:
>
> __assign_str( msg, msg );

Directly in to the code. Was more in to get in to discussion on how complex we should have
the trace data. There is a lot of fields. Not all is always present. Is there any good way
to handle that? Like "something= somethingelse=42" or "something=nil somthingelse=42"


>
> -- Steve
>
>
>
>> +    ),
>> +
>> +        TP_printk("requested=0x%x denied=%d audited=%d result=%d
>> msg=%s",
>> +              __entry->requested, __entry->denied, __entry->audited,
>> __entry->result, __entry->msg
>> +              )


2020-07-30 15:48:39

by peter enderborg

[permalink] [raw]
Subject: Re: [PATCH] RFC: selinux avc trace

On 7/30/20 4:50 PM, Stephen Smalley wrote:
> On Thu, Jul 30, 2020 at 10:29 AM peter enderborg
> <[email protected]> wrote:
>> I did manage to rebase it but this is about my approach.
>>
>> Compared to Thiébaud Weksteen patch this adds:
>>
>> 1 Filtering. Types goes to trace so we can put up a filter for contexts or type etc.
>>
>> 2 It tries also to cover non denies. And upon that you should be able to do coverage tools.
>> I think many systems have a lot more rules that what is needed, but there is good way
>> to find out what. A other way us to make a stat page for the rules, but this way connect to
>> userspace and can be used for test cases.
>>
>> This code need a lot more work, but it shows how the filter should work (extra info is not right)
>> and there are memory leaks, extra debug info and nonsense variable etc.
> Perhaps the two of you could work together to come up with a common
> tracepoint that addresses both needs.

Sounds good to me.

> On the one hand, we don't need/want to duplicate the avc message
> itself; we just need enough to be able to correlate them.
> With respect to non-denials, SELinux auditallow statements can be used
> to generate avc: granted messages that can be used to support coverage
> tools although you can easily flood the logs that way. One other

That is one reason to use trace. I can be used for things that
generate a lot of data. Like memory allocations and scheduler etc,
and it is a developer tool so you should not have to worry about DOS etc.

Both netlink and android logging are too happy to throw away data for
developers to be happy.

> limitation of the other patch is that it doesn't support generating
> trace information for denials silenced by dontaudit rules, which might
> be challenging to debug especially on Android where you can't just run
> semodule -DB to strip all dontaudits.

I think that only work for rooted devices. Many application developers
run on commercial devices that are locked, but they do have access
to trace. But I have no idea if they (google) have intended the selinux traces to
available there.

2020-07-30 15:51:01

by Thiébaud Weksteen

[permalink] [raw]
Subject: Re: [PATCH] selinux: add tracepoint on denials

On Tue, Jul 28, 2020 at 6:20 PM Paul Moore <[email protected]> wrote:
> I probably wasn't as clear as I should have been. I think it would be
> helpful if you demonstrated how one would take the SELinux data in the
> perf event and translated that into something meaningful.

So the data itself is not that relevant. What is important is the
ability to hook the kernel at the right location, at the right time.
Here is an example on how this patch can be used on Android
(simpleperf is the Android equivalent of perf), running dmesg as the
shell user which is not permitted:
# simpleperf record -e selinux:selinux_denied -a -g --duration 10
# simpleperf report -g --full-callgraph
Cmdline: /system/bin/simpleperf record -e selinux:selinux_denied -a -g
--duration 10
Arch: arm64
Event: selinux:selinux_denied (type 2, config 493)
Samples: 1
Event count: 1

Children Self Command Pid Tid Shared Object
Symbol
100.00% 0.00% dmesg 3511 3511
/apex/com.android.runtime/lib64/bionic/libc.so __libc_init
|
-- __libc_init
|
-- main
toybox_main
toy_exec_which
dmesg_main
klogctl
el0_svc_naked
sys_syslog
do_syslog
security_syslog
selinux_syslog
avc_has_perm
slow_avc_audit
common_lsm_audit
avc_audit_pre_callback

You can see the combined user and kernel stacks which is useful to
understand where and why the denial happened.
The key point is that simpleperf is doing the heavy work (i.e names
resolution), while the kernel only shares the strict minimum for that
to happen.
This can be correlated with the pid of the avc denial message (I'm
assuming we are trouble shooting one specific denial).

It is also possible to manually use ftrace. For instance, after
enabling and triggering the denial:
bonito:/sys/kernel/debug/tracing # cat trace
# tracer: nop
#
# entries-in-buffer/entries-written: 1/1 #P:8
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
dmesg-3624 [001] .... 13072.325358: selinux_denied: denied
pid=3624 tclass=4 audited=2

This can be correlated with the following avc denial:
[ 2180.183062] type=1400 audit(1596111144.026:27): avc: denied {
syslog_read } for comm="dmesg" scontext=u:r:shell:s0
tcontext=u:r:kernel:s0 tclass=system permissive=0
Here, there is limited value of having that tracepoint as we are only
duplicating the avc message content.

Nevertheless, the filtering part of Peter's patch would be useful to
be more precise on which denial we are targeting (I'll reply to the
other thread as well).
I hope this clarifies the usage. Thanks.

2020-07-30 16:02:52

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] RFC: selinux avc trace

On Thu, 30 Jul 2020 17:31:17 +0200
peter enderborg <[email protected]> wrote:

> On 7/30/20 5:04 PM, Steven Rostedt wrote:
> > On Thu, 30 Jul 2020 16:29:12 +0200
> > peter enderborg <[email protected]> wrote:
> >
> >> +#undef TRACE_SYSTEM
> >> +#define TRACE_SYSTEM avc
> >> +
> >> +#if !defined(_TRACE_AVC_H) || defined(TRACE_HEADER_MULTI_READ)
> >> +#define _TRACE_AVC_H
> >> +
> >> +#include <linux/tracepoint.h>
> >> +TRACE_EVENT(avc_data,
> >> +        TP_PROTO(u32 requested,
> >> +             u32 denied,
> >> +             u32 audited,
> >> +             int result,
> >> +             const char *msg
> >> +             ),
> >> +
> >> +        TP_ARGS(requested, denied, audited, result,msg),
> >> +
> >> +        TP_STRUCT__entry(
> >> +             __field(u32, requested)
> >> +             __field(u32, denied)
> >> +             __field(u32, audited)
> >> +             __field(int, result)
> >> +             __array(char, msg, 255)
> > You want to use __string() here, otherwise you are wasting a lot of
> > buffer space.
> >
> > __string( msg, msg)

> It should be a full structure with a lot of sub strings.  But that make is even more relevant.

So one event instance can have a list of strings recorded?

> >
> >> +                 ),
> >> +
> >> +        TP_fast_assign(
> >> +               __entry->requested    = requested;
> >> +               __entry->denied    = denied;
> >> +               __entry->audited    = audited;
> >> +               __entry->result    = result;
> >> +               memcpy(__entry->msg, msg, 255);
> > Not to mention, the above is a bug. As the msg being passed in, is
> > highly unlikely to be 255 bytes. You just leaked all that memory after
> > the sting to user space.
> >
> > Where you want here:
> >
> > __assign_str( msg, msg );
>
> Directly in to the code. Was more in to get in to discussion on how complex we should have
> the trace data. There is a lot of fields. Not all is always present. Is there any good way
> to handle that? Like "something= somethingelse=42" or "something=nil somthingelse=42"

Can you show what you want to record and what you want to display? I'm
not totally understanding the request.

-- Steve

> >> +    ),
> >> +
> >> +        TP_printk("requested=0x%x denied=%d audited=%d result=%d
> >> msg=%s",
> >> +              __entry->requested, __entry->denied, __entry->audited,
> >> __entry->result, __entry->msg
> >> +              )
>

2020-07-30 17:07:08

by peter enderborg

[permalink] [raw]
Subject: Re: [PATCH] RFC: selinux avc trace

On 7/30/20 6:02 PM, Steven Rostedt wrote:
> On Thu, 30 Jul 2020 17:31:17 +0200
> peter enderborg <[email protected]> wrote:
>
>> On 7/30/20 5:04 PM, Steven Rostedt wrote:
>>> On Thu, 30 Jul 2020 16:29:12 +0200
>>> peter enderborg <[email protected]> wrote:
>>>
>>>> +#undef TRACE_SYSTEM
>>>> +#define TRACE_SYSTEM avc
>>>> +
>>>> +#if !defined(_TRACE_AVC_H) || defined(TRACE_HEADER_MULTI_READ)
>>>> +#define _TRACE_AVC_H
>>>> +
>>>> +#include <linux/tracepoint.h>
>>>> +TRACE_EVENT(avc_data,
>>>> +        TP_PROTO(u32 requested,
>>>> +             u32 denied,
>>>> +             u32 audited,
>>>> +             int result,
>>>> +             const char *msg
>>>> +             ),
>>>> +
>>>> +        TP_ARGS(requested, denied, audited, result,msg),
>>>> +
>>>> +        TP_STRUCT__entry(
>>>> +             __field(u32, requested)
>>>> +             __field(u32, denied)
>>>> +             __field(u32, audited)
>>>> +             __field(int, result)
>>>> +             __array(char, msg, 255)
>>> You want to use __string() here, otherwise you are wasting a lot of
>>> buffer space.
>>>
>>> __string( msg, msg)
>> It should be a full structure with a lot of sub strings.  But that make is even more relevant.
> So one event instance can have a list of strings recorded?

Yes, it is a list very similar to a normal trace. But it is more generic.

For example ino= is for filesystems that have inode, but for a
violation that send a signal that make no sense at all.  Network
addresses is in many cases not applicable. laddr= is only exist for
for IP.

So if you just print them it will look like:

avc:  denied  { find } for interface=vendor.qti.hardware.perf::IPerf sid=u:r:permissioncontroller_app:s0:c230,c256,c512,c768 pid=9164 scontext=u:r:permissioncontroller_app:s0:c230,c256,c512,c768 tcontext=u:object_r:vendor_hal_perf_hwservice:s0 tclass=hwservice_manager permissive=0
 avc:  denied  { execute } for  pid=13914 comm="ScionFrontendAp" path="/data/user_de/0/com.google.android.gms/app_chimera/m/00000002/oat/arm64/DynamiteLoader.odex" dev="sda77" ino=204967 scontext=u:r:platform_app:s0:c512,c768 tcontext=u:object_r:privapp_data_file:s0:c512,c768 tclass=file permissive=0 ppid=788 pcomm="main" pgid=13914 pgcomm="on.updatecenter"

It omit the fields that are not used. Some parts are common some are not. So a correct format specification for trace will be problematic if there is no "optional" field indicator.

>
>>>
>>>> +                 ),
>>>> +
>>>> +        TP_fast_assign(
>>>> +               __entry->requested    = requested;
>>>> +               __entry->denied    = denied;
>>>> +               __entry->audited    = audited;
>>>> +               __entry->result    = result;
>>>> +               memcpy(__entry->msg, msg, 255);
>>> Not to mention, the above is a bug. As the msg being passed in, is
>>> highly unlikely to be 255 bytes. You just leaked all that memory after
>>> the sting to user space.
>>>
>>> Where you want here:
>>>
>>> __assign_str( msg, msg );
>> Directly in to the code. Was more in to get in to discussion on how complex we should have
>> the trace data. There is a lot of fields. Not all is always present. Is there any good way
>> to handle that? Like "something= somethingelse=42" or "something=nil somthingelse=42"
> Can you show what you want to record and what you want to display? I'm
> not totally understanding the request.
>
> -- Steve
>
>>>> +    ),
>>>> +
>>>> +        TP_printk("requested=0x%x denied=%d audited=%d result=%d
>>>> msg=%s",
>>>> +              __entry->requested, __entry->denied, __entry->audited,
>>>> __entry->result, __entry->msg
>>>> +              )


2020-07-30 17:20:19

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] RFC: selinux avc trace

On Thu, 30 Jul 2020 19:05:49 +0200
peter enderborg <[email protected]> wrote:

> >> It should be a full structure with a lot of sub strings.  But that make is even more relevant.
> > So one event instance can have a list of strings recorded?
>
> Yes, it is a list very similar to a normal trace. But it is more generic.
>
> For example ino= is for filesystems that have inode, but for a
> violation that send a signal that make no sense at all.  Network
> addresses is in many cases not applicable. laddr= is only exist for
> for IP.
>
> So if you just print them it will look like:
>
> avc:  denied  { find } for interface=vendor.qti.hardware.perf::IPerf sid=u:r:permissioncontroller_app:s0:c230,c256,c512,c768 pid=9164 scontext=u:r:permissioncontroller_app:s0:c230,c256,c512,c768 tcontext=u:object_r:vendor_hal_perf_hwservice:s0 tclass=hwservice_manager permissive=0
>  avc:  denied  { execute } for  pid=13914 comm="ScionFrontendAp" path="/data/user_de/0/com.google.android.gms/app_chimera/m/00000002/oat/arm64/DynamiteLoader.odex" dev="sda77" ino=204967 scontext=u:r:platform_app:s0:c512,c768 tcontext=u:object_r:privapp_data_file:s0:c512,c768 tclass=file permissive=0 ppid=788 pcomm="main" pgid=13914 pgcomm="on.updatecenter"
>
> It omit the fields that are not used. Some parts are common some are not. So a correct format specification for trace will be problematic if there is no "optional" field indicator.

That's all quite noisy. What is the object of these changes? What
exactly are you trying to trace and why?

-- Steve

2020-07-30 19:15:57

by peter enderborg

[permalink] [raw]
Subject: Re: [PATCH] RFC: selinux avc trace

On 7/30/20 7:16 PM, Steven Rostedt wrote:
> On Thu, 30 Jul 2020 19:05:49 +0200
> peter enderborg <[email protected]> wrote:
>
>>>> It should be a full structure with a lot of sub strings.  But that make is even more relevant.
>>> So one event instance can have a list of strings recorded?
>> Yes, it is a list very similar to a normal trace. But it is more generic.
>>
>> For example ino= is for filesystems that have inode, but for a
>> violation that send a signal that make no sense at all.  Network
>> addresses is in many cases not applicable. laddr= is only exist for
>> for IP.
>>
>> So if you just print them it will look like:
>>
>> avc:  denied  { find } for interface=vendor.qti.hardware.perf::IPerf sid=u:r:permissioncontroller_app:s0:c230,c256,c512,c768 pid=9164 scontext=u:r:permissioncontroller_app:s0:c230,c256,c512,c768 tcontext=u:object_r:vendor_hal_perf_hwservice:s0 tclass=hwservice_manager permissive=0
>>  avc:  denied  { execute } for  pid=13914 comm="ScionFrontendAp" path="/data/user_de/0/com.google.android.gms/app_chimera/m/00000002/oat/arm64/DynamiteLoader.odex" dev="sda77" ino=204967 scontext=u:r:platform_app:s0:c512,c768 tcontext=u:object_r:privapp_data_file:s0:c512,c768 tclass=file permissive=0 ppid=788 pcomm="main" pgid=13914 pgcomm="on.updatecenter"
>>
>> It omit the fields that are not used. Some parts are common some are not. So a correct format specification for trace will be problematic if there is no "optional" field indicator.
> That's all quite noisy. What is the object of these changes? What
> exactly are you trying to trace and why?

It is noisy, and it have to be. it covers a lot of different areas.  One common problem is
to debug userspace applications regarding violations. You get the violation from the logs
and try to figure out what you did to cause it. With a trace point you can do much better
when combine with other traces. Having a the userspace stack is a very good way,
unfortunately  it does not work on that many architectures within trace.

What exactly are you doing with any trace? You collect data to analyse what's
going on. This is not different. Selinux do a specific thing, but is has lots of parameters.


> -- Steve


2020-07-30 19:30:41

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] RFC: selinux avc trace

On Thu, 30 Jul 2020 21:12:39 +0200
peter enderborg <[email protected]> wrote:

> >> avc:  denied  { find } for interface=vendor.qti.hardware.perf::IPerf sid=u:r:permissioncontroller_app:s0:c230,c256,c512,c768 pid=9164 scontext=u:r:permissioncontroller_app:s0:c230,c256,c512,c768 tcontext=u:object_r:vendor_hal_perf_hwservice:s0 tclass=hwservice_manager permissive=0
> >>  avc:  denied  { execute } for  pid=13914 comm="ScionFrontendAp" path="/data/user_de/0/com.google.android.gms/app_chimera/m/00000002/oat/arm64/DynamiteLoader.odex" dev="sda77" ino=204967 scontext=u:r:platform_app:s0:c512,c768 tcontext=u:object_r:privapp_data_file:s0:c512,c768 tclass=file permissive=0 ppid=788 pcomm="main" pgid=13914 pgcomm="on.updatecenter"
> >>
> >> It omit the fields that are not used. Some parts are common some are not. So a correct format specification for trace will be problematic if there is no "optional" field indicator.
> > That's all quite noisy. What is the object of these changes? What
> > exactly are you trying to trace and why?
>
> It is noisy, and it have to be. it covers a lot of different areas.  One common problem is
> to debug userspace applications regarding violations. You get the violation from the logs
> and try to figure out what you did to cause it. With a trace point you can do much better
> when combine with other traces. Having a the userspace stack is a very good way,
> unfortunately  it does not work on that many architectures within trace.
>
> What exactly are you doing with any trace? You collect data to analyse what's
> going on. This is not different. Selinux do a specific thing, but is has lots of parameters.

Have you thought of adding multiple trace events with if statements
around them to decode each specific type of event?

Note, you can have a generic event that gets enabled by all the other
events via the "reg" and "unreg" part of TRACE_EVENT_FN(). Say its
called trace_avc, make a dummy trace_avc() call hat doesn't even need
to be called anywhere, it just needs to exist to get to the other trace
events.

Then have:

if (trace_avc_enabled()) {
if (event1)
trace_avc_req_event1();
if (event2)
trace_avc_req_event2();
[..]
}

The reason for the trace_avc_enabled() is because that's a static
branch, which is a nop when not enabled. When enabled, it is a jump to
the out of band if condition block that has all the other trace events.

-- Steve

2020-07-30 19:53:20

by peter enderborg

[permalink] [raw]
Subject: Re: [PATCH] RFC: selinux avc trace

On 7/30/20 9:29 PM, Steven Rostedt wrote:
> On Thu, 30 Jul 2020 21:12:39 +0200
> peter enderborg <[email protected]> wrote:
>
>>>> avc:  denied  { find } for interface=vendor.qti.hardware.perf::IPerf sid=u:r:permissioncontroller_app:s0:c230,c256,c512,c768 pid=9164 scontext=u:r:permissioncontroller_app:s0:c230,c256,c512,c768 tcontext=u:object_r:vendor_hal_perf_hwservice:s0 tclass=hwservice_manager permissive=0
>>>>  avc:  denied  { execute } for  pid=13914 comm="ScionFrontendAp" path="/data/user_de/0/com.google.android.gms/app_chimera/m/00000002/oat/arm64/DynamiteLoader.odex" dev="sda77" ino=204967 scontext=u:r:platform_app:s0:c512,c768 tcontext=u:object_r:privapp_data_file:s0:c512,c768 tclass=file permissive=0 ppid=788 pcomm="main" pgid=13914 pgcomm="on.updatecenter"
>>>>
>>>> It omit the fields that are not used. Some parts are common some are not. So a correct format specification for trace will be problematic if there is no "optional" field indicator.
>>> That's all quite noisy. What is the object of these changes? What
>>> exactly are you trying to trace and why?
>> It is noisy, and it have to be. it covers a lot of different areas.  One common problem is
>> to debug userspace applications regarding violations. You get the violation from the logs
>> and try to figure out what you did to cause it. With a trace point you can do much better
>> when combine with other traces. Having a the userspace stack is a very good way,
>> unfortunately  it does not work on that many architectures within trace.
>>
>> What exactly are you doing with any trace? You collect data to analyse what's
>> going on. This is not different. Selinux do a specific thing, but is has lots of parameters.
> Have you thought of adding multiple trace events with if statements
> around them to decode each specific type of event?

Yes. And I think class is good split point. But I think it will require
a few layers, but a is mostly data driven so I think it might be hard
to do it compile time.  I think a hybrid might be possible,
but it then we need some ugly part with a other separator than =,
or some escape seq to separate.

sort of "generc1=X generic2=Y variable1^x variable2^y" or

"generc1=X generic2=Y leftover=[variable1=x variable2=y]"

If there was a formal parameter tree we could maybe do some
generated printer. I don't think there are one, maybe Paul Moore or Stephen Smalley
can verify that.

 

> Note, you can have a generic event that gets enabled by all the other
> events via the "reg" and "unreg" part of TRACE_EVENT_FN(). Say its
> called trace_avc, make a dummy trace_avc() call hat doesn't even need
> to be called anywhere, it just needs to exist to get to the other trace
> events.
>
> Then have:
>
> if (trace_avc_enabled()) {
> if (event1)
> trace_avc_req_event1();
> if (event2)
> trace_avc_req_event2();
> [..]
> }
>
> The reason for the trace_avc_enabled() is because that's a static
> branch, which is a nop when not enabled. When enabled, it is a jump to
> the out of band if condition block that has all the other trace events.
>
> -- Steve


2020-07-31 11:09:52

by Thiébaud Weksteen

[permalink] [raw]
Subject: Re: [PATCH] RFC: selinux avc trace

Thanks Peter, this looks like a great start.

> Perhaps the two of you could work together to come up with a common
tracepoint that addresses both needs.

Agreed.

> 1 Filtering. Types goes to trace so we can put up a filter for contexts or type etc.

That's right. I think this is the main reason why we would need
further attributes in the trace event.

> 2 It tries also to cover non denies. And upon that you should be able to do coverage tools.
> I think many systems have a lot more rules that what is needed, but there is good way
> to find out what. A other way us to make a stat page for the rules, but this way connect to
> userspace and can be used for test cases.

This is a great idea too.

>> On the one hand, we don't need/want to duplicate the avc message
>> itself; we just need enough to be able to correlate them.
>> With respect to non-denials, SELinux auditallow statements can be used
>> to generate avc: granted messages that can be used to support coverage
>> tools although you can easily flood the logs that way. One other
> That is one reason to use trace.

Yes, that's right. I don't have any concern about the flooding here.
As Peter mentioned, trace is specially designed for this purpose.

On the patch, few things to note:

> ---
> +#include <linux/tracepoint.h>
> +TRACE_EVENT(avc_data,
> + TP_PROTO(u32 requested,
> + u32 denied,
> + u32 audited,
> + int result,
> + const char *msg
> + ),

I would not store the raw msg from avc. As we discussed, it is useful
to be able to match against the values we are seeing in the avc denial
message but these attributes should be simple (as opposed to
composite) so the filtering can easily be setup (see section 5.1 in
https://www.kernel.org/doc/Documentation/trace/events.txt). It makes
more sense extracting scontext and tcontext (for instance) which
allows for a precise filtering setup.

Here, I would also pass down the "struct selinux_audit_data" to avoid
a large list of arguments.

> +TRACE_EVENT(avc_req,
> + TP_PROTO(u32 requested,
> + u32 denied,
> + u32 audited,
> + int result,
> + const char *msg,
> + u32 ssid,
> + struct selinux_state *state
> + ),

I don't see that event being used later on. What was the intention here?

> +static int avc_dump_querys(struct selinux_state *state, char *ab, u32 ssid, u32 tsid, u16 tclass)
> +{
> + int rc;
> + char *scontext;
> + u32 scontext_len;
> + int rp;
> +
> + rc = security_sid_to_context(state,ssid, &scontext, &scontext_len);
> + if (rc)
> + rp = sprintf(ab, "ssid=%d", ssid);
> + else {
> + rp = sprintf(ab, "scontext=%s", scontext);
> + kfree(scontext);
> + }
> +
> + rc = security_sid_to_context(state, tsid, &scontext, &scontext_len);
> + if (rc)
> + rp +=sprintf(ab+rp, " tsid=%d", tsid);
> + else {
> + rp +=sprintf(ab+rp, " tcontext=%s", scontext);
> + kfree(scontext);
> + }
> +
> + BUG_ON(!tclass || tclass >= ARRAY_SIZE(secclass_map));
> + rp += sprintf(ab+rp, " tclass=%s", secclass_map[tclass-1].name);
> + return rp;
> +}

As I mentioned before, this is literally repeating the avc audit
message. We are better off storing the exact fields we are interested
in, so that the filtering is precise.

Thanks