2022-09-02 10:29:27

by Marco Elver

[permalink] [raw]
Subject: [PATCH] perf: Allow restricted kernel breakpoints on user addresses

Allow the creation of restricted breakpoint perf events that also fire
in the kernel (!exclude_kernel), if:

1. No sample information is requested; samples may contain IPs,
registers, or other information that may disclose kernel addresses.

2. The breakpoint (viz. data watchpoint) is on a user address.

The rules constrain the allowable perf events such that no sensitive
kernel information can be disclosed.

Despite no explicit kernel information disclosure, the following
questions may need answers:

1. Is obtaining information that the kernel accessed a particular
user's known memory location revealing new information?
Given the kernel's user space ABI, there should be no "surprise
accesses" to user space memory in the first place.

2. Does causing breakpoints on user memory accesses by the kernel
potentially impact timing in a sensitive way?
Since hardware breakpoints trigger regardless of the state of
perf_event_attr::exclude_kernel, but are filtered in the perf
subsystem, this possibility already exists independent of the
proposed change.

Signed-off-by: Marco Elver <[email protected]>
---

Changelog
~~~~~~~~~

v1:
* Rebase.

RFC: https://lkml.kernel.org/r/[email protected]
---
include/linux/perf_event.h | 8 +-------
kernel/events/core.c | 38 ++++++++++++++++++++++++++++++++++++++
2 files changed, 39 insertions(+), 7 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index a784e055002e..907b0e3f1318 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1367,13 +1367,7 @@ static inline int perf_is_paranoid(void)
return sysctl_perf_event_paranoid > -1;
}

-static inline int perf_allow_kernel(struct perf_event_attr *attr)
-{
- if (sysctl_perf_event_paranoid > 1 && !perfmon_capable())
- return -EACCES;
-
- return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
-}
+extern int perf_allow_kernel(struct perf_event_attr *attr);

static inline int perf_allow_cpu(struct perf_event_attr *attr)
{
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2621fd24ad26..75f5705b6892 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3176,6 +3176,12 @@ static int perf_event_modify_attr(struct perf_event *event,
return -EOPNOTSUPP;
}

+ if (!event->attr.exclude_kernel) {
+ err = perf_allow_kernel(attr);
+ if (err)
+ return err;
+ }
+
WARN_ON_ONCE(event->ctx->parent_ctx);

mutex_lock(&event->child_mutex);
@@ -12037,6 +12043,38 @@ perf_check_permission(struct perf_event_attr *attr, struct task_struct *task)
return is_capable || ptrace_may_access(task, ptrace_mode);
}

+/*
+ * Check if unprivileged users are allowed to set up breakpoints on user
+ * addresses that also count when the kernel accesses them.
+ */
+static bool perf_allow_kernel_breakpoint(struct perf_event_attr *attr)
+{
+ if (attr->type != PERF_TYPE_BREAKPOINT)
+ return false;
+
+ /*
+ * The sample may contain IPs, registers, or other information that may
+ * disclose kernel addresses or timing information. Disallow any kind of
+ * additional sample information.
+ */
+ if (attr->sample_type)
+ return false;
+
+ /*
+ * Only allow kernel breakpoints on user addresses.
+ */
+ return access_ok((void __user *)(unsigned long)attr->bp_addr, attr->bp_len);
+}
+
+int perf_allow_kernel(struct perf_event_attr *attr)
+{
+ if (sysctl_perf_event_paranoid > 1 && !perfmon_capable() &&
+ !perf_allow_kernel_breakpoint(attr))
+ return -EACCES;
+
+ return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
+}
+
/**
* sys_perf_event_open - open a performance event, associate it to a task/cpu
*
--
2.37.2.789.g6183377224-goog


2022-09-05 16:08:38

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH] perf: Allow restricted kernel breakpoints on user addresses

On Fri, 2 Sept 2022 at 12:01, Marco Elver <[email protected]> wrote:
>
> Allow the creation of restricted breakpoint perf events that also fire
> in the kernel (!exclude_kernel), if:
>
> 1. No sample information is requested; samples may contain IPs,
> registers, or other information that may disclose kernel addresses.
>
> 2. The breakpoint (viz. data watchpoint) is on a user address.
>
> The rules constrain the allowable perf events such that no sensitive
> kernel information can be disclosed.
>
> Despite no explicit kernel information disclosure, the following
> questions may need answers:
>
> 1. Is obtaining information that the kernel accessed a particular
> user's known memory location revealing new information?
> Given the kernel's user space ABI, there should be no "surprise
> accesses" to user space memory in the first place.
>
> 2. Does causing breakpoints on user memory accesses by the kernel
> potentially impact timing in a sensitive way?
> Since hardware breakpoints trigger regardless of the state of
> perf_event_attr::exclude_kernel, but are filtered in the perf
> subsystem, this possibility already exists independent of the
> proposed change.

I don't see how this gives userspace any new information.
As you noted userspace already should know what userspace addresses
kernel accesses. Additionally since the breakpoint fires anyway (just
filtered out), the fact of it firing should be easily recoverable from
the timing side-channel already. So:

Acked-by: Dmitry Vyukov <[email protected]>


> Signed-off-by: Marco Elver <[email protected]>
> ---
>
> Changelog
> ~~~~~~~~~
>
> v1:
> * Rebase.
>
> RFC: https://lkml.kernel.org/r/[email protected]
> ---
> include/linux/perf_event.h | 8 +-------
> kernel/events/core.c | 38 ++++++++++++++++++++++++++++++++++++++
> 2 files changed, 39 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index a784e055002e..907b0e3f1318 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -1367,13 +1367,7 @@ static inline int perf_is_paranoid(void)
> return sysctl_perf_event_paranoid > -1;
> }
>
> -static inline int perf_allow_kernel(struct perf_event_attr *attr)
> -{
> - if (sysctl_perf_event_paranoid > 1 && !perfmon_capable())
> - return -EACCES;
> -
> - return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
> -}
> +extern int perf_allow_kernel(struct perf_event_attr *attr);
>
> static inline int perf_allow_cpu(struct perf_event_attr *attr)
> {
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 2621fd24ad26..75f5705b6892 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3176,6 +3176,12 @@ static int perf_event_modify_attr(struct perf_event *event,
> return -EOPNOTSUPP;
> }
>
> + if (!event->attr.exclude_kernel) {
> + err = perf_allow_kernel(attr);
> + if (err)
> + return err;
> + }
> +
> WARN_ON_ONCE(event->ctx->parent_ctx);
>
> mutex_lock(&event->child_mutex);
> @@ -12037,6 +12043,38 @@ perf_check_permission(struct perf_event_attr *attr, struct task_struct *task)
> return is_capable || ptrace_may_access(task, ptrace_mode);
> }
>
> +/*
> + * Check if unprivileged users are allowed to set up breakpoints on user
> + * addresses that also count when the kernel accesses them.
> + */
> +static bool perf_allow_kernel_breakpoint(struct perf_event_attr *attr)
> +{
> + if (attr->type != PERF_TYPE_BREAKPOINT)
> + return false;
> +
> + /*
> + * The sample may contain IPs, registers, or other information that may
> + * disclose kernel addresses or timing information. Disallow any kind of
> + * additional sample information.
> + */
> + if (attr->sample_type)
> + return false;
> +
> + /*
> + * Only allow kernel breakpoints on user addresses.
> + */
> + return access_ok((void __user *)(unsigned long)attr->bp_addr, attr->bp_len);
> +}
> +
> +int perf_allow_kernel(struct perf_event_attr *attr)
> +{
> + if (sysctl_perf_event_paranoid > 1 && !perfmon_capable() &&
> + !perf_allow_kernel_breakpoint(attr))
> + return -EACCES;
> +
> + return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
> +}
> +
> /**
> * sys_perf_event_open - open a performance event, associate it to a task/cpu
> *
> --
> 2.37.2.789.g6183377224-goog
>

2022-09-06 20:49:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] perf: Allow restricted kernel breakpoints on user addresses

On Fri, Sep 02, 2022 at 12:00:57PM +0200, Marco Elver wrote:
> Allow the creation of restricted breakpoint perf events that also fire
> in the kernel (!exclude_kernel), if:
>
> 1. No sample information is requested; samples may contain IPs,
> registers, or other information that may disclose kernel addresses.
>
> 2. The breakpoint (viz. data watchpoint) is on a user address.
>
> The rules constrain the allowable perf events such that no sensitive
> kernel information can be disclosed.
>
> Despite no explicit kernel information disclosure, the following
> questions may need answers:
>
> 1. Is obtaining information that the kernel accessed a particular
> user's known memory location revealing new information?
> Given the kernel's user space ABI, there should be no "surprise
> accesses" to user space memory in the first place.
>
> 2. Does causing breakpoints on user memory accesses by the kernel
> potentially impact timing in a sensitive way?
> Since hardware breakpoints trigger regardless of the state of
> perf_event_attr::exclude_kernel, but are filtered in the perf
> subsystem, this possibility already exists independent of the
> proposed change.
>

Changelog forgot to tell us why you want this :-)

I don't see any immediate concerns, but it's late so who knows..

2022-09-07 08:17:48

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH] perf: Allow restricted kernel breakpoints on user addresses

On Tue, 6 Sept 2022 at 22:38, Peter Zijlstra <[email protected]> wrote:
>
> On Fri, Sep 02, 2022 at 12:00:57PM +0200, Marco Elver wrote:
> > Allow the creation of restricted breakpoint perf events that also fire
> > in the kernel (!exclude_kernel), if:
> >
> > 1. No sample information is requested; samples may contain IPs,
> > registers, or other information that may disclose kernel addresses.
> >
> > 2. The breakpoint (viz. data watchpoint) is on a user address.
> >
> > The rules constrain the allowable perf events such that no sensitive
> > kernel information can be disclosed.
> >
> > Despite no explicit kernel information disclosure, the following
> > questions may need answers:
> >
> > 1. Is obtaining information that the kernel accessed a particular
> > user's known memory location revealing new information?
> > Given the kernel's user space ABI, there should be no "surprise
> > accesses" to user space memory in the first place.
> >
> > 2. Does causing breakpoints on user memory accesses by the kernel
> > potentially impact timing in a sensitive way?
> > Since hardware breakpoints trigger regardless of the state of
> > perf_event_attr::exclude_kernel, but are filtered in the perf
> > subsystem, this possibility already exists independent of the
> > proposed change.
> >
>
> Changelog forgot to tell us why you want this :-)

Oops.

> I don't see any immediate concerns, but it's late so who knows..

Similar to motivation as
https://lore.kernel.org/all/[email protected]/:
Low-overhead error detectors that rely on detecting memory access via
breakpoints/watchpoints. For example for race detection, but also
things like data flow tracking.

By allowing in-kernel breakpoints on user addresses, we can detect
bugs that involve kernel accesses (e.g. for race detector, racy
read/write vs. syscall somewhere; or tracking data flow through
kernel).

Shall I go and send v2 with some motivation?

Thanks,
-- Marco

2022-09-07 12:54:20

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] perf: Allow restricted kernel breakpoints on user addresses

On Fri, Sep 02, 2022 at 12:00:57PM +0200, Marco Elver wrote:

> +/*
> + * Check if unprivileged users are allowed to set up breakpoints on user
> + * addresses that also count when the kernel accesses them.
> + */
> +static bool perf_allow_kernel_breakpoint(struct perf_event_attr *attr)
> +{
> + if (attr->type != PERF_TYPE_BREAKPOINT)
> + return false;
> +
> + /*
> + * The sample may contain IPs, registers, or other information that may
> + * disclose kernel addresses or timing information. Disallow any kind of
> + * additional sample information.
> + */
> + if (attr->sample_type)
> + return false;

This feels a bit weird; should that perhaps be is_sampling_event()?

> +
> + /*
> + * Only allow kernel breakpoints on user addresses.
> + */
> + return access_ok((void __user *)(unsigned long)attr->bp_addr, attr->bp_len);
> +}
> +
> +int perf_allow_kernel(struct perf_event_attr *attr)
> +{
> + if (sysctl_perf_event_paranoid > 1 && !perfmon_capable() &&
> + !perf_allow_kernel_breakpoint(attr))

I'm on the fence about this; one the one hand it feels weird to have a
breakpoint exception here and not a pmu specific callback for instance;
OTOH, leaving security policy like that up to pmu drivers sounds like a
really bad idea too.

Keep it as is I suppose, just me thinking out loud or so.

> + return -EACCES;
> +
> + return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
> +}


2022-09-08 08:08:46

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH] perf: Allow restricted kernel breakpoints on user addresses

On Wed, 7 Sept 2022 at 14:39, Peter Zijlstra <[email protected]> wrote:
>
> On Fri, Sep 02, 2022 at 12:00:57PM +0200, Marco Elver wrote:
>
> > +/*
> > + * Check if unprivileged users are allowed to set up breakpoints on user
> > + * addresses that also count when the kernel accesses them.
> > + */
> > +static bool perf_allow_kernel_breakpoint(struct perf_event_attr *attr)
> > +{
> > + if (attr->type != PERF_TYPE_BREAKPOINT)
> > + return false;
> > +
> > + /*
> > + * The sample may contain IPs, registers, or other information that may
> > + * disclose kernel addresses or timing information. Disallow any kind of
> > + * additional sample information.
> > + */
> > + if (attr->sample_type)
> > + return false;
>
> This feels a bit weird; should that perhaps be is_sampling_event()?

is_sampling_event() just checks for sample_period. In fact, we still
want to set sample_period to get overflow events. That in itself is
not dangerous.

What's problematic is if the samples contain additional information,
which can be specified in sample_type. For example if PERF_SAMPLE_IP
is set, it might leak kernel IPs, and that's bad. Since it's safest to
disallow any kind of extra information, we just check if sample_type
is zero.

> > +
> > + /*
> > + * Only allow kernel breakpoints on user addresses.
> > + */
> > + return access_ok((void __user *)(unsigned long)attr->bp_addr, attr->bp_len);
> > +}
> > +
> > +int perf_allow_kernel(struct perf_event_attr *attr)
> > +{
> > + if (sysctl_perf_event_paranoid > 1 && !perfmon_capable() &&
> > + !perf_allow_kernel_breakpoint(attr))
>
> I'm on the fence about this; one the one hand it feels weird to have a
> breakpoint exception here and not a pmu specific callback for instance;
> OTOH, leaving security policy like that up to pmu drivers sounds like a
> really bad idea too.
>
> Keep it as is I suppose, just me thinking out loud or so.

Ack. I also think this should stay in core, as it's also easier to audit.