2020-06-05 08:33:00

by Marco Elver

[permalink] [raw]
Subject: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

While we lack a compiler attribute to add to noinstr that would disable
KCOV, make the KCOV runtime functions return if the caller is in a
noinstr section, and mark them noinstr.

Declare write_comp_data() as __always_inline to ensure it is inlined,
which also reduces stack usage and removes one extra call from the
fast-path.

In future, our compilers may provide an attribute to implement
__no_sanitize_coverage, which can then be added to noinstr, and the
checks added in this patch can be guarded by an #ifdef checking if the
compiler has such an attribute or not.

Signed-off-by: Marco Elver <[email protected]>
---
Applies to -tip only currently, because of the use of instrumentation.h
markers.

v3:
* Remove objtool hack, and instead properly mark __sanitizer_cov
functions as noinstr.
* Add comment about .entry.text.

v2: https://lkml.kernel.org/r/[email protected]
* Rewrite based on Peter's and Andrey's feedback -- v1 worked because we
got lucky. Let's not rely on luck, as it will be difficult to ensure the
same conditions remain true in future.

v1: https://lkml.kernel.org/r/[email protected]

Note: There are a set of KCOV patches from Andrey in -next:
https://lkml.kernel.org/r/[email protected] --
Git cleanly merges this patch with those patches, and no merge conflict
is expected.
---
kernel/kcov.c | 59 +++++++++++++++++++++++++++++++++++++++------------
1 file changed, 45 insertions(+), 14 deletions(-)

diff --git a/kernel/kcov.c b/kernel/kcov.c
index 8accc9722a81..84cdc30d478e 100644
--- a/kernel/kcov.c
+++ b/kernel/kcov.c
@@ -6,6 +6,7 @@
#include <linux/compiler.h>
#include <linux/errno.h>
#include <linux/export.h>
+#include <linux/instrumentation.h>
#include <linux/types.h>
#include <linux/file.h>
#include <linux/fs.h>
@@ -24,6 +25,7 @@
#include <linux/refcount.h>
#include <linux/log2.h>
#include <asm/setup.h>
+#include <asm/sections.h>

#define kcov_debug(fmt, ...) pr_debug("%s: " fmt, __func__, ##__VA_ARGS__)

@@ -172,20 +174,38 @@ static notrace unsigned long canonicalize_ip(unsigned long ip)
return ip;
}

+/* Return true if @ip is within a noinstr section. */
+static __always_inline bool within_noinstr_section(unsigned long ip)
+{
+ /*
+ * Note: .entry.text is also considered noinstr, but for now, since all
+ * .entry.text code lives in .S files, these are never instrumented.
+ */
+ return (unsigned long)__noinstr_text_start <= ip &&
+ ip < (unsigned long)__noinstr_text_end;
+}
+
/*
* Entry point from instrumented code.
* This is called once per basic-block/edge.
*/
-void notrace __sanitizer_cov_trace_pc(void)
+void noinstr __sanitizer_cov_trace_pc(void)
{
struct task_struct *t;
unsigned long *area;
- unsigned long ip = canonicalize_ip(_RET_IP_);
+ unsigned long ip;
unsigned long pos;

+ if (unlikely(within_noinstr_section(_RET_IP_)))
+ return;
+
+ instrumentation_begin();
+
t = current;
if (!check_kcov_mode(KCOV_MODE_TRACE_PC, t))
- return;
+ goto out;
+
+ ip = canonicalize_ip(_RET_IP_);

area = t->kcov_area;
/* The first 64-bit word is the number of subsequent PCs. */
@@ -194,19 +214,27 @@ void notrace __sanitizer_cov_trace_pc(void)
area[pos] = ip;
WRITE_ONCE(area[0], pos);
}
+
+out:
+ instrumentation_end();
}
EXPORT_SYMBOL(__sanitizer_cov_trace_pc);

#ifdef CONFIG_KCOV_ENABLE_COMPARISONS
-static void notrace write_comp_data(u64 type, u64 arg1, u64 arg2, u64 ip)
+static __always_inline void write_comp_data(u64 type, u64 arg1, u64 arg2, u64 ip)
{
struct task_struct *t;
u64 *area;
u64 count, start_index, end_pos, max_pos;

+ if (unlikely(within_noinstr_section(ip)))
+ return;
+
+ instrumentation_begin();
+
t = current;
if (!check_kcov_mode(KCOV_MODE_TRACE_CMP, t))
- return;
+ goto out;

ip = canonicalize_ip(ip);

@@ -229,61 +257,64 @@ static void notrace write_comp_data(u64 type, u64 arg1, u64 arg2, u64 ip)
area[start_index + 3] = ip;
WRITE_ONCE(area[0], count + 1);
}
+
+out:
+ instrumentation_end();
}

-void notrace __sanitizer_cov_trace_cmp1(u8 arg1, u8 arg2)
+void noinstr __sanitizer_cov_trace_cmp1(u8 arg1, u8 arg2)
{
write_comp_data(KCOV_CMP_SIZE(0), arg1, arg2, _RET_IP_);
}
EXPORT_SYMBOL(__sanitizer_cov_trace_cmp1);

-void notrace __sanitizer_cov_trace_cmp2(u16 arg1, u16 arg2)
+void noinstr __sanitizer_cov_trace_cmp2(u16 arg1, u16 arg2)
{
write_comp_data(KCOV_CMP_SIZE(1), arg1, arg2, _RET_IP_);
}
EXPORT_SYMBOL(__sanitizer_cov_trace_cmp2);

-void notrace __sanitizer_cov_trace_cmp4(u32 arg1, u32 arg2)
+void noinstr __sanitizer_cov_trace_cmp4(u32 arg1, u32 arg2)
{
write_comp_data(KCOV_CMP_SIZE(2), arg1, arg2, _RET_IP_);
}
EXPORT_SYMBOL(__sanitizer_cov_trace_cmp4);

-void notrace __sanitizer_cov_trace_cmp8(u64 arg1, u64 arg2)
+void noinstr __sanitizer_cov_trace_cmp8(u64 arg1, u64 arg2)
{
write_comp_data(KCOV_CMP_SIZE(3), arg1, arg2, _RET_IP_);
}
EXPORT_SYMBOL(__sanitizer_cov_trace_cmp8);

-void notrace __sanitizer_cov_trace_const_cmp1(u8 arg1, u8 arg2)
+void noinstr __sanitizer_cov_trace_const_cmp1(u8 arg1, u8 arg2)
{
write_comp_data(KCOV_CMP_SIZE(0) | KCOV_CMP_CONST, arg1, arg2,
_RET_IP_);
}
EXPORT_SYMBOL(__sanitizer_cov_trace_const_cmp1);

-void notrace __sanitizer_cov_trace_const_cmp2(u16 arg1, u16 arg2)
+void noinstr __sanitizer_cov_trace_const_cmp2(u16 arg1, u16 arg2)
{
write_comp_data(KCOV_CMP_SIZE(1) | KCOV_CMP_CONST, arg1, arg2,
_RET_IP_);
}
EXPORT_SYMBOL(__sanitizer_cov_trace_const_cmp2);

-void notrace __sanitizer_cov_trace_const_cmp4(u32 arg1, u32 arg2)
+void noinstr __sanitizer_cov_trace_const_cmp4(u32 arg1, u32 arg2)
{
write_comp_data(KCOV_CMP_SIZE(2) | KCOV_CMP_CONST, arg1, arg2,
_RET_IP_);
}
EXPORT_SYMBOL(__sanitizer_cov_trace_const_cmp4);

-void notrace __sanitizer_cov_trace_const_cmp8(u64 arg1, u64 arg2)
+void noinstr __sanitizer_cov_trace_const_cmp8(u64 arg1, u64 arg2)
{
write_comp_data(KCOV_CMP_SIZE(3) | KCOV_CMP_CONST, arg1, arg2,
_RET_IP_);
}
EXPORT_SYMBOL(__sanitizer_cov_trace_const_cmp8);

-void notrace __sanitizer_cov_trace_switch(u64 val, u64 *cases)
+void noinstr __sanitizer_cov_trace_switch(u64 val, u64 *cases)
{
u64 i;
u64 count = cases[0];
--
2.27.0.278.ge193c7cf3a9-goog


2020-06-05 11:00:26

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Fri, Jun 5, 2020 at 10:28 AM Marco Elver <[email protected]> wrote:
>
> While we lack a compiler attribute to add to noinstr that would disable
> KCOV, make the KCOV runtime functions return if the caller is in a
> noinstr section, and mark them noinstr.
>
> Declare write_comp_data() as __always_inline to ensure it is inlined,
> which also reduces stack usage and removes one extra call from the
> fast-path.
>
> In future, our compilers may provide an attribute to implement
> __no_sanitize_coverage, which can then be added to noinstr, and the
> checks added in this patch can be guarded by an #ifdef checking if the
> compiler has such an attribute or not.

Adding noinstr attribute to instrumentation callbacks looks fine to me.

But I don't understand the within_noinstr_section part.
As the cover letter mentions, kcov callbacks don't do much and we
already have it inserted and called. What is the benefit of bailing
out a bit earlier rather than letting it run to completion?
Is the only reason for potential faults on access to the vmalloc-ed
region? If so, I think the right approach is to eliminate the faults
(if it's possible). We don't want faults for other reasons: they
caused recursion on ARM and these callbacks are inserted into lots of
sensitive code, so I am not sure checking only noinstr will resolve
all potential issues. E.g. we may get a deadlock if we fault from a
code that holds some lock, or we still can get that recursion on ARM (
I don't think all of page fault handling code is noinstr).
The fact that we started getting faults again (did we?) looks like a
regression related to remote KCOV.
Andrey, Mark, do you know if it's possible to pre-fault these areas?
The difference is that they run in a context of kernel threads. Maybe
we could do kcov_fault_in_area when we activate and remove KCOV on an
area? This way we get all faults in a very well-defined place (which
is not noinstr and holds known locks).



> Signed-off-by: Marco Elver <[email protected]>
> ---
> Applies to -tip only currently, because of the use of instrumentation.h
> markers.
>
> v3:
> * Remove objtool hack, and instead properly mark __sanitizer_cov
> functions as noinstr.
> * Add comment about .entry.text.
>
> v2: https://lkml.kernel.org/r/[email protected]
> * Rewrite based on Peter's and Andrey's feedback -- v1 worked because we
> got lucky. Let's not rely on luck, as it will be difficult to ensure the
> same conditions remain true in future.
>
> v1: https://lkml.kernel.org/r/[email protected]
>
> Note: There are a set of KCOV patches from Andrey in -next:
> https://lkml.kernel.org/r/[email protected] --
> Git cleanly merges this patch with those patches, and no merge conflict
> is expected.
> ---
> kernel/kcov.c | 59 +++++++++++++++++++++++++++++++++++++++------------
> 1 file changed, 45 insertions(+), 14 deletions(-)
>
> diff --git a/kernel/kcov.c b/kernel/kcov.c
> index 8accc9722a81..84cdc30d478e 100644
> --- a/kernel/kcov.c
> +++ b/kernel/kcov.c
> @@ -6,6 +6,7 @@
> #include <linux/compiler.h>
> #include <linux/errno.h>
> #include <linux/export.h>
> +#include <linux/instrumentation.h>
> #include <linux/types.h>
> #include <linux/file.h>
> #include <linux/fs.h>
> @@ -24,6 +25,7 @@
> #include <linux/refcount.h>
> #include <linux/log2.h>
> #include <asm/setup.h>
> +#include <asm/sections.h>
>
> #define kcov_debug(fmt, ...) pr_debug("%s: " fmt, __func__, ##__VA_ARGS__)
>
> @@ -172,20 +174,38 @@ static notrace unsigned long canonicalize_ip(unsigned long ip)
> return ip;
> }
>
> +/* Return true if @ip is within a noinstr section. */
> +static __always_inline bool within_noinstr_section(unsigned long ip)
> +{
> + /*
> + * Note: .entry.text is also considered noinstr, but for now, since all
> + * .entry.text code lives in .S files, these are never instrumented.
> + */
> + return (unsigned long)__noinstr_text_start <= ip &&
> + ip < (unsigned long)__noinstr_text_end;
> +}
> +
> /*
> * Entry point from instrumented code.
> * This is called once per basic-block/edge.
> */
> -void notrace __sanitizer_cov_trace_pc(void)
> +void noinstr __sanitizer_cov_trace_pc(void)
> {
> struct task_struct *t;
> unsigned long *area;
> - unsigned long ip = canonicalize_ip(_RET_IP_);
> + unsigned long ip;
> unsigned long pos;
>
> + if (unlikely(within_noinstr_section(_RET_IP_)))
> + return;
> +
> + instrumentation_begin();
> +
> t = current;
> if (!check_kcov_mode(KCOV_MODE_TRACE_PC, t))
> - return;
> + goto out;
> +
> + ip = canonicalize_ip(_RET_IP_);
>
> area = t->kcov_area;
> /* The first 64-bit word is the number of subsequent PCs. */
> @@ -194,19 +214,27 @@ void notrace __sanitizer_cov_trace_pc(void)
> area[pos] = ip;
> WRITE_ONCE(area[0], pos);
> }
> +
> +out:
> + instrumentation_end();
> }
> EXPORT_SYMBOL(__sanitizer_cov_trace_pc);
>
> #ifdef CONFIG_KCOV_ENABLE_COMPARISONS
> -static void notrace write_comp_data(u64 type, u64 arg1, u64 arg2, u64 ip)
> +static __always_inline void write_comp_data(u64 type, u64 arg1, u64 arg2, u64 ip)
> {
> struct task_struct *t;
> u64 *area;
> u64 count, start_index, end_pos, max_pos;
>
> + if (unlikely(within_noinstr_section(ip)))
> + return;
> +
> + instrumentation_begin();
> +
> t = current;
> if (!check_kcov_mode(KCOV_MODE_TRACE_CMP, t))
> - return;
> + goto out;
>
> ip = canonicalize_ip(ip);
>
> @@ -229,61 +257,64 @@ static void notrace write_comp_data(u64 type, u64 arg1, u64 arg2, u64 ip)
> area[start_index + 3] = ip;
> WRITE_ONCE(area[0], count + 1);
> }
> +
> +out:
> + instrumentation_end();
> }
>
> -void notrace __sanitizer_cov_trace_cmp1(u8 arg1, u8 arg2)
> +void noinstr __sanitizer_cov_trace_cmp1(u8 arg1, u8 arg2)
> {
> write_comp_data(KCOV_CMP_SIZE(0), arg1, arg2, _RET_IP_);
> }
> EXPORT_SYMBOL(__sanitizer_cov_trace_cmp1);
>
> -void notrace __sanitizer_cov_trace_cmp2(u16 arg1, u16 arg2)
> +void noinstr __sanitizer_cov_trace_cmp2(u16 arg1, u16 arg2)
> {
> write_comp_data(KCOV_CMP_SIZE(1), arg1, arg2, _RET_IP_);
> }
> EXPORT_SYMBOL(__sanitizer_cov_trace_cmp2);
>
> -void notrace __sanitizer_cov_trace_cmp4(u32 arg1, u32 arg2)
> +void noinstr __sanitizer_cov_trace_cmp4(u32 arg1, u32 arg2)
> {
> write_comp_data(KCOV_CMP_SIZE(2), arg1, arg2, _RET_IP_);
> }
> EXPORT_SYMBOL(__sanitizer_cov_trace_cmp4);
>
> -void notrace __sanitizer_cov_trace_cmp8(u64 arg1, u64 arg2)
> +void noinstr __sanitizer_cov_trace_cmp8(u64 arg1, u64 arg2)
> {
> write_comp_data(KCOV_CMP_SIZE(3), arg1, arg2, _RET_IP_);
> }
> EXPORT_SYMBOL(__sanitizer_cov_trace_cmp8);
>
> -void notrace __sanitizer_cov_trace_const_cmp1(u8 arg1, u8 arg2)
> +void noinstr __sanitizer_cov_trace_const_cmp1(u8 arg1, u8 arg2)
> {
> write_comp_data(KCOV_CMP_SIZE(0) | KCOV_CMP_CONST, arg1, arg2,
> _RET_IP_);
> }
> EXPORT_SYMBOL(__sanitizer_cov_trace_const_cmp1);
>
> -void notrace __sanitizer_cov_trace_const_cmp2(u16 arg1, u16 arg2)
> +void noinstr __sanitizer_cov_trace_const_cmp2(u16 arg1, u16 arg2)
> {
> write_comp_data(KCOV_CMP_SIZE(1) | KCOV_CMP_CONST, arg1, arg2,
> _RET_IP_);
> }
> EXPORT_SYMBOL(__sanitizer_cov_trace_const_cmp2);
>
> -void notrace __sanitizer_cov_trace_const_cmp4(u32 arg1, u32 arg2)
> +void noinstr __sanitizer_cov_trace_const_cmp4(u32 arg1, u32 arg2)
> {
> write_comp_data(KCOV_CMP_SIZE(2) | KCOV_CMP_CONST, arg1, arg2,
> _RET_IP_);
> }
> EXPORT_SYMBOL(__sanitizer_cov_trace_const_cmp4);
>
> -void notrace __sanitizer_cov_trace_const_cmp8(u64 arg1, u64 arg2)
> +void noinstr __sanitizer_cov_trace_const_cmp8(u64 arg1, u64 arg2)
> {
> write_comp_data(KCOV_CMP_SIZE(3) | KCOV_CMP_CONST, arg1, arg2,
> _RET_IP_);
> }
> EXPORT_SYMBOL(__sanitizer_cov_trace_const_cmp8);
>
> -void notrace __sanitizer_cov_trace_switch(u64 val, u64 *cases)
> +void noinstr __sanitizer_cov_trace_switch(u64 val, u64 *cases)
> {
> u64 i;
> u64 count = cases[0];
> --
> 2.27.0.278.ge193c7cf3a9-goog
>

2020-06-05 12:08:54

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Fri, Jun 05, 2020 at 12:57:15PM +0200, Dmitry Vyukov wrote:
> On Fri, Jun 5, 2020 at 10:28 AM Marco Elver <[email protected]> wrote:
> >
> > While we lack a compiler attribute to add to noinstr that would disable
> > KCOV, make the KCOV runtime functions return if the caller is in a
> > noinstr section, and mark them noinstr.
> >
> > Declare write_comp_data() as __always_inline to ensure it is inlined,
> > which also reduces stack usage and removes one extra call from the
> > fast-path.
> >
> > In future, our compilers may provide an attribute to implement
> > __no_sanitize_coverage, which can then be added to noinstr, and the
> > checks added in this patch can be guarded by an #ifdef checking if the
> > compiler has such an attribute or not.
>
> Adding noinstr attribute to instrumentation callbacks looks fine to me.
>
> But I don't understand the within_noinstr_section part.
> As the cover letter mentions, kcov callbacks don't do much and we
> already have it inserted and called. What is the benefit of bailing
> out a bit earlier rather than letting it run to completion?
> Is the only reason for potential faults on access to the vmalloc-ed
> region?

Vmalloc faults (on x86, the only arch that had them IIRC) are gone, per
this merge window.

The reason I mentioned them is because it is important that they are
gone, and that this hard relies on them being gone, and the patch didn't
call that out.

There is one additional issue though; you can set hardware breakpoint on
vmalloc space, and that would trigger #DB and then we'd be dead when we
were already in #DB (IST recursion FTW).

And that is not something you can trivially fix, because you can set the
breakpoint before the allocation (or perhaps on a previous allocation).

That said; we already have this problem with task_struct (and
task_stack). IIRC Andy wants to fix the task_stack issue by making all
of noinstr run on the entry stack, but we're not there yet.

There are no good proposals for random allocations like task_struct or
in your case kcov_area.

> Andrey, Mark, do you know if it's possible to pre-fault these areas?

Under the assumption that vmalloc faults are still a thing:

You cannot pre-fault the remote area thing, kernel threads use the mm of
the previous user task, and there is no guarantee that mm will have had
the vmalloc fault.

2020-06-05 13:28:02

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Fri, Jun 5, 2020 at 2:04 PM Peter Zijlstra <[email protected]> wrote:
>
> On Fri, Jun 05, 2020 at 12:57:15PM +0200, Dmitry Vyukov wrote:
> > On Fri, Jun 5, 2020 at 10:28 AM Marco Elver <[email protected]> wrote:
> > >
> > > While we lack a compiler attribute to add to noinstr that would disable
> > > KCOV, make the KCOV runtime functions return if the caller is in a
> > > noinstr section, and mark them noinstr.
> > >
> > > Declare write_comp_data() as __always_inline to ensure it is inlined,
> > > which also reduces stack usage and removes one extra call from the
> > > fast-path.
> > >
> > > In future, our compilers may provide an attribute to implement
> > > __no_sanitize_coverage, which can then be added to noinstr, and the
> > > checks added in this patch can be guarded by an #ifdef checking if the
> > > compiler has such an attribute or not.
> >
> > Adding noinstr attribute to instrumentation callbacks looks fine to me.
> >
> > But I don't understand the within_noinstr_section part.
> > As the cover letter mentions, kcov callbacks don't do much and we
> > already have it inserted and called. What is the benefit of bailing
> > out a bit earlier rather than letting it run to completion?
> > Is the only reason for potential faults on access to the vmalloc-ed
> > region?
>
> Vmalloc faults (on x86, the only arch that had them IIRC) are gone, per
> this merge window.
>
> The reason I mentioned them is because it is important that they are
> gone, and that this hard relies on them being gone, and the patch didn't
> call that out.
>
> There is one additional issue though; you can set hardware breakpoint on
> vmalloc space, and that would trigger #DB and then we'd be dead when we
> were already in #DB (IST recursion FTW).
>
> And that is not something you can trivially fix, because you can set the
> breakpoint before the allocation (or perhaps on a previous allocation).
>
> That said; we already have this problem with task_struct (and
> task_stack). IIRC Andy wants to fix the task_stack issue by making all
> of noinstr run on the entry stack, but we're not there yet.
>
> There are no good proposals for random allocations like task_struct or
> in your case kcov_area.
>
> > Andrey, Mark, do you know if it's possible to pre-fault these areas?
>
> Under the assumption that vmalloc faults are still a thing:
>
> You cannot pre-fault the remote area thing, kernel threads use the mm of
> the previous user task, and there is no guarantee that mm will have had
> the vmalloc fault.

To clarify this part AFAIU it, even if we try to prefault the whole
remote area each time kcov_remote_start() is called, then (let alone
the performance impact) the kernel thread can be rescheduled between
kcov_remote_start() and kcov_remote_stop(), and then it might be
running with a different mm than the one that was used when
kcov_remote_start() happened.

2020-06-07 09:39:48

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Fri, Jun 5, 2020 at 3:25 PM 'Andrey Konovalov' via kasan-dev
<[email protected]> wrote:
> > On Fri, Jun 05, 2020 at 12:57:15PM +0200, Dmitry Vyukov wrote:
> > > On Fri, Jun 5, 2020 at 10:28 AM Marco Elver <[email protected]> wrote:
> > > >
> > > > While we lack a compiler attribute to add to noinstr that would disable
> > > > KCOV, make the KCOV runtime functions return if the caller is in a
> > > > noinstr section, and mark them noinstr.
> > > >
> > > > Declare write_comp_data() as __always_inline to ensure it is inlined,
> > > > which also reduces stack usage and removes one extra call from the
> > > > fast-path.
> > > >
> > > > In future, our compilers may provide an attribute to implement
> > > > __no_sanitize_coverage, which can then be added to noinstr, and the
> > > > checks added in this patch can be guarded by an #ifdef checking if the
> > > > compiler has such an attribute or not.
> > >
> > > Adding noinstr attribute to instrumentation callbacks looks fine to me.
> > >
> > > But I don't understand the within_noinstr_section part.
> > > As the cover letter mentions, kcov callbacks don't do much and we
> > > already have it inserted and called. What is the benefit of bailing
> > > out a bit earlier rather than letting it run to completion?
> > > Is the only reason for potential faults on access to the vmalloc-ed
> > > region?
> >
> > Vmalloc faults (on x86, the only arch that had them IIRC) are gone, per
> > this merge window.
> >
> > The reason I mentioned them is because it is important that they are
> > gone, and that this hard relies on them being gone, and the patch didn't
> > call that out.
> >
> > There is one additional issue though; you can set hardware breakpoint on
> > vmalloc space, and that would trigger #DB and then we'd be dead when we
> > were already in #DB (IST recursion FTW).
> >
> > And that is not something you can trivially fix, because you can set the
> > breakpoint before the allocation (or perhaps on a previous allocation).
> >
> > That said; we already have this problem with task_struct (and
> > task_stack). IIRC Andy wants to fix the task_stack issue by making all
> > of noinstr run on the entry stack, but we're not there yet.
> >
> > There are no good proposals for random allocations like task_struct or
> > in your case kcov_area.
> >
> > > Andrey, Mark, do you know if it's possible to pre-fault these areas?
> >
> > Under the assumption that vmalloc faults are still a thing:
> >
> > You cannot pre-fault the remote area thing, kernel threads use the mm of
> > the previous user task, and there is no guarantee that mm will have had
> > the vmalloc fault.
>
> To clarify this part AFAIU it, even if we try to prefault the whole
> remote area each time kcov_remote_start() is called, then (let alone
> the performance impact) the kernel thread can be rescheduled between
> kcov_remote_start() and kcov_remote_stop(), and then it might be
> running with a different mm than the one that was used when
> kcov_remote_start() happened.

Ugh, this is nasty. But this has also gone, which I am happy about :)

Why I am looking at this is because with coverage instrumentation
__sanitizer_cov_trace_pc is the hottest function in the kernel and we
are adding additional branches to it.

Can we touch at least some per-cpu data within noinstr code?
If yes, we could try to affect the existing
in_task()/in_serving_softirq() check.
If not, it would be useful to have a comment clarifying that
within_noinstr_section check must happen before we touch anything
else.

I assume objtool can now also detect all violations. How bad is it now
without within_noinstr_section check? I am assuming we marking noinstr
functions to not be instrumented, but we are getting some stray
instrumentation from inlined functions or something, right? How many
are there? Is it fixable/unfixable? Marco, do you know the list, or
could you please collect the list of violations?

Is there any config that disables #DB? We could well disable it on
syzbot, I think we already disable some production hardening/debugging
confings, which are not too useful for testing setup.
E.g. we support RANDOMIZE_BASE, no problem, but if one disables it
(which we do), that becomes no-op:

#ifdef CONFIG_RANDOMIZE_BASE
ip -= kaslr_offset();
#endif
return ip;

2020-06-08 07:51:29

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Sun, 7 Jun 2020 at 11:37, Dmitry Vyukov <[email protected]> wrote:
>
> On Fri, Jun 5, 2020 at 3:25 PM 'Andrey Konovalov' via kasan-dev
> <[email protected]> wrote:
> > > On Fri, Jun 05, 2020 at 12:57:15PM +0200, Dmitry Vyukov wrote:
> > > > On Fri, Jun 5, 2020 at 10:28 AM Marco Elver <[email protected]> wrote:
> > > > >
> > > > > While we lack a compiler attribute to add to noinstr that would disable
> > > > > KCOV, make the KCOV runtime functions return if the caller is in a
> > > > > noinstr section, and mark them noinstr.
> > > > >
> > > > > Declare write_comp_data() as __always_inline to ensure it is inlined,
> > > > > which also reduces stack usage and removes one extra call from the
> > > > > fast-path.
> > > > >
> > > > > In future, our compilers may provide an attribute to implement
> > > > > __no_sanitize_coverage, which can then be added to noinstr, and the
> > > > > checks added in this patch can be guarded by an #ifdef checking if the
> > > > > compiler has such an attribute or not.
> > > >
> > > > Adding noinstr attribute to instrumentation callbacks looks fine to me.
> > > >
> > > > But I don't understand the within_noinstr_section part.
> > > > As the cover letter mentions, kcov callbacks don't do much and we
> > > > already have it inserted and called. What is the benefit of bailing
> > > > out a bit earlier rather than letting it run to completion?
> > > > Is the only reason for potential faults on access to the vmalloc-ed
> > > > region?
> > >
> > > Vmalloc faults (on x86, the only arch that had them IIRC) are gone, per
> > > this merge window.
> > >
> > > The reason I mentioned them is because it is important that they are
> > > gone, and that this hard relies on them being gone, and the patch didn't
> > > call that out.
> > >
> > > There is one additional issue though; you can set hardware breakpoint on
> > > vmalloc space, and that would trigger #DB and then we'd be dead when we
> > > were already in #DB (IST recursion FTW).
> > >
> > > And that is not something you can trivially fix, because you can set the
> > > breakpoint before the allocation (or perhaps on a previous allocation).
> > >
> > > That said; we already have this problem with task_struct (and
> > > task_stack). IIRC Andy wants to fix the task_stack issue by making all
> > > of noinstr run on the entry stack, but we're not there yet.
> > >
> > > There are no good proposals for random allocations like task_struct or
> > > in your case kcov_area.
> > >
> > > > Andrey, Mark, do you know if it's possible to pre-fault these areas?
> > >
> > > Under the assumption that vmalloc faults are still a thing:
> > >
> > > You cannot pre-fault the remote area thing, kernel threads use the mm of
> > > the previous user task, and there is no guarantee that mm will have had
> > > the vmalloc fault.
> >
> > To clarify this part AFAIU it, even if we try to prefault the whole
> > remote area each time kcov_remote_start() is called, then (let alone
> > the performance impact) the kernel thread can be rescheduled between
> > kcov_remote_start() and kcov_remote_stop(), and then it might be
> > running with a different mm than the one that was used when
> > kcov_remote_start() happened.
>
> Ugh, this is nasty. But this has also gone, which I am happy about :)
>
> Why I am looking at this is because with coverage instrumentation
> __sanitizer_cov_trace_pc is the hottest function in the kernel and we
> are adding additional branches to it.
>
> Can we touch at least some per-cpu data within noinstr code?
> If yes, we could try to affect the existing
> in_task()/in_serving_softirq() check.
> If not, it would be useful to have a comment clarifying that
> within_noinstr_section check must happen before we touch anything
> else.

I don't think this will get us anywhere. If anything this will require
introducing code outside KCOV, and I think that makes the whole
situation even worse. My guess is also we can't even read per-CPU
data, but Peter would have to comment on this.

> I assume objtool can now also detect all violations. How bad is it now
> without within_noinstr_section check? I am assuming we marking noinstr
> functions to not be instrumented, but we are getting some stray
> instrumentation from inlined functions or something, right? How many
> are there? Is it fixable/unfixable? Marco, do you know the list, or
> could you please collect the list of violations?

It's everywhere. We cannot mark noinstr functions to not be
instrumented by KCOV/-fsanitize-coverage, because no compiler provides
an attribute to do so. GCC doesn't have
__attribute__((no_sanitize_coverage)) and Clang doesn't have
__attribute__((no_sanitize("coverage")), and therefore we can't have
__no_sanitize_coverage.

My plan would be to now go and implement the attributes, at the very
least in Clang. Then what we can do is make wihin_noinstr_section a
noop (just return false) if we have CONFIG_CC_HAS_NOSANITIZE_COVERAGE
or something.

Unfortunately, without this patch, we won't have a reliable kernel
with KCOV until we get compiler support.

The thing is that this slowdown is temporary if we add the attributes
to the compiler.


> Is there any config that disables #DB? We could well disable it on
> syzbot, I think we already disable some production hardening/debugging
> confings, which are not too useful for testing setup.
> E.g. we support RANDOMIZE_BASE, no problem, but if one disables it
> (which we do), that becomes no-op:
>
> #ifdef CONFIG_RANDOMIZE_BASE
> ip -= kaslr_offset();
> #endif
> return ip;

2020-06-08 08:00:08

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Mon, Jun 8, 2020 at 9:49 AM Marco Elver <[email protected]> wrote:
> > On Fri, Jun 5, 2020 at 3:25 PM 'Andrey Konovalov' via kasan-dev
> > <[email protected]> wrote:
> > > > On Fri, Jun 05, 2020 at 12:57:15PM +0200, Dmitry Vyukov wrote:
> > > > > On Fri, Jun 5, 2020 at 10:28 AM Marco Elver <[email protected]> wrote:
> > > > > >
> > > > > > While we lack a compiler attribute to add to noinstr that would disable
> > > > > > KCOV, make the KCOV runtime functions return if the caller is in a
> > > > > > noinstr section, and mark them noinstr.
> > > > > >
> > > > > > Declare write_comp_data() as __always_inline to ensure it is inlined,
> > > > > > which also reduces stack usage and removes one extra call from the
> > > > > > fast-path.
> > > > > >
> > > > > > In future, our compilers may provide an attribute to implement
> > > > > > __no_sanitize_coverage, which can then be added to noinstr, and the
> > > > > > checks added in this patch can be guarded by an #ifdef checking if the
> > > > > > compiler has such an attribute or not.
> > > > >
> > > > > Adding noinstr attribute to instrumentation callbacks looks fine to me.
> > > > >
> > > > > But I don't understand the within_noinstr_section part.
> > > > > As the cover letter mentions, kcov callbacks don't do much and we
> > > > > already have it inserted and called. What is the benefit of bailing
> > > > > out a bit earlier rather than letting it run to completion?
> > > > > Is the only reason for potential faults on access to the vmalloc-ed
> > > > > region?
> > > >
> > > > Vmalloc faults (on x86, the only arch that had them IIRC) are gone, per
> > > > this merge window.
> > > >
> > > > The reason I mentioned them is because it is important that they are
> > > > gone, and that this hard relies on them being gone, and the patch didn't
> > > > call that out.
> > > >
> > > > There is one additional issue though; you can set hardware breakpoint on
> > > > vmalloc space, and that would trigger #DB and then we'd be dead when we
> > > > were already in #DB (IST recursion FTW).
> > > >
> > > > And that is not something you can trivially fix, because you can set the
> > > > breakpoint before the allocation (or perhaps on a previous allocation).
> > > >
> > > > That said; we already have this problem with task_struct (and
> > > > task_stack). IIRC Andy wants to fix the task_stack issue by making all
> > > > of noinstr run on the entry stack, but we're not there yet.
> > > >
> > > > There are no good proposals for random allocations like task_struct or
> > > > in your case kcov_area.
> > > >
> > > > > Andrey, Mark, do you know if it's possible to pre-fault these areas?
> > > >
> > > > Under the assumption that vmalloc faults are still a thing:
> > > >
> > > > You cannot pre-fault the remote area thing, kernel threads use the mm of
> > > > the previous user task, and there is no guarantee that mm will have had
> > > > the vmalloc fault.
> > >
> > > To clarify this part AFAIU it, even if we try to prefault the whole
> > > remote area each time kcov_remote_start() is called, then (let alone
> > > the performance impact) the kernel thread can be rescheduled between
> > > kcov_remote_start() and kcov_remote_stop(), and then it might be
> > > running with a different mm than the one that was used when
> > > kcov_remote_start() happened.
> >
> > Ugh, this is nasty. But this has also gone, which I am happy about :)
> >
> > Why I am looking at this is because with coverage instrumentation
> > __sanitizer_cov_trace_pc is the hottest function in the kernel and we
> > are adding additional branches to it.
> >
> > Can we touch at least some per-cpu data within noinstr code?
> > If yes, we could try to affect the existing
> > in_task()/in_serving_softirq() check.
> > If not, it would be useful to have a comment clarifying that
> > within_noinstr_section check must happen before we touch anything
> > else.
>
> I don't think this will get us anywhere. If anything this will require
> introducing code outside KCOV, and I think that makes the whole
> situation even worse. My guess is also we can't even read per-CPU
> data, but Peter would have to comment on this.
>
> > I assume objtool can now also detect all violations. How bad is it now
> > without within_noinstr_section check? I am assuming we marking noinstr
> > functions to not be instrumented, but we are getting some stray
> > instrumentation from inlined functions or something, right? How many
> > are there? Is it fixable/unfixable? Marco, do you know the list, or
> > could you please collect the list of violations?
>
> It's everywhere. We cannot mark noinstr functions to not be
> instrumented by KCOV/-fsanitize-coverage, because no compiler provides
> an attribute to do so. GCC doesn't have
> __attribute__((no_sanitize_coverage)) and Clang doesn't have
> __attribute__((no_sanitize("coverage")), and therefore we can't have
> __no_sanitize_coverage.
>
> My plan would be to now go and implement the attributes, at the very
> least in Clang. Then what we can do is make wihin_noinstr_section a
> noop (just return false) if we have CONFIG_CC_HAS_NOSANITIZE_COVERAGE
> or something.
>
> Unfortunately, without this patch, we won't have a reliable kernel
> with KCOV until we get compiler support.
>
> The thing is that this slowdown is temporary if we add the attributes
> to the compiler.

As a crazy idea: is it possible to employ objtool (linker script?) to
rewrite all coverage calls to nops in the noinstr section? Or relocate
to nop function?
What we are trying to do is very static, it _should_ have been done
during build. We don't have means in existing _compilers_ to do this,
but maybe we could do it elsewhere during build?...


> > Is there any config that disables #DB? We could well disable it on
> > syzbot, I think we already disable some production hardening/debugging
> > confings, which are not too useful for testing setup.
> > E.g. we support RANDOMIZE_BASE, no problem, but if one disables it
> > (which we do), that becomes no-op:
> >
> > #ifdef CONFIG_RANDOMIZE_BASE
> > ip -= kaslr_offset();
> > #endif
> > return ip;

2020-06-08 11:03:38

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Mon, Jun 08, 2020 at 09:57:39AM +0200, Dmitry Vyukov wrote:

> As a crazy idea: is it possible to employ objtool (linker script?) to
> rewrite all coverage calls to nops in the noinstr section? Or relocate
> to nop function?
> What we are trying to do is very static, it _should_ have been done
> during build. We don't have means in existing _compilers_ to do this,
> but maybe we could do it elsewhere during build?...

Let me try and figure out how to make objtool actually rewrite code.

2020-06-11 21:58:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Mon, Jun 08, 2020 at 01:01:08PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 08, 2020 at 09:57:39AM +0200, Dmitry Vyukov wrote:
>
> > As a crazy idea: is it possible to employ objtool (linker script?) to
> > rewrite all coverage calls to nops in the noinstr section? Or relocate
> > to nop function?
> > What we are trying to do is very static, it _should_ have been done
> > during build. We don't have means in existing _compilers_ to do this,
> > but maybe we could do it elsewhere during build?...
>
> Let me try and figure out how to make objtool actually rewrite code.

The below is quite horrific but seems to sorta work.


It turns this:

12: e8 00 00 00 00 callq 17 <lockdep_hardirqs_on+0x17>
13: R_X86_64_PLT32 __sanitizer_cov_trace_pc-0x4

Into this:

12: 90 nop
13: 90 nop
13: R_X86_64_NONE __sanitizer_cov_trace_pc-0x4
14: 90 nop
15: 90 nop
16: 90 nop


I'll have to dig around a little more to see if I can't get rid of the
relocation entirely. Also, I need to steal better arch_nop_insn() from
the kernel :-)

---
tools/objtool/arch.h | 2 ++
tools/objtool/arch/x86/decode.c | 24 ++++++++++++++++++++++
tools/objtool/check.c | 15 +++++++++++++-
tools/objtool/elf.c | 45 ++++++++++++++++++++++++++++++++++++++++-
tools/objtool/elf.h | 11 ++++++++--
5 files changed, 93 insertions(+), 4 deletions(-)

diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index eda15a5a285e..3c5967748abb 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -84,4 +84,6 @@ unsigned long arch_jump_destination(struct instruction *insn);

unsigned long arch_dest_rela_offset(int addend);

+const char *arch_nop_insn(int len);
+
#endif /* _ARCH_H */
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 4b504fc90bbb..b615c32e21db 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -565,3 +565,27 @@ void arch_initial_func_cfi_state(struct cfi_init_state *state)
state->regs[16].base = CFI_CFA;
state->regs[16].offset = -8;
}
+
+const char *arch_nop_insn(int len)
+{
+ static const char insn[16] = {
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ };
+
+ return insn;
+}
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 5fbb90a80d23..487b4dc3d122 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -765,6 +765,17 @@ static int add_call_destinations(struct objtool_file *file)
} else
insn->call_dest = rela->sym;

+ if (insn->sec->noinstr &&
+ !strncmp(insn->call_dest->name, "__sanitizer_cov_", 16)) {
+ if (rela)
+ elf_write_rela(file->elf, rela);
+
+ elf_write_insn(file->elf, insn->sec,
+ insn->offset, insn->len,
+ arch_nop_insn(insn->len));
+ insn->type = INSN_NOP;
+ }
+
/*
* Whatever stack impact regular CALLs have, should be undone
* by the RETURN of the called function.
@@ -2802,11 +2813,13 @@ int check(const char *_objname, bool orc)
if (ret < 0)
goto out;

+ }
+
+ if (file.elf->changed) {
ret = elf_write(file.elf);
if (ret < 0)
goto out;
}
-
out:
if (ret < 0) {
/*
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 84225679f96d..705582729374 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -525,6 +525,7 @@ static int read_relas(struct elf *elf)
return -1;
}

+ rela->idx = i;
rela->type = GELF_R_TYPE(rela->rela.r_info);
rela->addend = rela->rela.r_addend;
rela->offset = rela->rela.r_offset;
@@ -713,6 +714,8 @@ struct section *elf_create_section(struct elf *elf, const char *name,
elf_hash_add(elf->section_hash, &sec->hash, sec->idx);
elf_hash_add(elf->section_name_hash, &sec->name_hash, str_hash(sec->name));

+ elf->changed = true;
+
return sec;
}

@@ -779,7 +782,43 @@ int elf_rebuild_rela_section(struct section *sec)
return 0;
}

-int elf_write(const struct elf *elf)
+int elf_write_insn(struct elf *elf, struct section *sec,
+ unsigned long offset, unsigned int len,
+ const char *insn)
+{
+ Elf_Data *data = sec->data;
+
+ if (data->d_type != ELF_T_BYTE || data->d_off) {
+ printf("ponies\n");
+ return -1;
+ }
+
+ memcpy(sec->data->d_buf + offset, insn, len);
+
+ elf_flagdata(data, ELF_C_SET, ELF_F_DIRTY);
+
+ sec->changed = true;
+ elf->changed = true;
+
+ return 0;
+}
+
+int elf_write_rela(struct elf *elf, struct rela *rela)
+{
+ struct section *sec = rela->sec;
+
+ rela->rela.r_info = 0;
+ rela->rela.r_addend = 0;
+
+ gelf_update_rela(sec->data, rela->idx, &rela->rela);
+
+ sec->changed = true;
+ elf->changed = true;
+
+ return 0;
+}
+
+int elf_write(struct elf *elf)
{
struct section *sec;
Elf_Scn *s;
@@ -796,6 +835,8 @@ int elf_write(const struct elf *elf)
WARN_ELF("gelf_update_shdr");
return -1;
}
+
+ sec->changed = false;
}
}

@@ -808,6 +849,8 @@ int elf_write(const struct elf *elf)
return -1;
}

+ elf->changed = false;
+
return 0;
}

diff --git a/tools/objtool/elf.h b/tools/objtool/elf.h
index f4fe1d6ea392..4a3fe4f455c5 100644
--- a/tools/objtool/elf.h
+++ b/tools/objtool/elf.h
@@ -64,9 +64,10 @@ struct rela {
GElf_Rela rela;
struct section *sec;
struct symbol *sym;
- unsigned int type;
unsigned long offset;
+ unsigned int type;
int addend;
+ int idx;
bool jump_table_start;
};

@@ -76,6 +77,7 @@ struct elf {
Elf *elf;
GElf_Ehdr ehdr;
int fd;
+ bool changed;
char *name;
struct list_head sections;
DECLARE_HASHTABLE(symbol_hash, ELF_HASH_BITS);
@@ -118,7 +120,7 @@ struct elf *elf_open_read(const char *name, int flags);
struct section *elf_create_section(struct elf *elf, const char *name, size_t entsize, int nr);
struct section *elf_create_rela_section(struct elf *elf, struct section *base);
void elf_add_rela(struct elf *elf, struct rela *rela);
-int elf_write(const struct elf *elf);
+int elf_write(struct elf *elf);
void elf_close(struct elf *elf);

struct section *find_section_by_name(const struct elf *elf, const char *name);
@@ -132,6 +134,11 @@ struct rela *find_rela_by_dest_range(const struct elf *elf, struct section *sec,
struct symbol *find_func_containing(struct section *sec, unsigned long offset);
int elf_rebuild_rela_section(struct section *sec);

+int elf_write_rela(struct elf *elf, struct rela *rela);
+int elf_write_insn(struct elf *elf, struct section *sec,
+ unsigned long offset, unsigned int len,
+ const char *insn);
+
#define for_each_sec(file, sec) \
list_for_each_entry(sec, &file->elf->sections, list)


2020-06-11 22:00:33

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Thu, Jun 11, 2020 at 11:55:38PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 08, 2020 at 01:01:08PM +0200, Peter Zijlstra wrote:
> > On Mon, Jun 08, 2020 at 09:57:39AM +0200, Dmitry Vyukov wrote:
> >
> > > As a crazy idea: is it possible to employ objtool (linker script?) to
> > > rewrite all coverage calls to nops in the noinstr section? Or relocate
> > > to nop function?
> > > What we are trying to do is very static, it _should_ have been done
> > > during build. We don't have means in existing _compilers_ to do this,
> > > but maybe we could do it elsewhere during build?...
> >
> > Let me try and figure out how to make objtool actually rewrite code.
>
> The below is quite horrific but seems to sorta work.
>
>
> It turns this:
>
> 12: e8 00 00 00 00 callq 17 <lockdep_hardirqs_on+0x17>
> 13: R_X86_64_PLT32 __sanitizer_cov_trace_pc-0x4
>
> Into this:
>
> 12: 90 nop
> 13: 90 nop
> 13: R_X86_64_NONE __sanitizer_cov_trace_pc-0x4
> 14: 90 nop
> 15: 90 nop
> 16: 90 nop
>
>
> I'll have to dig around a little more to see if I can't get rid of the
> relocation entirely. Also, I need to steal better arch_nop_insn() from
> the kernel :-)

Damn, paste buffer confusion, this is the right version.

---
tools/objtool/arch.h | 2 ++
tools/objtool/arch/x86/decode.c | 24 ++++++++++++++++++++++
tools/objtool/check.c | 17 +++++++++++++++-
tools/objtool/elf.c | 45 ++++++++++++++++++++++++++++++++++++++++-
tools/objtool/elf.h | 11 ++++++++--
5 files changed, 95 insertions(+), 4 deletions(-)

diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
index eda15a5a285e..3c5967748abb 100644
--- a/tools/objtool/arch.h
+++ b/tools/objtool/arch.h
@@ -84,4 +84,6 @@ unsigned long arch_jump_destination(struct instruction *insn);

unsigned long arch_dest_rela_offset(int addend);

+const char *arch_nop_insn(int len);
+
#endif /* _ARCH_H */
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 4b504fc90bbb..b615c32e21db 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -565,3 +565,27 @@ void arch_initial_func_cfi_state(struct cfi_init_state *state)
state->regs[16].base = CFI_CFA;
state->regs[16].offset = -8;
}
+
+const char *arch_nop_insn(int len)
+{
+ static const char insn[16] = {
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ 0x90,
+ };
+
+ return insn;
+}
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 5fbb90a80d23..6f5e9c9465cb 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -765,6 +765,19 @@ static int add_call_destinations(struct objtool_file *file)
} else
insn->call_dest = rela->sym;

+ if (insn->sec->noinstr &&
+ !strncmp(insn->call_dest->name, "__sanitizer_cov_", 16)) {
+ if (rela) {
+ rela->type = R_X86_64_NONE;
+ elf_write_rela(file->elf, rela);
+ }
+
+ elf_write_insn(file->elf, insn->sec,
+ insn->offset, insn->len,
+ arch_nop_insn(insn->len));
+ insn->type = INSN_NOP;
+ }
+
/*
* Whatever stack impact regular CALLs have, should be undone
* by the RETURN of the called function.
@@ -2802,11 +2815,13 @@ int check(const char *_objname, bool orc)
if (ret < 0)
goto out;

+ }
+
+ if (file.elf->changed) {
ret = elf_write(file.elf);
if (ret < 0)
goto out;
}
-
out:
if (ret < 0) {
/*
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 84225679f96d..6f8021a5076a 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -525,6 +525,7 @@ static int read_relas(struct elf *elf)
return -1;
}

+ rela->idx = i;
rela->type = GELF_R_TYPE(rela->rela.r_info);
rela->addend = rela->rela.r_addend;
rela->offset = rela->rela.r_offset;
@@ -713,6 +714,8 @@ struct section *elf_create_section(struct elf *elf, const char *name,
elf_hash_add(elf->section_hash, &sec->hash, sec->idx);
elf_hash_add(elf->section_name_hash, &sec->name_hash, str_hash(sec->name));

+ elf->changed = true;
+
return sec;
}

@@ -779,7 +782,43 @@ int elf_rebuild_rela_section(struct section *sec)
return 0;
}

-int elf_write(const struct elf *elf)
+int elf_write_insn(struct elf *elf, struct section *sec,
+ unsigned long offset, unsigned int len,
+ const char *insn)
+{
+ Elf_Data *data = sec->data;
+
+ if (data->d_type != ELF_T_BYTE || data->d_off) {
+ printf("ponies\n");
+ return -1;
+ }
+
+ memcpy(sec->data->d_buf + offset, insn, len);
+
+ elf_flagdata(data, ELF_C_SET, ELF_F_DIRTY);
+
+ sec->changed = true;
+ elf->changed = true;
+
+ return 0;
+}
+
+int elf_write_rela(struct elf *elf, struct rela *rela)
+{
+ struct section *sec = rela->sec;
+
+ rela->rela.r_info = GELF_R_INFO(rela->sym->idx, rela->type);
+ rela->rela.r_addend = rela->addend;
+
+ gelf_update_rela(sec->data, rela->idx, &rela->rela);
+
+ sec->changed = true;
+ elf->changed = true;
+
+ return 0;
+}
+
+int elf_write(struct elf *elf)
{
struct section *sec;
Elf_Scn *s;
@@ -796,6 +835,8 @@ int elf_write(const struct elf *elf)
WARN_ELF("gelf_update_shdr");
return -1;
}
+
+ sec->changed = false;
}
}

@@ -808,6 +849,8 @@ int elf_write(const struct elf *elf)
return -1;
}

+ elf->changed = false;
+
return 0;
}

diff --git a/tools/objtool/elf.h b/tools/objtool/elf.h
index f4fe1d6ea392..4a3fe4f455c5 100644
--- a/tools/objtool/elf.h
+++ b/tools/objtool/elf.h
@@ -64,9 +64,10 @@ struct rela {
GElf_Rela rela;
struct section *sec;
struct symbol *sym;
- unsigned int type;
unsigned long offset;
+ unsigned int type;
int addend;
+ int idx;
bool jump_table_start;
};

@@ -76,6 +77,7 @@ struct elf {
Elf *elf;
GElf_Ehdr ehdr;
int fd;
+ bool changed;
char *name;
struct list_head sections;
DECLARE_HASHTABLE(symbol_hash, ELF_HASH_BITS);
@@ -118,7 +120,7 @@ struct elf *elf_open_read(const char *name, int flags);
struct section *elf_create_section(struct elf *elf, const char *name, size_t entsize, int nr);
struct section *elf_create_rela_section(struct elf *elf, struct section *base);
void elf_add_rela(struct elf *elf, struct rela *rela);
-int elf_write(const struct elf *elf);
+int elf_write(struct elf *elf);
void elf_close(struct elf *elf);

struct section *find_section_by_name(const struct elf *elf, const char *name);
@@ -132,6 +134,11 @@ struct rela *find_rela_by_dest_range(const struct elf *elf, struct section *sec,
struct symbol *find_func_containing(struct section *sec, unsigned long offset);
int elf_rebuild_rela_section(struct section *sec);

+int elf_write_rela(struct elf *elf, struct rela *rela);
+int elf_write_insn(struct elf *elf, struct section *sec,
+ unsigned long offset, unsigned int len,
+ const char *insn);
+
#define for_each_sec(file, sec) \
list_for_each_entry(sec, &file->elf->sections, list)


2020-06-12 04:08:33

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Thu, Jun 11, 2020 at 11:55 PM Peter Zijlstra <[email protected]> wrote:
>
> On Mon, Jun 08, 2020 at 01:01:08PM +0200, Peter Zijlstra wrote:
> > On Mon, Jun 08, 2020 at 09:57:39AM +0200, Dmitry Vyukov wrote:
> >
> > > As a crazy idea: is it possible to employ objtool (linker script?) to
> > > rewrite all coverage calls to nops in the noinstr section? Or relocate
> > > to nop function?
> > > What we are trying to do is very static, it _should_ have been done
> > > during build. We don't have means in existing _compilers_ to do this,
> > > but maybe we could do it elsewhere during build?...
> >
> > Let me try and figure out how to make objtool actually rewrite code.
>
> The below is quite horrific but seems to sorta work.
>
> It turns this:
>
> 12: e8 00 00 00 00 callq 17 <lockdep_hardirqs_on+0x17>
> 13: R_X86_64_PLT32 __sanitizer_cov_trace_pc-0x4
>
> Into this:
>
> 12: 90 nop
> 13: 90 nop
> 13: R_X86_64_NONE __sanitizer_cov_trace_pc-0x4
> 14: 90 nop
> 15: 90 nop
> 16: 90 nop
>
>
> I'll have to dig around a little more to see if I can't get rid of the
> relocation entirely. Also, I need to steal better arch_nop_insn() from
> the kernel :-)

Wow! Cool!
Thanks for resolving this. I guess this can be used to wipe more
unwanted things in future :)

Marco double checked and his patch did not actually fix the existing
crash under KCSAN. The call itself was the problem or something,
returning early did not really help. This should hopefully fix it.
Marco, please double check.

Re better nop insn, I don't know how much work it is (or how much you
are striving for perfection :)). But from KCOV point of view, I think
we can live with more or less any nop insn. The main thing was
removing overhead from all other (not noinstr) cases, I would assume
the noinstr cases where we use nops are very rare. I mean don't spend
too much time on it, if it's not needed for something else.

Thanks again!


> ---
> tools/objtool/arch.h | 2 ++
> tools/objtool/arch/x86/decode.c | 24 ++++++++++++++++++++++
> tools/objtool/check.c | 15 +++++++++++++-
> tools/objtool/elf.c | 45 ++++++++++++++++++++++++++++++++++++++++-
> tools/objtool/elf.h | 11 ++++++++--
> 5 files changed, 93 insertions(+), 4 deletions(-)
>
> diff --git a/tools/objtool/arch.h b/tools/objtool/arch.h
> index eda15a5a285e..3c5967748abb 100644
> --- a/tools/objtool/arch.h
> +++ b/tools/objtool/arch.h
> @@ -84,4 +84,6 @@ unsigned long arch_jump_destination(struct instruction *insn);
>
> unsigned long arch_dest_rela_offset(int addend);
>
> +const char *arch_nop_insn(int len);
> +
> #endif /* _ARCH_H */
> diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
> index 4b504fc90bbb..b615c32e21db 100644
> --- a/tools/objtool/arch/x86/decode.c
> +++ b/tools/objtool/arch/x86/decode.c
> @@ -565,3 +565,27 @@ void arch_initial_func_cfi_state(struct cfi_init_state *state)
> state->regs[16].base = CFI_CFA;
> state->regs[16].offset = -8;
> }
> +
> +const char *arch_nop_insn(int len)
> +{
> + static const char insn[16] = {
> + 0x90,
> + 0x90,
> + 0x90,
> + 0x90,
> + 0x90,
> + 0x90,
> + 0x90,
> + 0x90,
> + 0x90,
> + 0x90,
> + 0x90,
> + 0x90,
> + 0x90,
> + 0x90,
> + 0x90,
> + 0x90,
> + };
> +
> + return insn;
> +}
> diff --git a/tools/objtool/check.c b/tools/objtool/check.c
> index 5fbb90a80d23..487b4dc3d122 100644
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -765,6 +765,17 @@ static int add_call_destinations(struct objtool_file *file)
> } else
> insn->call_dest = rela->sym;
>
> + if (insn->sec->noinstr &&
> + !strncmp(insn->call_dest->name, "__sanitizer_cov_", 16)) {
> + if (rela)
> + elf_write_rela(file->elf, rela);
> +
> + elf_write_insn(file->elf, insn->sec,
> + insn->offset, insn->len,
> + arch_nop_insn(insn->len));
> + insn->type = INSN_NOP;
> + }
> +
> /*
> * Whatever stack impact regular CALLs have, should be undone
> * by the RETURN of the called function.
> @@ -2802,11 +2813,13 @@ int check(const char *_objname, bool orc)
> if (ret < 0)
> goto out;
>
> + }
> +
> + if (file.elf->changed) {
> ret = elf_write(file.elf);
> if (ret < 0)
> goto out;
> }
> -
> out:
> if (ret < 0) {
> /*
> diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
> index 84225679f96d..705582729374 100644
> --- a/tools/objtool/elf.c
> +++ b/tools/objtool/elf.c
> @@ -525,6 +525,7 @@ static int read_relas(struct elf *elf)
> return -1;
> }
>
> + rela->idx = i;
> rela->type = GELF_R_TYPE(rela->rela.r_info);
> rela->addend = rela->rela.r_addend;
> rela->offset = rela->rela.r_offset;
> @@ -713,6 +714,8 @@ struct section *elf_create_section(struct elf *elf, const char *name,
> elf_hash_add(elf->section_hash, &sec->hash, sec->idx);
> elf_hash_add(elf->section_name_hash, &sec->name_hash, str_hash(sec->name));
>
> + elf->changed = true;
> +
> return sec;
> }
>
> @@ -779,7 +782,43 @@ int elf_rebuild_rela_section(struct section *sec)
> return 0;
> }
>
> -int elf_write(const struct elf *elf)
> +int elf_write_insn(struct elf *elf, struct section *sec,
> + unsigned long offset, unsigned int len,
> + const char *insn)
> +{
> + Elf_Data *data = sec->data;
> +
> + if (data->d_type != ELF_T_BYTE || data->d_off) {
> + printf("ponies\n");
> + return -1;
> + }
> +
> + memcpy(sec->data->d_buf + offset, insn, len);
> +
> + elf_flagdata(data, ELF_C_SET, ELF_F_DIRTY);
> +
> + sec->changed = true;
> + elf->changed = true;
> +
> + return 0;
> +}
> +
> +int elf_write_rela(struct elf *elf, struct rela *rela)
> +{
> + struct section *sec = rela->sec;
> +
> + rela->rela.r_info = 0;
> + rela->rela.r_addend = 0;
> +
> + gelf_update_rela(sec->data, rela->idx, &rela->rela);
> +
> + sec->changed = true;
> + elf->changed = true;
> +
> + return 0;
> +}
> +
> +int elf_write(struct elf *elf)
> {
> struct section *sec;
> Elf_Scn *s;
> @@ -796,6 +835,8 @@ int elf_write(const struct elf *elf)
> WARN_ELF("gelf_update_shdr");
> return -1;
> }
> +
> + sec->changed = false;
> }
> }
>
> @@ -808,6 +849,8 @@ int elf_write(const struct elf *elf)
> return -1;
> }
>
> + elf->changed = false;
> +
> return 0;
> }
>
> diff --git a/tools/objtool/elf.h b/tools/objtool/elf.h
> index f4fe1d6ea392..4a3fe4f455c5 100644
> --- a/tools/objtool/elf.h
> +++ b/tools/objtool/elf.h
> @@ -64,9 +64,10 @@ struct rela {
> GElf_Rela rela;
> struct section *sec;
> struct symbol *sym;
> - unsigned int type;
> unsigned long offset;
> + unsigned int type;
> int addend;
> + int idx;
> bool jump_table_start;
> };
>
> @@ -76,6 +77,7 @@ struct elf {
> Elf *elf;
> GElf_Ehdr ehdr;
> int fd;
> + bool changed;
> char *name;
> struct list_head sections;
> DECLARE_HASHTABLE(symbol_hash, ELF_HASH_BITS);
> @@ -118,7 +120,7 @@ struct elf *elf_open_read(const char *name, int flags);
> struct section *elf_create_section(struct elf *elf, const char *name, size_t entsize, int nr);
> struct section *elf_create_rela_section(struct elf *elf, struct section *base);
> void elf_add_rela(struct elf *elf, struct rela *rela);
> -int elf_write(const struct elf *elf);
> +int elf_write(struct elf *elf);
> void elf_close(struct elf *elf);
>
> struct section *find_section_by_name(const struct elf *elf, const char *name);
> @@ -132,6 +134,11 @@ struct rela *find_rela_by_dest_range(const struct elf *elf, struct section *sec,
> struct symbol *find_func_containing(struct section *sec, unsigned long offset);
> int elf_rebuild_rela_section(struct section *sec);
>
> +int elf_write_rela(struct elf *elf, struct rela *rela);
> +int elf_write_insn(struct elf *elf, struct section *sec,
> + unsigned long offset, unsigned int len,
> + const char *insn);
> +
> #define for_each_sec(file, sec) \
> list_for_each_entry(sec, &file->elf->sections, list)
>
>

2020-06-12 11:37:08

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Thu, Jun 11, 2020 at 11:58:12PM +0200, Peter Zijlstra wrote:
> On Thu, Jun 11, 2020 at 11:55:38PM +0200, Peter Zijlstra wrote:
> > I'll have to dig around a little more to see if I can't get rid of the
> > relocation entirely. Also, I need to steal better arch_nop_insn() from
> > the kernel :-)

Oh, I just realized that recordmcount does exactly this same thing, so I
checked what that does to the relocation, and it turns out, it does the
same thing I did. They change the relocation type to R_*_NONE too.

So I suppose that's all right then.

I suppose I ought to go look at the recordmcount to objtool patches to
see if there's anything to share there.

2020-06-12 11:53:28

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible



On Fri, 12 Jun 2020, Dmitry Vyukov wrote:

> On Thu, Jun 11, 2020 at 11:55 PM Peter Zijlstra <[email protected]> wrote:
> >
> > On Mon, Jun 08, 2020 at 01:01:08PM +0200, Peter Zijlstra wrote:
> > > On Mon, Jun 08, 2020 at 09:57:39AM +0200, Dmitry Vyukov wrote:
> > >
> > > > As a crazy idea: is it possible to employ objtool (linker script?) to
> > > > rewrite all coverage calls to nops in the noinstr section? Or relocate
> > > > to nop function?
> > > > What we are trying to do is very static, it _should_ have been done
> > > > during build. We don't have means in existing _compilers_ to do this,
> > > > but maybe we could do it elsewhere during build?...
> > >
> > > Let me try and figure out how to make objtool actually rewrite code.
> >
> > The below is quite horrific but seems to sorta work.
> >
> > It turns this:
> >
> > 12: e8 00 00 00 00 callq 17 <lockdep_hardirqs_on+0x17>
> > 13: R_X86_64_PLT32 __sanitizer_cov_trace_pc-0x4
> >
> > Into this:
> >
> > 12: 90 nop
> > 13: 90 nop
> > 13: R_X86_64_NONE __sanitizer_cov_trace_pc-0x4
> > 14: 90 nop
> > 15: 90 nop
> > 16: 90 nop
> >
> >
> > I'll have to dig around a little more to see if I can't get rid of the
> > relocation entirely. Also, I need to steal better arch_nop_insn() from
> > the kernel :-)
>
> Wow! Cool!
> Thanks for resolving this. I guess this can be used to wipe more
> unwanted things in future :)
>
> Marco double checked and his patch did not actually fix the existing
> crash under KCSAN. The call itself was the problem or something,
> returning early did not really help. This should hopefully fix it.
> Marco, please double check.
>
> Re better nop insn, I don't know how much work it is (or how much you
> are striving for perfection :)). But from KCOV point of view, I think
> we can live with more or less any nop insn. The main thing was
> removing overhead from all other (not noinstr) cases, I would assume
> the noinstr cases where we use nops are very rare. I mean don't spend
> too much time on it, if it's not needed for something else.
>
> Thanks again!

This is great, thanks! To make noinstr not call into KCOV, this
definitely seems to do the job.

Though sadly it doesn't fix the problem I'm seeing. The problem occurs
when I compile using Clang, and enable either KASAN or KCSAN together
with KCOV. Actually, turning off KCOV also shows this... a stacktrace is
below.

The repro is this one: https://syzkaller.appspot.com/x/repro.c?x=1017ef06100000

I don't quite understand what's going on here. Maybe the inserted
instrumentation causes the compiler to spill more things onto the stack
and somehow blow that? The nops obviously won't help with that. :-/

I'll try to debug and understand this some more. Also this is of course
on top of:
https://lore.kernel.org/lkml/[email protected]/

But, again, for disabling KCOV instrumentation in noinstr, I believe
your patch does what we want. In future, when we get compiler support
for __no_sanitize_coverage, the logic you're adding to objtool can
probably stay but shouldn't be invoked if the compiler is doing its job.

Thanks,
-- Marco

------ >8 ------

traps: PANIC: double fault, error_code: 0x0
double fault: 0000 [#1] PREEMPT SMP PTI
CPU: 3 PID: 513 Comm: a.out Not tainted 5.7.0+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:79 [inline]
RIP: 0010:check_preemption_disabled+0x60/0x120 lib/smp_processor_id.c:19
Code: 7f 74 27 90 90 90 90 90 65 48 8b 04 25 28 00 00 00 48 3b 44 24 08 0f 85 c6 00 00 00 89 d8 48 83 c4 10 5b 41 5c 41 5e 41 5f c3 <9c> 8f 04 24 f7 04 24 00 02 00 00 75 07 90 90 90 90 90 eb ca 65 4c
RSP: 0018:fffffe0000094ff8 EFLAGS: 00010046
RAX: 0000000080000000 RBX: 0000000000000003 RCX: ffffffffacc00ef7
RDX: 0000000000000000 RSI: ffffffffad29c4f2 RDI: ffffffffad21fe08
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: ffffffffad29c4f2 R15: ffffffffad21fe08
FS: 0000000001d26880(0000) GS:ffffa16e5fcc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffe0000094fe8 CR3: 00000008147bc002 CR4: 0000000000760ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<ENTRY_TRAMPOLINE>
__this_cpu_preempt_check+0x18/0x1a lib/smp_processor_id.c:65
fixup_bad_iret+0x2e/0xe0 arch/x86/kernel/traps.c:678
error_entry+0xd5/0xe0 arch/x86/entry/entry_64.S:937
RIP: 0010:native_irq_return_iret+0x0/0x2
Code: 5d 41 5c 5d 5b 41 5b 41 5a 41 59 41 58 58 59 5a 5e 5f 48 83 c4 08 eb 0b 66 66 2e 0f 1f 84 00 00 00 00 00 f6 44 24 20 04 75 02 <48> cf 57 0f 01 f8 66 90 0f 20 df 48 0f ba ef 3f 48 81 e7 ff e7 ff
RSP: 0018:fffffe00000951d8 EFLAGS: 00010046 ORIG_RAX: 0000000000000000
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000100
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
RIP: 0033:0x3bfd19e0df38d197
Code: Bad RIP value.
RSP: 002b:00007ffd10c4c948 EFLAGS: 00000313 </ENTRY_TRAMPOLINE>
Modules linked in:
---[ end trace df1b33281490ebc3 ]---
RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:79 [inline]
RIP: 0010:check_preemption_disabled+0x60/0x120 lib/smp_processor_id.c:19
Code: 7f 74 27 90 90 90 90 90 65 48 8b 04 25 28 00 00 00 48 3b 44 24 08 0f 85 c6 00 00 00 89 d8 48 83 c4 10 5b 41 5c 41 5e 41 5f c3 <9c> 8f 04 24 f7 04 24 00 02 00 00 75 07 90 90 90 90 90 eb ca 65 4c
RSP: 0018:fffffe0000094ff8 EFLAGS: 00010046
RAX: 0000000080000000 RBX: 0000000000000003 RCX: ffffffffacc00ef7
RDX: 0000000000000000 RSI: ffffffffad29c4f2 RDI: ffffffffad21fe08
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: ffffffffad29c4f2 R15: ffffffffad21fe08
FS: 0000000001d26880(0000) GS:ffffa16e5fcc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffe0000094fe8 CR3: 00000008147bc002 CR4: 0000000000760ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554

2020-06-13 17:27:56

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Fri, Jun 12, 2020 at 1:49 PM Marco Elver <[email protected]> wrote:
> On Fri, 12 Jun 2020, Dmitry Vyukov wrote:
>
> > On Thu, Jun 11, 2020 at 11:55 PM Peter Zijlstra <[email protected]> wrote:
> > >
> > > On Mon, Jun 08, 2020 at 01:01:08PM +0200, Peter Zijlstra wrote:
> > > > On Mon, Jun 08, 2020 at 09:57:39AM +0200, Dmitry Vyukov wrote:
> > > >
> > > > > As a crazy idea: is it possible to employ objtool (linker script?) to
> > > > > rewrite all coverage calls to nops in the noinstr section? Or relocate
> > > > > to nop function?
> > > > > What we are trying to do is very static, it _should_ have been done
> > > > > during build. We don't have means in existing _compilers_ to do this,
> > > > > but maybe we could do it elsewhere during build?...
> > > >
> > > > Let me try and figure out how to make objtool actually rewrite code.
> > >
> > > The below is quite horrific but seems to sorta work.
> > >
> > > It turns this:
> > >
> > > 12: e8 00 00 00 00 callq 17 <lockdep_hardirqs_on+0x17>
> > > 13: R_X86_64_PLT32 __sanitizer_cov_trace_pc-0x4
> > >
> > > Into this:
> > >
> > > 12: 90 nop
> > > 13: 90 nop
> > > 13: R_X86_64_NONE __sanitizer_cov_trace_pc-0x4
> > > 14: 90 nop
> > > 15: 90 nop
> > > 16: 90 nop
> > >
> > >
> > > I'll have to dig around a little more to see if I can't get rid of the
> > > relocation entirely. Also, I need to steal better arch_nop_insn() from
> > > the kernel :-)
> >
> > Wow! Cool!
> > Thanks for resolving this. I guess this can be used to wipe more
> > unwanted things in future :)
> >
> > Marco double checked and his patch did not actually fix the existing
> > crash under KCSAN. The call itself was the problem or something,
> > returning early did not really help. This should hopefully fix it.
> > Marco, please double check.
> >
> > Re better nop insn, I don't know how much work it is (or how much you
> > are striving for perfection :)). But from KCOV point of view, I think
> > we can live with more or less any nop insn. The main thing was
> > removing overhead from all other (not noinstr) cases, I would assume
> > the noinstr cases where we use nops are very rare. I mean don't spend
> > too much time on it, if it's not needed for something else.
> >
> > Thanks again!
>
> This is great, thanks! To make noinstr not call into KCOV, this
> definitely seems to do the job.
>
> Though sadly it doesn't fix the problem I'm seeing. The problem occurs
> when I compile using Clang, and enable either KASAN or KCSAN together
> with KCOV. Actually, turning off KCOV also shows this... a stacktrace is
> below.

I can't reproduce this after tuning off KCOV. Just KASAN works for me.
Also the following helps (at least for my config):

diff --git a/lib/Makefile b/lib/Makefile
index b1c42c10073b9..8514519bc5bcb 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -17,6 +17,7 @@ KCOV_INSTRUMENT_list_debug.o := n
KCOV_INSTRUMENT_debugobjects.o := n
KCOV_INSTRUMENT_dynamic_debug.o := n
KCOV_INSTRUMENT_fault-inject.o := n
+KCOV_INSTRUMENT_smp_processor_id.o := n


Btw, do you use inline instrumentation for KASAN or outline?
I use inline KASAN, so maybe it's a function call that's the problem.
KCOV uses calls and KCSAN also uses calls.

And it's not that we are getting that "BUG:", right? Otherwise we
would see it in non-KCOV builds as well. So it must be something in
the very beginning of the function...




> The repro is this one: https://syzkaller.appspot.com/x/repro.c?x=1017ef06100000
>
> I don't quite understand what's going on here. Maybe the inserted
> instrumentation causes the compiler to spill more things onto the stack
> and somehow blow that? The nops obviously won't help with that. :-/
>
> I'll try to debug and understand this some more. Also this is of course
> on top of:
> https://lore.kernel.org/lkml/[email protected]/
>
> But, again, for disabling KCOV instrumentation in noinstr, I believe
> your patch does what we want. In future, when we get compiler support
> for __no_sanitize_coverage, the logic you're adding to objtool can
> probably stay but shouldn't be invoked if the compiler is doing its job.
>
> Thanks,
> -- Marco
>
> ------ >8 ------
>
> traps: PANIC: double fault, error_code: 0x0
> double fault: 0000 [#1] PREEMPT SMP PTI
> CPU: 3 PID: 513 Comm: a.out Not tainted 5.7.0+ #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
> RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
> RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:79 [inline]
> RIP: 0010:check_preemption_disabled+0x60/0x120 lib/smp_processor_id.c:19
> Code: 7f 74 27 90 90 90 90 90 65 48 8b 04 25 28 00 00 00 48 3b 44 24 08 0f 85 c6 00 00 00 89 d8 48 83 c4 10 5b 41 5c 41 5e 41 5f c3 <9c> 8f 04 24 f7 04 24 00 02 00 00 75 07 90 90 90 90 90 eb ca 65 4c
> RSP: 0018:fffffe0000094ff8 EFLAGS: 00010046
> RAX: 0000000080000000 RBX: 0000000000000003 RCX: ffffffffacc00ef7
> RDX: 0000000000000000 RSI: ffffffffad29c4f2 RDI: ffffffffad21fe08
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: ffffffffad29c4f2 R15: ffffffffad21fe08
> FS: 0000000001d26880(0000) GS:ffffa16e5fcc0000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: fffffe0000094fe8 CR3: 00000008147bc002 CR4: 0000000000760ee0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
> <ENTRY_TRAMPOLINE>
> __this_cpu_preempt_check+0x18/0x1a lib/smp_processor_id.c:65
> fixup_bad_iret+0x2e/0xe0 arch/x86/kernel/traps.c:678
> error_entry+0xd5/0xe0 arch/x86/entry/entry_64.S:937
> RIP: 0010:native_irq_return_iret+0x0/0x2
> Code: 5d 41 5c 5d 5b 41 5b 41 5a 41 59 41 58 58 59 5a 5e 5f 48 83 c4 08 eb 0b 66 66 2e 0f 1f 84 00 00 00 00 00 f6 44 24 20 04 75 02 <48> cf 57 0f 01 f8 66 90 0f 20 df 48 0f ba ef 3f 48 81 e7 ff e7 ff
> RSP: 0018:fffffe00000951d8 EFLAGS: 00010046 ORIG_RAX: 0000000000000000
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000100
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> RIP: 0033:0x3bfd19e0df38d197
> Code: Bad RIP value.
> RSP: 002b:00007ffd10c4c948 EFLAGS: 00000313 </ENTRY_TRAMPOLINE>
> Modules linked in:
> ---[ end trace df1b33281490ebc3 ]---
> RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
> RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:79 [inline]
> RIP: 0010:check_preemption_disabled+0x60/0x120 lib/smp_processor_id.c:19
> Code: 7f 74 27 90 90 90 90 90 65 48 8b 04 25 28 00 00 00 48 3b 44 24 08 0f 85 c6 00 00 00 89 d8 48 83 c4 10 5b 41 5c 41 5e 41 5f c3 <9c> 8f 04 24 f7 04 24 00 02 00 00 75 07 90 90 90 90 90 eb ca 65 4c
> RSP: 0018:fffffe0000094ff8 EFLAGS: 00010046
> RAX: 0000000080000000 RBX: 0000000000000003 RCX: ffffffffacc00ef7
> RDX: 0000000000000000 RSI: ffffffffad29c4f2 RDI: ffffffffad21fe08
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: ffffffffad29c4f2 R15: ffffffffad21fe08
> FS: 0000000001d26880(0000) GS:ffffa16e5fcc0000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: fffffe0000094fe8 CR3: 00000008147bc002 CR4: 0000000000760ee0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554

2020-06-15 07:57:45

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Sat, 13 Jun 2020 at 19:24, Dmitry Vyukov <[email protected]> wrote:
>
> On Fri, Jun 12, 2020 at 1:49 PM Marco Elver <[email protected]> wrote:
> > On Fri, 12 Jun 2020, Dmitry Vyukov wrote:
> >
> > > On Thu, Jun 11, 2020 at 11:55 PM Peter Zijlstra <[email protected]> wrote:
> > > >
> > > > On Mon, Jun 08, 2020 at 01:01:08PM +0200, Peter Zijlstra wrote:
> > > > > On Mon, Jun 08, 2020 at 09:57:39AM +0200, Dmitry Vyukov wrote:
> > > > >
> > > > > > As a crazy idea: is it possible to employ objtool (linker script?) to
> > > > > > rewrite all coverage calls to nops in the noinstr section? Or relocate
> > > > > > to nop function?
> > > > > > What we are trying to do is very static, it _should_ have been done
> > > > > > during build. We don't have means in existing _compilers_ to do this,
> > > > > > but maybe we could do it elsewhere during build?...
> > > > >
> > > > > Let me try and figure out how to make objtool actually rewrite code.
> > > >
> > > > The below is quite horrific but seems to sorta work.
> > > >
[...]
> > > >
> > > > I'll have to dig around a little more to see if I can't get rid of the
> > > > relocation entirely. Also, I need to steal better arch_nop_insn() from
> > > > the kernel :-)
> > >
> > > Wow! Cool!
> > > Thanks for resolving this. I guess this can be used to wipe more
> > > unwanted things in future :)
> > >
> > > Marco double checked and his patch did not actually fix the existing
> > > crash under KCSAN. The call itself was the problem or something,
> > > returning early did not really help. This should hopefully fix it.
> > > Marco, please double check.
> > >
> > > Re better nop insn, I don't know how much work it is (or how much you
> > > are striving for perfection :)). But from KCOV point of view, I think
> > > we can live with more or less any nop insn. The main thing was
> > > removing overhead from all other (not noinstr) cases, I would assume
> > > the noinstr cases where we use nops are very rare. I mean don't spend
> > > too much time on it, if it's not needed for something else.
> > >
> > > Thanks again!
> >
> > This is great, thanks! To make noinstr not call into KCOV, this
> > definitely seems to do the job.
> >
> > Though sadly it doesn't fix the problem I'm seeing. The problem occurs
> > when I compile using Clang, and enable either KASAN or KCSAN together
> > with KCOV. Actually, turning off KCOV also shows this... a stacktrace is
> > below.
>
> I can't reproduce this after tuning off KCOV. Just KASAN works for me.
> Also the following helps (at least for my config):

Disabling KCOV for smp_processor_id now moves the crash elsewhere. In
the case of KASAN into its 'memcpy' wrapper, called after
__this_cpu_read in fixup_bad_iret. This is making me suspicious,
because it shouldn't be called from the noinstr functions.

For KCSAN the crash still happens in check_preemption_disabled, in the
inlined native_save_fl function (apparently on its 'pushf'). If I turn
fixup_bad_iret's __this_cpu_read into a raw_cpu_read (to bypass
check_preemption_disabled), no more crash with KCSAN.

> diff --git a/lib/Makefile b/lib/Makefile
> index b1c42c10073b9..8514519bc5bcb 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -17,6 +17,7 @@ KCOV_INSTRUMENT_list_debug.o := n
> KCOV_INSTRUMENT_debugobjects.o := n
> KCOV_INSTRUMENT_dynamic_debug.o := n
> KCOV_INSTRUMENT_fault-inject.o := n
> +KCOV_INSTRUMENT_smp_processor_id.o := n
>
>
> Btw, do you use inline instrumentation for KASAN or outline?
> I use inline KASAN, so maybe it's a function call that's the problem.
> KCOV uses calls and KCSAN also uses calls.

Using inline KASAN. The calls may be only indirectly contributing,
because at the point the BUG happens we're not in any runtime in the
case of smp_processor_id crash (noinstr is supposed to add __no_kcsan,
__no_sanitize_address etc). My guess here is that the compiler ends up
generating either a call or increased stack usage which causes a crash
in a random place.

> And it's not that we are getting that "BUG:", right? Otherwise we
> would see it in non-KCOV builds as well. So it must be something in
> the very beginning of the function...

For reference, I've attached my config (rebased on 5.8-rc1, with the
SAN-vs-noinstr series on top:
https://lore.kernel.org/lkml/[email protected]/).

Right now I'm not liking that fixup_bad_iret calls into KASAN's
memcpy. But that doesn't explain the crash with KCSAN. Maybe we're
running out of stack space?

Thanks,
-- Marco


Attachments:
config (126.61 kB)

2020-06-15 14:34:53

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Mon, Jun 15, 2020 at 09:53:06AM +0200, Marco Elver wrote:
>
> Disabling KCOV for smp_processor_id now moves the crash elsewhere. In
> the case of KASAN into its 'memcpy' wrapper, called after
> __this_cpu_read in fixup_bad_iret. This is making me suspicious,
> because it shouldn't be called from the noinstr functions.

With your .config, objtool complains about exactly that though:

vmlinux.o: warning: objtool: fixup_bad_iret()+0x8e: call to memcpy() leaves .noinstr.text section

The utterly gruesome thing below 'cures' that.

> For KCSAN the crash still happens in check_preemption_disabled, in the
> inlined native_save_fl function (apparently on its 'pushf'). If I turn
> fixup_bad_iret's __this_cpu_read into a raw_cpu_read (to bypass
> check_preemption_disabled), no more crash with KCSAN.

vmlinux.o: warning: objtool: debug_smp_processor_id()+0x0: call to __sanitizer_cov_trace_pc() leaves .noinstr.text section
vmlinux.o: warning: objtool: check_preemption_disabled()+0x1f: call to __sanitizer_cov_trace_pc() leaves .noinstr.text section
vmlinux.o: warning: objtool: __this_cpu_preempt_check()+0x4: call to __sanitizer_cov_trace_pc() leaves .noinstr.text section

That could be either of those I suppose, did you have the NOP patches
on? Let me try... those seem to placate objtool at least.

I do see a fair amount of __kasan_check*() crud though:

vmlinux.o: warning: objtool: rcu_nmi_exit()+0x44: call to __kasan_check_read() leaves .noinstr.text section
vmlinux.o: warning: objtool: rcu_dynticks_eqs_enter()+0x1c: call to __kasan_check_write() leaves .noinstr.text section
vmlinux.o: warning: objtool: rcu_nmi_enter()+0x46: call to __kasan_check_read() leaves .noinstr.text section
vmlinux.o: warning: objtool: rcu_dynticks_eqs_exit()+0x21: call to __kasan_check_write() leaves .noinstr.text section
vmlinux.o: warning: objtool: __rcu_is_watching()+0x1c: call to __kasan_check_read() leaves .noinstr.text section
vmlinux.o: warning: objtool: debug_locks_off()+0x1b: call to __kasan_check_write() leaves .noinstr.text section

That wasn't supported to happen with the __no_sanitize patches on (which
I didn't forget). Aah, I think we've lost a bunch of patches.. /me goes
rummage.

This:

https://lkml.kernel.org/r/[email protected]

that cures the rcu part of that.

Let me go look at your KCSAN thing now...

---
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index af75109485c26..031a21fb5a741 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -675,6 +675,14 @@ struct bad_iret_stack {
struct pt_regs regs;
};

+void __always_inline __badcpy(void *dst, void *src, int nr)
+{
+ unsigned long *d = dst, *s = src;
+ nr /= sizeof(unsigned long);
+ while (nr--)
+ *(d++) = *(s++);
+}
+
asmlinkage __visible noinstr
struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
{
@@ -690,13 +698,13 @@ struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
(struct bad_iret_stack *)__this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;

/* Copy the IRET target to the temporary storage. */
- memcpy(&tmp.regs.ip, (void *)s->regs.sp, 5*8);
+ __badcpy(&tmp.regs.ip, (void *)s->regs.sp, 5*8);

/* Copy the remainder of the stack from the current stack. */
- memcpy(&tmp, s, offsetof(struct bad_iret_stack, regs.ip));
+ __badcpy(&tmp, s, offsetof(struct bad_iret_stack, regs.ip));

/* Update the entry stack */
- memcpy(new_stack, &tmp, sizeof(tmp));
+ __badcpy(new_stack, &tmp, sizeof(tmp));

BUG_ON(!user_mode(&new_stack->regs));
return new_stack;

2020-06-15 14:40:03

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Mon, Jun 15, 2020 at 04:29:50PM +0200, Peter Zijlstra wrote:
> Let me go look at your KCSAN thing now...

vmlinux.o: warning: objtool: idtentry_enter_cond_rcu()+0x16: call to is_idle_task() leaves .noinstr.text section

---
diff --git a/include/linux/sched.h b/include/linux/sched.h
index b62e6aaf28f03..a7abc18a7d0ad 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1663,7 +1663,7 @@ extern struct task_struct *idle_task(int cpu);
*
* Return: 1 if @p is an idle task. 0 otherwise.
*/
-static inline bool is_idle_task(const struct task_struct *p)
+static __always_inline bool is_idle_task(const struct task_struct *p)
{
return !!(p->flags & PF_IDLE);
}

2020-06-15 14:56:06

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Mon, 15 Jun 2020, Peter Zijlstra wrote:

> On Mon, Jun 15, 2020 at 09:53:06AM +0200, Marco Elver wrote:
> >
> > Disabling KCOV for smp_processor_id now moves the crash elsewhere. In
> > the case of KASAN into its 'memcpy' wrapper, called after
> > __this_cpu_read in fixup_bad_iret. This is making me suspicious,
> > because it shouldn't be called from the noinstr functions.
>
> With your .config, objtool complains about exactly that though:
>
> vmlinux.o: warning: objtool: fixup_bad_iret()+0x8e: call to memcpy() leaves .noinstr.text section
>
> The utterly gruesome thing below 'cures' that.

Is __memcpy() generally available? I think that bypasses KASAN and
whatever else.

> > For KCSAN the crash still happens in check_preemption_disabled, in the
> > inlined native_save_fl function (apparently on its 'pushf'). If I turn
> > fixup_bad_iret's __this_cpu_read into a raw_cpu_read (to bypass
> > check_preemption_disabled), no more crash with KCSAN.
>
> vmlinux.o: warning: objtool: debug_smp_processor_id()+0x0: call to __sanitizer_cov_trace_pc() leaves .noinstr.text section
> vmlinux.o: warning: objtool: check_preemption_disabled()+0x1f: call to __sanitizer_cov_trace_pc() leaves .noinstr.text section
> vmlinux.o: warning: objtool: __this_cpu_preempt_check()+0x4: call to __sanitizer_cov_trace_pc() leaves .noinstr.text section
>
> That could be either of those I suppose, did you have the NOP patches
> on? Let me try... those seem to placate objtool at least.
>
> I do see a fair amount of __kasan_check*() crud though:
>
> vmlinux.o: warning: objtool: rcu_nmi_exit()+0x44: call to __kasan_check_read() leaves .noinstr.text section
> vmlinux.o: warning: objtool: rcu_dynticks_eqs_enter()+0x1c: call to __kasan_check_write() leaves .noinstr.text section
> vmlinux.o: warning: objtool: rcu_nmi_enter()+0x46: call to __kasan_check_read() leaves .noinstr.text section
> vmlinux.o: warning: objtool: rcu_dynticks_eqs_exit()+0x21: call to __kasan_check_write() leaves .noinstr.text section
> vmlinux.o: warning: objtool: __rcu_is_watching()+0x1c: call to __kasan_check_read() leaves .noinstr.text section
> vmlinux.o: warning: objtool: debug_locks_off()+0x1b: call to __kasan_check_write() leaves .noinstr.text section
>
> That wasn't supported to happen with the __no_sanitize patches on (which
> I didn't forget). Aah, I think we've lost a bunch of patches.. /me goes
> rummage.
>
> This:
>
> https://lkml.kernel.org/r/[email protected]
>
> that cures the rcu part of that.
>
> Let me go look at your KCSAN thing now...

I tried to find the stack that is used by the crashing code -- which led
me to entry_stack? So I tried this:

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -370,7 +370,7 @@ struct x86_hw_tss {
#define IO_BITMAP_OFFSET_INVALID (__KERNEL_TSS_LIMIT + 1)

struct entry_stack {
- unsigned long words[64];
+ unsigned long words[128];
};

struct entry_stack_page {

No more crash. But that's probably not what we want. Just a datapoint.

> ---
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index af75109485c26..031a21fb5a741 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -675,6 +675,14 @@ struct bad_iret_stack {
> struct pt_regs regs;
> };
>
> +void __always_inline __badcpy(void *dst, void *src, int nr)
> +{
> + unsigned long *d = dst, *s = src;
> + nr /= sizeof(unsigned long);
> + while (nr--)
> + *(d++) = *(s++);
> +}
> +

If we can use __memcpy() here, that would probably solve that.

Thanks,
-- Marco

2020-06-15 14:59:50

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Mon, Jun 15, 2020 at 09:53:06AM +0200, Marco Elver wrote:

> For KCSAN the crash still happens in check_preemption_disabled, in the
> inlined native_save_fl function (apparently on its 'pushf'). If I turn
> fixup_bad_iret's __this_cpu_read into a raw_cpu_read (to bypass
> check_preemption_disabled), no more crash with KCSAN.

Yeah, I can't see anything weird there with KCSAN + KCOV + NOP :-(

2020-06-15 15:06:41

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Mon, Jun 15, 2020 at 04:53:36PM +0200, Marco Elver wrote:
> On Mon, 15 Jun 2020, Peter Zijlstra wrote:
>
> > On Mon, Jun 15, 2020 at 09:53:06AM +0200, Marco Elver wrote:
> > >
> > > Disabling KCOV for smp_processor_id now moves the crash elsewhere. In
> > > the case of KASAN into its 'memcpy' wrapper, called after
> > > __this_cpu_read in fixup_bad_iret. This is making me suspicious,
> > > because it shouldn't be called from the noinstr functions.
> >
> > With your .config, objtool complains about exactly that though:
> >
> > vmlinux.o: warning: objtool: fixup_bad_iret()+0x8e: call to memcpy() leaves .noinstr.text section
> >
> > The utterly gruesome thing below 'cures' that.
>
> Is __memcpy() generally available? I think that bypasses KASAN and
> whatever else.

Yes, I think so. x86_64 needs lib/memcpy_64.S in .noinstr.text then. For
i386 it's an __always_inline inline-asm thing.

2020-06-15 15:23:16

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Mon, Jun 15, 2020 at 05:03:27PM +0200, Peter Zijlstra wrote:

> Yes, I think so. x86_64 needs lib/memcpy_64.S in .noinstr.text then. For
> i386 it's an __always_inline inline-asm thing.

Bah, I tried writing it without memcpy, but clang inserts memcpy anyway
:/

---
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index af75109485c26..d74fd6313a4ed 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -686,17 +686,17 @@ struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
* just below the IRET frame) and we want to pretend that the
* exception came from the IRET target.
*/
- struct bad_iret_stack tmp, *new_stack =
+ struct bad_iret_stack tmp = *s, *new_stack =
(struct bad_iret_stack *)__this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
+ unsigned long *p = (unsigned long *)s->regs.sp;

- /* Copy the IRET target to the temporary storage. */
- memcpy(&tmp.regs.ip, (void *)s->regs.sp, 5*8);
+ tmp.regs.ip = p[0];
+ tmp.regs.cs = p[1];
+ tmp.regs.flags = p[2];
+ tmp.regs.sp = p[3];
+ tmp.regs.ss = p[4];

- /* Copy the remainder of the stack from the current stack. */
- memcpy(&tmp, s, offsetof(struct bad_iret_stack, regs.ip));
-
- /* Update the entry stack */
- memcpy(new_stack, &tmp, sizeof(tmp));
+ *new_stack = tmp;

BUG_ON(!user_mode(&new_stack->regs));
return new_stack;

2020-06-17 14:37:16

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Mon, Jun 15, 2020 at 05:20PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 15, 2020 at 05:03:27PM +0200, Peter Zijlstra wrote:
>
> > Yes, I think so. x86_64 needs lib/memcpy_64.S in .noinstr.text then. For
> > i386 it's an __always_inline inline-asm thing.
>
> Bah, I tried writing it without memcpy, but clang inserts memcpy anyway
> :/

Hmm, __builtin_memcpy() won't help either.

Turns out, Clang 11 got __builtin_memcpy_inline(): https://reviews.llvm.org/D73543

The below works, no more crash on either KASAN or KCSAN with Clang. We
can test if we have it with __has_feature(__builtin_memcpy_inline)
(although that's currently not working as expected, trying to fix :-/).

Would a memcpy_inline() be generally useful? It's not just Clang but
also GCC that isn't entirely upfront about which memcpy is inlined and
which isn't. If the compiler has __builtin_memcpy_inline(), we can use
it, otherwise the arch likely has to provide the implementation.

Thoughts?

Thanks,
-- Marco

------ >8 ------

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index af75109485c2..3e07beae2a75 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -690,13 +690,13 @@ struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
(struct bad_iret_stack *)__this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;

/* Copy the IRET target to the temporary storage. */
- memcpy(&tmp.regs.ip, (void *)s->regs.sp, 5*8);
+ __builtin_memcpy_inline(&tmp.regs.ip, (void *)s->regs.sp, 5*8);

/* Copy the remainder of the stack from the current stack. */
- memcpy(&tmp, s, offsetof(struct bad_iret_stack, regs.ip));
+ __builtin_memcpy_inline(&tmp, s, offsetof(struct bad_iret_stack, regs.ip));

/* Update the entry stack */
- memcpy(new_stack, &tmp, sizeof(tmp));
+ __builtin_memcpy_inline(new_stack, &tmp, sizeof(tmp));

BUG_ON(!user_mode(&new_stack->regs));
return new_stack;

2020-06-17 14:52:43

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Wed, Jun 17, 2020 at 04:32:08PM +0200, Marco Elver wrote:
> On Mon, Jun 15, 2020 at 05:20PM +0200, Peter Zijlstra wrote:
> > On Mon, Jun 15, 2020 at 05:03:27PM +0200, Peter Zijlstra wrote:
> >
> > > Yes, I think so. x86_64 needs lib/memcpy_64.S in .noinstr.text then. For
> > > i386 it's an __always_inline inline-asm thing.
> >
> > Bah, I tried writing it without memcpy, but clang inserts memcpy anyway
> > :/
>
> Hmm, __builtin_memcpy() won't help either.
>
> Turns out, Clang 11 got __builtin_memcpy_inline(): https://reviews.llvm.org/D73543
>
> The below works, no more crash on either KASAN or KCSAN with Clang. We
> can test if we have it with __has_feature(__builtin_memcpy_inline)
> (although that's currently not working as expected, trying to fix :-/).
>
> Would a memcpy_inline() be generally useful? It's not just Clang but
> also GCC that isn't entirely upfront about which memcpy is inlined and
> which isn't. If the compiler has __builtin_memcpy_inline(), we can use
> it, otherwise the arch likely has to provide the implementation.
>
> Thoughts?

I had the below, except of course that yields another objtool
complaint, and I was still looking at that.

Does GCC (8, as per the new KASAN thing) have that
__builtin_memcpy_inline() ?

---
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index af75109485c26..a7d1570905727 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -690,13 +690,13 @@ struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
(struct bad_iret_stack *)__this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;

/* Copy the IRET target to the temporary storage. */
- memcpy(&tmp.regs.ip, (void *)s->regs.sp, 5*8);
+ __memcpy(&tmp.regs.ip, (void *)s->regs.sp, 5*8);

/* Copy the remainder of the stack from the current stack. */
- memcpy(&tmp, s, offsetof(struct bad_iret_stack, regs.ip));
+ __memcpy(&tmp, s, offsetof(struct bad_iret_stack, regs.ip));

/* Update the entry stack */
- memcpy(new_stack, &tmp, sizeof(tmp));
+ __memcpy(new_stack, &tmp, sizeof(tmp));

BUG_ON(!user_mode(&new_stack->regs));
return new_stack;
diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S
index 56b243b14c3a2..bbcc05bcefadb 100644
--- a/arch/x86/lib/memcpy_64.S
+++ b/arch/x86/lib/memcpy_64.S
@@ -8,6 +8,8 @@
#include <asm/alternative-asm.h>
#include <asm/export.h>

+.pushsection .noinstr.text, "ax"
+
/*
* We build a jump to memcpy_orig by default which gets NOPped out on
* the majority of x86 CPUs which set REP_GOOD. In addition, CPUs which
@@ -184,6 +186,8 @@ SYM_FUNC_START_LOCAL(memcpy_orig)
retq
SYM_FUNC_END(memcpy_orig)

+.popsection
+
#ifndef CONFIG_UML

MCSAFE_TEST_CTL

2020-06-17 15:22:46

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Wed, Jun 17, 2020 at 04:49:49PM +0200, Peter Zijlstra wrote:

> I had the below, except of course that yields another objtool
> complaint, and I was still looking at that.

This cures it.

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 5fbb90a80d239..fe0d6f1b28d7c 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2746,7 +2746,7 @@ int check(const char *_objname, bool orc)

INIT_LIST_HEAD(&file.insn_list);
hash_init(file.insn_hash);
- file.c_file = find_section_by_name(file.elf, ".comment");
+ file.c_file = !vmlinux && find_section_by_name(file.elf, ".comment");
file.ignore_unreachables = no_unreachable;
file.hints = false;


> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index af75109485c26..a7d1570905727 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -690,13 +690,13 @@ struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
> (struct bad_iret_stack *)__this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
>
> /* Copy the IRET target to the temporary storage. */
> - memcpy(&tmp.regs.ip, (void *)s->regs.sp, 5*8);
> + __memcpy(&tmp.regs.ip, (void *)s->regs.sp, 5*8);
>
> /* Copy the remainder of the stack from the current stack. */
> - memcpy(&tmp, s, offsetof(struct bad_iret_stack, regs.ip));
> + __memcpy(&tmp, s, offsetof(struct bad_iret_stack, regs.ip));
>
> /* Update the entry stack */
> - memcpy(new_stack, &tmp, sizeof(tmp));
> + __memcpy(new_stack, &tmp, sizeof(tmp));
>
> BUG_ON(!user_mode(&new_stack->regs));
> return new_stack;
> diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S
> index 56b243b14c3a2..bbcc05bcefadb 100644
> --- a/arch/x86/lib/memcpy_64.S
> +++ b/arch/x86/lib/memcpy_64.S
> @@ -8,6 +8,8 @@
> #include <asm/alternative-asm.h>
> #include <asm/export.h>
>
> +.pushsection .noinstr.text, "ax"
> +
> /*
> * We build a jump to memcpy_orig by default which gets NOPped out on
> * the majority of x86 CPUs which set REP_GOOD. In addition, CPUs which
> @@ -184,6 +186,8 @@ SYM_FUNC_START_LOCAL(memcpy_orig)
> retq
> SYM_FUNC_END(memcpy_orig)
>
> +.popsection
> +
> #ifndef CONFIG_UML
>
> MCSAFE_TEST_CTL

2020-06-17 15:25:03

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Wed, Jun 17, 2020 at 04:49PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 17, 2020 at 04:32:08PM +0200, Marco Elver wrote:
> > On Mon, Jun 15, 2020 at 05:20PM +0200, Peter Zijlstra wrote:
> > > On Mon, Jun 15, 2020 at 05:03:27PM +0200, Peter Zijlstra wrote:
> > >
> > > > Yes, I think so. x86_64 needs lib/memcpy_64.S in .noinstr.text then. For
> > > > i386 it's an __always_inline inline-asm thing.
> > >
> > > Bah, I tried writing it without memcpy, but clang inserts memcpy anyway
> > > :/
> >
> > Hmm, __builtin_memcpy() won't help either.
> >
> > Turns out, Clang 11 got __builtin_memcpy_inline(): https://reviews.llvm.org/D73543
> >
> > The below works, no more crash on either KASAN or KCSAN with Clang. We
> > can test if we have it with __has_feature(__builtin_memcpy_inline)
> > (although that's currently not working as expected, trying to fix :-/).
> >
> > Would a memcpy_inline() be generally useful? It's not just Clang but
> > also GCC that isn't entirely upfront about which memcpy is inlined and
> > which isn't. If the compiler has __builtin_memcpy_inline(), we can use
> > it, otherwise the arch likely has to provide the implementation.
> >
> > Thoughts?
>
> I had the below, except of course that yields another objtool
> complaint, and I was still looking at that.
>
> Does GCC (8, as per the new KASAN thing) have that
> __builtin_memcpy_inline() ?

No, sadly it doesn't. Only Clang 11. :-/

But using a call to __memcpy() somehow breaks with Clang+KCSAN. Yet,
it's not the memcpy that BUGs, but once again check_preemption_disabled
(which is noinstr!). Just adding calls anywhere here seems to results in
unpredictable behaviour. Are we running out of stack space?

Thanks,
-- Marco

2020-06-17 15:57:50

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Wed, Jun 17, 2020 at 05:19:59PM +0200, Marco Elver wrote:

> > Does GCC (8, as per the new KASAN thing) have that
> > __builtin_memcpy_inline() ?
>
> No, sadly it doesn't. Only Clang 11. :-/
>
> But using a call to __memcpy() somehow breaks with Clang+KCSAN. Yet,
> it's not the memcpy that BUGs, but once again check_preemption_disabled
> (which is noinstr!). Just adding calls anywhere here seems to results in
> unpredictable behaviour. Are we running out of stack space?

Very likely, bad_iret is running on that entry_stack you found, and as
you found, it is puny.

Andy wanted to make it a full page a while ago, so I suppose the
question is do we do that now?

2020-06-17 16:41:52

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Wed, Jun 17, 2020 at 05:55:17PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 17, 2020 at 05:19:59PM +0200, Marco Elver wrote:
>
> > > Does GCC (8, as per the new KASAN thing) have that
> > > __builtin_memcpy_inline() ?
> >
> > No, sadly it doesn't. Only Clang 11. :-/
> >
> > But using a call to __memcpy() somehow breaks with Clang+KCSAN. Yet,
> > it's not the memcpy that BUGs, but once again check_preemption_disabled
> > (which is noinstr!). Just adding calls anywhere here seems to results in
> > unpredictable behaviour. Are we running out of stack space?
>
> Very likely, bad_iret is running on that entry_stack you found, and as
> you found, it is puny.
>
> Andy wanted to make it a full page a while ago, so I suppose the
> question is do we do that now?

Andy suggested doing the full page; untested patches here:

git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git x86/entry


2020-06-17 18:09:05

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Wed, Jun 17, 2020 at 06:36PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 17, 2020 at 05:55:17PM +0200, Peter Zijlstra wrote:
> > On Wed, Jun 17, 2020 at 05:19:59PM +0200, Marco Elver wrote:
> >
> > > > Does GCC (8, as per the new KASAN thing) have that
> > > > __builtin_memcpy_inline() ?
> > >
> > > No, sadly it doesn't. Only Clang 11. :-/
> > >
> > > But using a call to __memcpy() somehow breaks with Clang+KCSAN. Yet,
> > > it's not the memcpy that BUGs, but once again check_preemption_disabled
> > > (which is noinstr!). Just adding calls anywhere here seems to results in
> > > unpredictable behaviour. Are we running out of stack space?
> >
> > Very likely, bad_iret is running on that entry_stack you found, and as
> > you found, it is puny.
> >
> > Andy wanted to make it a full page a while ago, so I suppose the
> > question is do we do that now?
>
> Andy suggested doing the full page; untested patches here:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git x86/entry

Yeah, that works, thanks! I think the stack increase alone fixes any
kind of crash due to the reproducer.

Also, my guess is this is not a hot function, right? One caveat to keep
in mind is that because it's not 'memcpy', the compiler will never
inline these memcpys (unlike before). Whether or not that actually makes
things faster or slower is anyone's guess though.

Thanks,
-- Marco