2024-06-11 13:33:01

by Aleksandr Nogikh

[permalink] [raw]
Subject: [PATCH] kcov: don't lose track of remote references during softirqs

In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
metadata of the current task into a per-CPU variable. However, the
kcov_mode_enabled(mode) check is not sufficient in the case of remote
KCOV coverage: current->kcov_mode always remains KCOV_MODE_DISABLED
for remote KCOV objects.

If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl
happens to get interrupted and kcov_remote_start() is called, it
ultimately leads to kcov_remote_stop() NOT restoring the original
KCOV reference. So when the task exits, all registered remote KCOV
handles remain active forever.

Fix it by introducing a special kcov_mode that is assigned to the
task that owns a KCOV remote object. It makes kcov_mode_enabled()
return true and yet does not trigger coverage collection in
__sanitizer_cov_trace_pc() and write_comp_data().

Signed-off-by: Aleksandr Nogikh <[email protected]>
Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")
---
include/linux/kcov.h | 2 ++
kernel/kcov.c | 1 +
2 files changed, 3 insertions(+)

diff --git a/include/linux/kcov.h b/include/linux/kcov.h
index b851ba415e03..3b479a3d235a 100644
--- a/include/linux/kcov.h
+++ b/include/linux/kcov.h
@@ -21,6 +21,8 @@ enum kcov_mode {
KCOV_MODE_TRACE_PC = 2,
/* Collecting comparison operands mode. */
KCOV_MODE_TRACE_CMP = 3,
+ /* The process owns a KCOV remote reference. */
+ KCOV_MODE_REMOTE = 4,
};

#define KCOV_IN_CTXSW (1 << 30)
diff --git a/kernel/kcov.c b/kernel/kcov.c
index c3124f6d5536..5371d3f7b5c3 100644
--- a/kernel/kcov.c
+++ b/kernel/kcov.c
@@ -632,6 +632,7 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
return -EINVAL;
kcov->mode = mode;
t->kcov = kcov;
+ WRITE_ONCE(t->kcov_mode, KCOV_MODE_REMOTE);
kcov->t = t;
kcov->remote = true;
kcov->remote_size = remote_arg->area_size;
--
2.45.2.505.gda0bf45e8d-goog



2024-06-11 13:37:40

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH] kcov: don't lose track of remote references during softirqs

On Tue, 11 Jun 2024 at 15:32, Aleksandr Nogikh <[email protected]> wrote:
>
> In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
> metadata of the current task into a per-CPU variable. However, the
> kcov_mode_enabled(mode) check is not sufficient in the case of remote
> KCOV coverage: current->kcov_mode always remains KCOV_MODE_DISABLED
> for remote KCOV objects.
>
> If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl
> happens to get interrupted and kcov_remote_start() is called, it
> ultimately leads to kcov_remote_stop() NOT restoring the original
> KCOV reference. So when the task exits, all registered remote KCOV
> handles remain active forever.
>
> Fix it by introducing a special kcov_mode that is assigned to the
> task that owns a KCOV remote object. It makes kcov_mode_enabled()
> return true and yet does not trigger coverage collection in
> __sanitizer_cov_trace_pc() and write_comp_data().
>
> Signed-off-by: Aleksandr Nogikh <[email protected]>
> Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")

Reviewed-by: Dmitry Vyukov <[email protected]>

> ---
> include/linux/kcov.h | 2 ++
> kernel/kcov.c | 1 +
> 2 files changed, 3 insertions(+)
>
> diff --git a/include/linux/kcov.h b/include/linux/kcov.h
> index b851ba415e03..3b479a3d235a 100644
> --- a/include/linux/kcov.h
> +++ b/include/linux/kcov.h
> @@ -21,6 +21,8 @@ enum kcov_mode {
> KCOV_MODE_TRACE_PC = 2,
> /* Collecting comparison operands mode. */
> KCOV_MODE_TRACE_CMP = 3,
> + /* The process owns a KCOV remote reference. */
> + KCOV_MODE_REMOTE = 4,
> };
>
> #define KCOV_IN_CTXSW (1 << 30)
> diff --git a/kernel/kcov.c b/kernel/kcov.c
> index c3124f6d5536..5371d3f7b5c3 100644
> --- a/kernel/kcov.c
> +++ b/kernel/kcov.c
> @@ -632,6 +632,7 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
> return -EINVAL;
> kcov->mode = mode;
> t->kcov = kcov;
> + WRITE_ONCE(t->kcov_mode, KCOV_MODE_REMOTE);
> kcov->t = t;
> kcov->remote = true;
> kcov->remote_size = remote_arg->area_size;
> --
> 2.45.2.505.gda0bf45e8d-goog
>

2024-06-11 19:03:01

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] kcov: don't lose track of remote references during softirqs

On Tue, 11 Jun 2024 15:32:29 +0200 Aleksandr Nogikh <[email protected]> wrote:

> In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
> metadata of the current task into a per-CPU variable. However, the
> kcov_mode_enabled(mode) check is not sufficient in the case of remote
> KCOV coverage: current->kcov_mode always remains KCOV_MODE_DISABLED
> for remote KCOV objects.
>
> If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl
> happens to get interrupted and kcov_remote_start() is called, it
> ultimately leads to kcov_remote_stop() NOT restoring the original
> KCOV reference. So when the task exits, all registered remote KCOV
> handles remain active forever.
>
> Fix it by introducing a special kcov_mode that is assigned to the
> task that owns a KCOV remote object. It makes kcov_mode_enabled()
> return true and yet does not trigger coverage collection in
> __sanitizer_cov_trace_pc() and write_comp_data().

What are the userspace visible effects of this bug? I *think* it's
just an efficiency thing, but how significant? In other words, should
we backport this fix?



2024-06-12 10:11:50

by Aleksandr Nogikh

[permalink] [raw]
Subject: Re: [PATCH] kcov: don't lose track of remote references during softirqs

On Tue, Jun 11, 2024 at 8:51 PM Andrew Morton <[email protected]> wrote:
>
> On Tue, 11 Jun 2024 15:32:29 +0200 Aleksandr Nogikh <[email protected]> wrote:
>
> > In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
> > metadata of the current task into a per-CPU variable. However, the
> > kcov_mode_enabled(mode) check is not sufficient in the case of remote
> > KCOV coverage: current->kcov_mode always remains KCOV_MODE_DISABLED
> > for remote KCOV objects.
> >
> > If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl
> > happens to get interrupted and kcov_remote_start() is called, it
> > ultimately leads to kcov_remote_stop() NOT restoring the original
> > KCOV reference. So when the task exits, all registered remote KCOV
> > handles remain active forever.
> >
> > Fix it by introducing a special kcov_mode that is assigned to the
> > task that owns a KCOV remote object. It makes kcov_mode_enabled()
> > return true and yet does not trigger coverage collection in
> > __sanitizer_cov_trace_pc() and write_comp_data().
>
> What are the userspace visible effects of this bug? I *think* it's
> just an efficiency thing, but how significant? In other words, should
> we backport this fix?
>

The most uncomfortable effect (at least for syzkaller) is that the bug
prevents the reuse of the same /sys/kernel/debug/kcov descriptor. If
we obtain it in the parent process and then e.g. drop some
capabilities and continuously fork to execute individual programs, at
some point current->kcov of the forked process is lost,
kcov_task_exit() takes no action, and all KCOV_REMOTE_ENABLE ioctls
calls from subsequent forks fail.

And, yes, the efficiency is also affected if we keep on losing remote
kcov objects.
a) kcov_remote_map keeps on growing forever.
b) (If I'm not mistaken), we're also not freeing the memory referenced
by kcov->area.

I think it would be nice to backport the fix to the stable trees.

--
Aleksandr

2024-06-13 22:12:48

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH] kcov: don't lose track of remote references during softirqs

On Tue, Jun 11, 2024 at 3:32 PM Aleksandr Nogikh <[email protected]> wrote:
>
> In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
> metadata of the current task into a per-CPU variable. However, the
> kcov_mode_enabled(mode) check is not sufficient in the case of remote
> KCOV coverage: current->kcov_mode always remains KCOV_MODE_DISABLED
> for remote KCOV objects.
>
> If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl
> happens to get interrupted and kcov_remote_start() is called, it
> ultimately leads to kcov_remote_stop() NOT restoring the original
> KCOV reference. So when the task exits, all registered remote KCOV
> handles remain active forever.
>
> Fix it by introducing a special kcov_mode that is assigned to the
> task that owns a KCOV remote object. It makes kcov_mode_enabled()
> return true and yet does not trigger coverage collection in
> __sanitizer_cov_trace_pc() and write_comp_data().
>
> Signed-off-by: Aleksandr Nogikh <[email protected]>
> Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")
> ---
> include/linux/kcov.h | 2 ++
> kernel/kcov.c | 1 +
> 2 files changed, 3 insertions(+)
>
> diff --git a/include/linux/kcov.h b/include/linux/kcov.h
> index b851ba415e03..3b479a3d235a 100644
> --- a/include/linux/kcov.h
> +++ b/include/linux/kcov.h
> @@ -21,6 +21,8 @@ enum kcov_mode {
> KCOV_MODE_TRACE_PC = 2,
> /* Collecting comparison operands mode. */
> KCOV_MODE_TRACE_CMP = 3,
> + /* The process owns a KCOV remote reference. */
> + KCOV_MODE_REMOTE = 4,
> };
>
> #define KCOV_IN_CTXSW (1 << 30)
> diff --git a/kernel/kcov.c b/kernel/kcov.c
> index c3124f6d5536..5371d3f7b5c3 100644
> --- a/kernel/kcov.c
> +++ b/kernel/kcov.c
> @@ -632,6 +632,7 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
> return -EINVAL;
> kcov->mode = mode;
> t->kcov = kcov;
> + WRITE_ONCE(t->kcov_mode, KCOV_MODE_REMOTE);
> kcov->t = t;
> kcov->remote = true;
> kcov->remote_size = remote_arg->area_size;
> --
> 2.45.2.505.gda0bf45e8d-goog
>

Reviewed-by: Andrey Konovalov <[email protected]>
Tested-by: Andrey Konovalov <[email protected]>

Thank you for fixing this!

2024-06-13 23:02:40

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [PATCH] kcov: don't lose track of remote references during softirqs

On Tue, Jun 11, 2024 at 3:32 PM Aleksandr Nogikh <[email protected]> wrote:
>
> In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
> metadata of the current task into a per-CPU variable. However, the
> kcov_mode_enabled(mode) check is not sufficient in the case of remote
> KCOV coverage: current->kcov_mode always remains KCOV_MODE_DISABLED
> for remote KCOV objects.
>
> If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl
> happens to get interrupted and kcov_remote_start() is called, it
> ultimately leads to kcov_remote_stop() NOT restoring the original
> KCOV reference. So when the task exits, all registered remote KCOV
> handles remain active forever.
>
> Fix it by introducing a special kcov_mode that is assigned to the
> task that owns a KCOV remote object. It makes kcov_mode_enabled()
> return true and yet does not trigger coverage collection in
> __sanitizer_cov_trace_pc() and write_comp_data().
>
> Signed-off-by: Aleksandr Nogikh <[email protected]>
> Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")
> ---
> include/linux/kcov.h | 2 ++
> kernel/kcov.c | 1 +
> 2 files changed, 3 insertions(+)
>
> diff --git a/include/linux/kcov.h b/include/linux/kcov.h
> index b851ba415e03..3b479a3d235a 100644
> --- a/include/linux/kcov.h
> +++ b/include/linux/kcov.h
> @@ -21,6 +21,8 @@ enum kcov_mode {
> KCOV_MODE_TRACE_PC = 2,
> /* Collecting comparison operands mode. */
> KCOV_MODE_TRACE_CMP = 3,
> + /* The process owns a KCOV remote reference. */
> + KCOV_MODE_REMOTE = 4,
> };
>
> #define KCOV_IN_CTXSW (1 << 30)
> diff --git a/kernel/kcov.c b/kernel/kcov.c
> index c3124f6d5536..5371d3f7b5c3 100644
> --- a/kernel/kcov.c
> +++ b/kernel/kcov.c
> @@ -632,6 +632,7 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
> return -EINVAL;
> kcov->mode = mode;
> t->kcov = kcov;
> + WRITE_ONCE(t->kcov_mode, KCOV_MODE_REMOTE);

Looking at this again, I don't think we need this WRITE_ONCE here, as
we have interrupts disabled. But if we do, perhaps it makes sense to
add a comment explaining why.

> kcov->t = t;
> kcov->remote = true;
> kcov->remote_size = remote_arg->area_size;
> --
> 2.45.2.505.gda0bf45e8d-goog
>

2024-06-14 17:43:18

by Aleksandr Nogikh

[permalink] [raw]
Subject: Re: [PATCH] kcov: don't lose track of remote references during softirqs

On Fri, Jun 14, 2024 at 1:02 AM Andrey Konovalov <[email protected]> wrote:
>
> On Tue, Jun 11, 2024 at 3:32 PM Aleksandr Nogikh <[email protected]> wrote:
> >
> > In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
> > metadata of the current task into a per-CPU variable. However, the
> > kcov_mode_enabled(mode) check is not sufficient in the case of remote
> > KCOV coverage: current->kcov_mode always remains KCOV_MODE_DISABLED
> > for remote KCOV objects.
> >
> > If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl
> > happens to get interrupted and kcov_remote_start() is called, it
> > ultimately leads to kcov_remote_stop() NOT restoring the original
> > KCOV reference. So when the task exits, all registered remote KCOV
> > handles remain active forever.
> >
> > Fix it by introducing a special kcov_mode that is assigned to the
> > task that owns a KCOV remote object. It makes kcov_mode_enabled()
> > return true and yet does not trigger coverage collection in
> > __sanitizer_cov_trace_pc() and write_comp_data().
> >
> > Signed-off-by: Aleksandr Nogikh <[email protected]>
> > Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")
> > ---
> > include/linux/kcov.h | 2 ++
> > kernel/kcov.c | 1 +
> > 2 files changed, 3 insertions(+)
> >
> > diff --git a/include/linux/kcov.h b/include/linux/kcov.h
> > index b851ba415e03..3b479a3d235a 100644
> > --- a/include/linux/kcov.h
> > +++ b/include/linux/kcov.h
> > @@ -21,6 +21,8 @@ enum kcov_mode {
> > KCOV_MODE_TRACE_PC = 2,
> > /* Collecting comparison operands mode. */
> > KCOV_MODE_TRACE_CMP = 3,
> > + /* The process owns a KCOV remote reference. */
> > + KCOV_MODE_REMOTE = 4,
> > };
> >
> > #define KCOV_IN_CTXSW (1 << 30)
> > diff --git a/kernel/kcov.c b/kernel/kcov.c
> > index c3124f6d5536..5371d3f7b5c3 100644
> > --- a/kernel/kcov.c
> > +++ b/kernel/kcov.c
> > @@ -632,6 +632,7 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
> > return -EINVAL;
> > kcov->mode = mode;
> > t->kcov = kcov;
> > + WRITE_ONCE(t->kcov_mode, KCOV_MODE_REMOTE);
>
> Looking at this again, I don't think we need this WRITE_ONCE here, as
> we have interrupts disabled. But if we do, perhaps it makes sense to
> add a comment explaining why.

Thank you!
I've sent a v2:
https://lore.kernel.org/all/[email protected]/

>
> > kcov->t = t;
> > kcov->remote = true;
> > kcov->remote_size = remote_arg->area_size;
> > --
> > 2.45.2.505.gda0bf45e8d-goog
> >
>