2024-01-09 16:43:47

by Artem Savkov

[permalink] [raw]
Subject: [PATCH bpf-next] selftests/bpf: fix potential premature unload in bpf_testmod

It is possible for bpf_kfunc_call_test_release() to be called from
bpf_map_free_deferred() when bpf_testmod is already unloaded and
perf_test_stuct.cnt which it tries to decrease is no longer in memory.
This patch tries to fix the issue by waiting for all references to be
dropped in bpf_testmod_exit().

The issue can be triggered by running 'test_progs -t map_kptr' in 6.5,
but is obscured in 6.6 by d119357d07435 ("rcu-tasks: Treat only
synchronous grace periods urgently").

Fixes: 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod")
---
tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
index 91907b321f913..63f0dbd016703 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
@@ -2,6 +2,7 @@
/* Copyright (c) 2020 Facebook */
#include <linux/btf.h>
#include <linux/btf_ids.h>
+#include <linux/delay.h>
#include <linux/error-injection.h>
#include <linux/init.h>
#include <linux/module.h>
@@ -544,6 +545,9 @@ static int bpf_testmod_init(void)

static void bpf_testmod_exit(void)
{
+ while (refcount_read(&prog_test_struct.cnt) > 1)
+ msleep(20);
+
return sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file);
}

--
2.43.0



2024-01-09 19:48:20

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH bpf-next] selftests/bpf: fix potential premature unload in bpf_testmod


On 1/9/24 8:43 AM, Artem Savkov wrote:
> It is possible for bpf_kfunc_call_test_release() to be called from
> bpf_map_free_deferred() when bpf_testmod is already unloaded and
> perf_test_stuct.cnt which it tries to decrease is no longer in memory.
> This patch tries to fix the issue by waiting for all references to be
> dropped in bpf_testmod_exit().
>
> The issue can be triggered by running 'test_progs -t map_kptr' in 6.5,
> but is obscured in 6.6 by d119357d07435 ("rcu-tasks: Treat only
> synchronous grace periods urgently").
>
> Fixes: 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod")

Please add your Signed-off-by tag.

I think the root cause is that bpf_kfunc_call_test_acquire() kfunc
is defined in bpf_testmod and the kfunc returns some data in bpf_testmod.
But the release function bpf_kfunc_call_test_release() is in the kernel.
The release func tries to access some data in bpf_testmod which might
have been unloaded. The prog_test_ref_kfunc is defined in the kernel, so
no bpf_testmod btf reference is hold so bpf_testmod can be unloaded before
bpf_kfunc_call_test_release().
As you mentioned, we won't have this issue if bpf_kfunc_call_test_acquire()
is also in the kernel.

I think putting bpf_kfunc_call_test_acquire() in bpf_testmod and
bpf_kfunc_call_test_release() in kernel is not a good idea and confusing.
But since this is only for tests, I guess we can live with that. With that,

Acked-by: Yonghong Song <[email protected]>

> ---
> tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> index 91907b321f913..63f0dbd016703 100644
> --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> @@ -2,6 +2,7 @@
> /* Copyright (c) 2020 Facebook */
> #include <linux/btf.h>
> #include <linux/btf_ids.h>
> +#include <linux/delay.h>
> #include <linux/error-injection.h>
> #include <linux/init.h>
> #include <linux/module.h>
> @@ -544,6 +545,9 @@ static int bpf_testmod_init(void)
>
> static void bpf_testmod_exit(void)
> {
> + while (refcount_read(&prog_test_struct.cnt) > 1)
> + msleep(20);
> +
> return sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file);
> }
>

2024-01-10 08:16:51

by Artem Savkov

[permalink] [raw]
Subject: Re: [PATCH bpf-next] selftests/bpf: fix potential premature unload in bpf_testmod

On Tue, Jan 09, 2024 at 11:40:38AM -0800, Yonghong Song wrote:
>
> On 1/9/24 8:43 AM, Artem Savkov wrote:
> > It is possible for bpf_kfunc_call_test_release() to be called from
> > bpf_map_free_deferred() when bpf_testmod is already unloaded and
> > perf_test_stuct.cnt which it tries to decrease is no longer in memory.
> > This patch tries to fix the issue by waiting for all references to be
> > dropped in bpf_testmod_exit().
> >
> > The issue can be triggered by running 'test_progs -t map_kptr' in 6.5,
> > but is obscured in 6.6 by d119357d07435 ("rcu-tasks: Treat only
> > synchronous grace periods urgently").
> >
> > Fixes: 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod")
>
> Please add your Signed-off-by tag.

Thanks for noticing. Will resend with signed-off-by and your ack.

> I think the root cause is that bpf_kfunc_call_test_acquire() kfunc
> is defined in bpf_testmod and the kfunc returns some data in bpf_testmod.
> But the release function bpf_kfunc_call_test_release() is in the kernel.
> The release func tries to access some data in bpf_testmod which might
> have been unloaded. The prog_test_ref_kfunc is defined in the kernel, so
> no bpf_testmod btf reference is hold so bpf_testmod can be unloaded before
> bpf_kfunc_call_test_release().
> As you mentioned, we won't have this issue if bpf_kfunc_call_test_acquire()
> is also in the kernel.
>
> I think putting bpf_kfunc_call_test_acquire() in bpf_testmod and
> bpf_kfunc_call_test_release() in kernel is not a good idea and confusing.
> But since this is only for tests, I guess we can live with that. With that,

Correct. 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod")
also mentions why bpf_kfunc_call_test_release() is not in the module and
states that this is temporary. I'll add a comment in v2 so the wait can
be removed once the functions are re-united.

> Acked-by: Yonghong Song <[email protected]>
>
> > ---
> > tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > index 91907b321f913..63f0dbd016703 100644
> > --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > @@ -2,6 +2,7 @@
> > /* Copyright (c) 2020 Facebook */
> > #include <linux/btf.h>
> > #include <linux/btf_ids.h>
> > +#include <linux/delay.h>
> > #include <linux/error-injection.h>
> > #include <linux/init.h>
> > #include <linux/module.h>
> > @@ -544,6 +545,9 @@ static int bpf_testmod_init(void)
> > static void bpf_testmod_exit(void)
> > {
> > + while (refcount_read(&prog_test_struct.cnt) > 1)
> > + msleep(20);
> > +
> > return sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file);
> > }
>

--
Regards,
Artem


2024-01-10 08:58:03

by Artem Savkov

[permalink] [raw]
Subject: [PATCH bpf-next v2] selftests/bpf: fix potential premature unload in bpf_testmod

It is possible for bpf_kfunc_call_test_release() to be called from
bpf_map_free_deferred() when bpf_testmod is already unloaded and
perf_test_stuct.cnt which it tries to decrease is no longer in memory.
This patch tries to fix the issue by waiting for all references to be
dropped in bpf_testmod_exit().

The issue can be triggered by running 'test_progs -t map_kptr' in 6.5,
but is obscured in 6.6 by d119357d07435 ("rcu-tasks: Treat only
synchronous grace periods urgently").

Fixes: 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod")
Signed-off-by: Artem Savkov <[email protected]>
Acked-by: Yonghong Song <[email protected]>
---
tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
index 91907b321f913..e7c9e1c7fde04 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
@@ -2,6 +2,7 @@
/* Copyright (c) 2020 Facebook */
#include <linux/btf.h>
#include <linux/btf_ids.h>
+#include <linux/delay.h>
#include <linux/error-injection.h>
#include <linux/init.h>
#include <linux/module.h>
@@ -544,6 +545,14 @@ static int bpf_testmod_init(void)

static void bpf_testmod_exit(void)
{
+ /* Need to wait for all references to be dropped because
+ * bpf_kfunc_call_test_release() which currently resides in kernel can
+ * be called after bpf_testmod is unloaded. Once release function is
+ * moved into the module this wait can be removed.
+ */
+ while (refcount_read(&prog_test_struct.cnt) > 1)
+ msleep(20);
+
return sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file);
}

--
2.43.0


2024-01-10 12:50:15

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH bpf-next] selftests/bpf: fix potential premature unload in bpf_testmod

On Wed, Jan 10, 2024 at 09:14:51AM +0100, Artem Savkov wrote:
> On Tue, Jan 09, 2024 at 11:40:38AM -0800, Yonghong Song wrote:
> >
> > On 1/9/24 8:43 AM, Artem Savkov wrote:
> > > It is possible for bpf_kfunc_call_test_release() to be called from
> > > bpf_map_free_deferred() when bpf_testmod is already unloaded and
> > > perf_test_stuct.cnt which it tries to decrease is no longer in memory.
> > > This patch tries to fix the issue by waiting for all references to be
> > > dropped in bpf_testmod_exit().
> > >
> > > The issue can be triggered by running 'test_progs -t map_kptr' in 6.5,
> > > but is obscured in 6.6 by d119357d07435 ("rcu-tasks: Treat only
> > > synchronous grace periods urgently").
> > >
> > > Fixes: 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod")
> >
> > Please add your Signed-off-by tag.
>
> Thanks for noticing. Will resend with signed-off-by and your ack.
>
> > I think the root cause is that bpf_kfunc_call_test_acquire() kfunc
> > is defined in bpf_testmod and the kfunc returns some data in bpf_testmod.
> > But the release function bpf_kfunc_call_test_release() is in the kernel.
> > The release func tries to access some data in bpf_testmod which might
> > have been unloaded. The prog_test_ref_kfunc is defined in the kernel, so
> > no bpf_testmod btf reference is hold so bpf_testmod can be unloaded before
> > bpf_kfunc_call_test_release().
> > As you mentioned, we won't have this issue if bpf_kfunc_call_test_acquire()
> > is also in the kernel.
> >
> > I think putting bpf_kfunc_call_test_acquire() in bpf_testmod and
> > bpf_kfunc_call_test_release() in kernel is not a good idea and confusing.
> > But since this is only for tests, I guess we can live with that. With that,
>
> Correct. 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod")
> also mentions why bpf_kfunc_call_test_release() is not in the module and
> states that this is temporary. I'll add a comment in v2 so the wait can
> be removed once the functions are re-united.

I somehow recall it has to do with the fact you can't have trusted
pointer on module's object, so that's why those structs had to stay
in kernel.. but I might be wrong

jirka

>
> > Acked-by: Yonghong Song <[email protected]>
> >
> > > ---
> > > tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 4 ++++
> > > 1 file changed, 4 insertions(+)
> > >
> > > diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > > index 91907b321f913..63f0dbd016703 100644
> > > --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > > +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > > @@ -2,6 +2,7 @@
> > > /* Copyright (c) 2020 Facebook */
> > > #include <linux/btf.h>
> > > #include <linux/btf_ids.h>
> > > +#include <linux/delay.h>
> > > #include <linux/error-injection.h>
> > > #include <linux/init.h>
> > > #include <linux/module.h>
> > > @@ -544,6 +545,9 @@ static int bpf_testmod_init(void)
> > > static void bpf_testmod_exit(void)
> > > {
> > > + while (refcount_read(&prog_test_struct.cnt) > 1)
> > > + msleep(20);
> > > +
> > > return sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file);
> > > }
> >
>
> --
> Regards,
> Artem
>

2024-01-16 15:50:43

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH bpf-next v2] selftests/bpf: fix potential premature unload in bpf_testmod

Hello:

This patch was applied to bpf/bpf-next.git (master)
by Daniel Borkmann <[email protected]>:

On Wed, 10 Jan 2024 09:57:37 +0100 you wrote:
> It is possible for bpf_kfunc_call_test_release() to be called from
> bpf_map_free_deferred() when bpf_testmod is already unloaded and
> perf_test_stuct.cnt which it tries to decrease is no longer in memory.
> This patch tries to fix the issue by waiting for all references to be
> dropped in bpf_testmod_exit().
>
> The issue can be triggered by running 'test_progs -t map_kptr' in 6.5,
> but is obscured in 6.6 by d119357d07435 ("rcu-tasks: Treat only
> synchronous grace periods urgently").
>
> [...]

Here is the summary with links:
- [bpf-next,v2] selftests/bpf: fix potential premature unload in bpf_testmod
https://git.kernel.org/bpf/bpf-next/c/6ad61815babf

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



2024-01-16 16:58:18

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH bpf-next v2] selftests/bpf: fix potential premature unload in bpf_testmod

On Wed, Jan 10, 2024 at 09:57:37AM +0100, Artem Savkov wrote:
> It is possible for bpf_kfunc_call_test_release() to be called from
> bpf_map_free_deferred() when bpf_testmod is already unloaded and
> perf_test_stuct.cnt which it tries to decrease is no longer in memory.
> This patch tries to fix the issue by waiting for all references to be
> dropped in bpf_testmod_exit().
>
> The issue can be triggered by running 'test_progs -t map_kptr' in 6.5,
> but is obscured in 6.6 by d119357d07435 ("rcu-tasks: Treat only
> synchronous grace periods urgently").
>
> Fixes: 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod")
> Signed-off-by: Artem Savkov <[email protected]>
> Acked-by: Yonghong Song <[email protected]>

Acked-by: Jiri Olsa <[email protected]>

jirka

> ---
> tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> index 91907b321f913..e7c9e1c7fde04 100644
> --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> @@ -2,6 +2,7 @@
> /* Copyright (c) 2020 Facebook */
> #include <linux/btf.h>
> #include <linux/btf_ids.h>
> +#include <linux/delay.h>
> #include <linux/error-injection.h>
> #include <linux/init.h>
> #include <linux/module.h>
> @@ -544,6 +545,14 @@ static int bpf_testmod_init(void)
>
> static void bpf_testmod_exit(void)
> {
> + /* Need to wait for all references to be dropped because
> + * bpf_kfunc_call_test_release() which currently resides in kernel can
> + * be called after bpf_testmod is unloaded. Once release function is
> + * moved into the module this wait can be removed.
> + */
> + while (refcount_read(&prog_test_struct.cnt) > 1)
> + msleep(20);
> +
> return sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file);
> }
>
> --
> 2.43.0
>