2022-03-09 08:49:48

by Peng Liu

[permalink] [raw]
Subject: [PATCH v2 0/3] kunit: fix a UAF bug and do some optimization

This series is to fix UAF when running kfence test case test_gfpzero,
which is time costly. This UAF bug can be easily triggered by setting
CONFIG_KFENCE_NUM_OBJECTS = 65535. Furthermore, some optimization for
kunit tests has been done.

v1->v2:
Change log is updated.

Peng Liu (3):
kunit: fix UAF when run kfence test case test_gfpzero
kunit: make kunit_test_timeout compatible with comment
kfence: test: try to avoid test_gfpzero trigger rcu_stall

lib/kunit/try-catch.c | 3 ++-
mm/kfence/kfence_test.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)

--
2.18.0.huawei.25


2022-03-09 08:59:11

by Peng Liu

[permalink] [raw]
Subject: [PATCH v2 1/3] kunit: fix UAF when run kfence test case test_gfpzero

Kunit will create a new thread to run an actual test case, and the
main process will wait for the completion of the actual test thread
until overtime. The variable "struct kunit test" has local property
in function kunit_try_catch_run, and will be used in the test case
thread. Task kunit_try_catch_run will free "struct kunit test" when
kunit runs overtime, but the actual test case is still run and an
UAF bug will be triggered.

The above problem has been both observed in a physical machine and
qemu platform when running kfence kunit tests. The problem can be
triggered when setting CONFIG_KFENCE_NUM_OBJECTS = 65535. Under
this setting, the test case test_gfpzero will cost hours and kunit
will run to overtime. The follows show the panic log.

BUG: unable to handle page fault for address: ffffffff82d882e9

Call Trace:
kunit_log_append+0x58/0xd0
...
test_alloc.constprop.0.cold+0x6b/0x8a [kfence_test]
test_gfpzero.cold+0x61/0x8ab [kfence_test]
kunit_try_run_case+0x4c/0x70
kunit_generic_run_threadfn_adapter+0x11/0x20
kthread+0x166/0x190
ret_from_fork+0x22/0x30
Kernel panic - not syncing: Fatal exception
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014

To solve this problem, the test case thread should be stopped when
the kunit frame runs overtime. The stop signal will send in function
kunit_try_catch_run, and test_gfpzero will handle it.

Signed-off-by: Peng Liu <[email protected]>
---
lib/kunit/try-catch.c | 1 +
mm/kfence/kfence_test.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/kunit/try-catch.c b/lib/kunit/try-catch.c
index be38a2c5ecc2..6b3d4db94077 100644
--- a/lib/kunit/try-catch.c
+++ b/lib/kunit/try-catch.c
@@ -78,6 +78,7 @@ void kunit_try_catch_run(struct kunit_try_catch *try_catch, void *context)
if (time_remaining == 0) {
kunit_err(test, "try timed out\n");
try_catch->try_result = -ETIMEDOUT;
+ kthread_stop(task_struct);
}

exit_code = try_catch->try_result;
diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index 50dbb815a2a8..caed6b4eba94 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -623,7 +623,7 @@ static void test_gfpzero(struct kunit *test)
break;
test_free(buf2);

- if (i == CONFIG_KFENCE_NUM_OBJECTS) {
+ if (kthread_should_stop() || (i == CONFIG_KFENCE_NUM_OBJECTS)) {
kunit_warn(test, "giving up ... cannot get same object back\n");
return;
}
--
2.18.0.huawei.25

2022-03-09 09:14:24

by Peng Liu

[permalink] [raw]
Subject: [PATCH v2 3/3] kfence: test: try to avoid test_gfpzero trigger rcu_stall

When CONFIG_KFENCE_NUM_OBJECTS is set to a big number, kfence
kunit-test-case test_gfpzero will eat up nearly all the CPU's
resources and rcu_stall is reported as the following log which
is cut from a physical server.

rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 68-....: (14422 ticks this GP) idle=6ce/1/0x4000000000000002
softirq=592/592 fqs=7500 (t=15004 jiffies g=10677 q=20019)
Task dump for CPU 68:
task:kunit_try_catch state:R running task
stack: 0 pid: 9728 ppid: 2 flags:0x0000020a
Call trace:
dump_backtrace+0x0/0x1e4
show_stack+0x20/0x2c
sched_show_task+0x148/0x170
...
rcu_sched_clock_irq+0x70/0x180
update_process_times+0x68/0xb0
tick_sched_handle+0x38/0x74
...
gic_handle_irq+0x78/0x2c0
el1_irq+0xb8/0x140
kfree+0xd8/0x53c
test_alloc+0x264/0x310 [kfence_test]
test_gfpzero+0xf4/0x840 [kfence_test]
kunit_try_run_case+0x48/0x20c
kunit_generic_run_threadfn_adapter+0x28/0x34
kthread+0x108/0x13c
ret_from_fork+0x10/0x18

To avoid rcu_stall and unacceptable latency, a schedule point is
added to test_gfpzero.

Signed-off-by: Peng Liu <[email protected]>
---
mm/kfence/kfence_test.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index caed6b4eba94..1b50f70a4c0f 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -627,6 +627,7 @@ static void test_gfpzero(struct kunit *test)
kunit_warn(test, "giving up ... cannot get same object back\n");
return;
}
+ cond_resched();
}

for (i = 0; i < size; i++)
--
2.18.0.huawei.25

2022-03-09 11:16:41

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] kunit: fix UAF when run kfence test case test_gfpzero

On Wed, 9 Mar 2022 at 09:19, 'Peng Liu' via kasan-dev
<[email protected]> wrote:
>
> Kunit will create a new thread to run an actual test case, and the
> main process will wait for the completion of the actual test thread
> until overtime. The variable "struct kunit test" has local property
> in function kunit_try_catch_run, and will be used in the test case
> thread. Task kunit_try_catch_run will free "struct kunit test" when
> kunit runs overtime, but the actual test case is still run and an
> UAF bug will be triggered.
>
> The above problem has been both observed in a physical machine and
> qemu platform when running kfence kunit tests. The problem can be
> triggered when setting CONFIG_KFENCE_NUM_OBJECTS = 65535. Under
> this setting, the test case test_gfpzero will cost hours and kunit
> will run to overtime. The follows show the panic log.
>
> BUG: unable to handle page fault for address: ffffffff82d882e9
>
> Call Trace:
> kunit_log_append+0x58/0xd0
> ...
> test_alloc.constprop.0.cold+0x6b/0x8a [kfence_test]
> test_gfpzero.cold+0x61/0x8ab [kfence_test]
> kunit_try_run_case+0x4c/0x70
> kunit_generic_run_threadfn_adapter+0x11/0x20
> kthread+0x166/0x190
> ret_from_fork+0x22/0x30
> Kernel panic - not syncing: Fatal exception
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>
> To solve this problem, the test case thread should be stopped when
> the kunit frame runs overtime. The stop signal will send in function
> kunit_try_catch_run, and test_gfpzero will handle it.
>
> Signed-off-by: Peng Liu <[email protected]>

Reviewed-by: Marco Elver <[email protected]>

Also Cc'ing more KUnit folks to double-check this is the right solution.

> ---
> lib/kunit/try-catch.c | 1 +
> mm/kfence/kfence_test.c | 2 +-
> 2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/lib/kunit/try-catch.c b/lib/kunit/try-catch.c
> index be38a2c5ecc2..6b3d4db94077 100644
> --- a/lib/kunit/try-catch.c
> +++ b/lib/kunit/try-catch.c
> @@ -78,6 +78,7 @@ void kunit_try_catch_run(struct kunit_try_catch *try_catch, void *context)
> if (time_remaining == 0) {
> kunit_err(test, "try timed out\n");
> try_catch->try_result = -ETIMEDOUT;
> + kthread_stop(task_struct);
> }
>
> exit_code = try_catch->try_result;
> diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
> index 50dbb815a2a8..caed6b4eba94 100644
> --- a/mm/kfence/kfence_test.c
> +++ b/mm/kfence/kfence_test.c
> @@ -623,7 +623,7 @@ static void test_gfpzero(struct kunit *test)
> break;
> test_free(buf2);
>
> - if (i == CONFIG_KFENCE_NUM_OBJECTS) {
> + if (kthread_should_stop() || (i == CONFIG_KFENCE_NUM_OBJECTS)) {
> kunit_warn(test, "giving up ... cannot get same object back\n");
> return;
> }
> --
> 2.18.0.huawei.25
>
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20220309083753.1561921-2-liupeng256%40huawei.com.

2022-03-10 08:41:29

by Brendan Higgins

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] kunit: fix UAF when run kfence test case test_gfpzero

On Wed, Mar 9, 2022 at 3:19 AM 'Peng Liu' via KUnit Development
<[email protected]> wrote:
>
> Kunit will create a new thread to run an actual test case, and the
> main process will wait for the completion of the actual test thread
> until overtime. The variable "struct kunit test" has local property
> in function kunit_try_catch_run, and will be used in the test case
> thread. Task kunit_try_catch_run will free "struct kunit test" when
> kunit runs overtime, but the actual test case is still run and an
> UAF bug will be triggered.
>
> The above problem has been both observed in a physical machine and
> qemu platform when running kfence kunit tests. The problem can be
> triggered when setting CONFIG_KFENCE_NUM_OBJECTS = 65535. Under
> this setting, the test case test_gfpzero will cost hours and kunit
> will run to overtime. The follows show the panic log.
>
> BUG: unable to handle page fault for address: ffffffff82d882e9
>
> Call Trace:
> kunit_log_append+0x58/0xd0
> ...
> test_alloc.constprop.0.cold+0x6b/0x8a [kfence_test]
> test_gfpzero.cold+0x61/0x8ab [kfence_test]
> kunit_try_run_case+0x4c/0x70
> kunit_generic_run_threadfn_adapter+0x11/0x20
> kthread+0x166/0x190
> ret_from_fork+0x22/0x30
> Kernel panic - not syncing: Fatal exception
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>
> To solve this problem, the test case thread should be stopped when
> the kunit frame runs overtime. The stop signal will send in function
> kunit_try_catch_run, and test_gfpzero will handle it.
>
> Signed-off-by: Peng Liu <[email protected]>

Thanks for taking care of this.

Reviewed-by: Brendan Higgins <[email protected]>

2022-03-11 23:17:36

by Brendan Higgins

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] kunit: fix a UAF bug and do some optimization

On Wed, Mar 9, 2022 at 3:19 AM 'Peng Liu' via KUnit Development
<[email protected]> wrote:
>
> This series is to fix UAF when running kfence test case test_gfpzero,
> which is time costly. This UAF bug can be easily triggered by setting
> CONFIG_KFENCE_NUM_OBJECTS = 65535. Furthermore, some optimization for
> kunit tests has been done.

I was able to reproduce the error you described and can confirm that I
didn't see the UAF after applying your patches.

Tested-by: Brendan Higgins <[email protected]>