2021-04-21 10:52:53

by Marco Elver

[permalink] [raw]
Subject: [PATCH v2 0/3] kfence: optimize timer scheduling

We have observed that mostly-idle systems with KFENCE enabled wake up
otherwise idle CPUs, preventing such to enter a lower power state.
Debugging revealed that KFENCE spends too much active time in
toggle_allocation_gate().

While the first version of KFENCE was using all the right bits to be
scheduling optimal, and thus power efficient, by simply using
wait_event() + wake_up(), that code was unfortunately removed.

As KFENCE was exposed to various different configs and tests, the
scheduling optimal code slowly disappeared. First because of hung task
warnings, and finally because of deadlocks when an allocation is made by
timer code with debug objects enabled. Clearly, the "fixes" were not too
friendly for devices that want to be power efficient.

Therefore, let's try a little harder to fix the hung task and deadlock
problems that we have with wait_event() + wake_up(), while remaining as
scheduling friendly and power efficient as possible.

Crucially, we need to defer the wake_up() to an irq_work, avoiding any
potential for deadlock.

The result with this series is that on the devices where we observed a
power regression, power usage returns back to baseline levels.

Changelog
---------

v2:
* Replace kfence_timer_waiting with simpler waitqueue_active() check.

v1: https://lkml.kernel.org/r/[email protected]

Marco Elver (3):
kfence: await for allocation using wait_event
kfence: maximize allocation wait timeout duration
kfence: use power-efficient work queue to run delayed work

lib/Kconfig.kfence | 1 +
mm/kfence/core.c | 58 ++++++++++++++++++++++++++++++++--------------
2 files changed, 42 insertions(+), 17 deletions(-)

--
2.31.1.368.gbe11c130af-goog


2021-04-21 10:52:58

by Marco Elver

[permalink] [raw]
Subject: [PATCH v2 1/3] kfence: await for allocation using wait_event

On mostly-idle systems, we have observed that toggle_allocation_gate()
is a cause of frequent wake-ups, preventing an otherwise idle CPU to go
into a lower power state.

A late change in KFENCE's development, due to a potential deadlock [1],
required changing the scheduling-friendly wait_event_timeout() and
wake_up() to an open-coded wait-loop using schedule_timeout().
[1] https://lkml.kernel.org/r/[email protected]

To avoid unnecessary wake-ups, switch to using wait_event_timeout().

Unfortunately, we still cannot use a version with direct wake_up() in
__kfence_alloc() due to the same potential for deadlock as in [1].
Instead, add a level of indirection via an irq_work that is scheduled if
we determine that the kfence_timer requires a wake_up().

Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Marco Elver <[email protected]>
---
v2:
* Replace kfence_timer_waiting with simpler waitqueue_active() check.
---
lib/Kconfig.kfence | 1 +
mm/kfence/core.c | 45 +++++++++++++++++++++++++++++----------------
2 files changed, 30 insertions(+), 16 deletions(-)

diff --git a/lib/Kconfig.kfence b/lib/Kconfig.kfence
index 78f50ccb3b45..e641add33947 100644
--- a/lib/Kconfig.kfence
+++ b/lib/Kconfig.kfence
@@ -7,6 +7,7 @@ menuconfig KFENCE
bool "KFENCE: low-overhead sampling-based memory safety error detector"
depends on HAVE_ARCH_KFENCE && (SLAB || SLUB)
select STACKTRACE
+ select IRQ_WORK
help
KFENCE is a low-overhead sampling-based detector of heap out-of-bounds
access, use-after-free, and invalid-free errors. KFENCE is designed
diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 768dbd58170d..235d726f88bc 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -10,6 +10,7 @@
#include <linux/atomic.h>
#include <linux/bug.h>
#include <linux/debugfs.h>
+#include <linux/irq_work.h>
#include <linux/kcsan-checks.h>
#include <linux/kfence.h>
#include <linux/kmemleak.h>
@@ -587,6 +588,17 @@ late_initcall(kfence_debugfs_init);

/* === Allocation Gate Timer ================================================ */

+#ifdef CONFIG_KFENCE_STATIC_KEYS
+/* Wait queue to wake up allocation-gate timer task. */
+static DECLARE_WAIT_QUEUE_HEAD(allocation_wait);
+
+static void wake_up_kfence_timer(struct irq_work *work)
+{
+ wake_up(&allocation_wait);
+}
+static DEFINE_IRQ_WORK(wake_up_kfence_timer_work, wake_up_kfence_timer);
+#endif
+
/*
* Set up delayed work, which will enable and disable the static key. We need to
* use a work queue (rather than a simple timer), since enabling and disabling a
@@ -604,25 +616,13 @@ static void toggle_allocation_gate(struct work_struct *work)
if (!READ_ONCE(kfence_enabled))
return;

- /* Enable static key, and await allocation to happen. */
atomic_set(&kfence_allocation_gate, 0);
#ifdef CONFIG_KFENCE_STATIC_KEYS
+ /* Enable static key, and await allocation to happen. */
static_branch_enable(&kfence_allocation_key);
- /*
- * Await an allocation. Timeout after 1 second, in case the kernel stops
- * doing allocations, to avoid stalling this worker task for too long.
- */
- {
- unsigned long end_wait = jiffies + HZ;
-
- do {
- set_current_state(TASK_UNINTERRUPTIBLE);
- if (atomic_read(&kfence_allocation_gate) != 0)
- break;
- schedule_timeout(1);
- } while (time_before(jiffies, end_wait));
- __set_current_state(TASK_RUNNING);
- }
+
+ wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate), HZ);
+
/* Disable static key and reset timer. */
static_branch_disable(&kfence_allocation_key);
#endif
@@ -729,6 +729,19 @@ void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
*/
if (atomic_read(&kfence_allocation_gate) || atomic_inc_return(&kfence_allocation_gate) > 1)
return NULL;
+#ifdef CONFIG_KFENCE_STATIC_KEYS
+ /*
+ * waitqueue_active() is fully ordered after the update of
+ * kfence_allocation_gate per atomic_inc_return().
+ */
+ if (waitqueue_active(&allocation_wait)) {
+ /*
+ * Calling wake_up() here may deadlock when allocations happen
+ * from within timer code. Use an irq_work to defer it.
+ */
+ irq_work_queue(&wake_up_kfence_timer_work);
+ }
+#endif

if (!READ_ONCE(kfence_enabled))
return NULL;
--
2.31.1.368.gbe11c130af-goog

2021-04-21 10:53:39

by Marco Elver

[permalink] [raw]
Subject: [PATCH v2 3/3] kfence: use power-efficient work queue to run delayed work

Use the power-efficient work queue, to avoid the pathological case where
we keep pinning ourselves on the same possibly idle CPU on systems that
want to be power-efficient [1].
[1] https://lwn.net/Articles/731052/

Signed-off-by: Marco Elver <[email protected]>
---
mm/kfence/core.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 9742649f3f88..e18fbbd5d9b4 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -636,7 +636,8 @@ static void toggle_allocation_gate(struct work_struct *work)
/* Disable static key and reset timer. */
static_branch_disable(&kfence_allocation_key);
#endif
- schedule_delayed_work(&kfence_timer, msecs_to_jiffies(kfence_sample_interval));
+ queue_delayed_work(system_power_efficient_wq, &kfence_timer,
+ msecs_to_jiffies(kfence_sample_interval));
}
static DECLARE_DELAYED_WORK(kfence_timer, toggle_allocation_gate);

@@ -665,7 +666,7 @@ void __init kfence_init(void)
}

WRITE_ONCE(kfence_enabled, true);
- schedule_delayed_work(&kfence_timer, 0);
+ queue_delayed_work(system_power_efficient_wq, &kfence_timer, 0);
pr_info("initialized - using %lu bytes for %d objects at 0x%p-0x%p\n", KFENCE_POOL_SIZE,
CONFIG_KFENCE_NUM_OBJECTS, (void *)__kfence_pool,
(void *)(__kfence_pool + KFENCE_POOL_SIZE));
--
2.31.1.368.gbe11c130af-goog

2021-04-21 10:54:13

by Marco Elver

[permalink] [raw]
Subject: [PATCH v2 2/3] kfence: maximize allocation wait timeout duration

The allocation wait timeout was initially added because of warnings due
to CONFIG_DETECT_HUNG_TASK=y [1]. While the 1 sec timeout is sufficient
to resolve the warnings (given the hung task timeout must be 1 sec or
larger) it may cause unnecessary wake-ups if the system is idle.
[1] https://lkml.kernel.org/r/CADYN=9J0DQhizAGB0-jz4HOBBh+05kMBXb4c0cXMS7Qi5NAJiw@mail.gmail.com

Fix it by computing the timeout duration in terms of the current
sysctl_hung_task_timeout_secs value.

Signed-off-by: Marco Elver <[email protected]>
---
mm/kfence/core.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 235d726f88bc..9742649f3f88 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -20,6 +20,7 @@
#include <linux/moduleparam.h>
#include <linux/random.h>
#include <linux/rcupdate.h>
+#include <linux/sched/sysctl.h>
#include <linux/seq_file.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
@@ -621,7 +622,16 @@ static void toggle_allocation_gate(struct work_struct *work)
/* Enable static key, and await allocation to happen. */
static_branch_enable(&kfence_allocation_key);

- wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate), HZ);
+ if (sysctl_hung_task_timeout_secs) {
+ /*
+ * During low activity with no allocations we might wait a
+ * while; let's avoid the hung task warning.
+ */
+ wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate),
+ sysctl_hung_task_timeout_secs * HZ / 2);
+ } else {
+ wait_event(allocation_wait, atomic_read(&kfence_allocation_gate));
+ }

/* Disable static key and reset timer. */
static_branch_disable(&kfence_allocation_key);
--
2.31.1.368.gbe11c130af-goog

2021-09-16 01:03:31

by Kefeng Wang

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] kfence: maximize allocation wait timeout duration


On 2021/4/21 18:51, Marco Elver wrote:
> The allocation wait timeout was initially added because of warnings due
> to CONFIG_DETECT_HUNG_TASK=y [1]. While the 1 sec timeout is sufficient
> to resolve the warnings (given the hung task timeout must be 1 sec or
> larger) it may cause unnecessary wake-ups if the system is idle.
> [1] https://lkml.kernel.org/r/CADYN=9J0DQhizAGB0-jz4HOBBh+05kMBXb4c0cXMS7Qi5NAJiw@mail.gmail.com
>
> Fix it by computing the timeout duration in terms of the current
> sysctl_hung_task_timeout_secs value.
>
> Signed-off-by: Marco Elver <[email protected]>
> ---
> mm/kfence/core.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> index 235d726f88bc..9742649f3f88 100644
> --- a/mm/kfence/core.c
> +++ b/mm/kfence/core.c
> @@ -20,6 +20,7 @@
> #include <linux/moduleparam.h>
> #include <linux/random.h>
> #include <linux/rcupdate.h>
> +#include <linux/sched/sysctl.h>
> #include <linux/seq_file.h>
> #include <linux/slab.h>
> #include <linux/spinlock.h>
> @@ -621,7 +622,16 @@ static void toggle_allocation_gate(struct work_struct *work)
> /* Enable static key, and await allocation to happen. */
> static_branch_enable(&kfence_allocation_key);
>
> - wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate), HZ);
> + if (sysctl_hung_task_timeout_secs) {
> + /*
> + * During low activity with no allocations we might wait a
> + * while; let's avoid the hung task warning.
> + */
> + wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate),
> + sysctl_hung_task_timeout_secs * HZ / 2);
> + } else {
> + wait_event(allocation_wait, atomic_read(&kfence_allocation_gate));
> + }
>
> /* Disable static key and reset timer. */
> static_branch_disable(&kfence_allocation_key);

2021-09-16 01:21:43

by Kefeng Wang

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] kfence: maximize allocation wait timeout duration

Hi Marco,

We found kfence_test will fails  on ARM64 with this patch with/without 
CONFIG_DETECT_HUNG_TASK,

Any thought ?


On 2021/9/16 9:02, Kefeng Wang wrote:
>
> On 2021/4/21 18:51, Marco Elver wrote:
>> The allocation wait timeout was initially added because of warnings due
>> to CONFIG_DETECT_HUNG_TASK=y [1]. While the 1 sec timeout is sufficient
>> to resolve the warnings (given the hung task timeout must be 1 sec or
>> larger) it may cause unnecessary wake-ups if the system is idle.
>> [1]
>> https://lkml.kernel.org/r/CADYN=9J0DQhizAGB0-jz4HOBBh+05kMBXb4c0cXMS7Qi5NAJiw@mail.gmail.com
>>
>> Fix it by computing the timeout duration in terms of the current
>> sysctl_hung_task_timeout_secs value.
>>
>> Signed-off-by: Marco Elver <[email protected]>
>> ---
>>   mm/kfence/core.c | 12 +++++++++++-
>>   1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
>> index 235d726f88bc..9742649f3f88 100644
>> --- a/mm/kfence/core.c
>> +++ b/mm/kfence/core.c
>> @@ -20,6 +20,7 @@
>>   #include <linux/moduleparam.h>
>>   #include <linux/random.h>
>>   #include <linux/rcupdate.h>
>> +#include <linux/sched/sysctl.h>
>>   #include <linux/seq_file.h>
>>   #include <linux/slab.h>
>>   #include <linux/spinlock.h>
>> @@ -621,7 +622,16 @@ static void toggle_allocation_gate(struct
>> work_struct *work)
>>       /* Enable static key, and await allocation to happen. */
>>       static_branch_enable(&kfence_allocation_key);
>>   -    wait_event_timeout(allocation_wait,
>> atomic_read(&kfence_allocation_gate), HZ);
>> +    if (sysctl_hung_task_timeout_secs) {
>> +        /*
>> +         * During low activity with no allocations we might wait a
>> +         * while; let's avoid the hung task warning.
>> +         */
>> +        wait_event_timeout(allocation_wait,
>> atomic_read(&kfence_allocation_gate),
>> +                   sysctl_hung_task_timeout_secs * HZ / 2);
>> +    } else {
>> +        wait_event(allocation_wait,
>> atomic_read(&kfence_allocation_gate));
>> +    }
>>         /* Disable static key and reset timer. */
>>       static_branch_disable(&kfence_allocation_key);

2021-09-16 08:52:21

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] kfence: maximize allocation wait timeout duration

On Thu, 16 Sept 2021 at 03:20, Kefeng Wang <[email protected]> wrote:
> Hi Marco,
>
> We found kfence_test will fails on ARM64 with this patch with/without
> CONFIG_DETECT_HUNG_TASK,
>
> Any thought ?

Please share log and instructions to reproduce if possible. Also, if
possible, please share bisection log that led you to this patch.

I currently do not see how this patch would cause that, it only
increases the timeout duration.

I know that under QEMU TCG mode, there are occasionally timeouts in
the test simply due to QEMU being extremely slow or other weirdness.

2021-09-16 15:50:14

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] kfence: maximize allocation wait timeout duration

On Thu, 16 Sept 2021 at 17:45, David Laight <[email protected]> wrote:
>
> From: Kefeng Wang
> > Sent: 16 September 2021 02:21
> >
> > We found kfence_test will fails on ARM64 with this patch with/without
> > CONFIG_DETECT_HUNG_TASK,
> >
> > Any thought ?
> >
> ...
> > >> /* Enable static key, and await allocation to happen. */
> > >> static_branch_enable(&kfence_allocation_key);
> > >> - wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate), HZ);
> > >> + if (sysctl_hung_task_timeout_secs) {
> > >> + /*
> > >> + * During low activity with no allocations we might wait a
> > >> + * while; let's avoid the hung task warning.
> > >> + */
> > >> + wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate),
> > >> + sysctl_hung_task_timeout_secs * HZ / 2);
> > >> + } else {
> > >> + wait_event(allocation_wait, atomic_read(&kfence_allocation_gate));
> > >> + }
> > >> /* Disable static key and reset timer. */
> > >> static_branch_disable(&kfence_allocation_key);
>
> It has replaced a wait_event_timeout() with a wait_event().
>
> That probably isn't intended.
> Although I'd expect their to be some test for the wait being
> signalled or timing out.

It is intended -- there's a wake_up() for this. See the whole patch
series for explanation.

The whole reason we had the timeout was to avoid the hung task
warnings, but we can do better if there is no hung task warning
enabled.

2021-09-16 19:22:38

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v2 2/3] kfence: maximize allocation wait timeout duration

From: Kefeng Wang
> Sent: 16 September 2021 02:21
>
> We found kfence_test will fails  on ARM64 with this patch with/without
> CONFIG_DETECT_HUNG_TASK,
>
> Any thought ?
>
...
> >>       /* Enable static key, and await allocation to happen. */
> >>       static_branch_enable(&kfence_allocation_key);
> >>   -    wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate), HZ);
> >> +    if (sysctl_hung_task_timeout_secs) {
> >> +        /*
> >> +         * During low activity with no allocations we might wait a
> >> +         * while; let's avoid the hung task warning.
> >> +         */
> >> +        wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate),
> >> +                   sysctl_hung_task_timeout_secs * HZ / 2);
> >> +    } else {
> >> +        wait_event(allocation_wait, atomic_read(&kfence_allocation_gate));
> >> +    }
> >>         /* Disable static key and reset timer. */
> >>       static_branch_disable(&kfence_allocation_key);

It has replaced a wait_event_timeout() with a wait_event().

That probably isn't intended.
Although I'd expect their to be some test for the wait being
signalled or timing out.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2021-09-18 16:07:56

by Liu Shixin

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] kfence: maximize allocation wait timeout duration


On 2021/9/16 16:49, Marco Elver wrote:
> On Thu, 16 Sept 2021 at 03:20, Kefeng Wang <[email protected]> wrote:
>> Hi Marco,
>>
>> We found kfence_test will fails on ARM64 with this patch with/without
>> CONFIG_DETECT_HUNG_TASK,
>>
>> Any thought ?
> Please share log and instructions to reproduce if possible. Also, if
> possible, please share bisection log that led you to this patch.
>
> I currently do not see how this patch would cause that, it only
> increases the timeout duration.
>
> I know that under QEMU TCG mode, there are occasionally timeouts in
> the test simply due to QEMU being extremely slow or other weirdness.
>
> .
>
Hi Marco,

There are some of the results of the current test:
1. Using qemu-kvm on arm64 machine, all testcase can pass.
2. Using qemu-system-aarch64 on x86_64 machine, randomly some testcases fail.
3. Using qemu-system-aarch64 on x86_64, but removing the judgment of kfence_allocation_key in kfence_alloc(), all testcase can pass.

I add some printing to the kernel and get very strange results.
I add a new variable kfence_allocation_key_gate to track the
state of kfence_allocation_key. As shown in the following code, theoretically,
if kfence_allocation_key_gate is zero, then kfence_allocation_key must be
enabled, so the value of variable error in kfence_alloc() should always be
zero. In fact, all the passed testcases fit this point. But as shown in the
following failed log, although kfence_allocation_key has been enabled, it's
still check failed here.

So I think static_key might be problematic in my qemu environment.
The change of timeout is not a problem but caused us to observe this problem.
I tried changing the wait_event to a loop. I set timeout to HZ and re-enable/disabled
in each loop, then the failed testcase disappears.

[ 3.463519] # Subtest: kfence
[ 3.463629] 1..25
[ 3.465548] # test_out_of_bounds_read: test_alloc: size=128, gfp=cc0, policy=left, cache=0
[ 3.561001] kfence: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~enabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[ 3.561934] kfence: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~disabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[ 3.665449] kfence: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~enabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[ 13.464796] --------------kfence_allocation_key check failed 13839286 times----------------
[ 13.467482] kfence: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~disabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[ 13.469166] # test_out_of_bounds_read: ASSERTION FAILED at mm/kfence/kfence_test.c:308
[ 13.469166] Expected false to be true, but is false
[ 13.469166]
[ 13.469166] failed to allocate from KFENCE
[ 13.473592] not ok 1 - test_out_of_bounds_read


diff --git a/include/linux/kfence.h b/include/linux/kfence.h
index 3fe6dd8a18c1..e72889606e82 100644
--- a/include/linux/kfence.h
+++ b/include/linux/kfence.h
@@ -25,6 +25,7 @@ extern char *__kfence_pool;
#ifdef CONFIG_KFENCE_STATIC_KEYS
#include <linux/static_key.h>
DECLARE_STATIC_KEY_FALSE(kfence_allocation_key);
+extern atomic_t kfence_allocation_key_gate;
#else
#include <linux/atomic.h>
extern atomic_t kfence_allocation_gate;
@@ -116,12 +117,20 @@ void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags);
*/
static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
{
+ static int error;
#ifdef CONFIG_KFENCE_STATIC_KEYS
- if (static_branch_unlikely(&kfence_allocation_key))
+ if (static_branch_unlikely(&kfence_allocation_key)) {
#else
- if (unlikely(!atomic_read(&kfence_allocation_gate)))
+ if (unlikely(!atomic_read(&kfence_allocation_gate))) {
#endif
+ if (error) {
+ pr_info("--------------kfence_allocation_key check failed %d times----------------\n", error);
+ error = 0;
+ }
return __kfence_alloc(s, size, flags);
+ }
+ if (!atomic_read(&kfence_allocation_key_gate))
+ error++;
return NULL;
}
diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 7a97db8bc8e7..637c2efa6133 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -100,6 +100,7 @@ static DEFINE_RAW_SPINLOCK(kfence_freelist_lock); /* Lock protecting freelist. *
#ifdef CONFIG_KFENCE_STATIC_KEYS
/* The static key to set up a KFENCE allocation. */
DEFINE_STATIC_KEY_FALSE(kfence_allocation_key);
+atomic_t kfence_allocation_key_gate = ATOMIC_INIT(1);
#endif

/* Gates the allocation, ensuring only one succeeds in a given period. */
@@ -624,7 +625,9 @@ static void toggle_allocation_gate(struct work_struct *work)
#ifdef CONFIG_KFENCE_STATIC_KEYS
/* Enable static key, and await allocation to happen. */
static_branch_enable(&kfence_allocation_key);
-
+ if (static_branch_unlikely(&kfence_allocation_key))
+ pr_info("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~enabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n");
+ atomic_set(&kfence_allocation_key_gate, 0);
if (sysctl_hung_task_timeout_secs) {
/*
* During low activity with no allocations we might wait a
@@ -637,7 +640,10 @@ static void toggle_allocation_gate(struct work_struct *work)
}

/* Disable static key and reset timer. */
+ atomic_set(&kfence_allocation_key_gate, 1);
static_branch_disable(&kfence_allocation_key);
+ if (!static_branch_unlikely(&kfence_allocation_key))
+ pr_info("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~disabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n");
#endif
queue_delayed_work(system_unbound_wq, &kfence_timer,
msecs_to_jiffies(kfence_sample_interval));

thanks,


2021-09-18 17:33:43

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] kfence: maximize allocation wait timeout duration

On Sat, 18 Sept 2021 at 10:07, Liu Shixin <[email protected]> wrote:
>
> On 2021/9/16 16:49, Marco Elver wrote:
> > On Thu, 16 Sept 2021 at 03:20, Kefeng Wang <[email protected]> wrote:
> >> Hi Marco,
> >>
> >> We found kfence_test will fails on ARM64 with this patch with/without
> >> CONFIG_DETECT_HUNG_TASK,
> >>
> >> Any thought ?
> > Please share log and instructions to reproduce if possible. Also, if
> > possible, please share bisection log that led you to this patch.
> >
> > I currently do not see how this patch would cause that, it only
> > increases the timeout duration.
> >
> > I know that under QEMU TCG mode, there are occasionally timeouts in
> > the test simply due to QEMU being extremely slow or other weirdness.
> >
> >
> Hi Marco,
>
> There are some of the results of the current test:
> 1. Using qemu-kvm on arm64 machine, all testcase can pass.
> 2. Using qemu-system-aarch64 on x86_64 machine, randomly some testcases fail.
> 3. Using qemu-system-aarch64 on x86_64, but removing the judgment of kfence_allocation_key in kfence_alloc(), all testcase can pass.
>
> I add some printing to the kernel and get very strange results.
> I add a new variable kfence_allocation_key_gate to track the
> state of kfence_allocation_key. As shown in the following code, theoretically,
> if kfence_allocation_key_gate is zero, then kfence_allocation_key must be
> enabled, so the value of variable error in kfence_alloc() should always be
> zero. In fact, all the passed testcases fit this point. But as shown in the
> following failed log, although kfence_allocation_key has been enabled, it's
> still check failed here.
>
> So I think static_key might be problematic in my qemu environment.
> The change of timeout is not a problem but caused us to observe this problem.
> I tried changing the wait_event to a loop. I set timeout to HZ and re-enable/disabled
> in each loop, then the failed testcase disappears.

Nice analysis, thanks! What I gather is that static_keys/jump_labels
are somehow broken in QEMU.

This does remind me that I found a bug in QEMU that might be relevant:
https://bugs.launchpad.net/qemu/+bug/1920934
Looks like it was never fixed. :-/

The failures I encountered caused the kernel to crash, but never saw
the kfence test to fail due to that (never managed to get that far).
Though the bug I saw was on x86 TCG mode, and I never tried arm64. If
you can, try to build a QEMU with ASan and see if you also get the
same use-after-free bug.

Unless we observe the problem on a real machine, I think for now we
can conclude with fairly high confidence that QEMU TCG still has
issues and cannot be fully trusted here (see bug above).

Thanks,
-- Marco

2021-09-18 17:42:26

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] kfence: maximize allocation wait timeout duration

On Sat, 18 Sept 2021 at 11:37, Marco Elver <[email protected]> wrote:
>
> On Sat, 18 Sept 2021 at 10:07, Liu Shixin <[email protected]> wrote:
> >
> > On 2021/9/16 16:49, Marco Elver wrote:
> > > On Thu, 16 Sept 2021 at 03:20, Kefeng Wang <[email protected]> wrote:
> > >> Hi Marco,
> > >>
> > >> We found kfence_test will fails on ARM64 with this patch with/without
> > >> CONFIG_DETECT_HUNG_TASK,
> > >>
> > >> Any thought ?
> > > Please share log and instructions to reproduce if possible. Also, if
> > > possible, please share bisection log that led you to this patch.
> > >
> > > I currently do not see how this patch would cause that, it only
> > > increases the timeout duration.
> > >
> > > I know that under QEMU TCG mode, there are occasionally timeouts in
> > > the test simply due to QEMU being extremely slow or other weirdness.
> > >
> > >
> > Hi Marco,
> >
> > There are some of the results of the current test:
> > 1. Using qemu-kvm on arm64 machine, all testcase can pass.
> > 2. Using qemu-system-aarch64 on x86_64 machine, randomly some testcases fail.
> > 3. Using qemu-system-aarch64 on x86_64, but removing the judgment of kfence_allocation_key in kfence_alloc(), all testcase can pass.
> >
> > I add some printing to the kernel and get very strange results.
> > I add a new variable kfence_allocation_key_gate to track the
> > state of kfence_allocation_key. As shown in the following code, theoretically,
> > if kfence_allocation_key_gate is zero, then kfence_allocation_key must be
> > enabled, so the value of variable error in kfence_alloc() should always be
> > zero. In fact, all the passed testcases fit this point. But as shown in the
> > following failed log, although kfence_allocation_key has been enabled, it's
> > still check failed here.
> >
> > So I think static_key might be problematic in my qemu environment.
> > The change of timeout is not a problem but caused us to observe this problem.
> > I tried changing the wait_event to a loop. I set timeout to HZ and re-enable/disabled
> > in each loop, then the failed testcase disappears.
>
> Nice analysis, thanks! What I gather is that static_keys/jump_labels
> are somehow broken in QEMU.
>
> This does remind me that I found a bug in QEMU that might be relevant:
> https://bugs.launchpad.net/qemu/+bug/1920934
> Looks like it was never fixed. :-/
>
> The failures I encountered caused the kernel to crash, but never saw
> the kfence test to fail due to that (never managed to get that far).
> Though the bug I saw was on x86 TCG mode, and I never tried arm64. If

[ ... that is, I didn't try running QEMU-ASan in arm64 TCG mode ... of
course I use QEMU arm64 to test. ;-) ]

> you can, try to build a QEMU with ASan and see if you also get the
> same use-after-free bug.
>
> Unless we observe the problem on a real machine, I think for now we
> can conclude with fairly high confidence that QEMU TCG still has
> issues and cannot be fully trusted here (see bug above).
>
> Thanks,
> -- Marco