v2:
- Update patch 3 to count the objects checked instead of being
gray for determining when to do cond_resched(). This is more
reliable.
There are 3 RCU-based object iteration loops in kmemleak_scan(). Because
of the need to take RCU read lock, we can't insert cond_resched() into
the loop like other parts of the function. As there can be millions of
objects to be scanned, it takes a while to iterate all of them. The
kmemleak functionality is usually enabled in a debug kernel which is
much slower than a non-debug kernel. With sufficient number of kmemleak
objects, the time to iterate them all may exceed 22s causing soft lockup.
watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kmemleak:625]
This patch series make changes to the 3 object iteration loops in
kmemleak_scan() to prevent them from causing soft lockup.
Waiman Long (3):
mm/kmemleak: Use _irq lock/unlock variants in kmemleak_scan/_clear()
mm/kmemleak: Skip unlikely objects in kmemleak_scan() without taking
lock
mm/kmemleak: Prevent soft lockup in first object iteration loop of
kmemleak_scan()
mm/kmemleak.c | 60 +++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 49 insertions(+), 11 deletions(-)
--
2.31.1
The first RCU-based object iteration loop has to modify the object
count. So we cannot skip taking the object lock.
One way to avoid soft lockup is to insert occasional cond_resched()
call into the loop. This cannot be done while holding the RCU read
lock which is to protect objects from being freed. However, taking a
reference to the object will prevent it from being freed. We can then
do a cond_resched() call after every 64k objects safely.
Signed-off-by: Waiman Long <[email protected]>
---
mm/kmemleak.c | 28 +++++++++++++++++++++++++++-
1 file changed, 27 insertions(+), 1 deletion(-)
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 7dd64139a7c7..abba063ae5ee 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1417,12 +1417,16 @@ static void kmemleak_scan(void)
struct zone *zone;
int __maybe_unused i;
int new_leaks = 0;
+ int loop1_cnt = 0;
jiffies_last_scan = jiffies;
/* prepare the kmemleak_object's */
rcu_read_lock();
list_for_each_entry_rcu(object, &object_list, object_list) {
+ bool obj_pinned = false;
+
+ loop1_cnt++;
raw_spin_lock_irq(&object->lock);
#ifdef DEBUG
/*
@@ -1437,10 +1441,32 @@ static void kmemleak_scan(void)
#endif
/* reset the reference count (whiten the object) */
object->count = 0;
- if (color_gray(object) && get_object(object))
+ if (color_gray(object) && get_object(object)) {
list_add_tail(&object->gray_list, &gray_list);
+ obj_pinned = true;
+ }
raw_spin_unlock_irq(&object->lock);
+
+ /*
+ * Do a cond_resched() to avoid soft lockup every 64k objects.
+ * Make sure a reference has been taken so that the object
+ * won't go away without RCU read lock.
+ */
+ if (!(loop1_cnt & 0xffff)) {
+ if (!obj_pinned && !get_object(object)) {
+ /* Try the next object instead */
+ loop1_cnt--;
+ continue;
+ }
+
+ rcu_read_unlock();
+ cond_resched();
+ rcu_read_lock();
+
+ if (!obj_pinned)
+ put_object(object);
+ }
}
rcu_read_unlock();
--
2.31.1
There are 3 RCU-based object iteration loops in kmemleak_scan(). Because
of the need to take RCU read lock, we can't insert cond_resched() into
the loop like other parts of the function. As there can be millions of
objects to be scanned, it takes a while to iterate all of them. The
kmemleak functionality is usually enabled in a debug kernel which is
much slower than a non-debug kernel. With sufficient number of kmemleak
objects, the time to iterate them all may exceed 22s causing soft lockup.
watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kmemleak:625]
In this particular bug report, the soft lockup happen in the 2nd
iteration loop.
In the 2nd and 3rd loops, most of the objects are checked and then
skipped under the object lock. Only a selected fews are modified. Those
objects certainly need lock protection. However, the lock/unlock
operation is slow especially with interrupt disabling and enabling
included.
We can actually do some basic check like color_white() without taking
the lock and skip the object accordingly. Of course, this kind of check
is racy and may miss objects that are being modified concurrently. The
cost of missed objects, however, is just that they will be discovered in
the next scan instead. The advantage of doing so is that iteration can
be done much faster especially with LOCKDEP enabled in a debug kernel.
With a debug kernel running on a 2-socket 96-thread x86-64 system
(HZ=1000), the 2nd and 3rd iteration loops speedup with this patch on
the first kmemleak_scan() call after bootup is shown in the table below.
Before patch After patch
Loop # # of objects Elapsed time # of objects Elapsed time
------ ------------ ------------ ------------ ------------
2 2,599,850 2.392s 2,596,364 0.266s
3 2,600,176 2.171s 2,597,061 0.260s
This patch reduces loop iteration times by about 88%. This will greatly
reduce the chance of a soft lockup happening in the 2nd or 3rd iteration
loops.
Even though the first loop runs a little bit faster, it can still be
problematic if many kmemleak objects are there. As the object count
has to be modified in every object, we cannot avoid taking the object
lock. So other way to prevent soft lockup will be needed.
Signed-off-by: Waiman Long <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
---
mm/kmemleak.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index dad9219c972c..7dd64139a7c7 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1508,6 +1508,13 @@ static void kmemleak_scan(void)
*/
rcu_read_lock();
list_for_each_entry_rcu(object, &object_list, object_list) {
+ /*
+ * This is racy but we can save the overhead of lock/unlock
+ * calls. The missed objects, if any, should be caught in
+ * the next scan.
+ */
+ if (!color_white(object))
+ continue;
raw_spin_lock_irq(&object->lock);
if (color_white(object) && (object->flags & OBJECT_ALLOCATED)
&& update_checksum(object) && get_object(object)) {
@@ -1535,6 +1542,13 @@ static void kmemleak_scan(void)
*/
rcu_read_lock();
list_for_each_entry_rcu(object, &object_list, object_list) {
+ /*
+ * This is racy but we can save the overhead of lock/unlock
+ * calls. The missed objects, if any, should be caught in
+ * the next scan.
+ */
+ if (!color_white(object))
+ continue;
raw_spin_lock_irq(&object->lock);
if (unreferenced_object(object) &&
!(object->flags & OBJECT_REPORTED)) {
--
2.31.1
On Tue, Jun 14, 2022 at 06:03:59PM -0400, Waiman Long wrote:
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index 7dd64139a7c7..abba063ae5ee 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -1417,12 +1417,16 @@ static void kmemleak_scan(void)
> struct zone *zone;
> int __maybe_unused i;
> int new_leaks = 0;
> + int loop1_cnt = 0;
>
> jiffies_last_scan = jiffies;
>
> /* prepare the kmemleak_object's */
> rcu_read_lock();
> list_for_each_entry_rcu(object, &object_list, object_list) {
> + bool obj_pinned = false;
> +
> + loop1_cnt++;
> raw_spin_lock_irq(&object->lock);
> #ifdef DEBUG
> /*
> @@ -1437,10 +1441,32 @@ static void kmemleak_scan(void)
> #endif
> /* reset the reference count (whiten the object) */
> object->count = 0;
> - if (color_gray(object) && get_object(object))
> + if (color_gray(object) && get_object(object)) {
> list_add_tail(&object->gray_list, &gray_list);
> + obj_pinned = true;
> + }
>
> raw_spin_unlock_irq(&object->lock);
> +
> + /*
> + * Do a cond_resched() to avoid soft lockup every 64k objects.
> + * Make sure a reference has been taken so that the object
> + * won't go away without RCU read lock.
> + */
> + if (!(loop1_cnt & 0xffff)) {
> + if (!obj_pinned && !get_object(object)) {
> + /* Try the next object instead */
> + loop1_cnt--;
> + continue;
> + }
With this trick we could probably get rid of rcu_read_lock() and take
the kmemleak_lock instead. But that's for another patch.
> +
> + rcu_read_unlock();
> + cond_resched();
> + rcu_read_lock();
cond_resched_rcu() to save a couple of lines?
> +
> + if (!obj_pinned)
> + put_object(object);
> + }
Reviewed-by: Catalin Marinas <[email protected]>
Thanks.
On 6/15/22 11:11, Catalin Marinas wrote:
> On Tue, Jun 14, 2022 at 06:03:59PM -0400, Waiman Long wrote:
>> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
>> index 7dd64139a7c7..abba063ae5ee 100644
>> --- a/mm/kmemleak.c
>> +++ b/mm/kmemleak.c
>> @@ -1417,12 +1417,16 @@ static void kmemleak_scan(void)
>> struct zone *zone;
>> int __maybe_unused i;
>> int new_leaks = 0;
>> + int loop1_cnt = 0;
>>
>> jiffies_last_scan = jiffies;
>>
>> /* prepare the kmemleak_object's */
>> rcu_read_lock();
>> list_for_each_entry_rcu(object, &object_list, object_list) {
>> + bool obj_pinned = false;
>> +
>> + loop1_cnt++;
>> raw_spin_lock_irq(&object->lock);
>> #ifdef DEBUG
>> /*
>> @@ -1437,10 +1441,32 @@ static void kmemleak_scan(void)
>> #endif
>> /* reset the reference count (whiten the object) */
>> object->count = 0;
>> - if (color_gray(object) && get_object(object))
>> + if (color_gray(object) && get_object(object)) {
>> list_add_tail(&object->gray_list, &gray_list);
>> + obj_pinned = true;
>> + }
>>
>> raw_spin_unlock_irq(&object->lock);
>> +
>> + /*
>> + * Do a cond_resched() to avoid soft lockup every 64k objects.
>> + * Make sure a reference has been taken so that the object
>> + * won't go away without RCU read lock.
>> + */
>> + if (!(loop1_cnt & 0xffff)) {
>> + if (!obj_pinned && !get_object(object)) {
>> + /* Try the next object instead */
>> + loop1_cnt--;
>> + continue;
>> + }
> With this trick we could probably get rid of rcu_read_lock() and take
> the kmemleak_lock instead. But that's for another patch.
rcu_read_lock() is cheap unless we use a preempt kernel.
>
>> +
>> + rcu_read_unlock();
>> + cond_resched();
>> + rcu_read_lock();
> cond_resched_rcu() to save a couple of lines?
I am not aware of such helper function. Yes, I should have used that
instead in case I need to update this patch again. Thanks!
>
>> +
>> + if (!obj_pinned)
>> + put_object(object);
>> + }
> Reviewed-by: Catalin Marinas <[email protected]>
Thanks,
Longman