2023-08-08 19:50:23

by John Stultz

[permalink] [raw]
Subject: [RFC][PATCH 3/3] test-ww_mutex: Make sure we bail out instead of livelock

I've seen what appears to be livelocks in the stress_inorder_work()
function, and looking at the code it is clear we can have a case
where we continually retry acquiring the locks and never check to
see if we have passed the specified timeout.

This patch reworks that function so we always check the timeout
before iterating through the loop again.

I believe others may have hit this previously here:
https://lore.kernel.org/lkml/[email protected]/

Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Li Zhijian <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: [email protected]
Reported-by: Li Zhijian <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: John Stultz <[email protected]>
---
kernel/locking/test-ww_mutex.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c
index 358d66150426..78719e1ef1b1 100644
--- a/kernel/locking/test-ww_mutex.c
+++ b/kernel/locking/test-ww_mutex.c
@@ -465,17 +465,18 @@ static void stress_inorder_work(struct work_struct *work)
ww_mutex_unlock(&locks[order[n]]);

if (err == -EDEADLK) {
- ww_mutex_lock_slow(&locks[order[contended]], &ctx);
- goto retry;
+ if (!time_after(jiffies, stress->timeout)) {
+ ww_mutex_lock_slow(&locks[order[contended]], &ctx);
+ goto retry;
+ }
}

+ ww_acquire_fini(&ctx);
if (err) {
pr_err_once("stress (%s) failed with %d\n",
__func__, err);
break;
}
-
- ww_acquire_fini(&ctx);
} while (!time_after(jiffies, stress->timeout));

kfree(order);
--
2.41.0.640.ga95def55d0-goog