2023-09-22 09:04:40

by John Stultz

[permalink] [raw]
Subject: [PATCH 3/3] test-ww_mutex: Make sure we bail out instead of livelock

I've seen what appears to be livelocks in the stress_inorder_work()
function, and looking at the code it is clear we can have a case
where we continually retry acquiring the locks and never check to
see if we have passed the specified timeout.

This patch reworks that function so we always check the timeout
before iterating through the loop again.

I believe others may have hit this previously here:
https://lore.kernel.org/lkml/[email protected]/

Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: [email protected]
Reported-by: Li Zhijian <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: John Stultz <[email protected]>
---
kernel/locking/test-ww_mutex.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c
index 358d66150426..78719e1ef1b1 100644
--- a/kernel/locking/test-ww_mutex.c
+++ b/kernel/locking/test-ww_mutex.c
@@ -465,17 +465,18 @@ static void stress_inorder_work(struct work_struct *work)
ww_mutex_unlock(&locks[order[n]]);

if (err == -EDEADLK) {
- ww_mutex_lock_slow(&locks[order[contended]], &ctx);
- goto retry;
+ if (!time_after(jiffies, stress->timeout)) {
+ ww_mutex_lock_slow(&locks[order[contended]], &ctx);
+ goto retry;
+ }
}

+ ww_acquire_fini(&ctx);
if (err) {
pr_err_once("stress (%s) failed with %d\n",
__func__, err);
break;
}
-
- ww_acquire_fini(&ctx);
} while (!time_after(jiffies, stress->timeout));

kfree(order);
--
2.42.0.515.g380fc7ccd1-goog


Subject: [tip: locking/core] locking/ww_mutex/test: Make sure we bail out instead of livelock

The following commit has been merged into the locking/core branch of tip:

Commit-ID: cfa92b6d52071aaa8f27d21affdcb14e7448fbc1
Gitweb: https://git.kernel.org/tip/cfa92b6d52071aaa8f27d21affdcb14e7448fbc1
Author: John Stultz <[email protected]>
AuthorDate: Fri, 22 Sep 2023 04:36:01
Committer: Ingo Molnar <[email protected]>
CommitterDate: Fri, 22 Sep 2023 09:43:41 +02:00

locking/ww_mutex/test: Make sure we bail out instead of livelock

I've seen what appears to be livelocks in the stress_inorder_work()
function, and looking at the code it is clear we can have a case
where we continually retry acquiring the locks and never check to
see if we have passed the specified timeout.

This patch reworks that function so we always check the timeout
before iterating through the loop again.

I believe others may have hit this previously here:

https://lore.kernel.org/lkml/[email protected]/

Reported-by: Li Zhijian <[email protected]>
Signed-off-by: John Stultz <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
kernel/locking/test-ww_mutex.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c
index 358d661..78719e1 100644
--- a/kernel/locking/test-ww_mutex.c
+++ b/kernel/locking/test-ww_mutex.c
@@ -465,17 +465,18 @@ retry:
ww_mutex_unlock(&locks[order[n]]);

if (err == -EDEADLK) {
- ww_mutex_lock_slow(&locks[order[contended]], &ctx);
- goto retry;
+ if (!time_after(jiffies, stress->timeout)) {
+ ww_mutex_lock_slow(&locks[order[contended]], &ctx);
+ goto retry;
+ }
}

+ ww_acquire_fini(&ctx);
if (err) {
pr_err_once("stress (%s) failed with %d\n",
__func__, err);
break;
}
-
- ww_acquire_fini(&ctx);
} while (!time_after(jiffies, stress->timeout));

kfree(order);