Subject: [PATCH v6] locking/rwbase: Mitigate indefinite writer starvation.

The rw_semaphore and rwlock_t locks are unfair to writers. Readers can
indefinitely acquire the lock unless the writer fully acquired the lock.
This can never happen if there is always a reader in the critical
section owning the lock.

Mel Gorman reported that since LTP-20220121 the dio_truncate test case
went from having 1 reader to having 16 reader and the number of readers
is sufficient to prevent the down_write ever succeeding while readers
exist. Eventually the test is killed after 30 minutes as a failure.

Mel proposed a timeout to limit how long a writer can be blocked until
the reader is forced into the slowpath.
Thomas argued that there is no added value by providing this timeout.
From PREEMPT_RT point of view, there are no critical rw_semaphore or
rwlock_t locks left where the reader must be prefer.

Mitigate indefinite writer starvation by forcing the READER into the
slowpath once the WRITER attempts to acquire the lock.

Reported-by: Mel Gorman <[email protected]>
Link: https://lore.kernel.org/877cwbq4cq.ffs@tglx
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
- v5..v6: https://lore.kernel.org/all/[email protected]/
- dropped the timeout and forcing reader into the slowpath once a
writer is waiting.

- v4..v5 https://lore.kernel.org/[email protected]
- Reworded last paragraph of the commit message as Mel's suggestion
- RT/DL tasks are no longer excluded from the waiter timeout. There
is no reason why this should be done since no RT user relies on
rwsem (and would need this kind of behaviour). The critical user
from RT perspective replaced rwsem with RCU.
Avoiding special treatment avoids this kind of bug with RT
readers.
- Update comments accordingly.

kernel/locking/rwbase_rt.c | 9 ---------
1 file changed, 9 deletions(-)

diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c
index c201aadb93017..25ec0239477c2 100644
--- a/kernel/locking/rwbase_rt.c
+++ b/kernel/locking/rwbase_rt.c
@@ -72,15 +72,6 @@ static int __sched __rwbase_read_lock(struct rwbase_rt *rwb,
int ret;

raw_spin_lock_irq(&rtm->wait_lock);
- /*
- * Allow readers, as long as the writer has not completely
- * acquired the semaphore for write.
- */
- if (atomic_read(&rwb->readers) != WRITER_BIAS) {
- atomic_inc(&rwb->readers);
- raw_spin_unlock_irq(&rtm->wait_lock);
- return 0;
- }

/*
* Call into the slow lock path with the rtmutex->wait_lock
--
2.40.0



2023-03-22 09:37:30

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH v6] locking/rwbase: Mitigate indefinite writer starvation.

On Tue, Mar 21, 2023 at 05:11:40PM +0100, Sebastian Andrzej Siewior wrote:
> The rw_semaphore and rwlock_t locks are unfair to writers. Readers can
> indefinitely acquire the lock unless the writer fully acquired the lock.
> This can never happen if there is always a reader in the critical
> section owning the lock.
>
> Mel Gorman reported that since LTP-20220121 the dio_truncate test case
> went from having 1 reader to having 16 reader and the number of readers
> is sufficient to prevent the down_write ever succeeding while readers
> exist. Eventually the test is killed after 30 minutes as a failure.
>
> Mel proposed a timeout to limit how long a writer can be blocked until
> the reader is forced into the slowpath.
> Thomas argued that there is no added value by providing this timeout.
> From PREEMPT_RT point of view, there are no critical rw_semaphore or
> rwlock_t locks left where the reader must be prefer.
>

s/prefer/preferred/

> Mitigate indefinite writer starvation by forcing the READER into the
> slowpath once the WRITER attempts to acquire the lock.
>
> Reported-by: Mel Gorman <[email protected]>
> Link: https://lore.kernel.org/877cwbq4cq.ffs@tglx
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>

Acked-by: Mel Gorman <[email protected]>

Thanks Sebastian and Thomas!

--
Mel Gorman
SUSE Labs