2013-09-26 13:26:41

by Josef Bacik

[permalink] [raw]
Subject: [PATCH 1/2] rwsem: add rwsem_is_contended V3

Btrfs needs a simple way to know if it needs to let go of it's read lock on a
rwsem. Introduce rwsem_is_contended to check to see if there are any waiters on
this rwsem currently. This is just a hueristic, it is meant to be light and not
100% accurate and called by somebody already holding on to the rwsem in either
read or write. Thanks,

Signed-off-by: Josef Bacik <[email protected]>
---
V2->V3: fixed the comment and simplified the function as per Peter's
suggestions.

V1->V2: took everybodys suggestions and simplified it to just one function in
rwsem.h so it works for both the spinlock case and non-spinlock case.

include/linux/rwsem.h | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
index 0616ffe..03f3b05 100644
--- a/include/linux/rwsem.h
+++ b/include/linux/rwsem.h
@@ -75,6 +75,17 @@ do { \
} while (0)

/*
+ * This is the same regardless of which rwsem implementation that is being used.
+ * It is just a heuristic meant to be called by somebody alreadying holding the
+ * rwsem to see if somebody from an incompatible type is wanting access to the
+ * lock.
+ */
+static inline int rwsem_is_contended(struct rw_semaphore *sem)
+{
+ return !list_empty(&sem->wait_list);
+}
+
+/*
* lock for reading
*/
extern void down_read(struct rw_semaphore *sem);
--
1.8.3.1


2013-09-26 13:26:48

by Josef Bacik

[permalink] [raw]
Subject: [PATCH 2/2] Btrfs: stop caching thread if extent_commit_sem is contended

We can starve out the transaction commit with a bunch of caching threads all
running at the same time. This is because we will only drop the
extent_commit_sem if we need_resched(), which isn't likely to happen since we
will be reading a lot from the disk so have already schedule()'ed plenty. Alex
observed that he could starve out a transaction commit for up to a minute with
32 caching threads all running at once. This will allow us to drop the
extent_commit_sem to allow the transaction commit to swap the commit_root out
and then all the cachers will start back up. Here is an explanation provided by
Igno

So, just to fill in what happens in this loop:

mutex_unlock(&caching_ctl->mutex);
cond_resched();
goto again;

where 'again:' takes caching_ctl->mutex and fs_info->extent_commit_sem
again:

again:
mutex_lock(&caching_ctl->mutex);
/* need to make sure the commit_root doesn't disappear */
down_read(&fs_info->extent_commit_sem);

So, if I'm reading the code correct, there can be a fair amount of
concurrency here: there may be multiple 'caching kthreads' per filesystem
active, while there's one fs_info->extent_commit_sem per filesystem
AFAICS.

So, what happens if there are a lot of CPUs all busy holding the
->extent_commit_sem rwsem read-locked and a writer arrives? They'd all
rush to try to release the fs_info->extent_commit_sem, and they'd block in
the down_read() because there's a writer waiting.

So there's a guarantee of forward progress. This should answer akpm's
concern I think.

Thanks,

Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
---
fs/btrfs/extent-tree.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index cfb3cf7..cc074c34 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -442,7 +442,8 @@ next:
if (ret)
break;

- if (need_resched()) {
+ if (need_resched() ||
+ rwsem_is_contended(&fs_info->extent_commit_sem)) {
caching_ctl->progress = last;
btrfs_release_path(path);
up_read(&fs_info->extent_commit_sem);
--
1.8.3.1

2013-09-26 14:16:13

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/2] rwsem: add rwsem_is_contended V3


* Josef Bacik <[email protected]> wrote:

> Btrfs needs a simple way to know if it needs to let go of it's read lock on a
> rwsem. Introduce rwsem_is_contended to check to see if there are any waiters on
> this rwsem currently. This is just a hueristic, it is meant to be light and not
> 100% accurate and called by somebody already holding on to the rwsem in either
> read or write. Thanks,
>
> Signed-off-by: Josef Bacik <[email protected]>

Acked-by: Ingo Molnar <[email protected]>

Thanks,

Ingo