2015-07-29 20:20:22

by Waiman Long

[permalink] [raw]
Subject: [PATCH] proc: change proc_subdir_lock to a rwlock

The proc_subdir_lock spinlock is used to allow only one task to
make change to the proc directory structure as well as looking up
information in it. However, the information lookup part can actually
be entered by more than one task as the pde_get() and pde_put()
reference count update calls in the critical sections are atomic
increment and decrement respectively and so are safe with concurrent
updates.

The x86 architecture has already used qrwlock which is fair and other
architectures like ARM are in the process of switching to qrwlock. So
unfairness shouldn't be a concern in that conversion.

This patch changed the proc_subdir_lock to a rwlock in order to enable
concurrent lookup. The following functions were modified to take a
write lock:
- proc_register()
- remove_proc_entry()
- remove_proc_subtree()

The following functions were modified to take a read lock:
- xlate_proc_name()
- proc_lookup_de()
- proc_readdir_de()

A parallel /proc filesystem search with the "find" command (1000
threads) was run on a 4-socket Haswell-EX box (144 threads). Before
the patch, the parallel search took about 39s. After the patch,
the parallel find took only 25s, a saving of about 14s.

Signed-off-by: Waiman Long <[email protected]>
---
fs/proc/generic.c | 44 ++++++++++++++++++++++----------------------
1 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index e5dee5c..ff3ffc7 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -26,7 +26,7 @@

#include "internal.h"

-static DEFINE_SPINLOCK(proc_subdir_lock);
+static DEFINE_RWLOCK(proc_subdir_lock);

static int proc_match(unsigned int len, const char *name, struct proc_dir_entry *de)
{
@@ -172,9 +172,9 @@ static int xlate_proc_name(const char *name, struct proc_dir_entry **ret,
{
int rv;

- spin_lock(&proc_subdir_lock);
+ read_lock(&proc_subdir_lock);
rv = __xlate_proc_name(name, ret, residual);
- spin_unlock(&proc_subdir_lock);
+ read_unlock(&proc_subdir_lock);
return rv;
}

@@ -231,11 +231,11 @@ struct dentry *proc_lookup_de(struct proc_dir_entry *de, struct inode *dir,
{
struct inode *inode;

- spin_lock(&proc_subdir_lock);
+ read_lock(&proc_subdir_lock);
de = pde_subdir_find(de, dentry->d_name.name, dentry->d_name.len);
if (de) {
pde_get(de);
- spin_unlock(&proc_subdir_lock);
+ read_unlock(&proc_subdir_lock);
inode = proc_get_inode(dir->i_sb, de);
if (!inode)
return ERR_PTR(-ENOMEM);
@@ -243,7 +243,7 @@ struct dentry *proc_lookup_de(struct proc_dir_entry *de, struct inode *dir,
d_add(dentry, inode);
return NULL;
}
- spin_unlock(&proc_subdir_lock);
+ read_unlock(&proc_subdir_lock);
return ERR_PTR(-ENOENT);
}

@@ -270,12 +270,12 @@ int proc_readdir_de(struct proc_dir_entry *de, struct file *file,
if (!dir_emit_dots(file, ctx))
return 0;

- spin_lock(&proc_subdir_lock);
+ read_lock(&proc_subdir_lock);
de = pde_subdir_first(de);
i = ctx->pos - 2;
for (;;) {
if (!de) {
- spin_unlock(&proc_subdir_lock);
+ read_unlock(&proc_subdir_lock);
return 0;
}
if (!i)
@@ -287,19 +287,19 @@ int proc_readdir_de(struct proc_dir_entry *de, struct file *file,
do {
struct proc_dir_entry *next;
pde_get(de);
- spin_unlock(&proc_subdir_lock);
+ read_unlock(&proc_subdir_lock);
if (!dir_emit(ctx, de->name, de->namelen,
de->low_ino, de->mode >> 12)) {
pde_put(de);
return 0;
}
- spin_lock(&proc_subdir_lock);
+ read_lock(&proc_subdir_lock);
ctx->pos++;
next = pde_subdir_next(de);
pde_put(de);
de = next;
} while (de);
- spin_unlock(&proc_subdir_lock);
+ read_unlock(&proc_subdir_lock);
return 1;
}

@@ -338,16 +338,16 @@ static int proc_register(struct proc_dir_entry * dir, struct proc_dir_entry * dp
if (ret)
return ret;

- spin_lock(&proc_subdir_lock);
+ write_lock(&proc_subdir_lock);
dp->parent = dir;
if (pde_subdir_insert(dir, dp) == false) {
WARN(1, "proc_dir_entry '%s/%s' already registered\n",
dir->name, dp->name);
- spin_unlock(&proc_subdir_lock);
+ write_unlock(&proc_subdir_lock);
proc_free_inum(dp->low_ino);
return -EEXIST;
}
- spin_unlock(&proc_subdir_lock);
+ write_unlock(&proc_subdir_lock);

return 0;
}
@@ -549,9 +549,9 @@ void remove_proc_entry(const char *name, struct proc_dir_entry *parent)
const char *fn = name;
unsigned int len;

- spin_lock(&proc_subdir_lock);
+ write_lock(&proc_subdir_lock);
if (__xlate_proc_name(name, &parent, &fn) != 0) {
- spin_unlock(&proc_subdir_lock);
+ write_unlock(&proc_subdir_lock);
return;
}
len = strlen(fn);
@@ -559,7 +559,7 @@ void remove_proc_entry(const char *name, struct proc_dir_entry *parent)
de = pde_subdir_find(parent, fn, len);
if (de)
rb_erase(&de->subdir_node, &parent->subdir);
- spin_unlock(&proc_subdir_lock);
+ write_unlock(&proc_subdir_lock);
if (!de) {
WARN(1, "name '%s'\n", name);
return;
@@ -583,16 +583,16 @@ int remove_proc_subtree(const char *name, struct proc_dir_entry *parent)
const char *fn = name;
unsigned int len;

- spin_lock(&proc_subdir_lock);
+ write_lock(&proc_subdir_lock);
if (__xlate_proc_name(name, &parent, &fn) != 0) {
- spin_unlock(&proc_subdir_lock);
+ write_unlock(&proc_subdir_lock);
return -ENOENT;
}
len = strlen(fn);

root = pde_subdir_find(parent, fn, len);
if (!root) {
- spin_unlock(&proc_subdir_lock);
+ write_unlock(&proc_subdir_lock);
return -ENOENT;
}
rb_erase(&root->subdir_node, &parent->subdir);
@@ -605,7 +605,7 @@ int remove_proc_subtree(const char *name, struct proc_dir_entry *parent)
de = next;
continue;
}
- spin_unlock(&proc_subdir_lock);
+ write_unlock(&proc_subdir_lock);

proc_entry_rundown(de);
next = de->parent;
@@ -616,7 +616,7 @@ int remove_proc_subtree(const char *name, struct proc_dir_entry *parent)
break;
pde_put(de);

- spin_lock(&proc_subdir_lock);
+ write_lock(&proc_subdir_lock);
de = next;
}
pde_put(root);
--
1.7.1


2015-07-29 22:21:19

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] proc: change proc_subdir_lock to a rwlock

Two quick questions.

- What motivates this work? Are you seeing lots of
parallel reads on proc?

- Why not rcu? Additions and removal of proc generic
files is very rare. Conversion to rcu for reads should
perform better and not take much more work.

Eric

2015-07-30 10:05:00

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: [PATCH] proc: change proc_subdir_lock to a rwlock

On Thu, Jul 30, 2015 at 1:21 AM, Eric W. Biederman
<[email protected]> wrote:
> Two quick questions.
>
> - What motivates this work? Are you seeing lots of
> parallel reads on proc?

Well, yes, benchmark looks like very synthetic.
But even gcc opens /proc/meminfo, so might as well
apply the change.

Patch looks like obviously correct to me.

> - Why not rcu? Additions and removal of proc generic
> files is very rare. Conversion to rcu for reads should
> perform better and not take much more work.

If you want proc locking to be understood by ~10 people
on the planet then yes, there is RCU. :-)

2015-07-31 02:17:03

by Waiman Long

[permalink] [raw]
Subject: Re: [PATCH] proc: change proc_subdir_lock to a rwlock

On 07/29/2015 06:21 PM, Eric W. Biederman wrote:
> Two quick questions.
>
> - What motivates this work? Are you seeing lots of
> parallel reads on proc?

The micro-benchmark that I used was artificial, but it was used to
reproduce an exit hanging problem that I saw in real application. In
fact, only allow one task to do a lookup seems too limiting to me.
> - Why not rcu? Additions and removal of proc generic
> files is very rare. Conversion to rcu for reads should
> perform better and not take much more work.

RCU is harder to verify its correctness, whereas rwlock is easier to use
and understand. If it is really a performance critical path where every
extra bit of performance counts, I will certainly think RCU may be the
right choice. However, in this particular case, I don't think using RCU
will give any noticeable performance gain compared with a rwlock.

Cheers,
Longman

2015-07-31 02:53:22

by Waiman Long

[permalink] [raw]
Subject: Re: [PATCH] proc: change proc_subdir_lock to a rwlock

On 07/30/2015 10:16 PM, Waiman Long wrote:
> On 07/29/2015 06:21 PM, Eric W. Biederman wrote:
>> Two quick questions.
>>
>> - What motivates this work? Are you seeing lots of
>> parallel reads on proc?
>
> The micro-benchmark that I used was artificial, but it was used to
> reproduce an exit hanging problem that I saw in real application. In
> fact, only allow one task to do a lookup seems too limiting to me.
>> - Why not rcu? Additions and removal of proc generic
>> files is very rare. Conversion to rcu for reads should
>> perform better and not take much more work.
>
> RCU is harder to verify its correctness, whereas rwlock is easier to
> use and understand. If it is really a performance critical path where
> every extra bit of performance counts, I will certainly think RCU may
> be the right choice. However, in this particular case, I don't think
> using RCU will give any noticeable performance gain compared with a
> rwlock.

One more thing, RCU is typically used with linked list. It is not easy
to use RCU with rbtree and may require major changes to the code.

Another alternative is to use seqlock + RCU, but it will still need more
code changes than rwlock.

Cheers,
Longman

2015-08-03 18:10:09

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] proc: change proc_subdir_lock to a rwlock

Waiman Long <[email protected]> writes:

> On 07/30/2015 10:16 PM, Waiman Long wrote:
>> On 07/29/2015 06:21 PM, Eric W. Biederman wrote:
>>> Two quick questions.
>>>
>>> - What motivates this work? Are you seeing lots of
>>> parallel reads on proc?
>>
>> The micro-benchmark that I used was artificial, but it was used to reproduce
>> an exit hanging problem that I saw in real application. In fact, only allow
>> one task to do a lookup seems too limiting to me.
>>> - Why not rcu? Additions and removal of proc generic
>>> files is very rare. Conversion to rcu for reads should
>>> perform better and not take much more work.
>>
>> RCU is harder to verify its correctness, whereas rwlock is easier to use and
>> understand. If it is really a performance critical path where every extra bit
>> of performance counts, I will certainly think RCU may be the right
>> choice. However, in this particular case, I don't think using RCU will give
>> any noticeable performance gain compared with a rwlock.
>
> One more thing, RCU is typically used with linked list. It is not easy to use
> RCU with rbtree and may require major changes to the code.
>
> Another alternative is to use seqlock + RCU, but it will still need more code
> changes than rwlock.

I had forgotten we had switch proc directories to rbtrees. So on that
note.

Acked-by: "Eric W. Biederman" <[email protected]>

Eric