2010-08-22 18:29:06

by Miquel van Smoorenburg

[permalink] [raw]
Subject: [PATCH 00/03] sunrpc: a few authentication cache patches

Thanks for including the configurable authcache hashtable size patches
in 2.6.36. Yet another patch I can drop from our local builds.

I do have a few small enhancements though. I think the first
two should go into 2.6.36. The last one probably needs some
discussion.

Mike.

[PATCH 01/03]: sunrpc: increase MAX_HASHTABLE_BITS to 14

The maximum size of the authcache is now set to 1024 (10 bits),
but on our server we need at least 4096 (12 bits). Increase
MAX_HASHTABLE_BITS to 14. This is a maximum of 16384 entries,
each containing a pointer (8 bytes on x86_64). This is
exactly the limit of kmalloc() (128K).

[PATCH 02/03]: sunrpc: make auth_hashtable_size param mode 0444

The auth_hashtable_size parameter shows up in
/sys/module/sunrpc/parameters with mode 0644. However the hashtables
are allocated when the module is loaded, and changing the parameter
afterwards doesn't do much. So make it read-only.

[PATCH 03/03] sunrpc: scale hashtable cache size with memory

Set the number of entries of the authcache to 4096 on servers
with 4G of memory or more. Because kmallocing more than a few K
is frowned upon, change the allocator from kmalloc to __get_free_pages.
Since the minimum allocation size of __get_free_pages is 1 page,
set the number of entries in the authcache to PAGE_SIZE / (entry_size)
on servers with < 4G of memory so that exactly one page is used.



2010-08-22 18:30:53

by Miquel van Smoorenburg

[permalink] [raw]
Subject: [PATCH 01/03]: sunrpc: increase MAX_HASHTABLE_BITS to 14

The maximum size of the authcache is now set to 1024 (10 bits),
but on our server we need at least 4096 (12 bits). Increase
MAX_HASHTABLE_BITS to 14. This is a maximum of 16384 entries,
each containing a pointer (8 bytes on x86_64). This is
exactly the limit of kmalloc() (128K).

Signed-off-by: Miquel van Smoorenburg <[email protected]>

diff -ruN linux-2.6.36-rc1.orig/net/sunrpc/auth.c linux-2.6.36-rc1/net/sunrpc/auth.c
--- linux-2.6.36-rc1.orig/net/sunrpc/auth.c 2010-08-16 02:41:37.000000000 +0200
+++ linux-2.6.36-rc1/net/sunrpc/auth.c 2010-08-22 17:02:27.896009116 +0200
@@ -38,7 +39,7 @@
static LIST_HEAD(cred_unused);
static unsigned long number_cred_unused;

-#define MAX_HASHTABLE_BITS (10)
+#define MAX_HASHTABLE_BITS (14)
static int param_set_hashtbl_sz(const char *val, const struct kernel_param *kp)
{
unsigned long num;

2010-08-22 18:31:32

by Miquel van Smoorenburg

[permalink] [raw]
Subject: [PATCH 02/03]: sunrpc: make auth_hashtable_size param mode 0444

The auth_hashtable_size parameter shows up in
/sys/module/sunrpc/parameters with mode 0644. However the hashtables
are allocated when the module is loaded, and changing the parameter
afterwards doesn't do much. So make it read-only.

Signed-off-by: Miquel van Smoorenburg <[email protected]>

diff -ruN linux-2.6.36-rc1.orig/net/sunrpc/auth.c linux-2.6.36-rc1/net/sunrpc/auth.c
--- linux-2.6.36-rc1.orig/net/sunrpc/auth.c 2010-08-16 02:41:37.000000000 +0200
+++ linux-2.6.36-rc1/net/sunrpc/auth.c 2010-08-22 17:02:27.896009116 +0200
@@ -76,7 +77,7 @@
.get = param_get_hashtbl_sz,
};

-module_param_named(auth_hashtable_size, auth_hashbits, hashtbl_sz, 0644);
+module_param_named(auth_hashtable_size, auth_hashbits, hashtbl_sz, 0444);
MODULE_PARM_DESC(auth_hashtable_size, "RPC credential cache hashtable size");

static u32

2010-08-22 18:40:01

by Miquel van Smoorenburg

[permalink] [raw]
Subject: [PATCH 03/03] sunrpc: scale hashtable cache size with memory

Set the number of entries of the authcache to 4096 on servers
with 4G of memory or more. Because kmallocing more than a few K
is frowned upon, change the allocator from kmalloc to __get_free_pages.
Since the minimum allocation size of __get_free_pages is 1 page,
set the number of entries in the authcache to PAGE_SIZE / (entry_size)
on servers with < 4G of memory so that exactly one page is used.

Signed-off-by: Miquel van Smoorenburg <[email protected]>

diff -ruN linux-2.6.36-rc1.orig/net/sunrpc/auth.c linux-2.6.36-rc1/net/sunrpc/auth.c
--- linux-2.6.36-rc1.orig/net/sunrpc/auth.c 2010-08-16 02:41:37.000000000 +0200
+++ linux-2.6.36-rc1/net/sunrpc/auth.c 2010-08-22 17:02:27.896009116 +0200
@@ -19,14 +19,15 @@
# define RPCDBG_FACILITY RPCDBG_AUTH
#endif

-#define RPC_CREDCACHE_DEFAULT_HASHBITS (4)
+#define RPC_CREDCACHE_LARGE_HASHBITS (12)
struct rpc_cred_cache {
struct hlist_head *hashtable;
unsigned int hashbits;
+ unsigned int order;
spinlock_t lock;
};

-static unsigned int auth_hashbits = RPC_CREDCACHE_DEFAULT_HASHBITS;
+static unsigned int auth_hashbits;

static DEFINE_SPINLOCK(rpc_authflavor_lock);
static const struct rpc_authops *auth_flavors[RPC_AUTH_MAXFLAVOR] = {
@@ -195,14 +196,14 @@
rpcauth_init_credcache(struct rpc_auth *auth)
{
struct rpc_cred_cache *new;
- unsigned int hashsize;

new = kmalloc(sizeof(*new), GFP_KERNEL);
if (!new)
goto out_nocache;
new->hashbits = auth_hashbits;
- hashsize = 1U << new->hashbits;
- new->hashtable = kcalloc(hashsize, sizeof(new->hashtable[0]), GFP_KERNEL);
+ new->order = get_order((1U << new->hashbits) * sizeof(new->hashtable[0]));
+ new->hashtable = (struct hlist_head *)
+ __get_free_pages(GFP_KERNEL|__GFP_ZERO, new->order);
if (!new->hashtable)
goto out_nohashtbl;
spin_lock_init(&new->lock);
@@ -274,7 +275,7 @@
if (cache) {
auth->au_credcache = NULL;
rpcauth_clear_credcache(cache);
- kfree(cache->hashtable);
+ free_pages((unsigned long)cache->hashtable, cache->order);
kfree(cache);
}
}
@@ -642,8 +643,20 @@

int __init rpcauth_init_module(void)
{
+ struct sysinfo si;
+ unsigned long gb;
int err;

+ if (auth_hashbits == 0) {
+ /* auto-size: < 4GB: whatever fits one page, >= 4GB: 12 bits */
+ si_meminfo(&si);
+ gb = (si.totalram - si.totalhigh + 4096) >> (30 - PAGE_SHIFT);
+ if (gb < 4)
+ auth_hashbits = ilog2(PAGE_SIZE / sizeof(struct hlist_head));
+ else
+ auth_hashbits = RPC_CREDCACHE_LARGE_HASHBITS;
+ }
+
err = rpc_init_authunix();
if (err < 0)
goto out1;

2010-09-07 19:52:41

by Miquel van Smoorenburg

[permalink] [raw]
Subject: Re: [PATCH 03/03] sunrpc: scale hashtable cache size with memory

On Tue, 2010-09-07 at 15:02 -0400, Trond Myklebust wrote:
> On Sun, 2010-08-22 at 20:31 +0200, Miquel van Smoorenburg wrote:
> > Set the number of entries of the authcache to 4096 on servers
> > with 4G of memory or more. Because kmallocing more than a few K
> > is frowned upon, change the allocator from kmalloc to __get_free_pages.
> > Since the minimum allocation size of __get_free_pages is 1 page,
> > set the number of entries in the authcache to PAGE_SIZE / (entry_size)
> > on servers with < 4G of memory so that exactly one page is used.
>
> I'm not really understanding why this is an improvement. kmalloc() will
> use pretty much the same mechanism when allocating a slab that is >
> PAGE_SIZE, so why should we duplicate that in the RPC layer?

Oh, I must have been reading old information then. Can't find it
anymore, but what I read was something like "if you need more than a few
pages, use __get_free_pages() instead of kmalloc". Probably out of date.

Anyway, I can change that if you like. So, what about the general idea
of having 16 hashtable entries on systems < 1 GB of (low!) memory, (say)
512 entries for systems with 1GB .. 4 GB and 4096 slots for systems with
>= 4 GB ? The sizes of these tables are dwarfed by the ones for
dentry/inode/IP/TCP anyways. Or we could just not bother and let people
use the module parameter.

The *real* problem we are papering over with this is a different one, by
the way. I have looked into it, but haven't had time to finish it.

The problem is that the hashtable chains are growing too large with
old/unused entries. We should find a way to limit the length of those
chains in an LRU way.

We can't however since the access cache in fs/nfs/dir.c holds a
reference to almost all authcache entries. The thing is, 99% of those
access cache entries will be stale anyway, but those are never cleaned
up until they are used.

And then ofcourse there is some sort of duplication of information
between the auth_unix and auth_generic caches, but I've forgotten the
details.

Thanks,

Mike.


2010-09-07 18:59:32

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 02/03]: sunrpc: make auth_hashtable_size param mode 0444

On Sun, 2010-08-22 at 20:31 +0200, Miquel van Smoorenburg wrote:
> The auth_hashtable_size parameter shows up in
> /sys/module/sunrpc/parameters with mode 0644. However the hashtables
> are allocated when the module is loaded, and changing the parameter
> afterwards doesn't do much. So make it read-only.

The above observation is only correct for AUTH_SYS. The reason for
making the mode 0644 was to allow users to change the value after
loading if they want to use RPCSEC_GSS.

Cheers
Trond

2010-09-07 20:33:20

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 03/03] sunrpc: scale hashtable cache size with memory

On Tue, 2010-09-07 at 21:52 +0200, Miquel van Smoorenburg wrote:
> On Tue, 2010-09-07 at 15:02 -0400, Trond Myklebust wrote:
> > On Sun, 2010-08-22 at 20:31 +0200, Miquel van Smoorenburg wrote:
> > > Set the number of entries of the authcache to 4096 on servers
> > > with 4G of memory or more. Because kmallocing more than a few K
> > > is frowned upon, change the allocator from kmalloc to __get_free_pages.
> > > Since the minimum allocation size of __get_free_pages is 1 page,
> > > set the number of entries in the authcache to PAGE_SIZE / (entry_size)
> > > on servers with < 4G of memory so that exactly one page is used.
> >
> > I'm not really understanding why this is an improvement. kmalloc() will
> > use pretty much the same mechanism when allocating a slab that is >
> > PAGE_SIZE, so why should we duplicate that in the RPC layer?
>
> Oh, I must have been reading old information then. Can't find it
> anymore, but what I read was something like "if you need more than a few
> pages, use __get_free_pages() instead of kmalloc". Probably out of date.
>
> Anyway, I can change that if you like. So, what about the general idea
> of having 16 hashtable entries on systems < 1 GB of (low!) memory, (say)
> 512 entries for systems with 1GB .. 4 GB and 4096 slots for systems with
> >= 4 GB ? The sizes of these tables are dwarfed by the ones for
> dentry/inode/IP/TCP anyways. Or we could just not bother and let people
> use the module parameter.

I'm not convinced there is a strong link between the amount of available
memory, and the number of different credentials a system needs to
support. Generally, systems that have lots of users will tend to have
lots of memory, but the reverse is not necessarily true...

> The *real* problem we are papering over with this is a different one, by
> the way. I have looked into it, but haven't had time to finish it.
>
> The problem is that the hashtable chains are growing too large with
> old/unused entries. We should find a way to limit the length of those
> chains in an LRU way.
>
> We can't however since the access cache in fs/nfs/dir.c holds a
> reference to almost all authcache entries. The thing is, 99% of those
> access cache entries will be stale anyway, but those are never cleaned
> up until they are used.

There is nothing stopping you from setting up a hard limit on the number
of access cache entries. Most of the code is already there in order to
support the access cache shrinker.

Ditto for the auth cache itself...

> And then ofcourse there is some sort of duplication of information
> between the auth_unix and auth_generic caches, but I've forgotten the
> details.

At some point, I'd like to get rid of both the auth_unix and
auth_generic caches and replace them with 'struct cred'. Unfortunately,
there is nothing in the current cred code that tries to merge creds with
identical information, which makes them hard to use for the access cache
and NFSv4 open state caches.

Cheers
Trond

2010-09-07 19:02:39

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 03/03] sunrpc: scale hashtable cache size with memory

On Sun, 2010-08-22 at 20:31 +0200, Miquel van Smoorenburg wrote:
> Set the number of entries of the authcache to 4096 on servers
> with 4G of memory or more. Because kmallocing more than a few K
> is frowned upon, change the allocator from kmalloc to __get_free_pages.
> Since the minimum allocation size of __get_free_pages is 1 page,
> set the number of entries in the authcache to PAGE_SIZE / (entry_size)
> on servers with < 4G of memory so that exactly one page is used.

I'm not really understanding why this is an improvement. kmalloc() will
use pretty much the same mechanism when allocating a slab that is >
PAGE_SIZE, so why should we duplicate that in the RPC layer?

Cheers
Trond

2010-09-07 19:25:06

by Miquel van Smoorenburg

[permalink] [raw]
Subject: Re: [PATCH 02/03]: sunrpc: make auth_hashtable_size param mode 0444

On Tue, 2010-09-07 at 14:58 -0400, Trond Myklebust wrote:
> On Sun, 2010-08-22 at 20:31 +0200, Miquel van Smoorenburg wrote:
> > The auth_hashtable_size parameter shows up in
> > /sys/module/sunrpc/parameters with mode 0644. However the hashtables
> > are allocated when the module is loaded, and changing the parameter
> > afterwards doesn't do much. So make it read-only.
>
> The above observation is only correct for AUTH_SYS. The reason for
> making the mode 0644 was to allow users to change the value after
> loading if they want to use RPCSEC_GSS.

Right, I can see that being useful. It's confusing though. It'd be more
clear if there was a seperate knob to tune just
rpcsec_auth_hashtable_size. Anyway, never mind then.

Mike.


2010-09-07 19:03:50

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 01/03]: sunrpc: increase MAX_HASHTABLE_BITS to 14

On Sun, 2010-08-22 at 20:30 +0200, Miquel van Smoorenburg wrote:
> The maximum size of the authcache is now set to 1024 (10 bits),
> but on our server we need at least 4096 (12 bits). Increase
> MAX_HASHTABLE_BITS to 14. This is a maximum of 16384 entries,
> each containing a pointer (8 bytes on x86_64). This is
> exactly the limit of kmalloc() (128K).
>
> Signed-off-by: Miquel van Smoorenburg <[email protected]>
>
> diff -ruN linux-2.6.36-rc1.orig/net/sunrpc/auth.c linux-2.6.36-rc1/net/sunrpc/auth.c
> --- linux-2.6.36-rc1.orig/net/sunrpc/auth.c 2010-08-16 02:41:37.000000000 +0200
> +++ linux-2.6.36-rc1/net/sunrpc/auth.c 2010-08-22 17:02:27.896009116 +0200
> @@ -38,7 +39,7 @@
> static LIST_HEAD(cred_unused);
> static unsigned long number_cred_unused;
>
> -#define MAX_HASHTABLE_BITS (10)
> +#define MAX_HASHTABLE_BITS (14)
> static int param_set_hashtbl_sz(const char *val, const struct kernel_param *kp)
> {
> unsigned long num;

Fair enough. Applied...

Trond