2011-07-21 17:49:05

by Steve Dickson

[permalink] [raw]
Subject: [PATCH] RDMA: Increasing RPCRDMA_MAX_DATA_SEGS

Our performance team has noticed that increasing
RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly
increases throughput when using the RDMA transport.

Signed-off-by: Steve Dickson <[email protected]>
---
net/sunrpc/xprtrdma/xprt_rdma.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index cae761a..5d1cfe5 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -109,7 +109,7 @@ struct rpcrdma_ep {
*/

/* temporary static scatter/gather max */
-#define RPCRDMA_MAX_DATA_SEGS (8) /* max scatter/gather */
+#define RPCRDMA_MAX_DATA_SEGS (64) /* max scatter/gather */
#define RPCRDMA_MAX_SEGS (RPCRDMA_MAX_DATA_SEGS + 2) /* head+tail = 2 */
#define MAX_RPCRDMAHDR (\
/* max supported RPC/RDMA header */ \
--
1.7.6



2011-07-21 21:41:07

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] RDMA: Increasing RPCRDMA_MAX_DATA_SEGS

On Thu, Jul 21, 2011 at 01:49:02PM -0400, Steve Dickson wrote:
> Our performance team has noticed that increasing
> RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly
> increases throughput when using the RDMA transport.

The main risk that I can see being that we have on the stack in two
places:

rpcrdma_register_fmr_external(struct rpcrdma_mr_seg *seg, ...
{
...
u64 physaddrs[RPCRDMA_MAX_DATA_SEGS];

rpcrdma_register_default_external(struct rpcrdma_mr_seg *seg, ...
{
...
struct ib_phys_buf ipb[RPCRDMA_MAX_DATA_SEGS];

Where ip_phys_buf is 16 bytes.

So that's 512 bytes in the first case, 1024 in the second. This is
called from rpciod--what are our rules about allocating memory from
rpciod?

--b.

>
> Signed-off-by: Steve Dickson <[email protected]>
> ---
> net/sunrpc/xprtrdma/xprt_rdma.h | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
> index cae761a..5d1cfe5 100644
> --- a/net/sunrpc/xprtrdma/xprt_rdma.h
> +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
> @@ -109,7 +109,7 @@ struct rpcrdma_ep {
> */
>
> /* temporary static scatter/gather max */
> -#define RPCRDMA_MAX_DATA_SEGS (8) /* max scatter/gather */
> +#define RPCRDMA_MAX_DATA_SEGS (64) /* max scatter/gather */
> #define RPCRDMA_MAX_SEGS (RPCRDMA_MAX_DATA_SEGS + 2) /* head+tail = 2 */
> #define MAX_RPCRDMAHDR (\
> /* max supported RPC/RDMA header */ \
> --
> 1.7.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-07-22 01:55:07

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] RDMA: Increasing RPCRDMA_MAX_DATA_SEGS

On Thu, Jul 21, 2011 at 09:42:04PM -0400, Trond Myklebust wrote:
> On Thu, 2011-07-21 at 17:41 -0400, J. Bruce Fields wrote:
> > On Thu, Jul 21, 2011 at 01:49:02PM -0400, Steve Dickson wrote:
> > > Our performance team has noticed that increasing
> > > RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly
> > > increases throughput when using the RDMA transport.
> >
> > The main risk that I can see being that we have on the stack in two
> > places:
> >
> > rpcrdma_register_fmr_external(struct rpcrdma_mr_seg *seg, ...
> > {
> > ...
> > u64 physaddrs[RPCRDMA_MAX_DATA_SEGS];
> >
> > rpcrdma_register_default_external(struct rpcrdma_mr_seg *seg, ...
> > {
> > ...
> > struct ib_phys_buf ipb[RPCRDMA_MAX_DATA_SEGS];
> >
> > Where ip_phys_buf is 16 bytes.
> >
> > So that's 512 bytes in the first case, 1024 in the second. This is
> > called from rpciod--what are our rules about allocating memory from
> > rpciod?
>
> Is that allocated on the stack? We should always try to avoid 1024-byte
> allocations on the stack, since that eats up a full 1/8th (or 1/4 in the
> case of 4k stacks) of the total stack space.

Right, it's on the stack, so I was wondering what we should do
instead....

> If, OTOH, that memory is being allocated dynamically, then the rule is
> "don't let rpciod sleep".

OK, so, looking around, the buf_alloc methods might provide examples to
follow for dynamic allocation here?

--b.

2011-07-22 01:42:07

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH] RDMA: Increasing RPCRDMA_MAX_DATA_SEGS

On Thu, 2011-07-21 at 17:41 -0400, J. Bruce Fields wrote:
> On Thu, Jul 21, 2011 at 01:49:02PM -0400, Steve Dickson wrote:
> > Our performance team has noticed that increasing
> > RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly
> > increases throughput when using the RDMA transport.
>
> The main risk that I can see being that we have on the stack in two
> places:
>
> rpcrdma_register_fmr_external(struct rpcrdma_mr_seg *seg, ...
> {
> ...
> u64 physaddrs[RPCRDMA_MAX_DATA_SEGS];
>
> rpcrdma_register_default_external(struct rpcrdma_mr_seg *seg, ...
> {
> ...
> struct ib_phys_buf ipb[RPCRDMA_MAX_DATA_SEGS];
>
> Where ip_phys_buf is 16 bytes.
>
> So that's 512 bytes in the first case, 1024 in the second. This is
> called from rpciod--what are our rules about allocating memory from
> rpciod?

Is that allocated on the stack? We should always try to avoid 1024-byte
allocations on the stack, since that eats up a full 1/8th (or 1/4 in the
case of 4k stacks) of the total stack space.

If, OTOH, that memory is being allocated dynamically, then the rule is
"don't let rpciod sleep".

Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com


2011-07-25 15:18:28

by Steve Dickson

[permalink] [raw]
Subject: Re: [PATCH] RDMA: Increasing RPCRDMA_MAX_DATA_SEGS

Sorry for the delayed response... I took a day off..

On 07/22/2011 04:19 AM, Max Matveev wrote:
> On Thu, 21 Jul 2011 13:49:02 -0400, Steve Dickson wrote:
>
> steved> Our performance team has noticed that increasing
> steved> RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly
> steved> increases throughput when using the RDMA transport.
>
> Did they try new client with old server and vice versa?
> Both read and write?
I believe it was done on the server side, but I've cc-ed the
person who did the testing....

steved.

2011-07-22 08:19:08

by Max Matveev

[permalink] [raw]
Subject: Re: [PATCH] RDMA: Increasing RPCRDMA_MAX_DATA_SEGS

On Thu, 21 Jul 2011 13:49:02 -0400, Steve Dickson wrote:

steved> Our performance team has noticed that increasing
steved> RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly
steved> increases throughput when using the RDMA transport.

Did they try new client with old server and vice versa?
Both read and write?

max