Our performance team has noticed that increasing
RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly
increases throughput when using the RDMA transport.
Signed-off-by: Steve Dickson <[email protected]>
---
net/sunrpc/xprtrdma/xprt_rdma.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index cae761a..5d1cfe5 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -109,7 +109,7 @@ struct rpcrdma_ep {
*/
/* temporary static scatter/gather max */
-#define RPCRDMA_MAX_DATA_SEGS (8) /* max scatter/gather */
+#define RPCRDMA_MAX_DATA_SEGS (64) /* max scatter/gather */
#define RPCRDMA_MAX_SEGS (RPCRDMA_MAX_DATA_SEGS + 2) /* head+tail = 2 */
#define MAX_RPCRDMAHDR (\
/* max supported RPC/RDMA header */ \
--
1.7.6
On Thu, Jul 21, 2011 at 01:49:02PM -0400, Steve Dickson wrote:
> Our performance team has noticed that increasing
> RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly
> increases throughput when using the RDMA transport.
The main risk that I can see being that we have on the stack in two
places:
rpcrdma_register_fmr_external(struct rpcrdma_mr_seg *seg, ...
{
...
u64 physaddrs[RPCRDMA_MAX_DATA_SEGS];
rpcrdma_register_default_external(struct rpcrdma_mr_seg *seg, ...
{
...
struct ib_phys_buf ipb[RPCRDMA_MAX_DATA_SEGS];
Where ip_phys_buf is 16 bytes.
So that's 512 bytes in the first case, 1024 in the second. This is
called from rpciod--what are our rules about allocating memory from
rpciod?
--b.
>
> Signed-off-by: Steve Dickson <[email protected]>
> ---
> net/sunrpc/xprtrdma/xprt_rdma.h | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
> index cae761a..5d1cfe5 100644
> --- a/net/sunrpc/xprtrdma/xprt_rdma.h
> +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
> @@ -109,7 +109,7 @@ struct rpcrdma_ep {
> */
>
> /* temporary static scatter/gather max */
> -#define RPCRDMA_MAX_DATA_SEGS (8) /* max scatter/gather */
> +#define RPCRDMA_MAX_DATA_SEGS (64) /* max scatter/gather */
> #define RPCRDMA_MAX_SEGS (RPCRDMA_MAX_DATA_SEGS + 2) /* head+tail = 2 */
> #define MAX_RPCRDMAHDR (\
> /* max supported RPC/RDMA header */ \
> --
> 1.7.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 21, 2011 at 09:42:04PM -0400, Trond Myklebust wrote:
> On Thu, 2011-07-21 at 17:41 -0400, J. Bruce Fields wrote:
> > On Thu, Jul 21, 2011 at 01:49:02PM -0400, Steve Dickson wrote:
> > > Our performance team has noticed that increasing
> > > RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly
> > > increases throughput when using the RDMA transport.
> >
> > The main risk that I can see being that we have on the stack in two
> > places:
> >
> > rpcrdma_register_fmr_external(struct rpcrdma_mr_seg *seg, ...
> > {
> > ...
> > u64 physaddrs[RPCRDMA_MAX_DATA_SEGS];
> >
> > rpcrdma_register_default_external(struct rpcrdma_mr_seg *seg, ...
> > {
> > ...
> > struct ib_phys_buf ipb[RPCRDMA_MAX_DATA_SEGS];
> >
> > Where ip_phys_buf is 16 bytes.
> >
> > So that's 512 bytes in the first case, 1024 in the second. This is
> > called from rpciod--what are our rules about allocating memory from
> > rpciod?
>
> Is that allocated on the stack? We should always try to avoid 1024-byte
> allocations on the stack, since that eats up a full 1/8th (or 1/4 in the
> case of 4k stacks) of the total stack space.
Right, it's on the stack, so I was wondering what we should do
instead....
> If, OTOH, that memory is being allocated dynamically, then the rule is
> "don't let rpciod sleep".
OK, so, looking around, the buf_alloc methods might provide examples to
follow for dynamic allocation here?
--b.
On Thu, 2011-07-21 at 17:41 -0400, J. Bruce Fields wrote:
> On Thu, Jul 21, 2011 at 01:49:02PM -0400, Steve Dickson wrote:
> > Our performance team has noticed that increasing
> > RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly
> > increases throughput when using the RDMA transport.
>
> The main risk that I can see being that we have on the stack in two
> places:
>
> rpcrdma_register_fmr_external(struct rpcrdma_mr_seg *seg, ...
> {
> ...
> u64 physaddrs[RPCRDMA_MAX_DATA_SEGS];
>
> rpcrdma_register_default_external(struct rpcrdma_mr_seg *seg, ...
> {
> ...
> struct ib_phys_buf ipb[RPCRDMA_MAX_DATA_SEGS];
>
> Where ip_phys_buf is 16 bytes.
>
> So that's 512 bytes in the first case, 1024 in the second. This is
> called from rpciod--what are our rules about allocating memory from
> rpciod?
Is that allocated on the stack? We should always try to avoid 1024-byte
allocations on the stack, since that eats up a full 1/8th (or 1/4 in the
case of 4k stacks) of the total stack space.
If, OTOH, that memory is being allocated dynamically, then the rule is
"don't let rpciod sleep".
Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer
NetApp
[email protected]
http://www.netapp.com
Sorry for the delayed response... I took a day off..
On 07/22/2011 04:19 AM, Max Matveev wrote:
> On Thu, 21 Jul 2011 13:49:02 -0400, Steve Dickson wrote:
>
> steved> Our performance team has noticed that increasing
> steved> RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly
> steved> increases throughput when using the RDMA transport.
>
> Did they try new client with old server and vice versa?
> Both read and write?
I believe it was done on the server side, but I've cc-ed the
person who did the testing....
steved.
On Thu, 21 Jul 2011 13:49:02 -0400, Steve Dickson wrote:
steved> Our performance team has noticed that increasing
steved> RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly
steved> increases throughput when using the RDMA transport.
Did they try new client with old server and vice versa?
Both read and write?
max