LinuxLists.cc - RFC: NFS/RDMA, IPoIB MTU and [rw]size

2012-01-12 19:26:20

Subject: RFC: NFS/RDMA, IPoIB MTU and [rw]size

Greetings.

I am currently in the process of moving a cluster I administer from
NFS/TCP to NFS/RDMA, and am running into a number of issues I'd like some
assistance with. Googling these doesn't help.

For background on what caused me to move to NFS/TCP in the first place,
please see the thread that starts at http://lkml.org/lkml/2010/8/23/204

The main reason I'm moving away from NFS/TCP is that something happened in
the later kernels that reduces its resilience. Specifically, the client
now permanently loses contact with the server whenever the latter fails to
allocate an RPC sk_buff due to memory fragmentation. Restarting the
server's nfsd's fixes this problem, at least temporarily.

I haven't nailed down when this started happening (somewhere since
2.6.38), nor am I inclined to do so. This new experience (for me) with
NFS/TCP has conclusively shown me that it is much more responsive with
smaller IPoIB MTU's. Thus I will instead be reducing that MTU from its
connected mode maximum of 65520, perhaps all the way down to datagram
mode's 2044, to completely factor out memory fragmentation effects. More
on that below.

In moving to NFS/RDMA and reducing the IPoIB MTU, I have seen the
following behaviours.

--

1) Random client-side BUG()'outs. In fact, these never finish producing a
complete stack trace. I've tracked this down to duplicate replies being
encountered by rpcrdma_reply_handler() in net/sunrpc/xprtrdma/rpc_rdma.c.
Frankly I don't see why rpcrdma_reply_handler() should BUG() out in that
case given TCP's behaviour in similar situations, documented requirements
for the use of BUG() & friends in the first place, and the fact that
rpcrdma_reply_handler() essentially "ignores" replies for which it cannot
find a corresponding request.

For the past few days now, I've been running the following on some of my
nodes with no ill effects. And yes, I do see the log message this
produces. This changes rpcrdma_reply_handler() to treat duplicate replies
in much the same way it treats replies for which it cannot find a request.

diff -adNpru linux-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c devel-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c
--- linux-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c 2011-12-21 14:00:46.000000000 -0700
+++ devel-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c 2011-12-29 07:25:59.000000000 -0700
@@ -776,7 +776,13 @@ repost:
" RPC request 0x%p xid 0x%08x\n",
__func__, rep, req, rqst, headerp->rm_xid);

- BUG_ON(!req || req->rl_reply);
+ /* req cannot be NULL here */
+ if (req->rl_reply) {
+ spin_unlock(&xprt->transport_lock);
+ printk(KERN_NOTICE "RPC: %s: duplicate replies to request 0x%p: "
+ "0x%p and 0x%p\n", __func__, req, req->rl_reply, rep);
+ goto repost;
+ }

/* from here on, the reply is no longer an orphan */
req->rl_reply = rep;

This would also apply, modulo patch fuzz, all the way back to 2.6.24.

--

2) Still client-side, I'm seeing a lot of these sequences ...

rpcrdma: connection to 10.0.6.1:20049 on mthca0, memreg 6 slots 32 ird 4
rpcrdma: connection to 10.0.6.1:20049 closed (-103)

103 is ECONNABORTED. memreg 6 is RPCRDMA_ALLPHYSICAL, so I'm assuming my
Mellanox adapters don't support the default RPCRDMA_FRMR (memreg 5). I've
traced these aborted connections to IB_CM_DREP_RECEIVED events being
received by cma_ib_handler() in drivers/infiniband/core/cma.c, but can go
no further given my limited understanding of what this code is supposed to
do. I am guessing though, that these would presumably disappear when
switching back to datagram mode (cm == connected mode). These messages
don't appear to affect anything (the client simply reconnects and I've
seen no data corruption), but it would still be nice to know what's going
on here.

--

3) isn't related to NFS/RDMA per se, but to my attempts at reducing the
IPoIB MTU. Whenever I do so on the fly across the cluster, some but not
all, IPoIB traffic simply times out. Even, in some cases, TCP connections
accept()'ed after the MTU reduction. Oddly, neither NFS/TCP nor NFS/RDMA
seem affected, but other things (MPI apps, torque, etc.) are, whether
started before or after the change. So, something, somewhere, remembers
the previous (larger) MTU (opensm?). It seems that the only way to clear
this "memory" is to reboot the entire cluster, something I'd rather avoid
if possible.

--

4) Lastly, I would like to ask for a better understanding of the
relationship, if any, between NFS/RDMA and the IPoIB MTU, and between
NFS/RDMA and [rw]size NFS mount parameters. What effect do these have on
NFS/RDMA? For [rw]size, I have found that specifying less than a page
(4K) results in data corruption.

--

Please CC me on any comments/flames about any of the above as I am not
subscribed to this list.

Thanks.

Marc.

+----------------------------------+----------------------------------+
| Marc Aurele La France | work: 1-780-492-9310 |
| Academic Information and | fax: 1-780-492-1729 |
| Communications Technologies | email: [email protected] |
| 352 General Services Building +----------------------------------+
| University of Alberta | |
| Edmonton, Alberta | Standard disclaimers apply |
| T6G 2H1 | |
| CANADA | |
+----------------------------------+----------------------------------+

2012-02-15 21:32:56

by Marc Aurele La France

[permalink] [raw]

Subject: Re: RFC: NFS/RDMA, IPoIB MTU and [rw]size

On Wed, 15 Feb 2012, Tom Tucker wrote:

> This looks correct to me.

... except that it doesn't work, and neither does your change at
http://git.openfabrics.org/git?p=~boomer/ofed_kernel/.git;a=commitdiff;h=217d68a9e4f8cb9c735e1098646f41fb36744ce9

> I assume these are v3 mounts?

Yes.

> BTW, the when you say you're running NFS/TCP are you running TCP over IPoIB?

Also yes.

In any case, I've switched back to NFS/TCP/IPoIB with a 2044 MTU (had to
reboot everything to get that done). So NFS/RDMA remains highly
experimental in my eyes. And it will remain so until the Linux kernel
community and the OpenFabrics community get their co-operation issues
resolved, if ever.

Marc.

> On 1/12/12 1:17 PM, Marc Aurele La France wrote:
>> Greetings.

>> I am currently in the process of moving a cluster I administer from
>> NFS/TCP to NFS/RDMA, and am running into a number of issues I'd like some
>> assistance with. Googling these doesn't help.

>> For background on what caused me to move to NFS/TCP in the first place,
>> please see the thread that starts at http://lkml.org/lkml/2010/8/23/204

>> The main reason I'm moving away from NFS/TCP is that something happened in
>> the later kernels that reduces its resilience. Specifically, the client
>> now permanently loses contact with the server whenever the latter fails to
>> allocate an RPC sk_buff due to memory fragmentation. Restarting the
>> server's nfsd's fixes this problem, at least temporarily.

>> I haven't nailed down when this started happening (somewhere since
>> 2.6.38), nor am I inclined to do so. This new experience (for me) with
>> NFS/TCP has conclusively shown me that it is much more responsive with
>> smaller IPoIB MTU's. Thus I will instead be reducing that MTU from its
>> connected mode maximum of 65520, perhaps all the way down to datagram
>> mode's 2044, to completely factor out memory fragmentation effects. More
>> on that below.

>> In moving to NFS/RDMA and reducing the IPoIB MTU, I have seen the
>> following behaviours.

>> --

>> 1) Random client-side BUG()'outs. In fact, these never finish producing a
>> complete stack trace. I've tracked this down to duplicate replies being
>> encountered by rpcrdma_reply_handler() in net/sunrpc/xprtrdma/rpc_rdma.c.
>> Frankly I don't see why rpcrdma_reply_handler() should BUG() out in that
>> case given TCP's behaviour in similar situations, documented requirements
>> for the use of BUG() & friends in the first place, and the fact that
>> rpcrdma_reply_handler() essentially "ignores" replies for which it cannot
>> find a corresponding request.

>> For the past few days now, I've been running the following on some of my
>> nodes with no ill effects. And yes, I do see the log message this
>> produces. This changes rpcrdma_reply_handler() to treat duplicate replies
>> in much the same way it treats replies for which it cannot find a request.

>> diff -adNpru linux-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c
>> devel-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c
>> --- linux-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c 2011-12-21
>> 14:00:46.000000000 -0700
>> +++ devel-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c 2011-12-29
>> 07:25:59.000000000 -0700
>> @@ -776,7 +776,13 @@ repost:
>> " RPC request 0x%p xid 0x%08x\n",
>> __func__, rep, req, rqst, headerp->rm_xid);
>>
>> - BUG_ON(!req || req->rl_reply);
>> + /* req cannot be NULL here */
>> + if (req->rl_reply) {
>> + spin_unlock(&xprt->transport_lock);
>> + printk(KERN_NOTICE "RPC: %s: duplicate replies to request 0x%p: "
>> + "0x%p and 0x%p\n", __func__, req, req->rl_reply, rep);
>> + goto repost;
>> + }
>>
>> /* from here on, the reply is no longer an orphan */
>> req->rl_reply = rep;

>> This would also apply, modulo patch fuzz, all the way back to 2.6.24.

>> --

>> 2) Still client-side, I'm seeing a lot of these sequences ...

>> rpcrdma: connection to 10.0.6.1:20049 on mthca0, memreg 6 slots 32 ird 4
>> rpcrdma: connection to 10.0.6.1:20049 closed (-103)

>> 103 is ECONNABORTED. memreg 6 is RPCRDMA_ALLPHYSICAL, so I'm assuming my
>> Mellanox adapters don't support the default RPCRDMA_FRMR (memreg 5). I've
>> traced these aborted connections to IB_CM_DREP_RECEIVED events being
>> received by cma_ib_handler() in drivers/infiniband/core/cma.c, but can go
>> no further given my limited understanding of what this code is supposed to
>> do. I am guessing though, that these would presumably disappear when
>> switching back to datagram mode (cm == connected mode). These messages
>> don't appear to affect anything (the client simply reconnects and I've
>> seen no data corruption), but it would still be nice to know what's going
>> on here.

>> --

>> 3) isn't related to NFS/RDMA per se, but to my attempts at reducing the
>> IPoIB MTU. Whenever I do so on the fly across the cluster, some but not
>> all, IPoIB traffic simply times out. Even, in some cases, TCP connections
>> accept()'ed after the MTU reduction. Oddly, neither NFS/TCP nor NFS/RDMA
>> seem affected, but other things (MPI apps, torque, etc.) are, whether
>> started before or after the change. So, something, somewhere, remembers
>> the previous (larger) MTU (opensm?). It seems that the only way to clear
>> this "memory" is to reboot the entire cluster, something I'd rather avoid
>> if possible.

>> --

>> 4) Lastly, I would like to ask for a better understanding of the
>> relationship, if any, between NFS/RDMA and the IPoIB MTU, and between
>> NFS/RDMA and [rw]size NFS mount parameters. What effect do these have on
>> NFS/RDMA? For [rw]size, I have found that specifying less than a page
>> (4K) results in data corruption.

>> --

>> Please CC me on any comments/flames about any of the above as I am not
>> subscribed to this list.

+----------------------------------+----------------------------------+
| Marc Aurele La France | work: 1-780-492-9310 |
| Academic Information and | fax: 1-780-492-1729 |
| Communications Technologies | email: [email protected] |
| 352 General Services Building +----------------------------------+
| University of Alberta | |
| Edmonton, Alberta | Standard disclaimers apply |
| T6G 2H1 | |
| CANADA | |
+----------------------------------+----------------------------------+

2012-02-18 15:12:39

by Tom Tucker

[permalink] [raw]

Subject: Re: RFC: NFS/RDMA, IPoIB MTU and [rw]size

On 2/15/12 3:32 PM, Marc Aurele La France wrote:
> On Wed, 15 Feb 2012, Tom Tucker wrote:
>
>> This looks correct to me.
>
> ... except that it doesn't work, and neither does your change at
> http://git.openfabrics.org/git?p=~boomer/ofed_kernel/.git;a=commitdiff;h=217d68a9e4f8cb9c735e1098646f41fb36744ce9
>
>> I assume these are v3 mounts?
>
> Yes.
>

Ok, "good", because I have a patch that I had to make a change to get v4
to work.

>> BTW, the when you say you're running NFS/TCP are you running TCP over
>> IPoIB?
>
I have experienced issues with takeover/giveback between
> Also yes.
> /
> In any case, I've switched back to NFS/TCP/IPoIB with a 2044 MTU (had
> to reboot everything to get that done). So NFS/RDMA remains highly
> experimental in my eyes. And it will remain so until the Linux kernel
> community and the OpenFabrics community get their co-operation issues
> resolved, if ever.
>
> Marc.
>
>> On 1/12/12 1:17 PM, Marc Aurele La France wrote:
>>> Greetings.
>
>>> I am currently in the process of moving a cluster I administer from
>>> NFS/TCP to NFS/RDMA, and am running into a number of issues I'd like
>>> some
>>> assistance with. Googling these doesn't help.
>
>>> For background on what caused me to move to NFS/TCP in the first place,
>>> please see the thread that starts at http://lkml.org/lkml/2010/8/23/204
>
>>> The main reason I'm moving away from NFS/TCP is that something
>>> happened in
>>> the later kernels that reduces its resilience. Specifically, the
>>> client
>>> now permanently loses contact with the server whenever the latter
>>> fails to
>>> allocate an RPC sk_buff due to memory fragmentation. Restarting the
>>> server's nfsd's fixes this problem, at least temporarily.
>

I have seen this behavior 'forever'. Perhaps you weren't getting the
memory exhaustion on the older kernels?

>>> I haven't nailed down when this started happening (somewhere since
>>> 2.6.38), nor am I inclined to do so. This new experience (for me) with
>>> NFS/TCP has conclusively shown me that it is much more responsive with
>>> smaller IPoIB MTU's. Thus I will instead be reducing that MTU from its
>>> connected mode maximum of 65520, perhaps all the way down to datagram
>>> mode's 2044, to completely factor out memory fragmentation effects.
>>> More
>>> on that below.
>

Perhaps you are really comparing UD to CM?

>>> In moving to NFS/RDMA and reducing the IPoIB MTU, I have seen the
>>> following behaviours.
>
>>> --
>
>>> 1) Random client-side BUG()'outs. In fact, these never finish
>>> producing a
>>> complete stack trace. I've tracked this down to duplicate replies
>>> being
>>> encountered by rpcrdma_reply_handler() in
>>> net/sunrpc/xprtrdma/rpc_rdma.c.
>>> Frankly I don't see why rpcrdma_reply_handler() should BUG() out in
>>> that
>>> case given TCP's behaviour in similar situations, documented
>>> requirements
>>> for the use of BUG() & friends in the first place, and the fact that
>>> rpcrdma_reply_handler() essentially "ignores" replies for which it
>>> cannot
>>> find a corresponding request.
>
>>> For the past few days now, I've been running the following on some
>>> of my
>>> nodes with no ill effects. And yes, I do see the log message this
>>> produces. This changes rpcrdma_reply_handler() to treat duplicate
>>> replies
>>> in much the same way it treats replies for which it cannot find a
>>> request.
>
>>> diff -adNpru linux-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c
>>> devel-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c
>>> --- linux-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c 2011-12-21
>>> 14:00:46.000000000 -0700
>>> +++ devel-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c 2011-12-29
>>> 07:25:59.000000000 -0700
>>> @@ -776,7 +776,13 @@ repost:
>>> " RPC request 0x%p xid 0x%08x\n",
>>> __func__, rep, req, rqst, headerp->rm_xid);
>>>
>>> - BUG_ON(!req || req->rl_reply);
>>> + /* req cannot be NULL here */
>>> + if (req->rl_reply) {
>>> + spin_unlock(&xprt->transport_lock);
>>> + printk(KERN_NOTICE "RPC: %s: duplicate replies to request
>>> 0x%p: "
>>> + "0x%p and 0x%p\n", __func__, req, req->rl_reply, rep);
>>> + goto repost;
>>> + }
>>>
>>> /* from here on, the reply is no longer an orphan */
>>> req->rl_reply = rep;
>
>>> This would also apply, modulo patch fuzz, all the way back to 2.6.24.
>
>>> --
>
>>> 2) Still client-side, I'm seeing a lot of these sequences ...
>
>>> rpcrdma: connection to 10.0.6.1:20049 on mthca0, memreg 6 slots 32
>>> ird 4
>>> rpcrdma: connection to 10.0.6.1:20049 closed (-103)
>
>>> 103 is ECONNABORTED. memreg 6 is RPCRDMA_ALLPHYSICAL, so I'm
>>> assuming my
>>> Mellanox adapters don't support the default RPCRDMA_FRMR (memreg
>>> 5). I've
>>> traced these aborted connections to IB_CM_DREP_RECEIVED events being
>>> received by cma_ib_handler() in drivers/infiniband/core/cma.c, but
>>> can go
>>> no further given my limited understanding of what this code is
>>> supposed to
>>> do. I am guessing though, that these would presumably disappear when
>>> switching back to datagram mode (cm == connected mode). These messages
>>> don't appear to affect anything (the client simply reconnects and I've
>>> seen no data corruption), but it would still be nice to know what's
>>> going
>>> on here.
>

You could try turning on debug as such "echo 511 >
/proc/sys/sunrpc/rpc_debug" and you will get lots of additional
information. Note, however, that those messages above are "normal". The
first one will occur when the mount happens, and the second when the
connection is closed; which occurs after 5-6 minutes of inactivity on
the mount.

>>> --
>
>>> 3) isn't related to NFS/RDMA per se, but to my attempts at reducing the
>>> IPoIB MTU. Whenever I do so on the fly across the cluster, some but
>>> not
>>> all, IPoIB traffic simply times out. Even, in some cases, TCP
>>> connections
>>> accept()'ed after the MTU reduction. Oddly, neither NFS/TCP nor
>>> NFS/RDMA
>>> seem affected, but other things (MPI apps, torque, etc.) are, whether
>>> started before or after the change. So, something, somewhere,
>>> remembers
>>> the previous (larger) MTU (opensm?). It seems that the only way to
>>> clear
>>> this "memory" is to reboot the entire cluster, something I'd rather
>>> avoid
>>> if possible.
>
NFSRDMA doesn't run over IPoIB, so it doesn't care what you've set the
IPoIB MTU to be.

>>> --
>
>>> 4) Lastly, I would like to ask for a better understanding of the
>>> relationship, if any, between NFS/RDMA and the IPoIB MTU, and between
>>> NFS/RDMA and [rw]size NFS mount parameters. What effect do these
>>> have on
>>> NFS/RDMA? For [rw]size, I have found that specifying less than a
>>> page (4K) results in data corruption.
>

None.

>>> --
>
>>> Please CC me on any comments/flames about any of the above as I am not
>>> subscribed to this list.
>
> +----------------------------------+----------------------------------+
> | Marc Aurele La France | work: 1-780-492-9310 |
> | Academic Information and | fax: 1-780-492-1729 |
> | Communications Technologies | email: [email protected] |
> | 352 General Services Building +----------------------------------+
> | University of Alberta | |
> | Edmonton, Alberta | Standard disclaimers apply |
> | T6G 2H1 | |
> | CANADA | |
> +----------------------------------+----------------------------------+

2012-02-15 18:17:55

by Tom Tucker

[permalink] [raw]

Subject: Re: RFC: NFS/RDMA, IPoIB MTU and [rw]size

Hi Marc,

This looks correct to me. I assume these are v3 mounts?

BTW, the when you say you're running NFS/TCP are you running TCP over IPoIB?

Thanks,
Tom

On 1/12/12 1:17 PM, Marc Aurele La France wrote:
> Greetings.
>
> I am currently in the process of moving a cluster I administer from
> NFS/TCP to NFS/RDMA, and am running into a number of issues I'd like some
> assistance with. Googling these doesn't help.
>
> For background on what caused me to move to NFS/TCP in the first place,
> please see the thread that starts at http://lkml.org/lkml/2010/8/23/204
>
> The main reason I'm moving away from NFS/TCP is that something happened in
> the later kernels that reduces its resilience. Specifically, the client
> now permanently loses contact with the server whenever the latter fails to
> allocate an RPC sk_buff due to memory fragmentation. Restarting the
> server's nfsd's fixes this problem, at least temporarily.
>
> I haven't nailed down when this started happening (somewhere since
> 2.6.38), nor am I inclined to do so. This new experience (for me) with
> NFS/TCP has conclusively shown me that it is much more responsive with
> smaller IPoIB MTU's. Thus I will instead be reducing that MTU from its
> connected mode maximum of 65520, perhaps all the way down to datagram
> mode's 2044, to completely factor out memory fragmentation effects. More
> on that below.
>
> In moving to NFS/RDMA and reducing the IPoIB MTU, I have seen the
> following behaviours.
>
> --
>
> 1) Random client-side BUG()'outs. In fact, these never finish producing a
> complete stack trace. I've tracked this down to duplicate replies being
> encountered by rpcrdma_reply_handler() in net/sunrpc/xprtrdma/rpc_rdma.c.
> Frankly I don't see why rpcrdma_reply_handler() should BUG() out in that
> case given TCP's behaviour in similar situations, documented requirements
> for the use of BUG() & friends in the first place, and the fact that
> rpcrdma_reply_handler() essentially "ignores" replies for which it cannot
> find a corresponding request.
>
> For the past few days now, I've been running the following on some of my
> nodes with no ill effects. And yes, I do see the log message this
> produces. This changes rpcrdma_reply_handler() to treat duplicate replies
> in much the same way it treats replies for which it cannot find a request.
>
> diff -adNpru linux-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c
> devel-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c
> --- linux-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c 2011-12-21
> 14:00:46.000000000 -0700
> +++ devel-3.1.6/net/sunrpc/xprtrdma/rpc_rdma.c 2011-12-29
> 07:25:59.000000000 -0700
> @@ -776,7 +776,13 @@ repost:
> " RPC request 0x%p xid 0x%08x\n",
> __func__, rep, req, rqst, headerp->rm_xid);
>
> - BUG_ON(!req || req->rl_reply);
> + /* req cannot be NULL here */
> + if (req->rl_reply) {
> + spin_unlock(&xprt->transport_lock);
> + printk(KERN_NOTICE "RPC: %s: duplicate replies to request 0x%p: "
> + "0x%p and 0x%p\n", __func__, req, req->rl_reply, rep);
> + goto repost;
> + }
>
> /* from here on, the reply is no longer an orphan */
> req->rl_reply = rep;
>
> This would also apply, modulo patch fuzz, all the way back to 2.6.24.
>
> --
>
> 2) Still client-side, I'm seeing a lot of these sequences ...
>
> rpcrdma: connection to 10.0.6.1:20049 on mthca0, memreg 6 slots 32 ird 4
> rpcrdma: connection to 10.0.6.1:20049 closed (-103)
>
> 103 is ECONNABORTED. memreg 6 is RPCRDMA_ALLPHYSICAL, so I'm assuming my
> Mellanox adapters don't support the default RPCRDMA_FRMR (memreg 5). I've
> traced these aborted connections to IB_CM_DREP_RECEIVED events being
> received by cma_ib_handler() in drivers/infiniband/core/cma.c, but can go
> no further given my limited understanding of what this code is supposed to
> do. I am guessing though, that these would presumably disappear when
> switching back to datagram mode (cm == connected mode). These messages
> don't appear to affect anything (the client simply reconnects and I've
> seen no data corruption), but it would still be nice to know what's going
> on here.
>
> --
>
> 3) isn't related to NFS/RDMA per se, but to my attempts at reducing the
> IPoIB MTU. Whenever I do so on the fly across the cluster, some but not
> all, IPoIB traffic simply times out. Even, in some cases, TCP connections
> accept()'ed after the MTU reduction. Oddly, neither NFS/TCP nor NFS/RDMA
> seem affected, but other things (MPI apps, torque, etc.) are, whether
> started before or after the change. So, something, somewhere, remembers
> the previous (larger) MTU (opensm?). It seems that the only way to clear
> this "memory" is to reboot the entire cluster, something I'd rather avoid
> if possible.
>
> --
>
> 4) Lastly, I would like to ask for a better understanding of the
> relationship, if any, between NFS/RDMA and the IPoIB MTU, and between
> NFS/RDMA and [rw]size NFS mount parameters. What effect do these have on
> NFS/RDMA? For [rw]size, I have found that specifying less than a page
> (4K) results in data corruption.
>
> --
>
> Please CC me on any comments/flames about any of the above as I am not
> subscribed to this list.
>
> Thanks.
>
> Marc.
>
> +----------------------------------+----------------------------------+
> | Marc Aurele La France | work: 1-780-492-9310 |
> | Academic Information and | fax: 1-780-492-1729 |
> | Communications Technologies | email: [email protected] |
> | 352 General Services Building +----------------------------------+
> | University of Alberta | |
> | Edmonton, Alberta | Standard disclaimers apply |
> | T6G 2H1 | |
> | CANADA | |
> +----------------------------------+----------------------------------+
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html