On Tue, Sep 16, 2008 at 06:34:39AM -0500, Tom Tucker wrote:
> This patch adds security related documentation to the nfs-rdma.txt file
> that describes the memory registration model, the potential security
> exploits, and compares these exploits to a similar threat when using TCP
> as the transport.
Thanks for doing this.
>
> Signed-off-by: Tom Tucker <[email protected]>
>
> ---
> Documentation/filesystems/nfs-rdma.txt | 66 ++++++++++++++++++++++++++++++++
> 1 files changed, 66 insertions(+), 0 deletions(-)
>
> diff --git a/Documentation/filesystems/nfs-rdma.txt b/Documentation/filesystems/nfs-rdma.txt
> index 44bd766..41f0fb2 100644
> --- a/Documentation/filesystems/nfs-rdma.txt
> +++ b/Documentation/filesystems/nfs-rdma.txt
> @@ -269,3 +269,69 @@ NFS/RDMA Setup
> the "proto" field for the given mount.
>
> Congratulations! You're using NFS/RDMA!
> +
> +Security
> +--------
> +
> + NFSRDMA exploits the RDMA capabilities of the IB and iWARP
> + transports to more efficiently exchange RPC data between the client
> + and the server. This section discusses the security implications of
> + the exchange of memory information on the wire when the wire may be
> + monitorable by an untrusted application. The identifier that
> + encapsulates this memory information is called an RKEY.
> +
> + A principal exploit is that a node listening on a mirror port of a
> + switch
There are probably always other ways to do trick the switch into sending
an attacker some of the traffic. It might be simpler just to say "a
node on the local network".
> + could snoop RDMA packets containing RKEY and then forge a
> + packet with this RKEY to write and/or read the memory of the peer to
> + which the RKEY referred.
> +
> + The NFSRDMA protocol is defined such that a) only the server
> + initiates RDMA, and b) only the client's memory is exposed via
> + RKEY. This is why the server reads to fetch RPC data from the client
> + even though it would be more efficient for the client to write the
> + data to the server's memory. This design goal is not entirely
> + realized with iWARP, however, because the RKEY (called an STag on
> + iWARP) for the data sink of an RDMA_READ is actually placed on the
> + wire, and this RKEY has Remote Write permission. This means that the
> + server's memory is exposed by virtue of having placed the RKEY for
> + it's local memory on the wire in order to receive the result of the
s/it's/its/
> + RDMA_READ.
> +
> + By contrast, IB uses an opaque transaction ID# to associate the
> + READ_RPL with the READ_REQ and the data sink of an READ_REQ does not
> + require remote access. That said, the byzantine node in question
> + could forge a packet with this transaction ID and corrupt the target
> + memory, however, the scope of the exploit is bounded to the lifetime
> + of this single RDMA_READ request and to the memory mapped by the
> + data sink of the READ_REQ.
> +
> + The newer RDMA adapters (both iWARP and IB) support "Fast Memory
> + Registration". This capability allows memory to be quickly
> + registered (i.e. made available for remote access) and de-registered
> + by submitting WR on the SQ. These capabilities provide a mechanism
> + to reduce the exposure discused above by limiting the scope of the
> + exploit. The idea is to create an RKEY that only maps the single RPC
> + and whose effective lifetime is only the exchange of this single
> + RPC. This is the default memory model that is employed by the server
> + when supported by the adapter and by the client when the
> + rdma_memreg_strategy is set to 6. Note that the client and server
> + may use different memory registration strategies, however,
> + performance is better when both the client and server use the
> + FastReg memory registration strategy.
> +
> + This approach has two benefits, a) it restricts the domain of the
> + exploit to the memory of a single RPC, and b) it limits the duration
> + of the exploit to the time it takes to satisfy the RDMA_READ.
> +
> + It is arguable that a one-shot STag/RKEY is no less secure than RPC
> + on the TCP transport. Consider that the exact same byzantine
> + application could more easily corrupt TCP RPC payload by simply
> + forging a packet with the correct TCP sequence number -- in fact
> + it's easier than the RDMA exploit because the RDMA exploit requires
> + that you correctly forge both the TCP packet and the RDMA
> + payload. In addition the duration of the TCP exploit is the lifetime
> + of the connection, not the lifetime of a single WR/RPC data transfer.
> +
> + So if you buy the argument above, RDMA on IB or iWARP using Fast Reg
> + is no less secure than TCP.
I'd leave out the first seven words of that last sentence on the grounds
that it's implicit....
This explanation is helpful, thanks. It would also be helpful if we
could boil down the advice to just a sentence or two for the busy admin.
Something like: unless you have card XYZ and kernel 2.6.y, do *not* use
rdma on a network where you cannot trust every machine....
And better at some point might be to allow nfs-utils to automatically
check for that situation, and/or just to drop support for anything that
can't provide at least a tcp/auth_unix-like security model.
--b.
J. Bruce Fields wrote:
> On Thu, Sep 25, 2008 at 08:35:26AM -0500, Tom Tucker wrote:
>> J. Bruce Fields wrote:
>>> This explanation is helpful, thanks. It would also be helpful if we
>>> could boil down the advice to just a sentence or two for the busy admin.
>>> Something like: unless you have card XYZ and kernel 2.6.y, do *not* use
>>> rdma on a network where you cannot trust every machine....
>>
>> Would it be better to say, "Do not use RDMA on a network where your
>> policy requires a security model stronger than tcp/auth_unix."
>
> I'm not worried about the case where the security provided is roughly
> equivalent to that provided by tcp/auth_unix.
>
> I'm worried about the non-"Fast Reg" case where I thought you were
> saying that the network could access memory other than that meant to
> hold rpc data.
>
Ok, so maybe we could state the security exposure of ALLPHYSICAL instead
of dwelling on the relative differences between the similar exposures of
tcp/auth_unix vs. fastreg?
I'd also like to get to fastreg as a default at some point.
Tom:
What's your perspective on the lifetime of bounce buffers, memory
windows, and the other strategies in client?
Tom
> --b.
On Mon, Sep 29, 2008 at 10:07:25PM -0500, Tom Tucker wrote:
> J. Bruce Fields wrote:
>> On Thu, Sep 25, 2008 at 08:35:26AM -0500, Tom Tucker wrote:
>>> J. Bruce Fields wrote:
>>>> This explanation is helpful, thanks. It would also be helpful if we
>>>> could boil down the advice to just a sentence or two for the busy admin.
>>>> Something like: unless you have card XYZ and kernel 2.6.y, do *not* use
>>>> rdma on a network where you cannot trust every machine....
>>>
>>> Would it be better to say, "Do not use RDMA on a network where your
>>> policy requires a security model stronger than tcp/auth_unix."
>>
>> I'm not worried about the case where the security provided is roughly
>> equivalent to that provided by tcp/auth_unix.
>>
>> I'm worried about the non-"Fast Reg" case where I thought you were
>> saying that the network could access memory other than that meant to
>> hold rpc data.
>>
>
> Ok, so maybe we could state the security exposure of ALLPHYSICAL instead
> of dwelling on the relative differences between the similar exposures of
> tcp/auth_unix vs. fastreg?
Right. It's all interesting, so I wouldn't cut anything out--but you
could put off the details to the end; e.g., start with "in situation
<...>, nfs/rdma provides roughly the same security as nfs over tcp and
auth_unix (see below for more details)".
> I'd also like to get to fastreg as a default at some point.
OK, good.
> What's your perspective on the lifetime of bounce buffers, memory
> windows, and the other strategies in client?
I'm ignorant. Pointer to something else I should read?
I assume there are similar issues on the client?
--b.
J. Bruce Fields wrote:
> On Mon, Sep 29, 2008 at 10:07:25PM -0500, Tom Tucker wrote:
>> J. Bruce Fields wrote:
>>> On Thu, Sep 25, 2008 at 08:35:26AM -0500, Tom Tucker wrote:
>>>> J. Bruce Fields wrote:
>>>>> This explanation is helpful, thanks. It would also be helpful if we
>>>>> could boil down the advice to just a sentence or two for the busy admin.
>>>>> Something like: unless you have card XYZ and kernel 2.6.y, do *not* use
>>>>> rdma on a network where you cannot trust every machine....
>>>> Would it be better to say, "Do not use RDMA on a network where your
>>>> policy requires a security model stronger than tcp/auth_unix."
>>> I'm not worried about the case where the security provided is roughly
>>> equivalent to that provided by tcp/auth_unix.
>>>
>>> I'm worried about the non-"Fast Reg" case where I thought you were
>>> saying that the network could access memory other than that meant to
>>> hold rpc data.
>>>
>> Ok, so maybe we could state the security exposure of ALLPHYSICAL instead
>> of dwelling on the relative differences between the similar exposures of
>> tcp/auth_unix vs. fastreg?
>
> Right. It's all interesting, so I wouldn't cut anything out--but you
> could put off the details to the end; e.g., start with "in situation
> <...>, nfs/rdma provides roughly the same security as nfs over tcp and
> auth_unix (see below for more details)".
Sounds reasonable.
>
>> I'd also like to get to fastreg as a default at some point.
>
> OK, good.
>
>> What's your perspective on the lifetime of bounce buffers, memory
>> windows, and the other strategies in client?
>
> I'm ignorant. Pointer to something else I should read?
>
> I assume there are similar issues on the client?
>
Sorry, that was a question for Tom Talpey. It's a client specific issue
since the server only implements ALLPHYSICAL and FAST REG.
> --b.
On Tue, Sep 30, 2008 at 01:55:45PM -0500, Tom Tucker wrote:
> J. Bruce Fields wrote:
>> On Mon, Sep 29, 2008 at 10:07:25PM -0500, Tom Tucker wrote:
>>> What's your perspective on the lifetime of bounce buffers, memory
>>> windows, and the other strategies in client?
>>
>> I'm ignorant. Pointer to something else I should read?
>>
>> I assume there are similar issues on the client?
>>
>
> Sorry, that was a question for Tom Talpey.
Phew!
> It's a client specific issue since the server only implements
> ALLPHYSICAL and FAST REG.
OK.--b.
At 02:44 PM 9/30/2008, J. Bruce Fields wrote:
>On Mon, Sep 29, 2008 at 10:07:25PM -0500, Tom Tucker wrote:
>> What's your perspective on the lifetime of bounce buffers, memory
>> windows, and the other strategies in client?
>
>I'm ignorant. Pointer to something else I should read?
>
>I assume there are similar issues on the client?
Tom's asking about the client memory registration options and
whether they should all remain in the client code going forward.
There are different strategies in there, depending on what the
hardware supports, how well it supports it, and how you want
to run it all.
The multiple strategies stem from the early days when no two
adapters did quite the same thing. That said, at least one of
them is no longer useful - no adapters support windows today,
since the demise of Ammasso, although the kernel API to drive
them is still there in the OFA stack below.
I think it's possible that some of the other modes can collapse, but
not just now. There's a lot of older hardware out there, and newer
hardware may appear to use the old stuff too.
Another concern I have, frankly, is interoperability. If we collapse
the modes, then the temptation to assume both ends have such-
and-such a capability increase. If one side tries some RDMA operation
that is only supported properly by a certain adapter, it's hard to detect
that in a monoculture. Having the option to switch modes can help avert
this, by testing for it.
In any case, if the current code doesn't work, it's a bug. Certainly
bouncebuffers (non-RDMA mode) should work perfectly and I plan to
check it asap.
Tom.
Bruce/Tom:
Below is an updated Documentation patch. Please take a look and tell me
what you think.
I've made all the changes to the code per Bruce's suggestions plus added
a patch to display the mem. reg. strategy used at mount time.
Please tweak the doc patch as needed and then I'll repost the whole lot.
Thanks,
Tom
From: Tom Tucker <[email protected]>
Date: Tue, 30 Sep 2008 14:41:30 -0500
Subject: [PATCH 10/11] svcrdma: Documentation update for the FastReg
memory model
This patch adds security related documentation to the nfs-rdma.txt file
that describes the memory registration model, the potential security
exploits, and compares these exploits to a similar threat when using TCP
as the transport.
Signed-off-by: Tom Tucker <[email protected]>
---
Documentation/filesystems/nfs-rdma.txt | 84
++++++++++++++++++++++++++++++++
1 files changed, 84 insertions(+), 0 deletions(-)
diff --git a/Documentation/filesystems/nfs-rdma.txt
b/Documentation/filesystems/nfs-rdma.txt
index 44bd766..266a57b 100644
--- a/Documentation/filesystems/nfs-rdma.txt
+++ b/Documentation/filesystems/nfs-rdma.txt
@@ -269,3 +269,87 @@ NFS/RDMA Setup
the "proto" field for the given mount.
Congratulations! You're using NFS/RDMA!
+
+Security
+--------
+
+ NFSRDMA exploits the RDMA capabilities of the IB and iWARP
+ transports to more efficiently exchange RPC data between the client
+ and the server. This section discusses the security implications of
+ the exchange of memory information on the wire when the wire may be
+ monitorable by an untrusted application. The identifier that
+ encapsulates this memory information is called an RKEY.
+
+ A principal exploit is that a node on the local network could snoop
+ RDMA packets containing RKEY and then forge a packet with this RKEY
+ to write and/or read the memory of the peer to which the RKEY
+ referred.
+
+ If the underlying RDMA device is capable of Fast Memory
+ Registration, then NFSRDMA is no less secure than TCP with
+ auth_unix. However, if the device does not support Fast Memory
+ Registration, then such a node could write anywhere in the server's
+ memory using the method above. At mount time, the server sends a
+ string to the message log to indicate whether or not Fast Memory
+ Registration is being used. If Fast Memory Registration is being
+ used, the string
+
+ "svcrdma: Using Fast Memory Registration"
+
+ is logged, otherwise,
+
+ "svcrdma: Using a Global DMA MR"
+
+ will be logged.
+
+ The sections below provide additional information on this issue.
+
+ The NFSRDMA protocol is defined such that a) only the server
+ initiates RDMA, and b) only the client's memory is exposed via
+ RKEY. This is why the server reads to fetch RPC data from the client
+ even though it would be more efficient for the client to write the
+ data to the server's memory. This design goal is not entirely
+ realized with iWARP, however, because the RKEY (called an STag on
+ iWARP) for the data sink of an RDMA_READ is actually placed on the
+ wire, and this RKEY has Remote Write permission. This means that the
+ server's memory is exposed by virtue of having placed the RKEY for
+ its local memory on the wire in order to receive the result of the
+ RDMA_READ.
+
+ By contrast, IB uses an opaque transaction ID# to associate the
+ READ_RPL with the READ_REQ and the data sink of an READ_REQ does not
+ require remote access. That said, the byzantine node in question
+ could forge a packet with this transaction ID and corrupt the target
+ memory, however, the scope of the exploit is bounded to the lifetime
+ of this single RDMA_READ request and to the memory mapped by the
+ data sink of the READ_REQ.
+
+ The newer RDMA adapters (both iWARP and IB) support "Fast Memory
+ Registration". This capability allows memory to be quickly
+ registered (i.e. made available for remote access) and de-registered
+ by submitting WR on the SQ. These capabilities provide a mechanism
+ to reduce the exposure discused above by limiting the scope of the
+ exploit. The idea is to create an RKEY that only maps the single RPC
+ and whose effective lifetime is only the exchange of this single
+ RPC. This is the default memory model that is employed by the server
+ when supported by the adapter and by the client when the
+ rdma_memreg_strategy is set to 6. Note that the client and server
+ may use different memory registration strategies, however,
+ performance is better when both the client and server use the
+ FastReg memory registration strategy.
+
+ This approach has two benefits, a) it restricts the domain of the
+ exploit to the memory of a single RPC, and b) it limits the duration
+ of the exploit to the time it takes to satisfy the RDMA_READ.
+
+ It is arguable that a one-shot STag/RKEY is no less secure than RPC
+ on the TCP transport. Consider that the exact same byzantine
+ application could more easily corrupt TCP RPC payload by simply
+ forging a packet with the correct TCP sequence number -- in fact
+ it's easier than the RDMA exploit because the RDMA exploit requires
+ that you correctly forge both the TCP packet and the RDMA
+ payload. In addition the duration of the TCP exploit is the lifetime
+ of the connection, not the lifetime of a single WR/RPC data transfer.
+
+ RDMA on IB or iWARP using Fast Reg is no less secure than TCP.
+
At 09:35 AM 9/25/2008, Tom Tucker wrote:
>> This explanation is helpful, thanks. It would also be helpful if we
>> could boil down the advice to just a sentence or two for the busy admin.
>> Something like: unless you have card XYZ and kernel 2.6.y, do *not* use
>> rdma on a network where you cannot trust every machine....
>
>
>Would it be better to say, "Do not use RDMA on a network where your
>policy requires a security model stronger than tcp/auth_unix."
No! This would confuse integrity and privacy concerns (the root of the
RDMA attack you describe) with authentication. While it's true there are
different attacks with a different transport, they do not in any way
contravene the protections in the RPC and NFS layers.
In fact, I believe the text is unfairly protraying a vulnerability in iWARP
as to be residing in NFS/RDMA, which is isn't.
While many of today's adapters allow so-called "type 2" RKEYs, the
protocol does not encourage them, and their use introduces these
risks. The risks are avoidable. The IETF RFCs describe these in detail,
for both RDDP and NFS/RPC/RDMA.
Tom.
On Thu, Sep 25, 2008 at 08:35:26AM -0500, Tom Tucker wrote:
> J. Bruce Fields wrote:
>> This explanation is helpful, thanks. It would also be helpful if we
>> could boil down the advice to just a sentence or two for the busy admin.
>> Something like: unless you have card XYZ and kernel 2.6.y, do *not* use
>> rdma on a network where you cannot trust every machine....
>
>
> Would it be better to say, "Do not use RDMA on a network where your
> policy requires a security model stronger than tcp/auth_unix."
I'm not worried about the case where the security provided is roughly
equivalent to that provided by tcp/auth_unix.
I'm worried about the non-"Fast Reg" case where I thought you were
saying that the network could access memory other than that meant to
hold rpc data.
--b.
J. Bruce Fields wrote:
> On Tue, Sep 16, 2008 at 06:34:39AM -0500, Tom Tucker wrote:
>> This patch adds security related documentation to the nfs-rdma.txt file
>> that describes the memory registration model, the potential security
>> exploits, and compares these exploits to a similar threat when using TCP
>> as the transport.
>
> Thanks for doing this.
>
>> Signed-off-by: Tom Tucker <[email protected]>
>>
>> ---
>> Documentation/filesystems/nfs-rdma.txt | 66 ++++++++++++++++++++++++++++++++
>> 1 files changed, 66 insertions(+), 0 deletions(-)
>>
>> diff --git a/Documentation/filesystems/nfs-rdma.txt b/Documentation/filesystems/nfs-rdma.txt
>> index 44bd766..41f0fb2 100644
>> --- a/Documentation/filesystems/nfs-rdma.txt
>> +++ b/Documentation/filesystems/nfs-rdma.txt
>> @@ -269,3 +269,69 @@ NFS/RDMA Setup
>> the "proto" field for the given mount.
>>
>> Congratulations! You're using NFS/RDMA!
>> +
>> +Security
>> +--------
>> +
>> + NFSRDMA exploits the RDMA capabilities of the IB and iWARP
>> + transports to more efficiently exchange RPC data between the client
>> + and the server. This section discusses the security implications of
>> + the exchange of memory information on the wire when the wire may be
>> + monitorable by an untrusted application. The identifier that
>> + encapsulates this memory information is called an RKEY.
>> +
>> + A principal exploit is that a node listening on a mirror port of a
>> + switch
>
> There are probably always other ways to do trick the switch into sending
> an attacker some of the traffic. It might be simpler just to say "a
> node on the local network".
Ok.
>
>> + could snoop RDMA packets containing RKEY and then forge a
>> + packet with this RKEY to write and/or read the memory of the peer to
>> + which the RKEY referred.
>> +
>> + The NFSRDMA protocol is defined such that a) only the server
>> + initiates RDMA, and b) only the client's memory is exposed via
>> + RKEY. This is why the server reads to fetch RPC data from the client
>> + even though it would be more efficient for the client to write the
>> + data to the server's memory. This design goal is not entirely
>> + realized with iWARP, however, because the RKEY (called an STag on
>> + iWARP) for the data sink of an RDMA_READ is actually placed on the
>> + wire, and this RKEY has Remote Write permission. This means that the
>> + server's memory is exposed by virtue of having placed the RKEY for
>> + it's local memory on the wire in order to receive the result of the
>
> s/it's/its/
>
Yes, erf.
>> + RDMA_READ.
>> +
>> + By contrast, IB uses an opaque transaction ID# to associate the
>> + READ_RPL with the READ_REQ and the data sink of an READ_REQ does not
>> + require remote access. That said, the byzantine node in question
>> + could forge a packet with this transaction ID and corrupt the target
>> + memory, however, the scope of the exploit is bounded to the lifetime
>> + of this single RDMA_READ request and to the memory mapped by the
>> + data sink of the READ_REQ.
>> +
>> + The newer RDMA adapters (both iWARP and IB) support "Fast Memory
>> + Registration". This capability allows memory to be quickly
>> + registered (i.e. made available for remote access) and de-registered
>> + by submitting WR on the SQ. These capabilities provide a mechanism
>> + to reduce the exposure discused above by limiting the scope of the
>> + exploit. The idea is to create an RKEY that only maps the single RPC
>> + and whose effective lifetime is only the exchange of this single
>> + RPC. This is the default memory model that is employed by the server
>> + when supported by the adapter and by the client when the
>> + rdma_memreg_strategy is set to 6. Note that the client and server
>> + may use different memory registration strategies, however,
>> + performance is better when both the client and server use the
>> + FastReg memory registration strategy.
>> +
>> + This approach has two benefits, a) it restricts the domain of the
>> + exploit to the memory of a single RPC, and b) it limits the duration
>> + of the exploit to the time it takes to satisfy the RDMA_READ.
>> +
>> + It is arguable that a one-shot STag/RKEY is no less secure than RPC
>> + on the TCP transport. Consider that the exact same byzantine
>> + application could more easily corrupt TCP RPC payload by simply
>> + forging a packet with the correct TCP sequence number -- in fact
>> + it's easier than the RDMA exploit because the RDMA exploit requires
>> + that you correctly forge both the TCP packet and the RDMA
>> + payload. In addition the duration of the TCP exploit is the lifetime
>> + of the connection, not the lifetime of a single WR/RPC data transfer.
>> +
>> + So if you buy the argument above, RDMA on IB or iWARP using Fast Reg
>> + is no less secure than TCP.
>
> I'd leave out the first seven words of that last sentence on the grounds
> that it's implicit....
Agreed.
>
> This explanation is helpful, thanks. It would also be helpful if we
> could boil down the advice to just a sentence or two for the busy admin.
> Something like: unless you have card XYZ and kernel 2.6.y, do *not* use
> rdma on a network where you cannot trust every machine....
Would it be better to say, "Do not use RDMA on a network where your
policy requires a security model stronger than tcp/auth_unix."
>
> And better at some point might be to allow nfs-utils to automatically
> check for that situation, and/or just to drop support for anything that
> can't provide at least a tcp/auth_unix-like security model.
>
> --b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Thanks, I think this is much more helpful.
On Tue, Sep 30, 2008 at 03:17:21PM -0500, Tom Tucker wrote:
> +Security
> +--------
> +
> + NFSRDMA exploits the RDMA capabilities of the IB and iWARP
> + transports to more efficiently exchange RPC data between the client
> + and the server. This section discusses the security implications of
> + the exchange of memory information on the wire when the wire may be
> + monitorable by an untrusted application. The identifier that
> + encapsulates this memory information is called an RKEY.
> +
> + A principal exploit is that a node on the local network could snoop
> + RDMA packets containing RKEY and then forge a packet with this RKEY
> + to write and/or read the memory of the peer to which the RKEY
> + referred.
> +
> + If the underlying RDMA device is capable of Fast Memory
> + Registration, then NFSRDMA is no less secure than TCP with
> + auth_unix. However, if the device does not support Fast Memory
> + Registration, then such a node could write anywhere in the server's
> + memory using the method above. At mount time, the server sends a
The server doesn't really know about mounts, especially not at this
level, so I assume you mean either server start time or client connect
time?
> + string to the message log to indicate whether or not Fast Memory
> + Registration is being used. If Fast Memory Registration is being
> + used, the string
> +
> + "svcrdma: Using Fast Memory Registration"
> +
> + is logged, otherwise,
> +
> + "svcrdma: Using a Global DMA MR"
> +
> + will be logged.
It'd be nicer to have something that can be queried by a program--a file
in proc or nfsd, for example--without having to grep through log files.
(Or is it possible the drivers already export enough information under
sysfs someplace to figure this out with a simple script?)
Or maybe the non-fast registration stuff should be under a separate
configuration option entirely? Distro's could eventually enable only
the safer configurations and people doing testing could build their own
kernels with the rest enabled.
My initial impulse is to be a bit scared of the non-fast-registration
case, but maybe I don't understand how this hardware is deployed.
--b.
> +
> + The sections below provide additional information on this issue.
> +
> + The NFSRDMA protocol is defined such that a) only the server
> + initiates RDMA, and b) only the client's memory is exposed via
> + RKEY. This is why the server reads to fetch RPC data from the client
> + even though it would be more efficient for the client to write the
> + data to the server's memory. This design goal is not entirely
> + realized with iWARP, however, because the RKEY (called an STag on
> + iWARP) for the data sink of an RDMA_READ is actually placed on the
> + wire, and this RKEY has Remote Write permission. This means that the
> + server's memory is exposed by virtue of having placed the RKEY for
> + its local memory on the wire in order to receive the result of the
> + RDMA_READ.
> +
> + By contrast, IB uses an opaque transaction ID# to associate the
> + READ_RPL with the READ_REQ and the data sink of an READ_REQ does not
> + require remote access. That said, the byzantine node in question
> + could forge a packet with this transaction ID and corrupt the target
> + memory, however, the scope of the exploit is bounded to the lifetime
> + of this single RDMA_READ request and to the memory mapped by the
> + data sink of the READ_REQ.
> +
> + The newer RDMA adapters (both iWARP and IB) support "Fast Memory
> + Registration". This capability allows memory to be quickly
> + registered (i.e. made available for remote access) and de-registered
> + by submitting WR on the SQ. These capabilities provide a mechanism
> + to reduce the exposure discused above by limiting the scope of the
> + exploit. The idea is to create an RKEY that only maps the single RPC
> + and whose effective lifetime is only the exchange of this single
> + RPC. This is the default memory model that is employed by the server
> + when supported by the adapter and by the client when the
> + rdma_memreg_strategy is set to 6. Note that the client and server
> + may use different memory registration strategies, however,
> + performance is better when both the client and server use the
> + FastReg memory registration strategy.
> +
> + This approach has two benefits, a) it restricts the domain of the
> + exploit to the memory of a single RPC, and b) it limits the duration
> + of the exploit to the time it takes to satisfy the RDMA_READ.
> +
> + It is arguable that a one-shot STag/RKEY is no less secure than RPC
> + on the TCP transport. Consider that the exact same byzantine
> + application could more easily corrupt TCP RPC payload by simply
> + forging a packet with the correct TCP sequence number -- in fact
> + it's easier than the RDMA exploit because the RDMA exploit requires
> + that you correctly forge both the TCP packet and the RDMA
> + payload. In addition the duration of the TCP exploit is the lifetime
> + of the connection, not the lifetime of a single WR/RPC data transfer.
> +
> + RDMA on IB or iWARP using Fast Reg is no less secure than TCP.
> +
>
J. Bruce Fields wrote:
> Thanks, I think this is much more helpful.
>
> On Tue, Sep 30, 2008 at 03:17:21PM -0500, Tom Tucker wrote:
>> +Security
>> +--------
>> +
>> + NFSRDMA exploits the RDMA capabilities of the IB and iWARP
>> + transports to more efficiently exchange RPC data between the client
>> + and the server. This section discusses the security implications of
>> + the exchange of memory information on the wire when the wire may be
>> + monitorable by an untrusted application. The identifier that
>> + encapsulates this memory information is called an RKEY.
>> +
>> + A principal exploit is that a node on the local network could snoop
>> + RDMA packets containing RKEY and then forge a packet with this RKEY
>> + to write and/or read the memory of the peer to which the RKEY
>> + referred.
>> +
>> + If the underlying RDMA device is capable of Fast Memory
>> + Registration, then NFSRDMA is no less secure than TCP with
>> + auth_unix. However, if the device does not support Fast Memory
>> + Registration, then such a node could write anywhere in the server's
>> + memory using the method above. At mount time, the server sends a
>
> The server doesn't really know about mounts, especially not at this
> level, so I assume you mean either server start time or client connect
> time?
Right, client connect time, I'll fix. Thanks.
>
>> + string to the message log to indicate whether or not Fast Memory
>> + Registration is being used. If Fast Memory Registration is being
>> + used, the string
>> +
>> + "svcrdma: Using Fast Memory Registration"
>> +
>> + is logged, otherwise,
>> +
>> + "svcrdma: Using a Global DMA MR"
>> +
>> + will be logged.
>
> It'd be nicer to have something that can be queried by a program--a file
> in proc or nfsd, for example--without having to grep through log files.
> (Or is it possible the drivers already export enough information under
> sysfs someplace to figure this out with a simple script?)
Yes, it's gross. But I was trying to keep it simple for the first go-round and
since it is conceivable that you have two adapters, one that supports FRMR and
the other doesn't, you would need a proc file per adapter. All my systems have
both iWARP and IB adapters in them. So half my connections are DMA MR and the
other FRMR.
>
> Or maybe the non-fast registration stuff should be under a separate
> configuration option entirely? Distro's could eventually enable only
> the safer configurations and people doing testing could build their own
> kernels with the rest enabled.
Perhaps, or maybe a module option that specifically disables DMA_MR. Also
note that with IB the DMA MR is RKEY is not put on the wire so I think I
need to qualify the warning somewhat.
>
> My initial impulse is to be a bit scared of the non-fast-registration
> case, but maybe I don't understand how this hardware is deployed.
>
In practice, I think the exposure is real, but somewhat academic.
Obviously as this sees wider adoption the likelihood that this could be
deployed on a network with untrusted hosts grows significantly. Today
I don't believe that's the case.
I would lean towards the module option and a perhaps a Kconfig option that
allows you to tweak the default. I also think the policy should be transport
dependent. IOW, DMA MR is OK for IB, but verboten for iWARP.
Thanks for the feedback,
Tom
> --b.
>
>> +
>> + The sections below provide additional information on this issue.
>> +
>> + The NFSRDMA protocol is defined such that a) only the server
>> + initiates RDMA, and b) only the client's memory is exposed via
>> + RKEY. This is why the server reads to fetch RPC data from the client
>> + even though it would be more efficient for the client to write the
>> + data to the server's memory. This design goal is not entirely
>> + realized with iWARP, however, because the RKEY (called an STag on
>> + iWARP) for the data sink of an RDMA_READ is actually placed on the
>> + wire, and this RKEY has Remote Write permission. This means that the
>> + server's memory is exposed by virtue of having placed the RKEY for
>> + its local memory on the wire in order to receive the result of the
>> + RDMA_READ.
>> +
>> + By contrast, IB uses an opaque transaction ID# to associate the
>> + READ_RPL with the READ_REQ and the data sink of an READ_REQ does not
>> + require remote access. That said, the byzantine node in question
>> + could forge a packet with this transaction ID and corrupt the target
>> + memory, however, the scope of the exploit is bounded to the lifetime
>> + of this single RDMA_READ request and to the memory mapped by the
>> + data sink of the READ_REQ.
>> +
>> + The newer RDMA adapters (both iWARP and IB) support "Fast Memory
>> + Registration". This capability allows memory to be quickly
>> + registered (i.e. made available for remote access) and de-registered
>> + by submitting WR on the SQ. These capabilities provide a mechanism
>> + to reduce the exposure discused above by limiting the scope of the
>> + exploit. The idea is to create an RKEY that only maps the single RPC
>> + and whose effective lifetime is only the exchange of this single
>> + RPC. This is the default memory model that is employed by the server
>> + when supported by the adapter and by the client when the
>> + rdma_memreg_strategy is set to 6. Note that the client and server
>> + may use different memory registration strategies, however,
>> + performance is better when both the client and server use the
>> + FastReg memory registration strategy.
>> +
>> + This approach has two benefits, a) it restricts the domain of the
>> + exploit to the memory of a single RPC, and b) it limits the duration
>> + of the exploit to the time it takes to satisfy the RDMA_READ.
>> +
>> + It is arguable that a one-shot STag/RKEY is no less secure than RPC
>> + on the TCP transport. Consider that the exact same byzantine
>> + application could more easily corrupt TCP RPC payload by simply
>> + forging a packet with the correct TCP sequence number -- in fact
>> + it's easier than the RDMA exploit because the RDMA exploit requires
>> + that you correctly forge both the TCP packet and the RDMA
>> + payload. In addition the duration of the TCP exploit is the lifetime
>> + of the connection, not the lifetime of a single WR/RPC data transfer.
>> +
>> + RDMA on IB or iWARP using Fast Reg is no less secure than TCP.
>> +
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html