v1 was here: https://lkml.org/lkml/2023/3/10/1162
since then:
- split original 1/3 into 1/5 and 5/5
- new patch 2/5
- reorder members of anon union
- always send cookie in network order
Eric Blake (5):
uapi nbd: improve doc links to userspace spec
block nbd: send handle in network order
uapi nbd: add cookie alias to handle
block nbd: use req.cookie instead of req.handle
docs nbd: userspace NBD now favors github over sourceforge
Documentation/admin-guide/blockdev/nbd.rst | 2 +-
drivers/block/nbd.c | 6 +++---
include/uapi/linux/nbd.h | 25 +++++++++++++++++-----
3 files changed, 24 insertions(+), 9 deletions(-)
base-commit: 8d3c682a5e3d9dfc2448ecbb22f4cd48359b9e21
--
2.39.2
The uapi <linux/nbd.h> header declares a 'char handle[8]' per request;
which is overloaded in English (are you referring to "handle" the
verb, such as handling a signal or writing a callback handler, or
"handle" the noun, the value used in a lookup table to correlate a
response back to the request). Many client-side NBD implementations
(both servers and clients) have instead used 'uint64_t cookie' or
similar, as it is easier to directly assign an integer than to futz
around with memcpy. In fact, upstream documentation is now
encouraging this shift in terminology:
https://lists.debian.org/nbd/2023/03/msg00031.html
Accomplish this by use of an anonymous union to provide the alias for
anyone getting the definition from the uapi; this does not break
existing clients, while exposing the nicer name for those who prefer
it. Note that block/nbd.c still uses the term handle (in fact, it
actually combines a 32-bit cookie and a 32-bit tag into the 64-bit
handle), but that internal usage is not changed the public uapi, since
no compliant NBD server has any reason to inspect or alter the 64
bits sent over the socket.
Signed-off-by: Eric Blake <[email protected]>
---
v2: swap order of anonymous union and add comments to list favored name
---
include/uapi/linux/nbd.h | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/nbd.h b/include/uapi/linux/nbd.h
index 8797387caaf7..80ce0ef43afd 100644
--- a/include/uapi/linux/nbd.h
+++ b/include/uapi/linux/nbd.h
@@ -12,7 +12,7 @@
* 2004/02/19 Paul Clements
* Removed PARANOIA, plus various cleanup and comments
* 2023 Copyright Red Hat
- * Link to userspace extensions.
+ * Link to userspace extensions, favor cookie over handle.
*/
#ifndef _UAPILINUX_NBD_H
@@ -81,7 +81,10 @@ enum {
struct nbd_request {
__be32 magic; /* NBD_REQUEST_MAGIC */
__be32 type; /* See NBD_CMD_* */
- char handle[8];
+ union {
+ __be64 cookie; /* Opaque identifier for request */
+ char handle[8]; /* older spelling of cookie */
+ };
__be64 from;
__be32 len;
} __attribute__((packed));
@@ -93,6 +96,9 @@ struct nbd_request {
struct nbd_reply {
__be32 magic; /* NBD_REPLY_MAGIC */
__be32 error; /* 0 = ok, else error */
- char handle[8]; /* handle you got from request */
+ union {
+ __be64 cookie; /* Opaque identifier from request */
+ char handle[8]; /* older spelling of cookie */
+ };
};
#endif /* _UAPILINUX_NBD_H */
--
2.39.2
The NBD spec says the client handle (or cookie) is opaque on the
server, and therefore it really doesn't matter what endianness we use;
to date, the use of memcpy() between u64 and a char[8] has exposed
native endianness when treating the handle as a 64-bit number.
However, since NBD protocol documents that everything else is in
network order, and tools like Wireshark will dump even the contents of
the handle as seen over the network, it's worth using a consistent
ordering regardless of the native endianness.
Plus, using a consistent endianness now allows an upcoming patch to
simplify this to directly use integer assignment instead of memcpy().
Signed-off-by: Eric Blake <[email protected]>
---
v2: new patch
---
drivers/block/nbd.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 592cfa8b765a..8a9487e79f1c 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -560,6 +560,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
unsigned long size = blk_rq_bytes(req);
struct bio *bio;
u64 handle;
+ __be64 tmp;
u32 type;
u32 nbd_cmd_flags = 0;
int sent = nsock->sent, skip = 0;
@@ -606,7 +607,8 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
request.len = htonl(size);
}
handle = nbd_cmd_handle(cmd);
- memcpy(request.handle, &handle, sizeof(handle));
+ tmp = cpu_to_be64(handle);
+ memcpy(request.handle, &tmp, sizeof(tmp));
trace_nbd_send_request(&request, nbd->index, blk_mq_rq_from_pdu(cmd));
@@ -618,7 +620,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
trace_nbd_header_sent(req, handle);
if (result < 0) {
if (was_interrupted(result)) {
- /* If we havne't sent anything we can just return BUSY,
+ /* If we haven't sent anything we can just return BUSY,
* however if we have sent something we need to make
* sure we only allow this req to be sent until we are
* completely done.
@@ -727,12 +729,14 @@ static struct nbd_cmd *nbd_handle_reply(struct nbd_device *nbd, int index,
int result;
struct nbd_cmd *cmd;
struct request *req = NULL;
+ __be64 tmp;
u64 handle;
u16 hwq;
u32 tag;
int ret = 0;
- memcpy(&handle, reply->handle, sizeof(handle));
+ memcpy(&tmp, reply->handle, sizeof(tmp));
+ handle = be64_to_cpu(tmp);
tag = nbd_handle_to_tag(handle);
hwq = blk_mq_unique_tag_to_hwq(tag);
if (hwq < nbd->tag_set.nr_hw_queues)
--
2.39.2
The uapi <linux/nbd.h> header intentionally documents only the NBD
server features that the kernel module will utilize as a client. But
while it already had one mention of skipped bits due to userspace
extensions, it did not actually direct the reader to the canonical
source to learn about those extensions.
While touching comments, fix an outdated reference that listed only
READ and WRITE as commands.
Signed-off-by: Eric Blake <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
---
v2: Split change to sourceforge link to separate patch
---
include/uapi/linux/nbd.h | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/nbd.h b/include/uapi/linux/nbd.h
index 20d6cc91435d..8797387caaf7 100644
--- a/include/uapi/linux/nbd.h
+++ b/include/uapi/linux/nbd.h
@@ -11,6 +11,8 @@
* Cleanup PARANOIA usage & code.
* 2004/02/19 Paul Clements
* Removed PARANOIA, plus various cleanup and comments
+ * 2023 Copyright Red Hat
+ * Link to userspace extensions.
*/
#ifndef _UAPILINUX_NBD_H
@@ -30,12 +32,18 @@
#define NBD_SET_TIMEOUT _IO( 0xab, 9 )
#define NBD_SET_FLAGS _IO( 0xab, 10)
+/*
+ * See also https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md
+ * for additional userspace extensions not yet utilized in the kernel module.
+ */
+
enum {
NBD_CMD_READ = 0,
NBD_CMD_WRITE = 1,
NBD_CMD_DISC = 2,
NBD_CMD_FLUSH = 3,
NBD_CMD_TRIM = 4
+ /* userspace defines additional extension commands */
};
/* values for flags field, these are server interaction specific. */
@@ -64,14 +72,15 @@ enum {
#define NBD_REQUEST_MAGIC 0x25609513
#define NBD_REPLY_MAGIC 0x67446698
/* Do *not* use magics: 0x12560953 0x96744668. */
+/* magic 0x668e33ef for structured reply not supported by kernel yet */
/*
* This is the packet used for communication between client and
* server. All data are in network byte order.
*/
struct nbd_request {
- __be32 magic;
- __be32 type; /* == READ || == WRITE */
+ __be32 magic; /* NBD_REQUEST_MAGIC */
+ __be32 type; /* See NBD_CMD_* */
char handle[8];
__be64 from;
__be32 len;
@@ -82,7 +91,7 @@ struct nbd_request {
* it has completed an I/O request (or an error occurs).
*/
struct nbd_reply {
- __be32 magic;
+ __be32 magic; /* NBD_REPLY_MAGIC */
__be32 error; /* 0 = ok, else error */
char handle[8]; /* handle you got from request */
};
--
2.39.2
A good optimizing compiler should not compile this any differently,
but it is nicer to work directly with integers instead of memcpy().
Signed-off-by: Eric Blake <[email protected]>
---
v2: Fix kernel test robot complaint about wrong endianness on loongarch:
https://lore.kernel.org/oe-kbuild-all/[email protected]/
---
drivers/block/nbd.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 8a9487e79f1c..94ae85400b46 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -560,7 +560,6 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
unsigned long size = blk_rq_bytes(req);
struct bio *bio;
u64 handle;
- __be64 tmp;
u32 type;
u32 nbd_cmd_flags = 0;
int sent = nsock->sent, skip = 0;
@@ -607,8 +606,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
request.len = htonl(size);
}
handle = nbd_cmd_handle(cmd);
- tmp = cpu_to_be64(handle);
- memcpy(request.handle, &tmp, sizeof(tmp));
+ request.cookie = cpu_to_be64(handle);
trace_nbd_send_request(&request, nbd->index, blk_mq_rq_from_pdu(cmd));
@@ -729,14 +727,12 @@ static struct nbd_cmd *nbd_handle_reply(struct nbd_device *nbd, int index,
int result;
struct nbd_cmd *cmd;
struct request *req = NULL;
- __be64 tmp;
u64 handle;
u16 hwq;
u32 tag;
int ret = 0;
- memcpy(&tmp, reply->handle, sizeof(tmp));
- handle = be64_to_cpu(tmp);
+ handle = be64_to_cpu(reply->cookie);
tag = nbd_handle_to_tag(handle);
hwq = blk_mq_unique_tag_to_hwq(tag);
if (hwq < nbd->tag_set.nr_hw_queues)
--
2.39.2
While the sourceforge site for userspace NBD still exists, the code
repository moved to github several years ago. Then with a recent
patch[1], the github landing page contains just as much information as
the sourceforge page, so we might as well point to a single location
that also provides the code.
[1] https://lists.debian.org/nbd/2023/03/msg00051.html
Signed-off-by: Eric Blake <[email protected]>
---
v2: split into its own patch
---
Documentation/admin-guide/blockdev/nbd.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/blockdev/nbd.rst b/Documentation/admin-guide/blockdev/nbd.rst
index d78dfe559dcf..faf2ac4b1509 100644
--- a/Documentation/admin-guide/blockdev/nbd.rst
+++ b/Documentation/admin-guide/blockdev/nbd.rst
@@ -14,7 +14,7 @@ to borrow disk space from another computer.
Unlike NFS, it is possible to put any filesystem on it, etc.
For more information, or to download the nbd-client and nbd-server
-tools, go to http://nbd.sf.net/.
+tools, go to https://github.com/NetworkBlockDevice/nbd.
The nbd kernel module need only be installed on the client
system, as the nbd-server is completely in userspace. In fact,
--
2.39.2
On Fri, Mar 17, 2023 at 03:27:46PM -0500, Eric Blake wrote:
> The NBD spec says the client handle (or cookie) is opaque on the
> server, and therefore it really doesn't matter what endianness we use;
> to date, the use of memcpy() between u64 and a char[8] has exposed
> native endianness when treating the handle as a 64-bit number.
No, memcpy() works fine for char[8], which doesn't break endianness.
> However, since NBD protocol documents that everything else is in
> network order, and tools like Wireshark will dump even the contents of
> the handle as seen over the network, it's worth using a consistent
> ordering regardless of the native endianness.
>
> Plus, using a consistent endianness now allows an upcoming patch to
> simplify this to directly use integer assignment instead of memcpy().
It isn't necessary, given ->handle is actually u64, which is handled by
nbd client only.
>
> Signed-off-by: Eric Blake <[email protected]>
>
> ---
> v2: new patch
> ---
> drivers/block/nbd.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 592cfa8b765a..8a9487e79f1c 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -560,6 +560,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
> unsigned long size = blk_rq_bytes(req);
> struct bio *bio;
> u64 handle;
> + __be64 tmp;
> u32 type;
> u32 nbd_cmd_flags = 0;
> int sent = nsock->sent, skip = 0;
> @@ -606,7 +607,8 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
> request.len = htonl(size);
> }
> handle = nbd_cmd_handle(cmd);
> - memcpy(request.handle, &handle, sizeof(handle));
> + tmp = cpu_to_be64(handle);
> + memcpy(request.handle, &tmp, sizeof(tmp));
This way copies handle two times, really not fun.
thanks,
Ming
On Tue, Mar 21, 2023 at 07:20:33AM +0800, Ming Lei wrote:
> On Fri, Mar 17, 2023 at 03:27:46PM -0500, Eric Blake wrote:
> > The NBD spec says the client handle (or cookie) is opaque on the
> > server, and therefore it really doesn't matter what endianness we use;
> > to date, the use of memcpy() between u64 and a char[8] has exposed
> > native endianness when treating the handle as a 64-bit number.
>
> No, memcpy() works fine for char[8], which doesn't break endianness.
I didn't say memcpy() breaks endianness, I said it preserves it. By
using memcpy(), you are exposing native endianness over the wire.
Thus, even though a server should not be making any decisions based on
the content of the handle (it is an opaque value handed back to the
client unchanged), the current kernel client code DOES leak through
information about whether the client is big- or little-endian; in
contrast to the NBD protocol saying that ALL data is
network-byte-order.
>
> > However, since NBD protocol documents that everything else is in
> > network order, and tools like Wireshark will dump even the contents of
> > the handle as seen over the network, it's worth using a consistent
> > ordering regardless of the native endianness.
> >
> > Plus, using a consistent endianness now allows an upcoming patch to
> > simplify this to directly use integer assignment instead of memcpy().
>
> It isn't necessary, given ->handle is actually u64, which is handled by
> nbd client only.
No, re-read the whole series. ->handle is actually char[8]. Later in
the series adds ->cookie as __be64 as an alias to ->handle, precisely
so that we are converting the u64 'handle' in kernel code into a
big-endian value on the wire, regardless of the host type, and making
it impossible for a server to inspect the wire data and learn the
kernel's endianness.
>
> >
> > Signed-off-by: Eric Blake <[email protected]>
> >
> > ---
> > v2: new patch
> > ---
> > drivers/block/nbd.c | 10 +++++++---
> > 1 file changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > index 592cfa8b765a..8a9487e79f1c 100644
> > --- a/drivers/block/nbd.c
> > +++ b/drivers/block/nbd.c
> > @@ -560,6 +560,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
> > unsigned long size = blk_rq_bytes(req);
> > struct bio *bio;
> > u64 handle;
> > + __be64 tmp;
> > u32 type;
> > u32 nbd_cmd_flags = 0;
> > int sent = nsock->sent, skip = 0;
> > @@ -606,7 +607,8 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
> > request.len = htonl(size);
> > }
> > handle = nbd_cmd_handle(cmd);
> > - memcpy(request.handle, &handle, sizeof(handle));
> > + tmp = cpu_to_be64(handle);
> > + memcpy(request.handle, &tmp, sizeof(tmp));
>
> This way copies handle two times, really not fun.
Indeed. And as mentioned in the commit message, it is temporary; the
second copy goes away later in the series once we can use direct
integer assignment.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
On Tue, Mar 21, 2023 at 08:59:00AM -0500, Eric Blake wrote:
> On Tue, Mar 21, 2023 at 07:20:33AM +0800, Ming Lei wrote:
> > On Fri, Mar 17, 2023 at 03:27:46PM -0500, Eric Blake wrote:
> > > The NBD spec says the client handle (or cookie) is opaque on the
> > > server, and therefore it really doesn't matter what endianness we use;
> > > to date, the use of memcpy() between u64 and a char[8] has exposed
> > > native endianness when treating the handle as a 64-bit number.
> >
> > No, memcpy() works fine for char[8], which doesn't break endianness.
>
> I didn't say memcpy() breaks endianness, I said it preserves it. By
> using memcpy(), you are exposing native endianness over the wire.
> Thus, even though a server should not be making any decisions based on
> the content of the handle (it is an opaque value handed back to the
> client unchanged), the current kernel client code DOES leak through
> information about whether the client is big- or little-endian;
How is the client cpu endianness leaked with handle defined as char[8]?
Suppose it is leaked, is it really one issue? Cause most of CPUs in
the world is little-endian.
> contrast to the NBD protocol saying that ALL data is
> network-byte-order.
That doesn't make sense for any data defined as char[] or byte which
needn't to be little or big endian.
>
> >
> > > However, since NBD protocol documents that everything else is in
> > > network order, and tools like Wireshark will dump even the contents of
> > > the handle as seen over the network, it's worth using a consistent
> > > ordering regardless of the native endianness.
> > >
> > > Plus, using a consistent endianness now allows an upcoming patch to
> > > simplify this to directly use integer assignment instead of memcpy().
> >
> > It isn't necessary, given ->handle is actually u64, which is handled by
> > nbd client only.
>
> No, re-read the whole series. ->handle is actually char[8]. Later in
> the series adds ->cookie as __be64 as an alias to ->handle, precisely
> so that we are converting the u64 'handle' in kernel code into a
> big-endian value on the wire, regardless of the host type, and making
> it impossible for a server to inspect the wire data and learn the
> kernel's endianness.
How does server learn the client cpu endianness in this way? Is it really
one issue?
>
> >
> > >
> > > Signed-off-by: Eric Blake <[email protected]>
> > >
> > > ---
> > > v2: new patch
> > > ---
> > > drivers/block/nbd.c | 10 +++++++---
> > > 1 file changed, 7 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > > index 592cfa8b765a..8a9487e79f1c 100644
> > > --- a/drivers/block/nbd.c
> > > +++ b/drivers/block/nbd.c
> > > @@ -560,6 +560,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
> > > unsigned long size = blk_rq_bytes(req);
> > > struct bio *bio;
> > > u64 handle;
> > > + __be64 tmp;
> > > u32 type;
> > > u32 nbd_cmd_flags = 0;
> > > int sent = nsock->sent, skip = 0;
> > > @@ -606,7 +607,8 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
> > > request.len = htonl(size);
> > > }
> > > handle = nbd_cmd_handle(cmd);
> > > - memcpy(request.handle, &handle, sizeof(handle));
> > > + tmp = cpu_to_be64(handle);
> > > + memcpy(request.handle, &tmp, sizeof(tmp));
> >
> > This way copies handle two times, really not fun.
>
> Indeed. And as mentioned in the commit message, it is temporary; the
> second copy goes away later in the series once we can use direct
> integer assignment.
Then please merge with following patch, given it is hard to review
temporary change.
thanks,
Ming
On Wed, Mar 22, 2023 at 08:47:41AM +0800, Ming Lei wrote:
> On Tue, Mar 21, 2023 at 08:59:00AM -0500, Eric Blake wrote:
> > On Tue, Mar 21, 2023 at 07:20:33AM +0800, Ming Lei wrote:
> > > On Fri, Mar 17, 2023 at 03:27:46PM -0500, Eric Blake wrote:
> > > > The NBD spec says the client handle (or cookie) is opaque on the
> > > > server, and therefore it really doesn't matter what endianness we use;
> > > > to date, the use of memcpy() between u64 and a char[8] has exposed
> > > > native endianness when treating the handle as a 64-bit number.
> > >
> > > No, memcpy() works fine for char[8], which doesn't break endianness.
> >
> > I didn't say memcpy() breaks endianness, I said it preserves it. By
> > using memcpy(), you are exposing native endianness over the wire.
> > Thus, even though a server should not be making any decisions based on
> > the content of the handle (it is an opaque value handed back to the
> > client unchanged), the current kernel client code DOES leak through
> > information about whether the client is big- or little-endian;
>
> How is the client cpu endianness leaked with handle defined as char[8]?
>
> Suppose it is leaked, is it really one issue? Cause most of CPUs in
> the world is little-endian.
>
> > contrast to the NBD protocol saying that ALL data is
> > network-byte-order.
>
> That doesn't make sense for any data defined as char[] or byte which
> needn't to be little or big endian.
The NBD spec defines it as a 64-bit opaque quantity - that does not
indicate whether it is a single integer or 8 characters, but because
it is opaque, we don't have to care. However, if we DO treat it as an
integer (and the kernel client code DOES do that: internally, it is
building up a u64 integer), it is wise to consider network endianness.
>
> >
> > >
> > > > However, since NBD protocol documents that everything else is in
> > > > network order, and tools like Wireshark will dump even the contents of
> > > > the handle as seen over the network, it's worth using a consistent
> > > > ordering regardless of the native endianness.
> > > >
> > > > Plus, using a consistent endianness now allows an upcoming patch to
> > > > simplify this to directly use integer assignment instead of memcpy().
> > >
> > > It isn't necessary, given ->handle is actually u64, which is handled by
> > > nbd client only.
> >
> > No, re-read the whole series. ->handle is actually char[8]. Later in
> > the series adds ->cookie as __be64 as an alias to ->handle, precisely
> > so that we are converting the u64 'handle' in kernel code into a
> > big-endian value on the wire, regardless of the host type, and making
> > it impossible for a server to inspect the wire data and learn the
> > kernel's endianness.
>
> How does server learn the client cpu endianness in this way? Is it really
> one issue?
Not a security issue, merely a consistency one. A server that
inspects the handles being sent by the client, and checks whether they
are sequential when treated as a big- or little-endian number, can
infer whether the client is little-endian. But there is nothing
useful it can do with that knowledge. Rather, the consistency factor
is that if you have a wireshark plugin reading network traffic, and
are trying to correlate it back to kernel traces, it is NICE if the
wireshark plugin can display the SAME u64 number as the kernel was
sticking into the field - and the way to do that is to have a fixed
endianness of the u64 value over the wire.
>
> >
> > >
> > > >
> > > > Signed-off-by: Eric Blake <[email protected]>
> > > >
> > > > ---
> > > > v2: new patch
> > > > ---
> > > > drivers/block/nbd.c | 10 +++++++---
> > > > 1 file changed, 7 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > > > index 592cfa8b765a..8a9487e79f1c 100644
> > > > --- a/drivers/block/nbd.c
> > > > +++ b/drivers/block/nbd.c
> > > > @@ -560,6 +560,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
> > > > unsigned long size = blk_rq_bytes(req);
> > > > struct bio *bio;
> > > > u64 handle;
> > > > + __be64 tmp;
> > > > u32 type;
> > > > u32 nbd_cmd_flags = 0;
> > > > int sent = nsock->sent, skip = 0;
> > > > @@ -606,7 +607,8 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
> > > > request.len = htonl(size);
> > > > }
> > > > handle = nbd_cmd_handle(cmd);
> > > > - memcpy(request.handle, &handle, sizeof(handle));
> > > > + tmp = cpu_to_be64(handle);
> > > > + memcpy(request.handle, &tmp, sizeof(tmp));
> > >
> > > This way copies handle two times, really not fun.
> >
> > Indeed. And as mentioned in the commit message, it is temporary; the
> > second copy goes away later in the series once we can use direct
> > integer assignment.
>
> Then please merge with following patch, given it is hard to review
> temporary change.
The underlying reason I split this patch out is that in v1 I got
complaints that I was not taking endianness into account. The patch
series DOES cause an observable change (namely, a little-endian client
now sends a value in big-endian order that it used to send in
little-endian order) - but the change is harmless. But if you want me
to squash this patch back with 4/5 in v3, I'm happy to do that.
Are there any other comments on this series that I should consider
before spending time putting out a v3?
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
On Wed, Mar 22, 2023 at 07:29:21AM -0500, Eric Blake wrote:
> On Wed, Mar 22, 2023 at 08:47:41AM +0800, Ming Lei wrote:
> > On Tue, Mar 21, 2023 at 08:59:00AM -0500, Eric Blake wrote:
> > > On Tue, Mar 21, 2023 at 07:20:33AM +0800, Ming Lei wrote:
> > > > On Fri, Mar 17, 2023 at 03:27:46PM -0500, Eric Blake wrote:
> > > > > The NBD spec says the client handle (or cookie) is opaque on the
> > > > > server, and therefore it really doesn't matter what endianness we use;
> > > > > to date, the use of memcpy() between u64 and a char[8] has exposed
> > > > > native endianness when treating the handle as a 64-bit number.
> > > >
> > > > No, memcpy() works fine for char[8], which doesn't break endianness.
> > >
> > > I didn't say memcpy() breaks endianness, I said it preserves it. By
> > > using memcpy(), you are exposing native endianness over the wire.
> > > Thus, even though a server should not be making any decisions based on
> > > the content of the handle (it is an opaque value handed back to the
> > > client unchanged), the current kernel client code DOES leak through
> > > information about whether the client is big- or little-endian;
> >
> > How is the client cpu endianness leaked with handle defined as char[8]?
> >
> > Suppose it is leaked, is it really one issue? Cause most of CPUs in
> > the world is little-endian.
> >
> > > contrast to the NBD protocol saying that ALL data is
> > > network-byte-order.
> >
> > That doesn't make sense for any data defined as char[] or byte which
> > needn't to be little or big endian.
>
> The NBD spec defines it as a 64-bit opaque quantity - that does not
> indicate whether it is a single integer or 8 characters, but because
> it is opaque, we don't have to care. However, if we DO treat it as an
> integer (and the kernel client code DOES do that: internally, it is
> building up a u64 integer), it is wise to consider network endianness.
That depends on if it is reasonable to convert to int.
>
> >
> > >
> > > >
> > > > > However, since NBD protocol documents that everything else is in
> > > > > network order, and tools like Wireshark will dump even the contents of
> > > > > the handle as seen over the network, it's worth using a consistent
> > > > > ordering regardless of the native endianness.
> > > > >
> > > > > Plus, using a consistent endianness now allows an upcoming patch to
> > > > > simplify this to directly use integer assignment instead of memcpy().
> > > >
> > > > It isn't necessary, given ->handle is actually u64, which is handled by
> > > > nbd client only.
> > >
> > > No, re-read the whole series. ->handle is actually char[8]. Later in
> > > the series adds ->cookie as __be64 as an alias to ->handle, precisely
> > > so that we are converting the u64 'handle' in kernel code into a
> > > big-endian value on the wire, regardless of the host type, and making
> > > it impossible for a server to inspect the wire data and learn the
> > > kernel's endianness.
> >
> > How does server learn the client cpu endianness in this way? Is it really
> > one issue?
>
> Not a security issue, merely a consistency one. A server that
> inspects the handles being sent by the client, and checks whether they
> are sequential when treated as a big- or little-endian number, can
> infer whether the client is little-endian. But there is nothing
> useful it can do with that knowledge. Rather, the consistency factor
> is that if you have a wireshark plugin reading network traffic, and
> are trying to correlate it back to kernel traces, it is NICE if the
> wireshark plugin can display the SAME u64 number as the kernel was
> sticking into the field - and the way to do that is to have a fixed
> endianness of the u64 value over the wire.
OK, so the real motivation is only for aligning wireshark output with nbd
trace event. If yes, please add it in comment log.
BTW, the nbd trace event can be converted to any format by bcc or bpftrace
script, then you still can associate one with another.
>
> >
> > >
> > > >
> > > > >
> > > > > Signed-off-by: Eric Blake <[email protected]>
> > > > >
> > > > > ---
> > > > > v2: new patch
> > > > > ---
> > > > > drivers/block/nbd.c | 10 +++++++---
> > > > > 1 file changed, 7 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > > > > index 592cfa8b765a..8a9487e79f1c 100644
> > > > > --- a/drivers/block/nbd.c
> > > > > +++ b/drivers/block/nbd.c
> > > > > @@ -560,6 +560,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
> > > > > unsigned long size = blk_rq_bytes(req);
> > > > > struct bio *bio;
> > > > > u64 handle;
> > > > > + __be64 tmp;
> > > > > u32 type;
> > > > > u32 nbd_cmd_flags = 0;
> > > > > int sent = nsock->sent, skip = 0;
> > > > > @@ -606,7 +607,8 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
> > > > > request.len = htonl(size);
> > > > > }
> > > > > handle = nbd_cmd_handle(cmd);
> > > > > - memcpy(request.handle, &handle, sizeof(handle));
> > > > > + tmp = cpu_to_be64(handle);
> > > > > + memcpy(request.handle, &tmp, sizeof(tmp));
> > > >
> > > > This way copies handle two times, really not fun.
> > >
> > > Indeed. And as mentioned in the commit message, it is temporary; the
> > > second copy goes away later in the series once we can use direct
> > > integer assignment.
> >
> > Then please merge with following patch, given it is hard to review
> > temporary change.
>
> The underlying reason I split this patch out is that in v1 I got
> complaints that I was not taking endianness into account. The patch
> series DOES cause an observable change (namely, a little-endian client
> now sends a value in big-endian order that it used to send in
> little-endian order) - but the change is harmless. But if you want me
> to squash this patch back with 4/5 in v3, I'm happy to do that.
>
> Are there any other comments on this series that I should consider
> before spending time putting out a v3?
I think 2~4 should be merged to single patch.
Thanks,
Ming
On Fri, Mar 17, 2023 at 03:27:44PM -0500, Eric Blake wrote:
> v1 was here: https://lkml.org/lkml/2023/3/10/1162
> since then:
> - split original 1/3 into 1/5 and 5/5
> - new patch 2/5
> - reorder members of anon union
> - always send cookie in network order
>
> Eric Blake (5):
> uapi nbd: improve doc links to userspace spec
> block nbd: send handle in network order
> uapi nbd: add cookie alias to handle
> block nbd: use req.cookie instead of req.handle
> docs nbd: userspace NBD now favors github over sourceforge
>
> Documentation/admin-guide/blockdev/nbd.rst | 2 +-
> drivers/block/nbd.c | 6 +++---
> include/uapi/linux/nbd.h | 25 +++++++++++++++++-----
> 3 files changed, 24 insertions(+), 9 deletions(-)
>
>
> base-commit: 8d3c682a5e3d9dfc2448ecbb22f4cd48359b9e21
Reviewed-by: Josef Bacik <[email protected]>
Thanks,
Josef