2019-06-21 17:46:36

by Suwan Kim

[permalink] [raw]
Subject: [PATCH 0/2] usbip: Implement SG support to vhci

There are bugs on vhci with usb 3.0 storage device. Originally, vhci
doesn't supported SG. So, USB storage driver on vhci divides SG list
into multiple URBs and it causes buffer overflow on the xhci of the
server. So we need to add SG support to vhci

In this patch series, vhci doesn't map and unmap URB for DMA to use
native SG list (urb->num_sgs). When vhci supports SG, it is useful
to use native SG list instead of mapped SG list because dma mapping
fnuction can adjust the number of SG list that is urb->num_mapped_sgs.

"usbip: Skip DMA mapping and unmapping for urb at vhci" is originally
submitted as separate patch, but I include it in this patch series
because it is needed to implement SG support of vhci.

In this patch, vhci basically support SG and it sends each SG list
entry to the stub driver. Then, the stub driver sees total length of
the buffer and allocates SG table and pages according to the total
buffer length calling sgl_alloc(). After the stub driver receives
completed URB, it again sends each SG list entry to the vhci.

If HCD of server doesn't support SG, the stub driver allocates
big buffer using kmalloc() instead of using sgl_alloc() which
allocates SG list and pages.

Alan fixed vhci bug with the USB 3.0 storage device by modifying
USB storage driver.
("usb-storage: Set virt_boundary_mask to avoid SG overflows")
But the fundamental solution of it is to add SG support to vhci.

This patch works well with the USB 3.0 storage devices without Alan's
patch, and we can revert Alan's patch if it causes some troubles.

Suwan Kim (2):
usbip: Skip DMA mapping and unmapping for urb at vhci
usbip: Implement SG support to vhci

drivers/usb/usbip/stub_rx.c | 32 +++++++++++++----
drivers/usb/usbip/stub_tx.c | 53 +++++++++++++++++++++++-----
drivers/usb/usbip/usbip_common.c | 60 +++++++++++++++++++++++++++-----
drivers/usb/usbip/usbip_common.h | 2 ++
drivers/usb/usbip/vhci_hcd.c | 23 +++++++++++-
drivers/usb/usbip/vhci_tx.c | 49 ++++++++++++++++++++------
6 files changed, 184 insertions(+), 35 deletions(-)

--
2.20.1


2019-06-21 17:46:50

by Suwan Kim

[permalink] [raw]
Subject: [PATCH 1/2] usbip: Skip DMA mapping and unmapping for urb at vhci

vhci doesn’t do dma for remote device. Actually, the real dma
operation is done by network card driver. So, vhci doesn’t use and
need dma address of transfer buffer of urb.

When vhci supports SG, it is useful to use native SG list instead
of mapped SG list because dma mapping fnuction can adjust the
number of SG list that is urb->num_mapped_sgs.

But hcd provides dma mapping and unmapping function by defualt.
Moreover, it causes unnecessary dma mapping and unmapping which
will be done again at the NIC driver and it wastes CPU cycles.
So, implement map_urb_for_dma and unmap_urb_for_dma function for
vhci in order to skip the dma mapping and unmapping procedure.

Signed-off-by: Suwan Kim <[email protected]>
---
drivers/usb/usbip/vhci_hcd.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c
index 667d9c0ec905..be87c8a63e24 100644
--- a/drivers/usb/usbip/vhci_hcd.c
+++ b/drivers/usb/usbip/vhci_hcd.c
@@ -1287,6 +1287,18 @@ static int vhci_free_streams(struct usb_hcd *hcd, struct usb_device *udev,
return 0;
}

+static int vhci_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
+ gfp_t mem_flags)
+{
+ dev_dbg(hcd->self.controller, "vhci does not map urb for dma\n");
+ return 0;
+}
+
+static void vhci_unmap_urb_for_dma(struct usb_hcd *hcd, struct urb *urb)
+{
+ dev_dbg(hcd->self.controller, "vhci does not unmap urb for dma\n");
+}
+
static const struct hc_driver vhci_hc_driver = {
.description = driver_name,
.product_desc = driver_desc,
@@ -1303,6 +1315,9 @@ static const struct hc_driver vhci_hc_driver = {

.get_frame_number = vhci_get_frame_number,

+ .map_urb_for_dma = vhci_map_urb_for_dma,
+ .unmap_urb_for_dma = vhci_unmap_urb_for_dma,
+
.hub_status_data = vhci_hub_status,
.hub_control = vhci_hub_control,
.bus_suspend = vhci_bus_suspend,
--
2.20.1

2019-06-21 17:48:31

by Suwan Kim

[permalink] [raw]
Subject: [PATCH 2/2] usbip: Implement SG support to vhci

There are bugs on vhci with usb 3.0 storage device. Originally, vhci
doesn't supported SG. So, USB storage driver on vhci divides SG list
into multiple URBs and it causes buffer overflow on the xhci of the
server. So we need to add SG support to vhci

In this patch, vhci basically support SG and it sends each SG list
entry to the stub driver. Then, the stub driver sees total length of
the buffer and allocates SG table and pages according to the total
buffer length calling sgl_alloc(). After the stub driver receives
completed URB, it again sends each SG list entry to the vhci.

If HCD of server doesn't support SG, the stub driver allocates
big buffer using kmalloc() instead of using sgl_alloc() which
allocates SG list and pages.

Alan fixed vhci bug with the USB 3.0 storage device by modifying
USB storage driver.
("usb-storage: Set virt_boundary_mask to avoid SG overflows")
But the fundamental solution of it is to add SG support to vhci.

This patch works well with the USB 3.0 storage devices without Alan's
patch, and we can revert Alan's patch if it causes some troubles.

Signed-off-by: Suwan Kim <[email protected]>
---
drivers/usb/usbip/stub_rx.c | 32 +++++++++++++----
drivers/usb/usbip/stub_tx.c | 53 +++++++++++++++++++++++-----
drivers/usb/usbip/usbip_common.c | 60 +++++++++++++++++++++++++++-----
drivers/usb/usbip/usbip_common.h | 2 ++
drivers/usb/usbip/vhci_hcd.c | 8 ++++-
drivers/usb/usbip/vhci_tx.c | 49 ++++++++++++++++++++------
6 files changed, 169 insertions(+), 35 deletions(-)

diff --git a/drivers/usb/usbip/stub_rx.c b/drivers/usb/usbip/stub_rx.c
index 97b09a42a10c..798854b9bbc8 100644
--- a/drivers/usb/usbip/stub_rx.c
+++ b/drivers/usb/usbip/stub_rx.c
@@ -7,6 +7,7 @@
#include <linux/kthread.h>
#include <linux/usb.h>
#include <linux/usb/hcd.h>
+#include <linux/scatterlist.h>

#include "usbip_common.h"
#include "stub.h"
@@ -446,7 +447,11 @@ static void stub_recv_cmd_submit(struct stub_device *sdev,
struct stub_priv *priv;
struct usbip_device *ud = &sdev->ud;
struct usb_device *udev = sdev->udev;
+ struct scatterlist *sgl;
+ int nents;
int pipe = get_pipe(sdev, pdu);
+ unsigned long long buf_len;
+ int use_sg;

if (pipe == -1)
return;
@@ -455,6 +460,10 @@ static void stub_recv_cmd_submit(struct stub_device *sdev,
if (!priv)
return;

+ buf_len = (unsigned long long)pdu->u.cmd_submit.transfer_buffer_length;
+ /* Check if the server's HCD supprot sg */
+ use_sg = pdu->u.cmd_submit.num_sgs && udev->bus->sg_tablesize;
+
/* setup a urb */
if (usb_pipeisoc(pipe))
priv->urb = usb_alloc_urb(pdu->u.cmd_submit.number_of_packets,
@@ -468,13 +477,22 @@ static void stub_recv_cmd_submit(struct stub_device *sdev,
}

/* allocate urb transfer buffer, if needed */
- if (pdu->u.cmd_submit.transfer_buffer_length > 0) {
- priv->urb->transfer_buffer =
- kzalloc(pdu->u.cmd_submit.transfer_buffer_length,
- GFP_KERNEL);
- if (!priv->urb->transfer_buffer) {
- usbip_event_add(ud, SDEV_EVENT_ERROR_MALLOC);
- return;
+ if (buf_len > 0) {
+ if (use_sg) {
+ sgl = sgl_alloc(buf_len, GFP_KERNEL, &nents);
+ if (!sgl) {
+ usbip_event_add(ud, SDEV_EVENT_ERROR_MALLOC);
+ return;
+ }
+ priv->urb->sg = sgl;
+ priv->urb->num_sgs = nents;
+ priv->urb->transfer_buffer = NULL;
+ } else {
+ priv->urb->transfer_buffer = kzalloc(buf_len, GFP_KERNEL);
+ if (!priv->urb->transfer_buffer) {
+ usbip_event_add(ud, SDEV_EVENT_ERROR_MALLOC);
+ return;
+ }
}
}

diff --git a/drivers/usb/usbip/stub_tx.c b/drivers/usb/usbip/stub_tx.c
index f0ec41a50cbc..ece129cd0b28 100644
--- a/drivers/usb/usbip/stub_tx.c
+++ b/drivers/usb/usbip/stub_tx.c
@@ -5,6 +5,7 @@

#include <linux/kthread.h>
#include <linux/socket.h>
+#include <linux/scatterlist.h>

#include "usbip_common.h"
#include "stub.h"
@@ -13,11 +14,21 @@ static void stub_free_priv_and_urb(struct stub_priv *priv)
{
struct urb *urb = priv->urb;

- kfree(urb->setup_packet);
- urb->setup_packet = NULL;
+ if (urb->setup_packet) {
+ kfree(urb->setup_packet);
+ urb->setup_packet = NULL;
+ }
+
+ if (urb->transfer_buffer) {
+ kfree(urb->transfer_buffer);
+ urb->transfer_buffer = NULL;
+ }

- kfree(urb->transfer_buffer);
- urb->transfer_buffer = NULL;
+ if (urb->num_sgs) {
+ sgl_free(urb->sg);
+ urb->sg = NULL;
+ urb->num_sgs = 0;
+ }

list_del(&priv->list);
kmem_cache_free(stub_priv_cache, priv);
@@ -161,13 +172,16 @@ static int stub_send_ret_submit(struct stub_device *sdev)
struct usbip_header pdu_header;
struct usbip_iso_packet_descriptor *iso_buffer = NULL;
struct kvec *iov = NULL;
+ struct scatterlist *sg;
int iovnum = 0;
+ int i;

txsize = 0;
memset(&pdu_header, 0, sizeof(pdu_header));
memset(&msg, 0, sizeof(msg));

- if (urb->actual_length > 0 && !urb->transfer_buffer) {
+ if (urb->actual_length > 0 && !urb->transfer_buffer &&
+ !urb->num_sgs) {
dev_err(&sdev->udev->dev,
"urb: actual_length %d transfer_buffer null\n",
urb->actual_length);
@@ -176,6 +190,8 @@ static int stub_send_ret_submit(struct stub_device *sdev)

if (usb_pipetype(urb->pipe) == PIPE_ISOCHRONOUS)
iovnum = 2 + urb->number_of_packets;
+ else if (usb_pipein(urb->pipe) && urb->actual_length > 0 && urb->num_sgs)
+ iovnum = 1 + urb->num_sgs;
else
iovnum = 2;

@@ -203,9 +219,30 @@ static int stub_send_ret_submit(struct stub_device *sdev)
if (usb_pipein(urb->pipe) &&
usb_pipetype(urb->pipe) != PIPE_ISOCHRONOUS &&
urb->actual_length > 0) {
- iov[iovnum].iov_base = urb->transfer_buffer;
- iov[iovnum].iov_len = urb->actual_length;
- iovnum++;
+ if (urb->num_sgs) {
+ unsigned int copy = urb->actual_length;
+ int size;
+
+ for_each_sg(urb->sg, sg, urb->num_sgs ,i) {
+ if (copy == 0)
+ break;
+
+ if (copy < sg->length)
+ size = copy;
+ else
+ size = sg->length;
+
+ iov[iovnum].iov_base = sg_virt(sg);
+ iov[iovnum].iov_len = size;
+
+ iovnum++;
+ copy -= size;
+ }
+ } else {
+ iov[iovnum].iov_base = urb->transfer_buffer;
+ iov[iovnum].iov_len = urb->actual_length;
+ iovnum++;
+ }
txsize += urb->actual_length;
} else if (usb_pipein(urb->pipe) &&
usb_pipetype(urb->pipe) == PIPE_ISOCHRONOUS) {
diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c
index 45da3e01c7b0..56b2a1fbe0bf 100644
--- a/drivers/usb/usbip/usbip_common.c
+++ b/drivers/usb/usbip/usbip_common.c
@@ -365,6 +365,7 @@ static void usbip_pack_cmd_submit(struct usbip_header *pdu, struct urb *urb,
spdu->start_frame = urb->start_frame;
spdu->number_of_packets = urb->number_of_packets;
spdu->interval = urb->interval;
+ spdu->num_sgs = urb->num_sgs;
} else {
urb->transfer_flags = spdu->transfer_flags;
urb->transfer_buffer_length = spdu->transfer_buffer_length;
@@ -434,6 +435,7 @@ static void correct_endian_cmd_submit(struct usbip_header_cmd_submit *pdu,
{
if (send) {
pdu->transfer_flags = cpu_to_be32(pdu->transfer_flags);
+ pdu->num_sgs = cpu_to_be32(pdu->num_sgs);

cpu_to_be32s(&pdu->transfer_buffer_length);
cpu_to_be32s(&pdu->start_frame);
@@ -441,6 +443,7 @@ static void correct_endian_cmd_submit(struct usbip_header_cmd_submit *pdu,
cpu_to_be32s(&pdu->interval);
} else {
pdu->transfer_flags = be32_to_cpu(pdu->transfer_flags);
+ pdu->num_sgs = be32_to_cpu(pdu->num_sgs);

be32_to_cpus(&pdu->transfer_buffer_length);
be32_to_cpus(&pdu->start_frame);
@@ -680,8 +683,12 @@ EXPORT_SYMBOL_GPL(usbip_pad_iso);
/* some members of urb must be substituted before. */
int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb)
{
- int ret;
+ struct scatterlist *sg;
+ int ret = 0;
+ int recv;
int size;
+ int copy;
+ int i;

if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC) {
/* the direction of urb must be OUT. */
@@ -712,14 +719,49 @@ int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb)
}
}

- ret = usbip_recv(ud->tcp_socket, urb->transfer_buffer, size);
- if (ret != size) {
- dev_err(&urb->dev->dev, "recv xbuf, %d\n", ret);
- if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC) {
- usbip_event_add(ud, SDEV_EVENT_ERROR_TCP);
- } else {
- usbip_event_add(ud, VDEV_EVENT_ERROR_TCP);
- return -EPIPE;
+ if (urb->num_sgs) {
+ copy = size;
+ for_each_sg(urb->sg, sg, urb->num_sgs, i) {
+ int recv_size;
+
+ if (copy < sg->length)
+ recv_size = copy;
+ else
+ recv_size = sg->length;
+
+ recv = usbip_recv(ud->tcp_socket, sg_virt(sg), recv_size);
+ if (recv != recv_size) {
+ dev_err(&urb->dev->dev, "recv xbuf, %d\n", ret);
+ if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC) {
+ usbip_event_add(ud, SDEV_EVENT_ERROR_TCP);
+ } else {
+ usbip_event_add(ud, VDEV_EVENT_ERROR_TCP);
+ return -EPIPE;
+ }
+ }
+ copy -= recv;
+ ret += recv;
+ }
+
+ if (ret != size) {
+ dev_err(&urb->dev->dev, "recv xbuf, %d\n", ret);
+ if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC) {
+ usbip_event_add(ud, SDEV_EVENT_ERROR_TCP);
+ } else {
+ usbip_event_add(ud, VDEV_EVENT_ERROR_TCP);
+ return -EPIPE;
+ }
+ }
+ } else {
+ ret = usbip_recv(ud->tcp_socket, urb->transfer_buffer, size);
+ if (ret != size) {
+ dev_err(&urb->dev->dev, "recv xbuf, %d\n", ret);
+ if (ud->side == USBIP_STUB || ud->side == USBIP_VUDC) {
+ usbip_event_add(ud, SDEV_EVENT_ERROR_TCP);
+ } else {
+ usbip_event_add(ud, VDEV_EVENT_ERROR_TCP);
+ return -EPIPE;
+ }
}
}

diff --git a/drivers/usb/usbip/usbip_common.h b/drivers/usb/usbip/usbip_common.h
index bf8afe9b5883..b395946c4d03 100644
--- a/drivers/usb/usbip/usbip_common.h
+++ b/drivers/usb/usbip/usbip_common.h
@@ -143,6 +143,7 @@ struct usbip_header_basic {
* struct usbip_header_cmd_submit - USBIP_CMD_SUBMIT packet header
* @transfer_flags: URB flags
* @transfer_buffer_length: the data size for (in) or (out) transfer
+ * @num_sgs: the number of scatter gather list of URB
* @start_frame: initial frame for isochronous or interrupt transfers
* @number_of_packets: number of isochronous packets
* @interval: maximum time for the request on the server-side host controller
@@ -151,6 +152,7 @@ struct usbip_header_basic {
struct usbip_header_cmd_submit {
__u32 transfer_flags;
__s32 transfer_buffer_length;
+ __u32 num_sgs;

/* it is difficult for usbip to sync frames (reserved only?) */
__s32 start_frame;
diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c
index be87c8a63e24..cc93c1a87a3e 100644
--- a/drivers/usb/usbip/vhci_hcd.c
+++ b/drivers/usb/usbip/vhci_hcd.c
@@ -696,7 +696,8 @@ static int vhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
}
vdev = &vhci_hcd->vdev[portnum-1];

- if (!urb->transfer_buffer && urb->transfer_buffer_length) {
+ if (!urb->transfer_buffer && !urb->num_sgs &&
+ urb->transfer_buffer_length) {
dev_dbg(dev, "Null URB transfer buffer\n");
return -EINVAL;
}
@@ -1142,6 +1143,11 @@ static int vhci_setup(struct usb_hcd *hcd)
hcd->speed = HCD_USB3;
hcd->self.root_hub->speed = USB_SPEED_SUPER;
}
+
+ /* support sg */
+ hcd->self.sg_tablesize = ~0;
+ hcd->self.no_sg_constraint = 1;
+
return 0;
}

diff --git a/drivers/usb/usbip/vhci_tx.c b/drivers/usb/usbip/vhci_tx.c
index 2fa26d0578d7..3472180f5af8 100644
--- a/drivers/usb/usbip/vhci_tx.c
+++ b/drivers/usb/usbip/vhci_tx.c
@@ -5,6 +5,7 @@

#include <linux/kthread.h>
#include <linux/slab.h>
+#include <linux/scatterlist.h>

#include "usbip_common.h"
#include "vhci.h"
@@ -51,12 +52,13 @@ static struct vhci_priv *dequeue_from_priv_tx(struct vhci_device *vdev)
static int vhci_send_cmd_submit(struct vhci_device *vdev)
{
struct vhci_priv *priv = NULL;
-
+ struct scatterlist *sg;
struct msghdr msg;
- struct kvec iov[3];
+ struct kvec *iov;
size_t txsize;
-
size_t total_size = 0;
+ int iovnum;
+ int i;

while ((priv = dequeue_from_priv_tx(vdev)) != NULL) {
int ret;
@@ -72,18 +74,41 @@ static int vhci_send_cmd_submit(struct vhci_device *vdev)
usbip_dbg_vhci_tx("setup txdata urb seqnum %lu\n",
priv->seqnum);

+ if (urb->num_sgs && usb_pipeout(urb->pipe))
+ iovnum = 2 + urb->num_sgs;
+ else
+ iovnum = 3;
+
+ iov = kzalloc(iovnum * sizeof(*iov), GFP_KERNEL);
+ if (!iov) {
+ usbip_event_add(&vdev->ud,
+ SDEV_EVENT_ERROR_MALLOC);
+ return -ENOMEM;
+ }
+
/* 1. setup usbip_header */
setup_cmd_submit_pdu(&pdu_header, urb);
usbip_header_correct_endian(&pdu_header, 1);
+ iovnum = 0;

- iov[0].iov_base = &pdu_header;
- iov[0].iov_len = sizeof(pdu_header);
+ iov[iovnum].iov_base = &pdu_header;
+ iov[iovnum].iov_len = sizeof(pdu_header);
txsize += sizeof(pdu_header);
+ iovnum++;

/* 2. setup transfer buffer */
if (!usb_pipein(urb->pipe) && urb->transfer_buffer_length > 0) {
- iov[1].iov_base = urb->transfer_buffer;
- iov[1].iov_len = urb->transfer_buffer_length;
+ if (urb->num_sgs && !usb_endpoint_xfer_isoc(&urb->ep->desc)) {
+ for_each_sg(urb->sg, sg, urb->num_sgs ,i) {
+ iov[iovnum].iov_base = sg_virt(sg);
+ iov[iovnum].iov_len = sg->length;
+ iovnum++;
+ }
+ } else {
+ iov[iovnum].iov_base = urb->transfer_buffer;
+ iov[iovnum].iov_len = urb->transfer_buffer_length;
+ iovnum++;
+ }
txsize += urb->transfer_buffer_length;
}

@@ -93,25 +118,29 @@ static int vhci_send_cmd_submit(struct vhci_device *vdev)

iso_buffer = usbip_alloc_iso_desc_pdu(urb, &len);
if (!iso_buffer) {
+ kfree(iov);
usbip_event_add(&vdev->ud,
SDEV_EVENT_ERROR_MALLOC);
return -1;
}

- iov[2].iov_base = iso_buffer;
- iov[2].iov_len = len;
+ iov[iovnum].iov_base = iso_buffer;
+ iov[iovnum].iov_len = len;
+ iovnum++;
txsize += len;
}

- ret = kernel_sendmsg(vdev->ud.tcp_socket, &msg, iov, 3, txsize);
+ ret = kernel_sendmsg(vdev->ud.tcp_socket, &msg, iov, iovnum, txsize);
if (ret != txsize) {
pr_err("sendmsg failed!, ret=%d for %zd\n", ret,
txsize);
+ kfree(iov);
kfree(iso_buffer);
usbip_event_add(&vdev->ud, VDEV_EVENT_ERROR_TCP);
return -1;
}

+ kfree(iov);
kfree(iso_buffer);
usbip_dbg_vhci_tx("send txdata\n");

--
2.20.1

2019-06-21 20:05:47

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH 2/2] usbip: Implement SG support to vhci

On Sat, 22 Jun 2019, Suwan Kim wrote:

> There are bugs on vhci with usb 3.0 storage device. Originally, vhci
> doesn't supported SG. So, USB storage driver on vhci divides SG list
> into multiple URBs and it causes buffer overflow on the xhci of the
> server. So we need to add SG support to vhci

It doesn't cause buffer overflow. The problem was that a transfer got
terminated too early because the transfer length for one of the URBs
was not divisible by the maxpacket size.

> In this patch, vhci basically support SG and it sends each SG list
> entry to the stub driver. Then, the stub driver sees total length of
> the buffer and allocates SG table and pages according to the total
> buffer length calling sgl_alloc(). After the stub driver receives
> completed URB, it again sends each SG list entry to the vhci.
>
> If HCD of server doesn't support SG, the stub driver allocates
> big buffer using kmalloc() instead of using sgl_alloc() which
> allocates SG list and pages.

You might be better off not using kmalloc. It's easier for the kernel
to allocate a bunch of small buffers than a single big one. Then you
can create a separate URB for each buffer.

> Alan fixed vhci bug with the USB 3.0 storage device by modifying
> USB storage driver.
> ("usb-storage: Set virt_boundary_mask to avoid SG overflows")
> But the fundamental solution of it is to add SG support to vhci.
>
> This patch works well with the USB 3.0 storage devices without Alan's
> patch, and we can revert Alan's patch if it causes some troubles.

These last two paragraphs don't need to be in the patch description.

> Signed-off-by: Suwan Kim <[email protected]>
> ---

I'm not sufficiently familiar with the usbip drivers to review most of
this. However...

> diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c
> index be87c8a63e24..cc93c1a87a3e 100644
> --- a/drivers/usb/usbip/vhci_hcd.c
> +++ b/drivers/usb/usbip/vhci_hcd.c
> @@ -696,7 +696,8 @@ static int vhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
> }
> vdev = &vhci_hcd->vdev[portnum-1];
>
> - if (!urb->transfer_buffer && urb->transfer_buffer_length) {
> + if (!urb->transfer_buffer && !urb->num_sgs &&
> + urb->transfer_buffer_length) {
> dev_dbg(dev, "Null URB transfer buffer\n");
> return -EINVAL;
> }
> @@ -1142,6 +1143,11 @@ static int vhci_setup(struct usb_hcd *hcd)
> hcd->speed = HCD_USB3;
> hcd->self.root_hub->speed = USB_SPEED_SUPER;
> }
> +
> + /* support sg */
> + hcd->self.sg_tablesize = ~0;
> + hcd->self.no_sg_constraint = 1;

You probably shouldn't do this, for two reasons. First, sg_tablesize
of the server's HCD may be smaller than ~0. If the client's value is
larger than the server's, a transfer could be accepted on the client
but then fail on the server because the SG list was too big.

Also, you may want to restrict the size of SG transfers even further,
so that you don't have to allocate a tremendous amount of memory all at
once on the server. An SG transfer can be quite large. I don't know
what a reasonable limit would be -- 16 perhaps?

Alan Stern

2019-06-22 10:41:43

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 2/2] usbip: Implement SG support to vhci

Hi Suwan,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on usb/usb-testing]
[also build test WARNING on v5.2-rc5 next-20190621]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Suwan-Kim/usbip-Skip-DMA-mapping-and-unmapping-for-urb-at-vhci/20190622-130016
base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-testing
reproduce:
# apt-get install sparse
# sparse version: v0.6.1-rc1-7-g2b96cd8-dirty
make ARCH=x86_64 allmodconfig
make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <[email protected]>


sparse warnings: (new ones prefixed by >>)

drivers/usb/usbip/usbip_common.c:419:33: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned int [usertype] command @@ got restrunsigned int [usertype] command @@
drivers/usb/usbip/usbip_common.c:419:33: sparse: expected unsigned int [usertype] command
drivers/usb/usbip/usbip_common.c:419:33: sparse: got restricted __be32 [usertype]
drivers/usb/usbip/usbip_common.c:420:33: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned int [usertype] seqnum @@ got restrunsigned int [usertype] seqnum @@
drivers/usb/usbip/usbip_common.c:420:33: sparse: expected unsigned int [usertype] seqnum
drivers/usb/usbip/usbip_common.c:420:33: sparse: got restricted __be32 [usertype]
drivers/usb/usbip/usbip_common.c:421:33: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned int [usertype] devid @@ got restrunsigned int [usertype] devid @@
drivers/usb/usbip/usbip_common.c:421:33: sparse: expected unsigned int [usertype] devid
drivers/usb/usbip/usbip_common.c:421:33: sparse: got restricted __be32 [usertype]
drivers/usb/usbip/usbip_common.c:422:33: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned int [usertype] direction @@ got restrunsigned int [usertype] direction @@
drivers/usb/usbip/usbip_common.c:422:33: sparse: expected unsigned int [usertype] direction
drivers/usb/usbip/usbip_common.c:422:33: sparse: got restricted __be32 [usertype]
drivers/usb/usbip/usbip_common.c:423:33: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned int [usertype] ep @@ got restrunsigned int [usertype] ep @@
drivers/usb/usbip/usbip_common.c:423:33: sparse: expected unsigned int [usertype] ep
drivers/usb/usbip/usbip_common.c:423:33: sparse: got restricted __be32 [usertype]
drivers/usb/usbip/usbip_common.c:425:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:425:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:425:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:425:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:425:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:425:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:426:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:426:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:426:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:426:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:426:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:426:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:427:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:427:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:427:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:427:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:427:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:427:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:428:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:428:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:428:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:428:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:428:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:428:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:429:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:429:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:429:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:429:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:429:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:429:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:437:37: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned int [usertype] transfer_flags @@ got restrunsigned int [usertype] transfer_flags @@
drivers/usb/usbip/usbip_common.c:437:37: sparse: expected unsigned int [usertype] transfer_flags
drivers/usb/usbip/usbip_common.c:437:37: sparse: got restricted __be32 [usertype]
>> drivers/usb/usbip/usbip_common.c:438:30: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned int [usertype] num_sgs @@ got restrunsigned int [usertype] num_sgs @@
>> drivers/usb/usbip/usbip_common.c:438:30: sparse: expected unsigned int [usertype] num_sgs
drivers/usb/usbip/usbip_common.c:438:30: sparse: got restricted __be32 [usertype]
drivers/usb/usbip/usbip_common.c:445:39: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:445:39: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:445:39: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:445:39: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:445:39: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:445:39: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:446:32: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:446:32: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:446:32: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:446:32: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:446:32: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:446:32: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:477:29: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned int [usertype] seqnum @@ got restrunsigned int [usertype] seqnum @@
drivers/usb/usbip/usbip_common.c:477:29: sparse: expected unsigned int [usertype] seqnum
drivers/usb/usbip/usbip_common.c:477:29: sparse: got restricted __be32 [usertype]
drivers/usb/usbip/usbip_common.c:479:31: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:479:31: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:479:31: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:479:31: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:479:31: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:479:31: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:529:33: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned int [usertype] offset @@ got restrunsigned int [usertype] offset @@
drivers/usb/usbip/usbip_common.c:529:33: sparse: expected unsigned int [usertype] offset
drivers/usb/usbip/usbip_common.c:529:33: sparse: got restricted __be32 [usertype]
drivers/usb/usbip/usbip_common.c:530:33: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned int [usertype] length @@ got restrunsigned int [usertype] length @@
drivers/usb/usbip/usbip_common.c:530:33: sparse: expected unsigned int [usertype] length
drivers/usb/usbip/usbip_common.c:530:33: sparse: got restricted __be32 [usertype]
drivers/usb/usbip/usbip_common.c:531:33: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned int [usertype] status @@ got restrunsigned int [usertype] status @@
drivers/usb/usbip/usbip_common.c:531:33: sparse: expected unsigned int [usertype] status
drivers/usb/usbip/usbip_common.c:531:33: sparse: got restricted __be32 [usertype]
drivers/usb/usbip/usbip_common.c:532:36: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned int [usertype] actual_length @@ got restrunsigned int [usertype] actual_length @@
drivers/usb/usbip/usbip_common.c:532:36: sparse: expected unsigned int [usertype] actual_length
drivers/usb/usbip/usbip_common.c:532:36: sparse: got restricted __be32 [usertype]
drivers/usb/usbip/usbip_common.c:534:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:534:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:534:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:534:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:534:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:534:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:535:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:535:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:535:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:535:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:535:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:535:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:536:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:536:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:536:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:536:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:536:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:536:35: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:537:38: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:537:38: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:537:38: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:537:38: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:537:38: sparse: sparse: cast to restricted __be32
drivers/usb/usbip/usbip_common.c:537:38: sparse: sparse: cast to restricted __be32

vim +438 drivers/usb/usbip/usbip_common.c

415
416 static void correct_endian_basic(struct usbip_header_basic *base, int send)
417 {
418 if (send) {
419 base->command = cpu_to_be32(base->command);
420 base->seqnum = cpu_to_be32(base->seqnum);
421 base->devid = cpu_to_be32(base->devid);
422 base->direction = cpu_to_be32(base->direction);
423 base->ep = cpu_to_be32(base->ep);
424 } else {
425 base->command = be32_to_cpu(base->command);
426 base->seqnum = be32_to_cpu(base->seqnum);
427 base->devid = be32_to_cpu(base->devid);
428 base->direction = be32_to_cpu(base->direction);
> 429 base->ep = be32_to_cpu(base->ep);
430 }
431 }
432
433 static void correct_endian_cmd_submit(struct usbip_header_cmd_submit *pdu,
434 int send)
435 {
436 if (send) {
437 pdu->transfer_flags = cpu_to_be32(pdu->transfer_flags);
> 438 pdu->num_sgs = cpu_to_be32(pdu->num_sgs);
439
440 cpu_to_be32s(&pdu->transfer_buffer_length);
441 cpu_to_be32s(&pdu->start_frame);
442 cpu_to_be32s(&pdu->number_of_packets);
443 cpu_to_be32s(&pdu->interval);
444 } else {
445 pdu->transfer_flags = be32_to_cpu(pdu->transfer_flags);
446 pdu->num_sgs = be32_to_cpu(pdu->num_sgs);
447
448 be32_to_cpus(&pdu->transfer_buffer_length);
449 be32_to_cpus(&pdu->start_frame);
450 be32_to_cpus(&pdu->number_of_packets);
451 be32_to_cpus(&pdu->interval);
452 }
453 }
454

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation

2019-06-24 14:59:52

by Suwan Kim

[permalink] [raw]
Subject: Re: [PATCH 2/2] usbip: Implement SG support to vhci

On Fri, Jun 21, 2019 at 04:05:24PM -0400, Alan Stern wrote:
> On Sat, 22 Jun 2019, Suwan Kim wrote:
>
> > There are bugs on vhci with usb 3.0 storage device. Originally, vhci
> > doesn't supported SG. So, USB storage driver on vhci divides SG list
> > into multiple URBs and it causes buffer overflow on the xhci of the
> > server. So we need to add SG support to vhci
>
> It doesn't cause buffer overflow. The problem was that a transfer got
> terminated too early because the transfer length for one of the URBs
> was not divisible by the maxpacket size.

Oh.. I misunderstood the problem. I will rewrite the problem
situation.

> > In this patch, vhci basically support SG and it sends each SG list
> > entry to the stub driver. Then, the stub driver sees total length of
> > the buffer and allocates SG table and pages according to the total
> > buffer length calling sgl_alloc(). After the stub driver receives
> > completed URB, it again sends each SG list entry to the vhci.
> >
> > If HCD of server doesn't support SG, the stub driver allocates
> > big buffer using kmalloc() instead of using sgl_alloc() which
> > allocates SG list and pages.
>
> You might be better off not using kmalloc. It's easier for the kernel
> to allocate a bunch of small buffers than a single big one. Then you
> can create a separate URB for each buffer.

Ok. I will implement it as usb_sg_init() does and send v2 patch
including the logic of submitting separate URBs.

> > Alan fixed vhci bug with the USB 3.0 storage device by modifying
> > USB storage driver.
> > ("usb-storage: Set virt_boundary_mask to avoid SG overflows")
> > But the fundamental solution of it is to add SG support to vhci.
> >
> > This patch works well with the USB 3.0 storage devices without Alan's
> > patch, and we can revert Alan's patch if it causes some troubles.
>
> These last two paragraphs don't need to be in the patch description.

I will remove these paragraphs in v2 patch.

> > Signed-off-by: Suwan Kim <[email protected]>
> > ---
>
> I'm not sufficiently familiar with the usbip drivers to review most of
> this. However...
>
> > diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c
> > index be87c8a63e24..cc93c1a87a3e 100644
> > --- a/drivers/usb/usbip/vhci_hcd.c
> > +++ b/drivers/usb/usbip/vhci_hcd.c
> > @@ -696,7 +696,8 @@ static int vhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
> > }
> > vdev = &vhci_hcd->vdev[portnum-1];
> >
> > - if (!urb->transfer_buffer && urb->transfer_buffer_length) {
> > + if (!urb->transfer_buffer && !urb->num_sgs &&
> > + urb->transfer_buffer_length) {
> > dev_dbg(dev, "Null URB transfer buffer\n");
> > return -EINVAL;
> > }
> > @@ -1142,6 +1143,11 @@ static int vhci_setup(struct usb_hcd *hcd)
> > hcd->speed = HCD_USB3;
> > hcd->self.root_hub->speed = USB_SPEED_SUPER;
> > }
> > +
> > + /* support sg */
> > + hcd->self.sg_tablesize = ~0;
> > + hcd->self.no_sg_constraint = 1;
>
> You probably shouldn't do this, for two reasons. First, sg_tablesize
> of the server's HCD may be smaller than ~0. If the client's value is
> larger than the server's, a transfer could be accepted on the client
> but then fail on the server because the SG list was too big.
>
> Also, you may want to restrict the size of SG transfers even further,
> so that you don't have to allocate a tremendous amount of memory all at
> once on the server. An SG transfer can be quite large. I don't know
> what a reasonable limit would be -- 16 perhaps?

Is there any reason why you think that 16 is ok? Or Can I set this
value as the smallest value of all HC? I think that sg_tablesize
cannot be a variable value because vhci interacts with different
machines and all machines has different sg_tablesize value.

Regards

Suwan Kim

2019-06-24 20:39:19

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH 2/2] usbip: Implement SG support to vhci

On Mon, 24 Jun 2019, Suwan Kim wrote:

> > > + hcd->self.sg_tablesize = ~0;
> > > + hcd->self.no_sg_constraint = 1;
> >
> > You probably shouldn't do this, for two reasons. First, sg_tablesize
> > of the server's HCD may be smaller than ~0. If the client's value is
> > larger than the server's, a transfer could be accepted on the client
> > but then fail on the server because the SG list was too big.

On the other hand, I don't know of any examples where an HCD has
sg_tablesize set to anything other than 0 or ~0. vhci-hcd might end up
being the only one.

> > Also, you may want to restrict the size of SG transfers even further,
> > so that you don't have to allocate a tremendous amount of memory all at
> > once on the server. An SG transfer can be quite large. I don't know
> > what a reasonable limit would be -- 16 perhaps?
>
> Is there any reason why you think that 16 is ok? Or Can I set this
> value as the smallest value of all HC? I think that sg_tablesize
> cannot be a variable value because vhci interacts with different
> machines and all machines has different sg_tablesize value.

I didn't have any good reason for picking 16. Using the smallest value
of all the HCDs seems like a good idea.

Alan Stern

2019-06-29 00:12:23

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 1/2] usbip: Skip DMA mapping and unmapping for urb at vhci

Hi Suwan,

On 6/21/19 11:45 AM, Suwan Kim wrote:
> vhci doesn’t do dma for remote device. Actually, the real dma
> operation is done by network card driver. So, vhci doesn’t use and
> need dma address of transfer buffer of urb.
>
> When vhci supports SG, it is useful to use native SG list instead
> of mapped SG list because dma mapping fnuction can adjust the
> number of SG list that is urb->num_mapped_sgs.
>
> But hcd provides dma mapping and unmapping function by defualt.

Typo "defualt"

> Moreover, it causes unnecessary dma mapping and unmapping which
> will be done again at the NIC driver and it wastes CPU cycles.
> So, implement map_urb_for_dma and unmap_urb_for_dma function for
> vhci in order to skip the dma mapping and unmapping procedure.
>

How did you verify that unnecessary dma map/unmap are happening?
How many CPU cycles did you manage to reduce with this change?

thanks,
-- Shuah

2019-07-01 09:30:14

by Suwan Kim

[permalink] [raw]
Subject: Re: [PATCH 1/2] usbip: Skip DMA mapping and unmapping for urb at vhci

On Fri, Jun 28, 2019 at 06:11:54PM -0600, shuah wrote:
> Hi Suwan,
>
> On 6/21/19 11:45 AM, Suwan Kim wrote:
> > vhci doesn’t do dma for remote device. Actually, the real dma
> > operation is done by network card driver. So, vhci doesn’t use and
> > need dma address of transfer buffer of urb.
> >
> > When vhci supports SG, it is useful to use native SG list instead
> > of mapped SG list because dma mapping fnuction can adjust the
> > number of SG list that is urb->num_mapped_sgs.
> >
> > But hcd provides dma mapping and unmapping function by defualt.
>
> Typo "defualt"
>
> > Moreover, it causes unnecessary dma mapping and unmapping which
> > will be done again at the NIC driver and it wastes CPU cycles.
> > So, implement map_urb_for_dma and unmap_urb_for_dma function for
> > vhci in order to skip the dma mapping and unmapping procedure.
> >
>
> How did you verify that unnecessary dma map/unmap are happening?
> How many CPU cycles did you manage to reduce with this change?

Dma mapping/unmapping is not required for vhci because vhci passes
the virtual address of the buffer to the network stack without
passing the dma address of the buffer. Network stack receive the
virtual address of the buffer from vhci and later, network card
driver performs dma mapping for the buffer. So, as far as I know,
dma address of the buffer is not used for vhci and virtual address
is only used by vhci.

I used ftrace to measure a duration of usb_hcd_map_urb_for_dma().
As a result, usb_hcd_map_urb_for_dma() took a duration of about
0.14us out of about 10us which is the duration of usb_hcd_submit_urb().
However, this figure is the dma mapping measurement value for
physically contiguous buffers when vhci does not support SG, and
if vhci supports SG, more CPU cycles will be consumed for SG dma
mapping.

I think that the important point is dma mapping/unmapping is
unnecessary at vhci. So we can skip dma mapping/unmapping and save
the CPU cycles (even if it is small). This is an opportunity to
reduce the end-to-end latency of usbip and improve the performance.

Regards

Suwan Kim

2019-07-04 18:14:36

by Suwan Kim

[permalink] [raw]
Subject: Re: [PATCH 2/2] usbip: Implement SG support to vhci

On Mon, Jun 24, 2019 at 01:24:15PM -0400, Alan Stern wrote:
> On Mon, 24 Jun 2019, Suwan Kim wrote:
>
> > > > + hcd->self.sg_tablesize = ~0;
> > > > + hcd->self.no_sg_constraint = 1;
> > >
> > > You probably shouldn't do this, for two reasons. First, sg_tablesize
> > > of the server's HCD may be smaller than ~0. If the client's value is
> > > larger than the server's, a transfer could be accepted on the client
> > > but then fail on the server because the SG list was too big.
>
> On the other hand, I don't know of any examples where an HCD has
> sg_tablesize set to anything other than 0 or ~0. vhci-hcd might end up
> being the only one.
>
> > > Also, you may want to restrict the size of SG transfers even further,
> > > so that you don't have to allocate a tremendous amount of memory all at
> > > once on the server. An SG transfer can be quite large. I don't know
> > > what a reasonable limit would be -- 16 perhaps?
> >
> > Is there any reason why you think that 16 is ok? Or Can I set this
> > value as the smallest value of all HC? I think that sg_tablesize
> > cannot be a variable value because vhci interacts with different
> > machines and all machines has different sg_tablesize value.
>
> I didn't have any good reason for picking 16. Using the smallest value
> of all the HCDs seems like a good idea.

I also have not seen an HCD with a value other than ~0 or 0 except for
whci which uses 2048, but is not 2048 the maximum value of sg_tablesize?
If so, ~0 is the minimum value of sg_tablesize that supports SG. Then
can vhci use ~0 if we don't consider memory pressure of the server?

If all of the HCDs supporting SG have ~0 as sg_tablesize value, I
think that whether we use an HCD locally or remotely, the degree of
memory pressure is same in both local and remote usage.

Regards

Suwan Kim

2019-07-05 01:43:28

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH 2/2] usbip: Implement SG support to vhci

On Fri, 5 Jul 2019, Suwan Kim wrote:

> On Mon, Jun 24, 2019 at 01:24:15PM -0400, Alan Stern wrote:
> > On Mon, 24 Jun 2019, Suwan Kim wrote:
> >
> > > > > + hcd->self.sg_tablesize = ~0;
> > > > > + hcd->self.no_sg_constraint = 1;
> > > >
> > > > You probably shouldn't do this, for two reasons. First, sg_tablesize
> > > > of the server's HCD may be smaller than ~0. If the client's value is
> > > > larger than the server's, a transfer could be accepted on the client
> > > > but then fail on the server because the SG list was too big.
> >
> > On the other hand, I don't know of any examples where an HCD has
> > sg_tablesize set to anything other than 0 or ~0. vhci-hcd might end up
> > being the only one.
> >
> > > > Also, you may want to restrict the size of SG transfers even further,
> > > > so that you don't have to allocate a tremendous amount of memory all at
> > > > once on the server. An SG transfer can be quite large. I don't know
> > > > what a reasonable limit would be -- 16 perhaps?
> > >
> > > Is there any reason why you think that 16 is ok? Or Can I set this
> > > value as the smallest value of all HC? I think that sg_tablesize
> > > cannot be a variable value because vhci interacts with different
> > > machines and all machines has different sg_tablesize value.
> >
> > I didn't have any good reason for picking 16. Using the smallest value
> > of all the HCDs seems like a good idea.
>
> I also have not seen an HCD with a value other than ~0 or 0 except for
> whci which uses 2048, but is not 2048 the maximum value of sg_tablesize?
> If so, ~0 is the minimum value of sg_tablesize that supports SG. Then
> can vhci use ~0 if we don't consider memory pressure of the server?
>
> If all of the HCDs supporting SG have ~0 as sg_tablesize value, I
> think that whether we use an HCD locally or remotely, the degree of
> memory pressure is same in both local and remote usage.

You have a lot of leeway. For example, there's no reason a single SG
transfer on the client has to correspond to a single SG transfer on the
host. In theory the client's vhci-hcd can break a large SG transfer up
into a bunch of smaller pieces and send them to the host one by one,
then reassemble the results to complete the original transfer. That
way the memory pressure on the host would be a lot smaller than on the
client.

Alan Stern

2019-07-05 09:08:27

by Suwan Kim

[permalink] [raw]
Subject: Re: [PATCH 2/2] usbip: Implement SG support to vhci

On Thu, Jul 04, 2019 at 09:41:04PM -0400, Alan Stern wrote:
> On Fri, 5 Jul 2019, Suwan Kim wrote:
>
> > On Mon, Jun 24, 2019 at 01:24:15PM -0400, Alan Stern wrote:
> > > On Mon, 24 Jun 2019, Suwan Kim wrote:
> > >
> > > > > > + hcd->self.sg_tablesize = ~0;
> > > > > > + hcd->self.no_sg_constraint = 1;
> > > > >
> > > > > You probably shouldn't do this, for two reasons. First, sg_tablesize
> > > > > of the server's HCD may be smaller than ~0. If the client's value is
> > > > > larger than the server's, a transfer could be accepted on the client
> > > > > but then fail on the server because the SG list was too big.
> > >
> > > On the other hand, I don't know of any examples where an HCD has
> > > sg_tablesize set to anything other than 0 or ~0. vhci-hcd might end up
> > > being the only one.
> > >
> > > > > Also, you may want to restrict the size of SG transfers even further,
> > > > > so that you don't have to allocate a tremendous amount of memory all at
> > > > > once on the server. An SG transfer can be quite large. I don't know
> > > > > what a reasonable limit would be -- 16 perhaps?
> > > >
> > > > Is there any reason why you think that 16 is ok? Or Can I set this
> > > > value as the smallest value of all HC? I think that sg_tablesize
> > > > cannot be a variable value because vhci interacts with different
> > > > machines and all machines has different sg_tablesize value.
> > >
> > > I didn't have any good reason for picking 16. Using the smallest value
> > > of all the HCDs seems like a good idea.
> >
> > I also have not seen an HCD with a value other than ~0 or 0 except for
> > whci which uses 2048, but is not 2048 the maximum value of sg_tablesize?
> > If so, ~0 is the minimum value of sg_tablesize that supports SG. Then
> > can vhci use ~0 if we don't consider memory pressure of the server?
> >
> > If all of the HCDs supporting SG have ~0 as sg_tablesize value, I
> > think that whether we use an HCD locally or remotely, the degree of
> > memory pressure is same in both local and remote usage.
>
> You have a lot of leeway. For example, there's no reason a single SG
> transfer on the client has to correspond to a single SG transfer on the
> host. In theory the client's vhci-hcd can break a large SG transfer up
> into a bunch of smaller pieces and send them to the host one by one,
> then reassemble the results to complete the original transfer. That
> way the memory pressure on the host would be a lot smaller than on the
> client.

Thank you for the feedback, Alan. I understood your comment. It
seems to be a good idea to use sg_tablesize to alleviate the memory
pressure of the host. But I think 16 is too small for USB 3.0 device
because USB 3.0 storage device in my machine usually uses more than
30 SG entries. So, I will set sg_tablesize to 32.

Regards

Suwan Kim