2019-04-01 14:17:39

by Suwan Kim

[permalink] [raw]
Subject: [PATCH v2] usb: host: xhci: Support running urb giveback in tasklet context

Patch "USB: HCD: support giveback of URB in tasklet context"[1]
introduced giveback of urb in tasklet context. [1] This patch was
applied to ehci but not xhci. [2] This patch significantly reduces
the hard irq time of xhci. Especially for uvc driver, the hard irq
including the uvc completion function runs quite long but applying
this patch reduces the hard irq time of xhci.

I have tested four SS devices to check if performance degradation
occurs when urb completion functions run in the tasklet context.

As a result of the test, all devices works well and shows very
similar performance with the upstream kernel. Moreover, usb ethernet
adapter show better performance than the upstream kernel about 5% for
RX and 2% for TX. Four SS devices is as follows.

SS devices for test

1. WD My Passport 2TB (external hard drive)
2. Sandisk Ultra Flair USB 3.0 32GB
3. Logitech Brio webcam
4. Iptime 1gigabit ethernet adapter (Mediatek RTL8153)

Test description

1. Mass storage (hard drive) performance test
- run below command 10 times and compute the average performance

dd if=/dev/sdN iflag=direct of=/dev/null bs=1G count=1

2. Mass storage (flash memory) performance test
- run below command 10 times and compute the average performance

dd if=/dev/sdN iflag=direct of=/dev/null bs=1G count=1

3. Webcam streaming performance test
- run simple capture program and get the average frame rate per second
- capture 1500 frames
- program link

https://github.com/asfaca/Webcam-performance-analyzing-tool

- video resolution : 4096 X 2160 (4K) at 30 or 24 fps
- device (Logitech Brio) spec url for the highest resolution and fps

https://support.logitech.com/en_gb/product/brio-stream/specs

4. USB Ethernet adapter performance test
- directly connect two linux machines with ethernet cable
- run pktgen of linux kernel and send 1500 bytes packets
- run vnstat to measure the network bandwidth for 180 seconds

Test machine

- CPU : Intel i5-7600 @ 3.5GHz

Test results

1. Mass storage (hard drive) performance test

WD My Passport 2TB (external hard drive)
--------------------------------------------------------------------
xhci without tasklet | xhci with tasklet
--------------------------------------------------------------------
103.667MB/s | 103.692MB/s
--------------------------------------------------------------------

2. Mass storage (flash memory) performance test

Sandisk Ultra Flair USB 3.0 32GB
--------------------------------------------------------------------
xhci without tasklet | xhci with tasklet
--------------------------------------------------------------------
129.727MB/s | 130.2MB/s
--------------------------------------------------------------------

3. Webcam streaming performance test

Logitech Brio webcam
--------------------------------------------------------------------
xhci without tasklet | xhci with tasklet
--------------------------------------------------------------------
26.4451 fps | 26.3949 fps
--------------------------------------------------------------------

4. USB Ethernet adapter performance test

Iptime 1gigabit ethernet adapter (Mediatek RTL8153)
--------------------------------------------------------------------
xhci without tasklet | xhci with tasklet
--------------------------------------------------------------------
RX 933.86 Mbit/s | 983.86 Mbit/s
--------------------------------------------------------------------
TX 830.18 Mbit/s | 882.75 Mbit/s
--------------------------------------------------------------------

[1], https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=94dfd7edfd5c9b605caf7b562de7a813d216e011
[2], https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=428aac8a81058e2303677a8fbf26670229e51d3a

Signed-off-by: Suwan Kim <[email protected]>

---
v2 change:
- Add SS device test results and rewrite the description
---
drivers/usb/host/xhci-ring.c | 2 --
drivers/usb/host/xhci.c | 2 +-
2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 40fa25c4d041..0ede5265e6e2 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -644,10 +644,8 @@ static void xhci_giveback_urb_in_irq(struct xhci_hcd *xhci,
}
xhci_urb_free_priv(urb_priv);
usb_hcd_unlink_urb_from_ep(hcd, urb);
- spin_unlock(&xhci->lock);
trace_xhci_urb_giveback(urb);
usb_hcd_giveback_urb(hcd, urb, status);
- spin_lock(&xhci->lock);
}

static void xhci_unmap_td_bounce_buffer(struct xhci_hcd *xhci,
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 7fa58c99f126..bb212b4669cf 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -5141,7 +5141,7 @@ static const struct hc_driver xhci_hc_driver = {
* generic hardware linkage
*/
.irq = xhci_irq,
- .flags = HCD_MEMORY | HCD_USB3 | HCD_SHARED,
+ .flags = HCD_MEMORY | HCD_USB3 | HCD_SHARED | HCD_BH,

/*
* basic lifecycle operations
--
2.20.1


2019-04-01 14:44:33

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH v2] usb: host: xhci: Support running urb giveback in tasklet context

On Mon, 1 Apr 2019, Suwan Kim wrote:

> Patch "USB: HCD: support giveback of URB in tasklet context"[1]
> introduced giveback of urb in tasklet context. [1] This patch was
> applied to ehci but not xhci. [2] This patch significantly reduces
> the hard irq time of xhci. Especially for uvc driver, the hard irq
> including the uvc completion function runs quite long but applying
> this patch reduces the hard irq time of xhci.

Please read the kerneldoc for usb_submit_urb() and usb_kill_urb(), in
particular, the parts that describe how isochronous URBs are treated.
Can you guarantee that with this patch applied, xhci-hcd will continue
to work as the kerneldoc describes?

Alan Stern

2019-04-04 12:20:42

by Suwan Kim

[permalink] [raw]
Subject: Re: [PATCH v2] usb: host: xhci: Support running urb giveback in tasklet context

Hi Alan,

On Mon, Apr 01, 2019 at 10:43:24AM -0400, Alan Stern wrote:
> On Mon, 1 Apr 2019, Suwan Kim wrote:
>
> > Patch "USB: HCD: support giveback of URB in tasklet context"[1]
> > introduced giveback of urb in tasklet context. [1] This patch was
> > applied to ehci but not xhci. [2] This patch significantly reduces
> > the hard irq time of xhci. Especially for uvc driver, the hard irq
> > including the uvc completion function runs quite long but applying
> > this patch reduces the hard irq time of xhci.
>
> Please read the kerneldoc for usb_submit_urb() and usb_kill_urb(), in
> particular, the parts that describe how isochronous URBs are treated.
> Can you guarantee that with this patch applied, xhci-hcd will continue
> to work as the kerneldoc describes?
>

I read the description of usb_submit_urb() and usb_kill_urb() and i
think that xhci-hcd with which this patch is applied works as the
description of usb_submit_urb() and usb_kill_urb().

In the case of usb_submit_urb(), xhci spec 4.10.3.1 "Ring Overrun and
Underrun" describes the behavior of xhci when an isochronous ring is
empty due to the late URB submission in driver. (In this patch, empty
isochronous ring can happen due to tasklet scheduling delay in URB
complete function which calls the next usb_submit_urb())

According to the xhci spec, xHC deals with a late isochronous URB
according to the SIA(Start Isoch ASAP) flag of TRB and SIA flag is
set according to URB_ISO_ASAP flag in xhci_queue_isoc_tx().

If the SIA flag is set, xHC will schedule the late isochronous URB in
the next "Endpoint Service Interval Time" (next available frame) and
transmits ischronous URB in that frame.

If the SIA flag is cleared (URB_ISO_ASAP flag is cleared), xHC generates
"Missed Service Error" event and skips the late isochronous URB and
doen't transmit it. When the interrupt handler (xhci_irq) receives
"Missed Service Error" event, it returns the late isochronous URB to
the driver calling usb_hcd_giveback_urb() with -EXDEV error code in
usb_iso_packet_descriptor->status at skip_isoc_td(). So xhci behavior
about the late isochronous URB in spec and implementation is same
with the description of usb_submit_urb().

In the case of usb_kill_urb(), description says that it waits until
the URB complete function finishes and the important point is that
whether the USB complete function is called early or late doesn't
affect the behavior of usb_kill_urb() because __usb_hcd_giveback_urb()
wakes up usb_kill_urb() after calling URB complete function.
So, pending a URB complete function in tasklet doesn't affect the
behavior of xhci in usb_kill_urb().

Regards

Suwan Kim

2019-04-04 21:57:34

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH v2] usb: host: xhci: Support running urb giveback in tasklet context

On Thu, 4 Apr 2019, Suwan Kim wrote:

> Hi Alan,
>
> On Mon, Apr 01, 2019 at 10:43:24AM -0400, Alan Stern wrote:
> > On Mon, 1 Apr 2019, Suwan Kim wrote:
> >
> > > Patch "USB: HCD: support giveback of URB in tasklet context"[1]
> > > introduced giveback of urb in tasklet context. [1] This patch was
> > > applied to ehci but not xhci. [2] This patch significantly reduces
> > > the hard irq time of xhci. Especially for uvc driver, the hard irq
> > > including the uvc completion function runs quite long but applying
> > > this patch reduces the hard irq time of xhci.
> >
> > Please read the kerneldoc for usb_submit_urb() and usb_kill_urb(), in
> > particular, the parts that describe how isochronous URBs are treated.
> > Can you guarantee that with this patch applied, xhci-hcd will continue
> > to work as the kerneldoc describes?
> >
>
> I read the description of usb_submit_urb() and usb_kill_urb() and i
> think that xhci-hcd with which this patch is applied works as the
> description of usb_submit_urb() and usb_kill_urb().
>
> In the case of usb_submit_urb(), xhci spec 4.10.3.1 "Ring Overrun and
> Underrun" describes the behavior of xhci when an isochronous ring is
> empty due to the late URB submission in driver. (In this patch, empty
> isochronous ring can happen due to tasklet scheduling delay in URB
> complete function which calls the next usb_submit_urb())
>
> According to the xhci spec, xHC deals with a late isochronous URB
> according to the SIA(Start Isoch ASAP) flag of TRB and SIA flag is
> set according to URB_ISO_ASAP flag in xhci_queue_isoc_tx().
>
> If the SIA flag is set, xHC will schedule the late isochronous URB in
> the next "Endpoint Service Interval Time" (next available frame) and
> transmits ischronous URB in that frame.
>
> If the SIA flag is cleared (URB_ISO_ASAP flag is cleared), xHC generates
> "Missed Service Error" event and skips the late isochronous URB and
> doen't transmit it. When the interrupt handler (xhci_irq) receives
> "Missed Service Error" event, it returns the late isochronous URB to
> the driver calling usb_hcd_giveback_urb() with -EXDEV error code in
> usb_iso_packet_descriptor->status at skip_isoc_td(). So xhci behavior
> about the late isochronous URB in spec and implementation is same
> with the description of usb_submit_urb().
>
> In the case of usb_kill_urb(), description says that it waits until
> the URB complete function finishes and the important point is that
> whether the USB complete function is called early or late doesn't
> affect the behavior of usb_kill_urb() because __usb_hcd_giveback_urb()
> wakes up usb_kill_urb() after calling URB complete function.
> So, pending a URB complete function in tasklet doesn't affect the
> behavior of xhci in usb_kill_urb().

Okay, good. I just wanted to make sure you were aware of the issues
and had checked that using tasklets wouldn't cause any problems.

Alan Stern

2019-04-08 14:28:01

by Suwan Kim

[permalink] [raw]
Subject: Re: [PATCH v2] usb: host: xhci: Support running urb giveback in tasklet context

On Thu, Apr 04, 2019 at 04:54:26PM -0400, Alan Stern wrote:
> On Thu, 4 Apr 2019, Suwan Kim wrote:
>
> > Hi Alan,
> >
> > On Mon, Apr 01, 2019 at 10:43:24AM -0400, Alan Stern wrote:
> > > On Mon, 1 Apr 2019, Suwan Kim wrote:
> > >
> > > > Patch "USB: HCD: support giveback of URB in tasklet context"[1]
> > > > introduced giveback of urb in tasklet context. [1] This patch was
> > > > applied to ehci but not xhci. [2] This patch significantly reduces
> > > > the hard irq time of xhci. Especially for uvc driver, the hard irq
> > > > including the uvc completion function runs quite long but applying
> > > > this patch reduces the hard irq time of xhci.
> > >
> > > Please read the kerneldoc for usb_submit_urb() and usb_kill_urb(), in
> > > particular, the parts that describe how isochronous URBs are treated.
> > > Can you guarantee that with this patch applied, xhci-hcd will continue
> > > to work as the kerneldoc describes?
> > >
> >
> > I read the description of usb_submit_urb() and usb_kill_urb() and i
> > think that xhci-hcd with which this patch is applied works as the
> > description of usb_submit_urb() and usb_kill_urb().
> >
> > In the case of usb_submit_urb(), xhci spec 4.10.3.1 "Ring Overrun and
> > Underrun" describes the behavior of xhci when an isochronous ring is
> > empty due to the late URB submission in driver. (In this patch, empty
> > isochronous ring can happen due to tasklet scheduling delay in URB
> > complete function which calls the next usb_submit_urb())
> >
> > According to the xhci spec, xHC deals with a late isochronous URB
> > according to the SIA(Start Isoch ASAP) flag of TRB and SIA flag is
> > set according to URB_ISO_ASAP flag in xhci_queue_isoc_tx().
> >
> > If the SIA flag is set, xHC will schedule the late isochronous URB in
> > the next "Endpoint Service Interval Time" (next available frame) and
> > transmits ischronous URB in that frame.
> >
> > If the SIA flag is cleared (URB_ISO_ASAP flag is cleared), xHC generates
> > "Missed Service Error" event and skips the late isochronous URB and
> > doen't transmit it. When the interrupt handler (xhci_irq) receives
> > "Missed Service Error" event, it returns the late isochronous URB to
> > the driver calling usb_hcd_giveback_urb() with -EXDEV error code in
> > usb_iso_packet_descriptor->status at skip_isoc_td(). So xhci behavior
> > about the late isochronous URB in spec and implementation is same
> > with the description of usb_submit_urb().
> >
> > In the case of usb_kill_urb(), description says that it waits until
> > the URB complete function finishes and the important point is that
> > whether the USB complete function is called early or late doesn't
> > affect the behavior of usb_kill_urb() because __usb_hcd_giveback_urb()
> > wakes up usb_kill_urb() after calling URB complete function.
> > So, pending a URB complete function in tasklet doesn't affect the
> > behavior of xhci in usb_kill_urb().
>
> Okay, good. I just wanted to make sure you were aware of the issues
> and had checked that using tasklets wouldn't cause any problems.

Yes, isochronous issue is very important point in this patch.
Thank you for your feedback Alan.


Mathias,

Could i receive your feedback for this patch?
Do i need more tests for SS devices or other types of tests?

Regards

Suwan Kim

2019-05-07 16:05:49

by Suwan Kim

[permalink] [raw]
Subject: Re: [PATCH v2] usb: host: xhci: Support running urb giveback in tasklet context

On Mon, Apr 01, 2019 at 11:16:11PM +0900, Suwan Kim wrote:
> Patch "USB: HCD: support giveback of URB in tasklet context"[1]
> introduced giveback of urb in tasklet context. [1] This patch was
> applied to ehci but not xhci. [2] This patch significantly reduces
> the hard irq time of xhci. Especially for uvc driver, the hard irq
> including the uvc completion function runs quite long but applying
> this patch reduces the hard irq time of xhci.
>
> I have tested four SS devices to check if performance degradation
> occurs when urb completion functions run in the tasklet context.
>
> As a result of the test, all devices works well and shows very
> similar performance with the upstream kernel. Moreover, usb ethernet
> adapter show better performance than the upstream kernel about 5% for
> RX and 2% for TX. Four SS devices is as follows.
>
> SS devices for test
>
> 1. WD My Passport 2TB (external hard drive)
> 2. Sandisk Ultra Flair USB 3.0 32GB
> 3. Logitech Brio webcam
> 4. Iptime 1gigabit ethernet adapter (Mediatek RTL8153)
>
> Test description
>
> 1. Mass storage (hard drive) performance test
> - run below command 10 times and compute the average performance
>
> dd if=/dev/sdN iflag=direct of=/dev/null bs=1G count=1
>
> 2. Mass storage (flash memory) performance test
> - run below command 10 times and compute the average performance
>
> dd if=/dev/sdN iflag=direct of=/dev/null bs=1G count=1
>
> 3. Webcam streaming performance test
> - run simple capture program and get the average frame rate per second
> - capture 1500 frames
> - program link
>
> https://github.com/asfaca/Webcam-performance-analyzing-tool
>
> - video resolution : 4096 X 2160 (4K) at 30 or 24 fps
> - device (Logitech Brio) spec url for the highest resolution and fps
>
> https://support.logitech.com/en_gb/product/brio-stream/specs
>
> 4. USB Ethernet adapter performance test
> - directly connect two linux machines with ethernet cable
> - run pktgen of linux kernel and send 1500 bytes packets
> - run vnstat to measure the network bandwidth for 180 seconds
>
> Test machine
>
> - CPU : Intel i5-7600 @ 3.5GHz
>
> Test results
>
> 1. Mass storage (hard drive) performance test
>
> WD My Passport 2TB (external hard drive)
> --------------------------------------------------------------------
> xhci without tasklet | xhci with tasklet
> --------------------------------------------------------------------
> 103.667MB/s | 103.692MB/s
> --------------------------------------------------------------------
>
> 2. Mass storage (flash memory) performance test
>
> Sandisk Ultra Flair USB 3.0 32GB
> --------------------------------------------------------------------
> xhci without tasklet | xhci with tasklet
> --------------------------------------------------------------------
> 129.727MB/s | 130.2MB/s
> --------------------------------------------------------------------
>
> 3. Webcam streaming performance test
>
> Logitech Brio webcam
> --------------------------------------------------------------------
> xhci without tasklet | xhci with tasklet
> --------------------------------------------------------------------
> 26.4451 fps | 26.3949 fps
> --------------------------------------------------------------------
>
> 4. USB Ethernet adapter performance test
>
> Iptime 1gigabit ethernet adapter (Mediatek RTL8153)
> --------------------------------------------------------------------
> xhci without tasklet | xhci with tasklet
> --------------------------------------------------------------------
> RX 933.86 Mbit/s | 983.86 Mbit/s
> --------------------------------------------------------------------
> TX 830.18 Mbit/s | 882.75 Mbit/s
> --------------------------------------------------------------------
>
> [1], https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=94dfd7edfd5c9b605caf7b562de7a813d216e011
> [2], https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=428aac8a81058e2303677a8fbf26670229e51d3a
>
> Signed-off-by: Suwan Kim <[email protected]>
>
> ---
> v2 change:
> - Add SS device test results and rewrite the description
> ---
> drivers/usb/host/xhci-ring.c | 2 --
> drivers/usb/host/xhci.c | 2 +-
> 2 files changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 40fa25c4d041..0ede5265e6e2 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -644,10 +644,8 @@ static void xhci_giveback_urb_in_irq(struct xhci_hcd *xhci,
> }
> xhci_urb_free_priv(urb_priv);
> usb_hcd_unlink_urb_from_ep(hcd, urb);
> - spin_unlock(&xhci->lock);
> trace_xhci_urb_giveback(urb);
> usb_hcd_giveback_urb(hcd, urb, status);
> - spin_lock(&xhci->lock);
> }
>
> static void xhci_unmap_td_bounce_buffer(struct xhci_hcd *xhci,
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 7fa58c99f126..bb212b4669cf 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -5141,7 +5141,7 @@ static const struct hc_driver xhci_hc_driver = {
> * generic hardware linkage
> */
> .irq = xhci_irq,
> - .flags = HCD_MEMORY | HCD_USB3 | HCD_SHARED,
> + .flags = HCD_MEMORY | HCD_USB3 | HCD_SHARED | HCD_BH,
>
> /*
> * basic lifecycle operations
> --
> 2.20.1

Hi,

I sent this patch a month ago but got no answer.
Is there any feedback for this?
Please let me know if there are any faults or it needs more tests.

Regards

Suwan Kim

2019-05-08 13:22:49

by Mathias Nyman

[permalink] [raw]
Subject: Re: [PATCH v2] usb: host: xhci: Support running urb giveback in tasklet context

On 7.5.2019 19.02, Suwan Kim wrote:
> On Mon, Apr 01, 2019 at 11:16:11PM +0900, Suwan Kim wrote:
>> Patch "USB: HCD: support giveback of URB in tasklet context"[1]
>> introduced giveback of urb in tasklet context. [1] This patch was
>> applied to ehci but not xhci. [2] This patch significantly reduces
>> the hard irq time of xhci. Especially for uvc driver, the hard irq
>> including the uvc completion function runs quite long but applying
>> this patch reduces the hard irq time of xhci.
>>
>
> I sent this patch a month ago but got no answer.
> Is there any feedback for this?
> Please let me know if there are any faults or it needs more tests.
>

I'll add this to a internal tree first, and let it sit there for a few
weeks, trying to catch possible bugs this change could expose.

-Mathias

2019-05-08 17:44:02

by Suwan Kim

[permalink] [raw]
Subject: Re: [PATCH v2] usb: host: xhci: Support running urb giveback in tasklet context

On Wed, May 08, 2019 at 03:04:33PM +0300, Mathias Nyman wrote:
> On 7.5.2019 19.02, Suwan Kim wrote:
> > On Mon, Apr 01, 2019 at 11:16:11PM +0900, Suwan Kim wrote:
> > > Patch "USB: HCD: support giveback of URB in tasklet context"[1]
> > > introduced giveback of urb in tasklet context. [1] This patch was
> > > applied to ehci but not xhci. [2] This patch significantly reduces
> > > the hard irq time of xhci. Especially for uvc driver, the hard irq
> > > including the uvc completion function runs quite long but applying
> > > this patch reduces the hard irq time of xhci.
> > >
> >
> > I sent this patch a month ago but got no answer.
> > Is there any feedback for this?
> > Please let me know if there are any faults or it needs more tests.
> >
>
> I'll add this to a internal tree first, and let it sit there for a few
> weeks, trying to catch possible bugs this change could expose.

Thank you for spending your time for it! I also have been testing it
in my daily desktop and until now, there are no special issues.
If some bugs occur, i will report it to the mailing list.

Regards

Suwan Kim