Immediate data transfers (IDT) allow the HCD to copy small chunks of
data (up to 8bytes) directly into its output transfer TRBs. This avoids
the somewhat expensive DMA mappings that are performed by default on
most URBs submissions.
In the case an URB was suitable for IDT. The data is directly copied
into the "Data Buffer Pointer" region of the TRB and the IDT flag is
set. Instead of triggering memory accesses the HC will use the data
directly.
The implementation could cover all kind of output endpoints. Yet
Isochronous endpoints are bypassed as I was unable to find one that
matched IDT's constraints. As we try to bypass the default DMA mappings
on URB buffers we'd need to find a Isochronous device with an
urb->transfer_buffer_length <= 8 bytes.
The implementation takes into account that the 8 byte buffers provided
by the URB will never cross a 64KB boundary.
Signed-off-by: Nicolas Saenz Julienne <[email protected]>
---
drivers/usb/host/xhci-ring.c | 12 ++++++++++++
drivers/usb/host/xhci.c | 16 ++++++++++++++++
drivers/usb/host/xhci.h | 17 +++++++++++++++++
3 files changed, 45 insertions(+)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 40fa25c4d041..997edc908a0d 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3272,6 +3272,12 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
field |= TRB_IOC;
more_trbs_coming = false;
td->last_trb = ring->enqueue;
+
+ if (xhci_urb_suitable_for_idt(urb)) {
+ memcpy(&send_addr, urb->transfer_buffer,
+ trb_buff_len);
+ field |= TRB_IDT;
+ }
}
/* Only set interrupt on short packet for IN endpoints */
@@ -3411,6 +3417,12 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
if (urb->transfer_buffer_length > 0) {
u32 length_field, remainder;
+ if (xhci_urb_suitable_for_idt(urb)) {
+ memcpy(&urb->transfer_dma, urb->transfer_buffer,
+ urb->transfer_buffer_length);
+ field |= TRB_IDT;
+ }
+
remainder = xhci_td_remainder(xhci, 0,
urb->transfer_buffer_length,
urb->transfer_buffer_length,
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 005e65922608..f04ad2290884 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1238,6 +1238,21 @@ EXPORT_SYMBOL_GPL(xhci_resume);
/*-------------------------------------------------------------------------*/
+/*
+ * Bypass the DMA mapping if URB is suitable for Immediate Transfer (IDT),
+ * we'll copy the actual data into the TRB address register. This is limited to
+ * transfers up to 8 bytes on output endpoints of any kind with wMaxPacketSize
+ * >= 8 bytes. If suitable for IDT only one Transfer TRB per TD is allowed.
+ */
+static int xhci_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
+ gfp_t mem_flags)
+{
+ if (xhci_urb_suitable_for_idt(urb))
+ return 0;
+
+ return usb_hcd_map_urb_for_dma(hcd, urb, mem_flags);
+}
+
/**
* xhci_get_endpoint_index - Used for passing endpoint bitmasks between the core and
* HCDs. Find the index for an endpoint given its descriptor. Use the return
@@ -5155,6 +5170,7 @@ static const struct hc_driver xhci_hc_driver = {
/*
* managing i/o requests and associated device resources
*/
+ .map_urb_for_dma = xhci_map_urb_for_dma,
.urb_enqueue = xhci_urb_enqueue,
.urb_dequeue = xhci_urb_dequeue,
.alloc_dev = xhci_alloc_dev,
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 652dc36e3012..7dc6d2197641 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1295,6 +1295,8 @@ enum xhci_setup_dev {
#define TRB_IOC (1<<5)
/* The buffer pointer contains immediate data */
#define TRB_IDT (1<<6)
+/* TDs smaller than this might use IDT */
+#define TRB_IDT_MAX_SIZE 8
/* Block Event Interrupt */
#define TRB_BEI (1<<9)
@@ -2141,6 +2143,21 @@ static inline struct xhci_ring *xhci_urb_to_transfer_ring(struct xhci_hcd *xhci,
urb->stream_id);
}
+/*
+ * TODO: As per spec Isochronous IDT transmissions are supported. We bypass
+ * them anyways as we where unable to find a device that matches the
+ * constraints.
+ */
+static inline bool xhci_urb_suitable_for_idt(struct urb *urb)
+{
+ if (!usb_endpoint_xfer_isoc(&urb->ep->desc) && usb_urb_dir_out(urb) &&
+ usb_endpoint_maxp(&urb->ep->desc) >= TRB_IDT_MAX_SIZE &&
+ urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE)
+ return true;
+
+ return false;
+}
+
static inline char *xhci_slot_state_string(u32 state)
{
switch (state) {
--
2.20.1
Nicolas Saenz Julienne <[email protected]> writes:
> Immediate data transfers (IDT) allow the HCD to copy small chunks of
> data (up to 8bytes) directly into its output transfer TRBs. This avoids
> the somewhat expensive DMA mappings that are performed by default on
> most URBs submissions.
>
> In the case an URB was suitable for IDT. The data is directly copied
> into the "Data Buffer Pointer" region of the TRB and the IDT flag is
> set. Instead of triggering memory accesses the HC will use the data
> directly.
>
> The implementation could cover all kind of output endpoints. Yet
> Isochronous endpoints are bypassed as I was unable to find one that
> matched IDT's constraints. As we try to bypass the default DMA mappings
> on URB buffers we'd need to find a Isochronous device with an
> urb->transfer_buffer_length <= 8 bytes.
>
> The implementation takes into account that the 8 byte buffers provided
> by the URB will never cross a 64KB boundary.
>
> Signed-off-by: Nicolas Saenz Julienne <[email protected]>
This looks good to my eyes.
Reviewed-by: Felipe Balbi <[email protected]>
--
balbi
On Tue, 2019-02-19 at 17:29 +0100, Nicolas Saenz Julienne wrote:
> Immediate data transfers (IDT) allow the HCD to copy small chunks of
> data (up to 8bytes) directly into its output transfer TRBs. This avoids
> the somewhat expensive DMA mappings that are performed by default on
> most URBs submissions.
>
> In the case an URB was suitable for IDT. The data is directly copied
> into the "Data Buffer Pointer" region of the TRB and the IDT flag is
> set. Instead of triggering memory accesses the HC will use the data
> directly.
>
> The implementation could cover all kind of output endpoints. Yet
> Isochronous endpoints are bypassed as I was unable to find one that
> matched IDT's constraints. As we try to bypass the default DMA mappings
> on URB buffers we'd need to find a Isochronous device with an
> urb->transfer_buffer_length <= 8 bytes.
>
> The implementation takes into account that the 8 byte buffers provided
> by the URB will never cross a 64KB boundary.
>
> Signed-off-by: Nicolas Saenz Julienne <[email protected]>
Friendly ping, any more comments on this? :)
> ---
> drivers/usb/host/xhci-ring.c | 12 ++++++++++++
> drivers/usb/host/xhci.c | 16 ++++++++++++++++
> drivers/usb/host/xhci.h | 17 +++++++++++++++++
> 3 files changed, 45 insertions(+)
>
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 40fa25c4d041..997edc908a0d 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -3272,6 +3272,12 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t
> mem_flags,
> field |= TRB_IOC;
> more_trbs_coming = false;
> td->last_trb = ring->enqueue;
> +
> + if (xhci_urb_suitable_for_idt(urb)) {
> + memcpy(&send_addr, urb->transfer_buffer,
> + trb_buff_len);
> + field |= TRB_IDT;
> + }
> }
>
> /* Only set interrupt on short packet for IN endpoints */
> @@ -3411,6 +3417,12 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t
> mem_flags,
> if (urb->transfer_buffer_length > 0) {
> u32 length_field, remainder;
>
> + if (xhci_urb_suitable_for_idt(urb)) {
> + memcpy(&urb->transfer_dma, urb->transfer_buffer,
> + urb->transfer_buffer_length);
> + field |= TRB_IDT;
> + }
> +
> remainder = xhci_td_remainder(xhci, 0,
> urb->transfer_buffer_length,
> urb->transfer_buffer_length,
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 005e65922608..f04ad2290884 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -1238,6 +1238,21 @@ EXPORT_SYMBOL_GPL(xhci_resume);
>
> /*-------------------------------------------------------------------------*/
>
> +/*
> + * Bypass the DMA mapping if URB is suitable for Immediate Transfer (IDT),
> + * we'll copy the actual data into the TRB address register. This is limited
> to
> + * transfers up to 8 bytes on output endpoints of any kind with
> wMaxPacketSize
> + * >= 8 bytes. If suitable for IDT only one Transfer TRB per TD is allowed.
> + */
> +static int xhci_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
> + gfp_t mem_flags)
> +{
> + if (xhci_urb_suitable_for_idt(urb))
> + return 0;
> +
> + return usb_hcd_map_urb_for_dma(hcd, urb, mem_flags);
> +}
> +
> /**
> * xhci_get_endpoint_index - Used for passing endpoint bitmasks between the
> core and
> * HCDs. Find the index for an endpoint given its descriptor. Use the
> return
> @@ -5155,6 +5170,7 @@ static const struct hc_driver xhci_hc_driver = {
> /*
> * managing i/o requests and associated device resources
> */
> + .map_urb_for_dma = xhci_map_urb_for_dma,
> .urb_enqueue = xhci_urb_enqueue,
> .urb_dequeue = xhci_urb_dequeue,
> .alloc_dev = xhci_alloc_dev,
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index 652dc36e3012..7dc6d2197641 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -1295,6 +1295,8 @@ enum xhci_setup_dev {
> #define TRB_IOC (1<<5)
> /* The buffer pointer contains immediate data */
> #define TRB_IDT (1<<6)
> +/* TDs smaller than this might use IDT */
> +#define TRB_IDT_MAX_SIZE 8
>
> /* Block Event Interrupt */
> #define TRB_BEI (1<<9)
> @@ -2141,6 +2143,21 @@ static inline struct xhci_ring
> *xhci_urb_to_transfer_ring(struct xhci_hcd *xhci,
> urb->stream_id);
> }
>
> +/*
> + * TODO: As per spec Isochronous IDT transmissions are supported. We bypass
> + * them anyways as we where unable to find a device that matches the
> + * constraints.
> + */
> +static inline bool xhci_urb_suitable_for_idt(struct urb *urb)
> +{
> + if (!usb_endpoint_xfer_isoc(&urb->ep->desc) && usb_urb_dir_out(urb) &&
> + usb_endpoint_maxp(&urb->ep->desc) >= TRB_IDT_MAX_SIZE &&
> + urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE)
> + return true;
> +
> + return false;
> +}
> +
> static inline char *xhci_slot_state_string(u32 state)
> {
> switch (state) {
On 15.3.2019 14.51, Nicolas Saenz Julienne wrote:
> On Tue, 2019-02-19 at 17:29 +0100, Nicolas Saenz Julienne wrote:
>> Immediate data transfers (IDT) allow the HCD to copy small chunks of
>> data (up to 8bytes) directly into its output transfer TRBs. This avoids
>> the somewhat expensive DMA mappings that are performed by default on
>> most URBs submissions.
>>
>> In the case an URB was suitable for IDT. The data is directly copied
>> into the "Data Buffer Pointer" region of the TRB and the IDT flag is
>> set. Instead of triggering memory accesses the HC will use the data
>> directly.
>>
>> The implementation could cover all kind of output endpoints. Yet
>> Isochronous endpoints are bypassed as I was unable to find one that
>> matched IDT's constraints. As we try to bypass the default DMA mappings
>> on URB buffers we'd need to find a Isochronous device with an
>> urb->transfer_buffer_length <= 8 bytes.
>>
>> The implementation takes into account that the 8 byte buffers provided
>> by the URB will never cross a 64KB boundary.
>>
>> Signed-off-by: Nicolas Saenz Julienne <[email protected]>
>
>
> Friendly ping, any more comments on this? :)
>
Looks good, adding to queue, thanks
-Mathias