Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753129AbeADNft (ORCPT + 1 other); Thu, 4 Jan 2018 08:35:49 -0500 Received: from mga09.intel.com ([134.134.136.24]:19338 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752523AbeADNfs (ORCPT ); Thu, 4 Jan 2018 08:35:48 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.45,507,1508828400"; d="scan'208";a="7370617" Subject: Re: [PATCH v4 1/3] usb: host: Implement workaround for Erratum A-007463 To: yinbo.zhu@nxp.com, Felipe Balbi , Greg Kroah-Hartman , Mathias Nyman , "open list:DESIGNWARE USB3 DRD IP DRIVER" , open list Cc: xiaobo.xie@nxp.com, jerry.huang@nxp.com, ran.wang_1@nxp.com References: <20171219101618.45650-1-yinbo.zhu@nxp.com> From: Mathias Nyman Message-ID: Date: Thu, 4 Jan 2018 15:38:29 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <20171219101618.45650-1-yinbo.zhu@nxp.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 19.12.2017 12:16, yinbo.zhu@nxp.com wrote: > From: yinbo.zhu > > When a transaction error (defined in Section 4.10.2.3, "USB > Transaction Error" of the xHCI Specification) occurs on the > USB, the host controller reports this through a transfer > event with the completion code "USB Transaction Error". When > this happens, the endpoint is placed in the Halted state. In > response, software must issue a Reset Endpoint command to > transition the endpoint to the Stopped state. In order to > restart the transfer, the driver can perform either of the > following: > a) Ring the doorbell again, which restarts the transfer from > where it stopped, or > b) Issue a Set TR (Transfer Ring) Dequeue Pointer command > for the endpoint to start the transfer form a different > Transfer Ring pointer Consider the following > scenario: > 1. The xHCI driver prepares a control transfer read to one > of the device's control endpoints; > 2. During the IN data stage, a transaction error occurs on > the USB, causing a transfer event with the completion > code "USB Transaction Error"; > 3. The driver issues a Reset Endpoint command; > 4. The driver rings the doorbell of the control endpoint to > resume the transfer. In this scenario, the controller > may reverse the direction of the data stage from IN to OUT. The xhci driver should not ring the doorbell without first issuing a Set TR Dequeue Pointer command in the USB Transaction error case. code flow: handle_tx_event() case COMP_USB_TRANSACTION_ERROR: status = -EPROTO; process_ctrl_td() switch (trb_comp_code) default: if (!xhci_requires_manual_halt_cleanup() // FALSE break; /* else fall through */ case COMP_STALL_ERROR: finish_td() if (xhci_requires_manual_halt_cleanup()) // TRUE xhci_cleanup_halted_endpoint(...,EP_HARD_RESET) xhci_queue_reset_ep() xhci_cleanup_stalled_ring() xhci_find_new_dequeue_state() xhci_queue_new_dequeue_state() If you can see the driver ringing the doorbell before the Set TR Dequeue Pointer command then there is some race going on in the driver > Instead of sending an ACK to the endpoint to poll for read > data, it sends a Data Packet (DP) to the endpoint. It > fetches the data from the data stage Transfer Request > Block (TRB) that is being resumed, even though the data > buffer is setup to receive data and not transmit it. > NOTE > This issue occurs only if the transaction error happens > during an IN data stage. There is no issue if the transaction > error happens during an OUT data stage. Have you been able to trigger this issue? > > Impact: When this issue occurs, the device likely responds in > one of the following ways: > a) The device responds with a STALL because the data stage > has unexpectedly changed directions. The controller then > generates a Stall Error transfer event, to which software > must issue a Reset Endpoint command followed by a Set TR > Dequeue Pointer command pointing to a new Setup TRB to clear > the STALL condition. > b) The device does not respond to the inverted data stage and > the transaction times out. The controller generates another > USB Transaction Error transfer event, to which software > likely performs a USB Reset to the device because it is > unresponsive. It is not expected that any of these recovery > steps will cause instability in the system because this > recovery is part of a standard xHCI driver and could happen > regardless of the defect. Another possible system-level > impact is that the controller attempts to read from the > memory location pointed at by the Data Stage TRB or a Normal > TRB chained to it. associated with this TRB is intended to be > written by the controller, but the controller reads from it > instead. Normally, this does not cause a problem. However, if > the system has some type of memory protection where this > unexpected read is treated as a bus error, > a problem. However, if the system has some type of memory > it may cause the system to become unstable or to crash. > > Workaround: If a USB Transaction Error occurs during the IN > data phase of a control transfer, the driver must use the > Set TR Dequeue Pointer command to either restart the data > Phase or restart the entire control transfer from the > Setup phase. The Set TR Dequeue pointer command should already be the default way the xhci driver handles this now. > > Configs Affected: > LS1021-20-22A-R1.0, LS1021-20-22A-R2.007463 > > Signed-off-by: yinbo zhu > --- > @@ -1951,14 +1953,30 @@ static int finish_td(struct xhci_hcd *xhci, struct xhci_td *td, > if (trb_comp_code == COMP_STALL_ERROR || > xhci_requires_manual_halt_cleanup(xhci, ep_ctx, > trb_comp_code)) { > - /* Issue a reset endpoint command to clear the host side > - * halt, followed by a set dequeue command to move the > - * dequeue pointer past the TD. > - * The class driver clears the device side halt later. > + /*erratum A-007463: > + *After transaction error, controller switches control transfer > + *data stage from IN to OUT direction. > */ > - xhci_cleanup_halted_endpoint(xhci, slot_id, ep_index, > + remaining = EVENT_TRB_LEN(le32_to_cpu(event->transfer_len)); > + if (remaining && xhci_requires_manual_halt_cleanup(xhci, ep_ctx, > + trb_comp_code) && > + (xhci->quirks & XHCI_REVERSE_IN_OUT)) { > + memset(&deq_state, 0, sizeof(deq_state)); > + xhci_find_new_dequeue_state(xhci, slot_id, > + ep_index, td->urb->stream_id, td, &deq_state); > + xhci_queue_new_dequeue_state(xhci, slot_id, ep_index, > + &deq_state); > + xhci_ring_cmd_db(xhci); Even if all these changes should be unnecessary the endpoint is still halted here, and we would need a reset endpoint command here as well, right? Has this new codepath been tested? > + } else { > + /* Issue a reset endpoint command to clear the host side > + * halt, followed by a set dequeue command to move the > + * dequeue pointer past the TD. > + * The class driver clears the device side halt later. > + */ > + xhci_cleanup_halted_endpoint(xhci, slot_id, ep_index, > ep_ring->stream_id, td, ep_trb, > EP_HARD_RESET); -Mathias