Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758626AbcLAIDb (ORCPT ); Thu, 1 Dec 2016 03:03:31 -0500 Received: from mail-yb0-f179.google.com ([209.85.213.179]:34787 "EHLO mail-yb0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754392AbcLAID3 (ORCPT ); Thu, 1 Dec 2016 03:03:29 -0500 MIME-Version: 1.0 In-Reply-To: <583FD4C6.7020306@linux.intel.com> References: <613dafc211127a4589306e91e231e151feb5ce80.1480496291.git.baolin.wang@linaro.org> <583FB8F0.6070503@linux.intel.com> <583FC4AF.3030502@linux.intel.com> <583FD4C6.7020306@linux.intel.com> From: Baolin Wang Date: Thu, 1 Dec 2016 16:03:27 +0800 Message-ID: Subject: Re: [RFC] usb: host: xhci: Remove the watchdog timer and use command timer to watch stop endpoint command To: Lu Baolu Cc: mathias.nyman@intel.com, Greg KH , USB , LKML , Mark Brown Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3661 Lines: 89 On 1 December 2016 at 15:44, Lu Baolu wrote: > Hi, > > On 12/01/2016 03:35 PM, Baolin Wang wrote: >> On 1 December 2016 at 14:35, Lu Baolu wrote: >>> Hi, >>> >>> On 12/01/2016 02:04 PM, Baolin Wang wrote: >>>> Hi Baolu, >>>> >>>> On 1 December 2016 at 13:45, Lu Baolu wrote: >>>>> Hi, >>>>> >>>>> On 11/30/2016 05:02 PM, Baolin Wang wrote: >>>>>> If the hardware never responds to the stop endpoint command, the >>>>>> URBs will never be completed, and we might hang the USB subsystem. >>>>>> The original watchdog timer is used to watch if one stop endpoint >>>>>> command is timeout, if timeout, then the watchdog timer will set >>>>>> XHCI_STATE_DYING, try to halt the xHCI host, and give back all >>>>>> pending URBs. >>>>>> >>>>>> But now we already have one command timer to control command timeout, >>>>>> thus we can also use the command timer to watch the stop endpoint >>>>>> command, instead of one duplicate watchdog timer which need to be >>>>>> removed. >>>>>> >>>>>> Meanwhile we don't need the 'stop_cmds_pending' flag to identy if >>>>>> this is the last stop endpoint command of one endpoint. Since we >>>>>> can make sure we only set one stop endpoint command for one endpoint >>>>>> by 'EP_HALT_PENDING' flag in xhci_urb_dequeue() function. Thus remove >>>>>> this flag. >>>>> I am afraid you can't do this. "stop_cmds_pending" was added >>>>> to fix the problem described in the comments that you want to >>>>> remove. But I didn't find any fix of this problem in your patch. >>>> Now we can not pending another stop endpoint command for the same one >>>> endpoint, since will check 'EP_HALT_PENDING' flag in >>>> xhci_urb_dequeue() function to avoid this. But after some >>>> investigation, I think I missed the stop endpoint command in >>>> xhci_stop_device() which did not check the 'EP_HALT_PENDING' flag, >>>> maybe need to add 'EP_HALT_PENDING' flag checking in >>>> xhci_stop_device() function. DId I miss something else? Thanks. >>> Consider below three threads running on different CPUs at the same time. >>> >>> Thread A: xhci_handle_cmd_stop_ep() --- in interrupt handler >>> Thread B: xhci_stop_endpoint_command_timeout() --- timer expired >>> Thread C: xhci_urb_dequeue --- called by usb core >>> >>> They are serialized by xhci->lock. Let's consider below sequence: >>> >>> Thread A: >>> - delete xhci->cmd_timer), but will fail due to Thread B. >>> - clear EP_HALT_PENDING bit >>> >>> Thread B: >>> - halt the host controller >>> >>> Thread C: >>> - set EP_HALT_PENDING bit >>> - enqueue another stop endpoint command >>> - add the timer back >> Ah, thanks for you comments. >> If thread B halted the host, then the thread C xhci_urb_dequeue() will >> check host state failed, then just return and can not set another stop >> endpoint command. > > Not so simple. Thread B will release xhci->lock before xhci_halt(). Yes. > >> But from your example, I think your meaning is we >> should not halt the host by thread B, since we have handled the stop >> endpoint command event. >> >> So I think we need to check the xhci command of this stop endpoint >> command in xhci_stop_endpoint_command_timeout(), if the xhci command >> of this stop endpoint command is not in the command list (which means >> the stop endpoint command event has been handled), then just return >> and do not halt the controller. Right? >> > > I'd like suggest you to put this change into a separated patch. > It's actually a different topic from the main purpose of this patch. OK, I will. Thanks for your comments. -- Baolin.wang Best Regards