Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757276AbcJNHDJ (ORCPT ); Fri, 14 Oct 2016 03:03:09 -0400 Received: from mail-yb0-f173.google.com ([209.85.213.173]:35728 "EHLO mail-yb0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755016AbcJNHDH (ORCPT ); Fri, 14 Oct 2016 03:03:07 -0400 MIME-Version: 1.0 In-Reply-To: <87shs0tltb.fsf@linux.intel.com> References: <87r37kviic.fsf@linux.intel.com> <87inswvg80.fsf@linux.intel.com> <87a8e8vaif.fsf@linux.intel.com> <877f9cv7no.fsf@linux.intel.com> <87y41struu.fsf@linux.intel.com> <87shs0tltb.fsf@linux.intel.com> From: Baolin Wang Date: Fri, 14 Oct 2016 15:03:06 +0800 Message-ID: Subject: Re: [PATCH v2] usb: dwc3: gadget: Wait for end transfer complete before free irq To: Felipe Balbi Cc: Greg KH , Mark Brown , USB , LKML Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 10680 Lines: 263 Hi Felipe, On 13 October 2016 at 21:34, Felipe Balbi wrote: > > > Hi, > > Baolin Wang writes: >>> Baolin Wang writes: >>>>>> Baolin Wang writes: >>>>>>>>>> I'm thinking this is a bug in configfs interface of Gadget API, not >>>>>>>>>> dwc3. The only reason for this to happen would be if we get to >>>>>>>>>> ->udc_stop() with endpoints still enabled. >>>>>>>>>> >>>>>>>>>> Can you check if that's the case? i.e. can you check if any endpoints >>>>>>>>>> are still enabled when we get here? >>>>>>>>> >>>>>>>>> No, it is not any endpoints are still enabled. Like I said in commit >>>>>>>>> message, we will start end transfer command when disable the endpoint, >>>>>>>>> if the end transfer command complete event comes after we have freed >>>>>>>>> the gadget irq, it will trigger the interrupt line all the time to >>>>>>>>> crash the system. >>>>>>>> >>>>>>>> I see what the problem is. Databook tells us we *must* set CMDIOC when >>>>>>>> issuing EndTransfer command and we should always wait for Command >>>>>>>> Complete IRQ. However, we took a shortcut and just delayed for 100us >>>>>>>> after issuing End Transfer. >>>>>>> >>>>>>> Yes, but 100us delay is not enough in some scenarios, like changing >>>>>>> function with configfs frequently, we will met problems. >>>>>> >>>>>> heh, 100us *is* enough. However we still get an IRQ because we requested >>>>>> for it. If you want to fix this, then you need to find a way to get rid >>>>>> of that 100us udelay() and add a proper wait_for_completion() to delay >>>>>> execution until command complete IRQ fires up. >>>>> >>>>> I haven't tested this yet, but it shows the idea (I think we might still >>>>> have a race with ep_queue if we're still waiting for End Transfer, but >>>> >>>> Yes, maybe we need check DWC3_EP_END_TRANSFER_PENDING flag when >>>> queuing one request. >>> >>> Yeah, I'll add that check later. I still need to make sure we would >>> still try to kick any transfers that might have been queued while >>> waiting for End Transfer Command Complete IRQ. >> >> OK. But I still worried if there are other races in some places which > > There are no other places where this needs to be sorted out. > >> is not easy to find out by testing. No introducing race condition >> would be one better solution, anyway I would like to test the patch >> you send out firstly. > > Right, but your patch was working around another problem, rather then > fixing it. > > Here's another version which should make sure everything remains working > as it should. > > 8<------------------------------------------------------------------------ > From 1f922a902a21d5cee32082be18bbfbf1d46137e4 Mon Sep 17 00:00:00 2001 > From: Felipe Balbi > Date: Thu, 13 Oct 2016 14:09:47 +0300 > Subject: [PATCH] usb: dwc3: gadget: wait for End Transfer to complete > > Instead of just delaying for 100us, we should > actually wait for End Transfer Command Complete > interrupt before moving on. Note that this should > only be done if we're dealing with one of the core > revisions that actually require the interrupt before > moving on. > > Reported-by: Baolin Wang > Signed-off-by: Felipe Balbi > --- > drivers/usb/dwc3/core.h | 10 +++++++++- > drivers/usb/dwc3/gadget.c | 31 ++++++++++++++++++++++++++++--- > 2 files changed, 37 insertions(+), 4 deletions(-) > > diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h > index e878366ead00..cf495d932252 100644 > --- a/drivers/usb/dwc3/core.h > +++ b/drivers/usb/dwc3/core.h > @@ -26,6 +26,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -504,6 +505,7 @@ struct dwc3_event_buffer { > * @endpoint: usb endpoint > * @pending_list: list of pending requests for this endpoint > * @started_list: list of started requests on this endpoint > + * @wait_end_transfer: wait_queue_head_t for waiting on End Transfer complete > * @lock: spinlock for endpoint request queue traversal > * @regs: pointer to first endpoint register > * @trb_pool: array of transaction buffers > @@ -529,6 +531,8 @@ struct dwc3_ep { > struct list_head pending_list; > struct list_head started_list; > > + wait_queue_head_t wait_end_transfer; > + > spinlock_t lock; > void __iomem *regs; > > @@ -545,7 +549,8 @@ struct dwc3_ep { > #define DWC3_EP_BUSY (1 << 4) > #define DWC3_EP_PENDING_REQUEST (1 << 5) > #define DWC3_EP_MISSED_ISOC (1 << 6) > - > +#define DWC3_EP_END_TRANSFER_PENDING (1 << 7) > +#define DWC3_EP_KICK_POST_END_TRANSFER (1 << 8) > /* This last one is specific to EP0 */ > #define DWC3_EP0_DIR_IN (1 << 31) > > @@ -1047,6 +1052,9 @@ struct dwc3_event_depevt { > #define DEPEVT_TRANSFER_BUS_EXPIRY 2 > > u32 parameters:16; > + > +/* For Command Complete Events */ > +#define DEPEVT_PARAMETER_CMD(n) (((n) & (0xf << 24)) >> 24) > } __packed; > > /** > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > index 3ba05b12e49a..a3f81b5ba901 100644 > --- a/drivers/usb/dwc3/gadget.c > +++ b/drivers/usb/dwc3/gadget.c > @@ -598,6 +598,8 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, > reg |= DWC3_DALEPENA_EP(dep->number); > dwc3_writel(dwc->regs, DWC3_DALEPENA, reg); > > + init_waitqueue_head(&dep->wait_end_transfer); > + > if (usb_endpoint_xfer_control(desc)) > return 0; > > @@ -1151,6 +1153,11 @@ static int __dwc3_gadget_ep_queue(struct dwc3_ep *dep, struct dwc3_request *req) > if (!dwc3_calc_trbs_left(dep)) > return 0; > > + if (dep->flags & DWC3_EP_END_TRANSFER_PENDING) { > + dep->flags &= DWC3_EP_KICK_POST_END_TRANSFER; > + return 0; > + } > + > ret = __dwc3_gadget_kick_transfer(dep, 0); > if (ret && ret != -EBUSY) > dwc3_trace(trace_dwc3_gadget, > @@ -2156,6 +2163,7 @@ static void dwc3_endpoint_interrupt(struct dwc3 *dwc, > { > struct dwc3_ep *dep; > u8 epnum = event->endpoint_number; > + u8 cmd; > > dep = dwc->eps[epnum]; > > @@ -2200,8 +2208,13 @@ static void dwc3_endpoint_interrupt(struct dwc3 *dwc, > return; > } > break; > - case DWC3_DEPEVT_RXTXFIFOEVT: > case DWC3_DEPEVT_EPCMDCMPLT: > + cmd = DEPEVT_PARAMETER_CMD(event->parameters); > + > + if (cmd == DWC3_DEPCMD_ENDTRANSFER) > + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING; I think you missed wake_up(&dep->wait_end_transfer) function here. > + break; > + case DWC3_DEPEVT_RXTXFIFOEVT: > break; > } > } > @@ -2246,6 +2259,7 @@ static void dwc3_reset_gadget(struct dwc3 *dwc) > } > > static void dwc3_stop_active_transfer(struct dwc3 *dwc, u32 epnum, bool force) > +__releases(&dwc->lock) __acquires(&dwc->lock) > { > struct dwc3_ep *dep; > struct dwc3_gadget_ep_cmd_params params; > @@ -2295,11 +2309,22 @@ static void dwc3_stop_active_transfer(struct dwc3 *dwc, u32 epnum, bool force) > memset(¶ms, 0, sizeof(params)); > ret = dwc3_send_gadget_ep_cmd(dep, cmd, ¶ms); > WARN_ON_ONCE(ret); > + > + if (dwc3_is_usb31(dwc) || dwc->revision < DWC3_REVISION_310A) { > + dep->flags |= DWC3_EP_END_TRANSFER_PENDING; > + wait_event_lock_irq(dep->wait_end_transfer, > + !(dep->flags & DWC3_EP_END_TRANSFER_PENDING), > + dwc->lock); > + } I've tested your patch, unfortunately it can not work on my platform. Since we can not issue wait_event_lock_irq() function in atomic context, we will hold another 'cdev->lock' in composite_disconnect() function. The dump stack as below: [ 92.686810] c7 BUG: scheduling while atomic: init/1/0x00000002 [ 92.686825] c7 INFO: lockdep is turned off. [ 92.686833] c0 Modules linked in: mtty marlin2_fm mali_kbase(O) [ 92.686859] c0 Preemption disabled at:[] composite_disconnect+0x30/0x90 [ 92.686879] c7 [ 92.686891] c7 CPU: 7 PID: 1 Comm: init Tainted: G W O 4.4.6+ #20 [ 92.686898] c7 Hardware name: Spreadtrum SP9860g Board (DT) [ 92.686905] c7 Call trace: [ 92.689519] c7 [] dump_backtrace+0x0/0x170 [ 92.689528] c7 [] show_stack+0x20/0x28 [ 92.689537] c7 [] dump_stack+0xa8/0xe0 [ 92.689547] c7 [] __schedule_bug+0x7c/0xd4 [ 92.689559] c7 [] __schedule+0x724/0xab4 [ 92.689567] c7 [] schedule+0x48/0xa8 [ 92.689578] c7 [] dwc3_stop_active_transfer.constprop.27+0x98/0x158 [ 92.689587] c7 [] dwc3_remove_requests+0x3c/0xa0 [ 92.689595] c7 [] __dwc3_gadget_ep_disable+0x4c/0x12c [ 92.689603] c7 [] dwc3_gadget_ep_disable+0x6c/0x100 [ 92.689613] c7 [] mtp_function_disable+0xc0/0x100 [ 92.689624] c7 [] reset_config+0x4c/0x94 [ 92.689632] c7 [] composite_disconnect+0x44/0x90 [ 92.689641] c7 [] android_disconnect+0x60/0x70 [ 92.689649] c7 [] usb_gadget_remove_driver+0x6c/0xe0 [ 92.689657] c7 [] usb_gadget_unregister_driver+0x7c/0xc8 [ 92.689665] c7 [] gadget_dev_desc_UDC_store+0x8c/0x128 [ 92.689675] c7 [] configfs_write_file+0xd0/0x13c [ 92.689684] c7 [] vfs_write+0xb8/0x214 [ 92.689692] c7 [] SyS_write+0x54/0xb0 [ 92.689701] c7 [] el0_svc_naked+0x24/0x28 > + > dep->resource_index = 0; > dep->flags &= ~DWC3_EP_BUSY; > > - if (dwc3_is_usb31(dwc) || dwc->revision < DWC3_REVISION_310A) > - udelay(100); > + if (dep->flags & DWC3_EP_KICK_POST_END_TRANSFER) { > + dep->flags &= ~DWC3_EP_KICK_POST_END_TRANSFER; > + ret = __dwc3_gadget_kick_transfer(dep, 0); > + WARN_ON_ONCE(ret); > + } > } > > static void dwc3_stop_active_transfers(struct dwc3 *dwc) > -- > 2.10.0.440.g21f862b > > > -- > balbi -- Baolin.wang Best Regards