Received: by 2002:ac0:8c9a:0:0:0:0:0 with SMTP id r26csp1110356ima; Fri, 1 Feb 2019 16:46:44 -0800 (PST) X-Google-Smtp-Source: ALg8bN7Audy/zSk1gQWfGNry0sflLrnB/stcJv+POlRn0qX33Z+vpFfNSSLvb8XTdUf7m3UaW2n1 X-Received: by 2002:a62:d148:: with SMTP id t8mr42697878pfl.52.1549068404396; Fri, 01 Feb 2019 16:46:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549068404; cv=none; d=google.com; s=arc-20160816; b=TDrwUL0jeMalpPMM+L8Iwi85lny0g5Tyb43S5HWwXRmVxS+zdeW/yRU2lef22fgTH9 fusyfVsUBzd5iP1PXqrRBqy4DAc4ZPq+8rcPo6pcRi7QC5bkx6Q4Sd26XVI+iw71fwKh F4yb/6sYFUYlOXi5vWhDX5xN9xhHgdfpgtHGyXZ76LejQ6/t5V+tdsRbMctAXUrRyBEi DR3JSgcEsfIn9IZgRUaIDw7o2lKyyFLL6qsGkMgujmdbzW0vf084B1/VYtRRLNdyyoMV AGpBhy9xjBA/7Y9xwMWSE60qEqHXAcXmBY77xyqCGSiKzvIL3KlIX/T+Eq/d3ejUpnIG dmtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=tRLfin02duNGpkrsPDh1mXYGi2cY9dbmUwVyV7opiiY=; b=fJTglE1kKoUUQnw4CfDz7oSnRH7QAOUM/ORsV2i1ivjit2TlqV7+ovg8Tn/44IqoSE sgFva9/PrxaVB9eE2vTzHSD+HiIuy42v0Gwjc+ilVlORngSAZIMf+xBVN8rdd02mOuKP STII7r5yv9jKDmuHpv0w2d/aVfFv5tM9rh8tuCQJE7dGzFJCLgiknVXqmoJ2ze6eEW3P xNBvAqwF+IBh523aBQUh9wTrP5DYHgm40gMX7dtQiNHRiqiIPFc/BylvJe5vEfpmAEhn Yme1PybaThvJSae7ijvqIDqkyINqaZKdr51TDpeih49u+sYyNQ1lQesD+rxonNgyRUYC TJ6w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="RX/ClDZm"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c17si8600835pgl.385.2019.02.01.16.46.29; Fri, 01 Feb 2019 16:46:44 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="RX/ClDZm"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726516AbfBBAgd (ORCPT + 99 others); Fri, 1 Feb 2019 19:36:33 -0500 Received: from mail-wm1-f53.google.com ([209.85.128.53]:35329 "EHLO mail-wm1-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725803AbfBBAgc (ORCPT ); Fri, 1 Feb 2019 19:36:32 -0500 Received: by mail-wm1-f53.google.com with SMTP id t200so7958972wmt.0 for ; Fri, 01 Feb 2019 16:36:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=tRLfin02duNGpkrsPDh1mXYGi2cY9dbmUwVyV7opiiY=; b=RX/ClDZm4hQyab+HTOrGTk3xXdv/ZooVqd6I6tM6QGt/88a4iBrdraFDRDZ91/ZqqW 1p2w0/fOstSIbfLBEes2CwxpEwQRwEfRC+DwOogqP029vRl5ENFPltNbrEnrQFpA5x/3 QTTv+mxAbsWuD+l9SGS40W1LE0TA9hG0OGos4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=tRLfin02duNGpkrsPDh1mXYGi2cY9dbmUwVyV7opiiY=; b=Q//6404VOmIdfVZCmfjcRYFVKWnii/lfYl9U1M2DrinG78iaSfZEMpO7kMSB8Fyk8A OQ3G6E3boikKKjtO02a+JQGYQ7osgetVTLl3NZzqeNkV7I3xakPBhADNLgqMj4vfMggy jRJv8PoucfoYf0GGwNV8gBDfjTXiSnCB9XhQWHNVVRHuHlKF7HJkAUgiI9IcNXxIWPyg A/MsbdK5FCQ8lXtz3F9b6doZT/nFehqMea1DeW5u3eWu5ilcY23lSblVUKd50Up7mQDv abmhq9TgmJH/oJHRsZQoSCywc6E+vDqOAEvyM4uVfn7gBTyCPQL6UdvCZeZfvv70Z/Ho QmUQ== X-Gm-Message-State: AHQUAuZnASylualu1G9FrQcVdv5xN8zCKIXtWaV5mLaN8QM8zg+Q/6dI oJaNZJQ0Tm1g+eDi7PGKJ8No3PmXoJV3XXE1Ep8V7Q== X-Received: by 2002:a1c:de57:: with SMTP id v84mr4275343wmg.55.1549067790375; Fri, 01 Feb 2019 16:36:30 -0800 (PST) MIME-Version: 1.0 References: <30102591E157244384E984126FC3CB4F639BF445@us01wembx1.internal.synopsys.com> In-Reply-To: <30102591E157244384E984126FC3CB4F639BF445@us01wembx1.internal.synopsys.com> From: John Stultz Date: Fri, 1 Feb 2019 16:36:18 -0800 Message-ID: Subject: Re: Frequent dwc3 crashes on suspend or reboot since 5.0-rc1 To: Thinh Nguyen Cc: Felipe Balbi , Zeng Tao , Jack Pham , Chen Yu , lkml , Linux USB List , Greg Kroah-Hartman Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 1, 2019 at 4:31 PM Thinh Nguyen wro= te: > > Hi John, > > John Stultz wrote: > > Hey all, > > Since the 5.0 merge window opened, I've been tripping on frequent > > dwc3 crashes on reboot and suspend, which I've added an example to the > > bottom of this mail. > > > > I've dug in a little bit and sort of have a sense of whats going on. > > > > In ffs_epfile_io(): > > https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A__git.kernel.org_p= ub_scm_linux_kernel_git_torvalds_linux.git_tree_drivers_usb_gadget_function= _f-5Ffs.c-23n1065&d=3DDwIBaQ&c=3DDPL6_X_6JkXFx7AXWqB0tg&r=3Du9FYoxKtyhjrGFc= yixFYqTjw1ZX0VsG2d8FCmzkTY-w&m=3Da8TU-itM8GBG_EARYf2yM-kVfCzmaPkKDNAUFQHTe3= Q&s=3DBQiVAFiViSlxVg5_LemED0x_47FLVUD43M7R6h6T8qk&e=3D > > > > The completion done is setup on the stack: > > DECLARE_COMPLETION_ONSTACK(done); > > > > Then later we setup a request and queue it: > > req->context =3D &done; > > ... > > ret =3D usb_ep_queue(ep->ep, req, GFP_ATOMIC); > > > > Then wait for it: > > if (unlikely(wait_for_completion_interruptible(&done))) { > > /* > > * To avoid race condition with ffs_epfile_io_complete, > > * dequeue the request first then check > > * status. usb_ep_dequeue API should guarantee no race > > * condition with req->complete callback. > > */ > > usb_ep_dequeue(ep->ep, req); > > interrupted =3D ep->status < 0; > > } > > > > The problem is, that we end up being interrupted, supposedly dequeue > > the request, and exit. > > > > But then (or in parallel) the irq triggers and we try calling > > complete() on the context pointer which points to now random stack > > space, which results in the panic. > > > > It seems like something is wrong with usb_ep_dequeue not really > > stopping the irq from happening? > > > > If I revert all the changes to dwc3 back to 4.20, I don't see the issue= . > > > > I'll do some bisection to try to narrow things down, but I wanted to > > see if this was a known issue or if anyone had immediate ideas as to > > what might be wrong. > > > > I'm not sure if this is related, but can you try to test using Felipe's > testing/next branch? There is a fix to a race condition when the gadget > driver tries to dequeue requests. > > See if you run into this issue again. I'll check that out! Thanks so much for the pointer! -john