Received: by 2002:ac0:8c9a:0:0:0:0:0 with SMTP id r26csp1110222ima; Fri, 1 Feb 2019 16:46:35 -0800 (PST) X-Google-Smtp-Source: ALg8bN6BaWG9u3XXrO3FEA9KZ2p2vvrTwVP/QjmkkLpr5cUEFLoZuecQZM30nwPAIC4x1Wg65k0D X-Received: by 2002:a17:902:7588:: with SMTP id j8mr42583753pll.215.1549068395795; Fri, 01 Feb 2019 16:46:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549068395; cv=none; d=google.com; s=arc-20160816; b=WhoM9v7QpNFqHfcQmVjD8nasxf15UpNFPT+hNT0pZD4yeFz2RqR1naQi2ieU/f0D3X 4HV3mvB05W+ZxKLwinRG2oakEFuqj5t597OLfo18zsM6Nk2OrtT6X3J3kqg52IUeGJku cdCJUNDAAMJ1bl5o0akLXkQKRsoZqL2JswTVz3P+OhvWh9fIiSWF0thG3c8ur/WFZ88X RNMlvNp0BFqqvKcYys1LnBO24G5yA2Je7Z5dfoNMyPJU7/vaejX8h4RlShIMCsXPtj1G shhtqlD0nTsWhovHBzCPdtXJdKhgzKUjXm+LO3NTIEK9iVTpg9FVh94ApG09BfsxPxiv 5/NA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=gdJIWHRLrhfAqwDqyuvJkmkSvUC3yaG9hpuD4a8bODo=; b=0fmEIQzrUE/F5uIz/xAjd7Y9eyvbbEIOMz8jO6YRbljyXwijOeCfX+l3MkIIJcw/OQ wkDNiXU9FinsMp8+Fm18EX/AwZZErutbxEDWbSrJYV62kWrOg4xdv09Qxh47o/UIjteI orNLzWxyHl8OCcVrB/HrlNYEwkEiUBoB6XNK9gVcmHWvBwVLoIm33U0MzcETrq4igvnf Z3AszG3eZ/ncIwr+cc9QpdSnNHP50yDQvkRlWzA6vPAhUzG1HMabvniI0PWxpH2Q1PJF 2hhOJdyZUnmcjx5L5zpRZ4lBr0LJaJOwqF3eCNp9bGDggbA9O9hcCIPMnzB2QoWhu4F6 kqXw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Qp95Uu7y; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q19si8810309pfh.138.2019.02.01.16.46.20; Fri, 01 Feb 2019 16:46:35 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Qp95Uu7y; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726973AbfBBAly (ORCPT + 99 others); Fri, 1 Feb 2019 19:41:54 -0500 Received: from mail-wm1-f66.google.com ([209.85.128.66]:33554 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726190AbfBBAlw (ORCPT ); Fri, 1 Feb 2019 19:41:52 -0500 Received: by mail-wm1-f66.google.com with SMTP id r24so5686660wmh.0 for ; Fri, 01 Feb 2019 16:41:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=gdJIWHRLrhfAqwDqyuvJkmkSvUC3yaG9hpuD4a8bODo=; b=Qp95Uu7yC8nhCTIjDnUzBj5P18YrLf8+LeKLrJ4TVYYlpySfpI3jjXaPUAm1EBZ9ro 5RCjRMQMFG65xad9Jm8zO8/5dX3e6/PTQ7F30YM0NUzNn34Slt3fyY9E0eVCe2Ybjmw8 IjwZkqjcrqmSGHxqDf9hdBDMikuMdPDyrFDLM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gdJIWHRLrhfAqwDqyuvJkmkSvUC3yaG9hpuD4a8bODo=; b=GxpRdxpn0V6Uc0nytZSu+xDRsWe0zdtlw6bmF3Nx/OJLyBerJoCho8XThMJpfHKFRh eiXBM+uFz1+mU0FOY3hPC7ZIy+J4oRNdBO1dX5HBTqDdx3qBa9sI1J4KpJcJRsWn2k/M Xx9LLaPLA7+/UbMGPcdT95+i9BX0h8IpyfbHm5ucwW0aMKm1mEhGkJfni7HYSUZ1P0Oo mxPlSgCzBTCxG8vwyvYNmKkLgkXvGPjED6gJ2nMn37e+FpvqIOcsxm8HPzSxsPnwcxRT T3CdWLs/omOoABsHFNL9ZgfcS/b1amzP5JYWTcNJzCUnsnMXlNDXiQvTgRictAFuZX2p un9g== X-Gm-Message-State: AHQUAuYTMyRQDI6MIgG67FlmJdJy5vu5Jtg8b231Aek28lEq2tL+iSEM pU5b4m4pVrmNU2LRuNEr73ljhyLYNnKs4kvo9IA1Pg== X-Received: by 2002:a1c:de57:: with SMTP id v84mr4286203wmg.55.1549068109958; Fri, 01 Feb 2019 16:41:49 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: John Stultz Date: Fri, 1 Feb 2019 16:41:38 -0800 Message-ID: Subject: Re: Frequent dwc3 crashes on suspend or reboot since 5.0-rc1 To: Felipe Balbi , Zeng Tao , Jack Pham , Thinh Nguyen , Chen Yu Cc: lkml , Linux USB List , Greg Kroah-Hartman Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 1, 2019 at 4:18 PM John Stultz wrote: > > Hey all, > Since the 5.0 merge window opened, I've been tripping on frequent > dwc3 crashes on reboot and suspend, which I've added an example to the > bottom of this mail. > > I've dug in a little bit and sort of have a sense of whats going on. > > In ffs_epfile_io(): > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/gadget/function/f_fs.c#n1065 > > The completion done is setup on the stack: > DECLARE_COMPLETION_ONSTACK(done); > > Then later we setup a request and queue it: > req->context = &done; > ... > ret = usb_ep_queue(ep->ep, req, GFP_ATOMIC); > > Then wait for it: > if (unlikely(wait_for_completion_interruptible(&done))) { > /* > * To avoid race condition with ffs_epfile_io_complete, > * dequeue the request first then check > * status. usb_ep_dequeue API should guarantee no race > * condition with req->complete callback. > */ > usb_ep_dequeue(ep->ep, req); > interrupted = ep->status < 0; > } > > The problem is, that we end up being interrupted, supposedly dequeue > the request, and exit. > > But then (or in parallel) the irq triggers and we try calling > complete() on the context pointer which points to now random stack > space, which results in the panic. > > It seems like something is wrong with usb_ep_dequeue not really > stopping the irq from happening? > > If I revert all the changes to dwc3 back to 4.20, I don't see the issue. > > I'll do some bisection to try to narrow things down, but I wanted to > see if this was a known issue or if anyone had immediate ideas as to > what might be wrong. Bisecting the changes down, it seems like its due to commit fec9095bdef4e ("usb: dwc3: gadget: remove wait_end_transfer"). It doesn't happen all the time, so I'll need to run some more testing, but so far I've not been able to trigger it backing out the patches to that point. thanks -john