Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755325AbdCTOV3 (ORCPT ); Mon, 20 Mar 2017 10:21:29 -0400 Received: from mga07.intel.com ([134.134.136.100]:45031 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753081AbdCTOUc (ORCPT ); Mon, 20 Mar 2017 10:20:32 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.36,194,1486454400"; d="scan'208";a="946234420" Subject: Re: Dell Inspiron 5558/0VNM2T hangs at resume from suspend when USB 3 is enabled To: Diego Viola , mathias.nyman@intel.com, Roger References: Cc: Ulf Hansson , Greg KH , Wei WANG , "linux-kernel@vger.kernel.org" , Linux USB List , Alan Stern From: Mathias Nyman Message-ID: <58CFE56C.8050906@linux.intel.com> Date: Mon, 20 Mar 2017 16:21:32 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4030 Lines: 101 On 19.03.2017 23:29, Diego Viola wrote: > On Fri, Mar 17, 2017 at 5:18 PM, Diego Viola wrote: >> On Fri, Mar 17, 2017 at 1:57 PM, Diego Viola wrote: >>> On Fri, Mar 17, 2017 at 1:24 PM, Alan Stern wrote: >>>> On Fri, 17 Mar 2017, Diego Viola wrote: >>>> >>>>> Hi, >>>>> >>>>> Here's the log to the netconsole dmesg capture, I've used >>>>> xhci_hcd.dyndbg no_console_suspend=1 as the kernel parameters. >>>>> >>>>> I did the usual suspend/resume cycle with i3lock, it hung after the >>>>> third attempt when trying to resume from suspend. >>>>> >>>>> https://bugzilla.kernel.org/attachment.cgi?id=255309 >>>> >>>> I'm not an expert on xHCI. This should be CC'ed to the xhci-hcd >>>> maintainer. >>>> >>>> Alan Stern >>>> >>>>> >>>>> Please let me know if I should provide something else. >>>>> >>>>> Thanks, >>>>> Diego >>>>> >>>> >>> >>> I've forwarded my email to Mathias Nyman. >>> >>> Diego >> >> Still a problem with 4.11.0-rc2-ARCH+ >> From a quick glance it looks like rtsx_usb_ms probaly takes a mutex (&ucr->dev_mutex) and then issues a usb_bulk_msg() and waits for it to complete with mutex held. The usb message times out, usb core kills the urb but the URB probably never gets completed, and function never returns. Everyting using ucr->dev_mutex would block, for example the kthread, rtsx_usb_detect_ms_card that continuously tries to detect a ms card, takes and releases the same ucr->dev_mutex for each try. [ 614.026502] INFO: task kworker/u8:0:5 blocked for more than 120 seconds. [ 614.027865] Not tainted 4.10.3-1-ARCH #1 [ 614.029116] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 614.030467] kworker/u8:0 D 0 5 2 0x00000000 [ 614.031812] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [ 614.033179] Call Trace: [ 614.034550] __schedule+0x22f/0x700 [ 614.035940] schedule+0x3d/0x90 [ 614.037334] schedule_preempt_disabled+0x15/0x20 [ 614.038680] __mutex_lock_slowpath+0x19b/0x2d0 [ 614.040067] ? flush_workqueue+0x204/0x580 [ 614.041456] mutex_lock+0x23/0x30 [ 614.042163] acpi_device_hotplug+0x43/0x3e7 [ 614.042882] acpi_hotplug_work_fn+0x1e/0x29 [ 614.043612] process_one_work+0x1e5/0x470 [ 614.044356] worker_thread+0x48/0x4e0 [ 614.045077] kthread+0x101/0x140 [ 614.045788] ? process_one_work+0x470/0x470 [ 614.046495] ? kthread_create_on_node+0x60/0x60 [ 614.047215] ret_from_fork+0x2c/0x40 [ 614.047950] INFO: task rtsx_usb_ms_1:235 blocked for more than 120 seconds. [ 614.048697] Not tainted 4.10.3-1-ARCH #1 [ 614.049465] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 614.050265] rtsx_usb_ms_1 D 0 235 2 0x00000000 [ 614.051064] Call Trace: [ 614.051841] __schedule+0x22f/0x700 [ 614.052626] schedule+0x3d/0x90 [ 614.053411] usb_kill_urb.part.4+0x6c/0xa0 [usbcore] [ 614.054198] ? wake_atomic_t_function+0x60/0x60 [ 614.055005] usb_kill_urb+0x21/0x30 [usbcore] [ 614.055819] usb_start_wait_urb+0xe5/0x170 [usbcore] [ 614.056652] usb_bulk_msg+0xbd/0x160 [usbcore] [ 614.057489] rtsx_usb_send_cmd+0x63/0x90 [rtsx_usb] [ 614.058306] rtsx_usb_read_register+0x6c/0xc0 [rtsx_usb] [ 614.059118] rtsx_usb_detect_ms_card+0x98/0x120 [rtsx_usb_ms] There is a lot going on in xhci during the last suspend befor this. URBs are canceled, devices reset and re-enumerated, timeout while reading descriptor, device firmware changed. It's possible we end up in a situation where xhci never givers back the URB. 4.11-rc2 has better xhci tracing, it shows each URB enqueue and dequeue and giveback. Could you try enabling xhci tracing before suspending (not the same as xhci verbose dynamic debug) It will generate a lot of data, so better to remove all extra USB devices. xhci tracing can be added with: mount -t debugfs none /sys/kernel/debug echo xhci-hcd >> /sys/kernel/debug/tracing/set_event and then send the output of cat /sys/kernel/debug/tracing/trace -Mathias