Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752867AbdCPTYX (ORCPT ); Thu, 16 Mar 2017 15:24:23 -0400 Received: from bh-25.webhostbox.net ([208.91.199.152]:57649 "EHLO bh-25.webhostbox.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752179AbdCPTYV (ORCPT ); Thu, 16 Mar 2017 15:24:21 -0400 From: Guenter Roeck To: Greg Kroah-Hartman Cc: Alan Stern , linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, Douglas Anderson , Brian Norris , Guenter Roeck Subject: [PATCH v3] usb: hub: Fix error loop seen after hub communication errors Date: Thu, 16 Mar 2017 12:24:13 -0700 Message-Id: <1489692253-20299-1-git-send-email-linux@roeck-us.net> X-Mailer: git-send-email 2.7.4 X-Authenticated_sender: guenter@roeck-us.net X-OutGoing-Spam-Status: No, score=-1.0 X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - bh-25.webhostbox.net X-AntiAbuse: Original Domain - vger.kernel.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - roeck-us.net X-Get-Message-Sender-Via: bh-25.webhostbox.net: authenticated_id: guenter@roeck-us.net X-Authenticated-Sender: bh-25.webhostbox.net: guenter@roeck-us.net X-Source: X-Source-Args: X-Source-Dir: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5508 Lines: 142 While stress testing a usb controller using a bind/unbind looop, the following error loop was observed. usb 7-1.2: new low-speed USB device number 3 using xhci-hcd usb 7-1.2: hub failed to enable device, error -108 usb 7-1-port2: cannot disable (err = -22) usb 7-1-port2: couldn't allocate usb_device usb 7-1-port2: cannot disable (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) ** 57 printk messages dropped ** hub 7-1:1.0: activate --> -22 ** 82 printk messages dropped ** hub 7-1:1.0: hub_ext_port_status failed (err = -22) This continues forever. After adding tracebacks into the code, the call sequence leading to this is found to be as follows. [] hub_activate+0x368/0x7b8 [] hub_resume+0x2c/0x3c [] usb_resume_interface.isra.6+0x128/0x158 [] usb_suspend_both+0x1e8/0x288 [] usb_runtime_suspend+0x3c/0x98 [] __rpm_callback+0x48/0x7c [] rpm_callback+0xa8/0xd4 [] rpm_suspend+0x84/0x758 [] rpm_idle+0x2c8/0x498 [] __pm_runtime_idle+0x60/0xac [] usb_autopm_put_interface+0x6c/0x7c [] hub_event+0x10ac/0x12ac [] process_one_work+0x390/0x6b8 [] worker_thread+0x480/0x610 [] kthread+0x164/0x178 [] ret_from_fork+0x10/0x40 kick_hub_wq() is called from hub_activate() even after failures to communicate with the hub. This results in an endless sequence of hub event -> hub activate -> wq trigger -> hub event -> ... Provide two solutions for the problem. - Only trigger the hub event queue if communication with the hub was successful. - After a suspend failure, only resume already suspended interfaces if the communication with the device is still possible. Each of the changes fixes the observed problem. Use both to improve robustness. Signed-off-by: Guenter Roeck --- v3: In hub.c, abort immediately if hub_port_status() returns an error. Since hub_port_status() already logs the error, don't log it again. In device,c, log the error return value from usb_suspend_device() if usb_get_status() failed as well. Don't check if the hub is still accessible if the error returned from hub_port_status() is -EBUSY. v2: Instead of not triggering the hub wq after an error to submit an urb, implement a more complex error detection and handling. Do it in two places. Marked as RFC to determine if one (or both) of those solutions are viable. drivers/usb/core/driver.c | 18 ++++++++++++++++++ drivers/usb/core/hub.c | 5 ++++- 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c index cdee5130638b..7ebdf2a4e8fe 100644 --- a/drivers/usb/core/driver.c +++ b/drivers/usb/core/driver.c @@ -1331,6 +1331,24 @@ static int usb_suspend_both(struct usb_device *udev, pm_message_t msg) */ if (udev->parent && !PMSG_IS_AUTO(msg)) status = 0; + + /* + * If the device is inaccessible, don't try to resume + * suspended interfaces and just return the error. + */ + if (status && status != -EBUSY) { + int err; + u16 devstat; + + err = usb_get_status(udev, USB_RECIP_DEVICE, 0, + &devstat); + if (err) { + dev_err(&udev->dev, + "Failed to suspend device, error %d\n", + status); + goto done; + } + } } /* If the suspend failed, resume interfaces that did get suspended */ diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c index 5a420657f9f7..5ab7cd09e1a4 100644 --- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -1066,6 +1066,9 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type) portstatus = portchange = 0; status = hub_port_status(hub, port1, &portstatus, &portchange); + if (status) + goto abort; + if (udev || (portstatus & USB_PORT_STAT_CONNECTION)) dev_dbg(&port_dev->dev, "status %04x change %04x\n", portstatus, portchange); @@ -1198,7 +1201,7 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type) /* Scan all ports that need attention */ kick_hub_wq(hub); - +abort: if (type == HUB_INIT2 || type == HUB_INIT3) { /* Allow autosuspend if it was suppressed */ disconnected: -- 2.7.4