Received: by 10.213.65.68 with SMTP id h4csp205987imn; Sat, 24 Mar 2018 19:05:31 -0700 (PDT) X-Google-Smtp-Source: AG47ELtUKtIkBuE31dlXDF97WVBgsWYA7iuoJ1DKf4YJ1HE44KJyDMD0QoX+VJI3yWis1+bNQGKw X-Received: by 10.101.99.26 with SMTP id g26mr18891751pgv.442.1521943531483; Sat, 24 Mar 2018 19:05:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521943531; cv=none; d=google.com; s=arc-20160816; b=liVf0ECQGL0+gWXoTZ0akQKT2k+9MyaPHXCxLA/4oA4Ui4+0l/CH7WQn1oV+EVVZm0 3mJMmXuUMXh89AfP3fGrrnZ/jkohLithgovfQ5LHCh3RoCQFWkffJEsxCM+1gDRCwiNJ 75ZKc81oC8yv4Z9rrYGv6qQI8Qqa6djSUh4dzPJvZXw8zPiyoh7GQrWHtP5Hp1r83QfQ qb/NaSJcT789jzjT7noi3sFmfdnqXCWGa8iO6sg1gGi+W2te0abfj3X+ybgCN/MwOqZM yJYShJGGKnbCf/odMc0jTK4q3FH/SlN8AYvdBElCdVjK+a9wA9k4sAyzIn28+TnXfOFT Ej2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=GN+EWIa2dGFeSWapMIhXp4zTwS6C+9lnjEPCPSFRLq4=; b=qXdBjp3thdf8DwgzsvvMOGCliaofd4h/AvJoh9n6dbPSGyo6dRQklPHH7liGUBDkp+ rfS/Jw6xaWhDIe+1KVXb15t1Q5p6c+iHEkxRV94/olH8kS0RhWmGwd5ZEvkmTIJKPwOB AFgTTRXTbrOcLECCrsLLSNk6jIkTXLfZy4Pnp5JYyHeFNDtemXLQ5+uzhMJy/SFXWdGm 7GWgKY2op1hoyLTlYFctfg80WQPPfMLwMRnmLaWL84YpnsXsETxE9jDhADDV1ETABuBy o1a8JzpSipz+QlaMStZKqluRiem+EJvevWVCcYyeTH4ayejak8Yj+vLgqkggKwIULUD0 g1lw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Ch708tfU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d21-v6si10577357pll.557.2018.03.24.19.05.03; Sat, 24 Mar 2018 19:05:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Ch708tfU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753182AbeCYCDn (ORCPT + 99 others); Sat, 24 Mar 2018 22:03:43 -0400 Received: from mail-lf0-f65.google.com ([209.85.215.65]:39687 "EHLO mail-lf0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752939AbeCYCDl (ORCPT ); Sat, 24 Mar 2018 22:03:41 -0400 Received: by mail-lf0-f65.google.com with SMTP id p142-v6so23377864lfd.6; Sat, 24 Mar 2018 19:03:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=GN+EWIa2dGFeSWapMIhXp4zTwS6C+9lnjEPCPSFRLq4=; b=Ch708tfUMdiTiw6XIvUZrypEVhsXw+YUMpv9yGENKy2iG2Pnb1D+R9Kvr/Z6GBm13U pUxJiSZFNFGt/FxVxaQahM7YjNGl+EmWGeYg6haZETEwd0T79PP/793DG+sO1ztIqYs2 h1v0PM/7YFdad4V2/GHJyBLXTLaLWGwzc4QVVRrTQcxw/kCQJHeIAR/qdpX91SnaBbJa 4BhXMZtgRYyeMgAOVR9sf0/5Yq+5p2Cgk3Ec7vVFbAM1J527zYpIIJQyIOza4tR1gqVn QewqnW346VI6j9uaF4F4pOou5zz6tgAfb8n9f77O8kcpCPLNwBfsV3wE7vkbXhkT5z5S EN0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=GN+EWIa2dGFeSWapMIhXp4zTwS6C+9lnjEPCPSFRLq4=; b=jb+Nsg39gEhRAOJqpxwrIt+Qs3M2j8B62/Kwuus22D34ded4ajQPi9HRaKb0TAnLel Hn5i3LASs0fGvjQAnM2E6boRCH6Jwq6rG1LrJ23ZNFoYs4n+/4KSFYkudYMY7thFa3oS GDOlYN7phD0ms86bjtKbImsRjNh4HJZ5B+QnvIogkeGpMkLW9A6Pr6M+dT6sy2fmcA+g r2Kk38DTZIf4LrI0vWc1x9hf4H/i63AtkZ8CTIxuQypg9ov2z8oKsEsJYD0PQ3XNaM1r tm8s6ZvFvtG5NV1CcdODvil9qpIpfBk6mS3EAS+6wuzzyPB7ozusy11IDaBXb27gZCDi tIcA== X-Gm-Message-State: AElRT7GpoaotGNpHn5DHpcT2RQAezDgDH1hRVh18Z81+cUlxpNCpe9K4 Q2OEOy9+cv/QOl/JF5lCwJeVaZ1udiQnx2OOPQI= X-Received: by 2002:a19:59d1:: with SMTP id n200-v6mr22307436lfb.84.1521943419642; Sat, 24 Mar 2018 19:03:39 -0700 (PDT) MIME-Version: 1.0 Received: by 10.46.17.84 with HTTP; Sat, 24 Mar 2018 19:03:39 -0700 (PDT) In-Reply-To: References: <20180208035501.10711-1-jeffy.chen@rock-chips.com> From: Jonathan Liu Date: Sun, 25 Mar 2018 13:03:39 +1100 Message-ID: Subject: Re: [v6] usb: ohci: Proper handling of ed_rm_list to handle race condition between usb_kill_urb() and finish_unlinks() To: Jeffy Chen Cc: linux-kernel , briannorris@chromium.org, stern@rowland.harvard.edu, mka@chromium.org, dianders@chromium.org, AMAN DEEP , stable@vger.kernel.org, Greg Kroah-Hartman , linux-usb@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 25 March 2018 at 12:21, Jonathan Liu wrote: > Hi, > > On 8 February 2018 at 14:55, Jeffy Chen wrote: >> From: AMAN DEEP >> >> There is a race condition between finish_unlinks->finish_urb() function >> and usb_kill_urb() in ohci controller case. The finish_urb calls >> spin_unlock(&ohci->lock) before usb_hcd_giveback_urb() function call, >> then if during this time, usb_kill_urb is called for another endpoint, >> then new ed will be added to ed_rm_list at beginning for unlink, and >> ed_rm_list will point to newly added. >> >> When finish_urb() is completed in finish_unlinks() and ed->td_list >> becomes empty as in below code (in finish_unlinks() function): >> >> if (list_empty(&ed->td_list)) { >> *last = ed->ed_next; >> ed->ed_next = NULL; >> } else if (ohci->rh_state == OHCI_RH_RUNNING) { >> *last = ed->ed_next; >> ed->ed_next = NULL; >> ed_schedule(ohci, ed); >> } >> >> The *last = ed->ed_next will make ed_rm_list to point to ed->ed_next >> and previously added ed by usb_kill_urb will be left unreferenced by >> ed_rm_list. This causes usb_kill_urb() hang forever waiting for >> finish_unlink to remove added ed from ed_rm_list. >> >> The main reason for hang in this race condtion is addition and removal >> of ed from ed_rm_list in the beginning during usb_kill_urb and later >> last* is modified in finish_unlinks(). >> >> As suggested by Alan Stern, the solution for proper handling of >> ohci->ed_rm_list is to remove ed from the ed_rm_list before finishing >> any URBs. Then at the end, we can add ed back to the list if necessary. >> >> This properly handle the updated ohci->ed_rm_list in usb_kill_urb(). >> >> Fixes:977dcfdc6031("USB:OHCI:don't lose track of EDs when a controller dies") >> Acked-by: Alan Stern >> CC: >> Signed-off-by: Aman Deep >> Signed-off-by: Jeffy Chen >> --- >> >> Changes in v6: >> This is a resend of Aman Deep's v5 patch [0], which solved the hang we >> hit [1]. (Thanks Aman :) >> >> The v5 has some format issues, so i slightly adjust the commit message. >> >> [0] https://www.spinics.net/lists/linux-usb/msg129010.html >> [1] https://bugs.chromium.org/p/chromium/issues/detail?id=803749 >> >> drivers/usb/host/ohci-q.c | 17 ++++++++++------- >> 1 file changed, 10 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/usb/host/ohci-q.c b/drivers/usb/host/ohci-q.c >> index b2ec8c399363..4ccb85a67bb3 100644 >> --- a/drivers/usb/host/ohci-q.c >> +++ b/drivers/usb/host/ohci-q.c >> @@ -1019,6 +1019,8 @@ static void finish_unlinks(struct ohci_hcd *ohci) >> * have modified this list. normally it's just prepending >> * entries (which we'd ignore), but paranoia won't hurt. >> */ >> + *last = ed->ed_next; >> + ed->ed_next = NULL; >> modified = 0; >> >> /* unlink urbs as requested, but rescan the list after >> @@ -1077,21 +1079,22 @@ static void finish_unlinks(struct ohci_hcd *ohci) >> goto rescan_this; >> >> /* >> - * If no TDs are queued, take ED off the ed_rm_list. >> + * If no TDs are queued, ED is now idle. >> * Otherwise, if the HC is running, reschedule. >> - * If not, leave it on the list for further dequeues. >> + * If the HC isn't running, add ED back to the >> + * start of the list for later processing. >> */ >> if (list_empty(&ed->td_list)) { >> - *last = ed->ed_next; >> - ed->ed_next = NULL; >> ed->state = ED_IDLE; >> list_del(&ed->in_use_list); >> } else if (ohci->rh_state == OHCI_RH_RUNNING) { >> - *last = ed->ed_next; >> - ed->ed_next = NULL; >> ed_schedule(ohci, ed); >> } else { >> - last = &ed->ed_next; >> + ed->ed_next = ohci->ed_rm_list; >> + ohci->ed_rm_list = ed; >> + /* Don't loop on the same ED */ >> + if (last == &ohci->ed_rm_list) >> + last = &ed->ed_next; >> } >> >> if (modified) > > I am experiencing a USB function call hang from userspace with OCHI > (full speed USB device) after updating from Linux 4.14.15 to 4.14.24 > and noticed this commit. > > Here is the Linux 4.14.24 kernel stack trace (extracted from SysRq+w > and amended with addr2line): > [] (__schedule) from [] (schedule+0x50/0xb4) > kernel/sched/core.c:2792 > [] (schedule) from [] > (usb_kill_urb.part.3+0x78/0xa8) include/asm-generic/preempt.h:59 > [] (usb_kill_urb.part.3) from [] > (usbdev_ioctl+0x1288/0x1cf0) drivers/usb/core/urb.c:690 > [] (usbdev_ioctl) from [] > (do_vfs_ioctl+0x9c/0x8ec) drivers/usb/core/devio.c:1835 > [] (do_vfs_ioctl) from [] (SyS_ioctl+0x34/0x5c) > fs/ioctl.c:47 > [] (SyS_ioctl) from [] (ret_fast_syscall+0x0/0x54) > include/linux/file.h:39 > > Afterwards the kernel is unresponsive to disconnect/connect of the > full speed USB device but I can connect/disconnect a high speed USB > device to the same port and communicate to it without problem since it > uses EHCI (OHCI is companion controller). If I try to connect the full > speed USB device again it is still unresponsive. The userspace > application is still hanging after all this. > > Could this commit be causing the issue? Note that I have been running with OHCI_QUIRK_QEMU added to ochi->flags in ohci_init of drivers/usb/host/ohci-hcd.c for both 4.14.15 and 4.14.24 kernels due to a race condition that was later addressed by https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=v4.14.23&id=85c3d26bd754160f6be90d8b078d70a1b35820e7, so io_watchdog_func of drivers/usb/host/ohci-hcd.c is not being run. Thanks. Regards, Jonathan