Received: by 10.213.65.68 with SMTP id h4csp189778imn; Sat, 24 Mar 2018 18:25:23 -0700 (PDT) X-Google-Smtp-Source: AG47ELtzmD/uMQE27FU1NYegp1WsYYO+tOS7jRDiv7+ZIsMDH6C0892GgO0z2YDXNkIZ8gootP/M X-Received: by 10.99.112.77 with SMTP id a13mr24790194pgn.253.1521941123071; Sat, 24 Mar 2018 18:25:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521941123; cv=none; d=google.com; s=arc-20160816; b=PX5GKL6uxvtPGaJGNlhjkDFW7EJYitaoegi7Cx4BNnXbcKv6Zu21xj1SIQwlujJtrP 3mqSofVYgdIxi3LDxhDqLaG/rf6DrxpLgQTMqlc/p5uKsc3Wn/YFBn0W7nsF5PCZq5rQ a6c6PWE2hr5JpE3hyfHlMVOPM4iGkLe5VrDEhLQbiOAJ/NmyWVh9QuZYakReLHSxrPd+ 6cdChh5w/GZ794Wqp3l82/7BZ9soDp54jbRvof+1WoArnL6KXAmtC8pkUkXkFgut3SO4 cZSQvzvbt2ub45fSHqEEsUpsdHmudiY+zrXi6XTU9z1ZgjUXJ7RO9pR2k9/n6OADBPkj lEfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=YSGXlDq2IZDtcmvjJ029kG5Fa/dIiph6i8OGl5XsL60=; b=pN0NOlozHXnCLCWyL/jaBdF/3KKbCdKufebSuKq068ttQplYUU5n/zuVFdd4Ttn+KN u84cQmX3UWc8dauSx/MAv8nOdyMBxaOGGCXUCneFbI/tHo8suoVHYAgt1TCbp7FhL4jp 7uAp5DSh/soWyTR+jqVVSU1ED1djaeh3gOtMdVaeMh+VliEL+GRZXI81X2Mhq1/GWb5K THvZovnBHyLBwfHqyEkNhJhpga+kezvaIqWjHVEYdPod/jJy5LYKGV3gXv9pDFlLe74M JmZlqlUbH2SM5ENmBQRzbLYYXrYUHUIW5ZAnPu5LFv1DKOKJcKOkO+Pcl0J1p9cqZ8ux aOug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=RVw9vO4N; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s126si8013282pgc.477.2018.03.24.18.24.32; Sat, 24 Mar 2018 18:25:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=RVw9vO4N; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753085AbeCYBVx (ORCPT + 99 others); Sat, 24 Mar 2018 21:21:53 -0400 Received: from mail-lf0-f66.google.com ([209.85.215.66]:36531 "EHLO mail-lf0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752216AbeCYBVv (ORCPT ); Sat, 24 Mar 2018 21:21:51 -0400 Received: by mail-lf0-f66.google.com with SMTP id z143-v6so23336367lff.3; Sat, 24 Mar 2018 18:21:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=YSGXlDq2IZDtcmvjJ029kG5Fa/dIiph6i8OGl5XsL60=; b=RVw9vO4Nkvc1xtmBi5gCv+GqS5Q4MDOHpZco2wdMOLO74RYkKEd5QlKzNYaL5mPSZV Z4O95XLjsy6UBUwV+NTuNspqYh7omev5oBBI1GexRnWWxByH4ID63JygFnv0B6UE1H74 lHpj8oZ2X2rkoVmlYjk37cHbr6dNyKhbD7ZsfJD7j4NgXswcUkbPJqqeYFxCmuu3rDNF u9h9Mxb698SvU8AayWkFComA30QGgTtUDowH2aCsVYN/wSR2gKptkFOJH9yFtOZCBhF+ Yw0yH8JSKrQnTs6wAqiEl2PCs5ft8568Be0nIt8jInZ07GMooUkh6VUu3AcihALKR/MV WYsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=YSGXlDq2IZDtcmvjJ029kG5Fa/dIiph6i8OGl5XsL60=; b=oJiUO9i19RAnkgDGOv0CINk7ck3HSfeB6CZUzV8eeA3bH2P82NOA/B/1XQaXPMw9Y5 +5xLXWlqijeKwm+MXMoxQ1xdz5Vjab7nnO2fUxT4XyrMMHOuB7/9cTM4Ak0dKuAe85kx 6Cd53EkI90cOEfRDSuZVVSNcmAppVzLdWcjFgA7Kb04GbV+WG7ah1xFGjCm++VHj++99 DsWftIMVj2TUkMtSuedEtge/XRh0eqSRQlxY1La+bLDtcvhnZKlH+udL2cRnHx7OD3/v G0y5FXfJbG3t1N6KN80krwnOIq2pM62SFz8ASWwAfvzusZ+4nOQqCItLVFgi7s01E+PD EX3A== X-Gm-Message-State: AElRT7EBLfGwGxPpaWj4/qv0aQqBPthA7DUStjKsrFzraYonfcMYrwnT /I759IrY/Obm5HrWsUzDLhgfjZcHXU0F8+1FTDo= X-Received: by 10.46.16.85 with SMTP id j82mr20764178lje.139.1521940908991; Sat, 24 Mar 2018 18:21:48 -0700 (PDT) MIME-Version: 1.0 Received: by 10.46.17.84 with HTTP; Sat, 24 Mar 2018 18:21:48 -0700 (PDT) In-Reply-To: <20180208035501.10711-1-jeffy.chen@rock-chips.com> References: <20180208035501.10711-1-jeffy.chen@rock-chips.com> From: Jonathan Liu Date: Sun, 25 Mar 2018 12:21:48 +1100 Message-ID: Subject: Re: [v6] usb: ohci: Proper handling of ed_rm_list to handle race condition between usb_kill_urb() and finish_unlinks() To: Jeffy Chen Cc: linux-kernel , briannorris@chromium.org, stern@rowland.harvard.edu, mka@chromium.org, dianders@chromium.org, AMAN DEEP , stable@vger.kernel.org, Greg Kroah-Hartman , linux-usb@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 8 February 2018 at 14:55, Jeffy Chen wrote: > From: AMAN DEEP > > There is a race condition between finish_unlinks->finish_urb() function > and usb_kill_urb() in ohci controller case. The finish_urb calls > spin_unlock(&ohci->lock) before usb_hcd_giveback_urb() function call, > then if during this time, usb_kill_urb is called for another endpoint, > then new ed will be added to ed_rm_list at beginning for unlink, and > ed_rm_list will point to newly added. > > When finish_urb() is completed in finish_unlinks() and ed->td_list > becomes empty as in below code (in finish_unlinks() function): > > if (list_empty(&ed->td_list)) { > *last = ed->ed_next; > ed->ed_next = NULL; > } else if (ohci->rh_state == OHCI_RH_RUNNING) { > *last = ed->ed_next; > ed->ed_next = NULL; > ed_schedule(ohci, ed); > } > > The *last = ed->ed_next will make ed_rm_list to point to ed->ed_next > and previously added ed by usb_kill_urb will be left unreferenced by > ed_rm_list. This causes usb_kill_urb() hang forever waiting for > finish_unlink to remove added ed from ed_rm_list. > > The main reason for hang in this race condtion is addition and removal > of ed from ed_rm_list in the beginning during usb_kill_urb and later > last* is modified in finish_unlinks(). > > As suggested by Alan Stern, the solution for proper handling of > ohci->ed_rm_list is to remove ed from the ed_rm_list before finishing > any URBs. Then at the end, we can add ed back to the list if necessary. > > This properly handle the updated ohci->ed_rm_list in usb_kill_urb(). > > Fixes:977dcfdc6031("USB:OHCI:don't lose track of EDs when a controller dies") > Acked-by: Alan Stern > CC: > Signed-off-by: Aman Deep > Signed-off-by: Jeffy Chen > --- > > Changes in v6: > This is a resend of Aman Deep's v5 patch [0], which solved the hang we > hit [1]. (Thanks Aman :) > > The v5 has some format issues, so i slightly adjust the commit message. > > [0] https://www.spinics.net/lists/linux-usb/msg129010.html > [1] https://bugs.chromium.org/p/chromium/issues/detail?id=803749 > > drivers/usb/host/ohci-q.c | 17 ++++++++++------- > 1 file changed, 10 insertions(+), 7 deletions(-) > > diff --git a/drivers/usb/host/ohci-q.c b/drivers/usb/host/ohci-q.c > index b2ec8c399363..4ccb85a67bb3 100644 > --- a/drivers/usb/host/ohci-q.c > +++ b/drivers/usb/host/ohci-q.c > @@ -1019,6 +1019,8 @@ static void finish_unlinks(struct ohci_hcd *ohci) > * have modified this list. normally it's just prepending > * entries (which we'd ignore), but paranoia won't hurt. > */ > + *last = ed->ed_next; > + ed->ed_next = NULL; > modified = 0; > > /* unlink urbs as requested, but rescan the list after > @@ -1077,21 +1079,22 @@ static void finish_unlinks(struct ohci_hcd *ohci) > goto rescan_this; > > /* > - * If no TDs are queued, take ED off the ed_rm_list. > + * If no TDs are queued, ED is now idle. > * Otherwise, if the HC is running, reschedule. > - * If not, leave it on the list for further dequeues. > + * If the HC isn't running, add ED back to the > + * start of the list for later processing. > */ > if (list_empty(&ed->td_list)) { > - *last = ed->ed_next; > - ed->ed_next = NULL; > ed->state = ED_IDLE; > list_del(&ed->in_use_list); > } else if (ohci->rh_state == OHCI_RH_RUNNING) { > - *last = ed->ed_next; > - ed->ed_next = NULL; > ed_schedule(ohci, ed); > } else { > - last = &ed->ed_next; > + ed->ed_next = ohci->ed_rm_list; > + ohci->ed_rm_list = ed; > + /* Don't loop on the same ED */ > + if (last == &ohci->ed_rm_list) > + last = &ed->ed_next; > } > > if (modified) I am experiencing a USB function call hang from userspace with OCHI (full speed USB device) after updating from Linux 4.14.15 to 4.14.24 and noticed this commit. Here is the Linux 4.14.24 kernel stack trace (extracted from SysRq+w and amended with addr2line): [] (__schedule) from [] (schedule+0x50/0xb4) kernel/sched/core.c:2792 [] (schedule) from [] (usb_kill_urb.part.3+0x78/0xa8) include/asm-generic/preempt.h:59 [] (usb_kill_urb.part.3) from [] (usbdev_ioctl+0x1288/0x1cf0) drivers/usb/core/urb.c:690 [] (usbdev_ioctl) from [] (do_vfs_ioctl+0x9c/0x8ec) drivers/usb/core/devio.c:1835 [] (do_vfs_ioctl) from [] (SyS_ioctl+0x34/0x5c) fs/ioctl.c:47 [] (SyS_ioctl) from [] (ret_fast_syscall+0x0/0x54) include/linux/file.h:39 Afterwards the kernel is unresponsive to disconnect/connect of the full speed USB device but I can connect/disconnect a high speed USB device to the same port and communicate to it without problem since it uses EHCI (OHCI is companion controller). If I try to connect the full speed USB device again it is still unresponsive. The userspace application is still hanging after all this. Could this commit be causing the issue? Thanks. Regards, Jonathan