Received: by 2002:a05:7412:8521:b0:e2:908c:2ebd with SMTP id t33csp546575rdf; Fri, 3 Nov 2023 08:06:24 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHz7cBaRH+hUQv4OSBtQ+zArxhcX1jua3mUNZhdN4ksK4qffCG2OuwXyHoI1vBHHk1MZ2+/ X-Received: by 2002:a05:6a00:93aa:b0:6be:bf7:fda5 with SMTP id ka42-20020a056a0093aa00b006be0bf7fda5mr21564447pfb.12.1699023984334; Fri, 03 Nov 2023 08:06:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1699023984; cv=none; d=google.com; s=arc-20160816; b=fQ7bt8rZEeFfdTorjz6MavefDy4NSyWsk2OvVmXCbVUfYMYCgEr9y1W9I5wqdleX9f O3uYr+NaqPd4arxK+uppM5nvUoMWR3Jo0bGOKqNFBAYCm34PV9uKnExSmkTmk2QULr1S T7zsJ0I3BE5xwrqXaxELvllayzHCfPIWJPBSYyBkTwh0rtiWoK+yMfsNVd/aWU1ofFoh ZoKb+kI55c5oqBe7qne+HbwbxBPtDgBdk027iXzieru1lPsm23aKa4jfBgHdkCG5DoKM LQTJhN8VHFPjobs81DS4kV+v/r2hkQkzRyszrcjWiqWS21vZEWnTlXbWqHF3KkiO1/wn qcyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=JK3b25HDYMngzkZZQfDOqwXLyQ4xfGpYvmYwE3sHJSc=; fh=ITwnhg9QG1y9WcOz9FDOiZ0/i5A9ptOY1SXeLd9bMwo=; b=W3skNdFfRjZJztyzffEtbPEsNo3uwvidKYusA18wlos2tLXY4pGxSOrrJqh3E8l2oJ JC3KodGyK1Aeob/fJ6eqJ8yrz0ExQyE4EWKdFe6FqMAxdrGGiDH/yxc5vKqjP5jgP/F/ 2ltP2vrhcFUb7vz/+8aVb6eQwqqcO5AEPiGP2ebD2lV0oNmynG/29o6tYBtI82Rx9kZu YnwHYzeUmYFGUZPWuvUJe79irxzABhwsQqBr9OLbZ9XFwTm8fuw3Zz78Y2HlmhMYmnyz tMm6BxUndYOUmxTCz4N5crW1fpt6ACGyNl+FBK5dJ+DiXBpjJeddgUnv8qBHbMsGSptm qN7Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=harvard.edu Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id m126-20020a632684000000b005b8f6936eb4si1592820pgm.773.2023.11.03.08.06.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Nov 2023 08:06:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=harvard.edu Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 9A9CB8061CF7; Fri, 3 Nov 2023 08:05:34 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232823AbjKCPEr (ORCPT + 99 others); Fri, 3 Nov 2023 11:04:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232831AbjKCPEp (ORCPT ); Fri, 3 Nov 2023 11:04:45 -0400 Received: from netrider.rowland.org (netrider.rowland.org [192.131.102.5]) by lindbergh.monkeyblade.net (Postfix) with SMTP id 63964D56 for ; Fri, 3 Nov 2023 08:04:38 -0700 (PDT) Received: (qmail 850690 invoked by uid 1000); 3 Nov 2023 11:04:36 -0400 Date: Fri, 3 Nov 2023 11:04:36 -0400 From: Alan Stern To: Muhammad Usama Anjum Cc: syzbot , gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org, linux-usb@vger.kernel.org, syzkaller-bugs@googlegroups.com Subject: Re: [syzbot] [usb?] INFO: task hung in hub_port_init (3) Message-ID: References: <000000000000704d6305fdb75642@google.com> <88cc734c-2a88-4495-aa1e-f16294eb6cea@collabora.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <88cc734c-2a88-4495-aa1e-f16294eb6cea@collabora.com> X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Fri, 03 Nov 2023 08:05:34 -0700 (PDT) On Fri, Nov 03, 2023 at 07:03:20PM +0500, Muhammad Usama Anjum wrote: > On 6/10/23 12:25 AM, syzbot wrote: > > syzbot has found a reproducer for the following issue on: > > > > HEAD commit: 33f2b5785a2b Merge tag 'drm-fixes-2023-06-09' of git://ano.. > > git tree: upstream > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=1206f143280000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=3c980bfe8b399968 > > dashboard link: https://syzkaller.appspot.com/bug?extid=b6f11035e572f08bc20f > > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1676f51b280000 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=129632fd280000 > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/a817d99af39d/disk-33f2b578.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/8916e1d053fc/vmlinux-33f2b578.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/e53956f3cfd4/bzImage-33f2b578.xz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+b6f11035e572f08bc20f@syzkaller.appspotmail.com > > > > INFO: task kworker/0:2:901 blocked for more than 143 seconds. > > Not tainted 6.4.0-rc5-syzkaller-00178-g33f2b5785a2b #0 > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > task:kworker/0:2 state:D stack:26800 pid:901 ppid:2 flags:0x00004000 > > Workqueue: usb_hub_wq hub_event > > Call Trace: > > > > context_switch kernel/sched/core.c:5343 [inline] > > __schedule+0xc9a/0x5880 kernel/sched/core.c:6669 > > schedule+0xde/0x1a0 kernel/sched/core.c:6745 > > usb_kill_urb.part.0+0x19a/0x220 drivers/usb/core/urb.c:728 > > usb_kill_urb+0x83/0xa0 drivers/usb/core/urb.c:717 > > usb_start_wait_urb+0x24a/0x4b0 drivers/usb/core/message.c:64 > > usb_internal_control_msg drivers/usb/core/message.c:102 [inline] > > usb_control_msg+0x320/0x4a0 drivers/usb/core/message.c:153 > > hub_port_init+0x14f3/0x3900 drivers/usb/core/hub.c:4874 > > hub_port_connect drivers/usb/core/hub.c:5336 [inline] > > hub_port_connect_change drivers/usb/core/hub.c:5551 [inline] > > port_event drivers/usb/core/hub.c:5711 [inline] > > hub_event+0x2b89/0x4e40 drivers/usb/core/hub.c:5793 > > process_one_work+0x99a/0x15e0 kernel/workqueue.c:2405 > > worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552 > > kthread+0x344/0x440 kernel/kthread.c:379 > > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308 > > > > INFO: task syz-executor104:5004 blocked for more than 143 seconds. > This is being reproduced on linux-next. The USB IP is being fuzzed. I'd > modified the reproducer to try to understand the issue. The execution of > application creates kworkers (can be found in hub->events). One of the usb > hub kworker gets stuck because of wrong use_count. I don't know USB side of > logic. But a worker shouldn't go to sleep indefinitely. I doubt that the problem is a wrong value for use_count. More likely it's a bug in the usbip driver. > My debug logs are as following which can help an expert USB developer to > pin point the problem: > > [ 118.904272][ T9] usb_submit_urb urb->use_count: 0 > [ 118.904942][ T9] usb_hcd_submit_urb urb->use_count: 1 > [ 118.905715][ T9] usb_submit_urb urb->use_count: 1 > [ 118.906428][ T9] usb_start_wait_urb urb->use_count: 1 > [ 123.938978][ T9] usb_kill_urb use_count: 1 These don't mean much because they don't give the address of urb, so we don't know if the various lines all refer to the same URB or to different ones. > At this point, wait is being done on usb_kill_urb_queue as use_count isn't > zero and no event on usb_kill_urb_queue is received. Right. The usbip driver is supposed to terminate the URB in a timely fashion (because usb_kill_urb() calls usb_hcd_unlink_urb()), but it isn't doing so. When the URB completes, the event will occur. > The comment for usb_kill_urb is: > > * This routine may not be used in an interrupt context (such as a bottom > > * half or a completion handler), or when holding a spinlock, or in other > > * situations where the caller can't schedule(). > > But several locks are held by this kworker and sleeps indefinitely. No spinlocks are held, only mutexes. The difference is that a task is allowed to sleep while holding a mutex, but it's not allowed to sleep while holding a spinlock. If you want to fix this problem (and probably a bunch of other ones in syzbot's list of pending bugs), figure out what's wrong with the ->urb_dequeue() callback routine in the usbip driver and fix it. Alan Stern