Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1940101pxb; Wed, 20 Oct 2021 15:08:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJziT3MfJfKTQnoJZEHJ1k7FEI7rwBnA59myre0F2B45DGODK3YBAyg7LJvAd16QF/dHl0He X-Received: by 2002:a62:7850:0:b0:44c:5b71:2506 with SMTP id t77-20020a627850000000b0044c5b712506mr1512047pfc.37.1634767722994; Wed, 20 Oct 2021 15:08:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634767722; cv=none; d=google.com; s=arc-20160816; b=PN/0y1zEcHffgb5xtdNJYtD10uQjzTuJ4+Bm0L1G4epq2qHLtnmNXii+R6A4JSvYQU dDMoa/SgVu0EGS4e1OpJ/SkTaMXa89SONAJoHWRTS1bg33jruuMSEfEyzWwLmYC1OVWK GuC6SJTG9DLSmUob2aGEIc+4Nq9EyE4tnkHrl3fLBvJ0yZIM9y6gQs0huHhhdJFZfhwt ibPUoeOouVhtOjwvl59VYiBbVyimKYS7cevwgUhjl3UREY5otXFzITubwmbzU+KIrAC3 NclVzq2t12Aa0/kTLJszbGvaW1PrWuBqkHmqIAGerca0d/Znp4QVHMyEqX6YGGjefBh6 nAUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=RSB1EFBcevPYbqZ2w4w1MBuSUmukITUbdzPFSYzrX/E=; b=dNoUJ4JwcvMYsNK2d0ruUSNHAeDtWYIJtxT0yirgS/Eh+0c6a/cG3We/np0IrJNvDD dSk9dfGAIY1c3KdAikdef7JEEK4y3odsXgYvcPAPcEWch20MnRqboTVn5m190MCg28td bwSoxteof12OG81F6s3PVbPhDafyjhc7URsNDhKrLXfnhxxuJkGKQLcIsA/6noEG/6rf AdToU1D3lD1UeyD3/iqDLKIDX442xwcYhkU1uvS1SCSYwr0Eh5SYYDpdOCw2hj/hb//x PHlbi3ngSASjk3Ln9d3JrU5Uaalp7gYoz2bc9P90mDu/jOlqV3IE+w9WcLtDjNZQpoMv YwOg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t190si1846669pgd.313.2021.10.20.15.08.02; Wed, 20 Oct 2021 15:08:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230288AbhJTWH0 (ORCPT + 99 others); Wed, 20 Oct 2021 18:07:26 -0400 Received: from netrider.rowland.org ([192.131.102.5]:40913 "HELO netrider.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S230082AbhJTWHZ (ORCPT ); Wed, 20 Oct 2021 18:07:25 -0400 Received: (qmail 1140490 invoked by uid 1000); 20 Oct 2021 18:05:03 -0400 Date: Wed, 20 Oct 2021 18:05:03 -0400 From: Alan Stern To: Krzysztof Kozlowski Cc: Felipe Balbi , Greg Kroah-Hartman , syzbot , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, syzkaller-bugs@googlegroups.com, Pavel Skripkin , Thierry Escande , Andrey Konovalov Subject: Re: [syzbot] INFO: task hung in port100_probe Message-ID: <20211020220503.GB1140001@rowland.harvard.edu> References: <000000000000c644cd05c55ca652@google.com> <9e06e977-9a06-f411-ab76-7a44116e883b@canonical.com> <20210722144721.GA6592@rowland.harvard.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 20, 2021 at 10:56:42PM +0200, Krzysztof Kozlowski wrote: > Hi Alan, Felipe, Greg and others, > > This is an old issue reported by syzkaller for NFC port100 driver [1]. > There is something similar for pn533 [2]. > > I was looking at it some time ago, took a break and now I am trying to > fix it again. Without success. > > The issue is reproducible via USB gadget on QEMU, not on real HW. I > looked and debugged the code and I think previously mentioned > double-URB-submit is not the reason here. Or I miss how the USB works > (which is quite probable...). > > 1. The port100 driver calls port100_send_cmd_sync() which eventually > goes to port100_send_frame_async(). After it, it waits for "sync" > completion. > > 2. In port100_send_frame_async(), driver indeed first submits "out_urb" > which quite fast is being processed by dummy_hcd with "no ep configured" > and -EPROTO. > > 3. Then (or sometimes before -EPROTO response from (2) above) the > port100_send_frame_async() submits "in_urb" via > port100_submit_urb_for_ack() and waits for its completion. Completion of > "in_urb" (or the "ack") in port100_recv_ack() would schedule work to > complete the (1) above - the sync completion. > > 4. Usually, when reproducer works fine (does not trigger issue), the > dummy_timer() from gadget responds with the same "no ep configured for > urb" for this "in_urb" (3). This completes "in_urb", which eventually > completes (1) and probe finishes with error. Error is expected, because > it's random junk-gadget... > > The syzkaller reproducer fails if >1 of threads are running these usb > gadgets. When this happens, no "in_urb" completion happens. No this > "ack" port100_recv_ack(). > > I added some debugs and simply dummy_hcd dummy_timer() is woken up on > enqueuing in_urb and then is looping crazy on a previous URB (some older > URB, coming from before port100 driver probe started). The dummy_timer() > loop never reaches the second "in_urb" to process it, I think. Is there any way you can track down what's happening in that crazy loop? That is, what driver was responsible for the previous URB? We have seen this sort of thing before, where a driver submits an URB for a gadget which has disconnected. The URB fails with -EPROTO status but the URB's completion handler does an automatic resubmit. That can lead to a very tight loop with dummy-hcd, and it could easily prevent some other important processing from occurring. The simple solution is to prevent the driver from resubmitting when the completion status is -EPROTO. Alan Stern > The pn533 NFC driver has similar design, but I have now really doubts it > is a NFC driver issue. Instead an issue in dummy gadget HCD is somehow > triggered by the reproducer. > > Reproduction - just follow [1] or [2]. Eventually I slightly tweaked the > code and put here: > https://github.com/krzk/tools/tree/master/tests-var/nfc/port100_probe > $ make > $ sudo ./port100_probe > > > [1] https://syzkaller.appspot.com/bug?extid=abd2e0dafb481b621869 > [2] https://syzkaller.appspot.com/bug?extid=1dc8b460d6d48d7ef9ca > > > Best regards, > Krzysztof