Received: by 2002:a05:6a10:a852:0:0:0:0 with SMTP id d18csp436367pxy; Wed, 5 May 2021 06:00:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy8vlT+qblnFgqH19zTzGD/PA/vcMcvIg9GYl0yKNmiuGXF6KmrRVU/KGocGP0FKbYzyfWP X-Received: by 2002:a17:90a:b001:: with SMTP id x1mr35863485pjq.122.1620219650175; Wed, 05 May 2021 06:00:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620219650; cv=none; d=google.com; s=arc-20160816; b=0v2XalCPnN3iCg/NU0L3tJ6hGtbETHPLxVpZG/dCq63jN5Knug1C9kID/xslLYiygB 6JQ/gMwDOZIJIdBemw8BBk+oCQvnQf8LyHyHJns3+fOuyVMMYnweSOwAWC6BGF8kq62z qv1teKEu7wc1u1vzVHdXCb42RDRpfoat9/pwl/5bAT0z+ZSbVjidcxaxbetjOb48SNeB RQJpl/R2Cul/gZTAQQM17+jZ5/XsHA8tEm+BRbrGTRYYpjOgoIU3EmWuV+eCffKE7m5i alAe/gR1b2do0x0QKCrupdfocLOScoiXjiypi2vw0ultCkoqGidfKnPB87MKBXpB9TFm LCxA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=Dk76o6RxDlWneF22/ybddRlCJ+O1cGcsTHCQybMdlLg=; b=rGI5z9Arb2ETgD3q6VE6uYUXJ2y+UQHYwqrNO+Amrsj4MVEfBJtjL3pNMiX2q62nyM wGUpXpR4Bc8Qn0Cnd5yb7+9r0BCLkSJerjiF99mlm0ToJ0P5lJFRfygB9+EWzdcesCYh 3hUb0xgmCIUSgfzmKoFZ3f3hDrdw4c+VdsHVjkZ80tEcRQ9ZchpDzjOeVDMB9KrwQi+u y6iHDfPV5B4FgUmf2egLqDO8w3wSKUwJ74OmPEg3hYKZBJm+wGxetfdVbRgDhsu3WFU9 LPtQhY+4fIkLgAxs4KlCPIsJD1aZ06QOBvz9sHzvbn+eJyaWLgbtfLUSHvRC1RVC/zUE hVPQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="UJAoNiS/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q4si7759354pls.74.2021.05.05.06.00.21; Wed, 05 May 2021 06:00:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="UJAoNiS/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233223AbhEENAe (ORCPT + 99 others); Wed, 5 May 2021 09:00:34 -0400 Received: from mail.kernel.org ([198.145.29.99]:57734 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231899AbhEENAc (ORCPT ); Wed, 5 May 2021 09:00:32 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id CC546610E6; Wed, 5 May 2021 12:59:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1620219575; bh=fivLajvDreKf59fm8zLN5oleFj4enAQJk96PmnS+VFM=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=UJAoNiS/suCyoQ+yFp9z39TMZwGVIJSSAGLsC8OZKju6RqWsVDsZHT+t6SaQCSbDR dimbcQbbgxNxyhQNRa6tkxxwC0t5MVje+s33f703/Xo7t6rY5QO3G87/52/oTayTz4 wtVhWwKHh9PWTXsjijubjU3CvRjTjUv/UZURWiaaGK2AjzNAYX289o6bC+5QmkATIW Uv0bQ1p3k+jvj/oHLN7+vOIxRMG90AkFKSS01Um03C5D0dU/1ClMSxmfeF21XybHzI G/rvUXDD/PrzuA74zoxYeYuRgAscz6hBGZ1rJFiN+ogm23Uv86617K/qlfMrVWGHFp IfyYksKgJAjBw== From: Felipe Balbi To: Thinh Nguyen , Wesley Cheng , Thinh Nguyen , "gregkh@linuxfoundation.org" Cc: "linux-usb@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "jackp@codeaurora.org" Subject: Re: [PATCH v2] usb: dwc3: gadget: Avoid canceling current request for queuing error In-Reply-To: References: <1620091264-418-1-git-send-email-wcheng@codeaurora.org> <5b46e4a1-93ef-2d17-048b-5b4ceba358ae@synopsys.com> <513e6c16-9586-c78e-881b-08e0a73c50a8@codeaurora.org> <7ef627cf-3f8f-8a52-52c4-ac67ab48b87d@codeaurora.org> <5c06dc0a-4274-b6f0-3844-bd8afa1a59f9@synopsys.com> Date: Wed, 05 May 2021 15:59:27 +0300 Message-ID: <87zgx9gwuo.fsf@kernel.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hi, Thinh Nguyen writes: >>>>>> allocate io_data (ffs) >>>>>> --> usb_ep_queue() >>>>>> --> __dwc3_gadget_kick_transfer() >>>>>> --> dwc3_send_gadget_ep_cmd(EINVAL) >>>>>> --> dwc3_gadget_ep_cleanup_cancelled_requests() >>>>>> --> dwc3_gadget_giveback(ECONNRESET) >>>>>> ffs completion callback >>>>>> queue work item within io_data >>>>>> --> usb_ep_queue returns EINVAL >>>>>> ffs frees io_data >>>>>> ... >>>>>> >>>>>> work scheduled >>>>>> --> NULL pointer/memory fault as io_data is freed >>>> >>>> Hi Thinh, >>>> >>>>> >>>>> sounds like a race issue. >>>>> >>>> >>>> It'll always happen if usb_ep_queue() fails with an error. Sorry for n= ot >>>> clarifying, but the "..." represents executing in a different context >>>> :). Anything above the "..." is in the same context. >>>>>> >>>>>>> BTW, what kinds of command and error do you see in your setup and f= or >>>>>>> what type endpoint? I'm thinking of letting the function driver to >>>>>>> dequeue the requests instead of letting dwc3 automatically >>>>>>> ending/cancelling the queued requests. However, it's a bit tricky t= o do >>>>>>> that if the error is -ETIMEDOUT since we're not sure if the control= ler >>>>>>> had already cached the TRBs. >>>>>>> >>>>>> >>>>>> Happens on bulk EPs so far, but I think it wouldn't matter as long as >>>>>> its over the FFS interface. (and using async IO transfers) >>>>> >>>>> Do you know which command and error code? It's strange if >>>>> UPDATE_TRANSFER command failed. >>>>> >>>> >>>> Sorry for missing that part of the question. It is a no xfer resource >>>> error on a start transfer command. So far this happens on low system >>>> memory test cases, so there may be some sequences that were missed, >>>> which led to this particular command error. >>>> >>>> Thanks >>>> Wesley Cheng >>=20 >> Hi Thinh, >>=20 >>> >>> No xfer resource usually means that the driver attempted to send >>> START_TRANSFER without waiting for END_TRANSFER command to complete. >>> This may be a dwc3 driver issue. Did you check this? >>> >>> Thanks, >>> Thinh >>> >>> >>=20 >> Yes, we know the reason why this happens, and its due to one of the >> downstream changes we had that led to the scenario above. Although, >> that has been fixed, I still believe the error path is a potential >> scenario we'd still want to address. >>=20 >> I think the returning success always on dwc3_gadget_ep_queue(), and >> allowing the error in the completion handler/giveback at the function >> driver level to do the cleanup is a feasible solution. Doesn't change >> the flow of the DWC3 gadget, and so far all function drivers we've used >> handle this in the correct manner. >>=20 >> Thanks >> Wesley Cheng > > Right. I think for now we should do that (return success always except > for cases of disconnect or already in-flight etc). This helps keeping it no, let's not lie to our users ;-) > simple and avoid some pitfalls dealing with giving back the request. > Currently we return the error status to dwc3_gadget_ep_queue if we > failed to send a command that may not even related to the same request > being queued. I think the fix should be simple, but we're trying to patch it in the wrong way. Can y'all comment on my suggestion on the other subthread? =2D-=20 balbi --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQFFBAEBCAAvFiEE9DumQ60WEZ09LIErzlfNM9wDzUgFAmCSlq8RHGJhbGJpQGtl cm5lbC5vcmcACgkQzlfNM9wDzUiKywf+Kf6+pp3/TXCFlwOeZsJ9yrd9oTCv/wYu N1Q02wXnbRRuDioIRMYYBcRrpC7KV7/I5bieDSEoZuNvMl0lZ0HJ3dYjMpKzX9gE XsLDFFChrs13HCs8ETPNOtbMAAPF9ljnRvlMns4y4jLRntUwzRUxLxpc8acI1ufa A3ss5cDbbmXig6SHOeyHysCOWAndGSN0zPjT2zrTKdmOBKjZkB5hhkE9ZMiMm0ng mL24HmTtRk6sa544/+VQtbtwCT+COiLH/LLxQxsI/LPgcZSuQ+o7ow7w52pbeLHN slo5zwkDTttCCFVYYmAY1DmTB5UEX0ctjFXL7uGvpRPvTAWYOX8zjQ== =RbDY -----END PGP SIGNATURE----- --=-=-=--