Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752588AbdHOMCy (ORCPT ); Tue, 15 Aug 2017 08:02:54 -0400 Received: from mga14.intel.com ([192.55.52.115]:54655 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752552AbdHOMCw (ORCPT ); Tue, 15 Aug 2017 08:02:52 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.41,377,1498546800"; d="asc'?scan'208";a="139781867" From: Felipe Balbi To: Danilo Krummrich Cc: linux-kernel@vger.kernel.org, linux-usb@vger.kernel.org, gregkh@linuxfoundation.org, k.opasiak@samsung.com Subject: Re: [PATCH] usb: gadget: udc: udc_stop before gadget unbind In-Reply-To: <0c918df8b95fba6ad6849cf9e67c59d6@dk-develop.de> References: <20170814163652.29108-1-danilokrummrich@dk-develop.de> <87efsd6tcn.fsf@linux.intel.com> <0c918df8b95fba6ad6849cf9e67c59d6@dk-develop.de> Date: Tue, 15 Aug 2017 15:02:40 +0300 Message-ID: <878til6ntr.fsf@linux.intel.com> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5831 Lines: 152 --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hi, Danilo Krummrich writes: > thanks for reviewing. np :-) > On 2017-08-15 12:03, Felipe Balbi wrote: >> Hi, >>=20 >> Danilo Krummrich writes: >>> udc_stop needs to be called before gadget driver unbind. Otherwise it >>> might happen that udc drivers still call into the gadget driver (e.g. >>> to reset gadget after OTG event). If this happens it is likely to get >>> panics from gadget driver dereferencing NULL ptr, as gadget's drvdata >>> is set to NULL on unbind. >>=20 >> seems like the problem here is with the OTG layer, not UDC core. >>=20 > I mentioned this just as example, it can happen whenever a UDC driver=20 > calls > the gadget driver (e.g. by calling usb_gadget_udc_reset() in ISR) after=20 > gadget > drivers unbind() was called already (e.g. by gadget configfs). > If this happens gadget drivers drvdata was already set to NULL by=20 > unbind() > and reset() could result into a NULL ptr exception. > Therefore my assumption was that it needs to be prevented that the=20 > gadget > driver is getting called after unbind. We have a known problem in the design of the gadget API that causes this races but we couldn't come up with a solution yet :-) Inverting these two calls is not the correct way to go about this :-) >>> Signed-off-by: Danilo Krummrich >>> --- >>> Actually there could still be a race: >>> (CPU1 code taken from dwc3 drivers dwc3_disconnect_gadget() as=20 >>> exsample) >>>=20 >>> CPU0 CPU1 >>> ---- ---- >>> usb_gadget_disconnect(udc->gadget); >>> udc->driver->disconnect(udc->gadget); >>> if (dwc->gadget_driver && dwc->gadget_driver->disconnect) >>> usb_gadget_udc_stop(udc); >>> udc->driver->unbind(udc->gadget); >>> dwc->gadget_driver->disconnect(&dwc->gadget); >>>=20 >>> UDC drivers typically set their gadget driver pointer to NULL in=20 >>> udc_stop >>> and check for it before calling into the gadget driver. To fix the=20 >>> issue >>> above every udc driver could apply a lock around this. >>>=20 >>> If you see the need for having this or another solutions I can provide >>> further patches. This patch could also just serve as a base for=20 >>> discussion >>> if someone knows a smarter solution. >>>=20 >>> I saw this problem causing a panic on hikey960 board and provided a=20 >>> quick >>> workaround for the same problem here: >>> https://android-review.googlesource.com/#/c/kernel/common/+/457476/ >>> (panic log in the commit message of the linked patch) >>> --- >>> drivers/usb/gadget/udc/core.c | 6 +++++- >>> 1 file changed, 5 insertions(+), 1 deletion(-) >>>=20 >>> diff --git a/drivers/usb/gadget/udc/core.c=20 >>> b/drivers/usb/gadget/udc/core.c >>> index efce68e9a8e0..8155468afc0d 100644 >>> --- a/drivers/usb/gadget/udc/core.c >>> +++ b/drivers/usb/gadget/udc/core.c >>> @@ -1234,8 +1234,12 @@ static void usb_gadget_remove_driver(struct=20 >>> usb_udc *udc) >>>=20 >>> usb_gadget_disconnect(udc->gadget); >>> udc->driver->disconnect(udc->gadget); >>> - udc->driver->unbind(udc->gadget); >>> + /* udc_stop needs to be called before gadget driver unbind to=20 >>> prevent >>> + * udc driver calls into gadget driver after unbind which could=20 >>> cause >>> + * a nullptr exception. >>> + */ >>> usb_gadget_udc_stop(udc); >>> + udc->driver->unbind(udc->gadget); >>=20 >> This patch is incorrect, it will prevent us from giving back requests=20 >> to >> gadget driver properly. ->unbind() has to happen before ->udc_stop(). > > Do you mean after udc_stop the udc driver can not call the gadget driver > anymore? If not, I did not got your point, sorry for that. Can you=20 > please > help me out? Would the changed order raise another issue I'm not aware=20 > of? right, ->udc_stop() is supposed to completely teardown the USB controller, including disabling interrupts and all. The only thing it _can_ do from ->udc_stop() would be giving back any pending requests that were left (which would cause req->complete() to be called with an error status). But even that is unlikely in the case you mention since =2D>unbind() was already called. > If I understood you correctly, without this patch udc driver can not=20 > call > the gadget driver back as well, because this would result in a NULL ptr > dereference, as unbind() sets drvdata to NULL. > > In any case the race described in my original message can still happen, > regardless of the order of udc_stop and unbind. But with this patch the > needed locking could easily done within the udc driver only. Without,=20 > the > lock needs to be acquired before udc->driver->unbind(udc->gadget) and > released after usb_gadget_udc_stop(). Otherwise an ISR of the udc driver > trying to call into the gadget driver could do this after gadget driver > already unbound. right =2D-=20 balbi --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEElLzh7wn96CXwjh2IzL64meEamQYFAlmS4uAACgkQzL64meEa mQZiIA/+LZuD983NJq9y4HtiAQQ5ljovNnAnVJp/wcFnVOSbcx2pXWKmtQkI1ify HZ3ulYr2iCTcc6EeAnKjqDVxANFlQV6Yykr8gZkR1fB+hJL+WqNdZsBmiXQqxTak BrfuIbVIhJoLTb+3pBbC8CtXK740wB33d5lw69+ymtyJ9DdkzlOcXApaNf0oDziM 06NMZY0DUadznnCSv340T3p8hNeXYdZVoaCdvN+OjAYU52BlwOVgYuWiqpBYe1pM N4bGCUrAUuYGxe+mu0QbB+8dqw+5hHom8VOZDGx97smXxXZrM2CrVPJlYjHA2U6P Q2GZO5dax2daZMUYROVeWhj7OwlpDos1brbg64Wanv/wA0Ruh96xUiVoLBOk0Tla Mt0roBDkYIN9U28Hw6EscxAOQ6NY8+HKipA5EiQXkv1/7MqRdp9BlYUb7rgkfk/c 3IPKQvCrCsWhoG1KjOtAa8auc9gNA2nZIfKMYm7PorfUBrQQlUlNFj+km8CgbH6P ASy20X76YPwj5VF4UaLso/VldmITT6k/N5UPKCCDJJn/lSkEbXNGJb7tITamX9LJ v0LvcBy7ioJb5rjJAW2ydE1IPgNzPvshZAFKiWB6perMM0foBK/U4FvceYpmYqdb No5laOugnvB+TpNUMqUiQFOdw/CMKPfE0obZULLW59stCD9kljE= =Sj1w -----END PGP SIGNATURE----- --=-=-=--