Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp2733548pxv; Sun, 27 Jun 2021 07:11:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwi0gsKUpqJa7oEivEaSARIdTQmrrRg/CvOqElTVGYDnXdDKgWCjNuYbiaoyWSEcMMw/wVe X-Received: by 2002:aa7:c50d:: with SMTP id o13mr27016053edq.9.1624803111822; Sun, 27 Jun 2021 07:11:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624803111; cv=none; d=google.com; s=arc-20160816; b=BIU3dYu/39Bl8ik8961itayYXBFwiLtzpoYCKbX232bjxVan8du8L0wkZXIYK8l5MH 79fKTxSKgrtUoinbuZ+eYzcrRBFREBvmovFqgcqa9Yd8eEkKkyHz5tca3oQoOBVtS75+ i/J4uAte7IgqoEMCz9lYKNMgcwS0nqptXOG825O8UrvcQY76xniOH5mbRr+MF3XbWDE7 LRkGzB+BWR9tvEv9hGYD/4MCh+fgGG8O/ZTL+zEO8qpecq0NDkKqYhrLy1meAgNqEPdv jLJkWAyWnnsRL7JNrsGDcDn+mxAQxjmbqW70VjKvBNa6vvtbTmkSQ48HgxArtm/6cQIK OpSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=WNB3iHd0ZGp+Jh7gTDIsv56z9GO4jzTlkMe1P+i9nBw=; b=NseN4wvf3OoxGuF7dBJWjzXdzHtN4Kk3eI3NuA9QNgnQWngoIkQeme064DJ8wAUuua mMH6+MZllSoo51NgUmImAKj7P30elmX1tanZ/6UoPtNpmfOU9hZq3ktFrC3gwzsshMvs Xrb+09UjpphUQmM/FL2eOzXK7GicUHUAc93c6KC9E/XbxyzUOpMJL+JFHGnYLIJwhPc/ RBBuWIGpjV2wqNG1Xvp99kbl8hANi+vfphRx0H9cKGF5T6Shj7A5/bSeGS87yM+XQzaV MMF04fgSTUXb3fiAeodVH+Rp81Wm1lHIN322mUFA9TogX/C9XHwvu5PR1dgVDPfbKdzF SCiQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gz24si11087633ejc.418.2021.06.27.07.11.27; Sun, 27 Jun 2021 07:11:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231130AbhF0OL3 (ORCPT + 99 others); Sun, 27 Jun 2021 10:11:29 -0400 Received: from netrider.rowland.org ([192.131.102.5]:51679 "HELO netrider.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S230386AbhF0OL2 (ORCPT ); Sun, 27 Jun 2021 10:11:28 -0400 Received: (qmail 625664 invoked by uid 1000); 27 Jun 2021 10:09:03 -0400 Date: Sun, 27 Jun 2021 10:09:03 -0400 From: Alan Stern To: linyyuan@codeaurora.org Cc: Felipe Balbi , Thinh Nguyen , Greg Kroah-Hartman , linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, Jack Pham Subject: Re: [PATCH] usb: dwc3: fix race of usb_gadget_driver operation Message-ID: <20210627140903.GB624763@rowland.harvard.edu> References: <20210625104415.8072-1-linyyuan@codeaurora.org> <20210625163707.GC574023@rowland.harvard.edu> <20210626150304.GA601624@rowland.harvard.edu> <1d1f06763c7cdeb67264128537c6a8f4@codeaurora.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1d1f06763c7cdeb67264128537c6a8f4@codeaurora.org> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Jun 27, 2021 at 10:48:56AM +0800, linyyuan@codeaurora.org wrote: > On 2021-06-26 23:03, Alan Stern wrote: > > On Sat, Jun 26, 2021 at 09:16:25AM +0800, linyyuan@codeaurora.org wrote: > > > On 2021-06-26 00:37, Alan Stern wrote: > > > > Here and in the other places, you should test dwc->async_callbacks > > > > _before_ dropping the spinlock. Otherwise there is a race (the flag > > > > could be written at about the same time it is checked). > > > thanks for your comments, > > > > > > if you think there is race here, how to make sure gadget_driver > > > pointer is > > > safe, > > > this is closest place where we can confirm it is non-NULL by checking > > > async_callbacks ? > > > > I explained this twice already: We know that gadget_driver is not > > NULL because usb_gadget_remove_driver calls synchronize_irq before > > doing usb_gadget_udc_stop. > > > > Look at this timing diagram: > > > > CPU0 CPU1 > > ---- ---- > > IRQ happens for setup packet > > Handler sees async_callbacks > > is enabled > > Handler unlocks dwc->lock > > usb_gadget_remove_driver runs > > Disables async callbacks > > Calls synchronize_irq > > Handler calls dwc-> . waits for IRQ handler to > > gadget_driver->setup . return > > Handler locks dwc-lock . > > ... . > > Handler returns . > > . synchronize_irq returns > > Calls usb_gadget_udc_stop > > dwc->gadget_driver is > > set to NULL > > > > As you can see, dwc->gadget_driver is non-NULL when CPU0 uses it, > > even though async_callbacks gets cleared during the time when the > > lock is released. > thanks for your patient explanation, > but from this part, seem it is synchronize_irq() help to avoid NULL pointer > crash. That's right. > can you also explain how async_callbacks flag help here ? It doesn't help in the situation shown above, but it does help in other situations. Consider this timing diagram: CPU0 CPU1 ---- ---- usb_gadget_remove_driver runs Disables async callbacks Calls synchronize_irq synchronize_irq returns Calls udc_driver_unbind IRQ happens for disconnect Handler sees async_callbacks is disabled Handler returns Calls usb_gadget_udc_stop dwc->gadget_driver is set to NULL With the async_callbacks check, everything works okay. But now look at what would happen without the async_callbacks mechanism: CPU0 CPU1 ---- ---- usb_gadget_remove_driver runs Calls synchronize_irq synchronize_irq returns Calls udc_driver_unbind IRQ happens for disconnect Handler unlocks dwc->lock Calls dwc->gadget_driver->disconnect Gadget driver has already been unbound and is not prepared to handle a callback, so it crashes Calls usb_gadget_udc_stop dwc->gadget_driver is set to NULL Without the async_callbacks mechanism, the gadget driver can get a callback at the wrong time (after it has been unbound), which might cause it to crash. Alan Stern