Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753262AbdDKOMI (ORCPT ); Tue, 11 Apr 2017 10:12:08 -0400 Received: from iolanthe.rowland.org ([192.131.102.54]:60686 "HELO iolanthe.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1753441AbdDKOMD (ORCPT ); Tue, 11 Apr 2017 10:12:03 -0400 Date: Tue, 11 Apr 2017 10:12:01 -0400 (EDT) From: Alan Stern X-X-Sender: stern@iolanthe.rowland.org To: Felipe Balbi , Greg KH cc: Roger Quadros , , USB list , Kernel development list Subject: Re: [PATCH v3 1/3] usb: udc: allow adding and removing the same gadget device In-Reply-To: <87r30zcs95.fsf@linux.intel.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3457 Lines: 90 On Tue, 11 Apr 2017, Felipe Balbi wrote: > > Oddly enough, yes. But it doesn't explain why this code doesn't blow > > up every time it gets called, in its current form. > > Well, it does :-) > > dev_get_drvdata(_dev) -> NULL -> kfree(NULL) > > We're just leaking memory. I guess a patch like below would be best: > > diff --git a/drivers/usb/gadget/udc/net2280.c b/drivers/usb/gadget/udc/net2280.c > index 3828c2ec8623..4dc04253da61 100644 > --- a/drivers/usb/gadget/udc/net2280.c > +++ b/drivers/usb/gadget/udc/net2280.c > @@ -3555,13 +3555,6 @@ static irqreturn_t net2280_irq(int irq, void *_dev) > > /*-------------------------------------------------------------------------*/ > > -static void gadget_release(struct device *_dev) > -{ > - struct net2280 *dev = dev_get_drvdata(_dev); > - > - kfree(dev); > -} > - > /* tear down the binding between this driver and the pci device */ > > static void net2280_remove(struct pci_dev *pdev) > @@ -3598,6 +3591,8 @@ static void net2280_remove(struct pci_dev *pdev) > device_remove_file(&pdev->dev, &dev_attr_registers); > > ep_info(dev, "unbind\n"); > + > + kfree(dev); > } > > /* wrap this driver around the specified device, but > @@ -3775,8 +3770,7 @@ static int net2280_probe(struct pci_dev *pdev, const struct pci_device_id *id) > if (retval) > goto done; > > - retval = usb_add_gadget_udc_release(&pdev->dev, &dev->gadget, > - gadget_release); > + retval = usb_add_gadget_udc(&pdev->dev, &dev->gadget); > if (retval) > goto done; > return 0; Maybe... But I can't shake the feeling that Greg KH would strongly disagree. Hasn't he said, many times in the past, that any dynamically allocated device structure _must_ have a real release routine? usb_udc_nop_release() doesn't qualify. The issue is outstanding references to gadget.dev. The driver core does not guarantee that all references will have been dropped by the time device_unregister() returns. So what if some other part of the kernel still has a reference to gadget.dev when we reach the end of net2280_remove() and deallocate the private structure? Unless you can be certain that there are no outstanding references when the gadget is unregistered, this approach isn't safe. (And even if you can be certain, how can you know that future changes to the kernel won't affect the situation?) > > And it doesn't help with the fact that net2280_remove() continues to > > access the private data structure after calling usb_del_gadget_udc(). > > Strictly speaking, that routine should do > > > > get_device(&dev->gadget.dev); > > > > at the start, with a corresponding put_device() at the end. > > > > There's another problem. Suppose a call to > > usb_add_gadget_udc_release() fails. At the end of that routine, the > > error pathway does put_device(&gadget->dev). This will invoke the > > release callback, deallocating the private data structure without > > giving the caller (i.e., the UDC driver) a chance to clean up. > > it won't deallocate anything :-) dev_set_drvdata() was never called, > we will endup with kfree(NULL) which is safe and just silently returns. But if you change gadget_release() to use the parent's drvdata field, like you proposed earlier, and fix up the refcounting in net2280_remove() so that the structure doesn't get deallocated too quickly, then this does become a real problem. And it can affect other UDC drivers, not just net2280. Alan Stern