Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755380Ab3IOKEj (ORCPT ); Sun, 15 Sep 2013 06:04:39 -0400 Received: from caramon.arm.linux.org.uk ([78.32.30.218]:54795 "EHLO caramon.arm.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752158Ab3IOKEi (ORCPT ); Sun, 15 Sep 2013 06:04:38 -0400 Date: Sat, 14 Sep 2013 12:20:56 +0100 From: Russell King - ARM Linux To: Markus Pargmann Cc: Greg Kroah-Hartman , kernel@pengutronix.de, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: Re: dev->of_node overwrite can cause device loading with different driver Message-ID: <20130914112056.GH12758@n2100.arm.linux.org.uk> References: <20130913155331.GC14456@pengutronix.de> <20130913171046.GA26207@kroah.com> <20130914071653.GA26512@pengutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130914071653.GA26512@pengutronix.de> User-Agent: Mutt/1.5.19 (2009-01-05) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5054 Lines: 123 On Sat, Sep 14, 2013 at 09:16:53AM +0200, Markus Pargmann wrote: > I think there are three options to solve this: > > 1. Break out of the driver list iteration loop as soon as a driver probe > function fails. This way there is no possibility for another driver > to be probed with a device struct that was changed by the first > driver. But it would remove the support to fall back to > another driver for a device. I am not aware of any device that is > supported by multiple drivers. What if the incorrect driver (the one which created this platform device) is the one which was matched first? > 2. We could restore the device struct completely or partially (only > of_node) after a probe failure. This would avoid driver probes with > unclean device structs, but introduces some overhead. I don't think it's about changing an existing device structure. The problem is more to do with this: - We have a platform device with an associated of_node. - The of_node is used to match it to its device driver. - This device driver spawns a new platform device, and copies the of_node to the new platform device (so that the _intended_ driver can get access to the DT properties) - The DD layer tries to match this new platform device with a driver, and _can_ match this new platform device against the device driver which created it due to the of_node matching. There's two solutions here: 1. get rid of this yucky "lets spawn a new device" stuff - if you actually work out what's going on with MUSB and its use of this platform device, it's _really_ horrid, and that's putting it mildly. Let's call the device being probed, devA, and briefly: new_dev = platform_device_alloc(); devA->platform_data = glue glue->dev = devA glue->musb = new_dev new_dev->parent = devA set_drvdata(devA, glue) platform_device_add(new_dev); musb->controller is the new device, callbacks into this layer do: glue = get_drvdata(musb->controller->parent) that's not too bad, because this is accessing devA's driver data from within its owning driver... until you find this: musb = glue_to_musb(glue) which is defined as: #define glue_to_musb(g) platform_get_drvdata(g->musb) glue->musb is the _child_ platform device, and we're accessing the child's driver data here. This seems to me to be a layering violation, and also rather racy when you consider that glue_to_musb() gets used from workqueue contexts (and I don't see a flush_workqueue() call in these stub drivers.) What if the new platform device gets unbound just before the workqueue fires? Another thing to note here is that platform_device_add_data() takes a copy of the data - in the case of omap2430, this is kzalloc'd but is pointlessly kept around until this driver is removed (via the devm_ mechanisms.) The last thing I don't like in these drivers is this: memset(musb_resources, 0x00, sizeof(*musb_resources) * ARRAY_SIZE(musb_resources)); musb_resources[0].name = pdev->resource[0].name; musb_resources[0].start = pdev->resource[0].start; musb_resources[0].end = pdev->resource[0].end; musb_resources[0].flags = pdev->resource[0].flags; musb_resources[1].name = pdev->resource[1].name; musb_resources[1].start = pdev->resource[1].start; musb_resources[1].end = pdev->resource[1].end; musb_resources[1].flags = pdev->resource[1].flags; musb_resources[2].name = pdev->resource[2].name; musb_resources[2].start = pdev->resource[2].start; musb_resources[2].end = pdev->resource[2].end; musb_resources[2].flags = pdev->resource[2].flags; ret = platform_device_add_resources(musb, musb_resources, ARRAY_SIZE(musb_resources)); A little knowledge of the driver model will reveal that the above is utterly pointless - and can be simplified to: ret = platform_device_add_resources(musb, pdev->resource, pdev->num_resources); as platform_device_add_resources() copies the passed resource structures via kmemdup() already. There's no reason for this device driver to make its own copy of the resources just to have them re-copied. It's also a lot safer in case fewer than three resources are supplied. It's also a lot less hastle if additional resources are added (like what's happened recently to these drivers.) 2. don't pass the of_node via the platform_device, but as part of the platform device's data. Is there any reason why musb isn't implemented as a library which these stub drivers can hook into? I'd much prefer (1), because it gets rid of the horrid dma_mask stuff in these drivers, turning it more into a "conventional" driver. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/