From: Martin Wilck <[email protected]>
Since b8b2c7d845d5, platform_drv_probe() is called for all platform
devices. If drv->probe is NULL, and dev_pm_domain_attach() fails,
platform_drv_probe() will return the error code from dev_pm_domain_attach().
This causes real_probe() to enter the "probe_failed" path and set
dev->driver to NULL. Before b8b2c7d845d5, real_probe() would assume
success if both dev->bus->probe and drv->probe are missing.
This may cause a panic later. For example, inserting the tpm_tis
driver with parameter "force=1" (i.e. registering tpm_tis as a platform
driver) will panic in tpmm_chip_alloc() because dev->driver is NULL:
chip->cdev.owner = chip->pdev->driver->owner;
This patch fixes this by returning success in platform_drv_probe() if
"just" dev_pm_domain_attach() had failed. This restores the semantics
of platform_device_register_XXX() if the associated platform driver has
no "probe" function.
Fixes: b8b2c7d845d5 ("base/platform: assert that dev_pm_domain
callbacks are called unconditionally")
Signed-off-by: Martin Wilck <[email protected]>
---
drivers/base/platform.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 1dd6d3b..c994e76 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -513,10 +513,14 @@ static int platform_drv_probe(struct device *_dev)
return ret;
ret = dev_pm_domain_attach(_dev, true);
- if (ret != -EPROBE_DEFER && drv->probe) {
- ret = drv->probe(dev);
- if (ret)
- dev_pm_domain_detach(_dev, true);
+ if (ret != -EPROBE_DEFER) {
+ if (drv->probe) {
+ ret = drv->probe(dev);
+ if (ret)
+ dev_pm_domain_detach(_dev, true);
+ } else
+ /* don't fail if just dev_pm_domain_attach failed */
+ ret = 0;
}
if (drv->prevent_deferred_probe && ret == -EPROBE_DEFER) {
--
1.8.3.1
On Thu, Nov 26, 2015 at 08:01:34PM +0100, [email protected] wrote:
> From: Martin Wilck <[email protected]>
>
> Since b8b2c7d845d5, platform_drv_probe() is called for all platform
> devices. If drv->probe is NULL, and dev_pm_domain_attach() fails,
> platform_drv_probe() will return the error code from dev_pm_domain_attach().
>
> This causes real_probe() to enter the "probe_failed" path and set
> dev->driver to NULL. Before b8b2c7d845d5, real_probe() would assume
> success if both dev->bus->probe and drv->probe are missing.
>
> This may cause a panic later. For example, inserting the tpm_tis
> driver with parameter "force=1" (i.e. registering tpm_tis as a platform
> driver) will panic in tpmm_chip_alloc() because dev->driver is NULL:
>
> chip->cdev.owner = chip->pdev->driver->owner;
Is this happening because tpm_tis is not creating the platform device
properly? ie it just calls platform_device_register_simple and then
force initializes it via tpm_tis_init, which expects to be called from
a probe function with an attached driver.
Instead we should setup a proper platform device with the default
IO range for x86 and let the driver core call tpm_tis_init via
tis_drv.probe.
Would changing things in this way fix the problem you've observed?
I have some patches to do this that are part of my OF enablement
series, but I can make something simpler that would deal with this
fairly quickly if you can test.
Jason
On Do, 2015-11-26 at 13:30 -0700, Jason Gunthorpe wrote:
> On Thu, Nov 26, 2015 at 08:01:34PM +0100, [email protected] wrote:
> > From: Martin Wilck <[email protected]>
> >
> > Since b8b2c7d845d5, platform_drv_probe() is called for all platform
> > devices. If drv->probe is NULL, and dev_pm_domain_attach() fails,
> > platform_drv_probe() will return the error code from dev_pm_domain_attach().
> >
> > This causes real_probe() to enter the "probe_failed" path and set
> > dev->driver to NULL. Before b8b2c7d845d5, real_probe() would assume
> > success if both dev->bus->probe and drv->probe are missing.
> >
> > This may cause a panic later. For example, inserting the tpm_tis
> > driver with parameter "force=1" (i.e. registering tpm_tis as a platform
> > driver) will panic in tpmm_chip_alloc() because dev->driver is NULL:
> >
> > chip->cdev.owner = chip->pdev->driver->owner;
>
> Is this happening because tpm_tis is not creating the platform device
> properly? ie it just calls platform_device_register_simple and then
> force initializes it via tpm_tis_init, which expects to be called from
> a probe function with an attached driver.
>
> Instead we should setup a proper platform device with the default
> IO range for x86 and let the driver core call tpm_tis_init via
> tis_drv.probe.
>
> Would changing things in this way fix the problem you've observed?
I think so. Nonetheless, patch b8b2c7d845d5 introduced a change in the
way platform device registration behaves. The platform device code seems
to be prepared for cases where platform_driver->probe == NULL, so that
case should be handled gracefully. Otherwise, failure should occur
earlier, e.g. when platform_driver_register() is called with
platform_driver->probe == NULL. tpm_tis may not be the only driver that
uses platform_device_register_simple() in this way.
> I have some patches to do this that are part of my OF enablement
> series, but I can make something simpler that would deal with this
> fairly quickly if you can test.
Let's first wait what the platform guys say.
Martin
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?
Hello Martin,
On Thu, Nov 26, 2015 at 08:01:34PM +0100, [email protected] wrote:
> From: Martin Wilck <[email protected]>
>
> Since b8b2c7d845d5, platform_drv_probe() is called for all platform
> devices. If drv->probe is NULL, and dev_pm_domain_attach() fails,
> platform_drv_probe() will return the error code from dev_pm_domain_attach().
Correct, this is an unintended change of behaviour introduced in
b8b2c7d845d5.
> This causes real_probe() to enter the "probe_failed" path and set
> dev->driver to NULL. Before b8b2c7d845d5, real_probe() would assume
> success if both dev->bus->probe and drv->probe are missing.
>
> This may cause a panic later. For example, inserting the tpm_tis
> driver with parameter "force=1" (i.e. registering tpm_tis as a platform
> driver) will panic in tpmm_chip_alloc() because dev->driver is NULL:
>
> chip->cdev.owner = chip->pdev->driver->owner;
This sounds like a separate issue though. Looking at init_tis there is:
rc = platform_driver_register(&tis_drv);
if (rc < 0)
return rc;
pdev = platform_device_register_simple("tpm_tis", -1, NULL, 0);
if (IS_ERR(pdev)) {
rc = PTR_ERR(pdev);
goto err_dev;
}
rc = tpm_tis_init(&pdev->dev, &tis_default_info, NULL);
tpm_tis_init calls tpmm_chip_alloc which barfs when pdev (i.e. the return value
of platform_device_register_simple above) isn't bound. It is not allowed
to assume that the device is bound after the above function calls.
So I'd say drop the paragraph about tpm_tis and the change is fine.
> This patch fixes this by returning success in platform_drv_probe() if
> "just" dev_pm_domain_attach() had failed. This restores the semantics
> of platform_device_register_XXX() if the associated platform driver has
> no "probe" function.
>
> Fixes: b8b2c7d845d5 ("base/platform: assert that dev_pm_domain
> callbacks are called unconditionally")
>
I think line breaks in the Fixes: line are frowned on. Also usually
there is no empty line between Fixes: and S-o-b:.
> Signed-off-by: Martin Wilck <[email protected]>
> ---
> drivers/base/platform.c | 12 ++++++++----
> 1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 1dd6d3b..c994e76 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -513,10 +513,14 @@ static int platform_drv_probe(struct device *_dev)
> return ret;
>
> ret = dev_pm_domain_attach(_dev, true);
> - if (ret != -EPROBE_DEFER && drv->probe) {
> - ret = drv->probe(dev);
> - if (ret)
> - dev_pm_domain_detach(_dev, true);
> + if (ret != -EPROBE_DEFER) {
> + if (drv->probe) {
> + ret = drv->probe(dev);
> + if (ret)
> + dev_pm_domain_detach(_dev, true);
> + } else
> + /* don't fail if just dev_pm_domain_attach failed */
> + ret = 0;
An else that has a } should also have a {, according to
checkpatch and Documentation/CodingStyle. You can write it
alternatively as:
if (ret != -EPROBE_DEFER) {
if (drv->probe)
ret = drv->probe(dev);
else
ret = 0;
if (ret)
dev_pm_domain_detach(_dev, true);
}
.
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-K?nig |
Industrial Linux Solutions | http://www.pengutronix.de/ |
On Thu, Nov 26, 2015 at 08:01:34PM +0100, [email protected] wrote:
> From: Martin Wilck <[email protected]>
>
> Since b8b2c7d845d5, platform_drv_probe() is called for all platform
> devices. If drv->probe is NULL, and dev_pm_domain_attach() fails,
> platform_drv_probe() will return the error code from dev_pm_domain_attach().
>
> This causes real_probe() to enter the "probe_failed" path and set
> dev->driver to NULL. Before b8b2c7d845d5, real_probe() would assume
> success if both dev->bus->probe and drv->probe are missing.
>
> This may cause a panic later. For example, inserting the tpm_tis
> driver with parameter "force=1" (i.e. registering tpm_tis as a platform
> driver) will panic in tpmm_chip_alloc() because dev->driver is NULL:
>
> chip->cdev.owner = chip->pdev->driver->owner;
>
> This patch fixes this by returning success in platform_drv_probe() if
> "just" dev_pm_domain_attach() had failed. This restores the semantics
> of platform_device_register_XXX() if the associated platform driver has
> no "probe" function.
>
> Fixes: b8b2c7d845d5 ("base/platform: assert that dev_pm_domain
> callbacks are called unconditionally")
>
> Signed-off-by: Martin Wilck <[email protected]>
Acked-by: Jarkko Sakkinen <[email protected]>
> ---
> drivers/base/platform.c | 12 ++++++++----
> 1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 1dd6d3b..c994e76 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -513,10 +513,14 @@ static int platform_drv_probe(struct device *_dev)
> return ret;
>
> ret = dev_pm_domain_attach(_dev, true);
> - if (ret != -EPROBE_DEFER && drv->probe) {
> - ret = drv->probe(dev);
> - if (ret)
> - dev_pm_domain_detach(_dev, true);
> + if (ret != -EPROBE_DEFER) {
> + if (drv->probe) {
> + ret = drv->probe(dev);
> + if (ret)
> + dev_pm_domain_detach(_dev, true);
> + } else
> + /* don't fail if just dev_pm_domain_attach failed */
> + ret = 0;
> }
>
> if (drv->prevent_deferred_probe && ret == -EPROBE_DEFER) {
> --
> 1.8.3.1
>
>
On Thu, Nov 26, 2015 at 01:30:31PM -0700, Jason Gunthorpe wrote:
> On Thu, Nov 26, 2015 at 08:01:34PM +0100, [email protected] wrote:
> > From: Martin Wilck <[email protected]>
> >
> > Since b8b2c7d845d5, platform_drv_probe() is called for all platform
> > devices. If drv->probe is NULL, and dev_pm_domain_attach() fails,
> > platform_drv_probe() will return the error code from dev_pm_domain_attach().
> >
> > This causes real_probe() to enter the "probe_failed" path and set
> > dev->driver to NULL. Before b8b2c7d845d5, real_probe() would assume
> > success if both dev->bus->probe and drv->probe are missing.
> >
> > This may cause a panic later. For example, inserting the tpm_tis
> > driver with parameter "force=1" (i.e. registering tpm_tis as a platform
> > driver) will panic in tpmm_chip_alloc() because dev->driver is NULL:
> >
> > chip->cdev.owner = chip->pdev->driver->owner;
>
> Is this happening because tpm_tis is not creating the platform device
> properly? ie it just calls platform_device_register_simple and then
> force initializes it via tpm_tis_init, which expects to be called from
> a probe function with an attached driver.
Agreed. We should have a probe callback.
> Instead we should setup a proper platform device with the default
> IO range for x86 and let the driver core call tpm_tis_init via
> tis_drv.probe.
>
> Would changing things in this way fix the problem you've observed?
>
> I have some patches to do this that are part of my OF enablement
> series, but I can make something simpler that would deal with this
> fairly quickly if you can test.
Does the patch set that you sent include the fix or not? I haven't yet
reviewed them properly.
> Jason
/Jarkko
On Sat, Nov 28, 2015 at 06:40:03PM +0200, Jarkko Sakkinen wrote:
> On Thu, Nov 26, 2015 at 01:30:31PM -0700, Jason Gunthorpe wrote:
> > On Thu, Nov 26, 2015 at 08:01:34PM +0100, [email protected] wrote:
> > > From: Martin Wilck <[email protected]>
> > >
> > > Since b8b2c7d845d5, platform_drv_probe() is called for all platform
> > > devices. If drv->probe is NULL, and dev_pm_domain_attach() fails,
> > > platform_drv_probe() will return the error code from dev_pm_domain_attach().
> > >
> > > This causes real_probe() to enter the "probe_failed" path and set
> > > dev->driver to NULL. Before b8b2c7d845d5, real_probe() would assume
> > > success if both dev->bus->probe and drv->probe are missing.
> > >
> > > This may cause a panic later. For example, inserting the tpm_tis
> > > driver with parameter "force=1" (i.e. registering tpm_tis as a platform
> > > driver) will panic in tpmm_chip_alloc() because dev->driver is NULL:
> > >
> > > chip->cdev.owner = chip->pdev->driver->owner;
> >
> > Is this happening because tpm_tis is not creating the platform device
> > properly? ie it just calls platform_device_register_simple and then
> > force initializes it via tpm_tis_init, which expects to be called from
> > a probe function with an attached driver.
>
> Agreed. We should have a probe callback.
>
> > Instead we should setup a proper platform device with the default
> > IO range for x86 and let the driver core call tpm_tis_init via
> > tis_drv.probe.
> >
> > Would changing things in this way fix the problem you've observed?
> >
> > I have some patches to do this that are part of my OF enablement
> > series, but I can make something simpler that would deal with this
> > fairly quickly if you can test.
>
> Does the patch set that you sent include the fix or not? I haven't yet
> reviewed them properly.
Another question: does you patch series include an alternative fix for
the probe bug or should I just pick Martins fix? As I sad previously I
was seriously lost with the race but now I understand what you and
Martin were saying (and feel utterly stupid + ashamed!). Now I'm just
thinking, which fix I should pick.
Anyway, I'll try to go through your code ASAP.
/Jarkko
On Sat, Nov 28, 2015 at 06:40:03PM +0200, Jarkko Sakkinen wrote:
> > I have some patches to do this that are part of my OF enablement
> > series, but I can make something simpler that would deal with this
> > fairly quickly if you can test.
>
> Does the patch set that you sent include the fix or not? I haven't yet
> reviewed them properly.
No fixing probe is another task. I can send some patches for that when
we are done with the IRQ stuff. That is something we should fix no
matter what..
BTW, please test my IRQ series, I forgot to mention I was unable to
test it properly here...
Jason
Hello Jarkko,
On Sat, Nov 28, 2015 at 06:34:47PM +0200, Jarkko Sakkinen wrote:
> On Thu, Nov 26, 2015 at 08:01:34PM +0100, [email protected] wrote:
> > From: Martin Wilck <[email protected]>
> >
> > Since b8b2c7d845d5, platform_drv_probe() is called for all platform
> > devices. If drv->probe is NULL, and dev_pm_domain_attach() fails,
> > platform_drv_probe() will return the error code from dev_pm_domain_attach().
> >
> > This causes real_probe() to enter the "probe_failed" path and set
> > dev->driver to NULL. Before b8b2c7d845d5, real_probe() would assume
> > success if both dev->bus->probe and drv->probe are missing.
> >
> > This may cause a panic later. For example, inserting the tpm_tis
> > driver with parameter "force=1" (i.e. registering tpm_tis as a platform
> > driver) will panic in tpmm_chip_alloc() because dev->driver is NULL:
> >
> > chip->cdev.owner = chip->pdev->driver->owner;
> >
> > This patch fixes this by returning success in platform_drv_probe() if
> > "just" dev_pm_domain_attach() had failed. This restores the semantics
> > of platform_device_register_XXX() if the associated platform driver has
> > no "probe" function.
> >
> > Fixes: b8b2c7d845d5 ("base/platform: assert that dev_pm_domain
> > callbacks are called unconditionally")
> >
> > Signed-off-by: Martin Wilck <[email protected]>
>
> Acked-by: Jarkko Sakkinen <[email protected]>
While the patch is fine, the commit log is not. It blames b8b2c7d845d5
to be responsible for a panic, but in fact it only breaks the wrong
assumption of the tpm_tis driver.
So I'm not sure how to interpret your Ack, IMHO it should not make
gregkh pick up the patch as is.
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-K?nig |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Hello Uwe,
thanks for your review.
> This may cause a panic later. For example, inserting the tpm_tis
> > driver with parameter "force=1" (i.e. registering tpm_tis as a platform
> > driver) will panic in tpmm_chip_alloc() because dev->driver is NULL:
> >
> > chip->cdev.owner = chip->pdev->driver->owner;
>
> This sounds like a separate issue though. Looking at init_tis there is:
>
> rc = platform_driver_register(&tis_drv);
> if (rc < 0)
> return rc;
> pdev = platform_device_register_simple("tpm_tis", -1, NULL, 0);
> if (IS_ERR(pdev)) {
> rc = PTR_ERR(pdev);
> goto err_dev;
> }
> rc = tpm_tis_init(&pdev->dev, &tis_default_info, NULL);
>
> tpm_tis_init calls tpmm_chip_alloc which barfs when pdev (i.e. the return value
> of platform_device_register_simple above) isn't bound. It is not allowed
> to assume that the device is bound after the above function calls.
I agree that the TPM platform device code deserves improvement. Jason
wrote that he has already some patches available for that.
I lack the knowledge to judge whether or not tpm_is_init's assumption
was correct. But, maybe just by luck, this assumption used to be *true*
until patch b8b2c7d845d5. Driver and device were matched by name
("tpm_tis") by the platform driver probing code, and device and driver
were actually bound to each other after this sequence of calls.
> So I'd say drop the paragraph about tpm_tis and the change is fine.
I didn't mean to blame your patch. But a note about the panic might be
helpful just in case someone else runs into the same problem. The
connection between your patch and tpm_tis loading is far from obvious.
I mentioned the panic in order to clarify that this wasn't just a
theoretical issue.
Anyway, I'll resubmit with your style hints applied and will try to find
a wording for the commit message that we can agree upon.
Best Regards,
Martin
>
> > This patch fixes this by returning success in platform_drv_probe() if
> > "just" dev_pm_domain_attach() had failed. This restores the semantics
> > of platform_device_register_XXX() if the associated platform driver has
> > no "probe" function.
> >
> > Fixes: b8b2c7d845d5 ("base/platform: assert that dev_pm_domain
> > callbacks are called unconditionally")
> >
>
> I think line breaks in the Fixes: line are frowned on. Also usually
> there is no empty line between Fixes: and S-o-b:.
>
> > Signed-off-by: Martin Wilck <[email protected]>
> > ---
> > drivers/base/platform.c | 12 ++++++++----
> > 1 file changed, 8 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> > index 1dd6d3b..c994e76 100644
> > --- a/drivers/base/platform.c
> > +++ b/drivers/base/platform.c
> > @@ -513,10 +513,14 @@ static int platform_drv_probe(struct device *_dev)
> > return ret;
> >
> > ret = dev_pm_domain_attach(_dev, true);
> > - if (ret != -EPROBE_DEFER && drv->probe) {
> > - ret = drv->probe(dev);
> > - if (ret)
> > - dev_pm_domain_detach(_dev, true);
> > + if (ret != -EPROBE_DEFER) {
> > + if (drv->probe) {
> > + ret = drv->probe(dev);
> > + if (ret)
> > + dev_pm_domain_detach(_dev, true);
> > + } else
> > + /* don't fail if just dev_pm_domain_attach failed */
> > + ret = 0;
>
> An else that has a } should also have a {, according to
> checkpatch and Documentation/CodingStyle. You can write it
> alternatively as:
>
> if (ret != -EPROBE_DEFER) {
> if (drv->probe)
> ret = drv->probe(dev);
> else
> ret = 0;
>
> if (ret)
> dev_pm_domain_detach(_dev, true);
> }
>
> .
>
> Best regards
> Uwe
>
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?
From: Martin Wilck <[email protected]>
Since b8b2c7d845d5, platform_drv_probe() is called for all platform
devices. If drv->probe is NULL, and dev_pm_domain_attach() fails,
platform_drv_probe() will return the error code from dev_pm_domain_attach().
This causes real_probe() to enter the "probe_failed" path and set
dev->driver to NULL. Before b8b2c7d845d5, real_probe() would assume
success if both dev->bus->probe and drv->probe were missing. As a result,
a device and driver could be "bound" together just by matching their names;
this doesn't work any more after b8b2c7d845d5.
This may cause problems later for certain usage of platform_driver_register()
and platform_device_register_simple(). I observed a panic while loading
the tpm_tis driver with parameter "force=1" (i.e. registering tpm_tis as
a platform driver), because tpm_tis_init's assumption that the device
returned by platform_device_register_simple() was bound didn't hold any more
(tpmm_chip_alloc() dereferences chip->pdev->driver, causing panic).
This patch restores the previous (4.3.0 and earlier) behavior of
platform_drv_probe() in the case when the associated platform driver has
no "probe" function.
v2: fixed style issues, rephrased commit message.
Fixes: b8b2c7d845d5 ("base/platform: assert that dev_pm_domain callbacks are called unconditionally")
Signed-off-by: Martin Wilck <[email protected]>
---
drivers/base/platform.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 1dd6d3b..176b59f 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -513,10 +513,15 @@ static int platform_drv_probe(struct device *_dev)
return ret;
ret = dev_pm_domain_attach(_dev, true);
- if (ret != -EPROBE_DEFER && drv->probe) {
- ret = drv->probe(dev);
- if (ret)
- dev_pm_domain_detach(_dev, true);
+ if (ret != -EPROBE_DEFER) {
+ if (drv->probe) {
+ ret = drv->probe(dev);
+ if (ret)
+ dev_pm_domain_detach(_dev, true);
+ } else {
+ /* don't fail if just dev_pm_domain_attach failed */
+ ret = 0;
+ }
}
if (drv->prevent_deferred_probe && ret == -EPROBE_DEFER) {
--
1.8.3.1
On Sat, Nov 28, 2015 at 03:52:51PM -0700, Jason Gunthorpe wrote:
> On Sat, Nov 28, 2015 at 06:40:03PM +0200, Jarkko Sakkinen wrote:
> > > I have some patches to do this that are part of my OF enablement
> > > series, but I can make something simpler that would deal with this
> > > fairly quickly if you can test.
> >
> > Does the patch set that you sent include the fix or not? I haven't yet
> > reviewed them properly.
>
> No fixing probe is another task. I can send some patches for that when
> we are done with the IRQ stuff. That is something we should fix no
> matter what..
>
> BTW, please test my IRQ series, I forgot to mention I was unable to
> test it properly here...
Got you. I need to at least test insmod/rmod (maybe couple of times in a
cycle). Do you see any other code paths that could easily break because
of your patches?
> Jason
/Jarkko
Hi Uwe,
On Sun, Nov 29, 2015 at 10:54:11AM +0100, Uwe Kleine-K?nig wrote:
> Hello Jarkko,
>
> On Sat, Nov 28, 2015 at 06:34:47PM +0200, Jarkko Sakkinen wrote:
> > On Thu, Nov 26, 2015 at 08:01:34PM +0100, [email protected] wrote:
> > > From: Martin Wilck <[email protected]>
> > >
> > > Since b8b2c7d845d5, platform_drv_probe() is called for all platform
> > > devices. If drv->probe is NULL, and dev_pm_domain_attach() fails,
> > > platform_drv_probe() will return the error code from dev_pm_domain_attach().
> > >
> > > This causes real_probe() to enter the "probe_failed" path and set
> > > dev->driver to NULL. Before b8b2c7d845d5, real_probe() would assume
> > > success if both dev->bus->probe and drv->probe are missing.
> > >
> > > This may cause a panic later. For example, inserting the tpm_tis
> > > driver with parameter "force=1" (i.e. registering tpm_tis as a platform
> > > driver) will panic in tpmm_chip_alloc() because dev->driver is NULL:
> > >
> > > chip->cdev.owner = chip->pdev->driver->owner;
> > >
> > > This patch fixes this by returning success in platform_drv_probe() if
> > > "just" dev_pm_domain_attach() had failed. This restores the semantics
> > > of platform_device_register_XXX() if the associated platform driver has
> > > no "probe" function.
> > >
> > > Fixes: b8b2c7d845d5 ("base/platform: assert that dev_pm_domain
> > > callbacks are called unconditionally")
> > >
> > > Signed-off-by: Martin Wilck <[email protected]>
> >
> > Acked-by: Jarkko Sakkinen <[email protected]>
>
> While the patch is fine, the commit log is not. It blames b8b2c7d845d5
> to be responsible for a panic, but in fact it only breaks the wrong
> assumption of the tpm_tis driver.
>
> So I'm not sure how to interpret your Ack, IMHO it should not make
> gregkh pick up the patch as is.
Alright. I don't think you can speak about *wrong assumptions* if the
semantics allowed not to have it before. *Where* it should be fixed is
another question. I'd keep the Fixes tag in all cases.
Jason, you had the fix for this issue directly to tpm_tis driver that
you haven't yet posted, right? Just double-checking this.
> Best regards
> Uwe
>
> --
> Pengutronix e.K. | Uwe Kleine-K?nig |
> Industrial Linux Solutions | http://www.pengutronix.de/ |
/Jarkko
On Mon, Nov 30, 2015 at 02:56:31PM +0200, Jarkko Sakkinen wrote:
> Hi Uwe,
>
> On Sun, Nov 29, 2015 at 10:54:11AM +0100, Uwe Kleine-K?nig wrote:
> > Hello Jarkko,
> >
> > On Sat, Nov 28, 2015 at 06:34:47PM +0200, Jarkko Sakkinen wrote:
> > > On Thu, Nov 26, 2015 at 08:01:34PM +0100, [email protected] wrote:
> > > > From: Martin Wilck <[email protected]>
> > > >
> > > > Since b8b2c7d845d5, platform_drv_probe() is called for all platform
> > > > devices. If drv->probe is NULL, and dev_pm_domain_attach() fails,
> > > > platform_drv_probe() will return the error code from dev_pm_domain_attach().
> > > >
> > > > This causes real_probe() to enter the "probe_failed" path and set
> > > > dev->driver to NULL. Before b8b2c7d845d5, real_probe() would assume
> > > > success if both dev->bus->probe and drv->probe are missing.
> > > >
> > > > This may cause a panic later. For example, inserting the tpm_tis
> > > > driver with parameter "force=1" (i.e. registering tpm_tis as a platform
> > > > driver) will panic in tpmm_chip_alloc() because dev->driver is NULL:
> > > >
> > > > chip->cdev.owner = chip->pdev->driver->owner;
> > > >
> > > > This patch fixes this by returning success in platform_drv_probe() if
> > > > "just" dev_pm_domain_attach() had failed. This restores the semantics
> > > > of platform_device_register_XXX() if the associated platform driver has
> > > > no "probe" function.
> > > >
> > > > Fixes: b8b2c7d845d5 ("base/platform: assert that dev_pm_domain
> > > > callbacks are called unconditionally")
> > > >
> > > > Signed-off-by: Martin Wilck <[email protected]>
> > >
> > > Acked-by: Jarkko Sakkinen <[email protected]>
> >
> > While the patch is fine, the commit log is not. It blames b8b2c7d845d5
> > to be responsible for a panic, but in fact it only breaks the wrong
> > assumption of the tpm_tis driver.
> >
> > So I'm not sure how to interpret your Ack, IMHO it should not make
> > gregkh pick up the patch as is.
>
> Alright. I don't think you can speak about *wrong assumptions* if the
> semantics allowed not to have it before. *Where* it should be fixed is
> another question. I'd keep the Fixes tag in all cases.
>
> Jason, you had the fix for this issue directly to tpm_tis driver that
> you haven't yet posted, right? Just double-checking this.
Uwe, please ignore this :) Saw your more in-depth comment about platform
driver creation. Thank you. I somehow have missed it before.
/Jarkko
Hello Uwe,
> This sounds like a separate issue though. Looking at init_tis there is:
>
> rc = platform_driver_register(&tis_drv);
> if (rc < 0)
> return rc;
> pdev = platform_device_register_simple("tpm_tis", -1, NULL, 0);
> if (IS_ERR(pdev)) {
> rc = PTR_ERR(pdev);
> goto err_dev;
> }
> rc = tpm_tis_init(&pdev->dev, &tis_default_info, NULL);
>
> tpm_tis_init calls tpmm_chip_alloc which barfs when pdev (i.e. the return value
> of platform_device_register_simple above) isn't bound. It is not allowed
> to assume that the device is bound after the above function calls.
Can you please explain again why you think that assumption is invalid?
As far as I understand the code, the assumption would be correct in
4.3.0 and earlier:
platform_driver_register() registers a platform driver with name
"tpm_tis". platform_device_register_simple() registers a device with the
same name. This will call platform_device_add()/device_add() and start
probing for a platform device. Platform bus probing in platform_match()
falls back to a simple match between driver and device name if all else
fails. That match succeeds for the "tpm_tis" driver. Thus
driver_probe_device() will be called, and in the absence of a
driver-specific probe routine, will succeed. Thus after
platform_device_register_simple() returns, device and driver will be
bound. This matches also actual behavior of the pre-4.4 code.
Please explain what I am overlooking. I am just trying to understand.
As far as tpm_tis is concerned, Jason's current patch set is going to
fix this for good anyway.
Regards
Martin
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?
Hello Martin,
On Tue, Dec 01, 2015 at 11:41:53AM +0100, Wilck, Martin wrote:
> > This sounds like a separate issue though. Looking at init_tis there is:
> >
> > rc = platform_driver_register(&tis_drv);
> > if (rc < 0)
> > return rc;
> > pdev = platform_device_register_simple("tpm_tis", -1, NULL, 0);
> > if (IS_ERR(pdev)) {
> > rc = PTR_ERR(pdev);
> > goto err_dev;
> > }
> > rc = tpm_tis_init(&pdev->dev, &tis_default_info, NULL);
> >
> > tpm_tis_init calls tpmm_chip_alloc which barfs when pdev (i.e. the return value
> > of platform_device_register_simple above) isn't bound. It is not allowed
> > to assume that the device is bound after the above function calls.
>
> Can you please explain again why you think that assumption is invalid?
You can unbind a device from a driver via sysfs, you can also prevent
binding somehow I think, probing can fail for different reasons, probing
might wait for userspace interaction to load firmware which wasn't
scheduled yet. I'm sure there are still more things that break the
assumption.
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-K?nig |
Industrial Linux Solutions | http://www.pengutronix.de/ |
> > > tpm_tis_init calls tpmm_chip_alloc which barfs when pdev (i.e. the return value
> > > of platform_device_register_simple above) isn't bound. It is not allowed
> > > to assume that the device is bound after the above function calls.
> >
> > Can you please explain again why you think that assumption is invalid?
>
> You can unbind a device from a driver via sysfs, you can also prevent
> binding somehow I think, probing can fail for different reasons, probing
> might wait for userspace interaction to load firmware which wasn't
> scheduled yet. I'm sure there are still more things that break the
> assumption.
Thanks. Out of these, "prevent binding somehow" would be the only
problem that applies to tpm_tis, as probing can't fail (no probe()
routine), there's no FW to load, and unbinding via sysfs would require
nearly impossible timing (not sure if it could be done with udev).
Anyway, the Right Thing to do is to create a probe() routine and that's
what Jason did.
Thanks again,
Martin
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?
On Tue, Dec 01, 2015 at 04:19:25PM +0100, Wilck, Martin wrote:
> > > > tpm_tis_init calls tpmm_chip_alloc which barfs when pdev (i.e. the return value
> > > > of platform_device_register_simple above) isn't bound. It is not allowed
> > > > to assume that the device is bound after the above function calls.
> > >
> > > Can you please explain again why you think that assumption is invalid?
> >
> > You can unbind a device from a driver via sysfs, you can also prevent
> > binding somehow I think, probing can fail for different reasons, probing
> > might wait for userspace interaction to load firmware which wasn't
> > scheduled yet. I'm sure there are still more things that break the
> > assumption.
>
> Thanks. Out of these, "prevent binding somehow" would be the only
> problem that applies to tpm_tis, as probing can't fail (no probe()
> routine), there's no FW to load, and unbinding via sysfs would require
> nearly impossible timing (not sure if it could be done with udev).
>
> Anyway, the Right Thing to do is to create a probe() routine and that's
> what Jason did.
That fixes tpm_tis, but there are other ancient TPM drivers that use
the old, now broken way.
So, we still need to do something here. Either fixup b8b2c7d845d5 as
you have proposed, remove the now broken obsolete TPM drivers, or try
and fix them..
Jason
Am 1. Dezember 2015 09:25:37 PST, schrieb Jason Gunthorpe <[email protected]>:
>On Tue, Dec 01, 2015 at 04:19:25PM +0100, Wilck, Martin wrote:
>> > > > tpm_tis_init calls tpmm_chip_alloc which barfs when pdev (i.e.
>the return value
>> > > > of platform_device_register_simple above) isn't bound. It is
>not allowed
>> > > > to assume that the device is bound after the above function
>calls.
>> > >
>> > > Can you please explain again why you think that assumption is
>invalid?
>> >
>> > You can unbind a device from a driver via sysfs, you can also
>prevent
>> > binding somehow I think, probing can fail for different reasons,
>probing
>> > might wait for userspace interaction to load firmware which wasn't
>> > scheduled yet. I'm sure there are still more things that break the
>> > assumption.
>>
>> Thanks. Out of these, "prevent binding somehow" would be the only
>> problem that applies to tpm_tis, as probing can't fail (no probe()
>> routine), there's no FW to load, and unbinding via sysfs would
>require
>> nearly impossible timing (not sure if it could be done with udev).
>>
>> Anyway, the Right Thing to do is to create a probe() routine and
>that's
>> what Jason did.
>
>That fixes tpm_tis, but there are other ancient TPM drivers that use
>the old, now broken way.
>
>So, we still need to do something here. Either fixup b8b2c7d845d5 as
>you have proposed, remove the now broken obsolete TPM drivers, or try
>and fix them..
How broken are they and since when?
I thought multiple times about deprecating and finally removing the 1.1b stuff - tpm 1.2 is out for 10? years now? With an expected life span of a TPM of roughly 5years...
And also unfortunately the 1.1b legacy drivers usually get loaded first :( (atleast for slb9635)
Mark them as obsolete , default them to No and remove them by 4.10 if there are no objections?
Peter
--
Sent from my mobile
On Tue, Dec 01, 2015 at 10:26:17AM -0800, Peter Huewe wrote:
>
>
> Am 1. Dezember 2015 09:25:37 PST, schrieb Jason Gunthorpe <[email protected]>:
> >On Tue, Dec 01, 2015 at 04:19:25PM +0100, Wilck, Martin wrote:
> >> > > > tpm_tis_init calls tpmm_chip_alloc which barfs when pdev (i.e.
> >the return value
> >> > > > of platform_device_register_simple above) isn't bound. It is
> >not allowed
> >> > > > to assume that the device is bound after the above function
> >calls.
> >> > >
> >> > > Can you please explain again why you think that assumption is
> >invalid?
> >> >
> >> > You can unbind a device from a driver via sysfs, you can also
> >prevent
> >> > binding somehow I think, probing can fail for different reasons,
> >probing
> >> > might wait for userspace interaction to load firmware which wasn't
> >> > scheduled yet. I'm sure there are still more things that break the
> >> > assumption.
> >>
> >> Thanks. Out of these, "prevent binding somehow" would be the only
> >> problem that applies to tpm_tis, as probing can't fail (no probe()
> >> routine), there's no FW to load, and unbinding via sysfs would
> >require
> >> nearly impossible timing (not sure if it could be done with udev).
> >>
> >> Anyway, the Right Thing to do is to create a probe() routine and
> >that's
> >> what Jason did.
> >
> >That fixes tpm_tis, but there are other ancient TPM drivers that use
> >the old, now broken way.
> >
> >So, we still need to do something here. Either fixup b8b2c7d845d5 as
> >you have proposed, remove the now broken obsolete TPM drivers, or try
> >and fix them..
>
> How broken are they and since when?
oops the kernel broken, since 4.4-rc1 apparently, so not released yet.
> Mark them as obsolete , default them to No and remove them by 4.10
> if there are no objections?
Greg KH has been advising just to delete stuff right away. It is easy
to undelete something if someone comes around with hardware and is
willing to test
Jason
> > >That fixes tpm_tis, but there are other ancient TPM drivers that use
> > >the old, now broken way.
> > >
> > >So, we still need to do something here. Either fixup b8b2c7d845d5 as
> > >you have proposed, remove the now broken obsolete TPM drivers, or try
> > >and fix them..
> >
> > How broken are they and since when?
> oops the kernel broken, since 4.4-rc1 apparently, so not released yet.
damn, I was hoping MUCH longer :)
> > Mark them as obsolete , default them to No and remove them by 4.10
> > if there are no objections?
> Greg KH has been advising just to delete stuff right away. It is easy
> to undelete something if someone comes around with hardware and is
> willing to test
Can you point to that discussion or was it offline?
I mean for staging, yes there it is clear, but for stuff that is officially "upstream", I'm not 100% sure.
Thanks,
Peter
On Tue, Dec 01, 2015 at 07:54:59PM +0100, Peter Huewe wrote:
> Can you point to that discussion or was it offline?
Apparently it was brought up at the most recent kernel summit
http://www.spinics.net/lists/linux-rdma/msg29985.html
Jason
Hello,
On Mon, Nov 30, 2015 at 12:50:05PM +0100, [email protected] wrote:
> From: Martin Wilck <[email protected]>
>
> Since b8b2c7d845d5, platform_drv_probe() is called for all platform
> devices. If drv->probe is NULL, and dev_pm_domain_attach() fails,
> platform_drv_probe() will return the error code from dev_pm_domain_attach().
>
> This causes real_probe() to enter the "probe_failed" path and set
> dev->driver to NULL. Before b8b2c7d845d5, real_probe() would assume
> success if both dev->bus->probe and drv->probe were missing. As a result,
> a device and driver could be "bound" together just by matching their names;
> this doesn't work any more after b8b2c7d845d5.
>
> This may cause problems later for certain usage of platform_driver_register()
> and platform_device_register_simple(). I observed a panic while loading
> the tpm_tis driver with parameter "force=1" (i.e. registering tpm_tis as
> a platform driver), because tpm_tis_init's assumption that the device
> returned by platform_device_register_simple() was bound didn't hold any more
> (tpmm_chip_alloc() dereferences chip->pdev->driver, causing panic).
I'm a bit uncertain if I should be happy with this wording or not. While
b8b2c7d845d5 has a bug which made the tpm_tis driver provoke a panic and
which needs fixing, this doesn't mean that the tpm_tis driver was right.
So "This may cause problems later ..." isn't a correct justification
because even in the presence of the bug introduced by b8b2c7d845d5, the
tpm_tis driver must not assume that it's device is bound on return of
platform_device_register_simple.
It's just that the bug in b8b2c7d845d5 made a wrong assumption of the
tpm_tis driver obvious, and now both need fixing.
If this were my patch, I'd not talk about tpm_tis at all. b8b2c7d845d5
broke binding for platform drivers with no probe function in certain
situations. This is corrected here.
> This patch restores the previous (4.3.0 and earlier) behavior of
> platform_drv_probe() in the case when the associated platform driver has
> no "probe" function.
>
> v2: fixed style issues, rephrased commit message.
This must go after the triple-dash below to not be included in the
commit.
> Fixes: b8b2c7d845d5 ("base/platform: assert that dev_pm_domain callbacks are called unconditionally")
> Signed-off-by: Martin Wilck <[email protected]>
> ---
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-K?nig |
Industrial Linux Solutions | http://www.pengutronix.de/ |