Hi Linux-PM,
both in the I2C subsystem and also for Renesas drivers I maintain, I am
starting to get boilerplate patches doing some pm_runtime_put_* variant
because a failing pm_runtime_get is supposed to increase the ref
counters? Really? This feels wrong and unintuitive to me. I expect there
has been a discussion around it but I couldn't find it. I wonder why we
don't fix the code where the incremented refcount is expected for some
reason.
Can I have some pointers please?
Thanks,
Wolfram
On Sun, Jun 14, 2020 at 12:10 PM Wolfram Sang <[email protected]> wrote:
> both in the I2C subsystem and also for Renesas drivers I maintain, I am
> starting to get boilerplate patches doing some pm_runtime_put_* variant
> because a failing pm_runtime_get is supposed to increase the ref
> counters? Really? This feels wrong and unintuitive to me.
Yeah, that is a well known issue with PM (I even have for a long time
a coccinelle script, when I realized myself that there are a lot of
cases like this, but someone else discovered this recently, like
opening a can of worms).
> I expect there
> has been a discussion around it but I couldn't find it.
Rafael explained (again) recently this. I can't find it quickly, unfortunately.
> I wonder why we
> don't fix the code where the incremented refcount is expected for some
> reason.
The main idea behind API that a lot of drivers do *not* check error
codes from runtime PM, so, we need to keep balance in case of
pm_runtime_get(...);
...
pm_runtime_put(...);
> Can I have some pointers please?
--
With Best Regards,
Andy Shevchenko
On Sun, Jun 14, 2020 at 12:34 PM Andy Shevchenko
<[email protected]> wrote:
>
> On Sun, Jun 14, 2020 at 12:10 PM Wolfram Sang <[email protected]> wrote:
> > both in the I2C subsystem and also for Renesas drivers I maintain, I am
> > starting to get boilerplate patches doing some pm_runtime_put_* variant
> > because a failing pm_runtime_get is supposed to increase the ref
> > counters? Really? This feels wrong and unintuitive to me.
>
> Yeah, that is a well known issue with PM (I even have for a long time
> a coccinelle script, when I realized myself that there are a lot of
> cases like this, but someone else discovered this recently, like
> opening a can of worms).
>
> > I expect there
> > has been a discussion around it but I couldn't find it.
>
> Rafael explained (again) recently this. I can't find it quickly, unfortunately.
I _think_ this discussion, but may be it's simple another tentacle of
the same octopus.
https://patchwork.ozlabs.org/project/linux-tegra/patch/[email protected]/
>
> > I wonder why we
> > don't fix the code where the incremented refcount is expected for some
> > reason.
>
> The main idea behind API that a lot of drivers do *not* check error
> codes from runtime PM, so, we need to keep balance in case of
>
> pm_runtime_get(...);
> ...
> pm_runtime_put(...);
>
> > Can I have some pointers please?
>
> --
> With Best Regards,
> Andy Shevchenko
--
With Best Regards,
Andy Shevchenko
Hi Andy,
On Sun, Jun 14, 2020 at 11:43 AM Andy Shevchenko
<[email protected]> wrote:
> On Sun, Jun 14, 2020 at 12:34 PM Andy Shevchenko
> <[email protected]> wrote:
> >
> > On Sun, Jun 14, 2020 at 12:10 PM Wolfram Sang <[email protected]> wrote:
> > > both in the I2C subsystem and also for Renesas drivers I maintain, I am
> > > starting to get boilerplate patches doing some pm_runtime_put_* variant
> > > because a failing pm_runtime_get is supposed to increase the ref
> > > counters? Really? This feels wrong and unintuitive to me.
> >
> > Yeah, that is a well known issue with PM (I even have for a long time
> > a coccinelle script, when I realized myself that there are a lot of
> > cases like this, but someone else discovered this recently, like
> > opening a can of worms).
> >
> > > I expect there
> > > has been a discussion around it but I couldn't find it.
> >
> > Rafael explained (again) recently this. I can't find it quickly, unfortunately.
>
> I _think_ this discussion, but may be it's simple another tentacle of
> the same octopus.
> https://patchwork.ozlabs.org/project/linux-tegra/patch/[email protected]/
Thanks, hadn't read that one! (so I was still at -1 from
http://sweng.the-davies.net/Home/rustys-api-design-manifesto ;-)
So "pm_runtime_put_noidle()" is the (definitive?) one to pair with a
pm_runtime_get_sync() failure?
> > > I wonder why we
> > > don't fix the code where the incremented refcount is expected for some
> > > reason.
> >
> > The main idea behind API that a lot of drivers do *not* check error
> > codes from runtime PM, so, we need to keep balance in case of
> >
> > pm_runtime_get(...);
> > ...
> > pm_runtime_put(...);
I've always[*] considered a pm_runtime_get_sync() failure to be fatal
(or: cannot happen), and that there's nothing that can be done to
recover. Hence I never checked the function's return value.
Was that wrong?
[*] at least on Renesas SoCs with Clock and/or Power Domains.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On Sun, Jun 14, 2020 at 12:00 PM Geert Uytterhoeven
<[email protected]> wrote:
> On Sun, Jun 14, 2020 at 11:43 AM Andy Shevchenko
> <[email protected]> wrote:
> > On Sun, Jun 14, 2020 at 12:34 PM Andy Shevchenko
> > <[email protected]> wrote:
> > >
> > > On Sun, Jun 14, 2020 at 12:10 PM Wolfram Sang <[email protected]> wrote:
> > > > both in the I2C subsystem and also for Renesas drivers I maintain, I am
> > > > starting to get boilerplate patches doing some pm_runtime_put_* variant
> > > > because a failing pm_runtime_get is supposed to increase the ref
> > > > counters? Really? This feels wrong and unintuitive to me.
> > >
> > > Yeah, that is a well known issue with PM (I even have for a long time
> > > a coccinelle script, when I realized myself that there are a lot of
> > > cases like this, but someone else discovered this recently, like
> > > opening a can of worms).
> > >
> > > > I expect there
> > > > has been a discussion around it but I couldn't find it.
> > >
> > > Rafael explained (again) recently this. I can't find it quickly, unfortunately.
> >
> > I _think_ this discussion, but may be it's simple another tentacle of
> > the same octopus.
> > https://patchwork.ozlabs.org/project/linux-tegra/patch/[email protected]/
>
> Thanks, hadn't read that one! (so I was still at -1 from
> http://sweng.the-davies.net/Home/rustys-api-design-manifesto ;-)
>
> So "pm_runtime_put_noidle()" is the (definitive?) one to pair with a
> pm_runtime_get_sync() failure?
My biggest worry here is all those copycats jumping on the bandwagon,
and sending untested[*] patches that end up calling the wrong function.
[*] Several of them turned out to introduce trivial compile warnings, so
I now consider all patches authored by the same person as untested.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On Sun, Jun 14, 2020 at 1:05 PM Geert Uytterhoeven <[email protected]> wrote:
> On Sun, Jun 14, 2020 at 12:00 PM Geert Uytterhoeven
> <[email protected]> wrote:
> > On Sun, Jun 14, 2020 at 11:43 AM Andy Shevchenko
> > <[email protected]> wrote:
> > > On Sun, Jun 14, 2020 at 12:34 PM Andy Shevchenko
> > > <[email protected]> wrote:
> > > >
> > > > On Sun, Jun 14, 2020 at 12:10 PM Wolfram Sang <[email protected]> wrote:
> > > > > both in the I2C subsystem and also for Renesas drivers I maintain, I am
> > > > > starting to get boilerplate patches doing some pm_runtime_put_* variant
> > > > > because a failing pm_runtime_get is supposed to increase the ref
> > > > > counters? Really? This feels wrong and unintuitive to me.
> > > >
> > > > Yeah, that is a well known issue with PM (I even have for a long time
> > > > a coccinelle script, when I realized myself that there are a lot of
> > > > cases like this, but someone else discovered this recently, like
> > > > opening a can of worms).
> > > >
> > > > > I expect there
> > > > > has been a discussion around it but I couldn't find it.
> > > >
> > > > Rafael explained (again) recently this. I can't find it quickly, unfortunately.
> > >
> > > I _think_ this discussion, but may be it's simple another tentacle of
> > > the same octopus.
> > > https://patchwork.ozlabs.org/project/linux-tegra/patch/[email protected]/
> >
> > Thanks, hadn't read that one! (so I was still at -1 from
> > http://sweng.the-davies.net/Home/rustys-api-design-manifesto ;-)
> >
> > So "pm_runtime_put_noidle()" is the (definitive?) one to pair with a
> > pm_runtime_get_sync() failure?
>
> My biggest worry here is all those copycats jumping on the bandwagon,
> and sending untested[*] patches that end up calling the wrong function.
>
> [*] Several of them turned out to introduce trivial compile warnings, so
> I now consider all patches authored by the same person as untested.
That's always a problem with janitors like patches...
Once I tried to ask them to provide a testing material, but...
- some maintainers just accept them without asking questions
- some maintainers even defend them that they are doing a good job
(and LWN top contributor statistics also motivate some of janitors,
though I consider it not the best metrics)
- practically almost no contributor answered to my queries, so, I
consider all of them are untested independent to the name (if name
appears in more than dozen patches, esp. in different subsystems)
- and yes, it's a trade-off, some of the patches indeed useful.
--
With Best Regards,
Andy Shevchenko
On Sun, Jun 14, 2020 at 1:00 PM Geert Uytterhoeven <[email protected]> wrote:
>
> Hi Andy,
>
> On Sun, Jun 14, 2020 at 11:43 AM Andy Shevchenko
> <[email protected]> wrote:
> > On Sun, Jun 14, 2020 at 12:34 PM Andy Shevchenko
> > <[email protected]> wrote:
> > >
> > > On Sun, Jun 14, 2020 at 12:10 PM Wolfram Sang <[email protected]> wrote:
> > > > both in the I2C subsystem and also for Renesas drivers I maintain, I am
> > > > starting to get boilerplate patches doing some pm_runtime_put_* variant
> > > > because a failing pm_runtime_get is supposed to increase the ref
> > > > counters? Really? This feels wrong and unintuitive to me.
> > >
> > > Yeah, that is a well known issue with PM (I even have for a long time
> > > a coccinelle script, when I realized myself that there are a lot of
> > > cases like this, but someone else discovered this recently, like
> > > opening a can of worms).
> > >
> > > > I expect there
> > > > has been a discussion around it but I couldn't find it.
> > >
> > > Rafael explained (again) recently this. I can't find it quickly, unfortunately.
> >
> > I _think_ this discussion, but may be it's simple another tentacle of
> > the same octopus.
> > https://patchwork.ozlabs.org/project/linux-tegra/patch/[email protected]/
>
> Thanks, hadn't read that one! (so I was still at -1 from
> http://sweng.the-davies.net/Home/rustys-api-design-manifesto ;-)
This one seems the starting point:
https://lkml.org/lkml/2020/5/20/1100
> So "pm_runtime_put_noidle()" is the (definitive?) one to pair with a
> pm_runtime_get_sync() failure?
Depends. If you are using autosuspend, then put_autosuspend() probably
is the right one.
> > > > I wonder why we
> > > > don't fix the code where the incremented refcount is expected for some
> > > > reason.
> > >
> > > The main idea behind API that a lot of drivers do *not* check error
> > > codes from runtime PM, so, we need to keep balance in case of
> > >
> > > pm_runtime_get(...);
> > > ...
> > > pm_runtime_put(...);
>
> I've always[*] considered a pm_runtime_get_sync() failure to be fatal
> (or: cannot happen), and that there's nothing that can be done to
> recover. Hence I never checked the function's return value.
> Was that wrong?
>
> [*] at least on Renesas SoCs with Clock and/or Power Domains.
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds
--
With Best Regards,
Andy Shevchenko
On Sun, Jun 14, 2020 at 11:08 AM Wolfram Sang <[email protected]> wrote:
>
> Hi Linux-PM,
>
> both in the I2C subsystem and also for Renesas drivers I maintain, I am
> starting to get boilerplate patches doing some pm_runtime_put_* variant
> because a failing pm_runtime_get is supposed to increase the ref
> counters? Really?
Yes. Really.
pm_runtime_get*() have been doing this forever, because the majority
of their users do something like
pm_runtime_get*()
...
pm_runtime_put*()
without checking the return values and they don't need to worry about
the refcounts, which wouldn't be possible otherwise.
> This feels wrong and unintuitive to me. I expect there
> has been a discussion around it but I couldn't find it. I wonder why we
> don't fix the code where the incremented refcount is expected for some
> reason.
>
> Can I have some pointers please?
The behavior is actually documented in
Documentation/power/runtime_pm.rst and I'm working on kerneldoc
comments for runtime PM functions in general to make it a bit more
clear.
Cheers!
On Sun, Jun 14, 2020 at 12:00 PM Geert Uytterhoeven
<[email protected]> wrote:
>
> Hi Andy,
>
> On Sun, Jun 14, 2020 at 11:43 AM Andy Shevchenko
> <[email protected]> wrote:
> > On Sun, Jun 14, 2020 at 12:34 PM Andy Shevchenko
> > <[email protected]> wrote:
> > >
> > > On Sun, Jun 14, 2020 at 12:10 PM Wolfram Sang <[email protected]> wrote:
> > > > both in the I2C subsystem and also for Renesas drivers I maintain, I am
> > > > starting to get boilerplate patches doing some pm_runtime_put_* variant
> > > > because a failing pm_runtime_get is supposed to increase the ref
> > > > counters? Really? This feels wrong and unintuitive to me.
> > >
> > > Yeah, that is a well known issue with PM (I even have for a long time
> > > a coccinelle script, when I realized myself that there are a lot of
> > > cases like this, but someone else discovered this recently, like
> > > opening a can of worms).
> > >
> > > > I expect there
> > > > has been a discussion around it but I couldn't find it.
> > >
> > > Rafael explained (again) recently this. I can't find it quickly, unfortunately.
> >
> > I _think_ this discussion, but may be it's simple another tentacle of
> > the same octopus.
> > https://patchwork.ozlabs.org/project/linux-tegra/patch/[email protected]/
>
> Thanks, hadn't read that one! (so I was still at -1 from
> http://sweng.the-davies.net/Home/rustys-api-design-manifesto ;-)
>
> So "pm_runtime_put_noidle()" is the (definitive?) one to pair with a
> pm_runtime_get_sync() failure?
If you bail out immediately on errors, then yes, it is.
If you'd rather to something like
ret = pm_runtime_get_sync(dev);
if (ret < 0)
goto fail;
... code depending on PM ...
fail:
pm_runtime_put_autosuspend(dev);
then it will still work correctly.
It actually doesn't matter which pm_runtime_put*() variant you call
after a pm_runtime_get_sync() failure, but the _noidle() is the
simplest one and it is sufficient.
> > > > I wonder why we
> > > > don't fix the code where the incremented refcount is expected for some
> > > > reason.
> > >
> > > The main idea behind API that a lot of drivers do *not* check error
> > > codes from runtime PM, so, we need to keep balance in case of
> > >
> > > pm_runtime_get(...);
> > > ...
> > > pm_runtime_put(...);
>
> I've always[*] considered a pm_runtime_get_sync() failure to be fatal
> (or: cannot happen), and that there's nothing that can be done to
> recover. Hence I never checked the function's return value.
> Was that wrong?
No, it wasn't. It is the right thing to do in the majority of cases.
> [*] at least on Renesas SoCs with Clock and/or Power Domains.
Cheers!
Hi Geert and Rafael,
> > I've always[*] considered a pm_runtime_get_sync() failure to be fatal
> > (or: cannot happen), and that there's nothing that can be done to
> > recover. Hence I never checked the function's return value.
> > Was that wrong?
>
> No, it wasn't. It is the right thing to do in the majority of cases.
OK, if *not checking* the retval is the major use case, then I
understand that ref counting takes place.
However, that probably means that for most patches I am getting, the
better fix would be to remove the error checking? (I assume most people
put the error check in there to be on the "safe side" without having a
real argument to really do it.)
And thanks for putting more hints to kernel doc! I think this will help
the case a lot.
Kind regards,
Wolfram
> However, that probably means that for most patches I am getting, the
> better fix would be to remove the error checking? (I assume most people
> put the error check in there to be on the "safe side" without having a
> real argument to really do it.)
Kindly asking for more input here: A better answer to all these patches
is to ask if the error checking could not be removed instead?