2020-06-14 09:10:20

by Wolfram Sang

[permalink] [raw]
Subject: RFC: a failing pm_runtime_get increases the refcnt?

Hi Linux-PM,

both in the I2C subsystem and also for Renesas drivers I maintain, I am
starting to get boilerplate patches doing some pm_runtime_put_* variant
because a failing pm_runtime_get is supposed to increase the ref
counters? Really? This feels wrong and unintuitive to me. I expect there
has been a discussion around it but I couldn't find it. I wonder why we
don't fix the code where the incremented refcount is expected for some
reason.

Can I have some pointers please?

Thanks,

Wolfram


Attachments:
(No filename) (519.00 B)
signature.asc (849.00 B)
Download all attachments

2020-06-14 09:37:17

by Andy Shevchenko

[permalink] [raw]
Subject: Re: RFC: a failing pm_runtime_get increases the refcnt?

On Sun, Jun 14, 2020 at 12:10 PM Wolfram Sang <[email protected]> wrote:
> both in the I2C subsystem and also for Renesas drivers I maintain, I am
> starting to get boilerplate patches doing some pm_runtime_put_* variant
> because a failing pm_runtime_get is supposed to increase the ref
> counters? Really? This feels wrong and unintuitive to me.

Yeah, that is a well known issue with PM (I even have for a long time
a coccinelle script, when I realized myself that there are a lot of
cases like this, but someone else discovered this recently, like
opening a can of worms).

> I expect there
> has been a discussion around it but I couldn't find it.

Rafael explained (again) recently this. I can't find it quickly, unfortunately.

> I wonder why we
> don't fix the code where the incremented refcount is expected for some
> reason.

The main idea behind API that a lot of drivers do *not* check error
codes from runtime PM, so, we need to keep balance in case of

pm_runtime_get(...);
...
pm_runtime_put(...);

> Can I have some pointers please?

--
With Best Regards,
Andy Shevchenko

2020-06-14 09:45:18

by Andy Shevchenko

[permalink] [raw]
Subject: Re: RFC: a failing pm_runtime_get increases the refcnt?

On Sun, Jun 14, 2020 at 12:34 PM Andy Shevchenko
<[email protected]> wrote:
>
> On Sun, Jun 14, 2020 at 12:10 PM Wolfram Sang <[email protected]> wrote:
> > both in the I2C subsystem and also for Renesas drivers I maintain, I am
> > starting to get boilerplate patches doing some pm_runtime_put_* variant
> > because a failing pm_runtime_get is supposed to increase the ref
> > counters? Really? This feels wrong and unintuitive to me.
>
> Yeah, that is a well known issue with PM (I even have for a long time
> a coccinelle script, when I realized myself that there are a lot of
> cases like this, but someone else discovered this recently, like
> opening a can of worms).
>
> > I expect there
> > has been a discussion around it but I couldn't find it.
>
> Rafael explained (again) recently this. I can't find it quickly, unfortunately.

I _think_ this discussion, but may be it's simple another tentacle of
the same octopus.
https://patchwork.ozlabs.org/project/linux-tegra/patch/[email protected]/

>
> > I wonder why we
> > don't fix the code where the incremented refcount is expected for some
> > reason.
>
> The main idea behind API that a lot of drivers do *not* check error
> codes from runtime PM, so, we need to keep balance in case of
>
> pm_runtime_get(...);
> ...
> pm_runtime_put(...);
>
> > Can I have some pointers please?
>
> --
> With Best Regards,
> Andy Shevchenko



--
With Best Regards,
Andy Shevchenko

2020-06-14 10:02:47

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: RFC: a failing pm_runtime_get increases the refcnt?

Hi Andy,

On Sun, Jun 14, 2020 at 11:43 AM Andy Shevchenko
<[email protected]> wrote:
> On Sun, Jun 14, 2020 at 12:34 PM Andy Shevchenko
> <[email protected]> wrote:
> >
> > On Sun, Jun 14, 2020 at 12:10 PM Wolfram Sang <[email protected]> wrote:
> > > both in the I2C subsystem and also for Renesas drivers I maintain, I am
> > > starting to get boilerplate patches doing some pm_runtime_put_* variant
> > > because a failing pm_runtime_get is supposed to increase the ref
> > > counters? Really? This feels wrong and unintuitive to me.
> >
> > Yeah, that is a well known issue with PM (I even have for a long time
> > a coccinelle script, when I realized myself that there are a lot of
> > cases like this, but someone else discovered this recently, like
> > opening a can of worms).
> >
> > > I expect there
> > > has been a discussion around it but I couldn't find it.
> >
> > Rafael explained (again) recently this. I can't find it quickly, unfortunately.
>
> I _think_ this discussion, but may be it's simple another tentacle of
> the same octopus.
> https://patchwork.ozlabs.org/project/linux-tegra/patch/[email protected]/

Thanks, hadn't read that one! (so I was still at -1 from
http://sweng.the-davies.net/Home/rustys-api-design-manifesto ;-)

So "pm_runtime_put_noidle()" is the (definitive?) one to pair with a
pm_runtime_get_sync() failure?

> > > I wonder why we
> > > don't fix the code where the incremented refcount is expected for some
> > > reason.
> >
> > The main idea behind API that a lot of drivers do *not* check error
> > codes from runtime PM, so, we need to keep balance in case of
> >
> > pm_runtime_get(...);
> > ...
> > pm_runtime_put(...);

I've always[*] considered a pm_runtime_get_sync() failure to be fatal
(or: cannot happen), and that there's nothing that can be done to
recover. Hence I never checked the function's return value.
Was that wrong?

[*] at least on Renesas SoCs with Clock and/or Power Domains.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2020-06-14 10:07:04

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: RFC: a failing pm_runtime_get increases the refcnt?

On Sun, Jun 14, 2020 at 12:00 PM Geert Uytterhoeven
<[email protected]> wrote:
> On Sun, Jun 14, 2020 at 11:43 AM Andy Shevchenko
> <[email protected]> wrote:
> > On Sun, Jun 14, 2020 at 12:34 PM Andy Shevchenko
> > <[email protected]> wrote:
> > >
> > > On Sun, Jun 14, 2020 at 12:10 PM Wolfram Sang <[email protected]> wrote:
> > > > both in the I2C subsystem and also for Renesas drivers I maintain, I am
> > > > starting to get boilerplate patches doing some pm_runtime_put_* variant
> > > > because a failing pm_runtime_get is supposed to increase the ref
> > > > counters? Really? This feels wrong and unintuitive to me.
> > >
> > > Yeah, that is a well known issue with PM (I even have for a long time
> > > a coccinelle script, when I realized myself that there are a lot of
> > > cases like this, but someone else discovered this recently, like
> > > opening a can of worms).
> > >
> > > > I expect there
> > > > has been a discussion around it but I couldn't find it.
> > >
> > > Rafael explained (again) recently this. I can't find it quickly, unfortunately.
> >
> > I _think_ this discussion, but may be it's simple another tentacle of
> > the same octopus.
> > https://patchwork.ozlabs.org/project/linux-tegra/patch/[email protected]/
>
> Thanks, hadn't read that one! (so I was still at -1 from
> http://sweng.the-davies.net/Home/rustys-api-design-manifesto ;-)
>
> So "pm_runtime_put_noidle()" is the (definitive?) one to pair with a
> pm_runtime_get_sync() failure?

My biggest worry here is all those copycats jumping on the bandwagon,
and sending untested[*] patches that end up calling the wrong function.

[*] Several of them turned out to introduce trivial compile warnings, so
I now consider all patches authored by the same person as untested.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2020-06-14 10:49:24

by Andy Shevchenko

[permalink] [raw]
Subject: Re: RFC: a failing pm_runtime_get increases the refcnt?

On Sun, Jun 14, 2020 at 1:05 PM Geert Uytterhoeven <[email protected]> wrote:
> On Sun, Jun 14, 2020 at 12:00 PM Geert Uytterhoeven
> <[email protected]> wrote:
> > On Sun, Jun 14, 2020 at 11:43 AM Andy Shevchenko
> > <[email protected]> wrote:
> > > On Sun, Jun 14, 2020 at 12:34 PM Andy Shevchenko
> > > <[email protected]> wrote:
> > > >
> > > > On Sun, Jun 14, 2020 at 12:10 PM Wolfram Sang <[email protected]> wrote:
> > > > > both in the I2C subsystem and also for Renesas drivers I maintain, I am
> > > > > starting to get boilerplate patches doing some pm_runtime_put_* variant
> > > > > because a failing pm_runtime_get is supposed to increase the ref
> > > > > counters? Really? This feels wrong and unintuitive to me.
> > > >
> > > > Yeah, that is a well known issue with PM (I even have for a long time
> > > > a coccinelle script, when I realized myself that there are a lot of
> > > > cases like this, but someone else discovered this recently, like
> > > > opening a can of worms).
> > > >
> > > > > I expect there
> > > > > has been a discussion around it but I couldn't find it.
> > > >
> > > > Rafael explained (again) recently this. I can't find it quickly, unfortunately.
> > >
> > > I _think_ this discussion, but may be it's simple another tentacle of
> > > the same octopus.
> > > https://patchwork.ozlabs.org/project/linux-tegra/patch/[email protected]/
> >
> > Thanks, hadn't read that one! (so I was still at -1 from
> > http://sweng.the-davies.net/Home/rustys-api-design-manifesto ;-)
> >
> > So "pm_runtime_put_noidle()" is the (definitive?) one to pair with a
> > pm_runtime_get_sync() failure?
>
> My biggest worry here is all those copycats jumping on the bandwagon,
> and sending untested[*] patches that end up calling the wrong function.
>
> [*] Several of them turned out to introduce trivial compile warnings, so
> I now consider all patches authored by the same person as untested.

That's always a problem with janitors like patches...
Once I tried to ask them to provide a testing material, but...
- some maintainers just accept them without asking questions
- some maintainers even defend them that they are doing a good job
(and LWN top contributor statistics also motivate some of janitors,
though I consider it not the best metrics)
- practically almost no contributor answered to my queries, so, I
consider all of them are untested independent to the name (if name
appears in more than dozen patches, esp. in different subsystems)
- and yes, it's a trade-off, some of the patches indeed useful.


--
With Best Regards,
Andy Shevchenko

2020-06-14 12:44:39

by Andy Shevchenko

[permalink] [raw]
Subject: Re: RFC: a failing pm_runtime_get increases the refcnt?

On Sun, Jun 14, 2020 at 1:00 PM Geert Uytterhoeven <[email protected]> wrote:
>
> Hi Andy,
>
> On Sun, Jun 14, 2020 at 11:43 AM Andy Shevchenko
> <[email protected]> wrote:
> > On Sun, Jun 14, 2020 at 12:34 PM Andy Shevchenko
> > <[email protected]> wrote:
> > >
> > > On Sun, Jun 14, 2020 at 12:10 PM Wolfram Sang <[email protected]> wrote:
> > > > both in the I2C subsystem and also for Renesas drivers I maintain, I am
> > > > starting to get boilerplate patches doing some pm_runtime_put_* variant
> > > > because a failing pm_runtime_get is supposed to increase the ref
> > > > counters? Really? This feels wrong and unintuitive to me.
> > >
> > > Yeah, that is a well known issue with PM (I even have for a long time
> > > a coccinelle script, when I realized myself that there are a lot of
> > > cases like this, but someone else discovered this recently, like
> > > opening a can of worms).
> > >
> > > > I expect there
> > > > has been a discussion around it but I couldn't find it.
> > >
> > > Rafael explained (again) recently this. I can't find it quickly, unfortunately.
> >
> > I _think_ this discussion, but may be it's simple another tentacle of
> > the same octopus.
> > https://patchwork.ozlabs.org/project/linux-tegra/patch/[email protected]/
>
> Thanks, hadn't read that one! (so I was still at -1 from
> http://sweng.the-davies.net/Home/rustys-api-design-manifesto ;-)

This one seems the starting point:

https://lkml.org/lkml/2020/5/20/1100

> So "pm_runtime_put_noidle()" is the (definitive?) one to pair with a
> pm_runtime_get_sync() failure?

Depends. If you are using autosuspend, then put_autosuspend() probably
is the right one.

> > > > I wonder why we
> > > > don't fix the code where the incremented refcount is expected for some
> > > > reason.
> > >
> > > The main idea behind API that a lot of drivers do *not* check error
> > > codes from runtime PM, so, we need to keep balance in case of
> > >
> > > pm_runtime_get(...);
> > > ...
> > > pm_runtime_put(...);
>
> I've always[*] considered a pm_runtime_get_sync() failure to be fatal
> (or: cannot happen), and that there's nothing that can be done to
> recover. Hence I never checked the function's return value.
> Was that wrong?
>
> [*] at least on Renesas SoCs with Clock and/or Power Domains.
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds



--
With Best Regards,
Andy Shevchenko

2020-06-14 13:52:32

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: RFC: a failing pm_runtime_get increases the refcnt?

On Sun, Jun 14, 2020 at 11:08 AM Wolfram Sang <[email protected]> wrote:
>
> Hi Linux-PM,
>
> both in the I2C subsystem and also for Renesas drivers I maintain, I am
> starting to get boilerplate patches doing some pm_runtime_put_* variant
> because a failing pm_runtime_get is supposed to increase the ref
> counters? Really?

Yes. Really.

pm_runtime_get*() have been doing this forever, because the majority
of their users do something like

pm_runtime_get*()

...

pm_runtime_put*()

without checking the return values and they don't need to worry about
the refcounts, which wouldn't be possible otherwise.

> This feels wrong and unintuitive to me. I expect there
> has been a discussion around it but I couldn't find it. I wonder why we
> don't fix the code where the incremented refcount is expected for some
> reason.
>
> Can I have some pointers please?

The behavior is actually documented in
Documentation/power/runtime_pm.rst and I'm working on kerneldoc
comments for runtime PM functions in general to make it a bit more
clear.

Cheers!

2020-06-14 14:01:35

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: RFC: a failing pm_runtime_get increases the refcnt?

On Sun, Jun 14, 2020 at 12:00 PM Geert Uytterhoeven
<[email protected]> wrote:
>
> Hi Andy,
>
> On Sun, Jun 14, 2020 at 11:43 AM Andy Shevchenko
> <[email protected]> wrote:
> > On Sun, Jun 14, 2020 at 12:34 PM Andy Shevchenko
> > <[email protected]> wrote:
> > >
> > > On Sun, Jun 14, 2020 at 12:10 PM Wolfram Sang <[email protected]> wrote:
> > > > both in the I2C subsystem and also for Renesas drivers I maintain, I am
> > > > starting to get boilerplate patches doing some pm_runtime_put_* variant
> > > > because a failing pm_runtime_get is supposed to increase the ref
> > > > counters? Really? This feels wrong and unintuitive to me.
> > >
> > > Yeah, that is a well known issue with PM (I even have for a long time
> > > a coccinelle script, when I realized myself that there are a lot of
> > > cases like this, but someone else discovered this recently, like
> > > opening a can of worms).
> > >
> > > > I expect there
> > > > has been a discussion around it but I couldn't find it.
> > >
> > > Rafael explained (again) recently this. I can't find it quickly, unfortunately.
> >
> > I _think_ this discussion, but may be it's simple another tentacle of
> > the same octopus.
> > https://patchwork.ozlabs.org/project/linux-tegra/patch/[email protected]/
>
> Thanks, hadn't read that one! (so I was still at -1 from
> http://sweng.the-davies.net/Home/rustys-api-design-manifesto ;-)
>
> So "pm_runtime_put_noidle()" is the (definitive?) one to pair with a
> pm_runtime_get_sync() failure?

If you bail out immediately on errors, then yes, it is.

If you'd rather to something like

ret = pm_runtime_get_sync(dev);
if (ret < 0)
goto fail;

... code depending on PM ...

fail:
pm_runtime_put_autosuspend(dev);

then it will still work correctly.

It actually doesn't matter which pm_runtime_put*() variant you call
after a pm_runtime_get_sync() failure, but the _noidle() is the
simplest one and it is sufficient.

> > > > I wonder why we
> > > > don't fix the code where the incremented refcount is expected for some
> > > > reason.
> > >
> > > The main idea behind API that a lot of drivers do *not* check error
> > > codes from runtime PM, so, we need to keep balance in case of
> > >
> > > pm_runtime_get(...);
> > > ...
> > > pm_runtime_put(...);
>
> I've always[*] considered a pm_runtime_get_sync() failure to be fatal
> (or: cannot happen), and that there's nothing that can be done to
> recover. Hence I never checked the function's return value.
> Was that wrong?

No, it wasn't. It is the right thing to do in the majority of cases.

> [*] at least on Renesas SoCs with Clock and/or Power Domains.

Cheers!

2020-06-14 14:09:38

by Wolfram Sang

[permalink] [raw]
Subject: Re: RFC: a failing pm_runtime_get increases the refcnt?

Hi Geert and Rafael,

> > I've always[*] considered a pm_runtime_get_sync() failure to be fatal
> > (or: cannot happen), and that there's nothing that can be done to
> > recover. Hence I never checked the function's return value.
> > Was that wrong?
>
> No, it wasn't. It is the right thing to do in the majority of cases.

OK, if *not checking* the retval is the major use case, then I
understand that ref counting takes place.

However, that probably means that for most patches I am getting, the
better fix would be to remove the error checking? (I assume most people
put the error check in there to be on the "safe side" without having a
real argument to really do it.)

And thanks for putting more hints to kernel doc! I think this will help
the case a lot.

Kind regards,

Wolfram


Attachments:
(No filename) (818.00 B)
signature.asc (849.00 B)
Download all attachments

2020-06-30 21:02:27

by Wolfram Sang

[permalink] [raw]
Subject: Re: RFC: a failing pm_runtime_get increases the refcnt?


> However, that probably means that for most patches I am getting, the
> better fix would be to remove the error checking? (I assume most people
> put the error check in there to be on the "safe side" without having a
> real argument to really do it.)

Kindly asking for more input here: A better answer to all these patches
is to ask if the error checking could not be removed instead?


Attachments:
(No filename) (398.00 B)
signature.asc (849.00 B)
Download all attachments