2014-01-26 19:28:17

by Guenter Roeck

[permalink] [raw]
Subject: lm90 driver no longer working on PCs in 3.13

Hi,

the lm90 driver is no longer working on PCs with the 3.13 kernel ... or at least not without
special configuration.

This is what I get if I try to instantiate a device on it (max6695):

i2c 1-0018: Driver lm90 requests probe deferral
i2c i2c-1: new_device: Instantiated device max6695 at 0x18

The regulator core always returns -EPROBE_DEFER if the platform does not support devicetree
and if the regulator it is looking for does not exist. Since the driver now requires a mandatory
regulator (commit 3e0f964f2ad - hwmon: (lm90) Add power control), and the regulator it requests
does not exist on a PC, the result is not really surprising. I thought the regulator core would
realize that it has to return a dummy regulator, but apparently that is not the case, or I don't
know how to configure it.

Any idea what I might need to do to get it working ?

Thanks,
Guenter


2014-01-26 20:14:13

by Jean Delvare

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

Hi Guenter,

Adding Wei Ni to Cc, as he provided the commit which causes problem.

On Sun, 26 Jan 2014 11:28:16 -0800, Guenter Roeck wrote:
> Hi,
>
> the lm90 driver is no longer working on PCs with the 3.13 kernel ... or at least not without
> special configuration.
>
> This is what I get if I try to instantiate a device on it (max6695):
>
> i2c 1-0018: Driver lm90 requests probe deferral
> i2c i2c-1: new_device: Instantiated device max6695 at 0x18
>
> The regulator core always returns -EPROBE_DEFER if the platform does not support devicetree
> and if the regulator it is looking for does not exist. Since the driver now requires a mandatory
> regulator (commit 3e0f964f2ad - hwmon: (lm90) Add power control), and the regulator it requests
> does not exist on a PC, the result is not really surprising. I thought the regulator core would
> realize that it has to return a dummy regulator, but apparently that is not the case, or I don't
> know how to configure it.
>
> Any idea what I might need to do to get it working ?

Me, I really don't know. I seem to remember I tested Wei's patch set on
an emulated ADM1032 chip and it was working fine. So maybe it depends
on the kernel configuration, or something changed on the regulator side
meanwhile.

--
Jean Delvare

2014-01-26 20:42:21

by Mark Brown

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

On Sun, Jan 26, 2014 at 09:13:57PM +0100, Jean Delvare wrote:

> Me, I really don't know. I seem to remember I tested Wei's patch set on
> an emulated ADM1032 chip and it was working fine. So maybe it depends
> on the kernel configuration, or something changed on the regulator side
> meanwhile.

The regulator API hasn't changed here, probably the regulator API was
enabled when previously it was disabled. If the platform doesn't have
regulator support either natively or via use of a firmware interface
like OF then enabling it is going to cause problems since no supplies
will be mapped in.


Attachments:
(No filename) (596.00 B)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-01-26 20:44:39

by Guenter Roeck

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

Hi Jean,

On 01/26/2014 12:13 PM, Jean Delvare wrote:
> Hi Guenter,
>
> Adding Wei Ni to Cc, as he provided the commit which causes problem.
>
> On Sun, 26 Jan 2014 11:28:16 -0800, Guenter Roeck wrote:
>> Hi,
>>
>> the lm90 driver is no longer working on PCs with the 3.13 kernel ... or at least not without
>> special configuration.
>>
>> This is what I get if I try to instantiate a device on it (max6695):
>>
>> i2c 1-0018: Driver lm90 requests probe deferral
>> i2c i2c-1: new_device: Instantiated device max6695 at 0x18
>>
>> The regulator core always returns -EPROBE_DEFER if the platform does not support devicetree
>> and if the regulator it is looking for does not exist. Since the driver now requires a mandatory
>> regulator (commit 3e0f964f2ad - hwmon: (lm90) Add power control), and the regulator it requests
>> does not exist on a PC, the result is not really surprising. I thought the regulator core would
>> realize that it has to return a dummy regulator, but apparently that is not the case, or I don't
>> know how to configure it.
>>
>> Any idea what I might need to do to get it working ?
>
> Me, I really don't know. I seem to remember I tested Wei's patch set on
> an emulated ADM1032 chip and it was working fine. So maybe it depends
> on the kernel configuration, or something changed on the regulator side
> meanwhile.
>

The regulator code changed with 3.13; the dummy regulator no longer exists,
and the functionality it provided is supposed to be handled automatically.
But that only really works on devicetree based systems and otherwise returns
-EPROBE_DEFER as mentioned above.

Maybe there is some configuration option, or maybe something needs to be
configured from user space. I found neither. In the first case, we should create
a dependency for the LM90 driver; in the latter case, we would have to make sure
that it is well documented (I'd grumble on that, though - it would result in
never ending trouble for us, having to repeatedly explain how this is now
supposed to work).

Another possible fix would be to have the regulator core return -ENODEV
instead of -EPROBE_DEFER on non-dt systems. No idea if this would be acceptable
or even feasible.

Guenter

2014-01-26 20:49:48

by Jean Delvare

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

On Sun, 26 Jan 2014 12:44:38 -0800, Guenter Roeck wrote:
> On 01/26/2014 12:13 PM, Jean Delvare wrote:
> The regulator code changed with 3.13; the dummy regulator no longer exists,
> and the functionality it provided is supposed to be handled automatically.
> But that only really works on devicetree based systems and otherwise returns
> -EPROBE_DEFER as mentioned above.
>
> Maybe there is some configuration option, or maybe something needs to be
> configured from user space. I found neither.

Neither would be acceptable to my eyes anyway. Things worked out of the
box before, they should keep working out of the box.

> In the first case, we should create
> a dependency for the LM90 driver; in the latter case, we would have to make sure
> that it is well documented (I'd grumble on that, though - it would result in
> never ending trouble for us, having to repeatedly explain how this is now
> supposed to work).
>
> Another possible fix would be to have the regulator core return -ENODEV
> instead of -EPROBE_DEFER on non-dt systems. No idea if this would be acceptable
> or even feasible.

Well, either the regulator subsystem gets fixed (or provides a suitable
API for drivers like lm90 and we update the lm90 driver to use it), or
I'll just revert the problematic commit for now. This is a severe
regression, we just can't leave things that way.

--
Jean Delvare

2014-01-26 20:52:26

by Mark Brown

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

On Sun, Jan 26, 2014 at 12:44:38PM -0800, Guenter Roeck wrote:
> On 01/26/2014 12:13 PM, Jean Delvare wrote:

> >Me, I really don't know. I seem to remember I tested Wei's patch set on
> >an emulated ADM1032 chip and it was working fine. So maybe it depends
> >on the kernel configuration, or something changed on the regulator side
> >meanwhile.

> The regulator code changed with 3.13; the dummy regulator no longer exists,
> and the functionality it provided is supposed to be handled automatically.
> But that only really works on devicetree based systems and otherwise returns
> -EPROBE_DEFER as mentioned above.

CONFIG_REGULATOR_DUMMY should never have been used in production, it was
a debug tool to help bringup but it broke things as often as it fixed
them particularly with init ordering which is why it generated a warning
when it was used.

The dummy driver is still there, if you're doing bringup you can hack it
in still or if you genuniely used it then specify that full constraints
are provided like the changelog says (and as I've previously said).

> Another possible fix would be to have the regulator core return -ENODEV
> instead of -EPROBE_DEFER on non-dt systems. No idea if this would be acceptable
> or even feasible.

No, this would introduce breakage due to init ordering.


Attachments:
(No filename) (1.27 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-01-26 21:23:12

by Mark Brown

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

On Sun, Jan 26, 2014 at 09:49:36PM +0100, Jean Delvare wrote:
> On Sun, 26 Jan 2014 12:44:38 -0800, Guenter Roeck wrote:

> > Maybe there is some configuration option, or maybe something needs to be
> > configured from user space. I found neither.

> Neither would be acceptable to my eyes anyway. Things worked out of the
> box before, they should keep working out of the box.

They only worked with a debug option turned on and generated warnings
every time they were used... that kernel config would've been actively
broken for devices that wanted to do anything at all interesting with
the regulators and would've been prone to issues with init ordering and
races in any cases where there are actually regulators.

> > Another possible fix would be to have the regulator core return -ENODEV
> > instead of -EPROBE_DEFER on non-dt systems. No idea if this would be acceptable
> > or even feasible.

> Well, either the regulator subsystem gets fixed (or provides a suitable
> API for drivers like lm90 and we update the lm90 driver to use it), or
> I'll just revert the problematic commit for now. This is a severe
> regression, we just can't leave things that way.

It's not an issue in the driver, it's an issue in the combination of the
platform and the kernel config. If the regulator API is going to be
turned on for the platform then either the platform needs to configure
the supplies or it needs to tell the regulator core that it's safe to
start using dummy regulators.

What is this platform and why does it have the regulator API enabled in
the first place?


Attachments:
(No filename) (1.54 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-01-26 21:40:35

by Guenter Roeck

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

On 01/26/2014 01:22 PM, Mark Brown wrote:
> On Sun, Jan 26, 2014 at 09:49:36PM +0100, Jean Delvare wrote:
>> On Sun, 26 Jan 2014 12:44:38 -0800, Guenter Roeck wrote:
>
>>> Maybe there is some configuration option, or maybe something needs to be
>>> configured from user space. I found neither.
>
>> Neither would be acceptable to my eyes anyway. Things worked out of the
>> box before, they should keep working out of the box.
>
> They only worked with a debug option turned on and generated warnings
> every time they were used... that kernel config would've been actively
> broken for devices that wanted to do anything at all interesting with
> the regulators and would've been prone to issues with init ordering and
> races in any cases where there are actually regulators.
>
>>> Another possible fix would be to have the regulator core return -ENODEV
>>> instead of -EPROBE_DEFER on non-dt systems. No idea if this would be acceptable
>>> or even feasible.
>
>> Well, either the regulator subsystem gets fixed (or provides a suitable
>> API for drivers like lm90 and we update the lm90 driver to use it), or
>> I'll just revert the problematic commit for now. This is a severe
>> regression, we just can't leave things that way.
>
> It's not an issue in the driver, it's an issue in the combination of the
> platform and the kernel config. If the regulator API is going to be
> turned on for the platform then either the platform needs to configure
> the supplies or it needs to tell the regulator core that it's safe to
> start using dummy regulators.
>
> What is this platform and why does it have the regulator API enabled in
> the first place?
>

It is a PC running Ubuntu. you'd have to ask the distro maintainers why they
turn on regulators.

config-3.8.0-26-generic:CONFIG_REGULATOR=y
config-3.8.0-35-generic:CONFIG_REGULATOR=y
config-3.11.0-13-generic:CONFIG_REGULATOR=y

Guenter

2014-01-26 21:44:45

by Jean Delvare

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

Hi Mark,

On Sun, 26 Jan 2014 21:22:56 +0000, Mark Brown wrote:
> On Sun, Jan 26, 2014 at 09:49:36PM +0100, Jean Delvare wrote:
> > On Sun, 26 Jan 2014 12:44:38 -0800, Guenter Roeck wrote:
>
> > > Maybe there is some configuration option, or maybe something needs to be
> > > configured from user space. I found neither.
>
> > Neither would be acceptable to my eyes anyway. Things worked out of the
> > box before, they should keep working out of the box.
>
> They only worked with a debug option turned on and generated warnings
> every time they were used... that kernel config would've been actively
> broken for devices that wanted to do anything at all interesting with
> the regulators and would've been prone to issues with init ordering and
> races in any cases where there are actually regulators.

No, you don't get what I'm saying. For PC users, the lm90 did not
request a regulator and things worked because the kernel isn't supposed
to take care about things like that on PC machines. Now that the lm90
driver does request a regulator, it fails on PC machines because no
regulator is declared.

PC users never had to do anything fancy to get the lm90 driver to work.
Just enabling the driver and loading it, did the trick. And that's
exactly how things must be. Hence my claim.

To be clear: my "before" is not referring to "before the dummy
regulator was dropped". It is referring to "before commit 3e0f964f2ad -
hwmon: (lm90) Add power control". I don't know a thing about all this
regulator stuff.

> > (...)
> > Well, either the regulator subsystem gets fixed (or provides a suitable
> > API for drivers like lm90 and we update the lm90 driver to use it), or
> > I'll just revert the problematic commit for now. This is a severe
> > regression, we just can't leave things that way.
>
> It's not an issue in the driver, it's an issue in the combination of the
> platform and the kernel config. If the regulator API is going to be
> turned on for the platform then either the platform needs to configure
> the supplies or it needs to tell the regulator core that it's safe to
> start using dummy regulators.
>
> What is this platform and why does it have the regulator API enabled in
> the first place?

Regular PC. It has regulator API enabled, because it is possible to
enable it, I suppose, and developers try to test various combinations
of kernel config options. Guenter can tell more, I do not have it
enabled (which is why it did not fail for me, BTW.)

Don't tell me that it is expected that things will fail if
CONFIG_REGULATOR is enabled on a system which doesn't need it. It
doesn't make any sense. If kernels would fail as soon as any enabled
option wasn't actually needed, no system would boot out there.

If regulator is enabled but not needed, that's something the regulator
subsystem should figure out at run-time so that it can stub everything
out and things keep working.

--
Jean Delvare

2014-01-26 21:47:10

by Guenter Roeck

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

On 01/26/2014 12:52 PM, Mark Brown wrote:
> On Sun, Jan 26, 2014 at 12:44:38PM -0800, Guenter Roeck wrote:
>> On 01/26/2014 12:13 PM, Jean Delvare wrote:
>
>>> Me, I really don't know. I seem to remember I tested Wei's patch set on
>>> an emulated ADM1032 chip and it was working fine. So maybe it depends
>>> on the kernel configuration, or something changed on the regulator side
>>> meanwhile.
>
>> The regulator code changed with 3.13; the dummy regulator no longer exists,
>> and the functionality it provided is supposed to be handled automatically.
>> But that only really works on devicetree based systems and otherwise returns
>> -EPROBE_DEFER as mentioned above.
>
> CONFIG_REGULATOR_DUMMY should never have been used in production, it was
> a debug tool to help bringup but it broke things as often as it fixed
> them particularly with init ordering which is why it generated a warning
> when it was used.
>
> The dummy driver is still there, if you're doing bringup you can hack it
> in still or if you genuniely used it then specify that full constraints
> are provided like the changelog says (and as I've previously said).
>
>> Another possible fix would be to have the regulator core return -ENODEV
>> instead of -EPROBE_DEFER on non-dt systems. No idea if this would be acceptable
>> or even feasible.
>
> No, this would introduce breakage due to init ordering.
>

You have a solution for that in dt configurations. I don't think you have one for
non-dt systems - you simply assume that all regulators are there. For dt, you even
have a constraint to tell the kernel if regulator configurations are fully specified,
and you automatically return success if not and if a regulator does not exist.
So you know that there is a problem. For non-DT configurations you simply assume and
expect that regulators are all declared. I don't think that is a feasible approach
for non-DT systems.

Guenter

2014-01-26 21:50:41

by Guenter Roeck

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

On 01/26/2014 12:49 PM, Jean Delvare wrote:
> On Sun, 26 Jan 2014 12:44:38 -0800, Guenter Roeck wrote:
>> On 01/26/2014 12:13 PM, Jean Delvare wrote:
>> The regulator code changed with 3.13; the dummy regulator no longer exists,
>> and the functionality it provided is supposed to be handled automatically.
>> But that only really works on devicetree based systems and otherwise returns
>> -EPROBE_DEFER as mentioned above.
>>
>> Maybe there is some configuration option, or maybe something needs to be
>> configured from user space. I found neither.
>
> Neither would be acceptable to my eyes anyway. Things worked out of the
> box before, they should keep working out of the box.
>
>> In the first case, we should create
>> a dependency for the LM90 driver; in the latter case, we would have to make sure
>> that it is well documented (I'd grumble on that, though - it would result in
>> never ending trouble for us, having to repeatedly explain how this is now
>> supposed to work).
>>
>> Another possible fix would be to have the regulator core return -ENODEV
>> instead of -EPROBE_DEFER on non-dt systems. No idea if this would be acceptable
>> or even feasible.
>
> Well, either the regulator subsystem gets fixed (or provides a suitable
> API for drivers like lm90 and we update the lm90 driver to use it), or
> I'll just revert the problematic commit for now. This is a severe
> regression, we just can't leave things that way.
>

Maybe your configuration has CONFIG_REGULATORS disabled. Ubuntu has it enabled.
I don't know about others.

I agree, we may have to revert the patch. I don't think the regulator API works well
enough in non-dt systems to be able to use it in such systems. Mark's expectation
that regulator support must be disabled if regulators are not fully declared in non-dt
systems doesn't seem very useful nor really feasible.

Guenter

2014-01-26 21:54:07

by Mark Brown

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

On Sun, Jan 26, 2014 at 01:40:30PM -0800, Guenter Roeck wrote:
> On 01/26/2014 01:22 PM, Mark Brown wrote:

> >What is this platform and why does it have the regulator API enabled in
> >the first place?

> It is a PC running Ubuntu. you'd have to ask the distro maintainers why they
> turn on regulators.

> config-3.8.0-26-generic:CONFIG_REGULATOR=y
> config-3.8.0-35-generic:CONFIG_REGULATOR=y
> config-3.11.0-13-generic:CONFIG_REGULATOR=y

Oh, magic. Can you please file a bug with or otherwise talk to them
asking them to turn that off (or contribute code to make it work)? Feel
free to add me as a CC when you do so.


Attachments:
(No filename) (624.00 B)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-01-26 22:02:16

by Mark Brown

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

On Sun, Jan 26, 2014 at 01:47:09PM -0800, Guenter Roeck wrote:

Please fix your mailer to word wrap within 80 columns, I've reflowed for
legibility.

> You have a solution for that in dt configurations. I don't think you
> have one for non-dt systems - you simply assume that all regulators
> are there. For dt, you even have a constraint to tell the kernel if
> regulator configurations are fully specified, and you automatically
> return success if not and if a regulator does not exist. So you know
> that there is a problem. For non-DT configurations you simply assume
> and expect that regulators are all declared. I don't think that is a
> feasible approach for non-DT systems.

To repeat the mechanism that's there for other platforms is to either
flag that they've provided full constraints or provide the supply
mappings. This has always been the case, there's nothing new here - the
only change there's been in this is that in v3.13 we refactored how
systems using DT flag that they have full constraints.

I agree that it's not going to work terribly well if neither of these
things has been done, it never has done and that's why the API has stubs
when it's disabled - they allow us to avoid a hard dependency on it in
drivers so people only need to turn it on if they're actually using it.
I'm very surprised to see anyone doing that to be honest, if you're not
using it it's just going to waste cycles and RAM.


Attachments:
(No filename) (1.39 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-01-26 22:02:49

by Guenter Roeck

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

On 01/26/2014 01:53 PM, Mark Brown wrote:
> On Sun, Jan 26, 2014 at 01:40:30PM -0800, Guenter Roeck wrote:
>> On 01/26/2014 01:22 PM, Mark Brown wrote:
>
>>> What is this platform and why does it have the regulator API enabled in
>>> the first place?
>
>> It is a PC running Ubuntu. you'd have to ask the distro maintainers why they
>> turn on regulators.
>
>> config-3.8.0-26-generic:CONFIG_REGULATOR=y
>> config-3.8.0-35-generic:CONFIG_REGULATOR=y
>> config-3.11.0-13-generic:CONFIG_REGULATOR=y
>
> Oh, magic. Can you please file a bug with or otherwise talk to them
> asking them to turn that off (or contribute code to make it work)? Feel
> free to add me as a CC when you do so.
>

That means they would have to stop supporting anything that _does_ need regulators,
which doesn't make much sense to me.

I think the bug is on your side, not theirs.

Guenter

2014-01-26 22:04:08

by Guenter Roeck

[permalink] [raw]
Subject: Re: [lm-sensors] lm90 driver no longer working on PCs in 3.13

On 01/26/2014 01:50 PM, Guenter Roeck wrote:
> On 01/26/2014 12:49 PM, Jean Delvare wrote:
>> On Sun, 26 Jan 2014 12:44:38 -0800, Guenter Roeck wrote:
>>> On 01/26/2014 12:13 PM, Jean Delvare wrote:
>>> The regulator code changed with 3.13; the dummy regulator no longer exists,
>>> and the functionality it provided is supposed to be handled automatically.
>>> But that only really works on devicetree based systems and otherwise returns
>>> -EPROBE_DEFER as mentioned above.
>>>
>>> Maybe there is some configuration option, or maybe something needs to be
>>> configured from user space. I found neither.
>>
>> Neither would be acceptable to my eyes anyway. Things worked out of the
>> box before, they should keep working out of the box.
>>
>>> In the first case, we should create
>>> a dependency for the LM90 driver; in the latter case, we would have to make sure
>>> that it is well documented (I'd grumble on that, though - it would result in
>>> never ending trouble for us, having to repeatedly explain how this is now
>>> supposed to work).
>>>
>>> Another possible fix would be to have the regulator core return -ENODEV
>>> instead of -EPROBE_DEFER on non-dt systems. No idea if this would be acceptable
>>> or even feasible.
>>
>> Well, either the regulator subsystem gets fixed (or provides a suitable
>> API for drivers like lm90 and we update the lm90 driver to use it), or
>> I'll just revert the problematic commit for now. This is a severe
>> regression, we just can't leave things that way.
>>
>
> Maybe your configuration has CONFIG_REGULATORS disabled. Ubuntu has it enabled.
> I don't know about others.
>
> I agree, we may have to revert the patch. I don't think the regulator API works well
> enough in non-dt systems to be able to use it in such systems. Mark's expectation
> that regulator support must be disabled if regulators are not fully declared in non-dt
> systems doesn't seem very useful nor really feasible.
>

I think I have a better idea: Surround the regulator code, or at least its error handling,
in the lm90 driver with
if (IS_ENABLED(CONFIG_OF)) {
}

Would that be ok ? If yes I'll submit a patch. I'll do the same in another driver
I am working on.

Guenter

2014-01-26 23:16:37

by Mark Brown

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

On Sun, Jan 26, 2014 at 10:44:25PM +0100, Jean Delvare wrote:
> On Sun, 26 Jan 2014 21:22:56 +0000, Mark Brown wrote:

> > They only worked with a debug option turned on and generated warnings
> > every time they were used... that kernel config would've been actively
> > broken for devices that wanted to do anything at all interesting with
> > the regulators and would've been prone to issues with init ordering and
> > races in any cases where there are actually regulators.

> No, you don't get what I'm saying. For PC users, the lm90 did not

I understand perfectly well thank you very much.

> request a regulator and things worked because the kernel isn't supposed
> to take care about things like that on PC machines. Now that the lm90
> driver does request a regulator, it fails on PC machines because no
> regulator is declared.

If and only if the user has enabled the regulator API on a platform that
hasn't fully configured it; if the user has not enabled the regulator
API it'll stub itself out and they'll never see it.

> Don't tell me that it is expected that things will fail if
> CONFIG_REGULATOR is enabled on a system which doesn't need it. It
> doesn't make any sense. If kernels would fail as soon as any enabled
> option wasn't actually needed, no system would boot out there.

It's very easy to generate unbootable kernels by changing the kernel
config, I'd not immediately expect a randomly generated config to do
anything useful and things like FW_LOADER_USER_HELPER can be rather
miserable if you turn them on (that one produces enormous delays during
init which look awfully like hangs when you're watching your board
boot).

> If regulator is enabled but not needed, that's something the regulator
> subsystem should figure out at run-time so that it can stub everything
> out and things keep working.

To repeat this is something the platform needs to tell the API; the only
safe thing that API is able to do by itself is hope that something is
going to turn up later on and supply some data. Supplying stubs and
then trying to substitute the real thing in later isn't going to be
terribly clever, worst case we end up in the situation where a driver
talks to a device that has no power or partial power because we've hit
an init ordering race. Requiring platforms to flag in code when they
use regulators isn't something that seems like it's going to be deployed
smoothly and quickly, I'd be concerned that'd cause as much trouble as
it avoids - up until now the kernel config has been sufficient.

I think the 90% fix here (aside from Ubuntu disabling regulators in the
kernel config which would be sane too) is that ACPI based systems ought
to be flagging that they have full constraints as soon as they boot and
decide they have ACPI in the same way that DT ones do, we can be pretty
confident that a system using ACPI will have all supplies taken care of
transparently by the platform. This would deal with the PC situation at
any rate which is probably most of the users who care and aren't already
using DT.

The regulator API is in general conservative about what it does since
incorrect operation of regulators can cause lasting physical damage to
the system, in general doing nothing and generating errors will be
safer and we'd much rather deal with people running into that than with
damaged boards. The general approach is that it only does operations it
was specifically told by some outside source were OK. What it's doing
here is essentially saying that it doesn't know what's going on so it's
punting and hoping something tells it later on.


Attachments:
(No filename) (3.51 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-01-26 23:37:58

by Mark Brown

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

On Sun, Jan 26, 2014 at 02:02:16PM -0800, Guenter Roeck wrote:
> On 01/26/2014 01:53 PM, Mark Brown wrote:

As previously mentioned please fix your mailer to word wrap at less than
80 columns.

> >Oh, magic. Can you please file a bug with or otherwise talk to them
> >asking them to turn that off (or contribute code to make it work)? Feel
> >free to add me as a CC when you do so.

> That means they would have to stop supporting anything that _does_
> need regulators, which doesn't make much sense to me.

> I think the bug is on your side, not theirs.

Regardless of how we deal with this with the code that Ubuntu actually
have in the kernel releases you highlighted has isn't intended to work
well with what they're doing. Like I said in another mail ACPI telling
the regulator core that everything can be stubbed would be better but
with what's there right now my immediate recommendation would be to turn
off the config option since we know having it on can cause problems. No
matter what we do in mainline it's not going to change already released
kernels and they can do this right now.

I'm open to surprise but I'm relatively confident that there aren't x86
systems running a mainline kernel without out of tree code that would be
affected (which would be required to set up supplies to do anything
useful). Anything using dummy regulators only will get equivalent
behaviour by turning off the API completely.


Attachments:
(No filename) (1.39 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-01-26 23:51:15

by Mark Brown

[permalink] [raw]
Subject: Re: [lm-sensors] lm90 driver no longer working on PCs in 3.13

On Sun, Jan 26, 2014 at 02:04:06PM -0800, Guenter Roeck wrote:

> I think I have a better idea: Surround the regulator code, or at least
> its error handling, in the lm90 driver with

> if (IS_ENABLED(CONFIG_OF)) {
> }

> Would that be ok ? If yes I'll submit a patch. I'll do the same in
> another driver I am working on.

That's not going to have the desired effect in cases where DT is built
into the kernel but not in use on the current system (which is a
configuration that gets used) and will remove error handling for non-DT
systems that do have regulators set up. There's not the relationship
between this and DT that you seem think there is...

Besides, if we're going to do a bodge like that we should do it in the
core and not in individual callers.


Attachments:
(No filename) (764.00 B)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-01-27 04:16:37

by Guenter Roeck

[permalink] [raw]
Subject: Re: [lm-sensors] lm90 driver no longer working on PCs in 3.13

On 01/26/2014 03:51 PM, Mark Brown wrote:
> On Sun, Jan 26, 2014 at 02:04:06PM -0800, Guenter Roeck wrote:
>
>> I think I have a better idea: Surround the regulator code, or at least
>> its error handling, in the lm90 driver with
>
>> if (IS_ENABLED(CONFIG_OF)) {
>> }
>
>> Would that be ok ? If yes I'll submit a patch. I'll do the same in
>> another driver I am working on.
>
> That's not going to have the desired effect in cases where DT is built
> into the kernel but not in use on the current system (which is a
> configuration that gets used) and will remove error handling for non-DT
> systems that do have regulators set up. There's not the relationship
> between this and DT that you seem think there is...
>
> Besides, if we're going to do a bodge like that we should do it in the
> core and not in individual callers.
>

Then it appears the only remedy at this time is to revert the patch.

Guenter

2014-01-27 10:10:44

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

On Mon, Jan 27, 2014 at 12:15 AM, Mark Brown <[email protected]> wrote:
>> No, you don't get what I'm saying. For PC users, the lm90 did not
>
> I understand perfectly well thank you very much.
>
>> request a regulator and things worked because the kernel isn't supposed
>> to take care about things like that on PC machines. Now that the lm90
>> driver does request a regulator, it fails on PC machines because no
>> regulator is declared.
>
> If and only if the user has enabled the regulator API on a platform that
> hasn't fully configured it; if the user has not enabled the regulator
> API it'll stub itself out and they'll never see it.
>
>> Don't tell me that it is expected that things will fail if
>> CONFIG_REGULATOR is enabled on a system which doesn't need it. It
>> doesn't make any sense. If kernels would fail as soon as any enabled
>> option wasn't actually needed, no system would boot out there.
>
> It's very easy to generate unbootable kernels by changing the kernel
> config, I'd not immediately expect a randomly generated config to do
> anything useful and things like FW_LOADER_USER_HELPER can be rather
> miserable if you turn them on (that one produces enormous delays during
> init which look awfully like hangs when you're watching your board
> boot).

This is not just a randomly generated config where you disable critical
options. Just enabling a subsystem like the regulator subsystem shouldn't
give you an unbootable kernel. It's even needed, think multi-platform
kernels.
Typically, I'd expect allmodconfig/allyesconfig kernels to actually boot
(ignoring RAM size issues etc.).

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2014-01-27 10:34:57

by Mark Brown

[permalink] [raw]
Subject: Re: [lm-sensors] lm90 driver no longer working on PCs in 3.13

On Sun, Jan 26, 2014 at 08:16:37PM -0800, Guenter Roeck wrote:
> On 01/26/2014 03:51 PM, Mark Brown wrote:
> >On Sun, Jan 26, 2014 at 02:04:06PM -0800, Guenter Roeck wrote:

> >>I think I have a better idea: Surround the regulator code, or at least
> >>its error handling, in the lm90 driver with

> >> if (IS_ENABLED(CONFIG_OF)) {
> >> }

> >>Would that be ok ? If yes I'll submit a patch. I'll do the same in
> >>another driver I am working on.

> >That's not going to have the desired effect in cases where DT is built
> >into the kernel but not in use on the current system (which is a
> >configuration that gets used) and will remove error handling for non-DT
> >systems that do have regulators set up. There's not the relationship
> >between this and DT that you seem think there is...

> >Besides, if we're going to do a bodge like that we should do it in the
> >core and not in individual callers.

> Then it appears the only remedy at this time is to revert the patch.

Why - that seems like a bit of a leap? The first order problem with
what you're doing there is that conditionalising this on DT being built
into the kernel is the wrong conditional for several reasons, if you do
want to use an approach like this that at least needs to be changed.


Attachments:
(No filename) (1.23 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-01-27 13:19:20

by Mark Brown

[permalink] [raw]
Subject: Re: lm90 driver no longer working on PCs in 3.13

On Mon, Jan 27, 2014 at 11:10:41AM +0100, Geert Uytterhoeven wrote:
> On Mon, Jan 27, 2014 at 12:15 AM, Mark Brown <[email protected]> wrote:

> > It's very easy to generate unbootable kernels by changing the kernel
> > config, I'd not immediately expect a randomly generated config to do
> > anything useful and things like FW_LOADER_USER_HELPER can be rather
> > miserable if you turn them on (that one produces enormous delays during
> > init which look awfully like hangs when you're watching your board
> > boot).

> This is not just a randomly generated config where you disable critical
> options. Just enabling a subsystem like the regulator subsystem shouldn't
> give you an unbootable kernel. It's even needed, think multi-platform
> kernels.
> Typically, I'd expect allmodconfig/allyesconfig kernels to actually boot
> (ignoring RAM size issues etc.).

So, not booting is probably not what's going to happen for most systems.
You will in general at least get console output and mostly the realistic
issues for boot failures are netboot in embedded systems. I'm more than
aware of the idea of multiplatform systems, this is one of the reasons
why things like checking for DT being compiled in don't do the right
thing.

Don't get me wrong, I would prefer it if we had a way to transparently
make this stuff just work and we are getting there - that's what's
driving the recent changes with optional regulators for example - but
just ignoring errors causes other problems. For DT and (assuming the
patch I sent last night gets applied) ACPI systems I think we're now
pretty much there which should cover the majority of users. I also
think many architectures (especially older ones that don't see much
active hardware development) could probably do the same thing.

The issues that remain are around platforms which may use regulators but
need to have them come from board data in Linux. Unfortunately the
mechanism we have for that currently only registers the supply mappings
when the regulator is registered which means it's difficult to tell if
the mappings are going to appear later so the core is never terribly
sure if it knows when everything is set up unless the platform tells it.
Knowing that is the key bit of information that allows us to be more
helpful now we've got regulator_get_optional().

Fortunately outside of the ACPI/DT cases the platforms that are affected
are generally embedded ones where the problems are going to be developer
only and so it should be relatively easy to deal with them (for the most
part just adding a regulator_has_full_constraints() call on the affected
system if disabing the regulator API isn't desirable). Realistically it
is unlikely to be useful to enable on affected systems though.

The next step is to transition all the platforms in this class to a way
of setting up mappings that could be done purely in machine init (which
means they're in a similar position to DT and ACPI) but that's more work
than could sensibly be done in a bugfix context which is what people are
looking for here.


Attachments:
(No filename) (2.99 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-01-27 17:19:28

by Stephen Warren

[permalink] [raw]
Subject: Re: [lm-sensors] lm90 driver no longer working on PCs in 3.13

On 01/26/2014 04:51 PM, Mark Brown wrote:
> On Sun, Jan 26, 2014 at 02:04:06PM -0800, Guenter Roeck wrote:
>
>> I think I have a better idea: Surround the regulator code, or at least
>> its error handling, in the lm90 driver with
>
>> if (IS_ENABLED(CONFIG_OF)) {
>> }
>
>> Would that be ok ? If yes I'll submit a patch. I'll do the same in
>> another driver I am working on.
>
> That's not going to have the desired effect in cases where DT is built
> into the kernel but not in use on the current system (which is a
> configuration that gets used) ...

The solution to that particular aspect of the problem is the following:

if (of_have_populated_dt()) {
...

which is a run-time rather than compile-time test.

2014-01-27 18:50:35

by Guenter Roeck

[permalink] [raw]
Subject: Re: [lm-sensors] lm90 driver no longer working on PCs in 3.13

On Mon, Jan 27, 2014 at 10:19:24AM -0700, Stephen Warren wrote:
> On 01/26/2014 04:51 PM, Mark Brown wrote:
> > On Sun, Jan 26, 2014 at 02:04:06PM -0800, Guenter Roeck wrote:
> >
> >> I think I have a better idea: Surround the regulator code, or at least
> >> its error handling, in the lm90 driver with
> >
> >> if (IS_ENABLED(CONFIG_OF)) {
> >> }
> >
> >> Would that be ok ? If yes I'll submit a patch. I'll do the same in
> >> another driver I am working on.
> >
> > That's not going to have the desired effect in cases where DT is built
> > into the kernel but not in use on the current system (which is a
> > configuration that gets used) ...
>
> The solution to that particular aspect of the problem is the following:
>
> if (of_have_populated_dt()) {
> ...
>

Turns out that won't help either after Mark's patches to ACPI and
to the regulator core are applied. Right now I don't have a solution
that would work for all systems.

I'll leave it up to Jean to decide how to proceed.

Guenter

2014-01-27 22:05:12

by Jean Delvare

[permalink] [raw]
Subject: Re: [lm-sensors] lm90 driver no longer working on PCs in 3.13

On Mon, 27 Jan 2014 10:50:31 -0800, Guenter Roeck wrote:
> On Mon, Jan 27, 2014 at 10:19:24AM -0700, Stephen Warren wrote:
> > On 01/26/2014 04:51 PM, Mark Brown wrote:
> > > On Sun, Jan 26, 2014 at 02:04:06PM -0800, Guenter Roeck wrote:
> > >
> > >> I think I have a better idea: Surround the regulator code, or at least
> > >> its error handling, in the lm90 driver with
> > >
> > >> if (IS_ENABLED(CONFIG_OF)) {
> > >> }
> > >
> > >> Would that be ok ? If yes I'll submit a patch. I'll do the same in
> > >> another driver I am working on.
> > >
> > > That's not going to have the desired effect in cases where DT is built
> > > into the kernel but not in use on the current system (which is a
> > > configuration that gets used) ...
> >
> > The solution to that particular aspect of the problem is the following:
> >
> > if (of_have_populated_dt()) {
> > ...
> >
>
> Turns out that won't help either after Mark's patches to ACPI and
> to the regulator core are applied. Right now I don't have a solution
> that would work for all systems.
>
> I'll leave it up to Jean to decide how to proceed.

I have no idea, really. I have seen multiple patches flying around,
each seems to have its own merits, but I simply don't know which is
going in the direction. I don't know a thing about regulators, OF, DT
etc. so I am really not the right person to make a decision about this.

All I can say is: either someone comes up with a patch set which
properly fixes the regression for all lm90 drivers users, or I will have
to revert commit 3e0f964f.

--
Jean Delvare

2014-01-27 22:37:04

by Mark Brown

[permalink] [raw]
Subject: Re: [lm-sensors] lm90 driver no longer working on PCs in 3.13

On Mon, Jan 27, 2014 at 10:50:31AM -0800, Guenter Roeck wrote:

> Turns out that won't help either after Mark's patches to ACPI and
> to the regulator core are applied. Right now I don't have a solution
> that would work for all systems.

Could you be more explicit in your logic here please? Given that those
patches should only affect non-DT platforms (modulo the fact that it's
compile time only which I guess you see is an issue) and that your bodge
consists of ignoring all errors for non-DT platforms I honestly can't
think what impact they would have on the bodge so I'm now even more
confused.


Attachments:
(No filename) (603.00 B)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-01-27 23:41:37

by Guenter Roeck

[permalink] [raw]
Subject: Re: [lm-sensors] lm90 driver no longer working on PCs in 3.13

On Mon, Jan 27, 2014 at 11:04:47PM +0100, Jean Delvare wrote:
> On Mon, 27 Jan 2014 10:50:31 -0800, Guenter Roeck wrote:
> > On Mon, Jan 27, 2014 at 10:19:24AM -0700, Stephen Warren wrote:
> > > On 01/26/2014 04:51 PM, Mark Brown wrote:
> > > > On Sun, Jan 26, 2014 at 02:04:06PM -0800, Guenter Roeck wrote:
> > > >
> > > >> I think I have a better idea: Surround the regulator code, or at least
> > > >> its error handling, in the lm90 driver with
> > > >
> > > >> if (IS_ENABLED(CONFIG_OF)) {
> > > >> }
> > > >
> > > >> Would that be ok ? If yes I'll submit a patch. I'll do the same in
> > > >> another driver I am working on.
> > > >
> > > > That's not going to have the desired effect in cases where DT is built
> > > > into the kernel but not in use on the current system (which is a
> > > > configuration that gets used) ...
> > >
> > > The solution to that particular aspect of the problem is the following:
> > >
> > > if (of_have_populated_dt()) {
> > > ...
> > >
> >
> > Turns out that won't help either after Mark's patches to ACPI and
> > to the regulator core are applied. Right now I don't have a solution
> > that would work for all systems.
> >
> > I'll leave it up to Jean to decide how to proceed.
>
> I have no idea, really. I have seen multiple patches flying around,
> each seems to have its own merits, but I simply don't know which is
> going in the direction. I don't know a thing about regulators, OF, DT
> etc. so I am really not the right person to make a decision about this.
>
> All I can say is: either someone comes up with a patch set which
> properly fixes the regression for all lm90 drivers users, or I will have
> to revert commit 3e0f964f.
>
I'll test Mark's two patches on my system and let you know the results.
After looking some more into it, those _may_ actually fix the problem at least
for systems supporting ACPI. No promise, though, and I am not sure if that
would be sufficient (it may still not work on non-ACPI PCs). But then who
am I to say ... Mark keeps repeating that I don't know what I am talking about,
after all, and maybe he has a point ;-).

Guenter

2014-01-27 23:58:31

by Mark Brown

[permalink] [raw]
Subject: Re: [lm-sensors] lm90 driver no longer working on PCs in 3.13

On Mon, Jan 27, 2014 at 03:41:35PM -0800, Guenter Roeck wrote:

> I'll test Mark's two patches on my system and let you know the results.
> After looking some more into it, those _may_ actually fix the problem at least
> for systems supporting ACPI. No promise, though, and I am not sure if that
> would be sufficient (it may still not work on non-ACPI PCs). But then who

They should fix ACPI systems, if they don't we need to fix them further
- that case should just work, it was an oversight not to have a patch
like the ACPI one in already. I'm not sure how many non-ACPI PCs still
exist, we're definitely talking about things over 10 years old (even
before PCs became ACPI only they started to have ACPI present which
should be sufficient).

Jean, I'm doing a writeup but it's probably not going to be done today.


Attachments:
(No filename) (820.00 B)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-01-28 02:33:45

by Guenter Roeck

[permalink] [raw]
Subject: Re: [lm-sensors] lm90 driver no longer working on PCs in 3.13

On 01/27/2014 03:58 PM, Mark Brown wrote:
> On Mon, Jan 27, 2014 at 03:41:35PM -0800, Guenter Roeck wrote:
>
>> I'll test Mark's two patches on my system and let you know the results.
>> After looking some more into it, those _may_ actually fix the problem at least
>> for systems supporting ACPI. No promise, though, and I am not sure if that
>> would be sufficient (it may still not work on non-ACPI PCs). But then who
>
> They should fix ACPI systems, if they don't we need to fix them further
> - that case should just work, it was an oversight not to have a patch
> like the ACPI one in already. I'm not sure how many non-ACPI PCs still
> exist, we're definitely talking about things over 10 years old (even
> before PCs became ACPI only they started to have ACPI present which
> should be sufficient).
>
> Jean, I'm doing a writeup but it's probably not going to be done today.
>

I confirmed that your two patches ("regulator: core: Correct default
return value for full constraints" and "ACPI: Flag use of ACPI and ACPI
idioms for power supplies to regulator API") together fix the problem
with the lm90 driver, at least for ACPI based systems.

Thanks,
Guenter

2014-01-28 10:25:09

by Mark Brown

[permalink] [raw]
Subject: Re: [lm-sensors] lm90 driver no longer working on PCs in 3.13

On Mon, Jan 27, 2014 at 06:33:44PM -0800, Guenter Roeck wrote:

> I confirmed that your two patches ("regulator: core: Correct default
> return value for full constraints" and "ACPI: Flag use of ACPI and ACPI
> idioms for power supplies to regulator API") together fix the problem
> with the lm90 driver, at least for ACPI based systems.

Great, thanks!


Attachments:
(No filename) (354.00 B)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-01-28 12:35:00

by Mark Brown

[permalink] [raw]
Subject: Re: [lm-sensors] lm90 driver no longer working on PCs in 3.13

On Mon, Jan 27, 2014 at 11:04:47PM +0100, Jean Delvare wrote:

> I have no idea, really. I have seen multiple patches flying around,
> each seems to have its own merits, but I simply don't know which is
> going in the direction. I don't know a thing about regulators, OF, DT
> etc. so I am really not the right person to make a decision about this.

So, regulators are in this context about providing power to the device.
Devices tend to work a lot better if they have power, we probably
shouldn't ignore errors in that because that's generally considered bad
form to ignore errors and the consequences of ignoring such errors are
typically bad if they are real. Managing the power explicitly in
drivers both allows greater power saving in systems that are power
sensitive and allows us to ensure that we power on supplies for devices
before we try to use the devices in systems where we have to do that
from Linux. A big part of deploying this in a system is telling the
regulator API which devices are supplied by which regulators.

Having said all that many platforms just don't care about power on this
level (we're talking leakage currents here) and have no runtime power
managment available to the operating system. These platforms probably
shouldn't enable the regulator API at all since it does nothing for them
so if the regulator API is disabled it provides stubs which look like
always on regulators.

If the platform tells the regulator API that it knows everything about
how devices on the platform are powered then the subsystem can safely
assume that any normal power supply that it doesn't explicitly know
about is there outside of Linux's control since otherwise the hardware
would be non-functional. One way that can happen is if the platform is
using DT, that does cover a lot of the platforms that use regulators but
that's not the only way - specifying full constraints does this too.
There was a bug here in the regulator core, I've committed a fix for it
now which will go to Linus soon.

The patch I posted "ACPI: Flag use of ACPI and ACPI idioms for power
supplies to regulator API" (which is applied now) makes systems using
ACPI do the same thing, covering essentially all desktop and server
systems which I believe is going to cover essentially all non-DT users
who might run into this outside of an embedded context. We could also
do the same for some architectures, and many are already covered by
virtue of using DT.

The problem cases that remain are with platforms that don't tell the
regulator core that they've handed over the full list of supply mappings
to the core. Unfortunately the regulator API predates deferred probe so
with platform data it uses a mechanism based on handing over supply
mappings when a regulator is registered which totally fails to work with
deferred probe unless you defer any missing mapping since a regulator
may be registered later providing the supply. It used to work OK prior
to deferred probe but that relied on init order hacks to do so which
aren't sustainable long term. The fix for this is to add a new
registration method that the board can call during early init separately
to registering devices, that hasn't been done yet. I will try to get
it done this release cycle but I wouldn't be comfortable rolling it out
as a bug fix.

> All I can say is: either someone comes up with a patch set which
> properly fixes the regression for all lm90 drivers users, or I will have
> to revert commit 3e0f964f.

If you're going to do something like that it would seem safer to just
ignore error handling in the driver instead. That would at least have a
chance of powering the device on when Linux needs to do so, obviously it
won't relaibly do the right thing in all cases but affected systems
would have had issues with the unmodified driver anyway as the power
wouldn't be enabled.

Personally I think that providing ACPI is updated the problems are
mostly mitigated and are on the same order as something that might
trigger a timeout in the userspace firmware loader; it's not good but
it's fairly easily discoverable and resolvable for the people who run
into it as they are likely to be developers rather than end users and
the number of affected users is likely to be small (you'd have to have
an affected device in a system that hasn't told the regulator core it
has full mappings and also have enabled the regulator API). It
therefore seems reasonable to leave the driver as it is while waiting
for an improved regulator registration method to be merged which I will
try to do as soon as possible.

Does that make sense?


Attachments:
(No filename) (4.50 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments