2019-08-01 00:59:23

by Saravana Kannan

[permalink] [raw]
Subject: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

Add device-links to track functional dependencies between devices
after they are created (but before they are probed) by looking at
their common DT bindings like clocks, interconnects, etc.

Having functional dependencies automatically added before the devices
are probed, provides the following benefits:

- Optimizes device probe order and avoids the useless work of
attempting probes of devices that will not probe successfully
(because their suppliers aren't present or haven't probed yet).

For example, in a commonly available mobile SoC, registering just
one consumer device's driver at an initcall level earlier than the
supplier device's driver causes 11 failed probe attempts before the
consumer device probes successfully. This was with a kernel with all
the drivers statically compiled in. This problem gets a lot worse if
all the drivers are loaded as modules without direct symbol
dependencies.

- Supplier devices like clock providers, interconnect providers, etc
need to keep the resources they provide active and at a particular
state(s) during boot up even if their current set of consumers don't
request the resource to be active. This is because the rest of the
consumers might not have probed yet and turning off the resource
before all the consumers have probed could lead to a hang or
undesired user experience.

Some frameworks (Eg: regulator) handle this today by turning off
"unused" resources at late_initcall_sync and hoping all the devices
have probed by then. This is not a valid assumption for systems with
loadable modules. Other frameworks (Eg: clock) just don't handle
this due to the lack of a clear signal for when they can turn off
resources. This leads to downstream hacks to handle cases like this
that can easily be solved in the upstream kernel.

By linking devices before they are probed, we give suppliers a clear
count of the number of dependent consumers. Once all of the
consumers are active, the suppliers can turn off the unused
resources without making assumptions about the number of consumers.

By default we just add device-links to track "driver presence" (probe
succeeded) of the supplier device. If any other functionality provided
by device-links are needed, it is left to the consumer/supplier
devices to change the link when they probe.

v1 -> v2:
- Drop patch to speed up of_find_device_by_node()
- Drop depends-on property and use existing bindings

v2 -> v3:
- Refactor the code to have driver core initiate the linking of devs
- Have driver core link consumers to supplier before it's probed
- Add support for drivers to edit the device links before probing

v3 -> v4:
- Tested edit_links() on system with cyclic dependency. Works.
- Added some checks to make sure device link isn't attempted from
parent device node to child device node.
- Added way to pause/resume sync_state callbacks across
of_platform_populate().
- Recursively parse DT node to create device links from parent to
suppliers of parent and all child nodes.

v4 -> v5:
- Fixed copy-pasta bugs with linked list handling
- Walk up the phandle reference till I find an actual device (needed
for regulators to work)
- Added support for linking devices from regulator DT bindings
- Tested the whole series again to make sure cyclic dependencies are
broken with edit_links() and regulator links are created properly.

v5 -> v6:
- Split, squashed and reordered some of the patches.
- Refactored the device linking code to follow the same code pattern for
any property.

v6 -> v7:
- No functional changes.
- Renamed i to index
- Added comment to clarify not having to check property name for every
index
- Added "matched" variable to clarify code. No functional change.
- Added comments to include/linux/device.h for add_links()

v7 -> v8:
- Rebased on top of linux-next to handle device link changes in [1]

v8 -> v9:
- Fixed kbuild test bot reported errors (docs and const)

[1] - https://lore.kernel.org/lkml/2305283.AStDPdUUnE@kreacher/

-Saravana


Saravana Kannan (7):
driver core: Add support for linking devices during device addition
driver core: Add edit_links() callback for drivers
of/platform: Add functional dependency link from DT bindings
driver core: Add sync_state driver/bus callback
of/platform: Pause/resume sync state during init and
of_platform_populate()
of/platform: Create device links for all child-supplier depencencies
of/platform: Don't create device links for default busses

.../admin-guide/kernel-parameters.txt | 5 +
drivers/base/core.c | 168 ++++++++++++++++
drivers/base/dd.c | 29 +++
drivers/of/platform.c | 189 ++++++++++++++++++
include/linux/device.h | 60 ++++++
5 files changed, 451 insertions(+)

--
2.22.0.709.g102302147b-goog


2019-08-01 06:13:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

On Wed, Jul 31, 2019 at 03:17:13PM -0700, Saravana Kannan wrote:
> Add device-links to track functional dependencies between devices
> after they are created (but before they are probed) by looking at
> their common DT bindings like clocks, interconnects, etc.
>
> Having functional dependencies automatically added before the devices
> are probed, provides the following benefits:
>
> - Optimizes device probe order and avoids the useless work of
> attempting probes of devices that will not probe successfully
> (because their suppliers aren't present or haven't probed yet).
>
> For example, in a commonly available mobile SoC, registering just
> one consumer device's driver at an initcall level earlier than the
> supplier device's driver causes 11 failed probe attempts before the
> consumer device probes successfully. This was with a kernel with all
> the drivers statically compiled in. This problem gets a lot worse if
> all the drivers are loaded as modules without direct symbol
> dependencies.
>
> - Supplier devices like clock providers, interconnect providers, etc
> need to keep the resources they provide active and at a particular
> state(s) during boot up even if their current set of consumers don't
> request the resource to be active. This is because the rest of the
> consumers might not have probed yet and turning off the resource
> before all the consumers have probed could lead to a hang or
> undesired user experience.
>
> Some frameworks (Eg: regulator) handle this today by turning off
> "unused" resources at late_initcall_sync and hoping all the devices
> have probed by then. This is not a valid assumption for systems with
> loadable modules. Other frameworks (Eg: clock) just don't handle
> this due to the lack of a clear signal for when they can turn off
> resources. This leads to downstream hacks to handle cases like this
> that can easily be solved in the upstream kernel.
>
> By linking devices before they are probed, we give suppliers a clear
> count of the number of dependent consumers. Once all of the
> consumers are active, the suppliers can turn off the unused
> resources without making assumptions about the number of consumers.
>
> By default we just add device-links to track "driver presence" (probe
> succeeded) of the supplier device. If any other functionality provided
> by device-links are needed, it is left to the consumer/supplier
> devices to change the link when they probe.

All now queued up in my driver-core-testing branch, and if 0-day is
happy with this, will move it to my "real" driver-core-next branch in a
day or so to get included in linux-next.

thanks for sticking with this!

greg k-h

2019-08-01 19:29:09

by Frank Rowand

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

Hi Greg,

On 7/31/19 11:12 PM, Greg Kroah-Hartman wrote:
> On Wed, Jul 31, 2019 at 03:17:13PM -0700, Saravana Kannan wrote:
>> Add device-links to track functional dependencies between devices
>> after they are created (but before they are probed) by looking at
>> their common DT bindings like clocks, interconnects, etc.
>>
>> Having functional dependencies automatically added before the devices
>> are probed, provides the following benefits:
>>
>> - Optimizes device probe order and avoids the useless work of
>> attempting probes of devices that will not probe successfully
>> (because their suppliers aren't present or haven't probed yet).
>>
>> For example, in a commonly available mobile SoC, registering just
>> one consumer device's driver at an initcall level earlier than the
>> supplier device's driver causes 11 failed probe attempts before the
>> consumer device probes successfully. This was with a kernel with all
>> the drivers statically compiled in. This problem gets a lot worse if
>> all the drivers are loaded as modules without direct symbol
>> dependencies.
>>
>> - Supplier devices like clock providers, interconnect providers, etc
>> need to keep the resources they provide active and at a particular
>> state(s) during boot up even if their current set of consumers don't
>> request the resource to be active. This is because the rest of the
>> consumers might not have probed yet and turning off the resource
>> before all the consumers have probed could lead to a hang or
>> undesired user experience.
>>
>> Some frameworks (Eg: regulator) handle this today by turning off
>> "unused" resources at late_initcall_sync and hoping all the devices
>> have probed by then. This is not a valid assumption for systems with
>> loadable modules. Other frameworks (Eg: clock) just don't handle
>> this due to the lack of a clear signal for when they can turn off
>> resources. This leads to downstream hacks to handle cases like this
>> that can easily be solved in the upstream kernel.
>>
>> By linking devices before they are probed, we give suppliers a clear
>> count of the number of dependent consumers. Once all of the
>> consumers are active, the suppliers can turn off the unused
>> resources without making assumptions about the number of consumers.
>>
>> By default we just add device-links to track "driver presence" (probe
>> succeeded) of the supplier device. If any other functionality provided
>> by device-links are needed, it is left to the consumer/supplier
>> devices to change the link when they probe.
>
> All now queued up in my driver-core-testing branch, and if 0-day is
> happy with this, will move it to my "real" driver-core-next branch in a
> day or so to get included in linux-next.

I have been slow in getting my review out.

This patch series is not yet ready for sending to Linus, so if putting
this in linux-next implies that it will be in your next pull request
to Linus, please do not put it in linux-next.

Thanks,

Frank

>
> thanks for sticking with this!
>
> greg k-h
>

2019-08-01 19:35:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

On Thu, Aug 01, 2019 at 12:28:13PM -0700, Frank Rowand wrote:
> Hi Greg,
>
> On 7/31/19 11:12 PM, Greg Kroah-Hartman wrote:
> > On Wed, Jul 31, 2019 at 03:17:13PM -0700, Saravana Kannan wrote:
> >> Add device-links to track functional dependencies between devices
> >> after they are created (but before they are probed) by looking at
> >> their common DT bindings like clocks, interconnects, etc.
> >>
> >> Having functional dependencies automatically added before the devices
> >> are probed, provides the following benefits:
> >>
> >> - Optimizes device probe order and avoids the useless work of
> >> attempting probes of devices that will not probe successfully
> >> (because their suppliers aren't present or haven't probed yet).
> >>
> >> For example, in a commonly available mobile SoC, registering just
> >> one consumer device's driver at an initcall level earlier than the
> >> supplier device's driver causes 11 failed probe attempts before the
> >> consumer device probes successfully. This was with a kernel with all
> >> the drivers statically compiled in. This problem gets a lot worse if
> >> all the drivers are loaded as modules without direct symbol
> >> dependencies.
> >>
> >> - Supplier devices like clock providers, interconnect providers, etc
> >> need to keep the resources they provide active and at a particular
> >> state(s) during boot up even if their current set of consumers don't
> >> request the resource to be active. This is because the rest of the
> >> consumers might not have probed yet and turning off the resource
> >> before all the consumers have probed could lead to a hang or
> >> undesired user experience.
> >>
> >> Some frameworks (Eg: regulator) handle this today by turning off
> >> "unused" resources at late_initcall_sync and hoping all the devices
> >> have probed by then. This is not a valid assumption for systems with
> >> loadable modules. Other frameworks (Eg: clock) just don't handle
> >> this due to the lack of a clear signal for when they can turn off
> >> resources. This leads to downstream hacks to handle cases like this
> >> that can easily be solved in the upstream kernel.
> >>
> >> By linking devices before they are probed, we give suppliers a clear
> >> count of the number of dependent consumers. Once all of the
> >> consumers are active, the suppliers can turn off the unused
> >> resources without making assumptions about the number of consumers.
> >>
> >> By default we just add device-links to track "driver presence" (probe
> >> succeeded) of the supplier device. If any other functionality provided
> >> by device-links are needed, it is left to the consumer/supplier
> >> devices to change the link when they probe.
> >
> > All now queued up in my driver-core-testing branch, and if 0-day is
> > happy with this, will move it to my "real" driver-core-next branch in a
> > day or so to get included in linux-next.
>
> I have been slow in getting my review out.
>
> This patch series is not yet ready for sending to Linus, so if putting
> this in linux-next implies that it will be in your next pull request
> to Linus, please do not put it in linux-next.

It means that it will be in my pull request for 5.4-rc1, many many
waeeks away from now.

thanks,

greg k-h

2019-08-01 20:02:28

by Frank Rowand

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

On 8/1/19 12:32 PM, Greg Kroah-Hartman wrote:
> On Thu, Aug 01, 2019 at 12:28:13PM -0700, Frank Rowand wrote:
>> Hi Greg,
>>
>> On 7/31/19 11:12 PM, Greg Kroah-Hartman wrote:
>>> On Wed, Jul 31, 2019 at 03:17:13PM -0700, Saravana Kannan wrote:
>>>> Add device-links to track functional dependencies between devices
>>>> after they are created (but before they are probed) by looking at
>>>> their common DT bindings like clocks, interconnects, etc.
>>>>
>>>> Having functional dependencies automatically added before the devices
>>>> are probed, provides the following benefits:
>>>>
>>>> - Optimizes device probe order and avoids the useless work of
>>>> attempting probes of devices that will not probe successfully
>>>> (because their suppliers aren't present or haven't probed yet).
>>>>
>>>> For example, in a commonly available mobile SoC, registering just
>>>> one consumer device's driver at an initcall level earlier than the
>>>> supplier device's driver causes 11 failed probe attempts before the
>>>> consumer device probes successfully. This was with a kernel with all
>>>> the drivers statically compiled in. This problem gets a lot worse if
>>>> all the drivers are loaded as modules without direct symbol
>>>> dependencies.
>>>>
>>>> - Supplier devices like clock providers, interconnect providers, etc
>>>> need to keep the resources they provide active and at a particular
>>>> state(s) during boot up even if their current set of consumers don't
>>>> request the resource to be active. This is because the rest of the
>>>> consumers might not have probed yet and turning off the resource
>>>> before all the consumers have probed could lead to a hang or
>>>> undesired user experience.
>>>>
>>>> Some frameworks (Eg: regulator) handle this today by turning off
>>>> "unused" resources at late_initcall_sync and hoping all the devices
>>>> have probed by then. This is not a valid assumption for systems with
>>>> loadable modules. Other frameworks (Eg: clock) just don't handle
>>>> this due to the lack of a clear signal for when they can turn off
>>>> resources. This leads to downstream hacks to handle cases like this
>>>> that can easily be solved in the upstream kernel.
>>>>
>>>> By linking devices before they are probed, we give suppliers a clear
>>>> count of the number of dependent consumers. Once all of the
>>>> consumers are active, the suppliers can turn off the unused
>>>> resources without making assumptions about the number of consumers.
>>>>
>>>> By default we just add device-links to track "driver presence" (probe
>>>> succeeded) of the supplier device. If any other functionality provided
>>>> by device-links are needed, it is left to the consumer/supplier
>>>> devices to change the link when they probe.
>>>
>>> All now queued up in my driver-core-testing branch, and if 0-day is
>>> happy with this, will move it to my "real" driver-core-next branch in a
>>> day or so to get included in linux-next.
>>
>> I have been slow in getting my review out.
>>
>> This patch series is not yet ready for sending to Linus, so if putting
>> this in linux-next implies that it will be in your next pull request
>> to Linus, please do not put it in linux-next.
>
> It means that it will be in my pull request for 5.4-rc1, many many
> waeeks away from now.

If you are willing to revert the series before the pull request _if_ I
have significant review issues in the next couple of days, then I am happy
to see the patches get exposure in linux-next.

-Frank

>
> thanks,
>
> greg k-h
>

2019-08-02 09:15:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

On Thu, Aug 01, 2019 at 12:59:25PM -0700, Frank Rowand wrote:
> On 8/1/19 12:32 PM, Greg Kroah-Hartman wrote:
> > On Thu, Aug 01, 2019 at 12:28:13PM -0700, Frank Rowand wrote:
> >> Hi Greg,
> >>
> >> On 7/31/19 11:12 PM, Greg Kroah-Hartman wrote:
> >>> On Wed, Jul 31, 2019 at 03:17:13PM -0700, Saravana Kannan wrote:
> >>>> Add device-links to track functional dependencies between devices
> >>>> after they are created (but before they are probed) by looking at
> >>>> their common DT bindings like clocks, interconnects, etc.
> >>>>
> >>>> Having functional dependencies automatically added before the devices
> >>>> are probed, provides the following benefits:
> >>>>
> >>>> - Optimizes device probe order and avoids the useless work of
> >>>> attempting probes of devices that will not probe successfully
> >>>> (because their suppliers aren't present or haven't probed yet).
> >>>>
> >>>> For example, in a commonly available mobile SoC, registering just
> >>>> one consumer device's driver at an initcall level earlier than the
> >>>> supplier device's driver causes 11 failed probe attempts before the
> >>>> consumer device probes successfully. This was with a kernel with all
> >>>> the drivers statically compiled in. This problem gets a lot worse if
> >>>> all the drivers are loaded as modules without direct symbol
> >>>> dependencies.
> >>>>
> >>>> - Supplier devices like clock providers, interconnect providers, etc
> >>>> need to keep the resources they provide active and at a particular
> >>>> state(s) during boot up even if their current set of consumers don't
> >>>> request the resource to be active. This is because the rest of the
> >>>> consumers might not have probed yet and turning off the resource
> >>>> before all the consumers have probed could lead to a hang or
> >>>> undesired user experience.
> >>>>
> >>>> Some frameworks (Eg: regulator) handle this today by turning off
> >>>> "unused" resources at late_initcall_sync and hoping all the devices
> >>>> have probed by then. This is not a valid assumption for systems with
> >>>> loadable modules. Other frameworks (Eg: clock) just don't handle
> >>>> this due to the lack of a clear signal for when they can turn off
> >>>> resources. This leads to downstream hacks to handle cases like this
> >>>> that can easily be solved in the upstream kernel.
> >>>>
> >>>> By linking devices before they are probed, we give suppliers a clear
> >>>> count of the number of dependent consumers. Once all of the
> >>>> consumers are active, the suppliers can turn off the unused
> >>>> resources without making assumptions about the number of consumers.
> >>>>
> >>>> By default we just add device-links to track "driver presence" (probe
> >>>> succeeded) of the supplier device. If any other functionality provided
> >>>> by device-links are needed, it is left to the consumer/supplier
> >>>> devices to change the link when they probe.
> >>>
> >>> All now queued up in my driver-core-testing branch, and if 0-day is
> >>> happy with this, will move it to my "real" driver-core-next branch in a
> >>> day or so to get included in linux-next.
> >>
> >> I have been slow in getting my review out.
> >>
> >> This patch series is not yet ready for sending to Linus, so if putting
> >> this in linux-next implies that it will be in your next pull request
> >> to Linus, please do not put it in linux-next.
> >
> > It means that it will be in my pull request for 5.4-rc1, many many
> > waeeks away from now.
>
> If you are willing to revert the series before the pull request _if_ I
> have significant review issues in the next couple of days, then I am happy
> to see the patches get exposure in linux-next.

If you have significant review issues, yes, I will be glad to revert them.

thanks,

greg k-h

2019-08-08 02:17:08

by Frank Rowand

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

Hi Greg, Saravana,

On 8/1/19 11:37 PM, Greg Kroah-Hartman wrote:
> On Thu, Aug 01, 2019 at 12:59:25PM -0700, Frank Rowand wrote:
>> On 8/1/19 12:32 PM, Greg Kroah-Hartman wrote:
>>> On Thu, Aug 01, 2019 at 12:28:13PM -0700, Frank Rowand wrote:
>>>> Hi Greg,
>>>>
>>>> On 7/31/19 11:12 PM, Greg Kroah-Hartman wrote:
>>>>> On Wed, Jul 31, 2019 at 03:17:13PM -0700, Saravana Kannan wrote:
>>>>>> Add device-links to track functional dependencies between devices
>>>>>> after they are created (but before they are probed) by looking at
>>>>>> their common DT bindings like clocks, interconnects, etc.
>>>>>>
>>>>>> Having functional dependencies automatically added before the devices
>>>>>> are probed, provides the following benefits:
>>>>>>
>>>>>> - Optimizes device probe order and avoids the useless work of
>>>>>> attempting probes of devices that will not probe successfully
>>>>>> (because their suppliers aren't present or haven't probed yet).
>>>>>>
>>>>>> For example, in a commonly available mobile SoC, registering just
>>>>>> one consumer device's driver at an initcall level earlier than the
>>>>>> supplier device's driver causes 11 failed probe attempts before the
>>>>>> consumer device probes successfully. This was with a kernel with all
>>>>>> the drivers statically compiled in. This problem gets a lot worse if
>>>>>> all the drivers are loaded as modules without direct symbol
>>>>>> dependencies.
>>>>>>
>>>>>> - Supplier devices like clock providers, interconnect providers, etc
>>>>>> need to keep the resources they provide active and at a particular
>>>>>> state(s) during boot up even if their current set of consumers don't
>>>>>> request the resource to be active. This is because the rest of the
>>>>>> consumers might not have probed yet and turning off the resource
>>>>>> before all the consumers have probed could lead to a hang or
>>>>>> undesired user experience.
>>>>>>
>>>>>> Some frameworks (Eg: regulator) handle this today by turning off
>>>>>> "unused" resources at late_initcall_sync and hoping all the devices
>>>>>> have probed by then. This is not a valid assumption for systems with
>>>>>> loadable modules. Other frameworks (Eg: clock) just don't handle
>>>>>> this due to the lack of a clear signal for when they can turn off
>>>>>> resources. This leads to downstream hacks to handle cases like this
>>>>>> that can easily be solved in the upstream kernel.
>>>>>>
>>>>>> By linking devices before they are probed, we give suppliers a clear
>>>>>> count of the number of dependent consumers. Once all of the
>>>>>> consumers are active, the suppliers can turn off the unused
>>>>>> resources without making assumptions about the number of consumers.
>>>>>>
>>>>>> By default we just add device-links to track "driver presence" (probe
>>>>>> succeeded) of the supplier device. If any other functionality provided
>>>>>> by device-links are needed, it is left to the consumer/supplier
>>>>>> devices to change the link when they probe.
>>>>>
>>>>> All now queued up in my driver-core-testing branch, and if 0-day is
>>>>> happy with this, will move it to my "real" driver-core-next branch in a
>>>>> day or so to get included in linux-next.
>>>>
>>>> I have been slow in getting my review out.
>>>>
>>>> This patch series is not yet ready for sending to Linus, so if putting
>>>> this in linux-next implies that it will be in your next pull request
>>>> to Linus, please do not put it in linux-next.
>>>
>>> It means that it will be in my pull request for 5.4-rc1, many many
>>> waeeks away from now.
>>
>> If you are willing to revert the series before the pull request _if_ I
>> have significant review issues in the next couple of days, then I am happy
>> to see the patches get exposure in linux-next.
>
> If you have significant review issues, yes, I will be glad to revert them.

Just a heads up that I have sent review issues in reply to version 7 of this
patch series.

We'll see what the responses are to my review comments, but I am expecting
the changes are big enough to result in a new version (or a couple more
versions) of the patch series.

No rush to revert version 9 since your 5.4-rc1 pull request is still not
near, and I am glad for whatever exposure these patches are getting in
linux-next.

Thanks,

Frank

>
> thanks,
>
> greg k-h
>

2019-08-10 03:01:08

by Frank Rowand

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

Hi Saravana,

On 7/31/19 3:17 PM, Saravana Kannan wrote:
> Add device-links to track functional dependencies between devices
> after they are created (but before they are probed) by looking at
> their common DT bindings like clocks, interconnects, etc.
>
> Having functional dependencies automatically added before the devices
> are probed, provides the following benefits:
>
> - Optimizes device probe order and avoids the useless work of
> attempting probes of devices that will not probe successfully
> (because their suppliers aren't present or haven't probed yet).
>
> For example, in a commonly available mobile SoC, registering just
> one consumer device's driver at an initcall level earlier than the
> supplier device's driver causes 11 failed probe attempts before the
> consumer device probes successfully. This was with a kernel with all
> the drivers statically compiled in. This problem gets a lot worse if
> all the drivers are loaded as modules without direct symbol
> dependencies.
>
> - Supplier devices like clock providers, interconnect providers, etc
> need to keep the resources they provide active and at a particular
> state(s) during boot up even if their current set of consumers don't
> request the resource to be active. This is because the rest of the
> consumers might not have probed yet and turning off the resource
> before all the consumers have probed could lead to a hang or
> undesired user experience.
>
> Some frameworks (Eg: regulator) handle this today by turning off
> "unused" resources at late_initcall_sync and hoping all the devices
> have probed by then. This is not a valid assumption for systems with
> loadable modules. Other frameworks (Eg: clock) just don't handle
> this due to the lack of a clear signal for when they can turn off
> resources. This leads to downstream hacks to handle cases like this
> that can easily be solved in the upstream kernel.
>
> By linking devices before they are probed, we give suppliers a clear
> count of the number of dependent consumers. Once all of the
> consumers are active, the suppliers can turn off the unused
> resources without making assumptions about the number of consumers.
>
> By default we just add device-links to track "driver presence" (probe
> succeeded) of the supplier device. If any other functionality provided
> by device-links are needed, it is left to the consumer/supplier
> devices to change the link when they probe.
>
> v1 -> v2:
> - Drop patch to speed up of_find_device_by_node()
> - Drop depends-on property and use existing bindings
>
> v2 -> v3:
> - Refactor the code to have driver core initiate the linking of devs
> - Have driver core link consumers to supplier before it's probed
> - Add support for drivers to edit the device links before probing
>
> v3 -> v4:
> - Tested edit_links() on system with cyclic dependency. Works.
> - Added some checks to make sure device link isn't attempted from
> parent device node to child device node.
> - Added way to pause/resume sync_state callbacks across
> of_platform_populate().
> - Recursively parse DT node to create device links from parent to
> suppliers of parent and all child nodes.
>
> v4 -> v5:
> - Fixed copy-pasta bugs with linked list handling
> - Walk up the phandle reference till I find an actual device (needed
> for regulators to work)
> - Added support for linking devices from regulator DT bindings
> - Tested the whole series again to make sure cyclic dependencies are
> broken with edit_links() and regulator links are created properly.
>
> v5 -> v6:
> - Split, squashed and reordered some of the patches.
> - Refactored the device linking code to follow the same code pattern for
> any property.
>
> v6 -> v7:
> - No functional changes.
> - Renamed i to index
> - Added comment to clarify not having to check property name for every
> index
> - Added "matched" variable to clarify code. No functional change.
> - Added comments to include/linux/device.h for add_links()
>
> v7 -> v8:
> - Rebased on top of linux-next to handle device link changes in [1]
>


> v8 -> v9:
> - Fixed kbuild test bot reported errors (docs and const)

Some maintainers have strong opinions about whether change logs should be:

(1) only in patch 0
(2) only in the specific patches that are changed
(3) both in patch 0 and in the specific patches that are changed.

I can adapt to any of the three styles. But for style "(1)" please
list which specific patch has changed for each item in the change list.

-Frank


>
> [1] - https://lore.kernel.org/lkml/2305283.AStDPdUUnE@kreacher/
>
> -Saravana
>
>
> Saravana Kannan (7):
> driver core: Add support for linking devices during device addition
> driver core: Add edit_links() callback for drivers
> of/platform: Add functional dependency link from DT bindings
> driver core: Add sync_state driver/bus callback
> of/platform: Pause/resume sync state during init and
> of_platform_populate()
> of/platform: Create device links for all child-supplier depencencies
> of/platform: Don't create device links for default busses
>
> .../admin-guide/kernel-parameters.txt | 5 +
> drivers/base/core.c | 168 ++++++++++++++++
> drivers/base/dd.c | 29 +++
> drivers/of/platform.c | 189 ++++++++++++++++++
> include/linux/device.h | 60 ++++++
> 5 files changed, 451 insertions(+)
>

2019-08-10 05:02:22

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

On Fri, Aug 9, 2019 at 7:57 PM Frank Rowand <[email protected]> wrote:
>
> Hi Saravana,
>
> On 7/31/19 3:17 PM, Saravana Kannan wrote:
> > Add device-links to track functional dependencies between devices
> > after they are created (but before they are probed) by looking at
> > their common DT bindings like clocks, interconnects, etc.
> >
> > Having functional dependencies automatically added before the devices
> > are probed, provides the following benefits:
> >
> > - Optimizes device probe order and avoids the useless work of
> > attempting probes of devices that will not probe successfully
> > (because their suppliers aren't present or haven't probed yet).
> >
> > For example, in a commonly available mobile SoC, registering just
> > one consumer device's driver at an initcall level earlier than the
> > supplier device's driver causes 11 failed probe attempts before the
> > consumer device probes successfully. This was with a kernel with all
> > the drivers statically compiled in. This problem gets a lot worse if
> > all the drivers are loaded as modules without direct symbol
> > dependencies.
> >
> > - Supplier devices like clock providers, interconnect providers, etc
> > need to keep the resources they provide active and at a particular
> > state(s) during boot up even if their current set of consumers don't
> > request the resource to be active. This is because the rest of the
> > consumers might not have probed yet and turning off the resource
> > before all the consumers have probed could lead to a hang or
> > undesired user experience.
> >
> > Some frameworks (Eg: regulator) handle this today by turning off
> > "unused" resources at late_initcall_sync and hoping all the devices
> > have probed by then. This is not a valid assumption for systems with
> > loadable modules. Other frameworks (Eg: clock) just don't handle
> > this due to the lack of a clear signal for when they can turn off
> > resources. This leads to downstream hacks to handle cases like this
> > that can easily be solved in the upstream kernel.
> >
> > By linking devices before they are probed, we give suppliers a clear
> > count of the number of dependent consumers. Once all of the
> > consumers are active, the suppliers can turn off the unused
> > resources without making assumptions about the number of consumers.
> >
> > By default we just add device-links to track "driver presence" (probe
> > succeeded) of the supplier device. If any other functionality provided
> > by device-links are needed, it is left to the consumer/supplier
> > devices to change the link when they probe.
> >
> > v1 -> v2:
> > - Drop patch to speed up of_find_device_by_node()
> > - Drop depends-on property and use existing bindings
> >
> > v2 -> v3:
> > - Refactor the code to have driver core initiate the linking of devs
> > - Have driver core link consumers to supplier before it's probed
> > - Add support for drivers to edit the device links before probing
> >
> > v3 -> v4:
> > - Tested edit_links() on system with cyclic dependency. Works.
> > - Added some checks to make sure device link isn't attempted from
> > parent device node to child device node.
> > - Added way to pause/resume sync_state callbacks across
> > of_platform_populate().
> > - Recursively parse DT node to create device links from parent to
> > suppliers of parent and all child nodes.
> >
> > v4 -> v5:
> > - Fixed copy-pasta bugs with linked list handling
> > - Walk up the phandle reference till I find an actual device (needed
> > for regulators to work)
> > - Added support for linking devices from regulator DT bindings
> > - Tested the whole series again to make sure cyclic dependencies are
> > broken with edit_links() and regulator links are created properly.
> >
> > v5 -> v6:
> > - Split, squashed and reordered some of the patches.
> > - Refactored the device linking code to follow the same code pattern for
> > any property.
> >
> > v6 -> v7:
> > - No functional changes.
> > - Renamed i to index
> > - Added comment to clarify not having to check property name for every
> > index
> > - Added "matched" variable to clarify code. No functional change.
> > - Added comments to include/linux/device.h for add_links()
> >
> > v7 -> v8:
> > - Rebased on top of linux-next to handle device link changes in [1]
> >
>
>
> > v8 -> v9:
> > - Fixed kbuild test bot reported errors (docs and const)
>
> Some maintainers have strong opinions about whether change logs should be:
>
> (1) only in patch 0
> (2) only in the specific patches that are changed
> (3) both in patch 0 and in the specific patches that are changed.
>
> I can adapt to any of the three styles. But for style "(1)" please
> list which specific patch has changed for each item in the change list.
>

Thanks for the context Frank. I'm okay with (1) or (2) but I'll stick
with (1) for this series. Didn't realize there were options (2) and
(3). Since you started reviewing from v7, I'll do that in the future
updates? Also, I haven't forgotten your emails. Just tied up with
something else for a few days. I'll get to your emails next week.

Thanks,
Saravana

2019-08-10 05:21:42

by Frank Rowand

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

On 8/9/19 10:00 PM, Saravana Kannan wrote:
> On Fri, Aug 9, 2019 at 7:57 PM Frank Rowand <[email protected]> wrote:
>>
>> Hi Saravana,
>>
>> On 7/31/19 3:17 PM, Saravana Kannan wrote:
>>> Add device-links to track functional dependencies between devices
>>> after they are created (but before they are probed) by looking at
>>> their common DT bindings like clocks, interconnects, etc.
>>>
>>> Having functional dependencies automatically added before the devices
>>> are probed, provides the following benefits:
>>>
>>> - Optimizes device probe order and avoids the useless work of
>>> attempting probes of devices that will not probe successfully
>>> (because their suppliers aren't present or haven't probed yet).
>>>
>>> For example, in a commonly available mobile SoC, registering just
>>> one consumer device's driver at an initcall level earlier than the
>>> supplier device's driver causes 11 failed probe attempts before the
>>> consumer device probes successfully. This was with a kernel with all
>>> the drivers statically compiled in. This problem gets a lot worse if
>>> all the drivers are loaded as modules without direct symbol
>>> dependencies.
>>>
>>> - Supplier devices like clock providers, interconnect providers, etc
>>> need to keep the resources they provide active and at a particular
>>> state(s) during boot up even if their current set of consumers don't
>>> request the resource to be active. This is because the rest of the
>>> consumers might not have probed yet and turning off the resource
>>> before all the consumers have probed could lead to a hang or
>>> undesired user experience.
>>>
>>> Some frameworks (Eg: regulator) handle this today by turning off
>>> "unused" resources at late_initcall_sync and hoping all the devices
>>> have probed by then. This is not a valid assumption for systems with
>>> loadable modules. Other frameworks (Eg: clock) just don't handle
>>> this due to the lack of a clear signal for when they can turn off
>>> resources. This leads to downstream hacks to handle cases like this
>>> that can easily be solved in the upstream kernel.
>>>
>>> By linking devices before they are probed, we give suppliers a clear
>>> count of the number of dependent consumers. Once all of the
>>> consumers are active, the suppliers can turn off the unused
>>> resources without making assumptions about the number of consumers.
>>>
>>> By default we just add device-links to track "driver presence" (probe
>>> succeeded) of the supplier device. If any other functionality provided
>>> by device-links are needed, it is left to the consumer/supplier
>>> devices to change the link when they probe.
>>>
>>> v1 -> v2:
>>> - Drop patch to speed up of_find_device_by_node()
>>> - Drop depends-on property and use existing bindings
>>>
>>> v2 -> v3:
>>> - Refactor the code to have driver core initiate the linking of devs
>>> - Have driver core link consumers to supplier before it's probed
>>> - Add support for drivers to edit the device links before probing
>>>
>>> v3 -> v4:
>>> - Tested edit_links() on system with cyclic dependency. Works.
>>> - Added some checks to make sure device link isn't attempted from
>>> parent device node to child device node.
>>> - Added way to pause/resume sync_state callbacks across
>>> of_platform_populate().
>>> - Recursively parse DT node to create device links from parent to
>>> suppliers of parent and all child nodes.
>>>
>>> v4 -> v5:
>>> - Fixed copy-pasta bugs with linked list handling
>>> - Walk up the phandle reference till I find an actual device (needed
>>> for regulators to work)
>>> - Added support for linking devices from regulator DT bindings
>>> - Tested the whole series again to make sure cyclic dependencies are
>>> broken with edit_links() and regulator links are created properly.
>>>
>>> v5 -> v6:
>>> - Split, squashed and reordered some of the patches.
>>> - Refactored the device linking code to follow the same code pattern for
>>> any property.
>>>
>>> v6 -> v7:
>>> - No functional changes.
>>> - Renamed i to index
>>> - Added comment to clarify not having to check property name for every
>>> index
>>> - Added "matched" variable to clarify code. No functional change.
>>> - Added comments to include/linux/device.h for add_links()
>>>
>>> v7 -> v8:
>>> - Rebased on top of linux-next to handle device link changes in [1]
>>>
>>
>>
>>> v8 -> v9:
>>> - Fixed kbuild test bot reported errors (docs and const)
>>
>> Some maintainers have strong opinions about whether change logs should be:
>>
>> (1) only in patch 0
>> (2) only in the specific patches that are changed
>> (3) both in patch 0 and in the specific patches that are changed.
>>
>> I can adapt to any of the three styles. But for style "(1)" please
>> list which specific patch has changed for each item in the change list.
>>
>
> Thanks for the context Frank. I'm okay with (1) or (2) but I'll stick
> with (1) for this series. Didn't realize there were options (2) and
> (3). Since you started reviewing from v7, I'll do that in the future
> updates? Also, I haven't forgotten your emails. Just tied up with
> something else for a few days. I'll get to your emails next week.

Yes, starting with future updates is fine, no need to redo the v9
change logs.

No problem on the timing. I figured you were busy or away from the
internet.

-Frank

>
> Thanks,
> Saravana
>

2019-08-16 01:52:00

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

On Fri, Aug 9, 2019 at 10:20 PM Frank Rowand <[email protected]> wrote:
>
> On 8/9/19 10:00 PM, Saravana Kannan wrote:
> > On Fri, Aug 9, 2019 at 7:57 PM Frank Rowand <[email protected]> wrote:
> >>
> >> Hi Saravana,
> >>
> >> On 7/31/19 3:17 PM, Saravana Kannan wrote:
> >>> Add device-links to track functional dependencies between devices
> >>> after they are created (but before they are probed) by looking at
> >>> their common DT bindings like clocks, interconnects, etc.
> >>>
> >>> Having functional dependencies automatically added before the devices
> >>> are probed, provides the following benefits:
> >>>
> >>> - Optimizes device probe order and avoids the useless work of
> >>> attempting probes of devices that will not probe successfully
> >>> (because their suppliers aren't present or haven't probed yet).
> >>>
> >>> For example, in a commonly available mobile SoC, registering just
> >>> one consumer device's driver at an initcall level earlier than the
> >>> supplier device's driver causes 11 failed probe attempts before the
> >>> consumer device probes successfully. This was with a kernel with all
> >>> the drivers statically compiled in. This problem gets a lot worse if
> >>> all the drivers are loaded as modules without direct symbol
> >>> dependencies.
> >>>
> >>> - Supplier devices like clock providers, interconnect providers, etc
> >>> need to keep the resources they provide active and at a particular
> >>> state(s) during boot up even if their current set of consumers don't
> >>> request the resource to be active. This is because the rest of the
> >>> consumers might not have probed yet and turning off the resource
> >>> before all the consumers have probed could lead to a hang or
> >>> undesired user experience.
> >>>
> >>> Some frameworks (Eg: regulator) handle this today by turning off
> >>> "unused" resources at late_initcall_sync and hoping all the devices
> >>> have probed by then. This is not a valid assumption for systems with
> >>> loadable modules. Other frameworks (Eg: clock) just don't handle
> >>> this due to the lack of a clear signal for when they can turn off
> >>> resources. This leads to downstream hacks to handle cases like this
> >>> that can easily be solved in the upstream kernel.
> >>>
> >>> By linking devices before they are probed, we give suppliers a clear
> >>> count of the number of dependent consumers. Once all of the
> >>> consumers are active, the suppliers can turn off the unused
> >>> resources without making assumptions about the number of consumers.
> >>>
> >>> By default we just add device-links to track "driver presence" (probe
> >>> succeeded) of the supplier device. If any other functionality provided
> >>> by device-links are needed, it is left to the consumer/supplier
> >>> devices to change the link when they probe.
> >>>
> >>> v1 -> v2:
> >>> - Drop patch to speed up of_find_device_by_node()
> >>> - Drop depends-on property and use existing bindings
> >>>
> >>> v2 -> v3:
> >>> - Refactor the code to have driver core initiate the linking of devs
> >>> - Have driver core link consumers to supplier before it's probed
> >>> - Add support for drivers to edit the device links before probing
> >>>
> >>> v3 -> v4:
> >>> - Tested edit_links() on system with cyclic dependency. Works.
> >>> - Added some checks to make sure device link isn't attempted from
> >>> parent device node to child device node.
> >>> - Added way to pause/resume sync_state callbacks across
> >>> of_platform_populate().
> >>> - Recursively parse DT node to create device links from parent to
> >>> suppliers of parent and all child nodes.
> >>>
> >>> v4 -> v5:
> >>> - Fixed copy-pasta bugs with linked list handling
> >>> - Walk up the phandle reference till I find an actual device (needed
> >>> for regulators to work)
> >>> - Added support for linking devices from regulator DT bindings
> >>> - Tested the whole series again to make sure cyclic dependencies are
> >>> broken with edit_links() and regulator links are created properly.
> >>>
> >>> v5 -> v6:
> >>> - Split, squashed and reordered some of the patches.
> >>> - Refactored the device linking code to follow the same code pattern for
> >>> any property.
> >>>
> >>> v6 -> v7:
> >>> - No functional changes.
> >>> - Renamed i to index
> >>> - Added comment to clarify not having to check property name for every
> >>> index
> >>> - Added "matched" variable to clarify code. No functional change.
> >>> - Added comments to include/linux/device.h for add_links()
> >>>
> >>> v7 -> v8:
> >>> - Rebased on top of linux-next to handle device link changes in [1]
> >>>
> >>
> >>
> >>> v8 -> v9:
> >>> - Fixed kbuild test bot reported errors (docs and const)
> >>
> >> Some maintainers have strong opinions about whether change logs should be:
> >>
> >> (1) only in patch 0
> >> (2) only in the specific patches that are changed
> >> (3) both in patch 0 and in the specific patches that are changed.
> >>
> >> I can adapt to any of the three styles. But for style "(1)" please
> >> list which specific patch has changed for each item in the change list.
> >>
> >
> > Thanks for the context Frank. I'm okay with (1) or (2) but I'll stick
> > with (1) for this series. Didn't realize there were options (2) and
> > (3). Since you started reviewing from v7, I'll do that in the future
> > updates? Also, I haven't forgotten your emails. Just tied up with
> > something else for a few days. I'll get to your emails next week.
>
> Yes, starting with future updates is fine, no need to redo the v9
> change logs.
>
> No problem on the timing. I figured you were busy or away from the
> internet.

I'm replying to your comments on the other 3 patches. Okay with a
majority of them. I'll wait for your reply to see where we settle for
some of the points before I send out any patches though.

For now I'm thinking of sending them as separate clean up patches so
that Greg doesn't have to deal with reverts in his "next" branch. We
can squash them later if we really need to rip out what's in there and
push it again.

-Saravana

2019-08-16 03:10:13

by Frank Rowand

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

Hi Saravana,

On 8/15/19 6:50 PM, Saravana Kannan wrote:
> On Fri, Aug 9, 2019 at 10:20 PM Frank Rowand <[email protected]> wrote:
>>
>> On 8/9/19 10:00 PM, Saravana Kannan wrote:
>>> On Fri, Aug 9, 2019 at 7:57 PM Frank Rowand <[email protected]> wrote:
>>>>
>>>> Hi Saravana,
>>>>
>>>> On 7/31/19 3:17 PM, Saravana Kannan wrote:
>>>>> Add device-links to track functional dependencies between devices
>>>>> after they are created (but before they are probed) by looking at
>>>>> their common DT bindings like clocks, interconnects, etc.
>>>>>
>>>>> Having functional dependencies automatically added before the devices
>>>>> are probed, provides the following benefits:
>>>>>
>>>>> - Optimizes device probe order and avoids the useless work of
>>>>> attempting probes of devices that will not probe successfully
>>>>> (because their suppliers aren't present or haven't probed yet).
>>>>>
>>>>> For example, in a commonly available mobile SoC, registering just
>>>>> one consumer device's driver at an initcall level earlier than the
>>>>> supplier device's driver causes 11 failed probe attempts before the
>>>>> consumer device probes successfully. This was with a kernel with all
>>>>> the drivers statically compiled in. This problem gets a lot worse if
>>>>> all the drivers are loaded as modules without direct symbol
>>>>> dependencies.
>>>>>
>>>>> - Supplier devices like clock providers, interconnect providers, etc
>>>>> need to keep the resources they provide active and at a particular
>>>>> state(s) during boot up even if their current set of consumers don't
>>>>> request the resource to be active. This is because the rest of the
>>>>> consumers might not have probed yet and turning off the resource
>>>>> before all the consumers have probed could lead to a hang or
>>>>> undesired user experience.
>>>>>
>>>>> Some frameworks (Eg: regulator) handle this today by turning off
>>>>> "unused" resources at late_initcall_sync and hoping all the devices
>>>>> have probed by then. This is not a valid assumption for systems with
>>>>> loadable modules. Other frameworks (Eg: clock) just don't handle
>>>>> this due to the lack of a clear signal for when they can turn off
>>>>> resources. This leads to downstream hacks to handle cases like this
>>>>> that can easily be solved in the upstream kernel.
>>>>>
>>>>> By linking devices before they are probed, we give suppliers a clear
>>>>> count of the number of dependent consumers. Once all of the
>>>>> consumers are active, the suppliers can turn off the unused
>>>>> resources without making assumptions about the number of consumers.
>>>>>
>>>>> By default we just add device-links to track "driver presence" (probe
>>>>> succeeded) of the supplier device. If any other functionality provided
>>>>> by device-links are needed, it is left to the consumer/supplier
>>>>> devices to change the link when they probe.
>>>>>
>>>>> v1 -> v2:
>>>>> - Drop patch to speed up of_find_device_by_node()
>>>>> - Drop depends-on property and use existing bindings
>>>>>
>>>>> v2 -> v3:
>>>>> - Refactor the code to have driver core initiate the linking of devs
>>>>> - Have driver core link consumers to supplier before it's probed
>>>>> - Add support for drivers to edit the device links before probing
>>>>>
>>>>> v3 -> v4:
>>>>> - Tested edit_links() on system with cyclic dependency. Works.
>>>>> - Added some checks to make sure device link isn't attempted from
>>>>> parent device node to child device node.
>>>>> - Added way to pause/resume sync_state callbacks across
>>>>> of_platform_populate().
>>>>> - Recursively parse DT node to create device links from parent to
>>>>> suppliers of parent and all child nodes.
>>>>>
>>>>> v4 -> v5:
>>>>> - Fixed copy-pasta bugs with linked list handling
>>>>> - Walk up the phandle reference till I find an actual device (needed
>>>>> for regulators to work)
>>>>> - Added support for linking devices from regulator DT bindings
>>>>> - Tested the whole series again to make sure cyclic dependencies are
>>>>> broken with edit_links() and regulator links are created properly.
>>>>>
>>>>> v5 -> v6:
>>>>> - Split, squashed and reordered some of the patches.
>>>>> - Refactored the device linking code to follow the same code pattern for
>>>>> any property.
>>>>>
>>>>> v6 -> v7:
>>>>> - No functional changes.
>>>>> - Renamed i to index
>>>>> - Added comment to clarify not having to check property name for every
>>>>> index
>>>>> - Added "matched" variable to clarify code. No functional change.
>>>>> - Added comments to include/linux/device.h for add_links()
>>>>>
>>>>> v7 -> v8:
>>>>> - Rebased on top of linux-next to handle device link changes in [1]
>>>>>
>>>>
>>>>
>>>>> v8 -> v9:
>>>>> - Fixed kbuild test bot reported errors (docs and const)
>>>>
>>>> Some maintainers have strong opinions about whether change logs should be:
>>>>
>>>> (1) only in patch 0
>>>> (2) only in the specific patches that are changed
>>>> (3) both in patch 0 and in the specific patches that are changed.
>>>>
>>>> I can adapt to any of the three styles. But for style "(1)" please
>>>> list which specific patch has changed for each item in the change list.
>>>>
>>>
>>> Thanks for the context Frank. I'm okay with (1) or (2) but I'll stick
>>> with (1) for this series. Didn't realize there were options (2) and
>>> (3). Since you started reviewing from v7, I'll do that in the future
>>> updates? Also, I haven't forgotten your emails. Just tied up with
>>> something else for a few days. I'll get to your emails next week.
>>
>> Yes, starting with future updates is fine, no need to redo the v9
>> change logs.
>>
>> No problem on the timing. I figured you were busy or away from the
>> internet.
>
> I'm replying to your comments on the other 3 patches. Okay with a
> majority of them. I'll wait for your reply to see where we settle for
> some of the points before I send out any patches though.
>
> For now I'm thinking of sending them as separate clean up patches so
> that Greg doesn't have to deal with reverts in his "next" branch. We
> can squash them later if we really need to rip out what's in there and
> push it again.
>
> -Saravana
>

Please do not do separate clean up patches. The series that Greg has is
not ready for acceptance and I am going to ask him to revert it as we
work through the needed changes.

I suspect there will be at least two more versions of the series. The
first is to get the patches I commented in good shape. Then I will
look at the patches later in the series to see how they fit into the
big picture.

In the end, there should be one coherent patch series that implements
the feature.

-Frank

2019-08-16 09:12:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

On Thu, Aug 15, 2019 at 08:09:19PM -0700, Frank Rowand wrote:
> Hi Saravana,
>
> On 8/15/19 6:50 PM, Saravana Kannan wrote:
> > On Fri, Aug 9, 2019 at 10:20 PM Frank Rowand <[email protected]> wrote:
> >>
> >> On 8/9/19 10:00 PM, Saravana Kannan wrote:
> >>> On Fri, Aug 9, 2019 at 7:57 PM Frank Rowand <[email protected]> wrote:
> >>>>
> >>>> Hi Saravana,
> >>>>
> >>>> On 7/31/19 3:17 PM, Saravana Kannan wrote:
> >>>>> Add device-links to track functional dependencies between devices
> >>>>> after they are created (but before they are probed) by looking at
> >>>>> their common DT bindings like clocks, interconnects, etc.
> >>>>>
> >>>>> Having functional dependencies automatically added before the devices
> >>>>> are probed, provides the following benefits:
> >>>>>
> >>>>> - Optimizes device probe order and avoids the useless work of
> >>>>> attempting probes of devices that will not probe successfully
> >>>>> (because their suppliers aren't present or haven't probed yet).
> >>>>>
> >>>>> For example, in a commonly available mobile SoC, registering just
> >>>>> one consumer device's driver at an initcall level earlier than the
> >>>>> supplier device's driver causes 11 failed probe attempts before the
> >>>>> consumer device probes successfully. This was with a kernel with all
> >>>>> the drivers statically compiled in. This problem gets a lot worse if
> >>>>> all the drivers are loaded as modules without direct symbol
> >>>>> dependencies.
> >>>>>
> >>>>> - Supplier devices like clock providers, interconnect providers, etc
> >>>>> need to keep the resources they provide active and at a particular
> >>>>> state(s) during boot up even if their current set of consumers don't
> >>>>> request the resource to be active. This is because the rest of the
> >>>>> consumers might not have probed yet and turning off the resource
> >>>>> before all the consumers have probed could lead to a hang or
> >>>>> undesired user experience.
> >>>>>
> >>>>> Some frameworks (Eg: regulator) handle this today by turning off
> >>>>> "unused" resources at late_initcall_sync and hoping all the devices
> >>>>> have probed by then. This is not a valid assumption for systems with
> >>>>> loadable modules. Other frameworks (Eg: clock) just don't handle
> >>>>> this due to the lack of a clear signal for when they can turn off
> >>>>> resources. This leads to downstream hacks to handle cases like this
> >>>>> that can easily be solved in the upstream kernel.
> >>>>>
> >>>>> By linking devices before they are probed, we give suppliers a clear
> >>>>> count of the number of dependent consumers. Once all of the
> >>>>> consumers are active, the suppliers can turn off the unused
> >>>>> resources without making assumptions about the number of consumers.
> >>>>>
> >>>>> By default we just add device-links to track "driver presence" (probe
> >>>>> succeeded) of the supplier device. If any other functionality provided
> >>>>> by device-links are needed, it is left to the consumer/supplier
> >>>>> devices to change the link when they probe.
> >>>>>
> >>>>> v1 -> v2:
> >>>>> - Drop patch to speed up of_find_device_by_node()
> >>>>> - Drop depends-on property and use existing bindings
> >>>>>
> >>>>> v2 -> v3:
> >>>>> - Refactor the code to have driver core initiate the linking of devs
> >>>>> - Have driver core link consumers to supplier before it's probed
> >>>>> - Add support for drivers to edit the device links before probing
> >>>>>
> >>>>> v3 -> v4:
> >>>>> - Tested edit_links() on system with cyclic dependency. Works.
> >>>>> - Added some checks to make sure device link isn't attempted from
> >>>>> parent device node to child device node.
> >>>>> - Added way to pause/resume sync_state callbacks across
> >>>>> of_platform_populate().
> >>>>> - Recursively parse DT node to create device links from parent to
> >>>>> suppliers of parent and all child nodes.
> >>>>>
> >>>>> v4 -> v5:
> >>>>> - Fixed copy-pasta bugs with linked list handling
> >>>>> - Walk up the phandle reference till I find an actual device (needed
> >>>>> for regulators to work)
> >>>>> - Added support for linking devices from regulator DT bindings
> >>>>> - Tested the whole series again to make sure cyclic dependencies are
> >>>>> broken with edit_links() and regulator links are created properly.
> >>>>>
> >>>>> v5 -> v6:
> >>>>> - Split, squashed and reordered some of the patches.
> >>>>> - Refactored the device linking code to follow the same code pattern for
> >>>>> any property.
> >>>>>
> >>>>> v6 -> v7:
> >>>>> - No functional changes.
> >>>>> - Renamed i to index
> >>>>> - Added comment to clarify not having to check property name for every
> >>>>> index
> >>>>> - Added "matched" variable to clarify code. No functional change.
> >>>>> - Added comments to include/linux/device.h for add_links()
> >>>>>
> >>>>> v7 -> v8:
> >>>>> - Rebased on top of linux-next to handle device link changes in [1]
> >>>>>
> >>>>
> >>>>
> >>>>> v8 -> v9:
> >>>>> - Fixed kbuild test bot reported errors (docs and const)
> >>>>
> >>>> Some maintainers have strong opinions about whether change logs should be:
> >>>>
> >>>> (1) only in patch 0
> >>>> (2) only in the specific patches that are changed
> >>>> (3) both in patch 0 and in the specific patches that are changed.
> >>>>
> >>>> I can adapt to any of the three styles. But for style "(1)" please
> >>>> list which specific patch has changed for each item in the change list.
> >>>>
> >>>
> >>> Thanks for the context Frank. I'm okay with (1) or (2) but I'll stick
> >>> with (1) for this series. Didn't realize there were options (2) and
> >>> (3). Since you started reviewing from v7, I'll do that in the future
> >>> updates? Also, I haven't forgotten your emails. Just tied up with
> >>> something else for a few days. I'll get to your emails next week.
> >>
> >> Yes, starting with future updates is fine, no need to redo the v9
> >> change logs.
> >>
> >> No problem on the timing. I figured you were busy or away from the
> >> internet.
> >
> > I'm replying to your comments on the other 3 patches. Okay with a
> > majority of them. I'll wait for your reply to see where we settle for
> > some of the points before I send out any patches though.
> >
> > For now I'm thinking of sending them as separate clean up patches so
> > that Greg doesn't have to deal with reverts in his "next" branch. We
> > can squash them later if we really need to rip out what's in there and
> > push it again.
> >
> > -Saravana
> >
>
> Please do not do separate clean up patches. The series that Greg has is
> not ready for acceptance and I am going to ask him to revert it as we
> work through the needed changes.
>
> I suspect there will be at least two more versions of the series. The
> first is to get the patches I commented in good shape. Then I will
> look at the patches later in the series to see how they fit into the
> big picture.
>
> In the end, there should be one coherent patch series that implements
> the feature.

Incremental patches to fix up the comments and documentation is fine, no
need to respin the whole mess.

thanks,

greg k-h

2019-08-16 14:06:16

by Frank Rowand

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

i Greg,

On 8/16/19 2:10 AM, Greg Kroah-Hartman wrote:
> On Thu, Aug 15, 2019 at 08:09:19PM -0700, Frank Rowand wrote:
>> Hi Saravana,
>>
>> On 8/15/19 6:50 PM, Saravana Kannan wrote:
>>> On Fri, Aug 9, 2019 at 10:20 PM Frank Rowand <[email protected]> wrote:
>>>>
>>>> On 8/9/19 10:00 PM, Saravana Kannan wrote:
>>>>> On Fri, Aug 9, 2019 at 7:57 PM Frank Rowand <[email protected]> wrote:
>>>>>>
>>>>>> Hi Saravana,
>>>>>>
>>>>>> On 7/31/19 3:17 PM, Saravana Kannan wrote:
>>>>>>> Add device-links to track functional dependencies between devices
>>>>>>> after they are created (but before they are probed) by looking at
>>>>>>> their common DT bindings like clocks, interconnects, etc.
>>>>>>>
>>>>>>> Having functional dependencies automatically added before the devices
>>>>>>> are probed, provides the following benefits:
>>>>>>>
>>>>>>> - Optimizes device probe order and avoids the useless work of
>>>>>>> attempting probes of devices that will not probe successfully
>>>>>>> (because their suppliers aren't present or haven't probed yet).
>>>>>>>
>>>>>>> For example, in a commonly available mobile SoC, registering just
>>>>>>> one consumer device's driver at an initcall level earlier than the
>>>>>>> supplier device's driver causes 11 failed probe attempts before the
>>>>>>> consumer device probes successfully. This was with a kernel with all
>>>>>>> the drivers statically compiled in. This problem gets a lot worse if
>>>>>>> all the drivers are loaded as modules without direct symbol
>>>>>>> dependencies.
>>>>>>>
>>>>>>> - Supplier devices like clock providers, interconnect providers, etc
>>>>>>> need to keep the resources they provide active and at a particular
>>>>>>> state(s) during boot up even if their current set of consumers don't
>>>>>>> request the resource to be active. This is because the rest of the
>>>>>>> consumers might not have probed yet and turning off the resource
>>>>>>> before all the consumers have probed could lead to a hang or
>>>>>>> undesired user experience.
>>>>>>>
>>>>>>> Some frameworks (Eg: regulator) handle this today by turning off
>>>>>>> "unused" resources at late_initcall_sync and hoping all the devices
>>>>>>> have probed by then. This is not a valid assumption for systems with
>>>>>>> loadable modules. Other frameworks (Eg: clock) just don't handle
>>>>>>> this due to the lack of a clear signal for when they can turn off
>>>>>>> resources. This leads to downstream hacks to handle cases like this
>>>>>>> that can easily be solved in the upstream kernel.
>>>>>>>
>>>>>>> By linking devices before they are probed, we give suppliers a clear
>>>>>>> count of the number of dependent consumers. Once all of the
>>>>>>> consumers are active, the suppliers can turn off the unused
>>>>>>> resources without making assumptions about the number of consumers.
>>>>>>>
>>>>>>> By default we just add device-links to track "driver presence" (probe
>>>>>>> succeeded) of the supplier device. If any other functionality provided
>>>>>>> by device-links are needed, it is left to the consumer/supplier
>>>>>>> devices to change the link when they probe.
>>>>>>>
>>>>>>> v1 -> v2:
>>>>>>> - Drop patch to speed up of_find_device_by_node()
>>>>>>> - Drop depends-on property and use existing bindings
>>>>>>>
>>>>>>> v2 -> v3:
>>>>>>> - Refactor the code to have driver core initiate the linking of devs
>>>>>>> - Have driver core link consumers to supplier before it's probed
>>>>>>> - Add support for drivers to edit the device links before probing
>>>>>>>
>>>>>>> v3 -> v4:
>>>>>>> - Tested edit_links() on system with cyclic dependency. Works.
>>>>>>> - Added some checks to make sure device link isn't attempted from
>>>>>>> parent device node to child device node.
>>>>>>> - Added way to pause/resume sync_state callbacks across
>>>>>>> of_platform_populate().
>>>>>>> - Recursively parse DT node to create device links from parent to
>>>>>>> suppliers of parent and all child nodes.
>>>>>>>
>>>>>>> v4 -> v5:
>>>>>>> - Fixed copy-pasta bugs with linked list handling
>>>>>>> - Walk up the phandle reference till I find an actual device (needed
>>>>>>> for regulators to work)
>>>>>>> - Added support for linking devices from regulator DT bindings
>>>>>>> - Tested the whole series again to make sure cyclic dependencies are
>>>>>>> broken with edit_links() and regulator links are created properly.
>>>>>>>
>>>>>>> v5 -> v6:
>>>>>>> - Split, squashed and reordered some of the patches.
>>>>>>> - Refactored the device linking code to follow the same code pattern for
>>>>>>> any property.
>>>>>>>
>>>>>>> v6 -> v7:
>>>>>>> - No functional changes.
>>>>>>> - Renamed i to index
>>>>>>> - Added comment to clarify not having to check property name for every
>>>>>>> index
>>>>>>> - Added "matched" variable to clarify code. No functional change.
>>>>>>> - Added comments to include/linux/device.h for add_links()
>>>>>>>
>>>>>>> v7 -> v8:
>>>>>>> - Rebased on top of linux-next to handle device link changes in [1]
>>>>>>>
>>>>>>
>>>>>>
>>>>>>> v8 -> v9:
>>>>>>> - Fixed kbuild test bot reported errors (docs and const)
>>>>>>
>>>>>> Some maintainers have strong opinions about whether change logs should be:
>>>>>>
>>>>>> (1) only in patch 0
>>>>>> (2) only in the specific patches that are changed
>>>>>> (3) both in patch 0 and in the specific patches that are changed.
>>>>>>
>>>>>> I can adapt to any of the three styles. But for style "(1)" please
>>>>>> list which specific patch has changed for each item in the change list.
>>>>>>
>>>>>
>>>>> Thanks for the context Frank. I'm okay with (1) or (2) but I'll stick
>>>>> with (1) for this series. Didn't realize there were options (2) and
>>>>> (3). Since you started reviewing from v7, I'll do that in the future
>>>>> updates? Also, I haven't forgotten your emails. Just tied up with
>>>>> something else for a few days. I'll get to your emails next week.
>>>>
>>>> Yes, starting with future updates is fine, no need to redo the v9
>>>> change logs.
>>>>
>>>> No problem on the timing. I figured you were busy or away from the
>>>> internet.
>>>
>>> I'm replying to your comments on the other 3 patches. Okay with a
>>> majority of them. I'll wait for your reply to see where we settle for
>>> some of the points before I send out any patches though.
>>>
>>> For now I'm thinking of sending them as separate clean up patches so
>>> that Greg doesn't have to deal with reverts in his "next" branch. We
>>> can squash them later if we really need to rip out what's in there and
>>> push it again.
>>>
>>> -Saravana
>>>
>>
>> Please do not do separate clean up patches. The series that Greg has is
>> not ready for acceptance and I am going to ask him to revert it as we
>> work through the needed changes.
>>
>> I suspect there will be at least two more versions of the series. The
>> first is to get the patches I commented in good shape. Then I will
>> look at the patches later in the series to see how they fit into the
>> big picture.
>>
>> In the end, there should be one coherent patch series that implements
>> the feature.
>
> Incremental patches to fix up the comments and documentation is fine, no
> need to respin the whole mess.

The problem is that the whole thing is a "mess" at this point. I expect
the series to go through at least two or three more versions.

Please revert the series for now.

-Frank

>
> thanks,
>
> greg k-h
>

2019-08-16 15:25:09

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

On Fri, Aug 16, 2019 at 07:05:06AM -0700, Frank Rowand wrote:
> i Greg,
>
> On 8/16/19 2:10 AM, Greg Kroah-Hartman wrote:
> > On Thu, Aug 15, 2019 at 08:09:19PM -0700, Frank Rowand wrote:
> >> Hi Saravana,
> >>
> >> On 8/15/19 6:50 PM, Saravana Kannan wrote:
> >>> On Fri, Aug 9, 2019 at 10:20 PM Frank Rowand <[email protected]> wrote:
> >>>>
> >>>> On 8/9/19 10:00 PM, Saravana Kannan wrote:
> >>>>> On Fri, Aug 9, 2019 at 7:57 PM Frank Rowand <[email protected]> wrote:
> >>>>>>
> >>>>>> Hi Saravana,
> >>>>>>
> >>>>>> On 7/31/19 3:17 PM, Saravana Kannan wrote:
> >>>>>>> Add device-links to track functional dependencies between devices
> >>>>>>> after they are created (but before they are probed) by looking at
> >>>>>>> their common DT bindings like clocks, interconnects, etc.
> >>>>>>>
> >>>>>>> Having functional dependencies automatically added before the devices
> >>>>>>> are probed, provides the following benefits:
> >>>>>>>
> >>>>>>> - Optimizes device probe order and avoids the useless work of
> >>>>>>> attempting probes of devices that will not probe successfully
> >>>>>>> (because their suppliers aren't present or haven't probed yet).
> >>>>>>>
> >>>>>>> For example, in a commonly available mobile SoC, registering just
> >>>>>>> one consumer device's driver at an initcall level earlier than the
> >>>>>>> supplier device's driver causes 11 failed probe attempts before the
> >>>>>>> consumer device probes successfully. This was with a kernel with all
> >>>>>>> the drivers statically compiled in. This problem gets a lot worse if
> >>>>>>> all the drivers are loaded as modules without direct symbol
> >>>>>>> dependencies.
> >>>>>>>
> >>>>>>> - Supplier devices like clock providers, interconnect providers, etc
> >>>>>>> need to keep the resources they provide active and at a particular
> >>>>>>> state(s) during boot up even if their current set of consumers don't
> >>>>>>> request the resource to be active. This is because the rest of the
> >>>>>>> consumers might not have probed yet and turning off the resource
> >>>>>>> before all the consumers have probed could lead to a hang or
> >>>>>>> undesired user experience.
> >>>>>>>
> >>>>>>> Some frameworks (Eg: regulator) handle this today by turning off
> >>>>>>> "unused" resources at late_initcall_sync and hoping all the devices
> >>>>>>> have probed by then. This is not a valid assumption for systems with
> >>>>>>> loadable modules. Other frameworks (Eg: clock) just don't handle
> >>>>>>> this due to the lack of a clear signal for when they can turn off
> >>>>>>> resources. This leads to downstream hacks to handle cases like this
> >>>>>>> that can easily be solved in the upstream kernel.
> >>>>>>>
> >>>>>>> By linking devices before they are probed, we give suppliers a clear
> >>>>>>> count of the number of dependent consumers. Once all of the
> >>>>>>> consumers are active, the suppliers can turn off the unused
> >>>>>>> resources without making assumptions about the number of consumers.
> >>>>>>>
> >>>>>>> By default we just add device-links to track "driver presence" (probe
> >>>>>>> succeeded) of the supplier device. If any other functionality provided
> >>>>>>> by device-links are needed, it is left to the consumer/supplier
> >>>>>>> devices to change the link when they probe.
> >>>>>>>
> >>>>>>> v1 -> v2:
> >>>>>>> - Drop patch to speed up of_find_device_by_node()
> >>>>>>> - Drop depends-on property and use existing bindings
> >>>>>>>
> >>>>>>> v2 -> v3:
> >>>>>>> - Refactor the code to have driver core initiate the linking of devs
> >>>>>>> - Have driver core link consumers to supplier before it's probed
> >>>>>>> - Add support for drivers to edit the device links before probing
> >>>>>>>
> >>>>>>> v3 -> v4:
> >>>>>>> - Tested edit_links() on system with cyclic dependency. Works.
> >>>>>>> - Added some checks to make sure device link isn't attempted from
> >>>>>>> parent device node to child device node.
> >>>>>>> - Added way to pause/resume sync_state callbacks across
> >>>>>>> of_platform_populate().
> >>>>>>> - Recursively parse DT node to create device links from parent to
> >>>>>>> suppliers of parent and all child nodes.
> >>>>>>>
> >>>>>>> v4 -> v5:
> >>>>>>> - Fixed copy-pasta bugs with linked list handling
> >>>>>>> - Walk up the phandle reference till I find an actual device (needed
> >>>>>>> for regulators to work)
> >>>>>>> - Added support for linking devices from regulator DT bindings
> >>>>>>> - Tested the whole series again to make sure cyclic dependencies are
> >>>>>>> broken with edit_links() and regulator links are created properly.
> >>>>>>>
> >>>>>>> v5 -> v6:
> >>>>>>> - Split, squashed and reordered some of the patches.
> >>>>>>> - Refactored the device linking code to follow the same code pattern for
> >>>>>>> any property.
> >>>>>>>
> >>>>>>> v6 -> v7:
> >>>>>>> - No functional changes.
> >>>>>>> - Renamed i to index
> >>>>>>> - Added comment to clarify not having to check property name for every
> >>>>>>> index
> >>>>>>> - Added "matched" variable to clarify code. No functional change.
> >>>>>>> - Added comments to include/linux/device.h for add_links()
> >>>>>>>
> >>>>>>> v7 -> v8:
> >>>>>>> - Rebased on top of linux-next to handle device link changes in [1]
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> v8 -> v9:
> >>>>>>> - Fixed kbuild test bot reported errors (docs and const)
> >>>>>>
> >>>>>> Some maintainers have strong opinions about whether change logs should be:
> >>>>>>
> >>>>>> (1) only in patch 0
> >>>>>> (2) only in the specific patches that are changed
> >>>>>> (3) both in patch 0 and in the specific patches that are changed.
> >>>>>>
> >>>>>> I can adapt to any of the three styles. But for style "(1)" please
> >>>>>> list which specific patch has changed for each item in the change list.
> >>>>>>
> >>>>>
> >>>>> Thanks for the context Frank. I'm okay with (1) or (2) but I'll stick
> >>>>> with (1) for this series. Didn't realize there were options (2) and
> >>>>> (3). Since you started reviewing from v7, I'll do that in the future
> >>>>> updates? Also, I haven't forgotten your emails. Just tied up with
> >>>>> something else for a few days. I'll get to your emails next week.
> >>>>
> >>>> Yes, starting with future updates is fine, no need to redo the v9
> >>>> change logs.
> >>>>
> >>>> No problem on the timing. I figured you were busy or away from the
> >>>> internet.
> >>>
> >>> I'm replying to your comments on the other 3 patches. Okay with a
> >>> majority of them. I'll wait for your reply to see where we settle for
> >>> some of the points before I send out any patches though.
> >>>
> >>> For now I'm thinking of sending them as separate clean up patches so
> >>> that Greg doesn't have to deal with reverts in his "next" branch. We
> >>> can squash them later if we really need to rip out what's in there and
> >>> push it again.
> >>>
> >>> -Saravana
> >>>
> >>
> >> Please do not do separate clean up patches. The series that Greg has is
> >> not ready for acceptance and I am going to ask him to revert it as we
> >> work through the needed changes.
> >>
> >> I suspect there will be at least two more versions of the series. The
> >> first is to get the patches I commented in good shape. Then I will
> >> look at the patches later in the series to see how they fit into the
> >> big picture.
> >>
> >> In the end, there should be one coherent patch series that implements
> >> the feature.
> >
> > Incremental patches to fix up the comments and documentation is fine, no
> > need to respin the whole mess.
>
> The problem is that the whole thing is a "mess" at this point. I expect
> the series to go through at least two or three more versions.

I'm confused. All I see so far is objections about some documentation
in comments that can be cleaned up, and a disagreement about the name of
some things (naming is hard, tie goes to the submitter).

But no logic issues, right? Documentation and names can be fixed
anytime, the logic is all working properly, right?

What am I missing here?

thanks,

greg k-h

2019-08-16 20:54:05

by Frank Rowand

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

On 8/16/19 8:23 AM, Greg Kroah-Hartman wrote:
> On Fri, Aug 16, 2019 at 07:05:06AM -0700, Frank Rowand wrote:
>> i Greg,
>>
>> On 8/16/19 2:10 AM, Greg Kroah-Hartman wrote:
>>> On Thu, Aug 15, 2019 at 08:09:19PM -0700, Frank Rowand wrote:
>>>> Hi Saravana,
>>>>
>>>> On 8/15/19 6:50 PM, Saravana Kannan wrote:
>>>>> On Fri, Aug 9, 2019 at 10:20 PM Frank Rowand <[email protected]> wrote:
>>>>>>
>>>>>> On 8/9/19 10:00 PM, Saravana Kannan wrote:
>>>>>>> On Fri, Aug 9, 2019 at 7:57 PM Frank Rowand <[email protected]> wrote:
>>>>>>>>
>>>>>>>> Hi Saravana,
>>>>>>>>
>>>>>>>> On 7/31/19 3:17 PM, Saravana Kannan wrote:
>>>>>>>>> Add device-links to track functional dependencies between devices
>>>>>>>>> after they are created (but before they are probed) by looking at
>>>>>>>>> their common DT bindings like clocks, interconnects, etc.
>>>>>>>>>
>>>>>>>>> Having functional dependencies automatically added before the devices
>>>>>>>>> are probed, provides the following benefits:
>>>>>>>>>
>>>>>>>>> - Optimizes device probe order and avoids the useless work of
>>>>>>>>> attempting probes of devices that will not probe successfully
>>>>>>>>> (because their suppliers aren't present or haven't probed yet).
>>>>>>>>>
>>>>>>>>> For example, in a commonly available mobile SoC, registering just
>>>>>>>>> one consumer device's driver at an initcall level earlier than the
>>>>>>>>> supplier device's driver causes 11 failed probe attempts before the
>>>>>>>>> consumer device probes successfully. This was with a kernel with all
>>>>>>>>> the drivers statically compiled in. This problem gets a lot worse if
>>>>>>>>> all the drivers are loaded as modules without direct symbol
>>>>>>>>> dependencies.
>>>>>>>>>
>>>>>>>>> - Supplier devices like clock providers, interconnect providers, etc
>>>>>>>>> need to keep the resources they provide active and at a particular
>>>>>>>>> state(s) during boot up even if their current set of consumers don't
>>>>>>>>> request the resource to be active. This is because the rest of the
>>>>>>>>> consumers might not have probed yet and turning off the resource
>>>>>>>>> before all the consumers have probed could lead to a hang or
>>>>>>>>> undesired user experience.
>>>>>>>>>
>>>>>>>>> Some frameworks (Eg: regulator) handle this today by turning off
>>>>>>>>> "unused" resources at late_initcall_sync and hoping all the devices
>>>>>>>>> have probed by then. This is not a valid assumption for systems with
>>>>>>>>> loadable modules. Other frameworks (Eg: clock) just don't handle
>>>>>>>>> this due to the lack of a clear signal for when they can turn off
>>>>>>>>> resources. This leads to downstream hacks to handle cases like this
>>>>>>>>> that can easily be solved in the upstream kernel.
>>>>>>>>>
>>>>>>>>> By linking devices before they are probed, we give suppliers a clear
>>>>>>>>> count of the number of dependent consumers. Once all of the
>>>>>>>>> consumers are active, the suppliers can turn off the unused
>>>>>>>>> resources without making assumptions about the number of consumers.
>>>>>>>>>
>>>>>>>>> By default we just add device-links to track "driver presence" (probe
>>>>>>>>> succeeded) of the supplier device. If any other functionality provided
>>>>>>>>> by device-links are needed, it is left to the consumer/supplier
>>>>>>>>> devices to change the link when they probe.
>>>>>>>>>
>>>>>>>>> v1 -> v2:
>>>>>>>>> - Drop patch to speed up of_find_device_by_node()
>>>>>>>>> - Drop depends-on property and use existing bindings
>>>>>>>>>
>>>>>>>>> v2 -> v3:
>>>>>>>>> - Refactor the code to have driver core initiate the linking of devs
>>>>>>>>> - Have driver core link consumers to supplier before it's probed
>>>>>>>>> - Add support for drivers to edit the device links before probing
>>>>>>>>>
>>>>>>>>> v3 -> v4:
>>>>>>>>> - Tested edit_links() on system with cyclic dependency. Works.
>>>>>>>>> - Added some checks to make sure device link isn't attempted from
>>>>>>>>> parent device node to child device node.
>>>>>>>>> - Added way to pause/resume sync_state callbacks across
>>>>>>>>> of_platform_populate().
>>>>>>>>> - Recursively parse DT node to create device links from parent to
>>>>>>>>> suppliers of parent and all child nodes.
>>>>>>>>>
>>>>>>>>> v4 -> v5:
>>>>>>>>> - Fixed copy-pasta bugs with linked list handling
>>>>>>>>> - Walk up the phandle reference till I find an actual device (needed
>>>>>>>>> for regulators to work)
>>>>>>>>> - Added support for linking devices from regulator DT bindings
>>>>>>>>> - Tested the whole series again to make sure cyclic dependencies are
>>>>>>>>> broken with edit_links() and regulator links are created properly.
>>>>>>>>>
>>>>>>>>> v5 -> v6:
>>>>>>>>> - Split, squashed and reordered some of the patches.
>>>>>>>>> - Refactored the device linking code to follow the same code pattern for
>>>>>>>>> any property.
>>>>>>>>>
>>>>>>>>> v6 -> v7:
>>>>>>>>> - No functional changes.
>>>>>>>>> - Renamed i to index
>>>>>>>>> - Added comment to clarify not having to check property name for every
>>>>>>>>> index
>>>>>>>>> - Added "matched" variable to clarify code. No functional change.
>>>>>>>>> - Added comments to include/linux/device.h for add_links()
>>>>>>>>>
>>>>>>>>> v7 -> v8:
>>>>>>>>> - Rebased on top of linux-next to handle device link changes in [1]
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> v8 -> v9:
>>>>>>>>> - Fixed kbuild test bot reported errors (docs and const)
>>>>>>>>
>>>>>>>> Some maintainers have strong opinions about whether change logs should be:
>>>>>>>>
>>>>>>>> (1) only in patch 0
>>>>>>>> (2) only in the specific patches that are changed
>>>>>>>> (3) both in patch 0 and in the specific patches that are changed.
>>>>>>>>
>>>>>>>> I can adapt to any of the three styles. But for style "(1)" please
>>>>>>>> list which specific patch has changed for each item in the change list.
>>>>>>>>
>>>>>>>
>>>>>>> Thanks for the context Frank. I'm okay with (1) or (2) but I'll stick
>>>>>>> with (1) for this series. Didn't realize there were options (2) and
>>>>>>> (3). Since you started reviewing from v7, I'll do that in the future
>>>>>>> updates? Also, I haven't forgotten your emails. Just tied up with
>>>>>>> something else for a few days. I'll get to your emails next week.
>>>>>>
>>>>>> Yes, starting with future updates is fine, no need to redo the v9
>>>>>> change logs.
>>>>>>
>>>>>> No problem on the timing. I figured you were busy or away from the
>>>>>> internet.
>>>>>
>>>>> I'm replying to your comments on the other 3 patches. Okay with a
>>>>> majority of them. I'll wait for your reply to see where we settle for
>>>>> some of the points before I send out any patches though.
>>>>>
>>>>> For now I'm thinking of sending them as separate clean up patches so
>>>>> that Greg doesn't have to deal with reverts in his "next" branch. We
>>>>> can squash them later if we really need to rip out what's in there and
>>>>> push it again.
>>>>>
>>>>> -Saravana
>>>>>
>>>>
>>>> Please do not do separate clean up patches. The series that Greg has is
>>>> not ready for acceptance and I am going to ask him to revert it as we
>>>> work through the needed changes.
>>>>
>>>> I suspect there will be at least two more versions of the series. The
>>>> first is to get the patches I commented in good shape. Then I will
>>>> look at the patches later in the series to see how they fit into the
>>>> big picture.
>>>>
>>>> In the end, there should be one coherent patch series that implements
>>>> the feature.
>>>
>>> Incremental patches to fix up the comments and documentation is fine, no
>>> need to respin the whole mess.
>>
>> The problem is that the whole thing is a "mess" at this point. I expect
>> the series to go through at least two or three more versions.
>
> I'm confused. All I see so far is objections about some documentation
> in comments that can be cleaned up, and a disagreement about the name of
> some things (naming is hard, tie goes to the submitter).

Yes naming is hard. No,tie does not go to the submitter is the naming
makes the code difficult to understand.

Naming is one of the reasons why I have found this series so difficult
to understand.


> But no logic issues, right? Documentation and names can be fixed
> anytime, the logic is all working properly, right?

Yes, there are logic issues. I do not agree will all of the explanations
in the replies.

Without going into detail about all the issues, one key is that I
need to see an example of the edit_links() function, which Saravana
says he will provide. I don't want a bunch of ad hoc edit_links()
functions that each deal with cyclic dependencies in different ways.

There is also disagreement over whether the complexity of the
dev->has_edit_links field and driver_edit_links() are needed.

My biggest meta-issue is that this patch series is papering over the
real problem that prompted the patches. The real problem is that the
boot loader has enabled a power supply, but the power subsystem is
not aware that there is an active consumer. I have been hopeful that
this series can be implemented in a way that makes me comfortable
that it is _not_ just papering over the true problem. I still
retain that hope.


>
> What am I missing here?
>
> thanks,
>
> greg k-h
>

-Frank

2019-08-16 20:55:06

by Frank Rowand

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

On 8/16/19 1:52 PM, Frank Rowand wrote:
> On 8/16/19 8:23 AM, Greg Kroah-Hartman wrote:
>> On Fri, Aug 16, 2019 at 07:05:06AM -0700, Frank Rowand wrote:
>>> i Greg,
>>>
>>> On 8/16/19 2:10 AM, Greg Kroah-Hartman wrote:
>>>> On Thu, Aug 15, 2019 at 08:09:19PM -0700, Frank Rowand wrote:
>>>>> Hi Saravana,
>>>>>
>>>>> On 8/15/19 6:50 PM, Saravana Kannan wrote:
>>>>>> On Fri, Aug 9, 2019 at 10:20 PM Frank Rowand <[email protected]> wrote:
>>>>>>>
>>>>>>> On 8/9/19 10:00 PM, Saravana Kannan wrote:
>>>>>>>> On Fri, Aug 9, 2019 at 7:57 PM Frank Rowand <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>> Hi Saravana,
>>>>>>>>>
>>>>>>>>> On 7/31/19 3:17 PM, Saravana Kannan wrote:
>>>>>>>>>> Add device-links to track functional dependencies between devices
>>>>>>>>>> after they are created (but before they are probed) by looking at
>>>>>>>>>> their common DT bindings like clocks, interconnects, etc.
>>>>>>>>>>
>>>>>>>>>> Having functional dependencies automatically added before the devices
>>>>>>>>>> are probed, provides the following benefits:
>>>>>>>>>>
>>>>>>>>>> - Optimizes device probe order and avoids the useless work of
>>>>>>>>>> attempting probes of devices that will not probe successfully
>>>>>>>>>> (because their suppliers aren't present or haven't probed yet).
>>>>>>>>>>
>>>>>>>>>> For example, in a commonly available mobile SoC, registering just
>>>>>>>>>> one consumer device's driver at an initcall level earlier than the
>>>>>>>>>> supplier device's driver causes 11 failed probe attempts before the
>>>>>>>>>> consumer device probes successfully. This was with a kernel with all
>>>>>>>>>> the drivers statically compiled in. This problem gets a lot worse if
>>>>>>>>>> all the drivers are loaded as modules without direct symbol
>>>>>>>>>> dependencies.
>>>>>>>>>>
>>>>>>>>>> - Supplier devices like clock providers, interconnect providers, etc
>>>>>>>>>> need to keep the resources they provide active and at a particular
>>>>>>>>>> state(s) during boot up even if their current set of consumers don't
>>>>>>>>>> request the resource to be active. This is because the rest of the
>>>>>>>>>> consumers might not have probed yet and turning off the resource
>>>>>>>>>> before all the consumers have probed could lead to a hang or
>>>>>>>>>> undesired user experience.
>>>>>>>>>>
>>>>>>>>>> Some frameworks (Eg: regulator) handle this today by turning off
>>>>>>>>>> "unused" resources at late_initcall_sync and hoping all the devices
>>>>>>>>>> have probed by then. This is not a valid assumption for systems with
>>>>>>>>>> loadable modules. Other frameworks (Eg: clock) just don't handle
>>>>>>>>>> this due to the lack of a clear signal for when they can turn off
>>>>>>>>>> resources. This leads to downstream hacks to handle cases like this
>>>>>>>>>> that can easily be solved in the upstream kernel.
>>>>>>>>>>
>>>>>>>>>> By linking devices before they are probed, we give suppliers a clear
>>>>>>>>>> count of the number of dependent consumers. Once all of the
>>>>>>>>>> consumers are active, the suppliers can turn off the unused
>>>>>>>>>> resources without making assumptions about the number of consumers.
>>>>>>>>>>
>>>>>>>>>> By default we just add device-links to track "driver presence" (probe
>>>>>>>>>> succeeded) of the supplier device. If any other functionality provided
>>>>>>>>>> by device-links are needed, it is left to the consumer/supplier
>>>>>>>>>> devices to change the link when they probe.
>>>>>>>>>>
>>>>>>>>>> v1 -> v2:
>>>>>>>>>> - Drop patch to speed up of_find_device_by_node()
>>>>>>>>>> - Drop depends-on property and use existing bindings
>>>>>>>>>>
>>>>>>>>>> v2 -> v3:
>>>>>>>>>> - Refactor the code to have driver core initiate the linking of devs
>>>>>>>>>> - Have driver core link consumers to supplier before it's probed
>>>>>>>>>> - Add support for drivers to edit the device links before probing
>>>>>>>>>>
>>>>>>>>>> v3 -> v4:
>>>>>>>>>> - Tested edit_links() on system with cyclic dependency. Works.
>>>>>>>>>> - Added some checks to make sure device link isn't attempted from
>>>>>>>>>> parent device node to child device node.
>>>>>>>>>> - Added way to pause/resume sync_state callbacks across
>>>>>>>>>> of_platform_populate().
>>>>>>>>>> - Recursively parse DT node to create device links from parent to
>>>>>>>>>> suppliers of parent and all child nodes.
>>>>>>>>>>
>>>>>>>>>> v4 -> v5:
>>>>>>>>>> - Fixed copy-pasta bugs with linked list handling
>>>>>>>>>> - Walk up the phandle reference till I find an actual device (needed
>>>>>>>>>> for regulators to work)
>>>>>>>>>> - Added support for linking devices from regulator DT bindings
>>>>>>>>>> - Tested the whole series again to make sure cyclic dependencies are
>>>>>>>>>> broken with edit_links() and regulator links are created properly.
>>>>>>>>>>
>>>>>>>>>> v5 -> v6:
>>>>>>>>>> - Split, squashed and reordered some of the patches.
>>>>>>>>>> - Refactored the device linking code to follow the same code pattern for
>>>>>>>>>> any property.
>>>>>>>>>>
>>>>>>>>>> v6 -> v7:
>>>>>>>>>> - No functional changes.
>>>>>>>>>> - Renamed i to index
>>>>>>>>>> - Added comment to clarify not having to check property name for every
>>>>>>>>>> index
>>>>>>>>>> - Added "matched" variable to clarify code. No functional change.
>>>>>>>>>> - Added comments to include/linux/device.h for add_links()
>>>>>>>>>>
>>>>>>>>>> v7 -> v8:
>>>>>>>>>> - Rebased on top of linux-next to handle device link changes in [1]
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> v8 -> v9:
>>>>>>>>>> - Fixed kbuild test bot reported errors (docs and const)
>>>>>>>>>
>>>>>>>>> Some maintainers have strong opinions about whether change logs should be:
>>>>>>>>>
>>>>>>>>> (1) only in patch 0
>>>>>>>>> (2) only in the specific patches that are changed
>>>>>>>>> (3) both in patch 0 and in the specific patches that are changed.
>>>>>>>>>
>>>>>>>>> I can adapt to any of the three styles. But for style "(1)" please
>>>>>>>>> list which specific patch has changed for each item in the change list.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for the context Frank. I'm okay with (1) or (2) but I'll stick
>>>>>>>> with (1) for this series. Didn't realize there were options (2) and
>>>>>>>> (3). Since you started reviewing from v7, I'll do that in the future
>>>>>>>> updates? Also, I haven't forgotten your emails. Just tied up with
>>>>>>>> something else for a few days. I'll get to your emails next week.
>>>>>>>
>>>>>>> Yes, starting with future updates is fine, no need to redo the v9
>>>>>>> change logs.
>>>>>>>
>>>>>>> No problem on the timing. I figured you were busy or away from the
>>>>>>> internet.
>>>>>>
>>>>>> I'm replying to your comments on the other 3 patches. Okay with a
>>>>>> majority of them. I'll wait for your reply to see where we settle for
>>>>>> some of the points before I send out any patches though.
>>>>>>
>>>>>> For now I'm thinking of sending them as separate clean up patches so
>>>>>> that Greg doesn't have to deal with reverts in his "next" branch. We
>>>>>> can squash them later if we really need to rip out what's in there and
>>>>>> push it again.
>>>>>>
>>>>>> -Saravana
>>>>>>
>>>>>
>>>>> Please do not do separate clean up patches. The series that Greg has is
>>>>> not ready for acceptance and I am going to ask him to revert it as we
>>>>> work through the needed changes.
>>>>>
>>>>> I suspect there will be at least two more versions of the series. The
>>>>> first is to get the patches I commented in good shape. Then I will
>>>>> look at the patches later in the series to see how they fit into the
>>>>> big picture.
>>>>>
>>>>> In the end, there should be one coherent patch series that implements
>>>>> the feature.
>>>>
>>>> Incremental patches to fix up the comments and documentation is fine, no
>>>> need to respin the whole mess.
>>>
>>> The problem is that the whole thing is a "mess" at this point. I expect
>>> the series to go through at least two or three more versions.
>>
>> I'm confused. All I see so far is objections about some documentation
>> in comments that can be cleaned up, and a disagreement about the name of
>> some things (naming is hard, tie goes to the submitter).
>
> Yes naming is hard. No,tie does not go to the submitter is the naming

^^ if

-Frank

> makes the code difficult to understand.
>
> Naming is one of the reasons why I have found this series so difficult
> to understand.
>
>
>> But no logic issues, right? Documentation and names can be fixed
>> anytime, the logic is all working properly, right?
>
> Yes, there are logic issues. I do not agree will all of the explanations
> in the replies.
>
> Without going into detail about all the issues, one key is that I
> need to see an example of the edit_links() function, which Saravana
> says he will provide. I don't want a bunch of ad hoc edit_links()
> functions that each deal with cyclic dependencies in different ways.
>
> There is also disagreement over whether the complexity of the
> dev->has_edit_links field and driver_edit_links() are needed.
>
> My biggest meta-issue is that this patch series is papering over the
> real problem that prompted the patches. The real problem is that the
> boot loader has enabled a power supply, but the power subsystem is
> not aware that there is an active consumer. I have been hopeful that
> this series can be implemented in a way that makes me comfortable
> that it is _not_ just papering over the true problem. I still
> retain that hope.
>
>
>>
>> What am I missing here?
>>
>> thanks,
>>
>> greg k-h
>>
>
> -Frank
>

2019-08-27 19:44:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v9 0/7] Solve postboot supplier cleanup and optimize probe ordering

On Wed, Aug 07, 2019 at 07:13:26PM -0700, Frank Rowand wrote:
> Hi Greg, Saravana,
>
> On 8/1/19 11:37 PM, Greg Kroah-Hartman wrote:
> > On Thu, Aug 01, 2019 at 12:59:25PM -0700, Frank Rowand wrote:
> >> On 8/1/19 12:32 PM, Greg Kroah-Hartman wrote:
> >>> On Thu, Aug 01, 2019 at 12:28:13PM -0700, Frank Rowand wrote:
> >>>> Hi Greg,
> >>>>
> >>>> On 7/31/19 11:12 PM, Greg Kroah-Hartman wrote:
> >>>>> On Wed, Jul 31, 2019 at 03:17:13PM -0700, Saravana Kannan wrote:
> >>>>>> Add device-links to track functional dependencies between devices
> >>>>>> after they are created (but before they are probed) by looking at
> >>>>>> their common DT bindings like clocks, interconnects, etc.
> >>>>>>
> >>>>>> Having functional dependencies automatically added before the devices
> >>>>>> are probed, provides the following benefits:
> >>>>>>
> >>>>>> - Optimizes device probe order and avoids the useless work of
> >>>>>> attempting probes of devices that will not probe successfully
> >>>>>> (because their suppliers aren't present or haven't probed yet).
> >>>>>>
> >>>>>> For example, in a commonly available mobile SoC, registering just
> >>>>>> one consumer device's driver at an initcall level earlier than the
> >>>>>> supplier device's driver causes 11 failed probe attempts before the
> >>>>>> consumer device probes successfully. This was with a kernel with all
> >>>>>> the drivers statically compiled in. This problem gets a lot worse if
> >>>>>> all the drivers are loaded as modules without direct symbol
> >>>>>> dependencies.
> >>>>>>
> >>>>>> - Supplier devices like clock providers, interconnect providers, etc
> >>>>>> need to keep the resources they provide active and at a particular
> >>>>>> state(s) during boot up even if their current set of consumers don't
> >>>>>> request the resource to be active. This is because the rest of the
> >>>>>> consumers might not have probed yet and turning off the resource
> >>>>>> before all the consumers have probed could lead to a hang or
> >>>>>> undesired user experience.
> >>>>>>
> >>>>>> Some frameworks (Eg: regulator) handle this today by turning off
> >>>>>> "unused" resources at late_initcall_sync and hoping all the devices
> >>>>>> have probed by then. This is not a valid assumption for systems with
> >>>>>> loadable modules. Other frameworks (Eg: clock) just don't handle
> >>>>>> this due to the lack of a clear signal for when they can turn off
> >>>>>> resources. This leads to downstream hacks to handle cases like this
> >>>>>> that can easily be solved in the upstream kernel.
> >>>>>>
> >>>>>> By linking devices before they are probed, we give suppliers a clear
> >>>>>> count of the number of dependent consumers. Once all of the
> >>>>>> consumers are active, the suppliers can turn off the unused
> >>>>>> resources without making assumptions about the number of consumers.
> >>>>>>
> >>>>>> By default we just add device-links to track "driver presence" (probe
> >>>>>> succeeded) of the supplier device. If any other functionality provided
> >>>>>> by device-links are needed, it is left to the consumer/supplier
> >>>>>> devices to change the link when they probe.
> >>>>>
> >>>>> All now queued up in my driver-core-testing branch, and if 0-day is
> >>>>> happy with this, will move it to my "real" driver-core-next branch in a
> >>>>> day or so to get included in linux-next.
> >>>>
> >>>> I have been slow in getting my review out.
> >>>>
> >>>> This patch series is not yet ready for sending to Linus, so if putting
> >>>> this in linux-next implies that it will be in your next pull request
> >>>> to Linus, please do not put it in linux-next.
> >>>
> >>> It means that it will be in my pull request for 5.4-rc1, many many
> >>> waeeks away from now.
> >>
> >> If you are willing to revert the series before the pull request _if_ I
> >> have significant review issues in the next couple of days, then I am happy
> >> to see the patches get exposure in linux-next.
> >
> > If you have significant review issues, yes, I will be glad to revert them.
>
> Just a heads up that I have sent review issues in reply to version 7 of this
> patch series.
>
> We'll see what the responses are to my review comments, but I am expecting
> the changes are big enough to result in a new version (or a couple more
> versions) of the patch series.
>
> No rush to revert version 9 since your 5.4-rc1 pull request is still not
> near, and I am glad for whatever exposure these patches are getting in
> linux-next.

Based on the further comments on this series, and the in-person we had
at ELC, I have now reverted these, and the follow-on fixes for this
series from my tree, with the hope that an updated patch set will be
sent for review soon.

thanks,

greg k-h