LinuxLists.cc - Re: [PATCH v3] PM / core: conditionally skip system pm in device/driver model

2024-02-23 18:40:10

Subject: Re: [PATCH v3] PM / core: conditionally skip system pm in device/driver model

On 2/23/24 06:38, Guan-Yu Lin wrote:
> In systems with a main processor and a co-processor, asynchronous
> controller management can lead to conflicts. One example is the main
> processor attempting to suspend a device while the co-processor is
> actively using it. To address this, we introduce a new sysfs entry
> called "conditional_skip". This entry allows the system to selectively
> skip certain device power management state transitions. To use this
> feature, set the value in "conditional_skip" to indicate the type of
> state transition you want to avoid. Please review /Documentation/ABI/
> testing/sysfs-devices-power for more detailed information.

This looks like a poor way of dealing with a lack of adequate resource
tracking from Linux on behalf of the co-processor(s) and I really do not
understand how someone is supposed to use that in a way that works.

Cannot you use a HW maintained spinlock between your host processor and
the co-processor such that they can each claim exclusive access to the
hardware and you can busy-wait until one or the other is done using the
device? How is your partitioning between host processor owned blocks and
co-processor(s) owned blocks? Is it static or is it dynamic?
--
Florian

2024-02-26 11:16:13

by Guan-Yu Lin

[permalink] [raw]

Subject: Re: [PATCH v3] PM / core: conditionally skip system pm in device/driver model

On Sat, Feb 24, 2024 at 2:20 AM Florian Fainelli <[email protected]> wrote:
>
> On 2/23/24 06:38, Guan-Yu Lin wrote:
> > In systems with a main processor and a co-processor, asynchronous
> > controller management can lead to conflicts. One example is the main
> > processor attempting to suspend a device while the co-processor is
> > actively using it. To address this, we introduce a new sysfs entry
> > called "conditional_skip". This entry allows the system to selectively
> > skip certain device power management state transitions. To use this
> > feature, set the value in "conditional_skip" to indicate the type of
> > state transition you want to avoid. Please review /Documentation/ABI/
> > testing/sysfs-devices-power for more detailed information.
>
> This looks like a poor way of dealing with a lack of adequate resource
> tracking from Linux on behalf of the co-processor(s) and I really do not
> understand how someone is supposed to use that in a way that works.
>
> Cannot you use a HW maintained spinlock between your host processor and
> the co-processor such that they can each claim exclusive access to the
> hardware and you can busy-wait until one or the other is done using the
> device? How is your partitioning between host processor owned blocks and
> co-processor(s) owned blocks? Is it static or is it dynamic?
> --
> Florian
>

This patch enables devices to selectively participate in system power
transitions. This is crucial when multiple processors, managed by
different operating system kernels, share the same controller. One
processor shouldn't enforce the same power transition procedures on
the controller – another processor might be using it at that moment.
While a spinlock is necessary for synchronizing controller access, we
still need to add the flexibility to dynamically customize power
transition behavior for each device. And that's what this patch is
trying to do.
In our use case, the host processor and co-processor are managed by
separate operating system kernels. This arrangement is static.

2024-02-26 18:40:30

by Florian Fainelli

[permalink] [raw]

Subject: Re: [PATCH v3] PM / core: conditionally skip system pm in device/driver model

On 2/26/24 02:28, Guan-Yu Lin wrote:
> On Sat, Feb 24, 2024 at 2:20 AM Florian Fainelli <[email protected]> wrote:
>>
>> On 2/23/24 06:38, Guan-Yu Lin wrote:
>>> In systems with a main processor and a co-processor, asynchronous
>>> controller management can lead to conflicts. One example is the main
>>> processor attempting to suspend a device while the co-processor is
>>> actively using it. To address this, we introduce a new sysfs entry
>>> called "conditional_skip". This entry allows the system to selectively
>>> skip certain device power management state transitions. To use this
>>> feature, set the value in "conditional_skip" to indicate the type of
>>> state transition you want to avoid. Please review /Documentation/ABI/
>>> testing/sysfs-devices-power for more detailed information.
>>
>> This looks like a poor way of dealing with a lack of adequate resource
>> tracking from Linux on behalf of the co-processor(s) and I really do not
>> understand how someone is supposed to use that in a way that works.
>>
>> Cannot you use a HW maintained spinlock between your host processor and
>> the co-processor such that they can each claim exclusive access to the
>> hardware and you can busy-wait until one or the other is done using the
>> device? How is your partitioning between host processor owned blocks and
>> co-processor(s) owned blocks? Is it static or is it dynamic?
>> --
>> Florian
>>
>
> This patch enables devices to selectively participate in system power
> transitions. This is crucial when multiple processors, managed by
> different operating system kernels, share the same controller. One
> processor shouldn't enforce the same power transition procedures on
> the controller – another processor might be using it at that moment.
> While a spinlock is necessary for synchronizing controller access, we
> still need to add the flexibility to dynamically customize power
> transition behavior for each device. And that's what this patch is
> trying to do.
> In our use case, the host processor and co-processor are managed by
> separate operating system kernels. This arrangement is static.

OK, so now the question is whether the peripheral is entirely visible to
Linux, or is it entirely owned by the co-processor, or is there a
combination of both and the usage of the said device driver is dynamic
between Linux and your co-processor?

A sysfs entry does not seem like the appropriate way to described which
states need to be skipped and which ones can remain under control of
Linux, you would have to use your firmware's description for that (ACPI,
Device Tree, etc.) such that you have a more comprehensive solution that
can span a bigger scope.
--
Florian

2024-02-27 09:12:26

by Guan-Yu Lin

[permalink] [raw]

Subject: Re: [PATCH v3] PM / core: conditionally skip system pm in device/driver model

On Tue, Feb 27, 2024 at 2:40 AM Florian Fainelli <[email protected]> wrote:
>
> On 2/26/24 02:28, Guan-Yu Lin wrote:
> > On Sat, Feb 24, 2024 at 2:20 AM Florian Fainelli <[email protected]> wrote:
> >>
> >> On 2/23/24 06:38, Guan-Yu Lin wrote:
> >>> In systems with a main processor and a co-processor, asynchronous
> >>> controller management can lead to conflicts. One example is the main
> >>> processor attempting to suspend a device while the co-processor is
> >>> actively using it. To address this, we introduce a new sysfs entry
> >>> called "conditional_skip". This entry allows the system to selectively
> >>> skip certain device power management state transitions. To use this
> >>> feature, set the value in "conditional_skip" to indicate the type of
> >>> state transition you want to avoid. Please review /Documentation/ABI/
> >>> testing/sysfs-devices-power for more detailed information.
> >>
> >> This looks like a poor way of dealing with a lack of adequate resource
> >> tracking from Linux on behalf of the co-processor(s) and I really do not
> >> understand how someone is supposed to use that in a way that works.
> >>
> >> Cannot you use a HW maintained spinlock between your host processor and
> >> the co-processor such that they can each claim exclusive access to the
> >> hardware and you can busy-wait until one or the other is done using the
> >> device? How is your partitioning between host processor owned blocks and
> >> co-processor(s) owned blocks? Is it static or is it dynamic?
> >> --
> >> Florian
> >>
> >
> > This patch enables devices to selectively participate in system power
> > transitions. This is crucial when multiple processors, managed by
> > different operating system kernels, share the same controller. One
> > processor shouldn't enforce the same power transition procedures on
> > the controller – another processor might be using it at that moment.
> > While a spinlock is necessary for synchronizing controller access, we
> > still need to add the flexibility to dynamically customize power
> > transition behavior for each device. And that's what this patch is
> > trying to do.
> > In our use case, the host processor and co-processor are managed by
> > separate operating system kernels. This arrangement is static.
>
> OK, so now the question is whether the peripheral is entirely visible to
> Linux, or is it entirely owned by the co-processor, or is there a
> combination of both and the usage of the said device driver is dynamic
> between Linux and your co-processor?
>
> A sysfs entry does not seem like the appropriate way to described which
> states need to be skipped and which ones can remain under control of
> Linux, you would have to use your firmware's description for that (ACPI,
> Device Tree, etc.) such that you have a more comprehensive solution that
> can span a bigger scope.
> --
> Florian
>

We anticipate that control of the peripheral (e.g., controller) will
be shared between operating system kernels. Each kernel will need its
own driver for peripheral communication. To accommodate different
tasks, the operating system managing the peripheral can change
dynamically at runtime.

We dynamically select the operating system kernel controlling the
target peripheral based on the task at hand, which looks more like a
software behavior rather than hardware behavior to me. I agree that we
might need a firmware description for "whether another operating
system exists for this peripheral", but we also need to store the
information about "whether another operating system is actively using
this peripheral". To me, the latter one looks more like a sysfs entry
rather than a firmware description as it's not determined statically.

2024-02-27 09:15:51

by Greg Kroah-Hartman

[permalink] [raw]

Subject: Re: [PATCH v3] PM / core: conditionally skip system pm in device/driver model

On Tue, Feb 27, 2024 at 04:56:00PM +0800, Guan-Yu Lin wrote:
> On Tue, Feb 27, 2024 at 2:40 AM Florian Fainelli <[email protected]> wrote:
> >
> > On 2/26/24 02:28, Guan-Yu Lin wrote:
> > > On Sat, Feb 24, 2024 at 2:20 AM Florian Fainelli <[email protected]> wrote:
> > >>
> > >> On 2/23/24 06:38, Guan-Yu Lin wrote:
> > >>> In systems with a main processor and a co-processor, asynchronous
> > >>> controller management can lead to conflicts. One example is the main
> > >>> processor attempting to suspend a device while the co-processor is
> > >>> actively using it. To address this, we introduce a new sysfs entry
> > >>> called "conditional_skip". This entry allows the system to selectively
> > >>> skip certain device power management state transitions. To use this
> > >>> feature, set the value in "conditional_skip" to indicate the type of
> > >>> state transition you want to avoid. Please review /Documentation/ABI/
> > >>> testing/sysfs-devices-power for more detailed information.
> > >>
> > >> This looks like a poor way of dealing with a lack of adequate resource
> > >> tracking from Linux on behalf of the co-processor(s) and I really do not
> > >> understand how someone is supposed to use that in a way that works.
> > >>
> > >> Cannot you use a HW maintained spinlock between your host processor and
> > >> the co-processor such that they can each claim exclusive access to the
> > >> hardware and you can busy-wait until one or the other is done using the
> > >> device? How is your partitioning between host processor owned blocks and
> > >> co-processor(s) owned blocks? Is it static or is it dynamic?
> > >> --
> > >> Florian
> > >>
> > >
> > > This patch enables devices to selectively participate in system power
> > > transitions. This is crucial when multiple processors, managed by
> > > different operating system kernels, share the same controller. One
> > > processor shouldn't enforce the same power transition procedures on
> > > the controller – another processor might be using it at that moment.
> > > While a spinlock is necessary for synchronizing controller access, we
> > > still need to add the flexibility to dynamically customize power
> > > transition behavior for each device. And that's what this patch is
> > > trying to do.
> > > In our use case, the host processor and co-processor are managed by
> > > separate operating system kernels. This arrangement is static.
> >
> > OK, so now the question is whether the peripheral is entirely visible to
> > Linux, or is it entirely owned by the co-processor, or is there a
> > combination of both and the usage of the said device driver is dynamic
> > between Linux and your co-processor?
> >
> > A sysfs entry does not seem like the appropriate way to described which
> > states need to be skipped and which ones can remain under control of
> > Linux, you would have to use your firmware's description for that (ACPI,
> > Device Tree, etc.) such that you have a more comprehensive solution that
> > can span a bigger scope.
> > --
> > Florian
> >
>
> We anticipate that control of the peripheral (e.g., controller) will
> be shared between operating system kernels. Each kernel will need its
> own driver for peripheral communication. To accommodate different
> tasks, the operating system managing the peripheral can change
> dynamically at runtime.

That sounds like a nightmare of control and handling, how are you going
to do any of that? Where is the code for that?

> We dynamically select the operating system kernel controlling the
> target peripheral based on the task at hand, which looks more like a
> software behavior rather than hardware behavior to me. I agree that we
> might need a firmware description for "whether another operating
> system exists for this peripheral", but we also need to store the
> information about "whether another operating system is actively using
> this peripheral". To me, the latter one looks more like a sysfs entry
> rather than a firmware description as it's not determined statically.

So you want to download different firmware to the device depending on
"something". What is going to control that "something"? Is that coming
from the kernel, or from userspace? If userspace, why is any of this an
issue and just load whatever firmware you decide at that point in time?
Why does the kernel care?

confused,

greg k-h

2024-02-27 17:57:26

by Florian Fainelli

[permalink] [raw]

Subject: Re: [PATCH v3] PM / core: conditionally skip system pm in device/driver model

On 2/27/24 00:56, Guan-Yu Lin wrote:
> On Tue, Feb 27, 2024 at 2:40 AM Florian Fainelli <[email protected]> wrote:
>>
>> On 2/26/24 02:28, Guan-Yu Lin wrote:
>>> On Sat, Feb 24, 2024 at 2:20 AM Florian Fainelli <[email protected]> wrote:
>>>>
>>>> On 2/23/24 06:38, Guan-Yu Lin wrote:
>>>>> In systems with a main processor and a co-processor, asynchronous
>>>>> controller management can lead to conflicts. One example is the main
>>>>> processor attempting to suspend a device while the co-processor is
>>>>> actively using it. To address this, we introduce a new sysfs entry
>>>>> called "conditional_skip". This entry allows the system to selectively
>>>>> skip certain device power management state transitions. To use this
>>>>> feature, set the value in "conditional_skip" to indicate the type of
>>>>> state transition you want to avoid. Please review /Documentation/ABI/
>>>>> testing/sysfs-devices-power for more detailed information.
>>>>
>>>> This looks like a poor way of dealing with a lack of adequate resource
>>>> tracking from Linux on behalf of the co-processor(s) and I really do not
>>>> understand how someone is supposed to use that in a way that works.
>>>>
>>>> Cannot you use a HW maintained spinlock between your host processor and
>>>> the co-processor such that they can each claim exclusive access to the
>>>> hardware and you can busy-wait until one or the other is done using the
>>>> device? How is your partitioning between host processor owned blocks and
>>>> co-processor(s) owned blocks? Is it static or is it dynamic?
>>>> --
>>>> Florian
>>>>
>>>
>>> This patch enables devices to selectively participate in system power
>>> transitions. This is crucial when multiple processors, managed by
>>> different operating system kernels, share the same controller. One
>>> processor shouldn't enforce the same power transition procedures on
>>> the controller – another processor might be using it at that moment.
>>> While a spinlock is necessary for synchronizing controller access, we
>>> still need to add the flexibility to dynamically customize power
>>> transition behavior for each device. And that's what this patch is
>>> trying to do.
>>> In our use case, the host processor and co-processor are managed by
>>> separate operating system kernels. This arrangement is static.
>>
>> OK, so now the question is whether the peripheral is entirely visible to
>> Linux, or is it entirely owned by the co-processor, or is there a
>> combination of both and the usage of the said device driver is dynamic
>> between Linux and your co-processor?
>>
>> A sysfs entry does not seem like the appropriate way to described which
>> states need to be skipped and which ones can remain under control of
>> Linux, you would have to use your firmware's description for that (ACPI,
>> Device Tree, etc.) such that you have a more comprehensive solution that
>> can span a bigger scope.
>> --
>> Florian
>>
>
> We anticipate that control of the peripheral (e.g., controller) will
> be shared between operating system kernels. Each kernel will need its
> own driver for peripheral communication. To accommodate different
> tasks, the operating system managing the peripheral can change
> dynamically at runtime.

OK, that seems like this ought to be resolved at various layer other
than just user-space, starting possibly with an
overarching/reconciliation layer between the various operating systems?

>
> We dynamically select the operating system kernel controlling the
> target peripheral based on the task at hand, which looks more like a
> software behavior rather than hardware behavior to me. I agree that we
> might need a firmware description for "whether another operating
> system exists for this peripheral", but we also need to store the
> information about "whether another operating system is actively using
> this peripheral". To me, the latter one looks more like a sysfs entry
> rather than a firmware description as it's not determined statically.

I can understand why moving this sort of decisions to user-space might
sound appealing, but it also seems like if the peripheral is going to be
"stolen" away from Linux, then maybe Linux should not be managing it at
all, e.g.: unbind the device from its driver, and then rebind it when
Linux needs to use it. You would have to write your drivers such that
they can skip the peripheral's initialization if you need to preserve
state from the previous agent after an ownership change for instance?

I do not think you are painting a full picture of your use case,
hopefully not intentionally but at first glance it sounds like you need
a combination of kernel-level changes to your drivers, and possibly more.

Seems like more details need to be provided about the overall intended
use cases such that people can guide you with a fuller picture of the
use cases.
--
Florian

2024-02-29 09:08:24

by Guan-Yu Lin

[permalink] [raw]

Subject: Re: [PATCH v3] PM / core: conditionally skip system pm in device/driver model

On Wed, Feb 28, 2024 at 1:57 AM Florian Fainelli <[email protected]> wrote:
>
> On 2/27/24 00:56, Guan-Yu Lin wrote:
> > On Tue, Feb 27, 2024 at 2:40 AM Florian Fainelli <[email protected]> wrote:
> >>
> >> On 2/26/24 02:28, Guan-Yu Lin wrote:
> >>> On Sat, Feb 24, 2024 at 2:20 AM Florian Fainelli <[email protected]> wrote:
> >>>>
> >>>> On 2/23/24 06:38, Guan-Yu Lin wrote:
> >>>>> In systems with a main processor and a co-processor, asynchronous
> >>>>> controller management can lead to conflicts. One example is the main
> >>>>> processor attempting to suspend a device while the co-processor is
> >>>>> actively using it. To address this, we introduce a new sysfs entry
> >>>>> called "conditional_skip". This entry allows the system to selectively
> >>>>> skip certain device power management state transitions. To use this
> >>>>> feature, set the value in "conditional_skip" to indicate the type of
> >>>>> state transition you want to avoid. Please review /Documentation/ABI/
> >>>>> testing/sysfs-devices-power for more detailed information.
> >>>>
> >>>> This looks like a poor way of dealing with a lack of adequate resource
> >>>> tracking from Linux on behalf of the co-processor(s) and I really do not
> >>>> understand how someone is supposed to use that in a way that works.
> >>>>
> >>>> Cannot you use a HW maintained spinlock between your host processor and
> >>>> the co-processor such that they can each claim exclusive access to the
> >>>> hardware and you can busy-wait until one or the other is done using the
> >>>> device? How is your partitioning between host processor owned blocks and
> >>>> co-processor(s) owned blocks? Is it static or is it dynamic?
> >>>> --
> >>>> Florian
> >>>>
> >>>
> >>> This patch enables devices to selectively participate in system power
> >>> transitions. This is crucial when multiple processors, managed by
> >>> different operating system kernels, share the same controller. One
> >>> processor shouldn't enforce the same power transition procedures on
> >>> the controller – another processor might be using it at that moment.
> >>> While a spinlock is necessary for synchronizing controller access, we
> >>> still need to add the flexibility to dynamically customize power
> >>> transition behavior for each device. And that's what this patch is
> >>> trying to do.
> >>> In our use case, the host processor and co-processor are managed by
> >>> separate operating system kernels. This arrangement is static.
> >>
> >> OK, so now the question is whether the peripheral is entirely visible to
> >> Linux, or is it entirely owned by the co-processor, or is there a
> >> combination of both and the usage of the said device driver is dynamic
> >> between Linux and your co-processor?
> >>
> >> A sysfs entry does not seem like the appropriate way to described which
> >> states need to be skipped and which ones can remain under control of
> >> Linux, you would have to use your firmware's description for that (ACPI,
> >> Device Tree, etc.) such that you have a more comprehensive solution that
> >> can span a bigger scope.
> >> --
> >> Florian
> >>
> >
> > We anticipate that control of the peripheral (e.g., controller) will
> > be shared between operating system kernels. Each kernel will need its
> > own driver for peripheral communication. To accommodate different
> > tasks, the operating system managing the peripheral can change
> > dynamically at runtime.
>
> OK, that seems like this ought to be resolved at various layer other
> than just user-space, starting possibly with an
> overarching/reconciliation layer between the various operating systems?
>

We achieve cooperation between operating system kernels by assigning
interrupts to corresponding kernels, and only one kernel could write
commands to the peripheral.

> >
> > We dynamically select the operating system kernel controlling the
> > target peripheral based on the task at hand, which looks more like a
> > software behavior rather than hardware behavior to me. I agree that we
> > might need a firmware description for "whether another operating
> > system exists for this peripheral", but we also need to store the
> > information about "whether another operating system is actively using
> > this peripheral". To me, the latter one looks more like a sysfs entry
> > rather than a firmware description as it's not determined statically.
>
> I can understand why moving this sort of decisions to user-space might
> sound appealing, but it also seems like if the peripheral is going to be
> "stolen" away from Linux, then maybe Linux should not be managing it at
> all, e.g.: unbind the device from its driver, and then rebind it when
> Linux needs to use it. You would have to write your drivers such that
> they can skip the peripheral's initialization if you need to preserve
> state from the previous agent after an ownership change for instance?
>
> I do not think you are painting a full picture of your use case,
> hopefully not intentionally but at first glance it sounds like you need
> a combination of kernel-level changes to your drivers, and possibly more.
>
> Seems like more details need to be provided about the overall intended
> use cases such that people can guide you with a fuller picture of the
> use cases.
> --
> Florian
>

Let me introduce the scenario of our real-world use case. The
peripheral (controller) can issue multiple interrupts, which are
handled respectively by two operating system kernels (Linux and a
non-Linux). In addition, only one kernel can issue commands to the
peripheral. Although we have successfully distributed control of this
peripheral between the kernels, Linux's system power management still
applies power transition rules to the entire peripheral without
awareness of the other kernel's activity. In other words, the Linux
kernel has partial responsibility for the peripheral's functionality,
but its power management decisions affect the entire peripheral. This
can potentially interfere with the non-Linux kernel's operations.

We want to introduce a mechanism that allows the Linux kernel to make
power transitions for the peripheral based on whether the other
operating system kernel is actively using it. To achieve this, we
propose this patch that adds a sysfs attribute, providing the Linux
kernel with the necessary information.

2024-02-29 10:28:50

by Guan-Yu Lin

[permalink] [raw]

Subject: Re: [PATCH v3] PM / core: conditionally skip system pm in device/driver model

On Tue, Feb 27, 2024 at 5:15 PM Greg KH <[email protected]> wrote:
>
> On Tue, Feb 27, 2024 at 04:56:00PM +0800, Guan-Yu Lin wrote:
> > On Tue, Feb 27, 2024 at 2:40 AM Florian Fainelli <[email protected]> wrote:
> > >
> > > On 2/26/24 02:28, Guan-Yu Lin wrote:
> > > > On Sat, Feb 24, 2024 at 2:20 AM Florian Fainelli <[email protected]> wrote:
> > > >>
> > > >> On 2/23/24 06:38, Guan-Yu Lin wrote:
> > > >>> In systems with a main processor and a co-processor, asynchronous
> > > >>> controller management can lead to conflicts. One example is the main
> > > >>> processor attempting to suspend a device while the co-processor is
> > > >>> actively using it. To address this, we introduce a new sysfs entry
> > > >>> called "conditional_skip". This entry allows the system to selectively
> > > >>> skip certain device power management state transitions. To use this
> > > >>> feature, set the value in "conditional_skip" to indicate the type of
> > > >>> state transition you want to avoid. Please review /Documentation/ABI/
> > > >>> testing/sysfs-devices-power for more detailed information.
> > > >>
> > > >> This looks like a poor way of dealing with a lack of adequate resource
> > > >> tracking from Linux on behalf of the co-processor(s) and I really do not
> > > >> understand how someone is supposed to use that in a way that works.
> > > >>
> > > >> Cannot you use a HW maintained spinlock between your host processor and
> > > >> the co-processor such that they can each claim exclusive access to the
> > > >> hardware and you can busy-wait until one or the other is done using the
> > > >> device? How is your partitioning between host processor owned blocks and
> > > >> co-processor(s) owned blocks? Is it static or is it dynamic?
> > > >> --
> > > >> Florian
> > > >>
> > > >
> > > > This patch enables devices to selectively participate in system power
> > > > transitions. This is crucial when multiple processors, managed by
> > > > different operating system kernels, share the same controller. One
> > > > processor shouldn't enforce the same power transition procedures on
> > > > the controller – another processor might be using it at that moment.
> > > > While a spinlock is necessary for synchronizing controller access, we
> > > > still need to add the flexibility to dynamically customize power
> > > > transition behavior for each device. And that's what this patch is
> > > > trying to do.
> > > > In our use case, the host processor and co-processor are managed by
> > > > separate operating system kernels. This arrangement is static.
> > >
> > > OK, so now the question is whether the peripheral is entirely visible to
> > > Linux, or is it entirely owned by the co-processor, or is there a
> > > combination of both and the usage of the said device driver is dynamic
> > > between Linux and your co-processor?
> > >
> > > A sysfs entry does not seem like the appropriate way to described which
> > > states need to be skipped and which ones can remain under control of
> > > Linux, you would have to use your firmware's description for that (ACPI,
> > > Device Tree, etc.) such that you have a more comprehensive solution that
> > > can span a bigger scope.
> > > --
> > > Florian
> > >
> >
> > We anticipate that control of the peripheral (e.g., controller) will
> > be shared between operating system kernels. Each kernel will need its
> > own driver for peripheral communication. To accommodate different
> > tasks, the operating system managing the peripheral can change
> > dynamically at runtime.
>
> That sounds like a nightmare of control and handling, how are you going
> to do any of that? Where is the code for that?
>

Since the peripheral can issue different types of interrupts, we plan
to assign the handling of those interrupts to separate operating
system kernels. Additionally, only one operating system kernel will
have the privilege to issue commands to the peripheral. We think that
this could resolve potential conflicts.

> > We dynamically select the operating system kernel controlling the
> > target peripheral based on the task at hand, which looks more like a
> > software behavior rather than hardware behavior to me. I agree that we
> > might need a firmware description for "whether another operating
> > system exists for this peripheral", but we also need to store the
> > information about "whether another operating system is actively using
> > this peripheral". To me, the latter one looks more like a sysfs entry
> > rather than a firmware description as it's not determined statically.
>
> So you want to download different firmware to the device depending on
> "something". What is going to control that "something"? Is that coming
> from the kernel, or from userspace? If userspace, why is any of this an
> issue and just load whatever firmware you decide at that point in time?
> Why does the kernel care?
>
> confused,
>
> greg k-h

In our design, no single firmware can fully control or communicate
with the peripheral. Different functions of the peripheral are
supported by different operating system kernels. Therefore, we should
keep both firmwares active to use the peripheral effectively.

2024-02-29 20:36:30

by Greg Kroah-Hartman

[permalink] [raw]

Subject: Re: [PATCH v3] PM / core: conditionally skip system pm in device/driver model

On Thu, Feb 29, 2024 at 05:08:00PM +0800, Guan-Yu Lin wrote:
> We want to introduce a mechanism that allows the Linux kernel to make
> power transitions for the peripheral based on whether the other
> operating system kernel is actively using it. To achieve this, we
> propose this patch that adds a sysfs attribute, providing the Linux
> kernel with the necessary information.

Don't create random user/kernel apis in sysfs for no good reason just
because it is "easy" :(

If the "other operating system is actively using it" isn't able to be
detected by Linux, then Linux shouldn't be able to change the PM state,
so this sounds like you need to fix your Linux driver to properly know
this information, just like any other device type (think about a sound
device that needs to know if it is being used or not, nothing different
here.)

So please post your Linux driver and we can see what needs to be done
there to get this to work properly, odds are you are just missing
something. Have a pointer to the code anywhere?

Also, as you know, we can NOT add interfaces to the kernel without any
real user, so without a driver for your hardware, none of this is able
to go anywhere at all, sorry.

thanks,

greg k-h

2024-03-08 18:05:31

by Guan-Yu Lin

[permalink] [raw]

Subject: Re: [PATCH v3] PM / core: conditionally skip system pm in device/driver model

On Fri, Mar 1, 2024 at 4:34 AM Greg KH <[email protected]> wrote:
>
> On Thu, Feb 29, 2024 at 05:08:00PM +0800, Guan-Yu Lin wrote:
> > We want to introduce a mechanism that allows the Linux kernel to make
> > power transitions for the peripheral based on whether the other
> > operating system kernel is actively using it. To achieve this, we
> > propose this patch that adds a sysfs attribute, providing the Linux
> > kernel with the necessary information.
>
> Don't create random user/kernel apis in sysfs for no good reason just
> because it is "easy" :(
>

We initially considered using sysfs because it could provide a
universal interface regardless of which operating system kernel shares
the device with the Linux kernel. This would allow users to modify the
feature through simple sysfs interactions. However, the current method
of using information in sysfs doesn't seem to integrate well with the
existing system power management framework. Could we refine how sysfs
is used in system power management to enable cross-kernel
communication? Alternatively, should we avoid exposing the information
of whether a device is used by multiple operating systems to user
space?

> If the "other operating system is actively using it" isn't able to be
> detected by Linux, then Linux shouldn't be able to change the PM state,
> so this sounds like you need to fix your Linux driver to properly know
> this information, just like any other device type (think about a sound
> device that needs to know if it is being used or not, nothing different
> here.)
>

I think the variable `usage_count` in struct `dev_pm_info` records
whether the device is being used. Could we leverage this information
in the design? We could modify the device tree to record which devices
are shared across operating system kernels. Then, we could
conditionally skip system power management steps for those devices if
they're actively in use. We'll need to carefully consider potential
corner cases and assess any impact on runtime power management. If
this proposal seems worthwhile, I can prepare a more detailed v4 for
discussion.

> So please post your Linux driver and we can see what needs to be done
> there to get this to work properly, odds are you are just missing
> something. Have a pointer to the code anywhere?
>
> Also, as you know, we can NOT add interfaces to the kernel without any
> real user, so without a driver for your hardware, none of this is able
> to go anywhere at all, sorry.
>

The Linux device driver we're using here is the upstream xhci driver.
We configure only partial interrupts for the controller in the device
tree. This prevents the Linux kernel from accessing other interrupts
handled by another operating system kernel. Consequently, the
controller can function normally even with two operating system
kernels accessing it, as long as we completely disable system power
management functionality.

> thanks,
>
> greg k-h