2019-12-15 21:03:23

by Martin Blumenstingl

[permalink] [raw]
Subject: [PATCH 0/1] clk: Meson8/8b/8m2: fix the mali clock flags

While playing with devfreq support for the lima driver I experienced
sporadic (random) system lockups. It turned out that this was in
certain cases when changing the mali clock.

The Amlogic vendor GPU platform driver (which is responsible for
changing the clock frequency) uses the following pattern when updating
the mali clock rate:
- at initialization: initialize the two mali_0 and mali_1 clock trees
with a default setting and enable both clocks
- when changing the clock frequency:
-- set HHI_MALI_CLK_CNTL[31] to temporarily use the mali_1 clock output
-- update the mali_0 clock tree (set the mux, divider, etc.)
-- clear HHI_MALI_CLK_CNTL[31] to temporarily use the mali_0 clock
output again

With the common clock framework we can even do better:
by setting CLK_SET_RATE_PARENT for the mali_0 and mali_1 output gates
we can force the common clock framework to update the "inactive" clock
and then switch to it's output.

I only tested this patch for a limited time only (approx. 2 hours).
So far I couldn't reproduce the sporadic system lockups with it.
However, broader testing would be great so I would like this to be
applied for -next.


Martin Blumenstingl (1):
clk: meson: meson8b: make the CCF use the glitch-free "mali" mux

drivers/clk/meson/meson8b.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--
2.24.1


2019-12-24 03:37:32

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH 0/1] clk: Meson8/8b/8m2: fix the mali clock flags

Quoting Jerome Brunet (2019-12-16 11:17:21)
>
> On Mon 16 Dec 2019 at 18:50, Stephen Boyd <[email protected]> wrote:
>
> > Quoting Jerome Brunet (2019-12-16 01:13:31)
> >>
> >> *updated last* which crucial to your use case.
> >>
> >> I just wonder if this crucial part something CCF guarantee and you can
> >> rely on it ... or if it might break in the future.
> >>
> >> Stephen, any thoughts on this ?
> >
> > We have problems with the order in which we call the set_rate clk_op.
> > Sometimes clk providers want us to call from leaf to root but instead we
> > call from root to leaf because of implementation reasons. Controlling
> > the order in which clk operations are done is an unsolved problem. But
> > yes, in the future I'd like to see us introduce the vaporware that is
> > coordinated clk rates that would allow clk providers to decide what this
> > order should be, instead of having to do this "root-to-leaf" update.
> > Doing so would help us with the clk dividers that have some parent
> > changing rate that causes the downstream device to be overclocked while
> > we change the parent before the divider.
> >
> > If there are more assumptions like this about how the CCF is implemented
> > then we'll have to be extra careful to not disturb the "normal" order of
> > operations when introducing something that allows clk providers to
> > modify it.
>
> I understand that CCR would, in theory, allow to define that sort of
> details. Still defining (and documenting) the default behavior would be
> nice.
>
> So the question is:
> * Can we rely set_rate() doing a root-to-leaf update until CCR comes
> around ?
> * If not, for use cases like the one described by Martin, I guess we
> are stuck with the notifier ? Or would you have something else to
> propose ?

I suppose we should just state that clk_set_rate() should do a
root-to-leaf update. It's not like anyone is interested in changing
this behavior. The notifier is not ideal. I've wanted to add a new
clk_op that would cover some amount of the notifier users by having a
'pre_set_rate' clk op that can mux the clk over to something safe or
setup a divider to something that is known to be safe and work. Then we
can avoid having to register for a notifier just to do something right
before the root-to-leaf update happens.

>
> >
> > Also, isn't CLK_SET_RATE_GATE broken in the case that clk_set_rate()
> > isn't called on that particular clk? I seem to recall that the flag only
> > matters when it's applied to the "leaf" or entry point into the CCF from
> > a consumer API.
>
> It did but not anymore
>
> > I've wanted to fix that but never gotten around to it.
>
> I fixed that already :P
> CLK_SET_RATE_GATE is a special case of clock protect. The clock is
> protecting itself so it is going down through the tree.
>

Ahaha ok. As you can see I'm trying to forget clock protect ;-)


>
> > The whole flag sort of irks me because I don't understand what consumers
> > are supposed to do when this flag is set on a clk. How do they discover
> > it?
>
> Actually (ATM) the consumer is not even aware of it. If a clock with
> CLK_SET_RATE_GATE is enabled, it will return the current rate to
> .round_rate() and .set_rate() ... as if it was fixed.

And then when the clk is disabled it will magically "unstick" and start
to accept the same rate request again?

>
> > They're supposed to "just know" and turn off the clk first and then
> > call clk_set_rate()?
>
> ATM, yes ... if CCF cannot switch to another "unlocked" subtree (the
> case here)
>
> > Why can't the framework do this all in the clk_set_rate() call?
>
> When there is multiple consumers the behavior would become a bit
> difficult to predict and drivers may have troubles anticipating that,
> maybe, the clock is locked.

Fun times!

2019-12-26 09:07:35

by Jerome Brunet

[permalink] [raw]
Subject: Re: [PATCH 0/1] clk: Meson8/8b/8m2: fix the mali clock flags


On Tue 24 Dec 2019 at 04:36, Stephen Boyd <[email protected]> wrote:

> Quoting Jerome Brunet (2019-12-16 11:17:21)
>>
>> On Mon 16 Dec 2019 at 18:50, Stephen Boyd <[email protected]> wrote:
>>
>> > Quoting Jerome Brunet (2019-12-16 01:13:31)
>> >>
>> >> *updated last* which crucial to your use case.
>> >>
>> >> I just wonder if this crucial part something CCF guarantee and you can
>> >> rely on it ... or if it might break in the future.
>> >>
>> >> Stephen, any thoughts on this ?
>> >
>> > We have problems with the order in which we call the set_rate clk_op.
>> > Sometimes clk providers want us to call from leaf to root but instead we
>> > call from root to leaf because of implementation reasons. Controlling
>> > the order in which clk operations are done is an unsolved problem. But
>> > yes, in the future I'd like to see us introduce the vaporware that is
>> > coordinated clk rates that would allow clk providers to decide what this
>> > order should be, instead of having to do this "root-to-leaf" update.
>> > Doing so would help us with the clk dividers that have some parent
>> > changing rate that causes the downstream device to be overclocked while
>> > we change the parent before the divider.
>> >
>> > If there are more assumptions like this about how the CCF is implemented
>> > then we'll have to be extra careful to not disturb the "normal" order of
>> > operations when introducing something that allows clk providers to
>> > modify it.
>>
>> I understand that CCR would, in theory, allow to define that sort of
>> details. Still defining (and documenting) the default behavior would be
>> nice.
>>
>> So the question is:
>> * Can we rely set_rate() doing a root-to-leaf update until CCR comes
>> around ?
>> * If not, for use cases like the one described by Martin, I guess we
>> are stuck with the notifier ? Or would you have something else to
>> propose ?
>
> I suppose we should just state that clk_set_rate() should do a
> root-to-leaf update. It's not like anyone is interested in changing
> this behavior. The notifier is not ideal. I've wanted to add a new
> clk_op that would cover some amount of the notifier users by having a
> 'pre_set_rate' clk op that can mux the clk over to something safe or
> setup a divider to something that is known to be safe and work. Then we
> can avoid having to register for a notifier just to do something right
> before the root-to-leaf update happens.
>

Martin,

It looks like a green light to me ;) Just add a detailed comment on the
mali top clock explaining things and it should be alright.

>>
>> >
>> > Also, isn't CLK_SET_RATE_GATE broken in the case that clk_set_rate()
>> > isn't called on that particular clk? I seem to recall that the flag only
>> > matters when it's applied to the "leaf" or entry point into the CCF from
>> > a consumer API.
>>
>> It did but not anymore
>>
>> > I've wanted to fix that but never gotten around to it.
>>
>> I fixed that already :P
>> CLK_SET_RATE_GATE is a special case of clock protect. The clock is
>> protecting itself so it is going down through the tree.
>>
>
> Ahaha ok. As you can see I'm trying to forget clock protect ;-)
>
>
>>
>> > The whole flag sort of irks me because I don't understand what consumers
>> > are supposed to do when this flag is set on a clk. How do they discover
>> > it?
>>
>> Actually (ATM) the consumer is not even aware of it. If a clock with
>> CLK_SET_RATE_GATE is enabled, it will return the current rate to
>> .round_rate() and .set_rate() ... as if it was fixed.
>
> And then when the clk is disabled it will magically "unstick" and start
> to accept the same rate request again?
>

Exactly

>>
>> > They're supposed to "just know" and turn off the clk first and then
>> > call clk_set_rate()?
>>
>> ATM, yes ... if CCF cannot switch to another "unlocked" subtree (the
>> case here)
>>
>> > Why can't the framework do this all in the clk_set_rate() call?
>>
>> When there is multiple consumers the behavior would become a bit
>> difficult to predict and drivers may have troubles anticipating that,
>> maybe, the clock is locked.
>
> Fun times!