On 16/02/2021 17:39, [email protected] wrote:
> From: Vincent Donnefort <[email protected]>
>
> Being called for each dequeue, util_est reduces the number of its updates
> by filtering out when the EWMA signal is different from the task util_avg
> by less than 1%. It is a problem for a sudden util_avg ramp-up. Due to the
> decay from a previous high util_avg, EWMA might now be close enough to
> the new util_avg. No update would then happen while it would leave
> ue.enqueued with an out-of-date value.
(1) enqueued[x-1] < ewma[x-1]
(2) diff(enqueued[x], ewma[x]) < 1024/100 && enqueued[x] < ewma[x] (*)
with ewma[x-1] == ewma[x]
(*) enqueued[x] must still be less than ewma[x] w/ default
UTIL_EST_FASTUP. Otherwise we would already 'goto done' (write the new
util_est) via the previous if condition.
>
> Taking into consideration the two util_est members, EWMA and enqueued for
> the filtering, ensures, for both, an up-to-date value.
>
> This is for now an issue only for the trace probe that might return the
> stale value. Functional-wise, it isn't (yet) a problem, as the value is
> always accessed through max(enqueued, ewma).
Yeah, I remember that the ue.enqueued plots looked weird in these
sections with stale ue.enqueued values.
> This problem has been observed using LISA's UtilConvergence:test_means on
> the sd845c board.
I ran the test a couple of times on my juno board and I never hit this
path (util_est_within_margin(last_ewma_diff) &&
!util_est_within_margin(last_enqueued_diff)) for a test task.
I can't see how this issue can be board specific? Does it happen
reliably on sd845c or is it just that it happens very, very occasionally?
I saw it a couple of times but always with a (non-test) tasks migrating
from one CPU to another.
> Signed-off-by: Vincent Donnefort <[email protected]>
Reviewed-by: Dietmar Eggemann <[email protected]>
[...]