Changes to hrtimer mode (potentially made by __hrtimer_init_sleeper on
PREEMPT_RT) are not visible to hrtimer_start_range_ns, thus not
accounted for by hrtimer_start_expires call paths. In particular,
__wait_event_hrtimeout suffers from this problem as we have, for
example:
fs/aio.c::read_events
wait_event_interruptible_hrtimeout
__wait_event_hrtimeout
hrtimer_init_sleeper_on_stack <- this might "mode |= HRTIMER_MODE_HARD"
on RT if task runs at RT/DL priority
hrtimer_start_range_ns
WARN_ON_ONCE(!(mode & HRTIMER_MODE_HARD) ^ !timer->is_hard)
fires since the latter doesn't see the change of mode done by
init_sleeper
Fix it by making __wait_event_hrtimeout call hrtimer_sleeper_start_expires,
which is aware of the special RT/DL case, instead of hrtimer_start_range_ns.
Cc: Sebastian Andrzej Siewior <[email protected]>
Reported-by: Bruno Goncalves <[email protected]>
Signed-off-by: Juri Lelli <[email protected]>
---
This is a continuation of discussion happened at
https://lore.kernel.org/lkml/[email protected]/
"[RT] WARNING at hrtimer_start_range_ns"
---
include/linux/wait.h | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 851e07da2583..58cfbf81447c 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -544,10 +544,11 @@ do { \
\
hrtimer_init_sleeper_on_stack(&__t, CLOCK_MONOTONIC, \
HRTIMER_MODE_REL); \
- if ((timeout) != KTIME_MAX) \
- hrtimer_start_range_ns(&__t.timer, timeout, \
- current->timer_slack_ns, \
- HRTIMER_MODE_REL); \
+ if ((timeout) != KTIME_MAX) { \
+ hrtimer_set_expires_range_ns(&__t.timer, timeout, \
+ current->timer_slack_ns); \
+ hrtimer_sleeper_start_expires(&__t, HRTIMER_MODE_REL); \
+ } \
\
__ret = ___wait_event(wq_head, condition, state, 0, 0, \
if (!__t.task) { \
--
2.36.1
On 27/06/22 11:50, Juri Lelli wrote:
> Changes to hrtimer mode (potentially made by __hrtimer_init_sleeper on
> PREEMPT_RT) are not visible to hrtimer_start_range_ns, thus not
> accounted for by hrtimer_start_expires call paths. In particular,
> __wait_event_hrtimeout suffers from this problem as we have, for
> example:
>
> fs/aio.c::read_events
> wait_event_interruptible_hrtimeout
> __wait_event_hrtimeout
> hrtimer_init_sleeper_on_stack <- this might "mode |= HRTIMER_MODE_HARD"
> on RT if task runs at RT/DL priority
> hrtimer_start_range_ns
> WARN_ON_ONCE(!(mode & HRTIMER_MODE_HARD) ^ !timer->is_hard)
> fires since the latter doesn't see the change of mode done by
> init_sleeper
>
> Fix it by making __wait_event_hrtimeout call hrtimer_sleeper_start_expires,
> which is aware of the special RT/DL case, instead of hrtimer_start_range_ns.
>
> Cc: Sebastian Andrzej Siewior <[email protected]>
> Reported-by: Bruno Goncalves <[email protected]>
> Signed-off-by: Juri Lelli <[email protected]>
Makes sense, that's now aligned with what e.g.
schedule_hrtimer_range_clock() does.
Reviewed-by: Valentin Schneider <[email protected]>
On 05/07/22 09:41, Valentin Schneider wrote:
> On 27/06/22 11:50, Juri Lelli wrote:
> > Changes to hrtimer mode (potentially made by __hrtimer_init_sleeper on
> > PREEMPT_RT) are not visible to hrtimer_start_range_ns, thus not
> > accounted for by hrtimer_start_expires call paths. In particular,
> > __wait_event_hrtimeout suffers from this problem as we have, for
> > example:
> >
> > fs/aio.c::read_events
> > wait_event_interruptible_hrtimeout
> > __wait_event_hrtimeout
> > hrtimer_init_sleeper_on_stack <- this might "mode |= HRTIMER_MODE_HARD"
> > on RT if task runs at RT/DL priority
> > hrtimer_start_range_ns
> > WARN_ON_ONCE(!(mode & HRTIMER_MODE_HARD) ^ !timer->is_hard)
> > fires since the latter doesn't see the change of mode done by
> > init_sleeper
> >
> > Fix it by making __wait_event_hrtimeout call hrtimer_sleeper_start_expires,
> > which is aware of the special RT/DL case, instead of hrtimer_start_range_ns.
> >
> > Cc: Sebastian Andrzej Siewior <[email protected]>
> > Reported-by: Bruno Goncalves <[email protected]>
> > Signed-off-by: Juri Lelli <[email protected]>
>
> Makes sense, that's now aligned with what e.g.
> schedule_hrtimer_range_clock() does.
>
> Reviewed-by: Valentin Schneider <[email protected]>
Thanks!
Gentle ping to the others about this one.
Best,
Juri
On 6/27/22 11:50, Juri Lelli wrote:
> Changes to hrtimer mode (potentially made by __hrtimer_init_sleeper on
> PREEMPT_RT) are not visible to hrtimer_start_range_ns, thus not
> accounted for by hrtimer_start_expires call paths. In particular,
> __wait_event_hrtimeout suffers from this problem as we have, for
> example:
>
> fs/aio.c::read_events
> wait_event_interruptible_hrtimeout
> __wait_event_hrtimeout
> hrtimer_init_sleeper_on_stack <- this might "mode |= HRTIMER_MODE_HARD"
> on RT if task runs at RT/DL priority
> hrtimer_start_range_ns
> WARN_ON_ONCE(!(mode & HRTIMER_MODE_HARD) ^ !timer->is_hard)
> fires since the latter doesn't see the change of mode done by
> init_sleeper
>
> Fix it by making __wait_event_hrtimeout call hrtimer_sleeper_start_expires,
> which is aware of the special RT/DL case, instead of hrtimer_start_range_ns.
>
> Cc: Sebastian Andrzej Siewior <[email protected]>
> Reported-by: Bruno Goncalves <[email protected]>
> Signed-off-by: Juri Lelli <[email protected]>
Reviewed-by: Daniel Bristot de Oliveira <[email protected]>
-- Daniel
The following commit has been merged into the timers/core branch of tip:
Commit-ID: cceeeb6a6d02e7b9a74ddd27a3225013b34174aa
Gitweb: https://git.kernel.org/tip/cceeeb6a6d02e7b9a74ddd27a3225013b34174aa
Author: Juri Lelli <[email protected]>
AuthorDate: Mon, 27 Jun 2022 11:50:51 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Thu, 28 Jul 2022 12:35:12 +02:00
wait: Fix __wait_event_hrtimeout for RT/DL tasks
Changes to hrtimer mode (potentially made by __hrtimer_init_sleeper on
PREEMPT_RT) are not visible to hrtimer_start_range_ns, thus not
accounted for by hrtimer_start_expires call paths. In particular,
__wait_event_hrtimeout suffers from this problem as we have, for
example:
fs/aio.c::read_events
wait_event_interruptible_hrtimeout
__wait_event_hrtimeout
hrtimer_init_sleeper_on_stack <- this might "mode |= HRTIMER_MODE_HARD"
on RT if task runs at RT/DL priority
hrtimer_start_range_ns
WARN_ON_ONCE(!(mode & HRTIMER_MODE_HARD) ^ !timer->is_hard)
fires since the latter doesn't see the change of mode done by
init_sleeper
Fix it by making __wait_event_hrtimeout call hrtimer_sleeper_start_expires,
which is aware of the special RT/DL case, instead of hrtimer_start_range_ns.
Reported-by: Bruno Goncalves <[email protected]>
Signed-off-by: Juri Lelli <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Daniel Bristot de Oliveira <[email protected]>
Reviewed-by: Valentin Schneider <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
include/linux/wait.h | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 851e07d..58cfbf8 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -544,10 +544,11 @@ do { \
\
hrtimer_init_sleeper_on_stack(&__t, CLOCK_MONOTONIC, \
HRTIMER_MODE_REL); \
- if ((timeout) != KTIME_MAX) \
- hrtimer_start_range_ns(&__t.timer, timeout, \
- current->timer_slack_ns, \
- HRTIMER_MODE_REL); \
+ if ((timeout) != KTIME_MAX) { \
+ hrtimer_set_expires_range_ns(&__t.timer, timeout, \
+ current->timer_slack_ns); \
+ hrtimer_sleeper_start_expires(&__t, HRTIMER_MODE_REL); \
+ } \
\
__ret = ___wait_event(wq_head, condition, state, 0, 0, \
if (!__t.task) { \