2014-07-09 06:55:50

by Viresh Kumar

[permalink] [raw]
Subject: [RFC 0/7] hrtimer: drop active hrtimer checks after adding it

hrtimer_start*() family never fails to enqueue a hrtimer to a clock-base. The
only special case is when the hrtimer was in past. If it is getting enqueued to
local CPUs's clock-base, we raise a softirq and exit, else we handle that on
next interrupt on remote CPU.

At several places in the kernel, we try to make sure if hrtimer was added
properly or not by calling hrtimer_active(), like:

hrtimer_start(timer, expires, mode);
if (hrtimer_active(timer)) {
/* Added successfully */
} else {
/* Was added in the past */
}

As hrtimer_start*() never fails, hrtimer_active() is guaranteed to return '1'.
So, there is no point calling hrtimer_active().

First patch adds a WARN_ON_ONCE() to __hrtimer_start_range_ns() to make sure
hrtimers are always enqueued from it. Next 6 patches update several parts of
kernel to drop calls to hrtimer_active() after starting a hrtimer.

Rebased over 3.16-rc4 and pushed here:
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux.git hrtimer/drop-hrtimer-active-calls

Cc: Darren Hart <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: Peter Zijlstra <[email protected]>

Viresh Kumar (7):
hrtimer: Warn if hrtimer_start*() failed to enqueue hrtimer
hrtimer: don't check for active hrtimer after adding it
tick: don't check for active hrtimer after adding it
sched: don't check for active hrtimer after adding it
futex: don't check for active hrtimer after adding it
rtmutex: don't check for active hrtimer after adding it
net: don't check for active hrtimer after adding it

kernel/futex.c | 5 +----
kernel/hrtimer.c | 6 ++----
kernel/locking/rtmutex.c | 5 +----
kernel/sched/core.c | 20 +++++++++-----------
kernel/sched/deadline.c | 2 +-
kernel/time/tick-sched.c | 45 ++++++++++++++++++---------------------------
net/core/pktgen.c | 2 --
7 files changed, 32 insertions(+), 53 deletions(-)

--
2.0.0.rc2


2014-07-09 06:55:58

by Viresh Kumar

[permalink] [raw]
Subject: [RFC 1/7] hrtimer: Warn if hrtimer_start*() failed to enqueue hrtimer

hrtimer_start*() family never fails to enqueue a hrtimer to a clock-base. The
only special case is when the hrtimer was in past. If it is getting enqueued to
local CPUs's clock-base, we raise a softirq and exit, else we handle that on
next interrupt on remote CPU.

At several places in kernel we check if hrtimer is enqueued properly with
hrtimer_active(). This isn't required and can be dropped.

Before fixing that, lets make sure hrtimer is always enqueued properly by adding

WARN_ON_ONCE(!hrtimer_active(timer));

towards the end of __hrtimer_start_range_ns().

Suggested-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
---
kernel/hrtimer.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 3ab2899..cf40209 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1037,6 +1037,8 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,

unlock_hrtimer_base(timer, &flags);

+ /* Make sure timer is enqueued */
+ WARN_ON_ONCE(!hrtimer_active(timer));
return ret;
}
EXPORT_SYMBOL_GPL(__hrtimer_start_range_ns);
--
2.0.0.rc2

2014-07-09 06:56:07

by Viresh Kumar

[permalink] [raw]
Subject: [RFC 2/7] hrtimer: don't check for active hrtimer after adding it

hrtimer_start*() family never fails to enqueue a hrtimer to a clock-base. The
only special case is when the hrtimer was in past. If it is getting enqueued to
local CPUs's clock-base, we raise a softirq and exit, else we handle that on
next interrupt on remote CPU.

At several places in the kernel, we try to make sure if hrtimer was added
properly or not by calling hrtimer_active(), like:

hrtimer_start(timer, expires, mode);
if (hrtimer_active(timer)) {
/* Added successfully */
} else {
/* Was added in the past */
}

As hrtimer_start*() never fails, hrtimer_active() is guaranteed to return '1'.
So, there is no point calling hrtimer_active().

This patch updates hrtimer core to get this fixed.

Signed-off-by: Viresh Kumar <[email protected]>
---
kernel/hrtimer.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index cf40209..a76f962 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1555,8 +1555,6 @@ static int __sched do_nanosleep(struct hrtimer_sleeper *t, enum hrtimer_mode mod
do {
set_current_state(TASK_INTERRUPTIBLE);
hrtimer_start_expires(&t->timer, mode);
- if (!hrtimer_active(&t->timer))
- t->task = NULL;

if (likely(t->task))
freezable_schedule();
@@ -1837,8 +1835,6 @@ schedule_hrtimeout_range_clock(ktime_t *expires, unsigned long delta,
hrtimer_init_sleeper(&t, current);

hrtimer_start_expires(&t.timer, mode);
- if (!hrtimer_active(&t.timer))
- t.task = NULL;

if (likely(t.task))
schedule();
--
2.0.0.rc2

2014-07-09 06:56:12

by Viresh Kumar

[permalink] [raw]
Subject: [RFC 3/7] tick: don't check for active hrtimer after adding it

hrtimer_start*() family never fails to enqueue a hrtimer to a clock-base. The
only special case is when the hrtimer was in past. If it is getting enqueued to
local CPUs's clock-base, we raise a softirq and exit, else we handle that on
next interrupt on remote CPU.

At several places in the kernel, we try to make sure if hrtimer was added
properly or not by calling hrtimer_active(), like:

hrtimer_start(timer, expires, mode);
if (hrtimer_active(timer)) {
/* Added successfully */
} else {
/* Was added in the past */
}

As hrtimer_start*() never fails, hrtimer_active() is guaranteed to return '1'.
So, there is no point calling hrtimer_active().

This is done in while loop at several places, which isn't required anymore.

This patch updates tick core to get this fixed. This also removes the confusing
comment present over usage of hrtimer_active().

- /* Check, if the timer was already in the past */

This comment says that hrtimer_active() would check if timer was already in the
past, but hrtimer_active() can't check it. It just checks if timer is in
INACTIVE state or not.

Signed-off-by: Viresh Kumar <[email protected]>
---
kernel/time/tick-sched.c | 45 ++++++++++++++++++---------------------------
1 file changed, 18 insertions(+), 27 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6558b7a..66ca5ab 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -658,9 +658,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
hrtimer_start(&ts->sched_timer, expires,
HRTIMER_MODE_ABS_PINNED);
- /* Check, if the timer was already in the past */
- if (hrtimer_active(&ts->sched_timer))
- goto out;
+ goto out;
} else if (!tick_program_event(expires, 0))
goto out;
/*
@@ -844,24 +842,25 @@ static void tick_nohz_restart(struct tick_sched *ts, ktime_t now)
hrtimer_cancel(&ts->sched_timer);
hrtimer_set_expires(&ts->sched_timer, ts->last_tick);

- while (1) {
- /* Forward the time to expire in the future */
- hrtimer_forward(&ts->sched_timer, now, tick_period);
+ /* Forward the time to expire in the future */
+ hrtimer_forward(&ts->sched_timer, now, tick_period);

- if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
- hrtimer_start_expires(&ts->sched_timer,
- HRTIMER_MODE_ABS_PINNED);
- /* Check, if the timer was already in the past */
- if (hrtimer_active(&ts->sched_timer))
- break;
- } else {
- if (!tick_program_event(
- hrtimer_get_expires(&ts->sched_timer), 0))
- break;
- }
+ if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
+ hrtimer_start_expires(&ts->sched_timer,
+ HRTIMER_MODE_ABS_PINNED);
+ return;
+ }
+
+ while (1) {
+ if (!tick_program_event(hrtimer_get_expires(&ts->sched_timer),
+ 0))
+ break;
/* Reread time and update jiffies */
now = ktime_get();
tick_do_update_jiffies64(now);
+
+ /* Forward the time to expire in the future */
+ hrtimer_forward(&ts->sched_timer, now, tick_period);
}
}

@@ -1104,7 +1103,6 @@ early_param("skew_tick", skew_tick);
void tick_setup_sched_timer(void)
{
struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
- ktime_t now = ktime_get();

/*
* Emulate tick processing via per-CPU hrtimers:
@@ -1123,15 +1121,8 @@ void tick_setup_sched_timer(void)
hrtimer_add_expires_ns(&ts->sched_timer, offset);
}

- for (;;) {
- hrtimer_forward(&ts->sched_timer, now, tick_period);
- hrtimer_start_expires(&ts->sched_timer,
- HRTIMER_MODE_ABS_PINNED);
- /* Check, if the timer was already in the past */
- if (hrtimer_active(&ts->sched_timer))
- break;
- now = ktime_get();
- }
+ hrtimer_forward(&ts->sched_timer, ktime_get(), tick_period);
+ hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED);

#ifdef CONFIG_NO_HZ_COMMON
if (tick_nohz_enabled) {
--
2.0.0.rc2

2014-07-09 06:56:22

by Viresh Kumar

[permalink] [raw]
Subject: [RFC 4/7] sched: don't check for active hrtimer after adding it

hrtimer_start*() family never fails to enqueue a hrtimer to a clock-base. The
only special case is when the hrtimer was in past. If it is getting enqueued to
local CPUs's clock-base, we raise a softirq and exit, else we handle that on
next interrupt on remote CPU.

At several places in the kernel, we try to make sure if hrtimer was added
properly or not by calling hrtimer_active(), like:

hrtimer_start(timer, expires, mode);
if (hrtimer_active(timer)) {
/* Added successfully */
} else {
/* Was added in the past */
}

As hrtimer_start*() never fails, hrtimer_active() is guaranteed to return '1'.
So, there is no point calling hrtimer_active().

Also this is done in while loop at several places, which isn't required anymore.

This patch updates sched core to get this fixed.

Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
---
kernel/sched/core.c | 20 +++++++++-----------
kernel/sched/deadline.c | 2 +-
2 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3bdf01b..4a6ef8d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -111,19 +111,17 @@ void start_bandwidth_timer(struct hrtimer *period_timer, ktime_t period)
unsigned long delta;
ktime_t soft, hard, now;

- for (;;) {
- if (hrtimer_active(period_timer))
- break;
+ if (hrtimer_active(period_timer))
+ return;

- now = hrtimer_cb_get_time(period_timer);
- hrtimer_forward(period_timer, now, period);
+ now = hrtimer_cb_get_time(period_timer);
+ hrtimer_forward(period_timer, now, period);

- soft = hrtimer_get_softexpires(period_timer);
- hard = hrtimer_get_expires(period_timer);
- delta = ktime_to_ns(ktime_sub(hard, soft));
- __hrtimer_start_range_ns(period_timer, soft, delta,
- HRTIMER_MODE_ABS_PINNED, 0);
- }
+ soft = hrtimer_get_softexpires(period_timer);
+ hard = hrtimer_get_expires(period_timer);
+ delta = ktime_to_ns(ktime_sub(hard, soft));
+ __hrtimer_start_range_ns(period_timer, soft, delta,
+ HRTIMER_MODE_ABS_PINNED, 0);
}

DEFINE_MUTEX(sched_domains_mutex);
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index fc4f98b1..eeb6786 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -484,7 +484,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
__hrtimer_start_range_ns(&dl_se->dl_timer, soft,
range, HRTIMER_MODE_ABS, 0);

- return hrtimer_active(&dl_se->dl_timer);
+ return 1;
}

/*
--
2.0.0.rc2

2014-07-09 06:56:26

by Viresh Kumar

[permalink] [raw]
Subject: [RFC 5/7] futex: don't check for active hrtimer after adding it

hrtimer_start*() family never fails to enqueue a hrtimer to a clock-base. The
only special case is when the hrtimer was in past. If it is getting enqueued to
local CPUs's clock-base, we raise a softirq and exit, else we handle that on
next interrupt on remote CPU.

At several places in the kernel, we try to make sure if hrtimer was added
properly or not by calling hrtimer_active(), like:

hrtimer_start(timer, expires, mode);
if (hrtimer_active(timer)) {
/* Added successfully */
} else {
/* Was added in the past */
}

As hrtimer_start*() never fails, hrtimer_active() is guaranteed to return '1'.
So, there is no point calling hrtimer_active().

This patch updates futex core to get this fixed.

Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Darren Hart <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
---
kernel/futex.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index b632b5f..4847efc 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2083,11 +2083,8 @@ static void futex_wait_queue_me(struct futex_hash_bucket *hb, struct futex_q *q,
queue_me(q, hb);

/* Arm the timer */
- if (timeout) {
+ if (timeout)
hrtimer_start_expires(&timeout->timer, HRTIMER_MODE_ABS);
- if (!hrtimer_active(&timeout->timer))
- timeout->task = NULL;
- }

/*
* If we have been removed from the hash list, then another task
--
2.0.0.rc2

2014-07-09 06:56:35

by Viresh Kumar

[permalink] [raw]
Subject: [RFC 6/7] rtmutex: don't check for active hrtimer after adding it

hrtimer_start*() family never fails to enqueue a hrtimer to a clock-base. The
only special case is when the hrtimer was in past. If it is getting enqueued to
local CPUs's clock-base, we raise a softirq and exit, else we handle that on
next interrupt on remote CPU.

At several places in the kernel, we try to make sure if hrtimer was added
properly or not by calling hrtimer_active(), like:

hrtimer_start(timer, expires, mode);
if (hrtimer_active(timer)) {
/* Added successfully */
} else {
/* Was added in the past */
}

As hrtimer_start*() never fails, hrtimer_active() is guaranteed to return '1'.
So, there is no point calling hrtimer_active().

This patch updates rtmutex core to get this fixed.

Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
---
kernel/locking/rtmutex.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index fc60594..9ea8830 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -922,11 +922,8 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
set_current_state(state);

/* Setup the timer, when timeout != NULL */
- if (unlikely(timeout)) {
+ if (unlikely(timeout))
hrtimer_start_expires(&timeout->timer, HRTIMER_MODE_ABS);
- if (!hrtimer_active(&timeout->timer))
- timeout->task = NULL;
- }

ret = task_blocks_on_rt_mutex(lock, &waiter, current, detect_deadlock);

--
2.0.0.rc2

2014-07-09 06:56:42

by Viresh Kumar

[permalink] [raw]
Subject: [RFC 7/7] net: don't check for active hrtimer after adding it

hrtimer_start*() family never fails to enqueue a hrtimer to a clock-base. The
only special case is when the hrtimer was in past. If it is getting enqueued to
local CPUs's clock-base, we raise a softirq and exit, else we handle that on
next interrupt on remote CPU.

At several places in the kernel, we try to make sure if hrtimer was added
properly or not by calling hrtimer_active(), like:

hrtimer_start(timer, expires, mode);
if (hrtimer_active(timer)) {
/* Added successfully */
} else {
/* Was added in the past */
}

As hrtimer_start*() never fails, hrtimer_active() is guaranteed to return '1'.
So, there is no point calling hrtimer_active().

This patch updates net core to get this fixed.

Cc: "David S. Miller" <[email protected]>
Cc: [email protected]
Signed-off-by: Viresh Kumar <[email protected]>
---
net/core/pktgen.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index fc17a9d..f911acd 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -2186,8 +2186,6 @@ static void spin(struct pktgen_dev *pkt_dev, ktime_t spin_until)
do {
set_current_state(TASK_INTERRUPTIBLE);
hrtimer_start_expires(&t.timer, HRTIMER_MODE_ABS);
- if (!hrtimer_active(&t.timer))
- t.task = NULL;

if (likely(t.task))
schedule();
--
2.0.0.rc2

2014-07-09 10:33:04

by Chris Redpath

[permalink] [raw]
Subject: Re: [RFC 7/7] net: don't check for active hrtimer after adding it

Hi Viresh,

On 09/07/14 07:55, Viresh Kumar wrote:
> hrtimer_start*() family never fails to enqueue a hrtimer to a clock-base. The
> only special case is when the hrtimer was in past. If it is getting enqueued to
> local CPUs's clock-base, we raise a softirq and exit, else we handle that on
> next interrupt on remote CPU.
>
> At several places in the kernel, we try to make sure if hrtimer was added
> properly or not by calling hrtimer_active(), like:
>
> hrtimer_start(timer, expires, mode);
> if (hrtimer_active(timer)) {
> /* Added successfully */
> } else {
> /* Was added in the past */
> }
>
> As hrtimer_start*() never fails, hrtimer_active() is guaranteed to return '1'.
> So, there is no point calling hrtimer_active().
>
> This patch updates net core to get this fixed.
>
> Cc: "David S. Miller" <[email protected]>
> Cc: [email protected]
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> net/core/pktgen.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/net/core/pktgen.c b/net/core/pktgen.c
> index fc17a9d..f911acd 100644
> --- a/net/core/pktgen.c
> +++ b/net/core/pktgen.c
> @@ -2186,8 +2186,6 @@ static void spin(struct pktgen_dev *pkt_dev, ktime_t spin_until)
> do {
> set_current_state(TASK_INTERRUPTIBLE);
> hrtimer_start_expires(&t.timer, HRTIMER_MODE_ABS);
> - if (!hrtimer_active(&t.timer))
> - t.task = NULL;
>
> if (likely(t.task))
> schedule();

I think this if condition can also be removed. hrtimer_init_sleeper
copies the supplied task_struct * to the timer, which in this case is
'current'. The check is likely to be there in case of !active case you
removed.

>

--Chris

2014-07-09 10:34:17

by Chris Redpath

[permalink] [raw]
Subject: Re: [RFC 2/7] hrtimer: don't check for active hrtimer after adding it

Hi Viresh,

On 09/07/14 07:55, Viresh Kumar wrote:
> hrtimer_start*() family never fails to enqueue a hrtimer to a clock-base. The
> only special case is when the hrtimer was in past. If it is getting enqueued to
> local CPUs's clock-base, we raise a softirq and exit, else we handle that on
> next interrupt on remote CPU.
>
> At several places in the kernel, we try to make sure if hrtimer was added
> properly or not by calling hrtimer_active(), like:
>
> hrtimer_start(timer, expires, mode);
> if (hrtimer_active(timer)) {
> /* Added successfully */
> } else {
> /* Was added in the past */
> }
>
> As hrtimer_start*() never fails, hrtimer_active() is guaranteed to return '1'.
> So, there is no point calling hrtimer_active().
>
> This patch updates hrtimer core to get this fixed.
>
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> kernel/hrtimer.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index cf40209..a76f962 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -1555,8 +1555,6 @@ static int __sched do_nanosleep(struct hrtimer_sleeper *t, enum hrtimer_mode mod
> do {
> set_current_state(TASK_INTERRUPTIBLE);
> hrtimer_start_expires(&t->timer, mode);
> - if (!hrtimer_active(&t->timer))
> - t->task = NULL;
>
> if (likely(t->task))
> freezable_schedule();
> @@ -1837,8 +1835,6 @@ schedule_hrtimeout_range_clock(ktime_t *expires, unsigned long delta,
> hrtimer_init_sleeper(&t, current);
>
> hrtimer_start_expires(&t.timer, mode);
> - if (!hrtimer_active(&t.timer))
> - t.task = NULL;
>
> if (likely(t.task))
> schedule();
>

Maybe safe to remove this if condition too.

2014-07-09 10:44:45

by Viresh Kumar

[permalink] [raw]
Subject: Re: [RFC 7/7] net: don't check for active hrtimer after adding it

Hi Chris,

On 9 July 2014 16:02, Chris Redpath <[email protected]> wrote:

>> diff --git a/net/core/pktgen.c b/net/core/pktgen.c
>> index fc17a9d..f911acd 100644
>> --- a/net/core/pktgen.c
>> +++ b/net/core/pktgen.c
>> @@ -2186,8 +2186,6 @@ static void spin(struct pktgen_dev *pkt_dev, ktime_t
>> spin_until)
>> do {
>> set_current_state(TASK_INTERRUPTIBLE);
>> hrtimer_start_expires(&t.timer, HRTIMER_MODE_ABS);
>> - if (!hrtimer_active(&t.timer))
>> - t.task = NULL;
>>
>> if (likely(t.task))
>> schedule();
>
>
> I think this if condition can also be removed. hrtimer_init_sleeper copies
> the supplied task_struct * to the timer, which in this case is 'current'.
> The check is likely to be there in case of !active case you removed.

Yeah, it looks like we can get rid of this. Also,

} while (t.task && pkt_dev->running && !signal_pending(current));

is present in the closing "}" of do-while loop and probably we
don't need to check t.task here as well.

And this review comment applies to patch 2/7 as well:
hrtimer: don't check for active hrtimer after adding it

I would still wait for somebody to prove us wrong :), and will resend
it next week only.

Thanks.

2014-07-09 10:48:22

by Chris Redpath

[permalink] [raw]
Subject: Re: [RFC 7/7] net: don't check for active hrtimer after adding it

On 09/07/14 11:44, Viresh Kumar wrote:
> Hi Chris,
>
> On 9 July 2014 16:02, Chris Redpath <[email protected]> wrote:
>
>>> diff --git a/net/core/pktgen.c b/net/core/pktgen.c
>>> index fc17a9d..f911acd 100644
>>> --- a/net/core/pktgen.c
>>> +++ b/net/core/pktgen.c
>>> @@ -2186,8 +2186,6 @@ static void spin(struct pktgen_dev *pkt_dev, ktime_t
>>> spin_until)
>>> do {
>>> set_current_state(TASK_INTERRUPTIBLE);
>>> hrtimer_start_expires(&t.timer, HRTIMER_MODE_ABS);
>>> - if (!hrtimer_active(&t.timer))
>>> - t.task = NULL;
>>>
>>> if (likely(t.task))
>>> schedule();
>>
>>
>> I think this if condition can also be removed. hrtimer_init_sleeper copies
>> the supplied task_struct * to the timer, which in this case is 'current'.
>> The check is likely to be there in case of !active case you removed.
>
> Yeah, it looks like we can get rid of this. Also,
>
> } while (t.task && pkt_dev->running && !signal_pending(current));
>
> is present in the closing "}" of do-while loop and probably we
> don't need to check t.task here as well.
>
> And this review comment applies to patch 2/7 as well:
> hrtimer: don't check for active hrtimer after adding it
>
> I would still wait for somebody to prove us wrong :), and will resend
> it next week only.
>
> Thanks.
>

Yeah, no worries. I just happened to read it and not knowing any of the
APIs had to look up what is going on.

BTW, I *will* get back to you about that broadcast stuff when I get back
to it myself. Other priorities at the moment again.

--Chris

2014-07-09 15:23:54

by Viresh Kumar

[permalink] [raw]
Subject: Re: [RFC 7/7] net: don't check for active hrtimer after adding it

On 9 July 2014 16:14, Viresh Kumar <[email protected]> wrote:
> Yeah, it looks like we can get rid of this. Also,
>
> } while (t.task && pkt_dev->running && !signal_pending(current));
>
> is present in the closing "}" of do-while loop and probably we
> don't need to check t.task here as well.

Actually No. t.task is modified from hrtimer handler and so this check
would stay:

Diff I have added to this patch:

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index f911acd..cc2694e 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -2187,8 +2187,7 @@ static void spin(struct pktgen_dev *pkt_dev,
ktime_t spin_until)
set_current_state(TASK_INTERRUPTIBLE);
hrtimer_start_expires(&t.timer, HRTIMER_MODE_ABS);

- if (likely(t.task))
- schedule();
+ schedule();

hrtimer_cancel(&t.timer);
} while (t.task && pkt_dev->running &&
!signal_pending(current));

2014-07-09 15:25:11

by Viresh Kumar

[permalink] [raw]
Subject: Re: [RFC 2/7] hrtimer: don't check for active hrtimer after adding it

On 9 July 2014 16:04, Chris Redpath <[email protected]> wrote:
> On 09/07/14 07:55, Viresh Kumar wrote:
>> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
>> index cf40209..a76f962 100644
>> --- a/kernel/hrtimer.c
>> +++ b/kernel/hrtimer.c
>> @@ -1555,8 +1555,6 @@ static int __sched do_nanosleep(struct
>> hrtimer_sleeper *t, enum hrtimer_mode mod
>> do {
>> set_current_state(TASK_INTERRUPTIBLE);
>> hrtimer_start_expires(&t->timer, mode);
>> - if (!hrtimer_active(&t->timer))
>> - t->task = NULL;
>>
>> if (likely(t->task))
>> freezable_schedule();
>> @@ -1837,8 +1835,6 @@ schedule_hrtimeout_range_clock(ktime_t *expires,
>> unsigned long delta,
>> hrtimer_init_sleeper(&t, current);
>>
>> hrtimer_start_expires(&t.timer, mode);
>> - if (!hrtimer_active(&t.timer))
>> - t.task = NULL;
>>
>> if (likely(t.task))
>> schedule();
>>
>
> Maybe safe to remove this if condition too.


Included following diff to this patch:

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index a76f962..ae7b5cf 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1556,8 +1556,7 @@ static int __sched do_nanosleep(struct
hrtimer_sleeper *t, enum hrtimer_mode mod
set_current_state(TASK_INTERRUPTIBLE);
hrtimer_start_expires(&t->timer, mode);

- if (likely(t->task))
- freezable_schedule();
+ freezable_schedule();

hrtimer_cancel(&t->timer);
mode = HRTIMER_MODE_ABS;
@@ -1836,8 +1835,7 @@ schedule_hrtimeout_range_clock(ktime_t *expires,
unsigned long delta,

hrtimer_start_expires(&t.timer, mode);

- if (likely(t.task))
- schedule();
+ schedule();

hrtimer_cancel(&t.timer);
destroy_hrtimer_on_stack(&t.timer);



Latest changes are pushed to my branch in case someone is looking to
test them.

2014-07-09 21:31:04

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [RFC 0/7] hrtimer: drop active hrtimer checks after adding it

On Wed, 9 Jul 2014, Viresh Kumar wrote:

So your patch series drops active hrtimer checks after adding it,
according to your subject line.

Quite useeul to drop something after adding it, right?

> hrtimer_start*() family never fails to enqueue a hrtimer to a clock-base. The
> only special case is when the hrtimer was in past. If it is getting enqueued to
> local CPUs's clock-base, we raise a softirq and exit, else we handle that on
> next interrupt on remote CPU.
>
> At several places in the kernel, we try to make sure if hrtimer was added
> properly or not by calling hrtimer_active(), like:
>
> hrtimer_start(timer, expires, mode);
> if (hrtimer_active(timer)) {
> /* Added successfully */
> } else {
> /* Was added in the past */
> }
>
> As hrtimer_start*() never fails, hrtimer_active() is guaranteed to return '1'.
> So, there is no point calling hrtimer_active().

Wrong as usual.

It's a common pattern that short timeouts are given which lead to
immediate expiry so the extra round through schedule is even more
pointless than the extra check.

Aside of that it's a long discussed issue that we really should tell
the caller right away that the timer was setup in the past and not
enqueued at all.

That requires to fixup a few call sites, but that'd far more valuable
than removing a few assumed to be pointless checks.

Thnaks,

tglx

2014-07-09 22:21:44

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [RFC 1/7] hrtimer: Warn if hrtimer_start*() failed to enqueue hrtimer

On Wed, Jul 09, 2014 at 12:25:33PM +0530, Viresh Kumar wrote:
> hrtimer_start*() family never fails to enqueue a hrtimer to a clock-base. The
> only special case is when the hrtimer was in past. If it is getting enqueued to
> local CPUs's clock-base, we raise a softirq and exit, else we handle that on
> next interrupt on remote CPU.
>
> At several places in kernel we check if hrtimer is enqueued properly with
> hrtimer_active(). This isn't required and can be dropped.
>
> Before fixing that, lets make sure hrtimer is always enqueued properly by adding
>
> WARN_ON_ONCE(!hrtimer_active(timer));
>
> towards the end of __hrtimer_start_range_ns().
>
> Suggested-by: Frederic Weisbecker <[email protected]>
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> kernel/hrtimer.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index 3ab2899..cf40209 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -1037,6 +1037,8 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
>
> unlock_hrtimer_base(timer, &flags);
>
> + /* Make sure timer is enqueued */
> + WARN_ON_ONCE(!hrtimer_active(timer));

Hmm, after reading Thomas reply, I think it's possible that the hrtimer expires
right after we unlock it and, if we are unlucky enough, before the hrtimer_active()
check.

In this case we might hit a false positive.

> return ret;
> }
> EXPORT_SYMBOL_GPL(__hrtimer_start_range_ns);
> --
> 2.0.0.rc2
>

2014-07-09 23:58:26

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [RFC 1/7] hrtimer: Warn if hrtimer_start*() failed to enqueue hrtimer

On Thu, 10 Jul 2014, Frederic Weisbecker wrote:

> On Wed, Jul 09, 2014 at 12:25:33PM +0530, Viresh Kumar wrote:
> > hrtimer_start*() family never fails to enqueue a hrtimer to a clock-base. The
> > only special case is when the hrtimer was in past. If it is getting enqueued to
> > local CPUs's clock-base, we raise a softirq and exit, else we handle that on
> > next interrupt on remote CPU.
> >
> > At several places in kernel we check if hrtimer is enqueued properly with
> > hrtimer_active(). This isn't required and can be dropped.
> >
> > Before fixing that, lets make sure hrtimer is always enqueued properly by adding
> >
> > WARN_ON_ONCE(!hrtimer_active(timer));
> >
> > towards the end of __hrtimer_start_range_ns().
> >
> > Suggested-by: Frederic Weisbecker <[email protected]>
> > Signed-off-by: Viresh Kumar <[email protected]>
> > ---
> > kernel/hrtimer.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> > index 3ab2899..cf40209 100644
> > --- a/kernel/hrtimer.c
> > +++ b/kernel/hrtimer.c
> > @@ -1037,6 +1037,8 @@ int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
> >
> > unlock_hrtimer_base(timer, &flags);
> >
> > + /* Make sure timer is enqueued */
> > + WARN_ON_ONCE(!hrtimer_active(timer));
>
> Hmm, after reading Thomas reply, I think it's possible that the hrtimer expires
> right after we unlock it and, if we are unlucky enough, before the hrtimer_active()
> check.
>
> In this case we might hit a false positive.

Haha, I didn't even go down to this patch after reading 0/N. I knew
right there that it's going to be pointless shite.

But now that you point me to it, it's even worse. It's not only
pointless shite it's actively wrong and outright stupid for someone
who tries to "work" on this code for a couple of month now.

Viresh, I'm really tired of this. Stop touching code you do not
understand. I warned you more than once and now you really reached the
level of complete incompetence. Welcome to my killfile.

Thanks,

tglx

2014-07-10 01:35:04

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [RFC 0/7] hrtimer: drop active hrtimer checks after adding it

On Wed, Jul 09, 2014 at 11:30:41PM +0200, Thomas Gleixner wrote:
> On Wed, 9 Jul 2014, Viresh Kumar wrote:
>
> So your patch series drops active hrtimer checks after adding it,
> according to your subject line.
>
> Quite useeul to drop something after adding it, right?
>
> > hrtimer_start*() family never fails to enqueue a hrtimer to a clock-base. The
> > only special case is when the hrtimer was in past. If it is getting enqueued to
> > local CPUs's clock-base, we raise a softirq and exit, else we handle that on
> > next interrupt on remote CPU.
> >
> > At several places in the kernel, we try to make sure if hrtimer was added
> > properly or not by calling hrtimer_active(), like:
> >
> > hrtimer_start(timer, expires, mode);
> > if (hrtimer_active(timer)) {
> > /* Added successfully */
> > } else {
> > /* Was added in the past */
> > }
> >
> > As hrtimer_start*() never fails, hrtimer_active() is guaranteed to return '1'.
> > So, there is no point calling hrtimer_active().
>
> Wrong as usual.
>
> It's a common pattern that short timeouts are given which lead to
> immediate expiry so the extra round through schedule is even more
> pointless than the extra check.

It may be a common pattern but it's not obvious at all as is in the code
except for timers gurus.

It looks like error handling while it's actually an optimization.

Also what about this pattern when it's used in interrupt or interrupt-disabled code?
In this case the handler is not going to fire right away, unless it's enqueued
on another CPU for unpinned timers.

For example this code in tick_nohz_stop_sched_tick():

hrtimer_start(&ts->sched_timer, expires, HRTIMER_MODE_ABS_PINNED);
/* Check, if the timer was already in the past */
if (hrtimer_active(&ts->sched_timer))
goto out;

It's not clear what this is handling. Concurrent immediate callback expiration from another CPU?
But the timer is pinned local so it can't execute right away between hrtimer_start() and hrtimer_active()
check...

2014-07-14 04:41:42

by Viresh Kumar

[permalink] [raw]
Subject: Re: [RFC 0/7] hrtimer: drop active hrtimer checks after adding it

Hi Thomas,

On 10 July 2014 07:04, Frederic Weisbecker <[email protected]> wrote:
> On Wed, Jul 09, 2014 at 11:30:41PM +0200, Thomas Gleixner wrote:
>> On Wed, 9 Jul 2014, Viresh Kumar wrote:
>>
>> So your patch series drops active hrtimer checks after adding it,
>> according to your subject line.
>>
>> Quite useeul to drop something after adding it, right?

I meant "hrtimer" by "it". Will fix it in case this patchset is still required.

>> > As hrtimer_start*() never fails, hrtimer_active() is guaranteed to return '1'.
>> > So, there is no point calling hrtimer_active().
>>
>> Wrong as usual.

I cross-checked this with Frederic and Preeti before reaching out to
you, to make sure its not 'obviously stupid'. And still couldn't get it
right. :(

>> It's a common pattern that short timeouts are given which lead to
>> immediate expiry so the extra round through schedule is even more
>> pointless than the extra check.

Just wanted to confirm it again, you are talking about CPU being
interrupted by clockevent device's interrupt right after hrtimer_start*()
returns and before calling hrtimer_active()?

> It may be a common pattern but it's not obvious at all as is in the code
> except for timers gurus.
>
> It looks like error handling while it's actually an optimization.
>
> Also what about this pattern when it's used in interrupt or interrupt-disabled code?
> In this case the handler is not going to fire right away, unless it's enqueued
> on another CPU for unpinned timers.
>
> For example this code in tick_nohz_stop_sched_tick():
>
> hrtimer_start(&ts->sched_timer, expires, HRTIMER_MODE_ABS_PINNED);
> /* Check, if the timer was already in the past */
> if (hrtimer_active(&ts->sched_timer))
> goto out;
>
> It's not clear what this is handling. Concurrent immediate callback expiration from another CPU?
> But the timer is pinned local so it can't execute right away between hrtimer_start() and hrtimer_active()
> check...

Actually I was concerned about other cases as well.

- Timeouts

I do agree that an extra check is better than an extra round of schedule().
But this is already achieved without calling hrtimer_active(), isn't it?

All these timeout hrtimers have hrtimer_wakeup() as there handler (as
these are initialized with: hrtimer_init_sleeper()).

And on expiration hrtimer_wakeup() does this: t->task = NULL;

So would this extra call to hrtimer_active() make any difference?

- Process-context: sched changes

I am not sure if scheduler routines: start_bandwidth_timer() and
start_dl_timer() would get called *only* with interrupts disabled.

But, it doesn't look obvious that the optimization Thomas mentioned
earlier is relevant here as well. These might be added here for error
checking.

I might be wrong here as I don't have any understanding of this code
and so sorry in advance.


Note: My tree is monitored by kbuild-bot and these changes are under
testing for over a week now. And I haven't received any reports of the
WARN() firing in __hrtimer_start_range_ns().. Probably these short
timeouts aren't getting hit at all by bot's tests.

--
viresh