2021-05-05 11:23:57

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH 1/6] delayacct: Use sched_clock()

Like all scheduler statistics, use sched_clock() based time.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
kernel/delayacct.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)

--- a/kernel/delayacct.c
+++ b/kernel/delayacct.c
@@ -7,9 +7,9 @@
#include <linux/sched.h>
#include <linux/sched/task.h>
#include <linux/sched/cputime.h>
+#include <linux/sched/clock.h>
#include <linux/slab.h>
#include <linux/taskstats.h>
-#include <linux/time.h>
#include <linux/sysctl.h>
#include <linux/delayacct.h>
#include <linux/module.h>
@@ -42,10 +42,9 @@ void __delayacct_tsk_init(struct task_st
* Finish delay accounting for a statistic using its timestamps (@start),
* accumalator (@total) and @count
*/
-static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total,
- u32 *count)
+static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total, u32 *count)
{
- s64 ns = ktime_get_ns() - *start;
+ s64 ns = local_clock() - *start;
unsigned long flags;

if (ns > 0) {
@@ -58,7 +57,7 @@ static void delayacct_end(raw_spinlock_t

void __delayacct_blkio_start(void)
{
- current->delays->blkio_start = ktime_get_ns();
+ current->delays->blkio_start = local_clock();
}

/*
@@ -151,21 +150,20 @@ __u64 __delayacct_blkio_ticks(struct tas

void __delayacct_freepages_start(void)
{
- current->delays->freepages_start = ktime_get_ns();
+ current->delays->freepages_start = local_clock();
}

void __delayacct_freepages_end(void)
{
- delayacct_end(
- &current->delays->lock,
- &current->delays->freepages_start,
- &current->delays->freepages_delay,
- &current->delays->freepages_count);
+ delayacct_end(&current->delays->lock,
+ &current->delays->freepages_start,
+ &current->delays->freepages_delay,
+ &current->delays->freepages_count);
}

void __delayacct_thrashing_start(void)
{
- current->delays->thrashing_start = ktime_get_ns();
+ current->delays->thrashing_start = local_clock();
}

void __delayacct_thrashing_end(void)



2021-05-05 19:30:41

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH 1/6] delayacct: Use sched_clock()

On Wed, 2021-05-05 at 12:59 +0200, Peter Zijlstra wrote:
> Like all scheduler statistics, use sched_clock() based time.
>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>

Looks like this works even on messed up systems because
the delayacct code is called from the same CPU at sleep
time and wakeup time, before a task is migrated.

Reviewed-by: Rik van Riel <[email protected]>

--
All Rights Reversed.


Attachments:
signature.asc (499.00 B)
This is a digitally signed message part

2021-05-06 14:01:18

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 1/6] delayacct: Use sched_clock()

On Wed, May 05, 2021 at 12:59:41PM +0200, Peter Zijlstra wrote:
> @@ -42,10 +42,9 @@ void __delayacct_tsk_init(struct task_st
> * Finish delay accounting for a statistic using its timestamps (@start),
> * accumalator (@total) and @count
> */
> -static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total,
> - u32 *count)
> +static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total, u32 *count)
> {
> - s64 ns = ktime_get_ns() - *start;
> + s64 ns = local_clock() - *start;

I don't think this is safe. These time sections that have preemption
and migration enabled and so might span multiple CPUs. local_clock()
could end up behind *start, AFAICS.

2021-05-06 14:24:53

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 1/6] delayacct: Use sched_clock()

On Thu, May 06, 2021 at 09:59:11AM -0400, Johannes Weiner wrote:
> On Wed, May 05, 2021 at 12:59:41PM +0200, Peter Zijlstra wrote:
> > @@ -42,10 +42,9 @@ void __delayacct_tsk_init(struct task_st
> > * Finish delay accounting for a statistic using its timestamps (@start),
> > * accumalator (@total) and @count
> > */
> > -static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total,
> > - u32 *count)
> > +static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total, u32 *count)
> > {
> > - s64 ns = ktime_get_ns() - *start;
> > + s64 ns = local_clock() - *start;
>
> I don't think this is safe. These time sections that have preemption
> and migration enabled and so might span multiple CPUs. local_clock()
> could end up behind *start, AFAICS.

Only if you have really crummy hardware, and in that case the drift is
bounded by around 1 tick. Also, this function actually checks: ns > 0.

And if you do have crummy hardware like that, ktime_get_ns() is the very
last thing you want to call at any frequency because it'll be the HPET.

2021-05-06 18:59:42

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 1/6] delayacct: Use sched_clock()

On Thu, May 06, 2021 at 04:17:33PM +0200, Peter Zijlstra wrote:
> On Thu, May 06, 2021 at 09:59:11AM -0400, Johannes Weiner wrote:
> > On Wed, May 05, 2021 at 12:59:41PM +0200, Peter Zijlstra wrote:
> > > @@ -42,10 +42,9 @@ void __delayacct_tsk_init(struct task_st
> > > * Finish delay accounting for a statistic using its timestamps (@start),
> > > * accumalator (@total) and @count
> > > */
> > > -static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total,
> > > - u32 *count)
> > > +static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total, u32 *count)
> > > {
> > > - s64 ns = ktime_get_ns() - *start;
> > > + s64 ns = local_clock() - *start;
> >
> > I don't think this is safe. These time sections that have preemption
> > and migration enabled and so might span multiple CPUs. local_clock()
> > could end up behind *start, AFAICS.
>
> Only if you have really crummy hardware, and in that case the drift is
> bounded by around 1 tick. Also, this function actually checks: ns > 0.

Oh, I didn't realize it was that close. I just went off the dramatic
warnings on cpu_clock() :-) But yeah, that seems plenty accurate for
this purpose.

Acked-by: Johannes Weiner <[email protected]>

2021-05-07 16:45:25

by Balbir Singh

[permalink] [raw]
Subject: Re: [PATCH 1/6] delayacct: Use sched_clock()

On Wed, May 05, 2021 at 12:59:41PM +0200, Peter Zijlstra wrote:
> Like all scheduler statistics, use sched_clock() based time.
>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> ---

Goind by your comment about preemption safety not being a concern
the patch looks good.

Acked-by: Balbir Singh <[email protected]>

Subject: [tip: sched/core] delayacct: Use sched_clock()

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 4b7a08a0b6e4e910a6feee438d76e426381df0cb
Gitweb: https://git.kernel.org/tip/4b7a08a0b6e4e910a6feee438d76e426381df0cb
Author: Peter Zijlstra <[email protected]>
AuthorDate: Tue, 04 May 2021 22:43:48 +02:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Wed, 12 May 2021 11:43:23 +02:00

delayacct: Use sched_clock()

Like all scheduler statistics, use sched_clock() based time.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Rik van Riel <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Reviewed-by: Ingo Molnar <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Acked-by: Balbir Singh <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/delayacct.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/kernel/delayacct.c b/kernel/delayacct.c
index 2772575..3fe7cd5 100644
--- a/kernel/delayacct.c
+++ b/kernel/delayacct.c
@@ -7,9 +7,9 @@
#include <linux/sched.h>
#include <linux/sched/task.h>
#include <linux/sched/cputime.h>
+#include <linux/sched/clock.h>
#include <linux/slab.h>
#include <linux/taskstats.h>
-#include <linux/time.h>
#include <linux/sysctl.h>
#include <linux/delayacct.h>
#include <linux/module.h>
@@ -42,10 +42,9 @@ void __delayacct_tsk_init(struct task_struct *tsk)
* Finish delay accounting for a statistic using its timestamps (@start),
* accumalator (@total) and @count
*/
-static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total,
- u32 *count)
+static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total, u32 *count)
{
- s64 ns = ktime_get_ns() - *start;
+ s64 ns = local_clock() - *start;
unsigned long flags;

if (ns > 0) {
@@ -58,7 +57,7 @@ static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total,

void __delayacct_blkio_start(void)
{
- current->delays->blkio_start = ktime_get_ns();
+ current->delays->blkio_start = local_clock();
}

/*
@@ -151,21 +150,20 @@ __u64 __delayacct_blkio_ticks(struct task_struct *tsk)

void __delayacct_freepages_start(void)
{
- current->delays->freepages_start = ktime_get_ns();
+ current->delays->freepages_start = local_clock();
}

void __delayacct_freepages_end(void)
{
- delayacct_end(
- &current->delays->lock,
- &current->delays->freepages_start,
- &current->delays->freepages_delay,
- &current->delays->freepages_count);
+ delayacct_end(&current->delays->lock,
+ &current->delays->freepages_start,
+ &current->delays->freepages_delay,
+ &current->delays->freepages_count);
}

void __delayacct_thrashing_start(void)
{
- current->delays->thrashing_start = ktime_get_ns();
+ current->delays->thrashing_start = local_clock();
}

void __delayacct_thrashing_end(void)

2021-05-12 10:44:08

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH 1/6] delayacct: Use sched_clock()

On Wed, May 05, 2021 at 12:59:41PM +0200, Peter Zijlstra wrote:
> Like all scheduler statistics, use sched_clock() based time.
>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>

Acked-by: Mel Gorman <[email protected]>

--
Mel Gorman
SUSE Labs