2020-10-12 13:53:43

by Pingfan Liu

[permalink] [raw]
Subject: [PATCH] sched/cputime: correct account of irqtime

__do_softirq() may be interrupted by hardware interrupts. In this case,
irqtime_account_irq() will account the time slice as CPUTIME_SOFTIRQ by
mistake.

By passing irqtime_account_irq() an extra param about either hardirq or
softirq, irqtime_account_irq() can handle the above case.

Signed-off-by: Pingfan Liu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Vincent Guittot <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Segall <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Allen Pais <[email protected]>
Cc: Romain Perier <[email protected]>
To: [email protected]
---
include/linux/hardirq.h | 4 ++--
include/linux/vtime.h | 12 ++++++------
kernel/sched/cputime.c | 4 ++--
kernel/softirq.c | 6 +++---
4 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index 754f67a..56e7bb5 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -32,7 +32,7 @@ static __always_inline void rcu_irq_enter_check_tick(void)
*/
#define __irq_enter() \
do { \
- account_irq_enter_time(current); \
+ account_irq_enter_time(current, true); \
preempt_count_add(HARDIRQ_OFFSET); \
lockdep_hardirq_enter(); \
} while (0)
@@ -63,7 +63,7 @@ void irq_enter_rcu(void);
#define __irq_exit() \
do { \
lockdep_hardirq_exit(); \
- account_irq_exit_time(current); \
+ account_irq_exit_time(current, true); \
preempt_count_sub(HARDIRQ_OFFSET); \
} while (0)

diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index 2cdeca0..294188ae1 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -98,21 +98,21 @@ static inline void vtime_flush(struct task_struct *tsk) { }


#ifdef CONFIG_IRQ_TIME_ACCOUNTING
-extern void irqtime_account_irq(struct task_struct *tsk);
+extern void irqtime_account_irq(struct task_struct *tsk, bool hardirq);
#else
-static inline void irqtime_account_irq(struct task_struct *tsk) { }
+static inline void irqtime_account_irq(struct task_struct *tsk, bool hardirq) { }
#endif

-static inline void account_irq_enter_time(struct task_struct *tsk)
+static inline void account_irq_enter_time(struct task_struct *tsk, bool hardirq)
{
vtime_account_irq_enter(tsk);
- irqtime_account_irq(tsk);
+ irqtime_account_irq(tsk, hardirq);
}

-static inline void account_irq_exit_time(struct task_struct *tsk)
+static inline void account_irq_exit_time(struct task_struct *tsk, bool hardirq)
{
vtime_account_irq_exit(tsk);
- irqtime_account_irq(tsk);
+ irqtime_account_irq(tsk, hardirq);
}

#endif /* _LINUX_KERNEL_VTIME_H */
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 5a55d23..166f1d7 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -47,7 +47,7 @@ static void irqtime_account_delta(struct irqtime *irqtime, u64 delta,
* Called before incrementing preempt_count on {soft,}irq_enter
* and before decrementing preempt_count on {soft,}irq_exit.
*/
-void irqtime_account_irq(struct task_struct *curr)
+void irqtime_account_irq(struct task_struct *curr, bool hardirq)
{
struct irqtime *irqtime = this_cpu_ptr(&cpu_irqtime);
s64 delta;
@@ -68,7 +68,7 @@ void irqtime_account_irq(struct task_struct *curr)
*/
if (hardirq_count())
irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
- else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
+ else if (in_serving_softirq() && curr != this_cpu_ksoftirqd() && !hardirq)
irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
}
EXPORT_SYMBOL_GPL(irqtime_account_irq);
diff --git a/kernel/softirq.c b/kernel/softirq.c
index bf88d7f6..da59ea39 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -270,7 +270,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
current->flags &= ~PF_MEMALLOC;

pending = local_softirq_pending();
- account_irq_enter_time(current);
+ account_irq_enter_time(current, false);

__local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET);
in_hardirq = lockdep_softirq_start();
@@ -321,7 +321,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
}

lockdep_softirq_end(in_hardirq);
- account_irq_exit_time(current);
+ account_irq_exit_time(current, false);
__local_bh_enable(SOFTIRQ_OFFSET);
WARN_ON_ONCE(in_interrupt());
current_restore_flags(old_flags, PF_MEMALLOC);
@@ -417,7 +417,7 @@ static inline void __irq_exit_rcu(void)
#else
lockdep_assert_irqs_disabled();
#endif
- account_irq_exit_time(current);
+ account_irq_exit_time(current, true);
preempt_count_sub(HARDIRQ_OFFSET);
if (!in_interrupt() && local_softirq_pending())
invoke_softirq();
--
2.7.5


2020-10-13 11:25:32

by jun qian

[permalink] [raw]
Subject: Re: [PATCH] sched/cputime: correct account of irqtime

Pingfan Liu <[email protected]> 于2020年10月12日周一 下午9:54写道:
>
> __do_softirq() may be interrupted by hardware interrupts. In this case,
> irqtime_account_irq() will account the time slice as CPUTIME_SOFTIRQ by
> mistake.
>
> By passing irqtime_account_irq() an extra param about either hardirq or
> softirq, irqtime_account_irq() can handle the above case.
>
> Signed-off-by: Pingfan Liu <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Juri Lelli <[email protected]>
> Cc: Vincent Guittot <[email protected]>
> Cc: Dietmar Eggemann <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Ben Segall <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: "Paul E. McKenney" <[email protected]>
> Cc: Frederic Weisbecker <[email protected]>
> Cc: Allen Pais <[email protected]>
> Cc: Romain Perier <[email protected]>
> To: [email protected]
> ---
> include/linux/hardirq.h | 4 ++--
> include/linux/vtime.h | 12 ++++++------
> kernel/sched/cputime.c | 4 ++--
> kernel/softirq.c | 6 +++---
> 4 files changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
> index 754f67a..56e7bb5 100644
> --- a/include/linux/hardirq.h
> +++ b/include/linux/hardirq.h
> @@ -32,7 +32,7 @@ static __always_inline void rcu_irq_enter_check_tick(void)
> */
> #define __irq_enter() \
> do { \
> - account_irq_enter_time(current); \
> + account_irq_enter_time(current, true); \
> preempt_count_add(HARDIRQ_OFFSET); \
> lockdep_hardirq_enter(); \
> } while (0)
> @@ -63,7 +63,7 @@ void irq_enter_rcu(void);
> #define __irq_exit() \
> do { \
> lockdep_hardirq_exit(); \
> - account_irq_exit_time(current); \
> + account_irq_exit_time(current, true); \
> preempt_count_sub(HARDIRQ_OFFSET); \
> } while (0)
>
> diff --git a/include/linux/vtime.h b/include/linux/vtime.h
> index 2cdeca0..294188ae1 100644
> --- a/include/linux/vtime.h
> +++ b/include/linux/vtime.h
> @@ -98,21 +98,21 @@ static inline void vtime_flush(struct task_struct *tsk) { }
>
>
> #ifdef CONFIG_IRQ_TIME_ACCOUNTING
> -extern void irqtime_account_irq(struct task_struct *tsk);
> +extern void irqtime_account_irq(struct task_struct *tsk, bool hardirq);
> #else
> -static inline void irqtime_account_irq(struct task_struct *tsk) { }
> +static inline void irqtime_account_irq(struct task_struct *tsk, bool hardirq) { }
> #endif
>
> -static inline void account_irq_enter_time(struct task_struct *tsk)
> +static inline void account_irq_enter_time(struct task_struct *tsk, bool hardirq)
> {
> vtime_account_irq_enter(tsk);
> - irqtime_account_irq(tsk);
> + irqtime_account_irq(tsk, hardirq);
> }
>
> -static inline void account_irq_exit_time(struct task_struct *tsk)
> +static inline void account_irq_exit_time(struct task_struct *tsk, bool hardirq)
> {
> vtime_account_irq_exit(tsk);
> - irqtime_account_irq(tsk);
> + irqtime_account_irq(tsk, hardirq);
> }
>
> #endif /* _LINUX_KERNEL_VTIME_H */
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index 5a55d23..166f1d7 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -47,7 +47,7 @@ static void irqtime_account_delta(struct irqtime *irqtime, u64 delta,
> * Called before incrementing preempt_count on {soft,}irq_enter
> * and before decrementing preempt_count on {soft,}irq_exit.
> */
> -void irqtime_account_irq(struct task_struct *curr)
> +void irqtime_account_irq(struct task_struct *curr, bool hardirq)
> {
> struct irqtime *irqtime = this_cpu_ptr(&cpu_irqtime);
> s64 delta;
> @@ -68,7 +68,7 @@ void irqtime_account_irq(struct task_struct *curr)
> */
> if (hardirq_count())
> irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
> - else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
> + else if (in_serving_softirq() && curr != this_cpu_ksoftirqd() && !hardirq)
> irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
> }

In my opinion, we don't need to use the hardirq flag, the code: if
(hardirq_count())
already tell us that where the delt time is from.

Thanks

> EXPORT_SYMBOL_GPL(irqtime_account_irq);
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index bf88d7f6..da59ea39 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -270,7 +270,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> current->flags &= ~PF_MEMALLOC;
>
> pending = local_softirq_pending();
> - account_irq_enter_time(current);
> + account_irq_enter_time(current, false);
>
> __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET);
> in_hardirq = lockdep_softirq_start();
> @@ -321,7 +321,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> }
>
> lockdep_softirq_end(in_hardirq);
> - account_irq_exit_time(current);
> + account_irq_exit_time(current, false);
> __local_bh_enable(SOFTIRQ_OFFSET);
> WARN_ON_ONCE(in_interrupt());
> current_restore_flags(old_flags, PF_MEMALLOC);
> @@ -417,7 +417,7 @@ static inline void __irq_exit_rcu(void)
> #else
> lockdep_assert_irqs_disabled();
> #endif
> - account_irq_exit_time(current);
> + account_irq_exit_time(current, true);
> preempt_count_sub(HARDIRQ_OFFSET);
> if (!in_interrupt() && local_softirq_pending())
> invoke_softirq();
> --
> 2.7.5
>

2020-10-13 16:07:20

by Pingfan Liu

[permalink] [raw]
Subject: Re: [PATCH] sched/cputime: correct account of irqtime

On Tue, Oct 13, 2020 at 11:10 AM jun qian <[email protected]> wrote:
>
> Pingfan Liu <[email protected]> 于2020年10月12日周一 下午9:54写道:
> >
> > __do_softirq() may be interrupted by hardware interrupts. In this case,
> > irqtime_account_irq() will account the time slice as CPUTIME_SOFTIRQ by
> > mistake.
> >
> > By passing irqtime_account_irq() an extra param about either hardirq or
> > softirq, irqtime_account_irq() can handle the above case.
> >
> > Signed-off-by: Pingfan Liu <[email protected]>
> > Cc: Ingo Molnar <[email protected]>
> > Cc: Peter Zijlstra <[email protected]>
> > Cc: Juri Lelli <[email protected]>
> > Cc: Vincent Guittot <[email protected]>
> > Cc: Dietmar Eggemann <[email protected]>
> > Cc: Steven Rostedt <[email protected]>
> > Cc: Ben Segall <[email protected]>
> > Cc: Mel Gorman <[email protected]>
> > Cc: Thomas Gleixner <[email protected]>
> > Cc: Andy Lutomirski <[email protected]>
> > Cc: Will Deacon <[email protected]>
> > Cc: "Paul E. McKenney" <[email protected]>
> > Cc: Frederic Weisbecker <[email protected]>
> > Cc: Allen Pais <[email protected]>
> > Cc: Romain Perier <[email protected]>
> > To: [email protected]
> > ---
> > include/linux/hardirq.h | 4 ++--
> > include/linux/vtime.h | 12 ++++++------
> > kernel/sched/cputime.c | 4 ++--
> > kernel/softirq.c | 6 +++---
> > 4 files changed, 13 insertions(+), 13 deletions(-)
> >
> > diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
> > index 754f67a..56e7bb5 100644
> > --- a/include/linux/hardirq.h
> > +++ b/include/linux/hardirq.h
> > @@ -32,7 +32,7 @@ static __always_inline void rcu_irq_enter_check_tick(void)
> > */
> > #define __irq_enter() \
> > do { \
> > - account_irq_enter_time(current); \
> > + account_irq_enter_time(current, true); \
> > preempt_count_add(HARDIRQ_OFFSET); \
> > lockdep_hardirq_enter(); \
> > } while (0)
> > @@ -63,7 +63,7 @@ void irq_enter_rcu(void);
> > #define __irq_exit() \
> > do { \
> > lockdep_hardirq_exit(); \
> > - account_irq_exit_time(current); \
> > + account_irq_exit_time(current, true); \
> > preempt_count_sub(HARDIRQ_OFFSET); \
> > } while (0)
> >
> > diff --git a/include/linux/vtime.h b/include/linux/vtime.h
> > index 2cdeca0..294188ae1 100644
> > --- a/include/linux/vtime.h
> > +++ b/include/linux/vtime.h
> > @@ -98,21 +98,21 @@ static inline void vtime_flush(struct task_struct *tsk) { }
> >
> >
> > #ifdef CONFIG_IRQ_TIME_ACCOUNTING
> > -extern void irqtime_account_irq(struct task_struct *tsk);
> > +extern void irqtime_account_irq(struct task_struct *tsk, bool hardirq);
> > #else
> > -static inline void irqtime_account_irq(struct task_struct *tsk) { }
> > +static inline void irqtime_account_irq(struct task_struct *tsk, bool hardirq) { }
> > #endif
> >
> > -static inline void account_irq_enter_time(struct task_struct *tsk)
> > +static inline void account_irq_enter_time(struct task_struct *tsk, bool hardirq)
> > {
> > vtime_account_irq_enter(tsk);
> > - irqtime_account_irq(tsk);
> > + irqtime_account_irq(tsk, hardirq);
> > }
> >
> > -static inline void account_irq_exit_time(struct task_struct *tsk)
> > +static inline void account_irq_exit_time(struct task_struct *tsk, bool hardirq)
> > {
> > vtime_account_irq_exit(tsk);
> > - irqtime_account_irq(tsk);
> > + irqtime_account_irq(tsk, hardirq);
> > }
> >
> > #endif /* _LINUX_KERNEL_VTIME_H */
> > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> > index 5a55d23..166f1d7 100644
> > --- a/kernel/sched/cputime.c
> > +++ b/kernel/sched/cputime.c
> > @@ -47,7 +47,7 @@ static void irqtime_account_delta(struct irqtime *irqtime, u64 delta,
> > * Called before incrementing preempt_count on {soft,}irq_enter
> > * and before decrementing preempt_count on {soft,}irq_exit.
> > */
> > -void irqtime_account_irq(struct task_struct *curr)
> > +void irqtime_account_irq(struct task_struct *curr, bool hardirq)
> > {
> > struct irqtime *irqtime = this_cpu_ptr(&cpu_irqtime);
> > s64 delta;
> > @@ -68,7 +68,7 @@ void irqtime_account_irq(struct task_struct *curr)
> > */
> > if (hardirq_count())
> > irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
> > - else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
> > + else if (in_serving_softirq() && curr != this_cpu_ksoftirqd() && !hardirq)
> > irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
> > }
>
> In my opinion, we don't need to use the hardirq flag, the code: if
> (hardirq_count())
> already tell us that where the delt time is from.

Considering the scenario in which hardirq happens immediately after
__do_softirq()->local_irq_enable(). The following code shows that
hardirq_count() can not help.
#define __irq_enter() \
do { \
account_irq_enter_time(current); \
preempt_count_add(HARDIRQ_OFFSET); \
lockdep_hardirq_enter(); \
} while (0)

Anything I missed?

Thanks,
Pingfan

2020-10-14 16:29:55

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched/cputime: correct account of irqtime

On Mon, Oct 12, 2020 at 09:50:44PM +0800, Pingfan Liu wrote:
> __do_softirq() may be interrupted by hardware interrupts. In this case,
> irqtime_account_irq() will account the time slice as CPUTIME_SOFTIRQ by
> mistake.
>
> By passing irqtime_account_irq() an extra param about either hardirq or
> softirq, irqtime_account_irq() can handle the above case.

I'm not sure I see the scenario in which it goes wrong.

irqtime_account_irq() is designed such that we're called with the old
preempt_count on enter and the new preempt_count on exit. This way we'll
accumuate the delta to the previous context.

> @@ -68,7 +68,7 @@ void irqtime_account_irq(struct task_struct *curr)
> */
> if (hardirq_count())
> irqtime_account_delta(irqtime, delta, CPUTIME_IRQ);
> - else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
> + else if (in_serving_softirq() && curr != this_cpu_ksoftirqd() && !hardirq)
> irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ);
> }


2020-10-15 08:51:07

by Pingfan Liu

[permalink] [raw]
Subject: Re: [PATCH] sched/cputime: correct account of irqtime

On Wed, Oct 14, 2020 at 9:02 PM Peter Zijlstra <[email protected]> wrote:
>
> On Mon, Oct 12, 2020 at 09:50:44PM +0800, Pingfan Liu wrote:
> > __do_softirq() may be interrupted by hardware interrupts. In this case,
> > irqtime_account_irq() will account the time slice as CPUTIME_SOFTIRQ by
> > mistake.
> >
> > By passing irqtime_account_irq() an extra param about either hardirq or
> > softirq, irqtime_account_irq() can handle the above case.
>
> I'm not sure I see the scenario in which it goes wrong.
>
> irqtime_account_irq() is designed such that we're called with the old
> preempt_count on enter and the new preempt_count on exit. This way we'll
> accumuate the delta to the previous context.
>
Oops! You are right, the time delta between a softirq and a
interrupting hardirq should be accounted into the softrq.

Thanks for your clear explanation.

Regards,
Pingfan