2020-09-09 09:11:34

by jun qian

[permalink] [raw]
Subject: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs

From: jun qian <[email protected]>

When get the pending softirqs, it need to process all the pending
softirqs in the while loop. If the processing time of each pending
softirq is need more than 2 msec in this loop, or one of the softirq
will running a long time, according to the original code logic, it
will process all the pending softirqs without wakeuping ksoftirqd,
which will cause a relatively large scheduling delay on the
corresponding CPU, which we do not wish to see. The patch will check
the total time to process pending softirq, if the time exceeds 2 ms
we need to wakeup the ksofirqd to aviod large sched delay.

Signed-off-by: jun qian <[email protected]>
---
kernel/softirq.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 69 insertions(+), 14 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index c4201b7f..1f696c8 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -25,6 +25,7 @@
#include <linux/smpboot.h>
#include <linux/tick.h>
#include <linux/irq.h>
+#include <linux/sched/clock.h>

#define CREATE_TRACE_POINTS
#include <trace/events/irq.h>
@@ -199,18 +200,17 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)

/*
* We restart softirq processing for at most MAX_SOFTIRQ_RESTART times,
- * but break the loop if need_resched() is set or after 2 ms.
- * The MAX_SOFTIRQ_TIME provides a nice upper bound in most cases, but in
- * certain cases, such as stop_machine(), jiffies may cease to
- * increment and so we need the MAX_SOFTIRQ_RESTART limit as
- * well to make sure we eventually return from this method.
+ * but break the loop if need_resched() is set or after MAX_SOFTIRQ_TIME_NS
+ * ns. In the loop, if the processing time of the softirq has exceeded
+ * MAX_SOFTIRQ_TIME_NS ns, we also need to break the loop to wakeup the
+ * ksofirqd.
*
* These limits have been established via experimentation.
* The two things to balance is latency against fairness -
* we want to handle softirqs as soon as possible, but they
* should not be able to lock up the box.
*/
-#define MAX_SOFTIRQ_TIME msecs_to_jiffies(2)
+#define MAX_SOFTIRQ_TIME_NS 2000000
#define MAX_SOFTIRQ_RESTART 10

#ifdef CONFIG_TRACE_IRQFLAGS
@@ -246,15 +246,20 @@ static inline void lockdep_softirq_end(bool in_hardirq)
static inline void lockdep_softirq_end(bool in_hardirq) { }
#endif

+DEFINE_PER_CPU(__u32, pending_new_flag);
+DEFINE_PER_CPU(__u32, pending_next_bit);
+#define SOFTIRQ_PENDING_MASK ((1UL << NR_SOFTIRQS) - 1)
+
asmlinkage __visible void __softirq_entry __do_softirq(void)
{
- unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
+ u64 end = sched_clock() + MAX_SOFTIRQ_TIME_NS;
unsigned long old_flags = current->flags;
int max_restart = MAX_SOFTIRQ_RESTART;
struct softirq_action *h;
bool in_hardirq;
- __u32 pending;
- int softirq_bit;
+ __u32 pending, pending_left, pending_new;
+ int softirq_bit, next_bit;
+ unsigned long flags;

/*
* Mask out PF_MEMALLOC as the current task context is borrowed for the
@@ -277,10 +282,33 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)

h = softirq_vec;

- while ((softirq_bit = ffs(pending))) {
- unsigned int vec_nr;
+ next_bit = per_cpu(pending_next_bit, smp_processor_id());
+ per_cpu(pending_new_flag, smp_processor_id()) = 0;
+
+ pending_left = pending &
+ (SOFTIRQ_PENDING_MASK << next_bit);
+ pending_new = pending &
+ (SOFTIRQ_PENDING_MASK >> (NR_SOFTIRQS - next_bit));
+
+ /*
+ * In order to be fair, we shold process the pengding bits by the
+ * last processing order.
+ */
+ while ((softirq_bit = ffs(pending_left)) ||
+ (softirq_bit = ffs(pending_new))) {
int prev_count;
+ unsigned int vec_nr = 0;

+ /*
+ * when the left pengding bits have been handled, we should
+ * to reset the h to softirq_vec.
+ */
+ if (!ffs(pending_left)) {
+ if (per_cpu(pending_new_flag, smp_processor_id()) == 0) {
+ h = softirq_vec;
+ per_cpu(pending_new_flag, smp_processor_id()) = 1;
+ }
+ }
h += softirq_bit - 1;

vec_nr = h - softirq_vec;
@@ -298,17 +326,44 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
preempt_count_set(prev_count);
}
h++;
- pending >>= softirq_bit;
+
+ if (ffs(pending_left))
+ pending_left >>= softirq_bit;
+ else
+ pending_new >>= softirq_bit;
+
+ /*
+ * the softirq's action has been run too much time,
+ * so it may need to wakeup the ksoftirqd
+ */
+ if (need_resched() && sched_clock() > end) {
+ /*
+ * Ensure that the remaining pending bits will be
+ * handled.
+ */
+ local_irq_save(flags);
+ if (ffs(pending_left))
+ or_softirq_pending((pending_left << (vec_nr + 1)) |
+ pending_new);
+ else
+ or_softirq_pending(pending_new << (vec_nr + 1));
+ local_irq_restore(flags);
+ per_cpu(pending_next_bit, smp_processor_id()) = vec_nr + 1;
+ break;
+ }
}

+ /* reset the pending_next_bit */
+ per_cpu(pending_next_bit, smp_processor_id()) = 0;
+
if (__this_cpu_read(ksoftirqd) == current)
rcu_softirq_qs();
local_irq_disable();

pending = local_softirq_pending();
if (pending) {
- if (time_before(jiffies, end) && !need_resched() &&
- --max_restart)
+ if (!need_resched() && --max_restart &&
+ sched_clock() <= end)
goto restart;

wakeup_softirqd();
--
1.8.3.1


2020-09-11 15:58:41

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs

On Wed, Sep 09, 2020 at 05:09:31PM +0800, [email protected] wrote:
> From: jun qian <[email protected]>
>
> When get the pending softirqs, it need to process all the pending
> softirqs in the while loop. If the processing time of each pending
> softirq is need more than 2 msec in this loop, or one of the softirq
> will running a long time, according to the original code logic, it
> will process all the pending softirqs without wakeuping ksoftirqd,
> which will cause a relatively large scheduling delay on the
> corresponding CPU, which we do not wish to see. The patch will check
> the total time to process pending softirq, if the time exceeds 2 ms
> we need to wakeup the ksofirqd to aviod large sched delay.

But what is all that unreadaable gibberish with pending_new_{flag,bit} ?

Random comments below..


> +#define MAX_SOFTIRQ_TIME_NS 2000000

2*NSEC_PER_MSEC


> +DEFINE_PER_CPU(__u32, pending_new_flag);
> +DEFINE_PER_CPU(__u32, pending_next_bit);

__u32 is for userspace ABI, this is not it, use u32

> +#define SOFTIRQ_PENDING_MASK ((1UL << NR_SOFTIRQS) - 1)
> +
> asmlinkage __visible void __softirq_entry __do_softirq(void)
> {
> - unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
> + u64 end = sched_clock() + MAX_SOFTIRQ_TIME_NS;
> unsigned long old_flags = current->flags;
> int max_restart = MAX_SOFTIRQ_RESTART;
> struct softirq_action *h;
> bool in_hardirq;
> - __u32 pending;
> - int softirq_bit;
> + __u32 pending, pending_left, pending_new;
> + int softirq_bit, next_bit;
> + unsigned long flags;
>
> /*
> * Mask out PF_MEMALLOC as the current task context is borrowed for the
> @@ -277,10 +282,33 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
>
> h = softirq_vec;
>
> - while ((softirq_bit = ffs(pending))) {
> - unsigned int vec_nr;
> + next_bit = per_cpu(pending_next_bit, smp_processor_id());
> + per_cpu(pending_new_flag, smp_processor_id()) = 0;

__this_cpu_read() / __this_cpu_write()

> +
> + pending_left = pending &
> + (SOFTIRQ_PENDING_MASK << next_bit);
> + pending_new = pending &
> + (SOFTIRQ_PENDING_MASK >> (NR_SOFTIRQS - next_bit));

The second mask is the inverse of the first.

> + /*
> + * In order to be fair, we shold process the pengding bits by the
> + * last processing order.
> + */
> + while ((softirq_bit = ffs(pending_left)) ||
> + (softirq_bit = ffs(pending_new))) {
> int prev_count;
> + unsigned int vec_nr = 0;
>
> + /*
> + * when the left pengding bits have been handled, we should
> + * to reset the h to softirq_vec.
> + */
> + if (!ffs(pending_left)) {
> + if (per_cpu(pending_new_flag, smp_processor_id()) == 0) {
> + h = softirq_vec;
> + per_cpu(pending_new_flag, smp_processor_id()) = 1;
> + }
> + }
> h += softirq_bit - 1;
>
> vec_nr = h - softirq_vec;
> @@ -298,17 +326,44 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> preempt_count_set(prev_count);
> }
> h++;
> - pending >>= softirq_bit;
> +
> + if (ffs(pending_left))

This is the _third_ ffs(pending_left), those things are _expensive_ (on
some archs, see include/asm-generic/bitops/__ffs.h).

> + pending_left >>= softirq_bit;
> + else
> + pending_new >>= softirq_bit;
> +
> + /*
> + * the softirq's action has been run too much time,
> + * so it may need to wakeup the ksoftirqd
> + */
> + if (need_resched() && sched_clock() > end) {
> + /*
> + * Ensure that the remaining pending bits will be
> + * handled.
> + */
> + local_irq_save(flags);
> + if (ffs(pending_left))

*fourth*...

> + or_softirq_pending((pending_left << (vec_nr + 1)) |
> + pending_new);
> + else
> + or_softirq_pending(pending_new << (vec_nr + 1));
> + local_irq_restore(flags);
> + per_cpu(pending_next_bit, smp_processor_id()) = vec_nr + 1;
> + break;
> + }
> }
>
> + /* reset the pending_next_bit */
> + per_cpu(pending_next_bit, smp_processor_id()) = 0;
> +
> if (__this_cpu_read(ksoftirqd) == current)
> rcu_softirq_qs();
> local_irq_disable();
>
> pending = local_softirq_pending();
> if (pending) {
> - if (time_before(jiffies, end) && !need_resched() &&
> - --max_restart)
> + if (!need_resched() && --max_restart &&
> + sched_clock() <= end)
> goto restart;
>
> wakeup_softirqd();

This really wants to be a number of separate patches; and I quickly lost
the plot in your code. Instead of cleaning things up, you're making an
even bigger mess of things.

That said, I _think_ I've managed to decode what you want. See the
completely untested patches attached.



Attachments:
(No filename) (4.62 kB)
peterz-softirq-fix-loop.patch (1.47 kB)
peterz-softirq-timo.patch (1.56 kB)
peterz-softirq-needs-break.patch (2.72 kB)
peterz-softirq-break-more.patch (1.13 kB)
Download all attachments

2020-09-11 16:48:56

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs

On 09/09/20 17:09, [email protected] wrote:
> From: jun qian <[email protected]>
>
> When get the pending softirqs, it need to process all the pending
> softirqs in the while loop. If the processing time of each pending
> softirq is need more than 2 msec in this loop, or one of the softirq
> will running a long time, according to the original code logic, it
> will process all the pending softirqs without wakeuping ksoftirqd,
> which will cause a relatively large scheduling delay on the
> corresponding CPU, which we do not wish to see. The patch will check
> the total time to process pending softirq, if the time exceeds 2 ms
> we need to wakeup the ksofirqd to aviod large sched delay.
>
> Signed-off-by: jun qian <[email protected]>

In Android there's a patch that tries to avoid schedling an RT task on a cpu
that is running softirqs. I wonder if this patch helps with this case.

https://android.googlesource.com/kernel/msm/+/5c3f54c34acf4d9ed01530288d4a98acff815d79%5E%21/#F0

John, Wei, is this something of interest to you?

IIUC this patch will make sure the total softirq duration is 2ms rather than
each call is 2ms.

I persume if there's a single handler that takes a lot of time then this won't
help. But in that case, one can argue there's a potential bug with this
handler.

Cheers

--
Qais Yousef

> ---
> kernel/softirq.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++----------
> 1 file changed, 69 insertions(+), 14 deletions(-)
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index c4201b7f..1f696c8 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -25,6 +25,7 @@
> #include <linux/smpboot.h>
> #include <linux/tick.h>
> #include <linux/irq.h>
> +#include <linux/sched/clock.h>
>
> #define CREATE_TRACE_POINTS
> #include <trace/events/irq.h>
> @@ -199,18 +200,17 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)
>
> /*
> * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times,
> - * but break the loop if need_resched() is set or after 2 ms.
> - * The MAX_SOFTIRQ_TIME provides a nice upper bound in most cases, but in
> - * certain cases, such as stop_machine(), jiffies may cease to
> - * increment and so we need the MAX_SOFTIRQ_RESTART limit as
> - * well to make sure we eventually return from this method.
> + * but break the loop if need_resched() is set or after MAX_SOFTIRQ_TIME_NS
> + * ns. In the loop, if the processing time of the softirq has exceeded
> + * MAX_SOFTIRQ_TIME_NS ns, we also need to break the loop to wakeup the
> + * ksofirqd.
> *
> * These limits have been established via experimentation.
> * The two things to balance is latency against fairness -
> * we want to handle softirqs as soon as possible, but they
> * should not be able to lock up the box.
> */
> -#define MAX_SOFTIRQ_TIME msecs_to_jiffies(2)
> +#define MAX_SOFTIRQ_TIME_NS 2000000
> #define MAX_SOFTIRQ_RESTART 10
>
> #ifdef CONFIG_TRACE_IRQFLAGS
> @@ -246,15 +246,20 @@ static inline void lockdep_softirq_end(bool in_hardirq)
> static inline void lockdep_softirq_end(bool in_hardirq) { }
> #endif
>
> +DEFINE_PER_CPU(__u32, pending_new_flag);
> +DEFINE_PER_CPU(__u32, pending_next_bit);
> +#define SOFTIRQ_PENDING_MASK ((1UL << NR_SOFTIRQS) - 1)
> +
> asmlinkage __visible void __softirq_entry __do_softirq(void)
> {
> - unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
> + u64 end = sched_clock() + MAX_SOFTIRQ_TIME_NS;
> unsigned long old_flags = current->flags;
> int max_restart = MAX_SOFTIRQ_RESTART;
> struct softirq_action *h;
> bool in_hardirq;
> - __u32 pending;
> - int softirq_bit;
> + __u32 pending, pending_left, pending_new;
> + int softirq_bit, next_bit;
> + unsigned long flags;
>
> /*
> * Mask out PF_MEMALLOC as the current task context is borrowed for the
> @@ -277,10 +282,33 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
>
> h = softirq_vec;
>
> - while ((softirq_bit = ffs(pending))) {
> - unsigned int vec_nr;
> + next_bit = per_cpu(pending_next_bit, smp_processor_id());
> + per_cpu(pending_new_flag, smp_processor_id()) = 0;
> +
> + pending_left = pending &
> + (SOFTIRQ_PENDING_MASK << next_bit);
> + pending_new = pending &
> + (SOFTIRQ_PENDING_MASK >> (NR_SOFTIRQS - next_bit));
> +
> + /*
> + * In order to be fair, we shold process the pengding bits by the
> + * last processing order.
> + */
> + while ((softirq_bit = ffs(pending_left)) ||
> + (softirq_bit = ffs(pending_new))) {
> int prev_count;
> + unsigned int vec_nr = 0;
>
> + /*
> + * when the left pengding bits have been handled, we should
> + * to reset the h to softirq_vec.
> + */
> + if (!ffs(pending_left)) {
> + if (per_cpu(pending_new_flag, smp_processor_id()) == 0) {
> + h = softirq_vec;
> + per_cpu(pending_new_flag, smp_processor_id()) = 1;
> + }
> + }
> h += softirq_bit - 1;
>
> vec_nr = h - softirq_vec;
> @@ -298,17 +326,44 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> preempt_count_set(prev_count);
> }
> h++;
> - pending >>= softirq_bit;
> +
> + if (ffs(pending_left))
> + pending_left >>= softirq_bit;
> + else
> + pending_new >>= softirq_bit;
> +
> + /*
> + * the softirq's action has been run too much time,
> + * so it may need to wakeup the ksoftirqd
> + */
> + if (need_resched() && sched_clock() > end) {
> + /*
> + * Ensure that the remaining pending bits will be
> + * handled.
> + */
> + local_irq_save(flags);
> + if (ffs(pending_left))
> + or_softirq_pending((pending_left << (vec_nr + 1)) |
> + pending_new);
> + else
> + or_softirq_pending(pending_new << (vec_nr + 1));
> + local_irq_restore(flags);
> + per_cpu(pending_next_bit, smp_processor_id()) = vec_nr + 1;
> + break;
> + }
> }
>
> + /* reset the pending_next_bit */
> + per_cpu(pending_next_bit, smp_processor_id()) = 0;
> +
> if (__this_cpu_read(ksoftirqd) == current)
> rcu_softirq_qs();
> local_irq_disable();
>
> pending = local_softirq_pending();
> if (pending) {
> - if (time_before(jiffies, end) && !need_resched() &&
> - --max_restart)
> + if (!need_resched() && --max_restart &&
> + sched_clock() <= end)
> goto restart;
>
> wakeup_softirqd();
> --
> 1.8.3.1
>

2020-09-11 18:30:07

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs

On Fri, Sep 11, 2020 at 05:46:45PM +0100, Qais Yousef wrote:
> On 09/09/20 17:09, [email protected] wrote:
> > From: jun qian <[email protected]>
> >
> > When get the pending softirqs, it need to process all the pending
> > softirqs in the while loop. If the processing time of each pending
> > softirq is need more than 2 msec in this loop, or one of the softirq
> > will running a long time, according to the original code logic, it
> > will process all the pending softirqs without wakeuping ksoftirqd,
> > which will cause a relatively large scheduling delay on the
> > corresponding CPU, which we do not wish to see. The patch will check
> > the total time to process pending softirq, if the time exceeds 2 ms
> > we need to wakeup the ksofirqd to aviod large sched delay.
> >
> > Signed-off-by: jun qian <[email protected]>
>
> In Android there's a patch that tries to avoid schedling an RT task on a cpu
> that is running softirqs. I wonder if this patch helps with this case.
>
> https://android.googlesource.com/kernel/msm/+/5c3f54c34acf4d9ed01530288d4a98acff815d79%5E%21/#F0
>
> John, Wei, is this something of interest to you?

Urgh.. that's pretty gross. I think the sane approach is indeed getting
softirqs to react to need_resched() better.

2020-09-12 07:18:54

by jun qian

[permalink] [raw]
Subject: Re: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs

<[email protected]> 于2020年9月11日周五 下午11:55写道:
>
> On Wed, Sep 09, 2020 at 05:09:31PM +0800, [email protected] wrote:
> > From: jun qian <[email protected]>
> >
> > When get the pending softirqs, it need to process all the pending
> > softirqs in the while loop. If the processing time of each pending
> > softirq is need more than 2 msec in this loop, or one of the softirq
> > will running a long time, according to the original code logic, it
> > will process all the pending softirqs without wakeuping ksoftirqd,
> > which will cause a relatively large scheduling delay on the
> > corresponding CPU, which we do not wish to see. The patch will check
> > the total time to process pending softirq, if the time exceeds 2 ms
> > we need to wakeup the ksofirqd to aviod large sched delay.
>
> But what is all that unreadaable gibberish with pending_new_{flag,bit} ?
>
> Random comments below..
>
>
> > +#define MAX_SOFTIRQ_TIME_NS 2000000
>
> 2*NSEC_PER_MSEC
>
>
> > +DEFINE_PER_CPU(__u32, pending_new_flag);
> > +DEFINE_PER_CPU(__u32, pending_next_bit);
>
> __u32 is for userspace ABI, this is not it, use u32
>
> > +#define SOFTIRQ_PENDING_MASK ((1UL << NR_SOFTIRQS) - 1)
> > +
> > asmlinkage __visible void __softirq_entry __do_softirq(void)
> > {
> > - unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
> > + u64 end = sched_clock() + MAX_SOFTIRQ_TIME_NS;
> > unsigned long old_flags = current->flags;
> > int max_restart = MAX_SOFTIRQ_RESTART;
> > struct softirq_action *h;
> > bool in_hardirq;
> > - __u32 pending;
> > - int softirq_bit;
> > + __u32 pending, pending_left, pending_new;
> > + int softirq_bit, next_bit;
> > + unsigned long flags;
> >
> > /*
> > * Mask out PF_MEMALLOC as the current task context is borrowed for the
> > @@ -277,10 +282,33 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> >
> > h = softirq_vec;
> >
> > - while ((softirq_bit = ffs(pending))) {
> > - unsigned int vec_nr;
> > + next_bit = per_cpu(pending_next_bit, smp_processor_id());
> > + per_cpu(pending_new_flag, smp_processor_id()) = 0;
>
> __this_cpu_read() / __this_cpu_write()
>
> > +
> > + pending_left = pending &
> > + (SOFTIRQ_PENDING_MASK << next_bit);
> > + pending_new = pending &
> > + (SOFTIRQ_PENDING_MASK >> (NR_SOFTIRQS - next_bit));
>
> The second mask is the inverse of the first.
>
> > + /*
> > + * In order to be fair, we shold process the pengding bits by the
> > + * last processing order.
> > + */
> > + while ((softirq_bit = ffs(pending_left)) ||
> > + (softirq_bit = ffs(pending_new))) {
> > int prev_count;
> > + unsigned int vec_nr = 0;
> >
> > + /*
> > + * when the left pengding bits have been handled, we should
> > + * to reset the h to softirq_vec.
> > + */
> > + if (!ffs(pending_left)) {
> > + if (per_cpu(pending_new_flag, smp_processor_id()) == 0) {
> > + h = softirq_vec;
> > + per_cpu(pending_new_flag, smp_processor_id()) = 1;
> > + }
> > + }
> > h += softirq_bit - 1;
> >
> > vec_nr = h - softirq_vec;
> > @@ -298,17 +326,44 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> > preempt_count_set(prev_count);
> > }
> > h++;
> > - pending >>= softirq_bit;
> > +
> > + if (ffs(pending_left))
>
> This is the _third_ ffs(pending_left), those things are _expensive_ (on
> some archs, see include/asm-generic/bitops/__ffs.h).
>
> > + pending_left >>= softirq_bit;
> > + else
> > + pending_new >>= softirq_bit;
> > +
> > + /*
> > + * the softirq's action has been run too much time,
> > + * so it may need to wakeup the ksoftirqd
> > + */
> > + if (need_resched() && sched_clock() > end) {
> > + /*
> > + * Ensure that the remaining pending bits will be
> > + * handled.
> > + */
> > + local_irq_save(flags);
> > + if (ffs(pending_left))
>
> *fourth*...
>
> > + or_softirq_pending((pending_left << (vec_nr + 1)) |
> > + pending_new);
> > + else
> > + or_softirq_pending(pending_new << (vec_nr + 1));
> > + local_irq_restore(flags);
> > + per_cpu(pending_next_bit, smp_processor_id()) = vec_nr + 1;
> > + break;
> > + }
> > }
> >
> > + /* reset the pending_next_bit */
> > + per_cpu(pending_next_bit, smp_processor_id()) = 0;
> > +
> > if (__this_cpu_read(ksoftirqd) == current)
> > rcu_softirq_qs();
> > local_irq_disable();
> >
> > pending = local_softirq_pending();
> > if (pending) {
> > - if (time_before(jiffies, end) && !need_resched() &&
> > - --max_restart)
> > + if (!need_resched() && --max_restart &&
> > + sched_clock() <= end)
> > goto restart;
> >
> > wakeup_softirqd();
>
> This really wants to be a number of separate patches; and I quickly lost
> the plot in your code. Instead of cleaning things up, you're making an
> even bigger mess of things.
>
> That said, I _think_ I've managed to decode what you want. See the
> completely untested patches attached.
>
>

thanks a lot thank for your suggestion. I will rewrite the patch by
following you sugestion, Useing multiple patches to clarify ideas.

2020-09-14 11:35:13

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs

On 09/11/20 20:28, [email protected] wrote:
> On Fri, Sep 11, 2020 at 05:46:45PM +0100, Qais Yousef wrote:
> > On 09/09/20 17:09, [email protected] wrote:
> > > From: jun qian <[email protected]>
> > >
> > > When get the pending softirqs, it need to process all the pending
> > > softirqs in the while loop. If the processing time of each pending
> > > softirq is need more than 2 msec in this loop, or one of the softirq
> > > will running a long time, according to the original code logic, it
> > > will process all the pending softirqs without wakeuping ksoftirqd,
> > > which will cause a relatively large scheduling delay on the
> > > corresponding CPU, which we do not wish to see. The patch will check
> > > the total time to process pending softirq, if the time exceeds 2 ms
> > > we need to wakeup the ksofirqd to aviod large sched delay.
> > >
> > > Signed-off-by: jun qian <[email protected]>
> >
> > In Android there's a patch that tries to avoid schedling an RT task on a cpu
> > that is running softirqs. I wonder if this patch helps with this case.
> >
> > https://android.googlesource.com/kernel/msm/+/5c3f54c34acf4d9ed01530288d4a98acff815d79%5E%21/#F0
> >
> > John, Wei, is this something of interest to you?
>
> Urgh.. that's pretty gross. I think the sane approach is indeed getting
> softirqs to react to need_resched() better.

What does PREEMPT_RT do to deal with softirqs delays?

I have tried playing with enabling threadirqs, which AFAIU should make softirqs
preemptible, right?

I realize this patch is still missing from mainline at least:

https://gitlab.com/kalilinux/packages/linux/blob/a17bad0db9da44cd73f594794a58cc5646393b13/debian/patches-rt/softirq-Add-preemptible-softirq.patch

Would this be a heavy handed approach to make available for non PREEMPT_RT
kernels?

I only worry about potential NET_RX throughput issues. Which by the way is
protected with preempt_disable currently in mainline. See netif_rx_ni().

I am guessing here, but I suspect this NET_RX softirq is one source of big
delays when network activity is high.

Thanks

--
Qais Yousef

2020-09-14 11:47:45

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs

On 09/11/20 12:14, John Dias wrote:
> I agree that the rt-softirq interaction patches are a gross hack (and I
> wrote the original versions, so it's my grossness). The problem is that
> even a 1ms delay in executing those low-latency audio threads is enough to
> cause obvious glitching in audio under very real circumstances, which is
> not an acceptable user experience -- and some softirq handlers run for >1ms
> on their own. (And 1ms is a high upper bound, not a median target.)

AFAIK professional audio apps have the toughest limit of sub 10ms. 120MHz
screens impose a stricter limit on display pipeline too to finish its frame in
8ms.

1ms is too short and approaches PREEMPT_RT realm.

Is it possible to expand on the use case here? What application requires this
constraint?

Thanks

--
Qais Yousef

2020-09-14 14:18:53

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs

On Mon, Sep 14, 2020 at 12:27:35PM +0100, Qais Yousef wrote:
> What does PREEMPT_RT do to deal with softirqs delays?

Makes the lot preemptible, you found the patch below.

> I have tried playing with enabling threadirqs, which AFAIU should make softirqs
> preemptible, right?

Not yet,..

> I realize this patch is still missing from mainline at least:
>
> https://gitlab.com/kalilinux/packages/linux/blob/a17bad0db9da44cd73f594794a58cc5646393b13/debian/patches-rt/softirq-Add-preemptible-softirq.patch
>
> Would this be a heavy handed approach to make available for non PREEMPT_RT
> kernels?

Not sure, I suspect it relies on migrate_disable(), which is
preempt_disable() on !RT and then we're back to square one.

> I only worry about potential NET_RX throughput issues. Which by the way is
> protected with preempt_disable currently in mainline. See netif_rx_ni().

So preempt_disable() isn't necessairily a problem, you just want it to
terminate soonish after need_resched() becomes true. Also, I'm having a
wee problem getting from net_rx_action() to netif_rx_ni()

> I am guessing here, but I suspect this NET_RX softirq is one source of big
> delays when network activity is high.

Well, one approach is to more agressively limit how long softirq
processing can run. Current measures are very soft in that regard.

2020-09-14 15:29:39

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs

On 09/14/20 16:14, [email protected] wrote:
> On Mon, Sep 14, 2020 at 12:27:35PM +0100, Qais Yousef wrote:
> > What does PREEMPT_RT do to deal with softirqs delays?
>
> Makes the lot preemptible, you found the patch below.
>
> > I have tried playing with enabling threadirqs, which AFAIU should make softirqs
> > preemptible, right?
>
> Not yet,..
>
> > I realize this patch is still missing from mainline at least:
> >
> > https://gitlab.com/kalilinux/packages/linux/blob/a17bad0db9da44cd73f594794a58cc5646393b13/debian/patches-rt/softirq-Add-preemptible-softirq.patch
> >
> > Would this be a heavy handed approach to make available for non PREEMPT_RT
> > kernels?
>
> Not sure, I suspect it relies on migrate_disable(), which is
> preempt_disable() on !RT and then we're back to square one.

I think it will depend on local_bh_disable(). I didn't dig into the patch
above, but I believe it's doing that for RT.

Or maybe there's another aspect I am not aware of that relies on
migrate_disable() too..

>
> > I only worry about potential NET_RX throughput issues. Which by the way is
> > protected with preempt_disable currently in mainline. See netif_rx_ni().
>
> So preempt_disable() isn't necessairily a problem, you just want it to

Yes. But high network traffic will make this a busy softirq. And it won't work
with the patch above. But I assume the above will have to fix that with it.

https://lore.kernel.org/netdev/[email protected]/

For the time being, it's just another potential path that could introduce
latencies.

I can't follow the whole thing too, but if 5G modems ends up there; I can see
this a big source of noise when the user is downloading a big file. Assuming 5g
lives up to its reputation of 400+ Mbps in practice.

So there might be a tangible trade off between better softirqs latencies vs
better network throughput.

> terminate soonish after need_resched() becomes true. Also, I'm having a
> wee problem getting from net_rx_action() to netif_rx_ni()

I can investigate this direction :)

>
> > I am guessing here, but I suspect this NET_RX softirq is one source of big
> > delays when network activity is high.
>
> Well, one approach is to more agressively limit how long softirq
> processing can run. Current measures are very soft in that regard.

Which is this patch. Although it doesn't take into account a single softirq
exceeding the quota IIUC. But the need_resched() bit above should address that.

Thanks

--
Qais Yousef