2022-12-22 22:59:30

by Jakub Kicinski

[permalink] [raw]
Subject: [PATCH 0/3] softirq: uncontroversial change

Catching up on LWN I run across the article about softirq
changes, and then I noticed fresh patches in Peter's tree.
So probably wise for me to throw these out there.

My (can I say Meta's?) problem is the opposite to what the RT
sensitive people complain about. In the current scheme once
ksoftirqd is woken no network processing happens until it runs.

When networking gets overloaded - that's probably fair, the problem
is that we confuse latency tweaks with overload protection. We have
a needs_resched() in the loop condition (which is a latency tweak)
Most often we defer to ksoftirqd because we're trying to be nice
and let user space respond quickly, not because there is an
overload. But the user space may not be nice, and sit on the CPU
for 10ms+. Also the sirq's "work allowance" is 2ms, which is
uncomfortably close to the timer tick, but that's another story.

We have a sirq latency tracker in our prod kernel which catches
8ms+ stalls of net Tx (packets queued to the NIC but there is
no NAPI cleanup within 8ms) and with these patches applied
on 5.19 fully loaded web machine sees a drop in stalls from
1.8 stalls/sec to 0.16/sec. I also see a 50% drop in outgoing
TCP retransmissions and ~10% drop in non-TLP incoming ones.
This is not a network-heavy workload so most of the rtx are
due to scheduling artifacts.

The network latency in a datacenter is somewhere around neat
1000x lower than scheduling granularity (around 10us).

These patches (patch 2 is "the meat") change what we recognize
as overload. Instead of just checking if "ksoftirqd is woken"
it also caps how long we consider ourselves to be in overload,
a time limit which is different based on whether we yield due
to real resource exhaustion vs just hitting that needs_resched().

I hope the core concept is not entirely idiotic. It'd be great
if we could get this in or fold an equivalent concept into ongoing
work from others, because due to various "scheduler improvements"
every time we upgrade the production kernel this problem is getting
worse :(

Jakub Kicinski (3):
softirq: rename ksoftirqd_running() -> ksoftirqd_should_handle()
softirq: avoid spurious stalls due to need_resched()
softirq: don't yield if only expedited handlers are pending

kernel/softirq.c | 29 ++++++++++++++++++++++-------
1 file changed, 22 insertions(+), 7 deletions(-)

--
2.38.1


2022-12-22 23:00:58

by Jakub Kicinski

[permalink] [raw]
Subject: [PATCH 1/3] softirq: rename ksoftirqd_running() -> ksoftirqd_should_handle()

ksoftirqd_running() takes the high priority softirqs into
consideration, so ksoftirqd_should_handle() seems like
a better name.

Signed-off-by: Jakub Kicinski <[email protected]>
---
kernel/softirq.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index c8a6913c067d..00b838d566c1 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -86,7 +86,7 @@ static void wakeup_softirqd(void)
* unless we're doing some of the synchronous softirqs.
*/
#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
-static bool ksoftirqd_running(unsigned long pending)
+static bool ksoftirqd_should_handle(unsigned long pending)
{
struct task_struct *tsk = __this_cpu_read(ksoftirqd);

@@ -236,7 +236,7 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)
goto out;

pending = local_softirq_pending();
- if (!pending || ksoftirqd_running(pending))
+ if (!pending || ksoftirqd_should_handle(pending))
goto out;

/*
@@ -432,7 +432,7 @@ static inline bool should_wake_ksoftirqd(void)

static inline void invoke_softirq(void)
{
- if (ksoftirqd_running(local_softirq_pending()))
+ if (ksoftirqd_should_handle(local_softirq_pending()))
return;

if (!force_irqthreads() || !__this_cpu_read(ksoftirqd)) {
@@ -468,7 +468,7 @@ asmlinkage __visible void do_softirq(void)

pending = local_softirq_pending();

- if (pending && !ksoftirqd_running(pending))
+ if (pending && !ksoftirqd_should_handle(pending))
do_softirq_own_stack();

local_irq_restore(flags);
--
2.38.1

2023-04-20 17:34:31

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH 0/3] softirq: uncontroversial change

Hi all,
On Thu, 2022-12-22 at 14:12 -0800, Jakub Kicinski wrote:
> Catching up on LWN I run across the article about softirq
> changes, and then I noticed fresh patches in Peter's tree.
> So probably wise for me to throw these out there.
>
> My (can I say Meta's?) problem is the opposite to what the RT
> sensitive people complain about. In the current scheme once
> ksoftirqd is woken no network processing happens until it runs.
>
> When networking gets overloaded - that's probably fair, the problem
> is that we confuse latency tweaks with overload protection. We have
> a needs_resched() in the loop condition (which is a latency tweak)
> Most often we defer to ksoftirqd because we're trying to be nice
> and let user space respond quickly, not because there is an
> overload. But the user space may not be nice, and sit on the CPU
> for 10ms+. Also the sirq's "work allowance" is 2ms, which is
> uncomfortably close to the timer tick, but that's another story.
>
> We have a sirq latency tracker in our prod kernel which catches
> 8ms+ stalls of net Tx (packets queued to the NIC but there is
> no NAPI cleanup within 8ms) and with these patches applied
> on 5.19 fully loaded web machine sees a drop in stalls from
> 1.8 stalls/sec to 0.16/sec. I also see a 50% drop in outgoing
> TCP retransmissions and ~10% drop in non-TLP incoming ones.
> This is not a network-heavy workload so most of the rtx are
> due to scheduling artifacts.
>
> The network latency in a datacenter is somewhere around neat
> 1000x lower than scheduling granularity (around 10us).
>
> These patches (patch 2 is "the meat") change what we recognize
> as overload. Instead of just checking if "ksoftirqd is woken"
> it also caps how long we consider ourselves to be in overload,
> a time limit which is different based on whether we yield due
> to real resource exhaustion vs just hitting that needs_resched().
>
> I hope the core concept is not entirely idiotic. It'd be great
> if we could get this in or fold an equivalent concept into ongoing
> work from others, because due to various "scheduler improvements"
> every time we upgrade the production kernel this problem is getting
> worse :(

Please allow me to revive this old thread.

My understanding is that we want to avoid adding more heuristics here,
preferring a consistent refactor.

I would like to propose a revert of:

4cd13c21b207 softirq: Let ksoftirqd do its job

the its follow-ups:

3c53776e29f8 Mark HI and TASKLET softirq synchronous
0f50524789fc softirq: Don't skip softirq execution when softirq thread is parking

The problem originally addressed by 4cd13c21b207 can now be tackled
with the threaded napi, available since:

29863d41bb6e net: implement threaded-able napi poll loop support

Reverting the mentioned commit should address the latency issues
mentioned by Jakub - I verified it solves a somewhat related problem in
my setup - and reduces the layering of heuristics in this area.

A refactor introducing uniform overload detection and proper resource
control will be better, but I admit it's beyond me and anyway it could
still land afterwards.

Any opinion more then welcome!

Thanks,

Paolo

2023-04-20 17:42:33

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH 0/3] softirq: uncontroversial change

On Thu, Apr 20, 2023 at 7:24 PM Paolo Abeni <[email protected]> wrote:
>
> Hi all,
> On Thu, 2022-12-22 at 14:12 -0800, Jakub Kicinski wrote:
> > Catching up on LWN I run across the article about softirq
> > changes, and then I noticed fresh patches in Peter's tree.
> > So probably wise for me to throw these out there.
> >
> > My (can I say Meta's?) problem is the opposite to what the RT
> > sensitive people complain about. In the current scheme once
> > ksoftirqd is woken no network processing happens until it runs.
> >
> > When networking gets overloaded - that's probably fair, the problem
> > is that we confuse latency tweaks with overload protection. We have
> > a needs_resched() in the loop condition (which is a latency tweak)
> > Most often we defer to ksoftirqd because we're trying to be nice
> > and let user space respond quickly, not because there is an
> > overload. But the user space may not be nice, and sit on the CPU
> > for 10ms+. Also the sirq's "work allowance" is 2ms, which is
> > uncomfortably close to the timer tick, but that's another story.
> >
> > We have a sirq latency tracker in our prod kernel which catches
> > 8ms+ stalls of net Tx (packets queued to the NIC but there is
> > no NAPI cleanup within 8ms) and with these patches applied
> > on 5.19 fully loaded web machine sees a drop in stalls from
> > 1.8 stalls/sec to 0.16/sec. I also see a 50% drop in outgoing
> > TCP retransmissions and ~10% drop in non-TLP incoming ones.
> > This is not a network-heavy workload so most of the rtx are
> > due to scheduling artifacts.
> >
> > The network latency in a datacenter is somewhere around neat
> > 1000x lower than scheduling granularity (around 10us).
> >
> > These patches (patch 2 is "the meat") change what we recognize
> > as overload. Instead of just checking if "ksoftirqd is woken"
> > it also caps how long we consider ourselves to be in overload,
> > a time limit which is different based on whether we yield due
> > to real resource exhaustion vs just hitting that needs_resched().
> >
> > I hope the core concept is not entirely idiotic. It'd be great
> > if we could get this in or fold an equivalent concept into ongoing
> > work from others, because due to various "scheduler improvements"
> > every time we upgrade the production kernel this problem is getting
> > worse :(
>
> Please allow me to revive this old thread.
>
> My understanding is that we want to avoid adding more heuristics here,
> preferring a consistent refactor.
>
> I would like to propose a revert of:
>
> 4cd13c21b207 softirq: Let ksoftirqd do its job
>
> the its follow-ups:
>
> 3c53776e29f8 Mark HI and TASKLET softirq synchronous
> 0f50524789fc softirq: Don't skip softirq execution when softirq thread is parking
>
> The problem originally addressed by 4cd13c21b207 can now be tackled
> with the threaded napi, available since:
>
> 29863d41bb6e net: implement threaded-able napi poll loop support
>
> Reverting the mentioned commit should address the latency issues
> mentioned by Jakub - I verified it solves a somewhat related problem in
> my setup - and reduces the layering of heuristics in this area.
>
> A refactor introducing uniform overload detection and proper resource
> control will be better, but I admit it's beyond me and anyway it could
> still land afterwards.
>
> Any opinion more then welcome!

Seems fine, but I think few things need to be fixed first in
napi_threaded_poll()
to enable some important features that are currently in net_rx_action() only.

2023-04-20 20:27:05

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH 0/3] softirq: uncontroversial change

On Thu, 2023-04-20 at 19:41 +0200, Eric Dumazet wrote:
> On Thu, Apr 20, 2023 at 7:24 PM Paolo Abeni <[email protected]> wrote:
> > I would like to propose a revert of:
> >
> > 4cd13c21b207 softirq: Let ksoftirqd do its job
> >
> > the its follow-ups:
> >
> > 3c53776e29f8 Mark HI and TASKLET softirq synchronous
> > 0f50524789fc softirq: Don't skip softirq execution when softirq thread is parking
> >
> > The problem originally addressed by 4cd13c21b207 can now be tackled
> > with the threaded napi, available since:
> >
> > 29863d41bb6e net: implement threaded-able napi poll loop support
> >
> > Reverting the mentioned commit should address the latency issues
> > mentioned by Jakub - I verified it solves a somewhat related problem in
> > my setup - and reduces the layering of heuristics in this area.
> >
> > A refactor introducing uniform overload detection and proper resource
> > control will be better, but I admit it's beyond me and anyway it could
> > still land afterwards.
> >
> > Any opinion more then welcome!
>
> Seems fine, but I think few things need to be fixed first in
> napi_threaded_poll()
> to enable some important features that are currently in net_rx_action() only.

Thanks for the feedback.

I fear I'll miss some relevant bits. 

On top of my head I think about RPS and skb_defer_free. Both should
work even when napi threaded is enabled - with an additional softirq ;)
Do you think we should be able to handle both inside the napi thread?
Or do you refer to other features?

Thanks!

Paolo

2023-04-21 02:50:12

by Jason Xing

[permalink] [raw]
Subject: Re: [PATCH 0/3] softirq: uncontroversial change

On Fri, Apr 21, 2023 at 1:34 AM Paolo Abeni <[email protected]> wrote:
>
> Hi all,
> On Thu, 2022-12-22 at 14:12 -0800, Jakub Kicinski wrote:
> > Catching up on LWN I run across the article about softirq
> > changes, and then I noticed fresh patches in Peter's tree.
> > So probably wise for me to throw these out there.
> >
> > My (can I say Meta's?) problem is the opposite to what the RT
> > sensitive people complain about. In the current scheme once
> > ksoftirqd is woken no network processing happens until it runs.
> >
> > When networking gets overloaded - that's probably fair, the problem
> > is that we confuse latency tweaks with overload protection. We have
> > a needs_resched() in the loop condition (which is a latency tweak)
> > Most often we defer to ksoftirqd because we're trying to be nice
> > and let user space respond quickly, not because there is an
> > overload. But the user space may not be nice, and sit on the CPU
> > for 10ms+. Also the sirq's "work allowance" is 2ms, which is
> > uncomfortably close to the timer tick, but that's another story.
> >
> > We have a sirq latency tracker in our prod kernel which catches
> > 8ms+ stalls of net Tx (packets queued to the NIC but there is
> > no NAPI cleanup within 8ms) and with these patches applied
> > on 5.19 fully loaded web machine sees a drop in stalls from
> > 1.8 stalls/sec to 0.16/sec. I also see a 50% drop in outgoing
> > TCP retransmissions and ~10% drop in non-TLP incoming ones.
> > This is not a network-heavy workload so most of the rtx are
> > due to scheduling artifacts.
> >
> > The network latency in a datacenter is somewhere around neat
> > 1000x lower than scheduling granularity (around 10us).
> >
> > These patches (patch 2 is "the meat") change what we recognize
> > as overload. Instead of just checking if "ksoftirqd is woken"
> > it also caps how long we consider ourselves to be in overload,
> > a time limit which is different based on whether we yield due
> > to real resource exhaustion vs just hitting that needs_resched().
> >
> > I hope the core concept is not entirely idiotic. It'd be great
> > if we could get this in or fold an equivalent concept into ongoing
> > work from others, because due to various "scheduler improvements"
> > every time we upgrade the production kernel this problem is getting
> > worse :(
>
[...]
> Please allow me to revive this old thread.

Hi Paolo,

So good to hear this :)

>
> My understanding is that we want to avoid adding more heuristics here,
> preferring a consistent refactor.
>
> I would like to propose a revert of:
>
> 4cd13c21b207 softirq: Let ksoftirqd do its job
>
> the its follow-ups:
>
> 3c53776e29f8 Mark HI and TASKLET softirq synchronous
> 0f50524789fc softirq: Don't skip softirq execution when softirq thread is parking

More than this, I list some related patches mentioned in the above
commit 3c53776e29f8:
1ff688209e2e ("watchdog: core: make sure the watchdog_worker is not deferred")
8d5755b3f77b ("watchdog: softdog: fire watchdog even if softirqs do
not get to run")
217f69743681 ("net: busy-poll: allow preemption in sk_busy_loop()")

>
> The problem originally addressed by 4cd13c21b207 can now be tackled
> with the threaded napi, available since:
>
> 29863d41bb6e net: implement threaded-able napi poll loop support
>
> Reverting the mentioned commit should address the latency issues
> mentioned by Jakub - I verified it solves a somewhat related problem in
> my setup - and reduces the layering of heuristics in this area.

Sure, it is. I also can verify its usefulness in the real workload.
Some days ago I also sent a heuristics patch [1] that can bypass the
ksoftirqd if the user chooses to mask some type of softirq. Let the
user decide it.

But I observed that if we mask some softirqs, or we can say,
completely revert the commit 4cd13c21b207, the load would go higher
and the kernel itself may occupy/consume more time than before. They
were tested under the similar workload launched by our applications.

[1]: https://lore.kernel.org/all/[email protected]/

>
> A refactor introducing uniform overload detection and proper resource
> control will be better, but I admit it's beyond me and anyway it could
> still land afterwards.

+1

Thanks,
Jason
>
> Any opinion more then welcome!
>
> Thanks,
>
> Paolo
>

2023-04-21 09:43:13

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH 0/3] softirq: uncontroversial change

On Fri, 2023-04-21 at 10:48 +0800, Jason Xing wrote:
>
> > My understanding is that we want to avoid adding more heuristics here,
> > preferring a consistent refactor.
> >
> > I would like to propose a revert of:
> >
> > 4cd13c21b207 softirq: Let ksoftirqd do its job
> >
> > the its follow-ups:
> >
> > 3c53776e29f8 Mark HI and TASKLET softirq synchronous
> > 0f50524789fc softirq: Don't skip softirq execution when softirq thread is parking
>
> More than this, I list some related patches mentioned in the above
> commit 3c53776e29f8:
> 1ff688209e2e ("watchdog: core: make sure the watchdog_worker is not deferred")
> 8d5755b3f77b ("watchdog: softdog: fire watchdog even if softirqs do
> not get to run")
> 217f69743681 ("net: busy-poll: allow preemption in sk_busy_loop()")

The first 2 changes replace plain timers with HR ones, could possibly
be reverted, too, but it should not be a big deal either way.

I think instead we want to keep the third commit above, as it should be
useful when napi threaded is enabled.

Generally speaking I would keep the initial revert to the bare minimum.

> > The problem originally addressed by 4cd13c21b207 can now be tackled
> > with the threaded napi, available since:
> >
> > 29863d41bb6e net: implement threaded-able napi poll loop support
> >
> > Reverting the mentioned commit should address the latency issues
> > mentioned by Jakub - I verified it solves a somewhat related problem in
> > my setup - and reduces the layering of heuristics in this area.
>
> Sure, it is. I also can verify its usefulness in the real workload.
> Some days ago I also sent a heuristics patch [1] that can bypass the
> ksoftirqd if the user chooses to mask some type of softirq. Let the
> user decide it.
>
> But I observed that if we mask some softirqs, or we can say,
> completely revert the commit 4cd13c21b207, the load would go higher
> and the kernel itself may occupy/consume more time than before. They
> were tested under the similar workload launched by our applications.
>
> [1]: https://lore.kernel.org/all/[email protected]/

Thanks for the reference, I would have missed that patch otherwise.

My understanding is that adding more knobs here is in the opposite
direction of what Thomas is suggesting, and IMHO the 'now mask' should
not be exposed to user-space.

>
Thanks for the feedback,

Paolo

2023-04-21 09:57:38

by Jason Xing

[permalink] [raw]
Subject: Re: [PATCH 0/3] softirq: uncontroversial change

On Fri, Apr 21, 2023 at 5:33 PM Paolo Abeni <[email protected]> wrote:
>
> On Fri, 2023-04-21 at 10:48 +0800, Jason Xing wrote:
> >
> > > My understanding is that we want to avoid adding more heuristics here,
> > > preferring a consistent refactor.
> > >
> > > I would like to propose a revert of:
> > >
> > > 4cd13c21b207 softirq: Let ksoftirqd do its job
> > >
> > > the its follow-ups:
> > >
> > > 3c53776e29f8 Mark HI and TASKLET softirq synchronous
> > > 0f50524789fc softirq: Don't skip softirq execution when softirq thread is parking
> >
> > More than this, I list some related patches mentioned in the above
> > commit 3c53776e29f8:
> > 1ff688209e2e ("watchdog: core: make sure the watchdog_worker is not deferred")
> > 8d5755b3f77b ("watchdog: softdog: fire watchdog even if softirqs do
> > not get to run")
> > 217f69743681 ("net: busy-poll: allow preemption in sk_busy_loop()")
>
[...]
> The first 2 changes replace plain timers with HR ones, could possibly
> be reverted, too, but it should not be a big deal either way.
>
> I think instead we want to keep the third commit above, as it should be
> useful when napi threaded is enabled.
>
> Generally speaking I would keep the initial revert to the bare minimum.

I agree with you :)

>
> > > The problem originally addressed by 4cd13c21b207 can now be tackled
> > > with the threaded napi, available since:
> > >
> > > 29863d41bb6e net: implement threaded-able napi poll loop support
> > >
> > > Reverting the mentioned commit should address the latency issues
> > > mentioned by Jakub - I verified it solves a somewhat related problem in
> > > my setup - and reduces the layering of heuristics in this area.
> >
> > Sure, it is. I also can verify its usefulness in the real workload.
> > Some days ago I also sent a heuristics patch [1] that can bypass the
> > ksoftirqd if the user chooses to mask some type of softirq. Let the
> > user decide it.
> >
> > But I observed that if we mask some softirqs, or we can say,
> > completely revert the commit 4cd13c21b207, the load would go higher
> > and the kernel itself may occupy/consume more time than before. They
> > were tested under the similar workload launched by our applications.
> >
> > [1]: https://lore.kernel.org/all/[email protected]/
>
> Thanks for the reference, I would have missed that patch otherwise.
>
> My understanding is that adding more knobs here is in the opposite
> direction of what Thomas is suggesting, and IMHO the 'now mask' should
> not be exposed to user-space.

Could you please share the link about what Thomas is suggesting? I
missed it. At the beginning, I didn't have the guts to revert the
commit directly. Instead I wrote a compromised patch that is not that
elegant as you said. Anyway, the idea is common, but reverting the
whole commit may involve more work. I will spend some time digging
into this part.

More suggestions are also welcome :)

Thanks,
Jason

>
> >
> Thanks for the feedback,
>
> Paolo
>

2023-05-09 20:01:02

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: irq/core] Revert "softirq: Let ksoftirqd do its job"

The following commit has been merged into the irq/core branch of tip:

Commit-ID: d15121be7485655129101f3960ae6add40204463
Gitweb: https://git.kernel.org/tip/d15121be7485655129101f3960ae6add40204463
Author: Paolo Abeni <[email protected]>
AuthorDate: Mon, 08 May 2023 08:17:44 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Tue, 09 May 2023 21:50:27 +02:00

Revert "softirq: Let ksoftirqd do its job"

This reverts the following commits:

4cd13c21b207 ("softirq: Let ksoftirqd do its job")
3c53776e29f8 ("Mark HI and TASKLET softirq synchronous")
1342d8080f61 ("softirq: Don't skip softirq execution when softirq thread is parking")

in a single change to avoid known bad intermediate states introduced by a
patch series reverting them individually.

Due to the mentioned commit, when the ksoftirqd threads take charge of
softirq processing, the system can experience high latencies.

In the past a few workarounds have been implemented for specific
side-effects of the initial ksoftirqd enforcement commit:

commit 1ff688209e2e ("watchdog: core: make sure the watchdog_worker is not deferred")
commit 8d5755b3f77b ("watchdog: softdog: fire watchdog even if softirqs do not get to run")
commit 217f69743681 ("net: busy-poll: allow preemption in sk_busy_loop()")
commit 3c53776e29f8 ("Mark HI and TASKLET softirq synchronous")

But the latency problem still exists in real-life workloads, see the link
below.

The reverted commit intended to solve a live-lock scenario that can now be
addressed with the NAPI threaded mode, introduced with commit 29863d41bb6e
("net: implement threaded-able napi poll loop support"), which is nowadays
in a pretty stable status.

While a complete solution to put softirq processing under nice resource
control would be preferable, that has proven to be a very hard task. In
the short term, remove the main pain point, and also simplify a bit the
current softirq implementation.

Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Jason Xing <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Sebastian Andrzej Siewior <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/netdev/[email protected]
Link: https://lore.kernel.org/r/57e66b364f1b6f09c9bc0316742c3b14f4ce83bd.1683526542.git.pabeni@redhat.com
---
kernel/softirq.c | 22 ++--------------------
1 file changed, 2 insertions(+), 20 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 1b72551..807b34c 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -80,21 +80,6 @@ static void wakeup_softirqd(void)
wake_up_process(tsk);
}

-/*
- * If ksoftirqd is scheduled, we do not want to process pending softirqs
- * right now. Let ksoftirqd handle this at its own rate, to get fairness,
- * unless we're doing some of the synchronous softirqs.
- */
-#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
-static bool ksoftirqd_running(unsigned long pending)
-{
- struct task_struct *tsk = __this_cpu_read(ksoftirqd);
-
- if (pending & SOFTIRQ_NOW_MASK)
- return false;
- return tsk && task_is_running(tsk) && !__kthread_should_park(tsk);
-}
-
#ifdef CONFIG_TRACE_IRQFLAGS
DEFINE_PER_CPU(int, hardirqs_enabled);
DEFINE_PER_CPU(int, hardirq_context);
@@ -236,7 +221,7 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)
goto out;

pending = local_softirq_pending();
- if (!pending || ksoftirqd_running(pending))
+ if (!pending)
goto out;

/*
@@ -432,9 +417,6 @@ static inline bool should_wake_ksoftirqd(void)

static inline void invoke_softirq(void)
{
- if (ksoftirqd_running(local_softirq_pending()))
- return;
-
if (!force_irqthreads() || !__this_cpu_read(ksoftirqd)) {
#ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
/*
@@ -468,7 +450,7 @@ asmlinkage __visible void do_softirq(void)

pending = local_softirq_pending();

- if (pending && !ksoftirqd_running(pending))
+ if (pending)
do_softirq_own_stack();

local_irq_restore(flags);