MIME-Version: 1.0
In-Reply-To: <CANn89iJzekwx_Hs0t0O==+gwAfqMyVHBg=gemayZZJXb4bYJdQ@mail.gmail.com>
References: <20180109133623.10711-1-dima@arista.com> <20180109133623.10711-2-dima@arista.com>
 <CANn89iK3M97MN0Pf3nXb+UAqqhUWOdSthHRBTYCwP75Ax_hO8Q@mail.gmail.com>
 <1515620880.3350.44.camel@arista.com> <CA+55aFyKKt4_5RT9RT8ZH-W26hC8=AvRYf8YxBm98dGSWwFs8g@mail.gmail.com>
 <20180111032232.GA11633@lerouge> <CA+55aFx_3zwQJ0YbDCL4YxpWEWhcEZfJnn42LzWBWDi3h1VdGA@mail.gmail.com>
 <20180111044456.GC11633@lerouge> <1515681091.3039.21.camel@arista.com>
 <CANn89i+mVmzrZ14Kttt=J0wsDOMHhm8CHiMRLQwEZXMxiVpftg@mail.gmail.com>
 <20180111163204.GE6176@hirez.programming.kicks-ass.net> <CA+55aFwc3CP-sKOyVvaLab3azmr3LnPfADnGJXDcxYz9dT75=A@mail.gmail.com>
 <CANn89i+ZTLtA5ZLRAbCgM_Cx-2xiwRbDXM4x=-QiM78r5ptcqg@mail.gmail.com>
 <CA+55aFyZPzkjwkLXWWXp3KUfLD7MUtGxSu1Q6vc0O5i9Ea6ZKw@mail.gmail.com> <CANn89iJzekwx_Hs0t0O==+gwAfqMyVHBg=gemayZZJXb4bYJdQ@mail.gmail.com>
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu, 11 Jan 2018 12:03:23 -0800
Message-ID: <CA+55aFx+1tFpnLBXjZKoYsMMVPakeP8nycyfMpF7agUXz_kGkQ@mail.gmail.com>
Subject: Re: [RFC 1/2] softirq: Defer net rx/tx processing to ksoftirqd context
To: Eric Dumazet <edumazet@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
        Dmitry Safonov <dima@arista.com>,
        Frederic Weisbecker <frederic@kernel.org>,
        LKML <linux-kernel@vger.kernel.org>,
        Dmitry Safonov <0x7f454c46@gmail.com>,
        Andrew Morton <akpm@linux-foundation.org>,
        David Miller <davem@davemloft.net>,
        Frederic Weisbecker <fweisbec@gmail.com>,
        Hannes Frederic Sowa <hannes@stressinduktion.org>,
        Ingo Molnar <mingo@kernel.org>,
        "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com>,
        Paolo Abeni <pabeni@redhat.com>,
        "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
        Radu Rendec <rrendec@arista.com>,
        Rik van Riel <riel@redhat.com>,
        Stanislaw Gruszka <sgruszka@redhat.com>,
        Thomas Gleixner <tglx@linutronix.de>,
        Wanpeng Li <wanpeng.li@hotmail.com>
Content-Type: text/plain; charset="UTF-8"
Sender: linux-kernel-owner@vger.kernel.org

On Thu, Jan 11, 2018 at 11:48 AM, Eric Dumazet <edumazet@google.com> wrote:
> That was the purpose on the last patch : As soon as ksoftirqd is scheduled
> (by some kind of jitter in the 99,000 pps workload, or antagonist wakeup),
> we then switch to a mode where process scheduler can make decisions
> based on threads prios and cpu usage.

Yeah, but that really screws up everybody else.

It really is a soft *interrupt*. That was what it was designed for.
The thread handling is not primary, it's literally a fallback to avoid
complete starvation.

The fact that networking has now - for several years - tried to make
it some kind of thread and get fairness with user threads is all
entirely antithetical to what softirq was designed for.

> Then, as soon as the load was able to finish in its quantum the
> pending irqs, we re-enter the mode
> where softirq are immediately serviced.

Except that's not at all how the code works.

As I pointed out, the softirq thread can be scheduled away, but the
"softiq_running()" wil stilll return true - and the networking code
has now screwed up all the *other* softirqs too!

I really suspect that what networking wants is more like the
workqueues. Or at least more isolation between different softirq
users, but that's fairly fundamentally hard, given how softirqs are
designed.

My dvb-fixing patch was an *extremely* stupid version of that "more
isolation".  But it really is a complete hack, saying that tasklets
are special and shouldn't trigger ksoftirqd. They're not really all
that special, but it at least isolated the USB/DVB usage from the
"networking wants softirqd" problem.

                    Linus