Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935805AbeAKUD0 (ORCPT + 1 other); Thu, 11 Jan 2018 15:03:26 -0500 Received: from mail-it0-f68.google.com ([209.85.214.68]:35154 "EHLO mail-it0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932458AbeAKUDY (ORCPT ); Thu, 11 Jan 2018 15:03:24 -0500 X-Google-Smtp-Source: ACJfBos3tE98twD7N4b76E5UkF3GP2IV1BMxMg8AlzupoMyhdvYpmGD99AvLRiYJ9GaDWYkG36TBZ4VEADRazB32Un4= MIME-Version: 1.0 In-Reply-To: References: <20180109133623.10711-1-dima@arista.com> <20180109133623.10711-2-dima@arista.com> <1515620880.3350.44.camel@arista.com> <20180111032232.GA11633@lerouge> <20180111044456.GC11633@lerouge> <1515681091.3039.21.camel@arista.com> <20180111163204.GE6176@hirez.programming.kicks-ass.net> From: Linus Torvalds Date: Thu, 11 Jan 2018 12:03:23 -0800 X-Google-Sender-Auth: cPzWBtcx4jhadPZZhw5GEMURH6E Message-ID: Subject: Re: [RFC 1/2] softirq: Defer net rx/tx processing to ksoftirqd context To: Eric Dumazet Cc: Peter Zijlstra , Dmitry Safonov , Frederic Weisbecker , LKML , Dmitry Safonov <0x7f454c46@gmail.com>, Andrew Morton , David Miller , Frederic Weisbecker , Hannes Frederic Sowa , Ingo Molnar , "Levin, Alexander (Sasha Levin)" , Paolo Abeni , "Paul E. McKenney" , Radu Rendec , Rik van Riel , Stanislaw Gruszka , Thomas Gleixner , Wanpeng Li Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Thu, Jan 11, 2018 at 11:48 AM, Eric Dumazet wrote: > That was the purpose on the last patch : As soon as ksoftirqd is scheduled > (by some kind of jitter in the 99,000 pps workload, or antagonist wakeup), > we then switch to a mode where process scheduler can make decisions > based on threads prios and cpu usage. Yeah, but that really screws up everybody else. It really is a soft *interrupt*. That was what it was designed for. The thread handling is not primary, it's literally a fallback to avoid complete starvation. The fact that networking has now - for several years - tried to make it some kind of thread and get fairness with user threads is all entirely antithetical to what softirq was designed for. > Then, as soon as the load was able to finish in its quantum the > pending irqs, we re-enter the mode > where softirq are immediately serviced. Except that's not at all how the code works. As I pointed out, the softirq thread can be scheduled away, but the "softiq_running()" wil stilll return true - and the networking code has now screwed up all the *other* softirqs too! I really suspect that what networking wants is more like the workqueues. Or at least more isolation between different softirq users, but that's fairly fundamentally hard, given how softirqs are designed. My dvb-fixing patch was an *extremely* stupid version of that "more isolation". But it really is a complete hack, saying that tasklets are special and shouldn't trigger ksoftirqd. They're not really all that special, but it at least isolated the USB/DVB usage from the "networking wants softirqd" problem. Linus