Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933994AbeALO4L (ORCPT + 1 other); Fri, 12 Jan 2018 09:56:11 -0500 Received: from mail.kernel.org ([198.145.29.99]:55638 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933764AbeALO4J (ORCPT ); Fri, 12 Jan 2018 09:56:09 -0500 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 98AD721722 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=frederic@kernel.org Date: Fri, 12 Jan 2018 15:56:06 +0100 From: Frederic Weisbecker To: Paolo Abeni Cc: LKML , Levin Alexander , Peter Zijlstra , Linus Torvalds , Hannes Frederic Sowa , "Paul E . McKenney" , Wanpeng Li , Dmitry Safonov , Thomas Gleixner , Eric Dumazet , Radu Rendec , Ingo Molnar , Stanislaw Gruszka , Rik van Riel , Andrew Morton , David Miller Subject: Re: [RFC PATCH 2/2] softirq: Per vector thread deferment Message-ID: <20180112145602.GB1950@lerouge> References: <1515735354-19279-1-git-send-email-frederic@kernel.org> <1515735354-19279-3-git-send-email-frederic@kernel.org> <1515748045.2648.10.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1515748045.2648.10.camel@redhat.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Fri, Jan 12, 2018 at 10:07:25AM +0100, Paolo Abeni wrote: > On Fri, 2018-01-12 at 06:35 +0100, Frederic Weisbecker wrote: > > Some softirq vectors can be more CPU hungry than others. Especially > > networking may sometimes deal with packet storm and need more CPU than > > IRQ tail can offer without inducing scheduler latencies. In this case > > the current code defers to ksoftirqd that behaves nicer. Now this nice > > behaviour can be bad for other IRQ vectors that usually need quick > > processing. > > > > To solve this we only defer to threading the vectors that outreached the > > time limit on IRQ tail processing and leave the others inline on real > > Soft-IRQs service. This is achieved using workqueues with > > per-CPU/per-vector worklets. > > > > Note ksoftirqd is not removed as it is still needed for threaded IRQs > > mode. > > > > Suggested-by: Linus Torvalds > > Signed-off-by: Frederic Weisbecker > > Cc: Dmitry Safonov > > Cc: Eric Dumazet > > Cc: Linus Torvalds > > Cc: Peter Zijlstra > > Cc: Andrew Morton > > Cc: David Miller > > Cc: Hannes Frederic Sowa > > Cc: Ingo Molnar > > Cc: Levin Alexander > > Cc: Paolo Abeni > > Cc: Paul E. McKenney > > Cc: Radu Rendec > > Cc: Rik van Riel > > Cc: Stanislaw Gruszka > > Cc: Thomas Gleixner > > Cc: Wanpeng Li > > --- > > kernel/softirq.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- > > 1 file changed, 87 insertions(+), 3 deletions(-) > > > > diff --git a/kernel/softirq.c b/kernel/softirq.c > > index fa267f7..0c817ec6 100644 > > --- a/kernel/softirq.c > > +++ b/kernel/softirq.c > > @@ -74,6 +74,13 @@ struct softirq_stat { > > > > static DEFINE_PER_CPU(struct softirq_stat, softirq_stat_cpu); > > > > +struct vector_work { > > + int vec; > > + struct work_struct work; > > +}; > > + > > +static DEFINE_PER_CPU(struct vector_work[NR_SOFTIRQS], vector_work_cpu); > > + > > /* > > * we cannot loop indefinitely here to avoid userspace starvation, > > * but we also don't want to introduce a worst case 1/HZ latency > > @@ -251,6 +258,70 @@ static inline bool lockdep_softirq_start(void) { return false; } > > static inline void lockdep_softirq_end(bool in_hardirq) { } > > #endif > > > > +static void vector_work_func(struct work_struct *work) > > +{ > > + struct vector_work *vector_work; > > + u32 pending; > > + int vec; > > + > > + vector_work = container_of(work, struct vector_work, work); > > + vec = vector_work->vec; > > + > > + local_irq_disable(); > > + pending = local_softirq_pending(); > > + account_irq_enter_time(current); > > + __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET); > > + lockdep_softirq_enter(); > > + set_softirq_pending(pending & ~(1 << vec)); > > + local_irq_enable(); > > + > > + if (pending & (1 << vec)) { > > + struct softirq_action *sa = &softirq_vec[vec]; > > + > > + kstat_incr_softirqs_this_cpu(vec); > > + trace_softirq_entry(vec); > > + sa->action(sa); > > + trace_softirq_exit(vec); > > + } > > + > > + local_irq_disable(); > > + > > + pending = local_softirq_pending(); > > + if (pending & (1 << vec)) > > + schedule_work_on(smp_processor_id(), work); > > If we check for the overrun condition here, as done in the > __do_softirq() main loop, we could avoid ksoftirqd completely and > probably have less code duplication. Yes that could be possible indeed. I guess having workqueues serializing vector works is not much different that what ksoftirqd does. I can try that. > > > + > > + lockdep_softirq_exit(); > > + account_irq_exit_time(current); > > + __local_bh_enable(SOFTIRQ_OFFSET); > > + local_irq_enable(); > > +} > > + > > +static int do_softirq_overrun(u32 overrun, u32 pending) > > +{ > > + struct softirq_action *h = softirq_vec; > > + int softirq_bit; > > + > > + if (!overrun) > > + return pending; > > + > > + overrun &= pending; > > + pending &= ~overrun; > > + > > + while ((softirq_bit = ffs(overrun))) { > > + struct vector_work *work; > > + unsigned int vec_nr; > > + > > + h += softirq_bit - 1; > > + vec_nr = h - softirq_vec; > > + work = this_cpu_ptr(&vector_work_cpu[vec_nr]); > > + schedule_work_on(smp_processor_id(), &work->work); > > + h++; > > + overrun >>= softirq_bit; > > + } > > + > > + return pending; > > +} > > + > > asmlinkage __visible void __softirq_entry __do_softirq(void) > > { > > struct softirq_stat *sstat = this_cpu_ptr(&softirq_stat_cpu); > > @@ -321,10 +392,13 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) > > > > pending = local_softirq_pending(); > > if (pending) { > > - if (overrun || need_resched()) > > + if (need_resched()) { > > wakeup_softirqd(); > > - else > > - goto restart; > > + } else { > > + pending = do_softirq_overrun(overrun, pending); > > + if (pending) > > + goto restart; > > + } > > } > > > > lockdep_softirq_end(in_hardirq); > > This way the 'overrun' branch is not triggered if we (also) need > resched, should we test for overrun first ? Yes they could have similar treatment. If need_resched() we schedule everything that is still pending: do_softirq_overrun(pending, pending), otherwise we take the other branch and still do a goto restart. In fact it can even be simplified this way: if (need_resched()) overrun = pending; pending = do_softirq_overrun(overrun, pending); if (pending) goto restart; Thanks.