Received: by 10.223.176.5 with SMTP id f5csp2186259wra; Thu, 8 Feb 2018 09:46:15 -0800 (PST) X-Google-Smtp-Source: AH8x227kBqvucrROEtyyarTcYKx133gDfDIfJgP0TZY2UrdwbfnDIfcVfskQt2ggaFmGbKUN8OJ5 X-Received: by 2002:a17:902:581a:: with SMTP id m26-v6mr821pli.158.1518111974985; Thu, 08 Feb 2018 09:46:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518111974; cv=none; d=google.com; s=arc-20160816; b=I6KKYdnM1tJs1P3A9XIigbRK0NtiQx9az4dTEWUCjYOb13HhAZ06xFcEM9lacHEA6V hoYIqUMRszWpeId502LNFcdSNuhE7ZOFBRhm9qwP6EVcbryJMdb+oOlGPM4HSBbhV5nB O95thlgDA62CUpCMD2ykEBk5r/fzU0syyUAdRSj3Z/mjv8E+UR1CGRSXXdt6+Lcjr24S gqlHAbbKZPEP2XYpQbKsrmTY6/Yo9GhzqCSZMR6wzyz0znm1OcqYgyoE3k2L+xZ6yZMS s4MNvvpvhnN+HCUe5wSclRnDiRqi1Nrx8lQkCbQ/Bo3tEGeHOMMVNyIy9XiyIRYGInYt Kc5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=DVi261VYNf+Y6l7lcUbCeJGaA18U46dVT1UMGDI0qao=; b=x9WP/Iln69B/5mY6PN3dXb/KT9G/+pjktWegkDWJQu3ciJMEStDfDeDH0vNVRcsktK 3jp3jNr374HzrC+wZsy7ZX2A3xztsjDPmMieI3/kokO4QV76PIBFBuSqLSZifhrDv9Bd aP+EATfwZZRmllvK8ij3/Y7nmVIcd+pkN+AyAsjKvoDhSMZym0akxdwA8mbToo011Uul kn/rkGbH3gJfoWjPZ+oSLlpsqQEppo/JLmSSLJWtULLSdoEuIMaYQIa6z0sJk4pc0jq9 5/WNV5rSoEBa88GMSP0yexFnza2Go9UvyGBjQ21TAsIz5RsdtKHqpk6e1XE1sFKLe/Im 6KPw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f19-v6si220117plr.641.2018.02.08.09.45.59; Thu, 08 Feb 2018 09:46:14 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752322AbeBHRpD convert rfc822-to-8bit (ORCPT + 99 others); Thu, 8 Feb 2018 12:45:03 -0500 Received: from Chamillionaire.breakpoint.cc ([146.0.238.67]:43458 "EHLO Chamillionaire.breakpoint.cc" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752297AbeBHRpC (ORCPT ); Thu, 8 Feb 2018 12:45:02 -0500 Received: from bigeasy by Chamillionaire.breakpoint.cc with local (Exim 4.84_2) (envelope-from ) id 1ejqCh-00067K-C8; Thu, 08 Feb 2018 18:41:47 +0100 Date: Thu, 8 Feb 2018 18:44:52 +0100 From: Sebastian Andrzej Siewior To: Frederic Weisbecker Cc: LKML , Levin Alexander , Peter Zijlstra , Mauro Carvalho Chehab , Linus Torvalds , Hannes Frederic Sowa , "Paul E . McKenney" , Wanpeng Li , Dmitry Safonov , Thomas Gleixner , Andrew Morton , Paolo Abeni , Radu Rendec , Ingo Molnar , Stanislaw Gruszka , Rik van Riel , Eric Dumazet , David Miller Subject: Re: [RFC PATCH 2/4] softirq: Per vector deferment to workqueue Message-ID: <20180208174450.qjvjy752jf4ngt2g@breakpoint.cc> References: <1516376774-24076-1-git-send-email-frederic@kernel.org> <1516376774-24076-3-git-send-email-frederic@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8BIT In-Reply-To: <1516376774-24076-3-git-send-email-frederic@kernel.org> User-Agent: NeoMutt/20171215 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018-01-19 16:46:12 [+0100], Frederic Weisbecker wrote: > diff --git a/kernel/softirq.c b/kernel/softirq.c > index c8c6841..becb1d9 100644 > --- a/kernel/softirq.c > +++ b/kernel/softirq.c > @@ -62,6 +62,19 @@ const char * const softirq_to_name[NR_SOFTIRQS] = { … > +static void vector_work_func(struct work_struct *work) > +{ > + struct vector *vector = container_of(work, struct vector, work); > + struct softirq *softirq = this_cpu_ptr(&softirq_cpu); > + int vec_nr = vector->nr; > + int vec_bit = BIT(vec_nr); > + u32 pending; > + > + local_irq_disable(); > + pending = local_softirq_pending(); > + account_irq_enter_time(current); > + __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET); > + lockdep_softirq_enter(); > + set_softirq_pending(pending & ~vec_bit); > + local_irq_enable(); > + > + if (pending & vec_bit) { > + struct softirq_action *sa = &softirq_vec[vec_nr]; > + > + kstat_incr_softirqs_this_cpu(vec_nr); > + softirq->work_running = 1; > + trace_softirq_entry(vec_nr); > + sa->action(sa); You invoke the softirq handler while BH is disabled (not wrong, I just state the obvious). That means, the scheduler can't preempt/interrupt the workqueue/BH-handler while it is invoked so it has to wait until it completes its doing. In do_softirq_workqueue() you schedule multiple workqueue items (one for each softirq vector) which is unnecessary because they can't preempt one another and should be invoked the order they were enqueued. So it would be enough to enqueue one item because it is serialized after all. So one work_struct per CPU with a cond_resched_rcu_qs() while switching from one vector to another should accomplish that what you have now here (not sure if that cond_resched after each vector is needed). But… > + trace_softirq_exit(vec_nr); > + softirq->work_running = 0; > + } > + > + local_irq_disable(); > + > + pending = local_softirq_pending(); > + if (pending & vec_bit) > + schedule_work_on(smp_processor_id(), &vector->work); … on a system that is using system_wq a lot, it might introduced a certain latency until your softirq-worker gets its turn. The workqueue will spawn new workers if the current worker schedules out but until that happens you have to wait. I am not sure if this is intended or whether this might be a problem. I think you could argue either way depending on what you currently think is more important. Further, schedule_work_on(x, ) does not guarentee that the work item is invoked on CPU x. It tries that but if CPU x goes down due to CPU-hotplug then the workitem will be moved to random CPU. For that reason we have work_on_cpu_safe() but you don't want to use that / flush that workqueue while in here. May I instead suggest to stick to ksoftirqd? So you run in softirq context (after return from IRQ) and if takes too long, you offload the vector to ksoftirqd instead. You may want to play with the metric on which you decide when you want switch to ksoftirqd / account how long a vector runs. > + else > + softirq->pending_work_mask &= ~vec_bit; > + > + lockdep_softirq_exit(); > + account_irq_exit_time(current); > + __local_bh_enable(SOFTIRQ_OFFSET); > + local_irq_enable(); > +} Sebastian