Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp171282pxk; Wed, 9 Sep 2020 02:11:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxbWjH0saOpygSpcqNSs2/n7O9kChZxZYVw1xKDBchiuhKO6l3bRRDnx5Q9kWgVdM1K/yrO X-Received: by 2002:aa7:c693:: with SMTP id n19mr3186502edq.101.1599642694491; Wed, 09 Sep 2020 02:11:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599642694; cv=none; d=google.com; s=arc-20160816; b=L+zhYzLW263o+jrYzKLP4hSsaqV8i2DD0pP55sjHbTXFEg2XkEQo9Ria3XNgq+6Qhy TMVYzx7X0OrnyPW7V/FSgK0zZbjneF5AkhvZoQ9RuHOVSmsKtLiQnEOX2u7GhGxUAZGk 0LiCu9CwCcPN7RffnIv3R8B0LqZaFqAsQIBq+ZEXKllBkkWNyuMY2QSWYF5aq2DDIhRK 3m9SxgCfWzVoyj56fKB1TGpJzy9dDK6kdnV+8EbS/DA2+fhAb9nmW9WBkAMkaw2qr/0p fDyfje65Ij6iB1GNwtTuTPXnnRhrqwLQGueD7p7mNUbsRsMPz9DCB6sm2Py3Vdpmjsdu AjTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=Lp4vuPsvFxDFpD3RPxxQA9GDt79f5fSUFEakGXQllkE=; b=rgM/4KT94VuJrbWehIBiHDfZb9S2Uo8xUqFjroIxpchpT060702DPfajolgMQ6Hbdy wf7gPqv1k7LQ5Bk++ruIUyaX3cbSG++tR1cohHdWYHRmEEep5LXpPfowPtD8UI4qCZVO KZfljypT+3p6du+qU4YRAXo9myO7LHWbOR7gvYKlIrTmcU4CFqggZyM+bAtWyqBW2tRC 2+QO6NHgV59zpYrmS4nku+z0659STeLxOSr6cFc5PsQnymPrMUW5lVoegZTv8grl9V9h fzud5Mp5Lt1f09iBvgOh25blgEOLiUU+CB8cKr1nz/qLAgze0ZicSH1zyFlj8InhuiVG IjrQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=X8bZqPNp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e23si1134440ejl.643.2020.09.09.02.11.11; Wed, 09 Sep 2020 02:11:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=X8bZqPNp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729773AbgIIJJt (ORCPT + 99 others); Wed, 9 Sep 2020 05:09:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44908 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727085AbgIIJJq (ORCPT ); Wed, 9 Sep 2020 05:09:46 -0400 Received: from mail-oi1-x241.google.com (mail-oi1-x241.google.com [IPv6:2607:f8b0:4864:20::241]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2716CC061573 for ; Wed, 9 Sep 2020 02:09:46 -0700 (PDT) Received: by mail-oi1-x241.google.com with SMTP id i17so1630063oig.10 for ; Wed, 09 Sep 2020 02:09:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=Lp4vuPsvFxDFpD3RPxxQA9GDt79f5fSUFEakGXQllkE=; b=X8bZqPNpcH6+/UUTOlrjCUxpBIHRpxYnhOVxM9oVc76gpNUIHtMobQJxnLGp2RHGb4 Ag+2MVXMcEdEA5t+cj1rLZk0Loc4qT1G9mve+/e3J57GPskWXfTbKzlUAspPm1JYQSV/ 8xdRiRdynXSrUFPoqwxTCuuPf2R5OVVoYvbt1xSfcQ4urDIBFVnF9H6kmQnrjb3bZOyO 2y1f9ogfYDKHVe1Fg/AckWNNvnU8JFinD76QEsugAs6t7Qavjw8fFb6SnfZVdzwKTI9l /CTKR4aq+vjinFMOaI1/hFpbqqSEsAgOOlBvbHYTYgit2N1FeQnGeWkpHsw10i0o12vY 2PXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=Lp4vuPsvFxDFpD3RPxxQA9GDt79f5fSUFEakGXQllkE=; b=G6s03++dpaWIIbTa2JXIaDukJGQL0UioOdG3mfOgu5B2zGACicJTWU6uIbB/lpmBGe SaHnu9iTX9DoBJAqCd+PZrCji74KOXSp1U8XWaHEI1SBs4+ummkbSyUGaVpFFWP1MTfn KLGXnnNiqXao00LlfMjwD2nMyAHl4VhfuvH8r3IzFqKtWDLmhei2bPfEhKiUT34eDnHJ xG0O3gJFkcwbDJNrBMDzprHmARr66CdWpzW6jUb+VVIcxcwcX78mD6Ux45WxRMBYR5Hk PN61XdSzHe9Mdc91gViJTMo9dJ/C51d2+ddAF3trZao2kM0V5YnwoQpAirWEPUfZXOGG bWDg== X-Gm-Message-State: AOAM5331i9jwk+GWCMgalyDgnANOdUJb4oRCEpi+86c0RxOv0vESdz2x bd+Lp0fLeJ0ICOauNvcfNrc= X-Received: by 2002:aca:e144:: with SMTP id y65mr24383oig.101.1599642584322; Wed, 09 Sep 2020 02:09:44 -0700 (PDT) Received: from localhost.localdomain ([50.236.19.102]) by smtp.gmail.com with ESMTPSA id 187sm342993oie.42.2020.09.09.02.09.37 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Sep 2020 02:09:43 -0700 (PDT) From: qianjun.kernel@gmail.com To: tglx@linutronix.de, peterz@infradead.org, will@kernel.org, luto@kernel.org, linux-kernel@vger.kernel.org Cc: laoar.shao@gmail.com, urezki@gmail.com, jun qian Subject: [PATCH V6 1/1] Softirq:avoid large sched delay from the pending softirqs Date: Wed, 9 Sep 2020 17:09:31 +0800 Message-Id: <20200909090931.8836-1-qianjun.kernel@gmail.com> X-Mailer: git-send-email 2.20.1 (Apple Git-117) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: jun qian When get the pending softirqs, it need to process all the pending softirqs in the while loop. If the processing time of each pending softirq is need more than 2 msec in this loop, or one of the softirq will running a long time, according to the original code logic, it will process all the pending softirqs without wakeuping ksoftirqd, which will cause a relatively large scheduling delay on the corresponding CPU, which we do not wish to see. The patch will check the total time to process pending softirq, if the time exceeds 2 ms we need to wakeup the ksofirqd to aviod large sched delay. Signed-off-by: jun qian --- kernel/softirq.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 69 insertions(+), 14 deletions(-) diff --git a/kernel/softirq.c b/kernel/softirq.c index c4201b7f..1f696c8 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -25,6 +25,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -199,18 +200,17 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int cnt) /* * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times, - * but break the loop if need_resched() is set or after 2 ms. - * The MAX_SOFTIRQ_TIME provides a nice upper bound in most cases, but in - * certain cases, such as stop_machine(), jiffies may cease to - * increment and so we need the MAX_SOFTIRQ_RESTART limit as - * well to make sure we eventually return from this method. + * but break the loop if need_resched() is set or after MAX_SOFTIRQ_TIME_NS + * ns. In the loop, if the processing time of the softirq has exceeded + * MAX_SOFTIRQ_TIME_NS ns, we also need to break the loop to wakeup the + * ksofirqd. * * These limits have been established via experimentation. * The two things to balance is latency against fairness - * we want to handle softirqs as soon as possible, but they * should not be able to lock up the box. */ -#define MAX_SOFTIRQ_TIME msecs_to_jiffies(2) +#define MAX_SOFTIRQ_TIME_NS 2000000 #define MAX_SOFTIRQ_RESTART 10 #ifdef CONFIG_TRACE_IRQFLAGS @@ -246,15 +246,20 @@ static inline void lockdep_softirq_end(bool in_hardirq) static inline void lockdep_softirq_end(bool in_hardirq) { } #endif +DEFINE_PER_CPU(__u32, pending_new_flag); +DEFINE_PER_CPU(__u32, pending_next_bit); +#define SOFTIRQ_PENDING_MASK ((1UL << NR_SOFTIRQS) - 1) + asmlinkage __visible void __softirq_entry __do_softirq(void) { - unsigned long end = jiffies + MAX_SOFTIRQ_TIME; + u64 end = sched_clock() + MAX_SOFTIRQ_TIME_NS; unsigned long old_flags = current->flags; int max_restart = MAX_SOFTIRQ_RESTART; struct softirq_action *h; bool in_hardirq; - __u32 pending; - int softirq_bit; + __u32 pending, pending_left, pending_new; + int softirq_bit, next_bit; + unsigned long flags; /* * Mask out PF_MEMALLOC as the current task context is borrowed for the @@ -277,10 +282,33 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) h = softirq_vec; - while ((softirq_bit = ffs(pending))) { - unsigned int vec_nr; + next_bit = per_cpu(pending_next_bit, smp_processor_id()); + per_cpu(pending_new_flag, smp_processor_id()) = 0; + + pending_left = pending & + (SOFTIRQ_PENDING_MASK << next_bit); + pending_new = pending & + (SOFTIRQ_PENDING_MASK >> (NR_SOFTIRQS - next_bit)); + + /* + * In order to be fair, we shold process the pengding bits by the + * last processing order. + */ + while ((softirq_bit = ffs(pending_left)) || + (softirq_bit = ffs(pending_new))) { int prev_count; + unsigned int vec_nr = 0; + /* + * when the left pengding bits have been handled, we should + * to reset the h to softirq_vec. + */ + if (!ffs(pending_left)) { + if (per_cpu(pending_new_flag, smp_processor_id()) == 0) { + h = softirq_vec; + per_cpu(pending_new_flag, smp_processor_id()) = 1; + } + } h += softirq_bit - 1; vec_nr = h - softirq_vec; @@ -298,17 +326,44 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) preempt_count_set(prev_count); } h++; - pending >>= softirq_bit; + + if (ffs(pending_left)) + pending_left >>= softirq_bit; + else + pending_new >>= softirq_bit; + + /* + * the softirq's action has been run too much time, + * so it may need to wakeup the ksoftirqd + */ + if (need_resched() && sched_clock() > end) { + /* + * Ensure that the remaining pending bits will be + * handled. + */ + local_irq_save(flags); + if (ffs(pending_left)) + or_softirq_pending((pending_left << (vec_nr + 1)) | + pending_new); + else + or_softirq_pending(pending_new << (vec_nr + 1)); + local_irq_restore(flags); + per_cpu(pending_next_bit, smp_processor_id()) = vec_nr + 1; + break; + } } + /* reset the pending_next_bit */ + per_cpu(pending_next_bit, smp_processor_id()) = 0; + if (__this_cpu_read(ksoftirqd) == current) rcu_softirq_qs(); local_irq_disable(); pending = local_softirq_pending(); if (pending) { - if (time_before(jiffies, end) && !need_resched() && - --max_restart) + if (!need_resched() && --max_restart && + sched_clock() <= end) goto restart; wakeup_softirqd(); -- 1.8.3.1