Received: by 2002:a25:ca44:0:0:0:0:0 with SMTP id a65csp324714ybg; Tue, 28 Jul 2020 07:04:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy1sUD/GjzGUQyassjnP80M2Ju0WDjF2OklxaTPZJzkYRx0kkURWlWKVaANcqCjEzYrRTqi X-Received: by 2002:a05:6402:337:: with SMTP id q23mr26773321edw.63.1595945081194; Tue, 28 Jul 2020 07:04:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595945081; cv=none; d=google.com; s=arc-20160816; b=NcBloUdpqQmc9jg7XdsXBWxcgjDVnr3SPW4zLxI1yZBBu4X8t6VVqXzkI0410+sLPQ BiSXnapqMggzIXr+zAdENrFH/fWTFIaWnCzuTzcXJYBYsHJXeVeP/ZBYwrANGkJwal5d r1MD0AzmKZ1X559OjsvQtJKUQN+iF/lrp09+ALKH59cwtdY1jMculBoK4Ils09UYKcWn JUuaXUI+2cjraugoBUpcHvzoAGSgtCsR5QktUlsxcCqWcm9Z1RAqghJSR/k4XCMasohq YP9mCwb+OpjekKhuLbvU4cG1UV8K8/xeJnij3CuxNcPtc7aYX3MWkuq2m5gODqQu7XTF Cehg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=jL69nX+B0Hx8u7JYf58E+XYay/YU8aCHKDCqftm6Jow=; b=MHf9iPa3vyJxUQlpuB07eDufwZUyB1e0CEVkOSCTPqIyGbgYZb9QDDqBmeIOwerZEx ieTajxURgKtIoyeynh1EsjCketLSc3i+DleLcXO8nbcP3ZFd4aojsZl4qn6m9eihRWA4 XI6kJW5nKX/dn9uFt3feXJPQe4rQBBNf7JlZ18rnSNMZowTLiTAez+NYlGiNeSai2Dko 56D0h2Z14JADCXIJEUC6uc0oQB1sC5G9StDuWkRP9oiQe/SMLRqrCw8mZBAr1lN4yAJK HoWgJXFBDFFmfQQhXdJlmiJg8CTuYXnKjPRqLGssyXzfZkjRjAtJ92G2ooldeIUqJCxY HVGw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=k6wjXpFR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i23si7926122ejj.300.2020.07.28.07.04.19; Tue, 28 Jul 2020 07:04:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=k6wjXpFR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730290AbgG1OC6 (ORCPT + 99 others); Tue, 28 Jul 2020 10:02:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59786 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730155AbgG1OC5 (ORCPT ); Tue, 28 Jul 2020 10:02:57 -0400 Received: from mail-oo1-xc44.google.com (mail-oo1-xc44.google.com [IPv6:2607:f8b0:4864:20::c44]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85987C061794 for ; Tue, 28 Jul 2020 07:02:57 -0700 (PDT) Received: by mail-oo1-xc44.google.com with SMTP id n24so1266595ooc.3 for ; Tue, 28 Jul 2020 07:02:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=jL69nX+B0Hx8u7JYf58E+XYay/YU8aCHKDCqftm6Jow=; b=k6wjXpFRtv5x15wt6DJdM0L4rO3whmGRX2Q8SaF9R2IG8WNwjhtcJVE3hk/WdhZJJ0 7P7dWRayCH5GJ1KKywwpc6ussSFtA4ZLH9yKZnb3afhgBylKUHa7dH5fMkPvZ983d+RR WmtO5vyDWgRNuxyYeoLVyD4mSi++/pz6bG6qQXRY74CMSQSBLlvjfQacETc/zBbCyC3T qXvpvcEUzddgPHE5tXX6UPMGo1OCcOBM/Nf5TikK8CeGE1mfzxCOYkR5vjkdvxuX4lfu DcQTkfoLS2fmblyYlYtBVxo8Y6hWczqtEuthfFxdoBK3WcxZvIHQySZ9Yr4rvCiFJyCM L+3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=jL69nX+B0Hx8u7JYf58E+XYay/YU8aCHKDCqftm6Jow=; b=HmbnonYI1IeK0pXgSPdLl4PkeAaeJ+KMu/o5OttRn7xNK5S/8Co5ZOvE5hLa3cgdmN zbY5m8pPqOvUEvow9sZL2Yx7gd8D5ausA7MwJskgcXBXqhbQ6DMPrV0OvMikQ6G30l08 Uw3NquletrHHgUbQvhPqzp2NCcbtmtiUnuH4TsjnbhUp2Jf2SOAybEb5dgICzd0yojN3 8rR35/rJLR5WH6+UnZqiZlV9WgaLNUK2h1vvoVIwSlsW1AOu3EbDzUg8fT62j5uICjfA iKzScuXzCPeQYDIklJesUddIjVgMceKfbphrEb9lCue/2n+JBP/F+rfSF0nQF/hGozdx IYTw== X-Gm-Message-State: AOAM532vyTCBdcAu3dJdPFvMeD6gocuPAqhzeO+Jjft3bH7hgWLseykX HukWbQILV6gNixZMoDDNznObrfqGDkpil9Szdds= X-Received: by 2002:a4a:d4d9:: with SMTP id r25mr25151088oos.51.1595944976680; Tue, 28 Jul 2020 07:02:56 -0700 (PDT) MIME-Version: 1.0 References: <1595601083-10183-1-git-send-email-qianjun.kernel@gmail.com> <87sgddaru7.fsf@nanos.tec.linutronix.de> In-Reply-To: <87sgddaru7.fsf@nanos.tec.linutronix.de> From: jun qian Date: Tue, 28 Jul 2020 22:02:45 +0800 Message-ID: Subject: Re: [PATCH V4] Softirq:avoid large sched delay from the pending softirqs To: Thomas Gleixner Cc: peterz@infradead.org, will@kernel.org, luto@kernel.org, linux-kernel@vger.kernel.org, Yafang Shao , Uladzislau Rezki Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 27, 2020 at 11:41 PM Thomas Gleixner wrote: > > Qian, > > qianjun.kernel@gmail.com writes: > > /* > > * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times, > > * but break the loop if need_resched() is set or after 2 ms. > > - * The MAX_SOFTIRQ_TIME provides a nice upper bound in most cases, but in > > - * certain cases, such as stop_machine(), jiffies may cease to > > - * increment and so we need the MAX_SOFTIRQ_RESTART limit as > > - * well to make sure we eventually return from this method. > > + * In the loop, if the processing time of the softirq has exceeded 2 > > + * milliseconds, we also need to break the loop to wakeup the > > ksofirqd. > > You are removing the MAX_SOFTIRQ_RESTART limit explanation and I rather > have MAX_SOFTIRQ_TIME_NS there than '2 milliseconds' in case the value > gets adjusted later on. Also while sched_clock() is granular on many > systems it still can be jiffies based and then the above problem > persists. > > > @@ -299,6 +298,19 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) > > } > > h++; > > pending >>= softirq_bit; > > + > > + /* > > + * the softirq's action has been running for too much time > > + * so it may need to wakeup the ksoftirqd > > + */ > > + if (need_resched() && sched_clock() > end) { > > + /* > > + * Ensure that the remaining pending bits are > > + * handled. > > + */ > > + or_softirq_pending(pending << (vec_nr + 1)); > > To or the value interrupts need to be disabled because otherwise you can > lose a bit when an interrupt happens in the middle of the RMW operation > and raises a softirq which is not in @pending and not in the per CPU > local softirq pending storage. > I can't understand the problem described above, could you explain in detail. thanks. > There is another problem. Assume bit 0 and 1 are pending when the > processing starts. Now it breaks out after bit 0 has been handled and > stores back bit 1 as pending. Before ksoftirqd runs bit 0 gets raised > again. ksoftirqd runs and handles bit 0, which takes more than the > timeout. As a result the bit 0 processing can starve all other softirqs. > May I use a percpu val to record the order of processing softirq, by the order val, when it is in ksoftirqd we can process the pending softirq in order. In the scenario you described above, before ksoftirqd runs, bit 0 gets raised again, ksoftirqd runs and handles bit 1 by the percpu val. just like a ring. Looking forward to your suggestions Thanks > So this needs more thought. > > Thanks, > > tglx