Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp6712932rwl; Mon, 9 Jan 2023 12:00:38 -0800 (PST) X-Google-Smtp-Source: AMrXdXutxkkDeEf8SlKULJ9pDMMA6hUNPnQ5ry9myPbuM9kHXbHtiq5OOkuDlmEB9hZ4fak1rJHi X-Received: by 2002:a17:903:4c2:b0:193:1424:2c83 with SMTP id jm2-20020a17090304c200b0019314242c83mr11975321plb.58.1673294438531; Mon, 09 Jan 2023 12:00:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673294438; cv=none; d=google.com; s=arc-20160816; b=TtmKOU1hGYB/ELd56G894LHEbWRqFbELgTpdhM8fah/UDdmUgpOZmRDwEoap5Rn2k6 q+mKTCgkPuYJWATtfNAnjHtDBTG6NjsUJ+spQySNUNtt1bJgJSHpwRdzeXN5EQZWyvet tbmzG/S7n2gcPJgketPZUjdZbD1OKqbN+/ZSKucDC1oMJ8r6RfDVDp2HNZ5tPZ2gDgNS JPMygbNQQQHjHlsH6mhi8Q6rmYbyHq5kjsLLjDwGp8mxc0YSmUnB7I0Js1HZwz0EFcuI B7iRCfLb32magCGErHzGYJM8s1n25ZoVTJPj5swu9Tg7hTihrXc43MH3zuCwunIlRVh4 z+GA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=szFu0crkCEhnjDYeWlHkBPVa7nhym4GC/v+HwEO9XUw=; b=aDarfWf1PEhRac3O1lMXTw6o0M6kSsfwdjPBvi9L4KjGn+3Ak4BAapLBgFhUY8Sacb W12u1E6L4m/5l2QkoEWOvbpPnWYfQ/j1NVh36b67GZ4h0At0bXRc1o4Ihnzi0UNMf/Ho jg8Pu6gZXgjjypCVlEeR2AYHV6WdWGbWctwsVg2ERroO3SP3lPo8tU51PVt6VDiHfjZ2 3FryozqRRc/DcEGUlbtbpfeBaHdgqOPMspd/DAcZ/Gi5+nD5dk7DGmhktq2difAudllU GgEPVxX5HJLxgF19Fcu4vcr5XbUIhgABSGx26+YP8DjFA7Eg+27zWrogmy+3+znOMLYR WP4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=W1yu7yBW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t6-20020a170902e84600b00189754b9d97si10591878plg.122.2023.01.09.12.00.31; Mon, 09 Jan 2023 12:00:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=W1yu7yBW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237376AbjAITN3 (ORCPT + 53 others); Mon, 9 Jan 2023 14:13:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51192 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235289AbjAITNG (ORCPT ); Mon, 9 Jan 2023 14:13:06 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4991838B8; Mon, 9 Jan 2023 11:12:45 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id D6E98B80E00; Mon, 9 Jan 2023 19:12:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 416A1C433EF; Mon, 9 Jan 2023 19:12:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673291562; bh=wTIj3y5/T8S2ncQd4XUCVx1Z1E5mTxiUfX6LIHH3wGc=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=W1yu7yBWp9tihRsSUgU5MySlpp9oOCoYEbYN6WheuNO8mi0y/CCC5HHACp3L4c7FF 8odzoADtE2xhhfCbqif5Pc/6TD7Kd9z71ZqM/CDkOQoN15dQ14I+AaR5WG+aX8OS7R rg7ILZkDEZzjsb8OyORz0+exxtn5oHd9Bn2o4d7P0q1d6lKYUDZ8+q9HYS3zRVAGJm rd4hGEFAllrvZHc8tXS4/3a8PmsKGokEDIeH7g/wQYMnwadRYZcFr+5G8VLjpA13CA nUVzgQTyA1vXzhcPBZ+CVBQFOp8l2Gau6vocdHojrnPhRe6+AlJ6qiiQQIfz+jL66j Rrl06mQgc3K1A== Date: Mon, 9 Jan 2023 11:12:41 -0800 From: Jakub Kicinski To: Eric Dumazet , Peter Zijlstra Cc: tglx@linutronix.de, jstultz@google.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 3/3] softirq: don't yield if only expedited handlers are pending Message-ID: <20230109111241.6ed3a64a@kernel.org> In-Reply-To: References: <20221222221244.1290833-1-kuba@kernel.org> <20221222221244.1290833-4-kuba@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 9 Jan 2023 11:16:45 +0100 Eric Dumazet wrote: > > On Thu, Dec 22, 2022 at 02:12:44PM -0800, Jakub Kicinski wrote: > > > In networking we try to keep Tx packet queues small, so we limit > > > how many bytes a socket may packetize and queue up. Tx completions > > > (from NAPI) notify the sockets when packets have left the system > > > (NIC Tx completion) and the socket schedules a tasklet to queue > > > the next batch of frames. > > > > > > This leads to a situation where we go thru the softirq loop twice. > > > First round we have pending = NET (from the NIC IRQ/NAPI), and > > > the second iteration has pending = TASKLET (the socket tasklet). > > > > So to me that sounds like you want to fix the network code to not do > > this then. Why can't the NAPI thing directly queue the next batch; why > > do you have to do a softirq roundtrip like this? > > I think Jakub refers to tcp_wfree() code, which can be called from > arbitrary contexts, > including non NAPI ones, and with the socket locked (by this thread or > another) or not locked at all > (say if skb is freed from a TX completion handler or a qdisc drop) Yes, fwiw. > > > On two web workloads I looked at this condition accounts for 10% > > > and 23% of all ksoftirqd wake ups respectively. We run NAPI > > > which wakes some process up, we hit need_resched() and wake up > > > ksoftirqd just to run the TSQ (TCP small queues) tasklet. > > > > > > Tweak the need_resched() condition to be ignored if all pending > > > softIRQs are "non-deferred". The tasklet would run relatively > > > soon, anyway, but once ksoftirqd is woken we're risking stalls. > > > > > > I did not see any negative impact on the latency in an RR test > > > on a loaded machine with this change applied. > > > > Ignoring need_resched() will get you in trouble with RT people real > > fast. Ah, you're right :/ Is it good enough if we throw || force_irqthreads() into the condition? Otherwise we can just postpone this optimization, the overload time horizon / limit patch is much more important.