Received: by 2002:a89:2c3:0:b0:1ed:23cc:44d1 with SMTP id d3csp1040949lqs; Wed, 6 Mar 2024 04:52:05 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCXAbzzmdv6Sw3Hbje9yeq5x/SMM4VCHDoLQXNHWhFKlOWO6dGz+6yzgBSfga5RC5gLR7U+XhlaLApUYKlPxuWVxTUIXW5aI2ly3kwMw4A== X-Google-Smtp-Source: AGHT+IFh9ynb1QmG32dESl7E+JUdaSuzp42IvjawxPMQ5Wwp52W0qTUJQ7A/l42qW49lBIiMvAas X-Received: by 2002:a05:620a:cef:b0:788:384d:d3b4 with SMTP id c15-20020a05620a0cef00b00788384dd3b4mr3676584qkj.25.1709729525594; Wed, 06 Mar 2024 04:52:05 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709729525; cv=pass; d=google.com; s=arc-20160816; b=kZ4aEodhJt88uLODayqEiSLR1T8zm41GpZMNmiSpsgE4FXn7eCud76eBuI9QIChdLe OO8+NJD5kWmevCj2JtZ5kDEHrCCEtlJj/FByR68mWv7fqQzdNbplt4RDT6kOE/TA8LJb cEwbDzQD8LMW7avJmyHOUvnbOtvZmxicYZ+H5O8zEeVkefERV8iyAUJaCQQow9PHCr6j cFwVV5gFk6eUKy5BuR9aMCmH7CWG/+O4/FZP2w5r5xT1Ay10YnrbYzEeRRhQz8uZzyqA 8Fu6ptipcguWyryeZUudG6WyBhIfctxOBJQ5l+SIrOfCNSAgbIVzO8Jx/oh8eoKCqyi1 jS6Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=NZ53iU6hdRTVPXgb3bO7wAZ+renq+wmZYoXZ3DzIzD0=; fh=8nHsaLMKcFW3pLeuuQPb0fHOYwFtwfiuXw9xeP9Q53A=; b=0695fmHgbMDDL2Cbg7l0Bs/N92zS9m2+k8qT7566rjE7AZO5uFqgUAmI4uOSfh/Idq gW4OKHrwT+IpNQygbwkkSqaYvwfoJXFHs8eufmg6tCqmVLoplNtE/5j/KPeIsiNAFFzN GdVtDMsDot0wUpi1Kz0FA21sbN4RNP+uu3J8V+YnPiQaFIsHh/jrJ/xC2d5k/wIR6/0z JxygaYwA/0fT1rK3ubRecjEMUTCdhDdMnUHoJUFzZXPIuEMsGpPnzpB4/B3DlEU4LEgv +IUI2+Y2gDG7b9QmWVF8Avrr4aP5sK6tfKqmBNFO1jv61K0W9CvuKAFyvwFSZaoMrSJY ga9g==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=TRMsbHf8; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-93918-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-93918-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id de43-20020a05620a372b00b0078769904e35si15750944qkb.219.2024.03.06.04.52.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Mar 2024 04:52:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-93918-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=TRMsbHf8; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-93918-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-93918-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id CB11A1C2074D for ; Wed, 6 Mar 2024 12:52:04 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 7D63712D1FC; Wed, 6 Mar 2024 12:51:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="TRMsbHf8" Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com [209.85.208.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A7EB412C53E for ; Wed, 6 Mar 2024 12:51:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.41 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709729517; cv=none; b=NESZ/6ftfvWCh/JFo7GAZcSHIhfHW+FoYRGrg771MBEa6ODrch/9ijzvbyDeMKkA28mYcgYy6gXXV/2CykV7p6ioA6Q2ZuymBFTbQuySjItOSpPhOUycz5KIM/ZenhipkXM6jNgU0HbJ7t6ZZJbjG/OuWzmh6mNZbn0DB6GiFe8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709729517; c=relaxed/simple; bh=RdX+DSyimaOFGLL/X/PkNBOiu5y8vWHzGkaUio1sOK4=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=nTHY0T7Vj5NFgcuiJo51HKvIpdyD+x5zl5LiFbyx53he5AJDRPktaxAbJ612VBgNBaSUV9lreXpRlURp738EhwkrM7dVa7seYi/rbkqtVCqc8dMklSngeL7C8cSjxfRBrz3CZBNChaBr1+La7vEBPLnF5qBawy0GtIZiNvRtq5c= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=TRMsbHf8; arc=none smtp.client-ip=209.85.208.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-ed1-f41.google.com with SMTP id 4fb4d7f45d1cf-5654ef0c61fso12713a12.0 for ; Wed, 06 Mar 2024 04:51:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1709729514; x=1710334314; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=NZ53iU6hdRTVPXgb3bO7wAZ+renq+wmZYoXZ3DzIzD0=; b=TRMsbHf8bi7lRPrYccur/RzoR/6TByjI8Lk5ZQOB6mkNxKWV0jt7xJUz5xZYS37Jei kM20drBiHkynuKBUP3iA5Xql3KNH8VgxQmP5qW9d2KBSHX6NhJwkOSZHkZt3yJtDS8jz WhR68MA55IQopWjDcBt/dTtKGOEDdW6Rg24eSX+PcVzro8c8JLQ7pFylkAddHhDlXCt3 V+k6XagdGcY3oJONrUzbJ5v6iX/5QaYF78xJAPF4IbhcpJKTUWrn4TsPm5le+1vVq8GV ESt/bhFAL5gG3G6Idz9evRnTybfIuEMqmC5cNQ4Q6SCpOxS++5kgy2aZFlRD58IFpo4X uBTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709729514; x=1710334314; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NZ53iU6hdRTVPXgb3bO7wAZ+renq+wmZYoXZ3DzIzD0=; b=u1EQ6Tc1Wm57whUlKYMD+atHrw5UHKOtpQTB67xZvs6n51Xo2Nl1f33MtSDa33Xdts xT8ESvDi7ZsgnrABRNCson/edq1dIsYVdzSwh7mfXToSjuzK/X7Hmueno9zqz1FNetRt eL4Jd83M/ZlEKUHPX76QkuTV3eTsmBJ4IBu22QZBvOhtuqAEEIcKZlVTfvNATJe/gQoa Jl+0tviHlKoIF0BoYCTACyZtXgjPOXj95AWP95swM/rk/VrOoHisfwapXJNClMgP1ca4 ppy1d4PRWpche1Nr3ji0ULbVIxw1pOUIveTbjyiW5YXVQmriqQ/bpuCcHmDxW+fIBEkq JUTQ== X-Forwarded-Encrypted: i=1; AJvYcCVNeJLHu152roJsIz9mJblh5lmmfhBzaXQSwQH0FJ3C2HVdHKU8pVOcrOAeaGLaKCrHntCw2IEYbDcngn5staSoWUJ1y8CWFNDzrrmJ X-Gm-Message-State: AOJu0Yy1QR5x3ggE+vlV66RhAqb2YzbtKTHvrGEw79NNvJ1/6N9HuFUv d23nwquZSZjWennut2SeMWQZoMf1aLgNIuL8pgiIkAuhVvBgzOxpVdOlWSX2SgcMC5rGdgeQUWw lx42gKECQvPThX5QVXL/titDRPR18+0248sC1 X-Received: by 2002:a05:6402:22c4:b0:567:3840:15be with SMTP id dm4-20020a05640222c400b00567384015bemr312577edb.1.1709729513557; Wed, 06 Mar 2024 04:51:53 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: In-Reply-To: From: Eric Dumazet Date: Wed, 6 Mar 2024 13:51:42 +0100 Message-ID: Subject: Re: Network performance regression in Linux kernel 6.6 for small socket size test cases To: Boon Ang Cc: Abdul Anshad Azeez , davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, corbet@lwn.net, dsahern@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, John Savanyo , Peter Jonasson , Rajender M Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Mar 6, 2024 at 1:43=E2=80=AFPM Boon Ang wro= te: > > Hello Eric, > > The choice of socket buffer size is something that an application can de= cide and there many be reasons to keep to smaller sizes. While high bandwi= dth transfers obviously should use larger sizes, a change that regresses th= e performance of existing configuration is a regression. Is there any way = to modify your change so that it keeps the benefits while avoiding the degr= adation for small socket sizes? > The kernel limits the amount of memory used by the receive queue. The problem is that for XXX bytes of payload (what the user application wan= ts), the metadata overhead is not fixed. Kernel structures change over time, and packets are not always full from the remote peer (that we can not control) 1000 bytes of payload might fit in 2KB, or 2MB depending on how the bytes are spread over multiple skbs. This issue has been there forever, the kernel can not put in stone any rule= : XXXX bytes of payload ---> YYYY bytes of kernel memory to hold XXXX bytes of payload. It is time that applications setting tiny SO_RCVBUF values get what they wa= nt : Poor TCP performance. Thanks. > Thanks > Boon > > On Wed, Feb 28, 2024 at 12:48=E2=80=AFAM Eric Dumazet wrote: >> >> On Wed, Feb 28, 2024 at 7:43=E2=80=AFAM Abdul Anshad Azeez >> wrote: >> > >> > During performance regression workload execution of the Linux >> > kernel we observed up to 30% performance decrease in a specific networ= king >> > workload on the 6.6 kernel compared to 6.5 (details below). The regres= sion is >> > reproducible in both Linux VMs running on ESXi and bare metal Linux. >> > >> > Workload details: >> > >> > Benchmark - Netperf TCP_STREAM >> > Socket buffer size - 8K >> > Message size - 256B >> > MTU - 1500B >> > Socket option - TCP_NODELAY >> > # of STREAMs - 32 >> > Direction - Uni-Directional Receive >> > Duration - 60 Seconds >> > NIC - Mellanox Technologies ConnectX-6 Dx EN 100G >> > Server Config - Intel(R) Xeon(R) Gold 6348 CPU @ 2.60GHz & 512G Memory >> > >> > Bisect between 6.5 and 6.6 kernel concluded that this regression origi= nated >> > from the below commit: >> > >> > commit - dfa2f0483360d4d6f2324405464c9f281156bd87 (tcp: get rid of >> > sysctl_tcp_adv_win_scale) >> > Author - Eric Dumazet >> > Link - >> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/com= mit/?id=3D >> > dfa2f0483360d4d6f2324405464c9f281156bd87 >> > >> > Performance data for (Linux VM on ESXi): >> > Test case - TCP_STREAM_RECV Throughput in Gbps >> > (for different socket buffer sizes and with constant message size - 25= 6B): >> > >> > Socket buffer size - [LK6.5 vs LK6.6] >> > 8K - [8.4 vs 5.9 Gbps] >> > 16K - [13.4 vs 10.6 Gbps] >> > 32K - [19.1 vs 16.3 Gbps] >> > 64K - [19.6 vs 19.7 Gbps] >> > Autotune - [19.7 vs 19.6 Gbps] >> > >> > From the above performance data, we can infer that: >> > * Regression is specific to lower fixed socket buffer sizes (8K, 16K &= 32K). >> > * Increasing the socket buffer size gradually decreases the throughput= impact. >> > * Performance is equal for higher fixed socket size (64K) and Autotune= socket >> > tests. >> > >> > We would like to know if there are any opportunities for optimization = in >> > the test cases with small socket sizes. >> > >> >> Sure, I would suggest not setting small SO_RCVBUF values in 2024, >> or you get what you ask for (going back to old TCP performance of year 2= 010 ) >> >> Back in 2018, we set tcp_rmem[1] to 131072 for a good reason. >> >> commit a337531b942bd8a03e7052444d7e36972aac2d92 >> Author: Yuchung Cheng >> Date: Thu Sep 27 11:21:19 2018 -0700 >> >> tcp: up initial rmem to 128KB and SYN rwin to around 64KB >> >> >> I can not enforce a minimum in SO_RCVBUF (other than the small one added= in >> commit eea86af6b1e18d6fa8dc959e3ddc0100f27aff9f ("net: sock: adapt >> SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF")) >> otherwise many test programs will break, expecting to set a low value. > > > This electronic communication and the information and any files transmitt= ed with it, or attached to it, are confidential and are intended solely for= the use of the individual or entity to whom it is addressed and may contai= n information that is confidential, legally privileged, protected by privac= y laws, or otherwise restricted from disclosure to anyone else. If you are = not the intended recipient or the person responsible for delivering the e-m= ail to the intended recipient, you are hereby notified that any use, copyin= g, distributing, dissemination, forwarding, printing, or copying of this e-= mail is strictly prohibited. If you received this e-mail in error, please r= eturn the e-mail to the sender, delete it from your computer, and destroy a= ny printed copy of it.