Received: by 2002:a5d:925a:0:0:0:0:0 with SMTP id e26csp1214295iol; Sun, 12 Jun 2022 09:12:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwE54u2cIrXMl8KuTks7WEMN8q6f0i8x3epqJM6C+JkKSeL7cYV+RV28q8ObDGvxun9SWbi X-Received: by 2002:a17:90b:38c2:b0:1e8:747f:a13b with SMTP id nn2-20020a17090b38c200b001e8747fa13bmr11006488pjb.166.1655050333400; Sun, 12 Jun 2022 09:12:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1655050333; cv=none; d=google.com; s=arc-20160816; b=qo6c3wvAz+4BqifeMJZI30hPewuTtH0fOBOe7ikBAJmricMwyN6Nw4+ABPTPy+wUdO MA0VFHp/iAuw8Px+j9iR11H/HuWNRGe9JZJW9A36iQZAlVQb9vglNFq6GGuf43i2YyS5 hvtHanFW651+r068DhoQpHDrLFSjl2vYYPOtXTMo708szwV3yeOubEPSvDON8BynBhkq kiGRkEP8WDciHiFUzrv6LGY+2eUl4a7OFf7Vza4yMn7F/ljeIeFqfH0WrZuRrIhTHUJ0 UHsE+wXlszJJY9ftxi1SqtJeRf8Y+w21IB/16bFM/3ZJyHXiIh+NFx+V6oai9I8YFy94 AtuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:subject:cc:to:from :dkim-signature; bh=bL3IO7tvytV66ahctRHmGVdmaVmcbUuZxI0XVWFQ8ZA=; b=gDNTf122PolFyTDrYvte/xfPYI9fqaNhOnxDA8D0qi1nX2B10TuKkF4h7QNXAvovAR vqOy07z24v1qPjFjcVpSS6LRzOPhr9WjrBNmVctaei2N1Qcb+ibgNE7M4rSFLJL9WnYq j8cYQvbt+kGF1B12JSY6JU4N1bWbdSDZ/isnjaozSL//oAAO34XLKb7h+tWXWIOz2Q+0 CGNR42DTo/czuel8yB+XsytHgIMqPoRotkZ9D7gvk/T8b8xB2eDXe/lsPV4btW28y1UG MDFQpTQa0KwwSvBuSnYeHnMV3734ufviaz3CoVrBh/4Z5a14T8+Io6bKtUEKOUAClH7C QK9Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@fastly.com header.s=google header.b=JHL9c5Jj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=fastly.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k17-20020a170902c41100b00161d4849a90si8619369plk.575.2022.06.12.09.12.02; Sun, 12 Jun 2022 09:12:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@fastly.com header.s=google header.b=JHL9c5Jj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=fastly.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235207AbiFLJAf (ORCPT + 99 others); Sun, 12 Jun 2022 05:00:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235179AbiFLJAd (ORCPT ); Sun, 12 Jun 2022 05:00:33 -0400 Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1732B50E27 for ; Sun, 12 Jun 2022 02:00:32 -0700 (PDT) Received: by mail-wm1-x330.google.com with SMTP id q15so1478984wmj.2 for ; Sun, 12 Jun 2022 02:00:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastly.com; s=google; h=from:to:cc:subject:date:message-id; bh=bL3IO7tvytV66ahctRHmGVdmaVmcbUuZxI0XVWFQ8ZA=; b=JHL9c5Jj8+TTgAFrBdAvWEfZKv/K76qtcJfx/EIfmB2r7S2U8DofeGOLyDw8HLYCoZ iEyWEAXSFpm/02ixoec9XeOeaeTauxgSIFJq3A3rcSBMTDyAGP8TcAI/kR3QMrDLsAlY mNAl2/obbnT6h7TqFa+zV3mokxklPGO+7gk9o= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=bL3IO7tvytV66ahctRHmGVdmaVmcbUuZxI0XVWFQ8ZA=; b=Ff28NO4Qf5H4ynvTOwXuV/qRZXL4DuLcQuaDgMzGcUzf3boECGpRILdFV/9QH/MQUh UOe1Itq5eUXQ3NN4O/d2Bg5cqlnxUdQi+r6nsKGIUx1zl69mw+L1NE8H3ltW/5jlIKpy JNvmJkCkwfmL+d8vTh6dwClko34kBaW5U74/+jNRhREZb0lrotJLd2Qat5grVdNw2QCW KtBreNxSz3rBZfHkeVrCDPCIjBgp3dkhtpg+EMNhCQDqY9XqiLAW42S0NCJfRMlWN7fU ttWF4fd1zrR2Fg/davNsgBVWV6CHEAHKk1XJv0wl48lUKg2p66NUWPONtbcpR9pxtPg6 03vw== X-Gm-Message-State: AOAM531Ub3EWEignppmEsY7DMFKsGveCRlBD7xswsZZlyGYCdL0mxuJP zLwLB3Tculm0SyxNSgc5v5SxLA== X-Received: by 2002:a05:600c:2e48:b0:39c:55ba:e4e9 with SMTP id q8-20020a05600c2e4800b0039c55bae4e9mr8398234wmf.180.1655024430554; Sun, 12 Jun 2022 02:00:30 -0700 (PDT) Received: from localhost.localdomain ([178.130.153.185]) by smtp.gmail.com with ESMTPSA id d34-20020a05600c4c2200b0039c5b4ab1b0sm4798603wmp.48.2022.06.12.02.00.25 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 12 Jun 2022 02:00:28 -0700 (PDT) From: Joe Damato To: x86@kernel.org, Alexander Viro , Borislav Petkov , Dave Hansen , David Ahern , "David S. Miller" , Eric Dumazet , Hideaki YOSHIFUJI , "H. Peter Anvin" , Ingo Molnar , Jakub Kicinski , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, Paolo Abeni , Thomas Gleixner Cc: Joe Damato Subject: [RFC,net-next,x86 v2 0/8] Nontemporal copies in sendmsg path Date: Sun, 12 Jun 2022 01:57:49 -0700 Message-Id: <1655024280-23827-1-git-send-email-jdamato@fastly.com> X-Mailer: git-send-email 2.7.4 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Greetings: Welcome to RFC v2. This is my first series that touches more than 1 subsystem; hope I got the various subject lines and to/cc-lists correct. Based on the feedback on RFC v1 [1], I've made a few changes: - Removed the indirect calls. - Simplified the code a bit by pushing logic down to a wrapper around copyin. - Added support for the 'MSG_NTCOPY' flag to udp, udp-lite, tcp, and unix. I think this series is much closer to a v1 that can be submit for consideration, but wanted to test the waters with an RFC first :) This new set of code allows applications to request non-temporal copies on individual calls to sendmsg for several socket types, not just unix. The result is that: 1. Users don't need to specify no cache copy for the entire interface as they had been doing previously with ethtool. There is more fine grained control of which sendmsgs are non-temporal. I think it makes sense for this to be application specific (vs interface-wide) since applications will have a better idea of which copy is appropriate. 2. Previously, the ethool bit for enabling no-cache-copy only seems to have affected TCP sockets, IIUC. This series supports UDP, UDP-Lite, TCP, and Unix. This means the behavior and accessibility of non-temporal copies is normalized bit more than it had been previously. The performance results on my AMD Zen2 test system are identical to the previous RFC, so I've included those results below. As you'll see below, NT copies in the unix write path have a large measureable impact on certain application architectures and CPUs. Initial benchmarks are extremely encouraging. I wrote a simple C program to benchmark this patchset, the program: - Creates a unix socket pair - Forks a child process - The parent process writes to the unix socket using MSG_NTCOPY, or not, depending on the command line flags - The child process uses splice to move the data from the unix socket to a pipe buffer, followed by a second splice call to move the data from the pipe buffer to a file descriptor opened on /dev/null. - taskset is used when launching the benchmark to ensure the parent and child run on appropriate CPUs for various scenarios The source of the test program is available for examination [2] and results for three benchmarks I ran are provided below. Test system: AMD EPYC 7662 64-Core Processor, 64 cores / 128 threads, 512kb L2 per core shared by sibling CPUs, 16mb L3 per NUMA zone, AMD specific settings: NPS=1 and L3 as NUMA enabled Test: 1048576 byte object, 100,000 iterations, 512kb pipe buffer size, 512kb unix socket send buffer size Sample command lines for running the tests provided below. Note that the command line shows how to run a "normal" copy benchmark. To run the benchmark in MSG_NTCOPY mode, change command line argument 3 from 0 to 1. Test pinned to CPUs 1 and 2 which do *not* share an L2 cache, but do share an L3. Command line for "normal" copy: % time taskset -ac 1,2 ./unix-nt-bench 1048576 100000 0 524288 524288 Mode real time (sec.) throughput (Mb/s) "Normal" copy 10.630 78,928 MSG_NTCOPY 7.429 112,935 Same test as above, but pinned to CPUs 1 and 65 which share an L2 (512kb) and L3 cache (16mb). Command line for "normal" copy: % time taskset -ac 1,65 ./unix-nt-bench 1048576 100000 0 524288 524288 Mode real time (sec.) throughput (Mb/s) "Normal" copy 12.532 66,941 MSG_NTCOPY 9.445 88,826 Same test as above, pinned to CPUs 1 and 65, but with 128kb unix send buffer and pipe buffer sizes (to avoid spilling L2). Command line for "normal" copy: % time taskset -ac 1,65 ./unix-nt-bench 1048576 100000 0 131072 131072 Mode real time (sec.) throughput (Mb/s) "Normal" copy 12.451 67,377 MSG_NTCOPY 9.451 88,768 Thanks, Joe [1]: https://patchwork.kernel.org/project/netdevbpf/cover/1652241268-46732-1-git-send-email-jdamato@fastly.com/ [2]: https://gist.githubusercontent.com/jdamato-fsly/03a2f0cd4e71ebe0fef97f7f2980d9e5/raw/19cfd3aca59109ebf5b03871d952ea1360f3e982/unix-nt-copy-bench.c Joe Damato (8): arch, x86, uaccess: Add nontemporal copy functions iov_iter: Introduce iter_copy_type iov_iter: add copyin_iovec helper net: Add MSG_NTCOPY sendmsg flag net: unix: Support MSG_NTCOPY net: ip: Support MSG_NTCOPY net: udplite: Support MSG_NTCOPY net: tcp: Support MSG_NTCOPY arch/x86/include/asm/uaccess_64.h | 6 ++++++ include/linux/socket.h | 9 +++++++++ include/linux/uaccess.h | 6 ++++++ include/linux/uio.h | 17 +++++++++++++++++ include/net/sock.h | 2 +- include/net/udplite.h | 1 + lib/iov_iter.c | 25 ++++++++++++++++++++----- net/ipv4/ip_output.c | 1 + net/ipv4/tcp.c | 2 ++ net/unix/af_unix.c | 4 ++++ 10 files changed, 67 insertions(+), 6 deletions(-) -- 2.7.4