Received: by 2002:a05:7412:b130:b0:e2:908c:2ebd with SMTP id az48csp105618rdb; Thu, 16 Nov 2023 13:14:04 -0800 (PST) X-Google-Smtp-Source: AGHT+IFSA3hCQbWOWyzDv+RyJFEo9nhwYfHi3SeTnUFdEzIUbGsWoh5DvS/OyigrQuS8cEKecDDM X-Received: by 2002:a17:90b:1a8a:b0:280:ca28:de58 with SMTP id ng10-20020a17090b1a8a00b00280ca28de58mr4432895pjb.4.1700169244583; Thu, 16 Nov 2023 13:14:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700169244; cv=none; d=google.com; s=arc-20160816; b=G776Yf1YePNV/WvLErWqxos7J9E4/HMAlm1HKUrWIpQtCVe7dFJmBsdTc/R5xdLfew lTqlmtUeuGEsPUQeplICwE6DuxT8uC2d8G0ghB3WHOIksJmJ3C8TfUMOBTAROARMks62 ePgG1hajB8CCYtSdykpAq26g4xfiZalZsKiWtDav4S/hHNGraJt8cUS8nfYFAfeTglc2 Olit+3/W4qW81wKaBTlNyKcpQtU6SjJFfGCo2/mA3CzXVCq032kWQ5T+XAVXxizf4jsW 4OVIKFH8Gl5jiRuqrJhEeKBvhjmvANJ7KyNQGDcl7tCDn5C8gpiOoKGjK5TenHioEyE7 cCXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:content-transfer-encoding :content-id:mime-version:subject:cc:to:references:in-reply-to:from :organization:dkim-signature; bh=l6B51kHuwLmyHLtffoopCcRE0yAJ1jVAzbx5f+dxL4g=; fh=bs4gBEdRv/LpQG4o42LMtvHwD9MQgsarePZqPBxmj54=; b=t1kANUMx4TLNoTJ4syhXW0l1nWyetDY+35Zpi1VCuuomaDxC5U1tOGXUr5juKx0f5z uavCVrKYnlkh/ASPh+eoKANhRXi1phI5nOdWPDCMgbqnHxwp8z4DlglD1eWzPjpTSZUM o+21sJy9XO+uZl2GS0ony8zJ3uv0r+IZ39XldnS28EoI/L8hPmg3TR1P/Pa1zugq2+Gf J7v3Y8uetlB7ynV8hP2/E9zjmpS2VFSo+/ilAOGKPVyvlQMubjzk0X+HQT3cQ/V6YdYC uqES8Gds9AYpSq6U4MOvS8mGCBtCKxHNwEHl6pAt6zI+fnlW2e6DkQhf1TOqYNcoGGyP cluw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=NFmz7UeZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id ch16-20020a17090af41000b00274a5edad0csi336370pjb.139.2023.11.16.13.14.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Nov 2023 13:14:04 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=NFmz7UeZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 527BF80C591A; Thu, 16 Nov 2023 13:14:02 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229487AbjKPVOB (ORCPT + 99 others); Thu, 16 Nov 2023 16:14:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229704AbjKPVN7 (ORCPT ); Thu, 16 Nov 2023 16:13:59 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38CF1D4D for ; Thu, 16 Nov 2023 13:13:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700169233; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l6B51kHuwLmyHLtffoopCcRE0yAJ1jVAzbx5f+dxL4g=; b=NFmz7UeZUxymrksOuyPlUzW9a/iUHpvRvpNgJUgs/ohp5bWFRxAR7fJmQBWPeKR8yeRkjF wO8WAw64BC03Q6HVDW6ce/i74QiQ87tmpdga1GM+lYHl01LFNN6XszBy74u5CcqYwENoYk 7vhx+V2fZzFM6vyipA1KExTt9NiYSO4= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-673-_lRLhBGzMz2xswfXPOP2ag-1; Thu, 16 Nov 2023 16:10:49 -0500 X-MC-Unique: _lRLhBGzMz2xswfXPOP2ag-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 36F5C2823817; Thu, 16 Nov 2023 21:09:08 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.42.28.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id F3BB01C060AE; Thu, 16 Nov 2023 21:09:05 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: References: <202311061616.cd495695-oliver.sang@intel.com> <3865842.1700061614@warthog.procyon.org.uk> <4097023.1700084620@warthog.procyon.org.uk> <42895.1700089191@warthog.procyon.org.uk> To: Linus Torvalds Cc: dhowells@redhat.com, Borislav Petkov , kernel test robot , oe-lkp@lists.linux.dev, lkp@intel.com, linux-kernel@vger.kernel.org, Christian Brauner , Alexander Viro , Jens Axboe , Christoph Hellwig , Christian Brauner , Matthew Wilcox , David Laight , ying.huang@intel.com, feng.tang@intel.com, fengwei.yin@intel.com Subject: Re: [linus:master] [iov_iter] c9eec08bac: vm-scalability.throughput -16.9% regression MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <282730.1700168945.1@warthog.procyon.org.uk> Content-Transfer-Encoding: quoted-printable Date: Thu, 16 Nov 2023 21:09:05 +0000 Message-ID: <282731.1700168945@warthog.procyon.org.uk> X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 16 Nov 2023 13:14:02 -0800 (PST) Linus Torvalds wrote: > You could try building the kernel without mitigations (or booting with t= hem > off, which isn't quite as good) to verify. Okay, I disabled RETPOLINE, which seems like it should be the important on= e. With inlined memcpy: iov_kunit_benchmark_bvec: avg 3160 uS, stddev 17 uS iov_kunit_benchmark_bvec_split: avg 3380 uS, stddev 29 uS iov_kunit_benchmark_kvec: avg 2940 uS, stddev 978 uS iov_kunit_benchmark_xarray: avg 3599 uS, stddev 8 uS iov_kunit_benchmark_xarray_to_bvec: avg 3964 uS, stddev 16 uS Directly calling __memcpy(): iov_kunit_benchmark_bvec: avg 9947 uS, stddev 61 uS iov_kunit_benchmark_bvec_split: avg 9790 uS, stddev 13 uS iov_kunit_benchmark_kvec: avg 9565 uS, stddev 758 uS iov_kunit_benchmark_xarray: avg 10498 uS, stddev 24 uS iov_kunit_benchmark_xarray_to_bvec: avg 10459 uS, stddev 188 uS I created a duplicate of __memcpy() (called __movsb_memcpy) without the "alternative" statement and made it call that: iov_kunit_benchmark_bvec: avg 3177 uS, stddev 7 uS iov_kunit_benchmark_bvec_split: avg 3393 uS, stddev 10 uS iov_kunit_benchmark_kvec: avg 2813 uS, stddev 385 uS iov_kunit_benchmark_xarray: avg 3651 uS, stddev 7 uS iov_kunit_benchmark_xarray_to_bvec: avg 3946 uS, stddev 8 uS And then I made it call memcpy_orig() directly: iov_kunit_benchmark_bvec: avg 9942 uS, stddev 17 uS iov_kunit_benchmark_bvec_split: avg 9802 uS, stddev 29 uS iov_kunit_benchmark_kvec: avg 9547 uS, stddev 598 uS iov_kunit_benchmark_xarray: avg 10486 uS, stddev 13 uS iov_kunit_benchmark_xarray_to_bvec: avg 10438 uS, stddev 12 uS (See attached patch) David --- diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S index 0ae2e1712e2e..df1ebbe345e2 100644 --- a/arch/x86/lib/memcpy_64.S +++ b/arch/x86/lib/memcpy_64.S @@ -43,7 +43,7 @@ EXPORT_SYMBOL(__memcpy) SYM_FUNC_ALIAS_MEMFUNC(memcpy, __memcpy) EXPORT_SYMBOL(memcpy) = -SYM_FUNC_START_LOCAL(memcpy_orig) +SYM_TYPED_FUNC_START(memcpy_orig) movq %rdi, %rax = cmpq $0x20, %rdx @@ -169,4 +169,12 @@ SYM_FUNC_START_LOCAL(memcpy_orig) .Lend: RET SYM_FUNC_END(memcpy_orig) +EXPORT_SYMBOL(memcpy_orig) = +SYM_TYPED_FUNC_START(__movsb_memcpy) + movq %rdi, %rax + movq %rdx, %rcx + rep movsb + RET +SYM_FUNC_END(__movsb_memcpy) +EXPORT_SYMBOL(__movsb_memcpy) diff --git a/lib/iov_iter.c b/lib/iov_iter.c index de7d11cf4c63..620cd6356a5b 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -58,11 +58,18 @@ size_t copy_from_user_iter(void __user *iter_from, siz= e_t progress, return res; } = +extern void *__movsb_memcpy(void *, const void *, size_t); +extern void *memcpy_orig(void *, const void *, size_t); + static __always_inline size_t memcpy_to_iter(void *iter_to, size_t progress, size_t len, void *from, void *priv2) { - memcpy(iter_to, from + progress, len); +#if 0 + __movsb_memcpy(iter_to, from + progress, len); +#else + memcpy_orig(iter_to, from + progress, len); +#endif = return 0; } = @@ -70,7 +77,11 @@ static __always_inline size_t memcpy_from_iter(void *iter_from, size_t progress, size_t len, void *to, void *priv2) { - memcpy(to + progress, iter_from, len); +#if 0 + __movsb_memcpy(to + progress, iter_from, len); +#else + memcpy_orig(to + progress, iter_from, len); +#endif return 0; } =