Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3237737imu; Sat, 24 Nov 2018 00:48:12 -0800 (PST) X-Google-Smtp-Source: AFSGD/Vhps+5zSUw327sNVV82UwGXPoe4Z+mg75JGoAZNo5/A4G6r9UzGBBBNnvI9MT6aNxafmpD X-Received: by 2002:a63:b105:: with SMTP id r5mr17344817pgf.442.1543049292576; Sat, 24 Nov 2018 00:48:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543049292; cv=none; d=google.com; s=arc-20160816; b=c7w4yJ9WHMIlhnD6ABO+ulTkg+gGon0a2yL4varsvRikhnnsEhbbyyzYkc4xoUh3Ks qGuRHJKLB3svWRjd2yKbYTwaONU3I2sLwxK4cEh+eUS5xWJOjUkJJ/sjJdYpKvauJJ1E W+MsTppZXIFv4YkQFn9UbPyPd3vn2LmbQZWhjyBuxBbJ7yC/Ned1qnaTrox8PtUk1zx+ PMVR45/GXZx2wY56SNfOXkveqP9JAcxvvLEvx0cSNGrbVtHyn6Ykh9NdShiPWY0T1wB7 V3cvtt1N8+/06FX2raZoLUu1zY0SpURaZUxoKrRvmmI0tLpKqhJEfb6iBB38kbqxww1A NRBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=jZO8+4aX1s1i7kfLfvKjRhlTHMG92kmcFX86kOZOXHE=; b=csJ3YAC0tZMfcCY/KLhnXAZUt9zql5VhP3vICdtXezIhqqHuOIVytqiUGcBS2zVSj5 WR25YULIwc4Mn2ovUNbVGDACzOElD80+Eui4AmYiR09PEHbI14ywqHHTVl/KFlt9m2u8 ZPNOARbEtwIVj9JZlR3d2ryI0hkEd2dlrilE3J/exr6np6sR1NSkpIlgPARFKNLLCRXm AXgXPeQbQPLOnY/XI8pQGRtyjhpBIiKMS+FdOev8lJZOrU8LX96LlqHTFliwiU2PkJ3s pUXlOxUnP18tH1FOzai6U9+06mvf/h6gLsGOeb8kOMQfKqEEyr3UH9R1IUobPuAQUIrg gqrg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=x4zsJtrL; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u6si44088783plz.220.2018.11.24.00.47.58; Sat, 24 Nov 2018 00:48:12 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=x4zsJtrL; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2441163AbeKXFZB (ORCPT + 99 others); Sat, 24 Nov 2018 00:25:01 -0500 Received: from mail-pf1-f193.google.com ([209.85.210.193]:34055 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2437404AbeKXFZB (ORCPT ); Sat, 24 Nov 2018 00:25:01 -0500 Received: by mail-pf1-f193.google.com with SMTP id h3so3698067pfg.1 for ; Fri, 23 Nov 2018 10:39:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=jZO8+4aX1s1i7kfLfvKjRhlTHMG92kmcFX86kOZOXHE=; b=x4zsJtrLmQFrQnzEvcRHTlTcSnwkUh4a4mRTk/3X55M2k510RTPHH9WvmeIrrWPw3F eCLKKQgb7zy1r0F4zvVTsScjJrNnLgx/lrOMFP72mAiwRZfGpt3IcAfu0s848M3dXp+N 8ICS1I9oJBqZmjp/fs1mgbdjr1/aU1U1i5prDChx6wh+KThiyldbQ6784YJShaqjX/qo mc4mmIj0/2DQhcHulw24RBKebg12Ca00/Ik8WdBcr6p8tBeM92rgVJ1QmT+fgBAIDYOP +eHmNL6jloaiXI2x72ZpvGmunn+aoAGcjlgLRXOK+isRZeYnEvAKomSbaIOj6lrJtE2z fVaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=jZO8+4aX1s1i7kfLfvKjRhlTHMG92kmcFX86kOZOXHE=; b=DhYacPjd1sTNL/IgcGnVN9u53driX0s6CfzC8CKaprK5MaRj2UVpUwmjLU7smg2A95 sfa+G0z+PEXPZ85RCDF3acKzE2D/UzTuqSu2ShF1PtRtL3A1a2ADY0aYNYjUdm9T0X6a nQdSNGoaqBHL6OhYhELCi+4X+C/rsTCwTTpLdfM6nAxqLduoVhJdfl1a7F0sv+lGqH6O TjOsDphx6wSe6YOHF0EzA4vTmgA2WZWDSBIDcfYplbhqB+pUaEwDGqoXNtpypizInEA1 U0QNb88RzR+YiUR/M1smTu4yw3qoQ2vHdbVJw1qDjJ7gCbAU+NUg4ctlXe9kqnYQTszN 0NIw== X-Gm-Message-State: AGRZ1gKBPkoa9dVmC12LNbMG7G7Cf7O2zZUkNF+BDFYq9wyBy9oTfWlP 3utuf/0qbVHK6zDE8sRg0YmlCQ== X-Received: by 2002:a62:d0c1:: with SMTP id p184mr16901519pfg.245.1542998375366; Fri, 23 Nov 2018 10:39:35 -0800 (PST) Received: from [10.255.101.131] (22.sub-97-41-132.myvzw.com. [97.41.132.22]) by smtp.gmail.com with ESMTPSA id x3sm37051567pgk.18.2018.11.23.10.39.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 23 Nov 2018 10:39:34 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: [PATCH] x86: only use ERMS for user copies for larger sizes From: Andy Lutomirski X-Mailer: iPhone Mail (16A404) In-Reply-To: Date: Fri, 23 Nov 2018 11:39:29 -0700 Cc: David.Laight@aculab.com, Andrew Lutomirski , dvlasenk@redhat.com, Jens Axboe , Ingo Molnar , Thomas Gleixner , Ingo Molnar , bp@alien8.de, Peter Anvin , the arch/x86 maintainers , Andrew Morton , Peter Zijlstra , brgerst@gmail.com, Linux List Kernel Mailing , pabeni@redhat.com Content-Transfer-Encoding: quoted-printable Message-Id: References: <02bfc577-32a5-66be-64bf-d476b7d447d2@kernel.dk> <20181121063609.GA109082@gmail.com> <48e27a3a-2bb2-ff41-3512-8aeb3fd59e57@kernel.dk> <26eff539-7de7-784c-0c88-f1d30753299d@redhat.com> <7ea44458b90b4d41a08ba9012818d273@AcuMS.aculab.com> <64fd67993af04579b5262c270a7a4694@AcuMS.aculab.com> To: Linus Torvalds Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Nov 23, 2018, at 10:42 AM, Linus Torvalds wrote: >=20 > On Fri, Nov 23, 2018 at 8:36 AM Linus Torvalds > wrote: >>=20 >> Let me write a generic routine in lib/iomap_copy.c (which already does >> the "user specifies chunk size" cases), and hook it up for x86. >=20 > Something like this? >=20 > ENTIRELY UNTESTED! It might not compile. Seriously. And if it does > compile, it might not work. >=20 > And this doesn't actually do the memset_io() function at all, just the > memcpy ones. >=20 > Finally, it's worth noting that on x86, we have this: >=20 > /* > * override generic version in lib/iomap_copy.c > */ > ENTRY(__iowrite32_copy) > movl %edx,%ecx > rep movsd > ret > ENDPROC(__iowrite32_copy) >=20 > because back in 2006, we did this: >=20 > [PATCH] Add faster __iowrite32_copy routine for x86_64 >=20 > This assembly version is measurably faster than the generic version in > lib/iomap_copy.c. >=20 > which actually implies that "rep movsd" is faster than doing > __raw_writel() by hand. >=20 > So it is possible that this should all be arch-specific code rather > than that butt-ugly "generic" code I wrote in this patch. >=20 > End result: I'm not really all that happy about this patch, but it's > perhaps worth testing, and it's definitely worth discussing. Because > our current memcpy_{to,from}io() is truly broken garbage. >=20 > =20 What is memcpy_to_io even supposed to do? I=E2=80=99m guessing it=E2=80=99s= defined as something like =E2=80=9Ccopy this data to IO space using at most= long-sized writes, all aligned, and writing each byte exactly once, in orde= r.=E2=80=9D That sounds... dubiously useful. I could see a function that w= rites to aligned memory in specified-sized chunks. And I can see a use for a= function to just write it in whatever size chunks the architecture thinks i= s fastest, and *that* should probably use MOVDIR64B. Or is there some subtlety I=E2=80=99m missing?=