Received: by 10.213.65.68 with SMTP id h4csp266061imn; Tue, 20 Mar 2018 03:02:28 -0700 (PDT) X-Google-Smtp-Source: AG47ELu9XS1vmTeoCVtf/IvTLeWKuqZKjlASUJatpOAIGnncS5xUkiDNKm3khJ4FxcXn+u+5JD5m X-Received: by 10.101.66.196 with SMTP id l4mr11607783pgp.66.1521540148045; Tue, 20 Mar 2018 03:02:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521540148; cv=none; d=google.com; s=arc-20160816; b=DRnmadARYRuI9+GUmlPTXk7GLe5Hb/PnofBcYZz6NcNxFIlbqAZD795fvnq371cI/P 71q2g8I29Bsl/CnGkarMcdeJut4EHA4L3ubJQC9F5A7LDTdcgD1FGG6cdW9Vx92EDXdr y+6w5V8uJhBsPJkWRbxnwGd5n4E3ZC1tuWU7nWQOCDpEw/KKYVahhOAaQJWFLMiao6F9 a+13G6U8alw49sO9lmYFaqW7T+xvswxW5YXq0n3UbgkIYKXIQFZLjxxyq8sOlbjm/lvB cpqiH4wRbpGYeGGryAww7/AXC76oWm/2VVkJCYg9ngXHLwuVcCAUgLGiVRGuKIeU5g5A 5vkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from :arc-authentication-results; bh=RlOK0zMdS063Pc92QyU6fPUQdpc+5h9VMO0JumhPSdU=; b=dKH4Y2qP70Ea81m5QVlLwiH0DxQ0LpqLv/IVhvaDVpyiwAJl2eQH5U3AXaExGns8iD 0iEFY7WMhLBmNJHQabYsQtkQ7+rkZYHyIkBbr0s6S5kMyUCuSNIQJM/w7eatt3jZntQV T1VrXcX5qNfC7kD5Fp+qxnwaZPjxBTKhCJOLDUOZyOi0UFTBub+qEkNNAbFaGuYLX2M4 SleDSz8Rpttnqk8ca26ZqVLtmMoBuiNtpEG0686cFsNhs7rFrR99IIUhUzGMGgjIuBkK aTiuYcc/dE7RLzmpEJ3hLeLfWF0z0//jj8xWS7McnOl+AHjGl677Vm20196COpB8uVgD 4q3A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v4-v6si1289855plo.55.2018.03.20.03.02.13; Tue, 20 Mar 2018 03:02:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752691AbeCTJ7E convert rfc822-to-8bit (ORCPT + 99 others); Tue, 20 Mar 2018 05:59:04 -0400 Received: from smtp-out4.electric.net ([192.162.216.182]:63024 "EHLO smtp-out4.electric.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752663AbeCTJ7B (ORCPT ); Tue, 20 Mar 2018 05:59:01 -0400 Received: from 1eyE2T-0005Eb-UR by out4b.electric.net with emc1-ok (Exim 4.90_1) (envelope-from ) id 1eyE2b-00064X-UX; Tue, 20 Mar 2018 02:58:49 -0700 Received: by emcmailer; Tue, 20 Mar 2018 02:58:49 -0700 Received: from [156.67.243.126] (helo=AcuMS.aculab.com) by out4b.electric.net with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) (Exim 4.90_1) (envelope-from ) id 1eyE2T-0005Eb-UR; Tue, 20 Mar 2018 02:58:41 -0700 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:43c:695e:880f:8750) by AcuMS.aculab.com (fd9f:af1c:a25b:0:43c:695e:880f:8750) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Tue, 20 Mar 2018 09:59:40 +0000 Received: from AcuMS.Aculab.com ([fe80::43c:695e:880f:8750]) by AcuMS.aculab.com ([fe80::43c:695e:880f:8750%12]) with mapi id 15.00.1347.000; Tue, 20 Mar 2018 09:59:40 +0000 From: David Laight To: 'Thomas Gleixner' , Ingo Molnar CC: 'Rahul Lakkireddy' , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "netdev@vger.kernel.org" , "mingo@redhat.com" , "hpa@zytor.com" , "davem@davemloft.net" , "akpm@linux-foundation.org" , "torvalds@linux-foundation.org" , "ganeshgr@chelsio.com" , "nirranjan@chelsio.com" , "indranil@chelsio.com" , "Andy Lutomirski" , Peter Zijlstra , Fenghua Yu , Eric Biggers Subject: RE: [RFC PATCH 0/3] kernel: add support for 256-bit IO access Thread-Topic: [RFC PATCH 0/3] kernel: add support for 256-bit IO access Thread-Index: AQHTv43TjMVMzNQoikSg1VH837bpVaPXnqXggAAJp4CAAAH+gIABNhgtgAABOMA= Date: Tue, 20 Mar 2018 09:59:40 +0000 Message-ID: <43d86d051123403496311bb70babadd5@AcuMS.aculab.com> References: <7f0ddb3678814c7bab180714437795e0@AcuMS.aculab.com> <7f8d811e79284a78a763f4852984eb3f@AcuMS.aculab.com> <20180320082651.jmxvvii2xvmpyr2s@gmail.com> <20180320090802.qw4tqjmhy6yfd6sf@gmail.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.33] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-Outbound-IP: 156.67.243.126 X-Env-From: David.Laight@ACULAB.COM X-Proto: esmtps X-Revdns: X-HELO: AcuMS.aculab.com X-TLS: TLSv1.2:ECDHE-RSA-AES256-SHA384:256 X-Authenticated_ID: X-PolicySMART: 3396946, 3397078 X-Virus-Status: Scanned by VirusSMART (c) X-Virus-Status: Scanned by VirusSMART (s) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Thomas Gleixner > Sent: 20 March 2018 09:41 > On Tue, 20 Mar 2018, Ingo Molnar wrote: > > * Thomas Gleixner wrote: ... > > > And if we go down that road then we want a AVX based memcpy() > > > implementation which is runtime conditional on the feature bit(s) and > > > length dependent. Just slapping a readqq() at it and use it in a loop does > > > not make any sense. > > > > Yeah, so generic memcpy() replacement is only feasible I think if the most > > optimistic implementation is actually correct: > > > > - if no preempt disable()/enable() is required > > > > - if direct access to the AVX[2] registers does not disturb legacy FPU state in > > any fashion > > > > - if direct access to the AVX[2] registers cannot raise weird exceptions or have > > weird behavior if the FPU control word is modified to non-standard values by > > untrusted user-space > > > > If we have to touch the FPU tag or control words then it's probably only good for > > a specialized API. > > I did not mean to have a general memcpy replacement. Rather something like > magic_memcpy() which falls back to memcpy when AVX is not usable or the > length does not justify the AVX stuff at all. There is probably no point for memcpy(). Where it would make a big difference is memcpy_fromio() for PCIe devices (where longer TLP make a big difference). But any code belongs in its implementation not in every driver. The implementation of memcpy_toio() is nothing like as critical. If might be the code would need to fallback to 64bit accesses if the AVX(2) registers can't currently be accessed - maybe some obscure state.... However memcpy_to/fromio() are both horrid at the moment because they result in byte copies! David