Received: by 10.213.65.68 with SMTP id h4csp1578280imn; Mon, 19 Mar 2018 07:55:55 -0700 (PDT) X-Google-Smtp-Source: AG47ELvFTvCjPzh9/4VTqhy3l5PYY+4GcXFz/ZYdC62VB+NIXXGu2DXnBy3IJ5IJJkyJJMLR0cHe X-Received: by 10.98.19.132 with SMTP id 4mr2124463pft.87.1521471355598; Mon, 19 Mar 2018 07:55:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521471355; cv=none; d=google.com; s=arc-20160816; b=olsMcABCRd7BvjzrvKzw0I3Y6lcQXIpkAlyRefuatu1DUT1rcH4E9ae30SoFt22I/F oZikoSU+sSLj7gcD3JOqv2z+AY1jM+y8DIdCECc/09P6A5ZIk87FZjSA6glYROchdWRF mUVxzKOSVkERFu53CuejgKAxgo5YEq7iqDiWyAxUIAluXNhsnqwBztkZtdITbbsaPERB naXgFSEbi4J9IY3rstrlZDQWE+dETCKKg6Q6pgZONe3fzVTUb5psgGMxvxy2gEf1gAMB WL7e0KHE54L/mReOXladxWcSdUS3jka5tLtdO6qNL/gMi+VRQU3MSyYPdztBsDjzKDy2 lFjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from :arc-authentication-results; bh=EzE9M6+8w4QCNeyldJmcsonHaqXsC/sOAhqcJsXE0qI=; b=fRKUHL5SOBsXbvPilU1s+qxKXPO7LFLWOFFiEUOBtp34twQvHZKddLXBUMrzSKDw53 Dk8Ht53QxpohMPmjSQPV4ruD3xZx1/E+BioWRMGonfEkxv93EGFr59bu7FTPQWOMnVuC hZWZkFNL2kWXnt8sPneLu9JbdQt49xBRdc9oZ0lMSNdGInkriP0KKEVbQcRZOMu/nI7F PNXYentHkVmStlDIXirw46aB8SIl0HcD8VISQhAcX2xg1YaAzLR91PDkUcsFPliZF3wP +ixsH5bD/w1Jbq5uJlfWcGqvaBSb1oyc/I7uIZbyTLI2TWtgV5ujUs7BNw0w7IExJHcd JEpg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t88si107786pfg.342.2018.03.19.07.55.41; Mon, 19 Mar 2018 07:55:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755737AbeCSOw3 convert rfc822-to-8bit (ORCPT + 99 others); Mon, 19 Mar 2018 10:52:29 -0400 Received: from smtp-out4.electric.net ([192.162.216.184]:56504 "EHLO smtp-out4.electric.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755504AbeCSOw1 (ORCPT ); Mon, 19 Mar 2018 10:52:27 -0400 Received: from 1exw94-000BYO-V3 by out4d.electric.net with emc1-ok (Exim 4.90_1) (envelope-from ) id 1exw96-000BkQ-Vm; Mon, 19 Mar 2018 07:52:20 -0700 Received: by emcmailer; Mon, 19 Mar 2018 07:52:20 -0700 Received: from [156.67.243.126] (helo=AcuMS.aculab.com) by out4d.electric.net with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) (Exim 4.90_1) (envelope-from ) id 1exw94-000BYO-V3; Mon, 19 Mar 2018 07:52:18 -0700 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:43c:695e:880f:8750) by AcuMS.aculab.com (fd9f:af1c:a25b:0:43c:695e:880f:8750) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Mon, 19 Mar 2018 14:53:17 +0000 Received: from AcuMS.Aculab.com ([fe80::43c:695e:880f:8750]) by AcuMS.aculab.com ([fe80::43c:695e:880f:8750%12]) with mapi id 15.00.1347.000; Mon, 19 Mar 2018 14:53:17 +0000 From: David Laight To: 'Rahul Lakkireddy' , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "netdev@vger.kernel.org" CC: "tglx@linutronix.de" , "mingo@redhat.com" , "hpa@zytor.com" , "davem@davemloft.net" , "akpm@linux-foundation.org" , "torvalds@linux-foundation.org" , "ganeshgr@chelsio.com" , "nirranjan@chelsio.com" , "indranil@chelsio.com" Subject: RE: [RFC PATCH 0/3] kernel: add support for 256-bit IO access Thread-Topic: [RFC PATCH 0/3] kernel: add support for 256-bit IO access Thread-Index: AQHTv43TjMVMzNQoikSg1VH837bpVaPXnqXg Date: Mon, 19 Mar 2018 14:53:17 +0000 Message-ID: <7f0ddb3678814c7bab180714437795e0@AcuMS.aculab.com> References: In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.33] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-Outbound-IP: 156.67.243.126 X-Env-From: David.Laight@ACULAB.COM X-Proto: esmtps X-Revdns: X-HELO: AcuMS.aculab.com X-TLS: TLSv1.2:ECDHE-RSA-AES256-SHA384:256 X-Authenticated_ID: X-PolicySMART: 3396946, 3397078 X-Virus-Status: Scanned by VirusSMART (c) X-Virus-Status: Scanned by VirusSMART (s) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Rahul Lakkireddy > Sent: 19 March 2018 14:21 > > This series of patches add support for 256-bit IO read and write. > The APIs are readqq and writeqq (quad quadword - 4 x 64), that read > and write 256-bits at a time from IO, respectively. Why not use the AVX2 registers to get 512bit accesses. > Patch 1 adds u256 type and adds necessary non-atomic accessors. Also > adds byteorder conversion APIs. > > Patch 2 adds 256-bit read and write to x86 via VMOVDQU AVX CPU > instructions. > > Patch 3 updates cxgb4 driver to use the readqq API to speed up > reading on-chip memory 256-bits at a time. Calling kernel_fpu_begin() is likely to be slow. I doubt you want to do it every time around a loop of accesses. In principle it ought to be possible to get access to one or two (eg) AVX registers by saving them to stack and telling the fpu save code where you've put them. Then the IPI fp save code could then copy the saved values over the current values if asked to save the fp state for a process. This should be reasonable cheap - especially if there isn't an fp save IPI. OTOH, for x86, if the code always runs in process context (eg from a system call) then, since the ABI defines them all as caller-saved the AVX(2) registers, it is only necessary to ensure that the current FPU registers belong to the current process once. The registers can be set to zero by an 'invalidate' instruction on system call entry (hope this is done) and after use. David