Received: by 10.213.65.68 with SMTP id h4csp473159imn; Tue, 20 Mar 2018 07:43:32 -0700 (PDT) X-Google-Smtp-Source: AG47ELvh2O6Jl89MjlRn07BGGfgWYJ9NaNx86yTsgYltlj0aqWF3Sq7eVdrTwRhsDSLia3Eqvwq3 X-Received: by 10.167.131.135 with SMTP id u7mr13858821pfm.50.1521557012880; Tue, 20 Mar 2018 07:43:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521557012; cv=none; d=google.com; s=arc-20160816; b=W4eG9p7p4IW0GATbnCpdzDJfnJzcE3Z6vaPL95mBGmhq3sjdLJS1uPKhxv47IgaLlL sYvPxjyAesUg4o7Eoh24omX2SqYZ65TL5ol+scZwg0I9/Aq94mTeQbAGC/Twtu0Ti3Xs phZBqSpMvpf1oB/lZmsnyckEmXc80wedDRvEE51/B6kaNYBEQ6OzZP8duWL4218a6ckX hs1oYu08mjKg5GDUaVo5jWfIE4yl3LaLDGz113u6x+8t2nfI/COIZu0JsQAlwMVdCVv/ 5oW3TLt4w/OIVerQq4I0nC6qLyKSxjGld6jq0/5a3JqLNQPoPgT0SZjxDDPCpVLIww8Y 5rJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=DF2BCnmVkrsxIdEKa+BSp/Z68xP8pBG/d21gEA5DwXs=; b=invC1GTMP463bpxBvAcGeGOvch1jLX2fIkGpMqdSo5nLjVs4pJQHKN39/mjOUYgFkq Kj02tg30QfWQcyPjI5nT7Gh/coGzAVlUtVT3VqxrxcbGlpMvfnUIpzGN111Fx5io4qmU PSCix7nBdDrq8Y4reTjKZUeZgwys5HGpehUyI5O69Qqmrndm9qZaLD5uslaqNzc/Bx9z s0FOs5i6HTpW1JgXS9zBKlmWINjlAOZhFeLbKkY0BtzTFOEUZk1RQpjt7rYOVRsUduse BHL3oz6O/tPqSS60nMQic2reGod1oOSxbfmiucUXlpSfV0NJrgHX5GkSEuY1nXnqUq5m J/Xw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Q/GvS2T3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d192si396703pgc.553.2018.03.20.07.43.16; Tue, 20 Mar 2018 07:43:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Q/GvS2T3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751475AbeCTOmV (ORCPT + 99 others); Tue, 20 Mar 2018 10:42:21 -0400 Received: from mail-wm0-f65.google.com ([74.125.82.65]:39893 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750828AbeCTOmR (ORCPT ); Tue, 20 Mar 2018 10:42:17 -0400 Received: by mail-wm0-f65.google.com with SMTP id f125so3987018wme.4; Tue, 20 Mar 2018 07:42:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=DF2BCnmVkrsxIdEKa+BSp/Z68xP8pBG/d21gEA5DwXs=; b=Q/GvS2T35kBym9M0iskM9hAcOOzbQvP05p2WNdkfZRdBJSl2eX6xL8Sl28b6Owzn/H lnmZO8udvJyZrPcOSlrk4EGC+F6fN2e/vJur4Hz5x4SDOEK/Bch133cLpXuO5me1XlDP De5G4gOEoIZlVSnoPbApycu8DHoAW1ugFNcbotMe34hHr0BFe11XUVrDX/kg5nHi4Nio yaTvA2L3Vfj0TkS3Ik7vA307aTPgICtgg1BnqNosUqEiAu9NQI4wnVVQdfuCKiYDpUen sy0mhb8KR4EDeLFBNV25Y2GRTek0dQfE/iiY4+wkKf9vIwh8y9Pugb4QoC5pSh0UD4/T MS0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=DF2BCnmVkrsxIdEKa+BSp/Z68xP8pBG/d21gEA5DwXs=; b=nQwU1jqstkXZXWsYA2BA2Ec0Q9JJRGJhwwmySvjikOxVwJX6eP2xaHEeOzuRWZolMB hprpOJrERqtXPEKwIJQtznd2UKsz0sF9FVkOfoGCFzCG21e9icwZWRk358pS7yHfB2L2 T2xer2Z7trWrHSonLc/BmfrimieXEqpdyQzmba6mOvzF5YGAczVfoB6Fl0OX/wFDMWi9 Lv1YSlT+niwtlvVyMPdBgRjnx6NX8a+2oR//Oo1wBFWPXCwCfbCuNcQq2HAYFILgYj+q FGdTxFMHXru/SrMmFodICNbTw5z9MuahvfklhlL3Df/WAReC6J2NL8wJYLzHrWYIKJOz gzZg== X-Gm-Message-State: AElRT7EfGVcfnjazPam2zA/D3r+r4eKkyonjhJWKSMrbk5x6LiwWg71w 79KNtKpELC1tJQzm3VXTU/P8mA2/LmnBRVKGTFy+bQ== X-Received: by 10.28.85.193 with SMTP id j184mr2368525wmb.109.1521556936223; Tue, 20 Mar 2018 07:42:16 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.184.189 with HTTP; Tue, 20 Mar 2018 07:42:15 -0700 (PDT) In-Reply-To: <20180320133206.GB25574@chelsio.com> References: <6ec3e7e0c70e85a804933f27bb4275d5363c044b.1521469118.git.rahul.lakkireddy@chelsio.com> <20180320133206.GB25574@chelsio.com> From: Alexander Duyck Date: Tue, 20 Mar 2018 07:42:15 -0700 Message-ID: Subject: Re: [RFC PATCH 2/3] x86/io: implement 256-bit IO read and write To: Rahul Lakkireddy Cc: Thomas Gleixner , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "netdev@vger.kernel.org" , "mingo@redhat.com" , "hpa@zytor.com" , "davem@davemloft.net" , "akpm@linux-foundation.org" , "torvalds@linux-foundation.org" , Ganesh GR , Nirranjan Kirubaharan , Indranil Choudhury Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 20, 2018 at 6:32 AM, Rahul Lakkireddy wrote: > On Monday, March 03/19/18, 2018 at 20:13:10 +0530, Thomas Gleixner wrote: >> On Mon, 19 Mar 2018, Rahul Lakkireddy wrote: >> >> > Use VMOVDQU AVX CPU instruction when available to do 256-bit >> > IO read and write. >> >> That's not what the patch does. See below. >> >> > Signed-off-by: Rahul Lakkireddy >> > Signed-off-by: Ganesh Goudar >> >> That Signed-off-by chain is wrong.... >> >> > +#ifdef CONFIG_AS_AVX >> > +#include >> > + >> > +static inline u256 __readqq(const volatile void __iomem *addr) >> > +{ >> > + u256 ret; >> > + >> > + kernel_fpu_begin(); >> > + asm volatile("vmovdqu %0, %%ymm0" : >> > + : "m" (*(volatile u256 __force *)addr)); >> > + asm volatile("vmovdqu %%ymm0, %0" : "=m" (ret)); >> > + kernel_fpu_end(); >> > + return ret; >> >> You _cannot_ assume that the instruction is available just because >> CONFIG_AS_AVX is set. The availability is determined by the runtime >> evaluated CPU feature flags, i.e. X86_FEATURE_AVX. >> > > Ok. Will add boot_cpu_has(X86_FEATURE_AVX) check as well. > >> Aside of that I very much doubt that this is faster than 4 consecutive >> 64bit reads/writes as you have the full overhead of >> kernel_fpu_begin()/end() for each access. >> >> You did not provide any numbers for this so its even harder to >> determine. >> > > Sorry about that. Here are the numbers with and without this series. > > When reading up to 2 GB on-chip memory via MMIO, the time taken: > > Without Series With Series > (64-bit read) (256-bit read) > > 52 seconds 26 seconds > > As can be seen, we see good improvement with doing 256-bits at a > time. Instead of framing this as an enhanced version of the read/write ops why not look at replacing or extending something like the memcpy_fromio or memcpy_toio operations? It would probably be more comparable to what you are doing if you are wanting to move large chunks of memory from one region to another, and it should translate into something like AVX instructions once the CPU optimizations kick in for a memcpy. - Alex