Received: by 10.213.65.68 with SMTP id h4csp296301imn; Tue, 20 Mar 2018 03:56:17 -0700 (PDT) X-Google-Smtp-Source: AG47ELvx71BP2OaiHNKAmhtW0I8crMj//05FbsKHfH43w9vYAsfVfQTQNOVEkiGC/smgQBXbTWAu X-Received: by 10.98.157.199 with SMTP id a68mr13128065pfk.237.1521543377545; Tue, 20 Mar 2018 03:56:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521543377; cv=none; d=google.com; s=arc-20160816; b=ZrqX/+H5XhaOHz2BzcyPrTIDV/1J+UyjOdcaXkG6gO2Vnwyp/h7e/SYwCgE1crm5yl 4/DkONzKz1Ntz1DHE8QcFThXfwuOnPBJusx4YjSMfI09hcEvm8tbkxy1Ndd01dKwmTi3 eGW2GJEzdk3/G8+R195KRTG7BxlaNHBycB9PjullCtAnHcfBY+JTw+Zwbi31+v+oZiB1 O+QRDA8u528+yoAmwjzIDKCZE05uAO0R+dMhgHnA5DREFG/1y2QWHrBjcUY89QOWBcoq FCqaw/M/QyblIBPJy1WMM9SdOzaV85SsmHEphwaaP2h5NlgovvTJU8ATrYxKT5RiRCmg hVQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=3jLglDRz3Pa8Lhx8vdesJXWBEH7ZR8FCUieTo3MHwCg=; b=r6evdsqTVpwTAhwep21H9mWEGehIkvg5oe8PWojrrIxOydWFAJ175YrVKPdN4ao+GX GuUbQChV4tQjXY15Rw4Aj9UV2kh34h46fBH0amGmR6+CEMgblKnr8so+v3T10cdlR4Cc PeDhOXYOV0n/OyPgd5bIRogvOIWKwv69kFBJi+8jZadNbthFb8ypfIIUPOvZaF8166Rn Pl93/B8LoT+MUkh2HuxFg9Lz3suSEWqCzK3fVPfjdzmUr2LB81D8ezavlUoysJsSNO11 ZzSdKwHiRrG7b6E9cHMD1WDg40FQbHb2L7VrLwJrYSD3ZU6BA0Kg5XNA6nZNrzajS+Iv VCzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=ak/KFT41; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u1-v6si1454069plj.409.2018.03.20.03.56.03; Tue, 20 Mar 2018 03:56:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=ak/KFT41; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752701AbeCTKyi (ORCPT + 99 others); Tue, 20 Mar 2018 06:54:38 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:54175 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752161AbeCTKyc (ORCPT ); Tue, 20 Mar 2018 06:54:32 -0400 Received: by mail-wm0-f68.google.com with SMTP id e194so2476922wmd.3; Tue, 20 Mar 2018 03:54:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=3jLglDRz3Pa8Lhx8vdesJXWBEH7ZR8FCUieTo3MHwCg=; b=ak/KFT41OFVFA/wm6Vgi+UvkR2sXQl2rqQ1nqDpf4B7ogpdqxiNmS31MvdbZIUcJK8 xVkc1VY6udWUL+i33S5Hj4KRf/qagGQhUwz16+atw/vLA+VP6fdkFew6dpsL2lWs7hT+ mZgCCq+C82GjOp8CE1+pBDTgLz60qCC6rm2nR0cQKQ2ofti9XxY7WHncyjH8q91bLIDG 2g41NTynXQ19Ay+2gK2+S9h6Nv54eBqMWxjCwbwoT3FljCjG3Jve4ZNaa4c5C3ziD1hm v2RpFO0XX15knmAR6sMVfJTaGwLI8th03pTAJtWmGARSXKeoFFyZ5hmbfdCMdgfqJfI+ b5bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=3jLglDRz3Pa8Lhx8vdesJXWBEH7ZR8FCUieTo3MHwCg=; b=CnfJpsVkzEmFaIzVkCbj8AbXeeTWUXHJ7GpWgs0Ndm+HYQxNclaRKLkj7yKQdbuHxL cviHea4GuTaUXMiPe/B+VOb/heVjZ5HsuIcPoHV45Pbge92WIZKdSPR4dj05i2uIkBgK piUXKq2GhrAW1Cdx4DP3qPrnI1TR7GfHvIYBYSX6MMk9dmmOVXYOaXR4UPvXys6/rl/I 93PL3DPhuJ8/ZLPy7+KvIgCqEHrB8YzhwVJdm7dXE+pNitrMayXWKXJBL678Z5tGhmq4 hDT9tR01PFGQroTf8xajaCypt79nVnZs4KmOYBQTQf1dtzQqxvV1rlX2S3O+YpoO8nFB QJwQ== X-Gm-Message-State: AElRT7EDw2/GjaHBvNI78NNGFA5y7D9upZVw3HcVNun6mYy+NGopxJRK v/FoTgBgBIKeZzc/se+GjDA= X-Received: by 10.28.13.142 with SMTP id 136mr1884625wmn.95.1521543271013; Tue, 20 Mar 2018 03:54:31 -0700 (PDT) Received: from gmail.com (2E8B0CD5.catv.pool.telekom.hu. [46.139.12.213]) by smtp.gmail.com with ESMTPSA id q13sm1270153wrg.56.2018.03.20.03.54.29 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 20 Mar 2018 03:54:30 -0700 (PDT) Date: Tue, 20 Mar 2018 11:54:27 +0100 From: Ingo Molnar To: Thomas Gleixner Cc: David Laight , 'Rahul Lakkireddy' , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "netdev@vger.kernel.org" , "mingo@redhat.com" , "hpa@zytor.com" , "davem@davemloft.net" , "akpm@linux-foundation.org" , "torvalds@linux-foundation.org" , "ganeshgr@chelsio.com" , "nirranjan@chelsio.com" , "indranil@chelsio.com" , Andy Lutomirski , Peter Zijlstra , Fenghua Yu , Eric Biggers Subject: Re: [RFC PATCH 0/3] kernel: add support for 256-bit IO access Message-ID: <20180320105427.bm4od7cpessbraag@gmail.com> References: <7f0ddb3678814c7bab180714437795e0@AcuMS.aculab.com> <7f8d811e79284a78a763f4852984eb3f@AcuMS.aculab.com> <20180320082651.jmxvvii2xvmpyr2s@gmail.com> <20180320090802.qw4tqjmhy6yfd6sf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Thomas Gleixner wrote: > On Tue, 20 Mar 2018, Ingo Molnar wrote: > > * Thomas Gleixner wrote: > > > > > > So I do think we could do more in this area to improve driver performance, if the > > > > code is correct and if there's actual benchmarks that are showing real benefits. > > > > > > If it's about hotpath performance I'm all for it, but the use case here is > > > a debug facility... > > > > > > And if we go down that road then we want a AVX based memcpy() > > > implementation which is runtime conditional on the feature bit(s) and > > > length dependent. Just slapping a readqq() at it and use it in a loop does > > > not make any sense. > > > > Yeah, so generic memcpy() replacement is only feasible I think if the most > > optimistic implementation is actually correct: > > > > - if no preempt disable()/enable() is required > > > > - if direct access to the AVX[2] registers does not disturb legacy FPU state in > > any fashion > > > > - if direct access to the AVX[2] registers cannot raise weird exceptions or have > > weird behavior if the FPU control word is modified to non-standard values by > > untrusted user-space > > > > If we have to touch the FPU tag or control words then it's probably only good for > > a specialized API. > > I did not mean to have a general memcpy replacement. Rather something like > magic_memcpy() which falls back to memcpy when AVX is not usable or the > length does not justify the AVX stuff at all. OK, fair enough. Note that a generic version might still be worth trying out, if and only if it's safe to access those vector registers directly: modern x86 CPUs will do their non-constant memcpy()s via the common memcpy_erms() function - which could in theory be an easy common point to be (cpufeatures-) patched to an AVX2 variant, if size (and alignment, perhaps) is a multiple of 32 bytes or so. Assuming it's correct with arbitrary user-space FPU state and if it results in any measurable speedups, which might not be the case: ERMS is supposed to be very fast. So even if it's possible (which it might not be), it could end up being slower than the ERMS version. Thanks, Ingo