Received: by 10.213.65.68 with SMTP id h4csp483888imn; Tue, 20 Mar 2018 07:59:34 -0700 (PDT) X-Google-Smtp-Source: AG47ELsuPCp1gWlfXxGWpcSPz8PdlmsIohaqlEc8EpLd7E837p0vW2Co3qOgGq1fzD4kjT9tq+rA X-Received: by 10.99.112.17 with SMTP id l17mr12356129pgc.281.1521557974230; Tue, 20 Mar 2018 07:59:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521557974; cv=none; d=google.com; s=arc-20160816; b=WFmzVxEXOKe4JPOX6yXg0b8AcELu6Hc+pt9zofcD6xttI8E4VbkyAWFLTrtWwe+Bi8 Atu5v0+tnYDG1Ghx2iXlaJHiFOCTi8levcUci8qvICmS4MGum3KeNw+VfGPvte3lnqmC U3VyeWqPsDTq2NmZb04+XcdJ81bMUVfIujH/aMdrejPdevYJgt9WwCj9utdnztblI/w3 sQh+6ElNvXkHK/r765yq4rM9PQeCDieYHLI6crElvvL92yDWPEVBDXsVxN7j754owq/6 D/Tqa1VR0QcnVJHQ3p2p39SFnZqd+HwBsec7uI1K/7fLZcJILI+jw1QvQi9zvfzYEG2L mxvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dmarc-filter :arc-authentication-results; bh=R+S/G1/0OccCUH7RvVTfJCCXwJRi95JWTEcXUeZE2O0=; b=vxCItt/z11xs/c05LtJjWMevi+9xcqQsbPguUkw4nCLOzmi+gOPjvEKVbmCuVevZpP 6oQYeRCHcaATMppQ1o53ixoKt+HAjRL8zzr53kvzsetayi1Hy6YezZuXdo57Hp873i2x wFBkBBmiG7hOSiQE3TCSGAWXsBmAeYotWLomd1IlaQIVAk/hypSAwNChBeX7hQxpIejr LyTIP/9urQi4KoInAdlXcPpRSwVGjSrJ4mVaVR81SoPrCKjX2yOshRUmVwfTVd7fQ5/X cP2cuBLPAQgb8j8/3DFklNIZ3+iBMr7vktDbyDSmJ9DXlE5CxzdYey7Z5hyghXVIGGHo kq9w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g17-v6si1742537plo.486.2018.03.20.07.59.20; Tue, 20 Mar 2018 07:59:34 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751694AbeCTO5b (ORCPT + 99 others); Tue, 20 Mar 2018 10:57:31 -0400 Received: from mail.kernel.org ([198.145.29.99]:59162 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751578AbeCTO51 (ORCPT ); Tue, 20 Mar 2018 10:57:27 -0400 Received: from mail-it0-f43.google.com (mail-it0-f43.google.com [209.85.214.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D222421834 for ; Tue, 20 Mar 2018 14:57:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D222421834 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org Received: by mail-it0-f43.google.com with SMTP id j137-v6so2824534ita.1 for ; Tue, 20 Mar 2018 07:57:26 -0700 (PDT) X-Gm-Message-State: AElRT7ENAwj8py58Id2L9YT8x5MFJJa4HnEeWKYcpaAbEPVIbxTZ6sBZ F0SQiZyKue0OaM3znv/o/qRZ+yJ69kdShmWRFp2q8Q== X-Received: by 2002:a24:4e0e:: with SMTP id r14-v6mr17008ita.146.1521557846189; Tue, 20 Mar 2018 07:57:26 -0700 (PDT) MIME-Version: 1.0 Received: by 10.2.137.70 with HTTP; Tue, 20 Mar 2018 07:57:05 -0700 (PDT) In-Reply-To: <20180320082651.jmxvvii2xvmpyr2s@gmail.com> References: <7f0ddb3678814c7bab180714437795e0@AcuMS.aculab.com> <7f8d811e79284a78a763f4852984eb3f@AcuMS.aculab.com> <20180320082651.jmxvvii2xvmpyr2s@gmail.com> From: Andy Lutomirski Date: Tue, 20 Mar 2018 14:57:05 +0000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH 0/3] kernel: add support for 256-bit IO access To: Ingo Molnar Cc: Thomas Gleixner , David Laight , Rahul Lakkireddy , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "netdev@vger.kernel.org" , "mingo@redhat.com" , "hpa@zytor.com" , "davem@davemloft.net" , "akpm@linux-foundation.org" , "torvalds@linux-foundation.org" , "ganeshgr@chelsio.com" , "nirranjan@chelsio.com" , "indranil@chelsio.com" , Andy Lutomirski , Peter Zijlstra , Fenghua Yu , Eric Biggers , Rik van Riel Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 20, 2018 at 8:26 AM, Ingo Molnar wrote: > > * Thomas Gleixner wrote: > >> > Useful also for code that needs AVX-like registers to do things like CRCs. >> >> x86/crypto/ has a lot of AVX optimized code. > > Yeah, that's true, but the crypto code is processing fundamentally bigger blocks > of data, which amortizes the cost of using kernel_fpu_begin()/_end(). > > kernel_fpu_begin()/_end() is a pretty heavy operation because it does a full FPU > save/restore via the XSAVE[S] and XRSTOR[S] instructions, which can easily copy a > thousand bytes around! So kernel_fpu_begin()/_end() is probably a non-starter for > something small, like a single 256-bit or 512-bit word access. > > But there's actually a new thing in modern kernels: we got rid of (most of) lazy > save/restore FPU code, our new x86 FPU model is very "direct" with no FPU faults > taken normally. > > So assuming the target driver will only load on modern FPUs I *think* it should > actually be possible to do something like (pseudocode): > > vmovdqa %ymm0, 40(%rsp) > vmovdqa %ymm1, 80(%rsp) > > ... > # use ymm0 and ymm1 > ... > > vmovdqa 80(%rsp), %ymm1 > vmovdqa 40(%rsp), %ymm0 > I think this kinda sorts works, but only kinda sorta: - I'm a bit worried about newer CPUs where, say, a 256-bit vector operation will implicitly clear the high 768 bits of the regs. (IIRC that's how it works for the new VEX stuff.) - This code will cause XINUSE to be set, which is user-visible and may have performance and correctness effects. I think the kernel may already have some but where it occasionally sets XINUSE on its own, and this caused problems for rr in the past. This might not be a total showstopper, but it's odd. I'd rather see us finally finish the work that Rik started to rework this differently. I'd like kernel_fpu_begin() to look like: if (test_thread_flag(TIF_NEED_FPU_RESTORE)) { return; // we're already okay. maybe we need to check in_interrupt() or something, though? } else { XSAVES/XSAVEOPT/XSAVE; set_thread_flag(TIF_NEED_FPU_RESTORE): } and kernel_fpu_end() does nothing at all. We take the full performance hit for a *single* kernel_fpu_begin() on an otherwise short syscall or interrupt, but there's no additional cost for more of them or for long-enough-running things that we schedule in the middle. As I remember, the main hangup was that this interacts a bit oddly with PKRU, but that's manageable. --Andy