Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp5631898imm; Wed, 12 Sep 2018 08:47:57 -0700 (PDT) X-Google-Smtp-Source: ANB0VdbkLrIam2YuZRhYdnXleTVpNwgZCz320NuJ38QpaGY7jBxcjMj0kntV1cuCdJH+RE7cAByj X-Received: by 2002:a62:1016:: with SMTP id y22-v6mr3073960pfi.109.1536767277870; Wed, 12 Sep 2018 08:47:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536767277; cv=none; d=google.com; s=arc-20160816; b=Yo/TRJ46xgMRq0YV7d2ywZuXCtqNhynai9Xt/oI8Gg/loFg7Ur4tBus9TsvoGSwWUU ZJc1B4pakAtu96/BL7HtHBl5BQmGIj/8O6KTNEdQCMRmvshEgbGs3ESw8otH2xLe/4Bk AXUiKNdX0StSldx810dDXMYbeNt6VTs6P4+VUyvN42elW4NLcxg2V60z74/xHVH/yyus y2TWIvz8sBOTs+0WgI6ratPIfVtcKLyIclyxUqWDCTqe6Lu4kvX0MfvodqyCUfH05OwY aOcXpq5Lgd73pD3n/f5r9Dx/3rKyhyGElMpotC+C89ikunlYbDXdyKhGc2oSMAAi5wsK y4Yg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=42SJlqJZuu6SRLlPyMRMzajgqPPjYpEJjmPYlo2xirU=; b=PBUKrHp6lSrZtcoHaTM+ZI95tp3v+a78Fw5AfdAQg6BLQF9U1XpQ/D0R3rv9Rskqjo n+1OKp38kEBVX0v0k10HCm4/4TGM3ZJH/8/beRqfPSeWONchmwuI2g2ua8TZH3qCJ+e9 8azgVRguxNHuiltlev3PYpnobARylT0ZOlaWlidKfrBa/I5aRgTJ4Jglt1RXps6aJ0TT CxmWqhzAia+GBVd79S0wXj5odFDC2JIgtff7e9eY9akfm2pfVoTNYR9k3JrntORKMcOO +9gVLzr+crYDLBQkhbXJIDgRaQ+tSMPF2ETLgtSmLd4knLJ1D7qn+C7/DkbVsguapqXR TuRA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=c7bpqklV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g11-v6si1265808pgo.525.2018.09.12.08.47.42; Wed, 12 Sep 2018 08:47:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=c7bpqklV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727417AbeILUw1 (ORCPT + 99 others); Wed, 12 Sep 2018 16:52:27 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:46233 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726866AbeILUw1 (ORCPT ); Wed, 12 Sep 2018 16:52:27 -0400 Received: by mail-pl1-f194.google.com with SMTP id t19-v6so1154346ply.13 for ; Wed, 12 Sep 2018 08:47:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=42SJlqJZuu6SRLlPyMRMzajgqPPjYpEJjmPYlo2xirU=; b=c7bpqklVClPc1F0Cr0VdTzbVWZpKzFr00ZoyAk/ZWWQOW/Hh5+Cy5610kskcc0Yf0d AkGSsxJjbZgeNscOn9JXVy+0BVLV+XcjnGn89euzqK2pwTkbhuWCWAkUD12j7PlMSYCk UdmyL2l/F0ojvYAywuE8UZ4T5yWLAjlxd8kMli8/IBcsHyU/pLWyfHnuhJ5/VH/taMWL uCwxdTusZ+uxLI9PsWIrPGB9eWhD7PPko+y9SGYYR0A/yHip0S9/zGtV44BuSmInn4zY FOT/hXoFn+puMDhRSa8lQQ/ypEgTIMHCK9ghMBZgoSILdW1CNFyxHqUSPUtdBH5RV6Bc 7MFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=42SJlqJZuu6SRLlPyMRMzajgqPPjYpEJjmPYlo2xirU=; b=kqGT91PMUJ6jlN+Rpd9+C8K1L6VAHj6UrRTnHUQshBbRj964VFGCXNf405i8ORzeAX ibDyd43y7Ri2qPDebnNDXFhYN7dnd2xKufBNmGbkDP1aBJHM/VubZfTYJgbu11YBwjV6 6Co3Vn4OtrcICgUNl5OVdRyI+THTwJnWK6vjztnC4qjf21aONa3v+VHAVx/T2tUa3JxF X5PoHr1mmwz0MOUhE3qDwtJh9jpmFPDoeRG+6uj/OAoAyHjZdCRUSaOXAoEQ0vO4AaZo gzOUs/so8I7UlMwP7o5vOwE2un/xwl4cz4f+xpgJcsItAf39HMW3Sg5i3V4hRwk+1UZ4 FiGQ== X-Gm-Message-State: APzg51Bc6+XE95JJa+h2uN5e31caJn4ADoWc+7Y5r2E+cIqGIF/NxmwP amYgXUaAgzbfTZX04v7tw6RDcA== X-Received: by 2002:a17:902:6b47:: with SMTP id g7-v6mr3040276plt.128.1536767241244; Wed, 12 Sep 2018 08:47:21 -0700 (PDT) Received: from ?IPv6:2601:646:c200:7429:9592:4a20:451a:68da? ([2601:646:c200:7429:9592:4a20:451a:68da]) by smtp.gmail.com with ESMTPSA id z5-v6sm1757890pfh.83.2018.09.12.08.47.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Sep 2018 08:47:20 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: [RFC PATCH 10/10] x86/fpu: defer FPU state load until return to userspace From: Andy Lutomirski X-Mailer: iPhone Mail (15G77) In-Reply-To: <20180912133353.20595-11-bigeasy@linutronix.de> Date: Wed, 12 Sep 2018 08:47:19 -0700 Cc: linux-kernel@vger.kernel.org, x86@kernel.org, Andy Lutomirski , Paolo Bonzini , =?utf-8?Q?Radim_Kr=C4=8Dm=C3=A1=C5=99?= , kvm@vger.kernel.org, "Jason A. Donenfeld" , Rik van Riel Content-Transfer-Encoding: quoted-printable Message-Id: <650FC457-7E4C-473A-9E5F-EAFC74F6444B@amacapital.net> References: <20180912133353.20595-1-bigeasy@linutronix.de> <20180912133353.20595-11-bigeasy@linutronix.de> To: Sebastian Andrzej Siewior Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Sep 12, 2018, at 6:33 AM, Sebastian Andrzej Siewior wrote: >=20 > From: Rik van Riel >=20 > Defer loading of FPU state until return to userspace. This gives > the kernel the potential to skip loading FPU state for tasks that > stay in kernel mode, or for tasks that end up with repeated > invocations of kernel_fpu_begin. >=20 > It also increases the chances that a task's FPU state will remain > valid in the FPU registers until it is scheduled back in, allowing > us to skip restoring that task's FPU state altogether. >=20 >=20 > --- a/arch/x86/kernel/fpu/core.c > +++ b/arch/x86/kernel/fpu/core.c > @@ -101,14 +101,14 @@ void __kernel_fpu_begin(void) >=20 > kernel_fpu_disable(); >=20 > - if (fpu->initialized) { > + __cpu_invalidate_fpregs_state(); > + > + if (!test_and_set_thread_flag(TIF_LOAD_FPU)) { Since the already-TIF_LOAD_FPU path is supposed to be fast here, use test_th= read_flag() instead. test_and_set operations do unconditional RMW operations= and are always full barriers, so they=E2=80=99re slow. Also, on top of this patch, there should be lots of cleanups available. In p= articular, all the fpu state accessors could probably be reworked to take TI= F_LOAD_FPU into account, which would simplify the callers and maybe even the= mess of variables tracking whether the state is in regs.=