Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp992775imm; Wed, 19 Sep 2018 10:08:39 -0700 (PDT) X-Google-Smtp-Source: ANB0VdaypKjuSfdOhAeSnyBaKOS/ljDqrGpTWEwy5oskq6UJVca4/2MOKXh6hUwayIHpDxwzAAOS X-Received: by 2002:a62:45d2:: with SMTP id n79-v6mr37008241pfi.137.1537376919454; Wed, 19 Sep 2018 10:08:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537376919; cv=none; d=google.com; s=arc-20160816; b=ObPJxii/kaMNJ9k0igD5suxbAZVtdlQlPu+nAtIkr5lzUz+l2ZjREJcND52BVJUX+8 AkpcwiXsV/mu7pdeDMVcf4SWxckbJlAF7YEhIlSPgG3iXgqxk8B32UsKFA+TZFoylWbz ki5llNJDUUdqhG2eP0AFUA3p4PABGs9PhD7CHNBdKeAhG9NcoZnuwYwdp82lA+aUMLH0 qxHd6+UJuOiygOncUKEzDOd5+M5VbIrWks9eptG6ZGrYgCsA7jqHFF2cVeUn3NjjU5nq GfCtRt5P9vPDqmWdTT2ZPOdXpS/RkXG4O1Y5nbwHZ07duaGkZFjbnayjk3nTWOwP7uzg 3OjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=4I6lUSUZXKBTCxjoo57YaM3GJVFGf7Y/D9tXJ7I6tz8=; b=gP8PnWx06MUBtNtr380prIeUCsQEzmYWZCYSjqX70KfJ3LDkf6sOgAplk21sCHXyEf QG0MS6HVAfWT1Ge1b7tvGg1DGiG/8rSRjTFId6FZjFTK3Au985qRYTfVZAZO6O00sUDQ qpDWwsjUYIhotCyV0OuYQlskg4pACNkYFooP+A4CqKmXBatMn9MnWUZBjS2vYxzfA91I JmsCjXDODvMbmN5XLakBQTp4Wy81ND16Aa6PxkBiAvi5ragkOPt4U4M/mgdj7uc8IAYR piP/m0ZrjEl6zBqKndcA5pfIr2cWaJH1J60u60B01Jt+hjpKNDlVXKIi2OP4tIQoXrWc KlPw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b24-v6si21639376pfo.54.2018.09.19.10.08.24; Wed, 19 Sep 2018 10:08:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732862AbeISWoI convert rfc822-to-8bit (ORCPT + 99 others); Wed, 19 Sep 2018 18:44:08 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:33029 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731592AbeISWoI (ORCPT ); Wed, 19 Sep 2018 18:44:08 -0400 Received: from bigeasy by Galois.linutronix.de with local (Exim 4.80) (envelope-from ) id 1g2fue-0006Un-30; Wed, 19 Sep 2018 19:05:16 +0200 Date: Wed, 19 Sep 2018 19:05:16 +0200 From: Sebastian Andrzej Siewior To: Andy Lutomirski Cc: linux-kernel@vger.kernel.org, x86@kernel.org, Andy Lutomirski , Paolo Bonzini , Radim =?utf-8?B?S3LEjW3DocWZ?= , kvm@vger.kernel.org, "Jason A. Donenfeld" , Rik van Riel Subject: Re: [RFC PATCH 10/10] x86/fpu: defer FPU state load until return to userspace Message-ID: <20180919170515.ptqmmpsxrdjsi64j@linutronix.de> References: <20180912133353.20595-1-bigeasy@linutronix.de> <20180912133353.20595-11-bigeasy@linutronix.de> <650FC457-7E4C-473A-9E5F-EAFC74F6444B@amacapital.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8BIT In-Reply-To: <650FC457-7E4C-473A-9E5F-EAFC74F6444B@amacapital.net> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018-09-12 08:47:19 [-0700], Andy Lutomirski wrote: > > --- a/arch/x86/kernel/fpu/core.c > > +++ b/arch/x86/kernel/fpu/core.c > > @@ -101,14 +101,14 @@ void __kernel_fpu_begin(void) > > > > kernel_fpu_disable(); > > > > - if (fpu->initialized) { > > + __cpu_invalidate_fpregs_state(); > > + > > + if (!test_and_set_thread_flag(TIF_LOAD_FPU)) { > > Since the already-TIF_LOAD_FPU path is supposed to be fast here, use test_thread_flag() instead. test_and_set operations do unconditional RMW operations and are always full barriers, so they’re slow. okay. > Also, on top of this patch, there should be lots of cleanups available. In particular, all the fpu state accessors could probably be reworked to take TIF_LOAD_FPU into account, which would simplify the callers and maybe even the mess of variables tracking whether the state is in regs. Do you refer to the fpu.initilized check or something else? Sebastian