Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933782AbcJRH6Z (ORCPT ); Tue, 18 Oct 2016 03:58:25 -0400 Received: from mail-lf0-f67.google.com ([209.85.215.67]:34265 "EHLO mail-lf0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932340AbcJRH6R (ORCPT ); Tue, 18 Oct 2016 03:58:17 -0400 Date: Tue, 18 Oct 2016 09:58:08 +0200 From: Ingo Molnar To: riel@redhat.com Cc: linux-kernel@vger.kernel.org, bp@alien8.de, torvalds@linux-foundation.org, luto@kernel.org, dave.hansen@intel.linux.com, tglx@linutronix.de, hpa@zytor.com Subject: Re: [PATCH RFC 0/3] x86/fpu: defer FPU state loading until return to userspace Message-ID: <20161018075808.GA21544@gmail.com> References: <1476734984-13839-1-git-send-email-riel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1476734984-13839-1-git-send-email-riel@redhat.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1418 Lines: 37 * riel@redhat.com wrote: > These patches defer FPU state loading until return to userspace. > > This has the advantage of not clobbering the FPU state of one task > with that of another, when that other task only stays in kernel mode. > > It also allows us to skip the FPU restore in kernel_fpu_end(), which > will help tasks that do multiple invokations of kernel_fpu_begin/end > without returning to userspace, for example KVM VCPU tasks. > > We could also skip the restore of the KVM VCPU guest FPU state at > guest entry time, if it is still valid, but I have not implemented > that yet. > > The code that loads FPU context directly into registers from user > space memory, or saves directly to user space memory, is wrapped > in a retry loop, that ensures the FPU state is correctly set up > at the start, and verifies that it is still valid at the end. > > I have stress tested these patches with various FPU test programs, > and things seem to survive. > > However, I have not found any good test suites that mix FPU > use and signal handlers. Close scrutiny of these patches would > be appreciated. BTW., for the next version it would be nice to also have a benchmark that shows the advantages (and proves that it's not causing measurable overhead elsewhere). Either an FPU-aware extension to 'perf bench sched' or a separate 'perf bench fpu' suite would be nice. Thanks, Ingo