Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752654AbbBXCOO (ORCPT ); Mon, 23 Feb 2015 21:14:14 -0500 Received: from eddie.linux-mips.org ([148.251.95.138]:50948 "EHLO cvs.linux-mips.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752212AbbBXCOM (ORCPT ); Mon, 23 Feb 2015 21:14:12 -0500 Date: Tue, 24 Feb 2015 02:14:10 +0000 (GMT) From: "Maciej W. Rozycki" To: Andy Lutomirski cc: Rik van Riel , Borislav Petkov , Ingo Molnar , Oleg Nesterov , X86 ML , "linux-kernel@vger.kernel.org" , Linus Torvalds Subject: Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs In-Reply-To: Message-ID: References: <20150221093150.GA27841@gmail.com> <20150221163840.GA32073@pd.tnic> <20150221172914.GB32073@pd.tnic> <54EB99E8.2060500@redhat.com> User-Agent: Alpine 2.11 (LFD 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4948 Lines: 98 On Mon, 23 Feb 2015, Andy Lutomirski wrote: > >> After a context switch, the instructions from the old task are no > >> longer in the pipeline. > > > > I'd say it's implementation-specific. As I mentioned the i486 aborted > > any transcendental x87 instruction in progress upon taking an exception or > > interrupt. That was a model like you refer to, but as I also mentioned it > > had its shortcomings. > > IRET is serializing, according to the the docs (I think) and according > to the Intel engineers I asked (I'm absolutely certain about this > part). So FPU ops are entirely done at the end of a normal context > switch. No question about the serialising property of IRET, it has been like this since the original Pentium implementation. Do you have an architecture specification reference to back up your claim though as far as the FPU is concerned? I'm asking because I am genuinely curious. The x87 case is so special, there isn't anything there really that is externally observable or should be affected by IRET or any other synchronisation barriers apart from WAIT (or a waiting x87 instruction) that has been there for this purpose since forever. And it would defeat some documented benefits of running the FP pipeline in the parallel. And certainly such synchronisation didn't happen in the old days. > We also always save the FPU context on every context switch away from > a task that used the FPU, even in lazy mode. This is because we might > switch the task back in on a different CPU, and we don't want to use > an IPI to move the FPU context. That's an interesting case too, although not necessarily related. If you say that we always save the FP context eagerly for the purpose of process migration, then sure, that invalidates any benefit we'd have from letting the x87 proceed. However I can see different ways to address this case avoiding the need of eager FP context saving or an IPI: 1. We could bind any currently suspended process with an unsaved FP context to the CPU it last executed on. 2. We could mark such a process for migration next time and let it execute on the CPU that holds its FP context once more, and then save the FP context eagerly on the way out. In some cases a lazily retained FP context would be preempted by another process before the process in question would resume anyway. In this case any temporary binding to a CPU could be given up. > Given that we're only talking about old CPUs here, I sincerely doubt > that there's any relevant case in which an fxsave can usefully wait > for a long-running transcendental op to finish while we continue doing > useful work. *Especially* since there will almost certainly be > several more mfences or atomic ops before the end of the context > switch, even if we're lucky enough to complete the context switching > using sysret. I am not sure what you mean by FXSAVE usefully waiting for an op, please elaborate. At the point you've reached FXSAVE and an earlier x87 instruction hasn't completed, you've already lost. The pipeline will be stalled until the x87 instruction has completed and it can be hundreds of cycles. My point therefore has been about avoiding to execute FXSAVE for the old task until absolutely necessary, that with the lazy FP context switching would be at the next x87 (or SSE) instruction reached by the new task. Likewise I don't see why MFENCE or an atomic operation should affect the excecution of say FSINCOS. Whether the results of FSINCOS arrive before or after MFENCE, etc. are not externally observable. And I'm not sure if this all affects old CPUs only -- I don't know how much x87 software is out there, but after all these years I'd expect quite some. Sure, lots of this can be recompiled to use SSE instead, but not all, and even where it is feasible, that's an extra burden for people, beyond say a routine hardware or Linux distribution or for that matter lone kernel upgrade. Therefore I think we need to be careful not to pessimise things for a subset of people too much and ideally at all. And to be clear, I am not against removing lazy FP context switching per se. I am just emphasizing to be careful with that and be absolutely sure that it does not cause excessive harm. I still wonder why Intel hasn't addressed some issues around this stuff -- is that there are not enough people using proper IEEE 754 arithmetic on x86 hardware to attract interest of hardware architecture maintainers? After all the same issue applies to enabled IEEE 754 exceptions, a #MF/#XM exception isn't going to take any less than a #NM fault. Or maybe I'm just missing something? Maciej -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/