Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752564AbcDOCQ0 (ORCPT ); Thu, 14 Apr 2016 22:16:26 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:43335 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751775AbcDOCQZ (ORCPT ); Thu, 14 Apr 2016 22:16:25 -0400 Date: Thu, 14 Apr 2016 22:14:22 -0400 From: Konrad Rzeszutek Wilk To: "Luis R. Rodriguez" Cc: George Dunlap , Matt Fleming , jeffm@suse.com, Linux Kernel Mailing List , Jim Fehlig , Jan Beulich , "H. Peter Anvin" , Daniel Kiper , the arch/x86 maintainers , Takashi Iwai , =?utf-8?Q?Vojt=C4=9Bch_Pavl=C3=ADk?= , Gary Lin , xen-devel , Jeffrey Cheung , Charles Arndol , Julien Grall , Stefano Stabellini , joeyli , Borislav Petkov , Boris Ostrovsky , Juergen Gross , Andrew Cooper , Michael Chang , Andy Lutomirski , David Vrabel , Linus Torvalds , Roger Pau =?iso-8859-1?Q?Monn=E9?= Subject: Re: [Xen-devel] HVMLite / PVHv2 - using x86 EFI boot entry Message-ID: <20160415021422.GB6956@localhost.localdomain> References: <20160406024027.GX1990@wotan.suse.de> <20160407185148.GL1990@wotan.suse.de> <20160413195257.GB1990@wotan.suse.de> <570F68AB.2040400@citrix.com> <20160414194408.GP1990@wotan.suse.de> <20160414203847.GB21657@localhost.localdomain> <20160414211201.GS1990@wotan.suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160414211201.GS1990@wotan.suse.de> User-Agent: Mutt/1.5.24 (2015-08-30) X-Source-IP: aserv0021.oracle.com [141.146.126.233] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4589 Lines: 123 On Thu, Apr 14, 2016 at 11:12:01PM +0200, Luis R. Rodriguez wrote: > On Thu, Apr 14, 2016 at 04:38:47PM -0400, Konrad Rzeszutek Wilk wrote: > > > This has nothing to do with dominance or anything nefarious, I'm asking > > > simply for a full engineering evaluation of all possibilities, with > > > the long term in mind. Not for now, but for hardware assumptions which > > > are sensible 5 years from now. > > > > There are two different things in my mind about this conversation: > > > > 1). semantics of low-level code wrapped around pvops. On baremetal > > it is easy - just look at Intel and AMD SDM. > > And this is exactly what running in HVM or HVMLite mode will do - > > all those low-level operations will have the same exact semantic > > as baremetal. > > Today Linux is KVM stupid for early boot code. I've pointed this out -EPARSE? > before, but again, there has been no reason found to need this. Perhaps > for HVMLite we won't need this... Are you talking about kvmtools? Which BTW are similar to how HVMLite would expose the platform. > > > There is no hope for the pv_ops to fix that. > > Actually I beg to differ. See my patches and ongoing work. I meant in terms of semantics. As in I cannot see some of those pv-ops to have the same semantics as baremetal. For example set_pte is simple on x86 (movq $, ). While on Xen PV it is a potential batching hypercall with lookup in an P2M table, then perhaps a sidelong look at the M2P, then maybe the M2P override. > > > And I am pretty sure the HVMLite in 5 years will have no > > trouble in this as it will be running in VMX mode (HVM). > > HVMLite may still use PV drivers for some things, its not super > obvious to me that low level semantics will not be needed yet. PV drivers are very different from low-level semantics. And it will have to use them. Maybe it is easier to think of this in terms of kvmtool - it is pretty much how this would work - but instead of VirtIO drivers you would be using the Xen PV drivers (thought one could also use VirtIO ones if you wanted). > > > 2). Boot entry. > > > > The semantics on Linux are well known - they are documented in > > Documentation/x86/boot.txt. > > > > HVMLite Linux guests have to somehow provide that. > > > > And how it is done seems to be tied around: > > > > a) Use existing boot paths - which means making some > > extra stub code to call in those existing boot paths > > (for example Xen could bundle with an GRUB2-alike > > code to be run when booting Linux using that boot-path). > > > > Or EFI (for a ton more code). Granted not all OSes > > support those, so not very OS agnostic. > > What other OSes do is something to consider but if they don't > do it because they are slacking in one domain should by no means > be a reason to not evaluate the long term possible gains. > Specially if we have reasons to believe more architectures will > consider it and standardize on it. > > It'd be silly not to take this a bit more seriously. Complexity vs simplicity. > > > Hard part - if the bootparams change then have to > > rev up the code in there. May be out of sync > > with Linux bootparams. > > If we are going to ultimately standardize on EFI boot for new > hardware it'd be rather silly to extend the boot params further. Whoa there... Have you spoken to hpa,tglrx about this? > > > b) Add another simpler boot entry point which has to copy > > "some" strings from its format in bootparams. > > > > > > So this part of the discussion does not fall in the > > hardware assumptions. Intel SDM or AMD mention nothing about > > boot loaders or how to boot an OS - that is all in realms > > of how software talks to software. > > Right -- so one question to ask here is what other uses are there > for this outside of say HVMLite. You mentioned Multiboot so far. > > > 3). And there is the discussion on man-power to make this > > happen. > > Sure. > > > 4). Lastly which one is simpler and involves less code so > > that there is a less chance of bitrot. > > Indeed. > > You also forgot the tie-in between dead-code and semantics but Wait, I just spoke about CPU semantics?! Which semantics are you talking about? > that clearly is not on your mind. But I'd say this is a good > summary. I put 'dead code' in the same realm as device drivers work. And they seem to always have some issue or another. Or maybe I getting unlucky and getting copied on those bugs. > > Luis