Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964880Ab2JCPtF (ORCPT ); Wed, 3 Oct 2012 11:49:05 -0400 Received: from smtp.eu.citrix.com ([62.200.22.115]:8312 "EHLO SMTP.EU.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932427Ab2JCPtD (ORCPT ); Wed, 3 Oct 2012 11:49:03 -0400 X-IronPort-AV: E=Sophos;i="4.80,528,1344211200"; d="scan'208";a="14920034" Date: Wed, 3 Oct 2012 16:48:00 +0100 From: Stefano Stabellini X-X-Sender: sstabellini@kaball.uk.xensource.com To: Ian Campbell CC: Konrad Rzeszutek Wilk , Stefano Stabellini , "xen-devel@lists.xensource.com" , "linux-kernel@vger.kernel.org" Subject: Re: [Xen-devel] [PATCH] xen: point xen_start_info to a dummy struct for PV on HVM guests In-Reply-To: <1349276431.650.156.camel@zakaz.uk.xensource.com> Message-ID: References: <1349272182.650.150.camel@zakaz.uk.xensource.com> <1349272482.650.151.camel@zakaz.uk.xensource.com> <20121003141116.GA10633@phenom.dumpdata.com> <1349276431.650.156.camel@zakaz.uk.xensource.com> User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3377 Lines: 73 On Wed, 3 Oct 2012, Ian Campbell wrote: > On Wed, 2012-10-03 at 15:11 +0100, Konrad Rzeszutek Wilk wrote: > > On Wed, Oct 03, 2012 at 02:54:42PM +0100, Ian Campbell wrote: > > > On Wed, 2012-10-03 at 14:51 +0100, Stefano Stabellini wrote: > > > > On Wed, 3 Oct 2012, Ian Campbell wrote: > > > > > On Wed, 2012-10-03 at 14:37 +0100, Stefano Stabellini wrote: > > > > > > PV on HVM guests don't have a start_info page mapped by Xen, so > > > > > > xen_start_info is just NULL for them. > > > > > > That is problem because other parts of the code expect xen_start_info to > > > > > > point to something valid, for example xen_initial_domain() is defined as > > > > > > follow: > > > > > > > > > > > > #define xen_initial_domain() (xen_domain() && \ > > > > > > xen_start_info->flags & SIF_INITDOMAIN) > > > > > > > > > > But anyone who calls this before xen_start_info is setup is going to get > > > > > a bogus result, specifically in this case they will think they are domU > > > > > when in reality they are dom0 -- wouldn't it be better to fix those > > > > > callsites? > > > > > > > > That cannot be the case because setting up xen_start_info is the very > > > > first thing that is done, before even calling to C. > > > > > > On PV, yes, but you are trying to fix PVHVM here, no? > > > > > > Otherwise if this is always set before calling into C then what is the > > > purpose of this patch? > > > > to fix this - as PVHVM has it set to NULL and we end up de-referencing > > the xen_start_info and crashing. As so:: > > > > Right, so returning to my original point: The caller here is calling > xen_initial_domain() *before* start info is setup. This is bogus and is > your actual bug, all this patch does is hide that real issue. That is because xen_start_info wasn't setup at all for PV on HVM guests. The real reason is that PV on HVM guests don't have one, but that is another matter. Until we get rid of all the references to xen_start_info outside of PV specific code, we should just assume that there is one, and that is already setup. One day not too far from now, we might refactor the code to never reference xen_start_info directly, but I don't think that now is the time for that. Also consider that this is the same thing we do on ARM. > With this "fix" the caller of xen_initial_domain shown in this trace now > gets a rubbish result based on the content of a dummy shared info > instead of the real answer from that actual shared info. That is not true. The caller gets a zero result, that is completely appropriate in this case, given that a PV on HVM guest doesn't have a start_info. > The right fix is to fix the caller to not call xen_initial_domain() > until after the shared info has been setup. Maybe that means moving > shinfo setup earlier, or maybe it means deferring this call until later > in the PVHVM case. I don't think so, we should be able to call xen_initial_domain() at any point in the code. The best course of action is taking this fix now (making PVHVM x86 guests behave the same way as ARM guests) and refactor all the callers to xen_start_info later. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/