PV on HVM guests don't have a start_info page mapped by Xen, so
xen_start_info is just NULL for them.
That is problem because other parts of the code expect xen_start_info to
point to something valid, for example xen_initial_domain() is defined as
follow:
#define xen_initial_domain() (xen_domain() && \
xen_start_info->flags & SIF_INITDOMAIN)
Allocate a dummy start_info struct and point xen_start_info to it, as we
do on ARM.
This is not going to change things for PV guests because
xen_start_info is set by arch/x86/xen/xen-head.S:startup_xen.
Signed-off-by: Stefano Stabellini <[email protected]>
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index bf788d3..5f242cb 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -96,7 +96,8 @@ EXPORT_SYMBOL(machine_to_phys_mapping);
unsigned long machine_to_phys_nr;
EXPORT_SYMBOL(machine_to_phys_nr);
-struct start_info *xen_start_info;
+static struct start_info _xen_start_info;
+struct start_info *xen_start_info = &_xen_start_info;
EXPORT_SYMBOL_GPL(xen_start_info);
struct shared_info xen_dummy_shared_info;
On Wed, 2012-10-03 at 14:37 +0100, Stefano Stabellini wrote:
> PV on HVM guests don't have a start_info page mapped by Xen, so
> xen_start_info is just NULL for them.
> That is problem because other parts of the code expect xen_start_info to
> point to something valid, for example xen_initial_domain() is defined as
> follow:
>
> #define xen_initial_domain() (xen_domain() && \
> xen_start_info->flags & SIF_INITDOMAIN)
But anyone who calls this before xen_start_info is setup is going to get
a bogus result, specifically in this case they will think they are domU
when in reality they are dom0 -- wouldn't it be better to fix those
callsites?
Perhaps turn this into a static inline with a BUG_ON(!xen_start_info) to
make catching these cases easier?
>
>
> Allocate a dummy start_info struct and point xen_start_info to it, as we
> do on ARM.
> This is not going to change things for PV guests because
> xen_start_info is set by arch/x86/xen/xen-head.S:startup_xen.
>
> Signed-off-by: Stefano Stabellini <[email protected]>
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index bf788d3..5f242cb 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -96,7 +96,8 @@ EXPORT_SYMBOL(machine_to_phys_mapping);
> unsigned long machine_to_phys_nr;
> EXPORT_SYMBOL(machine_to_phys_nr);
>
> -struct start_info *xen_start_info;
> +static struct start_info _xen_start_info;
> +struct start_info *xen_start_info = &_xen_start_info;
> EXPORT_SYMBOL_GPL(xen_start_info);
>
> struct shared_info xen_dummy_shared_info;
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> http://lists.xen.org/xen-devel
On Wed, 3 Oct 2012, Ian Campbell wrote:
> On Wed, 2012-10-03 at 14:37 +0100, Stefano Stabellini wrote:
> > PV on HVM guests don't have a start_info page mapped by Xen, so
> > xen_start_info is just NULL for them.
> > That is problem because other parts of the code expect xen_start_info to
> > point to something valid, for example xen_initial_domain() is defined as
> > follow:
> >
> > #define xen_initial_domain() (xen_domain() && \
> > xen_start_info->flags & SIF_INITDOMAIN)
>
> But anyone who calls this before xen_start_info is setup is going to get
> a bogus result, specifically in this case they will think they are domU
> when in reality they are dom0 -- wouldn't it be better to fix those
> callsites?
That cannot be the case because setting up xen_start_info is the very
first thing that is done, before even calling to C.
> Perhaps turn this into a static inline with a BUG_ON(!xen_start_info) to
> make catching these cases easier?
On Wed, Oct 03, 2012 at 02:37:53PM +0100, Stefano Stabellini wrote:
> PV on HVM guests don't have a start_info page mapped by Xen, so
> xen_start_info is just NULL for them.
> That is problem because other parts of the code expect xen_start_info to
> point to something valid, for example xen_initial_domain() is defined as
> follow:
>
> #define xen_initial_domain() (xen_domain() && \
> xen_start_info->flags & SIF_INITDOMAIN)
>
.. introduced by commit 4c071ee5268f7234c3d084b6093bebccc28cdcba
("arm: initial Xen support)
>
> Allocate a dummy start_info struct and point xen_start_info to it, as we
> do on ARM.
> This is not going to change things for PV guests because
> xen_start_info is set by arch/x86/xen/xen-head.S:startup_xen.
>
> Signed-off-by: Stefano Stabellini <[email protected]>
Reported-by: Konrad Rzeszutek Wilk <[email protected]>
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index bf788d3..5f242cb 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -96,7 +96,8 @@ EXPORT_SYMBOL(machine_to_phys_mapping);
> unsigned long machine_to_phys_nr;
> EXPORT_SYMBOL(machine_to_phys_nr);
>
> -struct start_info *xen_start_info;
> +static struct start_info _xen_start_info;
And lets change that to
'xen_dummy_start_info' to keep in sync with the other dummy one.
And also add a commnt:
/*
* Since 'xen_initial_domain' dereferences the xen_start_info we need
* a dummy structure filled with zeros (for PVHVM guests which initialize
* this late). For PV guests we do not have to worry about this as the first
* few instructions (startup_xen) set it properly.
*/
> +struct start_info *xen_start_info = &_xen_start_info;
> EXPORT_SYMBOL_GPL(xen_start_info);
>
> struct shared_info xen_dummy_shared_info;
On Wed, 2012-10-03 at 14:51 +0100, Stefano Stabellini wrote:
> On Wed, 3 Oct 2012, Ian Campbell wrote:
> > On Wed, 2012-10-03 at 14:37 +0100, Stefano Stabellini wrote:
> > > PV on HVM guests don't have a start_info page mapped by Xen, so
> > > xen_start_info is just NULL for them.
> > > That is problem because other parts of the code expect xen_start_info to
> > > point to something valid, for example xen_initial_domain() is defined as
> > > follow:
> > >
> > > #define xen_initial_domain() (xen_domain() && \
> > > xen_start_info->flags & SIF_INITDOMAIN)
> >
> > But anyone who calls this before xen_start_info is setup is going to get
> > a bogus result, specifically in this case they will think they are domU
> > when in reality they are dom0 -- wouldn't it be better to fix those
> > callsites?
>
> That cannot be the case because setting up xen_start_info is the very
> first thing that is done, before even calling to C.
On PV, yes, but you are trying to fix PVHVM here, no?
Otherwise if this is always set before calling into C then what is the
purpose of this patch?
>
>
> > Perhaps turn this into a static inline with a BUG_ON(!xen_start_info) to
> > make catching these cases easier?
>
On Wed, Oct 03, 2012 at 02:54:42PM +0100, Ian Campbell wrote:
> On Wed, 2012-10-03 at 14:51 +0100, Stefano Stabellini wrote:
> > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > On Wed, 2012-10-03 at 14:37 +0100, Stefano Stabellini wrote:
> > > > PV on HVM guests don't have a start_info page mapped by Xen, so
> > > > xen_start_info is just NULL for them.
> > > > That is problem because other parts of the code expect xen_start_info to
> > > > point to something valid, for example xen_initial_domain() is defined as
> > > > follow:
> > > >
> > > > #define xen_initial_domain() (xen_domain() && \
> > > > xen_start_info->flags & SIF_INITDOMAIN)
> > >
> > > But anyone who calls this before xen_start_info is setup is going to get
> > > a bogus result, specifically in this case they will think they are domU
> > > when in reality they are dom0 -- wouldn't it be better to fix those
> > > callsites?
> >
> > That cannot be the case because setting up xen_start_info is the very
> > first thing that is done, before even calling to C.
>
> On PV, yes, but you are trying to fix PVHVM here, no?
>
> Otherwise if this is always set before calling into C then what is the
> purpose of this patch?
to fix this - as PVHVM has it set to NULL and we end up de-referencing
the xen_start_info and crashing. As so::
Decompressing Linux... Parsing ELF... done.
Booting the kernel.
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Linux version 3.6.0upstream-04121-g0313983 ([email protected]) (gcc version 4.4.4 20100503 (Red Hat 4.4.4-2) (GCC) ) #1 SMP Tue Oct 2 16:31:21 EDT 2012
[ 0.000000] Command line: initrd=initramf.gz console=ttyS0,115200 test=net nofb earlyprintk=serial,ttyS0,115200 BOOT_IMAGE=vmlinuz
[ 0.000000] e820: BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007fffffff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved
[ 0.000000] bootconsole [earlyser0] enabled
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] DMI 2.4 present.
[ 0.000000] Hypervisor detected: Xen HVM
[ 0.000000] Xen version 4.1.
[ 0.000000] Netfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated NICs.
[ 0.000000] Blkfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated disks.
[ 0.000000] You might have to change the root device
[ 0.000000] from /dev/hd[a-d] to /dev/xvd[a-d]
[ 0.000000] in your root= kernel command line option
[ 0.000000] No AGP bridge found
[ 0.000000] e820: last_pfn = 0x80000 max_arch_pfn = 0x400000000
[ 0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
[ 0.000000] found SMP MP-table at [mem 0x000fbc90-0x000fbc9f] mapped at [ffff8800000fbc90]
[ 0.000000] init_memory_mapping: [mem 0x00000000-0x7fffffff]
[ 0.000000] RAMDISK: [mem 0x7abeb000-0x7ffdefff]
[ 0.000000] ACPI: RSDP 00000000000ea020 00024 (v02 Xen)
[ 0.000000] ACPI: XSDT 00000000fc00f2b0 00034 (v01 Xen HVM 00000000 HVML 00000000)
[ 0.000000] ACPI: FACP 00000000fc00f0d0 000F4 (v04 Xen HVM 00000000 HVML 00000000)
[ 0.000000] ACPI: DSDT 00000000fc003440 0BC09 (v02 Xen HVM 00000000 INTL 20100528)
[ 0.000000] ACPI: FACS 00000000fc003400 00040
[ 0.000000] ACPI: APIC 00000000fc00f1d0 000D8 (v02 Xen HVM 00000000 HVML 00000000)
[ 0.000000] No NUMA configuration found
[ 0.000000] Faking a node at [mem 0x0000000000000000-0x000000007fffffff]
[ 0.000000] Initmem setup node 0 [mem 0x00000000-0x7fffffff]
[ 0.000000] NODE_DATA [mem 0x7fffc000-0x7fffffff]
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x00010000-0x00ffffff]
[ 0.000000] DMA32 [mem 0x01000000-0xffffffff]
[ 0.000000] Normal empty
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x00010000-0x0009dfff]
[ 0.000000] node 0: [mem 0x00100000-0x7fffffff]
[ 0.000000] ACPI: PM-Timer IO Port: 0xb008
[ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x02] enabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x04] disabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x06] disabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x08] disabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x0a] disabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x0c] disabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x0e] disabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x10] disabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x09] lapic_id[0x12] disabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x14] disabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x16] disabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x18] disabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x1a] disabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x1c] disabled)
[ 0.000000] ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
[ 0.000000] IOAPIC[0]: apic_id 1, version 17, address 0xfec00000, GSI 0-47
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 low level)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 low level)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 low level)
[ 0.000000] Using ACPI (MADT) for SMP configuration information
[ 0.000000] smpboot: Allowing 15 CPUs, 13 hotplug CPUs
[ 0.000000] PM: Registered nosave memory: 000000000009e000 - 00000000000a0000
[ 0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
[ 0.000000] PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
[ 0.000000] e820: [mem 0x80000000-0xfbffffff] available for PCI devices
[ 0.000000] Booting paravirtualized kernel on Xen HVM
[ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:15 nr_node_ids:1
[ 0.000000] PERCPU: Embedded 28 pages/cpu @ffff88007a800000 s84352 r8192 d22144 u131072
[ 0.000000] Built 1 zonelists in Node order, mobility grouping on. Total pages: 517000
[ 0.000000] Policy zone: DMA32
[ 0.000000] Kernel command line: initrd=initramf.gz console=ttyS0,115200 test=net nofb earlyprintk=serial,ttyS0,115200 BOOT_IMAGE=vmlinuz
[ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[ 0.000000] __ex_table already sorted, skipping sort
[ 0.000000] Checking aperture...
[ 0.000000] No AGP bridge found
[ 0.000000] Memory: 1967336k/2097152k available (6368k kernel code, 456k absent, 129360k reserved, 4525k data, 752k init)
[ 0.000000] Hierarchical RCU implementation.
[ 0.000000] RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=15.
[ 0.000000] NR_IRQS:33024 nr_irqs:1208 16
[ 0.000000] Xen HVM callback vector for event delivery is enabled
[ 0.000000] Console: colour VGA+ 80x25
[ 0.000000] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[ 0.000000] IP: [<ffffffff813ab3be>] xen_cons_init+0x1e/0x60
[ 0.000000] PGD 0
[ 0.000000] Oops: 0000 [#1] SMP
[ 0.000000] Modules linked in:
[ 0.000000] CPU 0
[ 0.000000] Pid: 0, comm: swapper/0 Not tainted 3.6.0upstream-04121-g0313983 #1 Xen HVM domU
[ 0.000000] RIP: 0010:[<ffffffff813ab3be>] [<ffffffff813ab3be>] xen_cons_init+0x1e/0x60
[ 0.000000] RSP: 0000:ffffffff81a01ef8 EFLAGS: 00010202
[ 0.000000] RAX: 0000000000000000 RBX: ffffffff81b3be60 RCX: 0000000000000002
[ 0.000000] RDX: ffffffff81a59c40 RSI: ffffffff81a59b01 RDI: ffffffff81ba7e81
[ 0.000000] RBP: ffffffff81a01ef8 R08: 00000000000003fd R09: 0000000000000020
[ 0.000000] R10: 0000000000000000 R11: 000000000000000d R12: ffffffff81b008e0
[ 0.000000] R13: ffffffff81b092e0 R14: 0000000000000000 R15: 0000000000026bf0
[ 0.000000] FS: 0000000000000000(0000) GS:ffff88007a800000(0000) knlGS:0000000000000000
[ 0.000000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 0.000000] CR2: 0000000000000030 CR3: 0000000001a0b000 CR4: 00000000000006b0
[ 0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.000000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 0.000000] Process swapper/0 (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a13420)
[ 0.000000] Stack:
[ 0.000000] ffffffff81a01f18 ffffffff81aeb9fb ffffffff81b008e0 ffffffffffffffff
[ 0.000000] ffffffff81a01f68 ffffffff81abac39 ffffffff81aba80d 0000000000026bf0
[ 0.000000] ffffffff81a01f58 ffffffff81b092e0 0000000001000000 0000000001c72000
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff81aeb9fb>] console_init+0x19/0x2a
[ 0.000000] [<ffffffff81abac39>] start_kernel+0x24a/0x3a3
[ 0.000000] [<ffffffff81aba80d>] ? kernel_init+0x1e8/0x1e8
[ 0.000000] [<ffffffff81aba356>] x86_64_start_reservations+0x131/0x136
[ 0.000000] [<ffffffff81aba45e>] x86_64_start_kernel+0x103/0x112
[ 0.000000] Code: 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 8b 0d 5a 2e 7c 00 55 31 c0 48 89 e5 85 c9 74 37 48 8b 05 51 2e 7c 00 48 c7 c2 40 9c a5 81 <f6> 40 30 02 75 15 83 f9 02 74 27 e8 52 fc ff ff 85 c0 78 15 48
[ 0.000000] RIP [<ffffffff813ab3be>] xen_cons_init+0x1e/0x60
[ 0.000000] RSP <ffffffff81a01ef8>
[ 0.000000] CR2: 0000000000000030
[ 0.000000] ---[ end trace 5cb378039a20e088 ]---
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
>
> >
> >
> > > Perhaps turn this into a static inline with a BUG_ON(!xen_start_info) to
> > > make catching these cases easier?
> >
>
On Wed, 2012-10-03 at 15:11 +0100, Konrad Rzeszutek Wilk wrote:
> On Wed, Oct 03, 2012 at 02:54:42PM +0100, Ian Campbell wrote:
> > On Wed, 2012-10-03 at 14:51 +0100, Stefano Stabellini wrote:
> > > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > > On Wed, 2012-10-03 at 14:37 +0100, Stefano Stabellini wrote:
> > > > > PV on HVM guests don't have a start_info page mapped by Xen, so
> > > > > xen_start_info is just NULL for them.
> > > > > That is problem because other parts of the code expect xen_start_info to
> > > > > point to something valid, for example xen_initial_domain() is defined as
> > > > > follow:
> > > > >
> > > > > #define xen_initial_domain() (xen_domain() && \
> > > > > xen_start_info->flags & SIF_INITDOMAIN)
> > > >
> > > > But anyone who calls this before xen_start_info is setup is going to get
> > > > a bogus result, specifically in this case they will think they are domU
> > > > when in reality they are dom0 -- wouldn't it be better to fix those
> > > > callsites?
> > >
> > > That cannot be the case because setting up xen_start_info is the very
> > > first thing that is done, before even calling to C.
> >
> > On PV, yes, but you are trying to fix PVHVM here, no?
> >
> > Otherwise if this is always set before calling into C then what is the
> > purpose of this patch?
>
> to fix this - as PVHVM has it set to NULL and we end up de-referencing
> the xen_start_info and crashing. As so::
>
Right, so returning to my original point: The caller here is calling
xen_initial_domain() *before* start info is setup. This is bogus and is
your actual bug, all this patch does is hide that real issue.
With this "fix" the caller of xen_initial_domain shown in this trace now
gets a rubbish result based on the content of a dummy shared info
instead of the real answer from that actual shared info.
The right fix is to fix the caller to not call xen_initial_domain()
until after the shared info has been setup. Maybe that means moving
shinfo setup earlier, or maybe it means deferring this call until later
in the PVHVM case.
>
> Decompressing Linux... Parsing ELF... done.
> Booting the kernel.
> [ 0.000000] Initializing cgroup subsys cpuset
> [ 0.000000] Initializing cgroup subsys cpu
> [ 0.000000] Linux version 3.6.0upstream-04121-g0313983 ([email protected]) (gcc version 4.4.4 20100503 (Red Hat 4.4.4-2) (GCC) ) #1 SMP Tue Oct 2 16:31:21 EDT 2012
> [ 0.000000] Command line: initrd=initramf.gz console=ttyS0,115200 test=net nofb earlyprintk=serial,ttyS0,115200 BOOT_IMAGE=vmlinuz
> [ 0.000000] e820: BIOS-provided physical RAM map:
> [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
> [ 0.000000] BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007fffffff] usable
> [ 0.000000] BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved
> [ 0.000000] bootconsole [earlyser0] enabled
> [ 0.000000] NX (Execute Disable) protection: active
> [ 0.000000] DMI 2.4 present.
> [ 0.000000] Hypervisor detected: Xen HVM
> [ 0.000000] Xen version 4.1.
> [ 0.000000] Netfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated NICs.
> [ 0.000000] Blkfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated disks.
> [ 0.000000] You might have to change the root device
> [ 0.000000] from /dev/hd[a-d] to /dev/xvd[a-d]
> [ 0.000000] in your root= kernel command line option
> [ 0.000000] No AGP bridge found
> [ 0.000000] e820: last_pfn = 0x80000 max_arch_pfn = 0x400000000
> [ 0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
> [ 0.000000] found SMP MP-table at [mem 0x000fbc90-0x000fbc9f] mapped at [ffff8800000fbc90]
> [ 0.000000] init_memory_mapping: [mem 0x00000000-0x7fffffff]
> [ 0.000000] RAMDISK: [mem 0x7abeb000-0x7ffdefff]
> [ 0.000000] ACPI: RSDP 00000000000ea020 00024 (v02 Xen)
> [ 0.000000] ACPI: XSDT 00000000fc00f2b0 00034 (v01 Xen HVM 00000000 HVML 00000000)
> [ 0.000000] ACPI: FACP 00000000fc00f0d0 000F4 (v04 Xen HVM 00000000 HVML 00000000)
> [ 0.000000] ACPI: DSDT 00000000fc003440 0BC09 (v02 Xen HVM 00000000 INTL 20100528)
> [ 0.000000] ACPI: FACS 00000000fc003400 00040
> [ 0.000000] ACPI: APIC 00000000fc00f1d0 000D8 (v02 Xen HVM 00000000 HVML 00000000)
> [ 0.000000] No NUMA configuration found
> [ 0.000000] Faking a node at [mem 0x0000000000000000-0x000000007fffffff]
> [ 0.000000] Initmem setup node 0 [mem 0x00000000-0x7fffffff]
> [ 0.000000] NODE_DATA [mem 0x7fffc000-0x7fffffff]
> [ 0.000000] Zone ranges:
> [ 0.000000] DMA [mem 0x00010000-0x00ffffff]
> [ 0.000000] DMA32 [mem 0x01000000-0xffffffff]
> [ 0.000000] Normal empty
> [ 0.000000] Movable zone start for each node
> [ 0.000000] Early memory node ranges
> [ 0.000000] node 0: [mem 0x00010000-0x0009dfff]
> [ 0.000000] node 0: [mem 0x00100000-0x7fffffff]
> [ 0.000000] ACPI: PM-Timer IO Port: 0xb008
> [ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
> [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x02] enabled)
> [ 0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x04] disabled)
> [ 0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x06] disabled)
> [ 0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x08] disabled)
> [ 0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x0a] disabled)
> [ 0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x0c] disabled)
> [ 0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x0e] disabled)
> [ 0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x10] disabled)
> [ 0.000000] ACPI: LAPIC (acpi_id[0x09] lapic_id[0x12] disabled)
> [ 0.000000] ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x14] disabled)
> [ 0.000000] ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x16] disabled)
> [ 0.000000] ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x18] disabled)
> [ 0.000000] ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x1a] disabled)
> [ 0.000000] ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x1c] disabled)
> [ 0.000000] ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
> [ 0.000000] IOAPIC[0]: apic_id 1, version 17, address 0xfec00000, GSI 0-47
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 low level)
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 low level)
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 low level)
> [ 0.000000] Using ACPI (MADT) for SMP configuration information
> [ 0.000000] smpboot: Allowing 15 CPUs, 13 hotplug CPUs
> [ 0.000000] PM: Registered nosave memory: 000000000009e000 - 00000000000a0000
> [ 0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
> [ 0.000000] PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
> [ 0.000000] e820: [mem 0x80000000-0xfbffffff] available for PCI devices
> [ 0.000000] Booting paravirtualized kernel on Xen HVM
> [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:15 nr_node_ids:1
> [ 0.000000] PERCPU: Embedded 28 pages/cpu @ffff88007a800000 s84352 r8192 d22144 u131072
> [ 0.000000] Built 1 zonelists in Node order, mobility grouping on. Total pages: 517000
> [ 0.000000] Policy zone: DMA32
> [ 0.000000] Kernel command line: initrd=initramf.gz console=ttyS0,115200 test=net nofb earlyprintk=serial,ttyS0,115200 BOOT_IMAGE=vmlinuz
> [ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
> [ 0.000000] __ex_table already sorted, skipping sort
> [ 0.000000] Checking aperture...
> [ 0.000000] No AGP bridge found
> [ 0.000000] Memory: 1967336k/2097152k available (6368k kernel code, 456k absent, 129360k reserved, 4525k data, 752k init)
> [ 0.000000] Hierarchical RCU implementation.
> [ 0.000000] RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=15.
> [ 0.000000] NR_IRQS:33024 nr_irqs:1208 16
> [ 0.000000] Xen HVM callback vector for event delivery is enabled
> [ 0.000000] Console: colour VGA+ 80x25
> [ 0.000000] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
> [ 0.000000] IP: [<ffffffff813ab3be>] xen_cons_init+0x1e/0x60
> [ 0.000000] PGD 0
> [ 0.000000] Oops: 0000 [#1] SMP
> [ 0.000000] Modules linked in:
> [ 0.000000] CPU 0
> [ 0.000000] Pid: 0, comm: swapper/0 Not tainted 3.6.0upstream-04121-g0313983 #1 Xen HVM domU
> [ 0.000000] RIP: 0010:[<ffffffff813ab3be>] [<ffffffff813ab3be>] xen_cons_init+0x1e/0x60
> [ 0.000000] RSP: 0000:ffffffff81a01ef8 EFLAGS: 00010202
> [ 0.000000] RAX: 0000000000000000 RBX: ffffffff81b3be60 RCX: 0000000000000002
> [ 0.000000] RDX: ffffffff81a59c40 RSI: ffffffff81a59b01 RDI: ffffffff81ba7e81
> [ 0.000000] RBP: ffffffff81a01ef8 R08: 00000000000003fd R09: 0000000000000020
> [ 0.000000] R10: 0000000000000000 R11: 000000000000000d R12: ffffffff81b008e0
> [ 0.000000] R13: ffffffff81b092e0 R14: 0000000000000000 R15: 0000000000026bf0
> [ 0.000000] FS: 0000000000000000(0000) GS:ffff88007a800000(0000) knlGS:0000000000000000
> [ 0.000000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 0.000000] CR2: 0000000000000030 CR3: 0000000001a0b000 CR4: 00000000000006b0
> [ 0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 0.000000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 0.000000] Process swapper/0 (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a13420)
> [ 0.000000] Stack:
> [ 0.000000] ffffffff81a01f18 ffffffff81aeb9fb ffffffff81b008e0 ffffffffffffffff
> [ 0.000000] ffffffff81a01f68 ffffffff81abac39 ffffffff81aba80d 0000000000026bf0
> [ 0.000000] ffffffff81a01f58 ffffffff81b092e0 0000000001000000 0000000001c72000
> [ 0.000000] Call Trace:
> [ 0.000000] [<ffffffff81aeb9fb>] console_init+0x19/0x2a
> [ 0.000000] [<ffffffff81abac39>] start_kernel+0x24a/0x3a3
> [ 0.000000] [<ffffffff81aba80d>] ? kernel_init+0x1e8/0x1e8
> [ 0.000000] [<ffffffff81aba356>] x86_64_start_reservations+0x131/0x136
> [ 0.000000] [<ffffffff81aba45e>] x86_64_start_kernel+0x103/0x112
> [ 0.000000] Code: 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 8b 0d 5a 2e 7c 00 55 31 c0 48 89 e5 85 c9 74 37 48 8b 05 51 2e 7c 00 48 c7 c2 40 9c a5 81 <f6> 40 30 02 75 15 83 f9 02 74 27 e8 52 fc ff ff 85 c0 78 15 48
> [ 0.000000] RIP [<ffffffff813ab3be>] xen_cons_init+0x1e/0x60
> [ 0.000000] RSP <ffffffff81a01ef8>
> [ 0.000000] CR2: 0000000000000030
> [ 0.000000] ---[ end trace 5cb378039a20e088 ]---
> [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
> >
> > >
> > >
> > > > Perhaps turn this into a static inline with a BUG_ON(!xen_start_info) to
> > > > make catching these cases easier?
> > >
> >
On Wed, 3 Oct 2012, Ian Campbell wrote:
> On Wed, 2012-10-03 at 15:11 +0100, Konrad Rzeszutek Wilk wrote:
> > On Wed, Oct 03, 2012 at 02:54:42PM +0100, Ian Campbell wrote:
> > > On Wed, 2012-10-03 at 14:51 +0100, Stefano Stabellini wrote:
> > > > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > > > On Wed, 2012-10-03 at 14:37 +0100, Stefano Stabellini wrote:
> > > > > > PV on HVM guests don't have a start_info page mapped by Xen, so
> > > > > > xen_start_info is just NULL for them.
> > > > > > That is problem because other parts of the code expect xen_start_info to
> > > > > > point to something valid, for example xen_initial_domain() is defined as
> > > > > > follow:
> > > > > >
> > > > > > #define xen_initial_domain() (xen_domain() && \
> > > > > > xen_start_info->flags & SIF_INITDOMAIN)
> > > > >
> > > > > But anyone who calls this before xen_start_info is setup is going to get
> > > > > a bogus result, specifically in this case they will think they are domU
> > > > > when in reality they are dom0 -- wouldn't it be better to fix those
> > > > > callsites?
> > > >
> > > > That cannot be the case because setting up xen_start_info is the very
> > > > first thing that is done, before even calling to C.
> > >
> > > On PV, yes, but you are trying to fix PVHVM here, no?
> > >
> > > Otherwise if this is always set before calling into C then what is the
> > > purpose of this patch?
> >
> > to fix this - as PVHVM has it set to NULL and we end up de-referencing
> > the xen_start_info and crashing. As so::
> >
>
> Right, so returning to my original point: The caller here is calling
> xen_initial_domain() *before* start info is setup. This is bogus and is
> your actual bug, all this patch does is hide that real issue.
That is because xen_start_info wasn't setup at all for PV on HVM guests.
The real reason is that PV on HVM guests don't have one, but that is
another matter. Until we get rid of all the references to xen_start_info
outside of PV specific code, we should just assume that there is one,
and that is already setup.
One day not too far from now, we might refactor the code to never
reference xen_start_info directly, but I don't think that now is the
time for that. Also consider that this is the same thing we do on ARM.
> With this "fix" the caller of xen_initial_domain shown in this trace now
> gets a rubbish result based on the content of a dummy shared info
> instead of the real answer from that actual shared info.
That is not true. The caller gets a zero result, that is completely
appropriate in this case, given that a PV on HVM guest doesn't have a
start_info.
> The right fix is to fix the caller to not call xen_initial_domain()
> until after the shared info has been setup. Maybe that means moving
> shinfo setup earlier, or maybe it means deferring this call until later
> in the PVHVM case.
I don't think so, we should be able to call xen_initial_domain() at any
point in the code.
The best course of action is taking this fix now (making PVHVM x86
guests behave the same way as ARM guests) and refactor all the callers to
xen_start_info later.
On Wed, 3 Oct 2012, Konrad Rzeszutek Wilk wrote:
> On Wed, Oct 03, 2012 at 02:37:53PM +0100, Stefano Stabellini wrote:
> > PV on HVM guests don't have a start_info page mapped by Xen, so
> > xen_start_info is just NULL for them.
> > That is problem because other parts of the code expect xen_start_info to
> > point to something valid, for example xen_initial_domain() is defined as
> > follow:
> >
> > #define xen_initial_domain() (xen_domain() && \
> > xen_start_info->flags & SIF_INITDOMAIN)
> >
>
> .. introduced by commit 4c071ee5268f7234c3d084b6093bebccc28cdcba
> ("arm: initial Xen support)
>
> >
> > Allocate a dummy start_info struct and point xen_start_info to it, as we
> > do on ARM.
> > This is not going to change things for PV guests because
> > xen_start_info is set by arch/x86/xen/xen-head.S:startup_xen.
> >
> > Signed-off-by: Stefano Stabellini <[email protected]>
>
> Reported-by: Konrad Rzeszutek Wilk <[email protected]>
> >
> > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > index bf788d3..5f242cb 100644
> > --- a/arch/x86/xen/enlighten.c
> > +++ b/arch/x86/xen/enlighten.c
> > @@ -96,7 +96,8 @@ EXPORT_SYMBOL(machine_to_phys_mapping);
> > unsigned long machine_to_phys_nr;
> > EXPORT_SYMBOL(machine_to_phys_nr);
> >
> > -struct start_info *xen_start_info;
> > +static struct start_info _xen_start_info;
>
> And lets change that to
> 'xen_dummy_start_info' to keep in sync with the other dummy one.
>
> And also add a commnt:
>
> /*
> * Since 'xen_initial_domain' dereferences the xen_start_info we need
> * a dummy structure filled with zeros (for PVHVM guests which initialize
> * this late). For PV guests we do not have to worry about this as the first
> * few instructions (startup_xen) set it properly.
> */
I agree with all the changes, I'll repost soon
> > +struct start_info *xen_start_info = &_xen_start_info;
> > EXPORT_SYMBOL_GPL(xen_start_info);
> >
> > struct shared_info xen_dummy_shared_info;
>
On Wed, 2012-10-03 at 16:48 +0100, Stefano Stabellini wrote:
> On Wed, 3 Oct 2012, Ian Campbell wrote:
> > On Wed, 2012-10-03 at 15:11 +0100, Konrad Rzeszutek Wilk wrote:
> > > On Wed, Oct 03, 2012 at 02:54:42PM +0100, Ian Campbell wrote:
> > > > On Wed, 2012-10-03 at 14:51 +0100, Stefano Stabellini wrote:
> > > > > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > > > > On Wed, 2012-10-03 at 14:37 +0100, Stefano Stabellini wrote:
> > > > > > > PV on HVM guests don't have a start_info page mapped by Xen, so
> > > > > > > xen_start_info is just NULL for them.
> > > > > > > That is problem because other parts of the code expect xen_start_info to
> > > > > > > point to something valid, for example xen_initial_domain() is defined as
> > > > > > > follow:
> > > > > > >
> > > > > > > #define xen_initial_domain() (xen_domain() && \
> > > > > > > xen_start_info->flags & SIF_INITDOMAIN)
> > > > > >
> > > > > > But anyone who calls this before xen_start_info is setup is going to get
> > > > > > a bogus result, specifically in this case they will think they are domU
> > > > > > when in reality they are dom0 -- wouldn't it be better to fix those
> > > > > > callsites?
> > > > >
> > > > > That cannot be the case because setting up xen_start_info is the very
> > > > > first thing that is done, before even calling to C.
> > > >
> > > > On PV, yes, but you are trying to fix PVHVM here, no?
> > > >
> > > > Otherwise if this is always set before calling into C then what is the
> > > > purpose of this patch?
> > >
> > > to fix this - as PVHVM has it set to NULL and we end up de-referencing
> > > the xen_start_info and crashing. As so::
> > >
> >
> > Right, so returning to my original point: The caller here is calling
> > xen_initial_domain() *before* start info is setup. This is bogus and is
> > your actual bug, all this patch does is hide that real issue.
>
> That is because xen_start_info wasn't setup at all for PV on HVM guests.
>
> The real reason is that PV on HVM guests don't have one, but that is
> another matter. Until we get rid of all the references to xen_start_info
> outside of PV specific code, we should just assume that there is one,
> and that is already setup.
>
> One day not too far from now, we might refactor the code to never
> reference xen_start_info directly, but I don't think that now is the
> time for that. Also consider that this is the same thing we do on ARM.
We actual fill in the dummy start info with valid information on ARM
though, we don't just leave it full of zeroes.
If we do start out with start_info pointing to an uninitialised
start_info on ARM too then I would argue that this is also a mistake. We
should leave the NULL pointer in place until we setup the content of the
dummy start info -- exactly because the resulting crash indicates to us
that someone has accessed the si before we've initialised it.
> > With this "fix" the caller of xen_initial_domain shown in this trace now
> > gets a rubbish result based on the content of a dummy shared info
> > instead of the real answer from that actual shared info.
>
> That is not true. The caller gets a zero result, that is completely
> appropriate in this case, given that a PV on HVM guest doesn't have a
> start_info.
It's just a side effect of Linux zeroing its bss though and zero
happening to be the right answer for a PVHVM guest in this case.
Is it true that zero is an appropriate result for all uses of fields in
start_info on PVHVM?
> > The right fix is to fix the caller to not call xen_initial_domain()
> > until after the shared info has been setup. Maybe that means moving
> > shinfo setup earlier, or maybe it means deferring this call until later
> > in the PVHVM case.
>
> I don't think so, we should be able to call xen_initial_domain() at any
> point in the code.
>
> The best course of action is taking this fix now (making PVHVM x86
> guests behave the same way as ARM guests) and refactor all the callers to
> xen_start_info later.
On Wed, 3 Oct 2012, Ian Campbell wrote:
> On Wed, 2012-10-03 at 16:48 +0100, Stefano Stabellini wrote:
> > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > On Wed, 2012-10-03 at 15:11 +0100, Konrad Rzeszutek Wilk wrote:
> > > > On Wed, Oct 03, 2012 at 02:54:42PM +0100, Ian Campbell wrote:
> > > > > On Wed, 2012-10-03 at 14:51 +0100, Stefano Stabellini wrote:
> > > > > > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > > > > > On Wed, 2012-10-03 at 14:37 +0100, Stefano Stabellini wrote:
> > > > > > > > PV on HVM guests don't have a start_info page mapped by Xen, so
> > > > > > > > xen_start_info is just NULL for them.
> > > > > > > > That is problem because other parts of the code expect xen_start_info to
> > > > > > > > point to something valid, for example xen_initial_domain() is defined as
> > > > > > > > follow:
> > > > > > > >
> > > > > > > > #define xen_initial_domain() (xen_domain() && \
> > > > > > > > xen_start_info->flags & SIF_INITDOMAIN)
> > > > > > >
> > > > > > > But anyone who calls this before xen_start_info is setup is going to get
> > > > > > > a bogus result, specifically in this case they will think they are domU
> > > > > > > when in reality they are dom0 -- wouldn't it be better to fix those
> > > > > > > callsites?
> > > > > >
> > > > > > That cannot be the case because setting up xen_start_info is the very
> > > > > > first thing that is done, before even calling to C.
> > > > >
> > > > > On PV, yes, but you are trying to fix PVHVM here, no?
> > > > >
> > > > > Otherwise if this is always set before calling into C then what is the
> > > > > purpose of this patch?
> > > >
> > > > to fix this - as PVHVM has it set to NULL and we end up de-referencing
> > > > the xen_start_info and crashing. As so::
> > > >
> > >
> > > Right, so returning to my original point: The caller here is calling
> > > xen_initial_domain() *before* start info is setup. This is bogus and is
> > > your actual bug, all this patch does is hide that real issue.
> >
> > That is because xen_start_info wasn't setup at all for PV on HVM guests.
> >
> > The real reason is that PV on HVM guests don't have one, but that is
> > another matter. Until we get rid of all the references to xen_start_info
> > outside of PV specific code, we should just assume that there is one,
> > and that is already setup.
> >
> > One day not too far from now, we might refactor the code to never
> > reference xen_start_info directly, but I don't think that now is the
> > time for that. Also consider that this is the same thing we do on ARM.
>
> We actual fill in the dummy start info with valid information on ARM
> though, we don't just leave it full of zeroes.
>
> If we do start out with start_info pointing to an uninitialised
> start_info on ARM too then I would argue that this is also a mistake.
Yes, we do point xen_start_info to an uninitialised start_info on ARM
too (I don't think is a mistake). Then when and if we have more
information we write them to start_info.
> We
> should leave the NULL pointer in place until we setup the content of the
> dummy start info -- exactly because the resulting crash indicates to us
> that someone has accessed the si before we've initialised it.
I don't think so. It is initialized to zero, that is the right thing to
do.
> > > With this "fix" the caller of xen_initial_domain shown in this trace now
> > > gets a rubbish result based on the content of a dummy shared info
> > > instead of the real answer from that actual shared info.
> >
> > That is not true. The caller gets a zero result, that is completely
> > appropriate in this case, given that a PV on HVM guest doesn't have a
> > start_info.
>
> It's just a side effect of Linux zeroing its bss though and zero
> happening to be the right answer for a PVHVM guest in this case.
well, I would call that "by design" ;-)
> Is it true that zero is an appropriate result for all uses of fields in
> start_info on PVHVM?
I think so. In fact, if we wanted to, we could have the dummy struct
initialized to something different, but I don't think that we should.
On Wed, 2012-10-03 at 17:05 +0100, Stefano Stabellini wrote:
> On Wed, 3 Oct 2012, Ian Campbell wrote:
> > On Wed, 2012-10-03 at 16:48 +0100, Stefano Stabellini wrote:
> > > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > > On Wed, 2012-10-03 at 15:11 +0100, Konrad Rzeszutek Wilk wrote:
> > > > > On Wed, Oct 03, 2012 at 02:54:42PM +0100, Ian Campbell wrote:
> > > > > > On Wed, 2012-10-03 at 14:51 +0100, Stefano Stabellini wrote:
> > > > > > > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > > > > > > On Wed, 2012-10-03 at 14:37 +0100, Stefano Stabellini wrote:
> > > > > > > > > PV on HVM guests don't have a start_info page mapped by Xen, so
> > > > > > > > > xen_start_info is just NULL for them.
> > > > > > > > > That is problem because other parts of the code expect xen_start_info to
> > > > > > > > > point to something valid, for example xen_initial_domain() is defined as
> > > > > > > > > follow:
> > > > > > > > >
> > > > > > > > > #define xen_initial_domain() (xen_domain() && \
> > > > > > > > > xen_start_info->flags & SIF_INITDOMAIN)
> > > > > > > >
> > > > > > > > But anyone who calls this before xen_start_info is setup is going to get
> > > > > > > > a bogus result, specifically in this case they will think they are domU
> > > > > > > > when in reality they are dom0 -- wouldn't it be better to fix those
> > > > > > > > callsites?
> > > > > > >
> > > > > > > That cannot be the case because setting up xen_start_info is the very
> > > > > > > first thing that is done, before even calling to C.
> > > > > >
> > > > > > On PV, yes, but you are trying to fix PVHVM here, no?
> > > > > >
> > > > > > Otherwise if this is always set before calling into C then what is the
> > > > > > purpose of this patch?
> > > > >
> > > > > to fix this - as PVHVM has it set to NULL and we end up de-referencing
> > > > > the xen_start_info and crashing. As so::
> > > > >
> > > >
> > > > Right, so returning to my original point: The caller here is calling
> > > > xen_initial_domain() *before* start info is setup. This is bogus and is
> > > > your actual bug, all this patch does is hide that real issue.
> > >
> > > That is because xen_start_info wasn't setup at all for PV on HVM guests.
> > >
> > > The real reason is that PV on HVM guests don't have one, but that is
> > > another matter. Until we get rid of all the references to xen_start_info
> > > outside of PV specific code, we should just assume that there is one,
> > > and that is already setup.
> > >
> > > One day not too far from now, we might refactor the code to never
> > > reference xen_start_info directly, but I don't think that now is the
> > > time for that. Also consider that this is the same thing we do on ARM.
> >
> > We actual fill in the dummy start info with valid information on ARM
> > though, we don't just leave it full of zeroes.
> >
> > If we do start out with start_info pointing to an uninitialised
> > start_info on ARM too then I would argue that this is also a mistake.
>
> Yes, we do point xen_start_info to an uninitialised start_info on ARM
> too (I don't think is a mistake). Then when and if we have more
> information we write them to start_info.
So callers of xen_initial_domain in dom0 before xen_guest_init is called
get the wrong result?
That sounds like a mistake to me.
> > We
> > should leave the NULL pointer in place until we setup the content of the
> > dummy start info -- exactly because the resulting crash indicates to us
> > that someone has accessed the si before we've initialised it.
>
> I don't think so. It is initialized to zero, that is the right thing to
> do.
Except it isn't in the dom0 case...
> > > > With this "fix" the caller of xen_initial_domain shown in this trace now
> > > > gets a rubbish result based on the content of a dummy shared info
> > > > instead of the real answer from that actual shared info.
> > >
> > > That is not true. The caller gets a zero result, that is completely
> > > appropriate in this case, given that a PV on HVM guest doesn't have a
> > > start_info.
> >
> > It's just a side effect of Linux zeroing its bss though and zero
> > happening to be the right answer for a PVHVM guest in this case.
>
> well, I would call that "by design" ;-)
Well, in that case it should be documented not just implicit!
> > Is it true that zero is an appropriate result for all uses of fields in
> > start_info on PVHVM?
>
> I think so. In fact, if we wanted to, we could have the dummy struct
> initialized to something different, but I don't think that we should.
On Wed, 2012-10-03 at 17:13 +0100, Ian Campbell wrote:
> On Wed, 2012-10-03 at 17:05 +0100, Stefano Stabellini wrote:
> > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > On Wed, 2012-10-03 at 16:48 +0100, Stefano Stabellini wrote:
> > > > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > > > On Wed, 2012-10-03 at 15:11 +0100, Konrad Rzeszutek Wilk wrote:
> > > > > > On Wed, Oct 03, 2012 at 02:54:42PM +0100, Ian Campbell wrote:
> > > > > > > On Wed, 2012-10-03 at 14:51 +0100, Stefano Stabellini wrote:
> > > > > > > > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > > > > > > > On Wed, 2012-10-03 at 14:37 +0100, Stefano Stabellini wrote:
> > > > > > > > > > PV on HVM guests don't have a start_info page mapped by Xen, so
> > > > > > > > > > xen_start_info is just NULL for them.
> > > > > > > > > > That is problem because other parts of the code expect xen_start_info to
> > > > > > > > > > point to something valid, for example xen_initial_domain() is defined as
> > > > > > > > > > follow:
> > > > > > > > > >
> > > > > > > > > > #define xen_initial_domain() (xen_domain() && \
> > > > > > > > > > xen_start_info->flags & SIF_INITDOMAIN)
> > > > > > > > >
> > > > > > > > > But anyone who calls this before xen_start_info is setup is going to get
> > > > > > > > > a bogus result, specifically in this case they will think they are domU
> > > > > > > > > when in reality they are dom0 -- wouldn't it be better to fix those
> > > > > > > > > callsites?
> > > > > > > >
> > > > > > > > That cannot be the case because setting up xen_start_info is the very
> > > > > > > > first thing that is done, before even calling to C.
> > > > > > >
> > > > > > > On PV, yes, but you are trying to fix PVHVM here, no?
> > > > > > >
> > > > > > > Otherwise if this is always set before calling into C then what is the
> > > > > > > purpose of this patch?
> > > > > >
> > > > > > to fix this - as PVHVM has it set to NULL and we end up de-referencing
> > > > > > the xen_start_info and crashing. As so::
> > > > > >
> > > > >
> > > > > Right, so returning to my original point: The caller here is calling
> > > > > xen_initial_domain() *before* start info is setup. This is bogus and is
> > > > > your actual bug, all this patch does is hide that real issue.
> > > >
> > > > That is because xen_start_info wasn't setup at all for PV on HVM guests.
> > > >
> > > > The real reason is that PV on HVM guests don't have one, but that is
> > > > another matter. Until we get rid of all the references to xen_start_info
> > > > outside of PV specific code, we should just assume that there is one,
> > > > and that is already setup.
> > > >
> > > > One day not too far from now, we might refactor the code to never
> > > > reference xen_start_info directly, but I don't think that now is the
> > > > time for that. Also consider that this is the same thing we do on ARM.
> > >
> > > We actual fill in the dummy start info with valid information on ARM
> > > though, we don't just leave it full of zeroes.
> > >
> > > If we do start out with start_info pointing to an uninitialised
> > > start_info on ARM too then I would argue that this is also a mistake.
> >
> > Yes, we do point xen_start_info to an uninitialised start_info on ARM
> > too (I don't think is a mistake). Then when and if we have more
> > information we write them to start_info.
>
> So callers of xen_initial_domain in dom0 before xen_guest_init is called
> get the wrong result?
>
> That sounds like a mistake to me.
How about (modulo my not having looked up the actual names of the
constants etc):
#define xen_initial_domain() (xen_domain() && arch_is_initial_domain())
on x86:
int arch_is_initial_domain(void)
{
/* The initial domain is always PV and
* therefore start_info is always set for it.
*/
return start_info && start_info->flags & SIF_INITDOMAIN;
}
on ARM:
int arch_is_initial_domain(void)
{
static is_initial = -1;
if (is_initial == -1)
is_initial = HVM_param_get(HVMPARAM_DOM0)
return is_initial;
}
On Wed, 3 Oct 2012, Ian Campbell wrote:
> On Wed, 2012-10-03 at 17:13 +0100, Ian Campbell wrote:
> > On Wed, 2012-10-03 at 17:05 +0100, Stefano Stabellini wrote:
> > > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > > On Wed, 2012-10-03 at 16:48 +0100, Stefano Stabellini wrote:
> > > > > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > > > > On Wed, 2012-10-03 at 15:11 +0100, Konrad Rzeszutek Wilk wrote:
> > > > > > > On Wed, Oct 03, 2012 at 02:54:42PM +0100, Ian Campbell wrote:
> > > > > > > > On Wed, 2012-10-03 at 14:51 +0100, Stefano Stabellini wrote:
> > > > > > > > > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > > > > > > > > On Wed, 2012-10-03 at 14:37 +0100, Stefano Stabellini wrote:
> > > > > > > > > > > PV on HVM guests don't have a start_info page mapped by Xen, so
> > > > > > > > > > > xen_start_info is just NULL for them.
> > > > > > > > > > > That is problem because other parts of the code expect xen_start_info to
> > > > > > > > > > > point to something valid, for example xen_initial_domain() is defined as
> > > > > > > > > > > follow:
> > > > > > > > > > >
> > > > > > > > > > > #define xen_initial_domain() (xen_domain() && \
> > > > > > > > > > > xen_start_info->flags & SIF_INITDOMAIN)
> > > > > > > > > >
> > > > > > > > > > But anyone who calls this before xen_start_info is setup is going to get
> > > > > > > > > > a bogus result, specifically in this case they will think they are domU
> > > > > > > > > > when in reality they are dom0 -- wouldn't it be better to fix those
> > > > > > > > > > callsites?
> > > > > > > > >
> > > > > > > > > That cannot be the case because setting up xen_start_info is the very
> > > > > > > > > first thing that is done, before even calling to C.
> > > > > > > >
> > > > > > > > On PV, yes, but you are trying to fix PVHVM here, no?
> > > > > > > >
> > > > > > > > Otherwise if this is always set before calling into C then what is the
> > > > > > > > purpose of this patch?
> > > > > > >
> > > > > > > to fix this - as PVHVM has it set to NULL and we end up de-referencing
> > > > > > > the xen_start_info and crashing. As so::
> > > > > > >
> > > > > >
> > > > > > Right, so returning to my original point: The caller here is calling
> > > > > > xen_initial_domain() *before* start info is setup. This is bogus and is
> > > > > > your actual bug, all this patch does is hide that real issue.
> > > > >
> > > > > That is because xen_start_info wasn't setup at all for PV on HVM guests.
> > > > >
> > > > > The real reason is that PV on HVM guests don't have one, but that is
> > > > > another matter. Until we get rid of all the references to xen_start_info
> > > > > outside of PV specific code, we should just assume that there is one,
> > > > > and that is already setup.
> > > > >
> > > > > One day not too far from now, we might refactor the code to never
> > > > > reference xen_start_info directly, but I don't think that now is the
> > > > > time for that. Also consider that this is the same thing we do on ARM.
> > > >
> > > > We actual fill in the dummy start info with valid information on ARM
> > > > though, we don't just leave it full of zeroes.
> > > >
> > > > If we do start out with start_info pointing to an uninitialised
> > > > start_info on ARM too then I would argue that this is also a mistake.
> > >
> > > Yes, we do point xen_start_info to an uninitialised start_info on ARM
> > > too (I don't think is a mistake). Then when and if we have more
> > > information we write them to start_info.
> >
> > So callers of xen_initial_domain in dom0 before xen_guest_init is called
> > get the wrong result?
> >
> > That sounds like a mistake to me.
>
> How about (modulo my not having looked up the actual names of the
> constants etc):
>
> #define xen_initial_domain() (xen_domain() && arch_is_initial_domain())
>
> on x86:
> int arch_is_initial_domain(void)
> {
> /* The initial domain is always PV and
> * therefore start_info is always set for it.
> */
> return start_info && start_info->flags & SIF_INITDOMAIN;
> }
> on ARM:
> int arch_is_initial_domain(void)
> {
> static is_initial = -1;
> if (is_initial == -1)
> is_initial = HVM_param_get(HVMPARAM_DOM0)
> return is_initial;
> }
What about the appended patch as a fix for the moment, then we can do
the refactoring that you are suggesting, that is very similar to what I
had in mind when I said that we shouldn't access xen_start_info
directly.
---
xen/xen_initial_domain: check that xen_start_info is initialized
Since commit commit 4c071ee5268f7234c3d084b6093bebccc28cdcba ("arm:
initial Xen support") PV on HVM guests can be xen_initial_domain.
However PV on HVM guests might have an unitialized xen_start_info, so
check before accessing its fields.
Signed-off-by: Stefano Stabellini <[email protected]>
Reported-by: Konrad Rzeszutek Wilk <[email protected]>
diff --git a/include/xen/xen.h b/include/xen/xen.h
index 9a39ca5..e7101bb 100644
--- a/include/xen/xen.h
+++ b/include/xen/xen.h
@@ -28,7 +28,7 @@ extern enum xen_domain_type xen_domain_type;
#include <asm/xen/hypervisor.h>
#define xen_initial_domain() (xen_domain() && \
- xen_start_info->flags & SIF_INITDOMAIN)
+ xen_start_info && xen_start_info->flags & SIF_INITDOMAIN)
#else /* !CONFIG_XEN_DOM0 */
#define xen_initial_domain() (0)
#endif /* CONFIG_XEN_DOM0 */
Since commit commit 4c071ee5268f7234c3d084b6093bebccc28cdcba ("arm:
initial Xen support") PV on HVM guests can be xen_initial_domain.
However PV on HVM guests might have an unitialized xen_start_info, so
check before accessing its fields.
Signed-off-by: Stefano Stabellini <[email protected]>
Acked-by: Ian Campbell <[email protected]>
Reported-by: Konrad Rzeszutek Wilk <[email protected]>
diff --git a/include/xen/xen.h b/include/xen/xen.h
index 9a39ca5..e7101bb 100644
--- a/include/xen/xen.h
+++ b/include/xen/xen.h
@@ -28,7 +28,7 @@ extern enum xen_domain_type xen_domain_type;
#include <asm/xen/hypervisor.h>
#define xen_initial_domain() (xen_domain() && \
- xen_start_info->flags & SIF_INITDOMAIN)
+ xen_start_info && xen_start_info->flags & SIF_INITDOMAIN)
#else /* !CONFIG_XEN_DOM0 */
#define xen_initial_domain() (0)
#endif /* CONFIG_XEN_DOM0 */
>>> On 03.10.12 at 19:08, Stefano Stabellini <[email protected]>
wrote:
> Since commit commit 4c071ee5268f7234c3d084b6093bebccc28cdcba ("arm:
> initial Xen support") PV on HVM guests can be xen_initial_domain.
> However PV on HVM guests might have an unitialized xen_start_info, so
> check before accessing its fields.
>
> Signed-off-by: Stefano Stabellini <[email protected]>
> Acked-by: Ian Campbell <[email protected]>
> Reported-by: Konrad Rzeszutek Wilk <[email protected]>
>
> diff --git a/include/xen/xen.h b/include/xen/xen.h
> index 9a39ca5..e7101bb 100644
> --- a/include/xen/xen.h
> +++ b/include/xen/xen.h
> @@ -28,7 +28,7 @@ extern enum xen_domain_type xen_domain_type;
> #include <asm/xen/hypervisor.h>
>
> #define xen_initial_domain() (xen_domain() && \
> - xen_start_info->flags & SIF_INITDOMAIN)
> + xen_start_info && xen_start_info->flags & SIF_INITDOMAIN)
> #else /* !CONFIG_XEN_DOM0 */
> #define xen_initial_domain() (0)
> #endif /* CONFIG_XEN_DOM0 */
Didn't your other patch statically initialize it?
Jan
On Thu, 4 Oct 2012, Jan Beulich wrote:
> >>> On 03.10.12 at 19:08, Stefano Stabellini <[email protected]>
> wrote:
> > Since commit commit 4c071ee5268f7234c3d084b6093bebccc28cdcba ("arm:
> > initial Xen support") PV on HVM guests can be xen_initial_domain.
> > However PV on HVM guests might have an unitialized xen_start_info, so
> > check before accessing its fields.
> >
> > Signed-off-by: Stefano Stabellini <[email protected]>
> > Acked-by: Ian Campbell <[email protected]>
> > Reported-by: Konrad Rzeszutek Wilk <[email protected]>
> >
> > diff --git a/include/xen/xen.h b/include/xen/xen.h
> > index 9a39ca5..e7101bb 100644
> > --- a/include/xen/xen.h
> > +++ b/include/xen/xen.h
> > @@ -28,7 +28,7 @@ extern enum xen_domain_type xen_domain_type;
> > #include <asm/xen/hypervisor.h>
> >
> > #define xen_initial_domain() (xen_domain() && \
> > - xen_start_info->flags & SIF_INITDOMAIN)
> > + xen_start_info && xen_start_info->flags & SIF_INITDOMAIN)
> > #else /* !CONFIG_XEN_DOM0 */
> > #define xen_initial_domain() (0)
> > #endif /* CONFIG_XEN_DOM0 */
>
> Didn't your other patch statically initialize it?
Yes. Even though both patches can safely coexist, I wrote this one as an
alternative solution.