2013-04-16 20:11:30

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH] fixes for v3.10 in the CPU hotplug patch (v1).

The first three patches fix outstanding issues with v3.9 (and earlier)
kernels where the simple sequence of:

echo 0 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu1/online

would embarrassingly not work. As such they also have the [email protected]
on them.

[PATCH 1/9] xen/smp: Fix leakage of timer interrupt line for every
[PATCH 2/9] xen/smp/spinlock: Fix leakage of the spinlock interrupt
[PATCH 3/9] xen/time: Fix kasprintf splat when allocating timer%d IRQ

The rest are just cleanups and some coalescing of the PV and PVHVM
paths.

arch/x86/xen/enlighten.c | 5 ++++-
arch/x86/xen/smp.c | 21 ++++++++++++++-------
arch/x86/xen/spinlock.c | 25 +++++++++++++++++++++++++
arch/x86/xen/time.c | 13 ++++++++++---
drivers/xen/events.c | 20 +++++++++++++++++++-
5 files changed, 72 insertions(+), 12 deletions(-)

Konrad Rzeszutek Wilk (9):
xen/smp: Fix leakage of timer interrupt line for every CPU online/offline.
xen/smp/spinlock: Fix leakage of the spinlock interrupt line for every CPU online/offline
xen/time: Fix kasprintf splat when allocating timer%d IRQ line.
xen/events: Check that IRQ value passed in is valid.
xen/time: Add default value of -1 for IRQ and check for that.
xen/spinlock: Check against default value of -1 for IRQ line.
xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM
xen/smp/pvhvm: Don't initialize IRQ_WORKER as we are using the native one.
xen/smp: Unifiy some of the PVs and PVHVM offline CPU path


2013-04-16 20:09:20

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 2/9] xen/smp/spinlock: Fix leakage of the spinlock interrupt line for every CPU online/offline

While we don't use the spinlock interrupt line (see for details
commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 -
xen: disable PV spinlocks on HVM) - we should still do the proper
init / deinit sequence. We did not do that correctly and for the
CPU init for PVHVM guest we would allocate an interrupt line - but
failed to deallocate the old interrupt line.

This resulted in leakage of an irq_desc but more importantly this splat
as we online an offlined CPU:

genirq: Flags mismatch irq 71. 0002cc20 (spinlock1) vs. 0002cc20 (spinlock1)
Pid: 2542, comm: init.late Not tainted 3.9.0-rc6upstream #1
Call Trace:
[<ffffffff811156de>] __setup_irq+0x23e/0x4a0
[<ffffffff81194191>] ? kmem_cache_alloc_trace+0x221/0x250
[<ffffffff811161bb>] request_threaded_irq+0xfb/0x160
[<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20
[<ffffffff813a8423>] bind_ipi_to_irqhandler+0xa3/0x160
[<ffffffff81303758>] ? kasprintf+0x38/0x40
[<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20
[<ffffffff810cad35>] ? update_max_interval+0x15/0x40
[<ffffffff816605db>] xen_init_lock_cpu+0x3c/0x78
[<ffffffff81660029>] xen_hvm_cpu_notify+0x29/0x33
[<ffffffff81676bdd>] notifier_call_chain+0x4d/0x70
[<ffffffff810bb2a9>] __raw_notifier_call_chain+0x9/0x10
[<ffffffff8109402b>] __cpu_notify+0x1b/0x30
[<ffffffff8166834a>] _cpu_up+0xa0/0x14b
[<ffffffff816684ce>] cpu_up+0xd9/0xec
[<ffffffff8165f754>] store_online+0x94/0xd0
[<ffffffff8141d15b>] dev_attr_store+0x1b/0x20
[<ffffffff81218f44>] sysfs_write_file+0xf4/0x170
[<ffffffff811a2864>] vfs_write+0xb4/0x130
[<ffffffff811a302a>] sys_write+0x5a/0xa0
[<ffffffff8167ada9>] system_call_fastpath+0x16/0x1b
cpu 1 spinlock event irq -16
smpboot: Booting Node 0 Processor 1 APIC 0x2

And if one looks at the /proc/interrupts right after
offlining (CPU1):

70: 0 0 xen-percpu-ipi spinlock0
71: 0 0 xen-percpu-ipi spinlock1
77: 0 0 xen-percpu-ipi spinlock2

There is the oddity of the 'spinlock1' still being present.

CC: [email protected]
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/smp.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index f80e69c..22c800a 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -662,6 +662,7 @@ static void xen_hvm_cpu_die(unsigned int cpu)
unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL);
unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL);
unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
+ xen_uninit_lock_cpu(cpu);
xen_teardown_timer(cpu);
native_cpu_die(cpu);
}
--
1.8.1.4

2013-04-16 20:09:25

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 7/9] xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM

See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23
(xen: disable PV spinlocks on HVM) for details.

But we did not disable it everywhere - which means that when
we boot as PVHVM we end up allocating per-CPU irq line for
spinlock. This fixes that.

Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/spinlock.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)

diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
index 47ae032..8b54603 100644
--- a/arch/x86/xen/spinlock.c
+++ b/arch/x86/xen/spinlock.c
@@ -367,6 +367,13 @@ void __cpuinit xen_init_lock_cpu(int cpu)
WARN(per_cpu(lock_kicker_irq, cpu) > 0, "spinlock on CPU%d exists on IRQ%d!\n",
cpu, per_cpu(lock_kicker_irq, cpu));

+ /*
+ * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23
+ * (xen: disable PV spinlocks on HVM)
+ */
+ if (xen_hvm_domain())
+ return;
+
name = kasprintf(GFP_KERNEL, "spinlock%d", cpu);
irq = bind_ipi_to_irqhandler(XEN_SPIN_UNLOCK_VECTOR,
cpu,
@@ -385,12 +392,26 @@ void __cpuinit xen_init_lock_cpu(int cpu)

void xen_uninit_lock_cpu(int cpu)
{
+ /*
+ * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23
+ * (xen: disable PV spinlocks on HVM)
+ */
+ if (xen_hvm_domain())
+ return;
+
unbind_from_irqhandler(per_cpu(lock_kicker_irq, cpu), NULL);
per_cpu(lock_kicker_irq, cpu) = -1;
}

void __init xen_init_spinlocks(void)
{
+ /*
+ * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23
+ * (xen: disable PV spinlocks on HVM)
+ */
+ if (xen_hvm_domain())
+ return;
+
BUILD_BUG_ON(sizeof(struct xen_spinlock) > sizeof(arch_spinlock_t));

pv_lock_ops.spin_is_locked = xen_spin_is_locked;
--
1.8.1.4

2013-04-16 20:09:39

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 3/9] xen/time: Fix kasprintf splat when allocating timer%d IRQ line.

When we online the CPU, we get this splat:

smpboot: Booting Node 0 Processor 1 APIC 0x2
installing Xen timer for CPU 1
BUG: sleeping function called from invalid context at /home/konrad/ssd/konrad/linux/mm/slab.c:3179
in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1
Pid: 0, comm: swapper/1 Not tainted 3.9.0-rc6upstream-00001-g3884fad #1
Call Trace:
[<ffffffff810c1fea>] __might_sleep+0xda/0x100
[<ffffffff81194617>] __kmalloc_track_caller+0x1e7/0x2c0
[<ffffffff81303758>] ? kasprintf+0x38/0x40
[<ffffffff813036eb>] kvasprintf+0x5b/0x90
[<ffffffff81303758>] kasprintf+0x38/0x40
[<ffffffff81044510>] xen_setup_timer+0x30/0xb0
[<ffffffff810445af>] xen_hvm_setup_cpu_clockevents+0x1f/0x30
[<ffffffff81666d0a>] start_secondary+0x19c/0x1a8

The solution to that is use kasprintf in the CPU hotplug path
that 'online's the CPU. That is, do it in in xen_hvm_cpu_notify,
and remove the call to in xen_hvm_setup_cpu_clockevents.

Unfortunatly the later is not a good idea as the bootup path
does not use xen_hvm_cpu_notify so we would end up never allocating
timer%d interrupt lines when booting. As such add the check for
atomic() to continue.

CC: [email protected]
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/enlighten.c | 5 ++++-
arch/x86/xen/time.c | 6 +++++-
2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 47d3243..ddbd54a 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1641,8 +1641,11 @@ static int __cpuinit xen_hvm_cpu_notify(struct notifier_block *self,
switch (action) {
case CPU_UP_PREPARE:
xen_vcpu_setup(cpu);
- if (xen_have_vector_callback)
+ if (xen_have_vector_callback) {
xen_init_lock_cpu(cpu);
+ if (xen_feature(XENFEAT_hvm_safe_pvclock))
+ xen_setup_timer(cpu);
+ }
break;
default:
break;
diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index 0296a95..054cc01 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -497,7 +497,11 @@ static void xen_hvm_setup_cpu_clockevents(void)
{
int cpu = smp_processor_id();
xen_setup_runstate_info(cpu);
- xen_setup_timer(cpu);
+ /*
+ * xen_setup_timer(cpu) - snprintf is bad in atomic context. Hence
+ * doing it xen_hvm_cpu_notify (which gets called by smp_init during
+ * early bootup and also during CPU hotplug events).
+ */
xen_setup_cpu_clockevents();
}

--
1.8.1.4

2013-04-16 20:09:38

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 5/9] xen/time: Add default value of -1 for IRQ and check for that.

If the timer interrupt has been de-init or is just now being
initialized, the default value of -1 should be preset as
interrupt line. Check for that and if something is odd
WARN us.

Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/time.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index 054cc01..3d88bfd 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -377,7 +377,7 @@ static const struct clock_event_device xen_vcpuop_clockevent = {

static const struct clock_event_device *xen_clockevent =
&xen_timerop_clockevent;
-static DEFINE_PER_CPU(struct clock_event_device, xen_clock_events);
+static DEFINE_PER_CPU(struct clock_event_device, xen_clock_events) = { .irq = -1 };

static irqreturn_t xen_timer_interrupt(int irq, void *dev_id)
{
@@ -401,6 +401,9 @@ void xen_setup_timer(int cpu)
struct clock_event_device *evt;
int irq;

+ evt = &per_cpu(xen_clock_events, cpu);
+ WARN(evt->irq >= 0, "IRQ%d for CPU%d is already allocated\n", evt->irq, cpu);
+
printk(KERN_INFO "installing Xen timer for CPU %d\n", cpu);

name = kasprintf(GFP_KERNEL, "timer%d", cpu);
@@ -413,7 +416,6 @@ void xen_setup_timer(int cpu)
IRQF_FORCE_RESUME,
name, NULL);

- evt = &per_cpu(xen_clock_events, cpu);
memcpy(evt, xen_clockevent, sizeof(*evt));

evt->cpumask = cpumask_of(cpu);
@@ -426,6 +428,7 @@ void xen_teardown_timer(int cpu)
BUG_ON(cpu == 0);
evt = &per_cpu(xen_clock_events, cpu);
unbind_from_irqhandler(evt->irq, NULL);
+ evt->irq = -1;
}

void xen_setup_cpu_clockevents(void)
--
1.8.1.4

2013-04-16 20:10:22

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 9/9] xen/smp: Unifiy some of the PVs and PVHVM offline CPU path

The "xen_cpu_die" and "xen_hvm_cpu_die" are very similar.
Lets coalesce them.

Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/smp.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 415694c..0d466d7 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -428,7 +428,7 @@ static int xen_cpu_disable(void)

static void xen_cpu_die(unsigned int cpu)
{
- while (HYPERVISOR_vcpu_op(VCPUOP_is_up, cpu, NULL)) {
+ while (xen_pv_domain() && HYPERVISOR_vcpu_op(VCPUOP_is_up, cpu, NULL)) {
current->state = TASK_UNINTERRUPTIBLE;
schedule_timeout(HZ/10);
}
@@ -436,7 +436,8 @@ static void xen_cpu_die(unsigned int cpu)
unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL);
unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL);
unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL);
- unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
+ if (!xen_hvm_domain())
+ unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
xen_uninit_lock_cpu(cpu);
xen_teardown_timer(cpu);
}
@@ -667,14 +668,7 @@ static int __cpuinit xen_hvm_cpu_up(unsigned int cpu, struct task_struct *tidle)

static void xen_hvm_cpu_die(unsigned int cpu)
{
- unbind_from_irqhandler(per_cpu(xen_resched_irq, cpu), NULL);
- unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL);
- unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL);
- unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL);
- if (!xen_hvm_domain())
- unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
- xen_uninit_lock_cpu(cpu);
- xen_teardown_timer(cpu);
+ xen_cpu_die(cpu);
native_cpu_die(cpu);
}

--
1.8.1.4

2013-04-16 20:10:20

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 6/9] xen/spinlock: Check against default value of -1 for IRQ line.

The default (uninitialized) value of the IRQ line is -1.
Check if we already have allocated an spinlock interrupt line
and if somebody is trying to do it again. Also set it to -1
when we offline the CPU.

Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/spinlock.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
index f7a080e..47ae032 100644
--- a/arch/x86/xen/spinlock.c
+++ b/arch/x86/xen/spinlock.c
@@ -364,6 +364,9 @@ void __cpuinit xen_init_lock_cpu(int cpu)
int irq;
const char *name;

+ WARN(per_cpu(lock_kicker_irq, cpu) > 0, "spinlock on CPU%d exists on IRQ%d!\n",
+ cpu, per_cpu(lock_kicker_irq, cpu));
+
name = kasprintf(GFP_KERNEL, "spinlock%d", cpu);
irq = bind_ipi_to_irqhandler(XEN_SPIN_UNLOCK_VECTOR,
cpu,
@@ -383,6 +386,7 @@ void __cpuinit xen_init_lock_cpu(int cpu)
void xen_uninit_lock_cpu(int cpu)
{
unbind_from_irqhandler(per_cpu(lock_kicker_irq, cpu), NULL);
+ per_cpu(lock_kicker_irq, cpu) = -1;
}

void __init xen_init_spinlocks(void)
--
1.8.1.4

2013-04-16 20:10:18

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 8/9] xen/smp/pvhvm: Don't initialize IRQ_WORKER as we are using the native one.

There is no need to use the PV version of the IRQ_WORKER mechanism
as under PVHVM we are using the native version. The native
version is using the SMP API.

They just sit around unused:

69: 0 0 xen-percpu-ipi irqwork0
83: 0 0 xen-percpu-ipi irqwork1

Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/xen/smp.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 22c800a..415694c 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -144,6 +144,13 @@ static int xen_smp_intr_init(unsigned int cpu)
goto fail;
per_cpu(xen_callfuncsingle_irq, cpu) = rc;

+ /*
+ * The IRQ worker on PVHVM goes through the native path and uses the
+ * IPI mechanism.
+ */
+ if (xen_hvm_domain())
+ return 0;
+
callfunc_name = kasprintf(GFP_KERNEL, "irqwork%d", cpu);
rc = bind_ipi_to_irqhandler(XEN_IRQ_WORK_VECTOR,
cpu,
@@ -167,6 +174,9 @@ static int xen_smp_intr_init(unsigned int cpu)
if (per_cpu(xen_callfuncsingle_irq, cpu) >= 0)
unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu),
NULL);
+ if (xen_hvm_domain())
+ return rc;
+
if (per_cpu(xen_irq_work, cpu) >= 0)
unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);

@@ -661,7 +671,8 @@ static void xen_hvm_cpu_die(unsigned int cpu)
unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL);
unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL);
unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL);
- unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
+ if (!xen_hvm_domain())
+ unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
xen_uninit_lock_cpu(cpu);
xen_teardown_timer(cpu);
native_cpu_die(cpu);
--
1.8.1.4

2013-04-16 20:09:19

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 1/9] xen/smp: Fix leakage of timer interrupt line for every CPU online/offline.

In the PVHVM path when we do CPU online/offline path we would
leak the timer%d IRQ line everytime we do a offline event. The
online path (xen_hvm_setup_cpu_clockevents via
x86_cpuinit.setup_percpu_clockev) would allocate a new interrupt
line for the timer%d.

But we would still use the old interrupt line leading to:

kernel BUG at /home/konrad/ssd/konrad/linux/kernel/hrtimer.c:1261!
invalid opcode: 0000 [#1] SMP
RIP: 0010:[<ffffffff810b9e21>] [<ffffffff810b9e21>] hrtimer_interrupt+0x261/0x270
.. snip..
<IRQ>
[<ffffffff810445ef>] xen_timer_interrupt+0x2f/0x1b0
[<ffffffff81104825>] ? stop_machine_cpu_stop+0xb5/0xf0
[<ffffffff8111434c>] handle_irq_event_percpu+0x7c/0x240
[<ffffffff811175b9>] handle_percpu_irq+0x49/0x70
[<ffffffff813a74a3>] __xen_evtchn_do_upcall+0x1c3/0x2f0
[<ffffffff813a760a>] xen_evtchn_do_upcall+0x2a/0x40
[<ffffffff8167c26d>] xen_hvm_callback_vector+0x6d/0x80
<EOI>
[<ffffffff81666d01>] ? start_secondary+0x193/0x1a8
[<ffffffff81666cfd>] ? start_secondary+0x18f/0x1a8

There is also the oddity (timer1) in the /proc/interrupts after
offlining CPU1:

64: 1121 0 xen-percpu-virq timer0
78: 0 0 xen-percpu-virq timer1
84: 0 2483 xen-percpu-virq timer2

This patch fixes it.

Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
CC: [email protected]
---
arch/x86/xen/smp.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 09ea61d..f80e69c 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -662,6 +662,7 @@ static void xen_hvm_cpu_die(unsigned int cpu)
unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL);
unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL);
unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
+ xen_teardown_timer(cpu);
native_cpu_die(cpu);
}

--
1.8.1.4

2013-04-16 20:11:09

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: [PATCH 4/9] xen/events: Check that IRQ value passed in is valid.

We naively assume that the IRQ value passed in is correct.
If it is not, then any dereference operation for the 'info'
structure will result in crash - so might as well guard ourselves
and sprinkle copious amounts of WARN_ON.

Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
drivers/xen/events.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index bb65f75..94daed1 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -515,6 +515,9 @@ static void xen_free_irq(unsigned irq)
{
struct irq_info *info = irq_get_handler_data(irq);

+ if (WARN_ON(!info))
+ return;
+
list_del(&info->list);

irq_set_handler_data(irq, NULL);
@@ -1003,6 +1006,9 @@ static void unbind_from_irq(unsigned int irq)
int evtchn = evtchn_from_irq(irq);
struct irq_info *info = irq_get_handler_data(irq);

+ if (WARN_ON(!info))
+ return;
+
mutex_lock(&irq_mapping_update_lock);

if (info->refcnt > 0) {
@@ -1130,6 +1136,10 @@ int bind_ipi_to_irqhandler(enum ipi_vector ipi,

void unbind_from_irqhandler(unsigned int irq, void *dev_id)
{
+ struct irq_info *info = irq_get_handler_data(irq);
+
+ if (WARN_ON(!info))
+ return;
free_irq(irq, dev_id);
unbind_from_irq(irq);
}
@@ -1441,6 +1451,9 @@ void rebind_evtchn_irq(int evtchn, int irq)
{
struct irq_info *info = info_for_irq(irq);

+ if (WARN_ON(!info))
+ return;
+
/* Make sure the irq is masked, since the new event channel
will also be masked. */
disable_irq(irq);
@@ -1714,7 +1727,12 @@ void xen_poll_irq(int irq)
int xen_test_irq_shared(int irq)
{
struct irq_info *info = info_for_irq(irq);
- struct physdev_irq_status_query irq_status = { .irq = info->u.pirq.pirq };
+ struct physdev_irq_status_query irq_status;
+
+ if (WARN_ON(!info))
+ return -ENOENT;
+
+ irq_status.irq = info->u.pirq.pirq;

if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status))
return 0;
--
1.8.1.4

2013-04-26 16:06:25

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH 2/9] xen/smp/spinlock: Fix leakage of the spinlock interrupt line for every CPU online/offline

On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> While we don't use the spinlock interrupt line (see for details
> commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 -
> xen: disable PV spinlocks on HVM) - we should still do the proper
> init / deinit sequence. We did not do that correctly and for the
> CPU init for PVHVM guest we would allocate an interrupt line - but
> failed to deallocate the old interrupt line.
>
> This resulted in leakage of an irq_desc but more importantly this splat
> as we online an offlined CPU:
>
> genirq: Flags mismatch irq 71. 0002cc20 (spinlock1) vs. 0002cc20 (spinlock1)
> Pid: 2542, comm: init.late Not tainted 3.9.0-rc6upstream #1
> Call Trace:
> [<ffffffff811156de>] __setup_irq+0x23e/0x4a0
> [<ffffffff81194191>] ? kmem_cache_alloc_trace+0x221/0x250
> [<ffffffff811161bb>] request_threaded_irq+0xfb/0x160
> [<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20
> [<ffffffff813a8423>] bind_ipi_to_irqhandler+0xa3/0x160
> [<ffffffff81303758>] ? kasprintf+0x38/0x40
> [<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20
> [<ffffffff810cad35>] ? update_max_interval+0x15/0x40
> [<ffffffff816605db>] xen_init_lock_cpu+0x3c/0x78
> [<ffffffff81660029>] xen_hvm_cpu_notify+0x29/0x33
> [<ffffffff81676bdd>] notifier_call_chain+0x4d/0x70
> [<ffffffff810bb2a9>] __raw_notifier_call_chain+0x9/0x10
> [<ffffffff8109402b>] __cpu_notify+0x1b/0x30
> [<ffffffff8166834a>] _cpu_up+0xa0/0x14b
> [<ffffffff816684ce>] cpu_up+0xd9/0xec
> [<ffffffff8165f754>] store_online+0x94/0xd0
> [<ffffffff8141d15b>] dev_attr_store+0x1b/0x20
> [<ffffffff81218f44>] sysfs_write_file+0xf4/0x170
> [<ffffffff811a2864>] vfs_write+0xb4/0x130
> [<ffffffff811a302a>] sys_write+0x5a/0xa0
> [<ffffffff8167ada9>] system_call_fastpath+0x16/0x1b
> cpu 1 spinlock event irq -16
> smpboot: Booting Node 0 Processor 1 APIC 0x2
>
> And if one looks at the /proc/interrupts right after
> offlining (CPU1):
>
> 70: 0 0 xen-percpu-ipi spinlock0
> 71: 0 0 xen-percpu-ipi spinlock1
> 77: 0 0 xen-percpu-ipi spinlock2
>
> There is the oddity of the 'spinlock1' still being present.
>
> CC: [email protected]
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

Acked-by: Stefano Stabellini <[email protected]>


> arch/x86/xen/smp.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> index f80e69c..22c800a 100644
> --- a/arch/x86/xen/smp.c
> +++ b/arch/x86/xen/smp.c
> @@ -662,6 +662,7 @@ static void xen_hvm_cpu_die(unsigned int cpu)
> unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL);
> unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL);
> unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
> + xen_uninit_lock_cpu(cpu);
> xen_teardown_timer(cpu);
> native_cpu_die(cpu);
> }
> --
> 1.8.1.4
>

2013-04-26 16:06:39

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH 1/9] xen/smp: Fix leakage of timer interrupt line for every CPU online/offline.

On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> In the PVHVM path when we do CPU online/offline path we would
> leak the timer%d IRQ line everytime we do a offline event. The
> online path (xen_hvm_setup_cpu_clockevents via
> x86_cpuinit.setup_percpu_clockev) would allocate a new interrupt
> line for the timer%d.
>
> But we would still use the old interrupt line leading to:
>
> kernel BUG at /home/konrad/ssd/konrad/linux/kernel/hrtimer.c:1261!
> invalid opcode: 0000 [#1] SMP
> RIP: 0010:[<ffffffff810b9e21>] [<ffffffff810b9e21>] hrtimer_interrupt+0x261/0x270
> .. snip..
> <IRQ>
> [<ffffffff810445ef>] xen_timer_interrupt+0x2f/0x1b0
> [<ffffffff81104825>] ? stop_machine_cpu_stop+0xb5/0xf0
> [<ffffffff8111434c>] handle_irq_event_percpu+0x7c/0x240
> [<ffffffff811175b9>] handle_percpu_irq+0x49/0x70
> [<ffffffff813a74a3>] __xen_evtchn_do_upcall+0x1c3/0x2f0
> [<ffffffff813a760a>] xen_evtchn_do_upcall+0x2a/0x40
> [<ffffffff8167c26d>] xen_hvm_callback_vector+0x6d/0x80
> <EOI>
> [<ffffffff81666d01>] ? start_secondary+0x193/0x1a8
> [<ffffffff81666cfd>] ? start_secondary+0x18f/0x1a8
>
> There is also the oddity (timer1) in the /proc/interrupts after
> offlining CPU1:
>
> 64: 1121 0 xen-percpu-virq timer0
> 78: 0 0 xen-percpu-virq timer1
> 84: 0 2483 xen-percpu-virq timer2
>
> This patch fixes it.
>
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
> CC: [email protected]

Acked-by: Stefano Stabellini <[email protected]>


> arch/x86/xen/smp.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> index 09ea61d..f80e69c 100644
> --- a/arch/x86/xen/smp.c
> +++ b/arch/x86/xen/smp.c
> @@ -662,6 +662,7 @@ static void xen_hvm_cpu_die(unsigned int cpu)
> unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL);
> unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL);
> unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
> + xen_teardown_timer(cpu);
> native_cpu_die(cpu);
> }
>
> --
> 1.8.1.4
>

2013-04-26 16:11:43

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH 3/9] xen/time: Fix kasprintf splat when allocating timer%d IRQ line.

On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> When we online the CPU, we get this splat:
>
> smpboot: Booting Node 0 Processor 1 APIC 0x2
> installing Xen timer for CPU 1
> BUG: sleeping function called from invalid context at /home/konrad/ssd/konrad/linux/mm/slab.c:3179
> in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1
> Pid: 0, comm: swapper/1 Not tainted 3.9.0-rc6upstream-00001-g3884fad #1
> Call Trace:
> [<ffffffff810c1fea>] __might_sleep+0xda/0x100
> [<ffffffff81194617>] __kmalloc_track_caller+0x1e7/0x2c0
> [<ffffffff81303758>] ? kasprintf+0x38/0x40
> [<ffffffff813036eb>] kvasprintf+0x5b/0x90
> [<ffffffff81303758>] kasprintf+0x38/0x40
> [<ffffffff81044510>] xen_setup_timer+0x30/0xb0
> [<ffffffff810445af>] xen_hvm_setup_cpu_clockevents+0x1f/0x30
> [<ffffffff81666d0a>] start_secondary+0x19c/0x1a8
>
> The solution to that is use kasprintf in the CPU hotplug path
> that 'online's the CPU. That is, do it in in xen_hvm_cpu_notify,
> and remove the call to in xen_hvm_setup_cpu_clockevents.
>
> Unfortunatly the later is not a good idea as the bootup path
> does not use xen_hvm_cpu_notify so we would end up never allocating
> timer%d interrupt lines when booting. As such add the check for
> atomic() to continue.

This last is not reflected in the code.

Also, is it actually OK to move xen_setup_timer out of
xen_hvm_setup_cpu_clockevents?

xen_setup_cpu_clockevents registers xen_clock_events as clocksource and
xen_clock_events is setup by xen_setup_timer so we need to make sure
that the call order remains the same.


> CC: [email protected]
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
> ---
> arch/x86/xen/enlighten.c | 5 ++++-
> arch/x86/xen/time.c | 6 +++++-
> 2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 47d3243..ddbd54a 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -1641,8 +1641,11 @@ static int __cpuinit xen_hvm_cpu_notify(struct notifier_block *self,
> switch (action) {
> case CPU_UP_PREPARE:
> xen_vcpu_setup(cpu);
> - if (xen_have_vector_callback)
> + if (xen_have_vector_callback) {
> xen_init_lock_cpu(cpu);
> + if (xen_feature(XENFEAT_hvm_safe_pvclock))
> + xen_setup_timer(cpu);
> + }
> break;
> default:
> break;
> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
> index 0296a95..054cc01 100644
> --- a/arch/x86/xen/time.c
> +++ b/arch/x86/xen/time.c
> @@ -497,7 +497,11 @@ static void xen_hvm_setup_cpu_clockevents(void)
> {
> int cpu = smp_processor_id();
> xen_setup_runstate_info(cpu);
> - xen_setup_timer(cpu);
> + /*
> + * xen_setup_timer(cpu) - snprintf is bad in atomic context. Hence
> + * doing it xen_hvm_cpu_notify (which gets called by smp_init during
> + * early bootup and also during CPU hotplug events).
> + */
> xen_setup_cpu_clockevents();
> }
>
> --
> 1.8.1.4
>

2013-04-26 16:12:38

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH 4/9] xen/events: Check that IRQ value passed in is valid.

On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> We naively assume that the IRQ value passed in is correct.
> If it is not, then any dereference operation for the 'info'
> structure will result in crash - so might as well guard ourselves
> and sprinkle copious amounts of WARN_ON.
>
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

Acked-by: Stefano Stabellini <[email protected]>


> drivers/xen/events.c | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> index bb65f75..94daed1 100644
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -515,6 +515,9 @@ static void xen_free_irq(unsigned irq)
> {
> struct irq_info *info = irq_get_handler_data(irq);
>
> + if (WARN_ON(!info))
> + return;
> +
> list_del(&info->list);
>
> irq_set_handler_data(irq, NULL);
> @@ -1003,6 +1006,9 @@ static void unbind_from_irq(unsigned int irq)
> int evtchn = evtchn_from_irq(irq);
> struct irq_info *info = irq_get_handler_data(irq);
>
> + if (WARN_ON(!info))
> + return;
> +
> mutex_lock(&irq_mapping_update_lock);
>
> if (info->refcnt > 0) {
> @@ -1130,6 +1136,10 @@ int bind_ipi_to_irqhandler(enum ipi_vector ipi,
>
> void unbind_from_irqhandler(unsigned int irq, void *dev_id)
> {
> + struct irq_info *info = irq_get_handler_data(irq);
> +
> + if (WARN_ON(!info))
> + return;
> free_irq(irq, dev_id);
> unbind_from_irq(irq);
> }
> @@ -1441,6 +1451,9 @@ void rebind_evtchn_irq(int evtchn, int irq)
> {
> struct irq_info *info = info_for_irq(irq);
>
> + if (WARN_ON(!info))
> + return;
> +
> /* Make sure the irq is masked, since the new event channel
> will also be masked. */
> disable_irq(irq);
> @@ -1714,7 +1727,12 @@ void xen_poll_irq(int irq)
> int xen_test_irq_shared(int irq)
> {
> struct irq_info *info = info_for_irq(irq);
> - struct physdev_irq_status_query irq_status = { .irq = info->u.pirq.pirq };
> + struct physdev_irq_status_query irq_status;
> +
> + if (WARN_ON(!info))
> + return -ENOENT;
> +
> + irq_status.irq = info->u.pirq.pirq;
>
> if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status))
> return 0;
> --
> 1.8.1.4
>

2013-04-26 16:15:16

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH 5/9] xen/time: Add default value of -1 for IRQ and check for that.

On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> If the timer interrupt has been de-init or is just now being
> initialized, the default value of -1 should be preset as
> interrupt line. Check for that and if something is odd
> WARN us.
>
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

Acked-by: Stefano Stabellini <[email protected]>


> arch/x86/xen/time.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
> index 054cc01..3d88bfd 100644
> --- a/arch/x86/xen/time.c
> +++ b/arch/x86/xen/time.c
> @@ -377,7 +377,7 @@ static const struct clock_event_device xen_vcpuop_clockevent = {
>
> static const struct clock_event_device *xen_clockevent =
> &xen_timerop_clockevent;
> -static DEFINE_PER_CPU(struct clock_event_device, xen_clock_events);
> +static DEFINE_PER_CPU(struct clock_event_device, xen_clock_events) = { .irq = -1 };
>
> static irqreturn_t xen_timer_interrupt(int irq, void *dev_id)
> {
> @@ -401,6 +401,9 @@ void xen_setup_timer(int cpu)
> struct clock_event_device *evt;
> int irq;
>
> + evt = &per_cpu(xen_clock_events, cpu);
> + WARN(evt->irq >= 0, "IRQ%d for CPU%d is already allocated\n", evt->irq, cpu);
> +
> printk(KERN_INFO "installing Xen timer for CPU %d\n", cpu);
>
> name = kasprintf(GFP_KERNEL, "timer%d", cpu);
> @@ -413,7 +416,6 @@ void xen_setup_timer(int cpu)
> IRQF_FORCE_RESUME,
> name, NULL);
>
> - evt = &per_cpu(xen_clock_events, cpu);
> memcpy(evt, xen_clockevent, sizeof(*evt));
>
> evt->cpumask = cpumask_of(cpu);
> @@ -426,6 +428,7 @@ void xen_teardown_timer(int cpu)
> BUG_ON(cpu == 0);
> evt = &per_cpu(xen_clock_events, cpu);
> unbind_from_irqhandler(evt->irq, NULL);
> + evt->irq = -1;
> }
>
> void xen_setup_cpu_clockevents(void)
> --
> 1.8.1.4
>

2013-04-26 16:18:09

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH 6/9] xen/spinlock: Check against default value of -1 for IRQ line.

On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> The default (uninitialized) value of the IRQ line is -1.
> Check if we already have allocated an spinlock interrupt line
> and if somebody is trying to do it again. Also set it to -1
> when we offline the CPU.
>
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
> ---
> arch/x86/xen/spinlock.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
> index f7a080e..47ae032 100644
> --- a/arch/x86/xen/spinlock.c
> +++ b/arch/x86/xen/spinlock.c
> @@ -364,6 +364,9 @@ void __cpuinit xen_init_lock_cpu(int cpu)
> int irq;
> const char *name;
>
> + WARN(per_cpu(lock_kicker_irq, cpu) > 0, "spinlock on CPU%d exists on IRQ%d!\n",
shouldn't this be >= ^


> + cpu, per_cpu(lock_kicker_irq, cpu));
>
> name = kasprintf(GFP_KERNEL, "spinlock%d", cpu);
> irq = bind_ipi_to_irqhandler(XEN_SPIN_UNLOCK_VECTOR,
> cpu,
> @@ -383,6 +386,7 @@ void __cpuinit xen_init_lock_cpu(int cpu)
> void xen_uninit_lock_cpu(int cpu)
> {
> unbind_from_irqhandler(per_cpu(lock_kicker_irq, cpu), NULL);
> + per_cpu(lock_kicker_irq, cpu) = -1;
> }
>
> void __init xen_init_spinlocks(void)
> --
> 1.8.1.4
>

2013-04-26 16:20:30

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH 7/9] xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM

On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23
> (xen: disable PV spinlocks on HVM) for details.
>
> But we did not disable it everywhere - which means that when
> we boot as PVHVM we end up allocating per-CPU irq line for
> spinlock. This fixes that.
>
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

Is there any point in calling xen_init_lock_cpu in PVHVM guests?
At that point we might as well remove the call from xen_hvm_cpu_notify.


> arch/x86/xen/spinlock.c | 21 +++++++++++++++++++++
> 1 file changed, 21 insertions(+)
>
> diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
> index 47ae032..8b54603 100644
> --- a/arch/x86/xen/spinlock.c
> +++ b/arch/x86/xen/spinlock.c
> @@ -367,6 +367,13 @@ void __cpuinit xen_init_lock_cpu(int cpu)
> WARN(per_cpu(lock_kicker_irq, cpu) > 0, "spinlock on CPU%d exists on IRQ%d!\n",
> cpu, per_cpu(lock_kicker_irq, cpu));
>
> + /*
> + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23
> + * (xen: disable PV spinlocks on HVM)
> + */
> + if (xen_hvm_domain())
> + return;
> +
> name = kasprintf(GFP_KERNEL, "spinlock%d", cpu);
> irq = bind_ipi_to_irqhandler(XEN_SPIN_UNLOCK_VECTOR,
> cpu,
> @@ -385,12 +392,26 @@ void __cpuinit xen_init_lock_cpu(int cpu)
>
> void xen_uninit_lock_cpu(int cpu)
> {
> + /*
> + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23
> + * (xen: disable PV spinlocks on HVM)
> + */
> + if (xen_hvm_domain())
> + return;
> +
> unbind_from_irqhandler(per_cpu(lock_kicker_irq, cpu), NULL);
> per_cpu(lock_kicker_irq, cpu) = -1;
> }
>
> void __init xen_init_spinlocks(void)
> {
> + /*
> + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23
> + * (xen: disable PV spinlocks on HVM)
> + */
> + if (xen_hvm_domain())
> + return;
> +
> BUILD_BUG_ON(sizeof(struct xen_spinlock) > sizeof(arch_spinlock_t));
>
> pv_lock_ops.spin_is_locked = xen_spin_is_locked;
> --
> 1.8.1.4
>

2013-04-26 16:27:28

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH 8/9] xen/smp/pvhvm: Don't initialize IRQ_WORKER as we are using the native one.

On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> There is no need to use the PV version of the IRQ_WORKER mechanism
> as under PVHVM we are using the native version. The native
> version is using the SMP API.
>
> They just sit around unused:
>
> 69: 0 0 xen-percpu-ipi irqwork0
> 83: 0 0 xen-percpu-ipi irqwork1
>
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

Might be worth trying to make it work instead?
Is it just because we don't set the apic->send_IPI_* functions to the
xen specific version on PVHVM?


> arch/x86/xen/smp.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> index 22c800a..415694c 100644
> --- a/arch/x86/xen/smp.c
> +++ b/arch/x86/xen/smp.c
> @@ -144,6 +144,13 @@ static int xen_smp_intr_init(unsigned int cpu)
> goto fail;
> per_cpu(xen_callfuncsingle_irq, cpu) = rc;
>
> + /*
> + * The IRQ worker on PVHVM goes through the native path and uses the
> + * IPI mechanism.
> + */
> + if (xen_hvm_domain())
> + return 0;
> +
> callfunc_name = kasprintf(GFP_KERNEL, "irqwork%d", cpu);
> rc = bind_ipi_to_irqhandler(XEN_IRQ_WORK_VECTOR,
> cpu,
> @@ -167,6 +174,9 @@ static int xen_smp_intr_init(unsigned int cpu)
> if (per_cpu(xen_callfuncsingle_irq, cpu) >= 0)
> unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu),
> NULL);
> + if (xen_hvm_domain())
> + return rc;
> +
> if (per_cpu(xen_irq_work, cpu) >= 0)
> unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
>
> @@ -661,7 +671,8 @@ static void xen_hvm_cpu_die(unsigned int cpu)
> unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL);
> unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL);
> unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL);
> - unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
> + if (!xen_hvm_domain())
> + unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
> xen_uninit_lock_cpu(cpu);
> xen_teardown_timer(cpu);
> native_cpu_die(cpu);
> --
> 1.8.1.4
>

2013-04-29 18:34:16

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 8/9] xen/smp/pvhvm: Don't initialize IRQ_WORKER as we are using the native one.

On Fri, Apr 26, 2013 at 05:27:20PM +0100, Stefano Stabellini wrote:
> On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> > There is no need to use the PV version of the IRQ_WORKER mechanism
> > as under PVHVM we are using the native version. The native
> > version is using the SMP API.
> >
> > They just sit around unused:
> >
> > 69: 0 0 xen-percpu-ipi irqwork0
> > 83: 0 0 xen-percpu-ipi irqwork1
> >
> > Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
>
> Might be worth trying to make it work instead?
> Is it just because we don't set the apic->send_IPI_* functions to the
> xen specific version on PVHVM?
>

Right. We use the baremetal mechanism to do it. And it works fine.

>
> > arch/x86/xen/smp.c | 13 ++++++++++++-
> > 1 file changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> > index 22c800a..415694c 100644
> > --- a/arch/x86/xen/smp.c
> > +++ b/arch/x86/xen/smp.c
> > @@ -144,6 +144,13 @@ static int xen_smp_intr_init(unsigned int cpu)
> > goto fail;
> > per_cpu(xen_callfuncsingle_irq, cpu) = rc;
> >
> > + /*
> > + * The IRQ worker on PVHVM goes through the native path and uses the
> > + * IPI mechanism.
> > + */
> > + if (xen_hvm_domain())
> > + return 0;
> > +
> > callfunc_name = kasprintf(GFP_KERNEL, "irqwork%d", cpu);
> > rc = bind_ipi_to_irqhandler(XEN_IRQ_WORK_VECTOR,
> > cpu,
> > @@ -167,6 +174,9 @@ static int xen_smp_intr_init(unsigned int cpu)
> > if (per_cpu(xen_callfuncsingle_irq, cpu) >= 0)
> > unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu),
> > NULL);
> > + if (xen_hvm_domain())
> > + return rc;
> > +
> > if (per_cpu(xen_irq_work, cpu) >= 0)
> > unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
> >
> > @@ -661,7 +671,8 @@ static void xen_hvm_cpu_die(unsigned int cpu)
> > unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL);
> > unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL);
> > unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL);
> > - unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
> > + if (!xen_hvm_domain())
> > + unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
> > xen_uninit_lock_cpu(cpu);
> > xen_teardown_timer(cpu);
> > native_cpu_die(cpu);
> > --
> > 1.8.1.4
> >

2013-04-29 18:35:02

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 7/9] xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM

On Fri, Apr 26, 2013 at 05:20:23PM +0100, Stefano Stabellini wrote:
> On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> > See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23
> > (xen: disable PV spinlocks on HVM) for details.
> >
> > But we did not disable it everywhere - which means that when
> > we boot as PVHVM we end up allocating per-CPU irq line for
> > spinlock. This fixes that.
> >
> > Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
>
> Is there any point in calling xen_init_lock_cpu in PVHVM guests?

I was thinking about it.. but I still have hope that I will be able to
take Jeremy's patches for paravirt locking and redo them a bit.

> At that point we might as well remove the call from xen_hvm_cpu_notify.
>
>
> > arch/x86/xen/spinlock.c | 21 +++++++++++++++++++++
> > 1 file changed, 21 insertions(+)
> >
> > diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
> > index 47ae032..8b54603 100644
> > --- a/arch/x86/xen/spinlock.c
> > +++ b/arch/x86/xen/spinlock.c
> > @@ -367,6 +367,13 @@ void __cpuinit xen_init_lock_cpu(int cpu)
> > WARN(per_cpu(lock_kicker_irq, cpu) > 0, "spinlock on CPU%d exists on IRQ%d!\n",
> > cpu, per_cpu(lock_kicker_irq, cpu));
> >
> > + /*
> > + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23
> > + * (xen: disable PV spinlocks on HVM)
> > + */
> > + if (xen_hvm_domain())
> > + return;
> > +
> > name = kasprintf(GFP_KERNEL, "spinlock%d", cpu);
> > irq = bind_ipi_to_irqhandler(XEN_SPIN_UNLOCK_VECTOR,
> > cpu,
> > @@ -385,12 +392,26 @@ void __cpuinit xen_init_lock_cpu(int cpu)
> >
> > void xen_uninit_lock_cpu(int cpu)
> > {
> > + /*
> > + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23
> > + * (xen: disable PV spinlocks on HVM)
> > + */
> > + if (xen_hvm_domain())
> > + return;
> > +
> > unbind_from_irqhandler(per_cpu(lock_kicker_irq, cpu), NULL);
> > per_cpu(lock_kicker_irq, cpu) = -1;
> > }
> >
> > void __init xen_init_spinlocks(void)
> > {
> > + /*
> > + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23
> > + * (xen: disable PV spinlocks on HVM)
> > + */
> > + if (xen_hvm_domain())
> > + return;
> > +
> > BUILD_BUG_ON(sizeof(struct xen_spinlock) > sizeof(arch_spinlock_t));
> >
> > pv_lock_ops.spin_is_locked = xen_spin_is_locked;
> > --
> > 1.8.1.4
> >

2013-04-29 18:35:30

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 6/9] xen/spinlock: Check against default value of -1 for IRQ line.

On Fri, Apr 26, 2013 at 05:18:01PM +0100, Stefano Stabellini wrote:
> On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> > The default (uninitialized) value of the IRQ line is -1.
> > Check if we already have allocated an spinlock interrupt line
> > and if somebody is trying to do it again. Also set it to -1
> > when we offline the CPU.
> >
> > Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
> > ---
> > arch/x86/xen/spinlock.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
> > index f7a080e..47ae032 100644
> > --- a/arch/x86/xen/spinlock.c
> > +++ b/arch/x86/xen/spinlock.c
> > @@ -364,6 +364,9 @@ void __cpuinit xen_init_lock_cpu(int cpu)
> > int irq;
> > const char *name;
> >
> > + WARN(per_cpu(lock_kicker_irq, cpu) > 0, "spinlock on CPU%d exists on IRQ%d!\n",
> shouldn't this be >= ^
>

Yes. Thanks for catching.
>
> > + cpu, per_cpu(lock_kicker_irq, cpu));
> >
> > name = kasprintf(GFP_KERNEL, "spinlock%d", cpu);
> > irq = bind_ipi_to_irqhandler(XEN_SPIN_UNLOCK_VECTOR,
> > cpu,
> > @@ -383,6 +386,7 @@ void __cpuinit xen_init_lock_cpu(int cpu)
> > void xen_uninit_lock_cpu(int cpu)
> > {
> > unbind_from_irqhandler(per_cpu(lock_kicker_irq, cpu), NULL);
> > + per_cpu(lock_kicker_irq, cpu) = -1;
> > }
> >
> > void __init xen_init_spinlocks(void)
> > --
> > 1.8.1.4
> >

2013-04-29 18:36:38

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 3/9] xen/time: Fix kasprintf splat when allocating timer%d IRQ line.

On Fri, Apr 26, 2013 at 05:11:35PM +0100, Stefano Stabellini wrote:
> On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> > When we online the CPU, we get this splat:
> >
> > smpboot: Booting Node 0 Processor 1 APIC 0x2
> > installing Xen timer for CPU 1
> > BUG: sleeping function called from invalid context at /home/konrad/ssd/konrad/linux/mm/slab.c:3179
> > in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1
> > Pid: 0, comm: swapper/1 Not tainted 3.9.0-rc6upstream-00001-g3884fad #1
> > Call Trace:
> > [<ffffffff810c1fea>] __might_sleep+0xda/0x100
> > [<ffffffff81194617>] __kmalloc_track_caller+0x1e7/0x2c0
> > [<ffffffff81303758>] ? kasprintf+0x38/0x40
> > [<ffffffff813036eb>] kvasprintf+0x5b/0x90
> > [<ffffffff81303758>] kasprintf+0x38/0x40
> > [<ffffffff81044510>] xen_setup_timer+0x30/0xb0
> > [<ffffffff810445af>] xen_hvm_setup_cpu_clockevents+0x1f/0x30
> > [<ffffffff81666d0a>] start_secondary+0x19c/0x1a8
> >
> > The solution to that is use kasprintf in the CPU hotplug path
> > that 'online's the CPU. That is, do it in in xen_hvm_cpu_notify,
> > and remove the call to in xen_hvm_setup_cpu_clockevents.
> >
> > Unfortunatly the later is not a good idea as the bootup path
> > does not use xen_hvm_cpu_notify so we would end up never allocating
> > timer%d interrupt lines when booting. As such add the check for
> > atomic() to continue.
>
> This last is not reflected in the code.

I found out that it was not needed.
>
> Also, is it actually OK to move xen_setup_timer out of
> xen_hvm_setup_cpu_clockevents?

Yes. It ends up being called earlier - in the notifier.
>
> xen_setup_cpu_clockevents registers xen_clock_events as clocksource and
> xen_clock_events is setup by xen_setup_timer so we need to make sure
> that the call order remains the same.

The order is still the same.
>
>
> > CC: [email protected]
> > Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
> > ---
> > arch/x86/xen/enlighten.c | 5 ++++-
> > arch/x86/xen/time.c | 6 +++++-
> > 2 files changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > index 47d3243..ddbd54a 100644
> > --- a/arch/x86/xen/enlighten.c
> > +++ b/arch/x86/xen/enlighten.c
> > @@ -1641,8 +1641,11 @@ static int __cpuinit xen_hvm_cpu_notify(struct notifier_block *self,
> > switch (action) {
> > case CPU_UP_PREPARE:
> > xen_vcpu_setup(cpu);
> > - if (xen_have_vector_callback)
> > + if (xen_have_vector_callback) {
> > xen_init_lock_cpu(cpu);
> > + if (xen_feature(XENFEAT_hvm_safe_pvclock))
> > + xen_setup_timer(cpu);
> > + }
> > break;
> > default:
> > break;
> > diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
> > index 0296a95..054cc01 100644
> > --- a/arch/x86/xen/time.c
> > +++ b/arch/x86/xen/time.c
> > @@ -497,7 +497,11 @@ static void xen_hvm_setup_cpu_clockevents(void)
> > {
> > int cpu = smp_processor_id();
> > xen_setup_runstate_info(cpu);
> > - xen_setup_timer(cpu);
> > + /*
> > + * xen_setup_timer(cpu) - snprintf is bad in atomic context. Hence
> > + * doing it xen_hvm_cpu_notify (which gets called by smp_init during
> > + * early bootup and also during CPU hotplug events).
> > + */
> > xen_setup_cpu_clockevents();
> > }
> >
> > --
> > 1.8.1.4
> >

2013-05-01 13:25:29

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH 8/9] xen/smp/pvhvm: Don't initialize IRQ_WORKER as we are using the native one.

On Mon, 29 Apr 2013, Konrad Rzeszutek Wilk wrote:
> On Fri, Apr 26, 2013 at 05:27:20PM +0100, Stefano Stabellini wrote:
> > On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> > > There is no need to use the PV version of the IRQ_WORKER mechanism
> > > as under PVHVM we are using the native version. The native
> > > version is using the SMP API.
> > >
> > > They just sit around unused:
> > >
> > > 69: 0 0 xen-percpu-ipi irqwork0
> > > 83: 0 0 xen-percpu-ipi irqwork1
> > >
> > > Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
> >
> > Might be worth trying to make it work instead?
> > Is it just because we don't set the apic->send_IPI_* functions to the
> > xen specific version on PVHVM?
> >
>
> Right. We use the baremetal mechanism to do it. And it works fine.

OK, it works fine, but won't it generate many mores trap and emulate
cycles?


> > > arch/x86/xen/smp.c | 13 ++++++++++++-
> > > 1 file changed, 12 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> > > index 22c800a..415694c 100644
> > > --- a/arch/x86/xen/smp.c
> > > +++ b/arch/x86/xen/smp.c
> > > @@ -144,6 +144,13 @@ static int xen_smp_intr_init(unsigned int cpu)
> > > goto fail;
> > > per_cpu(xen_callfuncsingle_irq, cpu) = rc;
> > >
> > > + /*
> > > + * The IRQ worker on PVHVM goes through the native path and uses the
> > > + * IPI mechanism.
> > > + */
> > > + if (xen_hvm_domain())
> > > + return 0;
> > > +
> > > callfunc_name = kasprintf(GFP_KERNEL, "irqwork%d", cpu);
> > > rc = bind_ipi_to_irqhandler(XEN_IRQ_WORK_VECTOR,
> > > cpu,
> > > @@ -167,6 +174,9 @@ static int xen_smp_intr_init(unsigned int cpu)
> > > if (per_cpu(xen_callfuncsingle_irq, cpu) >= 0)
> > > unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu),
> > > NULL);
> > > + if (xen_hvm_domain())
> > > + return rc;
> > > +
> > > if (per_cpu(xen_irq_work, cpu) >= 0)
> > > unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
> > >
> > > @@ -661,7 +671,8 @@ static void xen_hvm_cpu_die(unsigned int cpu)
> > > unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL);
> > > unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL);
> > > unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL);
> > > - unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
> > > + if (!xen_hvm_domain())
> > > + unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL);
> > > xen_uninit_lock_cpu(cpu);
> > > xen_teardown_timer(cpu);
> > > native_cpu_die(cpu);
> > > --
> > > 1.8.1.4
> > >
>

2013-05-01 14:58:12

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 8/9] xen/smp/pvhvm: Don't initialize IRQ_WORKER as we are using the native one.

On Wed, May 01, 2013 at 02:25:16PM +0100, Stefano Stabellini wrote:
> On Mon, 29 Apr 2013, Konrad Rzeszutek Wilk wrote:
> > On Fri, Apr 26, 2013 at 05:27:20PM +0100, Stefano Stabellini wrote:
> > > On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> > > > There is no need to use the PV version of the IRQ_WORKER mechanism
> > > > as under PVHVM we are using the native version. The native
> > > > version is using the SMP API.
> > > >
> > > > They just sit around unused:
> > > >
> > > > 69: 0 0 xen-percpu-ipi irqwork0
> > > > 83: 0 0 xen-percpu-ipi irqwork1
> > > >
> > > > Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
> > >
> > > Might be worth trying to make it work instead?
> > > Is it just because we don't set the apic->send_IPI_* functions to the
> > > xen specific version on PVHVM?
> > >
> >
> > Right. We use the baremetal mechanism to do it. And it works fine.
>
> OK, it works fine, but won't it generate many mores trap and emulate
> cycles?

No idea. We can certainly make use of the PV IPI mechanism for IRQ_WORKER
type mechaism but I would have to play with xentrace to get a good handle
of what is involved (And how the v Posted interrupt thing affects this).

Right now that is something I can't do (buried in bugs).

2013-05-01 15:07:42

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH 8/9] xen/smp/pvhvm: Don't initialize IRQ_WORKER as we are using the native one.

On Wed, 1 May 2013, Konrad Rzeszutek Wilk wrote:
> On Wed, May 01, 2013 at 02:25:16PM +0100, Stefano Stabellini wrote:
> > On Mon, 29 Apr 2013, Konrad Rzeszutek Wilk wrote:
> > > On Fri, Apr 26, 2013 at 05:27:20PM +0100, Stefano Stabellini wrote:
> > > > On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:
> > > > > There is no need to use the PV version of the IRQ_WORKER mechanism
> > > > > as under PVHVM we are using the native version. The native
> > > > > version is using the SMP API.
> > > > >
> > > > > They just sit around unused:
> > > > >
> > > > > 69: 0 0 xen-percpu-ipi irqwork0
> > > > > 83: 0 0 xen-percpu-ipi irqwork1
> > > > >
> > > > > Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
> > > >
> > > > Might be worth trying to make it work instead?
> > > > Is it just because we don't set the apic->send_IPI_* functions to the
> > > > xen specific version on PVHVM?
> > > >
> > >
> > > Right. We use the baremetal mechanism to do it. And it works fine.
> >
> > OK, it works fine, but won't it generate many mores trap and emulate
> > cycles?
>
> No idea. We can certainly make use of the PV IPI mechanism for IRQ_WORKER
> type mechaism but I would have to play with xentrace to get a good handle
> of what is involved (And how the v Posted interrupt thing affects this).
>
> Right now that is something I can't do (buried in bugs).

OK