2010-12-01 18:22:17

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH 0/5] Xen PV on HVM fixes

Hi all,
this patch series is a collection of fixes for Xen PV on HVM.

The first two patches fix two pirq remapping problems with MSIs, they
have been sent to the list few times already for comments.
The three following patches are new and fix save/restore bugs.

Stefano Stabellini (5):
xen: use PHYSDEVOP_get_free_pirq to implement find_unbound_pirq
xen: fix MSI setup and teardown for PV on HVM guests
xen: resume the pv console for hvm guests too
xen: fix save/restore for PV on HVM guests with pirq remapping
xen: unplug the emulated devices at resume time

arch/x86/pci/xen.c | 27 +++++++---
arch/x86/xen/platform-pci-unplug.c | 2 +-
arch/x86/xen/suspend.c | 1 +
arch/x86/xen/xen-ops.h | 2 +-
drivers/xen/events.c | 105 +++++++++++++++++++++++++-----------
drivers/xen/manage.c | 1 +
include/xen/events.h | 7 ++-
include/xen/interface/physdev.h | 10 ++++
8 files changed, 114 insertions(+), 41 deletions(-)

A branch with these fixes based on 2.6.37-rc4 is available here:

git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git 2.6.37-rc4-pvhvm-fixes

Cheers,

Stefano


2010-12-01 18:23:53

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH 1/5] xen: use PHYSDEVOP_get_free_pirq to implement find_unbound_pirq

From: Stefano Stabellini <[email protected]>

Use the new hypercall PHYSDEVOP_get_free_pirq to ask Xen to allocate a
pirq. Remove the unsupported PHYSDEVOP_get_nr_pirqs hypercall to get the
amount of pirq available.

Signed-off-by: Stefano Stabellini <[email protected]>
---
drivers/xen/events.c | 45 +++++++++++++++++---------------------
include/xen/interface/physdev.h | 10 ++++++++
2 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index 2811bb9..7ab43c3 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -105,7 +105,6 @@ struct irq_info

static struct irq_info *irq_info;
static int *pirq_to_irq;
-static int nr_pirqs;

static int *evtchn_to_irq;
struct cpu_evtchn_s {
@@ -385,12 +384,17 @@ static int get_nr_hw_irqs(void)
return ret;
}

-/* callers of this function should make sure that PHYSDEVOP_get_nr_pirqs
- * succeeded otherwise nr_pirqs won't hold the right value */
-static int find_unbound_pirq(void)
+static int find_unbound_pirq(int type)
{
- int i;
- for (i = nr_pirqs-1; i >= 0; i--) {
+ int rc, i;
+ struct physdev_get_free_pirq op_get_free_pirq;
+ op_get_free_pirq.type = type;
+
+ rc = HYPERVISOR_physdev_op(PHYSDEVOP_get_free_pirq, &op_get_free_pirq);
+ if (!rc)
+ return op_get_free_pirq.pirq;
+
+ for (i = 0; i < nr_irqs; i++) {
if (pirq_to_irq[i] < 0)
return i;
}
@@ -611,10 +615,10 @@ int xen_map_pirq_gsi(unsigned pirq, unsigned gsi, int shareable, char *name)

spin_lock(&irq_mapping_update_lock);

- if ((pirq > nr_pirqs) || (gsi > nr_irqs)) {
+ if ((pirq > nr_irqs) || (gsi > nr_irqs)) {
printk(KERN_WARNING "xen_map_pirq_gsi: %s %s is incorrect!\n",
- pirq > nr_pirqs ? "nr_pirqs" :"",
- gsi > nr_irqs ? "nr_irqs" : "");
+ pirq > nr_irqs ? "pirq" :"",
+ gsi > nr_irqs ? "gsi" : "");
goto out;
}

@@ -672,7 +676,7 @@ void xen_allocate_pirq_msi(char *name, int *irq, int *pirq)
if (*irq == -1)
goto out;

- *pirq = find_unbound_pirq();
+ *pirq = find_unbound_pirq(MAP_PIRQ_TYPE_MSI);
if (*pirq == -1)
goto out;

@@ -1506,26 +1510,17 @@ void xen_callback_vector(void) {}

void __init xen_init_IRQ(void)
{
- int i, rc;
- struct physdev_nr_pirqs op_nr_pirqs;
+ int i;

cpu_evtchn_mask_p = kcalloc(nr_cpu_ids, sizeof(struct cpu_evtchn_s),
GFP_KERNEL);
irq_info = kcalloc(nr_irqs, sizeof(*irq_info), GFP_KERNEL);

- rc = HYPERVISOR_physdev_op(PHYSDEVOP_get_nr_pirqs, &op_nr_pirqs);
- if (rc < 0) {
- nr_pirqs = nr_irqs;
- if (rc != -ENOSYS)
- printk(KERN_WARNING "PHYSDEVOP_get_nr_pirqs returned rc=%d\n", rc);
- } else {
- if (xen_pv_domain() && !xen_initial_domain())
- nr_pirqs = max((int)op_nr_pirqs.nr_pirqs, nr_irqs);
- else
- nr_pirqs = op_nr_pirqs.nr_pirqs;
- }
- pirq_to_irq = kcalloc(nr_pirqs, sizeof(*pirq_to_irq), GFP_KERNEL);
- for (i = 0; i < nr_pirqs; i++)
+ /* We are using nr_irqs as the maximum number of pirq available but
+ * that number is actually chosen by Xen and we don't know exactly
+ * what it is. Be careful choosing high pirq numbers. */
+ pirq_to_irq = kcalloc(nr_irqs, sizeof(*pirq_to_irq), GFP_KERNEL);
+ for (i = 0; i < nr_irqs; i++)
pirq_to_irq[i] = -1;

evtchn_to_irq = kcalloc(NR_EVENT_CHANNELS, sizeof(*evtchn_to_irq),
diff --git a/include/xen/interface/physdev.h b/include/xen/interface/physdev.h
index 2b2c66c..534cac8 100644
--- a/include/xen/interface/physdev.h
+++ b/include/xen/interface/physdev.h
@@ -188,6 +188,16 @@ struct physdev_nr_pirqs {
uint32_t nr_pirqs;
};

+/* type is MAP_PIRQ_TYPE_GSI or MAP_PIRQ_TYPE_MSI
+ * the hypercall returns a free pirq */
+#define PHYSDEVOP_get_free_pirq 23
+struct physdev_get_free_pirq {
+ /* IN */
+ int type;
+ /* OUT */
+ uint32_t pirq;
+};
+
/*
* Notify that some PIRQ-bound event channels have been unmasked.
* ** This command is obsolete since interface version 0x00030202 and is **
--
1.5.6.5

2010-12-01 18:23:58

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH 5/5] xen: unplug the emulated devices at resume time

From: Stefano Stabellini <[email protected]>

Early after being resumed we need to unplug again the emulated devices.

Signed-off-by: Stefano Stabellini <[email protected]>
---
arch/x86/xen/platform-pci-unplug.c | 2 +-
arch/x86/xen/suspend.c | 1 +
arch/x86/xen/xen-ops.h | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/platform-pci-unplug.c b/arch/x86/xen/platform-pci-unplug.c
index 0f45638..25c52f9 100644
--- a/arch/x86/xen/platform-pci-unplug.c
+++ b/arch/x86/xen/platform-pci-unplug.c
@@ -68,7 +68,7 @@ static int __init check_platform_magic(void)
return 0;
}

-void __init xen_unplug_emulated_devices(void)
+void xen_unplug_emulated_devices(void)
{
int r;

diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
index 1d789d5..9bbd63a 100644
--- a/arch/x86/xen/suspend.c
+++ b/arch/x86/xen/suspend.c
@@ -31,6 +31,7 @@ void xen_hvm_post_suspend(int suspend_cancelled)
int cpu;
xen_hvm_init_shared_info();
xen_callback_vector();
+ xen_unplug_emulated_devices();
if (xen_feature(XENFEAT_hvm_safe_pvclock)) {
for_each_online_cpu(cpu) {
xen_setup_runstate_info(cpu);
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index 6404474..9d41bf9 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -43,7 +43,7 @@ void xen_vcpu_restore(void);

void xen_callback_vector(void);
void xen_hvm_init_shared_info(void);
-void __init xen_unplug_emulated_devices(void);
+void xen_unplug_emulated_devices(void);

void __init xen_build_dynamic_phys_to_machine(void);

--
1.5.6.5

2010-12-01 18:23:55

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH 3/5] xen: resume the pv console for hvm guests too

From: Stefano Stabellini <[email protected]>

Signed-off-by: Stefano Stabellini <[email protected]>
---
drivers/xen/manage.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index ef9c7db..db8c4c4 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -49,6 +49,7 @@ static int xen_hvm_suspend(void *data)

if (!*cancelled) {
xen_irq_resume();
+ xen_console_resume();
xen_timer_resume();
}

--
1.5.6.5

2010-12-01 18:24:17

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH 4/5] xen: fix save/restore for PV on HVM guests with pirq remapping

From: Stefano Stabellini <[email protected]>

Re-map and re-bind all the pirqs at resume time.

Signed-off-by: Stefano Stabellini <[email protected]>
---
drivers/xen/events.c | 38 ++++++++++++++++++++++++++++++++++++++
1 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index f78945c..f34288a 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -1293,6 +1293,42 @@ static int retrigger_dynirq(unsigned int irq)
return ret;
}

+static void restore_cpu_pirqs(void)
+{
+ int pirq, rc, irq, gsi;
+ struct physdev_map_pirq map_irq;
+
+ for (pirq = 0; pirq < nr_irqs; pirq++) {
+ irq = pirq_to_irq[pirq];
+ if (irq == -1)
+ continue;
+
+ /* save/restore of PT devices doesn't work, so at this point the
+ * only devices present are GSI based emulated devices */
+ gsi = gsi_from_irq(irq);
+ if (!gsi)
+ continue;
+
+ map_irq.domid = DOMID_SELF;
+ map_irq.type = MAP_PIRQ_TYPE_GSI;
+ map_irq.index = gsi;
+ map_irq.pirq = pirq;
+
+ rc = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq);
+ if (rc) {
+ printk(KERN_WARNING "xen map irq failed gsi=%d irq=%d pirq=%d rc=%d\n",
+ gsi, irq, pirq, rc);
+ irq_info[irq] = mk_unbound_info();
+ pirq_to_irq[pirq] = -1;
+ continue;
+ }
+
+ printk(KERN_DEBUG "xen: --> irq=%d, pirq=%d\n", irq, map_irq.pirq);
+
+ startup_pirq(irq);
+ }
+}
+
static void restore_cpu_virqs(unsigned int cpu)
{
struct evtchn_bind_virq bind_virq;
@@ -1436,6 +1472,8 @@ void xen_irq_resume(void)

unmask_evtchn(evtchn);
}
+
+ restore_cpu_pirqs();
}

static struct irq_chip xen_dynamic_chip __read_mostly = {
--
1.5.6.5

2010-12-01 18:24:51

by Stefano Stabellini

[permalink] [raw]
Subject: [PATCH 2/5] xen: fix MSI setup and teardown for PV on HVM guests

From: Stefano Stabellini <[email protected]>

When remapping MSIs into pirqs for PV on HVM guests, qemu is responsible
for doing the actual mapping and unmapping.
We only give qemu the desired pirq number when we ask to do the mapping
the first time, after that we should be reading back the pirq number
from qemu every time we want to re-enable the MSI.

Signed-off-by: Stefano Stabellini <[email protected]>
---
arch/x86/pci/xen.c | 27 ++++++++++++++++++++-------
drivers/xen/events.c | 24 +++++++++++++++++-------
include/xen/events.h | 7 ++++++-
3 files changed, 43 insertions(+), 15 deletions(-)

diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index d7b5109..25cd4a0 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -70,6 +70,9 @@ static int acpi_register_gsi_xen_hvm(struct device *dev, u32 gsi,
struct xen_pci_frontend_ops *xen_pci_frontend;
EXPORT_SYMBOL_GPL(xen_pci_frontend);

+#define XEN_PIRQ_MSI_DATA (MSI_DATA_TRIGGER_EDGE | \
+ MSI_DATA_LEVEL_ASSERT | (3 << 8) | MSI_DATA_VECTOR(0))
+
static void xen_msi_compose_msg(struct pci_dev *pdev, unsigned int pirq,
struct msi_msg *msg)
{
@@ -83,12 +86,7 @@ static void xen_msi_compose_msg(struct pci_dev *pdev, unsigned int pirq,
MSI_ADDR_REDIRECTION_CPU |
MSI_ADDR_DEST_ID(pirq);

- msg->data =
- MSI_DATA_TRIGGER_EDGE |
- MSI_DATA_LEVEL_ASSERT |
- /* delivery mode reserved */
- (3 << 8) |
- MSI_DATA_VECTOR(0);
+ msg->data = XEN_PIRQ_MSI_DATA;
}

static int xen_hvm_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
@@ -98,8 +96,23 @@ static int xen_hvm_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
struct msi_msg msg;

list_for_each_entry(msidesc, &dev->msi_list, list) {
+ __read_msi_msg(msidesc, &msg);
+ pirq = MSI_ADDR_EXT_DEST_ID(msg.address_hi) |
+ ((msg.address_lo >> MSI_ADDR_DEST_ID_SHIFT) & 0xff);
+ if (xen_irq_from_pirq(pirq) >= 0 && msg.data == XEN_PIRQ_MSI_DATA) {
+ xen_allocate_pirq_msi((type == PCI_CAP_ID_MSIX) ?
+ "msi-x" : "msi", &irq, &pirq, XEN_ALLOC_IRQ);
+ if (irq < 0)
+ goto error;
+ ret = set_irq_msi(irq, msidesc);
+ if (ret < 0)
+ goto error_while;
+ printk(KERN_DEBUG "xen: msi already setup: msi --> irq=%d"
+ " pirq=%d\n", irq, pirq);
+ return 0;
+ }
xen_allocate_pirq_msi((type == PCI_CAP_ID_MSIX) ?
- "msi-x" : "msi", &irq, &pirq);
+ "msi-x" : "msi", &irq, &pirq, (XEN_ALLOC_IRQ | XEN_ALLOC_PIRQ));
if (irq < 0 || pirq < 0)
goto error;
printk(KERN_DEBUG "xen: msi --> irq=%d, pirq=%d\n", irq, pirq);
diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index 7ab43c3..f78945c 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -668,17 +668,21 @@ out:
#include <linux/msi.h>
#include "../pci/msi.h"

-void xen_allocate_pirq_msi(char *name, int *irq, int *pirq)
+void xen_allocate_pirq_msi(char *name, int *irq, int *pirq, int alloc)
{
spin_lock(&irq_mapping_update_lock);

- *irq = find_unbound_irq();
- if (*irq == -1)
- goto out;
+ if (alloc & XEN_ALLOC_IRQ) {
+ *irq = find_unbound_irq();
+ if (*irq == -1)
+ goto out;
+ }

- *pirq = find_unbound_pirq(MAP_PIRQ_TYPE_MSI);
- if (*pirq == -1)
- goto out;
+ if (alloc & XEN_ALLOC_PIRQ) {
+ *pirq = find_unbound_pirq(MAP_PIRQ_TYPE_MSI);
+ if (*pirq == -1)
+ goto out;
+ }

set_irq_chip_and_handler_name(*irq, &xen_pirq_chip,
handle_level_irq, name);
@@ -766,6 +770,7 @@ int xen_destroy_irq(int irq)
printk(KERN_WARNING "unmap irq failed %d\n", rc);
goto out;
}
+ pirq_to_irq[info->u.pirq.pirq] = -1;
}
irq_info[irq] = mk_unbound_info();

@@ -786,6 +791,11 @@ int xen_gsi_from_irq(unsigned irq)
return gsi_from_irq(irq);
}

+int xen_irq_from_pirq(unsigned pirq)
+{
+ return pirq_to_irq[pirq];
+}
+
int bind_evtchn_to_irq(unsigned int evtchn)
{
int irq;
diff --git a/include/xen/events.h b/include/xen/events.h
index 646dd17..00f53dd 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -76,7 +76,9 @@ int xen_map_pirq_gsi(unsigned pirq, unsigned gsi, int shareable, char *name);

#ifdef CONFIG_PCI_MSI
/* Allocate an irq and a pirq to be used with MSIs. */
-void xen_allocate_pirq_msi(char *name, int *irq, int *pirq);
+#define XEN_ALLOC_PIRQ (1 << 0)
+#define XEN_ALLOC_IRQ (1 << 1)
+void xen_allocate_pirq_msi(char *name, int *irq, int *pirq, int alloc_mask);
int xen_create_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc, int type);
#endif

@@ -89,4 +91,7 @@ int xen_vector_from_irq(unsigned pirq);
/* Return gsi allocated to pirq */
int xen_gsi_from_irq(unsigned pirq);

+/* Return irq from pirq */
+int xen_irq_from_pirq(unsigned pirq);
+
#endif /* _XEN_EVENTS_H */
--
1.5.6.5

2010-12-01 18:47:31

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 0/5] Xen PV on HVM fixes

On 12/01/2010 10:22 AM, Stefano Stabellini wrote:
> Hi all,
> this patch series is a collection of fixes for Xen PV on HVM.
>
> The first two patches fix two pirq remapping problems with MSIs, they
> have been sent to the list few times already for comments.
> The three following patches are new and fix save/restore bugs.

Are you targeting these at 3.6.37-rc? If so, each changelog commit
should explicitly say what bug is being fixed, and what the impact of
the bug is. The last 3 are OK, but it isn't clear to me whether the
first two are strictly bugfixes or not.

J

2010-12-02 14:42:37

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 0/5] Xen PV on HVM fixes

On Wed, 1 Dec 2010, Jeremy Fitzhardinge wrote:
> On 12/01/2010 10:22 AM, Stefano Stabellini wrote:
> > Hi all,
> > this patch series is a collection of fixes for Xen PV on HVM.
> >
> > The first two patches fix two pirq remapping problems with MSIs, they
> > have been sent to the list few times already for comments.
> > The three following patches are new and fix save/restore bugs.
>
> Are you targeting these at 3.6.37-rc?

Yes, because the bugs are pretty serious.


> If so, each changelog commit
> should explicitly say what bug is being fixed, and what the impact of
> the bug is. The last 3 are OK, but it isn't clear to me whether the
> first two are strictly bugfixes or not.
>

The first patch fixes find_unbound_pirq that otherwise would be based on
PHYSDEVOP_get_nr_pirqs that is actually not supported by Xen,
therefore it would fail. So find_unbound_pirq would return a number
starting from nr_irqs that might very well be out of range in Xen.

The symptom of this bug is that when you passthrough an MSI capable pci
device to a PV on HVM guest, Linux would fail to enable MSIs on the
device.


The second patch fixes a bug in xen_hvm_setup_msi_irqs that manifests
itself when trying to enable the same MSI for the second time: qemu is
responsible for remapping MSIs into pirqs and considers the old MSI to
pirq mapping still valid at this point but xen_hvm_setup_msi_irqs
would try to assign a new pirq anyway.

A simple way to reproduce this bug is to assign an MSI capable network
card to a PV on HVM guest, if the user brings down the corresponding
ethernet interface and up again, Linux would fail to enable MSIs on the
device.



I have updated the first two commit messages with these info, the branch
is still the same:

git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git 2.6.37-rc4-pvhvm-fixes