2024-06-07 07:51:45

by Jiqian Chen

[permalink] [raw]
Subject: [RFC KERNEL PATCH v8 0/2] Support device passthrough when dom0 is PVH on Xen

Hi All,
This is v8 series to support passthrough on Xen when dom0 is PVH.
v7->v8 change:
* patch#1: This is the patch#1 of v6, because it is reverted from the staging branch due to the API changes on Xen side.
Add pci_device_state_reset_type_t to distinguish the reset types.
* patch#2: is the patch#1 of v7. Use CONFIG_XEN_ACPI instead of CONFIG_ACPI to wrap codes.
* patch#3: is the patch#2 of v7. In function privcmd_ioctl_gsi_from_dev, return -EINVAL when not confige CONFIG_XEN_ACPI.
use PCI_BUS_NUM PCI_SLOT PCI_FUNC instead of open coding.


Best regards,
Jiqian Chen



v6->v7 change:
* the first patch of v6 was already merged into branch linux_next.
* patch#1: is the patch#2 of v6. move the implementation of function xen_acpi_get_gsi_info to
file drivers/xen/acpi.c, that modification is more convenient for the subsequent
patch to obtain gsi.
* patch#2: is the patch#3 of v6. add a new parameter "gsi" to struct pcistub_device and set
gsi when pcistub initialize device. Then when userspace wants to get gsi by passing
sbdf, we can return that gsi.


v5->v6 change:
* patch#3: change to add a new syscall to translate irq to gsi, instead adding a new gsi sysfs.


v4->v5 changes:
* patch#1: Add Reviewed-by Stefano
* patch#2: Add Reviewed-by Stefano
* patch#3: No changes


v3->v4 changes:
* patch#1: change the comment of PHYSDEVOP_pci_device_state_reset; use a new function
pcistub_reset_device_state to wrap __pci_reset_function_locked and xen_reset_device_state,
and call pcistub_reset_device_state in pci_stub.c
* patch#2: remove map_pirq from xen_pvh_passthrough_gsi


v2->v3 changes:
* patch#1: add condition to limit do xen_reset_device_state for no-pv domain in pcistub_init_device.
* patch#2: Abandoning previous implementations that call unmask_irq. To setup gsi and map pirq for
passthrough device in pcistub_init_device.
* patch#3: Abandoning previous implementations that adds new syscall to get gsi from irq. To add a new
sysfs for gsi, then userspace can get gsi number from sysfs.


Below is the description of v2 cover letter:
This series of patches are the v2 of the implementation of passthrough when dom0 is PVH on Xen.
We sent the v1 to upstream before, but the v1 had so many problems and we got lots of suggestions.
I will introduce all issues that these patches try to fix and the differences between v1 and v2.

Issues we encountered:
1. pci_stub failed to write bar for a passthrough device.
Problem: when we run \u201csudo xl pci-assignable-add <sbdf>\u201d to assign a device, pci_stub will
call \u201cpcistub_init_device() -> pci_restore_state() -> pci_restore_config_space() ->
pci_restore_config_space_range() -> pci_restore_config_dword() -> pci_write_config_dword(), the pci
config write will trigger an io interrupt to bar_write() in the xen, but the bar->enabled was set before,
the write is not allowed now, and then when bar->Qemu config the passthrough device in xen_pt_realize(),
it gets invalid bar values.

Reason: the reason is that we don't tell vPCI that the device has been reset, so the current cached state
in pdev->vpci is all out of date and is different from the real device state.

Solution: to solve this problem, the first patch of kernel(xen/pci: Add xen_reset_device_state
function) and the fist patch of xen(xen/vpci: Clear all vpci status of device) add a new hypercall to
reset the state stored in vPCI when the state of real device has changed.
Thank Roger for the suggestion of this v2, and it is different from v1
(https://lore.kernel.org/xen-devel/[email protected]/), v1 simply allow domU to
write pci bar, it does not comply with the design principles of vPCI.

2. failed to do PHYSDEVOP_map_pirq when dom0 is PVH
Problem: HVM domU will do PHYSDEVOP_map_pirq for a passthrough device by using gsi. See
xen_pt_realize->xc_physdev_map_pirq and pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq
will call into Xen, but in hvm_physdev_op(), PHYSDEVOP_map_pirq is not allowed.

Reason: In hvm_physdev_op(), the variable "currd" is PVH dom0 and PVH has no X86_EMU_USE_PIRQ flag, it
will fail at has_pirq check.

Solution: I think we may need to allow PHYSDEVOP_map_pirq when "currd" is dom0 (at present dom0 is PVH).
The second patch of xen(x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0) allow PVH dom0 do
PHYSDEVOP_map_pirq. This v2 patch is better than v1, v1 simply remove the has_pirq check
(xen https://lore.kernel.org/xen-devel/[email protected]/).

3. the gsi of a passthrough device doesn't be unmasked
3.1 failed to check the permission of pirq
3.2 the gsi of passthrough device was not registered in PVH dom0

Problem:
3.1 callback function pci_add_dm_done() will be called when qemu config a passthrough device for domU.
This function will call xc_domain_irq_permission()-> pirq_access_permitted() to check if the gsi has
corresponding mappings in dom0. But it didn\u2019t, so failed. See
XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH dom0 and it return irq is 0.
3.2 it's possible for a gsi (iow: vIO-APIC pin) to never get registered on PVH dom0, because the
devices of PVH are using MSI(-X) interrupts. However, the IO-APIC pin must be configured for it to be
able to be mapped into a domU.

Reason: After searching codes, I find "map_pirq" and "register_gsi" will be done in function
vioapic_write_redirent->vioapic_hwdom_map_gsi when the gsi(aka ioapic's pin) is unmasked in PVH dom0.
So the two problems can be concluded to that the gsi of a passthrough device doesn't be unmasked.

Solution: to solve these problems, the second patch of kernel(xen/pvh: Unmask irq for passthrough device
in PVH dom0) call the unmask_irq() when we assign a device to be passthrough. So that passthrough devices
can have the mapping of gsi on PVH dom0 and gsi can be registered. This v2 patch is different from the
v1( kernel https://lore.kernel.org/xen-devel/[email protected]/,
kernel https://lore.kernel.org/xen-devel/[email protected]/ and
xen https://lore.kernel.org/xen-devel/[email protected]/),
v1 performed "map_pirq" and "register_gsi" on all pci devices on PVH dom0, which is unnecessary and may
cause multiple registration.

4. failed to map pirq for gsi
Problem: qemu will call xc_physdev_map_pirq() to map a passthrough device\u2019s gsi to pirq in function
xen_pt_realize(). But failed.

Reason: According to the implement of xc_physdev_map_pirq(), it needs gsi instead of irq, but qemu pass
irq to it and treat irq as gsi, it is got from file /sys/bus/pci/devices/xxxx:xx:xx.x/irq in function
xen_host_pci_device_get(). But actually the gsi number is not equal with irq. On PVH dom0, when it
allocates irq for a gsi in function acpi_register_gsi_ioapic(), allocation is dynamic, and follow the
principle of applying first, distributing first. And if you debug the kernel codes(see
function __irq_alloc_descs), you will find the irq number is allocated from small to large by order, but
the applying gsi number is not, gsi 38 may come before gsi 28, that causes gsi 38 get a smaller irq number
than gsi 28, and then gsi != irq.

Solution: we can record the relation between gsi and irq, then when userspace(qemu) want to use gsi, we
can do a translation. The third patch of kernel(xen/privcmd: Add new syscall to get gsi from irq) records
all the relations in acpi_register_gsi_xen_pvh() when dom0 initialize pci devices, and provide a syscall
for userspace to get the gsi from irq. The third patch of xen(tools: Add new function to get gsi from irq)
add a new function xc_physdev_gsi_from_irq() to call the new syscall added on kernel side.
And then userspace can use that function to get gsi. Then xc_physdev_map_pirq() will success. This v2 patch
is the same as v1( kernel https://lore.kernel.org/xen-devel/[email protected]/ and
xen https://lore.kernel.org/xen-devel/[email protected]/)

About the v2 patch of qemu, just change an included head file, other are similar to the v1 (
qemu https://lore.kernel.org/xen-devel/[email protected]/), just call
xc_physdev_gsi_from_irq() to get gsi from irq.

Jiqian Chen (3):
xen/pci: Add xen_reset_device_function_state
xen/pvh: Setup gsi for passthrough device
xen/privcmd: Add new syscall to get gsi from dev

arch/x86/xen/enlighten_pvh.c | 23 +++++++++
drivers/acpi/pci_irq.c | 2 +-
drivers/xen/acpi.c | 50 +++++++++++++++++++
drivers/xen/pci.c | 25 ++++++++++
drivers/xen/privcmd.c | 28 +++++++++++
drivers/xen/xen-pciback/pci_stub.c | 77 +++++++++++++++++++++++++++---
include/linux/acpi.h | 1 +
include/uapi/xen/privcmd.h | 7 +++
include/xen/acpi.h | 19 ++++++++
include/xen/interface/physdev.h | 7 +++
include/xen/pci.h | 6 +++
11 files changed, 238 insertions(+), 7 deletions(-)

--
2.34.1



2024-06-07 07:52:35

by Jiqian Chen

[permalink] [raw]
Subject: [RFC KERNEL PATCH v8 3/3] xen/privcmd: Add new syscall to get gsi from dev

In PVH dom0, it uses the linux local interrupt mechanism,
when it allocs irq for a gsi, it is dynamic, and follow
the principle of applying first, distributing first. And
the irq number is alloced from small to large, but the
applying gsi number is not, may gsi 38 comes before gsi 28,
it causes the irq number is not equal with the gsi number.
And when passthrough a device, QEMU will use device's gsi
number to do pirq mapping, but the gsi number is got from
file /sys/bus/pci/devices/<sbdf>/irq, irq!= gsi, so it will
fail when mapping.
And in current linux codes, there is no method to get gsi
for userspace.

For above purpose, record gsi of pcistub devices when init
pcistub and add a new syscall into privcmd to let userspace
can get gsi when they have a need.

Signed-off-by: Huang Rui <[email protected]>
Signed-off-by: Jiqian Chen <[email protected]>
---
RFC: it need review and need to wait for previous patch of this series to be merged.
---
drivers/xen/privcmd.c | 28 ++++++++++++++++++++++
drivers/xen/xen-pciback/pci_stub.c | 38 +++++++++++++++++++++++++++---
include/uapi/xen/privcmd.h | 7 ++++++
include/xen/acpi.h | 9 +++++++
4 files changed, 79 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index 67dfa4778864..5809b3168f25 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -45,6 +45,9 @@
#include <xen/page.h>
#include <xen/xen-ops.h>
#include <xen/balloon.h>
+#ifdef CONFIG_XEN_ACPI
+#include <xen/acpi.h>
+#endif

#include "privcmd.h"

@@ -842,6 +845,27 @@ static long privcmd_ioctl_mmap_resource(struct file *file,
return rc;
}

+static long privcmd_ioctl_gsi_from_dev(struct file *file, void __user *udata)
+{
+#ifdef CONFIG_XEN_ACPI
+ struct privcmd_gsi_from_dev kdata;
+
+ if (copy_from_user(&kdata, udata, sizeof(kdata)))
+ return -EFAULT;
+
+ kdata.gsi = pcistub_get_gsi_from_sbdf(kdata.sbdf);
+ if (kdata.gsi == -1)
+ return -EINVAL;
+
+ if (copy_to_user(udata, &kdata, sizeof(kdata)))
+ return -EFAULT;
+
+ return 0;
+#else
+ return -EINVAL;
+#endif
+}
+
#ifdef CONFIG_XEN_PRIVCMD_EVENTFD
/* Irqfd support */
static struct workqueue_struct *irqfd_cleanup_wq;
@@ -1529,6 +1553,10 @@ static long privcmd_ioctl(struct file *file,
ret = privcmd_ioctl_ioeventfd(file, udata);
break;

+ case IOCTL_PRIVCMD_GSI_FROM_DEV:
+ ret = privcmd_ioctl_gsi_from_dev(file, udata);
+ break;
+
default:
break;
}
diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 6b22e45188f5..9d791d7a8098 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -56,6 +56,9 @@ struct pcistub_device {

struct pci_dev *dev;
struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
+#ifdef CONFIG_XEN_ACPI
+ int gsi;
+#endif
};

/* Access to pcistub_devices & seized_devices lists and the initialize_devices
@@ -88,6 +91,9 @@ static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)

kref_init(&psdev->kref);
spin_lock_init(&psdev->lock);
+#ifdef CONFIG_XEN_ACPI
+ psdev->gsi = -1;
+#endif

return psdev;
}
@@ -220,6 +226,25 @@ static struct pci_dev *pcistub_device_get_pci_dev(struct xen_pcibk_device *pdev,
return pci_dev;
}

+#ifdef CONFIG_XEN_ACPI
+int pcistub_get_gsi_from_sbdf(unsigned int sbdf)
+{
+ struct pcistub_device *psdev;
+ int domain = (sbdf >> 16) & 0xffff;
+ int bus = PCI_BUS_NUM(sbdf);
+ int slot = PCI_SLOT(sbdf);
+ int func = PCI_FUNC(sbdf);
+
+ psdev = pcistub_device_find(domain, bus, slot, func);
+
+ if (!psdev)
+ return -1;
+
+ return psdev->gsi;
+}
+EXPORT_SYMBOL_GPL(pcistub_get_gsi_from_sbdf);
+#endif
+
struct pci_dev *pcistub_get_pci_dev_by_slot(struct xen_pcibk_device *pdev,
int domain, int bus,
int slot, int func)
@@ -367,14 +392,20 @@ static int pcistub_match(struct pci_dev *dev)
return found;
}

-static int pcistub_init_device(struct pci_dev *dev)
+static int pcistub_init_device(struct pcistub_device *psdev)
{
struct xen_pcibk_dev_data *dev_data;
+ struct pci_dev *dev;
#ifdef CONFIG_XEN_ACPI
int gsi, trigger, polarity;
#endif
int err = 0;

+ if (!psdev)
+ return -EINVAL;
+
+ dev = psdev->dev;
+
dev_dbg(&dev->dev, "initializing...\n");

/* The PCI backend is not intended to be a module (or to work with
@@ -448,6 +479,7 @@ static int pcistub_init_device(struct pci_dev *dev)
dev_err(&dev->dev, "Fail to get gsi info!\n");
goto config_release;
}
+ psdev->gsi = gsi;

if (xen_initial_domain() && xen_pvh_domain()) {
err = xen_pvh_setup_gsi(gsi, trigger, polarity);
@@ -495,7 +527,7 @@ static int __init pcistub_init_devices_late(void)

spin_unlock_irqrestore(&pcistub_devices_lock, flags);

- err = pcistub_init_device(psdev->dev);
+ err = pcistub_init_device(psdev);
if (err) {
dev_err(&psdev->dev->dev,
"error %d initializing device\n", err);
@@ -565,7 +597,7 @@ static int pcistub_seize(struct pci_dev *dev,
spin_unlock_irqrestore(&pcistub_devices_lock, flags);

/* don't want irqs disabled when calling pcistub_init_device */
- err = pcistub_init_device(psdev->dev);
+ err = pcistub_init_device(psdev);

spin_lock_irqsave(&pcistub_devices_lock, flags);

diff --git a/include/uapi/xen/privcmd.h b/include/uapi/xen/privcmd.h
index 8b8c5d1420fe..220e7670a113 100644
--- a/include/uapi/xen/privcmd.h
+++ b/include/uapi/xen/privcmd.h
@@ -126,6 +126,11 @@ struct privcmd_ioeventfd {
__u8 pad[2];
};

+struct privcmd_gsi_from_dev {
+ __u32 sbdf;
+ int gsi;
+};
+
/*
* @cmd: IOCTL_PRIVCMD_HYPERCALL
* @arg: &privcmd_hypercall_t
@@ -157,5 +162,7 @@ struct privcmd_ioeventfd {
_IOW('P', 8, struct privcmd_irqfd)
#define IOCTL_PRIVCMD_IOEVENTFD \
_IOW('P', 9, struct privcmd_ioeventfd)
+#define IOCTL_PRIVCMD_GSI_FROM_DEV \
+ _IOC(_IOC_NONE, 'P', 10, sizeof(struct privcmd_gsi_from_dev))

#endif /* __LINUX_PUBLIC_PRIVCMD_H__ */
diff --git a/include/xen/acpi.h b/include/xen/acpi.h
index 9b50027113f3..d6315fd559a9 100644
--- a/include/xen/acpi.h
+++ b/include/xen/acpi.h
@@ -83,4 +83,13 @@ int xen_acpi_get_gsi_info(struct pci_dev *dev,
int *gsi_out,
int *trigger_out,
int *polarity_out);
+
+#ifdef CONFIG_XEN_PCI_STUB
+int pcistub_get_gsi_from_sbdf(unsigned int sbdf);
+#else
+static inline int pcistub_get_gsi_from_sbdf(unsigned int sbdf)
+{
+ return -1;
+}
+#endif
#endif /* _XEN_ACPI_H */
--
2.34.1


2024-06-07 07:53:28

by Jiqian Chen

[permalink] [raw]
Subject: [RFC KERNEL PATCH v8 1/3] xen/pci: Add xen_reset_device_function_state

When device on dom0 side has been reset, the vpci on Xen side
won't get notification, so that the cached state in vpci is
all out of date with the real device state.
To solve that problem, add a new function to clear all vpci
device state when device is reset on dom0 side.

And call that function in pcistub_init_device. Because when
using "pci-assignable-add" to assign a passthrough device in
Xen, it will reset passthrough device and the vpci state will
out of date, and then device will fail to restore bar state.

Signed-off-by: Huang Rui <[email protected]>
Signed-off-by: Jiqian Chen <[email protected]>
Reviewed-by: Stefano Stabellini <[email protected]>
---
RFC: it need to wait for the corresponding first patch on xen side to be merged.
---
drivers/xen/pci.c | 25 +++++++++++++++++++++++++
drivers/xen/xen-pciback/pci_stub.c | 18 +++++++++++++++---
include/xen/interface/physdev.h | 7 +++++++
include/xen/pci.h | 6 ++++++
4 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c
index 72d4e3f193af..57093e395982 100644
--- a/drivers/xen/pci.c
+++ b/drivers/xen/pci.c
@@ -177,6 +177,31 @@ static int xen_remove_device(struct device *dev)
return r;
}

+enum pci_device_state_reset_type {
+ DEVICE_RESET_FLR,
+ DEVICE_RESET_COLD,
+ DEVICE_RESET_WARM,
+ DEVICE_RESET_HOT,
+};
+
+struct pci_device_state_reset {
+ struct physdev_pci_device dev;
+ enum pci_device_state_reset_type reset_type;
+};
+
+int xen_reset_device_function_state(const struct pci_dev *dev)
+{
+ struct pci_device_state_reset device = {
+ .dev.seg = pci_domain_nr(dev->bus),
+ .dev.bus = dev->bus->number,
+ .dev.devfn = dev->devfn,
+ .reset_type = DEVICE_RESET_FLR,
+ };
+
+ return HYPERVISOR_physdev_op(PHYSDEVOP_pci_device_state_reset, &device);
+}
+EXPORT_SYMBOL_GPL(xen_reset_device_function_state);
+
static int xen_pci_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index e34b623e4b41..73062e531c34 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -89,6 +89,16 @@ static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
return psdev;
}

+static int pcistub_reset_device_state(struct pci_dev *dev)
+{
+ __pci_reset_function_locked(dev);
+
+ if (!xen_pv_domain())
+ return xen_reset_device_function_state(dev);
+ else
+ return 0;
+}
+
/* Don't call this directly as it's called by pcistub_device_put */
static void pcistub_device_release(struct kref *kref)
{
@@ -107,7 +117,7 @@ static void pcistub_device_release(struct kref *kref)
/* Call the reset function which does not take lock as this
* is called from "unbind" which takes a device_lock mutex.
*/
- __pci_reset_function_locked(dev);
+ pcistub_reset_device_state(dev);
if (dev_data &&
pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
dev_info(&dev->dev, "Could not reload PCI state\n");
@@ -284,7 +294,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
* (so it's ready for the next domain)
*/
device_lock_assert(&dev->dev);
- __pci_reset_function_locked(dev);
+ pcistub_reset_device_state(dev);

dev_data = pci_get_drvdata(dev);
ret = pci_load_saved_state(dev, dev_data->pci_saved_state);
@@ -420,7 +430,9 @@ static int pcistub_init_device(struct pci_dev *dev)
dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
else {
dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
- __pci_reset_function_locked(dev);
+ err = pcistub_reset_device_state(dev);
+ if (err)
+ goto config_release;
pci_restore_state(dev);
}
/* Now disable the device (this also ensures some private device
diff --git a/include/xen/interface/physdev.h b/include/xen/interface/physdev.h
index a237af867873..b50646c993dd 100644
--- a/include/xen/interface/physdev.h
+++ b/include/xen/interface/physdev.h
@@ -256,6 +256,13 @@ struct physdev_pci_device_add {
*/
#define PHYSDEVOP_prepare_msix 30
#define PHYSDEVOP_release_msix 31
+/*
+ * Notify the hypervisor that a PCI device has been reset, so that any
+ * internally cached state is regenerated. Should be called after any
+ * device reset performed by the hardware domain.
+ */
+#define PHYSDEVOP_pci_device_state_reset 32
+
struct physdev_pci_device {
/* IN */
uint16_t seg;
diff --git a/include/xen/pci.h b/include/xen/pci.h
index b8337cf85fd1..7941809ab729 100644
--- a/include/xen/pci.h
+++ b/include/xen/pci.h
@@ -4,10 +4,16 @@
#define __XEN_PCI_H__

#if defined(CONFIG_XEN_DOM0)
+int xen_reset_device_function_state(const struct pci_dev *dev);
int xen_find_device_domain_owner(struct pci_dev *dev);
int xen_register_device_domain_owner(struct pci_dev *dev, uint16_t domain);
int xen_unregister_device_domain_owner(struct pci_dev *dev);
#else
+static inline int xen_reset_device_function_state(const struct pci_dev *dev)
+{
+ return -1;
+}
+
static inline int xen_find_device_domain_owner(struct pci_dev *dev)
{
return -1;
--
2.34.1


2024-06-07 07:53:38

by Jiqian Chen

[permalink] [raw]
Subject: [RFC KERNEL PATCH v8 2/3] xen/pvh: Setup gsi for passthrough device

In PVH dom0, the gsis don't get registered, but the gsi of
a passthrough device must be configured for it to be able to be
mapped into a domU.

When assign a device to passthrough, proactively setup the gsi
of the device during that process.

Signed-off-by: Huang Rui <[email protected]>
Signed-off-by: Jiqian Chen <[email protected]>
Reviewed-by: Stefano Stabellini <[email protected]>
---
RFC: it need to wait for the corresponding third patch on xen side to be merged.
---
arch/x86/xen/enlighten_pvh.c | 23 ++++++++++++++
drivers/acpi/pci_irq.c | 2 +-
drivers/xen/acpi.c | 50 ++++++++++++++++++++++++++++++
drivers/xen/xen-pciback/pci_stub.c | 21 +++++++++++++
include/linux/acpi.h | 1 +
include/xen/acpi.h | 10 ++++++
6 files changed, 106 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
index 27a2a02ef8fb..6caadf9c00ab 100644
--- a/arch/x86/xen/enlighten_pvh.c
+++ b/arch/x86/xen/enlighten_pvh.c
@@ -4,6 +4,7 @@
#include <linux/mm.h>

#include <xen/hvc-console.h>
+#include <xen/acpi.h>

#include <asm/bootparam.h>
#include <asm/io_apic.h>
@@ -27,6 +28,28 @@
bool __ro_after_init xen_pvh;
EXPORT_SYMBOL_GPL(xen_pvh);

+#ifdef CONFIG_XEN_DOM0
+int xen_pvh_setup_gsi(int gsi, int trigger, int polarity)
+{
+ int ret;
+ struct physdev_setup_gsi setup_gsi;
+
+ setup_gsi.gsi = gsi;
+ setup_gsi.triggering = (trigger == ACPI_EDGE_SENSITIVE ? 0 : 1);
+ setup_gsi.polarity = (polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
+
+ ret = HYPERVISOR_physdev_op(PHYSDEVOP_setup_gsi, &setup_gsi);
+ if (ret == -EEXIST) {
+ xen_raw_printk("Already setup the GSI :%d\n", gsi);
+ ret = 0;
+ } else if (ret)
+ xen_raw_printk("Fail to setup GSI (%d)!\n", gsi);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(xen_pvh_setup_gsi);
+#endif
+
void __init xen_pvh_init(struct boot_params *boot_params)
{
u32 msr;
diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
index ff30ceca2203..630fe0a34bc6 100644
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -288,7 +288,7 @@ static int acpi_reroute_boot_interrupt(struct pci_dev *dev,
}
#endif /* CONFIG_X86_IO_APIC */

-static struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin)
+struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin)
{
struct acpi_prt_entry *entry = NULL;
struct pci_dev *bridge;
diff --git a/drivers/xen/acpi.c b/drivers/xen/acpi.c
index 6893c79fd2a1..9e2096524fbc 100644
--- a/drivers/xen/acpi.c
+++ b/drivers/xen/acpi.c
@@ -30,6 +30,7 @@
* IN THE SOFTWARE.
*/

+#include <linux/pci.h>
#include <xen/acpi.h>
#include <xen/interface/platform.h>
#include <asm/xen/hypercall.h>
@@ -75,3 +76,52 @@ int xen_acpi_notify_hypervisor_extended_sleep(u8 sleep_state,
return xen_acpi_notify_hypervisor_state(sleep_state, val_a,
val_b, true);
}
+
+struct acpi_prt_entry {
+ struct acpi_pci_id id;
+ u8 pin;
+ acpi_handle link;
+ u32 index;
+};
+
+int xen_acpi_get_gsi_info(struct pci_dev *dev,
+ int *gsi_out,
+ int *trigger_out,
+ int *polarity_out)
+{
+ int gsi;
+ u8 pin;
+ struct acpi_prt_entry *entry;
+ int trigger = ACPI_LEVEL_SENSITIVE;
+ int polarity = acpi_irq_model == ACPI_IRQ_MODEL_GIC ?
+ ACPI_ACTIVE_HIGH : ACPI_ACTIVE_LOW;
+
+ if (!dev || !gsi_out || !trigger_out || !polarity_out)
+ return -EINVAL;
+
+ pin = dev->pin;
+ if (!pin)
+ return -EINVAL;
+
+ entry = acpi_pci_irq_lookup(dev, pin);
+ if (entry) {
+ if (entry->link)
+ gsi = acpi_pci_link_allocate_irq(entry->link,
+ entry->index,
+ &trigger, &polarity,
+ NULL);
+ else
+ gsi = entry->index;
+ } else
+ gsi = -1;
+
+ if (gsi < 0)
+ return -EINVAL;
+
+ *gsi_out = gsi;
+ *trigger_out = trigger;
+ *polarity_out = polarity;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(xen_acpi_get_gsi_info);
diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 73062e531c34..6b22e45188f5 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -21,6 +21,9 @@
#include <xen/events.h>
#include <xen/pci.h>
#include <xen/xen.h>
+#ifdef CONFIG_XEN_ACPI
+#include <xen/acpi.h>
+#endif
#include <asm/xen/hypervisor.h>
#include <xen/interface/physdev.h>
#include "pciback.h"
@@ -367,6 +370,9 @@ static int pcistub_match(struct pci_dev *dev)
static int pcistub_init_device(struct pci_dev *dev)
{
struct xen_pcibk_dev_data *dev_data;
+#ifdef CONFIG_XEN_ACPI
+ int gsi, trigger, polarity;
+#endif
int err = 0;

dev_dbg(&dev->dev, "initializing...\n");
@@ -435,6 +441,21 @@ static int pcistub_init_device(struct pci_dev *dev)
goto config_release;
pci_restore_state(dev);
}
+
+#ifdef CONFIG_XEN_ACPI
+ err = xen_acpi_get_gsi_info(dev, &gsi, &trigger, &polarity);
+ if (err) {
+ dev_err(&dev->dev, "Fail to get gsi info!\n");
+ goto config_release;
+ }
+
+ if (xen_initial_domain() && xen_pvh_domain()) {
+ err = xen_pvh_setup_gsi(gsi, trigger, polarity);
+ if (err)
+ goto config_release;
+ }
+#endif
+
/* Now disable the device (this also ensures some private device
* data is setup before we export)
*/
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 34829f2c517a..f8690b02bba4 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -361,6 +361,7 @@ void acpi_unregister_gsi (u32 gsi);

struct pci_dev;

+struct acpi_prt_entry *acpi_pci_irq_lookup(struct pci_dev *dev, int pin);
int acpi_pci_irq_enable (struct pci_dev *dev);
void acpi_penalize_isa_irq(int irq, int active);
bool acpi_isa_irq_available(int irq);
diff --git a/include/xen/acpi.h b/include/xen/acpi.h
index b1e11863144d..9b50027113f3 100644
--- a/include/xen/acpi.h
+++ b/include/xen/acpi.h
@@ -67,10 +67,20 @@ static inline void xen_acpi_sleep_register(void)
acpi_suspend_lowlevel = xen_acpi_suspend_lowlevel;
}
}
+int xen_pvh_setup_gsi(int gsi, int trigger, int polarity);
#else
static inline void xen_acpi_sleep_register(void)
{
}
+
+static inline int xen_pvh_setup_gsi(int gsi, int trigger, int polarity)
+{
+ return -1;
+}
#endif

+int xen_acpi_get_gsi_info(struct pci_dev *dev,
+ int *gsi_out,
+ int *trigger_out,
+ int *polarity_out);
#endif /* _XEN_ACPI_H */
--
2.34.1