Subject: [PATCH 0/2] xen-paciback: Fixes for PCI passthrough of AMD GPU.

I've got this patch series from Dwayne Litzenberger, as a fix for AMD GPU
passthrough. I don't have this hardware myself, so I can't test this, but he
claims it works. I'm not sure if allowing guest to enable/disable expansion ROM
is safe thing to do, but the comment in rom_write for me suggests it is but
wasn't considered necessary. Other than this uncertainty, the patches looks
good for me.

Originally posted here:
https://github.com/marmarek/qubes-linux-kernel/pulls/1

Cc: Dwayne Litzenberger <[email protected]>

Dwayne Litzenberger (2):
xen-pciback: Fix error return in bar_write() and rom_write()
xen-pciback: Allow enabling/disabling expansion ROM

drivers/xen/xen-pciback/conf_space_header.c | 34 ++++++++++++++--------
1 file changed, 23 insertions(+), 11 deletions(-)

base-commit: 2e6e902d185027f8e3cb8b7305238f7e35d6a436
--
git-series 0.9.1


Subject: [PATCH 1/2] xen-pciback: Fix error return in bar_write() and rom_write()

From: Dwayne Litzenberger <[email protected]>

Signed-off-by: Dwayne Litzenberger <[email protected]>
---
drivers/xen/xen-pciback/conf_space_header.c | 24 ++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/xen/xen-pciback/conf_space_header.c b/drivers/xen/xen-pciback/conf_space_header.c
index 10ae24b..697d0a8 100644
--- a/drivers/xen/xen-pciback/conf_space_header.c
+++ b/drivers/xen/xen-pciback/conf_space_header.c
@@ -135,6 +135,7 @@ static int command_write(struct pci_dev *dev, int offset, u16 value, void *data)

static int rom_write(struct pci_dev *dev, int offset, u32 value, void *data)
{
+ int err = 0;
struct pci_bar_info *bar = data;

if (unlikely(!bar)) {
@@ -150,17 +151,22 @@ static int rom_write(struct pci_dev *dev, int offset, u32 value, void *data)
bar->which = 1;
else {
u32 tmpval;
- pci_read_config_dword(dev, offset, &tmpval);
+ err = pci_read_config_dword(dev, offset, &tmpval);
+ if (err)
+ goto out;
if (tmpval != bar->val && value == bar->val) {
/* Allow restoration of bar value. */
- pci_write_config_dword(dev, offset, bar->val);
+ err = pci_write_config_dword(dev, offset, bar->val);
+ if (err)
+ goto out;
}
bar->which = 0;
}

/* Do we need to support enabling/disabling the rom address here? */

- return 0;
+out:
+ return err;
}

/* For the BARs, only allow writes which write ~0 or
@@ -169,6 +175,7 @@ static int rom_write(struct pci_dev *dev, int offset, u32 value, void *data)
*/
static int bar_write(struct pci_dev *dev, int offset, u32 value, void *data)
{
+ int err = 0;
struct pci_bar_info *bar = data;
unsigned int pos = (offset - PCI_BASE_ADDRESS_0) / 4;
const struct resource *res = dev->resource;
@@ -193,15 +200,20 @@ static int bar_write(struct pci_dev *dev, int offset, u32 value, void *data)
bar->which = 1;
else {
u32 tmpval;
- pci_read_config_dword(dev, offset, &tmpval);
+ err = pci_read_config_dword(dev, offset, &tmpval);
+ if (err)
+ goto out;
if (tmpval != bar->val && value == bar->val) {
/* Allow restoration of bar value. */
- pci_write_config_dword(dev, offset, bar->val);
+ err = pci_write_config_dword(dev, offset, bar->val);
+ if (err)
+ goto out;
}
bar->which = 0;
}

- return 0;
+out:
+ return err;
}

static int bar_read(struct pci_dev *dev, int offset, u32 * value, void *data)
--
git-series 0.9.1

Subject: [PATCH 2/2] xen-pciback: Allow enabling/disabling expansion ROM

From: Dwayne Litzenberger <[email protected]>

Newer AMD GPUs store their initialization routines as bytecode on the
ROM. This fixes the following initialization error inside the VM when
doing PCI passthrough:

radeon 0000:00:05.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
radeon 0000:00:05.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
[drm:radeon_get_bios [radeon]] *ERROR* Unable to locate a BIOS ROM
radeon 0000:00:05.0: Fatal error during GPU init

Signed-off-by: Dwayne Litzenberger <[email protected]>
---
drivers/xen/xen-pciback/conf_space_header.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/xen/xen-pciback/conf_space_header.c b/drivers/xen/xen-pciback/conf_space_header.c
index 697d0a8..bc145d3 100644
--- a/drivers/xen/xen-pciback/conf_space_header.c
+++ b/drivers/xen/xen-pciback/conf_space_header.c
@@ -150,21 +150,21 @@ static int rom_write(struct pci_dev *dev, int offset, u32 value, void *data)
if ((value | ~PCI_ROM_ADDRESS_MASK) == ~0U)
bar->which = 1;
else {
- u32 tmpval;
- err = pci_read_config_dword(dev, offset, &tmpval);
+ u32 newval = bar->val;
+
+ /* Allow enabling/disabling rom, if present */
+ if (newval & PCI_ROM_ADDRESS_MASK) {
+ newval &= ~PCI_ROM_ADDRESS_ENABLE;
+ newval |= value & PCI_ROM_ADDRESS_ENABLE;
+ }
+
+ err = pci_write_config_dword(dev, offset, newval);
if (err)
goto out;
- if (tmpval != bar->val && value == bar->val) {
- /* Allow restoration of bar value. */
- err = pci_write_config_dword(dev, offset, bar->val);
- if (err)
- goto out;
- }
+ bar->val = newval;
bar->which = 0;
}

- /* Do we need to support enabling/disabling the rom address here? */
-
out:
return err;
}
--
git-series 0.9.1

2018-12-03 10:56:35

by Jan Beulich

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 1/2] xen-pciback: Fix error return in bar_write() and rom_write()

>>> On 02.12.18 at 18:47, <[email protected]> wrote:
> From: Dwayne Litzenberger <[email protected]>
>
> Signed-off-by: Dwayne Litzenberger <[email protected]>

At least in the kernel world I think your own SOB is expected here.

Also the description would better be non-empty, explaining under
what conditions failure was observed (and wrongly ignored), or
whether instead the change is solely "just in case".

Some stylistic adjustments would also seem on order, but since
I'm not entirely certain about the kernel policy in this regard I'll
omit respective remarks.

Jan



2018-12-03 11:02:28

by Jan Beulich

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 2/2] xen-pciback: Allow enabling/disabling expansion ROM

>>> On 02.12.18 at 18:47, <[email protected]> wrote:
> From: Dwayne Litzenberger <[email protected]>
>
> Newer AMD GPUs store their initialization routines as bytecode on the
> ROM. This fixes the following initialization error inside the VM when
> doing PCI passthrough:
>
> radeon 0000:00:05.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
> radeon 0000:00:05.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
> [drm:radeon_get_bios [radeon]] *ERROR* Unable to locate a BIOS ROM
> radeon 0000:00:05.0: Fatal error during GPU init

Isn't it that qemu is supposed to surface the ROM image to guests,
making it unnecessary to allow guests control over the physical
enable bit? Also why would allowing to alter the bit depend on
whether the address portion of the value is non-zero?

Jan



2018-12-03 12:37:36

by Roger Pau Monne

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 1/2] xen-pciback: Fix error return in bar_write() and rom_write()

On Sun, Dec 02, 2018 at 06:47:32PM +0100, Marek Marczykowski-G?recki wrote:
> From: Dwayne Litzenberger <[email protected]>

I think this requires some description. At least a note that the
function is not altered, just errors from pci reads/writes are no
longer ignored.

> Signed-off-by: Dwayne Litzenberger <[email protected]>
> ---
> drivers/xen/xen-pciback/conf_space_header.c | 24 ++++++++++++++++------
> 1 file changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/xen/xen-pciback/conf_space_header.c b/drivers/xen/xen-pciback/conf_space_header.c
> index 10ae24b..697d0a8 100644
> --- a/drivers/xen/xen-pciback/conf_space_header.c
> +++ b/drivers/xen/xen-pciback/conf_space_header.c
> @@ -135,6 +135,7 @@ static int command_write(struct pci_dev *dev, int offset, u16 value, void *data)
>
> static int rom_write(struct pci_dev *dev, int offset, u32 value, void *data)
> {
> + int err = 0;
> struct pci_bar_info *bar = data;
>
> if (unlikely(!bar)) {
> @@ -150,17 +151,22 @@ static int rom_write(struct pci_dev *dev, int offset, u32 value, void *data)
> bar->which = 1;
> else {
> u32 tmpval;
> - pci_read_config_dword(dev, offset, &tmpval);
> + err = pci_read_config_dword(dev, offset, &tmpval);
> + if (err)
> + goto out;

I don't think you need the out label, you could just return err.
Adding the label doesn't help in any way IMO, just makes the function
one line longer for no reason.

Same for the out label added to bar_write.

Thanks, Roger.

2018-12-03 12:54:47

by Roger Pau Monne

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 2/2] xen-pciback: Allow enabling/disabling expansion ROM

On Sun, Dec 02, 2018 at 06:47:33PM +0100, Marek Marczykowski-G?recki wrote:
> From: Dwayne Litzenberger <[email protected]>
>
> Newer AMD GPUs store their initialization routines as bytecode on the
> ROM. This fixes the following initialization error inside the VM when
> doing PCI passthrough:
>
> radeon 0000:00:05.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
> radeon 0000:00:05.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
> [drm:radeon_get_bios [radeon]] *ERROR* Unable to locate a BIOS ROM
> radeon 0000:00:05.0: Fatal error during GPU init
>
> Signed-off-by: Dwayne Litzenberger <[email protected]>
> ---
> drivers/xen/xen-pciback/conf_space_header.c | 20 ++++++++++----------
> 1 file changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/xen/xen-pciback/conf_space_header.c b/drivers/xen/xen-pciback/conf_space_header.c
> index 697d0a8..bc145d3 100644
> --- a/drivers/xen/xen-pciback/conf_space_header.c
> +++ b/drivers/xen/xen-pciback/conf_space_header.c
> @@ -150,21 +150,21 @@ static int rom_write(struct pci_dev *dev, int offset, u32 value, void *data)
> if ((value | ~PCI_ROM_ADDRESS_MASK) == ~0U)
> bar->which = 1;
> else {
> - u32 tmpval;
> - err = pci_read_config_dword(dev, offset, &tmpval);
> + u32 newval = bar->val;

Naming this newval is quite confusing IMO, since at this point it's
actually the old value.

> +
> + /* Allow enabling/disabling rom, if present */
> + if (newval & PCI_ROM_ADDRESS_MASK) {
> + newval &= ~PCI_ROM_ADDRESS_ENABLE;
> + newval |= value & PCI_ROM_ADDRESS_ENABLE;
> + }
> + err = pci_write_config_dword(dev, offset, newval);
> if (err)
> goto out;
> - if (tmpval != bar->val && value == bar->val) {
> - /* Allow restoration of bar value. */
> - err = pci_write_config_dword(dev, offset, bar->val);
> - if (err)
> - goto out;
> - }
> + bar->val = newval;

I'm not sure there's much value in storing this, the only difference
is the setting of the enable bit, which is ignored anyway.

Thanks, Roger.

Subject: Re: [Xen-devel] [PATCH 2/2] xen-pciback: Allow enabling/disabling expansion ROM

On Mon, Dec 03, 2018 at 04:01:07AM -0700, Jan Beulich wrote:
> >>> On 02.12.18 at 18:47, <[email protected]> wrote:
> > From: Dwayne Litzenberger <[email protected]>
> >
> > Newer AMD GPUs store their initialization routines as bytecode on the
> > ROM. This fixes the following initialization error inside the VM when
> > doing PCI passthrough:
> >
> > radeon 0000:00:05.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
> > radeon 0000:00:05.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
> > [drm:radeon_get_bios [radeon]] *ERROR* Unable to locate a BIOS ROM
> > radeon 0000:00:05.0: Fatal error during GPU init
>
> Isn't it that qemu is supposed to surface the ROM image to guests,
> making it unnecessary to allow guests control over the physical
> enable bit?

Unless that qemu is in stubdomain, where it use pciback to access
everything about the device...

> Also why would allowing to alter the bit depend on
> whether the address portion of the value is non-zero?

That's a good question. According to commit message I think it should be
a ROM presence check instead. If needed at this point at all - is this
function even called if there is no ROM?

--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


Attachments:
(No filename) (1.40 kB)
signature.asc (499.00 B)
Download all attachments

2018-12-03 15:05:44

by Jan Beulich

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 2/2] xen-pciback: Allow enabling/disabling expansion ROM

>>> On 03.12.18 at 15:47, <[email protected]> wrote:
> On Mon, Dec 03, 2018 at 04:01:07AM -0700, Jan Beulich wrote:
>> >>> On 02.12.18 at 18:47, <[email protected]> wrote:
>> > From: Dwayne Litzenberger <[email protected]>
>> >
>> > Newer AMD GPUs store their initialization routines as bytecode on the
>> > ROM. This fixes the following initialization error inside the VM when
>> > doing PCI passthrough:
>> >
>> > radeon 0000:00:05.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
>> > radeon 0000:00:05.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
>> > [drm:radeon_get_bios [radeon]] *ERROR* Unable to locate a BIOS ROM
>> > radeon 0000:00:05.0: Fatal error during GPU init
>>
>> Isn't it that qemu is supposed to surface the ROM image to guests,
>> making it unnecessary to allow guests control over the physical
>> enable bit?
>
> Unless that qemu is in stubdomain, where it use pciback to access
> everything about the device...

Would be quite helpful to explain this in the description.

>> Also why would allowing to alter the bit depend on
>> whether the address portion of the value is non-zero?
>
> That's a good question. According to commit message I think it should be
> a ROM presence check instead. If needed at this point at all - is this
> function even called if there is no ROM?

I suppose this was a question to Dwayne, not me.

Jan