2023-04-06 11:15:10

by David Laight

[permalink] [raw]
Subject: revert bab65e48cb064 PCI/MSI Sanitize MSI-X checks

The change in bab65e48cb064 breaks pci_enable_msix_range().
The intent is to optimise the sanity checks, but it is
somewhat overenthusiastic.

The interface allows you to ask for a lot of vectors and
returns the number that were allocated.
However, after the change, you can't request a vector
that is higher than the largest the hardware supports.
Which makes that rather pointless.

So code like:
for (i = 0; i < 16; i++)
msix_tbl[i].entry = i;
nvec = pci_enable_msix_range(dev, msix_tbl, 1, 16);
Now returns -22 if the hardware only supports 8 interrupts.

Previously it returned 8.

I can fix my driver, but I suspect that any code that relies
on a smaller number of vectors being returned is now broken.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


2023-04-06 15:20:57

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: revert bab65e48cb064 PCI/MSI Sanitize MSI-X checks

[+cc linux-pci, regressions]

On Thu, Apr 06, 2023 at 11:05:14AM +0000, David Laight wrote:
> The change in bab65e48cb064 breaks pci_enable_msix_range().
> The intent is to optimise the sanity checks, but it is
> somewhat overenthusiastic.
>
> The interface allows you to ask for a lot of vectors and
> returns the number that were allocated.
> However, after the change, you can't request a vector
> that is higher than the largest the hardware supports.
> Which makes that rather pointless.
>
> So code like:
> for (i = 0; i < 16; i++)
> msix_tbl[i].entry = i;
> nvec = pci_enable_msix_range(dev, msix_tbl, 1, 16);
> Now returns -22 if the hardware only supports 8 interrupts.
>
> Previously it returned 8.
>
> I can fix my driver, but I suspect that any code that relies
> on a smaller number of vectors being returned is now broken.

Thanks for the report! bab65e48cb06 ("PCI/MSI: Sanitize MSI-X
checks") appeared in v6.2-rc1, so this is a recent regression and it
would be good to fix it for v6.3.

bab65e48cb06 only touches drivers/pci/msi/msi.c, but since it didn't
go through the PCI tree, I'll let Thomas handle any revert (or better,
an improvement to pci_msix_validate_entries()) since he wrote and
applied the original.

Bjorn

2023-04-06 15:37:18

by David Laight

[permalink] [raw]
Subject: RE: revert bab65e48cb064 PCI/MSI Sanitize MSI-X checks

From: Bjorn Helgaas
> Sent: 06 April 2023 16:08
>
> [+cc linux-pci, regressions]
>
> On Thu, Apr 06, 2023 at 11:05:14AM +0000, David Laight wrote:
> > The change in bab65e48cb064 breaks pci_enable_msix_range().
> > The intent is to optimise the sanity checks, but it is
> > somewhat overenthusiastic.
> >
> > The interface allows you to ask for a lot of vectors and
> > returns the number that were allocated.
> > However, after the change, you can't request a vector
> > that is higher than the largest the hardware supports.
> > Which makes that rather pointless.
> >
> > So code like:
> > for (i = 0; i < 16; i++)
> > msix_tbl[i].entry = i;
> > nvec = pci_enable_msix_range(dev, msix_tbl, 1, 16);
> > Now returns -22 if the hardware only supports 8 interrupts.
> >
> > Previously it returned 8.
> >
> > I can fix my driver, but I suspect that any code that relies
> > on a smaller number of vectors being returned is now broken.
>
> Thanks for the report! bab65e48cb06 ("PCI/MSI: Sanitize MSI-X
> checks") appeared in v6.2-rc1, so this is a recent regression and it
> would be good to fix it for v6.3.

I do try to test every release at around rc3.

> bab65e48cb06 only touches drivers/pci/msi/msi.c, but since it didn't
> go through the PCI tree, I'll let Thomas handle any revert (or better,
> an improvement to pci_msix_validate_entries()) since he wrote and
> applied the original.

Looking it:

static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *entries,
int nvec, int hwsize)
{
bool nogap;
int i, j;

if (!entries)
return true;

nogap = pci_msi_domain_supports(dev, MSI_FLAG_MSIX_CONTIGUOUS, DENY_LEGACY);

for (i = 0; i < nvec; i++) {
/* Entry within hardware limit? */
if (entries[i].entry >= hwsize)
return false;

/* Check for duplicate entries */
for (j = i + 1; j < nvec; j++) {
if (entries[i].entry == entries[j].entry)
return false;
}
/* Check for unsupported gaps */
if (nogap && entries[i].entry != i)
return false;
}
return true;
}

It probably needs to return an updated 'nvec'.
The gap/duplicate check is also a bit horrid, why not:
if (nogap) {
if (entries[i].entry != i)
return false;
continue;
}

if (!i || entries[i].entry > entries[i - 1].entry)
continue;

horrid, expensive loop...

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-04-06 19:40:55

by Linus Torvalds

[permalink] [raw]
Subject: Re: revert bab65e48cb064 PCI/MSI Sanitize MSI-X checks

On Thu, Apr 6, 2023 at 4:05 AM David Laight <[email protected]> wrote:
>
> So code like:
> for (i = 0; i < 16; i++)
> msix_tbl[i].entry = i;
> nvec = pci_enable_msix_range(dev, msix_tbl, 1, 16);
> Now returns -22 if the hardware only supports 8 interrupts.
>
> Previously it returned 8.

Does just moving the pci_msix_validate_entries() down to below the
hwsize update code fix it?

IOW, something like this attached patch?

ENTIRELY UNTESTED! This may be seriously broken for some reason, but
it does seem like the current code makes no sense (that "Keep the IRQ
virtual hackery working" comment seems to not possibly be true since
the MSIX nvec has effectively been checked against hwsize by the
pci_msix_validate_entries() code before).

I don't know this code. Thomas?

Linus


Attachments:
patch.diff (847.00 B)

2023-04-06 19:47:44

by Thomas Gleixner

[permalink] [raw]
Subject: Re: revert bab65e48cb064 PCI/MSI Sanitize MSI-X checks

On Thu, Apr 06 2023 at 10:07, Bjorn Helgaas wrote:
> On Thu, Apr 06, 2023 at 11:05:14AM +0000, David Laight wrote:
> Thanks for the report! bab65e48cb06 ("PCI/MSI: Sanitize MSI-X
> checks") appeared in v6.2-rc1, so this is a recent regression and it
> would be good to fix it for v6.3.
>
> bab65e48cb06 only touches drivers/pci/msi/msi.c, but since it didn't
> go through the PCI tree, I'll let Thomas handle any revert (or better,
> an improvement to pci_msix_validate_entries()) since he wrote and
> applied the original.

Right. The fix is trivial as the hardware size check in this validation
function is pointless.

The point is that for a range allocation with and entries array, _all_
entries up to max_vec must be correct independent of the actual hardware
size.

So the fix is simply removing the hardware size check from the
validation function.

The hardware size checking happens afterwards anyway.

Thanks,

tglx
---
drivers/pci/msi/msi.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -750,8 +750,7 @@ static int msix_capability_init(struct p
return ret;
}

-static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *entries,
- int nvec, int hwsize)
+static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *entries, int nvev)
{
bool nogap;
int i, j;
@@ -762,10 +761,6 @@ static bool pci_msix_validate_entries(st
nogap = pci_msi_domain_supports(dev, MSI_FLAG_MSIX_CONTIGUOUS, DENY_LEGACY);

for (i = 0; i < nvec; i++) {
- /* Entry within hardware limit? */
- if (entries[i].entry >= hwsize)
- return false;
-
/* Check for duplicate entries */
for (j = i + 1; j < nvec; j++) {
if (entries[i].entry == entries[j].entry)
@@ -805,7 +800,7 @@ int __pci_enable_msix_range(struct pci_d
if (hwsize < 0)
return hwsize;

- if (!pci_msix_validate_entries(dev, entries, nvec, hwsize))
+ if (!pci_msix_validate_entries(dev, entries, nvec))
return -EINVAL;

if (hwsize < nvec) {

2023-04-06 21:12:08

by Thomas Gleixner

[permalink] [raw]
Subject: Re: revert bab65e48cb064 PCI/MSI Sanitize MSI-X checks

On Thu, Apr 06 2023 at 12:35, Linus Torvalds wrote:
>> Previously it returned 8.
>
> Does just moving the pci_msix_validate_entries() down to below the
> hwsize update code fix it?
>
> IOW, something like this attached patch?
>
> ENTIRELY UNTESTED! This may be seriously broken for some reason, but
> it does seem like the current code makes no sense (that "Keep the IRQ
> virtual hackery working" comment seems to not possibly be true since
> the MSIX nvec has effectively been checked against hwsize by the
> pci_msix_validate_entries() code before).

Yes, that works too. But I rather remove the hwsize check from the
validation function as I explained in my earlier reply to Bjorn in that
thread.

Thanks,

tglx

2023-04-07 12:47:56

by David Laight

[permalink] [raw]
Subject: RE: revert bab65e48cb064 PCI/MSI Sanitize MSI-X checks

From: Linus Torvalds
> Sent: 06 April 2023 20:36
>
> On Thu, Apr 6, 2023 at 4:05 AM David Laight <[email protected]> wrote:
> >
> > So code like:
> > for (i = 0; i < 16; i++)
> > msix_tbl[i].entry = i;
> > nvec = pci_enable_msix_range(dev, msix_tbl, 1, 16);
> > Now returns -22 if the hardware only supports 8 interrupts.
> >
> > Previously it returned 8.
>
> Does just moving the pci_msix_validate_entries() down to below the
> hwsize update code fix it?

I think it depends on why the driver is asking for more
interrupts than the hardware supports.
Potentially the driver could do:
for (i = 0; i < 8; i++)
msix_tbl[i].entry = 2 * i;
if the hardware supports 8 interrupts perhaps it
should return 4?

In my case both the MSI-X logic on the fpga and the driver
support 16 interrupts, but PCIe config space reports 8.
The high numbered interrupts aren't used very often and get
multiplexed onto the highest 'real' interrupt.
(The fpga build makes it pretty much impossible to tie
together the config space value and the size of the MSI-X
table.)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-04-07 19:28:52

by Linus Torvalds

[permalink] [raw]
Subject: Re: revert bab65e48cb064 PCI/MSI Sanitize MSI-X checks

On Fri, Apr 7, 2023 at 5:25 AM David Laight <[email protected]> wrote:
>
> I think it depends on why the driver is asking for more
> interrupts than the hardware supports.
> Potentially the driver could do:
> for (i = 0; i < 8; i++)
> msix_tbl[i].entry = 2 * i;
> if the hardware supports 8 interrupts perhaps it
> should return 4?

Hmm.

Something like this might get that case right too. Again - entirely
untested, but looks superficially sane to me.

It just returns how many of the msix_entry[] entries can be used (or
negative for error). So then the (only) caller can just say "is that
still enough for what we required". Seems reasonable, and would get
that non-contiguous case right, I think.

Thomas?

I'm going to drop this and assume I'll get a pull from you with what
you consider the proper fix, but since I was looking at this anyway, I
decided I might as well send out this RFC patch.

Linus


Attachments:
patch.diff (1.87 kB)

2023-04-07 21:51:26

by Thomas Gleixner

[permalink] [raw]
Subject: Re: revert bab65e48cb064 PCI/MSI Sanitize MSI-X checks

Linus!

On Fri, Apr 07 2023 at 12:26, Linus Torvalds wrote:
> On Fri, Apr 7, 2023 at 5:25 AM David Laight <[email protected]> wrote:
>>
>> I think it depends on why the driver is asking for more
>> interrupts than the hardware supports.
>> Potentially the driver could do:
>> for (i = 0; i < 8; i++)
>> msix_tbl[i].entry = 2 * i;
>> if the hardware supports 8 interrupts perhaps it
>> should return 4?
>
> Hmm.
>
> Something like this might get that case right too. Again - entirely
> untested, but looks superficially sane to me.
>
> It just returns how many of the msix_entry[] entries can be used (or
> negative for error). So then the (only) caller can just say "is that
> still enough for what we required". Seems reasonable, and would get
> that non-contiguous case right, I think.

Probably, but the point is that a driver should not hand in invalid data
in the first place. Even if the hardware does not provide enough space
for the requested maximum range then the handed in entries array should
not contain any invalid data, right?

Ergo the hwsize check in that function is bogus. No idea what I was
thinking there.

So the simple and IMO correct solution is to drop that hwsize check from
the validation function and validate up to the

The point is that for a range allocation with and entries array, _all_
entries up to max_vec must be correct independent of the actual hardware
size.

So the fix is simply removing the hardware size check from the
validation function. The hardware size checking happens afterwards
anyway.

Let me write up a proper changelog for that.

Thanks,

tglx
---
drivers/pci/msi/msi.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -750,8 +750,7 @@ static int msix_capability_init(struct p
return ret;
}

-static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *entries,
- int nvec, int hwsize)
+static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *entries, int nvev)
{
bool nogap;
int i, j;
@@ -762,10 +761,6 @@ static bool pci_msix_validate_entries(st
nogap = pci_msi_domain_supports(dev, MSI_FLAG_MSIX_CONTIGUOUS, DENY_LEGACY);

for (i = 0; i < nvec; i++) {
- /* Entry within hardware limit? */
- if (entries[i].entry >= hwsize)
- return false;
-
/* Check for duplicate entries */
for (j = i + 1; j < nvec; j++) {
if (entries[i].entry == entries[j].entry)
@@ -805,7 +800,7 @@ int __pci_enable_msix_range(struct pci_d
if (hwsize < 0)
return hwsize;

- if (!pci_msix_validate_entries(dev, entries, nvec, hwsize))
+ if (!pci_msix_validate_entries(dev, entries, nvec))
return -EINVAL;

if (hwsize < nvec) {

2023-04-10 19:17:09

by Thomas Gleixner

[permalink] [raw]
Subject: [PATCH] PCI/MSI: Remove over-zealous hardware size check in pci_msix_validate_entries()

pci_msix_validate_entries() validates the entries array which is handed in
by the caller for a MSI-X interrupt allocation. Aside of consistency
failures it also detects a failure when the size of the MSI-X hardware table
in the device is smaller than the size of the entries array.

That's wrong for the case of range allocations where the caller provides
the minimum and the maximum number of vectors to allocate, when the
hardware size is greater or equal to the mininum, but smaller than the
maximum.

Remove the hardware size check completely from that function and just
ensure that the entires array up to the maximum size is consistent.

The limitation and range checking versus the hardware size happens
independently of that afterwards anyway because the entries array is
optional.

Fixes: 4644d22eb673 ("PCI/MSI: Validate MSI-X contiguous restriction early")
Reported-by: David Laight <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---

David, can you please confirm that this fixes your issue?

---
drivers/pci/msi/msi.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -750,8 +750,7 @@ static int msix_capability_init(struct p
return ret;
}

-static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *entries,
- int nvec, int hwsize)
+static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *entries, int nvev)
{
bool nogap;
int i, j;
@@ -762,10 +761,6 @@ static bool pci_msix_validate_entries(st
nogap = pci_msi_domain_supports(dev, MSI_FLAG_MSIX_CONTIGUOUS, DENY_LEGACY);

for (i = 0; i < nvec; i++) {
- /* Entry within hardware limit? */
- if (entries[i].entry >= hwsize)
- return false;
-
/* Check for duplicate entries */
for (j = i + 1; j < nvec; j++) {
if (entries[i].entry == entries[j].entry)
@@ -805,7 +800,7 @@ int __pci_enable_msix_range(struct pci_d
if (hwsize < 0)
return hwsize;

- if (!pci_msix_validate_entries(dev, entries, nvec, hwsize))
+ if (!pci_msix_validate_entries(dev, entries, nvec))
return -EINVAL;

if (hwsize < nvec) {

Subject: [tip: irq/urgent] PCI/MSI: Remove over-zealous hardware size check in pci_msix_validate_entries()

The following commit has been merged into the irq/urgent branch of tip:

Commit-ID: 84d9651e13fb9820041d19262a55906851524c0f
Gitweb: https://git.kernel.org/tip/84d9651e13fb9820041d19262a55906851524c0f
Author: Thomas Gleixner <[email protected]>
AuthorDate: Mon, 10 Apr 2023 21:14:45 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Sat, 15 Apr 2023 23:17:32 +02:00

PCI/MSI: Remove over-zealous hardware size check in pci_msix_validate_entries()

pci_msix_validate_entries() validates the entries array which is handed in
by the caller for a MSI-X interrupt allocation. Aside of consistency
failures it also detects a failure when the size of the MSI-X hardware table
in the device is smaller than the size of the entries array.

That's wrong for the case of range allocations where the caller provides
the minimum and the maximum number of vectors to allocate, when the
hardware size is greater or equal than the mininum, but smaller than the
maximum.

Remove the hardware size check completely from that function and just
ensure that the entires array up to the maximum size is consistent.

The limitation and range checking versus the hardware size happens
independently of that afterwards anyway because the entries array is
optional.

Fixes: 4644d22eb673 ("PCI/MSI: Validate MSI-X contiguous restriction early")
Reported-by: David Laight <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/87v8i3sg62.ffs@tglx
---
drivers/pci/msi/msi.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index 1f71662..24899d9 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -750,8 +750,7 @@ out_disable:
return ret;
}

-static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *entries,
- int nvec, int hwsize)
+static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *entries, int nvev)
{
bool nogap;
int i, j;
@@ -762,10 +761,6 @@ static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *en
nogap = pci_msi_domain_supports(dev, MSI_FLAG_MSIX_CONTIGUOUS, DENY_LEGACY);

for (i = 0; i < nvec; i++) {
- /* Entry within hardware limit? */
- if (entries[i].entry >= hwsize)
- return false;
-
/* Check for duplicate entries */
for (j = i + 1; j < nvec; j++) {
if (entries[i].entry == entries[j].entry)
@@ -805,7 +800,7 @@ int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int
if (hwsize < 0)
return hwsize;

- if (!pci_msix_validate_entries(dev, entries, nvec, hwsize))
+ if (!pci_msix_validate_entries(dev, entries, nvec))
return -EINVAL;

if (hwsize < nvec) {

Subject: [tip: irq/urgent] PCI/MSI: Remove over-zealous hardware size check in pci_msix_validate_entries()

The following commit has been merged into the irq/urgent branch of tip:

Commit-ID: e3c026be4d3ca046799fde55ccbae9d0f059fb93
Gitweb: https://git.kernel.org/tip/e3c026be4d3ca046799fde55ccbae9d0f059fb93
Author: Thomas Gleixner <[email protected]>
AuthorDate: Mon, 10 Apr 2023 21:14:45 +02:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Sun, 16 Apr 2023 14:11:51 +02:00

PCI/MSI: Remove over-zealous hardware size check in pci_msix_validate_entries()

pci_msix_validate_entries() validates the entries array which is handed in
by the caller for a MSI-X interrupt allocation. Aside of consistency
failures it also detects a failure when the size of the MSI-X hardware table
in the device is smaller than the size of the entries array.

That's wrong for the case of range allocations where the caller provides
the minimum and the maximum number of vectors to allocate, when the
hardware size is greater or equal than the mininum, but smaller than the
maximum.

Remove the hardware size check completely from that function and just
ensure that the entires array up to the maximum size is consistent.

The limitation and range checking versus the hardware size happens
independently of that afterwards anyway because the entries array is
optional.

Fixes: 4644d22eb673 ("PCI/MSI: Validate MSI-X contiguous restriction early")
Reported-by: David Laight <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/87v8i3sg62.ffs@tglx

---
drivers/pci/msi/msi.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index 1f71662..ef1d885 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -750,8 +750,7 @@ out_disable:
return ret;
}

-static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *entries,
- int nvec, int hwsize)
+static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *entries, int nvec)
{
bool nogap;
int i, j;
@@ -762,10 +761,6 @@ static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *en
nogap = pci_msi_domain_supports(dev, MSI_FLAG_MSIX_CONTIGUOUS, DENY_LEGACY);

for (i = 0; i < nvec; i++) {
- /* Entry within hardware limit? */
- if (entries[i].entry >= hwsize)
- return false;
-
/* Check for duplicate entries */
for (j = i + 1; j < nvec; j++) {
if (entries[i].entry == entries[j].entry)
@@ -805,7 +800,7 @@ int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int
if (hwsize < 0)
return hwsize;

- if (!pci_msix_validate_entries(dev, entries, nvec, hwsize))
+ if (!pci_msix_validate_entries(dev, entries, nvec))
return -EINVAL;

if (hwsize < nvec) {