According to the documentation, "pcie_ports=native", linux should use
native AER and DPC services. While that is true for the _OSC method
parsing, this is not the only place that is checked. Should the HEST
table list PCIe ports as firmware-first, linux will not use native
services.
This happens because aer_acpi_firmware_first() doesn't take
'pcie_ports' into account. This is wrong. DPC uses the same logic when
it decides whether to load or not, so fixing this also fixes DPC not
loading.
Signed-off-by: Alexandru Gagniuc <[email protected]>
---
drivers/pci/pcie/aer.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
Changes since v1:
- Re-tested with latest and greatest (v4.18-rc1) -- works great
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index a2e88386af28..98ced0f7c850 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -291,7 +291,7 @@ static void aer_set_firmware_first(struct pci_dev *pci_dev)
rc = apei_hest_parse(aer_hest_parse, &info);
- if (rc)
+ if (rc || pcie_ports_native)
pci_dev->__aer_firmware_first = 0;
else
pci_dev->__aer_firmware_first = info.firmware_first;
@@ -327,6 +327,9 @@ bool aer_acpi_firmware_first(void)
apei_hest_parse(aer_hest_parse, &info);
aer_firmware_first = info.firmware_first;
parsed = true;
+ if (pcie_ports_native)
+ aer_firmware_first = 0;
+
}
return aer_firmware_first;
}
--
2.14.3
[+cc Borislav, linux-acpi, since this involves APEI/HEST]
On Tue, Jun 19, 2018 at 02:58:20PM -0500, Alexandru Gagniuc wrote:
> According to the documentation, "pcie_ports=native", linux should use
> native AER and DPC services. While that is true for the _OSC method
> parsing, this is not the only place that is checked. Should the HEST
> table list PCIe ports as firmware-first, linux will not use native
> services.
Nothing in ACPI-land looks at pcie_ports_native. How should ACPI
things work in the "pcie_ports=native" case? I guess we still have to
expect to receive error records from the firmware, because it may
certainly send us non-PCI errors (machine checks, etc) and maybe even
some PCI errors (even if the Linux AER driver claims AER interrupts,
we don't know what notification mechanisms the firmware may be using).
I guess best-case, we'll get ACPI error records for all non-PCI
things, and the Linux AER driver will see all the AER errors.
Worst-case, I don't really know what to expect. Duplicate reporting
of AER errors via firmware and Linux AER driver? Some kind of
confusion about who acknowledges and clears them?
Out of curiosity, what is your use case for "pcie_ports=native"?
Presumably there's something that works better when using it, and
things work even *better* with this patch?
I know people do use it, because I often see it mentioned in forums
and bug reports, but I really don't expect it to work very well
because we're ignoring the usage model the firmware is designed
around. My unproven suspicion is that most uses are in the black
magic category of "there's a bug here, and we don't know how to fix
it, but pcie_ports=native makes it work better".
Obviously I would much rather find and fix those bugs so people
wouldn't have to stumble over the problem in the first place.
> This happens because aer_acpi_firmware_first() doesn't take
> 'pcie_ports' into account. This is wrong. DPC uses the same logic when
> it decides whether to load or not, so fixing this also fixes DPC not
> loading.
>
> Signed-off-by: Alexandru Gagniuc <[email protected]>
> ---
> drivers/pci/pcie/aer.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> Changes since v1:
> - Re-tested with latest and greatest (v4.18-rc1) -- works great
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index a2e88386af28..98ced0f7c850 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -291,7 +291,7 @@ static void aer_set_firmware_first(struct pci_dev *pci_dev)
>
> rc = apei_hest_parse(aer_hest_parse, &info);
>
> - if (rc)
> + if (rc || pcie_ports_native)
> pci_dev->__aer_firmware_first = 0;
> else
> pci_dev->__aer_firmware_first = info.firmware_first;
> @@ -327,6 +327,9 @@ bool aer_acpi_firmware_first(void)
> apei_hest_parse(aer_hest_parse, &info);
> aer_firmware_first = info.firmware_first;
> parsed = true;
> + if (pcie_ports_native)
> + aer_firmware_first = 0;
> +
> }
> return aer_firmware_first;
> }
> --
> 2.14.3
>
On 06/30/2018 04:31 PM, Bjorn Helgaas wrote:
> [+cc Borislav, linux-acpi, since this involves APEI/HEST]
Borislav is not the relevant maintainer here, since we're not contingent
on APEI handling. I think Keith has a lot more experience with this part
of the kernel.
> On Tue, Jun 19, 2018 at 02:58:20PM -0500, Alexandru Gagniuc wrote:
>> According to the documentation, "pcie_ports=native", linux should use
>> native AER and DPC services. While that is true for the _OSC method
>> parsing, this is not the only place that is checked. Should the HEST
>> table list PCIe ports as firmware-first, linux will not use native
>> services.
>
> Nothing in ACPI-land looks at pcie_ports_native. How should ACPI
> things work in the "pcie_ports=native" case? I guess we still have to
> expect to receive error records from the firmware, because it may
> certainly send us non-PCI errors (machine checks, etc) and maybe even
> some PCI errors (even if the Linux AER driver claims AER interrupts,
> we don't know what notification mechanisms the firmware may be using).
I think ACPI land shouldn't care about this. We care about it from the
PCIe stand point at the interface with ACPI. FW might see a delta in the
sense that we request control of some features via _OSC, which we
otherwise would not do without pcie_ports=native.
> I guess best-case, we'll get ACPI error records for all non-PCI
> things, and the Linux AER driver will see all the AER errors.
It might affect FW's ability to catch errors, but that's dependent on
the root port implementation.
> Worst-case, I don't really know what to expect. Duplicate reporting
> of AER errors via firmware and Linux AER driver? Some kind of
> confusion about who acknowledges and clears them?
Once user enters pcie_ports=native, all bets are off: you broke the
contract you have with the FW -- whether or not you have this patch.
> Out of curiosity, what is your use case for "pcie_ports=native"?
> Presumably there's something that works better when using it, and
> things work even *better* with this patch?
Corectness. It bothers me that actual behavior does not match the
documentation:
native Use native PCIe services associated with PCIe ports
unconditionally.
> I know people do use it, because I often see it mentioned in forums
> and bug reports, but I really don't expect it to work very well
> because we're ignoring the usage model the firmware is designed
> around. My unproven suspicion is that most uses are in the black
> magic category of "there's a bug here, and we don't know how to fix
> it, but pcie_ports=native makes it work better".
There exist cases that firmware didn't consider. I would not call them
"firmware bugs", but there are cases where the user understands the
platform better than firmware.
Example: on certain PCIe switches, a hardware PCIe error may bring the
switch downstream ports into a state where they stop notifying hotplug
events. Depending on the platform, firmware may or may not fix this
condition, but "pcie_ports=native" enables DPC. DPC contains the error
without the switch downstream port entering the weird error state in the
first place.
All bets are off at this point.
> Obviously I would much rather find and fix those bugs so people
> wouldn't have to stumble over the problem in the first place.
Using native services when firmware asks us not to is a crapshoot every
time. I don't condone the use of this feature, but if we do get a
pcie_ports=native request, we should at least honor it.
>> This happens because aer_acpi_firmware_first() doesn't take
>> 'pcie_ports' into account. This is wrong. DPC uses the same logic when
>> it decides whether to load or not, so fixing this also fixes DPC not
>> loading.
>>
>> Signed-off-by: Alexandru Gagniuc <[email protected]>
>> ---
>> drivers/pci/pcie/aer.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> Changes since v1:
>> - Re-tested with latest and greatest (v4.18-rc1) -- works great
>>
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index a2e88386af28..98ced0f7c850 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -291,7 +291,7 @@ static void aer_set_firmware_first(struct pci_dev *pci_dev)
>>
>> rc = apei_hest_parse(aer_hest_parse, &info);
>>
>> - if (rc)
>> + if (rc || pcie_ports_native)
>> pci_dev->__aer_firmware_first = 0;
>> else
>> pci_dev->__aer_firmware_first = info.firmware_first;
>> @@ -327,6 +327,9 @@ bool aer_acpi_firmware_first(void)
>> apei_hest_parse(aer_hest_parse, &info);
>> aer_firmware_first = info.firmware_first;
>> parsed = true;
>> + if (pcie_ports_native)
>> + aer_firmware_first = 0;
>> +
>> }
>> return aer_firmware_first;
>> }
>> --
>> 2.14.3
>>
On Sat, Jun 30, 2018 at 11:39:00PM -0500, Alex G wrote:
> On 06/30/2018 04:31 PM, Bjorn Helgaas wrote:
> > [+cc Borislav, linux-acpi, since this involves APEI/HEST]
>
> Borislav is not the relevant maintainer here, since we're not contingent on
> APEI handling. I think Keith has a lot more experience with this part of the
> kernel.
Thanks for adding Keith.
> > On Tue, Jun 19, 2018 at 02:58:20PM -0500, Alexandru Gagniuc wrote:
> > > According to the documentation, "pcie_ports=native", linux should use
> > > native AER and DPC services. While that is true for the _OSC method
> > > parsing, this is not the only place that is checked. Should the HEST
> > > table list PCIe ports as firmware-first, linux will not use native
> > > services.
> >
> > Nothing in ACPI-land looks at pcie_ports_native. How should ACPI
> > things work in the "pcie_ports=native" case? I guess we still have to
> > expect to receive error records from the firmware, because it may
> > certainly send us non-PCI errors (machine checks, etc) and maybe even
> > some PCI errors (even if the Linux AER driver claims AER interrupts,
> > we don't know what notification mechanisms the firmware may be using).
>
> I think ACPI land shouldn't care about this. We care about it from the PCIe
> stand point at the interface with ACPI. FW might see a delta in the sense
> that we request control of some features via _OSC, which we otherwise would
> not do without pcie_ports=native.
>
> > I guess best-case, we'll get ACPI error records for all non-PCI
> > things, and the Linux AER driver will see all the AER errors.
>
> It might affect FW's ability to catch errors, but that's dependent on the
> root port implementation.
>
> > Worst-case, I don't really know what to expect. Duplicate reporting
> > of AER errors via firmware and Linux AER driver? Some kind of
> > confusion about who acknowledges and clears them?
>
> Once user enters pcie_ports=native, all bets are off: you broke the contract
> you have with the FW -- whether or not you have this patch.
>
> > Out of curiosity, what is your use case for "pcie_ports=native"?
> > Presumably there's something that works better when using it, and
> > things work even *better* with this patch?
>
> Corectness. It bothers me that actual behavior does not match the
> documentation:
>
> native Use native PCIe services associated with PCIe ports
> unconditionally.
>
>
> > I know people do use it, because I often see it mentioned in forums
> > and bug reports, but I really don't expect it to work very well
> > because we're ignoring the usage model the firmware is designed
> > around. My unproven suspicion is that most uses are in the black
> > magic category of "there's a bug here, and we don't know how to fix
> > it, but pcie_ports=native makes it work better".
>
> There exist cases that firmware didn't consider. I would not call them
> "firmware bugs", but there are cases where the user understands the platform
> better than firmware.
> Example: on certain PCIe switches, a hardware PCIe error may bring the
> switch downstream ports into a state where they stop notifying hotplug
> events. Depending on the platform, firmware may or may not fix this
> condition, but "pcie_ports=native" enables DPC. DPC contains the error
> without the switch downstream port entering the weird error state in the
> first place.
>
> All bets are off at this point.
If a user needs "pcie_ports=native", I claim that's a user experience
problem, and the underlying cause is a hardware, firmware, or OS
defect.
I have no doubt the situation you describe is real, but this doesn't
make any progress toward resolving the user experience problem. In
fact, it propagates the folklore that using "pcie_ports=native" is an
appropriate final solution. It's fine as a temporary workaround while
we figure out a better solution, but we need some mechanism for
analyzing the problem and eventually removing the need to use
"pcie_ports=native".
I have a minor comment on the patch, but I think it makes sense. This
might be a good time to resurrect Prarit's "taint-on-pci-parameters"
patch. If somebody uses "pcie_ports=native", I think it makes sense
to taint the kernel both because (1) we broke the contract with the
firmware and we don't really know what to expect, and (2) it's an
opportunity to encourage the user to raise a bug report.
Bjorn
On Tue, Jun 19, 2018 at 02:58:20PM -0500, Alexandru Gagniuc wrote:
> According to the documentation, "pcie_ports=native", linux should use
> native AER and DPC services. While that is true for the _OSC method
> parsing, this is not the only place that is checked. Should the HEST
> table list PCIe ports as firmware-first, linux will not use native
> services.
>
> This happens because aer_acpi_firmware_first() doesn't take
> 'pcie_ports' into account. This is wrong. DPC uses the same logic when
> it decides whether to load or not, so fixing this also fixes DPC not
> loading.
>
> Signed-off-by: Alexandru Gagniuc <[email protected]>
> ---
> drivers/pci/pcie/aer.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> Changes since v1:
> - Re-tested with latest and greatest (v4.18-rc1) -- works great
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index a2e88386af28..98ced0f7c850 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -291,7 +291,7 @@ static void aer_set_firmware_first(struct pci_dev *pci_dev)
>
> rc = apei_hest_parse(aer_hest_parse, &info);
>
> - if (rc)
> + if (rc || pcie_ports_native)
> pci_dev->__aer_firmware_first = 0;
> else
> pci_dev->__aer_firmware_first = info.firmware_first;
> @@ -327,6 +327,9 @@ bool aer_acpi_firmware_first(void)
> apei_hest_parse(aer_hest_parse, &info);
> aer_firmware_first = info.firmware_first;
> parsed = true;
> + if (pcie_ports_native)
> + aer_firmware_first = 0;
Trivial comment for both of these hunks: if we test for
pcie_ports_native *first*, we won't have to run apei_hest_parse().
It's fairly obvious that we ignore the result of apei_hest_parse()
anyway, but I think it's a courtesy to the reader if we don't run it
at all, so we don't have to worry about side effects.
> +
> }
> return aer_firmware_first;
> }
> --
> 2.14.3
>
On 07/02/2018 08:16 AM, Bjorn Helgaas wrote:
> On Sat, Jun 30, 2018 at 11:39:00PM -0500, Alex G wrote:
>> On 06/30/2018 04:31 PM, Bjorn Helgaas wrote:
>>> [+cc Borislav, linux-acpi, since this involves APEI/HEST]
>>
>> Borislav is not the relevant maintainer here, since we're not contingent on
>> APEI handling. I think Keith has a lot more experience with this part of the
>> kernel.
>
> Thanks for adding Keith.
>
>>> On Tue, Jun 19, 2018 at 02:58:20PM -0500, Alexandru Gagniuc wrote:
>>>> According to the documentation, "pcie_ports=native", linux should use
>>>> native AER and DPC services. While that is true for the _OSC method
>>>> parsing, this is not the only place that is checked. Should the HEST
>>>> table list PCIe ports as firmware-first, linux will not use native
>>>> services.
>>>
>>> Nothing in ACPI-land looks at pcie_ports_native. How should ACPI
>>> things work in the "pcie_ports=native" case? I guess we still have to
>>> expect to receive error records from the firmware, because it may
>>> certainly send us non-PCI errors (machine checks, etc) and maybe even
>>> some PCI errors (even if the Linux AER driver claims AER interrupts,
>>> we don't know what notification mechanisms the firmware may be using).
>>
>> I think ACPI land shouldn't care about this. We care about it from the PCIe
>> stand point at the interface with ACPI. FW might see a delta in the sense
>> that we request control of some features via _OSC, which we otherwise would
>> not do without pcie_ports=native.
>>
>>> I guess best-case, we'll get ACPI error records for all non-PCI
>>> things, and the Linux AER driver will see all the AER errors.
>>
>> It might affect FW's ability to catch errors, but that's dependent on the
>> root port implementation.
>>
>>> Worst-case, I don't really know what to expect. Duplicate reporting
>>> of AER errors via firmware and Linux AER driver? Some kind of
>>> confusion about who acknowledges and clears them?
>>
>> Once user enters pcie_ports=native, all bets are off: you broke the contract
>> you have with the FW -- whether or not you have this patch.
>>
>>> Out of curiosity, what is your use case for "pcie_ports=native"?
>>> Presumably there's something that works better when using it, and
>>> things work even *better* with this patch?
>>
>> Corectness. It bothers me that actual behavior does not match the
>> documentation:
>>
>> native Use native PCIe services associated with PCIe ports
>> unconditionally.
>>
>>
>>> I know people do use it, because I often see it mentioned in forums
>>> and bug reports, but I really don't expect it to work very well
>>> because we're ignoring the usage model the firmware is designed
>>> around. My unproven suspicion is that most uses are in the black
>>> magic category of "there's a bug here, and we don't know how to fix
>>> it, but pcie_ports=native makes it work better".
>>
>> There exist cases that firmware didn't consider. I would not call them
>> "firmware bugs", but there are cases where the user understands the platform
>> better than firmware.
>> Example: on certain PCIe switches, a hardware PCIe error may bring the
>> switch downstream ports into a state where they stop notifying hotplug
>> events. Depending on the platform, firmware may or may not fix this
>> condition, but "pcie_ports=native" enables DPC. DPC contains the error
>> without the switch downstream port entering the weird error state in the
>> first place.
>>
>> All bets are off at this point.
>
> If a user needs "pcie_ports=native", I claim that's a user experience
> problem, and the underlying cause is a hardware, firmware, or OS
> defect.
>
> I have no doubt the situation you describe is real, but this doesn't
> make any progress toward resolving the user experience problem. In
> fact, it propagates the folklore that using "pcie_ports=native" is an
> appropriate final solution. It's fine as a temporary workaround while
> we figure out a better solution, but we need some mechanism for
> analyzing the problem and eventually removing the need to use
> "pcie_ports=native".
Speaking of user experience, I'd argue that it's a horrible experience
for the kernel to _not_ do what it is asked.
I'm going to go fix the little comment about the patch. I had the same
dilemma when I wrote it, but didn't find it too noteworthy. It makes
more sense now that you mentioned it.
Alex
> I have a minor comment on the patch, but I think it makes sense. This
> might be a good time to resurrect Prarit's "taint-on-pci-parameters"
> patch. If somebody uses "pcie_ports=native", I think it makes sense
> to taint the kernel both because (1) we broke the contract with the
> firmware and we don't really know what to expect, and (2) it's an
> opportunity to encourage the user to raise a bug report.
>
> Bjorn
>
According to the documentation, "pcie_ports=native", linux should use
native AER and DPC services. While that is true for the _OSC method
parsing, this is not the only place that is checked. Should the HEST
table list PCIe ports as firmware-first, linux will not use native
services.
This happens because aer_acpi_firmware_first() doesn't take
'pcie_ports' into account. This is wrong. DPC uses the same logic when
it decides whether to load or not, so fixing this also fixes DPC not
loading.
Signed-off-by: Alexandru Gagniuc <[email protected]>
---
drivers/pci/pcie/aer.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index a2e88386af28..db2c01056dc7 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -283,13 +283,14 @@ static int aer_hest_parse(struct acpi_hest_header *hest_hdr, void *data)
static void aer_set_firmware_first(struct pci_dev *pci_dev)
{
- int rc;
+ int rc = 0;
struct aer_hest_parse_info info = {
.pci_dev = pci_dev,
.firmware_first = 0,
};
- rc = apei_hest_parse(aer_hest_parse, &info);
+ if (!pcie_ports_native)
+ rc = apei_hest_parse(aer_hest_parse, &info);
if (rc)
pci_dev->__aer_firmware_first = 0;
@@ -324,7 +325,9 @@ bool aer_acpi_firmware_first(void)
};
if (!parsed) {
- apei_hest_parse(aer_hest_parse, &info);
+ if (!pcie_ports_native)
+ apei_hest_parse(aer_hest_parse, &info);
+
aer_firmware_first = info.firmware_first;
parsed = true;
}
--
2.14.3
On Mon, Jul 02, 2018 at 11:16:01AM -0500, Alexandru Gagniuc wrote:
> According to the documentation, "pcie_ports=native", linux should use
> native AER and DPC services. While that is true for the _OSC method
> parsing, this is not the only place that is checked. Should the HEST
> table list PCIe ports as firmware-first, linux will not use native
> services.
>
> This happens because aer_acpi_firmware_first() doesn't take
> 'pcie_ports' into account. This is wrong. DPC uses the same logic when
> it decides whether to load or not, so fixing this also fixes DPC not
> loading.
>
> Signed-off-by: Alexandru Gagniuc <[email protected]>
> ---
> drivers/pci/pcie/aer.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index a2e88386af28..db2c01056dc7 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -283,13 +283,14 @@ static int aer_hest_parse(struct acpi_hest_header *hest_hdr, void *data)
>
> static void aer_set_firmware_first(struct pci_dev *pci_dev)
> {
> - int rc;
> + int rc = 0;
> struct aer_hest_parse_info info = {
> .pci_dev = pci_dev,
> .firmware_first = 0,
> };
>
> - rc = apei_hest_parse(aer_hest_parse, &info);
> + if (!pcie_ports_native)
> + rc = apei_hest_parse(aer_hest_parse, &info);
>
> if (rc)
> pci_dev->__aer_firmware_first = 0;
> @@ -324,7 +325,9 @@ bool aer_acpi_firmware_first(void)
> };
>
> if (!parsed) {
> - apei_hest_parse(aer_hest_parse, &info);
> + if (!pcie_ports_native)
> + apei_hest_parse(aer_hest_parse, &info);
> +
> aer_firmware_first = info.firmware_first;
> parsed = true;
> }
I was thinking something along the lines of the patch below, so we
don't have to work through the settings of "rc" and "info". But maybe
I'm missing something subtle?
One subtle thing that I didn't look into is the
pcie_aer_get_firmware_first() stub for the non-CONFIG_ACPI_APEI case.
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index b24f2d252180..5ccbd7635f33 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -342,6 +342,9 @@ int pcie_aer_get_firmware_first(struct pci_dev *dev)
if (!pci_is_pcie(dev))
return 0;
+ if (pcie_ports_native)
+ return 0;
+
if (!dev->__aer_firmware_first_valid)
aer_set_firmware_first(dev);
return dev->__aer_firmware_first;
@@ -362,6 +365,9 @@ bool aer_acpi_firmware_first(void)
.firmware_first = 0,
};
+ if (pcie_ports_native)
+ return false;
+
if (!parsed) {
apei_hest_parse(aer_hest_parse, &info);
aer_firmware_first = info.firmware_first;
On 07/03/2018 11:38 AM, Bjorn Helgaas wrote:
> On Mon, Jul 02, 2018 at 11:16:01AM -0500, Alexandru Gagniuc wrote:
>> According to the documentation, "pcie_ports=native", linux should use
>> native AER and DPC services. While that is true for the _OSC method
>> parsing, this is not the only place that is checked. Should the HEST
>> table list PCIe ports as firmware-first, linux will not use native
>> services.
>>
>> This happens because aer_acpi_firmware_first() doesn't take
>> 'pcie_ports' into account. This is wrong. DPC uses the same logic when
>> it decides whether to load or not, so fixing this also fixes DPC not
>> loading.
>>
>> Signed-off-by: Alexandru Gagniuc <[email protected]>
>> ---
>> drivers/pci/pcie/aer.c | 9 ++++++---
>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index a2e88386af28..db2c01056dc7 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -283,13 +283,14 @@ static int aer_hest_parse(struct acpi_hest_header *hest_hdr, void *data)
>>
>> static void aer_set_firmware_first(struct pci_dev *pci_dev)
>> {
>> - int rc;
>> + int rc = 0;
>> struct aer_hest_parse_info info = {
>> .pci_dev = pci_dev,
>> .firmware_first = 0,
>> };
>>
>> - rc = apei_hest_parse(aer_hest_parse, &info);
>> + if (!pcie_ports_native)
>> + rc = apei_hest_parse(aer_hest_parse, &info);
>>
>> if (rc)
>> pci_dev->__aer_firmware_first = 0;
>> @@ -324,7 +325,9 @@ bool aer_acpi_firmware_first(void)
>> };
>>
>> if (!parsed) {
>> - apei_hest_parse(aer_hest_parse, &info);
>> + if (!pcie_ports_native)
>> + apei_hest_parse(aer_hest_parse, &info);
>> +
>> aer_firmware_first = info.firmware_first;
>> parsed = true;
>> }
>
> I was thinking something along the lines of the patch below, so we
> don't have to work through the settings of "rc" and "info". But maybe
> I'm missing something subtle?
>
> One subtle thing that I didn't look into is the
> pcie_aer_get_firmware_first() stub for the non-CONFIG_ACPI_APEI case.
>
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index b24f2d252180..5ccbd7635f33 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -342,6 +342,9 @@ int pcie_aer_get_firmware_first(struct pci_dev *dev)
> if (!pci_is_pcie(dev))
> return 0;
>
> + if (pcie_ports_native)
> + return 0;
> +
Here we're not setting dev->__aer_firmware_first_valid. Assuming we only
check for it via pcie_aer_get_firmware_first(), we should be fine. THis
field is used in portdrv.h, but its use seems safe. I think this
approach can work.
> if (!dev->__aer_firmware_first_valid)
> aer_set_firmware_first(dev);
> return dev->__aer_firmware_first;
> @@ -362,6 +365,9 @@ bool aer_acpi_firmware_first(void)
> .firmware_first = 0,
> };
>
> + if (pcie_ports_native)
> + return false;
> +
This makes better sense.
> if (!parsed) {
> apei_hest_parse(aer_hest_parse, &info);
> aer_firmware_first = info.firmware_first;
>
According to the documentation, "pcie_ports=native", linux should use
native AER and DPC services. While that is true for the _OSC method
parsing, this is not the only place that is checked. Should the HEST
table list PCIe ports as firmware-first, linux will not use native
services.
This happens because aer_acpi_firmware_first() doesn't take
'pcie_ports' into account. This is wrong. DPC uses the same logic when
it decides whether to load or not, so fixing this also fixes DPC not
loading.
Signed-off-by: Alexandru Gagniuc <[email protected]>
---
drivers/pci/pcie/aer.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index a2e88386af28..63a66a3b2fb5 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -303,6 +303,9 @@ int pcie_aer_get_firmware_first(struct pci_dev *dev)
if (!pci_is_pcie(dev))
return 0;
+ if (pcie_ports_native)
+ return 0;
+
if (!dev->__aer_firmware_first_valid)
aer_set_firmware_first(dev);
return dev->__aer_firmware_first;
@@ -323,6 +326,9 @@ bool aer_acpi_firmware_first(void)
.firmware_first = 0,
};
+ if (pcie_ports_native)
+ return 0;
+
if (!parsed) {
apei_hest_parse(aer_hest_parse, &info);
aer_firmware_first = info.firmware_first;
--
2.14.3
On Tue, Jul 03, 2018 at 06:27:43PM -0500, Alexandru Gagniuc wrote:
> According to the documentation, "pcie_ports=native", linux should use
> native AER and DPC services. While that is true for the _OSC method
> parsing, this is not the only place that is checked. Should the HEST
> table list PCIe ports as firmware-first, linux will not use native
> services.
>
> This happens because aer_acpi_firmware_first() doesn't take
> 'pcie_ports' into account. This is wrong. DPC uses the same logic when
> it decides whether to load or not, so fixing this also fixes DPC not
> loading.
>
> Signed-off-by: Alexandru Gagniuc <[email protected]>
Applied to pci/aer for v4.19, thanks!
> ---
> drivers/pci/pcie/aer.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index a2e88386af28..63a66a3b2fb5 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -303,6 +303,9 @@ int pcie_aer_get_firmware_first(struct pci_dev *dev)
> if (!pci_is_pcie(dev))
> return 0;
>
> + if (pcie_ports_native)
> + return 0;
> +
> if (!dev->__aer_firmware_first_valid)
> aer_set_firmware_first(dev);
> return dev->__aer_firmware_first;
> @@ -323,6 +326,9 @@ bool aer_acpi_firmware_first(void)
> .firmware_first = 0,
> };
>
> + if (pcie_ports_native)
> + return 0;
> +
> if (!parsed) {
> apei_hest_parse(aer_hest_parse, &info);
> aer_firmware_first = info.firmware_first;
> --
> 2.14.3
>