2018-07-18 19:46:33

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL

This is a v3 of Oza's patches [1]. It's available at [2] if you prefer
git.

v3 changes:
- Add pci_aer_clear_fatal_status() to clear ERR_FATAL bits, only called
from pcie_do_fatal_recovery(). Moved to first in series to avoid a
window where ERR_FATAL recovery only clears ERR_NONFATAL bits. Visible
only inside the PCI core.
- Instead of having pci_cleanup_aer_uncorrect_error_status() do different
things based on dev->error_state, use this only for ERR_NONFATAL bits.
I didn't change the name because it's used by many drivers.
- Rename pci_cleanup_aer_error_device_status() to
pci_aer_clear_device_status(), make it void, and make it visible only
inside the PCI core.
- Remove pcie_portdrv_err_handler.slot_reset altogether instead of making
it a stub function. Possibly pcie_portdrv_err_handler could be removed
completely?

[1] https://lkml.kernel.org/r/[email protected]
[2] https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/?h=pci/06-22-oza-aer

---

Bjorn Helgaas (1):
PCI/AER: Clear only ERR_FATAL status bits during fatal recovery

Oza Pawandeep (6):
PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery
PCI/AER: Factor out ERR_NONFATAL status bit clearing
PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path
PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL
PCI/AER: Clear device status bits during ERR_COR handling
PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset


drivers/pci/pci.h | 5 ++++
drivers/pci/pcie/aer.c | 47 +++++++++++++++++++++++++++-------------
drivers/pci/pcie/err.c | 15 +++++--------
drivers/pci/pcie/portdrv_pci.c | 25 ---------------------
4 files changed, 43 insertions(+), 49 deletions(-)


2018-07-18 19:46:45

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v3 1/7] PCI/AER: Clear only ERR_FATAL status bits during fatal recovery

From: Bjorn Helgaas <[email protected]>

During recovery from fatal errors, we previously called
pci_cleanup_aer_uncorrect_error_status(), which cleared *all* uncorrectable
error status bits (both ERR_FATAL and ERR_NONFATAL).

Instead, call a new pci_aer_clear_fatal_status() that clears only the
ERR_FATAL bits (as indicated by the PCI_ERR_UNCOR_SEVER register).

Based-on-patch-by: Oza Pawandeep <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/pci.h | 4 ++++
drivers/pci/pcie/aer.c | 17 +++++++++++++++++
drivers/pci/pcie/err.c | 2 +-
3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index c358e7a07f3f..12fd2ac95843 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -452,4 +452,8 @@ static inline int devm_of_pci_get_host_bridge_resources(struct device *dev,
}
#endif

+#ifdef CONFIG_PCIEAER
+void pci_aer_clear_fatal_status(struct pci_dev *dev);
+#endif
+
#endif /* DRIVERS_PCI_H */
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index a2e88386af28..5b4a84e3d360 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -374,6 +374,23 @@ int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
}
EXPORT_SYMBOL_GPL(pci_cleanup_aer_uncorrect_error_status);

+void pci_aer_clear_fatal_status(struct pci_dev *dev)
+{
+ int pos;
+ u32 status, sev;
+
+ pos = dev->aer_cap;
+ if (!pos)
+ return;
+
+ /* Clear status bits for ERR_FATAL errors only */
+ pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+ pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &sev);
+ status &= sev;
+ if (status)
+ pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
+}
+
int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
{
int pos;
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index f7ce0cb0b0b7..0539518f9861 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -316,7 +316,7 @@ void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service)
* do error recovery on all subordinates of the bridge instead
* of the bridge and clear the error status of the bridge.
*/
- pci_cleanup_aer_uncorrect_error_status(dev);
+ pci_aer_clear_fatal_status(dev);
}

if (result == PCI_ERS_RESULT_RECOVERED) {


2018-07-18 19:46:58

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v3 2/7] PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery

From: Oza Pawandeep <[email protected]>

pci_cleanup_aer_uncorrect_error_status() is called by driver .slot_reset()
methods when handling ERR_NONFATAL errors. Previously this cleared *all*
the bits, including ERR_FATAL bits.

Since we're only handling ERR_NONFATAL errors, clear only the ERR_NONFATAL
error status bits.

Signed-off-by: Oza Pawandeep <[email protected]>
[bhelgaas: split to separate patch]
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/pcie/aer.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 5b4a84e3d360..6f0f131b5e6a 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -360,13 +360,16 @@ EXPORT_SYMBOL_GPL(pci_disable_pcie_error_reporting);
int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
{
int pos;
- u32 status;
+ u32 status, sev;

pos = dev->aer_cap;
if (!pos)
return -EIO;

+ /* Clear status bits for ERR_NONFATAL errors only */
pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+ pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &sev);
+ status &= ~sev;
if (status)
pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);



2018-07-18 19:47:21

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v3 3/7] PCI/AER: Factor out ERR_NONFATAL status bit clearing

From: Oza Pawandeep <[email protected]>

aer_error_resume() clears all ERR_NONFATAL error status bits. This is
exactly what pci_cleanup_aer_uncorrect_error_status(), so use that instead
of duplicating the code.

Signed-off-by: Oza Pawandeep <[email protected]>
[bhelgaas: split to separate patch]
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/pcie/aer.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 6f0f131b5e6a..b8972fe85043 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1356,20 +1356,13 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
*/
static void aer_error_resume(struct pci_dev *dev)
{
- int pos;
- u32 status, mask;
u16 reg16;

/* Clean up Root device status */
pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &reg16);
pcie_capability_write_word(dev, PCI_EXP_DEVSTA, reg16);

- /* Clean AER Root Error Status */
- pos = dev->aer_cap;
- pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
- pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &mask);
- status &= ~mask; /* Clear corresponding nonfatal bits */
- pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
+ pci_cleanup_aer_uncorrect_error_status(dev);
}

static struct pcie_port_service_driver aerdriver = {


2018-07-18 19:49:13

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v3 6/7] PCI/AER: Clear device status bits during ERR_COR handling

From: Oza Pawandeep <[email protected]>

In case of correctable error, the Correctable Error Detected bit in the
Device Status register is set. Clear it after handling the error.

Signed-off-by: Oza Pawandeep <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/pcie/aer.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index dc67f52b002f..2accfd7a4c9d 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -827,6 +827,7 @@ static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info)
if (pos)
pci_write_config_dword(dev, pos + PCI_ERR_COR_STATUS,
info->status);
+ pci_aer_clear_device_status(dev);
} else if (info->severity == AER_NONFATAL)
pcie_do_nonfatal_recovery(dev);
else if (info->severity == AER_FATAL)


2018-07-18 19:49:59

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v3 4/7] PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path

From: Oza Pawandeep <[email protected]>

broadcast_error_message() is only used for ERR_NONFATAL events, when the
state is always pci_channel_io_normal, so remove the unused alternate path.

Signed-off-by: Oza Pawandeep <[email protected]>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/pcie/err.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 0539518f9861..638eda5c1d79 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -259,15 +259,10 @@ static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
/*
* If the error is reported by an end point, we think this
* error is related to the upstream link of the end point.
+ * The error is non fatal so the bus is ok; just invoke
+ * the callback for the function that logged the error.
*/
- if (state == pci_channel_io_normal)
- /*
- * the error is non fatal so the bus is ok, just invoke
- * the callback for the function that logged the error.
- */
- cb(dev, &result_data);
- else
- pci_walk_bus(dev->bus, cb, &result_data);
+ cb(dev, &result_data);
}

return result_data.result;


2018-07-18 19:50:07

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v3 5/7] PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL

From: Oza Pawandeep <[email protected]>

Clear the device status bits while handling both ERR_FATAL and ERR_NONFATAL
cases.

Signed-off-by: Oza Pawandeep <[email protected]>
[bhelgaas: rename to pci_aer_clear_device_status(), declare internal to PCI
core instead of exposing it everywhere]
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/pci.h | 1 +
drivers/pci/pcie/aer.c | 15 +++++++++------
drivers/pci/pcie/err.c | 2 ++
3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 12fd2ac95843..fc4978df7caf 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -454,6 +454,7 @@ static inline int devm_of_pci_get_host_bridge_resources(struct device *dev,

#ifdef CONFIG_PCIEAER
void pci_aer_clear_fatal_status(struct pci_dev *dev);
+void pci_aer_clear_device_status(struct pci_dev *dev);
#endif

#endif /* DRIVERS_PCI_H */
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index b8972fe85043..dc67f52b002f 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -357,6 +357,14 @@ int pci_disable_pcie_error_reporting(struct pci_dev *dev)
}
EXPORT_SYMBOL_GPL(pci_disable_pcie_error_reporting);

+void pci_aer_clear_device_status(struct pci_dev *dev)
+{
+ u16 sta;
+
+ pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
+ pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
+}
+
int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
{
int pos;
@@ -1356,12 +1364,7 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
*/
static void aer_error_resume(struct pci_dev *dev)
{
- u16 reg16;
-
- /* Clean up Root device status */
- pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &reg16);
- pcie_capability_write_word(dev, PCI_EXP_DEVSTA, reg16);
-
+ pci_aer_clear_device_status(dev);
pci_cleanup_aer_uncorrect_error_status(dev);
}

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 638eda5c1d79..fdbcc555860d 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -252,6 +252,7 @@ static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
dev->error_state = state;
pci_walk_bus(dev->subordinate, cb, &result_data);
if (cb == report_resume) {
+ pci_aer_clear_device_status(dev);
pci_cleanup_aer_uncorrect_error_status(dev);
dev->error_state = pci_channel_io_normal;
}
@@ -312,6 +313,7 @@ void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service)
* of the bridge and clear the error status of the bridge.
*/
pci_aer_clear_fatal_status(dev);
+ pci_aer_clear_device_status(dev);
}

if (result == PCI_ERS_RESULT_RECOVERED) {


2018-07-18 19:51:08

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v3 7/7] PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset

From: Oza Pawandeep <[email protected]>

The pci_error_handlers.slot_reset() callback is only used for non-bridge
devices (see broadcast_error_message()). Since portdrv only binds to
bridges, we don't need pcie_portdrv_slot_reset(), so remove it.

Signed-off-by: Oza Pawandeep <[email protected]>
[bhelgaas: changelog, remove pcie_portdrv_slot_reset() completely]
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/pcie/portdrv_pci.c | 25 -------------------------
1 file changed, 25 deletions(-)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 973f1b80a038..b78840f54a9b 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -42,17 +42,6 @@ __setup("pcie_ports=", pcie_port_setup);

/* global data */

-static int pcie_portdrv_restore_config(struct pci_dev *dev)
-{
- int retval;
-
- retval = pci_enable_device(dev);
- if (retval)
- return retval;
- pci_set_master(dev);
- return 0;
-}
-
#ifdef CONFIG_PM
static int pcie_port_runtime_suspend(struct device *dev)
{
@@ -160,19 +149,6 @@ static pci_ers_result_t pcie_portdrv_mmio_enabled(struct pci_dev *dev)
return PCI_ERS_RESULT_RECOVERED;
}

-static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
-{
- /* If fatal, restore cfg space for possible link reset at upstream */
- if (dev->error_state == pci_channel_io_frozen) {
- dev->state_saved = true;
- pci_restore_state(dev);
- pcie_portdrv_restore_config(dev);
- pci_enable_pcie_error_reporting(dev);
- }
-
- return PCI_ERS_RESULT_RECOVERED;
-}
-
static int resume_iter(struct device *device, void *data)
{
struct pcie_device *pcie_device;
@@ -208,7 +184,6 @@ static const struct pci_device_id port_pci_ids[] = { {
static const struct pci_error_handlers pcie_portdrv_err_handler = {
.error_detected = pcie_portdrv_error_detected,
.mmio_enabled = pcie_portdrv_mmio_enabled,
- .slot_reset = pcie_portdrv_slot_reset,
.resume = pcie_portdrv_err_resume,
};



2018-07-19 03:55:06

by Oza Pawandeep

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL

On 2018-07-19 01:14, Bjorn Helgaas wrote:
> This is a v3 of Oza's patches [1]. It's available at [2] if you prefer
> git.
>
> v3 changes:
> - Add pci_aer_clear_fatal_status() to clear ERR_FATAL bits, only
> called
> from pcie_do_fatal_recovery(). Moved to first in series to avoid a
> window where ERR_FATAL recovery only clears ERR_NONFATAL bits.
> Visible
> only inside the PCI core.
> - Instead of having pci_cleanup_aer_uncorrect_error_status() do
> different
> things based on dev->error_state, use this only for ERR_NONFATAL
> bits.
> I didn't change the name because it's used by many drivers.
> - Rename pci_cleanup_aer_error_device_status() to
> pci_aer_clear_device_status(), make it void, and make it visible
> only
> inside the PCI core.
> - Remove pcie_portdrv_err_handler.slot_reset altogether instead of
> making
> it a stub function. Possibly pcie_portdrv_err_handler could be
> removed
> completely?
>
> [1]
> https://lkml.kernel.org/r/[email protected]
> [2]
> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/?h=pci/06-22-oza-aer
>
> ---
>
> Bjorn Helgaas (1):
> PCI/AER: Clear only ERR_FATAL status bits during fatal recovery
>
> Oza Pawandeep (6):
> PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery
> PCI/AER: Factor out ERR_NONFATAL status bit clearing
> PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path
> PCI/AER: Clear device status bits during ERR_FATAL and
> ERR_NONFATAL
> PCI/AER: Clear device status bits during ERR_COR handling
> PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset
>
>
> drivers/pci/pci.h | 5 ++++
> drivers/pci/pcie/aer.c | 47
> +++++++++++++++++++++++++++-------------
> drivers/pci/pcie/err.c | 15 +++++--------
> drivers/pci/pcie/portdrv_pci.c | 25 ---------------------
> 4 files changed, 43 insertions(+), 49 deletions(-)

looks good to me.
Thanks for the corrections.
some x86 compilation errors, you want me to to fix it and push v4 ?

Regards,
Oza.






2018-07-19 15:58:12

by Oza Pawandeep

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL

On 2018-07-19 01:14, Bjorn Helgaas wrote:
> This is a v3 of Oza's patches [1]. It's available at [2] if you prefer
> git.
>
> v3 changes:
> - Add pci_aer_clear_fatal_status() to clear ERR_FATAL bits, only
> called
> from pcie_do_fatal_recovery(). Moved to first in series to avoid a
> window where ERR_FATAL recovery only clears ERR_NONFATAL bits.
> Visible
> only inside the PCI core.
> - Instead of having pci_cleanup_aer_uncorrect_error_status() do
> different
> things based on dev->error_state, use this only for ERR_NONFATAL
> bits.
> I didn't change the name because it's used by many drivers.
> - Rename pci_cleanup_aer_error_device_status() to
> pci_aer_clear_device_status(), make it void, and make it visible
> only
> inside the PCI core.
> - Remove pcie_portdrv_err_handler.slot_reset altogether instead of
> making
> it a stub function. Possibly pcie_portdrv_err_handler could be
> removed
> completely?
>
> [1]
> https://lkml.kernel.org/r/[email protected]
> [2]
> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/?h=pci/06-22-oza-aer
>
> ---
>
> Bjorn Helgaas (1):
> PCI/AER: Clear only ERR_FATAL status bits during fatal recovery
>
> Oza Pawandeep (6):
> PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery
> PCI/AER: Factor out ERR_NONFATAL status bit clearing
> PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path
> PCI/AER: Clear device status bits during ERR_FATAL and
> ERR_NONFATAL
> PCI/AER: Clear device status bits during ERR_COR handling
> PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset
>
>
> drivers/pci/pci.h | 5 ++++
> drivers/pci/pcie/aer.c | 47
> +++++++++++++++++++++++++++-------------
> drivers/pci/pcie/err.c | 15 +++++--------
> drivers/pci/pcie/portdrv_pci.c | 25 ---------------------
> 4 files changed, 43 insertions(+), 49 deletions(-)


Hi Bjorn,

I am planning on some things to do after this series.


your text
"
1) I don't think the driver slot_reset callbacks should be responsible
for clearing these AER status bits. Can we clear them somewhere in
the pcie_do_nonfatal_recovery() path and remove these calls from the
drivers?
"

Oza: We can do following
broadcast_error_message()
if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
should do
pci_walk_bus(dev->subordinate,
pci_cleanup_aer_uncorrect_error_status, NULL);

and update all the drivers and remove the call
pci_cleanup_aer_uncorrect_error_status()


2) In principle, we should only read PCI_ERR_UNCOR_STATUS *once* per
device when handling an error. We currently read it three times:

aer_isr
aer_isr_one_error
find_source_device
find_device_iter
is_error_source
read PCI_ERR_UNCOR_STATUS # 1
Oza: this is the first legitimate read
aer_process_err_devices
get_device_error_info(e_info->dev[i])
read PCI_ERR_UNCOR_STATUS # 2
Oza: I see this read used to check if link is healthy so the purpose of
this read looks different to me.
handle_error_source
pcie_do_nonfatal_recovery
...
report_slot_reset
driver->err_handler->slot_reset
pci_cleanup_aer_uncorrect_error_status
read PCI_ERR_UNCOR_STATUS # 3
Oza: pci_cleanup_aer_uncorrect_error_status() is generic and able to
clear status.
for e.g. in point 4 as I suggested if we have to do
pci_walk_bus(dev->subordinate, pci_cleanup_aer_uncorrect_error_status,
NULL); then we have to read them.


3) we need to get rid of pci_channel_io_frozen permanently.

Regards,
Oza.

















2018-07-19 23:01:08

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL

On Thu, Jul 19, 2018 at 09:23:47AM +0530, [email protected] wrote:
> On 2018-07-19 01:14, Bjorn Helgaas wrote:
> > This is a v3 of Oza's patches [1]. It's available at [2] if you prefer
> > git.
> >
> > v3 changes:
> > - Add pci_aer_clear_fatal_status() to clear ERR_FATAL bits, only
> > called
> > from pcie_do_fatal_recovery(). Moved to first in series to avoid a
> > window where ERR_FATAL recovery only clears ERR_NONFATAL bits.
> > Visible
> > only inside the PCI core.
> > - Instead of having pci_cleanup_aer_uncorrect_error_status() do
> > different
> > things based on dev->error_state, use this only for ERR_NONFATAL
> > bits.
> > I didn't change the name because it's used by many drivers.
> > - Rename pci_cleanup_aer_error_device_status() to
> > pci_aer_clear_device_status(), make it void, and make it visible
> > only
> > inside the PCI core.
> > - Remove pcie_portdrv_err_handler.slot_reset altogether instead of
> > making
> > it a stub function. Possibly pcie_portdrv_err_handler could be
> > removed
> > completely?
> >
> > [1]
> > https://lkml.kernel.org/r/[email protected]
> > [2]
> > https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/?h=pci/06-22-oza-aer
> >
> > ---
> >
> > Bjorn Helgaas (1):
> > PCI/AER: Clear only ERR_FATAL status bits during fatal recovery
> >
> > Oza Pawandeep (6):
> > PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery
> > PCI/AER: Factor out ERR_NONFATAL status bit clearing
> > PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path
> > PCI/AER: Clear device status bits during ERR_FATAL and
> > ERR_NONFATAL
> > PCI/AER: Clear device status bits during ERR_COR handling
> > PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset
> >
> >
> > drivers/pci/pci.h | 5 ++++
> > drivers/pci/pcie/aer.c | 47
> > +++++++++++++++++++++++++++-------------
> > drivers/pci/pcie/err.c | 15 +++++--------
> > drivers/pci/pcie/portdrv_pci.c | 25 ---------------------
> > 4 files changed, 43 insertions(+), 49 deletions(-)
>
> looks good to me.
> Thanks for the corrections.
> some x86 compilation errors, you want me to to fix it and push v4 ?

I fixed those already. I moved these all to the pci/aer branch for
v4.19. I'll merge them into "next" soon. Thanks!

Bjorn