Em Mon, 13 May 2019 22:19:56 +0800
Changbin Du <[email protected]> escreveu:
> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>
> Cc: Mauro Carvalho Chehab <[email protected]>
Reviewed-by: Mauro Carvalho Chehab <[email protected]>
> ---
> Documentation/PCI/index.rst | 1 +
> .../{pcieaer-howto.txt => pcieaer-howto.rst} | 156 +++++++++++-------
> 2 files changed, 101 insertions(+), 56 deletions(-)
> rename Documentation/PCI/{pcieaer-howto.txt => pcieaer-howto.rst} (72%)
>
> diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
> index 92e62d0fc9e6..f54b65b1ca5f 100644
> --- a/Documentation/PCI/index.rst
> +++ b/Documentation/PCI/index.rst
> @@ -14,3 +14,4 @@ Linux PCI Bus Subsystem
> msi-howto
> acpi-info
> pci-error-recovery
> + pcieaer-howto
> diff --git a/Documentation/PCI/pcieaer-howto.txt b/Documentation/PCI/pcieaer-howto.rst
> similarity index 72%
> rename from Documentation/PCI/pcieaer-howto.txt
> rename to Documentation/PCI/pcieaer-howto.rst
> index 48ce7903e3c6..18bdefaafd1a 100644
> --- a/Documentation/PCI/pcieaer-howto.txt
> +++ b/Documentation/PCI/pcieaer-howto.rst
> @@ -1,21 +1,29 @@
> - The PCI Express Advanced Error Reporting Driver Guide HOWTO
> - T. Long Nguyen <[email protected]>
> - Yanmin Zhang <[email protected]>
> - 07/29/2006
> +.. SPDX-License-Identifier: GPL-2.0
> +.. include:: <isonum.txt>
>
> +===========================================================
> +The PCI Express Advanced Error Reporting Driver Guide HOWTO
> +===========================================================
>
> -1. Overview
> +:Authors: - T. Long Nguyen <[email protected]>
> + - Yanmin Zhang <[email protected]>
>
> -1.1 About this guide
> +:Copyright: |copy| 2006 Intel Corporation
> +
> +Overview
> +===========
> +
> +About this guide
> +----------------
>
> This guide describes the basics of the PCI Express Advanced Error
> Reporting (AER) driver and provides information on how to use it, as
> well as how to enable the drivers of endpoint devices to conform with
> PCI Express AER driver.
>
> -1.2 Copyright (C) Intel Corporation 2006.
>
> -1.3 What is the PCI Express AER Driver?
> +What is the PCI Express AER Driver?
> +-----------------------------------
>
> PCI Express error signaling can occur on the PCI Express link itself
> or on behalf of transactions initiated on the link. PCI Express
> @@ -30,17 +38,19 @@ The PCI Express AER driver provides the infrastructure to support PCI
> Express Advanced Error Reporting capability. The PCI Express AER
> driver provides three basic functions:
>
> -- Gathers the comprehensive error information if errors occurred.
> -- Reports error to the users.
> -- Performs error recovery actions.
> + - Gathers the comprehensive error information if errors occurred.
> + - Reports error to the users.
> + - Performs error recovery actions.
>
> AER driver only attaches root ports which support PCI-Express AER
> capability.
>
>
> -2. User Guide
> +User Guide
> +==========
>
> -2.1 Include the PCI Express AER Root Driver into the Linux Kernel
> +Include the PCI Express AER Root Driver into the Linux Kernel
> +-------------------------------------------------------------
>
> The PCI Express AER Root driver is a Root Port service driver attached
> to the PCI Express Port Bus driver. If a user wants to use it, the driver
> @@ -48,7 +58,8 @@ has to be compiled. Option CONFIG_PCIEAER supports this capability. It
> depends on CONFIG_PCIEPORTBUS, so pls. set CONFIG_PCIEPORTBUS=y and
> CONFIG_PCIEAER = y.
>
> -2.2 Load PCI Express AER Root Driver
> +Load PCI Express AER Root Driver
> +--------------------------------
>
> Some systems have AER support in firmware. Enabling Linux AER support at
> the same time the firmware handles AER may result in unpredictable
> @@ -56,30 +67,34 @@ behavior. Therefore, Linux does not handle AER events unless the firmware
> grants AER control to the OS via the ACPI _OSC method. See the PCI FW 3.0
> Specification for details regarding _OSC usage.
>
> -2.3 AER error output
> +AER error output
> +----------------
>
> When a PCIe AER error is captured, an error message will be output to
> console. If it's a correctable error, it is output as a warning.
> Otherwise, it is printed as an error. So users could choose different
> log level to filter out correctable error messages.
>
> -Below shows an example:
> -0000:50:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, id=0500(Requester ID)
> -0000:50:00.0: device [8086:0329] error status/mask=00100000/00000000
> -0000:50:00.0: [20] Unsupported Request (First)
> -0000:50:00.0: TLP Header: 04000001 00200a03 05010000 00050100
> +Below shows an example::
> +
> + 0000:50:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, id=0500(Requester ID)
> + 0000:50:00.0: device [8086:0329] error status/mask=00100000/00000000
> + 0000:50:00.0: [20] Unsupported Request (First)
> + 0000:50:00.0: TLP Header: 04000001 00200a03 05010000 00050100
>
> In the example, 'Requester ID' means the ID of the device who sends
> the error message to root port. Pls. refer to pci express specs for
> other fields.
>
> -2.4 AER Statistics / Counters
> +AER Statistics / Counters
> +-------------------------
>
> When PCIe AER errors are captured, the counters / statistics are also exposed
> in the form of sysfs attributes which are documented at
> Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
>
> -3. Developer Guide
> +Developer Guide
> +===============
>
> To enable AER aware support requires a software driver to configure
> the AER capability structure within its device and to provide callbacks.
> @@ -120,7 +135,8 @@ hierarchy and links. These errors do not include any device specific
> errors because device specific errors will still get sent directly to
> the device driver.
>
> -3.1 Configure the AER capability structure
> +Configure the AER capability structure
> +--------------------------------------
>
> AER aware drivers of PCI Express component need change the device
> control registers to enable AER. They also could change AER registers,
> @@ -128,9 +144,11 @@ including mask and severity registers. Helper function
> pci_enable_pcie_error_reporting could be used to enable AER. See
> section 3.3.
>
> -3.2. Provide callbacks
> +Provide callbacks
> +-----------------
>
> -3.2.1 callback reset_link to reset pci express link
> +callback reset_link to reset pci express link
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> This callback is used to reset the pci express physical link when a
> fatal error happens. The root port aer service driver provides a
> @@ -140,13 +158,15 @@ upstream ports should provide their own reset_link functions.
>
> In struct pcie_port_service_driver, a new pointer, reset_link, is
> added.
> +::
>
> -pci_ers_result_t (*reset_link) (struct pci_dev *dev);
> + pci_ers_result_t (*reset_link) (struct pci_dev *dev);
>
> Section 3.2.2.2 provides more detailed info on when to call
> reset_link.
>
> -3.2.2 PCI error-recovery callbacks
> +PCI error-recovery callbacks
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> The PCI Express AER Root driver uses error callbacks to coordinate
> with downstream device drivers associated with a hierarchy in question
> @@ -161,7 +181,8 @@ definitions of the callbacks.
>
> Below sections specify when to call the error callback functions.
>
> -3.2.2.1 Correctable errors
> +Correctable errors
> +~~~~~~~~~~~~~~~~~~
>
> Correctable errors pose no impacts on the functionality of
> the interface. The PCI Express protocol can recover without any
> @@ -169,13 +190,16 @@ software intervention or any loss of data. These errors do not
> require any recovery actions. The AER driver clears the device's
> correctable error status register accordingly and logs these errors.
>
> -3.2.2.2 Non-correctable (non-fatal and fatal) errors
> +Non-correctable (non-fatal and fatal) errors
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> If an error message indicates a non-fatal error, performing link reset
> at upstream is not required. The AER driver calls error_detected(dev,
> pci_channel_io_normal) to all drivers associated within a hierarchy in
> -question. for example,
> -EndPoint<==>DownstreamPort B<==>UpstreamPort A<==>RootPort.
> +question. for example::
> +
> + EndPoint<==>DownstreamPort B<==>UpstreamPort A<==>RootPort
> +
> If Upstream port A captures an AER error, the hierarchy consists of
> Downstream port B and EndPoint.
>
> @@ -199,53 +223,72 @@ function. If error_detected returns PCI_ERS_RESULT_CAN_RECOVER and
> reset_link returns PCI_ERS_RESULT_RECOVERED, the error handling goes
> to mmio_enabled.
>
> -3.3 helper functions
> +helper functions
> +----------------
> +::
> +
> + int pci_enable_pcie_error_reporting(struct pci_dev *dev);
>
> -3.3.1 int pci_enable_pcie_error_reporting(struct pci_dev *dev);
> pci_enable_pcie_error_reporting enables the device to send error
> messages to root port when an error is detected. Note that devices
> don't enable the error reporting by default, so device drivers need
> call this function to enable it.
>
> -3.3.2 int pci_disable_pcie_error_reporting(struct pci_dev *dev);
> +::
> +
> + int pci_disable_pcie_error_reporting(struct pci_dev *dev);
> +
> pci_disable_pcie_error_reporting disables the device to send error
> messages to root port when an error is detected.
>
> -3.3.3 int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
> +::
> +
> + int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);`
> +
> pci_cleanup_aer_uncorrect_error_status cleanups the uncorrectable
> error status register.
>
> -3.4 Frequent Asked Questions
> +Frequent Asked Questions
> +------------------------
>
> -Q: What happens if a PCI Express device driver does not provide an
> -error recovery handler (pci_driver->err_handler is equal to NULL)?
> +Q:
> + What happens if a PCI Express device driver does not provide an
> + error recovery handler (pci_driver->err_handler is equal to NULL)?
>
> -A: The devices attached with the driver won't be recovered. If the
> -error is fatal, kernel will print out warning messages. Please refer
> -to section 3 for more information.
> +A:
> + The devices attached with the driver won't be recovered. If the
> + error is fatal, kernel will print out warning messages. Please refer
> + to section 3 for more information.
>
> -Q: What happens if an upstream port service driver does not provide
> -callback reset_link?
> +Q:
> + What happens if an upstream port service driver does not provide
> + callback reset_link?
>
> -A: Fatal error recovery will fail if the errors are reported by the
> -upstream ports who are attached by the service driver.
> +A:
> + Fatal error recovery will fail if the errors are reported by the
> + upstream ports who are attached by the service driver.
>
> -Q: How does this infrastructure deal with driver that is not PCI
> -Express aware?
> +Q:
> + How does this infrastructure deal with driver that is not PCI
> + Express aware?
>
> -A: This infrastructure calls the error callback functions of the
> -driver when an error happens. But if the driver is not aware of
> -PCI Express, the device might not report its own errors to root
> -port.
> +A:
> + This infrastructure calls the error callback functions of the
> + driver when an error happens. But if the driver is not aware of
> + PCI Express, the device might not report its own errors to root
> + port.
>
> -Q: What modifications will that driver need to make it compatible
> -with the PCI Express AER Root driver?
> +Q:
> + What modifications will that driver need to make it compatible
> + with the PCI Express AER Root driver?
>
> -A: It could call the helper functions to enable AER in devices and
> -cleanup uncorrectable status register. Pls. refer to section 3.3.
> +A:
> + It could call the helper functions to enable AER in devices and
> + cleanup uncorrectable status register. Pls. refer to section 3.3.
>
>
> -4. Software error injection
> +Software error injection
> +========================
>
> Debugging PCIe AER error recovery code is quite difficult because it
> is hard to trigger real hardware errors. Software based error
> @@ -261,6 +304,7 @@ After reboot with new kernel or insert the module, a device file named
>
> Then, you need a user space tool named aer-inject, which can be gotten
> from:
> +
> https://git.kernel.org/cgit/linux/kernel/git/gong.chen/aer-inject.git/
>
> More information about aer-inject can be found in the document comes
Thanks,
Mauro