2018-06-29 21:17:20

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v1 0/2] PCI: Add patch submission and ACPI FW/OS info

Add hints about my patch idiosyncrasies and and ACPI firmware/OS interface
usage. I posted the latter 18 months ago or so, but never followed up on
it.

---

Bjorn Helgaas (2):
PCI: Document patch submission hints
PCI: Document ACPI description of PCI host bridges


Documentation/PCI/00-INDEX | 4 +
Documentation/PCI/acpi-info.txt | 183 ++++++++++++++++++++++++++++++
Documentation/PCI/submitting-patches.txt | 153 +++++++++++++++++++++++++
3 files changed, 340 insertions(+)
create mode 100644 Documentation/PCI/acpi-info.txt
create mode 100644 Documentation/PCI/submitting-patches.txt


2018-06-29 21:17:24

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v1 1/2] PCI: Document patch submission hints

From: Bjorn Helgaas <[email protected]>

Add hints about how to write PCI patches and changelogs.

Signed-off-by: Bjorn Helgaas <[email protected]>
---
Documentation/PCI/00-INDEX | 2
Documentation/PCI/submitting-patches.txt | 153 ++++++++++++++++++++++++++++++
2 files changed, 155 insertions(+)
create mode 100644 Documentation/PCI/submitting-patches.txt

diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
index 00c9a90b6f38..0f1d1de087f1 100644
--- a/Documentation/PCI/00-INDEX
+++ b/Documentation/PCI/00-INDEX
@@ -12,6 +12,8 @@ pci.txt
- info on the PCI subsystem for device driver authors
pcieaer-howto.txt
- the PCI Express Advanced Error Reporting Driver Guide HOWTO
+submitting-patches.txt
+ - hints about how to submit PCI patches
endpoint/pci-endpoint.txt
- guide to add endpoint controller driver and endpoint function driver.
endpoint/pci-endpoint-cfs.txt
diff --git a/Documentation/PCI/submitting-patches.txt b/Documentation/PCI/submitting-patches.txt
new file mode 100644
index 000000000000..d6a694182446
--- /dev/null
+++ b/Documentation/PCI/submitting-patches.txt
@@ -0,0 +1,153 @@
+Start with Documentation/process/submitting-patches.rst for general
+guidance.
+
+These are things I look at when reviewing patches. Most of the technical
+things are obvious or covered elsewhere. Some things here are cosmetic
+personal preferences, but consistency makes maintenance easier. I silently
+fix up most of the trivial things, so don't get too hung up on the details.
+
+Write the patch:
+
+ - Make each patch small but complete by itself. Don't worry about making
+ patches *too* small; it's trivial for me to squash patches together if
+ necessary.
+
+ - Make sure the kernel builds after every patch. You can't introduce a
+ problem in one patch and fix it in a subsequent patch. If one of the
+ autobuilders finds a build issue, I'll revert the patch unless you send
+ a fix.
+
+ - Please do send whitespace and coding style fixes, but don't mix them
+ with other substantive changes. It's easier for people to backport
+ fixes if whitespace changes are at the end of a series.
+
+ - Wrap code and comments to fit in 80 columns. Exception: I prefer
+ printk strings to be in one piece for searchability, so don't split
+ quoted strings to make them fit in 80 columns.
+
+ - Follow the existing style for code, names, and indentation. When
+ you're finished, the file should look like it was all written by one
+ person.
+
+Write the changelog title (first line of the changelog):
+
+ - Follow the existing convention Run "git log --oneline <file>" and make
+ your subject line match previous changes in format, capitalization, and
+ sentence structure. For example, native host bridge driver patch
+ titles look like this:
+
+ PCI: vmd: Remove IRQ affinity so we can allocate more IRQs
+ PCI: mediatek: Add MSI support for MT2712 and MT7622
+ PCI: rockchip: Remove IRQ domain if probe fails
+
+ - Write a complete sentence, starting with a capitalized verb.
+
+ - Include specific details, e.g., write "Add XYZ controller support"
+ instead of "add support for new generation controller".
+
+ - Do not include a trailing period in the title.
+
+Write the changelog:
+
+ - Make the changelog readable without the title. The changelog is not a
+ continuation of the title, so it should make sense by itself. Always
+ include a changelog, even if it is the same as the title.
+
+ - Explain the change (not just "Fix a problem"), but do it as concisely
+ as possible. Include function names, references to sections of the
+ spec, URLs for bug reports, etc. This makes reviewing and future
+ maintenance easier.
+
+ - Capitalize initialisms ("PCI", "IRQ", "ID", "MSI", etc) in all English
+ text, including title, changelog, and comments. These are usually
+ written in lower-case in the C code, but please follow normal English
+ conventions in text.
+
+ - Include "()" after function names and "[]" after array names as a
+ visual clue that these refer to something in the code.
+
+ - Include dmesg output and stack trace when relevant. Prune details that
+ aren't relevant, e.g., you can usually remove timestamps and function
+ addresses. The objective is to concisely illustrate the issue and make
+ it discoverable by search engines.
+
+ - Use spaces (not tabs) in the changelog because "git log" indents the
+ changelog and things aligned with tabs won't stay aligned.
+
+ - Wrap changelogs to fit in 80 columns when shown by "git show", which
+ adds 4 spaces. I use "textwidth=75" in vim.
+
+ - Order tags as suggested by Ingo [1] (extended):
+
+ Fixes:
+ Link:
+ Reported-by:
+ Tested-by:
+ Signed-off-by: (author)
+ Signed-off-by: (chain)
+ Reviewed-by:
+ Acked-by:
+ Cc: [email protected] # 3.4+
+ Cc: (other)
+
+ - Include a "Fixes:" reference when you're fixing a previous commit and
+ copy the author of the previous commit. This helps people figure out
+ where a change needs to be backported.
+
+ - Include specific commit references when possible, e.g., 'e77f847df54c
+ ("PCI: rockchip: Add Rockchip PCIe controller support")'. I use this
+ alias to generate them:
+
+ alias gsr='git --no-pager show -s --abbrev-commit --abbrev=12 --pretty=format:"%h (\"%s\")%n"'
+
+ - Include bugzilla URLs if available (kernel.org bugzilla preferred),
+ e.g.,
+
+ Link: https://bugzilla.redhat.com/show_bug.cgi?id=1409201
+
+ - Include problem report URLs. Use kernel.org URLs, e.g.,
+ http://lkml.kernel.org/r/<Message-ID>, because they don't depend on
+ other mirror sites:
+
+ Link: http://lkml.kernel.org/r/[email protected]
+
+ - Include specific references to the spec when possible, e.g., "PCIe
+ r3.1, sec 7.8.2". If you're talking about something mentioned in the
+ spec, use the same name and capitalization as the spec.
+
+ - Include a "Cc: [email protected]" tag if you want one, and
+ indicate which kernels need it.
+
+Post the patch:
+
+ - Use scripts/get_maintainer.pl to find the maintainers of files you're
+ changing, and copy the maintainers and authors of recent or related
+ changes.
+
+ - Always copy [email protected] and [email protected].
+ I don't apply patches that haven't appeared on the linux-pci mailing
+ list, even if you send them to me directly. This is partly to make
+ sure everyone has a chance to review it and partly because I use the
+ Patchwork tracker [2], which only tracks things on the linux-pci list.
+
+ - If you send more than one patch and they're related, always include a
+ "[0/n]" cover letter. This makes it easy for me to reply to the cover
+ letter saying "I applied this series." I use "stg -e -v v1 --to=...
+ patch1..patchN".
+
+ - If you post a new version, please make sure it includes "[v2]" or
+ similar in the subject line. If it's a series, I want a new version
+ of the entire series. I don't want updates of individual patches
+ within the series -- that's too hard for me to keep track of. It's
+ perfectly fine if some patches in a v2 series are the same as in the
+ initial posting.
+
+ - If you want the patch in the current release, include a cover letter
+ and tell me that. Otherwise, I assume all patches are intended for the
+ next merge window.
+
+ - If you're really gung-ho, you can go to Patchwork [2] and mark your
+ superseded patches as "Superseded" so I don't have to do that myself.
+
+[1] http://lkml.kernel.org/r/[email protected]
+[2] https://patchwork.ozlabs.org/project/linux-pci/list/


2018-06-29 21:18:23

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v1 2/2] PCI: Document ACPI description of PCI host bridges

From: Bjorn Helgaas <[email protected]>

Add a writeup about how PCI host bridges should be described in ACPI
using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.

Signed-off-by: Bjorn Helgaas <[email protected]>
---
Documentation/PCI/00-INDEX | 2
Documentation/PCI/acpi-info.txt | 183 +++++++++++++++++++++++++++++++++++++++
2 files changed, 185 insertions(+)
create mode 100644 Documentation/PCI/acpi-info.txt

diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
index 0f1d1de087f1..fc6af2957e55 100644
--- a/Documentation/PCI/00-INDEX
+++ b/Documentation/PCI/00-INDEX
@@ -1,5 +1,7 @@
00-INDEX
- this file
+acpi-info.txt
+ - info on how PCI host bridges are represented in ACPI
MSI-HOWTO.txt
- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
PCIEBUS-HOWTO.txt
diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
new file mode 100644
index 000000000000..9b8e7b560b50
--- /dev/null
+++ b/Documentation/PCI/acpi-info.txt
@@ -0,0 +1,183 @@
+ ACPI considerations for PCI host bridges
+
+The general rule is that the ACPI namespace should describe everything the
+OS might use unless there's another way for the OS to find it [1, 2].
+
+For example, there's no standard hardware mechanism for enumerating PCI
+host bridges, so ACPI must describe each host bridge, the method for
+accessing PCI config space below it, the address space windows the bridge
+forwards to PCI, and the routing of legacy INTx interrupts.
+
+PCI devices *below* the host bridge generally do not need to be described
+via ACPI because the OS can discover them via the standard PCI enumeration
+mechanism, which uses config accesses to discover and identify the device
+and read and size its BARs.
+
+ACPI resource description is done via _CRS objects of devices in the ACPI
+namespace [2].   The _CRS is like a generalized PCI BAR: the OS can read
+_CRS and figure out what resource is being consumed even if it doesn't have
+a driver for the device [3].  That's important because it means an old OS
+can work correctly even on a system with new devices unknown to the OS.
+The new devices might not do anything, but the OS can at least make sure no
+resources conflict with them.
+
+Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
+reserving address space! The static tables are for things the OS needs to
+know early in boot, before it can parse the ACPI namespace. If a new table
+is defined, an old OS needs to operate correctly even though it ignores the
+table. _CRS allows that because it is generic and understood by the old
+OS; a static table does not.
+
+If the OS is expected to manage a non-discoverable device described via
+ACPI, that device will have a specific _HID/_CID that tells the OS what
+driver to bind to it, and the _CRS tells the OS and the driver where the
+device's registers are.
+
+PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
+describe all the address space they consume.  This includes all the windows
+they forward down to the PCI bus, as well as bridge registers that are not
+forwarded to PCI.  The bridge registers include things like secondary/
+subordinate bus registers that determine the bus range below the bridge,
+window registers that describe the apertures, etc.  These are all
+device-specific, non-architected things, so the only way a PNP0A03/PNP0A08
+driver can manage them is via _PRS/_CRS/_SRS, which contain the
+device-specific details.  The bridge registers also include ECAM space,
+since it is consumed by the bridge.
+
+ACPI defines a Consumer/Producer bit to distinguish the bridge registers
+("Consumer") from the bridge apertures ("Producer") [4, 5], but early
+BIOSes didn't use that bit correctly. The result is that the current ACPI
+spec defines Consumer/Producer only for the Extended Address Space
+descriptors; the bit should be ignored in the older QWord/DWord/Word
+Address Space descriptors. Consequently, OSes have to assume all
+QWord/DWord/Word descriptors are windows.
+
+Prior to the addition of Extended Address Space descriptors, the failure of
+Consumer/Producer meant there was no way to describe bridge registers in
+the PNP0A03/PNP0A08 device itself. The workaround was to describe the
+bridge registers (including ECAM space) in PNP0C02 catch-all devices [6].
+With the exception of ECAM, the bridge register space is device-specific
+anyway, so the generic PNP0A03/PNP0A08 driver (pci_root.c) has no need to
+know about it.  
+
+New architectures should be able to use "Consumer" Extended Address Space
+descriptors in the PNP0A03 device for bridge registers, including ECAM,
+although a strict interpretation of [6] might prohibit this. Old x86 and
+ia64 kernels assume all address space descriptors, including "Consumer"
+Extended Address Space ones, are windows, so it would not be safe to
+describe bridge registers this way on those architectures.
+
+PNP0C02 "motherboard" devices are basically a catch-all.  There's no
+programming model for them other than "don't use these resources for
+anything else."  So a PNP0C02 _CRS should claim any address space that is
+(1) not claimed by _CRS under any other device object in the ACPI namespace
+and (2) should not be assigned by the OS to something else.
+
+The PCIe spec requires the Enhanced Configuration Access Method (ECAM)
+unless there's a standard firmware interface for config access, e.g., the
+ia64 SAL interface [7]. A host bridge consumes ECAM memory address space
+and converts memory accesses into PCI configuration accesses. The spec
+defines the ECAM address space layout and functionality; only the base of
+the address space is device-specific. An ACPI OS learns the base address
+from either the static MCFG table or a _CBA method in the PNP0A03 device.
+
+The MCFG table must describe the ECAM space of non-hot pluggable host
+bridges [8]. Since MCFG is a static table and can't be updated by hotplug,
+a _CBA method in the PNP0A03 device describes the ECAM space of a
+hot-pluggable host bridge [9]. Note that for both MCFG and _CBA, the base
+address always corresponds to bus 0, even if the bus range below the bridge
+(which is reported via _CRS) doesn't start at 0.
+
+
+[1] ACPI 6.2, sec 6.1:
+ For any device that is on a non-enumerable type of bus (for example, an
+ ISA bus), OSPM enumerates the devices' identifier(s) and the ACPI
+ system firmware must supply an _HID object ... for each device to
+ enable OSPM to do that.
+
+[2] ACPI 6.2, sec 3.7:
+ The OS enumerates motherboard devices simply by reading through the
+ ACPI Namespace looking for devices with hardware IDs.
+
+ Each device enumerated by ACPI includes ACPI-defined objects in the
+ ACPI Namespace that report the hardware resources the device could
+ occupy [_PRS], an object that reports the resources that are currently
+ used by the device [_CRS], and objects for configuring those resources
+ [_SRS]. The information is used by the Plug and Play OS (OSPM) to
+ configure the devices.
+
+[3] ACPI 6.2, sec 6.2:
+ OSPM uses device configuration objects to configure hardware resources
+ for devices enumerated via ACPI. Device configuration objects provide
+ information about current and possible resource requirements, the
+ relationship between shared resources, and methods for configuring
+ hardware resources.
+
+ When OSPM enumerates a device, it calls _PRS to determine the resource
+ requirements of the device. It may also call _CRS to find the current
+ resource settings for the device. Using this information, the Plug and
+ Play system determines what resources the device should consume and
+ sets those resources by calling the device’s _SRS control method.
+
+ In ACPI, devices can consume resources (for example, legacy keyboards),
+ provide resources (for example, a proprietary PCI bridge), or do both.
+ Unless otherwise specified, resources for a device are assumed to be
+ taken from the nearest matching resource above the device in the device
+ hierarchy.
+
+[4] ACPI 6.2, sec 6.4.3.5.1, 2, 3, 4:
+ QWord/DWord/Word Address Space Descriptor (.1, .2, .3)
+ General Flags: Bit [0] Ignored
+
+ Extended Address Space Descriptor (.4)
+ General Flags: Bit [0] Consumer/Producer:
+ 1–This device consumes this resource
+ 0–This device produces and consumes this resource
+
+[5] ACPI 6.2, sec 19.6.43:
+ ResourceUsage specifies whether the Memory range is consumed by
+ this device (ResourceConsumer) or passed on to child devices
+ (ResourceProducer). If nothing is specified, then
+ ResourceConsumer is assumed.
+
+[6] PCI Firmware 3.2, sec 4.1.2:
+ If the operating system does not natively comprehend reserving the
+ MMCFG region, the MMCFG region must be reserved by firmware. The
+ address range reported in the MCFG table or by _CBA method (see Section
+ 4.1.3) must be reserved by declaring a motherboard resource. For most
+ systems, the motherboard resource would appear at the root of the ACPI
+ namespace (under \_SB) in a node with a _HID of EISAID (PNP0C02), and
+ the resources in this case should not be claimed in the root PCI bus’s
+ _CRS. The resources can optionally be returned in Int15 E820 or
+ EFIGetMemoryMap as reserved memory but must always be reported through
+ ACPI as a motherboard resource.
+
+[7] PCI Express 4.0, sec 7.2.2:
+ For systems that are PC-compatible, or that do not implement a
+ processor-architecture-specific firmware interface standard that allows
+ access to the Configuration Space, the ECAM is required as defined in
+ this section.
+
+[8] PCI Firmware 3.2, sec 4.1.2:
+ The MCFG table is an ACPI table that is used to communicate the base
+ addresses corresponding to the non-hot removable PCI Segment Groups
+ range within a PCI Segment Group available to the operating system at
+ boot. This is required for the PC-compatible systems.
+
+ The MCFG table is only used to communicate the base addresses
+ corresponding to the PCI Segment Groups available to the system at
+ boot.
+
+[9] PCI Firmware 3.2, sec 4.1.3:
+ The _CBA (Memory mapped Configuration Base Address) control method is
+ an optional ACPI object that returns the 64-bit memory mapped
+ configuration base address for the hot plug capable host bridge. The
+ base address returned by _CBA is processor-relative address. The _CBA
+ control method evaluates to an Integer.
+
+ This control method appears under a host bridge object. When the _CBA
+ method appears under an active host bridge object, the operating system
+ evaluates this structure to identify the memory mapped configuration
+ base address corresponding to the PCI Segment Group for the bus number
+ range specified in _CRS method. An ACPI name space object that contains
+ the _CBA method must also contain a corresponding _SEG method.


2018-06-30 00:50:34

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] PCI: Document patch submission hints

On Fri, Jun 29, 2018 at 03:27:39PM -0500, Bjorn Helgaas wrote:
> + - Wrap changelogs to fit in 80 columns when shown by "git show", which
> + adds 4 spaces. I use "textwidth=75" in vim.

I guess the ideal width is subjective, I usually wrap at 72 chars because
then you've got 4 blanks on either side when viewed with "git log", which
I find neater than maxing out the horizontal width.

In some cases I deliberately wrap at less than 72 chars or allow 73 chars
if it avoids vastly unequal widths in a paragraph. Often when I later
doublecheck what you've committed, I find that you've rewrapped everything
to 75 chars and the result doesn't look as neat as I wanted it to be.
Not a big deal, but thought I'd mention it now that you're codifying this
"rule" somewhat more formally.

Lukas

2018-07-01 17:47:32

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] PCI: Document patch submission hints

On Fri, Jun 29, 2018 at 03:27:39PM -0500, Bjorn Helgaas wrote:
> --- /dev/null
> +++ b/Documentation/PCI/submitting-patches.txt
> @@ -0,0 +1,153 @@
> +Start with Documentation/process/submitting-patches.rst for general
> +guidance.
> +
> +These are things I look at when reviewing patches.

For an uninitiated reader who doesn't know that you're currently the
(sole) maintainer of the PCI subsystem, this sentence might look odd.
Who's "I"? What happens if you onboard co-maintainers, are you
going to change this to "we"?


> + - Wrap code and comments to fit in 80 columns. Exception: I prefer
> + printk strings to be in one piece for searchability, so don't split
> + quoted strings to make them fit in 80 columns.

This is a duplication of Documentation/process/coding-style.rst, section 2.


> + - Follow the existing convention Run "git log --oneline <file>" and make
> + your subject line match previous changes in format, capitalization, and
> + sentence structure. For example, native host bridge driver patch
> + titles look like this:
> +
> + PCI: vmd: Remove IRQ affinity so we can allocate more IRQs
> + PCI: mediatek: Add MSI support for MT2712 and MT7622
> + PCI: rockchip: Remove IRQ domain if probe fails

A quick "git log --oneline --no-merges drivers/pci" shows that the prefixes
in use aren't consistent at all: Sometimes a slash is used to separate
"PCI" from the subpart touched by the patch, sometimes a colon, e.g.
"PCI/AER: " versus "PCI: shpchp: ". Your own patches aren't consistent
in that respect. Sometimes, only "PCI: " is given as prefix, even though
the commit only touches a subpart such as "sysfs", so could easily specify
more precisely what it's touching.

If you value consistency, it would be good to codify the preferred form
right here.


> + - Include specific details, e.g., write "Add XYZ controller support"
> + instead of "add support for new generation controller".

Why not simply "Support XYZ controller"? One word less, more succinct.


> + - Always copy [email protected] and [email protected].

I'd drop linux-kernel here. The volume on that list is already like
drinking from a firehose, I doubt it adds much value to cc it.

Thanks,

Lukas

2018-07-04 09:43:55

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v1 2/2] PCI: Document ACPI description of PCI host bridges

On Fri, Jun 29, 2018 at 10:27 PM, Bjorn Helgaas <[email protected]> wrote:
> From: Bjorn Helgaas <[email protected]>
>
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas <[email protected]>
> ---
> Documentation/PCI/00-INDEX | 2
> Documentation/PCI/acpi-info.txt | 183 +++++++++++++++++++++++++++++++++++++++
> 2 files changed, 185 insertions(+)
> create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 0f1d1de087f1..fc6af2957e55 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
> 00-INDEX
> - this file
> +acpi-info.txt
> + - info on how PCI host bridges are represented in ACPI
> MSI-HOWTO.txt
> - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
> PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 000000000000..9b8e7b560b50
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,183 @@
> + ACPI considerations for PCI host bridges
> +
> +The general rule is that the ACPI namespace should describe everything the
> +OS might use unless there's another way for the OS to find it [1, 2].
> +
> +For example, there's no standard hardware mechanism for enumerating PCI
> +host bridges, so ACPI must describe each host bridge, the method for
> +accessing PCI config space below it, the address space windows the bridge
> +forwards to PCI, and the routing of legacy INTx interrupts.
> +
> +PCI devices *below* the host bridge generally do not need to be described
> +via ACPI because the OS can discover them via the standard PCI enumeration
> +mechanism, which uses config accesses to discover and identify the device
> +and read and size its BARs.

While they can be discovered without ACPI, power management or hotplug
may depend on it.

Also things like _PRT come to mind here.

> +
> +ACPI resource description is done via _CRS objects of devices in the ACPI
> +namespace [2]. The _CRS is like a generalized PCI BAR: the OS can read
> +_CRS and figure out what resource is being consumed even if it doesn't have
> +a driver for the device [3]. That's important because it means an old OS
> +can work correctly even on a system with new devices unknown to the OS.
> +The new devices might not do anything, but the OS can at least make sure no
> +resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space! The static tables are for things the OS needs to
> +know early in boot, before it can parse the ACPI namespace. If a new table
> +is defined, an old OS needs to operate correctly even though it ignores the
> +table. _CRS allows that because it is generic and understood by the old
> +OS; a static table does not.
> +
> +If the OS is expected to manage a non-discoverable device described via
> +ACPI, that device will have a specific _HID/_CID that tells the OS what
> +driver to bind to it, and the _CRS tells the OS and the driver where the
> +device's registers are.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices. Their _CRS should
> +describe all the address space they consume. This includes all the windows
> +they forward down to the PCI bus, as well as bridge registers that are not

I believe you mean registers of the host bridge itself here, but it is
somewhat unclear if that applies to bridges below it too.

> +forwarded to PCI. The bridge registers include things like secondary/
> +subordinate bus registers that determine the bus range below the bridge,
> +window registers that describe the apertures, etc. These are all
> +device-specific, non-architected things, so the only way a PNP0A03/PNP0A08
> +driver can manage them is via _PRS/_CRS/_SRS, which contain the
> +device-specific details. The bridge registers also include ECAM space,
> +since it is consumed by the bridge.
> +
> +ACPI defines a Consumer/Producer bit to distinguish the bridge registers
> +("Consumer") from the bridge apertures ("Producer") [4, 5], but early
> +BIOSes didn't use that bit correctly. The result is that the current ACPI
> +spec defines Consumer/Producer only for the Extended Address Space
> +descriptors; the bit should be ignored in the older QWord/DWord/Word
> +Address Space descriptors. Consequently, OSes have to assume all
> +QWord/DWord/Word descriptors are windows.
> +
> +Prior to the addition of Extended Address Space descriptors, the failure of
> +Consumer/Producer meant there was no way to describe bridge registers in
> +the PNP0A03/PNP0A08 device itself. The workaround was to describe the
> +bridge registers (including ECAM space) in PNP0C02 catch-all devices [6].
> +With the exception of ECAM, the bridge register space is device-specific
> +anyway, so the generic PNP0A03/PNP0A08 driver (pci_root.c) has no need to
> +know about it.
> +
> +New architectures should be able to use "Consumer" Extended Address Space
> +descriptors in the PNP0A03 device for bridge registers, including ECAM,
> +although a strict interpretation of [6] might prohibit this. Old x86 and
> +ia64 kernels assume all address space descriptors, including "Consumer"
> +Extended Address Space ones, are windows, so it would not be safe to
> +describe bridge registers this way on those architectures.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all. There's no
> +programming model for them other than "don't use these resources for
> +anything else." So a PNP0C02 _CRS should claim any address space that is
> +(1) not claimed by _CRS under any other device object in the ACPI namespace
> +and (2) should not be assigned by the OS to something else.
> +
> +The PCIe spec requires the Enhanced Configuration Access Method (ECAM)
> +unless there's a standard firmware interface for config access, e.g., the
> +ia64 SAL interface [7]. A host bridge consumes ECAM memory address space
> +and converts memory accesses into PCI configuration accesses. The spec
> +defines the ECAM address space layout and functionality; only the base of
> +the address space is device-specific. An ACPI OS learns the base address
> +from either the static MCFG table or a _CBA method in the PNP0A03 device.
> +
> +The MCFG table must describe the ECAM space of non-hot pluggable host
> +bridges [8]. Since MCFG is a static table and can't be updated by hotplug,
> +a _CBA method in the PNP0A03 device describes the ECAM space of a
> +hot-pluggable host bridge [9]. Note that for both MCFG and _CBA, the base
> +address always corresponds to bus 0, even if the bus range below the bridge
> +(which is reported via _CRS) doesn't start at 0.
> +
> +
> +[1] ACPI 6.2, sec 6.1:
> + For any device that is on a non-enumerable type of bus (for example, an
> + ISA bus), OSPM enumerates the devices' identifier(s) and the ACPI
> + system firmware must supply an _HID object ... for each device to
> + enable OSPM to do that.
> +
> +[2] ACPI 6.2, sec 3.7:
> + The OS enumerates motherboard devices simply by reading through the
> + ACPI Namespace looking for devices with hardware IDs.
> +
> + Each device enumerated by ACPI includes ACPI-defined objects in the
> + ACPI Namespace that report the hardware resources the device could
> + occupy [_PRS], an object that reports the resources that are currently
> + used by the device [_CRS], and objects for configuring those resources
> + [_SRS]. The information is used by the Plug and Play OS (OSPM) to
> + configure the devices.
> +
> +[3] ACPI 6.2, sec 6.2:
> + OSPM uses device configuration objects to configure hardware resources
> + for devices enumerated via ACPI. Device configuration objects provide
> + information about current and possible resource requirements, the
> + relationship between shared resources, and methods for configuring
> + hardware resources.
> +
> + When OSPM enumerates a device, it calls _PRS to determine the resource
> + requirements of the device. It may also call _CRS to find the current
> + resource settings for the device. Using this information, the Plug and
> + Play system determines what resources the device should consume and
> + sets those resources by calling the device’s _SRS control method.
> +
> + In ACPI, devices can consume resources (for example, legacy keyboards),
> + provide resources (for example, a proprietary PCI bridge), or do both.
> + Unless otherwise specified, resources for a device are assumed to be
> + taken from the nearest matching resource above the device in the device
> + hierarchy.
> +
> +[4] ACPI 6.2, sec 6.4.3.5.1, 2, 3, 4:
> + QWord/DWord/Word Address Space Descriptor (.1, .2, .3)
> + General Flags: Bit [0] Ignored
> +
> + Extended Address Space Descriptor (.4)
> + General Flags: Bit [0] Consumer/Producer:
> + 1–This device consumes this resource
> + 0–This device produces and consumes this resource
> +
> +[5] ACPI 6.2, sec 19.6.43:
> + ResourceUsage specifies whether the Memory range is consumed by
> + this device (ResourceConsumer) or passed on to child devices
> + (ResourceProducer). If nothing is specified, then
> + ResourceConsumer is assumed.
> +
> +[6] PCI Firmware 3.2, sec 4.1.2:
> + If the operating system does not natively comprehend reserving the
> + MMCFG region, the MMCFG region must be reserved by firmware. The
> + address range reported in the MCFG table or by _CBA method (see Section
> + 4.1.3) must be reserved by declaring a motherboard resource. For most
> + systems, the motherboard resource would appear at the root of the ACPI
> + namespace (under \_SB) in a node with a _HID of EISAID (PNP0C02), and
> + the resources in this case should not be claimed in the root PCI bus’s
> + _CRS. The resources can optionally be returned in Int15 E820 or
> + EFIGetMemoryMap as reserved memory but must always be reported through
> + ACPI as a motherboard resource.
> +
> +[7] PCI Express 4.0, sec 7.2.2:
> + For systems that are PC-compatible, or that do not implement a
> + processor-architecture-specific firmware interface standard that allows
> + access to the Configuration Space, the ECAM is required as defined in
> + this section.
> +
> +[8] PCI Firmware 3.2, sec 4.1.2:
> + The MCFG table is an ACPI table that is used to communicate the base
> + addresses corresponding to the non-hot removable PCI Segment Groups
> + range within a PCI Segment Group available to the operating system at
> + boot. This is required for the PC-compatible systems.
> +
> + The MCFG table is only used to communicate the base addresses
> + corresponding to the PCI Segment Groups available to the system at
> + boot.
> +
> +[9] PCI Firmware 3.2, sec 4.1.3:
> + The _CBA (Memory mapped Configuration Base Address) control method is
> + an optional ACPI object that returns the 64-bit memory mapped
> + configuration base address for the hot plug capable host bridge. The
> + base address returned by _CBA is processor-relative address. The _CBA
> + control method evaluates to an Integer.
> +
> + This control method appears under a host bridge object. When the _CBA
> + method appears under an active host bridge object, the operating system
> + evaluates this structure to identify the memory mapped configuration
> + base address corresponding to the PCI Segment Group for the bus number
> + range specified in _CRS method. An ACPI name space object that contains
> + the _CBA method must also contain a corresponding _SEG method.

Apart from the minor comments above, it looks all good. Thanks for
taking care of documenting this!

Reviewed-by: Rafael J. Wysocki <[email protected]>

2018-07-12 16:01:30

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] PCI: Document patch submission hints

On Sun, Jul 01, 2018 at 07:45:08PM +0200, Lukas Wunner wrote:
> On Fri, Jun 29, 2018 at 03:27:39PM -0500, Bjorn Helgaas wrote:
> > --- /dev/null
> > +++ b/Documentation/PCI/submitting-patches.txt
> > @@ -0,0 +1,153 @@
> > +Start with Documentation/process/submitting-patches.rst for general
> > +guidance.
> > +
> > +These are things I look at when reviewing patches.
>
> For an uninitiated reader who doesn't know that you're currently the
> (sole) maintainer of the PCI subsystem, this sentence might look odd.
> Who's "I"? What happens if you onboard co-maintainers, are you
> going to change this to "we"?

You're absolutely right. That was appropriate for email (where I
originally posted this) but certainly not for the tree.

> > + - Follow the existing convention Run "git log --oneline <file>" and make
> > + your subject line match previous changes in format, capitalization, and
> > + sentence structure. For example, native host bridge driver patch
> > + titles look like this:
> > +
> > + PCI: vmd: Remove IRQ affinity so we can allocate more IRQs
> > + PCI: mediatek: Add MSI support for MT2712 and MT7622
> > + PCI: rockchip: Remove IRQ domain if probe fails
>
> A quick "git log --oneline --no-merges drivers/pci" shows that the prefixes
> in use aren't consistent at all: Sometimes a slash is used to separate
> "PCI" from the subpart touched by the patch, sometimes a colon, e.g.
> "PCI/AER: " versus "PCI: shpchp: ". Your own patches aren't consistent
> in that respect. Sometimes, only "PCI: " is given as prefix, even though
> the commit only touches a subpart such as "sysfs", so could easily specify
> more precisely what it's touching.
>
> If you value consistency, it would be good to codify the preferred form
> right here.

I was trying to make the point that whenever we're doing *anything* in
a shared project like Linux, it's better to look at existing practice
and follow it than it is to slavishly follow rules.

So I purposely didn't enumerate all the cases. I think if you look at
logs of individual files, they're pretty consistent. I generally use
"PCI/XXX" for things in the core (mostly capabilities like MSI, AER,
DPC, etc) and "PCI: xxx:" for drivers (shpchp, pciehp, etc). IIRC, I
adopted this from previous practice.

Of course there are things I don't apply that are different. And
Rafael does most of the PM stuff, so I try to follow *his* convention
for that.

> > + - Always copy [email protected] and [email protected].
>
> I'd drop linux-kernel here. The volume on that list is already like
> drinking from a firehose, I doubt it adds much value to cc it.

It's useful as an archive and for people not subscribed to linux-pci
who happen to see the occasional thing they want to respond to.
Nobody expects to be able to follow everything on LKML.
get_maintainer.pl reports LKML for everything, and I'm not trying to
innovate by being different.

But on reflection, I think the overall value of this writeup is
minimal. It's a lot of repetition of things already documented
elsewhere and most of it boils down to "pay attention to existing
practice and don't do things differently unless you're innovating and
adding value." That *should* be obvious, and if it's not, I doubt
that adding one more thing to read is going to make it more obvious.

Bjorn

2018-07-27 21:02:54

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v1 2/2] PCI: Document ACPI description of PCI host bridges

On Wed, Jul 04, 2018 at 11:40:52AM +0200, Rafael J. Wysocki wrote:
> On Fri, Jun 29, 2018 at 10:27 PM, Bjorn Helgaas <[email protected]> wrote:
> > From: Bjorn Helgaas <[email protected]>
> >
> > Add a writeup about how PCI host bridges should be described in ACPI
> > using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> >
> > Signed-off-by: Bjorn Helgaas <[email protected]>
> > ---
> > Documentation/PCI/00-INDEX | 2
> > Documentation/PCI/acpi-info.txt | 183 +++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 185 insertions(+)
> > create mode 100644 Documentation/PCI/acpi-info.txt
> >
> > diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> > index 0f1d1de087f1..fc6af2957e55 100644
> > --- a/Documentation/PCI/00-INDEX
> > +++ b/Documentation/PCI/00-INDEX
> > @@ -1,5 +1,7 @@
> > 00-INDEX
> > - this file
> > +acpi-info.txt
> > + - info on how PCI host bridges are represented in ACPI
> > MSI-HOWTO.txt
> > - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
> > PCIEBUS-HOWTO.txt
> > diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> > new file mode 100644
> > index 000000000000..9b8e7b560b50
> > --- /dev/null
> > +++ b/Documentation/PCI/acpi-info.txt
> > @@ -0,0 +1,183 @@
> > + ACPI considerations for PCI host bridges
> > +
> > +The general rule is that the ACPI namespace should describe everything the
> > +OS might use unless there's another way for the OS to find it [1, 2].
> > +
> > +For example, there's no standard hardware mechanism for enumerating PCI
> > +host bridges, so ACPI must describe each host bridge, the method for
> > +accessing PCI config space below it, the address space windows the bridge
> > +forwards to PCI, and the routing of legacy INTx interrupts.
> > +
> > +PCI devices *below* the host bridge generally do not need to be described
> > +via ACPI because the OS can discover them via the standard PCI enumeration
> > +mechanism, which uses config accesses to discover and identify the device
> > +and read and size its BARs.
>
> While they can be discovered without ACPI, power management or hotplug
> may depend on it.
>
> Also things like _PRT come to mind here.

Good point. I added some text for these items (see below).

> > +ACPI resource description is done via _CRS objects of devices in the ACPI
> > +namespace [2]. The _CRS is like a generalized PCI BAR: the OS can read
> > +_CRS and figure out what resource is being consumed even if it doesn't have
> > +a driver for the device [3]. That's important because it means an old OS
> > +can work correctly even on a system with new devices unknown to the OS.
> > +The new devices might not do anything, but the OS can at least make sure no
> > +resources conflict with them.
> > +
> > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> > +reserving address space! The static tables are for things the OS needs to
> > +know early in boot, before it can parse the ACPI namespace. If a new table
> > +is defined, an old OS needs to operate correctly even though it ignores the
> > +table. _CRS allows that because it is generic and understood by the old
> > +OS; a static table does not.
> > +
> > +If the OS is expected to manage a non-discoverable device described via
> > +ACPI, that device will have a specific _HID/_CID that tells the OS what
> > +driver to bind to it, and the _CRS tells the OS and the driver where the
> > +device's registers are.
> > +
> > +PCI host bridges are PNP0A03 or PNP0A08 devices. Their _CRS should
> > +describe all the address space they consume. This includes all the windows
> > +they forward down to the PCI bus, as well as bridge registers that are not
>
> I believe you mean registers of the host bridge itself here, but it is
> somewhat unclear if that applies to bridges below it too.

Right, I do mean host bridge registers for all the "bridge"
occurrences in this paragraph. I changed all these references to
"host bridge".

PCI-to-PCI bridges below the host bridge are just like any other PCI
device, of course. Their config registers are accessed via the normal
config accessors and their MMIO space (if any) would be included in
the host bridge's apertures.

> > +forwarded to PCI. The bridge registers include things like secondary/
> > +subordinate bus registers that determine the bus range below the bridge,
> > +window registers that describe the apertures, etc. These are all
> > +device-specific, non-architected things, so the only way a PNP0A03/PNP0A08
> > +driver can manage them is via _PRS/_CRS/_SRS, which contain the
> > +device-specific details. The bridge registers also include ECAM space,
> > +since it is consumed by the bridge.

I made the following changes (hand-edited to remove trivial changes):

diff -u b/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
--- b/Documentation/PCI/acpi-info.txt
+++ b/Documentation/PCI/acpi-info.txt
@@ -1,17 +1,21 @@

For example, there's no standard hardware mechanism for enumerating PCI
-host bridges, so ACPI must describe each host bridge, the method for
-accessing PCI config space below it, the address space windows the bridge
-forwards to PCI, and the routing of legacy INTx interrupts.
-
-PCI devices *below* the host bridge generally do not need to be described
-via ACPI because the OS can discover them via the standard PCI enumeration
-mechanism, which uses config accesses to discover and identify the device
-and read and size its BARs.
+host bridges, so the ACPI namespace must describe each host bridge, the
+method for accessing PCI config space below it, the address space windows
+the host bridge forwards to PCI (using _CRS), and the routing of legacy
+INTx interrupts (using _PRT).
+
+PCI devices, which are below the host bridge, generally do not need to be
+described via ACPI. The OS can discover them via the standard PCI
+enumeration mechanism, using config accesses to discover and identify
+devices and read and size their BARs. However, ACPI may describe PCI
+devices if it provides power management or hotplug functionality for them
+or if the device has INTx interrupts connected by platform interrupt
+controllers and a _PRT is needed to describe those connections.

@@ -35,14 +39,14 @@

PCI host bridges are PNP0A03 or PNP0A08 devices. ?Their _CRS should
describe all the address space they consume. ?This includes all the windows
-they forward down to the PCI bus, as well as bridge registers that are not
-forwarded to PCI. ?The bridge registers include things like secondary/
-subordinate bus registers that determine the bus range below the bridge,
-window registers that describe the apertures, etc. ?These are all
-device-specific, non-architected things, so the only way a PNP0A03/PNP0A08
-driver can manage them is via _PRS/_CRS/_SRS, which contain the
-device-specific details. ?The bridge registers also include ECAM space,
-since it is consumed by the bridge.
+they forward down to the PCI bus, as well as registers of the host bridge
+itself that are not forwarded to PCI. ?The host bridge registers include
+things like secondary/subordinate bus registers that determine the bus
+range below the bridge, window registers that describe the apertures, etc.
+These are all device-specific, non-architected things, so the only way a
+PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which contain
+the device-specific details. ?The host bridge registers also include ECAM
+space, since it is consumed by the host bridge.

2018-07-30 14:33:43

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] PCI: Document patch submission hints

On Thu, Jul 12, 2018 at 10:59:46AM -0500, Bjorn Helgaas wrote:
> But on reflection, I think the overall value of this writeup is
> minimal. It's a lot of repetition of things already documented
> elsewhere and most of it boils down to "pay attention to existing
> practice and don't do things differently unless you're innovating and
> adding value." That *should* be obvious, and if it's not, I doubt
> that adding one more thing to read is going to make it more obvious.

So my opinion is that your writeup does contain valid points that are
worth documenting: For an open source project, a top priority is to
attract and retain contributors who improve the bus factor, who keep
the code base alive and maintained, thereby avoiding bit rot.

Knowledge diffusion, including documentation of best practices and
conventions, goes a long way towards that goal. Your writeup was
mainly from a maintainer perspective: "consistency makes maintenance
easier". But consistency is also valuable from a contributor
perspective: It makes it easier to dive into a code base and find
your way around, and that includes changelogs in the git history.

There are important bits of knowledge in the writeup, if those can
be distilled, the result would very much be valuable to have in the tree.

Example:

> I generally use
> "PCI/XXX" for things in the core (mostly capabilities like MSI, AER,
> DPC, etc) and "PCI: xxx:" for drivers (shpchp, pciehp, etc).

That was in fact unknown to me.

If you find it difficult to put yourself in the shoes of a contributor,
I could try to rework the document and distill the points I find important.

Thanks,

Lukas

2018-07-30 21:57:51

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] PCI: Document patch submission hints

On Mon, Jul 30, 2018 at 04:31:36PM +0200, Lukas Wunner wrote:
> On Thu, Jul 12, 2018 at 10:59:46AM -0500, Bjorn Helgaas wrote:
> > But on reflection, I think the overall value of this writeup is
> > minimal. It's a lot of repetition of things already documented
> > elsewhere and most of it boils down to "pay attention to existing
> > practice and don't do things differently unless you're innovating and
> > adding value." That *should* be obvious, and if it's not, I doubt
> > that adding one more thing to read is going to make it more obvious.
>
> So my opinion is that your writeup does contain valid points that are
> worth documenting: For an open source project, a top priority is to
> attract and retain contributors who improve the bus factor, who keep
> the code base alive and maintained, thereby avoiding bit rot.
>
> Knowledge diffusion, including documentation of best practices and
> conventions, goes a long way towards that goal. Your writeup was
> mainly from a maintainer perspective: "consistency makes maintenance
> easier". But consistency is also valuable from a contributor
> perspective: It makes it easier to dive into a code base and find
> your way around, and that includes changelogs in the git history.
>
> There are important bits of knowledge in the writeup, if those can
> be distilled, the result would very much be valuable to have in the tree.
>
> Example:
>
> > I generally use
> > "PCI/XXX" for things in the core (mostly capabilities like MSI, AER,
> > DPC, etc) and "PCI: xxx:" for drivers (shpchp, pciehp, etc).
>
> That was in fact unknown to me.
>
> If you find it difficult to put yourself in the shoes of a contributor,
> I could try to rework the document and distill the points I find important.

I definitely want to make it easy and attractive for people to
contribute to Linux in general. I'd be glad for any hints. You're
right, it's hard for me to put myself in the shoes of a contributor,
at least a new one.

I guess I'm hesitant because a lot of the things I included there seem
like they border on the obsessive, and I don't want to ask
contributors to make trivial changes to fit in with my personal
quirks. Since they're my quirks, I'd rather just silently fix them up
as I apply patches.

If there were a wider consensus on some things and they could be
folded into Documentation/process/ somehow, that would be ideal, but
I'm not sure there would be enough of a consensus, and I don't really
want to start a big discussion about it. But maybe there are a few
things that wouldn't be controversial, I dunno.

Bjorn