2019-04-23 16:31:37

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 00/63] Include linux ACPI/PCI/X86 docs into Sphinx TOC tree

Hi Corbet and All,
The kernel now uses Sphinx to generate intelligent and beautiful documentation
from reStructuredText files. I converted all of the Linux ACPI/PCI/X86 docs to
reST format in this serias.

In this version I combined ACPI and PCI docs, and added new x86 docs conversion.

The hieararchy of ACPI docs are based on Corbet's suggestion:
https://lkml.org/lkml/2019/4/3/1047
I did some adjustment according to the content and finally they are placed as:
Documentation/firmware-guide/acpi/
├── acpi-lid.rst
├── aml-debugger.rst
├── apei
│   ├── einj.rst
│   └── output_format.rst
├── debug.rst
├── dsd
│   ├── data-node-references.rst
│   └── graph.rst
├── DSD-properties-rules.rst
├── enumeration.rst
├── gpio-properties.rst
├── i2c-muxes.rst
├── lpit.rst
├── method-customizing.rst
├── method-tracing.rst
├── namespace.rst
├── osi.rst
└── video_extension.rst
Documentation/driver-api/acpi/
├── linuxized-acpica.rst
└── scan_handlers.rst
ocumentation/admin-guide/acpi/
├── cppc_sysfs.rst
├── dsdt-override.rst
├── initrd_table_override.rst
└── ssdt-overlays.rst

The PCI docs are all put into driver API guide.
The X86 docs are all put into Architecture-specific documentation.

For you to preview, please visit below url:
http://www.bytemem.com:8080/kernel-doc/index.html

Thank you!


Changbin Du (63):
Documentation: add Linux ACPI to Sphinx TOC tree
Documentation: ACPI: move namespace.txt to firmware-guide/acpi and
convert to reST
Documentation: ACPI: move enumeration.txt to firmware-guide/acpi and
convert to reST
Documentation: ACPI: move osi.txt to firmware-guide/acpi and convert
to reST
Documentation: ACPI: move linuxized-acpica.txt to driver-api/acpi and
convert to reST
Documentation: ACPI: move scan_handlers.txt to driver-api/acpi and
convert to reST
Documentation: ACPI: move DSD-properties-rules.txt to
firmware-guide/acpi and covert to reST
Documentation: ACPI: move gpio-properties.txt to firmware-guide/acpi
and convert to reST
Documentation: ACPI: move method-customizing.txt to
firmware-guide/acpi and convert to reST
Documentation: ACPI: move initrd_table_override.txt to
admin-guide/acpi and convert to reST
Documentation: ACPI: move dsdt-override.txt to admin-guide/acpi and
convert to reST
Documentation: ACPI: move i2c-muxes.txt to firmware-guide/acpi and
convert to reST
Documentation: ACPI: move acpi-lid.txt to firmware-guide/acpi and
convert to reST
Documentation: ACPI: move dsd/graph.txt to firmware-guide/acpi and
convert to reST
Documentation: ACPI: move dsd/data-node-references.txt to
firmware-guide/acpi and convert to reST
Documentation: ACPI: move debug.txt to firmware-guide/acpi and convert
to reST
Documentation: ACPI: move method-tracing.txt to firmware-guide/acpi
and convert to rsST
Documentation: ACPI: move aml-debugger.txt to firmware-guide/acpi and
convert to reST
Documentation: ACPI: move apei/output_format.txt to
firmware-guide/acpi and convert to reST
Documentation: ACPI: move apei/einj.txt to firmware-guide/acpi and
convert to reST
Documentation: ACPI: move cppc_sysfs.txt to admin-guide/acpi and
convert to reST
Documentation: ACPI: move lpit.txt to firmware-guide/acpi and convert
to reST
Documentation: ACPI: move ssdt-overlays.txt to admin-guide/acpi and
convert to reST
Documentation: ACPI: move video_extension.txt to firmware-guide/acpi
and convert to reST
Documentation: add Linux PCI to Sphinx TOC tree
Documentation: PCI: convert pci.txt to reST
Documentation: PCI: convert PCIEBUS-HOWTO.txt to reST
Documentation: PCI: convert pci-iov-howto.txt to reST
Documentation: PCI: convert MSI-HOWTO.txt to reST
Documentation: PCI: convert acpi-info.txt to reST
Documentation: PCI: convert pci-error-recovery.txt to reST
Documentation: PCI: convert pcieaer-howto.txt to reST
Documentation: PCI: convert endpoint/pci-endpoint.txt to reST
Documentation: PCI: convert endpoint/pci-endpoint-cfs.txt to reST
Documentation: PCI: convert endpoint/pci-test-function.txt to reST
Documentation: PCI: convert endpoint/pci-test-howto.txt to reST
Documentation: add Linux x86 docs to Sphinx TOC tree
Documentation: x86: convert boot.txt to reST
Documentation: x86: convert topology.txt to reST
Documentation: x86: convert exception-tables.txt to reST
Documentation: x86: convert kernel-stacks to reST
Documentation: x86: convert entry_64.txt to reST
Documentation: x86: convert earlyprintk.txt to reST
Documentation: x86: convert zero-page.txt to reST
Documentation: x86: convert tlb.txt to reST
Documentation: x86: convert mtrr.txt to reST
Documentation: x86: convert pat.txt to reST
Documentation: x86: convert protection-keys.txt to reST
Documentation: x86: convert intel_mpx.txt to reST
Documentation: x86: convert amd-memory-encryption.txt to reST
Documentation: x86: convert pti.txt to reST
Documentation: x86: convert microcode.txt to reST
Documentation: x86: convert resctrl_ui.txt to reST
Documentation: x86: convert orc-unwinder.txt to reST
Documentation: x86: convert usb-legacy-support.txt to reST
Documentation: x86: convert i386/IO-APIC.txt to reST
Documentation: x86: convert x86_64/boot-options.txt to reST
Documentation: x86: convert x86_64/uefi.txt to reST
Documentation: x86: convert x86_64/mm.txt to reST
Documentation: x86: convert x86_64/5level-paging.txt to reST
Documentation: x86: convert x86_64/fake-numa-for-cpusets to reST
Documentation: x86: convert x86_64/cpu-hotplug-spec to reST
Documentation: x86: convert x86_64/machinecheck to reST

.../PCI/{MSI-HOWTO.txt => MSI-HOWTO.rst} | 83 +-
.../{PCIEBUS-HOWTO.txt => PCIEBUS-HOWTO.rst} | 140 +-
.../PCI/{acpi-info.txt => acpi-info.rst} | 11 +-
Documentation/PCI/endpoint/index.rst | 13 +
...-endpoint-cfs.txt => pci-endpoint-cfs.rst} | 99 +-
.../{pci-endpoint.txt => pci-endpoint.rst} | 95 +-
...est-function.txt => pci-test-function.rst} | 32 +-
...{pci-test-howto.txt => pci-test-howto.rst} | 81 +-
Documentation/PCI/index.rst | 18 +
...or-recovery.txt => pci-error-recovery.rst} | 178 +--
.../{pci-iov-howto.txt => pci-iov-howto.rst} | 161 ++-
Documentation/PCI/{pci.txt => pci.rst} | 267 ++--
.../{pcieaer-howto.txt => pcieaer-howto.rst} | 110 +-
Documentation/acpi/aml-debugger.txt | 66 -
Documentation/acpi/apei/output_format.txt | 147 --
Documentation/acpi/i2c-muxes.txt | 58 -
Documentation/acpi/initrd_table_override.txt | 111 --
Documentation/acpi/method-customizing.txt | 73 -
Documentation/acpi/method-tracing.txt | 192 ---
Documentation/acpi/ssdt-overlays.txt | 172 ---
.../acpi/cppc_sysfs.rst} | 71 +-
.../acpi/dsdt-override.rst} | 8 +-
Documentation/admin-guide/acpi/index.rst | 14 +
.../acpi/initrd_table_override.rst | 120 ++
.../admin-guide/acpi/ssdt-overlays.rst | 180 +++
Documentation/admin-guide/index.rst | 1 +
Documentation/driver-api/acpi/index.rst | 9 +
.../acpi/linuxized-acpica.rst} | 115 +-
.../acpi/scan_handlers.rst} | 24 +-
Documentation/driver-api/index.rst | 1 +
.../acpi/DSD-properties-rules.rst} | 21 +-
.../acpi/acpi-lid.rst} | 48 +-
.../firmware-guide/acpi/aml-debugger.rst | 75 +
.../acpi/apei/einj.rst} | 98 +-
.../acpi/apei/output_format.rst | 150 ++
.../acpi/debug.rst} | 31 +-
.../acpi/dsd/data-node-references.rst} | 28 +-
.../acpi/dsd/graph.rst} | 157 +--
.../acpi/enumeration.rst} | 135 +-
.../acpi/gpio-properties.rst} | 78 +-
.../firmware-guide/acpi/i2c-muxes.rst | 61 +
Documentation/firmware-guide/acpi/index.rst | 26 +
.../lpit.txt => firmware-guide/acpi/lpit.rst} | 18 +-
.../acpi/method-customizing.rst | 82 ++
.../firmware-guide/acpi/method-tracing.rst | 225 +++
.../acpi/namespace.rst} | 310 +++--
.../osi.txt => firmware-guide/acpi/osi.rst} | 15 +-
.../acpi/video_extension.rst} | 63 +-
Documentation/firmware-guide/index.rst | 13 +
Documentation/index.rst | 12 +
...cryption.txt => amd-memory-encryption.rst} | 13 +-
Documentation/x86/boot.rst | 1205 +++++++++++++++++
Documentation/x86/boot.txt | 1130 ----------------
Documentation/x86/earlyprintk.rst | 148 ++
Documentation/x86/earlyprintk.txt | 141 --
.../x86/{entry_64.txt => entry_64.rst} | 12 +-
...eption-tables.txt => exception-tables.rst} | 231 ++--
.../x86/i386/{IO-APIC.txt => IO-APIC.rst} | 26 +-
Documentation/x86/i386/index.rst | 10 +
Documentation/x86/index.rst | 30 +
.../x86/{intel_mpx.txt => intel_mpx.rst} | 120 +-
.../x86/{kernel-stacks => kernel-stacks.rst} | 20 +-
.../x86/{microcode.txt => microcode.rst} | 62 +-
Documentation/x86/mtrr.rst | 350 +++++
Documentation/x86/mtrr.txt | 329 -----
.../{orc-unwinder.txt => orc-unwinder.rst} | 27 +-
Documentation/x86/pat.rst | 235 ++++
Documentation/x86/pat.txt | 230 ----
...rotection-keys.txt => protection-keys.rst} | 33 +-
Documentation/x86/{pti.txt => pti.rst} | 19 +-
.../x86/{resctrl_ui.txt => resctrl_ui.rst} | 913 +++++++------
Documentation/x86/{tlb.txt => tlb.rst} | 30 +-
Documentation/x86/topology.rst | 228 ++++
Documentation/x86/topology.txt | 217 ---
...acy-support.txt => usb-legacy-support.rst} | 8 +-
.../{5level-paging.txt => 5level-paging.rst} | 16 +-
Documentation/x86/x86_64/boot-options.rst | 327 +++++
Documentation/x86/x86_64/boot-options.txt | 278 ----
...{cpu-hotplug-spec => cpu-hotplug-spec.rst} | 5 +-
...-for-cpusets => fake-numa-for-cpusets.rst} | 18 +-
Documentation/x86/x86_64/index.rst | 16 +
.../x86_64/{machinecheck => machinecheck.rst} | 11 +-
Documentation/x86/x86_64/mm.rst | 161 +++
Documentation/x86/x86_64/mm.txt | 153 ---
.../x86/x86_64/{uefi.txt => uefi.rst} | 30 +-
Documentation/x86/zero-page.rst | 47 +
Documentation/x86/zero-page.txt | 40 -
MAINTAINERS | 4 +-
88 files changed, 6041 insertions(+), 5128 deletions(-)
rename Documentation/PCI/{MSI-HOWTO.txt => MSI-HOWTO.rst} (88%)
rename Documentation/PCI/{PCIEBUS-HOWTO.txt => PCIEBUS-HOWTO.rst} (70%)
rename Documentation/PCI/{acpi-info.txt => acpi-info.rst} (97%)
create mode 100644 Documentation/PCI/endpoint/index.rst
rename Documentation/PCI/endpoint/{pci-endpoint-cfs.txt => pci-endpoint-cfs.rst} (64%)
rename Documentation/PCI/endpoint/{pci-endpoint.txt => pci-endpoint.rst} (82%)
rename Documentation/PCI/endpoint/{pci-test-function.txt => pci-test-function.rst} (84%)
rename Documentation/PCI/endpoint/{pci-test-howto.txt => pci-test-howto.rst} (78%)
create mode 100644 Documentation/PCI/index.rst
rename Documentation/PCI/{pci-error-recovery.txt => pci-error-recovery.rst} (80%)
rename Documentation/PCI/{pci-iov-howto.txt => pci-iov-howto.rst} (63%)
rename Documentation/PCI/{pci.txt => pci.rst} (78%)
rename Documentation/PCI/{pcieaer-howto.txt => pcieaer-howto.rst} (81%)
delete mode 100644 Documentation/acpi/aml-debugger.txt
delete mode 100644 Documentation/acpi/apei/output_format.txt
delete mode 100644 Documentation/acpi/i2c-muxes.txt
delete mode 100644 Documentation/acpi/initrd_table_override.txt
delete mode 100644 Documentation/acpi/method-customizing.txt
delete mode 100644 Documentation/acpi/method-tracing.txt
delete mode 100644 Documentation/acpi/ssdt-overlays.txt
rename Documentation/{acpi/cppc_sysfs.txt => admin-guide/acpi/cppc_sysfs.rst} (51%)
rename Documentation/{acpi/dsdt-override.txt => admin-guide/acpi/dsdt-override.rst} (56%)
create mode 100644 Documentation/admin-guide/acpi/index.rst
create mode 100644 Documentation/admin-guide/acpi/initrd_table_override.rst
create mode 100644 Documentation/admin-guide/acpi/ssdt-overlays.rst
create mode 100644 Documentation/driver-api/acpi/index.rst
rename Documentation/{acpi/linuxized-acpica.txt => driver-api/acpi/linuxized-acpica.rst} (78%)
rename Documentation/{acpi/scan_handlers.txt => driver-api/acpi/scan_handlers.rst} (90%)
rename Documentation/{acpi/DSD-properties-rules.txt => firmware-guide/acpi/DSD-properties-rules.rst} (88%)
rename Documentation/{acpi/acpi-lid.txt => firmware-guide/acpi/acpi-lid.rst} (77%)
create mode 100644 Documentation/firmware-guide/acpi/aml-debugger.rst
rename Documentation/{acpi/apei/einj.txt => firmware-guide/acpi/apei/einj.rst} (67%)
create mode 100644 Documentation/firmware-guide/acpi/apei/output_format.rst
rename Documentation/{acpi/debug.txt => firmware-guide/acpi/debug.rst} (91%)
rename Documentation/{acpi/dsd/data-node-references.txt => firmware-guide/acpi/dsd/data-node-references.rst} (79%)
rename Documentation/{acpi/dsd/graph.txt => firmware-guide/acpi/dsd/graph.rst} (56%)
rename Documentation/{acpi/enumeration.txt => firmware-guide/acpi/enumeration.rst} (87%)
rename Documentation/{acpi/gpio-properties.txt => firmware-guide/acpi/gpio-properties.rst} (81%)
create mode 100644 Documentation/firmware-guide/acpi/i2c-muxes.rst
create mode 100644 Documentation/firmware-guide/acpi/index.rst
rename Documentation/{acpi/lpit.txt => firmware-guide/acpi/lpit.rst} (68%)
create mode 100644 Documentation/firmware-guide/acpi/method-customizing.rst
create mode 100644 Documentation/firmware-guide/acpi/method-tracing.rst
rename Documentation/{acpi/namespace.txt => firmware-guide/acpi/namespace.rst} (54%)
rename Documentation/{acpi/osi.txt => firmware-guide/acpi/osi.rst} (97%)
rename Documentation/{acpi/video_extension.txt => firmware-guide/acpi/video_extension.rst} (79%)
create mode 100644 Documentation/firmware-guide/index.rst
rename Documentation/x86/{amd-memory-encryption.txt => amd-memory-encryption.rst} (94%)
create mode 100644 Documentation/x86/boot.rst
delete mode 100644 Documentation/x86/boot.txt
create mode 100644 Documentation/x86/earlyprintk.rst
delete mode 100644 Documentation/x86/earlyprintk.txt
rename Documentation/x86/{entry_64.txt => entry_64.rst} (95%)
rename Documentation/x86/{exception-tables.txt => exception-tables.rst} (67%)
rename Documentation/x86/i386/{IO-APIC.txt => IO-APIC.rst} (93%)
create mode 100644 Documentation/x86/i386/index.rst
create mode 100644 Documentation/x86/index.rst
rename Documentation/x86/{intel_mpx.txt => intel_mpx.rst} (75%)
rename Documentation/x86/{kernel-stacks => kernel-stacks.rst} (92%)
rename Documentation/x86/{microcode.txt => microcode.rst} (81%)
create mode 100644 Documentation/x86/mtrr.rst
delete mode 100644 Documentation/x86/mtrr.txt
rename Documentation/x86/{orc-unwinder.txt => orc-unwinder.rst} (93%)
create mode 100644 Documentation/x86/pat.rst
delete mode 100644 Documentation/x86/pat.txt
rename Documentation/x86/{protection-keys.txt => protection-keys.rst} (83%)
rename Documentation/x86/{pti.txt => pti.rst} (95%)
rename Documentation/x86/{resctrl_ui.txt => resctrl_ui.rst} (68%)
rename Documentation/x86/{tlb.txt => tlb.rst} (81%)
create mode 100644 Documentation/x86/topology.rst
delete mode 100644 Documentation/x86/topology.txt
rename Documentation/x86/{usb-legacy-support.txt => usb-legacy-support.rst} (92%)
rename Documentation/x86/x86_64/{5level-paging.txt => 5level-paging.rst} (91%)
create mode 100644 Documentation/x86/x86_64/boot-options.rst
delete mode 100644 Documentation/x86/x86_64/boot-options.txt
rename Documentation/x86/x86_64/{cpu-hotplug-spec => cpu-hotplug-spec.rst} (88%)
rename Documentation/x86/x86_64/{fake-numa-for-cpusets => fake-numa-for-cpusets.rst} (90%)
create mode 100644 Documentation/x86/x86_64/index.rst
rename Documentation/x86/x86_64/{machinecheck => machinecheck.rst} (92%)
create mode 100644 Documentation/x86/x86_64/mm.rst
delete mode 100644 Documentation/x86/x86_64/mm.txt
rename Documentation/x86/x86_64/{uefi.txt => uefi.rst} (79%)
create mode 100644 Documentation/x86/zero-page.rst
delete mode 100644 Documentation/x86/zero-page.txt

--
2.20.1


2019-04-23 16:31:54

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 01/63] Documentation: add Linux ACPI to Sphinx TOC tree

Add below index.rst files for ACPI subsystem. More docs will be added later.
o admin-guide/acpi/index.rst
o driver-api/acpi/index.rst
o firmware-guide/index.rst

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/admin-guide/acpi/index.rst | 10 ++++++++++
Documentation/admin-guide/index.rst | 1 +
Documentation/driver-api/acpi/index.rst | 7 +++++++
Documentation/driver-api/index.rst | 1 +
Documentation/firmware-guide/acpi/index.rst | 9 +++++++++
Documentation/firmware-guide/index.rst | 13 +++++++++++++
Documentation/index.rst | 10 ++++++++++
7 files changed, 51 insertions(+)
create mode 100644 Documentation/admin-guide/acpi/index.rst
create mode 100644 Documentation/driver-api/acpi/index.rst
create mode 100644 Documentation/firmware-guide/acpi/index.rst
create mode 100644 Documentation/firmware-guide/index.rst

diff --git a/Documentation/admin-guide/acpi/index.rst b/Documentation/admin-guide/acpi/index.rst
new file mode 100644
index 000000000000..3e041206089d
--- /dev/null
+++ b/Documentation/admin-guide/acpi/index.rst
@@ -0,0 +1,10 @@
+============
+ACPI Support
+============
+
+Here we document in detail how to interact with various mechanisms in
+the Linux ACPI support.
+
+.. toctree::
+ :maxdepth: 1
+
diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index 0a491676685e..5b8286fdd91b 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -77,6 +77,7 @@ configure specific aspects of kernel behavior to your liking.
LSM/index
mm/index
perf-security
+ acpi/index

.. only:: subproject and html

diff --git a/Documentation/driver-api/acpi/index.rst b/Documentation/driver-api/acpi/index.rst
new file mode 100644
index 000000000000..898b0c60671a
--- /dev/null
+++ b/Documentation/driver-api/acpi/index.rst
@@ -0,0 +1,7 @@
+============
+ACPI Support
+============
+
+.. toctree::
+ :maxdepth: 2
+
diff --git a/Documentation/driver-api/index.rst b/Documentation/driver-api/index.rst
index c0b600ed9961..aa87075c7846 100644
--- a/Documentation/driver-api/index.rst
+++ b/Documentation/driver-api/index.rst
@@ -56,6 +56,7 @@ available subsections can be seen below.
slimbus
soundwire/index
fpga/index
+ acpi/index

.. only:: subproject and html

diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
new file mode 100644
index 000000000000..0ec7d072ba22
--- /dev/null
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -0,0 +1,9 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============
+ACPI Support
+============
+
+.. toctree::
+ :maxdepth: 1
+
diff --git a/Documentation/firmware-guide/index.rst b/Documentation/firmware-guide/index.rst
new file mode 100644
index 000000000000..5355784ca0a2
--- /dev/null
+++ b/Documentation/firmware-guide/index.rst
@@ -0,0 +1,13 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============================
+The Linux kernel firmware guide
+===============================
+
+This section describes the ACPI subsystem in Linux from firmware perspective.
+
+.. toctree::
+ :maxdepth: 1
+
+ acpi/index
+
diff --git a/Documentation/index.rst b/Documentation/index.rst
index 80a421cb935e..fdfa85c56a50 100644
--- a/Documentation/index.rst
+++ b/Documentation/index.rst
@@ -35,6 +35,16 @@ trying to get it to work optimally on a given system.

admin-guide/index

+Firmware-related documentation
+------------------------------
+The following holds information on the kernel's expectations regarding the
+platform firmwares.
+
+.. toctree::
+ :maxdepth: 2
+
+ firmware-guide/index
+
Application-developer documentation
-----------------------------------

--
2.20.1

2019-04-23 16:32:08

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 02/63] Documentation: ACPI: move namespace.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/firmware-guide/acpi/index.rst | 1 +
.../acpi/namespace.rst} | 310 +++++++++---------
2 files changed, 161 insertions(+), 150 deletions(-)
rename Documentation/{acpi/namespace.txt => firmware-guide/acpi/namespace.rst} (54%)

diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index 0ec7d072ba22..210ad8acd6df 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -7,3 +7,4 @@ ACPI Support
.. toctree::
:maxdepth: 1

+ namespace
diff --git a/Documentation/acpi/namespace.txt b/Documentation/firmware-guide/acpi/namespace.rst
similarity index 54%
rename from Documentation/acpi/namespace.txt
rename to Documentation/firmware-guide/acpi/namespace.rst
index 1860cb3865c6..443f0e5d0617 100644
--- a/Documentation/acpi/namespace.txt
+++ b/Documentation/firmware-guide/acpi/namespace.rst
@@ -1,85 +1,88 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+===================================================
ACPI Device Tree - Representation of ACPI Namespace
+===================================================
+
+:Copyright: |copy| 2013, Intel Corporation
+
+:Author: Lv Zheng <[email protected]>
+
+:Abstract: The Linux ACPI subsystem converts ACPI namespace objects into a Linux
+ device tree under the /sys/devices/LNXSYSTEM:00 and updates it upon
+ receiving ACPI hotplug notification events. For each device object
+ in this hierarchy there is a corresponding symbolic link in the
+ /sys/bus/acpi/devices.
+ This document illustrates the structure of the ACPI device tree.
+
+:Credit: Thanks for the help from Zhang Rui <[email protected]> and
+ Rafael J.Wysocki <[email protected]>.
+
+
+ACPI Definition Blocks
+======================
+
+The ACPI firmware sets up RSDP (Root System Description Pointer) in the
+system memory address space pointing to the XSDT (Extended System
+Description Table). The XSDT always points to the FADT (Fixed ACPI
+Description Table) using its first entry, the data within the FADT
+includes various fixed-length entries that describe fixed ACPI features
+of the hardware. The FADT contains a pointer to the DSDT
+(Differentiated System Descripition Table). The XSDT also contains
+entries pointing to possibly multiple SSDTs (Secondary System
+Description Table).
+
+The DSDT and SSDT data is organized in data structures called definition
+blocks that contain definitions of various objects, including ACPI
+control methods, encoded in AML (ACPI Machine Language). The data block
+of the DSDT along with the contents of SSDTs represents a hierarchical
+data structure called the ACPI namespace whose topology reflects the
+structure of the underlying hardware platform.
+
+The relationships between ACPI System Definition Tables described above
+are illustrated in the following diagram::
+
+ +---------+ +-------+ +--------+ +------------------------+
+ | RSDP | +->| XSDT | +->| FADT | | +-------------------+ |
+ +---------+ | +-------+ | +--------+ +-|->| DSDT | |
+ | Pointer | | | Entry |-+ | ...... | | | +-------------------+ |
+ +---------+ | +-------+ | X_DSDT |--+ | | Definition Blocks | |
+ | Pointer |-+ | ..... | | ...... | | +-------------------+ |
+ +---------+ +-------+ +--------+ | +-------------------+ |
+ | Entry |------------------|->| SSDT | |
+ +- - - -+ | +-------------------| |
+ | Entry | - - - - - - - -+ | | Definition Blocks | |
+ +- - - -+ | | +-------------------+ |
+ | | +- - - - - - - - - -+ |
+ +-|->| SSDT | |
+ | +-------------------+ |
+ | | Definition Blocks | |
+ | +- - - - - - - - - -+ |
+ +------------------------+
+ |
+ OSPM Loading |
+ \|/
+ +----------------+
+ | ACPI Namespace |
+ +----------------+
+
+ Figure 1. ACPI Definition Blocks
+
+.. note:: RSDP can also contain a pointer to the RSDT (Root System
+ Description Table). Platforms provide RSDT to enable
+ compatibility with ACPI 1.0 operating systems. The OS is expected
+ to use XSDT, if present.
+
+
+Example ACPI Namespace
+======================
+
+All definition blocks are loaded into a single namespace. The namespace
+is a hierarchy of objects identified by names and paths.
+The following naming conventions apply to object names in the ACPI
+namespace:

-Copyright (C) 2013, Intel Corporation
-Author: Lv Zheng <[email protected]>
-
-
-Abstract:
-
-The Linux ACPI subsystem converts ACPI namespace objects into a Linux
-device tree under the /sys/devices/LNXSYSTEM:00 and updates it upon
-receiving ACPI hotplug notification events. For each device object in this
-hierarchy there is a corresponding symbolic link in the
-/sys/bus/acpi/devices.
-This document illustrates the structure of the ACPI device tree.
-
-
-Credit:
-
-Thanks for the help from Zhang Rui <[email protected]> and Rafael J.
-Wysocki <[email protected]>.
-
-
-1. ACPI Definition Blocks
-
- The ACPI firmware sets up RSDP (Root System Description Pointer) in the
- system memory address space pointing to the XSDT (Extended System
- Description Table). The XSDT always points to the FADT (Fixed ACPI
- Description Table) using its first entry, the data within the FADT
- includes various fixed-length entries that describe fixed ACPI features
- of the hardware. The FADT contains a pointer to the DSDT
- (Differentiated System Descripition Table). The XSDT also contains
- entries pointing to possibly multiple SSDTs (Secondary System
- Description Table).
-
- The DSDT and SSDT data is organized in data structures called definition
- blocks that contain definitions of various objects, including ACPI
- control methods, encoded in AML (ACPI Machine Language). The data block
- of the DSDT along with the contents of SSDTs represents a hierarchical
- data structure called the ACPI namespace whose topology reflects the
- structure of the underlying hardware platform.
-
- The relationships between ACPI System Definition Tables described above
- are illustrated in the following diagram.
-
- +---------+ +-------+ +--------+ +------------------------+
- | RSDP | +->| XSDT | +->| FADT | | +-------------------+ |
- +---------+ | +-------+ | +--------+ +-|->| DSDT | |
- | Pointer | | | Entry |-+ | ...... | | | +-------------------+ |
- +---------+ | +-------+ | X_DSDT |--+ | | Definition Blocks | |
- | Pointer |-+ | ..... | | ...... | | +-------------------+ |
- +---------+ +-------+ +--------+ | +-------------------+ |
- | Entry |------------------|->| SSDT | |
- +- - - -+ | +-------------------| |
- | Entry | - - - - - - - -+ | | Definition Blocks | |
- +- - - -+ | | +-------------------+ |
- | | +- - - - - - - - - -+ |
- +-|->| SSDT | |
- | +-------------------+ |
- | | Definition Blocks | |
- | +- - - - - - - - - -+ |
- +------------------------+
- |
- OSPM Loading |
- \|/
- +----------------+
- | ACPI Namespace |
- +----------------+
-
- Figure 1. ACPI Definition Blocks
-
- NOTE: RSDP can also contain a pointer to the RSDT (Root System
- Description Table). Platforms provide RSDT to enable
- compatibility with ACPI 1.0 operating systems. The OS is expected
- to use XSDT, if present.
-
-
-2. Example ACPI Namespace
-
- All definition blocks are loaded into a single namespace. The namespace
- is a hierarchy of objects identified by names and paths.
- The following naming conventions apply to object names in the ACPI
- namespace:
1. All names are 32 bits long.
2. The first byte of a name must be one of 'A' - 'Z', '_'.
3. Each of the remaining bytes of a name must be one of 'A' - 'Z', '0'
@@ -91,7 +94,7 @@ Wysocki <[email protected]>.
(i.e. names prepended with '^' are relative to the parent of the
current namespace node).

- The figure below shows an example ACPI namespace.
+The figure below shows an example ACPI namespace::

+------+
| \ | Root
@@ -184,19 +187,20 @@ Wysocki <[email protected]>.
Figure 2. Example ACPI Namespace


-3. Linux ACPI Device Objects
+Linux ACPI Device Objects
+=========================

- The Linux kernel's core ACPI subsystem creates struct acpi_device
- objects for ACPI namespace objects representing devices, power resources
- processors, thermal zones. Those objects are exported to user space via
- sysfs as directories in the subtree under /sys/devices/LNXSYSTM:00. The
- format of their names is <bus_id:instance>, where 'bus_id' refers to the
- ACPI namespace representation of the given object and 'instance' is used
- for distinguishing different object of the same 'bus_id' (it is
- two-digit decimal representation of an unsigned integer).
+The Linux kernel's core ACPI subsystem creates struct acpi_device
+objects for ACPI namespace objects representing devices, power resources
+processors, thermal zones. Those objects are exported to user space via
+sysfs as directories in the subtree under /sys/devices/LNXSYSTM:00. The
+format of their names is <bus_id:instance>, where 'bus_id' refers to the
+ACPI namespace representation of the given object and 'instance' is used
+for distinguishing different object of the same 'bus_id' (it is
+two-digit decimal representation of an unsigned integer).

- The value of 'bus_id' depends on the type of the object whose name it is
- part of as listed in the table below.
+The value of 'bus_id' depends on the type of the object whose name it is
+part of as listed in the table below::

+---+-----------------+-------+----------+
| | Object/Feature | Table | bus_id |
@@ -226,10 +230,11 @@ Wysocki <[email protected]>.

Table 1. ACPI Namespace Objects Mapping

- The following rules apply when creating struct acpi_device objects on
- the basis of the contents of ACPI System Description Tables (as
- indicated by the letter in the first column and the notation in the
- second column of the table above):
+The following rules apply when creating struct acpi_device objects on
+the basis of the contents of ACPI System Description Tables (as
+indicated by the letter in the first column and the notation in the
+second column of the table above):
+
N:
The object's source is an ACPI namespace node (as indicated by the
named object's type in the second column). In that case the object's
@@ -249,13 +254,14 @@ Wysocki <[email protected]>.
struct acpi_device object with LNXVIDEO 'bus_id' will be created for
it.

- The third column of the above table indicates which ACPI System
- Description Tables contain information used for the creation of the
- struct acpi_device objects represented by the given row (xSDT means DSDT
- or SSDT).
+The third column of the above table indicates which ACPI System
+Description Tables contain information used for the creation of the
+struct acpi_device objects represented by the given row (xSDT means DSDT
+or SSDT).
+
+The forth column of the above table indicates the 'bus_id' generation
+rule of the struct acpi_device object:

- The forth column of the above table indicates the 'bus_id' generation
- rule of the struct acpi_device object:
_HID:
_HID in the last column of the table means that the object's bus_id
is derived from the _HID/_CID identification objects present under
@@ -275,45 +281,47 @@ Wysocki <[email protected]>.
object's bus_id.


-4. Linux ACPI Physical Device Glue
-
- ACPI device (i.e. struct acpi_device) objects may be linked to other
- objects in the Linux' device hierarchy that represent "physical" devices
- (for example, devices on the PCI bus). If that happens, it means that
- the ACPI device object is a "companion" of a device otherwise
- represented in a different way and is used (1) to provide configuration
- information on that device which cannot be obtained by other means and
- (2) to do specific things to the device with the help of its ACPI
- control methods. One ACPI device object may be linked this way to
- multiple "physical" devices.
-
- If an ACPI device object is linked to a "physical" device, its sysfs
- directory contains the "physical_node" symbolic link to the sysfs
- directory of the target device object. In turn, the target device's
- sysfs directory will then contain the "firmware_node" symbolic link to
- the sysfs directory of the companion ACPI device object.
- The linking mechanism relies on device identification provided by the
- ACPI namespace. For example, if there's an ACPI namespace object
- representing a PCI device (i.e. a device object under an ACPI namespace
- object representing a PCI bridge) whose _ADR returns 0x00020000 and the
- bus number of the parent PCI bridge is 0, the sysfs directory
- representing the struct acpi_device object created for that ACPI
- namespace object will contain the 'physical_node' symbolic link to the
- /sys/devices/pci0000:00/0000:00:02:0/ sysfs directory of the
- corresponding PCI device.
-
- The linking mechanism is generally bus-specific. The core of its
- implementation is located in the drivers/acpi/glue.c file, but there are
- complementary parts depending on the bus types in question located
- elsewhere. For example, the PCI-specific part of it is located in
- drivers/pci/pci-acpi.c.
-
-
-5. Example Linux ACPI Device Tree
-
- The sysfs hierarchy of struct acpi_device objects corresponding to the
- example ACPI namespace illustrated in Figure 2 with the addition of
- fixed PWR_BUTTON/SLP_BUTTON devices is shown below.
+Linux ACPI Physical Device Glue
+===============================
+
+ACPI device (i.e. struct acpi_device) objects may be linked to other
+objects in the Linux' device hierarchy that represent "physical" devices
+(for example, devices on the PCI bus). If that happens, it means that
+the ACPI device object is a "companion" of a device otherwise
+represented in a different way and is used (1) to provide configuration
+information on that device which cannot be obtained by other means and
+(2) to do specific things to the device with the help of its ACPI
+control methods. One ACPI device object may be linked this way to
+multiple "physical" devices.
+
+If an ACPI device object is linked to a "physical" device, its sysfs
+directory contains the "physical_node" symbolic link to the sysfs
+directory of the target device object. In turn, the target device's
+sysfs directory will then contain the "firmware_node" symbolic link to
+the sysfs directory of the companion ACPI device object.
+The linking mechanism relies on device identification provided by the
+ACPI namespace. For example, if there's an ACPI namespace object
+representing a PCI device (i.e. a device object under an ACPI namespace
+object representing a PCI bridge) whose _ADR returns 0x00020000 and the
+bus number of the parent PCI bridge is 0, the sysfs directory
+representing the struct acpi_device object created for that ACPI
+namespace object will contain the 'physical_node' symbolic link to the
+/sys/devices/pci0000:00/0000:00:02:0/ sysfs directory of the
+corresponding PCI device.
+
+The linking mechanism is generally bus-specific. The core of its
+implementation is located in the drivers/acpi/glue.c file, but there are
+complementary parts depending on the bus types in question located
+elsewhere. For example, the PCI-specific part of it is located in
+drivers/pci/pci-acpi.c.
+
+
+Example Linux ACPI Device Tree
+=================================
+
+The sysfs hierarchy of struct acpi_device objects corresponding to the
+example ACPI namespace illustrated in Figure 2 with the addition of
+fixed PWR_BUTTON/SLP_BUTTON devices is shown below::

+--------------+---+-----------------+
| LNXSYSTEM:00 | \ | acpi:LNXSYSTEM: |
@@ -377,12 +385,14 @@ Wysocki <[email protected]>.

Figure 3. Example Linux ACPI Device Tree

- NOTE: Each node is represented as "object/path/modalias", where:
- 1. 'object' is the name of the object's directory in sysfs.
- 2. 'path' is the ACPI namespace path of the corresponding
- ACPI namespace object, as returned by the object's 'path'
- sysfs attribute.
- 3. 'modalias' is the value of the object's 'modalias' sysfs
- attribute (as described earlier in this document).
- NOTE: N/A indicates the device object does not have the 'path' or the
- 'modalias' attribute.
+.. note:: Each node is represented as "object/path/modalias", where:
+
+ 1. 'object' is the name of the object's directory in sysfs.
+ 2. 'path' is the ACPI namespace path of the corresponding
+ ACPI namespace object, as returned by the object's 'path'
+ sysfs attribute.
+ 3. 'modalias' is the value of the object's 'modalias' sysfs
+ attribute (as described earlier in this document).
+
+.. note:: N/A indicates the device object does not have the 'path' or the
+ 'modalias' attribute.
--
2.20.1

2019-04-23 16:32:11

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 03/63] Documentation: ACPI: move enumeration.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
.../acpi/enumeration.rst} | 135 ++++++++++--------
Documentation/firmware-guide/acpi/index.rst | 1 +
2 files changed, 74 insertions(+), 62 deletions(-)
rename Documentation/{acpi/enumeration.txt => firmware-guide/acpi/enumeration.rst} (87%)

diff --git a/Documentation/acpi/enumeration.txt b/Documentation/firmware-guide/acpi/enumeration.rst
similarity index 87%
rename from Documentation/acpi/enumeration.txt
rename to Documentation/firmware-guide/acpi/enumeration.rst
index 7bcf9c3d9fbe..ce755e963714 100644
--- a/Documentation/acpi/enumeration.txt
+++ b/Documentation/firmware-guide/acpi/enumeration.rst
@@ -1,5 +1,9 @@
-ACPI based device enumeration
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+.. SPDX-License-Identifier: GPL-2.0
+
+=============================
+ACPI Based Device Enumeration
+=============================
+
ACPI 5 introduced a set of new resources (UartTSerialBus, I2cSerialBus,
SpiSerialBus, GpioIo and GpioInt) which can be used in enumerating slave
devices behind serial bus controllers.
@@ -11,12 +15,12 @@ that are accessed through memory-mapped registers.
In order to support this and re-use the existing drivers as much as
possible we decided to do following:

- o Devices that have no bus connector resource are represented as
- platform devices.
+ - Devices that have no bus connector resource are represented as
+ platform devices.

- o Devices behind real busses where there is a connector resource
- are represented as struct spi_device or struct i2c_device
- (standard UARTs are not busses so there is no struct uart_device).
+ - Devices behind real busses where there is a connector resource
+ are represented as struct spi_device or struct i2c_device
+ (standard UARTs are not busses so there is no struct uart_device).

As both ACPI and Device Tree represent a tree of devices (and their
resources) this implementation follows the Device Tree way as much as
@@ -31,7 +35,8 @@ enumerated from ACPI namespace. This handle can be used to extract other
device-specific configuration. There is an example of this below.

Platform bus support
-~~~~~~~~~~~~~~~~~~~~
+====================
+
Since we are using platform devices to represent devices that are not
connected to any physical bus we only need to implement a platform driver
for the device and add supported ACPI IDs. If this same IP-block is used on
@@ -39,7 +44,7 @@ some other non-ACPI platform, the driver might work out of the box or needs
some minor changes.

Adding ACPI support for an existing driver should be pretty
-straightforward. Here is the simplest example:
+straightforward. Here is the simplest example::

#ifdef CONFIG_ACPI
static const struct acpi_device_id mydrv_acpi_match[] = {
@@ -61,12 +66,13 @@ configuring GPIOs it can get its ACPI handle and extract this information
from ACPI tables.

DMA support
-~~~~~~~~~~~
+===========
+
DMA controllers enumerated via ACPI should be registered in the system to
provide generic access to their resources. For example, a driver that would
like to be accessible to slave devices via generic API call
dma_request_slave_channel() must register itself at the end of the probe
-function like this:
+function like this::

err = devm_acpi_dma_controller_register(dev, xlate_func, dw);
/* Handle the error if it's not a case of !CONFIG_ACPI */
@@ -74,7 +80,7 @@ function like this:
and implement custom xlate function if needed (usually acpi_dma_simple_xlate()
is enough) which converts the FixedDMA resource provided by struct
acpi_dma_spec into the corresponding DMA channel. A piece of code for that case
-could look like:
+could look like::

#ifdef CONFIG_ACPI
struct filter_args {
@@ -114,7 +120,7 @@ provided by struct acpi_dma.
Clients must call dma_request_slave_channel() with the string parameter that
corresponds to a specific FixedDMA resource. By default "tx" means the first
entry of the FixedDMA resource array, "rx" means the second entry. The table
-below shows a layout:
+below shows a layout::

Device (I2C0)
{
@@ -138,12 +144,13 @@ acpi_dma_request_slave_chan_by_index() directly and therefore choose the
specific FixedDMA resource by its index.

SPI serial bus support
-~~~~~~~~~~~~~~~~~~~~~~
+======================
+
Slave devices behind SPI bus have SpiSerialBus resource attached to them.
This is extracted automatically by the SPI core and the slave devices are
enumerated once spi_register_master() is called by the bus driver.

-Here is what the ACPI namespace for a SPI slave might look like:
+Here is what the ACPI namespace for a SPI slave might look like::

Device (EEP0)
{
@@ -163,7 +170,7 @@ Here is what the ACPI namespace for a SPI slave might look like:

The SPI device drivers only need to add ACPI IDs in a similar way than with
the platform device drivers. Below is an example where we add ACPI support
-to at25 SPI eeprom driver (this is meant for the above ACPI snippet):
+to at25 SPI eeprom driver (this is meant for the above ACPI snippet)::

#ifdef CONFIG_ACPI
static const struct acpi_device_id at25_acpi_match[] = {
@@ -182,7 +189,7 @@ to at25 SPI eeprom driver (this is meant for the above ACPI snippet):

Note that this driver actually needs more information like page size of the
eeprom etc. but at the time writing this there is no standard way of
-passing those. One idea is to return this in _DSM method like:
+passing those. One idea is to return this in _DSM method like::

Device (EEP0)
{
@@ -202,7 +209,7 @@ passing those. One idea is to return this in _DSM method like:
}

Then the at25 SPI driver can get this configuration by calling _DSM on its
-ACPI handle like:
+ACPI handle like::

struct acpi_buffer output = { ACPI_ALLOCATE_BUFFER, NULL };
struct acpi_object_list input;
@@ -220,14 +227,15 @@ ACPI handle like:
kfree(output.pointer);

I2C serial bus support
-~~~~~~~~~~~~~~~~~~~~~~
+======================
+
The slaves behind I2C bus controller only need to add the ACPI IDs like
with the platform and SPI drivers. The I2C core automatically enumerates
any slave devices behind the controller device once the adapter is
registered.

Below is an example of how to add ACPI support to the existing mpu3050
-input driver:
+input driver::

#ifdef CONFIG_ACPI
static const struct acpi_device_id mpu3050_acpi_match[] = {
@@ -251,56 +259,57 @@ input driver:
};

GPIO support
-~~~~~~~~~~~~
+============
+
ACPI 5 introduced two new resources to describe GPIO connections: GpioIo
and GpioInt. These resources can be used to pass GPIO numbers used by
the device to the driver. ACPI 5.1 extended this with _DSD (Device
Specific Data) which made it possible to name the GPIOs among other things.

-For example:
+For example::

-Device (DEV)
-{
- Method (_CRS, 0, NotSerialized)
+ Device (DEV)
{
- Name (SBUF, ResourceTemplate()
+ Method (_CRS, 0, NotSerialized)
{
- ...
- // Used to power on/off the device
- GpioIo (Exclusive, PullDefault, 0x0000, 0x0000,
- IoRestrictionOutputOnly, "\\_SB.PCI0.GPI0",
- 0x00, ResourceConsumer,,)
+ Name (SBUF, ResourceTemplate()
{
- // Pin List
- 0x0055
- }
+ ...
+ // Used to power on/off the device
+ GpioIo (Exclusive, PullDefault, 0x0000, 0x0000,
+ IoRestrictionOutputOnly, "\\_SB.PCI0.GPI0",
+ 0x00, ResourceConsumer,,)
+ {
+ // Pin List
+ 0x0055
+ }
+
+ // Interrupt for the device
+ GpioInt (Edge, ActiveHigh, ExclusiveAndWake, PullNone,
+ 0x0000, "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer,,)
+ {
+ // Pin list
+ 0x0058
+ }
+
+ ...

- // Interrupt for the device
- GpioInt (Edge, ActiveHigh, ExclusiveAndWake, PullNone,
- 0x0000, "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer,,)
- {
- // Pin list
- 0x0058
}

- ...
-
+ Return (SBUF)
}

- Return (SBUF)
- }
-
- // ACPI 5.1 _DSD used for naming the GPIOs
- Name (_DSD, Package ()
- {
- ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
- Package ()
+ // ACPI 5.1 _DSD used for naming the GPIOs
+ Name (_DSD, Package ()
{
- Package () {"power-gpios", Package() {^DEV, 0, 0, 0 }},
- Package () {"irq-gpios", Package() {^DEV, 1, 0, 0 }},
- }
- })
- ...
+ ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
+ Package ()
+ {
+ Package () {"power-gpios", Package() {^DEV, 0, 0, 0 }},
+ Package () {"irq-gpios", Package() {^DEV, 1, 0, 0 }},
+ }
+ })
+ ...

These GPIO numbers are controller relative and path "\\_SB.PCI0.GPI0"
specifies the path to the controller. In order to use these GPIOs in Linux
@@ -310,7 +319,7 @@ There is a standard GPIO API for that and is documented in
Documentation/gpio/.

In the above example we can get the corresponding two GPIO descriptors with
-a code like this:
+a code like this::

#include <linux/gpio/consumer.h>
...
@@ -334,21 +343,22 @@ See Documentation/acpi/gpio-properties.txt for more information about the
_DSD binding related to GPIOs.

MFD devices
-~~~~~~~~~~~
+===========
+
The MFD devices register their children as platform devices. For the child
devices there needs to be an ACPI handle that they can use to reference
parts of the ACPI namespace that relate to them. In the Linux MFD subsystem
we provide two ways:

- o The children share the parent ACPI handle.
- o The MFD cell can specify the ACPI id of the device.
+ - The children share the parent ACPI handle.
+ - The MFD cell can specify the ACPI id of the device.

For the first case, the MFD drivers do not need to do anything. The
resulting child platform device will have its ACPI_COMPANION() set to point
to the parent device.

If the ACPI namespace has a device that we can match using an ACPI id or ACPI
-adr, the cell should be set like:
+adr, the cell should be set like::

static struct mfd_cell_acpi_match my_subdevice_cell_acpi_match = {
.pnpid = "XYZ0001",
@@ -366,7 +376,8 @@ the MFD device and if found, that ACPI companion device is bound to the
resulting child platform device.

Device Tree namespace link device ID
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+====================================
+
The Device Tree protocol uses device identification based on the "compatible"
property whose value is a string or an array of strings recognized as device
identifiers by drivers and the driver core. The set of all those strings may be
@@ -423,4 +434,4 @@ the _DSD of the device object itself or the _DSD of its ancestor in the
Otherwise, the _DSD itself is regarded as invalid and therefore the "compatible"
property returned by it is meaningless.

-Refer to DSD-properties-rules.txt for more information.
+Refer to :doc:`DSD-properties-rules` for more information.
diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index 210ad8acd6df..99677c73f1fb 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -8,3 +8,4 @@ ACPI Support
:maxdepth: 1

namespace
+ enumeration
--
2.20.1

2019-04-23 16:32:22

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 04/63] Documentation: ACPI: move osi.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/firmware-guide/acpi/index.rst | 1 +
.../{acpi/osi.txt => firmware-guide/acpi/osi.rst} | 15 +++++++++------
2 files changed, 10 insertions(+), 6 deletions(-)
rename Documentation/{acpi/osi.txt => firmware-guide/acpi/osi.rst} (97%)

diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index 99677c73f1fb..868bd25a3398 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -9,3 +9,4 @@ ACPI Support

namespace
enumeration
+ osi
diff --git a/Documentation/acpi/osi.txt b/Documentation/firmware-guide/acpi/osi.rst
similarity index 97%
rename from Documentation/acpi/osi.txt
rename to Documentation/firmware-guide/acpi/osi.rst
index 50cde0ceb9b0..29e9ef79ebc0 100644
--- a/Documentation/acpi/osi.txt
+++ b/Documentation/firmware-guide/acpi/osi.rst
@@ -1,5 +1,8 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========================
ACPI _OSI and _REV methods
---------------------------
+==========================

An ACPI BIOS can use the "Operating System Interfaces" method (_OSI)
to find out what the operating system supports. Eg. If BIOS
@@ -14,7 +17,7 @@ This document explains how and why the BIOS and Linux should use these methods.
It also explains how and why they are widely misused.

How to use _OSI
----------------
+===============

Linux runs on two groups of machines -- those that are tested by the OEM
to be compatible with Linux, and those that were never tested with Linux,
@@ -62,7 +65,7 @@ the string when that support is added to the kernel.
That was easy. Read on, to find out how to do it wrong.

Before _OSI, there was _OS
---------------------------
+==========================

ACPI 1.0 specified "_OS" as an
"object that evaluates to a string that identifies the operating system."
@@ -96,7 +99,7 @@ That is the *only* viable strategy, as that is what modern Windows does,
and so doing otherwise could steer the BIOS down an untested path.

_OSI is born, and immediately misused
---------------------------------------
+=====================================

With _OSI, the *BIOS* provides the string describing an interface,
and asks the OS: "YES/NO, are you compatible with this interface?"
@@ -144,7 +147,7 @@ catastrophic failure resulting from the BIOS taking paths that
were never validated under *any* OS.

Do not use _REV
----------------
+===============

Since _OSI("Linux") went away, some BIOS writers used _REV
to support Linux and Windows differences in the same BIOS.
@@ -164,7 +167,7 @@ from mid-2015 onward. The ACPI specification will also be updated
to reflect that _REV is deprecated, and always returns 2.

Apple Mac and _OSI("Darwin")
-----------------------------
+============================

On Apple's Mac platforms, the ACPI BIOS invokes _OSI("Darwin")
to determine if the machine is running Apple OSX.
--
2.20.1

2019-04-23 16:32:41

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 05/63] Documentation: ACPI: move linuxized-acpica.txt to driver-api/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/driver-api/acpi/index.rst | 1 +
.../acpi/linuxized-acpica.rst} | 115 ++++++++++--------
2 files changed, 66 insertions(+), 50 deletions(-)
rename Documentation/{acpi/linuxized-acpica.txt => driver-api/acpi/linuxized-acpica.rst} (78%)

diff --git a/Documentation/driver-api/acpi/index.rst b/Documentation/driver-api/acpi/index.rst
index 898b0c60671a..12649947b19b 100644
--- a/Documentation/driver-api/acpi/index.rst
+++ b/Documentation/driver-api/acpi/index.rst
@@ -5,3 +5,4 @@ ACPI Support
.. toctree::
:maxdepth: 2

+ linuxized-acpica
diff --git a/Documentation/acpi/linuxized-acpica.txt b/Documentation/driver-api/acpi/linuxized-acpica.rst
similarity index 78%
rename from Documentation/acpi/linuxized-acpica.txt
rename to Documentation/driver-api/acpi/linuxized-acpica.rst
index 3ad7b0dfb083..f8aaea668e41 100644
--- a/Documentation/acpi/linuxized-acpica.txt
+++ b/Documentation/driver-api/acpi/linuxized-acpica.rst
@@ -1,31 +1,35 @@
-Linuxized ACPICA - Introduction to ACPICA Release Automation
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>

-Copyright (C) 2013-2016, Intel Corporation
-Author: Lv Zheng <[email protected]>
+============================================================
+Linuxized ACPICA - Introduction to ACPICA Release Automation
+============================================================

+:Copyright: |copy| 2013-2016, Intel Corporation

-Abstract:
+:Author: Lv Zheng <[email protected]>

-This document describes the ACPICA project and the relationship between
-ACPICA and Linux. It also describes how ACPICA code in drivers/acpi/acpica,
-include/acpi and tools/power/acpi is automatically updated to follow the
-upstream.
+:Abstract: This document describes the ACPICA project and the relationship
+ between ACPICA and Linux. It also describes how ACPICA code in
+ drivers/acpi/acpica, include/acpi and tools/power/acpi is
+ automatically updated to follow the upstream.


-1. ACPICA Project
+ACPICA Project
+==============

- The ACPI Component Architecture (ACPICA) project provides an operating
- system (OS)-independent reference implementation of the Advanced
- Configuration and Power Interface Specification (ACPI). It has been
- adapted by various host OSes. By directly integrating ACPICA, Linux can
- also benefit from the application experiences of ACPICA from other host
- OSes.
+The ACPI Component Architecture (ACPICA) project provides an operating
+system (OS)-independent reference implementation of the Advanced
+Configuration and Power Interface Specification (ACPI). It has been
+adapted by various host OSes. By directly integrating ACPICA, Linux can
+also benefit from the application experiences of ACPICA from other host
+OSes.

- The homepage of ACPICA project is: http://www.acpica.org, it is maintained and
- supported by Intel Corporation.
+The homepage of ACPICA project is: http://www.acpica.org, it is maintained and
+supported by Intel Corporation.

- The following figure depicts the Linux ACPI subsystem where the ACPICA
- adaptation is included:
+The following figure depicts the Linux ACPI subsystem where the ACPICA
+adaptation is included::

+---------------------------------------------------------+
| |
@@ -71,21 +75,27 @@ upstream.

Figure 1. Linux ACPI Software Components

- NOTE:
+.. note::
A. OS Service Layer - Provided by Linux to offer OS dependent
implementation of the predefined ACPICA interfaces (acpi_os_*).
+ ::
+
include/acpi/acpiosxf.h
drivers/acpi/osl.c
include/acpi/platform
include/asm/acenv.h
B. ACPICA Functionality - Released from ACPICA code base to offer
OS independent implementation of the ACPICA interfaces (acpi_*).
+ ::
+
drivers/acpi/acpica
include/acpi/ac*.h
tools/power/acpi
C. Linux/ACPI Functionality - Providing Linux specific ACPI
functionality to the other Linux kernel subsystems and user space
programs.
+ ::
+
drivers/acpi
include/linux/acpi.h
include/linux/acpi*.h
@@ -95,24 +105,27 @@ upstream.
ACPI subsystem to offer architecture specific implementation of the
ACPI interfaces. They are Linux specific components and are out of
the scope of this document.
+ ::
+
include/asm/acpi.h
include/asm/acpi*.h
arch/*/acpi

-2. ACPICA Release
+ACPICA Release
+==============

- The ACPICA project maintains its code base at the following repository URL:
- https://github.com/acpica/acpica.git. As a rule, a release is made every
- month.
+The ACPICA project maintains its code base at the following repository URL:
+https://github.com/acpica/acpica.git. As a rule, a release is made every
+month.

- As the coding style adopted by the ACPICA project is not acceptable by
- Linux, there is a release process to convert the ACPICA git commits into
- Linux patches. The patches generated by this process are referred to as
- "linuxized ACPICA patches". The release process is carried out on a local
- copy the ACPICA git repository. Each commit in the monthly release is
- converted into a linuxized ACPICA patch. Together, they form the monthly
- ACPICA release patchset for the Linux ACPI community. This process is
- illustrated in the following figure:
+As the coding style adopted by the ACPICA project is not acceptable by
+Linux, there is a release process to convert the ACPICA git commits into
+Linux patches. The patches generated by this process are referred to as
+"linuxized ACPICA patches". The release process is carried out on a local
+copy the ACPICA git repository. Each commit in the monthly release is
+converted into a linuxized ACPICA patch. Together, they form the monthly
+ACPICA release patchset for the Linux ACPI community. This process is
+illustrated in the following figure::

+-----------------------------+
| acpica / master (-) commits |
@@ -153,7 +166,7 @@ upstream.

Figure 2. ACPICA -> Linux Upstream Process

- NOTE:
+.. note::
A. Linuxize Utilities - Provided by the ACPICA repository, including a
utility located in source/tools/acpisrc folder and a number of
scripts located in generate/linux folder.
@@ -170,19 +183,20 @@ upstream.
following kernel configuration options:
CONFIG_ACPI/CONFIG_ACPI_DEBUG/CONFIG_ACPI_DEBUGGER

-3. ACPICA Divergences
+ACPICA Divergences
+==================

- Ideally, all of the ACPICA commits should be converted into Linux patches
- automatically without manual modifications, the "linux / master" tree should
- contain the ACPICA code that exactly corresponds to the ACPICA code
- contained in "new linuxized acpica" tree and it should be possible to run
- the release process fully automatically.
+Ideally, all of the ACPICA commits should be converted into Linux patches
+automatically without manual modifications, the "linux / master" tree should
+contain the ACPICA code that exactly corresponds to the ACPICA code
+contained in "new linuxized acpica" tree and it should be possible to run
+the release process fully automatically.

- As a matter of fact, however, there are source code differences between
- the ACPICA code in Linux and the upstream ACPICA code, referred to as
- "ACPICA Divergences".
+As a matter of fact, however, there are source code differences between
+the ACPICA code in Linux and the upstream ACPICA code, referred to as
+"ACPICA Divergences".

- The various sources of ACPICA divergences include:
+The various sources of ACPICA divergences include:
1. Legacy divergences - Before the current ACPICA release process was
established, there already had been divergences between Linux and
ACPICA. Over the past several years those divergences have been greatly
@@ -213,11 +227,12 @@ upstream.
rebased on the ACPICA side in order to offer better solutions, new ACPICA
divergences are generated.

-4. ACPICA Development
+ACPICA Development
+==================

- This paragraph guides Linux developers to use the ACPICA upstream release
- utilities to obtain Linux patches corresponding to upstream ACPICA commits
- before they become available from the ACPICA release process.
+This paragraph guides Linux developers to use the ACPICA upstream release
+utilities to obtain Linux patches corresponding to upstream ACPICA commits
+before they become available from the ACPICA release process.

1. Cherry-pick an ACPICA commit

@@ -225,7 +240,7 @@ upstream.
you want to cherry pick must be committed into the local repository.

Then the gen-patch.sh command can help to cherry-pick an ACPICA commit
- from the ACPICA local repository:
+ from the ACPICA local repository::

$ git clone https://github.com/acpica/acpica
$ cd acpica
@@ -240,7 +255,7 @@ upstream.
changes that haven't been applied to Linux yet.

You can generate the ACPICA release series yourself and rebase your code on
- top of the generated ACPICA release patches:
+ top of the generated ACPICA release patches::

$ git clone https://github.com/acpica/acpica
$ cd acpica
@@ -254,7 +269,7 @@ upstream.
3. Inspect the current divergences

If you have local copies of both Linux and upstream ACPICA, you can generate
- a diff file indicating the state of the current divergences:
+ a diff file indicating the state of the current divergences::

# git clone https://github.com/acpica/acpica
# git clone http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
--
2.20.1

2019-04-23 16:32:49

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 06/63] Documentation: ACPI: move scan_handlers.txt to driver-api/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/driver-api/acpi/index.rst | 1 +
.../acpi/scan_handlers.rst} | 24 ++++++++++++-------
2 files changed, 16 insertions(+), 9 deletions(-)
rename Documentation/{acpi/scan_handlers.txt => driver-api/acpi/scan_handlers.rst} (90%)

diff --git a/Documentation/driver-api/acpi/index.rst b/Documentation/driver-api/acpi/index.rst
index 12649947b19b..ace0008e54c2 100644
--- a/Documentation/driver-api/acpi/index.rst
+++ b/Documentation/driver-api/acpi/index.rst
@@ -6,3 +6,4 @@ ACPI Support
:maxdepth: 2

linuxized-acpica
+ scan_handlers
diff --git a/Documentation/acpi/scan_handlers.txt b/Documentation/driver-api/acpi/scan_handlers.rst
similarity index 90%
rename from Documentation/acpi/scan_handlers.txt
rename to Documentation/driver-api/acpi/scan_handlers.rst
index 3246ccf15992..7a197b3a33fc 100644
--- a/Documentation/acpi/scan_handlers.txt
+++ b/Documentation/driver-api/acpi/scan_handlers.rst
@@ -1,7 +1,13 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+==================
ACPI Scan Handlers
+==================
+
+:Copyright: |copy| 2012, Intel Corporation

-Copyright (C) 2012, Intel Corporation
-Author: Rafael J. Wysocki <[email protected]>
+:Author: Rafael J. Wysocki <[email protected]>

During system initialization and ACPI-based device hot-add, the ACPI namespace
is scanned in search of device objects that generally represent various pieces
@@ -30,14 +36,14 @@ to configure that link so that the kernel can use it.
Those additional configuration tasks usually depend on the type of the hardware
component represented by the given device node which can be determined on the
basis of the device node's hardware ID (HID). They are performed by objects
-called ACPI scan handlers represented by the following structure:
+called ACPI scan handlers represented by the following structure::

-struct acpi_scan_handler {
- const struct acpi_device_id *ids;
- struct list_head list_node;
- int (*attach)(struct acpi_device *dev, const struct acpi_device_id *id);
- void (*detach)(struct acpi_device *dev);
-};
+ struct acpi_scan_handler {
+ const struct acpi_device_id *ids;
+ struct list_head list_node;
+ int (*attach)(struct acpi_device *dev, const struct acpi_device_id *id);
+ void (*detach)(struct acpi_device *dev);
+ };

where ids is the list of IDs of device nodes the given handler is supposed to
take care of, list_node is the hook to the global list of ACPI scan handlers
--
2.20.1

2019-04-23 16:33:09

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 07/63] Documentation: ACPI: move DSD-properties-rules.txt to firmware-guide/acpi and covert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
.../acpi/DSD-properties-rules.rst} | 21 +++++++++++--------
Documentation/firmware-guide/acpi/index.rst | 1 +
2 files changed, 13 insertions(+), 9 deletions(-)
rename Documentation/{acpi/DSD-properties-rules.txt => firmware-guide/acpi/DSD-properties-rules.rst} (88%)

diff --git a/Documentation/acpi/DSD-properties-rules.txt b/Documentation/firmware-guide/acpi/DSD-properties-rules.rst
similarity index 88%
rename from Documentation/acpi/DSD-properties-rules.txt
rename to Documentation/firmware-guide/acpi/DSD-properties-rules.rst
index 3e4862bdad98..4306f29b6103 100644
--- a/Documentation/acpi/DSD-properties-rules.txt
+++ b/Documentation/firmware-guide/acpi/DSD-properties-rules.rst
@@ -1,8 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==================================
_DSD Device Properties Usage Rules
-----------------------------------
+==================================

Properties, Property Sets and Property Subsets
-----------------------------------------------
+==============================================

The _DSD (Device Specific Data) configuration object, introduced in ACPI 5.1,
allows any type of device configuration data to be provided via the ACPI
@@ -18,7 +21,7 @@ specific type) associated with it.

In the ACPI _DSD context it is an element of the sub-package following the
generic Device Properties UUID in the _DSD return package as specified in the
-Device Properties UUID definition document [1].
+Device Properties UUID definition document [1]_.

It also may be regarded as the definition of a key and the associated data type
that can be returned by _DSD in the Device Properties UUID sub-package for a
@@ -33,14 +36,14 @@ Property subsets are nested collections of properties. Each of them is
associated with an additional key (name) allowing the subset to be referred
to as a whole (and to be treated as a separate entity). The canonical
representation of property subsets is via the mechanism specified in the
-Hierarchical Properties Extension UUID definition document [2].
+Hierarchical Properties Extension UUID definition document [2]_.

Property sets may be hierarchical. That is, a property set may contain
multiple property subsets that each may contain property subsets of its
own and so on.

General Validity Rule for Property Sets
----------------------------------------
+=======================================

Valid property sets must follow the guidance given by the Device Properties UUID
definition document [1].
@@ -73,7 +76,7 @@ suitable for the ACPI environment and consequently they cannot belong to a valid
property set.

Property Sets and Device Tree Bindings
---------------------------------------
+======================================

It often is useful to make _DSD return property sets that follow Device Tree
bindings.
@@ -91,7 +94,7 @@ expected to automatically work in the ACPI environment regardless of their
contents.

References
-----------
+==========

-[1] http://www.uefi.org/sites/default/files/resources/_DSD-device-properties-UUID.pdf
-[2] http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.1.pdf
+.. [1] http://www.uefi.org/sites/default/files/resources/_DSD-device-properties-UUID.pdf
+.. [2] http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.1.pdf
diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index 868bd25a3398..0e05b843521c 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -10,3 +10,4 @@ ACPI Support
namespace
enumeration
osi
+ DSD-properties-rules
--
2.20.1

2019-04-23 16:33:20

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 09/63] Documentation: ACPI: move method-customizing.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/acpi/method-customizing.txt | 73 -----------------
Documentation/firmware-guide/acpi/index.rst | 3 +-
.../acpi/method-customizing.rst | 82 +++++++++++++++++++
3 files changed, 84 insertions(+), 74 deletions(-)
delete mode 100644 Documentation/acpi/method-customizing.txt
create mode 100644 Documentation/firmware-guide/acpi/method-customizing.rst

diff --git a/Documentation/acpi/method-customizing.txt b/Documentation/acpi/method-customizing.txt
deleted file mode 100644
index 7235da975f23..000000000000
--- a/Documentation/acpi/method-customizing.txt
+++ /dev/null
@@ -1,73 +0,0 @@
-Linux ACPI Custom Control Method How To
-=======================================
-
-Written by Zhang Rui <[email protected]>
-
-
-Linux supports customizing ACPI control methods at runtime.
-
-Users can use this to
-1. override an existing method which may not work correctly,
- or just for debugging purposes.
-2. insert a completely new method in order to create a missing
- method such as _OFF, _ON, _STA, _INI, etc.
-For these cases, it is far simpler to dynamically install a single
-control method rather than override the entire DSDT, because kernel
-rebuild/reboot is not needed and test result can be got in minutes.
-
-Note: Only ACPI METHOD can be overridden, any other object types like
- "Device", "OperationRegion", are not recognized. Methods
- declared inside scope operators are also not supported.
-Note: The same ACPI control method can be overridden for many times,
- and it's always the latest one that used by Linux/kernel.
-Note: To get the ACPI debug object output (Store (AAAA, Debug)),
- please run "echo 1 > /sys/module/acpi/parameters/aml_debug_output".
-
-1. override an existing method
- a) get the ACPI table via ACPI sysfs I/F. e.g. to get the DSDT,
- just run "cat /sys/firmware/acpi/tables/DSDT > /tmp/dsdt.dat"
- b) disassemble the table by running "iasl -d dsdt.dat".
- c) rewrite the ASL code of the method and save it in a new file,
- d) package the new file (psr.asl) to an ACPI table format.
- Here is an example of a customized \_SB._AC._PSR method,
-
- DefinitionBlock ("", "SSDT", 1, "", "", 0x20080715)
- {
- Method (\_SB_.AC._PSR, 0, NotSerialized)
- {
- Store ("In AC _PSR", Debug)
- Return (ACON)
- }
- }
- Note that the full pathname of the method in ACPI namespace
- should be used.
- e) assemble the file to generate the AML code of the method.
- e.g. "iasl -vw 6084 psr.asl" (psr.aml is generated as a result)
- If parameter "-vw 6084" is not supported by your iASL compiler,
- please try a newer version.
- f) mount debugfs by "mount -t debugfs none /sys/kernel/debug"
- g) override the old method via the debugfs by running
- "cat /tmp/psr.aml > /sys/kernel/debug/acpi/custom_method"
-
-2. insert a new method
- This is easier than overriding an existing method.
- We just need to create the ASL code of the method we want to
- insert and then follow the step c) ~ g) in section 1.
-
-3. undo your changes
- The "undo" operation is not supported for a new inserted method
- right now, i.e. we can not remove a method currently.
- For an overridden method, in order to undo your changes, please
- save a copy of the method original ASL code in step c) section 1,
- and redo step c) ~ g) to override the method with the original one.
-
-
-Note: We can use a kernel with multiple custom ACPI method running,
- But each individual write to debugfs can implement a SINGLE
- method override. i.e. if we want to insert/override multiple
- ACPI methods, we need to redo step c) ~ g) for multiple times.
-
-Note: Be aware that root can mis-use this driver to modify arbitrary
- memory and gain additional rights, if root's privileges got
- restricted (for example if root is not allowed to load additional
- modules after boot).
diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index 61d67763851b..d1d069b26bbc 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -10,5 +10,6 @@ ACPI Support
namespace
enumeration
osi
+ method-customizing
DSD-properties-rules
- gpio-properties
+ gpio-properties
\ No newline at end of file
diff --git a/Documentation/firmware-guide/acpi/method-customizing.rst b/Documentation/firmware-guide/acpi/method-customizing.rst
new file mode 100644
index 000000000000..32eb1cdc1549
--- /dev/null
+++ b/Documentation/firmware-guide/acpi/method-customizing.rst
@@ -0,0 +1,82 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================================
+Linux ACPI Custom Control Method How To
+=======================================
+
+:Author: Zhang Rui <[email protected]>
+
+
+Linux supports customizing ACPI control methods at runtime.
+
+Users can use this to:
+
+1. override an existing method which may not work correctly,
+ or just for debugging purposes.
+2. insert a completely new method in order to create a missing
+ method such as _OFF, _ON, _STA, _INI, etc.
+
+For these cases, it is far simpler to dynamically install a single
+control method rather than override the entire DSDT, because kernel
+rebuild/reboot is not needed and test result can be got in minutes.
+
+.. note:: Only ACPI METHOD can be overridden, any other object types like
+ "Device", "OperationRegion", are not recognized. Methods
+ declared inside scope operators are also not supported.
+.. note:: The same ACPI control method can be overridden for many times,
+ and it's always the latest one that used by Linux/kernel.
+.. note:: To get the ACPI debug object output (Store (AAAA, Debug)),
+ please run "echo 1 > /sys/module/acpi/parameters/aml_debug_output".
+
+1. override an existing method
+==============================
+a) get the ACPI table via ACPI sysfs I/F. e.g. to get the DSDT,
+ just run "cat /sys/firmware/acpi/tables/DSDT > /tmp/dsdt.dat"
+b) disassemble the table by running "iasl -d dsdt.dat".
+c) rewrite the ASL code of the method and save it in a new file,
+d) package the new file (psr.asl) to an ACPI table format.
+ Here is an example of a customized \_SB._AC._PSR method::
+
+ DefinitionBlock ("", "SSDT", 1, "", "", 0x20080715)
+ {
+ Method (\_SB_.AC._PSR, 0, NotSerialized)
+ {
+ Store ("In AC _PSR", Debug)
+ Return (ACON)
+ }
+ }
+
+ Note that the full pathname of the method in ACPI namespace
+ should be used.
+e) assemble the file to generate the AML code of the method.
+ e.g. "iasl -vw 6084 psr.asl" (psr.aml is generated as a result)
+ If parameter "-vw 6084" is not supported by your iASL compiler,
+ please try a newer version.
+f) mount debugfs by "mount -t debugfs none /sys/kernel/debug"
+g) override the old method via the debugfs by running
+ "cat /tmp/psr.aml > /sys/kernel/debug/acpi/custom_method"
+
+2. insert a new method
+======================
+This is easier than overriding an existing method.
+We just need to create the ASL code of the method we want to
+insert and then follow the step c) ~ g) in section 1.
+
+3. undo your changes
+====================
+The "undo" operation is not supported for a new inserted method
+right now, i.e. we can not remove a method currently.
+For an overridden method, in order to undo your changes, please
+save a copy of the method original ASL code in step c) section 1,
+and redo step c) ~ g) to override the method with the original one.
+
+
+.. note:: We can use a kernel with multiple custom ACPI method running,
+ But each individual write to debugfs can implement a SINGLE
+ method override. i.e. if we want to insert/override multiple
+ ACPI methods, we need to redo step c) ~ g) for multiple times.
+
+.. note:: Be aware that root can mis-use this driver to modify arbitrary
+ memory and gain additional rights, if root's privileges got
+ restricted (for example if root is not allowed to load additional
+ modules after boot).
--
2.20.1

2019-04-23 16:34:20

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 14/63] Documentation: ACPI: move dsd/graph.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
.../acpi/dsd/graph.rst} | 157 +++++++++---------
Documentation/firmware-guide/acpi/index.rst | 1 +
2 files changed, 81 insertions(+), 77 deletions(-)
rename Documentation/{acpi/dsd/graph.txt => firmware-guide/acpi/dsd/graph.rst} (56%)

diff --git a/Documentation/acpi/dsd/graph.txt b/Documentation/firmware-guide/acpi/dsd/graph.rst
similarity index 56%
rename from Documentation/acpi/dsd/graph.txt
rename to Documentation/firmware-guide/acpi/dsd/graph.rst
index b9ce910781dc..e0baed35b037 100644
--- a/Documentation/acpi/dsd/graph.txt
+++ b/Documentation/firmware-guide/acpi/dsd/graph.rst
@@ -1,8 +1,11 @@
-Graphs
+.. SPDX-License-Identifier: GPL-2.0

+======
+Graphs
+======

_DSD
-----
+====

_DSD (Device Specific Data) [7] is a predefined ACPI device
configuration object that can be used to convey information on
@@ -30,7 +33,7 @@ hierarchical data extension array on each depth.


Ports and endpoints
--------------------
+===================

The port and endpoint concepts are very similar to those in Devicetree
[3]. A port represents an interface in a device, and an endpoint
@@ -38,9 +41,9 @@ represents a connection to that interface.

All port nodes are located under the device's "_DSD" node in the hierarchical
data extension tree. The data extension related to each port node must begin
-with "port" and must be followed by the "@" character and the number of the port
-as its key. The target object it refers to should be called "PRTX", where "X" is
-the number of the port. An example of such a package would be:
+with "port" and must be followed by the "@" character and the number of the
+port as its key. The target object it refers to should be called "PRTX", where
+"X" is the number of the port. An example of such a package would be::

Package() { "port@4", PRT4 }

@@ -49,7 +52,7 @@ data extension key of the endpoint nodes must begin with
"endpoint" and must be followed by the "@" character and the number of the
endpoint. The object it refers to should be called "EPXY", where "X" is the
number of the port and "Y" is the number of the endpoint. An example of such a
-package would be:
+package would be::

Package() { "endpoint@0", EP40 }

@@ -62,85 +65,85 @@ of that port shall be zero. Similarly, if a port may only have a single
endpoint, the number of that endpoint shall be zero.

The endpoint reference uses property extension with "remote-endpoint" property
-name followed by a reference in the same package. Such references consist of the
+name followed by a reference in the same package. Such references consist of
the remote device reference, the first package entry of the port data extension
reference under the device and finally the first package entry of the endpoint
-data extension reference under the port. Individual references thus appear as:
+data extension reference under the port. Individual references thus appear as::

Package() { device, "port@X", "endpoint@Y" }

-In the above example, "X" is the number of the port and "Y" is the number of the
-endpoint.
+In the above example, "X" is the number of the port and "Y" is the number of
+the endpoint.

The references to endpoints must be always done both ways, to the
remote endpoint and back from the referred remote endpoint node.

-A simple example of this is show below:
+A simple example of this is show below::

Scope (\_SB.PCI0.I2C2)
{
- Device (CAM0)
- {
- Name (_DSD, Package () {
- ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
- Package () {
- Package () { "compatible", Package () { "nokia,smia" } },
- },
- ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
- Package () {
- Package () { "port@0", PRT0 },
- }
- })
- Name (PRT0, Package() {
- ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
- Package () {
- Package () { "reg", 0 },
- },
- ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
- Package () {
- Package () { "endpoint@0", EP00 },
- }
- })
- Name (EP00, Package() {
- ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
- Package () {
- Package () { "reg", 0 },
- Package () { "remote-endpoint", Package() { \_SB.PCI0.ISP, "port@4", "endpoint@0" } },
- }
- })
- }
+ Device (CAM0)
+ {
+ Name (_DSD, Package () {
+ ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
+ Package () {
+ Package () { "compatible", Package () { "nokia,smia" } },
+ },
+ ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
+ Package () {
+ Package () { "port@0", PRT0 },
+ }
+ })
+ Name (PRT0, Package() {
+ ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
+ Package () {
+ Package () { "reg", 0 },
+ },
+ ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
+ Package () {
+ Package () { "endpoint@0", EP00 },
+ }
+ })
+ Name (EP00, Package() {
+ ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
+ Package () {
+ Package () { "reg", 0 },
+ Package () { "remote-endpoint", Package() { \_SB.PCI0.ISP, "port@4", "endpoint@0" } },
+ }
+ })
+ }
}

Scope (\_SB.PCI0)
{
- Device (ISP)
- {
- Name (_DSD, Package () {
- ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
- Package () {
- Package () { "port@4", PRT4 },
- }
- })
-
- Name (PRT4, Package() {
- ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
- Package () {
- Package () { "reg", 4 }, /* CSI-2 port number */
- },
- ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
- Package () {
- Package () { "endpoint@0", EP40 },
- }
- })
-
- Name (EP40, Package() {
- ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
- Package () {
- Package () { "reg", 0 },
- Package () { "remote-endpoint", Package () { \_SB.PCI0.I2C2.CAM0, "port@0", "endpoint@0" } },
- }
- })
- }
+ Device (ISP)
+ {
+ Name (_DSD, Package () {
+ ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
+ Package () {
+ Package () { "port@4", PRT4 },
+ }
+ })
+
+ Name (PRT4, Package() {
+ ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
+ Package () {
+ Package () { "reg", 4 }, /* CSI-2 port number */
+ },
+ ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
+ Package () {
+ Package () { "endpoint@0", EP40 },
+ }
+ })
+
+ Name (EP40, Package() {
+ ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
+ Package () {
+ Package () { "reg", 0 },
+ Package () { "remote-endpoint", Package () { \_SB.PCI0.I2C2.CAM0, "port@0", "endpoint@0" } },
+ }
+ })
+ }
}

Here, the port 0 of the "CAM0" device is connected to the port 4 of
@@ -148,27 +151,27 @@ the "ISP" device and vice versa.


References
-----------
+==========

[1] _DSD (Device Specific Data) Implementation Guide.
- <URL:http://www.uefi.org/sites/default/files/resources/_DSD-implementation-guide-toplevel-1_1.htm>,
+ http://www.uefi.org/sites/default/files/resources/_DSD-implementation-guide-toplevel-1_1.htm,
referenced 2016-10-03.

-[2] Devicetree. <URL:http://www.devicetree.org>, referenced 2016-10-03.
+[2] Devicetree. http://www.devicetree.org, referenced 2016-10-03.

[3] Documentation/devicetree/bindings/graph.txt

[4] Device Properties UUID For _DSD.
- <URL:http://www.uefi.org/sites/default/files/resources/_DSD-device-properties-UUID.pdf>,
+ http://www.uefi.org/sites/default/files/resources/_DSD-device-properties-UUID.pdf,
referenced 2016-10-04.

[5] Hierarchical Data Extension UUID For _DSD.
- <URL:http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.1.pdf>,
+ http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.1.pdf,
referenced 2016-10-04.

[6] Advanced Configuration and Power Interface Specification.
- <URL:http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf>,
+ http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf,
referenced 2016-10-04.

[7] _DSD Device Properties Usage Rules.
- Documentation/acpi/DSD-properties-rules.txt
+ :doc:`../DSD-properties-rules`
diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index bedcb0b242a2..f81cfbcb6878 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -8,6 +8,7 @@ ACPI Support
:maxdepth: 1

namespace
+ dsd/graph
enumeration
osi
method-customizing
--
2.20.1

2019-04-23 16:34:31

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 15/63] Documentation: ACPI: move dsd/data-node-references.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
.../acpi/dsd/data-node-references.rst} | 28 +++++++++++--------
Documentation/firmware-guide/acpi/index.rst | 1 +
2 files changed, 17 insertions(+), 12 deletions(-)
rename Documentation/{acpi/dsd/data-node-references.txt => firmware-guide/acpi/dsd/data-node-references.rst} (79%)

diff --git a/Documentation/acpi/dsd/data-node-references.txt b/Documentation/firmware-guide/acpi/dsd/data-node-references.rst
similarity index 79%
rename from Documentation/acpi/dsd/data-node-references.txt
rename to Documentation/firmware-guide/acpi/dsd/data-node-references.rst
index c3871565c8cf..79c5368eaecf 100644
--- a/Documentation/acpi/dsd/data-node-references.txt
+++ b/Documentation/firmware-guide/acpi/dsd/data-node-references.rst
@@ -1,9 +1,12 @@
-Copyright (C) 2018 Intel Corporation
-Author: Sakari Ailus <[email protected]>
-
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>

+===================================
Referencing hierarchical data nodes
------------------------------------
+===================================
+
+:Copyright: |copy| 2018 Intel Corporation
+:Author: Sakari Ailus <[email protected]>

ACPI in general allows referring to device objects in the tree only.
Hierarchical data extension nodes may not be referred to directly, hence this
@@ -28,13 +31,14 @@ extension key.


Example
--------
+=======

- In the ASL snippet below, the "reference" _DSD property [2] contains a
- device object reference to DEV0 and under that device object, a
- hierarchical data extension key "node@1" referring to the NOD1 object
- and lastly, a hierarchical data extension key "anothernode" referring to
- the ANOD object which is also the final target node of the reference.
+In the ASL snippet below, the "reference" _DSD property [2] contains a
+device object reference to DEV0 and under that device object, a
+hierarchical data extension key "node@1" referring to the NOD1 object
+and lastly, a hierarchical data extension key "anothernode" referring to
+the ANOD object which is also the final target node of the reference.
+::

Device (DEV0)
{
@@ -75,10 +79,10 @@ Example
})
}

-Please also see a graph example in graph.txt .
+Please also see a graph example in :doc:`graph`.

References
-----------
+==========

[1] Hierarchical Data Extension UUID For _DSD.
<URL:http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.1.pdf>,
diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index f81cfbcb6878..6d4e0df4f063 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -9,6 +9,7 @@ ACPI Support

namespace
dsd/graph
+ dsd/data-node-references
enumeration
osi
method-customizing
--
2.20.1

2019-04-23 16:34:31

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 08/63] Documentation: ACPI: move gpio-properties.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
.../acpi/gpio-properties.rst} | 78 +++++++++++--------
Documentation/firmware-guide/acpi/index.rst | 1 +
MAINTAINERS | 2 +-
3 files changed, 46 insertions(+), 35 deletions(-)
rename Documentation/{acpi/gpio-properties.txt => firmware-guide/acpi/gpio-properties.rst} (81%)

diff --git a/Documentation/acpi/gpio-properties.txt b/Documentation/firmware-guide/acpi/gpio-properties.rst
similarity index 81%
rename from Documentation/acpi/gpio-properties.txt
rename to Documentation/firmware-guide/acpi/gpio-properties.rst
index 88c65cb5bf0a..89c636963544 100644
--- a/Documentation/acpi/gpio-properties.txt
+++ b/Documentation/firmware-guide/acpi/gpio-properties.rst
@@ -1,5 +1,8 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================================
_DSD Device Properties Related to GPIO
---------------------------------------
+======================================

With the release of ACPI 5.1, the _DSD configuration object finally
allows names to be given to GPIOs (and other things as well) returned
@@ -8,7 +11,7 @@ the corresponding GPIO, which is pretty error prone (it depends on
the _CRS output ordering, for example).

With _DSD we can now query GPIOs using a name instead of an integer
-index, like the ASL example below shows:
+index, like the ASL example below shows::

// Bluetooth device with reset and shutdown GPIOs
Device (BTH)
@@ -34,15 +37,19 @@ index, like the ASL example below shows:
})
}

-The format of the supported GPIO property is:
+The format of the supported GPIO property is::

Package () { "name", Package () { ref, index, pin, active_low }}

- ref - The device that has _CRS containing GpioIo()/GpioInt() resources,
- typically this is the device itself (BTH in our case).
- index - Index of the GpioIo()/GpioInt() resource in _CRS starting from zero.
- pin - Pin in the GpioIo()/GpioInt() resource. Typically this is zero.
- active_low - If 1 the GPIO is marked as active_low.
+ref
+ The device that has _CRS containing GpioIo()/GpioInt() resources,
+ typically this is the device itself (BTH in our case).
+index
+ Index of the GpioIo()/GpioInt() resource in _CRS starting from zero.
+pin
+ Pin in the GpioIo()/GpioInt() resource. Typically this is zero.
+active_low
+ If 1 the GPIO is marked as active_low.

Since ACPI GpioIo() resource does not have a field saying whether it is
active low or high, the "active_low" argument can be used here. Setting
@@ -55,7 +62,7 @@ It is possible to leave holes in the array of GPIOs. This is useful in
cases like with SPI host controllers where some chip selects may be
implemented as GPIOs and some as native signals. For example a SPI host
controller can have chip selects 0 and 2 implemented as GPIOs and 1 as
-native:
+native::

Package () {
"cs-gpios",
@@ -67,7 +74,7 @@ native:
}

Other supported properties
---------------------------
+==========================

Following Device Tree compatible device properties are also supported by
_DSD device properties for GPIO controllers:
@@ -78,7 +85,7 @@ _DSD device properties for GPIO controllers:
- input
- line-name

-Example:
+Example::

Name (_DSD, Package () {
// _DSD Hierarchical Properties Extension UUID
@@ -100,7 +107,7 @@ Example:

- gpio-line-names

-Example:
+Example::

Package () {
"gpio-line-names",
@@ -114,7 +121,7 @@ See Documentation/devicetree/bindings/gpio/gpio.txt for more information
about these properties.

ACPI GPIO Mappings Provided by Drivers
---------------------------------------
+======================================

There are systems in which the ACPI tables do not contain _DSD but provide _CRS
with GpioIo()/GpioInt() resources and device drivers still need to work with
@@ -139,16 +146,16 @@ line in that resource starting from zero, and the active-low flag for that line,
respectively, in analogy with the _DSD GPIO property format specified above.

For the example Bluetooth device discussed previously the data structures in
-question would look like this:
+question would look like this::

-static const struct acpi_gpio_params reset_gpio = { 1, 1, false };
-static const struct acpi_gpio_params shutdown_gpio = { 0, 0, false };
+ static const struct acpi_gpio_params reset_gpio = { 1, 1, false };
+ static const struct acpi_gpio_params shutdown_gpio = { 0, 0, false };

-static const struct acpi_gpio_mapping bluetooth_acpi_gpios[] = {
- { "reset-gpios", &reset_gpio, 1 },
- { "shutdown-gpios", &shutdown_gpio, 1 },
- { },
-};
+ static const struct acpi_gpio_mapping bluetooth_acpi_gpios[] = {
+ { "reset-gpios", &reset_gpio, 1 },
+ { "shutdown-gpios", &shutdown_gpio, 1 },
+ { },
+ };

Next, the mapping table needs to be passed as the second argument to
acpi_dev_add_driver_gpios() that will register it with the ACPI device object
@@ -158,12 +165,12 @@ calling acpi_dev_remove_driver_gpios() on the ACPI device object where that
table was previously registered.

Using the _CRS fallback
------------------------
+=======================

If a device does not have _DSD or the driver does not create ACPI GPIO
mapping, the Linux GPIO framework refuses to return any GPIOs. This is
because the driver does not know what it actually gets. For example if we
-have a device like below:
+have a device like below::

Device (BTH)
{
@@ -177,7 +184,7 @@ have a device like below:
})
}

-The driver might expect to get the right GPIO when it does:
+The driver might expect to get the right GPIO when it does::

desc = gpiod_get(dev, "reset", GPIOD_OUT_LOW);

@@ -193,22 +200,25 @@ the ACPI GPIO mapping tables are hardly linked to ACPI ID and certain
objects, as listed in the above chapter, of the device in question.

Getting GPIO descriptor
------------------------
+=======================
+
+There are two main approaches to get GPIO resource from ACPI::

-There are two main approaches to get GPIO resource from ACPI:
- desc = gpiod_get(dev, connection_id, flags);
- desc = gpiod_get_index(dev, connection_id, index, flags);
+ desc = gpiod_get(dev, connection_id, flags);
+ desc = gpiod_get_index(dev, connection_id, index, flags);

We may consider two different cases here, i.e. when connection ID is
provided and otherwise.

-Case 1:
- desc = gpiod_get(dev, "non-null-connection-id", flags);
- desc = gpiod_get_index(dev, "non-null-connection-id", index, flags);
+Case 1::
+
+ desc = gpiod_get(dev, "non-null-connection-id", flags);
+ desc = gpiod_get_index(dev, "non-null-connection-id", index, flags);
+
+Case 2::

-Case 2:
- desc = gpiod_get(dev, NULL, flags);
- desc = gpiod_get_index(dev, NULL, index, flags);
+ desc = gpiod_get(dev, NULL, flags);
+ desc = gpiod_get_index(dev, NULL, index, flags);

Case 1 assumes that corresponding ACPI device description must have
defined device properties and will prevent to getting any GPIO resources
diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index 0e05b843521c..61d67763851b 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -11,3 +11,4 @@ ACPI Support
enumeration
osi
DSD-properties-rules
+ gpio-properties
diff --git a/MAINTAINERS b/MAINTAINERS
index 09f43f1bdd15..87f930bf32ad 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6593,7 +6593,7 @@ M: Andy Shevchenko <[email protected]>
L: [email protected]
L: [email protected]
S: Maintained
-F: Documentation/acpi/gpio-properties.txt
+F: Documentation/firmware-guide/acpi/gpio-properties.rst
F: drivers/gpio/gpiolib-acpi.c

GPIO IR Transmitter
--
2.20.1

2019-04-23 16:34:42

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 16/63] Documentation: ACPI: move debug.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
.../acpi/debug.rst} | 31 ++++++++++---------
Documentation/firmware-guide/acpi/index.rst | 3 +-
2 files changed, 19 insertions(+), 15 deletions(-)
rename Documentation/{acpi/debug.txt => firmware-guide/acpi/debug.rst} (91%)

diff --git a/Documentation/acpi/debug.txt b/Documentation/firmware-guide/acpi/debug.rst
similarity index 91%
rename from Documentation/acpi/debug.txt
rename to Documentation/firmware-guide/acpi/debug.rst
index 65bf47c46b6d..1a152dd1d765 100644
--- a/Documentation/acpi/debug.txt
+++ b/Documentation/firmware-guide/acpi/debug.rst
@@ -1,18 +1,21 @@
- ACPI Debug Output
+.. SPDX-License-Identifier: GPL-2.0

+=================
+ACPI Debug Output
+=================

The ACPI CA, the Linux ACPI core, and some ACPI drivers can generate debug
output. This document describes how to use this facility.

Compile-time configuration
---------------------------
+==========================

ACPI debug output is globally enabled by CONFIG_ACPI_DEBUG. If this config
option is turned off, the debug messages are not even built into the
kernel.

Boot- and run-time configuration
---------------------------------
+================================

When CONFIG_ACPI_DEBUG=y, you can select the component and level of messages
you're interested in. At boot-time, use the acpi.debug_layer and
@@ -21,7 +24,7 @@ debug_layer and debug_level files in /sys/module/acpi/parameters/ to control
the debug messages.

debug_layer (component)
------------------------
+=======================

The "debug_layer" is a mask that selects components of interest, e.g., a
specific driver or part of the ACPI interpreter. To build the debug_layer
@@ -33,7 +36,7 @@ to /sys/module/acpi/parameters/debug_layer.

The possible components are defined in include/acpi/acoutput.h and
include/acpi/acpi_drivers.h. Reading /sys/module/acpi/parameters/debug_layer
-shows the supported mask values, currently these:
+shows the supported mask values, currently these::

ACPI_UTILITIES 0x00000001
ACPI_HARDWARE 0x00000002
@@ -65,7 +68,7 @@ shows the supported mask values, currently these:
ACPI_PROCESSOR_COMPONENT 0x20000000

debug_level
------------
+===========

The "debug_level" is a mask that selects different types of messages, e.g.,
those related to initialization, method execution, informational messages, etc.
@@ -81,7 +84,7 @@ to /sys/module/acpi/parameters/debug_level.

The possible levels are defined in include/acpi/acoutput.h. Reading
/sys/module/acpi/parameters/debug_level shows the supported mask values,
-currently these:
+currently these::

ACPI_LV_INIT 0x00000001
ACPI_LV_DEBUG_OBJECT 0x00000002
@@ -113,9 +116,9 @@ currently these:
ACPI_LV_EVENTS 0x80000000

Examples
---------
+========

-For example, drivers/acpi/bus.c contains this:
+For example, drivers/acpi/bus.c contains this::

#define _COMPONENT ACPI_BUS_COMPONENT
...
@@ -127,22 +130,22 @@ statement uses ACPI_DB_INFO, which is macro based on the ACPI_LV_INFO
definition.)

Enable all AML "Debug" output (stores to the Debug object while interpreting
-AML) during boot:
+AML) during boot::

acpi.debug_layer=0xffffffff acpi.debug_level=0x2

-Enable PCI and PCI interrupt routing debug messages:
+Enable PCI and PCI interrupt routing debug messages::

acpi.debug_layer=0x400000 acpi.debug_level=0x4

-Enable all ACPI hardware-related messages:
+Enable all ACPI hardware-related messages::

acpi.debug_layer=0x2 acpi.debug_level=0xffffffff

-Enable all ACPI_DB_INFO messages after boot:
+Enable all ACPI_DB_INFO messages after boot::

# echo 0x4 > /sys/module/acpi/parameters/debug_level

-Show all valid component values:
+Show all valid component values::

# cat /sys/module/acpi/parameters/debug_layer
diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index 6d4e0df4f063..a45fea11f998 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -14,6 +14,7 @@ ACPI Support
osi
method-customizing
DSD-properties-rules
+ debug
gpio-properties
i2c-muxes
- acpi-lid
\ No newline at end of file
+ acpi-lid
--
2.20.1

2019-04-23 16:34:44

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 10/63] Documentation: ACPI: move initrd_table_override.txt to admin-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/acpi/initrd_table_override.txt | 111 ----------------
Documentation/admin-guide/acpi/index.rst | 1 +
.../acpi/initrd_table_override.rst | 120 ++++++++++++++++++
3 files changed, 121 insertions(+), 111 deletions(-)
delete mode 100644 Documentation/acpi/initrd_table_override.txt
create mode 100644 Documentation/admin-guide/acpi/initrd_table_override.rst

diff --git a/Documentation/acpi/initrd_table_override.txt b/Documentation/acpi/initrd_table_override.txt
deleted file mode 100644
index 30437a6db373..000000000000
--- a/Documentation/acpi/initrd_table_override.txt
+++ /dev/null
@@ -1,111 +0,0 @@
-Upgrading ACPI tables via initrd
-================================
-
-1) Introduction (What is this about)
-2) What is this for
-3) How does it work
-4) References (Where to retrieve userspace tools)
-
-1) What is this about
----------------------
-
-If the ACPI_TABLE_UPGRADE compile option is true, it is possible to
-upgrade the ACPI execution environment that is defined by the ACPI tables
-via upgrading the ACPI tables provided by the BIOS with an instrumented,
-modified, more recent version one, or installing brand new ACPI tables.
-
-When building initrd with kernel in a single image, option
-ACPI_TABLE_OVERRIDE_VIA_BUILTIN_INITRD should also be true for this
-feature to work.
-
-For a full list of ACPI tables that can be upgraded/installed, take a look
-at the char *table_sigs[MAX_ACPI_SIGNATURE]; definition in
-drivers/acpi/tables.c.
-All ACPI tables iasl (Intel's ACPI compiler and disassembler) knows should
-be overridable, except:
- - ACPI_SIG_RSDP (has a signature of 6 bytes)
- - ACPI_SIG_FACS (does not have an ordinary ACPI table header)
-Both could get implemented as well.
-
-
-2) What is this for
--------------------
-
-Complain to your platform/BIOS vendor if you find a bug which is so severe
-that a workaround is not accepted in the Linux kernel. And this facility
-allows you to upgrade the buggy tables before your platform/BIOS vendor
-releases an upgraded BIOS binary.
-
-This facility can be used by platform/BIOS vendors to provide a Linux
-compatible environment without modifying the underlying platform firmware.
-
-This facility also provides a powerful feature to easily debug and test
-ACPI BIOS table compatibility with the Linux kernel by modifying old
-platform provided ACPI tables or inserting new ACPI tables.
-
-It can and should be enabled in any kernel because there is no functional
-change with not instrumented initrds.
-
-
-3) How does it work
--------------------
-
-# Extract the machine's ACPI tables:
-cd /tmp
-acpidump >acpidump
-acpixtract -a acpidump
-# Disassemble, modify and recompile them:
-iasl -d *.dat
-# For example add this statement into a _PRT (PCI Routing Table) function
-# of the DSDT:
-Store("HELLO WORLD", debug)
-# And increase the OEM Revision. For example, before modification:
-DefinitionBlock ("DSDT.aml", "DSDT", 2, "INTEL ", "TEMPLATE", 0x00000000)
-# After modification:
-DefinitionBlock ("DSDT.aml", "DSDT", 2, "INTEL ", "TEMPLATE", 0x00000001)
-iasl -sa dsdt.dsl
-# Add the raw ACPI tables to an uncompressed cpio archive.
-# They must be put into a /kernel/firmware/acpi directory inside the cpio
-# archive. Note that if the table put here matches a platform table
-# (similar Table Signature, and similar OEMID, and similar OEM Table ID)
-# with a more recent OEM Revision, the platform table will be upgraded by
-# this table. If the table put here doesn't match a platform table
-# (dissimilar Table Signature, or dissimilar OEMID, or dissimilar OEM Table
-# ID), this table will be appended.
-mkdir -p kernel/firmware/acpi
-cp dsdt.aml kernel/firmware/acpi
-# A maximum of "NR_ACPI_INITRD_TABLES (64)" tables are currently allowed
-# (see osl.c):
-iasl -sa facp.dsl
-iasl -sa ssdt1.dsl
-cp facp.aml kernel/firmware/acpi
-cp ssdt1.aml kernel/firmware/acpi
-# The uncompressed cpio archive must be the first. Other, typically
-# compressed cpio archives, must be concatenated on top of the uncompressed
-# one. Following command creates the uncompressed cpio archive and
-# concatenates the original initrd on top:
-find kernel | cpio -H newc --create > /boot/instrumented_initrd
-cat /boot/initrd >>/boot/instrumented_initrd
-# reboot with increased acpi debug level, e.g. boot params:
-acpi.debug_level=0x2 acpi.debug_layer=0xFFFFFFFF
-# and check your syslog:
-[ 1.268089] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
-[ 1.272091] [ACPI Debug] String [0x0B] "HELLO WORLD"
-
-iasl is able to disassemble and recompile quite a lot different,
-also static ACPI tables.
-
-
-4) Where to retrieve userspace tools
-------------------------------------
-
-iasl and acpixtract are part of Intel's ACPICA project:
-http://acpica.org/
-and should be packaged by distributions (for example in the acpica package
-on SUSE).
-
-acpidump can be found in Len Browns pmtools:
-ftp://kernel.org/pub/linux/kernel/people/lenb/acpi/utils/pmtools/acpidump
-This tool is also part of the acpica package on SUSE.
-Alternatively, used ACPI tables can be retrieved via sysfs in latest kernels:
-/sys/firmware/acpi/tables
diff --git a/Documentation/admin-guide/acpi/index.rst b/Documentation/admin-guide/acpi/index.rst
index 3e041206089d..09e4e81e4fb7 100644
--- a/Documentation/admin-guide/acpi/index.rst
+++ b/Documentation/admin-guide/acpi/index.rst
@@ -8,3 +8,4 @@ the Linux ACPI support.
.. toctree::
:maxdepth: 1

+ initrd_table_override
diff --git a/Documentation/admin-guide/acpi/initrd_table_override.rst b/Documentation/admin-guide/acpi/initrd_table_override.rst
new file mode 100644
index 000000000000..0787b2b91ded
--- /dev/null
+++ b/Documentation/admin-guide/acpi/initrd_table_override.rst
@@ -0,0 +1,120 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+================================
+Upgrading ACPI tables via initrd
+================================
+
+1) Introduction (What is this about)
+2) What is this for
+3) How does it work
+4) References (Where to retrieve userspace tools)
+
+1) What is this about
+=====================
+
+If the ACPI_TABLE_UPGRADE compile option is true, it is possible to
+upgrade the ACPI execution environment that is defined by the ACPI tables
+via upgrading the ACPI tables provided by the BIOS with an instrumented,
+modified, more recent version one, or installing brand new ACPI tables.
+
+When building initrd with kernel in a single image, option
+ACPI_TABLE_OVERRIDE_VIA_BUILTIN_INITRD should also be true for this
+feature to work.
+
+For a full list of ACPI tables that can be upgraded/installed, take a look
+at the char `*table_sigs[MAX_ACPI_SIGNATURE];` definition in
+drivers/acpi/tables.c.
+
+All ACPI tables iasl (Intel's ACPI compiler and disassembler) knows should
+be overridable, except:
+
+ - ACPI_SIG_RSDP (has a signature of 6 bytes)
+ - ACPI_SIG_FACS (does not have an ordinary ACPI table header)
+
+Both could get implemented as well.
+
+
+2) What is this for
+===================
+
+Complain to your platform/BIOS vendor if you find a bug which is so severe
+that a workaround is not accepted in the Linux kernel. And this facility
+allows you to upgrade the buggy tables before your platform/BIOS vendor
+releases an upgraded BIOS binary.
+
+This facility can be used by platform/BIOS vendors to provide a Linux
+compatible environment without modifying the underlying platform firmware.
+
+This facility also provides a powerful feature to easily debug and test
+ACPI BIOS table compatibility with the Linux kernel by modifying old
+platform provided ACPI tables or inserting new ACPI tables.
+
+It can and should be enabled in any kernel because there is no functional
+change with not instrumented initrds.
+
+
+3) How does it work
+===================
+::
+
+ # Extract the machine's ACPI tables:
+ cd /tmp
+ acpidump >acpidump
+ acpixtract -a acpidump
+ # Disassemble, modify and recompile them:
+ iasl -d *.dat
+ # For example add this statement into a _PRT (PCI Routing Table) function
+ # of the DSDT:
+ Store("HELLO WORLD", debug)
+ # And increase the OEM Revision. For example, before modification:
+ DefinitionBlock ("DSDT.aml", "DSDT", 2, "INTEL ", "TEMPLATE", 0x00000000)
+ # After modification:
+ DefinitionBlock ("DSDT.aml", "DSDT", 2, "INTEL ", "TEMPLATE", 0x00000001)
+ iasl -sa dsdt.dsl
+ # Add the raw ACPI tables to an uncompressed cpio archive.
+ # They must be put into a /kernel/firmware/acpi directory inside the cpio
+ # archive. Note that if the table put here matches a platform table
+ # (similar Table Signature, and similar OEMID, and similar OEM Table ID)
+ # with a more recent OEM Revision, the platform table will be upgraded by
+ # this table. If the table put here doesn't match a platform table
+ # (dissimilar Table Signature, or dissimilar OEMID, or dissimilar OEM Table
+ # ID), this table will be appended.
+ mkdir -p kernel/firmware/acpi
+ cp dsdt.aml kernel/firmware/acpi
+ # A maximum of "NR_ACPI_INITRD_TABLES (64)" tables are currently allowed
+ # (see osl.c):
+ iasl -sa facp.dsl
+ iasl -sa ssdt1.dsl
+ cp facp.aml kernel/firmware/acpi
+ cp ssdt1.aml kernel/firmware/acpi
+ # The uncompressed cpio archive must be the first. Other, typically
+ # compressed cpio archives, must be concatenated on top of the uncompressed
+ # one. Following command creates the uncompressed cpio archive and
+ # concatenates the original initrd on top:
+ find kernel | cpio -H newc --create > /boot/instrumented_initrd
+ cat /boot/initrd >>/boot/instrumented_initrd
+ # reboot with increased acpi debug level, e.g. boot params:
+ acpi.debug_level=0x2 acpi.debug_layer=0xFFFFFFFF
+ # and check your syslog:
+ [ 1.268089] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
+ [ 1.272091] [ACPI Debug] String [0x0B] "HELLO WORLD"
+
+iasl is able to disassemble and recompile quite a lot different,
+also static ACPI tables.
+
+
+4) Where to retrieve userspace tools
+====================================
+
+iasl and acpixtract are part of Intel's ACPICA project:
+http://acpica.org/
+
+and should be packaged by distributions (for example in the acpica package
+on SUSE).
+
+acpidump can be found in Len Browns pmtools:
+ftp://kernel.org/pub/linux/kernel/people/lenb/acpi/utils/pmtools/acpidump
+
+This tool is also part of the acpica package on SUSE.
+Alternatively, used ACPI tables can be retrieved via sysfs in latest kernels:
+/sys/firmware/acpi/tables
--
2.20.1

2019-04-23 16:34:51

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 17/63] Documentation: ACPI: move method-tracing.txt to firmware-guide/acpi and convert to rsST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/acpi/method-tracing.txt | 192 ---------------
Documentation/firmware-guide/acpi/index.rst | 1 +
.../firmware-guide/acpi/method-tracing.rst | 225 ++++++++++++++++++
3 files changed, 226 insertions(+), 192 deletions(-)
delete mode 100644 Documentation/acpi/method-tracing.txt
create mode 100644 Documentation/firmware-guide/acpi/method-tracing.rst

diff --git a/Documentation/acpi/method-tracing.txt b/Documentation/acpi/method-tracing.txt
deleted file mode 100644
index 0aba14c8f459..000000000000
--- a/Documentation/acpi/method-tracing.txt
+++ /dev/null
@@ -1,192 +0,0 @@
-ACPICA Trace Facility
-
-Copyright (C) 2015, Intel Corporation
-Author: Lv Zheng <[email protected]>
-
-
-Abstract:
-
-This document describes the functions and the interfaces of the method
-tracing facility.
-
-1. Functionalities and usage examples:
-
- ACPICA provides method tracing capability. And two functions are
- currently implemented using this capability.
-
- A. Log reducer
- ACPICA subsystem provides debugging outputs when CONFIG_ACPI_DEBUG is
- enabled. The debugging messages which are deployed via
- ACPI_DEBUG_PRINT() macro can be reduced at 2 levels - per-component
- level (known as debug layer, configured via
- /sys/module/acpi/parameters/debug_layer) and per-type level (known as
- debug level, configured via /sys/module/acpi/parameters/debug_level).
-
- But when the particular layer/level is applied to the control method
- evaluations, the quantity of the debugging outputs may still be too
- large to be put into the kernel log buffer. The idea thus is worked out
- to only enable the particular debug layer/level (normally more detailed)
- logs when the control method evaluation is started, and disable the
- detailed logging when the control method evaluation is stopped.
-
- The following command examples illustrate the usage of the "log reducer"
- functionality:
- a. Filter out the debug layer/level matched logs when control methods
- are being evaluated:
- # cd /sys/module/acpi/parameters
- # echo "0xXXXXXXXX" > trace_debug_layer
- # echo "0xYYYYYYYY" > trace_debug_level
- # echo "enable" > trace_state
- b. Filter out the debug layer/level matched logs when the specified
- control method is being evaluated:
- # cd /sys/module/acpi/parameters
- # echo "0xXXXXXXXX" > trace_debug_layer
- # echo "0xYYYYYYYY" > trace_debug_level
- # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
- # echo "method" > /sys/module/acpi/parameters/trace_state
- c. Filter out the debug layer/level matched logs when the specified
- control method is being evaluated for the first time:
- # cd /sys/module/acpi/parameters
- # echo "0xXXXXXXXX" > trace_debug_layer
- # echo "0xYYYYYYYY" > trace_debug_level
- # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
- # echo "method-once" > /sys/module/acpi/parameters/trace_state
- Where:
- 0xXXXXXXXX/0xYYYYYYYY: Refer to Documentation/acpi/debug.txt for
- possible debug layer/level masking values.
- \PPPP.AAAA.TTTT.HHHH: Full path of a control method that can be found
- in the ACPI namespace. It needn't be an entry
- of a control method evaluation.
-
- B. AML tracer
-
- There are special log entries added by the method tracing facility at
- the "trace points" the AML interpreter starts/stops to execute a control
- method, or an AML opcode. Note that the format of the log entries are
- subject to change:
- [ 0.186427] exdebug-0398 ex_trace_point : Method Begin [0xf58394d8:\_SB.PCI0.LPCB.ECOK] execution.
- [ 0.186630] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905c88:If] execution.
- [ 0.186820] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905cc0:LEqual] execution.
- [ 0.187010] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905a20:-NamePath-] execution.
- [ 0.187214] exdebug-0398 ex_trace_point : Opcode End [0xf5905a20:-NamePath-] execution.
- [ 0.187407] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905f60:One] execution.
- [ 0.187594] exdebug-0398 ex_trace_point : Opcode End [0xf5905f60:One] execution.
- [ 0.187789] exdebug-0398 ex_trace_point : Opcode End [0xf5905cc0:LEqual] execution.
- [ 0.187980] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905cc0:Return] execution.
- [ 0.188146] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905f60:One] execution.
- [ 0.188334] exdebug-0398 ex_trace_point : Opcode End [0xf5905f60:One] execution.
- [ 0.188524] exdebug-0398 ex_trace_point : Opcode End [0xf5905cc0:Return] execution.
- [ 0.188712] exdebug-0398 ex_trace_point : Opcode End [0xf5905c88:If] execution.
- [ 0.188903] exdebug-0398 ex_trace_point : Method End [0xf58394d8:\_SB.PCI0.LPCB.ECOK] execution.
-
- Developers can utilize these special log entries to track the AML
- interpretion, thus can aid issue debugging and performance tuning. Note
- that, as the "AML tracer" logs are implemented via ACPI_DEBUG_PRINT()
- macro, CONFIG_ACPI_DEBUG is also required to be enabled for enabling
- "AML tracer" logs.
-
- The following command examples illustrate the usage of the "AML tracer"
- functionality:
- a. Filter out the method start/stop "AML tracer" logs when control
- methods are being evaluated:
- # cd /sys/module/acpi/parameters
- # echo "0x80" > trace_debug_layer
- # echo "0x10" > trace_debug_level
- # echo "enable" > trace_state
- b. Filter out the method start/stop "AML tracer" when the specified
- control method is being evaluated:
- # cd /sys/module/acpi/parameters
- # echo "0x80" > trace_debug_layer
- # echo "0x10" > trace_debug_level
- # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
- # echo "method" > trace_state
- c. Filter out the method start/stop "AML tracer" logs when the specified
- control method is being evaluated for the first time:
- # cd /sys/module/acpi/parameters
- # echo "0x80" > trace_debug_layer
- # echo "0x10" > trace_debug_level
- # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
- # echo "method-once" > trace_state
- d. Filter out the method/opcode start/stop "AML tracer" when the
- specified control method is being evaluated:
- # cd /sys/module/acpi/parameters
- # echo "0x80" > trace_debug_layer
- # echo "0x10" > trace_debug_level
- # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
- # echo "opcode" > trace_state
- e. Filter out the method/opcode start/stop "AML tracer" when the
- specified control method is being evaluated for the first time:
- # cd /sys/module/acpi/parameters
- # echo "0x80" > trace_debug_layer
- # echo "0x10" > trace_debug_level
- # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
- # echo "opcode-opcode" > trace_state
-
- Note that all above method tracing facility related module parameters can
- be used as the boot parameters, for example:
- acpi.trace_debug_layer=0x80 acpi.trace_debug_level=0x10 \
- acpi.trace_method_name=\_SB.LID0._LID acpi.trace_state=opcode-once
-
-2. Interface descriptions:
-
- All method tracing functions can be configured via ACPI module
- parameters that are accessible at /sys/module/acpi/parameters/:
-
- trace_method_name
- The full path of the AML method that the user wants to trace.
- Note that the full path shouldn't contain the trailing "_"s in its
- name segments but may contain "\" to form an absolute path.
-
- trace_debug_layer
- The temporary debug_layer used when the tracing feature is enabled.
- Using ACPI_EXECUTER (0x80) by default, which is the debug_layer
- used to match all "AML tracer" logs.
-
- trace_debug_level
- The temporary debug_level used when the tracing feature is enabled.
- Using ACPI_LV_TRACE_POINT (0x10) by default, which is the
- debug_level used to match all "AML tracer" logs.
-
- trace_state
- The status of the tracing feature.
- Users can enable/disable this debug tracing feature by executing
- the following command:
- # echo string > /sys/module/acpi/parameters/trace_state
- Where "string" should be one of the following:
- "disable"
- Disable the method tracing feature.
- "enable"
- Enable the method tracing feature.
- ACPICA debugging messages matching
- "trace_debug_layer/trace_debug_level" during any method
- execution will be logged.
- "method"
- Enable the method tracing feature.
- ACPICA debugging messages matching
- "trace_debug_layer/trace_debug_level" during method execution
- of "trace_method_name" will be logged.
- "method-once"
- Enable the method tracing feature.
- ACPICA debugging messages matching
- "trace_debug_layer/trace_debug_level" during method execution
- of "trace_method_name" will be logged only once.
- "opcode"
- Enable the method tracing feature.
- ACPICA debugging messages matching
- "trace_debug_layer/trace_debug_level" during method/opcode
- execution of "trace_method_name" will be logged.
- "opcode-once"
- Enable the method tracing feature.
- ACPICA debugging messages matching
- "trace_debug_layer/trace_debug_level" during method/opcode
- execution of "trace_method_name" will be logged only once.
- Note that, the difference between the "enable" and other feature
- enabling options are:
- 1. When "enable" is specified, since
- "trace_debug_layer/trace_debug_level" shall apply to all control
- method evaluations, after configuring "trace_state" to "enable",
- "trace_method_name" will be reset to NULL.
- 2. When "method/opcode" is specified, if
- "trace_method_name" is NULL when "trace_state" is configured to
- these options, the "trace_debug_layer/trace_debug_level" will
- apply to all control method evaluations.
diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index a45fea11f998..287a7cbd82ac 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -13,6 +13,7 @@ ACPI Support
enumeration
osi
method-customizing
+ method-tracing
DSD-properties-rules
debug
gpio-properties
diff --git a/Documentation/firmware-guide/acpi/method-tracing.rst b/Documentation/firmware-guide/acpi/method-tracing.rst
new file mode 100644
index 000000000000..7a997ba168d7
--- /dev/null
+++ b/Documentation/firmware-guide/acpi/method-tracing.rst
@@ -0,0 +1,225 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+=====================
+ACPICA Trace Facility
+=====================
+
+:Copyright: |copy| 2015, Intel Corporation
+:Author: Lv Zheng <[email protected]>
+
+
+:Abstract: This document describes the functions and the interfaces of the
+ method tracing facility.
+
+1. Functionalities and usage examples
+=====================================
+
+ACPICA provides method tracing capability. And two functions are
+currently implemented using this capability.
+
+Log reducer
+--------------
+
+ACPICA subsystem provides debugging outputs when CONFIG_ACPI_DEBUG is
+enabled. The debugging messages which are deployed via
+ACPI_DEBUG_PRINT() macro can be reduced at 2 levels - per-component
+level (known as debug layer, configured via
+/sys/module/acpi/parameters/debug_layer) and per-type level (known as
+debug level, configured via /sys/module/acpi/parameters/debug_level).
+
+But when the particular layer/level is applied to the control method
+evaluations, the quantity of the debugging outputs may still be too
+large to be put into the kernel log buffer. The idea thus is worked out
+to only enable the particular debug layer/level (normally more detailed)
+logs when the control method evaluation is started, and disable the
+detailed logging when the control method evaluation is stopped.
+
+The following command examples illustrate the usage of the "log reducer"
+functionality:
+
+a. Filter out the debug layer/level matched logs when control methods
+ are being evaluated::
+
+ # cd /sys/module/acpi/parameters
+ # echo "0xXXXXXXXX" > trace_debug_layer
+ # echo "0xYYYYYYYY" > trace_debug_level
+ # echo "enable" > trace_state
+
+b. Filter out the debug layer/level matched logs when the specified
+ control method is being evaluated::
+
+ # cd /sys/module/acpi/parameters
+ # echo "0xXXXXXXXX" > trace_debug_layer
+ # echo "0xYYYYYYYY" > trace_debug_level
+ # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
+ # echo "method" > /sys/module/acpi/parameters/trace_state
+
+c. Filter out the debug layer/level matched logs when the specified
+ control method is being evaluated for the first time::
+
+ # cd /sys/module/acpi/parameters
+ # echo "0xXXXXXXXX" > trace_debug_layer
+ # echo "0xYYYYYYYY" > trace_debug_level
+ # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
+ # echo "method-once" > /sys/module/acpi/parameters/trace_state
+
+Where:
+ 0xXXXXXXXX/0xYYYYYYYY
+ Refer to Documentation/acpi/debug.txt for possible debug layer/level
+ masking values.
+ \PPPP.AAAA.TTTT.HHHH
+ Full path of a control method that can be found in the ACPI namespace.
+ It needn't be an entry of a control method evaluation.
+
+AML tracer
+-------------
+
+There are special log entries added by the method tracing facility at
+the "trace points" the AML interpreter starts/stops to execute a control
+method, or an AML opcode. Note that the format of the log entries are
+subject to change::
+
+ [ 0.186427] exdebug-0398 ex_trace_point : Method Begin [0xf58394d8:\_SB.PCI0.LPCB.ECOK] execution.
+ [ 0.186630] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905c88:If] execution.
+ [ 0.186820] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905cc0:LEqual] execution.
+ [ 0.187010] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905a20:-NamePath-] execution.
+ [ 0.187214] exdebug-0398 ex_trace_point : Opcode End [0xf5905a20:-NamePath-] execution.
+ [ 0.187407] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905f60:One] execution.
+ [ 0.187594] exdebug-0398 ex_trace_point : Opcode End [0xf5905f60:One] execution.
+ [ 0.187789] exdebug-0398 ex_trace_point : Opcode End [0xf5905cc0:LEqual] execution.
+ [ 0.187980] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905cc0:Return] execution.
+ [ 0.188146] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905f60:One] execution.
+ [ 0.188334] exdebug-0398 ex_trace_point : Opcode End [0xf5905f60:One] execution.
+ [ 0.188524] exdebug-0398 ex_trace_point : Opcode End [0xf5905cc0:Return] execution.
+ [ 0.188712] exdebug-0398 ex_trace_point : Opcode End [0xf5905c88:If] execution.
+ [ 0.188903] exdebug-0398 ex_trace_point : Method End [0xf58394d8:\_SB.PCI0.LPCB.ECOK] execution.
+
+Developers can utilize these special log entries to track the AML
+interpretion, thus can aid issue debugging and performance tuning. Note
+that, as the "AML tracer" logs are implemented via ACPI_DEBUG_PRINT()
+macro, CONFIG_ACPI_DEBUG is also required to be enabled for enabling
+"AML tracer" logs.
+
+The following command examples illustrate the usage of the "AML tracer"
+functionality:
+
+a. Filter out the method start/stop "AML tracer" logs when control
+ methods are being evaluated::
+
+ # cd /sys/module/acpi/parameters
+ # echo "0x80" > trace_debug_layer
+ # echo "0x10" > trace_debug_level
+ # echo "enable" > trace_state
+
+b. Filter out the method start/stop "AML tracer" when the specified
+ control method is being evaluated::
+
+ # cd /sys/module/acpi/parameters
+ # echo "0x80" > trace_debug_layer
+ # echo "0x10" > trace_debug_level
+ # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
+ # echo "method" > trace_state
+
+c. Filter out the method start/stop "AML tracer" logs when the specified
+ control method is being evaluated for the first time::
+
+ # cd /sys/module/acpi/parameters
+ # echo "0x80" > trace_debug_layer
+ # echo "0x10" > trace_debug_level
+ # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
+ # echo "method-once" > trace_state
+
+d. Filter out the method/opcode start/stop "AML tracer" when the
+ specified control method is being evaluated::
+
+ # cd /sys/module/acpi/parameters
+ # echo "0x80" > trace_debug_layer
+ # echo "0x10" > trace_debug_level
+ # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
+ # echo "opcode" > trace_state
+
+e. Filter out the method/opcode start/stop "AML tracer" when the
+ specified control method is being evaluated for the first time::
+
+ # cd /sys/module/acpi/parameters
+ # echo "0x80" > trace_debug_layer
+ # echo "0x10" > trace_debug_level
+ # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
+ # echo "opcode-opcode" > trace_state
+
+Note that all above method tracing facility related module parameters can
+be used as the boot parameters, for example::
+
+ acpi.trace_debug_layer=0x80 acpi.trace_debug_level=0x10 \
+ acpi.trace_method_name=\_SB.LID0._LID acpi.trace_state=opcode-once
+
+2. Interface descriptions
+=========================
+
+All method tracing functions can be configured via ACPI module
+parameters that are accessible at /sys/module/acpi/parameters/:
+
+trace_method_name
+The full path of the AML method that the user wants to trace.
+Note that the full path shouldn't contain the trailing "_"s in its
+name segments but may contain "\" to form an absolute path.
+
+trace_debug_layer
+The temporary debug_layer used when the tracing feature is enabled.
+Using ACPI_EXECUTER (0x80) by default, which is the debug_layer
+used to match all "AML tracer" logs.
+
+trace_debug_level
+The temporary debug_level used when the tracing feature is enabled.
+Using ACPI_LV_TRACE_POINT (0x10) by default, which is the
+debug_level used to match all "AML tracer" logs.
+
+trace_state
+The status of the tracing feature.
+Users can enable/disable this debug tracing feature by executing
+the following command::
+
+ # echo string > /sys/module/acpi/parameters/trace_state
+
+Where "string" should be one of the following:
+
+"disable"
+ Disable the method tracing feature.
+"enable"
+ Enable the method tracing feature.
+ ACPICA debugging messages matching
+ "trace_debug_layer/trace_debug_level" during any method
+ execution will be logged.
+"method"
+ Enable the method tracing feature.
+ ACPICA debugging messages matching
+ "trace_debug_layer/trace_debug_level" during method execution
+ of "trace_method_name" will be logged.
+"method-once"
+ Enable the method tracing feature.
+ ACPICA debugging messages matching
+ "trace_debug_layer/trace_debug_level" during method execution
+ of "trace_method_name" will be logged only once.
+"opcode"
+ Enable the method tracing feature.
+ ACPICA debugging messages matching
+ "trace_debug_layer/trace_debug_level" during method/opcode
+ execution of "trace_method_name" will be logged.
+"opcode-once"
+ Enable the method tracing feature.
+ ACPICA debugging messages matching
+ "trace_debug_layer/trace_debug_level" during method/opcode
+ execution of "trace_method_name" will be logged only once.
+
+Note that, the difference between the "enable" and other feature
+enabling options are:
+
+1. When "enable" is specified, since
+ "trace_debug_layer/trace_debug_level" shall apply to all control
+ method evaluations, after configuring "trace_state" to "enable",
+ "trace_method_name" will be reset to NULL.
+2. When "method/opcode" is specified, if
+ "trace_method_name" is NULL when "trace_state" is configured to
+ these options, the "trace_debug_layer/trace_debug_level" will
+ apply to all control method evaluations.
--
2.20.1

2019-04-23 16:34:56

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 11/63] Documentation: ACPI: move dsdt-override.txt to admin-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
.../acpi/dsdt-override.rst} | 8 +++++++-
Documentation/admin-guide/acpi/index.rst | 1 +
2 files changed, 8 insertions(+), 1 deletion(-)
rename Documentation/{acpi/dsdt-override.txt => admin-guide/acpi/dsdt-override.rst} (56%)

diff --git a/Documentation/acpi/dsdt-override.txt b/Documentation/admin-guide/acpi/dsdt-override.rst
similarity index 56%
rename from Documentation/acpi/dsdt-override.txt
rename to Documentation/admin-guide/acpi/dsdt-override.rst
index 784841caa6e6..50bd7f194bf4 100644
--- a/Documentation/acpi/dsdt-override.txt
+++ b/Documentation/admin-guide/acpi/dsdt-override.rst
@@ -1,6 +1,12 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============
+Overriding DSDT
+===============
+
Linux supports a method of overriding the BIOS DSDT:

-CONFIG_ACPI_CUSTOM_DSDT builds the image into the kernel.
+CONFIG_ACPI_CUSTOM_DSDT - builds the image into the kernel.

When to use this method is described in detail on the
Linux/ACPI home page:
diff --git a/Documentation/admin-guide/acpi/index.rst b/Documentation/admin-guide/acpi/index.rst
index 09e4e81e4fb7..d68e9914c5ff 100644
--- a/Documentation/admin-guide/acpi/index.rst
+++ b/Documentation/admin-guide/acpi/index.rst
@@ -9,3 +9,4 @@ the Linux ACPI support.
:maxdepth: 1

initrd_table_override
+ dsdt-override
--
2.20.1

2019-04-23 16:34:59

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 18/63] Documentation: ACPI: move aml-debugger.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/acpi/aml-debugger.txt | 66 ----------------
.../firmware-guide/acpi/aml-debugger.rst | 75 +++++++++++++++++++
Documentation/firmware-guide/acpi/index.rst | 1 +
3 files changed, 76 insertions(+), 66 deletions(-)
delete mode 100644 Documentation/acpi/aml-debugger.txt
create mode 100644 Documentation/firmware-guide/acpi/aml-debugger.rst

diff --git a/Documentation/acpi/aml-debugger.txt b/Documentation/acpi/aml-debugger.txt
deleted file mode 100644
index 75ebeb64ab29..000000000000
--- a/Documentation/acpi/aml-debugger.txt
+++ /dev/null
@@ -1,66 +0,0 @@
-The AML Debugger
-
-Copyright (C) 2016, Intel Corporation
-Author: Lv Zheng <[email protected]>
-
-
-This document describes the usage of the AML debugger embedded in the Linux
-kernel.
-
-1. Build the debugger
-
- The following kernel configuration items are required to enable the AML
- debugger interface from the Linux kernel:
-
- CONFIG_ACPI_DEBUGGER=y
- CONFIG_ACPI_DEBUGGER_USER=m
-
- The userspace utilities can be built from the kernel source tree using
- the following commands:
-
- $ cd tools
- $ make acpi
-
- The resultant userspace tool binary is then located at:
-
- tools/power/acpi/acpidbg
-
- It can be installed to system directories by running "make install" (as a
- sufficiently privileged user).
-
-2. Start the userspace debugger interface
-
- After booting the kernel with the debugger built-in, the debugger can be
- started by using the following commands:
-
- # mount -t debugfs none /sys/kernel/debug
- # modprobe acpi_dbg
- # tools/power/acpi/acpidbg
-
- That spawns the interactive AML debugger environment where you can execute
- debugger commands.
-
- The commands are documented in the "ACPICA Overview and Programmer Reference"
- that can be downloaded from
-
- https://acpica.org/documentation
-
- The detailed debugger commands reference is located in Chapter 12 "ACPICA
- Debugger Reference". The "help" command can be used for a quick reference.
-
-3. Stop the userspace debugger interface
-
- The interactive debugger interface can be closed by pressing Ctrl+C or using
- the "quit" or "exit" commands. When finished, unload the module with:
-
- # rmmod acpi_dbg
-
- The module unloading may fail if there is an acpidbg instance running.
-
-4. Run the debugger in a script
-
- It may be useful to run the AML debugger in a test script. "acpidbg" supports
- this in a special "batch" mode. For example, the following command outputs
- the entire ACPI namespace:
-
- # acpidbg -b "namespace"
diff --git a/Documentation/firmware-guide/acpi/aml-debugger.rst b/Documentation/firmware-guide/acpi/aml-debugger.rst
new file mode 100644
index 000000000000..a889d43bc6c5
--- /dev/null
+++ b/Documentation/firmware-guide/acpi/aml-debugger.rst
@@ -0,0 +1,75 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+================
+The AML Debugger
+================
+
+:Copyright: |copy| 2016, Intel Corporation
+:Author: Lv Zheng <[email protected]>
+
+
+This document describes the usage of the AML debugger embedded in the Linux
+kernel.
+
+1. Build the debugger
+=====================
+
+The following kernel configuration items are required to enable the AML
+debugger interface from the Linux kernel::
+
+ CONFIG_ACPI_DEBUGGER=y
+ CONFIG_ACPI_DEBUGGER_USER=m
+
+The userspace utilities can be built from the kernel source tree using
+the following commands::
+
+ $ cd tools
+ $ make acpi
+
+The resultant userspace tool binary is then located at::
+
+ tools/power/acpi/acpidbg
+
+It can be installed to system directories by running "make install" (as a
+sufficiently privileged user).
+
+2. Start the userspace debugger interface
+=========================================
+
+After booting the kernel with the debugger built-in, the debugger can be
+started by using the following commands::
+
+ # mount -t debugfs none /sys/kernel/debug
+ # modprobe acpi_dbg
+ # tools/power/acpi/acpidbg
+
+That spawns the interactive AML debugger environment where you can execute
+debugger commands.
+
+The commands are documented in the "ACPICA Overview and Programmer Reference"
+that can be downloaded from
+
+https://acpica.org/documentation
+
+The detailed debugger commands reference is located in Chapter 12 "ACPICA
+Debugger Reference". The "help" command can be used for a quick reference.
+
+3. Stop the userspace debugger interface
+========================================
+
+The interactive debugger interface can be closed by pressing Ctrl+C or using
+the "quit" or "exit" commands. When finished, unload the module with::
+
+ # rmmod acpi_dbg
+
+The module unloading may fail if there is an acpidbg instance running.
+
+4. Run the debugger in a script
+===============================
+
+It may be useful to run the AML debugger in a test script. "acpidbg" supports
+this in a special "batch" mode. For example, the following command outputs
+the entire ACPI namespace::
+
+ # acpidbg -b "namespace"
diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index 287a7cbd82ac..e9f253d54897 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -16,6 +16,7 @@ ACPI Support
method-tracing
DSD-properties-rules
debug
+ aml-debugger
gpio-properties
i2c-muxes
acpi-lid
--
2.20.1

2019-04-23 16:35:00

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 19/63] Documentation: ACPI: move apei/output_format.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/acpi/apei/output_format.txt | 147 -----------------
.../acpi/apei/output_format.rst | 150 ++++++++++++++++++
Documentation/firmware-guide/acpi/index.rst | 1 +
3 files changed, 151 insertions(+), 147 deletions(-)
delete mode 100644 Documentation/acpi/apei/output_format.txt
create mode 100644 Documentation/firmware-guide/acpi/apei/output_format.rst

diff --git a/Documentation/acpi/apei/output_format.txt b/Documentation/acpi/apei/output_format.txt
deleted file mode 100644
index 0c49c197c47a..000000000000
--- a/Documentation/acpi/apei/output_format.txt
+++ /dev/null
@@ -1,147 +0,0 @@
- APEI output format
- ~~~~~~~~~~~~~~~~~~
-
-APEI uses printk as hardware error reporting interface, the output
-format is as follow.
-
-<error record> :=
-APEI generic hardware error status
-severity: <integer>, <severity string>
-section: <integer>, severity: <integer>, <severity string>
-flags: <integer>
-<section flags strings>
-fru_id: <uuid string>
-fru_text: <string>
-section_type: <section type string>
-<section data>
-
-<severity string>* := recoverable | fatal | corrected | info
-
-<section flags strings># :=
-[primary][, containment warning][, reset][, threshold exceeded]\
-[, resource not accessible][, latent error]
-
-<section type string> := generic processor error | memory error | \
-PCIe error | unknown, <uuid string>
-
-<section data> :=
-<generic processor section data> | <memory section data> | \
-<pcie section data> | <null>
-
-<generic processor section data> :=
-[processor_type: <integer>, <proc type string>]
-[processor_isa: <integer>, <proc isa string>]
-[error_type: <integer>
-<proc error type strings>]
-[operation: <integer>, <proc operation string>]
-[flags: <integer>
-<proc flags strings>]
-[level: <integer>]
-[version_info: <integer>]
-[processor_id: <integer>]
-[target_address: <integer>]
-[requestor_id: <integer>]
-[responder_id: <integer>]
-[IP: <integer>]
-
-<proc type string>* := IA32/X64 | IA64
-
-<proc isa string>* := IA32 | IA64 | X64
-
-<processor error type strings># :=
-[cache error][, TLB error][, bus error][, micro-architectural error]
-
-<proc operation string>* := unknown or generic | data read | data write | \
-instruction execution
-
-<proc flags strings># :=
-[restartable][, precise IP][, overflow][, corrected]
-
-<memory section data> :=
-[error_status: <integer>]
-[physical_address: <integer>]
-[physical_address_mask: <integer>]
-[node: <integer>]
-[card: <integer>]
-[module: <integer>]
-[bank: <integer>]
-[device: <integer>]
-[row: <integer>]
-[column: <integer>]
-[bit_position: <integer>]
-[requestor_id: <integer>]
-[responder_id: <integer>]
-[target_id: <integer>]
-[error_type: <integer>, <mem error type string>]
-
-<mem error type string>* :=
-unknown | no error | single-bit ECC | multi-bit ECC | \
-single-symbol chipkill ECC | multi-symbol chipkill ECC | master abort | \
-target abort | parity error | watchdog timeout | invalid address | \
-mirror Broken | memory sparing | scrub corrected error | \
-scrub uncorrected error
-
-<pcie section data> :=
-[port_type: <integer>, <pcie port type string>]
-[version: <integer>.<integer>]
-[command: <integer>, status: <integer>]
-[device_id: <integer>:<integer>:<integer>.<integer>
-slot: <integer>
-secondary_bus: <integer>
-vendor_id: <integer>, device_id: <integer>
-class_code: <integer>]
-[serial number: <integer>, <integer>]
-[bridge: secondary_status: <integer>, control: <integer>]
-[aer_status: <integer>, aer_mask: <integer>
-<aer status string>
-[aer_uncor_severity: <integer>]
-aer_layer=<aer layer string>, aer_agent=<aer agent string>
-aer_tlp_header: <integer> <integer> <integer> <integer>]
-
-<pcie port type string>* := PCIe end point | legacy PCI end point | \
-unknown | unknown | root port | upstream switch port | \
-downstream switch port | PCIe to PCI/PCI-X bridge | \
-PCI/PCI-X to PCIe bridge | root complex integrated endpoint device | \
-root complex event collector
-
-if section severity is fatal or recoverable
-<aer status string># :=
-unknown | unknown | unknown | unknown | Data Link Protocol | \
-unknown | unknown | unknown | unknown | unknown | unknown | unknown | \
-Poisoned TLP | Flow Control Protocol | Completion Timeout | \
-Completer Abort | Unexpected Completion | Receiver Overflow | \
-Malformed TLP | ECRC | Unsupported Request
-else
-<aer status string># :=
-Receiver Error | unknown | unknown | unknown | unknown | unknown | \
-Bad TLP | Bad DLLP | RELAY_NUM Rollover | unknown | unknown | unknown | \
-Replay Timer Timeout | Advisory Non-Fatal
-fi
-
-<aer layer string> :=
-Physical Layer | Data Link Layer | Transaction Layer
-
-<aer agent string> :=
-Receiver ID | Requester ID | Completer ID | Transmitter ID
-
-Where, [] designate corresponding content is optional
-
-All <field string> description with * has the following format:
-
-field: <integer>, <field string>
-
-Where value of <integer> should be the position of "string" in <field
-string> description. Otherwise, <field string> will be "unknown".
-
-All <field strings> description with # has the following format:
-
-field: <integer>
-<field strings>
-
-Where each string in <fields strings> corresponding to one set bit of
-<integer>. The bit position is the position of "string" in <field
-strings> description.
-
-For more detailed explanation of every field, please refer to UEFI
-specification version 2.3 or later, section Appendix N: Common
-Platform Error Record.
diff --git a/Documentation/firmware-guide/acpi/apei/output_format.rst b/Documentation/firmware-guide/acpi/apei/output_format.rst
new file mode 100644
index 000000000000..c2e7ebddb529
--- /dev/null
+++ b/Documentation/firmware-guide/acpi/apei/output_format.rst
@@ -0,0 +1,150 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==================
+APEI output format
+==================
+
+APEI uses printk as hardware error reporting interface, the output
+format is as follow::
+
+ <error record> :=
+ APEI generic hardware error status
+ severity: <integer>, <severity string>
+ section: <integer>, severity: <integer>, <severity string>
+ flags: <integer>
+ <section flags strings>
+ fru_id: <uuid string>
+ fru_text: <string>
+ section_type: <section type string>
+ <section data>
+
+ <severity string>* := recoverable | fatal | corrected | info
+
+ <section flags strings># :=
+ [primary][, containment warning][, reset][, threshold exceeded]\
+ [, resource not accessible][, latent error]
+
+ <section type string> := generic processor error | memory error | \
+ PCIe error | unknown, <uuid string>
+
+ <section data> :=
+ <generic processor section data> | <memory section data> | \
+ <pcie section data> | <null>
+
+ <generic processor section data> :=
+ [processor_type: <integer>, <proc type string>]
+ [processor_isa: <integer>, <proc isa string>]
+ [error_type: <integer>
+ <proc error type strings>]
+ [operation: <integer>, <proc operation string>]
+ [flags: <integer>
+ <proc flags strings>]
+ [level: <integer>]
+ [version_info: <integer>]
+ [processor_id: <integer>]
+ [target_address: <integer>]
+ [requestor_id: <integer>]
+ [responder_id: <integer>]
+ [IP: <integer>]
+
+ <proc type string>* := IA32/X64 | IA64
+
+ <proc isa string>* := IA32 | IA64 | X64
+
+ <processor error type strings># :=
+ [cache error][, TLB error][, bus error][, micro-architectural error]
+
+ <proc operation string>* := unknown or generic | data read | data write | \
+ instruction execution
+
+ <proc flags strings># :=
+ [restartable][, precise IP][, overflow][, corrected]
+
+ <memory section data> :=
+ [error_status: <integer>]
+ [physical_address: <integer>]
+ [physical_address_mask: <integer>]
+ [node: <integer>]
+ [card: <integer>]
+ [module: <integer>]
+ [bank: <integer>]
+ [device: <integer>]
+ [row: <integer>]
+ [column: <integer>]
+ [bit_position: <integer>]
+ [requestor_id: <integer>]
+ [responder_id: <integer>]
+ [target_id: <integer>]
+ [error_type: <integer>, <mem error type string>]
+
+ <mem error type string>* :=
+ unknown | no error | single-bit ECC | multi-bit ECC | \
+ single-symbol chipkill ECC | multi-symbol chipkill ECC | master abort | \
+ target abort | parity error | watchdog timeout | invalid address | \
+ mirror Broken | memory sparing | scrub corrected error | \
+ scrub uncorrected error
+
+ <pcie section data> :=
+ [port_type: <integer>, <pcie port type string>]
+ [version: <integer>.<integer>]
+ [command: <integer>, status: <integer>]
+ [device_id: <integer>:<integer>:<integer>.<integer>
+ slot: <integer>
+ secondary_bus: <integer>
+ vendor_id: <integer>, device_id: <integer>
+ class_code: <integer>]
+ [serial number: <integer>, <integer>]
+ [bridge: secondary_status: <integer>, control: <integer>]
+ [aer_status: <integer>, aer_mask: <integer>
+ <aer status string>
+ [aer_uncor_severity: <integer>]
+ aer_layer=<aer layer string>, aer_agent=<aer agent string>
+ aer_tlp_header: <integer> <integer> <integer> <integer>]
+
+ <pcie port type string>* := PCIe end point | legacy PCI end point | \
+ unknown | unknown | root port | upstream switch port | \
+ downstream switch port | PCIe to PCI/PCI-X bridge | \
+ PCI/PCI-X to PCIe bridge | root complex integrated endpoint device | \
+ root complex event collector
+
+ if section severity is fatal or recoverable
+ <aer status string># :=
+ unknown | unknown | unknown | unknown | Data Link Protocol | \
+ unknown | unknown | unknown | unknown | unknown | unknown | unknown | \
+ Poisoned TLP | Flow Control Protocol | Completion Timeout | \
+ Completer Abort | Unexpected Completion | Receiver Overflow | \
+ Malformed TLP | ECRC | Unsupported Request
+ else
+ <aer status string># :=
+ Receiver Error | unknown | unknown | unknown | unknown | unknown | \
+ Bad TLP | Bad DLLP | RELAY_NUM Rollover | unknown | unknown | unknown | \
+ Replay Timer Timeout | Advisory Non-Fatal
+ fi
+
+ <aer layer string> :=
+ Physical Layer | Data Link Layer | Transaction Layer
+
+ <aer agent string> :=
+ Receiver ID | Requester ID | Completer ID | Transmitter ID
+
+Where, [] designate corresponding content is optional
+
+All <field string> description with * has the following format::
+
+ field: <integer>, <field string>
+
+Where value of <integer> should be the position of "string" in <field
+string> description. Otherwise, <field string> will be "unknown".
+
+All <field strings> description with # has the following format::
+
+ field: <integer>
+ <field strings>
+
+Where each string in <fields strings> corresponding to one set bit of
+<integer>. The bit position is the position of "string" in <field
+strings> description.
+
+For more detailed explanation of every field, please refer to UEFI
+specification version 2.3 or later, section Appendix N: Common
+Platform Error Record.
diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index e9f253d54897..869badba6d7a 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -17,6 +17,7 @@ ACPI Support
DSD-properties-rules
debug
aml-debugger
+ apei/output_format
gpio-properties
i2c-muxes
acpi-lid
--
2.20.1

2019-04-23 16:35:08

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 20/63] Documentation: ACPI: move apei/einj.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
.../acpi/apei/einj.rst} | 98 ++++++++++---------
Documentation/firmware-guide/acpi/index.rst | 1 +
2 files changed, 53 insertions(+), 46 deletions(-)
rename Documentation/{acpi/apei/einj.txt => firmware-guide/acpi/apei/einj.rst} (67%)

diff --git a/Documentation/acpi/apei/einj.txt b/Documentation/firmware-guide/acpi/apei/einj.rst
similarity index 67%
rename from Documentation/acpi/apei/einj.txt
rename to Documentation/firmware-guide/acpi/apei/einj.rst
index e550c8b98139..d85e2667155c 100644
--- a/Documentation/acpi/apei/einj.txt
+++ b/Documentation/firmware-guide/acpi/apei/einj.rst
@@ -1,13 +1,16 @@
- APEI Error INJection
- ~~~~~~~~~~~~~~~~~~~~
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+APEI Error INJection
+====================

EINJ provides a hardware error injection mechanism. It is very useful
for debugging and testing APEI and RAS features in general.

You need to check whether your BIOS supports EINJ first. For that, look
-for early boot messages similar to this one:
+for early boot messages similar to this one::

-ACPI: EINJ 0x000000007370A000 000150 (v01 INTEL 00000001 INTL 00000001)
+ ACPI: EINJ 0x000000007370A000 000150 (v01 INTEL 00000001 INTL 00000001)

which shows that the BIOS is exposing an EINJ table - it is the
mechanism through which the injection is done.
@@ -23,11 +26,11 @@ order to see the APEI,EINJ,... functionality supported and exposed by
the BIOS menu.

To use EINJ, make sure the following are options enabled in your kernel
-configuration:
+configuration::

-CONFIG_DEBUG_FS
-CONFIG_ACPI_APEI
-CONFIG_ACPI_APEI_EINJ
+ CONFIG_DEBUG_FS
+ CONFIG_ACPI_APEI
+ CONFIG_ACPI_APEI_EINJ

The EINJ user interface is in <debugfs mount point>/apei/einj.

@@ -35,22 +38,22 @@ The following files belong to it:

- available_error_type

- This file shows which error types are supported:
-
- Error Type Value Error Description
- ================ =================
- 0x00000001 Processor Correctable
- 0x00000002 Processor Uncorrectable non-fatal
- 0x00000004 Processor Uncorrectable fatal
- 0x00000008 Memory Correctable
- 0x00000010 Memory Uncorrectable non-fatal
- 0x00000020 Memory Uncorrectable fatal
- 0x00000040 PCI Express Correctable
- 0x00000080 PCI Express Uncorrectable fatal
- 0x00000100 PCI Express Uncorrectable non-fatal
- 0x00000200 Platform Correctable
- 0x00000400 Platform Uncorrectable non-fatal
- 0x00000800 Platform Uncorrectable fatal
+ This file shows which error types are supported::
+
+ Error Type Value Error Description
+ ================ =================
+ 0x00000001 Processor Correctable
+ 0x00000002 Processor Uncorrectable non-fatal
+ 0x00000004 Processor Uncorrectable fatal
+ 0x00000008 Memory Correctable
+ 0x00000010 Memory Uncorrectable non-fatal
+ 0x00000020 Memory Uncorrectable fatal
+ 0x00000040 PCI Express Correctable
+ 0x00000080 PCI Express Uncorrectable fatal
+ 0x00000100 PCI Express Uncorrectable non-fatal
+ 0x00000200 Platform Correctable
+ 0x00000400 Platform Uncorrectable non-fatal
+ 0x00000800 Platform Uncorrectable fatal

The format of the file contents are as above, except present are only
the available error types.
@@ -73,9 +76,12 @@ The following files belong to it:
injection. Value is a bitmask as specified in ACPI5.0 spec for the
SET_ERROR_TYPE_WITH_ADDRESS data structure:

- Bit 0 - Processor APIC field valid (see param3 below).
- Bit 1 - Memory address and mask valid (param1 and param2).
- Bit 2 - PCIe (seg,bus,dev,fn) valid (see param4 below).
+ Bit 0
+ Processor APIC field valid (see param3 below).
+ Bit 1
+ Memory address and mask valid (param1 and param2).
+ Bit 2
+ PCIe (seg,bus,dev,fn) valid (see param4 below).

If set to zero, legacy behavior is mimicked where the type of
injection specifies just one bit set, and param1 is multiplexed.
@@ -121,7 +127,7 @@ BIOS versions based on the ACPI 5.0 specification have more control over
the target of the injection. For processor-related errors (type 0x1, 0x2
and 0x4), you can set flags to 0x3 (param3 for bit 0, and param1 and
param2 for bit 1) so that you have more information added to the error
-signature being injected. The actual data passed is this:
+signature being injected. The actual data passed is this::

memory_address = param1;
memory_address_range = param2;
@@ -131,7 +137,7 @@ signature being injected. The actual data passed is this:
For memory errors (type 0x8, 0x10 and 0x20) the address is set using
param1 with a mask in param2 (0x0 is equivalent to all ones). For PCI
express errors (type 0x40, 0x80 and 0x100) the segment, bus, device and
-function are specified using param1:
+function are specified using param1::

31 24 23 16 15 11 10 8 7 0
+-------------------------------------------------+
@@ -152,26 +158,26 @@ documentation for details (and expect changes to this API if vendors
creativity in using this feature expands beyond our expectations).


-An error injection example:
+An error injection example::

-# cd /sys/kernel/debug/apei/einj
-# cat available_error_type # See which errors can be injected
-0x00000002 Processor Uncorrectable non-fatal
-0x00000008 Memory Correctable
-0x00000010 Memory Uncorrectable non-fatal
-# echo 0x12345000 > param1 # Set memory address for injection
-# echo $((-1 << 12)) > param2 # Mask 0xfffffffffffff000 - anywhere in this page
-# echo 0x8 > error_type # Choose correctable memory error
-# echo 1 > error_inject # Inject now
+ # cd /sys/kernel/debug/apei/einj
+ # cat available_error_type # See which errors can be injected
+ 0x00000002 Processor Uncorrectable non-fatal
+ 0x00000008 Memory Correctable
+ 0x00000010 Memory Uncorrectable non-fatal
+ # echo 0x12345000 > param1 # Set memory address for injection
+ # echo $((-1 << 12)) > param2 # Mask 0xfffffffffffff000 - anywhere in this page
+ # echo 0x8 > error_type # Choose correctable memory error
+ # echo 1 > error_inject # Inject now

-You should see something like this in dmesg:
+You should see something like this in dmesg::

-[22715.830801] EDAC sbridge MC3: HANDLING MCE MEMORY ERROR
-[22715.834759] EDAC sbridge MC3: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010090
-[22715.834759] EDAC sbridge MC3: TSC 0
-[22715.834759] EDAC sbridge MC3: ADDR 12345000 EDAC sbridge MC3: MISC 144780c86
-[22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0
-[22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)
+ [22715.830801] EDAC sbridge MC3: HANDLING MCE MEMORY ERROR
+ [22715.834759] EDAC sbridge MC3: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010090
+ [22715.834759] EDAC sbridge MC3: TSC 0
+ [22715.834759] EDAC sbridge MC3: ADDR 12345000 EDAC sbridge MC3: MISC 144780c86
+ [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0
+ [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)

For more information about EINJ, please refer to ACPI specification
version 4.0, section 17.5 and ACPI 5.0, section 18.6.
diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index 869badba6d7a..fca854f017d8 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -18,6 +18,7 @@ ACPI Support
debug
aml-debugger
apei/output_format
+ apei/einj
gpio-properties
i2c-muxes
acpi-lid
--
2.20.1

2019-04-23 16:35:09

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 12/63] Documentation: ACPI: move i2c-muxes.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/acpi/i2c-muxes.txt | 58 ------------------
.../firmware-guide/acpi/i2c-muxes.rst | 61 +++++++++++++++++++
Documentation/firmware-guide/acpi/index.rst | 3 +-
3 files changed, 63 insertions(+), 59 deletions(-)
delete mode 100644 Documentation/acpi/i2c-muxes.txt
create mode 100644 Documentation/firmware-guide/acpi/i2c-muxes.rst

diff --git a/Documentation/acpi/i2c-muxes.txt b/Documentation/acpi/i2c-muxes.txt
deleted file mode 100644
index 9fcc4f0b885e..000000000000
--- a/Documentation/acpi/i2c-muxes.txt
+++ /dev/null
@@ -1,58 +0,0 @@
-ACPI I2C Muxes
---------------
-
-Describing an I2C device hierarchy that includes I2C muxes requires an ACPI
-Device () scope per mux channel.
-
-Consider this topology:
-
-+------+ +------+
-| SMB1 |-->| MUX0 |--CH00--> i2c client A (0x50)
-| | | 0x70 |--CH01--> i2c client B (0x50)
-+------+ +------+
-
-which corresponds to the following ASL:
-
-Device (SMB1)
-{
- Name (_HID, ...)
- Device (MUX0)
- {
- Name (_HID, ...)
- Name (_CRS, ResourceTemplate () {
- I2cSerialBus (0x70, ControllerInitiated, I2C_SPEED,
- AddressingMode7Bit, "^SMB1", 0x00,
- ResourceConsumer,,)
- }
-
- Device (CH00)
- {
- Name (_ADR, 0)
-
- Device (CLIA)
- {
- Name (_HID, ...)
- Name (_CRS, ResourceTemplate () {
- I2cSerialBus (0x50, ControllerInitiated, I2C_SPEED,
- AddressingMode7Bit, "^CH00", 0x00,
- ResourceConsumer,,)
- }
- }
- }
-
- Device (CH01)
- {
- Name (_ADR, 1)
-
- Device (CLIB)
- {
- Name (_HID, ...)
- Name (_CRS, ResourceTemplate () {
- I2cSerialBus (0x50, ControllerInitiated, I2C_SPEED,
- AddressingMode7Bit, "^CH01", 0x00,
- ResourceConsumer,,)
- }
- }
- }
- }
-}
diff --git a/Documentation/firmware-guide/acpi/i2c-muxes.rst b/Documentation/firmware-guide/acpi/i2c-muxes.rst
new file mode 100644
index 000000000000..3a8997ccd7c4
--- /dev/null
+++ b/Documentation/firmware-guide/acpi/i2c-muxes.rst
@@ -0,0 +1,61 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============
+ACPI I2C Muxes
+==============
+
+Describing an I2C device hierarchy that includes I2C muxes requires an ACPI
+Device () scope per mux channel.
+
+Consider this topology::
+
+ +------+ +------+
+ | SMB1 |-->| MUX0 |--CH00--> i2c client A (0x50)
+ | | | 0x70 |--CH01--> i2c client B (0x50)
+ +------+ +------+
+
+which corresponds to the following ASL::
+
+ Device (SMB1)
+ {
+ Name (_HID, ...)
+ Device (MUX0)
+ {
+ Name (_HID, ...)
+ Name (_CRS, ResourceTemplate () {
+ I2cSerialBus (0x70, ControllerInitiated, I2C_SPEED,
+ AddressingMode7Bit, "^SMB1", 0x00,
+ ResourceConsumer,,)
+ }
+
+ Device (CH00)
+ {
+ Name (_ADR, 0)
+
+ Device (CLIA)
+ {
+ Name (_HID, ...)
+ Name (_CRS, ResourceTemplate () {
+ I2cSerialBus (0x50, ControllerInitiated, I2C_SPEED,
+ AddressingMode7Bit, "^CH00", 0x00,
+ ResourceConsumer,,)
+ }
+ }
+ }
+
+ Device (CH01)
+ {
+ Name (_ADR, 1)
+
+ Device (CLIB)
+ {
+ Name (_HID, ...)
+ Name (_CRS, ResourceTemplate () {
+ I2cSerialBus (0x50, ControllerInitiated, I2C_SPEED,
+ AddressingMode7Bit, "^CH01", 0x00,
+ ResourceConsumer,,)
+ }
+ }
+ }
+ }
+ }
diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index d1d069b26bbc..1c89888f6ee8 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -12,4 +12,5 @@ ACPI Support
osi
method-customizing
DSD-properties-rules
- gpio-properties
\ No newline at end of file
+ gpio-properties
+ i2c-muxes
--
2.20.1

2019-04-23 16:35:18

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 22/63] Documentation: ACPI: move lpit.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/firmware-guide/acpi/index.rst | 1 +
.../lpit.txt => firmware-guide/acpi/lpit.rst} | 18 +++++++++++++-----
2 files changed, 14 insertions(+), 5 deletions(-)
rename Documentation/{acpi/lpit.txt => firmware-guide/acpi/lpit.rst} (68%)

diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index fca854f017d8..0e60f4b7129a 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -22,3 +22,4 @@ ACPI Support
gpio-properties
i2c-muxes
acpi-lid
+ lpit
diff --git a/Documentation/acpi/lpit.txt b/Documentation/firmware-guide/acpi/lpit.rst
similarity index 68%
rename from Documentation/acpi/lpit.txt
rename to Documentation/firmware-guide/acpi/lpit.rst
index b426398d2e97..aca928fab027 100644
--- a/Documentation/acpi/lpit.txt
+++ b/Documentation/firmware-guide/acpi/lpit.rst
@@ -1,3 +1,9 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========================
+Low Power Idle Table (LPIT)
+===========================
+
To enumerate platform Low Power Idle states, Intel platforms are using
“Low Power Idle Table” (LPIT). More details about this table can be
downloaded from:
@@ -8,13 +14,15 @@ Residencies for each low power state can be read via FFH

On platforms supporting S0ix sleep states, there can be two types of
residencies:
-- CPU PKG C10 (Read via FFH interface)
-- Platform Controller Hub (PCH) SLP_S0 (Read via memory mapped interface)
+
+ - CPU PKG C10 (Read via FFH interface)
+ - Platform Controller Hub (PCH) SLP_S0 (Read via memory mapped interface)

The following attributes are added dynamically to the cpuidle
-sysfs attribute group:
- /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
- /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
+sysfs attribute group::
+
+ /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
+ /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us

The "low_power_idle_cpu_residency_us" attribute shows time spent
by the CPU package in PKG C10
--
2.20.1

2019-04-23 16:35:19

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 13/63] Documentation: ACPI: move acpi-lid.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
.../acpi/acpi-lid.rst} | 48 ++++++++++++-------
Documentation/firmware-guide/acpi/index.rst | 1 +
2 files changed, 33 insertions(+), 16 deletions(-)
rename Documentation/{acpi/acpi-lid.txt => firmware-guide/acpi/acpi-lid.rst} (77%)

diff --git a/Documentation/acpi/acpi-lid.txt b/Documentation/firmware-guide/acpi/acpi-lid.rst
similarity index 77%
rename from Documentation/acpi/acpi-lid.txt
rename to Documentation/firmware-guide/acpi/acpi-lid.rst
index effe7af3a5af..1d19e15a6945 100644
--- a/Documentation/acpi/acpi-lid.txt
+++ b/Documentation/firmware-guide/acpi/acpi-lid.rst
@@ -1,25 +1,29 @@
-Special Usage Model of the ACPI Control Method Lid Device
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>

-Copyright (C) 2016, Intel Corporation
-Author: Lv Zheng <[email protected]>
+=========================================================
+Special Usage Model of the ACPI Control Method Lid Device
+=========================================================

+:Copyright: |copy| 2016, Intel Corporation

-Abstract:
+:Author: Lv Zheng <[email protected]>

-Platforms containing lids convey lid state (open/close) to OSPMs using a
-control method lid device. To implement this, the AML tables issue
-Notify(lid_device, 0x80) to notify the OSPMs whenever the lid state has
-changed. The _LID control method for the lid device must be implemented to
-report the "current" state of the lid as either "opened" or "closed".
+:Abstract: Platforms containing lids convey lid state (open/close) to OSPMs
+ using a control method lid device. To implement this, the AML tables issue
+ Notify(lid_device, 0x80) to notify the OSPMs whenever the lid state has
+ changed. The _LID control method for the lid device must be implemented to
+ report the "current" state of the lid as either "opened" or "closed".

-For most platforms, both the _LID method and the lid notifications are
-reliable. However, there are exceptions. In order to work with these
-exceptional buggy platforms, special restrictions and expections should be
-taken into account. This document describes the restrictions and the
-expections of the Linux ACPI lid device driver.
+ For most platforms, both the _LID method and the lid notifications are
+ reliable. However, there are exceptions. In order to work with these
+ exceptional buggy platforms, special restrictions and expections should be
+ taken into account. This document describes the restrictions and the
+ expections of the Linux ACPI lid device driver.


1. Restrictions of the returning value of the _LID control method
+=================================================================

The _LID control method is described to return the "current" lid state.
However the word of "current" has ambiguity, some buggy AML tables return
@@ -31,6 +35,7 @@ with cached value, the initial returning value is likely not reliable.
There are platforms always retun "closed" as initial lid state.

2. Restrictions of the lid state change notifications
+=====================================================

There are buggy AML tables never notifying when the lid device state is
changed to "opened". Thus the "opened" notification is not guaranteed. But
@@ -40,17 +45,21 @@ trigger some system power saving operations on Windows. Since it is fully
tested, it is reliable from all AML tables.

3. Expections for the userspace users of the ACPI lid device driver
+===================================================================

The ACPI button driver exports the lid state to the userspace via the
-following file:
+following file::
+
/proc/acpi/button/lid/LID0/state
+
This file actually calls the _LID control method described above. And given
the previous explanation, it is not reliable enough on some platforms. So
it is advised for the userspace program to not to solely rely on this file
to determine the actual lid state.

The ACPI button driver emits the following input event to the userspace:
- SW_LID
+ * SW_LID
+
The ACPI lid device driver is implemented to try to deliver the platform
triggered events to the userspace. However, given the fact that the buggy
firmware cannot make sure "opened"/"closed" events are paired, the ACPI
@@ -59,20 +68,25 @@ button driver uses the following 3 modes in order not to trigger issues.
If the userspace hasn't been prepared to ignore the unreliable "opened"
events and the unreliable initial state notification, Linux users can use
the following kernel parameters to handle the possible issues:
+
A. button.lid_init_state=method:
When this option is specified, the ACPI button driver reports the
initial lid state using the returning value of the _LID control method
and whether the "opened"/"closed" events are paired fully relies on the
firmware implementation.
+
This option can be used to fix some platforms where the returning value
of the _LID control method is reliable but the initial lid state
notification is missing.
+
This option is the default behavior during the period the userspace
isn't ready to handle the buggy AML tables.
+
B. button.lid_init_state=open:
When this option is specified, the ACPI button driver always reports the
initial lid state as "opened" and whether the "opened"/"closed" events
are paired fully relies on the firmware implementation.
+
This may fix some platforms where the returning value of the _LID
control method is not reliable and the initial lid state notification is
missing.
@@ -80,6 +94,7 @@ B. button.lid_init_state=open:
If the userspace has been prepared to ignore the unreliable "opened" events
and the unreliable initial state notification, Linux users should always
use the following kernel parameter:
+
C. button.lid_init_state=ignore:
When this option is specified, the ACPI button driver never reports the
initial lid state and there is a compensation mechanism implemented to
@@ -89,6 +104,7 @@ C. button.lid_init_state=ignore:
notifications can be delivered to the userspace when the lid is actually
opens given that some AML tables do not send "opened" notifications
reliably.
+
In this mode, if everything is correctly implemented by the platform
firmware, the old userspace programs should still work. Otherwise, the
new userspace programs are required to work with the ACPI button driver.
diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index 1c89888f6ee8..bedcb0b242a2 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -14,3 +14,4 @@ ACPI Support
DSD-properties-rules
gpio-properties
i2c-muxes
+ acpi-lid
\ No newline at end of file
--
2.20.1

2019-04-23 16:35:24

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 23/63] Documentation: ACPI: move ssdt-overlays.txt to admin-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/acpi/ssdt-overlays.txt | 172 -----------------
Documentation/admin-guide/acpi/index.rst | 1 +
.../admin-guide/acpi/ssdt-overlays.rst | 180 ++++++++++++++++++
3 files changed, 181 insertions(+), 172 deletions(-)
delete mode 100644 Documentation/acpi/ssdt-overlays.txt
create mode 100644 Documentation/admin-guide/acpi/ssdt-overlays.rst

diff --git a/Documentation/acpi/ssdt-overlays.txt b/Documentation/acpi/ssdt-overlays.txt
deleted file mode 100644
index 5ae13f161ea2..000000000000
--- a/Documentation/acpi/ssdt-overlays.txt
+++ /dev/null
@@ -1,172 +0,0 @@
-
-In order to support ACPI open-ended hardware configurations (e.g. development
-boards) we need a way to augment the ACPI configuration provided by the firmware
-image. A common example is connecting sensors on I2C / SPI buses on development
-boards.
-
-Although this can be accomplished by creating a kernel platform driver or
-recompiling the firmware image with updated ACPI tables, neither is practical:
-the former proliferates board specific kernel code while the latter requires
-access to firmware tools which are often not publicly available.
-
-Because ACPI supports external references in AML code a more practical
-way to augment firmware ACPI configuration is by dynamically loading
-user defined SSDT tables that contain the board specific information.
-
-For example, to enumerate a Bosch BMA222E accelerometer on the I2C bus of the
-Minnowboard MAX development board exposed via the LSE connector [1], the
-following ASL code can be used:
-
-DefinitionBlock ("minnowmax.aml", "SSDT", 1, "Vendor", "Accel", 0x00000003)
-{
- External (\_SB.I2C6, DeviceObj)
-
- Scope (\_SB.I2C6)
- {
- Device (STAC)
- {
- Name (_ADR, Zero)
- Name (_HID, "BMA222E")
-
- Method (_CRS, 0, Serialized)
- {
- Name (RBUF, ResourceTemplate ()
- {
- I2cSerialBus (0x0018, ControllerInitiated, 0x00061A80,
- AddressingMode7Bit, "\\_SB.I2C6", 0x00,
- ResourceConsumer, ,)
- GpioInt (Edge, ActiveHigh, Exclusive, PullDown, 0x0000,
- "\\_SB.GPO2", 0x00, ResourceConsumer, , )
- { // Pin list
- 0
- }
- })
- Return (RBUF)
- }
- }
- }
-}
-
-which can then be compiled to AML binary format:
-
-$ iasl minnowmax.asl
-
-Intel ACPI Component Architecture
-ASL Optimizing Compiler version 20140214-64 [Mar 29 2014]
-Copyright (c) 2000 - 2014 Intel Corporation
-
-ASL Input: minnomax.asl - 30 lines, 614 bytes, 7 keywords
-AML Output: minnowmax.aml - 165 bytes, 6 named objects, 1 executable opcodes
-
-[1] http://wiki.minnowboard.org/MinnowBoard_MAX#Low_Speed_Expansion_Connector_.28Top.29
-
-The resulting AML code can then be loaded by the kernel using one of the methods
-below.
-
-== Loading ACPI SSDTs from initrd ==
-
-This option allows loading of user defined SSDTs from initrd and it is useful
-when the system does not support EFI or when there is not enough EFI storage.
-
-It works in a similar way with initrd based ACPI tables override/upgrade: SSDT
-aml code must be placed in the first, uncompressed, initrd under the
-"kernel/firmware/acpi" path. Multiple files can be used and this will translate
-in loading multiple tables. Only SSDT and OEM tables are allowed. See
-initrd_table_override.txt for more details.
-
-Here is an example:
-
-# Add the raw ACPI tables to an uncompressed cpio archive.
-# They must be put into a /kernel/firmware/acpi directory inside the
-# cpio archive.
-# The uncompressed cpio archive must be the first.
-# Other, typically compressed cpio archives, must be
-# concatenated on top of the uncompressed one.
-mkdir -p kernel/firmware/acpi
-cp ssdt.aml kernel/firmware/acpi
-
-# Create the uncompressed cpio archive and concatenate the original initrd
-# on top:
-find kernel | cpio -H newc --create > /boot/instrumented_initrd
-cat /boot/initrd >>/boot/instrumented_initrd
-
-== Loading ACPI SSDTs from EFI variables ==
-
-This is the preferred method, when EFI is supported on the platform, because it
-allows a persistent, OS independent way of storing the user defined SSDTs. There
-is also work underway to implement EFI support for loading user defined SSDTs
-and using this method will make it easier to convert to the EFI loading
-mechanism when that will arrive.
-
-In order to load SSDTs from an EFI variable the efivar_ssdt kernel command line
-parameter can be used. The argument for the option is the variable name to
-use. If there are multiple variables with the same name but with different
-vendor GUIDs, all of them will be loaded.
-
-In order to store the AML code in an EFI variable the efivarfs filesystem can be
-used. It is enabled and mounted by default in /sys/firmware/efi/efivars in all
-recent distribution.
-
-Creating a new file in /sys/firmware/efi/efivars will automatically create a new
-EFI variable. Updating a file in /sys/firmware/efi/efivars will update the EFI
-variable. Please note that the file name needs to be specially formatted as
-"Name-GUID" and that the first 4 bytes in the file (little-endian format)
-represent the attributes of the EFI variable (see EFI_VARIABLE_MASK in
-include/linux/efi.h). Writing to the file must also be done with one write
-operation.
-
-For example, you can use the following bash script to create/update an EFI
-variable with the content from a given file:
-
-#!/bin/sh -e
-
-while ! [ -z "$1" ]; do
- case "$1" in
- "-f") filename="$2"; shift;;
- "-g") guid="$2"; shift;;
- *) name="$1";;
- esac
- shift
-done
-
-usage()
-{
- echo "Syntax: ${0##*/} -f filename [ -g guid ] name"
- exit 1
-}
-
-[ -n "$name" -a -f "$filename" ] || usage
-
-EFIVARFS="/sys/firmware/efi/efivars"
-
-[ -d "$EFIVARFS" ] || exit 2
-
-if stat -tf $EFIVARFS | grep -q -v de5e81e4; then
- mount -t efivarfs none $EFIVARFS
-fi
-
-# try to pick up an existing GUID
-[ -n "$guid" ] || guid=$(find "$EFIVARFS" -name "$name-*" | head -n1 | cut -f2- -d-)
-
-# use a randomly generated GUID
-[ -n "$guid" ] || guid="$(cat /proc/sys/kernel/random/uuid)"
-
-# efivarfs expects all of the data in one write
-tmp=$(mktemp)
-/bin/echo -ne "\007\000\000\000" | cat - $filename > $tmp
-dd if=$tmp of="$EFIVARFS/$name-$guid" bs=$(stat -c %s $tmp)
-rm $tmp
-
-== Loading ACPI SSDTs from configfs ==
-
-This option allows loading of user defined SSDTs from userspace via the configfs
-interface. The CONFIG_ACPI_CONFIGFS option must be select and configfs must be
-mounted. In the following examples, we assume that configfs has been mounted in
-/config.
-
-New tables can be loading by creating new directories in /config/acpi/table/ and
-writing the SSDT aml code in the aml attribute:
-
-cd /config/acpi/table
-mkdir my_ssdt
-cat ~/ssdt.aml > my_ssdt/aml
diff --git a/Documentation/admin-guide/acpi/index.rst b/Documentation/admin-guide/acpi/index.rst
index 9049a7b9f065..4d13eeea1eca 100644
--- a/Documentation/admin-guide/acpi/index.rst
+++ b/Documentation/admin-guide/acpi/index.rst
@@ -10,4 +10,5 @@ the Linux ACPI support.

initrd_table_override
dsdt-override
+ ssdt-overlays
cppc_sysfs
diff --git a/Documentation/admin-guide/acpi/ssdt-overlays.rst b/Documentation/admin-guide/acpi/ssdt-overlays.rst
new file mode 100644
index 000000000000..da37455f96c9
--- /dev/null
+++ b/Documentation/admin-guide/acpi/ssdt-overlays.rst
@@ -0,0 +1,180 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=============
+SSDT Overlays
+=============
+
+In order to support ACPI open-ended hardware configurations (e.g. development
+boards) we need a way to augment the ACPI configuration provided by the firmware
+image. A common example is connecting sensors on I2C / SPI buses on development
+boards.
+
+Although this can be accomplished by creating a kernel platform driver or
+recompiling the firmware image with updated ACPI tables, neither is practical:
+the former proliferates board specific kernel code while the latter requires
+access to firmware tools which are often not publicly available.
+
+Because ACPI supports external references in AML code a more practical
+way to augment firmware ACPI configuration is by dynamically loading
+user defined SSDT tables that contain the board specific information.
+
+For example, to enumerate a Bosch BMA222E accelerometer on the I2C bus of the
+Minnowboard MAX development board exposed via the LSE connector [1], the
+following ASL code can be used::
+
+ DefinitionBlock ("minnowmax.aml", "SSDT", 1, "Vendor", "Accel", 0x00000003)
+ {
+ External (\_SB.I2C6, DeviceObj)
+
+ Scope (\_SB.I2C6)
+ {
+ Device (STAC)
+ {
+ Name (_ADR, Zero)
+ Name (_HID, "BMA222E")
+
+ Method (_CRS, 0, Serialized)
+ {
+ Name (RBUF, ResourceTemplate ()
+ {
+ I2cSerialBus (0x0018, ControllerInitiated, 0x00061A80,
+ AddressingMode7Bit, "\\_SB.I2C6", 0x00,
+ ResourceConsumer, ,)
+ GpioInt (Edge, ActiveHigh, Exclusive, PullDown, 0x0000,
+ "\\_SB.GPO2", 0x00, ResourceConsumer, , )
+ { // Pin list
+ 0
+ }
+ })
+ Return (RBUF)
+ }
+ }
+ }
+ }
+
+which can then be compiled to AML binary format::
+
+ $ iasl minnowmax.asl
+
+ Intel ACPI Component Architecture
+ ASL Optimizing Compiler version 20140214-64 [Mar 29 2014]
+ Copyright (c) 2000 - 2014 Intel Corporation
+
+ ASL Input: minnomax.asl - 30 lines, 614 bytes, 7 keywords
+ AML Output: minnowmax.aml - 165 bytes, 6 named objects, 1 executable opcodes
+
+[1] http://wiki.minnowboard.org/MinnowBoard_MAX#Low_Speed_Expansion_Connector_.28Top.29
+
+The resulting AML code can then be loaded by the kernel using one of the methods
+below.
+
+Loading ACPI SSDTs from initrd
+==============================
+
+This option allows loading of user defined SSDTs from initrd and it is useful
+when the system does not support EFI or when there is not enough EFI storage.
+
+It works in a similar way with initrd based ACPI tables override/upgrade: SSDT
+aml code must be placed in the first, uncompressed, initrd under the
+"kernel/firmware/acpi" path. Multiple files can be used and this will translate
+in loading multiple tables. Only SSDT and OEM tables are allowed. See
+initrd_table_override.txt for more details.
+
+Here is an example::
+
+ # Add the raw ACPI tables to an uncompressed cpio archive.
+ # They must be put into a /kernel/firmware/acpi directory inside the
+ # cpio archive.
+ # The uncompressed cpio archive must be the first.
+ # Other, typically compressed cpio archives, must be
+ # concatenated on top of the uncompressed one.
+ mkdir -p kernel/firmware/acpi
+ cp ssdt.aml kernel/firmware/acpi
+
+ # Create the uncompressed cpio archive and concatenate the original initrd
+ # on top:
+ find kernel | cpio -H newc --create > /boot/instrumented_initrd
+ cat /boot/initrd >>/boot/instrumented_initrd
+
+Loading ACPI SSDTs from EFI variables
+=====================================
+
+This is the preferred method, when EFI is supported on the platform, because it
+allows a persistent, OS independent way of storing the user defined SSDTs. There
+is also work underway to implement EFI support for loading user defined SSDTs
+and using this method will make it easier to convert to the EFI loading
+mechanism when that will arrive.
+
+In order to load SSDTs from an EFI variable the efivar_ssdt kernel command line
+parameter can be used. The argument for the option is the variable name to
+use. If there are multiple variables with the same name but with different
+vendor GUIDs, all of them will be loaded.
+
+In order to store the AML code in an EFI variable the efivarfs filesystem can be
+used. It is enabled and mounted by default in /sys/firmware/efi/efivars in all
+recent distribution.
+
+Creating a new file in /sys/firmware/efi/efivars will automatically create a new
+EFI variable. Updating a file in /sys/firmware/efi/efivars will update the EFI
+variable. Please note that the file name needs to be specially formatted as
+"Name-GUID" and that the first 4 bytes in the file (little-endian format)
+represent the attributes of the EFI variable (see EFI_VARIABLE_MASK in
+include/linux/efi.h). Writing to the file must also be done with one write
+operation.
+
+For example, you can use the following bash script to create/update an EFI
+variable with the content from a given file::
+
+ #!/bin/sh -e
+
+ while ! [ -z "$1" ]; do
+ case "$1" in
+ "-f") filename="$2"; shift;;
+ "-g") guid="$2"; shift;;
+ *) name="$1";;
+ esac
+ shift
+ done
+
+ usage()
+ {
+ echo "Syntax: ${0##*/} -f filename [ -g guid ] name"
+ exit 1
+ }
+
+ [ -n "$name" -a -f "$filename" ] || usage
+
+ EFIVARFS="/sys/firmware/efi/efivars"
+
+ [ -d "$EFIVARFS" ] || exit 2
+
+ if stat -tf $EFIVARFS | grep -q -v de5e81e4; then
+ mount -t efivarfs none $EFIVARFS
+ fi
+
+ # try to pick up an existing GUID
+ [ -n "$guid" ] || guid=$(find "$EFIVARFS" -name "$name-*" | head -n1 | cut -f2- -d-)
+
+ # use a randomly generated GUID
+ [ -n "$guid" ] || guid="$(cat /proc/sys/kernel/random/uuid)"
+
+ # efivarfs expects all of the data in one write
+ tmp=$(mktemp)
+ /bin/echo -ne "\007\000\000\000" | cat - $filename > $tmp
+ dd if=$tmp of="$EFIVARFS/$name-$guid" bs=$(stat -c %s $tmp)
+ rm $tmp
+
+Loading ACPI SSDTs from configfs
+================================
+
+This option allows loading of user defined SSDTs from userspace via the configfs
+interface. The CONFIG_ACPI_CONFIGFS option must be select and configfs must be
+mounted. In the following examples, we assume that configfs has been mounted in
+/config.
+
+New tables can be loading by creating new directories in /config/acpi/table/ and
+writing the SSDT aml code in the aml attribute::
+
+ cd /config/acpi/table
+ mkdir my_ssdt
+ cat ~/ssdt.aml > my_ssdt/aml
--
2.20.1

2019-04-23 16:35:33

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 24/63] Documentation: ACPI: move video_extension.txt to firmware-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/firmware-guide/acpi/index.rst | 1 +
.../acpi/video_extension.rst} | 63 ++++++++++---------
2 files changed, 36 insertions(+), 28 deletions(-)
rename Documentation/{acpi/video_extension.txt => firmware-guide/acpi/video_extension.rst} (79%)

diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
index 0e60f4b7129a..ae609eec4679 100644
--- a/Documentation/firmware-guide/acpi/index.rst
+++ b/Documentation/firmware-guide/acpi/index.rst
@@ -23,3 +23,4 @@ ACPI Support
i2c-muxes
acpi-lid
lpit
+ video_extension
diff --git a/Documentation/acpi/video_extension.txt b/Documentation/firmware-guide/acpi/video_extension.rst
similarity index 79%
rename from Documentation/acpi/video_extension.txt
rename to Documentation/firmware-guide/acpi/video_extension.rst
index 79bf6a4921be..06f7e3230b6e 100644
--- a/Documentation/acpi/video_extension.txt
+++ b/Documentation/firmware-guide/acpi/video_extension.rst
@@ -1,5 +1,8 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================
ACPI video extensions
-~~~~~~~~~~~~~~~~~~~~~
+=====================

This driver implement the ACPI Extensions For Display Adapters for
integrated graphics devices on motherboard, as specified in ACPI 2.0
@@ -8,9 +11,10 @@ defining the video POST device, retrieving EDID information or to
setup a video output, etc. Note that this is an ref. implementation
only. It may or may not work for your integrated video device.

-The ACPI video driver does 3 things regarding backlight control:
+The ACPI video driver does 3 things regarding backlight control.

-1 Export a sysfs interface for user space to control backlight level
+1. Export a sysfs interface for user space to control backlight level
+=====================================================================

If the ACPI table has a video device, and acpi_backlight=vendor kernel
command line is not present, the driver will register a backlight device
@@ -32,26 +36,26 @@ type: firmware

Note that ACPI video backlight driver will always use index for
brightness, actual_brightness and max_brightness. So if we have
-the following _BCL package:
+the following _BCL package::

-Method (_BCL, 0, NotSerialized)
-{
- Return (Package (0x0C)
+ Method (_BCL, 0, NotSerialized)
{
- 0x64,
- 0x32,
- 0x0A,
- 0x14,
- 0x1E,
- 0x28,
- 0x32,
- 0x3C,
- 0x46,
- 0x50,
- 0x5A,
- 0x64
- })
-}
+ Return (Package (0x0C)
+ {
+ 0x64,
+ 0x32,
+ 0x0A,
+ 0x14,
+ 0x1E,
+ 0x28,
+ 0x32,
+ 0x3C,
+ 0x46,
+ 0x50,
+ 0x5A,
+ 0x64
+ })
+ }

The first two levels are for when laptop are on AC or on battery and are
not used by Linux currently. The remaining 10 levels are supported levels
@@ -62,13 +66,15 @@ as a "brightness level" indicator. Thus from the user space perspective
the range of available brightness levels is from 0 to 9 (max_brightness)
inclusive.

-2 Notify user space about hotkey event
+2. Notify user space about hotkey event
+=======================================

There are generally two cases for hotkey event reporting:
+
i) For some laptops, when user presses the hotkey, a scancode will be
generated and sent to user space through the input device created by
the keyboard driver as a key type input event, with proper remap, the
- following key code will appear to user space:
+ following key code will appear to user space::

EV_KEY, KEY_BRIGHTNESSUP
EV_KEY, KEY_BRIGHTNESSDOWN
@@ -82,7 +88,7 @@ ii) For some laptops, the press of the hotkey will not generate the
about the event. The event value is defined in the ACPI spec. ACPI
video driver will generate an key type input event according to the
notify value it received and send the event to user space through the
- input device it created:
+ input device it created::

event keycode
0x86 KEY_BRIGHTNESSUP
@@ -94,13 +100,14 @@ so this would lead to the same effect as case i) now.
Once user space tool receives this event, it can modify the backlight
level through the sysfs interface.

-3 Change backlight level in the kernel
+3. Change backlight level in the kernel
+=======================================

This works for machines covered by case ii) in Section 2. Once the driver
received a notification, it will set the backlight level accordingly. This does
not affect the sending of event to user space, they are always sent to user
space regardless of whether or not the video module controls the backlight level
directly. This behaviour can be controlled through the brightness_switch_enabled
-module parameter as documented in admin-guide/kernel-parameters.rst. It is recommended to
-disable this behaviour once a GUI environment starts up and wants to have full
-control of the backlight level.
+module parameter as documented in admin-guide/kernel-parameters.rst. It is
+recommended to disable this behaviour once a GUI environment starts up and
+wants to have full control of the backlight level.
--
2.20.1

2019-04-23 16:35:43

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 26/63] Documentation: PCI: convert pci.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
---
Documentation/PCI/index.rst | 2 +
Documentation/PCI/{pci.txt => pci.rst} | 267 +++++++++++++------------
2 files changed, 140 insertions(+), 129 deletions(-)
rename Documentation/PCI/{pci.txt => pci.rst} (78%)

diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
index c2f8728d11cf..7babf43709b0 100644
--- a/Documentation/PCI/index.rst
+++ b/Documentation/PCI/index.rst
@@ -7,3 +7,5 @@ Linux PCI Bus Subsystem
.. toctree::
:maxdepth: 2
:numbered:
+
+ pci
diff --git a/Documentation/PCI/pci.txt b/Documentation/PCI/pci.rst
similarity index 78%
rename from Documentation/PCI/pci.txt
rename to Documentation/PCI/pci.rst
index badb26ac33dc..29ddd2e9177a 100644
--- a/Documentation/PCI/pci.txt
+++ b/Documentation/PCI/pci.rst
@@ -1,10 +1,12 @@
+.. SPDX-License-Identifier: GPL-2.0

- How To Write Linux PCI Drivers
+==============================
+How To Write Linux PCI Drivers
+==============================

- by Martin Mares <[email protected]> on 07-Feb-2000
- updated by Grant Grundler <[email protected]> on 23-Dec-2006
+:Authors: - Martin Mares <[email protected]>
+ - Grant Grundler <[email protected]>

-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The world of PCI is vast and full of (mostly unpleasant) surprises.
Since each CPU architecture implements different chip-sets and PCI devices
have different requirements (erm, "features"), the result is the PCI support
@@ -26,8 +28,8 @@ Please send questions/comments/patches about Linux PCI API to the



-0. Structure of PCI drivers
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Structure of PCI drivers
+========================
PCI drivers "discover" PCI devices in a system via pci_register_driver().
Actually, it's the other way around. When the PCI generic code discovers
a new device, the driver with a matching "description" will be notified.
@@ -42,24 +44,25 @@ pointers and thus dictates the high level structure of a driver.
Once the driver knows about a PCI device and takes ownership, the
driver generally needs to perform the following initialization:

- Enable the device
- Request MMIO/IOP resources
- Set the DMA mask size (for both coherent and streaming DMA)
- Allocate and initialize shared control data (pci_allocate_coherent())
- Access device configuration space (if needed)
- Register IRQ handler (request_irq())
- Initialize non-PCI (i.e. LAN/SCSI/etc parts of the chip)
- Enable DMA/processing engines
+ - Enable the device
+ - Request MMIO/IOP resources
+ - Set the DMA mask size (for both coherent and streaming DMA)
+ - Allocate and initialize shared control data (pci_allocate_coherent())
+ - Access device configuration space (if needed)
+ - Register IRQ handler (request_irq())
+ - Initialize non-PCI (i.e. LAN/SCSI/etc parts of the chip)
+ - Enable DMA/processing engines

When done using the device, and perhaps the module needs to be unloaded,
the driver needs to take the follow steps:
- Disable the device from generating IRQs
- Release the IRQ (free_irq())
- Stop all DMA activity
- Release DMA buffers (both streaming and coherent)
- Unregister from other subsystems (e.g. scsi or netdev)
- Release MMIO/IOP resources
- Disable the device
+
+ - Disable the device from generating IRQs
+ - Release the IRQ (free_irq())
+ - Stop all DMA activity
+ - Release DMA buffers (both streaming and coherent)
+ - Unregister from other subsystems (e.g. scsi or netdev)
+ - Release MMIO/IOP resources
+ - Disable the device

Most of these topics are covered in the following sections.
For the rest look at LDD3 or <linux/pci.h> .
@@ -70,13 +73,12 @@ completely empty or just returning an appropriate error codes to avoid
lots of ifdefs in the drivers.


-
-1. pci_register_driver() call
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+pci_register_driver() call
+==========================

PCI device drivers call pci_register_driver() during their
initialization with a pointer to a structure describing the driver
-(struct pci_driver):
+(struct pci_driver)::

field name Description
---------- ------------------------------------------------------
@@ -125,7 +127,7 @@ initialization with a pointer to a structure describing the driver
The ID table is an array of struct pci_device_id entries ending with an
all-zero entry. Definitions with static const are generally preferred.

-Each entry consists of:
+Each entry consists of::

vendor,device Vendor and device ID to match (or PCI_ANY_ID)

@@ -160,9 +162,10 @@ echo "vendor device subvendor subdevice class class_mask driver_data" > \
All fields are passed in as hexadecimal values (no leading 0x).
The vendor and device fields are mandatory, the others are optional. Users
need pass only as many optional fields as necessary:
- o subvendor and subdevice fields default to PCI_ANY_ID (FFFFFFFF)
- o class and classmask fields default to 0
- o driver_data defaults to 0UL.
+
+ - subvendor and subdevice fields default to PCI_ANY_ID (FFFFFFFF)
+ - class and classmask fields default to 0
+ - driver_data defaults to 0UL.

Note that driver_data must match the value used by any of the pci_device_id
entries defined in the driver. This makes the driver_data field mandatory
@@ -175,29 +178,30 @@ When the driver exits, it just calls pci_unregister_driver() and the PCI layer
automatically calls the remove hook for all devices handled by the driver.


-1.1 "Attributes" for driver functions/data
+"Attributes" for driver functions/data
+--------------------------------------

Please mark the initialization and cleanup functions where appropriate
-(the corresponding macros are defined in <linux/init.h>):
+(the corresponding macros are defined in <linux/init.h>)::

__init Initialization code. Thrown away after the driver
initializes.
__exit Exit code. Ignored for non-modular drivers.

Tips on when/where to use the above attributes:
- o The module_init()/module_exit() functions (and all
+ - The module_init()/module_exit() functions (and all
initialization functions called _only_ from these)
should be marked __init/__exit.

- o Do not mark the struct pci_driver.
+ - Do not mark the struct pci_driver.

- o Do NOT mark a function if you are not sure which mark to use.
+ - Do NOT mark a function if you are not sure which mark to use.
Better to not mark the function than mark the function wrong.



-2. How to find PCI devices manually
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+How to find PCI devices manually
+================================

PCI drivers should have a really good reason for not using the
pci_register_driver() interface to search for PCI devices.
@@ -207,17 +211,17 @@ E.g. combined serial/parallel port/floppy controller.

A manual search may be performed using the following constructs:

-Searching by vendor and device ID:
+Searching by vendor and device ID::

struct pci_dev *dev = NULL;
while (dev = pci_get_device(VENDOR_ID, DEVICE_ID, dev))
configure_device(dev);

-Searching by class ID (iterate in a similar way):
+Searching by class ID (iterate in a similar way)::

pci_get_class(CLASS_ID, dev)

-Searching by both vendor/device and subsystem vendor/device ID:
+Searching by both vendor/device and subsystem vendor/device ID::

pci_get_subsys(VENDOR_ID,DEVICE_ID, SUBSYS_VENDOR_ID, SUBSYS_DEVICE_ID, dev).

@@ -231,20 +235,20 @@ decrement the reference count on these devices by calling pci_dev_put().



-3. Device Initialization Steps
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Device Initialization Steps
+===========================

As noted in the introduction, most PCI drivers need the following steps
for device initialization:

- Enable the device
- Request MMIO/IOP resources
- Set the DMA mask size (for both coherent and streaming DMA)
- Allocate and initialize shared control data (pci_allocate_coherent())
- Access device configuration space (if needed)
- Register IRQ handler (request_irq())
- Initialize non-PCI (i.e. LAN/SCSI/etc parts of the chip)
- Enable DMA/processing engines.
+ - Enable the device
+ - Request MMIO/IOP resources
+ - Set the DMA mask size (for both coherent and streaming DMA)
+ - Allocate and initialize shared control data (pci_allocate_coherent())
+ - Access device configuration space (if needed)
+ - Register IRQ handler (request_irq())
+ - Initialize non-PCI (i.e. LAN/SCSI/etc parts of the chip)
+ - Enable DMA/processing engines.

The driver can access PCI config space registers at any time.
(Well, almost. When running BIST, config space can go away...but
@@ -252,17 +256,18 @@ that will just result in a PCI Bus Master Abort and config reads
will return garbage).


-3.1 Enable the PCI device
-~~~~~~~~~~~~~~~~~~~~~~~~~
+Enable the PCI device
+---------------------
Before touching any device registers, the driver needs to enable
the PCI device by calling pci_enable_device(). This will:
- o wake up the device if it was in suspended state,
- o allocate I/O and memory regions of the device (if BIOS did not),
- o allocate an IRQ (if BIOS did not).

-NOTE: pci_enable_device() can fail! Check the return value.
+ - wake up the device if it was in suspended state,
+ - allocate I/O and memory regions of the device (if BIOS did not),
+ - allocate an IRQ (if BIOS did not).
+
+.. note:: pci_enable_device() can fail! Check the return value.

-[ OS BUG: we don't check resource allocations before enabling those
+.. warning:: OS BUG: we don't check resource allocations before enabling those
resources. The sequence would make more sense if we called
pci_request_resources() before calling pci_enable_device().
Currently, the device drivers can't detect the bug when when two
@@ -271,7 +276,7 @@ NOTE: pci_enable_device() can fail! Check the return value.

This has been discussed before but not changed as of 2.6.19:
http://lkml.org/lkml/2006/3/2/194
-]
+

pci_set_master() will enable DMA by setting the bus master bit
in the PCI_COMMAND register. It also fixes the latency timer value if
@@ -288,8 +293,8 @@ pci_try_set_mwi() to have the system do its best effort at enabling
Mem-Wr-Inval.


-3.2 Request MMIO/IOP resources
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Request MMIO/IOP resources
+--------------------------
Memory (MMIO), and I/O port addresses should NOT be read directly
from the PCI device config space. Use the values in the pci_dev structure
as the PCI "bus address" might have been remapped to a "host physical"
@@ -304,9 +309,9 @@ Conversely, drivers should call pci_release_region() AFTER
calling pci_disable_device().
The idea is to prevent two devices colliding on the same address range.

-[ See OS BUG comment above. Currently (2.6.19), The driver can only
+.. tip:: See OS BUG comment above. Currently (2.6.19), The driver can only
determine MMIO and IO Port resource availability _after_ calling
- pci_enable_device(). ]
+ pci_enable_device().

Generic flavors of pci_request_region() are request_mem_region()
(for MMIO ranges) and request_region() (for IO Port ranges).
@@ -316,12 +321,12 @@ BARs.
Also see pci_request_selected_regions() below.


-3.3 Set the DMA mask size
-~~~~~~~~~~~~~~~~~~~~~~~~~
-[ If anything below doesn't make sense, please refer to
+Set the DMA mask size
+---------------------
+.. note:: If anything below doesn't make sense, please refer to
Documentation/DMA-API.txt. This section is just a reminder that
drivers need to indicate DMA capabilities of the device and is not
- an authoritative source for DMA interfaces. ]
+ an authoritative source for DMA interfaces.

While all drivers should explicitly indicate the DMA capability
(e.g. 32 or 64 bit) of the PCI bus master, devices with more than
@@ -342,23 +347,23 @@ Many 64-bit "PCI" devices (before PCI-X) and some PCI-X devices are
("consistent") data.


-3.4 Setup shared control data
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Setup shared control data
+-------------------------
Once the DMA masks are set, the driver can allocate "consistent" (a.k.a. shared)
memory. See Documentation/DMA-API.txt for a full description of
the DMA APIs. This section is just a reminder that it needs to be done
before enabling DMA on the device.


-3.5 Initialize device registers
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Initialize device registers
+---------------------------
Some drivers will need specific "capability" fields programmed
or other "vendor specific" register initialized or reset.
E.g. clearing pending interrupts.


-3.6 Register IRQ handler
-~~~~~~~~~~~~~~~~~~~~~~~~
+Register IRQ handler
+--------------------
While calling request_irq() is the last step described here,
this is often just another intermediate step to initialize a device.
This step can often be deferred until the device is opened for use.
@@ -396,6 +401,7 @@ and msix_enabled flags in the pci_dev structure after calling
pci_alloc_irq_vectors.

There are (at least) two really good reasons for using MSI:
+
1) MSI is an exclusive interrupt vector by definition.
This means the interrupt handler doesn't have to verify
its device caused the interrupt.
@@ -411,23 +417,23 @@ of MSI/MSI-X usage.



-4. PCI device shutdown
-~~~~~~~~~~~~~~~~~~~~~~~
+PCI device shutdown
+===================

When a PCI device driver is being unloaded, most of the following
steps need to be performed:

- Disable the device from generating IRQs
- Release the IRQ (free_irq())
- Stop all DMA activity
- Release DMA buffers (both streaming and consistent)
- Unregister from other subsystems (e.g. scsi or netdev)
- Disable device from responding to MMIO/IO Port addresses
- Release MMIO/IO Port resource(s)
+ - Disable the device from generating IRQs
+ - Release the IRQ (free_irq())
+ - Stop all DMA activity
+ - Release DMA buffers (both streaming and consistent)
+ - Unregister from other subsystems (e.g. scsi or netdev)
+ - Disable device from responding to MMIO/IO Port addresses
+ - Release MMIO/IO Port resource(s)


-4.1 Stop IRQs on the device
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Stop IRQs on the device
+-----------------------
How to do this is chip/device specific. If it's not done, it opens
the possibility of a "screaming interrupt" if (and only if)
the IRQ is shared with another device.
@@ -446,16 +452,16 @@ MSI and MSI-X are defined to be exclusive interrupts and thus
are not susceptible to the "screaming interrupt" problem.


-4.2 Release the IRQ
-~~~~~~~~~~~~~~~~~~~
+Release the IRQ
+---------------
Once the device is quiesced (no more IRQs), one can call free_irq().
This function will return control once any pending IRQs are handled,
"unhook" the drivers IRQ handler from that IRQ, and finally release
the IRQ if no one else is using it.


-4.3 Stop all DMA activity
-~~~~~~~~~~~~~~~~~~~~~~~~~
+Stop all DMA activity
+---------------------
It's extremely important to stop all DMA operations BEFORE attempting
to deallocate DMA control data. Failure to do so can result in memory
corruption, hangs, and on some chip-sets a hard crash.
@@ -467,8 +473,8 @@ While this step sounds obvious and trivial, several "mature" drivers
didn't get this step right in the past.


-4.4 Release DMA buffers
-~~~~~~~~~~~~~~~~~~~~~~~
+Release DMA buffers
+-------------------
Once DMA is stopped, clean up streaming DMA first.
I.e. unmap data buffers and return buffers to "upstream"
owners if there is one.
@@ -478,8 +484,8 @@ Then clean up "consistent" buffers which contain the control data.
See Documentation/DMA-API.txt for details on unmapping interfaces.


-4.5 Unregister from other subsystems
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Unregister from other subsystems
+--------------------------------
Most low level PCI device drivers support some other subsystem
like USB, ALSA, SCSI, NetDev, Infiniband, etc. Make sure your
driver isn't losing resources from that other subsystem.
@@ -487,31 +493,31 @@ If this happens, typically the symptom is an Oops (panic) when
the subsystem attempts to call into a driver that has been unloaded.


-4.6 Disable Device from responding to MMIO/IO Port addresses
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Disable Device from responding to MMIO/IO Port addresses
+--------------------------------------------------------
io_unmap() MMIO or IO Port resources and then call pci_disable_device().
This is the symmetric opposite of pci_enable_device().
Do not access device registers after calling pci_disable_device().


-4.7 Release MMIO/IO Port Resource(s)
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Release MMIO/IO Port Resource(s)
+--------------------------------
Call pci_release_region() to mark the MMIO or IO Port range as available.
Failure to do so usually results in the inability to reload the driver.



-5. How to access PCI config space
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+How to access PCI config space
+==============================

-You can use pci_(read|write)_config_(byte|word|dword) to access the config
-space of a device represented by struct pci_dev *. All these functions return 0
-when successful or an error code (PCIBIOS_...) which can be translated to a text
-string by pcibios_strerror. Most drivers expect that accesses to valid PCI
+You can use `pci_(read|write)_config_(byte|word|dword)` to access the config
+space of a device represented by `struct pci_dev *`. All these functions return
+0 when successful or an error code (`PCIBIOS_...`) which can be translated to a
+text string by pcibios_strerror. Most drivers expect that accesses to valid PCI
devices don't fail.

If you don't have a struct pci_dev available, you can call
-pci_bus_(read|write)_config_(byte|word|dword) to access a given device
+`pci_bus_(read|write)_config_(byte|word|dword)` to access a given device
and function on that bus.

If you access fields in the standard portion of the config header, please
@@ -522,28 +528,29 @@ pci_find_capability() for the particular capability and it will find the
corresponding register block for you.


+Other interesting functions
+===========================

-6. Other interesting functions
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+::

-pci_get_domain_bus_and_slot() Find pci_dev corresponding to given domain,
- bus and slot and number. If the device is
- found, its reference count is increased.
-pci_set_power_state() Set PCI Power Management state (0=D0 ... 3=D3)
-pci_find_capability() Find specified capability in device's capability
+ pci_get_domain_bus_and_slot() Find pci_dev corresponding to given domain,
+ bus and slot and number. If the device is
+ found, its reference count is increased.
+ pci_set_power_state() Set PCI Power Management state (0=D0 ... 3=D3)
+ pci_find_capability() Find specified capability in device's capability
list.
-pci_resource_start() Returns bus start address for a given PCI region
-pci_resource_end() Returns bus end address for a given PCI region
-pci_resource_len() Returns the byte length of a PCI region
-pci_set_drvdata() Set private driver data pointer for a pci_dev
-pci_get_drvdata() Return private driver data pointer for a pci_dev
-pci_set_mwi() Enable Memory-Write-Invalidate transactions.
-pci_clear_mwi() Disable Memory-Write-Invalidate transactions.
+ pci_resource_start() Returns bus start address for a given PCI region
+ pci_resource_end() Returns bus end address for a given PCI region
+ pci_resource_len() Returns the byte length of a PCI region
+ pci_set_drvdata() Set private driver data pointer for a pci_dev
+ pci_get_drvdata() Return private driver data pointer for a pci_dev
+ pci_set_mwi() Enable Memory-Write-Invalidate transactions.
+ pci_clear_mwi() Disable Memory-Write-Invalidate transactions.



-7. Miscellaneous hints
-~~~~~~~~~~~~~~~~~~~~~~
+Miscellaneous hints
+===================

When displaying PCI device names to the user (for example when a driver wants
to tell the user what card has it found), please use pci_name(pci_dev).
@@ -560,8 +567,8 @@ to be handled by platform and generic code, not individual drivers.



-8. Vendor and device identifications
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Vendor and device identifications
+=================================

Do not add new device or vendor IDs to include/linux/pci_ids.h unless they
are shared across multiple drivers. You can add private definitions in
@@ -576,18 +583,20 @@ and https://github.com/pciutils/pciids.



-9. Obsolete functions
-~~~~~~~~~~~~~~~~~~~~~
+Obsolete functions
+==================

There are several functions which you might come across when trying to
port an old driver to the new PCI interface. They are no longer present
in the kernel as they aren't compatible with hotplug or PCI domains or
having sane locking.

-pci_find_device() Superseded by pci_get_device()
-pci_find_subsys() Superseded by pci_get_subsys()
-pci_find_slot() Superseded by pci_get_domain_bus_and_slot()
-pci_get_slot() Superseded by pci_get_domain_bus_and_slot()
+::
+
+ pci_find_device() Superseded by pci_get_device()
+ pci_find_subsys() Superseded by pci_get_subsys()
+ pci_find_slot() Superseded by pci_get_domain_bus_and_slot()
+ pci_get_slot() Superseded by pci_get_domain_bus_and_slot()


The alternative is the traditional PCI device driver that walks PCI
@@ -595,8 +604,8 @@ device lists. This is still possible but discouraged.



-10. MMIO Space and "Write Posting"
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+MMIO Space and "Write Posting"
+==============================

Converting a driver from using I/O Port space to using MMIO space
often requires some additional changes. Specifically, "write posting"
@@ -609,14 +618,14 @@ the CPU before the transaction has reached its destination.

Thus, timing sensitive code should add readl() where the CPU is
expected to wait before doing other work. The classic "bit banging"
-sequence works fine for I/O Port space:
+sequence works fine for I/O Port space::

for (i = 8; --i; val >>= 1) {
outb(val & 1, ioport_reg); /* write bit */
udelay(10);
}

-The same sequence for MMIO space should be:
+The same sequence for MMIO space should be::

for (i = 8; --i; val >>= 1) {
writeb(val & 1, mmio_reg); /* write bit */
--
2.20.1

2019-04-23 16:36:02

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 28/63] Documentation: PCI: convert pci-iov-howto.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
---
Documentation/PCI/index.rst | 1 +
.../{pci-iov-howto.txt => pci-iov-howto.rst} | 161 ++++++++++--------
2 files changed, 94 insertions(+), 68 deletions(-)
rename Documentation/PCI/{pci-iov-howto.txt => pci-iov-howto.rst} (63%)

diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
index 452723318405..e1c19962a7f8 100644
--- a/Documentation/PCI/index.rst
+++ b/Documentation/PCI/index.rst
@@ -10,3 +10,4 @@ Linux PCI Bus Subsystem

pci
PCIEBUS-HOWTO
+ pci-iov-howto
diff --git a/Documentation/PCI/pci-iov-howto.txt b/Documentation/PCI/pci-iov-howto.rst
similarity index 63%
rename from Documentation/PCI/pci-iov-howto.txt
rename to Documentation/PCI/pci-iov-howto.rst
index d2a84151e99c..b9fd003206f1 100644
--- a/Documentation/PCI/pci-iov-howto.txt
+++ b/Documentation/PCI/pci-iov-howto.rst
@@ -1,14 +1,19 @@
- PCI Express I/O Virtualization Howto
- Copyright (C) 2009 Intel Corporation
- Yu Zhao <[email protected]>
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>

- Update: November 2012
- -- sysfs-based SRIOV enable-/disable-ment
- Donald Dutile <[email protected]>
+====================================
+PCI Express I/O Virtualization Howto
+====================================

-1. Overview
+:Copyright: |copy| 2009 Intel Corporation
+:Authors: - Yu Zhao <[email protected]>
+ - Donald Dutile <[email protected]>

-1.1 What is SR-IOV
+Overview
+========
+
+What is SR-IOV
+--------------

Single Root I/O Virtualization (SR-IOV) is a PCI Express Extended
capability which makes one physical device appear as multiple virtual
@@ -23,9 +28,11 @@ Memory Space, which is used to map its register set. VF device driver
operates on the register set so it can be functional and appear as a
real existing PCI device.

-2. User Guide
+User Guide
+==========

-2.1 How can I enable SR-IOV capability
+How can I enable SR-IOV capability
+----------------------------------

Multiple methods are available for SR-IOV enablement.
In the first method, the device driver (PF driver) will control the
@@ -43,105 +50,123 @@ checks, e.g., check numvfs == 0 if enabling VFs, ensure
numvfs <= totalvfs.
The second method is the recommended method for new/future VF devices.

-2.2 How can I use the Virtual Functions
+How can I use the Virtual Functions
+-----------------------------------

The VF is treated as hot-plugged PCI devices in the kernel, so they
should be able to work in the same way as real PCI devices. The VF
requires device driver that is same as a normal PCI device's.

-3. Developer Guide
+Developer Guide
+===============

-3.1 SR-IOV API
+SR-IOV API
+----------

To enable SR-IOV capability:
-(a) For the first method, in the driver:
+
+(a) For the first method, in the driver::
+
int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
- 'nr_virtfn' is number of VFs to be enabled.
-(b) For the second method, from sysfs:
+
+'nr_virtfn' is number of VFs to be enabled.
+
+(b) For the second method, from sysfs::
+
echo 'nr_virtfn' > \
/sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_numvfs

To disable SR-IOV capability:
-(a) For the first method, in the driver:
+
+(a) For the first method, in the driver::
+
void pci_disable_sriov(struct pci_dev *dev);
-(b) For the second method, from sysfs:
+
+(b) For the second method, from sysfs::
+
echo 0 > \
/sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_numvfs

To enable auto probing VFs by a compatible driver on the host, run
command below before enabling SR-IOV capabilities. This is the
default behavior.
+::
+
echo 1 > \
/sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_drivers_autoprobe

To disable auto probing VFs by a compatible driver on the host, run
command below before enabling SR-IOV capabilities. Updating this
entry will not affect VFs which are already probed.
+::
+
echo 0 > \
/sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_drivers_autoprobe

-3.2 Usage example
+Usage example
+-------------

Following piece of code illustrates the usage of the SR-IOV API.
+::

-static int dev_probe(struct pci_dev *dev, const struct pci_device_id *id)
-{
- pci_enable_sriov(dev, NR_VIRTFN);
+ static int dev_probe(struct pci_dev *dev, const struct pci_device_id *id)
+ {
+ pci_enable_sriov(dev, NR_VIRTFN);

- ...
-
- return 0;
-}
+ ...

-static void dev_remove(struct pci_dev *dev)
-{
- pci_disable_sriov(dev);
+ return 0;
+ }

- ...
-}
+ static void dev_remove(struct pci_dev *dev)
+ {
+ pci_disable_sriov(dev);

-static int dev_suspend(struct pci_dev *dev, pm_message_t state)
-{
- ...
+ ...
+ }

- return 0;
-}
+ static int dev_suspend(struct pci_dev *dev, pm_message_t state)
+ {
+ ...

-static int dev_resume(struct pci_dev *dev)
-{
- ...
+ return 0;
+ }

- return 0;
-}
+ static int dev_resume(struct pci_dev *dev)
+ {
+ ...

-static void dev_shutdown(struct pci_dev *dev)
-{
- ...
-}
+ return 0;
+ }

-static int dev_sriov_configure(struct pci_dev *dev, int numvfs)
-{
- if (numvfs > 0) {
- ...
- pci_enable_sriov(dev, numvfs);
+ static void dev_shutdown(struct pci_dev *dev)
+ {
...
- return numvfs;
}
- if (numvfs == 0) {
- ....
- pci_disable_sriov(dev);
- ...
- return 0;
+
+ static int dev_sriov_configure(struct pci_dev *dev, int numvfs)
+ {
+ if (numvfs > 0) {
+ ...
+ pci_enable_sriov(dev, numvfs);
+ ...
+ return numvfs;
+ }
+ if (numvfs == 0) {
+ ....
+ pci_disable_sriov(dev);
+ ...
+ return 0;
+ }
}
-}
-
-static struct pci_driver dev_driver = {
- .name = "SR-IOV Physical Function driver",
- .id_table = dev_id_table,
- .probe = dev_probe,
- .remove = dev_remove,
- .suspend = dev_suspend,
- .resume = dev_resume,
- .shutdown = dev_shutdown,
- .sriov_configure = dev_sriov_configure,
-};
+
+ static struct pci_driver dev_driver = {
+ .name = "SR-IOV Physical Function driver",
+ .id_table = dev_id_table,
+ .probe = dev_probe,
+ .remove = dev_remove,
+ .suspend = dev_suspend,
+ .resume = dev_resume,
+ .shutdown = dev_shutdown,
+ .sriov_configure = dev_sriov_configure,
+ };
--
2.20.1

2019-04-23 16:36:23

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 21/63] Documentation: ACPI: move cppc_sysfs.txt to admin-guide/acpi and convert to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
.../acpi/cppc_sysfs.rst} | 71 ++++++++++---------
Documentation/admin-guide/acpi/index.rst | 1 +
2 files changed, 40 insertions(+), 32 deletions(-)
rename Documentation/{acpi/cppc_sysfs.txt => admin-guide/acpi/cppc_sysfs.rst} (51%)

diff --git a/Documentation/acpi/cppc_sysfs.txt b/Documentation/admin-guide/acpi/cppc_sysfs.rst
similarity index 51%
rename from Documentation/acpi/cppc_sysfs.txt
rename to Documentation/admin-guide/acpi/cppc_sysfs.rst
index f20fb445135d..a4b99afbe331 100644
--- a/Documentation/acpi/cppc_sysfs.txt
+++ b/Documentation/admin-guide/acpi/cppc_sysfs.rst
@@ -1,5 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0

- Collaborative Processor Performance Control (CPPC)
+==================================================
+Collaborative Processor Performance Control (CPPC)
+==================================================
+
+CPPC
+====

CPPC defined in the ACPI spec describes a mechanism for the OS to manage the
performance of a logical processor on a contigious and abstract performance
@@ -10,31 +16,28 @@ For more details on CPPC please refer to the ACPI specification at:

http://uefi.org/specifications

-Some of the CPPC registers are exposed via sysfs under:
-
-/sys/devices/system/cpu/cpuX/acpi_cppc/
-
-for each cpu X
+Some of the CPPC registers are exposed via sysfs under::

---------------------------------------------------------------------------------
+ /sys/devices/system/cpu/cpuX/acpi_cppc/

-$ ls -lR /sys/devices/system/cpu/cpu0/acpi_cppc/
-/sys/devices/system/cpu/cpu0/acpi_cppc/:
-total 0
--r--r--r-- 1 root root 65536 Mar 5 19:38 feedback_ctrs
--r--r--r-- 1 root root 65536 Mar 5 19:38 highest_perf
--r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_freq
--r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_nonlinear_perf
--r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_perf
--r--r--r-- 1 root root 65536 Mar 5 19:38 nominal_freq
--r--r--r-- 1 root root 65536 Mar 5 19:38 nominal_perf
--r--r--r-- 1 root root 65536 Mar 5 19:38 reference_perf
--r--r--r-- 1 root root 65536 Mar 5 19:38 wraparound_time
+for each cpu X::

---------------------------------------------------------------------------------
+ $ ls -lR /sys/devices/system/cpu/cpu0/acpi_cppc/
+ /sys/devices/system/cpu/cpu0/acpi_cppc/:
+ total 0
+ -r--r--r-- 1 root root 65536 Mar 5 19:38 feedback_ctrs
+ -r--r--r-- 1 root root 65536 Mar 5 19:38 highest_perf
+ -r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_freq
+ -r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_nonlinear_perf
+ -r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_perf
+ -r--r--r-- 1 root root 65536 Mar 5 19:38 nominal_freq
+ -r--r--r-- 1 root root 65536 Mar 5 19:38 nominal_perf
+ -r--r--r-- 1 root root 65536 Mar 5 19:38 reference_perf
+ -r--r--r-- 1 root root 65536 Mar 5 19:38 wraparound_time

* highest_perf : Highest performance of this processor (abstract scale).
-* nominal_perf : Highest sustained performance of this processor (abstract scale).
+* nominal_perf : Highest sustained performance of this processor
+ (abstract scale).
* lowest_nonlinear_perf : Lowest performance of this processor with nonlinear
power savings (abstract scale).
* lowest_perf : Lowest performance of this processor (abstract scale).
@@ -48,22 +51,26 @@ total 0
* feedback_ctrs : Includes both Reference and delivered performance counter.
Reference counter ticks up proportional to processor's reference performance.
Delivered counter ticks up proportional to processor's delivered performance.
-* wraparound_time: Minimum time for the feedback counters to wraparound (seconds).
+* wraparound_time: Minimum time for the feedback counters to wraparound
+ (seconds).
* reference_perf : Performance level at which reference performance counter
accumulates (abstract scale).

---------------------------------------------------------------------------------

- Computing Average Delivered Performance
+Computing Average Delivered Performance
+=======================================
+
+Below describes the steps to compute the average performance delivered by
+taking two different snapshots of feedback counters at time T1 and T2.
+
+ T1: Read feedback_ctrs as fbc_t1
+ Wait or run some workload

-Below describes the steps to compute the average performance delivered by taking
-two different snapshots of feedback counters at time T1 and T2.
+ T2: Read feedback_ctrs as fbc_t2

-T1: Read feedback_ctrs as fbc_t1
- Wait or run some workload
-T2: Read feedback_ctrs as fbc_t2
+::

-delivered_counter_delta = fbc_t2[del] - fbc_t1[del]
-reference_counter_delta = fbc_t2[ref] - fbc_t1[ref]
+ delivered_counter_delta = fbc_t2[del] - fbc_t1[del]
+ reference_counter_delta = fbc_t2[ref] - fbc_t1[ref]

-delivered_perf = (refernce_perf x delivered_counter_delta) / reference_counter_delta
+ delivered_perf = (refernce_perf x delivered_counter_delta) / reference_counter_delta
diff --git a/Documentation/admin-guide/acpi/index.rst b/Documentation/admin-guide/acpi/index.rst
index d68e9914c5ff..9049a7b9f065 100644
--- a/Documentation/admin-guide/acpi/index.rst
+++ b/Documentation/admin-guide/acpi/index.rst
@@ -10,3 +10,4 @@ the Linux ACPI support.

initrd_table_override
dsdt-override
+ cppc_sysfs
--
2.20.1

2019-04-23 16:36:32

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 30/63] Documentation: PCI: convert acpi-info.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
---
Documentation/PCI/{acpi-info.txt => acpi-info.rst} | 11 ++++++++---
Documentation/PCI/index.rst | 1 +
2 files changed, 9 insertions(+), 3 deletions(-)
rename Documentation/PCI/{acpi-info.txt => acpi-info.rst} (97%)

diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.rst
similarity index 97%
rename from Documentation/PCI/acpi-info.txt
rename to Documentation/PCI/acpi-info.rst
index 3ffa3b03970e..f7dabb7ca255 100644
--- a/Documentation/PCI/acpi-info.txt
+++ b/Documentation/PCI/acpi-info.rst
@@ -1,4 +1,8 @@
- ACPI considerations for PCI host bridges
+.. SPDX-License-Identifier: GPL-2.0
+
+========================================
+ACPI considerations for PCI host bridges
+========================================

The general rule is that the ACPI namespace should describe everything the
OS might use unless there's another way for the OS to find it [1, 2].
@@ -135,8 +139,9 @@ address always corresponds to bus 0, even if the bus range below the bridge

Extended Address Space Descriptor (.4)
General Flags: Bit [0] Consumer/Producer:
- 1–This device consumes this resource
- 0–This device produces and consumes this resource
+
+ * 1 – This device consumes this resource
+ * 0 – This device produces and consumes this resource

[5] ACPI 6.2, sec 19.6.43:
ResourceUsage specifies whether the Memory range is consumed by
diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
index 1b25bcc1edca..c877a369481d 100644
--- a/Documentation/PCI/index.rst
+++ b/Documentation/PCI/index.rst
@@ -12,3 +12,4 @@ Linux PCI Bus Subsystem
PCIEBUS-HOWTO
pci-iov-howto
MSI-HOWTO
+ acpi-info
--
2.20.1

2019-04-23 16:36:38

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 31/63] Documentation: PCI: convert pci-error-recovery.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
---
Documentation/PCI/index.rst | 1 +
...or-recovery.txt => pci-error-recovery.rst} | 178 +++++++++---------
MAINTAINERS | 2 +-
3 files changed, 94 insertions(+), 87 deletions(-)
rename Documentation/PCI/{pci-error-recovery.txt => pci-error-recovery.rst} (80%)

diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
index c877a369481d..5ee4dba07116 100644
--- a/Documentation/PCI/index.rst
+++ b/Documentation/PCI/index.rst
@@ -13,3 +13,4 @@ Linux PCI Bus Subsystem
pci-iov-howto
MSI-HOWTO
acpi-info
+ pci-error-recovery
diff --git a/Documentation/PCI/pci-error-recovery.txt b/Documentation/PCI/pci-error-recovery.rst
similarity index 80%
rename from Documentation/PCI/pci-error-recovery.txt
rename to Documentation/PCI/pci-error-recovery.rst
index 0b6bb3ef449e..533ec4035bf5 100644
--- a/Documentation/PCI/pci-error-recovery.txt
+++ b/Documentation/PCI/pci-error-recovery.rst
@@ -1,12 +1,13 @@
+.. SPDX-License-Identifier: GPL-2.0

- PCI Error Recovery
- ------------------
- February 2, 2006
+==================
+PCI Error Recovery
+==================

- Current document maintainer:
- Linas Vepstas <[email protected]>
- updated by Richard Lary <[email protected]>
- and Mike Mason <[email protected]> on 27-Jul-2009
+
+:Authors: - Linas Vepstas <[email protected]>
+ - Richard Lary <[email protected]>
+ - Mike Mason <[email protected]>


Many PCI bus controllers are able to detect a variety of hardware
@@ -63,7 +64,8 @@ mechanisms for dealing with SCSI bus errors and SCSI bus resets.


Detailed Design
----------------
+===============
+
Design and implementation details below, based on a chain of
public email discussions with Ben Herrenschmidt, circa 5 April 2005.

@@ -73,30 +75,33 @@ pci_driver. A driver that fails to provide the structure is "non-aware",
and the actual recovery steps taken are platform dependent. The
arch/powerpc implementation will simulate a PCI hotplug remove/add.

-This structure has the form:
-struct pci_error_handlers
-{
- int (*error_detected)(struct pci_dev *dev, enum pci_channel_state);
- int (*mmio_enabled)(struct pci_dev *dev);
- int (*slot_reset)(struct pci_dev *dev);
- void (*resume)(struct pci_dev *dev);
-};
-
-The possible channel states are:
-enum pci_channel_state {
- pci_channel_io_normal, /* I/O channel is in normal state */
- pci_channel_io_frozen, /* I/O to channel is blocked */
- pci_channel_io_perm_failure, /* PCI card is dead */
-};
-
-Possible return values are:
-enum pci_ers_result {
- PCI_ERS_RESULT_NONE, /* no result/none/not supported in device driver */
- PCI_ERS_RESULT_CAN_RECOVER, /* Device driver can recover without slot reset */
- PCI_ERS_RESULT_NEED_RESET, /* Device driver wants slot to be reset. */
- PCI_ERS_RESULT_DISCONNECT, /* Device has completely failed, is unrecoverable */
- PCI_ERS_RESULT_RECOVERED, /* Device driver is fully recovered and operational */
-};
+This structure has the form::
+
+ struct pci_error_handlers
+ {
+ int (*error_detected)(struct pci_dev *dev, enum pci_channel_state);
+ int (*mmio_enabled)(struct pci_dev *dev);
+ int (*slot_reset)(struct pci_dev *dev);
+ void (*resume)(struct pci_dev *dev);
+ };
+
+The possible channel states are::
+
+ enum pci_channel_state {
+ pci_channel_io_normal, /* I/O channel is in normal state */
+ pci_channel_io_frozen, /* I/O to channel is blocked */
+ pci_channel_io_perm_failure, /* PCI card is dead */
+ };
+
+Possible return values are::
+
+ enum pci_ers_result {
+ PCI_ERS_RESULT_NONE, /* no result/none/not supported in device driver */
+ PCI_ERS_RESULT_CAN_RECOVER, /* Device driver can recover without slot reset */
+ PCI_ERS_RESULT_NEED_RESET, /* Device driver wants slot to be reset. */
+ PCI_ERS_RESULT_DISCONNECT, /* Device has completely failed, is unrecoverable */
+ PCI_ERS_RESULT_RECOVERED, /* Device driver is fully recovered and operational */
+ };

A driver does not have to implement all of these callbacks; however,
if it implements any, it must implement error_detected(). If a callback
@@ -134,16 +139,17 @@ shouldn't do any new IOs. Called in task context. This is sort of a

All drivers participating in this system must implement this call.
The driver must return one of the following result codes:
- - PCI_ERS_RESULT_CAN_RECOVER:
- Driver returns this if it thinks it might be able to recover
- the HW by just banging IOs or if it wants to be given
- a chance to extract some diagnostic information (see
- mmio_enable, below).
- - PCI_ERS_RESULT_NEED_RESET:
- Driver returns this if it can't recover without a
- slot reset.
- - PCI_ERS_RESULT_DISCONNECT:
- Driver returns this if it doesn't want to recover at all.
+
+ - PCI_ERS_RESULT_CAN_RECOVER:
+ Driver returns this if it thinks it might be able to recover
+ the HW by just banging IOs or if it wants to be given
+ a chance to extract some diagnostic information (see
+ mmio_enable, below).
+ - PCI_ERS_RESULT_NEED_RESET:
+ Driver returns this if it can't recover without a
+ slot reset.
+ - PCI_ERS_RESULT_DISCONNECT:
+ Driver returns this if it doesn't want to recover at all.

The next step taken will depend on the result codes returned by the
drivers.
@@ -177,7 +183,7 @@ is STEP 6 (Permanent Failure).
>>> get the device working again.

STEP 2: MMIO Enabled
--------------------
+--------------------
The platform re-enables MMIO to the device (but typically not the
DMA), and then calls the mmio_enabled() callback on all affected
device drivers.
@@ -203,23 +209,23 @@ instead will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
>>> into one of the next states, that is, link reset or slot reset.

The driver should return one of the following result codes:
- - PCI_ERS_RESULT_RECOVERED
- Driver returns this if it thinks the device is fully
- functional and thinks it is ready to start
- normal driver operations again. There is no
- guarantee that the driver will actually be
- allowed to proceed, as another driver on the
- same segment might have failed and thus triggered a
- slot reset on platforms that support it.
-
- - PCI_ERS_RESULT_NEED_RESET
- Driver returns this if it thinks the device is not
- recoverable in its current state and it needs a slot
- reset to proceed.
-
- - PCI_ERS_RESULT_DISCONNECT
- Same as above. Total failure, no recovery even after
- reset driver dead. (To be defined more precisely)
+ - PCI_ERS_RESULT_RECOVERED
+ Driver returns this if it thinks the device is fully
+ functional and thinks it is ready to start
+ normal driver operations again. There is no
+ guarantee that the driver will actually be
+ allowed to proceed, as another driver on the
+ same segment might have failed and thus triggered a
+ slot reset on platforms that support it.
+
+ - PCI_ERS_RESULT_NEED_RESET
+ Driver returns this if it thinks the device is not
+ recoverable in its current state and it needs a slot
+ reset to proceed.
+
+ - PCI_ERS_RESULT_DISCONNECT
+ Same as above. Total failure, no recovery even after
+ reset driver dead. (To be defined more precisely)

The next step taken depends on the results returned by the drivers.
If all drivers returned PCI_ERS_RESULT_RECOVERED, then the platform
@@ -293,24 +299,24 @@ device will be considered "dead" in this case.
Drivers for multi-function cards will need to coordinate among
themselves as to which driver instance will perform any "one-shot"
or global device initialization. For example, the Symbios sym53cxx2
-driver performs device init only from PCI function 0:
+driver performs device init only from PCI function 0::

-+ if (PCI_FUNC(pdev->devfn) == 0)
-+ sym_reset_scsi_bus(np, 0);
+ + if (PCI_FUNC(pdev->devfn) == 0)
+ + sym_reset_scsi_bus(np, 0);

- Result codes:
- - PCI_ERS_RESULT_DISCONNECT
- Same as above.
+Result codes:
+ - PCI_ERS_RESULT_DISCONNECT
+ Same as above.

Drivers for PCI Express cards that require a fundamental reset must
set the needs_freset bit in the pci_dev structure in their probe function.
For example, the QLogic qla2xxx driver sets the needs_freset bit for certain
-PCI card types:
+PCI card types::

-+ /* Set EEH reset type to fundamental if required by hba */
-+ if (IS_QLA24XX(ha) || IS_QLA25XX(ha) || IS_QLA81XX(ha))
-+ pdev->needs_freset = 1;
-+
+ + /* Set EEH reset type to fundamental if required by hba */
+ + if (IS_QLA24XX(ha) || IS_QLA25XX(ha) || IS_QLA81XX(ha))
+ + pdev->needs_freset = 1;
+ +

Platform proceeds either to STEP 5 (Resume Operations) or STEP 6 (Permanent
Failure).
@@ -370,23 +376,23 @@ The current policy is to turn this into a platform policy.
That is, the recovery API only requires that:

- There is no guarantee that interrupt delivery can proceed from any
-device on the segment starting from the error detection and until the
-slot_reset callback is called, at which point interrupts are expected
-to be fully operational.
+ device on the segment starting from the error detection and until the
+ slot_reset callback is called, at which point interrupts are expected
+ to be fully operational.

- There is no guarantee that interrupt delivery is stopped, that is,
-a driver that gets an interrupt after detecting an error, or that detects
-an error within the interrupt handler such that it prevents proper
-ack'ing of the interrupt (and thus removal of the source) should just
-return IRQ_NOTHANDLED. It's up to the platform to deal with that
-condition, typically by masking the IRQ source during the duration of
-the error handling. It is expected that the platform "knows" which
-interrupts are routed to error-management capable slots and can deal
-with temporarily disabling that IRQ number during error processing (this
-isn't terribly complex). That means some IRQ latency for other devices
-sharing the interrupt, but there is simply no other way. High end
-platforms aren't supposed to share interrupts between many devices
-anyway :)
+ a driver that gets an interrupt after detecting an error, or that detects
+ an error within the interrupt handler such that it prevents proper
+ ack'ing of the interrupt (and thus removal of the source) should just
+ return IRQ_NOTHANDLED. It's up to the platform to deal with that
+ condition, typically by masking the IRQ source during the duration of
+ the error handling. It is expected that the platform "knows" which
+ interrupts are routed to error-management capable slots and can deal
+ with temporarily disabling that IRQ number during error processing (this
+ isn't terribly complex). That means some IRQ latency for other devices
+ sharing the interrupt, but there is simply no other way. High end
+ platforms aren't supposed to share interrupts between many devices
+ anyway :)

>>> Implementation details for the powerpc platform are discussed in
>>> the file Documentation/powerpc/eeh-pci-error-recovery.txt
diff --git a/MAINTAINERS b/MAINTAINERS
index 87f930bf32ad..403178958b05 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11965,7 +11965,7 @@ M: Sam Bobroff <[email protected]>
M: Oliver O'Halloran <[email protected]>
L: [email protected]
S: Supported
-F: Documentation/PCI/pci-error-recovery.txt
+F: Documentation/PCI/pci-error-recovery.rst
F: drivers/pci/pcie/aer.c
F: drivers/pci/pcie/dpc.c
F: drivers/pci/pcie/err.c
--
2.20.1

2019-04-23 16:36:51

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 25/63] Documentation: add Linux PCI to Sphinx TOC tree

Add a index.rst for PCI subsystem. More docs will be added later.

Signed-off-by: Changbin Du <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
---
Documentation/PCI/index.rst | 9 +++++++++
Documentation/index.rst | 1 +
2 files changed, 10 insertions(+)
create mode 100644 Documentation/PCI/index.rst

diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
new file mode 100644
index 000000000000..c2f8728d11cf
--- /dev/null
+++ b/Documentation/PCI/index.rst
@@ -0,0 +1,9 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================
+Linux PCI Bus Subsystem
+=======================
+
+.. toctree::
+ :maxdepth: 2
+ :numbered:
diff --git a/Documentation/index.rst b/Documentation/index.rst
index fdfa85c56a50..d80138284e0f 100644
--- a/Documentation/index.rst
+++ b/Documentation/index.rst
@@ -100,6 +100,7 @@ needed).
filesystems/index
vm/index
bpf/index
+ PCI/index
misc-devices/index

Architecture-specific documentation
--
2.20.1

2019-04-23 16:36:52

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 32/63] Documentation: PCI: convert pcieaer-howto.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
---
Documentation/PCI/index.rst | 1 +
.../{pcieaer-howto.txt => pcieaer-howto.rst} | 110 ++++++++++++------
2 files changed, 74 insertions(+), 37 deletions(-)
rename Documentation/PCI/{pcieaer-howto.txt => pcieaer-howto.rst} (81%)

diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
index 5ee4dba07116..86c76c22810b 100644
--- a/Documentation/PCI/index.rst
+++ b/Documentation/PCI/index.rst
@@ -14,3 +14,4 @@ Linux PCI Bus Subsystem
MSI-HOWTO
acpi-info
pci-error-recovery
+ pcieaer-howto
diff --git a/Documentation/PCI/pcieaer-howto.txt b/Documentation/PCI/pcieaer-howto.rst
similarity index 81%
rename from Documentation/PCI/pcieaer-howto.txt
rename to Documentation/PCI/pcieaer-howto.rst
index 48ce7903e3c6..67f77ff76865 100644
--- a/Documentation/PCI/pcieaer-howto.txt
+++ b/Documentation/PCI/pcieaer-howto.rst
@@ -1,21 +1,29 @@
- The PCI Express Advanced Error Reporting Driver Guide HOWTO
- T. Long Nguyen <[email protected]>
- Yanmin Zhang <[email protected]>
- 07/29/2006
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>

+===========================================================
+The PCI Express Advanced Error Reporting Driver Guide HOWTO
+===========================================================

-1. Overview
+:Authors: - T. Long Nguyen <[email protected]>
+ - Yanmin Zhang <[email protected]>

-1.1 About this guide
+:Copyright: |copy| 2006 Intel Corporation
+
+Overview
+===========
+
+About this guide
+----------------

This guide describes the basics of the PCI Express Advanced Error
Reporting (AER) driver and provides information on how to use it, as
well as how to enable the drivers of endpoint devices to conform with
PCI Express AER driver.

-1.2 Copyright (C) Intel Corporation 2006.

-1.3 What is the PCI Express AER Driver?
+What is the PCI Express AER Driver?
+-----------------------------------

PCI Express error signaling can occur on the PCI Express link itself
or on behalf of transactions initiated on the link. PCI Express
@@ -30,17 +38,19 @@ The PCI Express AER driver provides the infrastructure to support PCI
Express Advanced Error Reporting capability. The PCI Express AER
driver provides three basic functions:

-- Gathers the comprehensive error information if errors occurred.
-- Reports error to the users.
-- Performs error recovery actions.
+ - Gathers the comprehensive error information if errors occurred.
+ - Reports error to the users.
+ - Performs error recovery actions.

AER driver only attaches root ports which support PCI-Express AER
capability.


-2. User Guide
+User Guide
+==========

-2.1 Include the PCI Express AER Root Driver into the Linux Kernel
+Include the PCI Express AER Root Driver into the Linux Kernel
+-------------------------------------------------------------

The PCI Express AER Root driver is a Root Port service driver attached
to the PCI Express Port Bus driver. If a user wants to use it, the driver
@@ -48,7 +58,8 @@ has to be compiled. Option CONFIG_PCIEAER supports this capability. It
depends on CONFIG_PCIEPORTBUS, so pls. set CONFIG_PCIEPORTBUS=y and
CONFIG_PCIEAER = y.

-2.2 Load PCI Express AER Root Driver
+Load PCI Express AER Root Driver
+--------------------------------

Some systems have AER support in firmware. Enabling Linux AER support at
the same time the firmware handles AER may result in unpredictable
@@ -56,30 +67,34 @@ behavior. Therefore, Linux does not handle AER events unless the firmware
grants AER control to the OS via the ACPI _OSC method. See the PCI FW 3.0
Specification for details regarding _OSC usage.

-2.3 AER error output
+AER error output
+----------------

When a PCIe AER error is captured, an error message will be output to
console. If it's a correctable error, it is output as a warning.
Otherwise, it is printed as an error. So users could choose different
log level to filter out correctable error messages.

-Below shows an example:
-0000:50:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, id=0500(Requester ID)
-0000:50:00.0: device [8086:0329] error status/mask=00100000/00000000
-0000:50:00.0: [20] Unsupported Request (First)
-0000:50:00.0: TLP Header: 04000001 00200a03 05010000 00050100
+Below shows an example::
+
+ 0000:50:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, id=0500(Requester ID)
+ 0000:50:00.0: device [8086:0329] error status/mask=00100000/00000000
+ 0000:50:00.0: [20] Unsupported Request (First)
+ 0000:50:00.0: TLP Header: 04000001 00200a03 05010000 00050100

In the example, 'Requester ID' means the ID of the device who sends
the error message to root port. Pls. refer to pci express specs for
other fields.

-2.4 AER Statistics / Counters
+AER Statistics / Counters
+-------------------------

When PCIe AER errors are captured, the counters / statistics are also exposed
in the form of sysfs attributes which are documented at
Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats

-3. Developer Guide
+Developer Guide
+===============

To enable AER aware support requires a software driver to configure
the AER capability structure within its device and to provide callbacks.
@@ -120,7 +135,8 @@ hierarchy and links. These errors do not include any device specific
errors because device specific errors will still get sent directly to
the device driver.

-3.1 Configure the AER capability structure
+Configure the AER capability structure
+--------------------------------------

AER aware drivers of PCI Express component need change the device
control registers to enable AER. They also could change AER registers,
@@ -128,9 +144,11 @@ including mask and severity registers. Helper function
pci_enable_pcie_error_reporting could be used to enable AER. See
section 3.3.

-3.2. Provide callbacks
+Provide callbacks
+-----------------

-3.2.1 callback reset_link to reset pci express link
+callback reset_link to reset pci express link
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This callback is used to reset the pci express physical link when a
fatal error happens. The root port aer service driver provides a
@@ -140,13 +158,15 @@ upstream ports should provide their own reset_link functions.

In struct pcie_port_service_driver, a new pointer, reset_link, is
added.
+::

-pci_ers_result_t (*reset_link) (struct pci_dev *dev);
+ pci_ers_result_t (*reset_link) (struct pci_dev *dev);

Section 3.2.2.2 provides more detailed info on when to call
reset_link.

-3.2.2 PCI error-recovery callbacks
+PCI error-recovery callbacks
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The PCI Express AER Root driver uses error callbacks to coordinate
with downstream device drivers associated with a hierarchy in question
@@ -161,7 +181,8 @@ definitions of the callbacks.

Below sections specify when to call the error callback functions.

-3.2.2.1 Correctable errors
+Correctable errors
+~~~~~~~~~~~~~~~~~~

Correctable errors pose no impacts on the functionality of
the interface. The PCI Express protocol can recover without any
@@ -169,13 +190,16 @@ software intervention or any loss of data. These errors do not
require any recovery actions. The AER driver clears the device's
correctable error status register accordingly and logs these errors.

-3.2.2.2 Non-correctable (non-fatal and fatal) errors
+Non-correctable (non-fatal and fatal) errors
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If an error message indicates a non-fatal error, performing link reset
at upstream is not required. The AER driver calls error_detected(dev,
pci_channel_io_normal) to all drivers associated within a hierarchy in
-question. for example,
-EndPoint<==>DownstreamPort B<==>UpstreamPort A<==>RootPort.
+question. for example::
+
+ EndPoint<==>DownstreamPort B<==>UpstreamPort A<==>RootPort
+
If Upstream port A captures an AER error, the hierarchy consists of
Downstream port B and EndPoint.

@@ -199,23 +223,33 @@ function. If error_detected returns PCI_ERS_RESULT_CAN_RECOVER and
reset_link returns PCI_ERS_RESULT_RECOVERED, the error handling goes
to mmio_enabled.

-3.3 helper functions
+helper functions
+----------------
+::
+
+ int pci_enable_pcie_error_reporting(struct pci_dev *dev);

-3.3.1 int pci_enable_pcie_error_reporting(struct pci_dev *dev);
pci_enable_pcie_error_reporting enables the device to send error
messages to root port when an error is detected. Note that devices
don't enable the error reporting by default, so device drivers need
call this function to enable it.

-3.3.2 int pci_disable_pcie_error_reporting(struct pci_dev *dev);
+::
+
+ int pci_disable_pcie_error_reporting(struct pci_dev *dev);
+
pci_disable_pcie_error_reporting disables the device to send error
messages to root port when an error is detected.

-3.3.3 int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
+::
+
+ int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);`
+
pci_cleanup_aer_uncorrect_error_status cleanups the uncorrectable
error status register.

-3.4 Frequent Asked Questions
+Frequent Asked Questions
+------------------------

Q: What happens if a PCI Express device driver does not provide an
error recovery handler (pci_driver->err_handler is equal to NULL)?
@@ -245,7 +279,8 @@ A: It could call the helper functions to enable AER in devices and
cleanup uncorrectable status register. Pls. refer to section 3.3.


-4. Software error injection
+Software error injection
+========================

Debugging PCIe AER error recovery code is quite difficult because it
is hard to trigger real hardware errors. Software based error
@@ -261,6 +296,7 @@ After reboot with new kernel or insert the module, a device file named

Then, you need a user space tool named aer-inject, which can be gotten
from:
+
https://git.kernel.org/cgit/linux/kernel/git/gong.chen/aer-inject.git/

More information about aer-inject can be found in the document comes
--
2.20.1

2019-04-23 16:37:06

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 27/63] Documentation: PCI: convert PCIEBUS-HOWTO.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
---
.../{PCIEBUS-HOWTO.txt => PCIEBUS-HOWTO.rst} | 140 ++++++++++--------
Documentation/PCI/index.rst | 1 +
2 files changed, 82 insertions(+), 59 deletions(-)
rename Documentation/PCI/{PCIEBUS-HOWTO.txt => PCIEBUS-HOWTO.rst} (70%)

diff --git a/Documentation/PCI/PCIEBUS-HOWTO.txt b/Documentation/PCI/PCIEBUS-HOWTO.rst
similarity index 70%
rename from Documentation/PCI/PCIEBUS-HOWTO.txt
rename to Documentation/PCI/PCIEBUS-HOWTO.rst
index 15f0bb3b5045..f882ff62c51f 100644
--- a/Documentation/PCI/PCIEBUS-HOWTO.txt
+++ b/Documentation/PCI/PCIEBUS-HOWTO.rst
@@ -1,16 +1,23 @@
- The PCI Express Port Bus Driver Guide HOWTO
- Tom L Nguyen [email protected]
- 11/03/2004
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>

-1. About this guide
+===========================================
+The PCI Express Port Bus Driver Guide HOWTO
+===========================================
+
+:Author: Tom L Nguyen [email protected] 11/03/2004
+:Copyright: |copy| 2004 Intel Corporation
+
+About this guide
+================

This guide describes the basics of the PCI Express Port Bus driver
and provides information on how to enable the service drivers to
register/unregister with the PCI Express Port Bus Driver.

-2. Copyright 2004 Intel Corporation

-3. What is the PCI Express Port Bus Driver
+What is the PCI Express Port Bus Driver
+=======================================

A PCI Express Port is a logical PCI-PCI Bridge structure. There
are two types of PCI Express Port: the Root Port and the Switch
@@ -30,7 +37,8 @@ support (AER), and virtual channel support (VC). These services may
be handled by a single complex driver or be individually distributed
and handled by corresponding service drivers.

-4. Why use the PCI Express Port Bus Driver?
+Why use the PCI Express Port Bus Driver?
+========================================

In existing Linux kernels, the Linux Device Driver Model allows a
physical device to be handled by only a single driver. The PCI
@@ -51,28 +59,31 @@ PCI Express Ports and distributes all provided service requests
to the corresponding service drivers as required. Some key
advantages of using the PCI Express Port Bus driver are listed below:

- - Allow multiple service drivers to run simultaneously on
- a PCI-PCI Bridge Port device.
+ - Allow multiple service drivers to run simultaneously on
+ a PCI-PCI Bridge Port device.

- - Allow service drivers implemented in an independent
- staged approach.
+ - Allow service drivers implemented in an independent
+ staged approach.

- - Allow one service driver to run on multiple PCI-PCI Bridge
- Port devices.
+ - Allow one service driver to run on multiple PCI-PCI Bridge
+ Port devices.

- - Manage and distribute resources of a PCI-PCI Bridge Port
- device to requested service drivers.
+ - Manage and distribute resources of a PCI-PCI Bridge Port
+ device to requested service drivers.

-5. Configuring the PCI Express Port Bus Driver vs. Service Drivers
+Configuring the PCI Express Port Bus Driver vs. Service Drivers
+===============================================================

-5.1 Including the PCI Express Port Bus Driver Support into the Kernel
+Including the PCI Express Port Bus Driver Support into the Kernel
+-----------------------------------------------------------------

Including the PCI Express Port Bus driver depends on whether the PCI
Express support is included in the kernel config. The kernel will
automatically include the PCI Express Port Bus driver as a kernel
driver when the PCI Express support is enabled in the kernel.

-5.2 Enabling Service Driver Support
+Enabling Service Driver Support
+-------------------------------

PCI device drivers are implemented based on Linux Device Driver Model.
All service drivers are PCI device drivers. As discussed above, it is
@@ -89,9 +100,11 @@ header file /include/linux/pcieport_if.h, before calling these APIs.
Failure to do so will result an identity mismatch, which prevents
the PCI Express Port Bus driver from loading a service driver.

-5.2.1 pcie_port_service_register
+pcie_port_service_register
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+::

-int pcie_port_service_register(struct pcie_port_service_driver *new)
+ int pcie_port_service_register(struct pcie_port_service_driver *new)

This API replaces the Linux Driver Model's pci_register_driver API. A
service driver should always calls pcie_port_service_register at
@@ -99,69 +112,76 @@ module init. Note that after service driver being loaded, calls
such as pci_enable_device(dev) and pci_set_master(dev) are no longer
necessary since these calls are executed by the PCI Port Bus driver.

-5.2.2 pcie_port_service_unregister
+pcie_port_service_unregister
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+::

-void pcie_port_service_unregister(struct pcie_port_service_driver *new)
+ void pcie_port_service_unregister(struct pcie_port_service_driver *new)

pcie_port_service_unregister replaces the Linux Driver Model's
pci_unregister_driver. It's always called by service driver when a
module exits.

-5.2.3 Sample Code
+Sample Code
+~~~~~~~~~~~

Below is sample service driver code to initialize the port service
driver data structure.
+::

-static struct pcie_port_service_id service_id[] = { {
- .vendor = PCI_ANY_ID,
- .device = PCI_ANY_ID,
- .port_type = PCIE_RC_PORT,
- .service_type = PCIE_PORT_SERVICE_AER,
- }, { /* end: all zeroes */ }
-};
+ static struct pcie_port_service_id service_id[] = { {
+ .vendor = PCI_ANY_ID,
+ .device = PCI_ANY_ID,
+ .port_type = PCIE_RC_PORT,
+ .service_type = PCIE_PORT_SERVICE_AER,
+ }, { /* end: all zeroes */ }
+ };

-static struct pcie_port_service_driver root_aerdrv = {
- .name = (char *)device_name,
- .id_table = &service_id[0],
+ static struct pcie_port_service_driver root_aerdrv = {
+ .name = (char *)device_name,
+ .id_table = &service_id[0],

- .probe = aerdrv_load,
- .remove = aerdrv_unload,
+ .probe = aerdrv_load,
+ .remove = aerdrv_unload,

- .suspend = aerdrv_suspend,
- .resume = aerdrv_resume,
-};
+ .suspend = aerdrv_suspend,
+ .resume = aerdrv_resume,
+ };

Below is a sample code for registering/unregistering a service
driver.
+::

-static int __init aerdrv_service_init(void)
-{
- int retval = 0;
+ static int __init aerdrv_service_init(void)
+ {
+ int retval = 0;

- retval = pcie_port_service_register(&root_aerdrv);
- if (!retval) {
- /*
- * FIX ME
- */
- }
- return retval;
-}
+ retval = pcie_port_service_register(&root_aerdrv);
+ if (!retval) {
+ /*
+ * FIX ME
+ */
+ }
+ return retval;
+ }

-static void __exit aerdrv_service_exit(void)
-{
- pcie_port_service_unregister(&root_aerdrv);
-}
+ static void __exit aerdrv_service_exit(void)
+ {
+ pcie_port_service_unregister(&root_aerdrv);
+ }

-module_init(aerdrv_service_init);
-module_exit(aerdrv_service_exit);
+ module_init(aerdrv_service_init);
+ module_exit(aerdrv_service_exit);

-6. Possible Resource Conflicts
+Possible Resource Conflicts
+===========================

Since all service drivers of a PCI-PCI Bridge Port device are
allowed to run simultaneously, below lists a few of possible resource
conflicts with proposed solutions.

-6.1 MSI and MSI-X Vector Resource
+MSI and MSI-X Vector Resource
+-----------------------------

Once MSI or MSI-X interrupts are enabled on a device, it stays in this
mode until they are disabled again. Since service drivers of the same
@@ -179,7 +199,8 @@ driver. Service drivers should use (struct pcie_device*)dev->irq to
call request_irq/free_irq. In addition, the interrupt mode is stored
in the field interrupt_mode of struct pcie_device.

-6.3 PCI Memory/IO Mapped Regions
+PCI Memory/IO Mapped Regions
+----------------------------

Service drivers for PCI Express Power Management (PME), Advanced
Error Reporting (AER), Hot-Plug (HP) and Virtual Channel (VC) access
@@ -188,7 +209,8 @@ registers accessed are independent of each other. This patch assumes
that all service drivers will be well behaved and not overwrite
other service driver's configuration settings.

-6.4 PCI Config Registers
+PCI Config Registers
+--------------------

Each service driver runs its PCI config operations on its own
capability structure except the PCI Express capability structure, in
diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
index 7babf43709b0..452723318405 100644
--- a/Documentation/PCI/index.rst
+++ b/Documentation/PCI/index.rst
@@ -9,3 +9,4 @@ Linux PCI Bus Subsystem
:numbered:

pci
+ PCIEBUS-HOWTO
--
2.20.1

2019-04-23 16:37:13

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 35/63] Documentation: PCI: convert endpoint/pci-test-function.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
---
Documentation/PCI/endpoint/index.rst | 1 +
...est-function.txt => pci-test-function.rst} | 32 +++++++++++--------
2 files changed, 20 insertions(+), 13 deletions(-)
rename Documentation/PCI/endpoint/{pci-test-function.txt => pci-test-function.rst} (84%)

diff --git a/Documentation/PCI/endpoint/index.rst b/Documentation/PCI/endpoint/index.rst
index 3951de9f923c..b680a3fc4fec 100644
--- a/Documentation/PCI/endpoint/index.rst
+++ b/Documentation/PCI/endpoint/index.rst
@@ -9,3 +9,4 @@ PCI Endpoint Framework

pci-endpoint
pci-endpoint-cfs
+ pci-test-function
diff --git a/Documentation/PCI/endpoint/pci-test-function.txt b/Documentation/PCI/endpoint/pci-test-function.rst
similarity index 84%
rename from Documentation/PCI/endpoint/pci-test-function.txt
rename to Documentation/PCI/endpoint/pci-test-function.rst
index 5916f1f592bb..ba02cddcec37 100644
--- a/Documentation/PCI/endpoint/pci-test-function.txt
+++ b/Documentation/PCI/endpoint/pci-test-function.rst
@@ -1,5 +1,10 @@
- PCI TEST
- Kishon Vijay Abraham I <[email protected]>
+.. SPDX-License-Identifier: GPL-2.0
+
+=================
+PCI Test Function
+=================
+
+:Author: Kishon Vijay Abraham I <[email protected]>

Traditionally PCI RC has always been validated by using standard
PCI cards like ethernet PCI cards or USB PCI cards or SATA PCI cards.
@@ -23,30 +28,31 @@ The PCI endpoint test device has the following registers:
8) PCI_ENDPOINT_TEST_IRQ_TYPE
9) PCI_ENDPOINT_TEST_IRQ_NUMBER

-*) PCI_ENDPOINT_TEST_MAGIC
+* PCI_ENDPOINT_TEST_MAGIC

This register will be used to test BAR0. A known pattern will be written
and read back from MAGIC register to verify BAR0.

-*) PCI_ENDPOINT_TEST_COMMAND:
+* PCI_ENDPOINT_TEST_COMMAND:

This register will be used by the host driver to indicate the function
that the endpoint device must perform.

-Bitfield Description:
+Bitfield Description::
+
Bit 0 : raise legacy IRQ
Bit 1 : raise MSI IRQ
Bit 2 : raise MSI-X IRQ
Bit 3 : read command (read data from RC buffer)
Bit 4 : write command (write data to RC buffer)
- Bit 5 : copy command (copy data from one RC buffer to another
- RC buffer)
+ Bit 5 : copy command (copy data from one RC buffer to another RC buffer)

-*) PCI_ENDPOINT_TEST_STATUS
+* PCI_ENDPOINT_TEST_STATUS

This register reflects the status of the PCI endpoint device.

-Bitfield Description:
+Bitfield Description::
+
Bit 0 : read success
Bit 1 : read fail
Bit 2 : write success
@@ -57,17 +63,17 @@ Bitfield Description:
Bit 7 : source address is invalid
Bit 8 : destination address is invalid

-*) PCI_ENDPOINT_TEST_SRC_ADDR
+* PCI_ENDPOINT_TEST_SRC_ADDR

This register contains the source address (RC buffer address) for the
COPY/READ command.

-*) PCI_ENDPOINT_TEST_DST_ADDR
+* PCI_ENDPOINT_TEST_DST_ADDR

This register contains the destination address (RC buffer address) for
the COPY/WRITE command.

-*) PCI_ENDPOINT_TEST_IRQ_TYPE
+* PCI_ENDPOINT_TEST_IRQ_TYPE

This register contains the interrupt type (Legacy/MSI) triggered
for the READ/WRITE/COPY and raise IRQ (Legacy/MSI) commands.
@@ -77,7 +83,7 @@ Possible types:
- MSI : 1
- MSI-X : 2

-*) PCI_ENDPOINT_TEST_IRQ_NUMBER
+* PCI_ENDPOINT_TEST_IRQ_NUMBER

This register contains the triggered ID interrupt.

--
2.20.1

2019-04-23 16:37:24

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 36/63] Documentation: PCI: convert endpoint/pci-test-howto.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
---
Documentation/PCI/endpoint/index.rst | 1 +
...{pci-test-howto.txt => pci-test-howto.rst} | 81 +++++++++++++------
2 files changed, 56 insertions(+), 26 deletions(-)
rename Documentation/PCI/endpoint/{pci-test-howto.txt => pci-test-howto.rst} (78%)

diff --git a/Documentation/PCI/endpoint/index.rst b/Documentation/PCI/endpoint/index.rst
index b680a3fc4fec..d114ea74b444 100644
--- a/Documentation/PCI/endpoint/index.rst
+++ b/Documentation/PCI/endpoint/index.rst
@@ -10,3 +10,4 @@ PCI Endpoint Framework
pci-endpoint
pci-endpoint-cfs
pci-test-function
+ pci-test-howto
diff --git a/Documentation/PCI/endpoint/pci-test-howto.txt b/Documentation/PCI/endpoint/pci-test-howto.rst
similarity index 78%
rename from Documentation/PCI/endpoint/pci-test-howto.txt
rename to Documentation/PCI/endpoint/pci-test-howto.rst
index 040479f437a5..909f770a07d6 100644
--- a/Documentation/PCI/endpoint/pci-test-howto.txt
+++ b/Documentation/PCI/endpoint/pci-test-howto.rst
@@ -1,38 +1,51 @@
- PCI TEST USERGUIDE
- Kishon Vijay Abraham I <[email protected]>
+.. SPDX-License-Identifier: GPL-2.0
+
+===================
+PCI Test User Guide
+===================
+
+:Author: Kishon Vijay Abraham I <[email protected]>

This document is a guide to help users use pci-epf-test function driver
and pci_endpoint_test host driver for testing PCI. The list of steps to
be followed in the host side and EP side is given below.

-1. Endpoint Device
+Endpoint Device
+===============

-1.1 Endpoint Controller Devices
+Endpoint Controller Devices
+---------------------------

-To find the list of endpoint controller devices in the system:
+To find the list of endpoint controller devices in the system::

# ls /sys/class/pci_epc/
51000000.pcie_ep

-If PCI_ENDPOINT_CONFIGFS is enabled
+If PCI_ENDPOINT_CONFIGFS is enabled::
+
# ls /sys/kernel/config/pci_ep/controllers
51000000.pcie_ep

-1.2 Endpoint Function Drivers

-To find the list of endpoint function drivers in the system:
+Endpoint Function Drivers
+-------------------------
+
+To find the list of endpoint function drivers in the system::

# ls /sys/bus/pci-epf/drivers
pci_epf_test

-If PCI_ENDPOINT_CONFIGFS is enabled
+If PCI_ENDPOINT_CONFIGFS is enabled::
+
# ls /sys/kernel/config/pci_ep/functions
pci_epf_test

-1.3 Creating pci-epf-test Device
+
+Creating pci-epf-test Device
+----------------------------

PCI endpoint function device can be created using the configfs. To create
-pci-epf-test device, the following commands can be used
+pci-epf-test device, the following commands can be used::

# mount -t configfs none /sys/kernel/config
# cd /sys/kernel/config/pci_ep/
@@ -42,7 +55,7 @@ The "mkdir func1" above creates the pci-epf-test function device that will
be probed by pci_epf_test driver.

The PCI endpoint framework populates the directory with the following
-configurable fields.
+configurable fields::

# ls functions/pci_epf_test/func1
baseclass_code interrupt_pin progif_code subsys_id
@@ -51,67 +64,83 @@ configurable fields.

The PCI endpoint function driver populates these entries with default values
when the device is bound to the driver. The pci-epf-test driver populates
-vendorid with 0xffff and interrupt_pin with 0x0001
+vendorid with 0xffff and interrupt_pin with 0x0001::

# cat functions/pci_epf_test/func1/vendorid
0xffff
# cat functions/pci_epf_test/func1/interrupt_pin
0x0001

-1.4 Configuring pci-epf-test Device
+
+Configuring pci-epf-test Device
+-------------------------------

The user can configure the pci-epf-test device using configfs entry. In order
to change the vendorid and the number of MSI interrupts used by the function
-device, the following commands can be used.
+device, the following commands can be used::

# echo 0x104c > functions/pci_epf_test/func1/vendorid
# echo 0xb500 > functions/pci_epf_test/func1/deviceid
# echo 16 > functions/pci_epf_test/func1/msi_interrupts
# echo 8 > functions/pci_epf_test/func1/msix_interrupts

-1.5 Binding pci-epf-test Device to EP Controller
+
+Binding pci-epf-test Device to EP Controller
+--------------------------------------------

In order for the endpoint function device to be useful, it has to be bound to
a PCI endpoint controller driver. Use the configfs to bind the function
-device to one of the controller driver present in the system.
+device to one of the controller driver present in the system::

# ln -s functions/pci_epf_test/func1 controllers/51000000.pcie_ep/

Once the above step is completed, the PCI endpoint is ready to establish a link
with the host.

-1.6 Start the Link
+
+Start the Link
+--------------

In order for the endpoint device to establish a link with the host, the _start_
-field should be populated with '1'.
+field should be populated with '1'::

# echo 1 > controllers/51000000.pcie_ep/start

-2. RootComplex Device

-2.1 lspci Output
+RootComplex Device
+==================
+
+lspci Output
+------------

-Note that the devices listed here correspond to the value populated in 1.4 above
+Note that the devices listed here correspond to the value populated in 1.4
+above::

00:00.0 PCI bridge: Texas Instruments Device 8888 (rev 01)
01:00.0 Unassigned class [ff00]: Texas Instruments Device b500

-2.2 Using Endpoint Test function Device
+
+Using Endpoint Test function Device
+-----------------------------------

pcitest.sh added in tools/pci/ can be used to run all the default PCI endpoint
-tests. To compile this tool the following commands should be used:
+tests. To compile this tool the following commands should be used::

# cd <kernel-dir>
# make -C tools/pci

-or if you desire to compile and install in your system:
+or if you desire to compile and install in your system::

# cd <kernel-dir>
# make -C tools/pci install

The tool and script will be located in <rootfs>/usr/bin/

-2.2.1 pcitest.sh Output
+
+pcitest.sh Output
+~~~~~~~~~~~~~~~~~
+::
+
# pcitest.sh
BAR tests

--
2.20.1

2019-04-23 16:37:27

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 29/63] Documentation: PCI: convert MSI-HOWTO.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>

---
v2:
o drop numbering.
o simplify author list
---
.../PCI/{MSI-HOWTO.txt => MSI-HOWTO.rst} | 83 +++++++++++--------
Documentation/PCI/index.rst | 1 +
2 files changed, 50 insertions(+), 34 deletions(-)
rename Documentation/PCI/{MSI-HOWTO.txt => MSI-HOWTO.rst} (88%)

diff --git a/Documentation/PCI/MSI-HOWTO.txt b/Documentation/PCI/MSI-HOWTO.rst
similarity index 88%
rename from Documentation/PCI/MSI-HOWTO.txt
rename to Documentation/PCI/MSI-HOWTO.rst
index 618e13d5e276..18cc3700489b 100644
--- a/Documentation/PCI/MSI-HOWTO.txt
+++ b/Documentation/PCI/MSI-HOWTO.rst
@@ -1,13 +1,14 @@
- The MSI Driver Guide HOWTO
- Tom L Nguyen [email protected]
- 10/03/2003
- Revised Feb 12, 2004 by Martine Silbermann
- email: [email protected]
- Revised Jun 25, 2004 by Tom L Nguyen
- Revised Jul 9, 2008 by Matthew Wilcox <[email protected]>
- Copyright 2003, 2008 Intel Corporation
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>

-1. About this guide
+==========================
+The MSI Driver Guide HOWTO
+==========================
+
+:Authors: Tom L Nguyen; Martine Silbermann; Matthew Wilcox
+
+About this guide
+================

This guide describes the basics of Message Signaled Interrupts (MSIs),
the advantages of using MSI over traditional interrupt mechanisms, how
@@ -15,7 +16,8 @@ to change your driver to use MSI or MSI-X and some basic diagnostics to
try if a device doesn't support MSIs.


-2. What are MSIs?
+What are MSIs?
+==============

A Message Signaled Interrupt is a write from the device to a special
address which causes an interrupt to be received by the CPU.
@@ -29,7 +31,8 @@ Devices may support both MSI and MSI-X, but only one can be enabled at
a time.


-3. Why use MSIs?
+Why use MSIs?
+=============

There are three reasons why using MSIs can give an advantage over
traditional pin-based interrupts.
@@ -61,14 +64,16 @@ Other possible designs include giving one interrupt to each packet queue
in a network card or each port in a storage controller.


-4. How to use MSIs
+How to use MSIs
+===============

PCI devices are initialised to use pin-based interrupts. The device
driver has to set up the device to use MSI or MSI-X. Not all machines
support MSIs correctly, and for those machines, the APIs described below
will simply fail and the device will continue to use pin-based interrupts.

-4.1 Include kernel support for MSIs
+Include kernel support for MSIs
+-------------------------------

To support MSI or MSI-X, the kernel must be built with the CONFIG_PCI_MSI
option enabled. This option is only available on some architectures,
@@ -76,14 +81,15 @@ and it may depend on some other options also being set. For example,
on x86, you must also enable X86_UP_APIC or SMP in order to see the
CONFIG_PCI_MSI option.

-4.2 Using MSI
+Using MSI
+---------

Most of the hard work is done for the driver in the PCI layer. The driver
simply has to request that the PCI layer set up the MSI capability for this
device.

To automatically use MSI or MSI-X interrupt vectors, use the following
-function:
+function::

int pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
unsigned int max_vecs, unsigned int flags);
@@ -101,12 +107,12 @@ any possible kind of interrupt. If the PCI_IRQ_AFFINITY flag is set,
pci_alloc_irq_vectors() will spread the interrupts around the available CPUs.

To get the Linux IRQ numbers passed to request_irq() and free_irq() and the
-vectors, use the following function:
+vectors, use the following function::

int pci_irq_vector(struct pci_dev *dev, unsigned int nr);

Any allocated resources should be freed before removing the device using
-the following function:
+the following function::

void pci_free_irq_vectors(struct pci_dev *dev);

@@ -126,7 +132,7 @@ The typical usage of MSI or MSI-X interrupts is to allocate as many vectors
as possible, likely up to the limit supported by the device. If nvec is
larger than the number supported by the device it will automatically be
capped to the supported limit, so there is no need to query the number of
-vectors supported beforehand:
+vectors supported beforehand::

nvec = pci_alloc_irq_vectors(pdev, 1, nvec, PCI_IRQ_ALL_TYPES)
if (nvec < 0)
@@ -135,7 +141,7 @@ vectors supported beforehand:
If a driver is unable or unwilling to deal with a variable number of MSI
interrupts it can request a particular number of interrupts by passing that
number to pci_alloc_irq_vectors() function as both 'min_vecs' and
-'max_vecs' parameters:
+'max_vecs' parameters::

ret = pci_alloc_irq_vectors(pdev, nvec, nvec, PCI_IRQ_ALL_TYPES);
if (ret < 0)
@@ -143,23 +149,24 @@ number to pci_alloc_irq_vectors() function as both 'min_vecs' and

The most notorious example of the request type described above is enabling
the single MSI mode for a device. It could be done by passing two 1s as
-'min_vecs' and 'max_vecs':
+'min_vecs' and 'max_vecs'::

ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
if (ret < 0)
goto out_err;

Some devices might not support using legacy line interrupts, in which case
-the driver can specify that only MSI or MSI-X is acceptable:
+the driver can specify that only MSI or MSI-X is acceptable::

nvec = pci_alloc_irq_vectors(pdev, 1, nvec, PCI_IRQ_MSI | PCI_IRQ_MSIX);
if (nvec < 0)
goto out_err;

-4.3 Legacy APIs
+Legacy APIs
+-----------

The following old APIs to enable and disable MSI or MSI-X interrupts should
-not be used in new code:
+not be used in new code::

pci_enable_msi() /* deprecated */
pci_disable_msi() /* deprecated */
@@ -174,9 +181,11 @@ number of vectors. If you have a legitimate special use case for the count
of vectors we might have to revisit that decision and add a
pci_nr_irq_vectors() helper that handles MSI and MSI-X transparently.

-4.4 Considerations when using MSIs
+Considerations when using MSIs
+------------------------------

-4.4.1 Spinlocks
+Spinlocks
+~~~~~~~~~

Most device drivers have a per-device spinlock which is taken in the
interrupt handler. With pin-based interrupts or a single MSI, it is not
@@ -188,7 +197,8 @@ acquire the spinlock. Such deadlocks can be avoided by using
spin_lock_irqsave() or spin_lock_irq() which disable local interrupts
and acquire the lock (see Documentation/kernel-hacking/locking.rst).

-4.5 How to tell whether MSI/MSI-X is enabled on a device
+How to tell whether MSI/MSI-X is enabled on a device
+----------------------------------------------------

Using 'lspci -v' (as root) may show some devices with "MSI", "Message
Signalled Interrupts" or "MSI-X" capabilities. Each of these capabilities
@@ -196,7 +206,8 @@ has an 'Enable' flag which is followed with either "+" (enabled)
or "-" (disabled).


-5. MSI quirks
+MSI quirks
+==========

Several PCI chipsets or devices are known not to support MSIs.
The PCI stack provides three ways to disable MSIs:
@@ -205,7 +216,8 @@ The PCI stack provides three ways to disable MSIs:
2. on all devices behind a specific bridge
3. on a single device

-5.1. Disabling MSIs globally
+Disabling MSIs globally
+-----------------------

Some host chipsets simply don't support MSIs properly. If we're
lucky, the manufacturer knows this and has indicated it in the ACPI
@@ -219,7 +231,8 @@ on the kernel command line to disable MSIs on all devices. It would be
in your best interests to report the problem to [email protected]
including a full 'lspci -v' so we can add the quirks to the kernel.

-5.2. Disabling MSIs below a bridge
+Disabling MSIs below a bridge
+-----------------------------

Some PCI bridges are not able to route MSIs between busses properly.
In this case, MSIs must be disabled on all devices behind the bridge.
@@ -230,7 +243,7 @@ as the nVidia nForce and Serverworks HT2000). As with host chipsets,
Linux mostly knows about them and automatically enables MSIs if it can.
If you have a bridge unknown to Linux, you can enable
MSIs in configuration space using whatever method you know works, then
-enable MSIs on that bridge by doing:
+enable MSIs on that bridge by doing::

echo 1 > /sys/bus/pci/devices/$bridge/msi_bus

@@ -244,7 +257,8 @@ below this bridge.
Again, please notify [email protected] of any bridges that need
special handling.

-5.3. Disabling MSIs on a single device
+Disabling MSIs on a single device
+---------------------------------

Some devices are known to have faulty MSI implementations. Usually this
is handled in the individual device driver, but occasionally it's necessary
@@ -252,7 +266,8 @@ to handle this with a quirk. Some drivers have an option to disable use
of MSI. While this is a convenient workaround for the driver author,
it is not good practice, and should not be emulated.

-5.4. Finding why MSIs are disabled on a device
+Finding why MSIs are disabled on a device
+-----------------------------------------

From the above three sections, you can see that there are many reasons
why MSIs may not be enabled for a given device. Your first step should
@@ -260,8 +275,8 @@ be to examine your dmesg carefully to determine whether MSIs are enabled
for your machine. You should also check your .config to be sure you
have enabled CONFIG_PCI_MSI.

-Then, 'lspci -t' gives the list of bridges above a device. Reading
-/sys/bus/pci/devices/*/msi_bus will tell you whether MSIs are enabled (1)
+Then, 'lspci -t' gives the list of bridges above a device. Reading
+`/sys/bus/pci/devices/*/msi_bus` will tell you whether MSIs are enabled (1)
or disabled (0). If 0 is found in any of the msi_bus files belonging
to bridges between the PCI root and the device, MSIs are disabled.

diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
index e1c19962a7f8..1b25bcc1edca 100644
--- a/Documentation/PCI/index.rst
+++ b/Documentation/PCI/index.rst
@@ -11,3 +11,4 @@ Linux PCI Bus Subsystem
pci
PCIEBUS-HOWTO
pci-iov-howto
+ MSI-HOWTO
--
2.20.1

2019-04-23 16:37:35

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 37/63] Documentation: add Linux x86 docs to Sphinx TOC tree

Add a index.rst for x86 support. More docs will be added later.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/index.rst | 1 +
Documentation/x86/index.rst | 9 +++++++++
2 files changed, 10 insertions(+)
create mode 100644 Documentation/x86/index.rst

diff --git a/Documentation/index.rst b/Documentation/index.rst
index d80138284e0f..f185c8040fa9 100644
--- a/Documentation/index.rst
+++ b/Documentation/index.rst
@@ -112,6 +112,7 @@ implementation.
.. toctree::
:maxdepth: 2

+ x86/index
sh/index

Filesystem Documentation
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
new file mode 100644
index 000000000000..7612d3142b2a
--- /dev/null
+++ b/Documentation/x86/index.rst
@@ -0,0 +1,9 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================
+Linux x86 Support
+=================
+
+.. toctree::
+ :maxdepth: 2
+ :numbered:
--
2.20.1

2019-04-23 16:37:51

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 38/63] Documentation: x86: convert boot.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/boot.rst | 1205 +++++++++++++++++++++++++++++++++++
Documentation/x86/boot.txt | 1130 --------------------------------
Documentation/x86/index.rst | 2 +
3 files changed, 1207 insertions(+), 1130 deletions(-)
create mode 100644 Documentation/x86/boot.rst
delete mode 100644 Documentation/x86/boot.txt

diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
new file mode 100644
index 000000000000..9f55e832bc47
--- /dev/null
+++ b/Documentation/x86/boot.rst
@@ -0,0 +1,1205 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========================
+The Linux/x86 Boot Protocol
+===========================
+
+On the x86 platform, the Linux kernel uses a rather complicated boot
+convention. This has evolved partially due to historical aspects, as
+well as the desire in the early days to have the kernel itself be a
+bootable image, the complicated PC memory model and due to changed
+expectations in the PC industry caused by the effective demise of
+real-mode DOS as a mainstream operating system.
+
+Currently, the following versions of the Linux/x86 boot protocol exist.
+
+Old kernels:
+ zImage/Image support only. Some very early kernels
+ may not even support a command line.
+
+Protocol 2.00:
+ (Kernel 1.3.73) Added bzImage and initrd support, as
+ well as a formalized way to communicate between the
+ boot loader and the kernel. setup.S made relocatable,
+ although the traditional setup area still assumed writable.
+
+Protocol 2.01:
+ (Kernel 1.3.76) Added a heap overrun warning.
+
+Protocol 2.02:
+ (Kernel 2.4.0-test3-pre3) New command line protocol.
+ Lower the conventional memory ceiling. No overwrite
+ of the traditional setup area, thus making booting
+ safe for systems which use the EBDA from SMM or 32-bit
+ BIOS entry points. zImage deprecated but still supported.
+
+Protocol 2.03:
+ (Kernel 2.4.18-pre1) Explicitly makes the highest possible
+ initrd address available to the bootloader.
+
+Protocol 2.04:
+ (Kernel 2.6.14) Extend the syssize field to four bytes.
+
+Protocol 2.05:
+ (Kernel 2.6.20) Make protected mode kernel relocatable.
+ Introduce relocatable_kernel and kernel_alignment fields.
+
+Protocol 2.06:
+ (Kernel 2.6.22) Added a field that contains the size of
+ the boot command line.
+
+Protocol 2.07:
+ (Kernel 2.6.24) Added paravirtualised boot protocol.
+ Introduced hardware_subarch and hardware_subarch_data
+ and KEEP_SEGMENTS flag in load_flags.
+
+Protocol 2.08:
+ (Kernel 2.6.26) Added crc32 checksum and ELF format
+ payload. Introduced payload_offset and payload_length
+ fields to aid in locating the payload.
+
+Protocol 2.09:
+ (Kernel 2.6.26) Added a field of 64-bit physical
+ pointer to single linked list of struct setup_data.
+
+Protocol 2.10:
+ (Kernel 2.6.31) Added a protocol for relaxed alignment
+ beyond the kernel_alignment added, new init_size and
+ pref_address fields. Added extended boot loader IDs.
+
+Protocol 2.11:
+ (Kernel 3.6) Added a field for offset of EFI handover
+ protocol entry point.
+
+Protocol 2.12:
+ (Kernel 3.8) Added the xloadflags field and extension fields
+ to struct boot_params for loading bzImage and ramdisk
+ above 4G in 64bit.
+
+MEMORY LAYOUT
+=============
+
+The traditional memory map for the kernel loader, used for Image or
+zImage kernels, typically looks like::
+
+ | |
+ 0A0000 +------------------------+
+ | Reserved for BIOS | Do not use. Reserved for BIOS EBDA.
+ 09A000 +------------------------+
+ | Command line |
+ | Stack/heap | For use by the kernel real-mode code.
+ 098000 +------------------------+
+ | Kernel setup | The kernel real-mode code.
+ 090200 +------------------------+
+ | Kernel boot sector | The kernel legacy boot sector.
+ 090000 +------------------------+
+ | Protected-mode kernel | The bulk of the kernel image.
+ 010000 +------------------------+
+ | Boot loader | <- Boot sector entry point 0000:7C00
+ 001000 +------------------------+
+ | Reserved for MBR/BIOS |
+ 000800 +------------------------+
+ | Typically used by MBR |
+ 000600 +------------------------+
+ | BIOS use only |
+ 000000 +------------------------+
+
+
+When using bzImage, the protected-mode kernel was relocated to
+0x100000 ("high memory"), and the kernel real-mode block (boot sector,
+setup, and stack/heap) was made relocatable to any address between
+0x10000 and end of low memory. Unfortunately, in protocols 2.00 and
+2.01 the 0x90000+ memory range is still used internally by the kernel;
+the 2.02 protocol resolves that problem.
+
+It is desirable to keep the "memory ceiling" -- the highest point in
+low memory touched by the boot loader -- as low as possible, since
+some newer BIOSes have begun to allocate some rather large amounts of
+memory, called the Extended BIOS Data Area, near the top of low
+memory. The boot loader should use the "INT 12h" BIOS call to verify
+how much low memory is available.
+
+Unfortunately, if INT 12h reports that the amount of memory is too
+low, there is usually nothing the boot loader can do but to report an
+error to the user. The boot loader should therefore be designed to
+take up as little space in low memory as it reasonably can. For
+zImage or old bzImage kernels, which need data written into the
+0x90000 segment, the boot loader should make sure not to use memory
+above the 0x9A000 point; too many BIOSes will break above that point.
+
+For a modern bzImage kernel with boot protocol version >= 2.02, a
+memory layout like the following is suggested::
+
+ ~ ~
+ | Protected-mode kernel |
+ 100000 +------------------------+
+ | I/O memory hole |
+ 0A0000 +------------------------+
+ | Reserved for BIOS | Leave as much as possible unused
+ ~ ~
+ | Command line | (Can also be below the X+10000 mark)
+ X+10000 +------------------------+
+ | Stack/heap | For use by the kernel real-mode code.
+ X+08000 +------------------------+
+ | Kernel setup | The kernel real-mode code.
+ | Kernel boot sector | The kernel legacy boot sector.
+ X +------------------------+
+ | Boot loader | <- Boot sector entry point 0000:7C00
+ 001000 +------------------------+
+ | Reserved for MBR/BIOS |
+ 000800 +------------------------+
+ | Typically used by MBR |
+ 000600 +------------------------+
+ | BIOS use only |
+ 000000 +------------------------+
+
+... where the address X is as low as the design of the boot loader
+permits.
+
+
+THE REAL-MODE KERNEL HEADER
+===========================
+
+In the following text, and anywhere in the kernel boot sequence, "a
+sector" refers to 512 bytes. It is independent of the actual sector
+size of the underlying medium.
+
+The first step in loading a Linux kernel should be to load the
+real-mode code (boot sector and setup code) and then examine the
+following header at offset 0x01f1. The real-mode code can total up to
+32K, although the boot loader may choose to load only the first two
+sectors (1K) and then examine the bootup sector size.
+
+The header looks like::
+
+ Offset Proto Name Meaning
+ /Size
+
+ 01F1/1 ALL(1 setup_sects The size of the setup in sectors
+ 01F2/2 ALL root_flags If set, the root is mounted readonly
+ 01F4/4 2.04+(2 syssize The size of the 32-bit code in 16-byte paras
+ 01F8/2 ALL ram_size DO NOT USE - for bootsect.S use only
+ 01FA/2 ALL vid_mode Video mode control
+ 01FC/2 ALL root_dev Default root device number
+ 01FE/2 ALL boot_flag 0xAA55 magic number
+ 0200/2 2.00+ jump Jump instruction
+ 0202/4 2.00+ header Magic signature "HdrS"
+ 0206/2 2.00+ version Boot protocol version supported
+ 0208/4 2.00+ realmode_swtch Boot loader hook (see below)
+ 020C/2 2.00+ start_sys_seg The load-low segment (0x1000) (obsolete)
+ 020E/2 2.00+ kernel_version Pointer to kernel version string
+ 0210/1 2.00+ type_of_loader Boot loader identifier
+ 0211/1 2.00+ loadflags Boot protocol option flags
+ 0212/2 2.00+ setup_move_size Move to high memory size (used with hooks)
+ 0214/4 2.00+ code32_start Boot loader hook (see below)
+ 0218/4 2.00+ ramdisk_image initrd load address (set by boot loader)
+ 021C/4 2.00+ ramdisk_size initrd size (set by boot loader)
+ 0220/4 2.00+ bootsect_kludge DO NOT USE - for bootsect.S use only
+ 0224/2 2.01+ heap_end_ptr Free memory after setup end
+ 0226/1 2.02+(3 ext_loader_ver Extended boot loader version
+ 0227/1 2.02+(3 ext_loader_type Extended boot loader ID
+ 0228/4 2.02+ cmd_line_ptr 32-bit pointer to the kernel command line
+ 022C/4 2.03+ initrd_addr_max Highest legal initrd address
+ 0230/4 2.05+ kernel_alignment Physical addr alignment required for kernel
+ 0234/1 2.05+ relocatable_kernel Whether kernel is relocatable or not
+ 0235/1 2.10+ min_alignment Minimum alignment, as a power of two
+ 0236/2 2.12+ xloadflags Boot protocol option flags
+ 0238/4 2.06+ cmdline_size Maximum size of the kernel command line
+ 023C/4 2.07+ hardware_subarch Hardware subarchitecture
+ 0240/8 2.07+ hardware_subarch_data Subarchitecture-specific data
+ 0248/4 2.08+ payload_offset Offset of kernel payload
+ 024C/4 2.08+ payload_length Length of kernel payload
+ 0250/8 2.09+ setup_data 64-bit physical pointer to linked list
+ of struct setup_data
+ 0258/8 2.10+ pref_address Preferred loading address
+ 0260/4 2.10+ init_size Linear memory required during initialization
+ 0264/4 2.11+ handover_offset Offset of handover entry point
+
+(1) For backwards compatibility, if the setup_sects field contains 0, the
+ real value is 4.
+
+(2) For boot protocol prior to 2.04, the upper two bytes of the syssize
+ field are unusable, which means the size of a bzImage kernel
+ cannot be determined.
+
+(3) Ignored, but safe to set, for boot protocols 2.02-2.09.
+
+If the "HdrS" (0x53726448) magic number is not found at offset 0x202,
+the boot protocol version is "old". Loading an old kernel, the
+following parameters should be assumed::
+
+ Image type = zImage
+ initrd not supported
+ Real-mode kernel must be located at 0x90000.
+
+Otherwise, the "version" field contains the protocol version,
+e.g. protocol version 2.01 will contain 0x0201 in this field. When
+setting fields in the header, you must make sure only to set fields
+supported by the protocol version in use.
+
+
+DETAILS OF HEADER FIELDS
+========================
+
+For each field, some are information from the kernel to the bootloader
+("read"), some are expected to be filled out by the bootloader
+("write"), and some are expected to be read and modified by the
+bootloader ("modify").
+
+All general purpose boot loaders should write the fields marked
+(obligatory). Boot loaders who want to load the kernel at a
+nonstandard address should fill in the fields marked (reloc); other
+boot loaders can ignore those fields.
+
+The byte order of all fields is littleendian (this is x86, after all.)
+::
+
+ Field name: setup_sects
+ Type: read
+ Offset/size: 0x1f1/1
+ Protocol: ALL
+
+The size of the setup code in 512-byte sectors. If this field is
+0, the real value is 4. The real-mode code consists of the boot
+sector (always one 512-byte sector) plus the setup code.
+::
+
+ Field name: root_flags
+ Type: modify (optional)
+ Offset/size: 0x1f2/2
+ Protocol: ALL
+
+If this field is nonzero, the root defaults to readonly. The use of
+this field is deprecated; use the "ro" or "rw" options on the
+command line instead.
+::
+
+ Field name: syssize
+ Type: read
+ Offset/size: 0x1f4/4 (protocol 2.04+) 0x1f4/2 (protocol ALL)
+ Protocol: 2.04+
+
+The size of the protected-mode code in units of 16-byte paragraphs.
+For protocol versions older than 2.04 this field is only two bytes
+wide, and therefore cannot be trusted for the size of a kernel if
+the LOAD_HIGH flag is set.
+::
+
+ Field name: ram_size
+ Type: kernel internal
+ Offset/size: 0x1f8/2
+ Protocol: ALL
+
+This field is obsolete.
+::
+
+ Field name: vid_mode
+ Type: modify (obligatory)
+ Offset/size: 0x1fa/2
+
+Please see the section on SPECIAL COMMAND LINE OPTIONS.
+::
+
+ Field name: root_dev
+ Type: modify (optional)
+ Offset/size: 0x1fc/2
+ Protocol: ALL
+
+The default root device device number. The use of this field is
+deprecated, use the "root=" option on the command line instead.
+::
+
+ Field name: boot_flag
+ Type: read
+ Offset/size: 0x1fe/2
+ Protocol: ALL
+
+Contains 0xAA55. This is the closest thing old Linux kernels have
+to a magic number.
+::
+
+ Field name: jump
+ Type: read
+ Offset/size: 0x200/2
+ Protocol: 2.00+
+
+Contains an x86 jump instruction, 0xEB followed by a signed offset
+relative to byte 0x202. This can be used to determine the size of
+the header.
+::
+
+ Field name: header
+ Type: read
+ Offset/size: 0x202/4
+ Protocol: 2.00+
+
+Contains the magic number "HdrS" (0x53726448).
+::
+
+ Field name: version
+ Type: read
+ Offset/size: 0x206/2
+ Protocol: 2.00+
+
+Contains the boot protocol version, in (major << 8)+minor format,
+e.g. 0x0204 for version 2.04, and 0x0a11 for a hypothetical version
+10.17.
+::
+
+ Field name: realmode_swtch
+ Type: modify (optional)
+ Offset/size: 0x208/4
+ Protocol: 2.00+
+
+Boot loader hook (see ADVANCED BOOT LOADER HOOKS below.)
+::
+
+ Field name: start_sys_seg
+ Type: read
+ Offset/size: 0x20c/2
+ Protocol: 2.00+
+
+The load low segment (0x1000). Obsolete.
+::
+
+ Field name: kernel_version
+ Type: read
+ Offset/size: 0x20e/2
+ Protocol: 2.00+
+
+If set to a nonzero value, contains a pointer to a NUL-terminated
+human-readable kernel version number string, less 0x200. This can
+be used to display the kernel version to the user. This value
+should be less than (0x200*setup_sects).
+
+For example, if this value is set to 0x1c00, the kernel version
+number string can be found at offset 0x1e00 in the kernel file.
+This is a valid value if and only if the "setup_sects" field
+contains the value 15 or higher, as::
+
+ 0x1c00 < 15*0x200 (= 0x1e00) but
+ 0x1c00 >= 14*0x200 (= 0x1c00)
+
+ 0x1c00 >> 9 = 14, so the minimum value for setup_secs is 15.
+
+::
+
+ Field name: type_of_loader
+ Type: write (obligatory)
+ Offset/size: 0x210/1
+ Protocol: 2.00+
+
+If your boot loader has an assigned id (see table below), enter
+0xTV here, where T is an identifier for the boot loader and V is
+a version number. Otherwise, enter 0xFF here.
+
+For boot loader IDs above T = 0xD, write T = 0xE to this field and
+write the extended ID minus 0x10 to the ext_loader_type field.
+Similarly, the ext_loader_ver field can be used to provide more than
+four bits for the bootloader version.
+
+For example, for T = 0x15, V = 0x234, write::
+
+ type_of_loader <- 0xE4
+ ext_loader_type <- 0x05
+ ext_loader_ver <- 0x23
+
+Assigned boot loader ids (hexadecimal)::
+
+ 0 LILO (0x00 reserved for pre-2.00 bootloader)
+ 1 Loadlin
+ 2 bootsect-loader (0x20, all other values reserved)
+ 3 Syslinux
+ 4 Etherboot/gPXE/iPXE
+ 5 ELILO
+ 7 GRUB
+ 8 U-Boot
+ 9 Xen
+ A Gujin
+ B Qemu
+ C Arcturus Networks uCbootloader
+ D kexec-tools
+ E Extended (see ext_loader_type)
+ F Special (0xFF = undefined)
+ 10 Reserved
+ 11 Minimal Linux Bootloader <http://sebastian-plotz.blogspot.de>
+ 12 OVMF UEFI virtualization stack
+
+Please contact <[email protected]> if you need a bootloader ID value assigned.
+::
+
+ Field name: loadflags
+ Type: modify (obligatory)
+ Offset/size: 0x211/1
+ Protocol: 2.00+
+
+This field is a bitmask.
+::
+
+ Bit 0 (read): LOADED_HIGH
+ - If 0, the protected-mode code is loaded at 0x10000.
+ - If 1, the protected-mode code is loaded at 0x100000.
+
+ Bit 1 (kernel internal): KASLR_FLAG
+ - Used internally by the compressed kernel to communicate
+ KASLR status to kernel proper.
+ If 1, KASLR enabled.
+ If 0, KASLR disabled.
+
+ Bit 5 (write): QUIET_FLAG
+ - If 0, print early messages.
+ - If 1, suppress early messages.
+ This requests to the kernel (decompressor and early
+ kernel) to not write early messages that require
+ accessing the display hardware directly.
+
+ Bit 6 (write): KEEP_SEGMENTS
+ Protocol: 2.07+
+ - If 0, reload the segment registers in the 32bit entry point.
+ - If 1, do not reload the segment registers in the 32bit entry point.
+ Assume that %cs %ds %ss %es are all set to flat segments with
+ a base of 0 (or the equivalent for their environment).
+
+ Bit 7 (write): CAN_USE_HEAP
+ Set this bit to 1 to indicate that the value entered in the
+ heap_end_ptr is valid. If this field is clear, some setup code
+ functionality will be disabled.
+
+::
+
+ Field name: setup_move_size
+ Type: modify (obligatory)
+ Offset/size: 0x212/2
+ Protocol: 2.00-2.01
+
+When using protocol 2.00 or 2.01, if the real mode kernel is not
+loaded at 0x90000, it gets moved there later in the loading
+sequence. Fill in this field if you want additional data (such as
+the kernel command line) moved in addition to the real-mode kernel
+itself.
+
+The unit is bytes starting with the beginning of the boot sector.
+
+This field is can be ignored when the protocol is 2.02 or higher, or
+if the real-mode code is loaded at 0x90000.
+::
+
+ Field name: code32_start
+ Type: modify (optional, reloc)
+ Offset/size: 0x214/4
+ Protocol: 2.00+
+
+The address to jump to in protected mode. This defaults to the load
+address of the kernel, and can be used by the boot loader to
+determine the proper load address.
+
+This field can be modified for two purposes:
+
+ 1. as a boot loader hook (see ADVANCED BOOT LOADER HOOKS below.)
+
+ 2. if a bootloader which does not install a hook loads a
+ relocatable kernel at a nonstandard address it will have to modify
+ this field to point to the load address.
+
+::
+
+ Field name: ramdisk_image
+ Type: write (obligatory)
+ Offset/size: 0x218/4
+ Protocol: 2.00+
+
+The 32-bit linear address of the initial ramdisk or ramfs. Leave at
+zero if there is no initial ramdisk/ramfs.
+::
+
+ Field name: ramdisk_size
+ Type: write (obligatory)
+ Offset/size: 0x21c/4
+ Protocol: 2.00+
+
+Size of the initial ramdisk or ramfs. Leave at zero if there is no
+initial ramdisk/ramfs.
+::
+
+ Field name: bootsect_kludge
+ Type: kernel internal
+ Offset/size: 0x220/4
+ Protocol: 2.00+
+
+This field is obsolete.
+::
+
+ Field name: heap_end_ptr
+ Type: write (obligatory)
+ Offset/size: 0x224/2
+ Protocol: 2.01+
+
+Set this field to the offset (from the beginning of the real-mode
+code) of the end of the setup stack/heap, minus 0x0200.
+::
+
+ Field name: ext_loader_ver
+ Type: write (optional)
+ Offset/size: 0x226/1
+ Protocol: 2.02+
+
+This field is used as an extension of the version number in the
+type_of_loader field. The total version number is considered to be
+(type_of_loader & 0x0f) + (ext_loader_ver << 4).
+
+The use of this field is boot loader specific. If not written, it
+is zero.
+
+Kernels prior to 2.6.31 did not recognize this field, but it is safe
+to write for protocol version 2.02 or higher.
+::
+
+ Field name: ext_loader_type
+ Type: write (obligatory if (type_of_loader & 0xf0) == 0xe0)
+ Offset/size: 0x227/1
+ Protocol: 2.02+
+
+This field is used as an extension of the type number in
+type_of_loader field. If the type in type_of_loader is 0xE, then
+the actual type is (ext_loader_type + 0x10).
+
+This field is ignored if the type in type_of_loader is not 0xE.
+
+Kernels prior to 2.6.31 did not recognize this field, but it is safe
+to write for protocol version 2.02 or higher.
+::
+
+ Field name: cmd_line_ptr
+ Type: write (obligatory)
+ Offset/size: 0x228/4
+ Protocol: 2.02+
+
+Set this field to the linear address of the kernel command line.
+The kernel command line can be located anywhere between the end of
+the setup heap and 0xA0000; it does not have to be located in the
+same 64K segment as the real-mode code itself.
+
+Fill in this field even if your boot loader does not support a
+command line, in which case you can point this to an empty string
+(or better yet, to the string "auto".) If this field is left at
+zero, the kernel will assume that your boot loader does not support
+the 2.02+ protocol.
+::
+
+ Field name: initrd_addr_max
+ Type: read
+ Offset/size: 0x22c/4
+ Protocol: 2.03+
+
+The maximum address that may be occupied by the initial
+ramdisk/ramfs contents. For boot protocols 2.02 or earlier, this
+field is not present, and the maximum address is 0x37FFFFFF. (This
+address is defined as the address of the highest safe byte, so if
+your ramdisk is exactly 131072 bytes long and this field is
+0x37FFFFFF, you can start your ramdisk at 0x37FE0000.)
+::
+
+ Field name: kernel_alignment
+ Type: read/modify (reloc)
+ Offset/size: 0x230/4
+ Protocol: 2.05+ (read), 2.10+ (modify)
+
+Alignment unit required by the kernel (if relocatable_kernel is
+true.) A relocatable kernel that is loaded at an alignment
+incompatible with the value in this field will be realigned during
+kernel initialization.
+
+Starting with protocol version 2.10, this reflects the kernel
+alignment preferred for optimal performance; it is possible for the
+loader to modify this field to permit a lesser alignment. See the
+min_alignment and pref_address field below.
+::
+
+ Field name: relocatable_kernel
+ Type: read (reloc)
+ Offset/size: 0x234/1
+ Protocol: 2.05+
+
+If this field is nonzero, the protected-mode part of the kernel can
+be loaded at any address that satisfies the kernel_alignment field.
+After loading, the boot loader must set the code32_start field to
+point to the loaded code, or to a boot loader hook.
+::
+
+ Field name: min_alignment
+ Type: read (reloc)
+ Offset/size: 0x235/1
+ Protocol: 2.10+
+
+This field, if nonzero, indicates as a power of two the minimum
+alignment required, as opposed to preferred, by the kernel to boot.
+If a boot loader makes use of this field, it should update the
+kernel_alignment field with the alignment unit desired; typically::
+
+ kernel_alignment = 1 << min_alignment
+
+There may be a considerable performance cost with an excessively
+misaligned kernel. Therefore, a loader should typically try each
+power-of-two alignment from kernel_alignment down to this alignment.
+::
+
+ Field name: xloadflags
+ Type: read
+ Offset/size: 0x236/2
+ Protocol: 2.12+
+
+This field is a bitmask.
+::
+
+ Bit 0 (read): XLF_KERNEL_64
+ - If 1, this kernel has the legacy 64-bit entry point at 0x200.
+
+ Bit 1 (read): XLF_CAN_BE_LOADED_ABOVE_4G
+ - If 1, kernel/boot_params/cmdline/ramdisk can be above 4G.
+
+ Bit 2 (read): XLF_EFI_HANDOVER_32
+ - If 1, the kernel supports the 32-bit EFI handoff entry point
+ given at handover_offset.
+
+ Bit 3 (read): XLF_EFI_HANDOVER_64
+ - If 1, the kernel supports the 64-bit EFI handoff entry point
+ given at handover_offset + 0x200.
+
+ Bit 4 (read): XLF_EFI_KEXEC
+ - If 1, the kernel supports kexec EFI boot with EFI runtime support.
+
+::
+
+ Field name: cmdline_size
+ Type: read
+ Offset/size: 0x238/4
+ Protocol: 2.06+
+
+The maximum size of the command line without the terminating
+zero. This means that the command line can contain at most
+cmdline_size characters. With protocol version 2.05 and earlier, the
+maximum size was 255.
+::
+
+ Field name: hardware_subarch
+ Type: write (optional, defaults to x86/PC)
+ Offset/size: 0x23c/4
+ Protocol: 2.07+
+
+In a paravirtualized environment the hardware low level architectural
+pieces such as interrupt handling, page table handling, and
+accessing process control registers needs to be done differently.
+
+This field allows the bootloader to inform the kernel we are in one
+one of those environments.
+::
+
+ 0x00000000 The default x86/PC environment
+ 0x00000001 lguest
+ 0x00000002 Xen
+ 0x00000003 Moorestown MID
+ 0x00000004 CE4100 TV Platform
+
+::
+
+ Field name: hardware_subarch_data
+ Type: write (subarch-dependent)
+ Offset/size: 0x240/8
+ Protocol: 2.07+
+
+A pointer to data that is specific to hardware subarch
+This field is currently unused for the default x86/PC environment,
+do not modify.
+::
+
+ Field name: payload_offset
+ Type: read
+ Offset/size: 0x248/4
+ Protocol: 2.08+
+
+If non-zero then this field contains the offset from the beginning
+of the protected-mode code to the payload.
+
+The payload may be compressed. The format of both the compressed and
+uncompressed data should be determined using the standard magic
+numbers. The currently supported compression formats are gzip
+(magic numbers 1F 8B or 1F 9E), bzip2 (magic number 42 5A), LZMA
+(magic number 5D 00), XZ (magic number FD 37), and LZ4 (magic number
+02 21). The uncompressed payload is currently always ELF (magic
+number 7F 45 4C 46).
+::
+
+ Field name: payload_length
+ Type: read
+ Offset/size: 0x24c/4
+ Protocol: 2.08+
+
+The length of the payload.
+::
+
+ Field name: setup_data
+ Type: write (special)
+ Offset/size: 0x250/8
+ Protocol: 2.09+
+
+The 64-bit physical pointer to NULL terminated single linked list of
+struct setup_data. This is used to define a more extensible boot
+parameters passing mechanism. The definition of struct setup_data is
+as follow::
+
+ struct setup_data {
+ u64 next;
+ u32 type;
+ u32 len;
+ u8 data[0];
+ };
+
+Where, the next is a 64-bit physical pointer to the next node of
+linked list, the next field of the last node is 0; the type is used
+to identify the contents of data; the len is the length of data
+field; the data holds the real payload.
+
+This list may be modified at a number of points during the bootup
+process. Therefore, when modifying this list one should always make
+sure to consider the case where the linked list already contains
+entries.
+::
+
+ Field name: pref_address
+ Type: read (reloc)
+ Offset/size: 0x258/8
+ Protocol: 2.10+
+
+This field, if nonzero, represents a preferred load address for the
+kernel. A relocating bootloader should attempt to load at this
+address if possible.
+
+A non-relocatable kernel will unconditionally move itself and to run
+at this address.
+::
+
+ Field name: init_size
+ Type: read
+ Offset/size: 0x260/4
+
+This field indicates the amount of linear contiguous memory starting
+at the kernel runtime start address that the kernel needs before it
+is capable of examining its memory map. This is not the same thing
+as the total amount of memory the kernel needs to boot, but it can
+be used by a relocating boot loader to help select a safe load
+address for the kernel.
+
+The kernel runtime start address is determined by the following algorithm::
+
+ if (relocatable_kernel)
+ runtime_start = align_up(load_address, kernel_alignment)
+ else
+ runtime_start = pref_address
+
+::
+
+ Field name: handover_offset
+ Type: read
+ Offset/size: 0x264/4
+
+This field is the offset from the beginning of the kernel image to
+the EFI handover protocol entry point. Boot loaders using the EFI
+handover protocol to boot the kernel should jump to this offset.
+
+See EFI HANDOVER PROTOCOL below for more details.
+
+
+THE IMAGE CHECKSUM
+==================
+
+From boot protocol version 2.08 onwards the CRC-32 is calculated over
+the entire file using the characteristic polynomial 0x04C11DB7 and an
+initial remainder of 0xffffffff. The checksum is appended to the
+file; therefore the CRC of the file up to the limit specified in the
+syssize field of the header is always 0.
+
+
+THE KERNEL COMMAND LINE
+=======================
+
+The kernel command line has become an important way for the boot
+loader to communicate with the kernel. Some of its options are also
+relevant to the boot loader itself, see "special command line options"
+below.
+
+The kernel command line is a null-terminated string. The maximum
+length can be retrieved from the field cmdline_size. Before protocol
+version 2.06, the maximum was 255 characters. A string that is too
+long will be automatically truncated by the kernel.
+
+If the boot protocol version is 2.02 or later, the address of the
+kernel command line is given by the header field cmd_line_ptr (see
+above.) This address can be anywhere between the end of the setup
+heap and 0xA0000.
+
+If the protocol version is *not* 2.02 or higher, the kernel
+command line is entered using the following protocol:
+
+ - At offset 0x0020 (word), "cmd_line_magic", enter the magic
+ number 0xA33F.
+
+ - At offset 0x0022 (word), "cmd_line_offset", enter the offset
+ of the kernel command line (relative to the start of the
+ real-mode kernel).
+
+ - The kernel command line *must* be within the memory region
+ covered by setup_move_size, so you may need to adjust this
+ field.
+
+
+MEMORY LAYOUT OF THE REAL-MODE CODE
+===================================
+
+The real-mode code requires a stack/heap to be set up, as well as
+memory allocated for the kernel command line. This needs to be done
+in the real-mode accessible memory in bottom megabyte.
+
+It should be noted that modern machines often have a sizable Extended
+BIOS Data Area (EBDA). As a result, it is advisable to use as little
+of the low megabyte as possible.
+
+Unfortunately, under the following circumstances the 0x90000 memory
+segment has to be used:
+
+ - When loading a zImage kernel ((loadflags & 0x01) == 0).
+ - When loading a 2.01 or earlier boot protocol kernel.
+
+ For the 2.00 and 2.01 boot protocols, the real-mode code
+ can be loaded at another address, but it is internally
+ relocated to 0x90000. For the "old" protocol, the
+ real-mode code must be loaded at 0x90000.
+
+When loading at 0x90000, avoid using memory above 0x9a000.
+
+For boot protocol 2.02 or higher, the command line does not have to be
+located in the same 64K segment as the real-mode setup code; it is
+thus permitted to give the stack/heap the full 64K segment and locate
+the command line above it.
+
+The kernel command line should not be located below the real-mode
+code, nor should it be located in high memory.
+
+
+SAMPLE BOOT CONFIGURATION
+=========================
+
+As a sample configuration, assume the following layout of the real
+mode segment.
+
+When loading below 0x90000, use the entire segment::
+
+ 0x0000-0x7fff Real mode kernel
+ 0x8000-0xdfff Stack and heap
+ 0xe000-0xffff Kernel command line
+
+When loading at 0x90000 OR the protocol version is 2.01 or earlier::
+
+ 0x0000-0x7fff Real mode kernel
+ 0x8000-0x97ff Stack and heap
+ 0x9800-0x9fff Kernel command line
+
+Such a boot loader should enter the following fields in the header::
+
+ unsigned long base_ptr; /* base address for real-mode segment */
+
+ if ( setup_sects == 0 ) {
+ setup_sects = 4;
+ }
+
+ if ( protocol >= 0x0200 ) {
+ type_of_loader = <type code>;
+ if ( loading_initrd ) {
+ ramdisk_image = <initrd_address>;
+ ramdisk_size = <initrd_size>;
+ }
+
+ if ( protocol >= 0x0202 && loadflags & 0x01 )
+ heap_end = 0xe000;
+ else
+ heap_end = 0x9800;
+
+ if ( protocol >= 0x0201 ) {
+ heap_end_ptr = heap_end - 0x200;
+ loadflags |= 0x80; /* CAN_USE_HEAP */
+ }
+
+ if ( protocol >= 0x0202 ) {
+ cmd_line_ptr = base_ptr + heap_end;
+ strcpy(cmd_line_ptr, cmdline);
+ } else {
+ cmd_line_magic = 0xA33F;
+ cmd_line_offset = heap_end;
+ setup_move_size = heap_end + strlen(cmdline)+1;
+ strcpy(base_ptr+cmd_line_offset, cmdline);
+ }
+ } else {
+ /* Very old kernel */
+
+ heap_end = 0x9800;
+
+ cmd_line_magic = 0xA33F;
+ cmd_line_offset = heap_end;
+
+ /* A very old kernel MUST have its real-mode code
+ loaded at 0x90000 */
+
+ if ( base_ptr != 0x90000 ) {
+ /* Copy the real-mode kernel */
+ memcpy(0x90000, base_ptr, (setup_sects+1)*512);
+ base_ptr = 0x90000; /* Relocated */
+ }
+
+ strcpy(0x90000+cmd_line_offset, cmdline);
+
+ /* It is recommended to clear memory up to the 32K mark */
+ memset(0x90000 + (setup_sects+1)*512, 0,
+ (64-(setup_sects+1))*512);
+ }
+
+
+LOADING THE REST OF THE KERNEL
+==============================
+
+The 32-bit (non-real-mode) kernel starts at offset (setup_sects+1)*512
+in the kernel file (again, if setup_sects == 0 the real value is 4.)
+It should be loaded at address 0x10000 for Image/zImage kernels and
+0x100000 for bzImage kernels.
+
+The kernel is a bzImage kernel if the protocol >= 2.00 and the 0x01
+bit (LOAD_HIGH) in the loadflags field is set::
+
+ is_bzImage = (protocol >= 0x0200) && (loadflags & 0x01);
+ load_address = is_bzImage ? 0x100000 : 0x10000;
+
+Note that Image/zImage kernels can be up to 512K in size, and thus use
+the entire 0x10000-0x90000 range of memory. This means it is pretty
+much a requirement for these kernels to load the real-mode part at
+0x90000. bzImage kernels allow much more flexibility.
+
+
+SPECIAL COMMAND LINE OPTIONS
+============================
+
+If the command line provided by the boot loader is entered by the
+user, the user may expect the following command line options to work.
+They should normally not be deleted from the kernel command line even
+though not all of them are actually meaningful to the kernel. Boot
+loader authors who need additional command line options for the boot
+loader itself should get them registered in
+Documentation/admin-guide/kernel-parameters.rst to make sure they will not
+conflict with actual kernel options now or in the future.
+
+ vga=<mode>
+ <mode> here is either an integer (in C notation, either
+ decimal, octal, or hexadecimal) or one of the strings
+ "normal" (meaning 0xFFFF), "ext" (meaning 0xFFFE) or "ask"
+ (meaning 0xFFFD). This value should be entered into the
+ vid_mode field, as it is used by the kernel before the command
+ line is parsed.
+
+ mem=<size>
+ <size> is an integer in C notation optionally followed by
+ (case insensitive) K, M, G, T, P or E (meaning << 10, << 20,
+ << 30, << 40, << 50 or << 60). This specifies the end of
+ memory to the kernel. This affects the possible placement of
+ an initrd, since an initrd should be placed near end of
+ memory. Note that this is an option to *both* the kernel and
+ the bootloader!
+
+ initrd=<file>
+ An initrd should be loaded. The meaning of <file> is
+ obviously bootloader-dependent, and some boot loaders
+ (e.g. LILO) do not have such a command.
+
+In addition, some boot loaders add the following options to the
+user-specified command line:
+
+ BOOT_IMAGE=<file>
+ The boot image which was loaded. Again, the meaning of <file>
+ is obviously bootloader-dependent.
+
+ auto
+ The kernel was booted without explicit user intervention.
+
+If these options are added by the boot loader, it is highly
+recommended that they are located *first*, before the user-specified
+or configuration-specified command line. Otherwise, "init=/bin/sh"
+gets confused by the "auto" option.
+
+
+RUNNING THE KERNEL
+==================
+
+The kernel is started by jumping to the kernel entry point, which is
+located at *segment* offset 0x20 from the start of the real mode
+kernel. This means that if you loaded your real-mode kernel code at
+0x90000, the kernel entry point is 9020:0000.
+
+At entry, ds = es = ss should point to the start of the real-mode
+kernel code (0x9000 if the code is loaded at 0x90000), sp should be
+set up properly, normally pointing to the top of the heap, and
+interrupts should be disabled. Furthermore, to guard against bugs in
+the kernel, it is recommended that the boot loader sets fs = gs = ds =
+es = ss.
+
+In our example from above, we would do::
+
+ /* Note: in the case of the "old" kernel protocol, base_ptr must
+ be == 0x90000 at this point; see the previous sample code */
+
+ seg = base_ptr >> 4;
+
+ cli(); /* Enter with interrupts disabled! */
+
+ /* Set up the real-mode kernel stack */
+ _SS = seg;
+ _SP = heap_end;
+
+ _DS = _ES = _FS = _GS = seg;
+ jmp_far(seg+0x20, 0); /* Run the kernel */
+
+If your boot sector accesses a floppy drive, it is recommended to
+switch off the floppy motor before running the kernel, since the
+kernel boot leaves interrupts off and thus the motor will not be
+switched off, especially if the loaded kernel has the floppy driver as
+a demand-loaded module!
+
+
+ADVANCED BOOT LOADER HOOKS
+==========================
+
+If the boot loader runs in a particularly hostile environment (such as
+LOADLIN, which runs under DOS) it may be impossible to follow the
+standard memory location requirements. Such a boot loader may use the
+following hooks that, if set, are invoked by the kernel at the
+appropriate time. The use of these hooks should probably be
+considered an absolutely last resort!
+
+IMPORTANT: All the hooks are required to preserve %esp, %ebp, %esi and
+%edi across invocation.
+
+ realmode_swtch:
+ A 16-bit real mode far subroutine invoked immediately before
+ entering protected mode. The default routine disables NMI, so
+ your routine should probably do so, too.
+
+ code32_start:
+ A 32-bit flat-mode routine *jumped* to immediately after the
+ transition to protected mode, but before the kernel is
+ uncompressed. No segments, except CS, are guaranteed to be
+ set up (current kernels do, but older ones do not); you should
+ set them up to BOOT_DS (0x18) yourself.
+
+ After completing your hook, you should jump to the address
+ that was in this field before your boot loader overwrote it
+ (relocated, if appropriate.)
+
+
+32-bit BOOT PROTOCOL
+====================
+
+For machine with some new BIOS other than legacy BIOS, such as EFI,
+LinuxBIOS, etc, and kexec, the 16-bit real mode setup code in kernel
+based on legacy BIOS can not be used, so a 32-bit boot protocol needs
+to be defined.
+
+In 32-bit boot protocol, the first step in loading a Linux kernel
+should be to setup the boot parameters (struct boot_params,
+traditionally known as "zero page"). The memory for struct boot_params
+should be allocated and initialized to all zero. Then the setup header
+from offset 0x01f1 of kernel image on should be loaded into struct
+boot_params and examined. The end of setup header can be calculated as
+follow::
+
+ 0x0202 + byte value at offset 0x0201
+
+In addition to read/modify/write the setup header of the struct
+boot_params as that of 16-bit boot protocol, the boot loader should
+also fill the additional fields of the struct boot_params as that
+described in zero-page.txt.
+
+After setting up the struct boot_params, the boot loader can load the
+32/64-bit kernel in the same way as that of 16-bit boot protocol.
+
+In 32-bit boot protocol, the kernel is started by jumping to the
+32-bit kernel entry point, which is the start address of loaded
+32/64-bit kernel.
+
+At entry, the CPU must be in 32-bit protected mode with paging
+disabled; a GDT must be loaded with the descriptors for selectors
+__BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat
+segment; __BOOT_CS must have execute/read permission, and __BOOT_DS
+must have read/write permission; CS must be __BOOT_CS and DS, ES, SS
+must be __BOOT_DS; interrupt must be disabled; %esi must hold the base
+address of the struct boot_params; %ebp, %edi and %ebx must be zero.
+
+64-bit BOOT PROTOCOL
+====================
+
+For machine with 64bit cpus and 64bit kernel, we could use 64bit bootloader
+and we need a 64-bit boot protocol.
+
+In 64-bit boot protocol, the first step in loading a Linux kernel
+should be to setup the boot parameters (struct boot_params,
+traditionally known as "zero page"). The memory for struct boot_params
+could be allocated anywhere (even above 4G) and initialized to all zero.
+Then, the setup header at offset 0x01f1 of kernel image on should be
+loaded into struct boot_params and examined. The end of setup header
+can be calculated as follows::
+
+ 0x0202 + byte value at offset 0x0201
+
+In addition to read/modify/write the setup header of the struct
+boot_params as that of 16-bit boot protocol, the boot loader should
+also fill the additional fields of the struct boot_params as described
+in zero-page.txt.
+
+After setting up the struct boot_params, the boot loader can load
+64-bit kernel in the same way as that of 16-bit boot protocol, but
+kernel could be loaded above 4G.
+
+In 64-bit boot protocol, the kernel is started by jumping to the
+64-bit kernel entry point, which is the start address of loaded
+64-bit kernel plus 0x200.
+
+At entry, the CPU must be in 64-bit mode with paging enabled.
+The range with setup_header.init_size from start address of loaded
+kernel and zero page and command line buffer get ident mapping;
+a GDT must be loaded with the descriptors for selectors
+__BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat
+segment; __BOOT_CS must have execute/read permission, and __BOOT_DS
+must have read/write permission; CS must be __BOOT_CS and DS, ES, SS
+must be __BOOT_DS; interrupt must be disabled; %rsi must hold the base
+address of the struct boot_params.
+
+EFI HANDOVER PROTOCOL
+=====================
+
+This protocol allows boot loaders to defer initialisation to the EFI
+boot stub. The boot loader is required to load the kernel/initrd(s)
+from the boot media and jump to the EFI handover protocol entry point
+which is hdr->handover_offset bytes from the beginning of
+startup_{32,64}.
+
+The function prototype for the handover entry point looks like this::
+
+ efi_main(void *handle, efi_system_table_t *table, struct boot_params *bp)
+
+'handle' is the EFI image handle passed to the boot loader by the EFI
+firmware, 'table' is the EFI system table - these are the first two
+arguments of the "handoff state" as described in section 2.3 of the
+UEFI specification. 'bp' is the boot loader-allocated boot params.
+
+The boot loader *must* fill out the following fields in bp::
+
+ - hdr.code32_start
+ - hdr.cmd_line_ptr
+ - hdr.ramdisk_image (if applicable)
+ - hdr.ramdisk_size (if applicable)
+
+All other fields should be zero.
diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
deleted file mode 100644
index f4c2a97bfdbd..000000000000
--- a/Documentation/x86/boot.txt
+++ /dev/null
@@ -1,1130 +0,0 @@
- THE LINUX/x86 BOOT PROTOCOL
- ---------------------------
-
-On the x86 platform, the Linux kernel uses a rather complicated boot
-convention. This has evolved partially due to historical aspects, as
-well as the desire in the early days to have the kernel itself be a
-bootable image, the complicated PC memory model and due to changed
-expectations in the PC industry caused by the effective demise of
-real-mode DOS as a mainstream operating system.
-
-Currently, the following versions of the Linux/x86 boot protocol exist.
-
-Old kernels: zImage/Image support only. Some very early kernels
- may not even support a command line.
-
-Protocol 2.00: (Kernel 1.3.73) Added bzImage and initrd support, as
- well as a formalized way to communicate between the
- boot loader and the kernel. setup.S made relocatable,
- although the traditional setup area still assumed
- writable.
-
-Protocol 2.01: (Kernel 1.3.76) Added a heap overrun warning.
-
-Protocol 2.02: (Kernel 2.4.0-test3-pre3) New command line protocol.
- Lower the conventional memory ceiling. No overwrite
- of the traditional setup area, thus making booting
- safe for systems which use the EBDA from SMM or 32-bit
- BIOS entry points. zImage deprecated but still
- supported.
-
-Protocol 2.03: (Kernel 2.4.18-pre1) Explicitly makes the highest possible
- initrd address available to the bootloader.
-
-Protocol 2.04: (Kernel 2.6.14) Extend the syssize field to four bytes.
-
-Protocol 2.05: (Kernel 2.6.20) Make protected mode kernel relocatable.
- Introduce relocatable_kernel and kernel_alignment fields.
-
-Protocol 2.06: (Kernel 2.6.22) Added a field that contains the size of
- the boot command line.
-
-Protocol 2.07: (Kernel 2.6.24) Added paravirtualised boot protocol.
- Introduced hardware_subarch and hardware_subarch_data
- and KEEP_SEGMENTS flag in load_flags.
-
-Protocol 2.08: (Kernel 2.6.26) Added crc32 checksum and ELF format
- payload. Introduced payload_offset and payload_length
- fields to aid in locating the payload.
-
-Protocol 2.09: (Kernel 2.6.26) Added a field of 64-bit physical
- pointer to single linked list of struct setup_data.
-
-Protocol 2.10: (Kernel 2.6.31) Added a protocol for relaxed alignment
- beyond the kernel_alignment added, new init_size and
- pref_address fields. Added extended boot loader IDs.
-
-Protocol 2.11: (Kernel 3.6) Added a field for offset of EFI handover
- protocol entry point.
-
-Protocol 2.12: (Kernel 3.8) Added the xloadflags field and extension fields
- to struct boot_params for loading bzImage and ramdisk
- above 4G in 64bit.
-
-**** MEMORY LAYOUT
-
-The traditional memory map for the kernel loader, used for Image or
-zImage kernels, typically looks like:
-
- | |
-0A0000 +------------------------+
- | Reserved for BIOS | Do not use. Reserved for BIOS EBDA.
-09A000 +------------------------+
- | Command line |
- | Stack/heap | For use by the kernel real-mode code.
-098000 +------------------------+
- | Kernel setup | The kernel real-mode code.
-090200 +------------------------+
- | Kernel boot sector | The kernel legacy boot sector.
-090000 +------------------------+
- | Protected-mode kernel | The bulk of the kernel image.
-010000 +------------------------+
- | Boot loader | <- Boot sector entry point 0000:7C00
-001000 +------------------------+
- | Reserved for MBR/BIOS |
-000800 +------------------------+
- | Typically used by MBR |
-000600 +------------------------+
- | BIOS use only |
-000000 +------------------------+
-
-
-When using bzImage, the protected-mode kernel was relocated to
-0x100000 ("high memory"), and the kernel real-mode block (boot sector,
-setup, and stack/heap) was made relocatable to any address between
-0x10000 and end of low memory. Unfortunately, in protocols 2.00 and
-2.01 the 0x90000+ memory range is still used internally by the kernel;
-the 2.02 protocol resolves that problem.
-
-It is desirable to keep the "memory ceiling" -- the highest point in
-low memory touched by the boot loader -- as low as possible, since
-some newer BIOSes have begun to allocate some rather large amounts of
-memory, called the Extended BIOS Data Area, near the top of low
-memory. The boot loader should use the "INT 12h" BIOS call to verify
-how much low memory is available.
-
-Unfortunately, if INT 12h reports that the amount of memory is too
-low, there is usually nothing the boot loader can do but to report an
-error to the user. The boot loader should therefore be designed to
-take up as little space in low memory as it reasonably can. For
-zImage or old bzImage kernels, which need data written into the
-0x90000 segment, the boot loader should make sure not to use memory
-above the 0x9A000 point; too many BIOSes will break above that point.
-
-For a modern bzImage kernel with boot protocol version >= 2.02, a
-memory layout like the following is suggested:
-
- ~ ~
- | Protected-mode kernel |
-100000 +------------------------+
- | I/O memory hole |
-0A0000 +------------------------+
- | Reserved for BIOS | Leave as much as possible unused
- ~ ~
- | Command line | (Can also be below the X+10000 mark)
-X+10000 +------------------------+
- | Stack/heap | For use by the kernel real-mode code.
-X+08000 +------------------------+
- | Kernel setup | The kernel real-mode code.
- | Kernel boot sector | The kernel legacy boot sector.
-X +------------------------+
- | Boot loader | <- Boot sector entry point 0000:7C00
-001000 +------------------------+
- | Reserved for MBR/BIOS |
-000800 +------------------------+
- | Typically used by MBR |
-000600 +------------------------+
- | BIOS use only |
-000000 +------------------------+
-
-... where the address X is as low as the design of the boot loader
-permits.
-
-
-**** THE REAL-MODE KERNEL HEADER
-
-In the following text, and anywhere in the kernel boot sequence, "a
-sector" refers to 512 bytes. It is independent of the actual sector
-size of the underlying medium.
-
-The first step in loading a Linux kernel should be to load the
-real-mode code (boot sector and setup code) and then examine the
-following header at offset 0x01f1. The real-mode code can total up to
-32K, although the boot loader may choose to load only the first two
-sectors (1K) and then examine the bootup sector size.
-
-The header looks like:
-
-Offset Proto Name Meaning
-/Size
-
-01F1/1 ALL(1 setup_sects The size of the setup in sectors
-01F2/2 ALL root_flags If set, the root is mounted readonly
-01F4/4 2.04+(2 syssize The size of the 32-bit code in 16-byte paras
-01F8/2 ALL ram_size DO NOT USE - for bootsect.S use only
-01FA/2 ALL vid_mode Video mode control
-01FC/2 ALL root_dev Default root device number
-01FE/2 ALL boot_flag 0xAA55 magic number
-0200/2 2.00+ jump Jump instruction
-0202/4 2.00+ header Magic signature "HdrS"
-0206/2 2.00+ version Boot protocol version supported
-0208/4 2.00+ realmode_swtch Boot loader hook (see below)
-020C/2 2.00+ start_sys_seg The load-low segment (0x1000) (obsolete)
-020E/2 2.00+ kernel_version Pointer to kernel version string
-0210/1 2.00+ type_of_loader Boot loader identifier
-0211/1 2.00+ loadflags Boot protocol option flags
-0212/2 2.00+ setup_move_size Move to high memory size (used with hooks)
-0214/4 2.00+ code32_start Boot loader hook (see below)
-0218/4 2.00+ ramdisk_image initrd load address (set by boot loader)
-021C/4 2.00+ ramdisk_size initrd size (set by boot loader)
-0220/4 2.00+ bootsect_kludge DO NOT USE - for bootsect.S use only
-0224/2 2.01+ heap_end_ptr Free memory after setup end
-0226/1 2.02+(3 ext_loader_ver Extended boot loader version
-0227/1 2.02+(3 ext_loader_type Extended boot loader ID
-0228/4 2.02+ cmd_line_ptr 32-bit pointer to the kernel command line
-022C/4 2.03+ initrd_addr_max Highest legal initrd address
-0230/4 2.05+ kernel_alignment Physical addr alignment required for kernel
-0234/1 2.05+ relocatable_kernel Whether kernel is relocatable or not
-0235/1 2.10+ min_alignment Minimum alignment, as a power of two
-0236/2 2.12+ xloadflags Boot protocol option flags
-0238/4 2.06+ cmdline_size Maximum size of the kernel command line
-023C/4 2.07+ hardware_subarch Hardware subarchitecture
-0240/8 2.07+ hardware_subarch_data Subarchitecture-specific data
-0248/4 2.08+ payload_offset Offset of kernel payload
-024C/4 2.08+ payload_length Length of kernel payload
-0250/8 2.09+ setup_data 64-bit physical pointer to linked list
- of struct setup_data
-0258/8 2.10+ pref_address Preferred loading address
-0260/4 2.10+ init_size Linear memory required during initialization
-0264/4 2.11+ handover_offset Offset of handover entry point
-
-(1) For backwards compatibility, if the setup_sects field contains 0, the
- real value is 4.
-
-(2) For boot protocol prior to 2.04, the upper two bytes of the syssize
- field are unusable, which means the size of a bzImage kernel
- cannot be determined.
-
-(3) Ignored, but safe to set, for boot protocols 2.02-2.09.
-
-If the "HdrS" (0x53726448) magic number is not found at offset 0x202,
-the boot protocol version is "old". Loading an old kernel, the
-following parameters should be assumed:
-
- Image type = zImage
- initrd not supported
- Real-mode kernel must be located at 0x90000.
-
-Otherwise, the "version" field contains the protocol version,
-e.g. protocol version 2.01 will contain 0x0201 in this field. When
-setting fields in the header, you must make sure only to set fields
-supported by the protocol version in use.
-
-
-**** DETAILS OF HEADER FIELDS
-
-For each field, some are information from the kernel to the bootloader
-("read"), some are expected to be filled out by the bootloader
-("write"), and some are expected to be read and modified by the
-bootloader ("modify").
-
-All general purpose boot loaders should write the fields marked
-(obligatory). Boot loaders who want to load the kernel at a
-nonstandard address should fill in the fields marked (reloc); other
-boot loaders can ignore those fields.
-
-The byte order of all fields is littleendian (this is x86, after all.)
-
-Field name: setup_sects
-Type: read
-Offset/size: 0x1f1/1
-Protocol: ALL
-
- The size of the setup code in 512-byte sectors. If this field is
- 0, the real value is 4. The real-mode code consists of the boot
- sector (always one 512-byte sector) plus the setup code.
-
-Field name: root_flags
-Type: modify (optional)
-Offset/size: 0x1f2/2
-Protocol: ALL
-
- If this field is nonzero, the root defaults to readonly. The use of
- this field is deprecated; use the "ro" or "rw" options on the
- command line instead.
-
-Field name: syssize
-Type: read
-Offset/size: 0x1f4/4 (protocol 2.04+) 0x1f4/2 (protocol ALL)
-Protocol: 2.04+
-
- The size of the protected-mode code in units of 16-byte paragraphs.
- For protocol versions older than 2.04 this field is only two bytes
- wide, and therefore cannot be trusted for the size of a kernel if
- the LOAD_HIGH flag is set.
-
-Field name: ram_size
-Type: kernel internal
-Offset/size: 0x1f8/2
-Protocol: ALL
-
- This field is obsolete.
-
-Field name: vid_mode
-Type: modify (obligatory)
-Offset/size: 0x1fa/2
-
- Please see the section on SPECIAL COMMAND LINE OPTIONS.
-
-Field name: root_dev
-Type: modify (optional)
-Offset/size: 0x1fc/2
-Protocol: ALL
-
- The default root device device number. The use of this field is
- deprecated, use the "root=" option on the command line instead.
-
-Field name: boot_flag
-Type: read
-Offset/size: 0x1fe/2
-Protocol: ALL
-
- Contains 0xAA55. This is the closest thing old Linux kernels have
- to a magic number.
-
-Field name: jump
-Type: read
-Offset/size: 0x200/2
-Protocol: 2.00+
-
- Contains an x86 jump instruction, 0xEB followed by a signed offset
- relative to byte 0x202. This can be used to determine the size of
- the header.
-
-Field name: header
-Type: read
-Offset/size: 0x202/4
-Protocol: 2.00+
-
- Contains the magic number "HdrS" (0x53726448).
-
-Field name: version
-Type: read
-Offset/size: 0x206/2
-Protocol: 2.00+
-
- Contains the boot protocol version, in (major << 8)+minor format,
- e.g. 0x0204 for version 2.04, and 0x0a11 for a hypothetical version
- 10.17.
-
-Field name: realmode_swtch
-Type: modify (optional)
-Offset/size: 0x208/4
-Protocol: 2.00+
-
- Boot loader hook (see ADVANCED BOOT LOADER HOOKS below.)
-
-Field name: start_sys_seg
-Type: read
-Offset/size: 0x20c/2
-Protocol: 2.00+
-
- The load low segment (0x1000). Obsolete.
-
-Field name: kernel_version
-Type: read
-Offset/size: 0x20e/2
-Protocol: 2.00+
-
- If set to a nonzero value, contains a pointer to a NUL-terminated
- human-readable kernel version number string, less 0x200. This can
- be used to display the kernel version to the user. This value
- should be less than (0x200*setup_sects).
-
- For example, if this value is set to 0x1c00, the kernel version
- number string can be found at offset 0x1e00 in the kernel file.
- This is a valid value if and only if the "setup_sects" field
- contains the value 15 or higher, as:
-
- 0x1c00 < 15*0x200 (= 0x1e00) but
- 0x1c00 >= 14*0x200 (= 0x1c00)
-
- 0x1c00 >> 9 = 14, so the minimum value for setup_secs is 15.
-
-Field name: type_of_loader
-Type: write (obligatory)
-Offset/size: 0x210/1
-Protocol: 2.00+
-
- If your boot loader has an assigned id (see table below), enter
- 0xTV here, where T is an identifier for the boot loader and V is
- a version number. Otherwise, enter 0xFF here.
-
- For boot loader IDs above T = 0xD, write T = 0xE to this field and
- write the extended ID minus 0x10 to the ext_loader_type field.
- Similarly, the ext_loader_ver field can be used to provide more than
- four bits for the bootloader version.
-
- For example, for T = 0x15, V = 0x234, write:
-
- type_of_loader <- 0xE4
- ext_loader_type <- 0x05
- ext_loader_ver <- 0x23
-
- Assigned boot loader ids (hexadecimal):
-
- 0 LILO (0x00 reserved for pre-2.00 bootloader)
- 1 Loadlin
- 2 bootsect-loader (0x20, all other values reserved)
- 3 Syslinux
- 4 Etherboot/gPXE/iPXE
- 5 ELILO
- 7 GRUB
- 8 U-Boot
- 9 Xen
- A Gujin
- B Qemu
- C Arcturus Networks uCbootloader
- D kexec-tools
- E Extended (see ext_loader_type)
- F Special (0xFF = undefined)
- 10 Reserved
- 11 Minimal Linux Bootloader <http://sebastian-plotz.blogspot.de>
- 12 OVMF UEFI virtualization stack
-
- Please contact <[email protected]> if you need a bootloader ID
- value assigned.
-
-Field name: loadflags
-Type: modify (obligatory)
-Offset/size: 0x211/1
-Protocol: 2.00+
-
- This field is a bitmask.
-
- Bit 0 (read): LOADED_HIGH
- - If 0, the protected-mode code is loaded at 0x10000.
- - If 1, the protected-mode code is loaded at 0x100000.
-
- Bit 1 (kernel internal): KASLR_FLAG
- - Used internally by the compressed kernel to communicate
- KASLR status to kernel proper.
- If 1, KASLR enabled.
- If 0, KASLR disabled.
-
- Bit 5 (write): QUIET_FLAG
- - If 0, print early messages.
- - If 1, suppress early messages.
- This requests to the kernel (decompressor and early
- kernel) to not write early messages that require
- accessing the display hardware directly.
-
- Bit 6 (write): KEEP_SEGMENTS
- Protocol: 2.07+
- - If 0, reload the segment registers in the 32bit entry point.
- - If 1, do not reload the segment registers in the 32bit entry point.
- Assume that %cs %ds %ss %es are all set to flat segments with
- a base of 0 (or the equivalent for their environment).
-
- Bit 7 (write): CAN_USE_HEAP
- Set this bit to 1 to indicate that the value entered in the
- heap_end_ptr is valid. If this field is clear, some setup code
- functionality will be disabled.
-
-Field name: setup_move_size
-Type: modify (obligatory)
-Offset/size: 0x212/2
-Protocol: 2.00-2.01
-
- When using protocol 2.00 or 2.01, if the real mode kernel is not
- loaded at 0x90000, it gets moved there later in the loading
- sequence. Fill in this field if you want additional data (such as
- the kernel command line) moved in addition to the real-mode kernel
- itself.
-
- The unit is bytes starting with the beginning of the boot sector.
-
- This field is can be ignored when the protocol is 2.02 or higher, or
- if the real-mode code is loaded at 0x90000.
-
-Field name: code32_start
-Type: modify (optional, reloc)
-Offset/size: 0x214/4
-Protocol: 2.00+
-
- The address to jump to in protected mode. This defaults to the load
- address of the kernel, and can be used by the boot loader to
- determine the proper load address.
-
- This field can be modified for two purposes:
-
- 1. as a boot loader hook (see ADVANCED BOOT LOADER HOOKS below.)
-
- 2. if a bootloader which does not install a hook loads a
- relocatable kernel at a nonstandard address it will have to modify
- this field to point to the load address.
-
-Field name: ramdisk_image
-Type: write (obligatory)
-Offset/size: 0x218/4
-Protocol: 2.00+
-
- The 32-bit linear address of the initial ramdisk or ramfs. Leave at
- zero if there is no initial ramdisk/ramfs.
-
-Field name: ramdisk_size
-Type: write (obligatory)
-Offset/size: 0x21c/4
-Protocol: 2.00+
-
- Size of the initial ramdisk or ramfs. Leave at zero if there is no
- initial ramdisk/ramfs.
-
-Field name: bootsect_kludge
-Type: kernel internal
-Offset/size: 0x220/4
-Protocol: 2.00+
-
- This field is obsolete.
-
-Field name: heap_end_ptr
-Type: write (obligatory)
-Offset/size: 0x224/2
-Protocol: 2.01+
-
- Set this field to the offset (from the beginning of the real-mode
- code) of the end of the setup stack/heap, minus 0x0200.
-
-Field name: ext_loader_ver
-Type: write (optional)
-Offset/size: 0x226/1
-Protocol: 2.02+
-
- This field is used as an extension of the version number in the
- type_of_loader field. The total version number is considered to be
- (type_of_loader & 0x0f) + (ext_loader_ver << 4).
-
- The use of this field is boot loader specific. If not written, it
- is zero.
-
- Kernels prior to 2.6.31 did not recognize this field, but it is safe
- to write for protocol version 2.02 or higher.
-
-Field name: ext_loader_type
-Type: write (obligatory if (type_of_loader & 0xf0) == 0xe0)
-Offset/size: 0x227/1
-Protocol: 2.02+
-
- This field is used as an extension of the type number in
- type_of_loader field. If the type in type_of_loader is 0xE, then
- the actual type is (ext_loader_type + 0x10).
-
- This field is ignored if the type in type_of_loader is not 0xE.
-
- Kernels prior to 2.6.31 did not recognize this field, but it is safe
- to write for protocol version 2.02 or higher.
-
-Field name: cmd_line_ptr
-Type: write (obligatory)
-Offset/size: 0x228/4
-Protocol: 2.02+
-
- Set this field to the linear address of the kernel command line.
- The kernel command line can be located anywhere between the end of
- the setup heap and 0xA0000; it does not have to be located in the
- same 64K segment as the real-mode code itself.
-
- Fill in this field even if your boot loader does not support a
- command line, in which case you can point this to an empty string
- (or better yet, to the string "auto".) If this field is left at
- zero, the kernel will assume that your boot loader does not support
- the 2.02+ protocol.
-
-Field name: initrd_addr_max
-Type: read
-Offset/size: 0x22c/4
-Protocol: 2.03+
-
- The maximum address that may be occupied by the initial
- ramdisk/ramfs contents. For boot protocols 2.02 or earlier, this
- field is not present, and the maximum address is 0x37FFFFFF. (This
- address is defined as the address of the highest safe byte, so if
- your ramdisk is exactly 131072 bytes long and this field is
- 0x37FFFFFF, you can start your ramdisk at 0x37FE0000.)
-
-Field name: kernel_alignment
-Type: read/modify (reloc)
-Offset/size: 0x230/4
-Protocol: 2.05+ (read), 2.10+ (modify)
-
- Alignment unit required by the kernel (if relocatable_kernel is
- true.) A relocatable kernel that is loaded at an alignment
- incompatible with the value in this field will be realigned during
- kernel initialization.
-
- Starting with protocol version 2.10, this reflects the kernel
- alignment preferred for optimal performance; it is possible for the
- loader to modify this field to permit a lesser alignment. See the
- min_alignment and pref_address field below.
-
-Field name: relocatable_kernel
-Type: read (reloc)
-Offset/size: 0x234/1
-Protocol: 2.05+
-
- If this field is nonzero, the protected-mode part of the kernel can
- be loaded at any address that satisfies the kernel_alignment field.
- After loading, the boot loader must set the code32_start field to
- point to the loaded code, or to a boot loader hook.
-
-Field name: min_alignment
-Type: read (reloc)
-Offset/size: 0x235/1
-Protocol: 2.10+
-
- This field, if nonzero, indicates as a power of two the minimum
- alignment required, as opposed to preferred, by the kernel to boot.
- If a boot loader makes use of this field, it should update the
- kernel_alignment field with the alignment unit desired; typically:
-
- kernel_alignment = 1 << min_alignment
-
- There may be a considerable performance cost with an excessively
- misaligned kernel. Therefore, a loader should typically try each
- power-of-two alignment from kernel_alignment down to this alignment.
-
-Field name: xloadflags
-Type: read
-Offset/size: 0x236/2
-Protocol: 2.12+
-
- This field is a bitmask.
-
- Bit 0 (read): XLF_KERNEL_64
- - If 1, this kernel has the legacy 64-bit entry point at 0x200.
-
- Bit 1 (read): XLF_CAN_BE_LOADED_ABOVE_4G
- - If 1, kernel/boot_params/cmdline/ramdisk can be above 4G.
-
- Bit 2 (read): XLF_EFI_HANDOVER_32
- - If 1, the kernel supports the 32-bit EFI handoff entry point
- given at handover_offset.
-
- Bit 3 (read): XLF_EFI_HANDOVER_64
- - If 1, the kernel supports the 64-bit EFI handoff entry point
- given at handover_offset + 0x200.
-
- Bit 4 (read): XLF_EFI_KEXEC
- - If 1, the kernel supports kexec EFI boot with EFI runtime support.
-
-Field name: cmdline_size
-Type: read
-Offset/size: 0x238/4
-Protocol: 2.06+
-
- The maximum size of the command line without the terminating
- zero. This means that the command line can contain at most
- cmdline_size characters. With protocol version 2.05 and earlier, the
- maximum size was 255.
-
-Field name: hardware_subarch
-Type: write (optional, defaults to x86/PC)
-Offset/size: 0x23c/4
-Protocol: 2.07+
-
- In a paravirtualized environment the hardware low level architectural
- pieces such as interrupt handling, page table handling, and
- accessing process control registers needs to be done differently.
-
- This field allows the bootloader to inform the kernel we are in one
- one of those environments.
-
- 0x00000000 The default x86/PC environment
- 0x00000001 lguest
- 0x00000002 Xen
- 0x00000003 Moorestown MID
- 0x00000004 CE4100 TV Platform
-
-Field name: hardware_subarch_data
-Type: write (subarch-dependent)
-Offset/size: 0x240/8
-Protocol: 2.07+
-
- A pointer to data that is specific to hardware subarch
- This field is currently unused for the default x86/PC environment,
- do not modify.
-
-Field name: payload_offset
-Type: read
-Offset/size: 0x248/4
-Protocol: 2.08+
-
- If non-zero then this field contains the offset from the beginning
- of the protected-mode code to the payload.
-
- The payload may be compressed. The format of both the compressed and
- uncompressed data should be determined using the standard magic
- numbers. The currently supported compression formats are gzip
- (magic numbers 1F 8B or 1F 9E), bzip2 (magic number 42 5A), LZMA
- (magic number 5D 00), XZ (magic number FD 37), and LZ4 (magic number
- 02 21). The uncompressed payload is currently always ELF (magic
- number 7F 45 4C 46).
-
-Field name: payload_length
-Type: read
-Offset/size: 0x24c/4
-Protocol: 2.08+
-
- The length of the payload.
-
-Field name: setup_data
-Type: write (special)
-Offset/size: 0x250/8
-Protocol: 2.09+
-
- The 64-bit physical pointer to NULL terminated single linked list of
- struct setup_data. This is used to define a more extensible boot
- parameters passing mechanism. The definition of struct setup_data is
- as follow:
-
- struct setup_data {
- u64 next;
- u32 type;
- u32 len;
- u8 data[0];
- };
-
- Where, the next is a 64-bit physical pointer to the next node of
- linked list, the next field of the last node is 0; the type is used
- to identify the contents of data; the len is the length of data
- field; the data holds the real payload.
-
- This list may be modified at a number of points during the bootup
- process. Therefore, when modifying this list one should always make
- sure to consider the case where the linked list already contains
- entries.
-
-Field name: pref_address
-Type: read (reloc)
-Offset/size: 0x258/8
-Protocol: 2.10+
-
- This field, if nonzero, represents a preferred load address for the
- kernel. A relocating bootloader should attempt to load at this
- address if possible.
-
- A non-relocatable kernel will unconditionally move itself and to run
- at this address.
-
-Field name: init_size
-Type: read
-Offset/size: 0x260/4
-
- This field indicates the amount of linear contiguous memory starting
- at the kernel runtime start address that the kernel needs before it
- is capable of examining its memory map. This is not the same thing
- as the total amount of memory the kernel needs to boot, but it can
- be used by a relocating boot loader to help select a safe load
- address for the kernel.
-
- The kernel runtime start address is determined by the following algorithm:
-
- if (relocatable_kernel)
- runtime_start = align_up(load_address, kernel_alignment)
- else
- runtime_start = pref_address
-
-Field name: handover_offset
-Type: read
-Offset/size: 0x264/4
-
- This field is the offset from the beginning of the kernel image to
- the EFI handover protocol entry point. Boot loaders using the EFI
- handover protocol to boot the kernel should jump to this offset.
-
- See EFI HANDOVER PROTOCOL below for more details.
-
-
-**** THE IMAGE CHECKSUM
-
-From boot protocol version 2.08 onwards the CRC-32 is calculated over
-the entire file using the characteristic polynomial 0x04C11DB7 and an
-initial remainder of 0xffffffff. The checksum is appended to the
-file; therefore the CRC of the file up to the limit specified in the
-syssize field of the header is always 0.
-
-
-**** THE KERNEL COMMAND LINE
-
-The kernel command line has become an important way for the boot
-loader to communicate with the kernel. Some of its options are also
-relevant to the boot loader itself, see "special command line options"
-below.
-
-The kernel command line is a null-terminated string. The maximum
-length can be retrieved from the field cmdline_size. Before protocol
-version 2.06, the maximum was 255 characters. A string that is too
-long will be automatically truncated by the kernel.
-
-If the boot protocol version is 2.02 or later, the address of the
-kernel command line is given by the header field cmd_line_ptr (see
-above.) This address can be anywhere between the end of the setup
-heap and 0xA0000.
-
-If the protocol version is *not* 2.02 or higher, the kernel
-command line is entered using the following protocol:
-
- At offset 0x0020 (word), "cmd_line_magic", enter the magic
- number 0xA33F.
-
- At offset 0x0022 (word), "cmd_line_offset", enter the offset
- of the kernel command line (relative to the start of the
- real-mode kernel).
-
- The kernel command line *must* be within the memory region
- covered by setup_move_size, so you may need to adjust this
- field.
-
-
-**** MEMORY LAYOUT OF THE REAL-MODE CODE
-
-The real-mode code requires a stack/heap to be set up, as well as
-memory allocated for the kernel command line. This needs to be done
-in the real-mode accessible memory in bottom megabyte.
-
-It should be noted that modern machines often have a sizable Extended
-BIOS Data Area (EBDA). As a result, it is advisable to use as little
-of the low megabyte as possible.
-
-Unfortunately, under the following circumstances the 0x90000 memory
-segment has to be used:
-
- - When loading a zImage kernel ((loadflags & 0x01) == 0).
- - When loading a 2.01 or earlier boot protocol kernel.
-
- -> For the 2.00 and 2.01 boot protocols, the real-mode code
- can be loaded at another address, but it is internally
- relocated to 0x90000. For the "old" protocol, the
- real-mode code must be loaded at 0x90000.
-
-When loading at 0x90000, avoid using memory above 0x9a000.
-
-For boot protocol 2.02 or higher, the command line does not have to be
-located in the same 64K segment as the real-mode setup code; it is
-thus permitted to give the stack/heap the full 64K segment and locate
-the command line above it.
-
-The kernel command line should not be located below the real-mode
-code, nor should it be located in high memory.
-
-
-**** SAMPLE BOOT CONFIGURATION
-
-As a sample configuration, assume the following layout of the real
-mode segment:
-
- When loading below 0x90000, use the entire segment:
-
- 0x0000-0x7fff Real mode kernel
- 0x8000-0xdfff Stack and heap
- 0xe000-0xffff Kernel command line
-
- When loading at 0x90000 OR the protocol version is 2.01 or earlier:
-
- 0x0000-0x7fff Real mode kernel
- 0x8000-0x97ff Stack and heap
- 0x9800-0x9fff Kernel command line
-
-Such a boot loader should enter the following fields in the header:
-
- unsigned long base_ptr; /* base address for real-mode segment */
-
- if ( setup_sects == 0 ) {
- setup_sects = 4;
- }
-
- if ( protocol >= 0x0200 ) {
- type_of_loader = <type code>;
- if ( loading_initrd ) {
- ramdisk_image = <initrd_address>;
- ramdisk_size = <initrd_size>;
- }
-
- if ( protocol >= 0x0202 && loadflags & 0x01 )
- heap_end = 0xe000;
- else
- heap_end = 0x9800;
-
- if ( protocol >= 0x0201 ) {
- heap_end_ptr = heap_end - 0x200;
- loadflags |= 0x80; /* CAN_USE_HEAP */
- }
-
- if ( protocol >= 0x0202 ) {
- cmd_line_ptr = base_ptr + heap_end;
- strcpy(cmd_line_ptr, cmdline);
- } else {
- cmd_line_magic = 0xA33F;
- cmd_line_offset = heap_end;
- setup_move_size = heap_end + strlen(cmdline)+1;
- strcpy(base_ptr+cmd_line_offset, cmdline);
- }
- } else {
- /* Very old kernel */
-
- heap_end = 0x9800;
-
- cmd_line_magic = 0xA33F;
- cmd_line_offset = heap_end;
-
- /* A very old kernel MUST have its real-mode code
- loaded at 0x90000 */
-
- if ( base_ptr != 0x90000 ) {
- /* Copy the real-mode kernel */
- memcpy(0x90000, base_ptr, (setup_sects+1)*512);
- base_ptr = 0x90000; /* Relocated */
- }
-
- strcpy(0x90000+cmd_line_offset, cmdline);
-
- /* It is recommended to clear memory up to the 32K mark */
- memset(0x90000 + (setup_sects+1)*512, 0,
- (64-(setup_sects+1))*512);
- }
-
-
-**** LOADING THE REST OF THE KERNEL
-
-The 32-bit (non-real-mode) kernel starts at offset (setup_sects+1)*512
-in the kernel file (again, if setup_sects == 0 the real value is 4.)
-It should be loaded at address 0x10000 for Image/zImage kernels and
-0x100000 for bzImage kernels.
-
-The kernel is a bzImage kernel if the protocol >= 2.00 and the 0x01
-bit (LOAD_HIGH) in the loadflags field is set:
-
- is_bzImage = (protocol >= 0x0200) && (loadflags & 0x01);
- load_address = is_bzImage ? 0x100000 : 0x10000;
-
-Note that Image/zImage kernels can be up to 512K in size, and thus use
-the entire 0x10000-0x90000 range of memory. This means it is pretty
-much a requirement for these kernels to load the real-mode part at
-0x90000. bzImage kernels allow much more flexibility.
-
-
-**** SPECIAL COMMAND LINE OPTIONS
-
-If the command line provided by the boot loader is entered by the
-user, the user may expect the following command line options to work.
-They should normally not be deleted from the kernel command line even
-though not all of them are actually meaningful to the kernel. Boot
-loader authors who need additional command line options for the boot
-loader itself should get them registered in
-Documentation/admin-guide/kernel-parameters.rst to make sure they will not
-conflict with actual kernel options now or in the future.
-
- vga=<mode>
- <mode> here is either an integer (in C notation, either
- decimal, octal, or hexadecimal) or one of the strings
- "normal" (meaning 0xFFFF), "ext" (meaning 0xFFFE) or "ask"
- (meaning 0xFFFD). This value should be entered into the
- vid_mode field, as it is used by the kernel before the command
- line is parsed.
-
- mem=<size>
- <size> is an integer in C notation optionally followed by
- (case insensitive) K, M, G, T, P or E (meaning << 10, << 20,
- << 30, << 40, << 50 or << 60). This specifies the end of
- memory to the kernel. This affects the possible placement of
- an initrd, since an initrd should be placed near end of
- memory. Note that this is an option to *both* the kernel and
- the bootloader!
-
- initrd=<file>
- An initrd should be loaded. The meaning of <file> is
- obviously bootloader-dependent, and some boot loaders
- (e.g. LILO) do not have such a command.
-
-In addition, some boot loaders add the following options to the
-user-specified command line:
-
- BOOT_IMAGE=<file>
- The boot image which was loaded. Again, the meaning of <file>
- is obviously bootloader-dependent.
-
- auto
- The kernel was booted without explicit user intervention.
-
-If these options are added by the boot loader, it is highly
-recommended that they are located *first*, before the user-specified
-or configuration-specified command line. Otherwise, "init=/bin/sh"
-gets confused by the "auto" option.
-
-
-**** RUNNING THE KERNEL
-
-The kernel is started by jumping to the kernel entry point, which is
-located at *segment* offset 0x20 from the start of the real mode
-kernel. This means that if you loaded your real-mode kernel code at
-0x90000, the kernel entry point is 9020:0000.
-
-At entry, ds = es = ss should point to the start of the real-mode
-kernel code (0x9000 if the code is loaded at 0x90000), sp should be
-set up properly, normally pointing to the top of the heap, and
-interrupts should be disabled. Furthermore, to guard against bugs in
-the kernel, it is recommended that the boot loader sets fs = gs = ds =
-es = ss.
-
-In our example from above, we would do:
-
- /* Note: in the case of the "old" kernel protocol, base_ptr must
- be == 0x90000 at this point; see the previous sample code */
-
- seg = base_ptr >> 4;
-
- cli(); /* Enter with interrupts disabled! */
-
- /* Set up the real-mode kernel stack */
- _SS = seg;
- _SP = heap_end;
-
- _DS = _ES = _FS = _GS = seg;
- jmp_far(seg+0x20, 0); /* Run the kernel */
-
-If your boot sector accesses a floppy drive, it is recommended to
-switch off the floppy motor before running the kernel, since the
-kernel boot leaves interrupts off and thus the motor will not be
-switched off, especially if the loaded kernel has the floppy driver as
-a demand-loaded module!
-
-
-**** ADVANCED BOOT LOADER HOOKS
-
-If the boot loader runs in a particularly hostile environment (such as
-LOADLIN, which runs under DOS) it may be impossible to follow the
-standard memory location requirements. Such a boot loader may use the
-following hooks that, if set, are invoked by the kernel at the
-appropriate time. The use of these hooks should probably be
-considered an absolutely last resort!
-
-IMPORTANT: All the hooks are required to preserve %esp, %ebp, %esi and
-%edi across invocation.
-
- realmode_swtch:
- A 16-bit real mode far subroutine invoked immediately before
- entering protected mode. The default routine disables NMI, so
- your routine should probably do so, too.
-
- code32_start:
- A 32-bit flat-mode routine *jumped* to immediately after the
- transition to protected mode, but before the kernel is
- uncompressed. No segments, except CS, are guaranteed to be
- set up (current kernels do, but older ones do not); you should
- set them up to BOOT_DS (0x18) yourself.
-
- After completing your hook, you should jump to the address
- that was in this field before your boot loader overwrote it
- (relocated, if appropriate.)
-
-
-**** 32-bit BOOT PROTOCOL
-
-For machine with some new BIOS other than legacy BIOS, such as EFI,
-LinuxBIOS, etc, and kexec, the 16-bit real mode setup code in kernel
-based on legacy BIOS can not be used, so a 32-bit boot protocol needs
-to be defined.
-
-In 32-bit boot protocol, the first step in loading a Linux kernel
-should be to setup the boot parameters (struct boot_params,
-traditionally known as "zero page"). The memory for struct boot_params
-should be allocated and initialized to all zero. Then the setup header
-from offset 0x01f1 of kernel image on should be loaded into struct
-boot_params and examined. The end of setup header can be calculated as
-follow:
-
- 0x0202 + byte value at offset 0x0201
-
-In addition to read/modify/write the setup header of the struct
-boot_params as that of 16-bit boot protocol, the boot loader should
-also fill the additional fields of the struct boot_params as that
-described in zero-page.txt.
-
-After setting up the struct boot_params, the boot loader can load the
-32/64-bit kernel in the same way as that of 16-bit boot protocol.
-
-In 32-bit boot protocol, the kernel is started by jumping to the
-32-bit kernel entry point, which is the start address of loaded
-32/64-bit kernel.
-
-At entry, the CPU must be in 32-bit protected mode with paging
-disabled; a GDT must be loaded with the descriptors for selectors
-__BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat
-segment; __BOOT_CS must have execute/read permission, and __BOOT_DS
-must have read/write permission; CS must be __BOOT_CS and DS, ES, SS
-must be __BOOT_DS; interrupt must be disabled; %esi must hold the base
-address of the struct boot_params; %ebp, %edi and %ebx must be zero.
-
-**** 64-bit BOOT PROTOCOL
-
-For machine with 64bit cpus and 64bit kernel, we could use 64bit bootloader
-and we need a 64-bit boot protocol.
-
-In 64-bit boot protocol, the first step in loading a Linux kernel
-should be to setup the boot parameters (struct boot_params,
-traditionally known as "zero page"). The memory for struct boot_params
-could be allocated anywhere (even above 4G) and initialized to all zero.
-Then, the setup header at offset 0x01f1 of kernel image on should be
-loaded into struct boot_params and examined. The end of setup header
-can be calculated as follows:
-
- 0x0202 + byte value at offset 0x0201
-
-In addition to read/modify/write the setup header of the struct
-boot_params as that of 16-bit boot protocol, the boot loader should
-also fill the additional fields of the struct boot_params as described
-in zero-page.txt.
-
-After setting up the struct boot_params, the boot loader can load
-64-bit kernel in the same way as that of 16-bit boot protocol, but
-kernel could be loaded above 4G.
-
-In 64-bit boot protocol, the kernel is started by jumping to the
-64-bit kernel entry point, which is the start address of loaded
-64-bit kernel plus 0x200.
-
-At entry, the CPU must be in 64-bit mode with paging enabled.
-The range with setup_header.init_size from start address of loaded
-kernel and zero page and command line buffer get ident mapping;
-a GDT must be loaded with the descriptors for selectors
-__BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat
-segment; __BOOT_CS must have execute/read permission, and __BOOT_DS
-must have read/write permission; CS must be __BOOT_CS and DS, ES, SS
-must be __BOOT_DS; interrupt must be disabled; %rsi must hold the base
-address of the struct boot_params.
-
-**** EFI HANDOVER PROTOCOL
-
-This protocol allows boot loaders to defer initialisation to the EFI
-boot stub. The boot loader is required to load the kernel/initrd(s)
-from the boot media and jump to the EFI handover protocol entry point
-which is hdr->handover_offset bytes from the beginning of
-startup_{32,64}.
-
-The function prototype for the handover entry point looks like this,
-
- efi_main(void *handle, efi_system_table_t *table, struct boot_params *bp)
-
-'handle' is the EFI image handle passed to the boot loader by the EFI
-firmware, 'table' is the EFI system table - these are the first two
-arguments of the "handoff state" as described in section 2.3 of the
-UEFI specification. 'bp' is the boot loader-allocated boot params.
-
-The boot loader *must* fill out the following fields in bp,
-
- o hdr.code32_start
- o hdr.cmd_line_ptr
- o hdr.ramdisk_image (if applicable)
- o hdr.ramdisk_size (if applicable)
-
-All other fields should be zero.
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 7612d3142b2a..8f08caf4fbbb 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -7,3 +7,5 @@ Linux x86 Support
.. toctree::
:maxdepth: 2
:numbered:
+
+ boot
--
2.20.1

2019-04-23 16:38:09

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 40/63] Documentation: x86: convert exception-tables.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
...eption-tables.txt => exception-tables.rst} | 231 ++++++++++--------
Documentation/x86/index.rst | 1 +
2 files changed, 126 insertions(+), 106 deletions(-)
rename Documentation/x86/{exception-tables.txt => exception-tables.rst} (67%)

diff --git a/Documentation/x86/exception-tables.txt b/Documentation/x86/exception-tables.rst
similarity index 67%
rename from Documentation/x86/exception-tables.txt
rename to Documentation/x86/exception-tables.rst
index e396bcd8d830..2ffb096c8b58 100644
--- a/Documentation/x86/exception-tables.txt
+++ b/Documentation/x86/exception-tables.rst
@@ -1,5 +1,10 @@
- Kernel level exception handling in Linux
- Commentary by Joerg Pommnitz <[email protected]>
+.. SPDX-License-Identifier: GPL-2.0
+
+===============================
+Kernel level exception handling
+===============================
+
+Commentary by Joerg Pommnitz <[email protected]>

When a process runs in kernel mode, it often has to access user
mode memory whose address has been passed by an untrusted program.
@@ -25,9 +30,9 @@ How does this work?

Whenever the kernel tries to access an address that is currently not
accessible, the CPU generates a page fault exception and calls the
-page fault handler
+page fault handler::

-void do_page_fault(struct pt_regs *regs, unsigned long error_code)
+ void do_page_fault(struct pt_regs *regs, unsigned long error_code)

in arch/x86/mm/fault.c. The parameters on the stack are set up by
the low level assembly glue in arch/x86/kernel/entry_32.S. The parameter
@@ -57,73 +62,74 @@ as an example. The definition is somewhat hard to follow, so let's peek at
the code generated by the preprocessor and the compiler. I selected
the get_user call in drivers/char/sysrq.c for a detailed examination.

-The original code in sysrq.c line 587:
+The original code in sysrq.c line 587::
+
get_user(c, buf);

-The preprocessor output (edited to become somewhat readable):
-
-(
- {
- long __gu_err = - 14 , __gu_val = 0;
- const __typeof__(*( ( buf ) )) *__gu_addr = ((buf));
- if (((((0 + current_set[0])->tss.segment) == 0x18 ) ||
- (((sizeof(*(buf))) <= 0xC0000000UL) &&
- ((unsigned long)(__gu_addr ) <= 0xC0000000UL - (sizeof(*(buf)))))))
- do {
- __gu_err = 0;
- switch ((sizeof(*(buf)))) {
- case 1:
- __asm__ __volatile__(
- "1: mov" "b" " %2,%" "b" "1\n"
- "2:\n"
- ".section .fixup,\"ax\"\n"
- "3: movl %3,%0\n"
- " xor" "b" " %" "b" "1,%" "b" "1\n"
- " jmp 2b\n"
- ".section __ex_table,\"a\"\n"
- " .align 4\n"
- " .long 1b,3b\n"
- ".text" : "=r"(__gu_err), "=q" (__gu_val): "m"((*(struct __large_struct *)
- ( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err )) ;
- break;
- case 2:
- __asm__ __volatile__(
- "1: mov" "w" " %2,%" "w" "1\n"
- "2:\n"
- ".section .fixup,\"ax\"\n"
- "3: movl %3,%0\n"
- " xor" "w" " %" "w" "1,%" "w" "1\n"
- " jmp 2b\n"
- ".section __ex_table,\"a\"\n"
- " .align 4\n"
- " .long 1b,3b\n"
- ".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *)
- ( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err ));
- break;
- case 4:
- __asm__ __volatile__(
- "1: mov" "l" " %2,%" "" "1\n"
- "2:\n"
- ".section .fixup,\"ax\"\n"
- "3: movl %3,%0\n"
- " xor" "l" " %" "" "1,%" "" "1\n"
- " jmp 2b\n"
- ".section __ex_table,\"a\"\n"
- " .align 4\n" " .long 1b,3b\n"
- ".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *)
- ( __gu_addr )) ), "i"(- 14 ), "0"(__gu_err));
- break;
- default:
- (__gu_val) = __get_user_bad();
- }
- } while (0) ;
- ((c)) = (__typeof__(*((buf))))__gu_val;
- __gu_err;
- }
-);
+The preprocessor output (edited to become somewhat readable)::
+
+ (
+ {
+ long __gu_err = - 14 , __gu_val = 0;
+ const __typeof__(*( ( buf ) )) *__gu_addr = ((buf));
+ if (((((0 + current_set[0])->tss.segment) == 0x18 ) ||
+ (((sizeof(*(buf))) <= 0xC0000000UL) &&
+ ((unsigned long)(__gu_addr ) <= 0xC0000000UL - (sizeof(*(buf)))))))
+ do {
+ __gu_err = 0;
+ switch ((sizeof(*(buf)))) {
+ case 1:
+ __asm__ __volatile__(
+ "1: mov" "b" " %2,%" "b" "1\n"
+ "2:\n"
+ ".section .fixup,\"ax\"\n"
+ "3: movl %3,%0\n"
+ " xor" "b" " %" "b" "1,%" "b" "1\n"
+ " jmp 2b\n"
+ ".section __ex_table,\"a\"\n"
+ " .align 4\n"
+ " .long 1b,3b\n"
+ ".text" : "=r"(__gu_err), "=q" (__gu_val): "m"((*(struct __large_struct *)
+ ( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err )) ;
+ break;
+ case 2:
+ __asm__ __volatile__(
+ "1: mov" "w" " %2,%" "w" "1\n"
+ "2:\n"
+ ".section .fixup,\"ax\"\n"
+ "3: movl %3,%0\n"
+ " xor" "w" " %" "w" "1,%" "w" "1\n"
+ " jmp 2b\n"
+ ".section __ex_table,\"a\"\n"
+ " .align 4\n"
+ " .long 1b,3b\n"
+ ".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *)
+ ( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err ));
+ break;
+ case 4:
+ __asm__ __volatile__(
+ "1: mov" "l" " %2,%" "" "1\n"
+ "2:\n"
+ ".section .fixup,\"ax\"\n"
+ "3: movl %3,%0\n"
+ " xor" "l" " %" "" "1,%" "" "1\n"
+ " jmp 2b\n"
+ ".section __ex_table,\"a\"\n"
+ " .align 4\n" " .long 1b,3b\n"
+ ".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *)
+ ( __gu_addr )) ), "i"(- 14 ), "0"(__gu_err));
+ break;
+ default:
+ (__gu_val) = __get_user_bad();
+ }
+ } while (0) ;
+ ((c)) = (__typeof__(*((buf))))__gu_val;
+ __gu_err;
+ }
+ );

WOW! Black GCC/assembly magic. This is impossible to follow, so let's
-see what code gcc generates:
+see what code gcc generates::

> xorl %edx,%edx
> movl current_set,%eax
@@ -154,7 +160,7 @@ understand. Can we? The actual user access is quite obvious. Thanks
to the unified address space we can just access the address in user
memory. But what does the .section stuff do?????

-To understand this we have to look at the final kernel:
+To understand this we have to look at the final kernel::

> objdump --section-headers vmlinux
>
@@ -181,7 +187,7 @@ To understand this we have to look at the final kernel:

There are obviously 2 non standard ELF sections in the generated object
file. But first we want to find out what happened to our code in the
-final kernel executable:
+final kernel executable::

> objdump --disassemble --section=.text vmlinux
>
@@ -199,7 +205,7 @@ final kernel executable:
The whole user memory access is reduced to 10 x86 machine instructions.
The instructions bracketed in the .section directives are no longer
in the normal execution path. They are located in a different section
-of the executable file:
+of the executable file::

> objdump --disassemble --section=.fixup vmlinux
>
@@ -207,14 +213,15 @@ of the executable file:
> c0199ffa <.fixup+10ba> xorb %dl,%dl
> c0199ffc <.fixup+10bc> jmp c017e7a7 <do_con_write+e3>

-And finally:
+And finally::
+
> objdump --full-contents --section=__ex_table vmlinux
>
> c01aa7c4 93c017c0 e09f19c0 97c017c0 99c017c0 ................
> c01aa7d4 f6c217c0 e99f19c0 a5e717c0 f59f19c0 ................
> c01aa7e4 080a18c0 01a019c0 0a0a18c0 04a019c0 ................

-or in human readable byte order:
+or in human readable byte order::

> c01aa7c4 c017c093 c0199fe0 c017c097 c017c099 ................
> c01aa7d4 c017c2f6 c0199fe9 c017e7a5 c0199ff5 ................
@@ -222,18 +229,22 @@ or in human readable byte order:
this is the interesting part!
> c01aa7e4 c0180a08 c019a001 c0180a0a c019a004 ................

-What happened? The assembly directives
+What happened? The assembly directives::

-.section .fixup,"ax"
-.section __ex_table,"a"
+ .section .fixup,"ax"
+ .section __ex_table,"a"

told the assembler to move the following code to the specified
-sections in the ELF object file. So the instructions
-3: movl $-14,%eax
- xorb %dl,%dl
- jmp 2b
-ended up in the .fixup section of the object file and the addresses
+sections in the ELF object file. So the instructions::
+
+ 3: movl $-14,%eax
+ xorb %dl,%dl
+ jmp 2b
+
+ended up in the .fixup section of the object file and the addresses::
+
.long 1b,3b
+
ended up in the __ex_table section of the object file. 1b and 3b
are local labels. The local label 1b (1b stands for next label 1
backward) is the address of the instruction that might fault, i.e.
@@ -246,35 +257,39 @@ the fault, in our case the actual value is c0199ff5:
the original assembly code: > 3: movl $-14,%eax
and linked in vmlinux : > c0199ff5 <.fixup+10b5> movl $0xfffffff2,%eax

-The assembly code
+The assembly code::
+
> .section __ex_table,"a"
> .align 4
> .long 1b,3b

-becomes the value pair
+becomes the value pair::
+
> c01aa7d4 c017c2f6 c0199fe9 c017e7a5 c0199ff5 ................
^this is ^this is
1b 3b
+
c017e7a5,c0199ff5 in the exception table of the kernel.

So, what actually happens if a fault from kernel mode with no suitable
vma occurs?

-1.) access to invalid address:
- > c017e7a5 <do_con_write+e1> movb (%ebx),%dl
-2.) MMU generates exception
-3.) CPU calls do_page_fault
-4.) do page fault calls search_exception_table (regs->eip == c017e7a5);
-5.) search_exception_table looks up the address c017e7a5 in the
- exception table (i.e. the contents of the ELF section __ex_table)
- and returns the address of the associated fault handle code c0199ff5.
-6.) do_page_fault modifies its own return address to point to the fault
- handle code and returns.
-7.) execution continues in the fault handling code.
-8.) 8a) EAX becomes -EFAULT (== -14)
- 8b) DL becomes zero (the value we "read" from user space)
- 8c) execution continues at local label 2 (address of the
- instruction immediately after the faulting user access).
+#. access to invalid address::
+
+ > c017e7a5 <do_con_write+e1> movb (%ebx),%dl
+#. MMU generates exception
+#. CPU calls do_page_fault
+#. do page fault calls search_exception_table (regs->eip == c017e7a5);
+#. search_exception_table looks up the address c017e7a5 in the
+ exception table (i.e. the contents of the ELF section __ex_table)
+ and returns the address of the associated fault handle code c0199ff5.
+#. do_page_fault modifies its own return address to point to the fault
+ handle code and returns.
+#. execution continues in the fault handling code.
+#. a) EAX becomes -EFAULT (== -14)
+ b) DL becomes zero (the value we "read" from user space)
+ c) execution continues at local label 2 (address of the
+ instruction immediately after the faulting user access).

The steps 8a to 8c in a certain way emulate the faulting instruction.

@@ -295,14 +310,15 @@ Things changed when 64-bit support was added to x86 Linux. Rather than
double the size of the exception table by expanding the two entries
from 32-bits to 64 bits, a clever trick was used to store addresses
as relative offsets from the table itself. The assembly code changed
-from:
- .long 1b,3b
-to:
- .long (from) - .
- .long (to) - .
+from::
+
+ .long 1b,3b
+ to:
+ .long (from) - .
+ .long (to) - .

and the C-code that uses these values converts back to absolute addresses
-like this:
+like this::

ex_insn_addr(const struct exception_table_entry *x)
{
@@ -313,15 +329,18 @@ In v4.6 the exception table entry was expanded with a new field "handler".
This is also 32-bits wide and contains a third relative function
pointer which points to one of:

-1) int ex_handler_default(const struct exception_table_entry *fixup)
+1) `int ex_handler_default(const struct exception_table_entry *fixup)`
This is legacy case that just jumps to the fixup code
-2) int ex_handler_fault(const struct exception_table_entry *fixup)
+
+2) `int ex_handler_fault(const struct exception_table_entry *fixup)`
This case provides the fault number of the trap that occurred at
entry->insn. It is used to distinguish page faults from machine
check.
-3) int ex_handler_ext(const struct exception_table_entry *fixup)
+
+3) `int ex_handler_ext(const struct exception_table_entry *fixup)`
This case is used for uaccess_err ... we need to set a flag
in the task structure. Before the handler functions existed this
case was handled by adding a large offset to the fixup to tag
it as special.
+
More functions can easily be added.
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 2033791e53bc..c0bfd0bd6000 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -10,3 +10,4 @@ Linux x86 Support

boot
topology
+ exception-tables
--
2.20.1

2019-04-23 16:38:10

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 33/63] Documentation: PCI: convert endpoint/pci-endpoint.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
---
Documentation/PCI/endpoint/index.rst | 10 ++
.../{pci-endpoint.txt => pci-endpoint.rst} | 95 +++++++++++--------
Documentation/PCI/index.rst | 1 +
3 files changed, 68 insertions(+), 38 deletions(-)
create mode 100644 Documentation/PCI/endpoint/index.rst
rename Documentation/PCI/endpoint/{pci-endpoint.txt => pci-endpoint.rst} (82%)

diff --git a/Documentation/PCI/endpoint/index.rst b/Documentation/PCI/endpoint/index.rst
new file mode 100644
index 000000000000..0db4f2fcd7f0
--- /dev/null
+++ b/Documentation/PCI/endpoint/index.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================
+PCI Endpoint Framework
+======================
+
+.. toctree::
+ :maxdepth: 2
+
+ pci-endpoint
diff --git a/Documentation/PCI/endpoint/pci-endpoint.txt b/Documentation/PCI/endpoint/pci-endpoint.rst
similarity index 82%
rename from Documentation/PCI/endpoint/pci-endpoint.txt
rename to Documentation/PCI/endpoint/pci-endpoint.rst
index e86a96b66a6a..6674ce5425bf 100644
--- a/Documentation/PCI/endpoint/pci-endpoint.txt
+++ b/Documentation/PCI/endpoint/pci-endpoint.rst
@@ -1,11 +1,17 @@
- PCI ENDPOINT FRAMEWORK
- Kishon Vijay Abraham I <[email protected]>
+.. SPDX-License-Identifier: GPL-2.0
+
+======================
+PCI Endpoint Framework
+======================
+
+:Author: Kishon Vijay Abraham I <[email protected]>

This document is a guide to use the PCI Endpoint Framework in order to create
endpoint controller driver, endpoint function driver, and using configfs
interface to bind the function driver to the controller driver.

-1. Introduction
+Introduction
+============

Linux has a comprehensive PCI subsystem to support PCI controllers that
operates in Root Complex mode. The subsystem has capability to scan PCI bus,
@@ -19,24 +25,27 @@ add endpoint mode support in Linux. This will help to run Linux in an
EP system which can have a wide variety of use cases from testing or
validation, co-processor accelerator, etc.

-2. PCI Endpoint Core
+PCI Endpoint Core
+=================

The PCI Endpoint Core layer comprises 3 components: the Endpoint Controller
library, the Endpoint Function library, and the configfs layer to bind the
endpoint function with the endpoint controller.

-2.1 PCI Endpoint Controller(EPC) Library
+PCI Endpoint Controller(EPC) Library
+------------------------------------

The EPC library provides APIs to be used by the controller that can operate
in endpoint mode. It also provides APIs to be used by function driver/library
in order to implement a particular endpoint function.

-2.1.1 APIs for the PCI controller Driver
+APIs for the PCI controller Driver
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This section lists the APIs that the PCI Endpoint core provides to be used
by the PCI controller driver.

-*) devm_pci_epc_create()/pci_epc_create()
+* devm_pci_epc_create()/pci_epc_create()

The PCI controller driver should implement the following ops:
* write_header: ops to populate configuration space header
@@ -51,110 +60,116 @@ by the PCI controller driver.
The PCI controller driver can then create a new EPC device by invoking
devm_pci_epc_create()/pci_epc_create().

-*) devm_pci_epc_destroy()/pci_epc_destroy()
+* devm_pci_epc_destroy()/pci_epc_destroy()

The PCI controller driver can destroy the EPC device created by either
devm_pci_epc_create() or pci_epc_create() using devm_pci_epc_destroy() or
pci_epc_destroy().

-*) pci_epc_linkup()
+* pci_epc_linkup()

In order to notify all the function devices that the EPC device to which
they are linked has established a link with the host, the PCI controller
driver should invoke pci_epc_linkup().

-*) pci_epc_mem_init()
+* pci_epc_mem_init()

Initialize the pci_epc_mem structure used for allocating EPC addr space.

-*) pci_epc_mem_exit()
+* pci_epc_mem_exit()

Cleanup the pci_epc_mem structure allocated during pci_epc_mem_init().

-2.1.2 APIs for the PCI Endpoint Function Driver
+
+APIs for the PCI Endpoint Function Driver
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This section lists the APIs that the PCI Endpoint core provides to be used
by the PCI endpoint function driver.

-*) pci_epc_write_header()
+* pci_epc_write_header()

The PCI endpoint function driver should use pci_epc_write_header() to
write the standard configuration header to the endpoint controller.

-*) pci_epc_set_bar()
+* pci_epc_set_bar()

The PCI endpoint function driver should use pci_epc_set_bar() to configure
the Base Address Register in order for the host to assign PCI addr space.
Register space of the function driver is usually configured
using this API.

-*) pci_epc_clear_bar()
+* pci_epc_clear_bar()

The PCI endpoint function driver should use pci_epc_clear_bar() to reset
the BAR.

-*) pci_epc_raise_irq()
+* pci_epc_raise_irq()

The PCI endpoint function driver should use pci_epc_raise_irq() to raise
Legacy Interrupt, MSI or MSI-X Interrupt.

-*) pci_epc_mem_alloc_addr()
+* pci_epc_mem_alloc_addr()

The PCI endpoint function driver should use pci_epc_mem_alloc_addr(), to
allocate memory address from EPC addr space which is required to access
RC's buffer

-*) pci_epc_mem_free_addr()
+* pci_epc_mem_free_addr()

The PCI endpoint function driver should use pci_epc_mem_free_addr() to
free the memory space allocated using pci_epc_mem_alloc_addr().

-2.1.3 Other APIs
+Other APIs
+~~~~~~~~~~

There are other APIs provided by the EPC library. These are used for binding
the EPF device with EPC device. pci-ep-cfs.c can be used as reference for
using these APIs.

-*) pci_epc_get()
+* pci_epc_get()

Get a reference to the PCI endpoint controller based on the device name of
the controller.

-*) pci_epc_put()
+* pci_epc_put()

Release the reference to the PCI endpoint controller obtained using
pci_epc_get()

-*) pci_epc_add_epf()
+* pci_epc_add_epf()

Add a PCI endpoint function to a PCI endpoint controller. A PCIe device
can have up to 8 functions according to the specification.

-*) pci_epc_remove_epf()
+* pci_epc_remove_epf()

Remove the PCI endpoint function from PCI endpoint controller.

-*) pci_epc_start()
+* pci_epc_start()

The PCI endpoint function driver should invoke pci_epc_start() once it
has configured the endpoint function and wants to start the PCI link.

-*) pci_epc_stop()
+* pci_epc_stop()

The PCI endpoint function driver should invoke pci_epc_stop() to stop
the PCI LINK.

-2.2 PCI Endpoint Function(EPF) Library
+
+PCI Endpoint Function(EPF) Library
+----------------------------------

The EPF library provides APIs to be used by the function driver and the EPC
library to provide endpoint mode functionality.

-2.2.1 APIs for the PCI Endpoint Function Driver
+APIs for the PCI Endpoint Function Driver
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This section lists the APIs that the PCI Endpoint core provides to be used
by the PCI endpoint function driver.

-*) pci_epf_register_driver()
+* pci_epf_register_driver()

The PCI Endpoint Function driver should implement the following ops:
* bind: ops to perform when a EPC device has been bound to EPF device
@@ -166,50 +181,54 @@ by the PCI endpoint function driver.
The PCI Function driver can then register the PCI EPF driver by using
pci_epf_register_driver().

-*) pci_epf_unregister_driver()
+* pci_epf_unregister_driver()

The PCI Function driver can unregister the PCI EPF driver by using
pci_epf_unregister_driver().

-*) pci_epf_alloc_space()
+* pci_epf_alloc_space()

The PCI Function driver can allocate space for a particular BAR using
pci_epf_alloc_space().

-*) pci_epf_free_space()
+* pci_epf_free_space()

The PCI Function driver can free the allocated space
(using pci_epf_alloc_space) by invoking pci_epf_free_space().

-2.2.2 APIs for the PCI Endpoint Controller Library
+APIs for the PCI Endpoint Controller Library
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
This section lists the APIs that the PCI Endpoint core provides to be used
by the PCI endpoint controller library.

-*) pci_epf_linkup()
+* pci_epf_linkup()

The PCI endpoint controller library invokes pci_epf_linkup() when the
EPC device has established the connection to the host.

-2.2.2 Other APIs
+Other APIs
+~~~~~~~~~~
+
There are other APIs provided by the EPF library. These are used to notify
the function driver when the EPF device is bound to the EPC device.
pci-ep-cfs.c can be used as reference for using these APIs.

-*) pci_epf_create()
+* pci_epf_create()

Create a new PCI EPF device by passing the name of the PCI EPF device.
This name will be used to bind the the EPF device to a EPF driver.

-*) pci_epf_destroy()
+* pci_epf_destroy()

Destroy the created PCI EPF device.

-*) pci_epf_bind()
+* pci_epf_bind()

pci_epf_bind() should be invoked when the EPF device has been bound to
a EPC device.

-*) pci_epf_unbind()
+* pci_epf_unbind()

pci_epf_unbind() should be invoked when the binding between EPC device
and EPF device is lost.
diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
index 86c76c22810b..c8ea2e626c20 100644
--- a/Documentation/PCI/index.rst
+++ b/Documentation/PCI/index.rst
@@ -15,3 +15,4 @@ Linux PCI Bus Subsystem
acpi-info
pci-error-recovery
pcieaer-howto
+ endpoint/index
--
2.20.1

2019-04-23 16:38:19

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 34/63] Documentation: PCI: convert endpoint/pci-endpoint-cfs.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
---
Documentation/PCI/endpoint/index.rst | 1 +
...-endpoint-cfs.txt => pci-endpoint-cfs.rst} | 99 +++++++++++--------
2 files changed, 57 insertions(+), 43 deletions(-)
rename Documentation/PCI/endpoint/{pci-endpoint-cfs.txt => pci-endpoint-cfs.rst} (64%)

diff --git a/Documentation/PCI/endpoint/index.rst b/Documentation/PCI/endpoint/index.rst
index 0db4f2fcd7f0..3951de9f923c 100644
--- a/Documentation/PCI/endpoint/index.rst
+++ b/Documentation/PCI/endpoint/index.rst
@@ -8,3 +8,4 @@ PCI Endpoint Framework
:maxdepth: 2

pci-endpoint
+ pci-endpoint-cfs
diff --git a/Documentation/PCI/endpoint/pci-endpoint-cfs.txt b/Documentation/PCI/endpoint/pci-endpoint-cfs.rst
similarity index 64%
rename from Documentation/PCI/endpoint/pci-endpoint-cfs.txt
rename to Documentation/PCI/endpoint/pci-endpoint-cfs.rst
index d740f29960a4..b6d39cdec56e 100644
--- a/Documentation/PCI/endpoint/pci-endpoint-cfs.txt
+++ b/Documentation/PCI/endpoint/pci-endpoint-cfs.rst
@@ -1,41 +1,51 @@
- CONFIGURING PCI ENDPOINT USING CONFIGFS
- Kishon Vijay Abraham I <[email protected]>
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================================
+Configuring PCI Endpoint Using CONFIGFS
+=======================================
+
+:Author: Kishon Vijay Abraham I <[email protected]>

The PCI Endpoint Core exposes configfs entry (pci_ep) to configure the
PCI endpoint function and to bind the endpoint function
with the endpoint controller. (For introducing other mechanisms to
configure the PCI Endpoint Function refer to [1]).

-*) Mounting configfs
+Mounting configfs
+=================

The PCI Endpoint Core layer creates pci_ep directory in the mounted configfs
-directory. configfs can be mounted using the following command.
+directory. configfs can be mounted using the following command::

mount -t configfs none /sys/kernel/config

-*) Directory Structure
+Directory Structure
+===================

The pci_ep configfs has two directories at its root: controllers and
functions. Every EPC device present in the system will have an entry in
the *controllers* directory and and every EPF driver present in the system
will have an entry in the *functions* directory.
+::

-/sys/kernel/config/pci_ep/
- .. controllers/
- .. functions/
+ /sys/kernel/config/pci_ep/
+ .. controllers/
+ .. functions/

-*) Creating EPF Device
+Creating EPF Device
+===================

Every registered EPF driver will be listed in controllers directory. The
entries corresponding to EPF driver will be created by the EPF core.
+::

-/sys/kernel/config/pci_ep/functions/
- .. <EPF Driver1>/
- ... <EPF Device 11>/
- ... <EPF Device 21>/
- .. <EPF Driver2>/
- ... <EPF Device 12>/
- ... <EPF Device 22>/
+ /sys/kernel/config/pci_ep/functions/
+ .. <EPF Driver1>/
+ ... <EPF Device 11>/
+ ... <EPF Device 21>/
+ .. <EPF Driver2>/
+ ... <EPF Device 12>/
+ ... <EPF Device 22>/

In order to create a <EPF device> of the type probed by <EPF Driver>, the
user has to create a directory inside <EPF DriverN>.
@@ -44,34 +54,37 @@ Every <EPF device> directory consists of the following entries that can be
used to configure the standard configuration header of the endpoint function.
(These entries are created by the framework when any new <EPF Device> is
created)
-
- .. <EPF Driver1>/
- ... <EPF Device 11>/
- ... vendorid
- ... deviceid
- ... revid
- ... progif_code
- ... subclass_code
- ... baseclass_code
- ... cache_line_size
- ... subsys_vendor_id
- ... subsys_id
- ... interrupt_pin
-
-*) EPC Device
+::
+
+ .. <EPF Driver1>/
+ ... <EPF Device 11>/
+ ... vendorid
+ ... deviceid
+ ... revid
+ ... progif_code
+ ... subclass_code
+ ... baseclass_code
+ ... cache_line_size
+ ... subsys_vendor_id
+ ... subsys_id
+ ... interrupt_pin
+
+EPC Device
+==========

Every registered EPC device will be listed in controllers directory. The
entries corresponding to EPC device will be created by the EPC core.
-
-/sys/kernel/config/pci_ep/controllers/
- .. <EPC Device1>/
- ... <Symlink EPF Device11>/
- ... <Symlink EPF Device12>/
- ... start
- .. <EPC Device2>/
- ... <Symlink EPF Device21>/
- ... <Symlink EPF Device22>/
- ... start
+::
+
+ /sys/kernel/config/pci_ep/controllers/
+ .. <EPC Device1>/
+ ... <Symlink EPF Device11>/
+ ... <Symlink EPF Device12>/
+ ... start
+ .. <EPC Device2>/
+ ... <Symlink EPF Device21>/
+ ... <Symlink EPF Device22>/
+ ... start

The <EPC Device> directory will have a list of symbolic links to
<EPF Device>. These symbolic links should be created by the user to
@@ -81,7 +94,7 @@ The <EPC Device> directory will also have a *start* field. Once
"1" is written to this field, the endpoint device will be ready to
establish the link with the host. This is usually done after
all the EPF devices are created and linked with the EPC device.
-
+::

| controllers/
| <Directory: EPC name>/
@@ -102,4 +115,4 @@ all the EPF devices are created and linked with the EPC device.
| interrupt_pin
| function

-[1] -> Documentation/PCI/endpoint/pci-endpoint.txt
+[1] :doc:`pci-endpoint`
--
2.20.1

2019-04-23 16:38:20

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 41/63] Documentation: x86: convert kernel-stacks to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/index.rst | 1 +
.../x86/{kernel-stacks => kernel-stacks.rst} | 20 ++++++++++++-------
2 files changed, 14 insertions(+), 7 deletions(-)
rename Documentation/x86/{kernel-stacks => kernel-stacks.rst} (92%)

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index c0bfd0bd6000..489f4f4179c4 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -11,3 +11,4 @@ Linux x86 Support
boot
topology
exception-tables
+ kernel-stacks
diff --git a/Documentation/x86/kernel-stacks b/Documentation/x86/kernel-stacks.rst
similarity index 92%
rename from Documentation/x86/kernel-stacks
rename to Documentation/x86/kernel-stacks.rst
index 9a0aa4d3a866..3e6bf5940db0 100644
--- a/Documentation/x86/kernel-stacks
+++ b/Documentation/x86/kernel-stacks.rst
@@ -1,5 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=============
+Kernel Stacks
+=============
+
Kernel stacks on x86-64 bit
----------------------------
+===========================

Most of the text from Keith Owens, hacked by AK

@@ -57,7 +63,7 @@ IST events with the same code to be nested. However in most cases, the
stack size allocated to an IST assumes no nesting for the same code.
If that assumption is ever broken then the stacks will become corrupt.

-The currently assigned IST stacks are :-
+The currently assigned IST stacks are :

* DOUBLEFAULT_STACK. EXCEPTION_STKSZ (PAGE_SIZE).

@@ -98,7 +104,7 @@ For more details see the Intel IA32 or AMD AMD64 architecture manuals.


Printing backtraces on x86
---------------------------
+==========================

The question about the '?' preceding function names in an x86 stacktrace
keeps popping up, here's an indepth explanation. It helps if the reader
@@ -108,7 +114,7 @@ arch/x86/kernel/dumpstack.c.
Adapted from Ingo's mail, Message-ID: <[email protected]>:

We always scan the full kernel stack for return addresses stored on
-the kernel stack(s) [*], from stack top to stack bottom, and print out
+the kernel stack(s) [1]_, from stack top to stack bottom, and print out
anything that 'looks like' a kernel text address.

If it fits into the frame pointer chain, we print it without a question
@@ -136,6 +142,6 @@ that look like kernel text addresses, so if debug information is wrong,
we still print out the real call chain as well - just with more question
marks than ideal.

-[*] For things like IRQ and IST stacks, we also scan those stacks, in
- the right order, and try to cross from one stack into another
- reconstructing the call chain. This works most of the time.
+.. [1] For things like IRQ and IST stacks, we also scan those stacks, in
+ the right order, and try to cross from one stack into another
+ reconstructing the call chain. This works most of the time.
--
2.20.1

2019-04-23 16:38:29

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 42/63] Documentation: x86: convert entry_64.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/{entry_64.txt => entry_64.rst} | 12 +++++++++---
Documentation/x86/index.rst | 1 +
2 files changed, 10 insertions(+), 3 deletions(-)
rename Documentation/x86/{entry_64.txt => entry_64.rst} (95%)

diff --git a/Documentation/x86/entry_64.txt b/Documentation/x86/entry_64.rst
similarity index 95%
rename from Documentation/x86/entry_64.txt
rename to Documentation/x86/entry_64.rst
index c1df8eba9dfd..a48b3f6ebbe8 100644
--- a/Documentation/x86/entry_64.txt
+++ b/Documentation/x86/entry_64.rst
@@ -1,3 +1,9 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============
+Kernel Entries
+==============
+
This file documents some of the kernel entries in
arch/x86/entry/entry_64.S. A lot of this explanation is adapted from
an email from Ingo Molnar:
@@ -59,7 +65,7 @@ Now, there's a secondary complication: there's a cheap way to test
which mode the CPU is in and an expensive way.

The cheap way is to pick this info off the entry frame on the kernel
-stack, from the CS of the ptregs area of the kernel stack:
+stack, from the CS of the ptregs area of the kernel stack::

xorl %ebx,%ebx
testl $3,CS+8(%rsp)
@@ -67,7 +73,7 @@ stack, from the CS of the ptregs area of the kernel stack:
SWAPGS

The expensive (paranoid) way is to read back the MSR_GS_BASE value
-(which is what SWAPGS modifies):
+(which is what SWAPGS modifies)::

movl $1,%ebx
movl $MSR_GS_BASE,%ecx
@@ -76,7 +82,7 @@ The expensive (paranoid) way is to read back the MSR_GS_BASE value
js 1f /* negative -> in kernel */
SWAPGS
xorl %ebx,%ebx
-1: ret
+ 1: ret

If we are at an interrupt or user-trap/gate-alike boundary then we can
use the faster check: the stack will be a reliable indicator of
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 489f4f4179c4..8a666c5abc85 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -12,3 +12,4 @@ Linux x86 Support
topology
exception-tables
kernel-stacks
+ entry_64
--
2.20.1

2019-04-23 16:38:47

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 44/63] Documentation: x86: convert zero-page.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/index.rst | 1 +
Documentation/x86/zero-page.rst | 47 +++++++++++++++++++++++++++++++++
Documentation/x86/zero-page.txt | 40 ----------------------------
3 files changed, 48 insertions(+), 40 deletions(-)
create mode 100644 Documentation/x86/zero-page.rst
delete mode 100644 Documentation/x86/zero-page.txt

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 7b8388ebd43d..9a0b5f38ef6b 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -14,3 +14,4 @@ Linux x86 Support
kernel-stacks
entry_64
earlyprintk
+ zero-page
diff --git a/Documentation/x86/zero-page.rst b/Documentation/x86/zero-page.rst
new file mode 100644
index 000000000000..deedbc84454d
--- /dev/null
+++ b/Documentation/x86/zero-page.rst
@@ -0,0 +1,47 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========
+Zero Page
+=========
+
+The additional fields in struct boot_params as a part of 32-bit boot
+protocol of kernel. These should be filled by bootloader or 16-bit
+real-mode setup code of the kernel. References/settings to it mainly
+are in::
+
+ arch/x86/include/uapi/asm/bootparam.h
+
+::
+
+ Offset Proto Name Meaning
+ /Size
+
+ 000/040 ALL screen_info Text mode or frame buffer information
+ (struct screen_info)
+ 040/014 ALL apm_bios_info APM BIOS information (struct apm_bios_info)
+ 058/008 ALL tboot_addr Physical address of tboot shared page
+ 060/010 ALL ist_info Intel SpeedStep (IST) BIOS support information
+ (struct ist_info)
+ 080/010 ALL hd0_info hd0 disk parameter, OBSOLETE!!
+ 090/010 ALL hd1_info hd1 disk parameter, OBSOLETE!!
+ 0A0/010 ALL sys_desc_table System description table (struct sys_desc_table),
+ OBSOLETE!!
+ 0B0/010 ALL olpc_ofw_header OLPC's OpenFirmware CIF and friends
+ 0C0/004 ALL ext_ramdisk_image ramdisk_image high 32bits
+ 0C4/004 ALL ext_ramdisk_size ramdisk_size high 32bits
+ 0C8/004 ALL ext_cmd_line_ptr cmd_line_ptr high 32bits
+ 140/080 ALL edid_info Video mode setup (struct edid_info)
+ 1C0/020 ALL efi_info EFI 32 information (struct efi_info)
+ 1E0/004 ALL alt_mem_k Alternative mem check, in KB
+ 1E4/004 ALL scratch Scratch field for the kernel setup code
+ 1E8/001 ALL e820_entries Number of entries in e820_table (below)
+ 1E9/001 ALL eddbuf_entries Number of entries in eddbuf (below)
+ 1EA/001 ALL edd_mbr_sig_buf_entries Number of entries in edd_mbr_sig_buffer
+ (below)
+ 1EB/001 ALL kbd_status Numlock is enabled
+ 1EC/001 ALL secure_boot Secure boot is enabled in the firmware
+ 1EF/001 ALL sentinel Used to detect broken bootloaders
+ 290/040 ALL edd_mbr_sig_buffer EDD MBR signatures
+ 2D0/A00 ALL e820_table E820 memory map table
+ (array of struct e820_entry)
+ D00/1EC ALL eddbuf EDD data (array of struct edd_info)
diff --git a/Documentation/x86/zero-page.txt b/Documentation/x86/zero-page.txt
deleted file mode 100644
index 68aed077f7b6..000000000000
--- a/Documentation/x86/zero-page.txt
+++ /dev/null
@@ -1,40 +0,0 @@
-The additional fields in struct boot_params as a part of 32-bit boot
-protocol of kernel. These should be filled by bootloader or 16-bit
-real-mode setup code of the kernel. References/settings to it mainly
-are in:
-
- arch/x86/include/uapi/asm/bootparam.h
-
-
-Offset Proto Name Meaning
-/Size
-
-000/040 ALL screen_info Text mode or frame buffer information
- (struct screen_info)
-040/014 ALL apm_bios_info APM BIOS information (struct apm_bios_info)
-058/008 ALL tboot_addr Physical address of tboot shared page
-060/010 ALL ist_info Intel SpeedStep (IST) BIOS support information
- (struct ist_info)
-080/010 ALL hd0_info hd0 disk parameter, OBSOLETE!!
-090/010 ALL hd1_info hd1 disk parameter, OBSOLETE!!
-0A0/010 ALL sys_desc_table System description table (struct sys_desc_table),
- OBSOLETE!!
-0B0/010 ALL olpc_ofw_header OLPC's OpenFirmware CIF and friends
-0C0/004 ALL ext_ramdisk_image ramdisk_image high 32bits
-0C4/004 ALL ext_ramdisk_size ramdisk_size high 32bits
-0C8/004 ALL ext_cmd_line_ptr cmd_line_ptr high 32bits
-140/080 ALL edid_info Video mode setup (struct edid_info)
-1C0/020 ALL efi_info EFI 32 information (struct efi_info)
-1E0/004 ALL alt_mem_k Alternative mem check, in KB
-1E4/004 ALL scratch Scratch field for the kernel setup code
-1E8/001 ALL e820_entries Number of entries in e820_table (below)
-1E9/001 ALL eddbuf_entries Number of entries in eddbuf (below)
-1EA/001 ALL edd_mbr_sig_buf_entries Number of entries in edd_mbr_sig_buffer
- (below)
-1EB/001 ALL kbd_status Numlock is enabled
-1EC/001 ALL secure_boot Secure boot is enabled in the firmware
-1EF/001 ALL sentinel Used to detect broken bootloaders
-290/040 ALL edd_mbr_sig_buffer EDD MBR signatures
-2D0/A00 ALL e820_table E820 memory map table
- (array of struct e820_entry)
-D00/1EC ALL eddbuf EDD data (array of struct edd_info)
--
2.20.1

2019-04-23 16:38:58

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 45/63] Documentation: x86: convert tlb.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/index.rst | 1 +
Documentation/x86/{tlb.txt => tlb.rst} | 30 ++++++++++++++++----------
2 files changed, 20 insertions(+), 11 deletions(-)
rename Documentation/x86/{tlb.txt => tlb.rst} (81%)

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 9a0b5f38ef6b..fd54b859db9b 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -15,3 +15,4 @@ Linux x86 Support
entry_64
earlyprintk
zero-page
+ tlb
diff --git a/Documentation/x86/tlb.txt b/Documentation/x86/tlb.rst
similarity index 81%
rename from Documentation/x86/tlb.txt
rename to Documentation/x86/tlb.rst
index 6a0607b99ed8..82ec58ae63a8 100644
--- a/Documentation/x86/tlb.txt
+++ b/Documentation/x86/tlb.rst
@@ -1,5 +1,12 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======
+The TLB
+=======
+
When the kernel unmaps or modified the attributes of a range of
memory, it has two choices:
+
1. Flush the entire TLB with a two-instruction sequence. This is
a quick operation, but it causes collateral damage: TLB entries
from areas other than the one we are trying to flush will be
@@ -10,6 +17,7 @@ memory, it has two choices:
damage to other TLB entries.

Which method to do depends on a few things:
+
1. The size of the flush being performed. A flush of the entire
address space is obviously better performed by flushing the
entire TLB than doing 2^48/PAGE_SIZE individual flushes.
@@ -33,7 +41,7 @@ well. There is essentially no "right" point to choose.
You may be doing too many individual invalidations if you see the
invlpg instruction (or instructions _near_ it) show up high in
profiles. If you believe that individual invalidations being
-called too often, you can lower the tunable:
+called too often, you can lower the tunable::

/sys/kernel/debug/x86/tlb_single_page_flush_ceiling

@@ -43,7 +51,7 @@ Setting it to 1 is a very conservative setting and it should
never need to be 0 under normal circumstances.

Despite the fact that a single individual flush on x86 is
-guaranteed to flush a full 2MB [1], hugetlbfs always uses the full
+guaranteed to flush a full 2MB [1]_, hugetlbfs always uses the full
flushes. THP is treated exactly the same as normal memory.

You might see invlpg inside of flush_tlb_mm_range() show up in
@@ -54,15 +62,15 @@ Essentially, you are balancing the cycles you spend doing invlpg
with the cycles that you spend refilling the TLB later.

You can measure how expensive TLB refills are by using
-performance counters and 'perf stat', like this:
+performance counters and 'perf stat', like this::

-perf stat -e
- cpu/event=0x8,umask=0x84,name=dtlb_load_misses_walk_duration/,
- cpu/event=0x8,umask=0x82,name=dtlb_load_misses_walk_completed/,
- cpu/event=0x49,umask=0x4,name=dtlb_store_misses_walk_duration/,
- cpu/event=0x49,umask=0x2,name=dtlb_store_misses_walk_completed/,
- cpu/event=0x85,umask=0x4,name=itlb_misses_walk_duration/,
- cpu/event=0x85,umask=0x2,name=itlb_misses_walk_completed/
+ perf stat -e
+ cpu/event=0x8,umask=0x84,name=dtlb_load_misses_walk_duration/,
+ cpu/event=0x8,umask=0x82,name=dtlb_load_misses_walk_completed/,
+ cpu/event=0x49,umask=0x4,name=dtlb_store_misses_walk_duration/,
+ cpu/event=0x49,umask=0x2,name=dtlb_store_misses_walk_completed/,
+ cpu/event=0x85,umask=0x4,name=itlb_misses_walk_duration/,
+ cpu/event=0x85,umask=0x2,name=itlb_misses_walk_completed/

That works on an IvyBridge-era CPU (i5-3320M). Different CPUs
may have differently-named counters, but they should at least
@@ -70,6 +78,6 @@ be there in some form. You can use pmu-tools 'ocperf list'
(https://github.com/andikleen/pmu-tools) to find the right
counters for a given CPU.

-1. A footnote in Intel's SDM "4.10.4.2 Recommended Invalidation"
+.. [1] A footnote in Intel's SDM "4.10.4.2 Recommended Invalidation"
says: "One execution of INVLPG is sufficient even for a page
with size greater than 4 KBytes."
--
2.20.1

2019-04-23 16:39:09

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 47/63] Documentation: x86: convert pat.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/index.rst | 1 +
Documentation/x86/pat.rst | 235 ++++++++++++++++++++++++++++++++++++
Documentation/x86/pat.txt | 230 -----------------------------------
3 files changed, 236 insertions(+), 230 deletions(-)
create mode 100644 Documentation/x86/pat.rst
delete mode 100644 Documentation/x86/pat.txt

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index d805962a7238..e06b5c0ea883 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -17,3 +17,4 @@ Linux x86 Support
zero-page
tlb
mtrr
+ pat
diff --git a/Documentation/x86/pat.rst b/Documentation/x86/pat.rst
new file mode 100644
index 000000000000..bf09cab2e0bf
--- /dev/null
+++ b/Documentation/x86/pat.rst
@@ -0,0 +1,235 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========================
+PAT (Page Attribute Table)
+==========================
+
+x86 Page Attribute Table (PAT) allows for setting the memory attribute at the
+page level granularity. PAT is complementary to the MTRR settings which allows
+for setting of memory types over physical address ranges. However, PAT is
+more flexible than MTRR due to its capability to set attributes at page level
+and also due to the fact that there are no hardware limitations on number of
+such attribute settings allowed. Added flexibility comes with guidelines for
+not having memory type aliasing for the same physical memory with multiple
+virtual addresses.
+
+PAT allows for different types of memory attributes. The most commonly used
+ones that will be supported at this time are Write-back, Uncached,
+Write-combined, Write-through and Uncached Minus.
+
+
+PAT APIs
+========
+
+There are many different APIs in the kernel that allows setting of memory
+attributes at the page level. In order to avoid aliasing, these interfaces
+should be used thoughtfully. Below is a table of interfaces available,
+their intended usage and their memory attribute relationships. Internally,
+these APIs use a reserve_memtype()/free_memtype() interface on the physical
+address range to avoid any aliasing.
+::
+
+ -------------------------------------------------------------------
+ API | RAM | ACPI,... | Reserved/Holes |
+ -----------------------|----------|------------|------------------|
+ | | | |
+ ioremap | -- | UC- | UC- |
+ | | | |
+ ioremap_cache | -- | WB | WB |
+ | | | |
+ ioremap_uc | -- | UC | UC |
+ | | | |
+ ioremap_nocache | -- | UC- | UC- |
+ | | | |
+ ioremap_wc | -- | -- | WC |
+ | | | |
+ ioremap_wt | -- | -- | WT |
+ | | | |
+ set_memory_uc | UC- | -- | -- |
+ set_memory_wb | | | |
+ | | | |
+ set_memory_wc | WC | -- | -- |
+ set_memory_wb | | | |
+ | | | |
+ set_memory_wt | WT | -- | -- |
+ set_memory_wb | | | |
+ | | | |
+ pci sysfs resource | -- | -- | UC- |
+ | | | |
+ pci sysfs resource_wc | -- | -- | WC |
+ is IORESOURCE_PREFETCH | | | |
+ | | | |
+ pci proc | -- | -- | UC- |
+ !PCIIOC_WRITE_COMBINE | | | |
+ | | | |
+ pci proc | -- | -- | WC |
+ PCIIOC_WRITE_COMBINE | | | |
+ | | | |
+ /dev/mem | -- | WB/WC/UC- | WB/WC/UC- |
+ read-write | | | |
+ | | | |
+ /dev/mem | -- | UC- | UC- |
+ mmap SYNC flag | | | |
+ | | | |
+ /dev/mem | -- | WB/WC/UC- | WB/WC/UC- |
+ mmap !SYNC flag | |(from exist-| (from exist- |
+ and | | ing alias)| ing alias) |
+ any alias to this area | | | |
+ | | | |
+ /dev/mem | -- | WB | WB |
+ mmap !SYNC flag | | | |
+ no alias to this area | | | |
+ and | | | |
+ MTRR says WB | | | |
+ | | | |
+ /dev/mem | -- | -- | UC- |
+ mmap !SYNC flag | | | |
+ no alias to this area | | | |
+ and | | | |
+ MTRR says !WB | | | |
+ | | | |
+ -------------------------------------------------------------------
+
+Advanced APIs for drivers
+=========================
+
+A. Exporting pages to users with remap_pfn_range, io_remap_pfn_range,
+vmf_insert_pfn.
+
+Drivers wanting to export some pages to userspace do it by using mmap
+interface and a combination of:
+
+ 1) pgprot_noncached()
+ 2) io_remap_pfn_range() or remap_pfn_range() or vmf_insert_pfn()
+
+With PAT support, a new API pgprot_writecombine is being added. So, drivers can
+continue to use the above sequence, with either pgprot_noncached() or
+pgprot_writecombine() in step 1, followed by step 2.
+
+In addition, step 2 internally tracks the region as UC or WC in memtype
+list in order to ensure no conflicting mapping.
+
+Note that this set of APIs only works with IO (non RAM) regions. If driver
+wants to export a RAM region, it has to do set_memory_uc() or set_memory_wc()
+as step 0 above and also track the usage of those pages and use set_memory_wb()
+before the page is freed to free pool.
+
+MTRR effects on PAT / non-PAT systems
+=====================================
+
+The following table provides the effects of using write-combining MTRRs when
+using ioremap*() calls on x86 for both non-PAT and PAT systems. Ideally
+mtrr_add() usage will be phased out in favor of arch_phys_wc_add() which will
+be a no-op on PAT enabled systems. The region over which a arch_phys_wc_add()
+is made, should already have been ioremapped with WC attributes or PAT entries,
+this can be done by using ioremap_wc() / set_memory_wc(). Devices which
+combine areas of IO memory desired to remain uncacheable with areas where
+write-combining is desirable should consider use of ioremap_uc() followed by
+set_memory_wc() to white-list effective write-combined areas. Such use is
+nevertheless discouraged as the effective memory type is considered
+implementation defined, yet this strategy can be used as last resort on devices
+with size-constrained regions where otherwise MTRR write-combining would
+otherwise not be effective.
+::
+
+ ----------------------------------------------------------------------
+ MTRR Non-PAT PAT Linux ioremap value Effective memory type
+ ----------------------------------------------------------------------
+ Non-PAT | PAT
+ PAT
+ |PCD
+ ||PWT
+ |||
+ WC 000 WB _PAGE_CACHE_MODE_WB WC | WC
+ WC 001 WC _PAGE_CACHE_MODE_WC WC* | WC
+ WC 010 UC- _PAGE_CACHE_MODE_UC_MINUS WC* | UC
+ WC 011 UC _PAGE_CACHE_MODE_UC UC | UC
+ ----------------------------------------------------------------------
+
+(*) denotes implementation defined and is discouraged
+
+.. note:: -- in the above table mean "Not suggested usage for the API". Some
+ of the --'s are strictly enforced by the kernel. Some others are not really
+ enforced today, but may be enforced in future.
+
+For ioremap and pci access through /sys or /proc - The actual type returned
+can be more restrictive, in case of any existing aliasing for that address.
+For example: If there is an existing uncached mapping, a new ioremap_wc can
+return uncached mapping in place of write-combine requested.
+
+set_memory_[uc|wc|wt] and set_memory_wb should be used in pairs, where driver
+will first make a region uc, wc or wt and switch it back to wb after use.
+
+Over time writes to /proc/mtrr will be deprecated in favor of using PAT based
+interfaces. Users writing to /proc/mtrr are suggested to use above interfaces.
+
+Drivers should use ioremap_[uc|wc] to access PCI BARs with [uc|wc] access
+types.
+
+Drivers should use set_memory_[uc|wc|wt] to set access type for RAM ranges.
+
+
+PAT debugging
+=============
+
+With CONFIG_DEBUG_FS enabled, PAT memtype list can be examined by::
+
+ # mount -t debugfs debugfs /sys/kernel/debug
+ # cat /sys/kernel/debug/x86/pat_memtype_list
+ PAT memtype list:
+ uncached-minus @ 0x7fadf000-0x7fae0000
+ uncached-minus @ 0x7fb19000-0x7fb1a000
+ uncached-minus @ 0x7fb1a000-0x7fb1b000
+ uncached-minus @ 0x7fb1b000-0x7fb1c000
+ uncached-minus @ 0x7fb1c000-0x7fb1d000
+ uncached-minus @ 0x7fb1d000-0x7fb1e000
+ uncached-minus @ 0x7fb1e000-0x7fb25000
+ uncached-minus @ 0x7fb25000-0x7fb26000
+ uncached-minus @ 0x7fb26000-0x7fb27000
+ uncached-minus @ 0x7fb27000-0x7fb28000
+ uncached-minus @ 0x7fb28000-0x7fb2e000
+ uncached-minus @ 0x7fb2e000-0x7fb2f000
+ uncached-minus @ 0x7fb2f000-0x7fb30000
+ uncached-minus @ 0x7fb31000-0x7fb32000
+ uncached-minus @ 0x80000000-0x90000000
+
+This list shows physical address ranges and various PAT settings used to
+access those physical address ranges.
+
+Another, more verbose way of getting PAT related debug messages is with
+"debugpat" boot parameter. With this parameter, various debug messages are
+printed to dmesg log.
+
+PAT Initialization
+==================
+
+The following table describes how PAT is initialized under various
+configurations. The PAT MSR must be updated by Linux in order to support WC
+and WT attributes. Otherwise, the PAT MSR has the value programmed in it
+by the firmware. Note, Xen enables WC attribute in the PAT MSR for guests.
+::
+
+ MTRR PAT Call Sequence PAT State PAT MSR
+ =========================================================
+ E E MTRR -> PAT init Enabled OS
+ E D MTRR -> PAT init Disabled -
+ D E MTRR -> PAT disable Disabled BIOS
+ D D MTRR -> PAT disable Disabled -
+ - np/E PAT -> PAT disable Disabled BIOS
+ - np/D PAT -> PAT disable Disabled -
+ E !P/E MTRR -> PAT init Disabled BIOS
+ D !P/E MTRR -> PAT disable Disabled BIOS
+ !M !P/E MTRR stub -> PAT disable Disabled BIOS
+
+ Legend
+ ------------------------------------------------
+ E Feature enabled in CPU
+ D Feature disabled/unsupported in CPU
+ np "nopat" boot option specified
+ !P CONFIG_X86_PAT option unset
+ !M CONFIG_MTRR option unset
+ Enabled PAT state set to enabled
+ Disabled PAT state set to disabled
+ OS PAT initializes PAT MSR with OS setting
+ BIOS PAT keeps PAT MSR with BIOS setting
+
diff --git a/Documentation/x86/pat.txt b/Documentation/x86/pat.txt
deleted file mode 100644
index 481d8d8536ac..000000000000
--- a/Documentation/x86/pat.txt
+++ /dev/null
@@ -1,230 +0,0 @@
-
-PAT (Page Attribute Table)
-
-x86 Page Attribute Table (PAT) allows for setting the memory attribute at the
-page level granularity. PAT is complementary to the MTRR settings which allows
-for setting of memory types over physical address ranges. However, PAT is
-more flexible than MTRR due to its capability to set attributes at page level
-and also due to the fact that there are no hardware limitations on number of
-such attribute settings allowed. Added flexibility comes with guidelines for
-not having memory type aliasing for the same physical memory with multiple
-virtual addresses.
-
-PAT allows for different types of memory attributes. The most commonly used
-ones that will be supported at this time are Write-back, Uncached,
-Write-combined, Write-through and Uncached Minus.
-
-
-PAT APIs
---------
-
-There are many different APIs in the kernel that allows setting of memory
-attributes at the page level. In order to avoid aliasing, these interfaces
-should be used thoughtfully. Below is a table of interfaces available,
-their intended usage and their memory attribute relationships. Internally,
-these APIs use a reserve_memtype()/free_memtype() interface on the physical
-address range to avoid any aliasing.
-
-
--------------------------------------------------------------------
-API | RAM | ACPI,... | Reserved/Holes |
------------------------|----------|------------|------------------|
- | | | |
-ioremap | -- | UC- | UC- |
- | | | |
-ioremap_cache | -- | WB | WB |
- | | | |
-ioremap_uc | -- | UC | UC |
- | | | |
-ioremap_nocache | -- | UC- | UC- |
- | | | |
-ioremap_wc | -- | -- | WC |
- | | | |
-ioremap_wt | -- | -- | WT |
- | | | |
-set_memory_uc | UC- | -- | -- |
- set_memory_wb | | | |
- | | | |
-set_memory_wc | WC | -- | -- |
- set_memory_wb | | | |
- | | | |
-set_memory_wt | WT | -- | -- |
- set_memory_wb | | | |
- | | | |
-pci sysfs resource | -- | -- | UC- |
- | | | |
-pci sysfs resource_wc | -- | -- | WC |
- is IORESOURCE_PREFETCH| | | |
- | | | |
-pci proc | -- | -- | UC- |
- !PCIIOC_WRITE_COMBINE | | | |
- | | | |
-pci proc | -- | -- | WC |
- PCIIOC_WRITE_COMBINE | | | |
- | | | |
-/dev/mem | -- | WB/WC/UC- | WB/WC/UC- |
- read-write | | | |
- | | | |
-/dev/mem | -- | UC- | UC- |
- mmap SYNC flag | | | |
- | | | |
-/dev/mem | -- | WB/WC/UC- | WB/WC/UC- |
- mmap !SYNC flag | |(from exist-| (from exist- |
- and | | ing alias)| ing alias) |
- any alias to this area| | | |
- | | | |
-/dev/mem | -- | WB | WB |
- mmap !SYNC flag | | | |
- no alias to this area | | | |
- and | | | |
- MTRR says WB | | | |
- | | | |
-/dev/mem | -- | -- | UC- |
- mmap !SYNC flag | | | |
- no alias to this area | | | |
- and | | | |
- MTRR says !WB | | | |
- | | | |
--------------------------------------------------------------------
-
-Advanced APIs for drivers
--------------------------
-A. Exporting pages to users with remap_pfn_range, io_remap_pfn_range,
-vmf_insert_pfn
-
-Drivers wanting to export some pages to userspace do it by using mmap
-interface and a combination of
-1) pgprot_noncached()
-2) io_remap_pfn_range() or remap_pfn_range() or vmf_insert_pfn()
-
-With PAT support, a new API pgprot_writecombine is being added. So, drivers can
-continue to use the above sequence, with either pgprot_noncached() or
-pgprot_writecombine() in step 1, followed by step 2.
-
-In addition, step 2 internally tracks the region as UC or WC in memtype
-list in order to ensure no conflicting mapping.
-
-Note that this set of APIs only works with IO (non RAM) regions. If driver
-wants to export a RAM region, it has to do set_memory_uc() or set_memory_wc()
-as step 0 above and also track the usage of those pages and use set_memory_wb()
-before the page is freed to free pool.
-
-MTRR effects on PAT / non-PAT systems
--------------------------------------
-
-The following table provides the effects of using write-combining MTRRs when
-using ioremap*() calls on x86 for both non-PAT and PAT systems. Ideally
-mtrr_add() usage will be phased out in favor of arch_phys_wc_add() which will
-be a no-op on PAT enabled systems. The region over which a arch_phys_wc_add()
-is made, should already have been ioremapped with WC attributes or PAT entries,
-this can be done by using ioremap_wc() / set_memory_wc(). Devices which
-combine areas of IO memory desired to remain uncacheable with areas where
-write-combining is desirable should consider use of ioremap_uc() followed by
-set_memory_wc() to white-list effective write-combined areas. Such use is
-nevertheless discouraged as the effective memory type is considered
-implementation defined, yet this strategy can be used as last resort on devices
-with size-constrained regions where otherwise MTRR write-combining would
-otherwise not be effective.
-
-----------------------------------------------------------------------
-MTRR Non-PAT PAT Linux ioremap value Effective memory type
-----------------------------------------------------------------------
- Non-PAT | PAT
- PAT
- |PCD
- ||PWT
- |||
-WC 000 WB _PAGE_CACHE_MODE_WB WC | WC
-WC 001 WC _PAGE_CACHE_MODE_WC WC* | WC
-WC 010 UC- _PAGE_CACHE_MODE_UC_MINUS WC* | UC
-WC 011 UC _PAGE_CACHE_MODE_UC UC | UC
-----------------------------------------------------------------------
-
-(*) denotes implementation defined and is discouraged
-
-Notes:
-
--- in the above table mean "Not suggested usage for the API". Some of the --'s
-are strictly enforced by the kernel. Some others are not really enforced
-today, but may be enforced in future.
-
-For ioremap and pci access through /sys or /proc - The actual type returned
-can be more restrictive, in case of any existing aliasing for that address.
-For example: If there is an existing uncached mapping, a new ioremap_wc can
-return uncached mapping in place of write-combine requested.
-
-set_memory_[uc|wc|wt] and set_memory_wb should be used in pairs, where driver
-will first make a region uc, wc or wt and switch it back to wb after use.
-
-Over time writes to /proc/mtrr will be deprecated in favor of using PAT based
-interfaces. Users writing to /proc/mtrr are suggested to use above interfaces.
-
-Drivers should use ioremap_[uc|wc] to access PCI BARs with [uc|wc] access
-types.
-
-Drivers should use set_memory_[uc|wc|wt] to set access type for RAM ranges.
-
-
-PAT debugging
--------------
-
-With CONFIG_DEBUG_FS enabled, PAT memtype list can be examined by
-
-# mount -t debugfs debugfs /sys/kernel/debug
-# cat /sys/kernel/debug/x86/pat_memtype_list
-PAT memtype list:
-uncached-minus @ 0x7fadf000-0x7fae0000
-uncached-minus @ 0x7fb19000-0x7fb1a000
-uncached-minus @ 0x7fb1a000-0x7fb1b000
-uncached-minus @ 0x7fb1b000-0x7fb1c000
-uncached-minus @ 0x7fb1c000-0x7fb1d000
-uncached-minus @ 0x7fb1d000-0x7fb1e000
-uncached-minus @ 0x7fb1e000-0x7fb25000
-uncached-minus @ 0x7fb25000-0x7fb26000
-uncached-minus @ 0x7fb26000-0x7fb27000
-uncached-minus @ 0x7fb27000-0x7fb28000
-uncached-minus @ 0x7fb28000-0x7fb2e000
-uncached-minus @ 0x7fb2e000-0x7fb2f000
-uncached-minus @ 0x7fb2f000-0x7fb30000
-uncached-minus @ 0x7fb31000-0x7fb32000
-uncached-minus @ 0x80000000-0x90000000
-
-This list shows physical address ranges and various PAT settings used to
-access those physical address ranges.
-
-Another, more verbose way of getting PAT related debug messages is with
-"debugpat" boot parameter. With this parameter, various debug messages are
-printed to dmesg log.
-
-PAT Initialization
-------------------
-
-The following table describes how PAT is initialized under various
-configurations. The PAT MSR must be updated by Linux in order to support WC
-and WT attributes. Otherwise, the PAT MSR has the value programmed in it
-by the firmware. Note, Xen enables WC attribute in the PAT MSR for guests.
-
- MTRR PAT Call Sequence PAT State PAT MSR
- =========================================================
- E E MTRR -> PAT init Enabled OS
- E D MTRR -> PAT init Disabled -
- D E MTRR -> PAT disable Disabled BIOS
- D D MTRR -> PAT disable Disabled -
- - np/E PAT -> PAT disable Disabled BIOS
- - np/D PAT -> PAT disable Disabled -
- E !P/E MTRR -> PAT init Disabled BIOS
- D !P/E MTRR -> PAT disable Disabled BIOS
- !M !P/E MTRR stub -> PAT disable Disabled BIOS
-
- Legend
- ------------------------------------------------
- E Feature enabled in CPU
- D Feature disabled/unsupported in CPU
- np "nopat" boot option specified
- !P CONFIG_X86_PAT option unset
- !M CONFIG_MTRR option unset
- Enabled PAT state set to enabled
- Disabled PAT state set to disabled
- OS PAT initializes PAT MSR with OS setting
- BIOS PAT keeps PAT MSR with BIOS setting
-
--
2.20.1

2019-04-23 16:39:16

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 48/63] Documentation: x86: convert protection-keys.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/index.rst | 1 +
...rotection-keys.txt => protection-keys.rst} | 33 ++++++++++++-------
2 files changed, 22 insertions(+), 12 deletions(-)
rename Documentation/x86/{protection-keys.txt => protection-keys.rst} (83%)

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index e06b5c0ea883..576628b121cc 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -18,3 +18,4 @@ Linux x86 Support
tlb
mtrr
pat
+ protection-keys
diff --git a/Documentation/x86/protection-keys.txt b/Documentation/x86/protection-keys.rst
similarity index 83%
rename from Documentation/x86/protection-keys.txt
rename to Documentation/x86/protection-keys.rst
index ecb0d2dadfb7..49d9833af871 100644
--- a/Documentation/x86/protection-keys.txt
+++ b/Documentation/x86/protection-keys.rst
@@ -1,3 +1,9 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================
+Memory Protection Keys
+======================
+
Memory Protection Keys for Userspace (PKU aka PKEYs) is a feature
which is found on Intel's Skylake "Scalable Processor" Server CPUs.
It will be avalable in future non-server parts.
@@ -23,9 +29,10 @@ even though there is theoretically space in the PAE PTEs. These
permissions are enforced on data access only and have no effect on
instruction fetches.

-=========================== Syscalls ===========================
+Syscalls
+========

-There are 3 system calls which directly interact with pkeys:
+There are 3 system calls which directly interact with pkeys::

int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
int pkey_free(int pkey);
@@ -37,6 +44,7 @@ pkey_alloc(). An application calls the WRPKRU instruction
directly in order to change access permissions to memory covered
with a key. In this example WRPKRU is wrapped by a C function
called pkey_set().
+::

int real_prot = PROT_READ|PROT_WRITE;
pkey = pkey_alloc(0, PKEY_DISABLE_WRITE);
@@ -45,43 +53,44 @@ called pkey_set().
... application runs here

Now, if the application needs to update the data at 'ptr', it can
-gain access, do the update, then remove its write access:
+gain access, do the update, then remove its write access::

pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE
*ptr = foo; // assign something
pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again

Now when it frees the memory, it will also free the pkey since it
-is no longer in use:
+is no longer in use::

munmap(ptr, PAGE_SIZE);
pkey_free(pkey);

-(Note: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
- An example implementation can be found in
- tools/testing/selftests/x86/protection_keys.c)
+.. note:: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
+ An example implementation can be found in
+ tools/testing/selftests/x86/protection_keys.c.

-=========================== Behavior ===========================
+Behavior
+========

The kernel attempts to make protection keys consistent with the
-behavior of a plain mprotect(). For instance if you do this:
+behavior of a plain mprotect(). For instance if you do this::

mprotect(ptr, size, PROT_NONE);
something(ptr);

-you can expect the same effects with protection keys when doing this:
+you can expect the same effects with protection keys when doing this::

pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ);
pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey);
something(ptr);

That should be true whether something() is a direct access to 'ptr'
-like:
+like::

*ptr = foo;

or when the kernel does the access on the application's behalf like
-with a read():
+with a read()::

read(fd, ptr, 1);

--
2.20.1

2019-04-23 16:39:16

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 39/63] Documentation: x86: convert topology.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/index.rst | 1 +
Documentation/x86/topology.rst | 228 +++++++++++++++++++++++++++++++++
Documentation/x86/topology.txt | 217 -------------------------------
3 files changed, 229 insertions(+), 217 deletions(-)
create mode 100644 Documentation/x86/topology.rst
delete mode 100644 Documentation/x86/topology.txt

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 8f08caf4fbbb..2033791e53bc 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -9,3 +9,4 @@ Linux x86 Support
:numbered:

boot
+ topology
diff --git a/Documentation/x86/topology.rst b/Documentation/x86/topology.rst
new file mode 100644
index 000000000000..1df5f56f4882
--- /dev/null
+++ b/Documentation/x86/topology.rst
@@ -0,0 +1,228 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============
+x86 Topology
+============
+
+This documents and clarifies the main aspects of x86 topology modelling and
+representation in the kernel. Update/change when doing changes to the
+respective code.
+
+The architecture-agnostic topology definitions are in
+Documentation/cputopology.txt. This file holds x86-specific
+differences/specialities which must not necessarily apply to the generic
+definitions. Thus, the way to read up on Linux topology on x86 is to start
+with the generic one and look at this one in parallel for the x86 specifics.
+
+Needless to say, code should use the generic functions - this file is *only*
+here to *document* the inner workings of x86 topology.
+
+Started by Thomas Gleixner <[email protected]> and Borislav Petkov <[email protected]>.
+
+The main aim of the topology facilities is to present adequate interfaces to
+code which needs to know/query/use the structure of the running system wrt
+threads, cores, packages, etc.
+
+The kernel does not care about the concept of physical sockets because a
+socket has no relevance to software. It's an electromechanical component. In
+the past a socket always contained a single package (see below), but with the
+advent of Multi Chip Modules (MCM) a socket can hold more than one package. So
+there might be still references to sockets in the code, but they are of
+historical nature and should be cleaned up.
+
+The topology of a system is described in the units of:
+
+ - packages
+ - cores
+ - threads
+
+Package
+=======
+
+Packages contain a number of cores plus shared resources, e.g. DRAM
+controller, shared caches etc.
+
+AMD nomenclature for package is 'Node'.
+
+Package-related topology information in the kernel:
+
+ - cpuinfo_x86.x86_max_cores:
+
+ The number of cores in a package. This information is retrieved via CPUID.
+
+ - cpuinfo_x86.phys_proc_id:
+
+ The physical ID of the package. This information is retrieved via CPUID
+ and deduced from the APIC IDs of the cores in the package.
+
+ - cpuinfo_x86.logical_id:
+
+ The logical ID of the package. As we do not trust BIOSes to enumerate the
+ packages in a consistent way, we introduced the concept of logical package
+ ID so we can sanely calculate the number of maximum possible packages in
+ the system and have the packages enumerated linearly.
+
+ - topology_max_packages():
+
+ The maximum possible number of packages in the system. Helpful for per
+ package facilities to preallocate per package information.
+
+ - cpu_llc_id:
+
+ A per-CPU variable containing:
+
+ - On Intel, the first APIC ID of the list of CPUs sharing the Last Level
+ Cache.
+
+ - On AMD, the Node ID or Core Complex ID containing the Last Level
+ Cache. In general, it is a number identifying an LLC uniquely on the
+ system.
+
+Cores
+=====
+
+A core consists of 1 or more threads. It does not matter whether the threads
+are SMT- or CMT-type threads.
+
+AMDs nomenclature for a CMT core is "Compute Unit". The kernel always uses
+"core".
+
+Core-related topology information in the kernel:
+
+ - smp_num_siblings:
+
+ The number of threads in a core. The number of threads in a package can be
+ calculated by::
+
+ threads_per_package = cpuinfo_x86.x86_max_cores * smp_num_siblings
+
+
+Threads
+=======
+
+A thread is a single scheduling unit. It's the equivalent to a logical Linux
+CPU.
+
+AMDs nomenclature for CMT threads is "Compute Unit Core". The kernel always
+uses "thread".
+
+Thread-related topology information in the kernel:
+
+ - topology_core_cpumask():
+
+ The cpumask contains all online threads in the package to which a thread
+ belongs.
+
+ The number of online threads is also printed in /proc/cpuinfo "siblings."
+
+ - topology_sibling_cpumask():
+
+ The cpumask contains all online threads in the core to which a thread
+ belongs.
+
+ - topology_logical_package_id():
+
+ The logical package ID to which a thread belongs.
+
+ - topology_physical_package_id():
+
+ The physical package ID to which a thread belongs.
+
+ - topology_core_id();
+
+ The ID of the core to which a thread belongs. It is also printed in /proc/cpuinfo
+ "core_id."
+
+
+
+System topology examples
+========================
+
+.. note:: The alternative Linux CPU enumeration depends on how the BIOS
+ enumerates the threads. Many BIOSes enumerate all threads 0 first and
+ then all threads 1. That has the "advantage" that the logical Linux CPU
+ numbers of threads 0 stay the same whether threads are enabled or not.
+ That's merely an implementation detail and has no practical impact.
+
+1) Single Package, Single Core
+::
+
+ [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
+
+2) Single Package, Dual Core
+
+ a) One thread per core
+ ::
+
+ [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
+ -> [core 1] -> [thread 0] -> Linux CPU 1
+
+ b) Two threads per core
+ ::
+
+ [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
+ -> [thread 1] -> Linux CPU 1
+ -> [core 1] -> [thread 0] -> Linux CPU 2
+ -> [thread 1] -> Linux CPU 3
+
+ Alternative enumeration::
+
+ [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
+ -> [thread 1] -> Linux CPU 2
+ -> [core 1] -> [thread 0] -> Linux CPU 1
+ -> [thread 1] -> Linux CPU 3
+
+ AMD nomenclature for CMT systems::
+
+ [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0
+ -> [Compute Unit Core 1] -> Linux CPU 1
+ -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2
+ -> [Compute Unit Core 1] -> Linux CPU 3
+
+4) Dual Package, Dual Core
+
+ a) One thread per core
+ ::
+
+ [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
+ -> [core 1] -> [thread 0] -> Linux CPU 1
+
+ [package 1] -> [core 0] -> [thread 0] -> Linux CPU 2
+ -> [core 1] -> [thread 0] -> Linux CPU 3
+
+ b) Two threads per core
+ ::
+
+ [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
+ -> [thread 1] -> Linux CPU 1
+ -> [core 1] -> [thread 0] -> Linux CPU 2
+ -> [thread 1] -> Linux CPU 3
+
+ [package 1] -> [core 0] -> [thread 0] -> Linux CPU 4
+ -> [thread 1] -> Linux CPU 5
+ -> [core 1] -> [thread 0] -> Linux CPU 6
+ -> [thread 1] -> Linux CPU 7
+
+ Alternative enumeration::
+
+ [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
+ -> [thread 1] -> Linux CPU 4
+ -> [core 1] -> [thread 0] -> Linux CPU 1
+ -> [thread 1] -> Linux CPU 5
+
+ [package 1] -> [core 0] -> [thread 0] -> Linux CPU 2
+ -> [thread 1] -> Linux CPU 6
+ -> [core 1] -> [thread 0] -> Linux CPU 3
+ -> [thread 1] -> Linux CPU 7
+
+ AMD nomenclature for CMT systems::
+
+ [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0
+ -> [Compute Unit Core 1] -> Linux CPU 1
+ -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2
+ -> [Compute Unit Core 1] -> Linux CPU 3
+
+ [node 1] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 4
+ -> [Compute Unit Core 1] -> Linux CPU 5
+ -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 6
+ -> [Compute Unit Core 1] -> Linux CPU 7
diff --git a/Documentation/x86/topology.txt b/Documentation/x86/topology.txt
deleted file mode 100644
index 2953e3ec9a02..000000000000
--- a/Documentation/x86/topology.txt
+++ /dev/null
@@ -1,217 +0,0 @@
-x86 Topology
-============
-
-This documents and clarifies the main aspects of x86 topology modelling and
-representation in the kernel. Update/change when doing changes to the
-respective code.
-
-The architecture-agnostic topology definitions are in
-Documentation/cputopology.txt. This file holds x86-specific
-differences/specialities which must not necessarily apply to the generic
-definitions. Thus, the way to read up on Linux topology on x86 is to start
-with the generic one and look at this one in parallel for the x86 specifics.
-
-Needless to say, code should use the generic functions - this file is *only*
-here to *document* the inner workings of x86 topology.
-
-Started by Thomas Gleixner <[email protected]> and Borislav Petkov <[email protected]>.
-
-The main aim of the topology facilities is to present adequate interfaces to
-code which needs to know/query/use the structure of the running system wrt
-threads, cores, packages, etc.
-
-The kernel does not care about the concept of physical sockets because a
-socket has no relevance to software. It's an electromechanical component. In
-the past a socket always contained a single package (see below), but with the
-advent of Multi Chip Modules (MCM) a socket can hold more than one package. So
-there might be still references to sockets in the code, but they are of
-historical nature and should be cleaned up.
-
-The topology of a system is described in the units of:
-
- - packages
- - cores
- - threads
-
-* Package:
-
- Packages contain a number of cores plus shared resources, e.g. DRAM
- controller, shared caches etc.
-
- AMD nomenclature for package is 'Node'.
-
- Package-related topology information in the kernel:
-
- - cpuinfo_x86.x86_max_cores:
-
- The number of cores in a package. This information is retrieved via CPUID.
-
- - cpuinfo_x86.phys_proc_id:
-
- The physical ID of the package. This information is retrieved via CPUID
- and deduced from the APIC IDs of the cores in the package.
-
- - cpuinfo_x86.logical_id:
-
- The logical ID of the package. As we do not trust BIOSes to enumerate the
- packages in a consistent way, we introduced the concept of logical package
- ID so we can sanely calculate the number of maximum possible packages in
- the system and have the packages enumerated linearly.
-
- - topology_max_packages():
-
- The maximum possible number of packages in the system. Helpful for per
- package facilities to preallocate per package information.
-
- - cpu_llc_id:
-
- A per-CPU variable containing:
- - On Intel, the first APIC ID of the list of CPUs sharing the Last Level
- Cache
-
- - On AMD, the Node ID or Core Complex ID containing the Last Level
- Cache. In general, it is a number identifying an LLC uniquely on the
- system.
-
-* Cores:
-
- A core consists of 1 or more threads. It does not matter whether the threads
- are SMT- or CMT-type threads.
-
- AMDs nomenclature for a CMT core is "Compute Unit". The kernel always uses
- "core".
-
- Core-related topology information in the kernel:
-
- - smp_num_siblings:
-
- The number of threads in a core. The number of threads in a package can be
- calculated by:
-
- threads_per_package = cpuinfo_x86.x86_max_cores * smp_num_siblings
-
-
-* Threads:
-
- A thread is a single scheduling unit. It's the equivalent to a logical Linux
- CPU.
-
- AMDs nomenclature for CMT threads is "Compute Unit Core". The kernel always
- uses "thread".
-
- Thread-related topology information in the kernel:
-
- - topology_core_cpumask():
-
- The cpumask contains all online threads in the package to which a thread
- belongs.
-
- The number of online threads is also printed in /proc/cpuinfo "siblings."
-
- - topology_sibling_cpumask():
-
- The cpumask contains all online threads in the core to which a thread
- belongs.
-
- - topology_logical_package_id():
-
- The logical package ID to which a thread belongs.
-
- - topology_physical_package_id():
-
- The physical package ID to which a thread belongs.
-
- - topology_core_id();
-
- The ID of the core to which a thread belongs. It is also printed in /proc/cpuinfo
- "core_id."
-
-
-
-System topology examples
-
-Note:
-
-The alternative Linux CPU enumeration depends on how the BIOS enumerates the
-threads. Many BIOSes enumerate all threads 0 first and then all threads 1.
-That has the "advantage" that the logical Linux CPU numbers of threads 0 stay
-the same whether threads are enabled or not. That's merely an implementation
-detail and has no practical impact.
-
-1) Single Package, Single Core
-
- [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
-
-2) Single Package, Dual Core
-
- a) One thread per core
-
- [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
- -> [core 1] -> [thread 0] -> Linux CPU 1
-
- b) Two threads per core
-
- [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
- -> [thread 1] -> Linux CPU 1
- -> [core 1] -> [thread 0] -> Linux CPU 2
- -> [thread 1] -> Linux CPU 3
-
- Alternative enumeration:
-
- [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
- -> [thread 1] -> Linux CPU 2
- -> [core 1] -> [thread 0] -> Linux CPU 1
- -> [thread 1] -> Linux CPU 3
-
- AMD nomenclature for CMT systems:
-
- [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0
- -> [Compute Unit Core 1] -> Linux CPU 1
- -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2
- -> [Compute Unit Core 1] -> Linux CPU 3
-
-4) Dual Package, Dual Core
-
- a) One thread per core
-
- [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
- -> [core 1] -> [thread 0] -> Linux CPU 1
-
- [package 1] -> [core 0] -> [thread 0] -> Linux CPU 2
- -> [core 1] -> [thread 0] -> Linux CPU 3
-
- b) Two threads per core
-
- [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
- -> [thread 1] -> Linux CPU 1
- -> [core 1] -> [thread 0] -> Linux CPU 2
- -> [thread 1] -> Linux CPU 3
-
- [package 1] -> [core 0] -> [thread 0] -> Linux CPU 4
- -> [thread 1] -> Linux CPU 5
- -> [core 1] -> [thread 0] -> Linux CPU 6
- -> [thread 1] -> Linux CPU 7
-
- Alternative enumeration:
-
- [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
- -> [thread 1] -> Linux CPU 4
- -> [core 1] -> [thread 0] -> Linux CPU 1
- -> [thread 1] -> Linux CPU 5
-
- [package 1] -> [core 0] -> [thread 0] -> Linux CPU 2
- -> [thread 1] -> Linux CPU 6
- -> [core 1] -> [thread 0] -> Linux CPU 3
- -> [thread 1] -> Linux CPU 7
-
- AMD nomenclature for CMT systems:
-
- [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0
- -> [Compute Unit Core 1] -> Linux CPU 1
- -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2
- -> [Compute Unit Core 1] -> Linux CPU 3
-
- [node 1] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 4
- -> [Compute Unit Core 1] -> Linux CPU 5
- -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 6
- -> [Compute Unit Core 1] -> Linux CPU 7
--
2.20.1

2019-04-23 16:39:22

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 49/63] Documentation: x86: convert intel_mpx.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/index.rst | 1 +
.../x86/{intel_mpx.txt => intel_mpx.rst} | 120 ++++++++++--------
2 files changed, 65 insertions(+), 56 deletions(-)
rename Documentation/x86/{intel_mpx.txt => intel_mpx.rst} (75%)

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 576628b121cc..20091d3e5d97 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -19,3 +19,4 @@ Linux x86 Support
mtrr
pat
protection-keys
+ intel_mpx
diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.rst
similarity index 75%
rename from Documentation/x86/intel_mpx.txt
rename to Documentation/x86/intel_mpx.rst
index 85d0549ad846..387a640941a6 100644
--- a/Documentation/x86/intel_mpx.txt
+++ b/Documentation/x86/intel_mpx.rst
@@ -1,5 +1,11 @@
-1. Intel(R) MPX Overview
-========================
+.. SPDX-License-Identifier: GPL-2.0
+
+===========================================
+Intel(R) Memory Protection Extensions (MPX)
+===========================================
+
+Intel(R) MPX Overview
+=====================

Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new capability
introduced into Intel Architecture. Intel MPX provides hardware features
@@ -7,7 +13,7 @@ that can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions are
usurped at runtime due to buffer overflow or underflow.

-You can tell if your CPU supports MPX by looking in /proc/cpuinfo:
+You can tell if your CPU supports MPX by looking in /proc/cpuinfo::

cat /proc/cpuinfo | grep ' mpx '

@@ -21,8 +27,8 @@ can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator


-2. How to get the advantage of MPX
-==================================
+How to get the advantage of MPX
+===============================

For MPX to work, changes are required in the kernel, binutils and compiler.
No source changes are required for applications, just a recompile.
@@ -84,14 +90,15 @@ Kernel MPX Code:
is unmapped.


-3. How does MPX kernel code work
-================================
+How does MPX kernel code work
+=============================

Handling #BR faults caused by MPX
---------------------------------

When MPX is enabled, there are 2 new situations that can generate
#BR faults.
+
* new bounds tables (BT) need to be allocated to save bounds.
* bounds violation caused by MPX instructions.

@@ -124,37 +131,37 @@ the kernel. It can theoretically be done completely from userspace. Here
are a few ways this could be done. We don't think any of them are practical
in the real-world, but here they are.

-Q: Can virtual space simply be reserved for the bounds tables so that we
- never have to allocate them?
-A: MPX-enabled application will possibly create a lot of bounds tables in
- process address space to save bounds information. These tables can take
- up huge swaths of memory (as much as 80% of the memory on the system)
- even if we clean them up aggressively. In the worst-case scenario, the
- tables can be 4x the size of the data structure being tracked. IOW, a
- 1-page structure can require 4 bounds-table pages. An X-GB virtual
- area needs 4*X GB of virtual space, plus 2GB for the bounds directory.
- If we were to preallocate them for the 128TB of user virtual address
- space, we would need to reserve 512TB+2GB, which is larger than the
- entire virtual address space today. This means they can not be reserved
- ahead of time. Also, a single process's pre-populated bounds directory
- consumes 2GB of virtual *AND* physical memory. IOW, it's completely
- infeasible to prepopulate bounds directories.
-
-Q: Can we preallocate bounds table space at the same time memory is
- allocated which might contain pointers that might eventually need
- bounds tables?
-A: This would work if we could hook the site of each and every memory
- allocation syscall. This can be done for small, constrained applications.
- But, it isn't practical at a larger scale since a given app has no
- way of controlling how all the parts of the app might allocate memory
- (think libraries). The kernel is really the only place to intercept
- these calls.
-
-Q: Could a bounds fault be handed to userspace and the tables allocated
- there in a signal handler instead of in the kernel?
-A: mmap() is not on the list of safe async handler functions and even
- if mmap() would work it still requires locking or nasty tricks to
- keep track of the allocation state there.
+:Q: Can virtual space simply be reserved for the bounds tables so that we
+ never have to allocate them?
+:A: MPX-enabled application will possibly create a lot of bounds tables in
+ process address space to save bounds information. These tables can take
+ up huge swaths of memory (as much as 80% of the memory on the system)
+ even if we clean them up aggressively. In the worst-case scenario, the
+ tables can be 4x the size of the data structure being tracked. IOW, a
+ 1-page structure can require 4 bounds-table pages. An X-GB virtual
+ area needs 4*X GB of virtual space, plus 2GB for the bounds directory.
+ If we were to preallocate them for the 128TB of user virtual address
+ space, we would need to reserve 512TB+2GB, which is larger than the
+ entire virtual address space today. This means they can not be reserved
+ ahead of time. Also, a single process's pre-populated bounds directory
+ consumes 2GB of virtual *AND* physical memory. IOW, it's completely
+ infeasible to prepopulate bounds directories.
+
+:Q: Can we preallocate bounds table space at the same time memory is
+ allocated which might contain pointers that might eventually need
+ bounds tables?
+:A: This would work if we could hook the site of each and every memory
+ allocation syscall. This can be done for small, constrained applications.
+ But, it isn't practical at a larger scale since a given app has no
+ way of controlling how all the parts of the app might allocate memory
+ (think libraries). The kernel is really the only place to intercept
+ these calls.
+
+:Q: Could a bounds fault be handed to userspace and the tables allocated
+ there in a signal handler instead of in the kernel?
+:A: mmap() is not on the list of safe async handler functions and even
+ if mmap() would work it still requires locking or nasty tricks to
+ keep track of the allocation state there.

Having ruled out all of the userspace-only approaches for managing
bounds tables that we could think of, we create them on demand in
@@ -167,20 +174,20 @@ If a #BR is generated due to a bounds violation caused by MPX.
We need to decode MPX instructions to get violation address and
set this address into extended struct siginfo.

-The _sigfault field of struct siginfo is extended as follow:
-
-87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
-88 struct {
-89 void __user *_addr; /* faulting insn/memory ref. */
-90 #ifdef __ARCH_SI_TRAPNO
-91 int _trapno; /* TRAP # which caused the signal */
-92 #endif
-93 short _addr_lsb; /* LSB of the reported address */
-94 struct {
-95 void __user *_lower;
-96 void __user *_upper;
-97 } _addr_bnd;
-98 } _sigfault;
+The _sigfault field of struct siginfo is extended as follow::
+
+ 87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
+ 88 struct {
+ 89 void __user *_addr; /* faulting insn/memory ref. */
+ 90 #ifdef __ARCH_SI_TRAPNO
+ 91 int _trapno; /* TRAP # which caused the signal */
+ 92 #endif
+ 93 short _addr_lsb; /* LSB of the reported address */
+ 94 struct {
+ 95 void __user *_lower;
+ 96 void __user *_upper;
+ 97 } _addr_bnd;
+ 98 } _sigfault;

The '_addr' field refers to violation address, and new '_addr_and'
field refers to the upper/lower bounds when a #BR is caused.
@@ -209,9 +216,10 @@ Adding new prctl commands

Two new prctl commands are added to enable and disable MPX bounds tables
management in kernel.
+::

-155 #define PR_MPX_ENABLE_MANAGEMENT 43
-156 #define PR_MPX_DISABLE_MANAGEMENT 44
+ 155 #define PR_MPX_ENABLE_MANAGEMENT 43
+ 156 #define PR_MPX_DISABLE_MANAGEMENT 44

Runtime library in userspace is responsible for allocation of bounds
directory. So kernel have to use XSAVE instruction to get the base
@@ -223,8 +231,8 @@ into struct mm_struct to be used in future during PR_MPX_ENABLE_MANAGEMENT
command execution.


-4. Special rules
-================
+Special rules
+=============

1) If userspace is requesting help from the kernel to do the management
of bounds tables, it may not create or modify entries in the bounds directory.
--
2.20.1

2019-04-23 16:39:35

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 51/63] Documentation: x86: convert pti.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/index.rst | 1 +
Documentation/x86/{pti.txt => pti.rst} | 19 ++++++++++++++-----
2 files changed, 15 insertions(+), 5 deletions(-)
rename Documentation/x86/{pti.txt => pti.rst} (95%)

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index a0426ab156bd..1c675cef14d7 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -21,3 +21,4 @@ Linux x86 Support
protection-keys
intel_mpx
amd-memory-encryption
+ pti
diff --git a/Documentation/x86/pti.txt b/Documentation/x86/pti.rst
similarity index 95%
rename from Documentation/x86/pti.txt
rename to Documentation/x86/pti.rst
index 5cd58439ad2d..44b98f99ca8a 100644
--- a/Documentation/x86/pti.txt
+++ b/Documentation/x86/pti.rst
@@ -1,9 +1,15 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========================
+Page Table Isolation (PTI)
+==========================
+
Overview
========

-Page Table Isolation (pti, previously known as KAISER[1]) is a
+Page Table Isolation (pti, previously known as KAISER [1]_) is a
countermeasure against attacks on the shared user/kernel address
-space such as the "Meltdown" approach[2].
+space such as the "Meltdown" approach [2]_.

To mitigate this class of attacks, we create an independent set of
page tables for use only when running userspace applications. When
@@ -60,6 +66,7 @@ Protection against side-channel attacks is important. But,
this protection comes at a cost:

1. Increased Memory Use
+
a. Each process now needs an order-1 PGD instead of order-0.
(Consumes an additional 4k per process).
b. The 'cpu_entry_area' structure must be 2MB in size and 2MB
@@ -68,6 +75,7 @@ this protection comes at a cost:
is decompressed, but no space in the kernel image itself.

2. Runtime Cost
+
a. CR3 manipulation to switch between the page table copies
must be done at interrupt, syscall, and exception entry
and exit (it can be skipped when the kernel is interrupted,
@@ -142,8 +150,9 @@ ideally doing all of these in parallel:
interrupted, including nested NMIs. Using "-c" boosts the rate of
NMIs, and using two -c with separate counters encourages nested NMIs
and less deterministic behavior.
+ ::

- while true; do perf record -c 10000 -e instructions,cycles -a sleep 10; done
+ while true; do perf record -c 10000 -e instructions,cycles -a sleep 10; done

4. Launch a KVM virtual machine.
5. Run 32-bit binaries on systems supporting the SYSCALL instruction.
@@ -182,5 +191,5 @@ that are worth noting here.
tended to be TLB invalidation issues. Usually invalidating
the wrong PCID, or otherwise missing an invalidation.

-1. https://gruss.cc/files/kaiser.pdf
-2. https://meltdownattack.com/meltdown.pdf
+.. [1] https://gruss.cc/files/kaiser.pdf
+.. [2] https://meltdownattack.com/meltdown.pdf
--
2.20.1

2019-04-23 16:39:36

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 52/63] Documentation: x86: convert microcode.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/index.rst | 1 +
.../x86/{microcode.txt => microcode.rst} | 62 ++++++++++---------
2 files changed, 35 insertions(+), 28 deletions(-)
rename Documentation/x86/{microcode.txt => microcode.rst} (81%)

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 1c675cef14d7..2fcd10f29b87 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -22,3 +22,4 @@ Linux x86 Support
intel_mpx
amd-memory-encryption
pti
+ microcode
diff --git a/Documentation/x86/microcode.txt b/Documentation/x86/microcode.rst
similarity index 81%
rename from Documentation/x86/microcode.txt
rename to Documentation/x86/microcode.rst
index 79fdb4a8148a..a320d37982ed 100644
--- a/Documentation/x86/microcode.txt
+++ b/Documentation/x86/microcode.rst
@@ -1,7 +1,11 @@
- The Linux Microcode Loader
+.. SPDX-License-Identifier: GPL-2.0

-Authors: Fenghua Yu <[email protected]>
- Borislav Petkov <[email protected]>
+==========================
+The Linux Microcode Loader
+==========================
+
+:Authors: - Fenghua Yu <[email protected]>
+ - Borislav Petkov <[email protected]>

The kernel has a x86 microcode loading facility which is supposed to
provide microcode loading methods in the OS. Potential use cases are
@@ -10,8 +14,8 @@ and updating the microcode on long-running systems without rebooting.

The loader supports three loading methods:

-1. Early load microcode
-=======================
+Early load microcode
+====================

The kernel can update microcode very early during boot. Loading
microcode early can fix CPU issues before they are observed during
@@ -26,8 +30,10 @@ loader parses the combined initrd image during boot.

The microcode files in cpio name space are:

-on Intel: kernel/x86/microcode/GenuineIntel.bin
-on AMD : kernel/x86/microcode/AuthenticAMD.bin
+on Intel:
+ kernel/x86/microcode/GenuineIntel.bin
+on AMD :
+ kernel/x86/microcode/AuthenticAMD.bin

During BSP (BootStrapping Processor) boot (pre-SMP), the kernel
scans the microcode file in the initrd. If microcode matching the
@@ -42,8 +48,8 @@ Here's a crude example how to prepare an initrd with microcode (this is
normally done automatically by the distribution, when recreating the
initrd, so you don't really have to do it yourself. It is documented
here for future reference only).
+::

----
#!/bin/bash

if [ -z "$1" ]; then
@@ -76,15 +82,15 @@ here for future reference only).
cat ucode.cpio $INITRD.orig > $INITRD

rm -rf $TMPDIR
----
+

The system needs to have the microcode packages installed into
/lib/firmware or you need to fixup the paths above if yours are
somewhere else and/or you've downloaded them directly from the processor
vendor's site.

-2. Late loading
-===============
+Late loading
+============

There are two legacy user space interfaces to load microcode, either through
/dev/cpu/microcode or through /sys/devices/system/cpu/microcode/reload file
@@ -94,9 +100,9 @@ The /dev/cpu/microcode method is deprecated because it needs a special
userspace tool for that.

The easier method is simply installing the microcode packages your distro
-supplies and running:
+supplies and running::

-# echo 1 > /sys/devices/system/cpu/microcode/reload
+ # echo 1 > /sys/devices/system/cpu/microcode/reload

as root.

@@ -104,29 +110,29 @@ The loading mechanism looks for microcode blobs in
/lib/firmware/{intel-ucode,amd-ucode}. The default distro installation
packages already put them there.

-3. Builtin microcode
-====================
+Builtin microcode
+=================

The loader supports also loading of a builtin microcode supplied through
the regular builtin firmware method CONFIG_EXTRA_FIRMWARE. Only 64-bit is
currently supported.

-Here's an example:
+Here's an example::

-CONFIG_EXTRA_FIRMWARE="intel-ucode/06-3a-09 amd-ucode/microcode_amd_fam15h.bin"
-CONFIG_EXTRA_FIRMWARE_DIR="/lib/firmware"
+ CONFIG_EXTRA_FIRMWARE="intel-ucode/06-3a-09 amd-ucode/microcode_amd_fam15h.bin"
+ CONFIG_EXTRA_FIRMWARE_DIR="/lib/firmware"

-This basically means, you have the following tree structure locally:
+This basically means, you have the following tree structure locally::

-/lib/firmware/
-|-- amd-ucode
-...
-| |-- microcode_amd_fam15h.bin
-...
-|-- intel-ucode
-...
-| |-- 06-3a-09
-...
+ /lib/firmware/
+ |-- amd-ucode
+ ...
+ | |-- microcode_amd_fam15h.bin
+ ...
+ |-- intel-ucode
+ ...
+ | |-- 06-3a-09
+ ...

so that the build system can find those files and integrate them into
the final kernel image. The early loader finds them and applies them.
--
2.20.1

2019-04-23 16:39:52

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 55/63] Documentation: x86: convert usb-legacy-support.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/index.rst | 1 +
.../{usb-legacy-support.txt => usb-legacy-support.rst} | 8 ++++++--
2 files changed, 7 insertions(+), 2 deletions(-)
rename Documentation/x86/{usb-legacy-support.txt => usb-legacy-support.rst} (92%)

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index c41c17906b6d..526f7a008b8e 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -25,3 +25,4 @@ Linux x86 Support
pti
microcode
resctrl_ui
+ usb-legacy-support
diff --git a/Documentation/x86/usb-legacy-support.txt b/Documentation/x86/usb-legacy-support.rst
similarity index 92%
rename from Documentation/x86/usb-legacy-support.txt
rename to Documentation/x86/usb-legacy-support.rst
index 1894cdfc69d9..19abead7f1a8 100644
--- a/Documentation/x86/usb-legacy-support.txt
+++ b/Documentation/x86/usb-legacy-support.rst
@@ -1,7 +1,11 @@
+
+.. SPDX-License-Identifier: GPL-2.0
+
+==================
USB Legacy support
-~~~~~~~~~~~~~~~~~~
+==================

-Vojtech Pavlik <[email protected]>, January 2004
+:Author: Vojtech Pavlik <[email protected]>, January 2004


Also known as "USB Keyboard" or "USB Mouse support" in the BIOS Setup is a
--
2.20.1

2019-04-23 16:40:05

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 43/63] Documentation: x86: convert earlyprintk.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/earlyprintk.rst | 148 ++++++++++++++++++++++++++++++
Documentation/x86/earlyprintk.txt | 141 ----------------------------
Documentation/x86/index.rst | 1 +
3 files changed, 149 insertions(+), 141 deletions(-)
create mode 100644 Documentation/x86/earlyprintk.rst
delete mode 100644 Documentation/x86/earlyprintk.txt

diff --git a/Documentation/x86/earlyprintk.rst b/Documentation/x86/earlyprintk.rst
new file mode 100644
index 000000000000..519402451f9c
--- /dev/null
+++ b/Documentation/x86/earlyprintk.rst
@@ -0,0 +1,148 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============
+Early Printk
+============
+
+Mini-HOWTO for using the earlyprintk=dbgp boot option with a
+USB2 Debug port key and a debug cable, on x86 systems.
+
+You need two computers, the 'USB debug key' special gadget and
+and two USB cables, connected like this::
+
+ [host/target] <-------> [USB debug key] <-------> [client/console]
+
+There are a number of specific hardware requirements
+====================================================
+
+ a) Host/target system needs to have USB debug port capability.
+
+ You can check this capability by looking at a 'Debug port' bit in
+ the lspci -vvv output::
+
+ # lspci -vvv
+ ...
+ 00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 03) (prog-if 20 [EHCI])
+ Subsystem: Lenovo ThinkPad T61
+ Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
+ Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
+ Latency: 0
+ Interrupt: pin D routed to IRQ 19
+ Region 0: Memory at fe227000 (32-bit, non-prefetchable) [size=1K]
+ Capabilities: [50] Power Management version 2
+ Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
+ Status: D0 PME-Enable- DSel=0 DScale=0 PME+
+ Capabilities: [58] Debug port: BAR=1 offset=00a0
+ ^^^^^^^^^^^ <==================== [ HERE ]
+ Kernel driver in use: ehci_hcd
+ Kernel modules: ehci-hcd
+ ...
+
+ If your system does not list a debug port capability then you probably
+ won't be able to use the USB debug key.
+
+ b) You also need a NetChip USB debug cable/key:
+
+ http://www.plxtech.com/products/NET2000/NET20DC/default.asp
+
+ This is a small blue plastic connector with two USB connections;
+ it draws power from its USB connections.
+
+ c) You need a second client/console system with a high speed USB 2.0 port.
+
+ d) The NetChip device must be plugged directly into the physical
+ debug port on the "host/target" system. You cannot use a USB hub in
+ between the physical debug port and the "host/target" system.
+
+ The EHCI debug controller is bound to a specific physical USB
+ port and the NetChip device will only work as an early printk
+ device in this port. The EHCI host controllers are electrically
+ wired such that the EHCI debug controller is hooked up to the
+ first physical port and there is no way to change this via software.
+ You can find the physical port through experimentation by trying
+ each physical port on the system and rebooting. Or you can try
+ and use lsusb or look at the kernel info messages emitted by the
+ usb stack when you plug a usb device into various ports on the
+ "host/target" system.
+
+ Some hardware vendors do not expose the usb debug port with a
+ physical connector and if you find such a device send a complaint
+ to the hardware vendor, because there is no reason not to wire
+ this port into one of the physically accessible ports.
+
+ e) It is also important to note, that many versions of the NetChip
+ device require the "client/console" system to be plugged into the
+ right hand side of the device (with the product logo facing up and
+ readable left to right). The reason being is that the 5 volt
+ power supply is taken from only one side of the device and it
+ must be the side that does not get rebooted.
+
+Software requirements
+=====================
+
+ a) On the host/target system:
+
+ You need to enable the following kernel config option::
+
+ CONFIG_EARLY_PRINTK_DBGP=y
+
+ And you need to add the boot command line: "earlyprintk=dbgp".
+
+ .. note:: If you are using Grub, append it to the 'kernel' line in
+ /etc/grub.conf. If you are using Grub2 on a BIOS firmware system,
+ append it to the 'linux' line in /boot/grub2/grub.cfg. If you are
+ using Grub2 on an EFI firmware system, append it to the 'linux'
+ or 'linuxefi' line in /boot/grub2/grub.cfg or
+ /boot/efi/EFI/<distro>/grub.cfg.)
+
+ On systems with more than one EHCI debug controller you must
+ specify the correct EHCI debug controller number. The ordering
+ comes from the PCI bus enumeration of the EHCI controllers. The
+ default with no number argument is "0" or the first EHCI debug
+ controller. To use the second EHCI debug controller, you would
+ use the command line: "earlyprintk=dbgp1"
+
+ NOTE: normally earlyprintk console gets turned off once the
+ regular console is alive - use "earlyprintk=dbgp,keep" to keep
+ this channel open beyond early bootup. This can be useful for
+ debugging crashes under Xorg, etc.
+
+ b) On the client/console system:
+
+ You should enable the following kernel config option::
+
+ CONFIG_USB_SERIAL_DEBUG=y
+
+ On the next bootup with the modified kernel you should
+ get a /dev/ttyUSBx device(s).
+
+ Now this channel of kernel messages is ready to be used: start
+ your favorite terminal emulator (minicom, etc.) and set
+ it up to use /dev/ttyUSB0 - or use a raw 'cat /dev/ttyUSBx' to
+ see the raw output.
+
+ c) On Nvidia Southbridge based systems: the kernel will try to probe
+ and find out which port has a debug device connected.
+
+Testing that it works fine
+==========================
+
+ You can test the output by using earlyprintk=dbgp,keep and provoking
+ kernel messages on the host/target system. You can provoke a harmless
+ kernel message by for example doing::
+
+ echo h > /proc/sysrq-trigger
+
+ On the host/target system you should see this help line in "dmesg" output::
+
+ SysRq : HELP : loglevel(0-9) reBoot Crashdump terminate-all-tasks(E) memory-full-oom-kill(F) kill-all-tasks(I) saK show-backtrace-all-active-cpus(L) show-memory-usage(M) nice-all-RT-tasks(N) powerOff show-registers(P) show-all-timers(Q) unRaw Sync show-task-states(T) Unmount show-blocked-tasks(W) dump-ftrace-buffer(Z)
+
+ On the client/console system do::
+
+ cat /dev/ttyUSB0
+
+ And you should see the help line above displayed shortly after you've
+ provoked it on the host system.
+
+If it does not work then please ask about it on the [email protected]
+mailing list or contact the x86 maintainers.
diff --git a/Documentation/x86/earlyprintk.txt b/Documentation/x86/earlyprintk.txt
deleted file mode 100644
index 46933e06c972..000000000000
--- a/Documentation/x86/earlyprintk.txt
+++ /dev/null
@@ -1,141 +0,0 @@
-
-Mini-HOWTO for using the earlyprintk=dbgp boot option with a
-USB2 Debug port key and a debug cable, on x86 systems.
-
-You need two computers, the 'USB debug key' special gadget and
-and two USB cables, connected like this:
-
- [host/target] <-------> [USB debug key] <-------> [client/console]
-
-1. There are a number of specific hardware requirements:
-
- a.) Host/target system needs to have USB debug port capability.
-
- You can check this capability by looking at a 'Debug port' bit in
- the lspci -vvv output:
-
- # lspci -vvv
- ...
- 00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 03) (prog-if 20 [EHCI])
- Subsystem: Lenovo ThinkPad T61
- Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
- Latency: 0
- Interrupt: pin D routed to IRQ 19
- Region 0: Memory at fe227000 (32-bit, non-prefetchable) [size=1K]
- Capabilities: [50] Power Management version 2
- Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
- Status: D0 PME-Enable- DSel=0 DScale=0 PME+
- Capabilities: [58] Debug port: BAR=1 offset=00a0
- ^^^^^^^^^^^ <==================== [ HERE ]
- Kernel driver in use: ehci_hcd
- Kernel modules: ehci-hcd
- ...
-
-( If your system does not list a debug port capability then you probably
- won't be able to use the USB debug key. )
-
- b.) You also need a NetChip USB debug cable/key:
-
- http://www.plxtech.com/products/NET2000/NET20DC/default.asp
-
- This is a small blue plastic connector with two USB connections;
- it draws power from its USB connections.
-
- c.) You need a second client/console system with a high speed USB 2.0
- port.
-
- d.) The NetChip device must be plugged directly into the physical
- debug port on the "host/target" system. You cannot use a USB hub in
- between the physical debug port and the "host/target" system.
-
- The EHCI debug controller is bound to a specific physical USB
- port and the NetChip device will only work as an early printk
- device in this port. The EHCI host controllers are electrically
- wired such that the EHCI debug controller is hooked up to the
- first physical port and there is no way to change this via software.
- You can find the physical port through experimentation by trying
- each physical port on the system and rebooting. Or you can try
- and use lsusb or look at the kernel info messages emitted by the
- usb stack when you plug a usb device into various ports on the
- "host/target" system.
-
- Some hardware vendors do not expose the usb debug port with a
- physical connector and if you find such a device send a complaint
- to the hardware vendor, because there is no reason not to wire
- this port into one of the physically accessible ports.
-
- e.) It is also important to note, that many versions of the NetChip
- device require the "client/console" system to be plugged into the
- right hand side of the device (with the product logo facing up and
- readable left to right). The reason being is that the 5 volt
- power supply is taken from only one side of the device and it
- must be the side that does not get rebooted.
-
-2. Software requirements:
-
- a.) On the host/target system:
-
- You need to enable the following kernel config option:
-
- CONFIG_EARLY_PRINTK_DBGP=y
-
- And you need to add the boot command line: "earlyprintk=dbgp".
-
- (If you are using Grub, append it to the 'kernel' line in
- /etc/grub.conf. If you are using Grub2 on a BIOS firmware system,
- append it to the 'linux' line in /boot/grub2/grub.cfg. If you are
- using Grub2 on an EFI firmware system, append it to the 'linux'
- or 'linuxefi' line in /boot/grub2/grub.cfg or
- /boot/efi/EFI/<distro>/grub.cfg.)
-
- On systems with more than one EHCI debug controller you must
- specify the correct EHCI debug controller number. The ordering
- comes from the PCI bus enumeration of the EHCI controllers. The
- default with no number argument is "0" or the first EHCI debug
- controller. To use the second EHCI debug controller, you would
- use the command line: "earlyprintk=dbgp1"
-
- NOTE: normally earlyprintk console gets turned off once the
- regular console is alive - use "earlyprintk=dbgp,keep" to keep
- this channel open beyond early bootup. This can be useful for
- debugging crashes under Xorg, etc.
-
- b.) On the client/console system:
-
- You should enable the following kernel config option:
-
- CONFIG_USB_SERIAL_DEBUG=y
-
- On the next bootup with the modified kernel you should
- get a /dev/ttyUSBx device(s).
-
- Now this channel of kernel messages is ready to be used: start
- your favorite terminal emulator (minicom, etc.) and set
- it up to use /dev/ttyUSB0 - or use a raw 'cat /dev/ttyUSBx' to
- see the raw output.
-
- c.) On Nvidia Southbridge based systems: the kernel will try to probe
- and find out which port has a debug device connected.
-
-3. Testing that it works fine:
-
- You can test the output by using earlyprintk=dbgp,keep and provoking
- kernel messages on the host/target system. You can provoke a harmless
- kernel message by for example doing:
-
- echo h > /proc/sysrq-trigger
-
- On the host/target system you should see this help line in "dmesg" output:
-
- SysRq : HELP : loglevel(0-9) reBoot Crashdump terminate-all-tasks(E) memory-full-oom-kill(F) kill-all-tasks(I) saK show-backtrace-all-active-cpus(L) show-memory-usage(M) nice-all-RT-tasks(N) powerOff show-registers(P) show-all-timers(Q) unRaw Sync show-task-states(T) Unmount show-blocked-tasks(W) dump-ftrace-buffer(Z)
-
- On the client/console system do:
-
- cat /dev/ttyUSB0
-
- And you should see the help line above displayed shortly after you've
- provoked it on the host system.
-
-If it does not work then please ask about it on the [email protected]
-mailing list or contact the x86 maintainers.
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 8a666c5abc85..7b8388ebd43d 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -13,3 +13,4 @@ Linux x86 Support
exception-tables
kernel-stacks
entry_64
+ earlyprintk
--
2.20.1

2019-04-23 16:40:06

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 58/63] Documentation: x86: convert x86_64/uefi.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/x86_64/index.rst | 1 +
.../x86/x86_64/{uefi.txt => uefi.rst} | 30 ++++++++++++++-----
2 files changed, 24 insertions(+), 7 deletions(-)
rename Documentation/x86/x86_64/{uefi.txt => uefi.rst} (79%)

diff --git a/Documentation/x86/x86_64/index.rst b/Documentation/x86/x86_64/index.rst
index a8cf7713cac9..ddfa1f9d4193 100644
--- a/Documentation/x86/x86_64/index.rst
+++ b/Documentation/x86/x86_64/index.rst
@@ -8,3 +8,4 @@ x86_64 Support
:maxdepth: 2

boot-options
+ uefi
diff --git a/Documentation/x86/x86_64/uefi.txt b/Documentation/x86/x86_64/uefi.rst
similarity index 79%
rename from Documentation/x86/x86_64/uefi.txt
rename to Documentation/x86/x86_64/uefi.rst
index a5e2b4fdb170..88c3ba32546f 100644
--- a/Documentation/x86/x86_64/uefi.txt
+++ b/Documentation/x86/x86_64/uefi.rst
@@ -1,5 +1,8 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================================
General note on [U]EFI x86_64 support
--------------------------------------
+=====================================

The nomenclature EFI and UEFI are used interchangeably in this document.

@@ -14,29 +17,42 @@ with EFI firmware and specifications are listed below.

3. x86_64 platform with EFI/UEFI firmware.

-Mechanics:
+Mechanics
---------
-- Build the kernel with the following configuration.
+
+- Build the kernel with the following configuration::
+
CONFIG_FB_EFI=y
CONFIG_FRAMEBUFFER_CONSOLE=y
+
If EFI runtime services are expected, the following configuration should
- be selected.
+ be selected::
+
CONFIG_EFI=y
CONFIG_EFI_VARS=y or m # optional
+
- Create a VFAT partition on the disk
- Copy the following to the VFAT partition:
+
elilo bootloader with x86_64 support, elilo configuration file,
kernel image built in first step and corresponding
initrd. Instructions on building elilo and its dependencies
can be found in the elilo sourceforge project.
+
- Boot to EFI shell and invoke elilo choosing the kernel image built
in first step.
- If some or all EFI runtime services don't work, you can try following
kernel command line parameters to turn off some or all EFI runtime
services.
- noefi turn off all EFI runtime services
- reboot_type=k turn off EFI reboot runtime service
+
+ noefi
+ turn off all EFI runtime services
+ reboot_type=k
+ turn off EFI reboot runtime service
+
- If the EFI memory map has additional entries not in the E820 map,
you can include those entries in the kernels memory map of available
physical RAM by using the following kernel command line parameter.
- add_efi_memmap include EFI memory map of available physical RAM
+
+ add_efi_memmap
+ include EFI memory map of available physical RAM
--
2.20.1

2019-04-23 16:40:12

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 59/63] Documentation: x86: convert x86_64/mm.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/x86_64/index.rst | 1 +
Documentation/x86/x86_64/mm.rst | 161 +++++++++++++++++++++++++++++
Documentation/x86/x86_64/mm.txt | 153 ---------------------------
3 files changed, 162 insertions(+), 153 deletions(-)
create mode 100644 Documentation/x86/x86_64/mm.rst
delete mode 100644 Documentation/x86/x86_64/mm.txt

diff --git a/Documentation/x86/x86_64/index.rst b/Documentation/x86/x86_64/index.rst
index ddfa1f9d4193..4b65d29ef459 100644
--- a/Documentation/x86/x86_64/index.rst
+++ b/Documentation/x86/x86_64/index.rst
@@ -9,3 +9,4 @@ x86_64 Support

boot-options
uefi
+ mm
diff --git a/Documentation/x86/x86_64/mm.rst b/Documentation/x86/x86_64/mm.rst
new file mode 100644
index 000000000000..4a29441c9e25
--- /dev/null
+++ b/Documentation/x86/x86_64/mm.rst
@@ -0,0 +1,161 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+================
+Memory Managment
+================
+
+Complete virtual memory map with 4-level page tables
+====================================================
+
+.. note::
+
+ - Negative addresses such as "-23 TB" are absolute addresses in bytes, counted down
+ from the top of the 64-bit address space. It's easier to understand the layout
+ when seen both in absolute addresses and in distance-from-top notation.
+
+ For example 0xffffe90000000000 == -23 TB, it's 23 TB lower than the top of the
+ 64-bit address space (ffffffffffffffff).
+
+ Note that as we get closer to the top of the address space, the notation changes
+ from TB to GB and then MB/KB.
+
+ - "16M TB" might look weird at first sight, but it's an easier to visualize size
+ notation than "16 EB", which few will recognize at first sight as 16 exabytes.
+ It also shows it nicely how incredibly large 64-bit address space is.
+
+::
+
+ ========================================================================================================================
+ Start addr | Offset | End addr | Size | VM area description
+ ========================================================================================================================
+ | | | |
+ 0000000000000000 | 0 | 00007fffffffffff | 128 TB | user-space virtual memory, different per mm
+ __________________|____________|__________________|_________|___________________________________________________________
+ | | | |
+ 0000800000000000 | +128 TB | ffff7fffffffffff | ~16M TB | ... huge, almost 64 bits wide hole of non-canonical
+ | | | | virtual memory addresses up to the -128 TB
+ | | | | starting offset of kernel mappings.
+ __________________|____________|__________________|_________|___________________________________________________________
+ |
+ | Kernel-space virtual memory, shared between all processes:
+ ____________________________________________________________|___________________________________________________________
+ | | | |
+ ffff800000000000 | -128 TB | ffff87ffffffffff | 8 TB | ... guard hole, also reserved for hypervisor
+ ffff880000000000 | -120 TB | ffff887fffffffff | 0.5 TB | LDT remap for PTI
+ ffff888000000000 | -119.5 TB | ffffc87fffffffff | 64 TB | direct mapping of all physical memory (page_offset_base)
+ ffffc88000000000 | -55.5 TB | ffffc8ffffffffff | 0.5 TB | ... unused hole
+ ffffc90000000000 | -55 TB | ffffe8ffffffffff | 32 TB | vmalloc/ioremap space (vmalloc_base)
+ ffffe90000000000 | -23 TB | ffffe9ffffffffff | 1 TB | ... unused hole
+ ffffea0000000000 | -22 TB | ffffeaffffffffff | 1 TB | virtual memory map (vmemmap_base)
+ ffffeb0000000000 | -21 TB | ffffebffffffffff | 1 TB | ... unused hole
+ ffffec0000000000 | -20 TB | fffffbffffffffff | 16 TB | KASAN shadow memory
+ __________________|____________|__________________|_________|____________________________________________________________
+ |
+ | Identical layout to the 56-bit one from here on:
+ ____________________________________________________________|____________________________________________________________
+ | | | |
+ fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole
+ | | | | vaddr_end for KASLR
+ fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping
+ fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole
+ ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks
+ ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole
+ ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space
+ ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | ... unused hole
+ ffffffff80000000 | -2 GB | ffffffff9fffffff | 512 MB | kernel text mapping, mapped to physical address 0
+ ffffffff80000000 |-2048 MB | | |
+ ffffffffa0000000 |-1536 MB | fffffffffeffffff | 1520 MB | module mapping space
+ ffffffffff000000 | -16 MB | | |
+ FIXADDR_START | ~-11 MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
+ ffffffffff600000 | -10 MB | ffffffffff600fff | 4 kB | legacy vsyscall ABI
+ ffffffffffe00000 | -2 MB | ffffffffffffffff | 2 MB | ... unused hole
+ __________________|____________|__________________|_________|___________________________________________________________
+
+
+Complete virtual memory map with 5-level page tables
+====================================================
+
+.. note::
+
+ - With 56-bit addresses, user-space memory gets expanded by a factor of 512x,
+ from 0.125 PB to 64 PB. All kernel mappings shift down to the -64 PT starting
+ offset and many of the regions expand to support the much larger physical
+ memory supported.
+
+::
+
+ ========================================================================================================================
+ Start addr | Offset | End addr | Size | VM area description
+ ========================================================================================================================
+ | | | |
+ 0000000000000000 | 0 | 00ffffffffffffff | 64 PB | user-space virtual memory, different per mm
+ __________________|____________|__________________|_________|___________________________________________________________
+ | | | |
+ 0000800000000000 | +64 PB | ffff7fffffffffff | ~16K PB | ... huge, still almost 64 bits wide hole of non-canonical
+ | | | | virtual memory addresses up to the -64 PB
+ | | | | starting offset of kernel mappings.
+ __________________|____________|__________________|_________|___________________________________________________________
+ |
+ | Kernel-space virtual memory, shared between all processes:
+ ____________________________________________________________|___________________________________________________________
+ | | | |
+ ff00000000000000 | -64 PB | ff0fffffffffffff | 4 PB | ... guard hole, also reserved for hypervisor
+ ff10000000000000 | -60 PB | ff10ffffffffffff | 0.25 PB | LDT remap for PTI
+ ff11000000000000 | -59.75 PB | ff90ffffffffffff | 32 PB | direct mapping of all physical memory (page_offset_base)
+ ff91000000000000 | -27.75 PB | ff9fffffffffffff | 3.75 PB | ... unused hole
+ ffa0000000000000 | -24 PB | ffd1ffffffffffff | 12.5 PB | vmalloc/ioremap space (vmalloc_base)
+ ffd2000000000000 | -11.5 PB | ffd3ffffffffffff | 0.5 PB | ... unused hole
+ ffd4000000000000 | -11 PB | ffd5ffffffffffff | 0.5 PB | virtual memory map (vmemmap_base)
+ ffd6000000000000 | -10.5 PB | ffdeffffffffffff | 2.25 PB | ... unused hole
+ ffdf000000000000 | -8.25 PB | fffffdffffffffff | ~8 PB | KASAN shadow memory
+ __________________|____________|__________________|_________|____________________________________________________________
+ |
+ | Identical layout to the 47-bit one from here on:
+ ____________________________________________________________|____________________________________________________________
+ | | | |
+ fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole
+ | | | | vaddr_end for KASLR
+ fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping
+ fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole
+ ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks
+ ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole
+ ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space
+ ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | ... unused hole
+ ffffffff80000000 | -2 GB | ffffffff9fffffff | 512 MB | kernel text mapping, mapped to physical address 0
+ ffffffff80000000 |-2048 MB | | |
+ ffffffffa0000000 |-1536 MB | fffffffffeffffff | 1520 MB | module mapping space
+ ffffffffff000000 | -16 MB | | |
+ FIXADDR_START | ~-11 MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
+ ffffffffff600000 | -10 MB | ffffffffff600fff | 4 kB | legacy vsyscall ABI
+ ffffffffffe00000 | -2 MB | ffffffffffffffff | 2 MB | ... unused hole
+ __________________|____________|__________________|_________|___________________________________________________________
+
+Architecture defines a 64-bit virtual address. Implementations can support
+less. Currently supported are 48- and 57-bit virtual addresses. Bits 63
+through to the most-significant implemented bit are sign extended.
+This causes hole between user space and kernel addresses if you interpret them
+as unsigned.
+
+The direct mapping covers all memory in the system up to the highest
+memory address (this means in some cases it can also include PCI memory
+holes).
+
+vmalloc space is lazily synchronized into the different PML4/PML5 pages of
+the processes using the page fault handler, with init_top_pgt as
+reference.
+
+We map EFI runtime services in the 'efi_pgd' PGD in a 64Gb large virtual
+memory window (this size is arbitrary, it can be raised later if needed).
+The mappings are not part of any other kernel PGD and are only available
+during EFI runtime calls.
+
+Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
+physical memory, vmalloc/ioremap space and virtual memory map are randomized.
+Their order is preserved but their base will be offset early at boot time.
+
+Be very careful vs. KASLR when changing anything here. The KASLR address
+range must not overlap with anything except the KASAN shadow area, which is
+correct as KASAN disables KASLR.
+
+For both 4- and 5-level layouts, the STACKLEAK_POISON value in the last 2MB
+hole: ffffffffffff4111
diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
deleted file mode 100644
index 804f9426ed17..000000000000
--- a/Documentation/x86/x86_64/mm.txt
+++ /dev/null
@@ -1,153 +0,0 @@
-====================================================
-Complete virtual memory map with 4-level page tables
-====================================================
-
-Notes:
-
- - Negative addresses such as "-23 TB" are absolute addresses in bytes, counted down
- from the top of the 64-bit address space. It's easier to understand the layout
- when seen both in absolute addresses and in distance-from-top notation.
-
- For example 0xffffe90000000000 == -23 TB, it's 23 TB lower than the top of the
- 64-bit address space (ffffffffffffffff).
-
- Note that as we get closer to the top of the address space, the notation changes
- from TB to GB and then MB/KB.
-
- - "16M TB" might look weird at first sight, but it's an easier to visualize size
- notation than "16 EB", which few will recognize at first sight as 16 exabytes.
- It also shows it nicely how incredibly large 64-bit address space is.
-
-========================================================================================================================
- Start addr | Offset | End addr | Size | VM area description
-========================================================================================================================
- | | | |
- 0000000000000000 | 0 | 00007fffffffffff | 128 TB | user-space virtual memory, different per mm
-__________________|____________|__________________|_________|___________________________________________________________
- | | | |
- 0000800000000000 | +128 TB | ffff7fffffffffff | ~16M TB | ... huge, almost 64 bits wide hole of non-canonical
- | | | | virtual memory addresses up to the -128 TB
- | | | | starting offset of kernel mappings.
-__________________|____________|__________________|_________|___________________________________________________________
- |
- | Kernel-space virtual memory, shared between all processes:
-____________________________________________________________|___________________________________________________________
- | | | |
- ffff800000000000 | -128 TB | ffff87ffffffffff | 8 TB | ... guard hole, also reserved for hypervisor
- ffff880000000000 | -120 TB | ffff887fffffffff | 0.5 TB | LDT remap for PTI
- ffff888000000000 | -119.5 TB | ffffc87fffffffff | 64 TB | direct mapping of all physical memory (page_offset_base)
- ffffc88000000000 | -55.5 TB | ffffc8ffffffffff | 0.5 TB | ... unused hole
- ffffc90000000000 | -55 TB | ffffe8ffffffffff | 32 TB | vmalloc/ioremap space (vmalloc_base)
- ffffe90000000000 | -23 TB | ffffe9ffffffffff | 1 TB | ... unused hole
- ffffea0000000000 | -22 TB | ffffeaffffffffff | 1 TB | virtual memory map (vmemmap_base)
- ffffeb0000000000 | -21 TB | ffffebffffffffff | 1 TB | ... unused hole
- ffffec0000000000 | -20 TB | fffffbffffffffff | 16 TB | KASAN shadow memory
-__________________|____________|__________________|_________|____________________________________________________________
- |
- | Identical layout to the 56-bit one from here on:
-____________________________________________________________|____________________________________________________________
- | | | |
- fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole
- | | | | vaddr_end for KASLR
- fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping
- fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole
- ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks
- ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole
- ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space
- ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | ... unused hole
- ffffffff80000000 | -2 GB | ffffffff9fffffff | 512 MB | kernel text mapping, mapped to physical address 0
- ffffffff80000000 |-2048 MB | | |
- ffffffffa0000000 |-1536 MB | fffffffffeffffff | 1520 MB | module mapping space
- ffffffffff000000 | -16 MB | | |
- FIXADDR_START | ~-11 MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
- ffffffffff600000 | -10 MB | ffffffffff600fff | 4 kB | legacy vsyscall ABI
- ffffffffffe00000 | -2 MB | ffffffffffffffff | 2 MB | ... unused hole
-__________________|____________|__________________|_________|___________________________________________________________
-
-
-====================================================
-Complete virtual memory map with 5-level page tables
-====================================================
-
-Notes:
-
- - With 56-bit addresses, user-space memory gets expanded by a factor of 512x,
- from 0.125 PB to 64 PB. All kernel mappings shift down to the -64 PT starting
- offset and many of the regions expand to support the much larger physical
- memory supported.
-
-========================================================================================================================
- Start addr | Offset | End addr | Size | VM area description
-========================================================================================================================
- | | | |
- 0000000000000000 | 0 | 00ffffffffffffff | 64 PB | user-space virtual memory, different per mm
-__________________|____________|__________________|_________|___________________________________________________________
- | | | |
- 0000800000000000 | +64 PB | ffff7fffffffffff | ~16K PB | ... huge, still almost 64 bits wide hole of non-canonical
- | | | | virtual memory addresses up to the -64 PB
- | | | | starting offset of kernel mappings.
-__________________|____________|__________________|_________|___________________________________________________________
- |
- | Kernel-space virtual memory, shared between all processes:
-____________________________________________________________|___________________________________________________________
- | | | |
- ff00000000000000 | -64 PB | ff0fffffffffffff | 4 PB | ... guard hole, also reserved for hypervisor
- ff10000000000000 | -60 PB | ff10ffffffffffff | 0.25 PB | LDT remap for PTI
- ff11000000000000 | -59.75 PB | ff90ffffffffffff | 32 PB | direct mapping of all physical memory (page_offset_base)
- ff91000000000000 | -27.75 PB | ff9fffffffffffff | 3.75 PB | ... unused hole
- ffa0000000000000 | -24 PB | ffd1ffffffffffff | 12.5 PB | vmalloc/ioremap space (vmalloc_base)
- ffd2000000000000 | -11.5 PB | ffd3ffffffffffff | 0.5 PB | ... unused hole
- ffd4000000000000 | -11 PB | ffd5ffffffffffff | 0.5 PB | virtual memory map (vmemmap_base)
- ffd6000000000000 | -10.5 PB | ffdeffffffffffff | 2.25 PB | ... unused hole
- ffdf000000000000 | -8.25 PB | fffffdffffffffff | ~8 PB | KASAN shadow memory
-__________________|____________|__________________|_________|____________________________________________________________
- |
- | Identical layout to the 47-bit one from here on:
-____________________________________________________________|____________________________________________________________
- | | | |
- fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole
- | | | | vaddr_end for KASLR
- fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping
- fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole
- ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks
- ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole
- ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space
- ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | ... unused hole
- ffffffff80000000 | -2 GB | ffffffff9fffffff | 512 MB | kernel text mapping, mapped to physical address 0
- ffffffff80000000 |-2048 MB | | |
- ffffffffa0000000 |-1536 MB | fffffffffeffffff | 1520 MB | module mapping space
- ffffffffff000000 | -16 MB | | |
- FIXADDR_START | ~-11 MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
- ffffffffff600000 | -10 MB | ffffffffff600fff | 4 kB | legacy vsyscall ABI
- ffffffffffe00000 | -2 MB | ffffffffffffffff | 2 MB | ... unused hole
-__________________|____________|__________________|_________|___________________________________________________________
-
-Architecture defines a 64-bit virtual address. Implementations can support
-less. Currently supported are 48- and 57-bit virtual addresses. Bits 63
-through to the most-significant implemented bit are sign extended.
-This causes hole between user space and kernel addresses if you interpret them
-as unsigned.
-
-The direct mapping covers all memory in the system up to the highest
-memory address (this means in some cases it can also include PCI memory
-holes).
-
-vmalloc space is lazily synchronized into the different PML4/PML5 pages of
-the processes using the page fault handler, with init_top_pgt as
-reference.
-
-We map EFI runtime services in the 'efi_pgd' PGD in a 64Gb large virtual
-memory window (this size is arbitrary, it can be raised later if needed).
-The mappings are not part of any other kernel PGD and are only available
-during EFI runtime calls.
-
-Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
-physical memory, vmalloc/ioremap space and virtual memory map are randomized.
-Their order is preserved but their base will be offset early at boot time.
-
-Be very careful vs. KASLR when changing anything here. The KASLR address
-range must not overlap with anything except the KASAN shadow area, which is
-correct as KASAN disables KASLR.
-
-For both 4- and 5-level layouts, the STACKLEAK_POISON value in the last 2MB
-hole: ffffffffffff4111
--
2.20.1

2019-04-23 16:40:22

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 61/63] Documentation: x86: convert x86_64/fake-numa-for-cpusets to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
...a-for-cpusets => fake-numa-for-cpusets.rst} | 18 +++++++++++++-----
Documentation/x86/x86_64/index.rst | 1 +
2 files changed, 14 insertions(+), 5 deletions(-)
rename Documentation/x86/x86_64/{fake-numa-for-cpusets => fake-numa-for-cpusets.rst} (90%)

diff --git a/Documentation/x86/x86_64/fake-numa-for-cpusets b/Documentation/x86/x86_64/fake-numa-for-cpusets.rst
similarity index 90%
rename from Documentation/x86/x86_64/fake-numa-for-cpusets
rename to Documentation/x86/x86_64/fake-numa-for-cpusets.rst
index 4b09f18831f8..3c23f45d38f6 100644
--- a/Documentation/x86/x86_64/fake-numa-for-cpusets
+++ b/Documentation/x86/x86_64/fake-numa-for-cpusets.rst
@@ -1,5 +1,12 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================
+Fake NUMA For CPUSets
+=====================
+
+:Author: David Rientjes <[email protected]>
+
Using numa=fake and CPUSets for Resource Management
-Written by David Rientjes <[email protected]>

This document describes how the numa=fake x86_64 command-line option can be used
in conjunction with cpusets for coarse memory management. Using this feature,
@@ -20,7 +27,7 @@ you become more familiar with using this combination for resource control,
you'll determine a better setup to minimize the number of nodes you have to deal
with.

-A machine may be split as follows with "numa=fake=4*512," as reported by dmesg:
+A machine may be split as follows with "numa=fake=4*512," as reported by dmesg::

Faking node 0 at 0000000000000000-0000000020000000 (512MB)
Faking node 1 at 0000000020000000-0000000040000000 (512MB)
@@ -34,7 +41,7 @@ A machine may be split as follows with "numa=fake=4*512," as reported by dmesg:

Now following the instructions for mounting the cpusets filesystem from
Documentation/cgroup-v1/cpusets.txt, you can assign fake nodes (i.e. contiguous memory
-address spaces) to individual cpusets:
+address spaces) to individual cpusets::

[root@xroads /]# mkdir exampleset
[root@xroads /]# mount -t cpuset none exampleset
@@ -47,7 +54,7 @@ Now this cpuset, 'ddset', will only allowed access to fake nodes 0 and 1 for
memory allocations (1G).

You can now assign tasks to these cpusets to limit the memory resources
-available to them according to the fake nodes assigned as mems:
+available to them according to the fake nodes assigned as mems::

[root@xroads /exampleset/ddset]# echo $$ > tasks
[root@xroads /exampleset/ddset]# dd if=/dev/zero of=tmp bs=1024 count=1G
@@ -56,7 +63,8 @@ available to them according to the fake nodes assigned as mems:
Notice the difference between the system memory usage as reported by
/proc/meminfo between the restricted cpuset case above and the unrestricted
case (i.e. running the same 'dd' command without assigning it to a fake NUMA
-cpuset):
+cpuset)::
+
Unrestricted Restricted
MemTotal: 3091900 kB 3091900 kB
MemFree: 42113 kB 1513236 kB
diff --git a/Documentation/x86/x86_64/index.rst b/Documentation/x86/x86_64/index.rst
index 7b8c82151358..e2a324cde671 100644
--- a/Documentation/x86/x86_64/index.rst
+++ b/Documentation/x86/x86_64/index.rst
@@ -11,3 +11,4 @@ x86_64 Support
uefi
mm
5level-paging
+ fake-numa-for-cpusets
--
2.20.1

2019-04-23 16:40:23

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 46/63] Documentation: x86: convert mtrr.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/index.rst | 1 +
Documentation/x86/mtrr.rst | 350 ++++++++++++++++++++++++++++++++++++
Documentation/x86/mtrr.txt | 329 ---------------------------------
3 files changed, 351 insertions(+), 329 deletions(-)
create mode 100644 Documentation/x86/mtrr.rst
delete mode 100644 Documentation/x86/mtrr.txt

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index fd54b859db9b..d805962a7238 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -16,3 +16,4 @@ Linux x86 Support
earlyprintk
zero-page
tlb
+ mtrr
diff --git a/Documentation/x86/mtrr.rst b/Documentation/x86/mtrr.rst
new file mode 100644
index 000000000000..72da61022861
--- /dev/null
+++ b/Documentation/x86/mtrr.rst
@@ -0,0 +1,350 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================================
+MTRR (Memory Type Range Register) control
+=========================================
+
+:Ahthors: - Richard Gooch <[email protected]> - 3 Jun 1999
+ - Luis R. Rodriguez <[email protected]> - April 9, 2015
+
+
+Phasing out MTRR use
+====================
+
+MTRR use is replaced on modern x86 hardware with PAT. Direct MTRR use by
+drivers on Linux is now completely phased out, device drivers should use
+arch_phys_wc_add() in combination with ioremap_wc() to make MTRR effective on
+non-PAT systems while a no-op but equally effective on PAT enabled systems.
+
+Even if Linux does not use MTRRs directly, some x86 platform firmware may still
+set up MTRRs early before booting the OS. They do this as some platform
+firmware may still have implemented access to MTRRs which would be controlled
+and handled by the platform firmware directly. An example of platform use of
+MTRRs is through the use of SMI handlers, one case could be for fan control,
+the platform code would need uncachable access to some of its fan control
+registers. Such platform access does not need any Operating System MTRR code in
+place other than mtrr_type_lookup() to ensure any OS specific mapping requests
+are aligned with platform MTRR setup. If MTRRs are only set up by the platform
+firmware code though and the OS does not make any specific MTRR mapping
+requests mtrr_type_lookup() should always return MTRR_TYPE_INVALID.
+
+For details refer to :doc:`x86/pat`.
+
+On Intel P6 family processors (Pentium Pro, Pentium II and later)
+the Memory Type Range Registers (MTRRs) may be used to control
+processor access to memory ranges. This is most useful when you have
+a video (VGA) card on a PCI or AGP bus. Enabling write-combining
+allows bus write transfers to be combined into a larger transfer
+before bursting over the PCI/AGP bus. This can increase performance
+of image write operations 2.5 times or more.
+
+The Cyrix 6x86, 6x86MX and M II processors have Address Range
+Registers (ARRs) which provide a similar functionality to MTRRs. For
+these, the ARRs are used to emulate the MTRRs.
+
+The AMD K6-2 (stepping 8 and above) and K6-3 processors have two
+MTRRs. These are supported. The AMD Athlon family provide 8 Intel
+style MTRRs.
+
+The Centaur C6 (WinChip) has 8 MCRs, allowing write-combining. These
+are supported.
+
+The VIA Cyrix III and VIA C3 CPUs offer 8 Intel style MTRRs.
+
+The CONFIG_MTRR option creates a /proc/mtrr file which may be used
+to manipulate your MTRRs. Typically the X server should use
+this. This should have a reasonably generic interface so that
+similar control registers on other processors can be easily
+supported.
+
+There are two interfaces to /proc/mtrr: one is an ASCII interface
+which allows you to read and write. The other is an ioctl()
+interface. The ASCII interface is meant for administration. The
+ioctl() interface is meant for C programs (i.e. the X server). The
+interfaces are described below, with sample commands and C code.
+
+Reading MTRRs from the shell::
+
+ % cat /proc/mtrr
+ reg00: base=0x00000000 ( 0MB), size= 128MB: write-back, count=1
+ reg01: base=0x08000000 ( 128MB), size= 64MB: write-back, count=1
+
+Creating MTRRs from the C-shell::
+
+ # echo "base=0xf8000000 size=0x400000 type=write-combining" >! /proc/mtrr
+
+or if you use bash::
+
+ # echo "base=0xf8000000 size=0x400000 type=write-combining" >| /proc/mtrr
+
+And the result thereof::
+
+ % cat /proc/mtrr
+ reg00: base=0x00000000 ( 0MB), size= 128MB: write-back, count=1
+ reg01: base=0x08000000 ( 128MB), size= 64MB: write-back, count=1
+ reg02: base=0xf8000000 (3968MB), size= 4MB: write-combining, count=1
+
+This is for video RAM at base address 0xf8000000 and size 4 megabytes. To
+find out your base address, you need to look at the output of your X
+server, which tells you where the linear framebuffer address is. A
+typical line that you may get is:
+
+(--) S3: PCI: 968 rev 0, Linear FB @ 0xf8000000
+
+Note that you should only use the value from the X server, as it may
+move the framebuffer base address, so the only value you can trust is
+that reported by the X server.
+
+To find out the size of your framebuffer (what, you don't actually
+know?), the following line will tell you:
+
+(--) S3: videoram: 4096k
+
+That's 4 megabytes, which is 0x400000 bytes (in hexadecimal).
+A patch is being written for XFree86 which will make this automatic:
+in other words the X server will manipulate /proc/mtrr using the
+ioctl() interface, so users won't have to do anything. If you use a
+commercial X server, lobby your vendor to add support for MTRRs.
+
+
+Creating overlapping MTRRs
+==========================
+::
+
+ %echo "base=0xfb000000 size=0x1000000 type=write-combining" >/proc/mtrr
+ %echo "base=0xfb000000 size=0x1000 type=uncachable" >/proc/mtrr
+
+And the results::
+
+ % cat /proc/mtrr
+ reg00: base=0x00000000 ( 0MB), size= 64MB: write-back, count=1
+ reg01: base=0xfb000000 (4016MB), size= 16MB: write-combining, count=1
+ reg02: base=0xfb000000 (4016MB), size= 4kB: uncachable, count=1
+
+Some cards (especially Voodoo Graphics boards) need this 4 kB area
+excluded from the beginning of the region because it is used for
+registers.
+
+NOTE: You can only create type=uncachable region, if the first
+region that you created is type=write-combining.
+
+
+Removing MTRRs from the C-shel
+==============================
+::
+
+ % echo "disable=2" >! /proc/mtrr
+
+or using bash::
+
+ % echo "disable=2" >| /proc/mtrr
+
+
+Reading MTRRs from a C program using ioctl()'s
+==============================================
+::
+
+ /* mtrr-show.c
+
+ Source file for mtrr-show (example program to show MTRRs using ioctl()'s)
+
+ Copyright (C) 1997-1998 Richard Gooch
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software
+ Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+
+ Richard Gooch may be reached by email at [email protected]
+ The postal address is:
+ Richard Gooch, c/o ATNF, P. O. Box 76, Epping, N.S.W., 2121, Australia.
+ */
+
+ /*
+ This program will use an ioctl() on /proc/mtrr to show the current MTRR
+ settings. This is an alternative to reading /proc/mtrr.
+
+
+ Written by Richard Gooch 17-DEC-1997
+
+ Last updated by Richard Gooch 2-MAY-1998
+
+
+ */
+ #include <stdio.h>
+ #include <stdlib.h>
+ #include <string.h>
+ #include <sys/types.h>
+ #include <sys/stat.h>
+ #include <fcntl.h>
+ #include <sys/ioctl.h>
+ #include <errno.h>
+ #include <asm/mtrr.h>
+
+ #define TRUE 1
+ #define FALSE 0
+ #define ERRSTRING strerror (errno)
+
+ static char *mtrr_strings[MTRR_NUM_TYPES] =
+ {
+ "uncachable", /* 0 */
+ "write-combining", /* 1 */
+ "?", /* 2 */
+ "?", /* 3 */
+ "write-through", /* 4 */
+ "write-protect", /* 5 */
+ "write-back", /* 6 */
+ };
+
+ int main ()
+ {
+ int fd;
+ struct mtrr_gentry gentry;
+
+ if ( ( fd = open ("/proc/mtrr", O_RDONLY, 0) ) == -1 )
+ {
+ if (errno == ENOENT)
+ {
+ fputs ("/proc/mtrr not found: not supported or you don't have a PPro?\n",
+ stderr);
+ exit (1);
+ }
+ fprintf (stderr, "Error opening /proc/mtrr\t%s\n", ERRSTRING);
+ exit (2);
+ }
+ for (gentry.regnum = 0; ioctl (fd, MTRRIOC_GET_ENTRY, &gentry) == 0;
+ ++gentry.regnum)
+ {
+ if (gentry.size < 1)
+ {
+ fprintf (stderr, "Register: %u disabled\n", gentry.regnum);
+ continue;
+ }
+ fprintf (stderr, "Register: %u base: 0x%lx size: 0x%lx type: %s\n",
+ gentry.regnum, gentry.base, gentry.size,
+ mtrr_strings[gentry.type]);
+ }
+ if (errno == EINVAL) exit (0);
+ fprintf (stderr, "Error doing ioctl(2) on /dev/mtrr\t%s\n", ERRSTRING);
+ exit (3);
+ } /* End Function main */
+
+
+Creating MTRRs from a C programme using ioctl()'s
+=================================================
+::
+
+ /* mtrr-add.c
+
+ Source file for mtrr-add (example programme to add an MTRRs using ioctl())
+
+ Copyright (C) 1997-1998 Richard Gooch
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software
+ Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+
+ Richard Gooch may be reached by email at [email protected]
+ The postal address is:
+ Richard Gooch, c/o ATNF, P. O. Box 76, Epping, N.S.W., 2121, Australia.
+ */
+
+ /*
+ This programme will use an ioctl() on /proc/mtrr to add an entry. The first
+ available mtrr is used. This is an alternative to writing /proc/mtrr.
+
+
+ Written by Richard Gooch 17-DEC-1997
+
+ Last updated by Richard Gooch 2-MAY-1998
+
+
+ */
+ #include <stdio.h>
+ #include <string.h>
+ #include <stdlib.h>
+ #include <unistd.h>
+ #include <sys/types.h>
+ #include <sys/stat.h>
+ #include <fcntl.h>
+ #include <sys/ioctl.h>
+ #include <errno.h>
+ #include <asm/mtrr.h>
+
+ #define TRUE 1
+ #define FALSE 0
+ #define ERRSTRING strerror (errno)
+
+ static char *mtrr_strings[MTRR_NUM_TYPES] =
+ {
+ "uncachable", /* 0 */
+ "write-combining", /* 1 */
+ "?", /* 2 */
+ "?", /* 3 */
+ "write-through", /* 4 */
+ "write-protect", /* 5 */
+ "write-back", /* 6 */
+ };
+
+ int main (int argc, char **argv)
+ {
+ int fd;
+ struct mtrr_sentry sentry;
+
+ if (argc != 4)
+ {
+ fprintf (stderr, "Usage:\tmtrr-add base size type\n");
+ exit (1);
+ }
+ sentry.base = strtoul (argv[1], NULL, 0);
+ sentry.size = strtoul (argv[2], NULL, 0);
+ for (sentry.type = 0; sentry.type < MTRR_NUM_TYPES; ++sentry.type)
+ {
+ if (strcmp (argv[3], mtrr_strings[sentry.type]) == 0) break;
+ }
+ if (sentry.type >= MTRR_NUM_TYPES)
+ {
+ fprintf (stderr, "Illegal type: \"%s\"\n", argv[3]);
+ exit (2);
+ }
+ if ( ( fd = open ("/proc/mtrr", O_WRONLY, 0) ) == -1 )
+ {
+ if (errno == ENOENT)
+ {
+ fputs ("/proc/mtrr not found: not supported or you don't have a PPro?\n",
+ stderr);
+ exit (3);
+ }
+ fprintf (stderr, "Error opening /proc/mtrr\t%s\n", ERRSTRING);
+ exit (4);
+ }
+ if (ioctl (fd, MTRRIOC_ADD_ENTRY, &sentry) == -1)
+ {
+ fprintf (stderr, "Error doing ioctl(2) on /dev/mtrr\t%s\n", ERRSTRING);
+ exit (5);
+ }
+ fprintf (stderr, "Sleeping for 5 seconds so you can see the new entry\n");
+ sleep (5);
+ close (fd);
+ fputs ("I've just closed /proc/mtrr so now the new entry should be gone\n",
+ stderr);
+ } /* End Function main */
diff --git a/Documentation/x86/mtrr.txt b/Documentation/x86/mtrr.txt
deleted file mode 100644
index dc3e703913ac..000000000000
--- a/Documentation/x86/mtrr.txt
+++ /dev/null
@@ -1,329 +0,0 @@
-MTRR (Memory Type Range Register) control
-
-Richard Gooch <[email protected]> - 3 Jun 1999
-Luis R. Rodriguez <[email protected]> - April 9, 2015
-
-===============================================================================
-Phasing out MTRR use
-
-MTRR use is replaced on modern x86 hardware with PAT. Direct MTRR use by
-drivers on Linux is now completely phased out, device drivers should use
-arch_phys_wc_add() in combination with ioremap_wc() to make MTRR effective on
-non-PAT systems while a no-op but equally effective on PAT enabled systems.
-
-Even if Linux does not use MTRRs directly, some x86 platform firmware may still
-set up MTRRs early before booting the OS. They do this as some platform
-firmware may still have implemented access to MTRRs which would be controlled
-and handled by the platform firmware directly. An example of platform use of
-MTRRs is through the use of SMI handlers, one case could be for fan control,
-the platform code would need uncachable access to some of its fan control
-registers. Such platform access does not need any Operating System MTRR code in
-place other than mtrr_type_lookup() to ensure any OS specific mapping requests
-are aligned with platform MTRR setup. If MTRRs are only set up by the platform
-firmware code though and the OS does not make any specific MTRR mapping
-requests mtrr_type_lookup() should always return MTRR_TYPE_INVALID.
-
-For details refer to Documentation/x86/pat.txt.
-
-===============================================================================
-
- On Intel P6 family processors (Pentium Pro, Pentium II and later)
- the Memory Type Range Registers (MTRRs) may be used to control
- processor access to memory ranges. This is most useful when you have
- a video (VGA) card on a PCI or AGP bus. Enabling write-combining
- allows bus write transfers to be combined into a larger transfer
- before bursting over the PCI/AGP bus. This can increase performance
- of image write operations 2.5 times or more.
-
- The Cyrix 6x86, 6x86MX and M II processors have Address Range
- Registers (ARRs) which provide a similar functionality to MTRRs. For
- these, the ARRs are used to emulate the MTRRs.
-
- The AMD K6-2 (stepping 8 and above) and K6-3 processors have two
- MTRRs. These are supported. The AMD Athlon family provide 8 Intel
- style MTRRs.
-
- The Centaur C6 (WinChip) has 8 MCRs, allowing write-combining. These
- are supported.
-
- The VIA Cyrix III and VIA C3 CPUs offer 8 Intel style MTRRs.
-
- The CONFIG_MTRR option creates a /proc/mtrr file which may be used
- to manipulate your MTRRs. Typically the X server should use
- this. This should have a reasonably generic interface so that
- similar control registers on other processors can be easily
- supported.
-
-
-There are two interfaces to /proc/mtrr: one is an ASCII interface
-which allows you to read and write. The other is an ioctl()
-interface. The ASCII interface is meant for administration. The
-ioctl() interface is meant for C programs (i.e. the X server). The
-interfaces are described below, with sample commands and C code.
-
-===============================================================================
-Reading MTRRs from the shell:
-
-% cat /proc/mtrr
-reg00: base=0x00000000 ( 0MB), size= 128MB: write-back, count=1
-reg01: base=0x08000000 ( 128MB), size= 64MB: write-back, count=1
-===============================================================================
-Creating MTRRs from the C-shell:
-# echo "base=0xf8000000 size=0x400000 type=write-combining" >! /proc/mtrr
-or if you use bash:
-# echo "base=0xf8000000 size=0x400000 type=write-combining" >| /proc/mtrr
-
-And the result thereof:
-% cat /proc/mtrr
-reg00: base=0x00000000 ( 0MB), size= 128MB: write-back, count=1
-reg01: base=0x08000000 ( 128MB), size= 64MB: write-back, count=1
-reg02: base=0xf8000000 (3968MB), size= 4MB: write-combining, count=1
-
-This is for video RAM at base address 0xf8000000 and size 4 megabytes. To
-find out your base address, you need to look at the output of your X
-server, which tells you where the linear framebuffer address is. A
-typical line that you may get is:
-
-(--) S3: PCI: 968 rev 0, Linear FB @ 0xf8000000
-
-Note that you should only use the value from the X server, as it may
-move the framebuffer base address, so the only value you can trust is
-that reported by the X server.
-
-To find out the size of your framebuffer (what, you don't actually
-know?), the following line will tell you:
-
-(--) S3: videoram: 4096k
-
-That's 4 megabytes, which is 0x400000 bytes (in hexadecimal).
-A patch is being written for XFree86 which will make this automatic:
-in other words the X server will manipulate /proc/mtrr using the
-ioctl() interface, so users won't have to do anything. If you use a
-commercial X server, lobby your vendor to add support for MTRRs.
-===============================================================================
-Creating overlapping MTRRs:
-
-%echo "base=0xfb000000 size=0x1000000 type=write-combining" >/proc/mtrr
-%echo "base=0xfb000000 size=0x1000 type=uncachable" >/proc/mtrr
-
-And the results: cat /proc/mtrr
-reg00: base=0x00000000 ( 0MB), size= 64MB: write-back, count=1
-reg01: base=0xfb000000 (4016MB), size= 16MB: write-combining, count=1
-reg02: base=0xfb000000 (4016MB), size= 4kB: uncachable, count=1
-
-Some cards (especially Voodoo Graphics boards) need this 4 kB area
-excluded from the beginning of the region because it is used for
-registers.
-
-NOTE: You can only create type=uncachable region, if the first
-region that you created is type=write-combining.
-===============================================================================
-Removing MTRRs from the C-shell:
-% echo "disable=2" >! /proc/mtrr
-or using bash:
-% echo "disable=2" >| /proc/mtrr
-===============================================================================
-Reading MTRRs from a C program using ioctl()'s:
-
-/* mtrr-show.c
-
- Source file for mtrr-show (example program to show MTRRs using ioctl()'s)
-
- Copyright (C) 1997-1998 Richard Gooch
-
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation; either version 2 of the License, or
- (at your option) any later version.
-
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
-
- You should have received a copy of the GNU General Public License
- along with this program; if not, write to the Free Software
- Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
-
- Richard Gooch may be reached by email at [email protected]
- The postal address is:
- Richard Gooch, c/o ATNF, P. O. Box 76, Epping, N.S.W., 2121, Australia.
-*/
-
-/*
- This program will use an ioctl() on /proc/mtrr to show the current MTRR
- settings. This is an alternative to reading /proc/mtrr.
-
-
- Written by Richard Gooch 17-DEC-1997
-
- Last updated by Richard Gooch 2-MAY-1998
-
-
-*/
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <fcntl.h>
-#include <sys/ioctl.h>
-#include <errno.h>
-#include <asm/mtrr.h>
-
-#define TRUE 1
-#define FALSE 0
-#define ERRSTRING strerror (errno)
-
-static char *mtrr_strings[MTRR_NUM_TYPES] =
-{
- "uncachable", /* 0 */
- "write-combining", /* 1 */
- "?", /* 2 */
- "?", /* 3 */
- "write-through", /* 4 */
- "write-protect", /* 5 */
- "write-back", /* 6 */
-};
-
-int main ()
-{
- int fd;
- struct mtrr_gentry gentry;
-
- if ( ( fd = open ("/proc/mtrr", O_RDONLY, 0) ) == -1 )
- {
- if (errno == ENOENT)
- {
- fputs ("/proc/mtrr not found: not supported or you don't have a PPro?\n",
- stderr);
- exit (1);
- }
- fprintf (stderr, "Error opening /proc/mtrr\t%s\n", ERRSTRING);
- exit (2);
- }
- for (gentry.regnum = 0; ioctl (fd, MTRRIOC_GET_ENTRY, &gentry) == 0;
- ++gentry.regnum)
- {
- if (gentry.size < 1)
- {
- fprintf (stderr, "Register: %u disabled\n", gentry.regnum);
- continue;
- }
- fprintf (stderr, "Register: %u base: 0x%lx size: 0x%lx type: %s\n",
- gentry.regnum, gentry.base, gentry.size,
- mtrr_strings[gentry.type]);
- }
- if (errno == EINVAL) exit (0);
- fprintf (stderr, "Error doing ioctl(2) on /dev/mtrr\t%s\n", ERRSTRING);
- exit (3);
-} /* End Function main */
-===============================================================================
-Creating MTRRs from a C programme using ioctl()'s:
-
-/* mtrr-add.c
-
- Source file for mtrr-add (example programme to add an MTRRs using ioctl())
-
- Copyright (C) 1997-1998 Richard Gooch
-
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation; either version 2 of the License, or
- (at your option) any later version.
-
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
-
- You should have received a copy of the GNU General Public License
- along with this program; if not, write to the Free Software
- Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
-
- Richard Gooch may be reached by email at [email protected]
- The postal address is:
- Richard Gooch, c/o ATNF, P. O. Box 76, Epping, N.S.W., 2121, Australia.
-*/
-
-/*
- This programme will use an ioctl() on /proc/mtrr to add an entry. The first
- available mtrr is used. This is an alternative to writing /proc/mtrr.
-
-
- Written by Richard Gooch 17-DEC-1997
-
- Last updated by Richard Gooch 2-MAY-1998
-
-
-*/
-#include <stdio.h>
-#include <string.h>
-#include <stdlib.h>
-#include <unistd.h>
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <fcntl.h>
-#include <sys/ioctl.h>
-#include <errno.h>
-#include <asm/mtrr.h>
-
-#define TRUE 1
-#define FALSE 0
-#define ERRSTRING strerror (errno)
-
-static char *mtrr_strings[MTRR_NUM_TYPES] =
-{
- "uncachable", /* 0 */
- "write-combining", /* 1 */
- "?", /* 2 */
- "?", /* 3 */
- "write-through", /* 4 */
- "write-protect", /* 5 */
- "write-back", /* 6 */
-};
-
-int main (int argc, char **argv)
-{
- int fd;
- struct mtrr_sentry sentry;
-
- if (argc != 4)
- {
- fprintf (stderr, "Usage:\tmtrr-add base size type\n");
- exit (1);
- }
- sentry.base = strtoul (argv[1], NULL, 0);
- sentry.size = strtoul (argv[2], NULL, 0);
- for (sentry.type = 0; sentry.type < MTRR_NUM_TYPES; ++sentry.type)
- {
- if (strcmp (argv[3], mtrr_strings[sentry.type]) == 0) break;
- }
- if (sentry.type >= MTRR_NUM_TYPES)
- {
- fprintf (stderr, "Illegal type: \"%s\"\n", argv[3]);
- exit (2);
- }
- if ( ( fd = open ("/proc/mtrr", O_WRONLY, 0) ) == -1 )
- {
- if (errno == ENOENT)
- {
- fputs ("/proc/mtrr not found: not supported or you don't have a PPro?\n",
- stderr);
- exit (3);
- }
- fprintf (stderr, "Error opening /proc/mtrr\t%s\n", ERRSTRING);
- exit (4);
- }
- if (ioctl (fd, MTRRIOC_ADD_ENTRY, &sentry) == -1)
- {
- fprintf (stderr, "Error doing ioctl(2) on /dev/mtrr\t%s\n", ERRSTRING);
- exit (5);
- }
- fprintf (stderr, "Sleeping for 5 seconds so you can see the new entry\n");
- sleep (5);
- close (fd);
- fputs ("I've just closed /proc/mtrr so now the new entry should be gone\n",
- stderr);
-} /* End Function main */
-===============================================================================
--
2.20.1

2019-04-23 16:40:30

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 62/63] Documentation: x86: convert x86_64/cpu-hotplug-spec to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
.../x86/x86_64/{cpu-hotplug-spec => cpu-hotplug-spec.rst} | 5 ++++-
Documentation/x86/x86_64/index.rst | 1 +
2 files changed, 5 insertions(+), 1 deletion(-)
rename Documentation/x86/x86_64/{cpu-hotplug-spec => cpu-hotplug-spec.rst} (88%)

diff --git a/Documentation/x86/x86_64/cpu-hotplug-spec b/Documentation/x86/x86_64/cpu-hotplug-spec.rst
similarity index 88%
rename from Documentation/x86/x86_64/cpu-hotplug-spec
rename to Documentation/x86/x86_64/cpu-hotplug-spec.rst
index 3c23e0587db3..8d1c91f0c880 100644
--- a/Documentation/x86/x86_64/cpu-hotplug-spec
+++ b/Documentation/x86/x86_64/cpu-hotplug-spec.rst
@@ -1,5 +1,8 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================================
Firmware support for CPU hotplug under Linux/x86-64
----------------------------------------------------
+===================================================

Linux/x86-64 supports CPU hotplug now. For various reasons Linux wants to
know in advance of boot time the maximum number of CPUs that could be plugged
diff --git a/Documentation/x86/x86_64/index.rst b/Documentation/x86/x86_64/index.rst
index e2a324cde671..c04b6eab3c76 100644
--- a/Documentation/x86/x86_64/index.rst
+++ b/Documentation/x86/x86_64/index.rst
@@ -12,3 +12,4 @@ x86_64 Support
mm
5level-paging
fake-numa-for-cpusets
+ cpu-hotplug-spec
--
2.20.1

2019-04-23 16:40:32

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 63/63] Documentation: x86: convert x86_64/machinecheck to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/x86_64/index.rst | 1 +
.../x86/x86_64/{machinecheck => machinecheck.rst} | 11 ++++++-----
2 files changed, 7 insertions(+), 5 deletions(-)
rename Documentation/x86/x86_64/{machinecheck => machinecheck.rst} (92%)

diff --git a/Documentation/x86/x86_64/index.rst b/Documentation/x86/x86_64/index.rst
index c04b6eab3c76..d6eaaa5a35fc 100644
--- a/Documentation/x86/x86_64/index.rst
+++ b/Documentation/x86/x86_64/index.rst
@@ -13,3 +13,4 @@ x86_64 Support
5level-paging
fake-numa-for-cpusets
cpu-hotplug-spec
+ machinecheck
diff --git a/Documentation/x86/x86_64/machinecheck b/Documentation/x86/x86_64/machinecheck.rst
similarity index 92%
rename from Documentation/x86/x86_64/machinecheck
rename to Documentation/x86/x86_64/machinecheck.rst
index d0648a74fceb..8e9d2d529a8d 100644
--- a/Documentation/x86/x86_64/machinecheck
+++ b/Documentation/x86/x86_64/machinecheck.rst
@@ -1,5 +1,8 @@
+.. SPDX-License-Identifier: GPL-2.0

-Configurable sysfs parameters for the x86-64 machine check code.
+===============================================================
+Configurable sysfs parameters for the x86-64 machine check code
+===============================================================

Machine checks report internal hardware error conditions detected
by the CPU. Uncorrected errors typically cause a machine check
@@ -16,14 +19,12 @@ log then mcelog should run to collect and decode machine check entries
from /dev/mcelog. Normally mcelog should be run regularly from a cronjob.

Each CPU has a directory in /sys/devices/system/machinecheck/machinecheckN
-(N = CPU number)
+(N = CPU number).

The directory contains some configurable entries:

-Entries:
-
bankNctl
-(N bank number)
+ (N bank number)
64bit Hex bitmask enabling/disabling specific subevents for bank N
When a bit in the bitmask is zero then the respective
subevent will not be reported.
--
2.20.1

2019-04-23 16:40:58

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 54/63] Documentation: x86: convert orc-unwinder.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/index.rst | 1 +
.../{orc-unwinder.txt => orc-unwinder.rst} | 27 ++++++++++---------
2 files changed, 16 insertions(+), 12 deletions(-)
rename Documentation/x86/{orc-unwinder.txt => orc-unwinder.rst} (93%)

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 4e9fa2b046df..c41c17906b6d 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -14,6 +14,7 @@ Linux x86 Support
kernel-stacks
entry_64
earlyprintk
+ orc-unwinder
zero-page
tlb
mtrr
diff --git a/Documentation/x86/orc-unwinder.txt b/Documentation/x86/orc-unwinder.rst
similarity index 93%
rename from Documentation/x86/orc-unwinder.txt
rename to Documentation/x86/orc-unwinder.rst
index cd4b29be29af..d811576c1f3e 100644
--- a/Documentation/x86/orc-unwinder.txt
+++ b/Documentation/x86/orc-unwinder.rst
@@ -1,8 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============
ORC unwinder
============

Overview
---------
+========

The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
similar in concept to a DWARF unwinder. The difference is that the
@@ -23,12 +26,12 @@ correlate instruction addresses with their stack states at run time.


ORC vs frame pointers
----------------------
+=====================

With frame pointers enabled, GCC adds instrumentation code to every
function in the kernel. The kernel's .text size increases by about
3.2%, resulting in a broad kernel-wide slowdown. Measurements by Mel
-Gorman [1] have shown a slowdown of 5-10% for some workloads.
+Gorman [1]_ have shown a slowdown of 5-10% for some workloads.

In contrast, the ORC unwinder has no effect on text size or runtime
performance, because the debuginfo is out of band. So if you disable
@@ -55,7 +58,7 @@ depending on the kernel config.


ORC vs DWARF
-------------
+============

ORC debuginfo's advantage over DWARF itself is that it's much simpler.
It gets rid of the complex DWARF CFI state machine and also gets rid of
@@ -65,7 +68,7 @@ mission critical oops code.

The simpler debuginfo format also enables the unwinder to be much faster
than DWARF, which is important for perf and lockdep. In a basic
-performance test by Jiri Slaby [2], the ORC unwinder was about 20x
+performance test by Jiri Slaby [2]_, the ORC unwinder was about 20x
faster than an out-of-tree DWARF unwinder. (Note: That measurement was
taken before some performance tweaks were added, which doubled
performance, so the speedup over DWARF may be closer to 40x.)
@@ -85,7 +88,7 @@ still be able to control the format, e.g. no complex state machines.


ORC unwind table generation
----------------------------
+===========================

The ORC data is generated by objtool. With the existing compile-time
stack metadata validation feature, objtool already follows all code
@@ -133,7 +136,7 @@ objtool follows GCC code quite well.


Unwinder implementation details
--------------------------------
+===============================

Objtool generates the ORC data by integrating with the compile-time
stack metadata validation feature, which is described in detail in
@@ -154,7 +157,7 @@ subset of the table needs to be searched.


Etymology
----------
+=========

Orcs, fearsome creatures of medieval folklore, are the Dwarves' natural
enemies. Similarly, the ORC unwinder was created in opposition to the
@@ -162,7 +165,7 @@ complexity and slowness of DWARF.

"Although Orcs rarely consider multiple solutions to a problem, they do
excel at getting things done because they are creatures of action, not
-thought." [3] Similarly, unlike the esoteric DWARF unwinder, the
+thought." [3]_ Similarly, unlike the esoteric DWARF unwinder, the
veracious ORC unwinder wastes no time or siloconic effort decoding
variable-length zero-extended unsigned-integer byte-coded
state-machine-based debug information entries.
@@ -174,6 +177,6 @@ brutal, unyielding efficiency.
ORC stands for Oops Rewind Capability.


-[1] https://lkml.kernel.org/r/[email protected]
-[2] https://lkml.kernel.org/r/[email protected]
-[3] http://dustin.wikidot.com/half-orcs-and-orcs
+.. [1] https://lkml.kernel.org/r/[email protected]
+.. [2] https://lkml.kernel.org/r/[email protected]
+.. [3] http://dustin.wikidot.com/half-orcs-and-orcs
--
2.20.1

2019-04-23 16:41:05

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 53/63] Documentation: x86: convert resctrl_ui.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/index.rst | 1 +
.../x86/{resctrl_ui.txt => resctrl_ui.rst} | 913 ++++++++++--------
2 files changed, 490 insertions(+), 424 deletions(-)
rename Documentation/x86/{resctrl_ui.txt => resctrl_ui.rst} (68%)

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 2fcd10f29b87..4e9fa2b046df 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -23,3 +23,4 @@ Linux x86 Support
amd-memory-encryption
pti
microcode
+ resctrl_ui
diff --git a/Documentation/x86/resctrl_ui.txt b/Documentation/x86/resctrl_ui.rst
similarity index 68%
rename from Documentation/x86/resctrl_ui.txt
rename to Documentation/x86/resctrl_ui.rst
index c1f95b59e14d..81aaa271d5ea 100644
--- a/Documentation/x86/resctrl_ui.txt
+++ b/Documentation/x86/resctrl_ui.rst
@@ -1,33 +1,39 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+===========================================
User Interface for Resource Control feature
+===========================================

-Intel refers to this feature as Intel Resource Director Technology(Intel(R) RDT).
-AMD refers to this feature as AMD Platform Quality of Service(AMD QoS).
+:Copyright: |copy| 2016 Intel Corporation
+:Authors: - Fenghua Yu <[email protected]>
+ - Tony Luck <[email protected]>
+ - Vikas Shivappa <[email protected]>

-Copyright (C) 2016 Intel Corporation

-Fenghua Yu <[email protected]>
-Tony Luck <[email protected]>
-Vikas Shivappa <[email protected]>
+Intel refers to this feature as Intel Resource Director Technology(Intel(R) RDT).
+AMD refers to this feature as AMD Platform Quality of Service(AMD QoS).

This feature is enabled by the CONFIG_X86_CPU_RESCTRL and the x86 /proc/cpuinfo
-flag bits:
-RDT (Resource Director Technology) Allocation - "rdt_a"
-CAT (Cache Allocation Technology) - "cat_l3", "cat_l2"
-CDP (Code and Data Prioritization ) - "cdp_l3", "cdp_l2"
-CQM (Cache QoS Monitoring) - "cqm_llc", "cqm_occup_llc"
-MBM (Memory Bandwidth Monitoring) - "cqm_mbm_total", "cqm_mbm_local"
-MBA (Memory Bandwidth Allocation) - "mba"
+flag bits::
+
+ RDT (Resource Director Technology) Allocation - "rdt_a"
+ CAT (Cache Allocation Technology) - "cat_l3", "cat_l2"
+ CDP (Code and Data Prioritization ) - "cdp_l3", "cdp_l2"
+ CQM (Cache QoS Monitoring) - "cqm_llc", "cqm_occup_llc"
+ MBM (Memory Bandwidth Monitoring) - "cqm_mbm_total", "cqm_mbm_local"
+ MBA (Memory Bandwidth Allocation) - "mba"

-To use the feature mount the file system:
+To use the feature mount the file system::

# mount -t resctrl resctrl [-o cdp[,cdpl2][,mba_MBps]] /sys/fs/resctrl

mount options are:

-"cdp": Enable code/data prioritization in L3 cache allocations.
-"cdpl2": Enable code/data prioritization in L2 cache allocations.
-"mba_MBps": Enable the MBA Software Controller(mba_sc) to specify MBA
- bandwidth in MBps
+* "cdp": Enable code/data prioritization in L3 cache allocations.
+* "cdpl2": Enable code/data prioritization in L2 cache allocations.
+* "mba_MBps": Enable the MBA Software Controller(mba_sc) to specify MBA
+ bandwidth in MBps

L2 and L3 CDP are controlled seperately.

@@ -44,7 +50,7 @@ For more details on the behavior of the interface during monitoring
and allocation, see the "Resource alloc and monitor groups" section.

Info directory
---------------
+==============

The 'info' directory contains information about the enabled
resources. Each resource has its own subdirectory. The subdirectory
@@ -56,77 +62,93 @@ allocation:
Cache resource(L3/L2) subdirectory contains the following files
related to allocation:

-"num_closids": The number of CLOSIDs which are valid for this
- resource. The kernel uses the smallest number of
- CLOSIDs of all enabled resources as limit.
-
-"cbm_mask": The bitmask which is valid for this resource.
- This mask is equivalent to 100%.
-
-"min_cbm_bits": The minimum number of consecutive bits which
- must be set when writing a mask.
-
-"shareable_bits": Bitmask of shareable resource with other executing
- entities (e.g. I/O). User can use this when
- setting up exclusive cache partitions. Note that
- some platforms support devices that have their
- own settings for cache use which can over-ride
- these bits.
-"bit_usage": Annotated capacity bitmasks showing how all
- instances of the resource are used. The legend is:
- "0" - Corresponding region is unused. When the system's
+"num_closids":
+ The number of CLOSIDs which are valid for this
+ resource. The kernel uses the smallest number of
+ CLOSIDs of all enabled resources as limit.
+"cbm_mask":
+ The bitmask which is valid for this resource.
+ This mask is equivalent to 100%.
+"min_cbm_bits":
+ The minimum number of consecutive bits which
+ must be set when writing a mask.
+
+"shareable_bits":
+ Bitmask of shareable resource with other executing
+ entities (e.g. I/O). User can use this when
+ setting up exclusive cache partitions. Note that
+ some platforms support devices that have their
+ own settings for cache use which can over-ride
+ these bits.
+"bit_usage":
+ Annotated capacity bitmasks showing how all
+ instances of the resource are used. The legend is:
+
+ "0":
+ Corresponding region is unused. When the system's
resources have been allocated and a "0" is found
in "bit_usage" it is a sign that resources are
wasted.
- "H" - Corresponding region is used by hardware only
+
+ "H":
+ Corresponding region is used by hardware only
but available for software use. If a resource
has bits set in "shareable_bits" but not all
of these bits appear in the resource groups'
schematas then the bits appearing in
"shareable_bits" but no resource group will
be marked as "H".
- "X" - Corresponding region is available for sharing and
+ "X":
+ Corresponding region is available for sharing and
used by hardware and software. These are the
bits that appear in "shareable_bits" as
well as a resource group's allocation.
- "S" - Corresponding region is used by software
+ "S":
+ Corresponding region is used by software
and available for sharing.
- "E" - Corresponding region is used exclusively by
+ "E":
+ Corresponding region is used exclusively by
one resource group. No sharing allowed.
- "P" - Corresponding region is pseudo-locked. No
+ "P":
+ Corresponding region is pseudo-locked. No
sharing allowed.

Memory bandwitdh(MB) subdirectory contains the following files
with respect to allocation:

-"min_bandwidth": The minimum memory bandwidth percentage which
- user can request.
+"min_bandwidth":
+ The minimum memory bandwidth percentage which
+ user can request.

-"bandwidth_gran": The granularity in which the memory bandwidth
- percentage is allocated. The allocated
- b/w percentage is rounded off to the next
- control step available on the hardware. The
- available bandwidth control steps are:
- min_bandwidth + N * bandwidth_gran.
+"bandwidth_gran":
+ The granularity in which the memory bandwidth
+ percentage is allocated. The allocated
+ b/w percentage is rounded off to the next
+ control step available on the hardware. The
+ available bandwidth control steps are:
+ min_bandwidth + N * bandwidth_gran.

-"delay_linear": Indicates if the delay scale is linear or
- non-linear. This field is purely informational
- only.
+"delay_linear":
+ Indicates if the delay scale is linear or
+ non-linear. This field is purely informational
+ only.

If RDT monitoring is available there will be an "L3_MON" directory
with the following files:

-"num_rmids": The number of RMIDs available. This is the
- upper bound for how many "CTRL_MON" + "MON"
- groups can be created.
+"num_rmids":
+ The number of RMIDs available. This is the
+ upper bound for how many "CTRL_MON" + "MON"
+ groups can be created.

-"mon_features": Lists the monitoring events if
- monitoring is enabled for the resource.
+"mon_features":
+ Lists the monitoring events if
+ monitoring is enabled for the resource.

"max_threshold_occupancy":
- Read/write file provides the largest value (in
- bytes) at which a previously used LLC_occupancy
- counter can be considered for re-use.
+ Read/write file provides the largest value (in
+ bytes) at which a previously used LLC_occupancy
+ counter can be considered for re-use.

Finally, in the top level of the "info" directory there is a file
named "last_cmd_status". This is reset with every "command" issued
@@ -134,6 +156,7 @@ via the file system (making new directories or writing to any of the
control files). If the command was successful, it will read as "ok".
If the command failed, it will provide more information that can be
conveyed in the error returns from file operations. E.g.
+::

# echo L3:0=f7 > schemata
bash: echo: write error: Invalid argument
@@ -141,7 +164,7 @@ conveyed in the error returns from file operations. E.g.
mask f7 has non-consecutive 1-bits

Resource alloc and monitor groups
----------------------------------
+=================================

Resource groups are represented as directories in the resctrl file
system. The default group is the root directory which, immediately
@@ -226,6 +249,7 @@ When monitoring is enabled all MON groups will also contain:

Resource allocation rules
-------------------------
+
When a task is running the following rules define which resources are
available to it:

@@ -252,7 +276,7 @@ Resource monitoring rules


Notes on cache occupancy monitoring and control
------------------------------------------------
+===============================================
When moving a task from one group to another you should remember that
this only affects *new* cache allocations by the task. E.g. you may have
a task in a monitor group showing 3 MB of cache occupancy. If you move
@@ -321,7 +345,7 @@ of the capacity of the cache. You could partition the cache into four
equal parts with masks: 0x1f, 0x3e0, 0x7c00, 0xf8000.

Memory bandwidth Allocation and monitoring
-------------------------------------------
+==========================================

For Memory bandwidth resource, by default the user controls the resource
by indicating the percentage of total memory bandwidth.
@@ -369,7 +393,7 @@ In order to mitigate this and make the interface more user friendly,
resctrl added support for specifying the bandwidth in MBps as well. The
kernel underneath would use a software feedback mechanism or a "Software
Controller(mba_sc)" which reads the actual bandwidth using MBM counters
-and adjust the memowy bandwidth percentages to ensure
+and adjust the memowy bandwidth percentages to ensure::

"actual bandwidth < user specified bandwidth".

@@ -380,14 +404,14 @@ sections.

L3 schemata file details (code and data prioritization disabled)
----------------------------------------------------------------
-With CDP disabled the L3 schemata format is:
+With CDP disabled the L3 schemata format is::

L3:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...

L3 schemata file details (CDP enabled via mount option to resctrl)
------------------------------------------------------------------
When CDP is enabled L3 control is split into two separate resources
-so you can specify independent masks for code and data like this:
+so you can specify independent masks for code and data like this::

L3data:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
L3code:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
@@ -395,7 +419,7 @@ so you can specify independent masks for code and data like this:
L2 schemata file details
------------------------
L2 cache does not support code and data prioritization, so the
-schemata format is always:
+schemata format is always::

L2:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...

@@ -403,6 +427,7 @@ Memory bandwidth Allocation (default mode)
------------------------------------------

Memory b/w domain is L3 cache.
+::

MB:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...

@@ -410,6 +435,7 @@ Memory bandwidth Allocation specified in MBps
---------------------------------------------

Memory bandwidth domain is L3 cache.
+::

MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...

@@ -418,17 +444,18 @@ Reading/writing the schemata file
Reading the schemata file will show the state of all resources
on all domains. When writing you only need to specify those values
which you wish to change. E.g.
+::

-# cat schemata
-L3DATA:0=fffff;1=fffff;2=fffff;3=fffff
-L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
-# echo "L3DATA:2=3c0;" > schemata
-# cat schemata
-L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
-L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
+ # cat schemata
+ L3DATA:0=fffff;1=fffff;2=fffff;3=fffff
+ L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
+ # echo "L3DATA:2=3c0;" > schemata
+ # cat schemata
+ L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
+ L3CODE:0=fffff;1=fffff;2=fffff;3=fffff

Cache Pseudo-Locking
---------------------
+====================
CAT enables a user to specify the amount of cache space that an
application can fill. Cache pseudo-locking builds on the fact that a
CPU can still read and write data pre-allocated outside its current
@@ -442,6 +469,7 @@ a region of memory with reduced average read latency.
The creation of a cache pseudo-locked region is triggered by a request
from the user to do so that is accompanied by a schemata of the region
to be pseudo-locked. The cache pseudo-locked region is created as follows:
+
- Create a CAT allocation CLOSNEW with a CBM matching the schemata
from the user of the cache region that will contain the pseudo-locked
memory. This region must not overlap with any current CAT allocation/CLOS
@@ -480,6 +508,7 @@ initial mmap() handling, there is no enforcement afterwards and the
application self needs to ensure it remains affine to the correct cores.

Pseudo-locking is accomplished in two stages:
+
1) During the first stage the system administrator allocates a portion
of cache that should be dedicated to pseudo-locking. At this time an
equivalent portion of memory is allocated, loaded into allocated
@@ -506,7 +535,7 @@ by user space in order to obtain access to the pseudo-locked memory region.
An example of cache pseudo-locked region creation and usage can be found below.

Cache Pseudo-Locking Debugging Interface
----------------------------------------
+----------------------------------------
The pseudo-locking debugging interface is enabled by default (if
CONFIG_DEBUG_FS is enabled) and can be found in /sys/kernel/debug/resctrl.

@@ -514,6 +543,7 @@ There is no explicit way for the kernel to test if a provided memory
location is present in the cache. The pseudo-locking debugging interface uses
the tracing infrastructure to provide two ways to measure cache residency of
the pseudo-locked region:
+
1) Memory access latency using the pseudo_lock_mem_latency tracepoint. Data
from these measurements are best visualized using a hist trigger (see
example below). In this test the pseudo-locked region is traversed at
@@ -529,87 +559,97 @@ it in debugfs as /sys/kernel/debug/resctrl/<newdir>. A single
write-only file, pseudo_lock_measure, is present in this directory. The
measurement of the pseudo-locked region depends on the number written to this
debugfs file:
-1 - writing "1" to the pseudo_lock_measure file will trigger the latency
+
+1:
+ writing "1" to the pseudo_lock_measure file will trigger the latency
measurement captured in the pseudo_lock_mem_latency tracepoint. See
example below.
-2 - writing "2" to the pseudo_lock_measure file will trigger the L2 cache
+2:
+ writing "2" to the pseudo_lock_measure file will trigger the L2 cache
residency (cache hits and misses) measurement captured in the
pseudo_lock_l2 tracepoint. See example below.
-3 - writing "3" to the pseudo_lock_measure file will trigger the L3 cache
+3:
+ writing "3" to the pseudo_lock_measure file will trigger the L3 cache
residency (cache hits and misses) measurement captured in the
pseudo_lock_l3 tracepoint.

All measurements are recorded with the tracing infrastructure. This requires
the relevant tracepoints to be enabled before the measurement is triggered.

-Example of latency debugging interface:
+Example of latency debugging interface
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In this example a pseudo-locked region named "newlock" was created. Here is
how we can measure the latency in cycles of reading from this region and
visualize this data with a histogram that is available if CONFIG_HIST_TRIGGERS
-is set:
-# :> /sys/kernel/debug/tracing/trace
-# echo 'hist:keys=latency' > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/trigger
-# echo 1 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/enable
-# echo 1 > /sys/kernel/debug/resctrl/newlock/pseudo_lock_measure
-# echo 0 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/enable
-# cat /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/hist
-
-# event histogram
-#
-# trigger info: hist:keys=latency:vals=hitcount:sort=hitcount:size=2048 [active]
-#
-
-{ latency: 456 } hitcount: 1
-{ latency: 50 } hitcount: 83
-{ latency: 36 } hitcount: 96
-{ latency: 44 } hitcount: 174
-{ latency: 48 } hitcount: 195
-{ latency: 46 } hitcount: 262
-{ latency: 42 } hitcount: 693
-{ latency: 40 } hitcount: 3204
-{ latency: 38 } hitcount: 3484
-
-Totals:
- Hits: 8192
- Entries: 9
- Dropped: 0
-
-Example of cache hits/misses debugging:
+is set::
+
+ # :> /sys/kernel/debug/tracing/trace
+ # echo 'hist:keys=latency' > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/trigger
+ # echo 1 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/enable
+ # echo 1 > /sys/kernel/debug/resctrl/newlock/pseudo_lock_measure
+ # echo 0 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/enable
+ # cat /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/hist
+
+ # event histogram
+ #
+ # trigger info: hist:keys=latency:vals=hitcount:sort=hitcount:size=2048 [active]
+ #
+
+ { latency: 456 } hitcount: 1
+ { latency: 50 } hitcount: 83
+ { latency: 36 } hitcount: 96
+ { latency: 44 } hitcount: 174
+ { latency: 48 } hitcount: 195
+ { latency: 46 } hitcount: 262
+ { latency: 42 } hitcount: 693
+ { latency: 40 } hitcount: 3204
+ { latency: 38 } hitcount: 3484
+
+ Totals:
+ Hits: 8192
+ Entries: 9
+ Dropped: 0
+
+Example of cache hits/misses debugging
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In this example a pseudo-locked region named "newlock" was created on the L2
cache of a platform. Here is how we can obtain details of the cache hits
and misses using the platform's precision counters.
+::

-# :> /sys/kernel/debug/tracing/trace
-# echo 1 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_l2/enable
-# echo 2 > /sys/kernel/debug/resctrl/newlock/pseudo_lock_measure
-# echo 0 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_l2/enable
-# cat /sys/kernel/debug/tracing/trace
+ # :> /sys/kernel/debug/tracing/trace
+ # echo 1 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_l2/enable
+ # echo 2 > /sys/kernel/debug/resctrl/newlock/pseudo_lock_measure
+ # echo 0 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_l2/enable
+ # cat /sys/kernel/debug/tracing/trace

-# tracer: nop
-#
-# _-----=> irqs-off
-# / _----=> need-resched
-# | / _---=> hardirq/softirq
-# || / _--=> preempt-depth
-# ||| / delay
-# TASK-PID CPU# |||| TIMESTAMP FUNCTION
-# | | | |||| | |
- pseudo_lock_mea-1672 [002] .... 3132.860500: pseudo_lock_l2: hits=4097 miss=0
+ # tracer: nop
+ #
+ # _-----=> irqs-off
+ # / _----=> need-resched
+ # | / _---=> hardirq/softirq
+ # || / _--=> preempt-depth
+ # ||| / delay
+ # TASK-PID CPU# |||| TIMESTAMP FUNCTION
+ # | | | |||| | |
+ pseudo_lock_mea-1672 [002] .... 3132.860500: pseudo_lock_l2: hits=4097 miss=0


-Examples for RDT allocation usage:
+Examples for RDT allocation usage
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+1) Example 1

-Example 1
----------
On a two socket machine (one L3 cache per socket) with just four bits
for cache bit masks, minimum b/w of 10% with a memory bandwidth
-granularity of 10%
+granularity of 10%.
+::

-# mount -t resctrl resctrl /sys/fs/resctrl
-# cd /sys/fs/resctrl
-# mkdir p0 p1
-# echo "L3:0=3;1=c\nMB:0=50;1=50" > /sys/fs/resctrl/p0/schemata
-# echo "L3:0=3;1=3\nMB:0=50;1=50" > /sys/fs/resctrl/p1/schemata
+ # mount -t resctrl resctrl /sys/fs/resctrl
+ # cd /sys/fs/resctrl
+ # mkdir p0 p1
+ # echo "L3:0=3;1=c\nMB:0=50;1=50" > /sys/fs/resctrl/p0/schemata
+ # echo "L3:0=3;1=3\nMB:0=50;1=50" > /sys/fs/resctrl/p1/schemata

The default resource group is unmodified, so we have access to all parts
of all caches (its schemata file reads "L3:0=f;1=f").
@@ -628,100 +668,106 @@ the b/w accordingly.

If the MBA is specified in MB(megabytes) then user can enter the max b/w in MB
rather than the percentage values.
+::

-# echo "L3:0=3;1=c\nMB:0=1024;1=500" > /sys/fs/resctrl/p0/schemata
-# echo "L3:0=3;1=3\nMB:0=1024;1=500" > /sys/fs/resctrl/p1/schemata
+ # echo "L3:0=3;1=c\nMB:0=1024;1=500" > /sys/fs/resctrl/p0/schemata
+ # echo "L3:0=3;1=3\nMB:0=1024;1=500" > /sys/fs/resctrl/p1/schemata

In the above example the tasks in "p1" and "p0" on socket 0 would use a max b/w
of 1024MB where as on socket 1 they would use 500MB.

-Example 2
----------
+2) Example 2
+
Again two sockets, but this time with a more realistic 20-bit mask.

Two real time tasks pid=1234 running on processor 0 and pid=5678 running on
processor 1 on socket 0 on a 2-socket and dual core machine. To avoid noisy
neighbors, each of the two real-time tasks exclusively occupies one quarter
of L3 cache on socket 0.
+::

-# mount -t resctrl resctrl /sys/fs/resctrl
-# cd /sys/fs/resctrl
+ # mount -t resctrl resctrl /sys/fs/resctrl
+ # cd /sys/fs/resctrl

First we reset the schemata for the default group so that the "upper"
50% of the L3 cache on socket 0 and 50% of memory b/w cannot be used by
-ordinary tasks:
+ordinary tasks::

-# echo "L3:0=3ff;1=fffff\nMB:0=50;1=100" > schemata
+ # echo "L3:0=3ff;1=fffff\nMB:0=50;1=100" > schemata

Next we make a resource group for our first real time task and give
it access to the "top" 25% of the cache on socket 0.
+::

-# mkdir p0
-# echo "L3:0=f8000;1=fffff" > p0/schemata
+ # mkdir p0
+ # echo "L3:0=f8000;1=fffff" > p0/schemata

Finally we move our first real time task into this resource group. We
also use taskset(1) to ensure the task always runs on a dedicated CPU
on socket 0. Most uses of resource groups will also constrain which
processors tasks run on.
+::

-# echo 1234 > p0/tasks
-# taskset -cp 1 1234
+ # echo 1234 > p0/tasks
+ # taskset -cp 1 1234

-Ditto for the second real time task (with the remaining 25% of cache):
+Ditto for the second real time task (with the remaining 25% of cache)::

-# mkdir p1
-# echo "L3:0=7c00;1=fffff" > p1/schemata
-# echo 5678 > p1/tasks
-# taskset -cp 2 5678
+ # mkdir p1
+ # echo "L3:0=7c00;1=fffff" > p1/schemata
+ # echo 5678 > p1/tasks
+ # taskset -cp 2 5678

For the same 2 socket system with memory b/w resource and CAT L3 the
schemata would look like(Assume min_bandwidth 10 and bandwidth_gran is
10):

-For our first real time task this would request 20% memory b/w on socket
-0.
+For our first real time task this would request 20% memory b/w on socket 0.
+::

-# echo -e "L3:0=f8000;1=fffff\nMB:0=20;1=100" > p0/schemata
+ # echo -e "L3:0=f8000;1=fffff\nMB:0=20;1=100" > p0/schemata

For our second real time task this would request an other 20% memory b/w
on socket 0.
+::

-# echo -e "L3:0=f8000;1=fffff\nMB:0=20;1=100" > p0/schemata
+ # echo -e "L3:0=f8000;1=fffff\nMB:0=20;1=100" > p0/schemata

-Example 3
----------
+3) Example 3

A single socket system which has real-time tasks running on core 4-7 and
non real-time workload assigned to core 0-3. The real-time tasks share text
and data, so a per task association is not required and due to interaction
with the kernel it's desired that the kernel on these cores shares L3 with
the tasks.
+::

-# mount -t resctrl resctrl /sys/fs/resctrl
-# cd /sys/fs/resctrl
+ # mount -t resctrl resctrl /sys/fs/resctrl
+ # cd /sys/fs/resctrl

First we reset the schemata for the default group so that the "upper"
50% of the L3 cache on socket 0, and 50% of memory bandwidth on socket 0
-cannot be used by ordinary tasks:
+cannot be used by ordinary tasks::

-# echo "L3:0=3ff\nMB:0=50" > schemata
+ # echo "L3:0=3ff\nMB:0=50" > schemata

Next we make a resource group for our real time cores and give it access
to the "top" 50% of the cache on socket 0 and 50% of memory bandwidth on
socket 0.
+::

-# mkdir p0
-# echo "L3:0=ffc00\nMB:0=50" > p0/schemata
+ # mkdir p0
+ # echo "L3:0=ffc00\nMB:0=50" > p0/schemata

Finally we move core 4-7 over to the new group and make sure that the
kernel and the tasks running there get 50% of the cache. They should
also get 50% of memory bandwidth assuming that the cores 4-7 are SMT
siblings and only the real time threads are scheduled on the cores 4-7.
+::

-# echo F0 > p0/cpus
+ # echo F0 > p0/cpus

-Example 4
----------
+4) Example 4

The resource groups in previous examples were all in the default "shareable"
mode allowing sharing of their cache allocations. If one resource group
@@ -732,157 +778,168 @@ In this example a new exclusive resource group will be created on a L2 CAT
system with two L2 cache instances that can be configured with an 8-bit
capacity bitmask. The new exclusive resource group will be configured to use
25% of each cache instance.
+::

-# mount -t resctrl resctrl /sys/fs/resctrl/
-# cd /sys/fs/resctrl
+ # mount -t resctrl resctrl /sys/fs/resctrl/
+ # cd /sys/fs/resctrl

First, we observe that the default group is configured to allocate to all L2
-cache:
+cache::

-# cat schemata
-L2:0=ff;1=ff
+ # cat schemata
+ L2:0=ff;1=ff

We could attempt to create the new resource group at this point, but it will
-fail because of the overlap with the schemata of the default group:
-# mkdir p0
-# echo 'L2:0=0x3;1=0x3' > p0/schemata
-# cat p0/mode
-shareable
-# echo exclusive > p0/mode
--sh: echo: write error: Invalid argument
-# cat info/last_cmd_status
-schemata overlaps
+fail because of the overlap with the schemata of the default group::
+
+ # mkdir p0
+ # echo 'L2:0=0x3;1=0x3' > p0/schemata
+ # cat p0/mode
+ shareable
+ # echo exclusive > p0/mode
+ -sh: echo: write error: Invalid argument
+ # cat info/last_cmd_status
+ schemata overlaps

To ensure that there is no overlap with another resource group the default
resource group's schemata has to change, making it possible for the new
resource group to become exclusive.
-# echo 'L2:0=0xfc;1=0xfc' > schemata
-# echo exclusive > p0/mode
-# grep . p0/*
-p0/cpus:0
-p0/mode:exclusive
-p0/schemata:L2:0=03;1=03
-p0/size:L2:0=262144;1=262144
+::
+
+ # echo 'L2:0=0xfc;1=0xfc' > schemata
+ # echo exclusive > p0/mode
+ # grep . p0/*
+ p0/cpus:0
+ p0/mode:exclusive
+ p0/schemata:L2:0=03;1=03
+ p0/size:L2:0=262144;1=262144

A new resource group will on creation not overlap with an exclusive resource
-group:
-# mkdir p1
-# grep . p1/*
-p1/cpus:0
-p1/mode:shareable
-p1/schemata:L2:0=fc;1=fc
-p1/size:L2:0=786432;1=786432
-
-The bit_usage will reflect how the cache is used:
-# cat info/L2/bit_usage
-0=SSSSSSEE;1=SSSSSSEE
-
-A resource group cannot be forced to overlap with an exclusive resource group:
-# echo 'L2:0=0x1;1=0x1' > p1/schemata
--sh: echo: write error: Invalid argument
-# cat info/last_cmd_status
-overlaps with exclusive group
+group::
+
+ # mkdir p1
+ # grep . p1/*
+ p1/cpus:0
+ p1/mode:shareable
+ p1/schemata:L2:0=fc;1=fc
+ p1/size:L2:0=786432;1=786432
+
+The bit_usage will reflect how the cache is used::
+
+ # cat info/L2/bit_usage
+ 0=SSSSSSEE;1=SSSSSSEE
+
+A resource group cannot be forced to overlap with an exclusive resource group::
+
+ # echo 'L2:0=0x1;1=0x1' > p1/schemata
+ -sh: echo: write error: Invalid argument
+ # cat info/last_cmd_status
+ overlaps with exclusive group

Example of Cache Pseudo-Locking
--------------------------------
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lock portion of L2 cache from cache id 1 using CBM 0x3. Pseudo-locked
region is exposed at /dev/pseudo_lock/newlock that can be provided to
application for argument to mmap().
+::

-# mount -t resctrl resctrl /sys/fs/resctrl/
-# cd /sys/fs/resctrl
+ # mount -t resctrl resctrl /sys/fs/resctrl/
+ # cd /sys/fs/resctrl

Ensure that there are bits available that can be pseudo-locked, since only
unused bits can be pseudo-locked the bits to be pseudo-locked needs to be
-removed from the default resource group's schemata:
-# cat info/L2/bit_usage
-0=SSSSSSSS;1=SSSSSSSS
-# echo 'L2:1=0xfc' > schemata
-# cat info/L2/bit_usage
-0=SSSSSSSS;1=SSSSSS00
+removed from the default resource group's schemata::
+
+ # cat info/L2/bit_usage
+ 0=SSSSSSSS;1=SSSSSSSS
+ # echo 'L2:1=0xfc' > schemata
+ # cat info/L2/bit_usage
+ 0=SSSSSSSS;1=SSSSSS00

Create a new resource group that will be associated with the pseudo-locked
region, indicate that it will be used for a pseudo-locked region, and
-configure the requested pseudo-locked region capacity bitmask:
+configure the requested pseudo-locked region capacity bitmask::

-# mkdir newlock
-# echo pseudo-locksetup > newlock/mode
-# echo 'L2:1=0x3' > newlock/schemata
+ # mkdir newlock
+ # echo pseudo-locksetup > newlock/mode
+ # echo 'L2:1=0x3' > newlock/schemata

On success the resource group's mode will change to pseudo-locked, the
bit_usage will reflect the pseudo-locked region, and the character device
-exposing the pseudo-locked region will exist:
-
-# cat newlock/mode
-pseudo-locked
-# cat info/L2/bit_usage
-0=SSSSSSSS;1=SSSSSSPP
-# ls -l /dev/pseudo_lock/newlock
-crw------- 1 root root 243, 0 Apr 3 05:01 /dev/pseudo_lock/newlock
-
-/*
- * Example code to access one page of pseudo-locked cache region
- * from user space.
- */
-#define _GNU_SOURCE
-#include <fcntl.h>
-#include <sched.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <unistd.h>
-#include <sys/mman.h>
-
-/*
- * It is required that the application runs with affinity to only
- * cores associated with the pseudo-locked region. Here the cpu
- * is hardcoded for convenience of example.
- */
-static int cpuid = 2;
-
-int main(int argc, char *argv[])
-{
- cpu_set_t cpuset;
- long page_size;
- void *mapping;
- int dev_fd;
- int ret;
-
- page_size = sysconf(_SC_PAGESIZE);
-
- CPU_ZERO(&cpuset);
- CPU_SET(cpuid, &cpuset);
- ret = sched_setaffinity(0, sizeof(cpuset), &cpuset);
- if (ret < 0) {
- perror("sched_setaffinity");
- exit(EXIT_FAILURE);
- }
-
- dev_fd = open("/dev/pseudo_lock/newlock", O_RDWR);
- if (dev_fd < 0) {
- perror("open");
- exit(EXIT_FAILURE);
- }
-
- mapping = mmap(0, page_size, PROT_READ | PROT_WRITE, MAP_SHARED,
- dev_fd, 0);
- if (mapping == MAP_FAILED) {
- perror("mmap");
- close(dev_fd);
- exit(EXIT_FAILURE);
- }
-
- /* Application interacts with pseudo-locked memory @mapping */
-
- ret = munmap(mapping, page_size);
- if (ret < 0) {
- perror("munmap");
- close(dev_fd);
- exit(EXIT_FAILURE);
- }
-
- close(dev_fd);
- exit(EXIT_SUCCESS);
-}
+exposing the pseudo-locked region will exist::
+
+ # cat newlock/mode
+ pseudo-locked
+ # cat info/L2/bit_usage
+ 0=SSSSSSSS;1=SSSSSSPP
+ # ls -l /dev/pseudo_lock/newlock
+ crw------- 1 root root 243, 0 Apr 3 05:01 /dev/pseudo_lock/newlock
+
+::
+
+ /*
+ * Example code to access one page of pseudo-locked cache region
+ * from user space.
+ */
+ #define _GNU_SOURCE
+ #include <fcntl.h>
+ #include <sched.h>
+ #include <stdio.h>
+ #include <stdlib.h>
+ #include <unistd.h>
+ #include <sys/mman.h>
+
+ /*
+ * It is required that the application runs with affinity to only
+ * cores associated with the pseudo-locked region. Here the cpu
+ * is hardcoded for convenience of example.
+ */
+ static int cpuid = 2;
+
+ int main(int argc, char *argv[])
+ {
+ cpu_set_t cpuset;
+ long page_size;
+ void *mapping;
+ int dev_fd;
+ int ret;
+
+ page_size = sysconf(_SC_PAGESIZE);
+
+ CPU_ZERO(&cpuset);
+ CPU_SET(cpuid, &cpuset);
+ ret = sched_setaffinity(0, sizeof(cpuset), &cpuset);
+ if (ret < 0) {
+ perror("sched_setaffinity");
+ exit(EXIT_FAILURE);
+ }
+
+ dev_fd = open("/dev/pseudo_lock/newlock", O_RDWR);
+ if (dev_fd < 0) {
+ perror("open");
+ exit(EXIT_FAILURE);
+ }
+
+ mapping = mmap(0, page_size, PROT_READ | PROT_WRITE, MAP_SHARED,
+ dev_fd, 0);
+ if (mapping == MAP_FAILED) {
+ perror("mmap");
+ close(dev_fd);
+ exit(EXIT_FAILURE);
+ }
+
+ /* Application interacts with pseudo-locked memory @mapping */
+
+ ret = munmap(mapping, page_size);
+ if (ret < 0) {
+ perror("munmap");
+ close(dev_fd);
+ exit(EXIT_FAILURE);
+ }
+
+ close(dev_fd);
+ exit(EXIT_SUCCESS);
+ }

Locking between applications
----------------------------
@@ -921,86 +978,86 @@ Read lock:
B) If success read the directory structure.
C) funlock

-Example with bash:
-
-# Atomically read directory structure
-$ flock -s /sys/fs/resctrl/ find /sys/fs/resctrl
-
-# Read directory contents and create new subdirectory
-
-$ cat create-dir.sh
-find /sys/fs/resctrl/ > output.txt
-mask = function-of(output.txt)
-mkdir /sys/fs/resctrl/newres/
-echo mask > /sys/fs/resctrl/newres/schemata
-
-$ flock /sys/fs/resctrl/ ./create-dir.sh
-
-Example with C:
-
-/*
- * Example code do take advisory locks
- * before accessing resctrl filesystem
- */
-#include <sys/file.h>
-#include <stdlib.h>
-
-void resctrl_take_shared_lock(int fd)
-{
- int ret;
-
- /* take shared lock on resctrl filesystem */
- ret = flock(fd, LOCK_SH);
- if (ret) {
- perror("flock");
- exit(-1);
- }
-}
-
-void resctrl_take_exclusive_lock(int fd)
-{
- int ret;
-
- /* release lock on resctrl filesystem */
- ret = flock(fd, LOCK_EX);
- if (ret) {
- perror("flock");
- exit(-1);
- }
-}
-
-void resctrl_release_lock(int fd)
-{
- int ret;
-
- /* take shared lock on resctrl filesystem */
- ret = flock(fd, LOCK_UN);
- if (ret) {
- perror("flock");
- exit(-1);
- }
-}
-
-void main(void)
-{
- int fd, ret;
-
- fd = open("/sys/fs/resctrl", O_DIRECTORY);
- if (fd == -1) {
- perror("open");
- exit(-1);
- }
- resctrl_take_shared_lock(fd);
- /* code to read directory contents */
- resctrl_release_lock(fd);
-
- resctrl_take_exclusive_lock(fd);
- /* code to read and write directory contents */
- resctrl_release_lock(fd);
-}
-
-Examples for RDT Monitoring along with allocation usage:
-
+Example with bash::
+
+ # Atomically read directory structure
+ $ flock -s /sys/fs/resctrl/ find /sys/fs/resctrl
+
+ # Read directory contents and create new subdirectory
+
+ $ cat create-dir.sh
+ find /sys/fs/resctrl/ > output.txt
+ mask = function-of(output.txt)
+ mkdir /sys/fs/resctrl/newres/
+ echo mask > /sys/fs/resctrl/newres/schemata
+
+ $ flock /sys/fs/resctrl/ ./create-dir.sh
+
+Example with C::
+
+ /*
+ * Example code do take advisory locks
+ * before accessing resctrl filesystem
+ */
+ #include <sys/file.h>
+ #include <stdlib.h>
+
+ void resctrl_take_shared_lock(int fd)
+ {
+ int ret;
+
+ /* take shared lock on resctrl filesystem */
+ ret = flock(fd, LOCK_SH);
+ if (ret) {
+ perror("flock");
+ exit(-1);
+ }
+ }
+
+ void resctrl_take_exclusive_lock(int fd)
+ {
+ int ret;
+
+ /* release lock on resctrl filesystem */
+ ret = flock(fd, LOCK_EX);
+ if (ret) {
+ perror("flock");
+ exit(-1);
+ }
+ }
+
+ void resctrl_release_lock(int fd)
+ {
+ int ret;
+
+ /* take shared lock on resctrl filesystem */
+ ret = flock(fd, LOCK_UN);
+ if (ret) {
+ perror("flock");
+ exit(-1);
+ }
+ }
+
+ void main(void)
+ {
+ int fd, ret;
+
+ fd = open("/sys/fs/resctrl", O_DIRECTORY);
+ if (fd == -1) {
+ perror("open");
+ exit(-1);
+ }
+ resctrl_take_shared_lock(fd);
+ /* code to read directory contents */
+ resctrl_release_lock(fd);
+
+ resctrl_take_exclusive_lock(fd);
+ /* code to read and write directory contents */
+ resctrl_release_lock(fd);
+ }
+
+Examples for RDT Monitoring along with allocation usage
+=======================================================
Reading monitored data
----------------------
Reading an event file (for ex: mon_data/mon_L3_00/llc_occupancy) would
@@ -1009,17 +1066,17 @@ group or CTRL_MON group.


Example 1 (Monitor CTRL_MON group and subset of tasks in CTRL_MON group)
----------
+------------------------------------------------------------------------
On a two socket machine (one L3 cache per socket) with just four bits
-for cache bit masks
+for cache bit masks::

-# mount -t resctrl resctrl /sys/fs/resctrl
-# cd /sys/fs/resctrl
-# mkdir p0 p1
-# echo "L3:0=3;1=c" > /sys/fs/resctrl/p0/schemata
-# echo "L3:0=3;1=3" > /sys/fs/resctrl/p1/schemata
-# echo 5678 > p1/tasks
-# echo 5679 > p1/tasks
+ # mount -t resctrl resctrl /sys/fs/resctrl
+ # cd /sys/fs/resctrl
+ # mkdir p0 p1
+ # echo "L3:0=3;1=c" > /sys/fs/resctrl/p0/schemata
+ # echo "L3:0=3;1=3" > /sys/fs/resctrl/p1/schemata
+ # echo 5678 > p1/tasks
+ # echo 5679 > p1/tasks

The default resource group is unmodified, so we have access to all parts
of all caches (its schemata file reads "L3:0=f;1=f").
@@ -1029,47 +1086,51 @@ Tasks that are under the control of group "p0" may only allocate from the
Tasks in group "p1" use the "lower" 50% of cache on both sockets.

Create monitor groups and assign a subset of tasks to each monitor group.
+::

-# cd /sys/fs/resctrl/p1/mon_groups
-# mkdir m11 m12
-# echo 5678 > m11/tasks
-# echo 5679 > m12/tasks
+ # cd /sys/fs/resctrl/p1/mon_groups
+ # mkdir m11 m12
+ # echo 5678 > m11/tasks
+ # echo 5679 > m12/tasks

fetch data (data shown in bytes)
+::

-# cat m11/mon_data/mon_L3_00/llc_occupancy
-16234000
-# cat m11/mon_data/mon_L3_01/llc_occupancy
-14789000
-# cat m12/mon_data/mon_L3_00/llc_occupancy
-16789000
+ # cat m11/mon_data/mon_L3_00/llc_occupancy
+ 16234000
+ # cat m11/mon_data/mon_L3_01/llc_occupancy
+ 14789000
+ # cat m12/mon_data/mon_L3_00/llc_occupancy
+ 16789000

The parent ctrl_mon group shows the aggregated data.
+::

-# cat /sys/fs/resctrl/p1/mon_data/mon_l3_00/llc_occupancy
-31234000
+ # cat /sys/fs/resctrl/p1/mon_data/mon_l3_00/llc_occupancy
+ 31234000

Example 2 (Monitor a task from its creation)
----------
-On a two socket machine (one L3 cache per socket)
+--------------------------------------------
+On a two socket machine (one L3 cache per socket)::

-# mount -t resctrl resctrl /sys/fs/resctrl
-# cd /sys/fs/resctrl
-# mkdir p0 p1
+ # mount -t resctrl resctrl /sys/fs/resctrl
+ # cd /sys/fs/resctrl
+ # mkdir p0 p1

An RMID is allocated to the group once its created and hence the <cmd>
below is monitored from its creation.
+::

-# echo $$ > /sys/fs/resctrl/p1/tasks
-# <cmd>
+ # echo $$ > /sys/fs/resctrl/p1/tasks
+ # <cmd>

-Fetch the data
+Fetch the data::

-# cat /sys/fs/resctrl/p1/mon_data/mon_l3_00/llc_occupancy
-31789000
+ # cat /sys/fs/resctrl/p1/mon_data/mon_l3_00/llc_occupancy
+ 31789000

Example 3 (Monitor without CAT support or before creating CAT groups)
----------
+---------------------------------------------------------------------

Assume a system like HSW has only CQM and no CAT support. In this case
the resctrl will still mount but cannot create CTRL_MON directories.
@@ -1078,27 +1139,29 @@ able to monitor all tasks including kernel threads.

This can also be used to profile jobs cache size footprint before being
able to allocate them to different allocation groups.
+::

-# mount -t resctrl resctrl /sys/fs/resctrl
-# cd /sys/fs/resctrl
-# mkdir mon_groups/m01
-# mkdir mon_groups/m02
+ # mount -t resctrl resctrl /sys/fs/resctrl
+ # cd /sys/fs/resctrl
+ # mkdir mon_groups/m01
+ # mkdir mon_groups/m02

-# echo 3478 > /sys/fs/resctrl/mon_groups/m01/tasks
-# echo 2467 > /sys/fs/resctrl/mon_groups/m02/tasks
+ # echo 3478 > /sys/fs/resctrl/mon_groups/m01/tasks
+ # echo 2467 > /sys/fs/resctrl/mon_groups/m02/tasks

Monitor the groups separately and also get per domain data. From the
below its apparent that the tasks are mostly doing work on
domain(socket) 0.
+::

-# cat /sys/fs/resctrl/mon_groups/m01/mon_L3_00/llc_occupancy
-31234000
-# cat /sys/fs/resctrl/mon_groups/m01/mon_L3_01/llc_occupancy
-34555
-# cat /sys/fs/resctrl/mon_groups/m02/mon_L3_00/llc_occupancy
-31234000
-# cat /sys/fs/resctrl/mon_groups/m02/mon_L3_01/llc_occupancy
-32789
+ # cat /sys/fs/resctrl/mon_groups/m01/mon_L3_00/llc_occupancy
+ 31234000
+ # cat /sys/fs/resctrl/mon_groups/m01/mon_L3_01/llc_occupancy
+ 34555
+ # cat /sys/fs/resctrl/mon_groups/m02/mon_L3_00/llc_occupancy
+ 31234000
+ # cat /sys/fs/resctrl/mon_groups/m02/mon_L3_01/llc_occupancy
+ 32789


Example 4 (Monitor real time tasks)
@@ -1107,15 +1170,17 @@ Example 4 (Monitor real time tasks)
A single socket system which has real time tasks running on cores 4-7
and non real time tasks on other cpus. We want to monitor the cache
occupancy of the real time threads on these cores.
+::

-# mount -t resctrl resctrl /sys/fs/resctrl
-# cd /sys/fs/resctrl
-# mkdir p1
+ # mount -t resctrl resctrl /sys/fs/resctrl
+ # cd /sys/fs/resctrl
+ # mkdir p1

-Move the cpus 4-7 over to p1
-# echo f0 > p1/cpus
+Move the cpus 4-7 over to p1::
+
+ # echo f0 > p1/cpus

-View the llc occupancy snapshot
+View the llc occupancy snapshot::

-# cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy
-11234000
+ # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy
+ 11234000
--
2.20.1

2019-04-23 16:41:11

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v4 00/63] Include linux ACPI/PCI/X86 docs into Sphinx TOC tree

On Tue, Apr 23, 2019 at 6:30 PM Changbin Du <[email protected]> wrote:
>
> Hi Corbet and All,
> The kernel now uses Sphinx to generate intelligent and beautiful documentation
> from reStructuredText files. I converted all of the Linux ACPI/PCI/X86 docs to
> reST format in this serias.
>
> In this version I combined ACPI and PCI docs, and added new x86 docs conversion.

I'm not sure if combining all three into one big patch series has been
a good idea, honestly.

It would have been easier to review and handle otherwise.

For one, I'd like to handle the ACPI part of it myself if Jon doesn't mind that.

Thanks,
Rafael

2019-04-23 16:41:11

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 56/63] Documentation: x86: convert i386/IO-APIC.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
.../x86/i386/{IO-APIC.txt => IO-APIC.rst} | 26 ++++++++++++-------
Documentation/x86/i386/index.rst | 10 +++++++
Documentation/x86/index.rst | 1 +
3 files changed, 27 insertions(+), 10 deletions(-)
rename Documentation/x86/i386/{IO-APIC.txt => IO-APIC.rst} (93%)
create mode 100644 Documentation/x86/i386/index.rst

diff --git a/Documentation/x86/i386/IO-APIC.txt b/Documentation/x86/i386/IO-APIC.rst
similarity index 93%
rename from Documentation/x86/i386/IO-APIC.txt
rename to Documentation/x86/i386/IO-APIC.rst
index 15f5baf7e1b6..aec98f742763 100644
--- a/Documentation/x86/i386/IO-APIC.txt
+++ b/Documentation/x86/i386/IO-APIC.rst
@@ -1,3 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======
+IO-APIC
+=======
+
+:Author: Ingo Molnar <[email protected]>
+
Most (all) Intel-MP compliant SMP boards have the so-called 'IO-APIC',
which is an enhanced interrupt controller. It enables us to route
hardware interrupts to multiple CPUs, or to CPU groups. Without an
@@ -13,7 +21,7 @@ usually worked around by the kernel. If your MP-compliant SMP board does
not boot Linux, then consult the linux-smp mailing list archives first.

If your box boots fine with enabled IO-APIC IRQs, then your
-/proc/interrupts will look like this one:
+/proc/interrupts will look like this one::

---------------------------->
hell:~> cat /proc/interrupts
@@ -37,14 +45,14 @@ none of those IRQ sources is performance-critical.
In the unlikely case that your board does not create a working mp-table,
you can use the pirq= boot parameter to 'hand-construct' IRQ entries. This
is non-trivial though and cannot be automated. One sample /etc/lilo.conf
-entry:
+entry::

append="pirq=15,11,10"

The actual numbers depend on your system, on your PCI cards and on their
PCI slot position. Usually PCI slots are 'daisy chained' before they are
connected to the PCI chipset IRQ routing facility (the incoming PIRQ1-4
-lines):
+lines)::

,-. ,-. ,-. ,-. ,-.
PIRQ4 ----| |-. ,-| |-. ,-| |-. ,-| |--------| |
@@ -56,7 +64,7 @@ lines):
PIRQ1 ----| |- `----| |- `----| |- `----| |--------| |
`-' `-' `-' `-' `-'

-Every PCI card emits a PCI IRQ, which can be INTA, INTB, INTC or INTD:
+Every PCI card emits a PCI IRQ, which can be INTA, INTB, INTC or INTD::

,-.
INTD--| |
@@ -78,19 +86,19 @@ to have non shared interrupts). Slot5 should be used for videocards, they
do not use interrupts normally, thus they are not daisy chained either.

so if you have your SCSI card (IRQ11) in Slot1, Tulip card (IRQ9) in
-Slot2, then you'll have to specify this pirq= line:
+Slot2, then you'll have to specify this pirq= line::

append="pirq=11,9"

the following script tries to figure out such a default pirq= line from
-your PCI configuration:
+your PCI configuration::

echo -n pirq=; echo `scanpci | grep T_L | cut -c56-` | sed 's/ /,/g'

note that this script won't work if you have skipped a few slots or if your
board does not do default daisy-chaining. (or the IO-APIC has the PIRQ pins
connected in some strange way). E.g. if in the above case you have your SCSI
-card (IRQ11) in Slot3, and have Slot1 empty:
+card (IRQ11) in Slot3, and have Slot1 empty::

append="pirq=0,9,11"

@@ -105,7 +113,7 @@ won't function properly (e.g. if it's inserted as a module).
If you have 2 PCI buses, then you can use up to 8 pirq values, although such
boards tend to have a good configuration.

-Be prepared that it might happen that you need some strange pirq line:
+Be prepared that it might happen that you need some strange pirq line::

append="pirq=0,0,0,0,0,0,9,11"

@@ -115,5 +123,3 @@ Good luck and mail to [email protected] or
[email protected] if you have any problems that are not covered
by this document.

--- mingo
-
diff --git a/Documentation/x86/i386/index.rst b/Documentation/x86/i386/index.rst
new file mode 100644
index 000000000000..8747cf5bbd49
--- /dev/null
+++ b/Documentation/x86/i386/index.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============
+i386 Support
+============
+
+.. toctree::
+ :maxdepth: 2
+
+ IO-APIC
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 526f7a008b8e..19323c5b89ce 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -26,3 +26,4 @@ Linux x86 Support
microcode
resctrl_ui
usb-legacy-support
+ i386/index
--
2.20.1

2019-04-23 16:41:17

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 57/63] Documentation: x86: convert x86_64/boot-options.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
Documentation/x86/index.rst | 1 +
Documentation/x86/x86_64/boot-options.rst | 327 ++++++++++++++++++++++
Documentation/x86/x86_64/boot-options.txt | 278 ------------------
Documentation/x86/x86_64/index.rst | 10 +
4 files changed, 338 insertions(+), 278 deletions(-)
create mode 100644 Documentation/x86/x86_64/boot-options.rst
delete mode 100644 Documentation/x86/x86_64/boot-options.txt
create mode 100644 Documentation/x86/x86_64/index.rst

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 19323c5b89ce..e7becb146c30 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -27,3 +27,4 @@ Linux x86 Support
resctrl_ui
usb-legacy-support
i386/index
+ x86_64/index
diff --git a/Documentation/x86/x86_64/boot-options.rst b/Documentation/x86/x86_64/boot-options.rst
new file mode 100644
index 000000000000..44aa8b878b16
--- /dev/null
+++ b/Documentation/x86/x86_64/boot-options.rst
@@ -0,0 +1,327 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========================
+AMD64 Specific Boot Options
+===========================
+
+There are many others (usually documented in driver documentation), but
+only the AMD64 specific ones are listed here.
+
+Machine check
+=============
+Please see Documentation/x86/x86_64/machinecheck for sysfs runtime tunables.
+
+ mce=off
+ Disable machine check
+ mce=no_cmci
+ Disable CMCI(Corrected Machine Check Interrupt) that
+ Intel processor supports. Usually this disablement is
+ not recommended, but it might be handy if your hardware
+ is misbehaving.
+ Note that you'll get more problems without CMCI than with
+ due to the shared banks, i.e. you might get duplicated
+ error logs.
+ mce=dont_log_ce
+ Don't make logs for corrected errors. All events reported
+ as corrected are silently cleared by OS.
+ This option will be useful if you have no interest in any
+ of corrected errors.
+ mce=ignore_ce
+ Disable features for corrected errors, e.g. polling timer
+ and CMCI. All events reported as corrected are not cleared
+ by OS and remained in its error banks.
+ Usually this disablement is not recommended, however if
+ there is an agent checking/clearing corrected errors
+ (e.g. BIOS or hardware monitoring applications), conflicting
+ with OS's error handling, and you cannot deactivate the agent,
+ then this option will be a help.
+ mce=no_lmce
+ Do not opt-in to Local MCE delivery. Use legacy method
+ to broadcast MCEs.
+ mce=bootlog
+ Enable logging of machine checks left over from booting.
+ Disabled by default on AMD Fam10h and older because some BIOS
+ leave bogus ones.
+ If your BIOS doesn't do that it's a good idea to enable though
+ to make sure you log even machine check events that result
+ in a reboot. On Intel systems it is enabled by default.
+ mce=nobootlog
+ Disable boot machine check logging.
+ mce=tolerancelevel[,monarchtimeout] (number,number)
+ tolerance levels:
+ 0: always panic on uncorrected errors, log corrected errors
+ 1: panic or SIGBUS on uncorrected errors, log corrected errors
+ 2: SIGBUS or log uncorrected errors, log corrected errors
+ 3: never panic or SIGBUS, log all errors (for testing only)
+ Default is 1
+ Can be also set using sysfs which is preferable.
+ monarchtimeout:
+ Sets the time in us to wait for other CPUs on machine checks. 0
+ to disable.
+ mce=bios_cmci_threshold
+ Don't overwrite the bios-set CMCI threshold. This boot option
+ prevents Linux from overwriting the CMCI threshold set by the
+ bios. Without this option, Linux always sets the CMCI
+ threshold to 1. Enabling this may make memory predictive failure
+ analysis less effective if the bios sets thresholds for memory
+ errors since we will not see details for all errors.
+ mce=recovery
+ Force-enable recoverable machine check code paths
+
+ nomce (for compatibility with i386)
+ same as mce=off
+
+ Everything else is in sysfs now.
+
+APICs
+=====
+
+ apic
+ Use IO-APIC. Default
+
+ noapic
+ Don't use the IO-APIC.
+
+ disableapic
+ Don't use the local APIC
+
+ nolapic
+ Don't use the local APIC (alias for i386 compatibility)
+
+ pirq=...
+ See Documentation/x86/i386/IO-APIC.txt
+
+ noapictimer
+ Don't set up the APIC timer
+
+ no_timer_check
+ Don't check the IO-APIC timer. This can work around
+ problems with incorrect timer initialization on some boards.
+
+ apicpmtimer
+ Do APIC timer calibration using the pmtimer. Implies
+ apicmaintimer. Useful when your PIT timer is totally
+ broken.
+
+Timing
+======
+
+ notsc
+ Deprecated, use tsc=unstable instead.
+
+ nohpet
+ Don't use the HPET timer.
+
+Idle loop
+=========
+
+ idle=poll
+ Don't do power saving in the idle loop using HLT, but poll for rescheduling
+ event. This will make the CPUs eat a lot more power, but may be useful
+ to get slightly better performance in multiprocessor benchmarks. It also
+ makes some profiling using performance counters more accurate.
+ Please note that on systems with MONITOR/MWAIT support (like Intel EM64T
+ CPUs) this option has no performance advantage over the normal idle loop.
+ It may also interact badly with hyperthreading.
+
+Rebooting
+=========
+
+ reboot=b[ios] | t[riple] | k[bd] | a[cpi] | e[fi] [, [w]arm | [c]old]
+ * bios - Use the CPU reboot vector for warm reset
+ * warm - Don't set the cold reboot flag
+ * cold - Set the cold reboot flag
+ * triple - Force a triple fault (init)
+ * kbd - Use the keyboard controller. cold reset (default)
+ * acpi - Use the ACPI RESET_REG in the FADT. If ACPI is not configured or
+ the ACPI reset does not work, the reboot path attempts the reset
+ using the keyboard controller.
+ * efi - Use efi reset_system runtime service. If EFI is not configured or
+ the EFI reset does not work, the reboot path attempts the reset using
+ the keyboard controller.
+
+ Using warm reset will be much faster especially on big memory
+ systems because the BIOS will not go through the memory check.
+ Disadvantage is that not all hardware will be completely reinitialized
+ on reboot so there may be boot problems on some systems.
+
+ reboot=force
+ Don't stop other CPUs on reboot. This can make reboot more reliable
+ in some cases.
+
+Non Executable Mappings
+=======================
+
+ noexec=on|off
+ * on - Enable(default)
+ * off - Disable
+
+NUMA
+====
+
+ numa=off
+ Only set up a single NUMA node spanning all memory.
+
+ numa=noacpi
+ Don't parse the SRAT table for NUMA setup
+
+ numa=fake=<size>[MG]
+ If given as a memory unit, fills all system RAM with nodes of
+ size interleaved over physical nodes.
+
+ numa=fake=<N>
+ If given as an integer, fills all system RAM with N fake nodes
+ interleaved over physical nodes.
+
+ numa=fake=<N>U
+ If given as an integer followed by 'U', it will divide each
+ physical node into N emulated nodes.
+
+ACPI
+====
+
+ acpi=off
+ Don't enable ACPI
+ acpi=ht
+ Use ACPI boot table parsing, but don't enable ACPI interpreter
+ acpi=force
+ Force ACPI on (currently not needed)
+ acpi=strict
+ Disable out of spec ACPI workarounds.
+ acpi_sci={edge,level,high,low}
+ Set up ACPI SCI interrupt.
+ acpi=noirq
+ Don't route interrupts
+ acpi=nocmcff
+ Disable firmware first mode for corrected errors. This
+ disables parsing the HEST CMC error source to check if
+ firmware has set the FF flag. This may result in
+ duplicate corrected error reports.
+
+PCI
+===
+
+ pci=off
+ Don't use PCI
+ pci=conf1
+ Use conf1 access.
+ pci=conf2
+ Use conf2 access.
+ pci=rom
+ Assign ROMs.
+ pci=assign-busses
+ Assign busses
+ pci=irqmask=MASK
+ Set PCI interrupt mask to MASK
+ pci=lastbus=NUMBER
+ Scan up to NUMBER busses, no matter what the mptable says.
+ pci=noacpi
+ Don't use ACPI to set up PCI interrupt routing.
+
+IOMMU (input/output memory management unit)
+===========================================
+Multiple x86-64 PCI-DMA mapping implementations exist, for example:
+
+ 1. <lib/dma-direct.c>: use no hardware/software IOMMU at all
+ (e.g. because you have < 3 GB memory).
+ Kernel boot message: "PCI-DMA: Disabling IOMMU"
+
+ 2. <arch/x86/kernel/amd_gart_64.c>: AMD GART based hardware IOMMU.
+ Kernel boot message: "PCI-DMA: using GART IOMMU"
+
+ 3. <arch/x86_64/kernel/pci-swiotlb.c> : Software IOMMU implementation. Used
+ e.g. if there is no hardware IOMMU in the system and it is need because
+ you have >3GB memory or told the kernel to us it (iommu=soft))
+ Kernel boot message: "PCI-DMA: Using software bounce buffering
+ for IO (SWIOTLB)"
+
+ 4. <arch/x86_64/pci-calgary.c> : IBM Calgary hardware IOMMU. Used in IBM
+ pSeries and xSeries servers. This hardware IOMMU supports DMA address
+ mapping with memory protection, etc.
+ Kernel boot message: "PCI-DMA: Using Calgary IOMMU"
+
+::
+
+ iommu=[<size>][,noagp][,off][,force][,noforce]
+ [,memaper[=<order>]][,merge][,fullflush][,nomerge]
+ [,noaperture][,calgary]
+
+General iommu options:
+
+ off
+ Don't initialize and use any kind of IOMMU.
+ noforce
+ Don't force hardware IOMMU usage when it is not needed. (default).
+ force
+ Force the use of the hardware IOMMU even when it is
+ not actually needed (e.g. because < 3 GB memory).
+ soft
+ Use software bounce buffering (SWIOTLB) (default for
+ Intel machines). This can be used to prevent the usage
+ of an available hardware IOMMU.
+
+iommu options only relevant to the AMD GART hardware IOMMU:
+
+ <size>
+ Set the size of the remapping area in bytes.
+ allowed
+ Overwrite iommu off workarounds for specific chipsets.
+ fullflush
+ Flush IOMMU on each allocation (default).
+ nofullflush
+ Don't use IOMMU fullflush.
+ memaper[=<order>]
+ Allocate an own aperture over RAM with size 32MB<<order.
+ (default: order=1, i.e. 64MB)
+ merge
+ Do scatter-gather (SG) merging. Implies "force" (experimental).
+ nomerge
+ Don't do scatter-gather (SG) merging.
+ noaperture
+ Ask the IOMMU not to touch the aperture for AGP.
+ noagp
+ Don't initialize the AGP driver and use full aperture.
+ panic
+ Always panic when IOMMU overflows.
+ calgary
+ Use the Calgary IOMMU if it is available
+
+iommu options only relevant to the software bounce buffering (SWIOTLB) IOMMU
+implementation:
+
+ swiotlb=<pages>[,force]
+ <pages>
+ Prereserve that many 128K pages for the software IO bounce buffering.
+ force
+ Force all IO through the software TLB.
+
+Settings for the IBM Calgary hardware IOMMU currently found in IBM
+pSeries and xSeries machines
+
+ calgary=[64k,128k,256k,512k,1M,2M,4M,8M]
+ Set the size of each PCI slot's translation table when using the
+ Calgary IOMMU. This is the size of the translation table itself
+ in main memory. The smallest table, 64k, covers an IO space of
+ 32MB; the largest, 8MB table, can cover an IO space of 4GB.
+ Normally the kernel will make the right choice by itself.
+ calgary=[translate_empty_slots]
+ Enable translation even on slots that have no devices attached to
+ them, in case a device will be hotplugged in the future.
+ calgary=[disable=<PCI bus number>]
+ Disable translation on a given PHB. For
+ example, the built-in graphics adapter resides on the first bridge
+ (PCI bus number 0); if translation (isolation) is enabled on this
+ bridge, X servers that access the hardware directly from user
+ space might stop working. Use this option if you have devices that
+ are accessed from userspace directly on some PCI host bridge.
+ panic
+ Always panic when IOMMU overflows
+
+
+Miscellaneous
+=============
+
+ nogbpages
+ Do not use GB pages for kernel direct mappings.
+ gbpages
+ Use GB pages for kernel direct mappings.
diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt
deleted file mode 100644
index abc53886655e..000000000000
--- a/Documentation/x86/x86_64/boot-options.txt
+++ /dev/null
@@ -1,278 +0,0 @@
-AMD64 specific boot options
-
-There are many others (usually documented in driver documentation), but
-only the AMD64 specific ones are listed here.
-
-Machine check
-
- Please see Documentation/x86/x86_64/machinecheck for sysfs runtime tunables.
-
- mce=off
- Disable machine check
- mce=no_cmci
- Disable CMCI(Corrected Machine Check Interrupt) that
- Intel processor supports. Usually this disablement is
- not recommended, but it might be handy if your hardware
- is misbehaving.
- Note that you'll get more problems without CMCI than with
- due to the shared banks, i.e. you might get duplicated
- error logs.
- mce=dont_log_ce
- Don't make logs for corrected errors. All events reported
- as corrected are silently cleared by OS.
- This option will be useful if you have no interest in any
- of corrected errors.
- mce=ignore_ce
- Disable features for corrected errors, e.g. polling timer
- and CMCI. All events reported as corrected are not cleared
- by OS and remained in its error banks.
- Usually this disablement is not recommended, however if
- there is an agent checking/clearing corrected errors
- (e.g. BIOS or hardware monitoring applications), conflicting
- with OS's error handling, and you cannot deactivate the agent,
- then this option will be a help.
- mce=no_lmce
- Do not opt-in to Local MCE delivery. Use legacy method
- to broadcast MCEs.
- mce=bootlog
- Enable logging of machine checks left over from booting.
- Disabled by default on AMD Fam10h and older because some BIOS
- leave bogus ones.
- If your BIOS doesn't do that it's a good idea to enable though
- to make sure you log even machine check events that result
- in a reboot. On Intel systems it is enabled by default.
- mce=nobootlog
- Disable boot machine check logging.
- mce=tolerancelevel[,monarchtimeout] (number,number)
- tolerance levels:
- 0: always panic on uncorrected errors, log corrected errors
- 1: panic or SIGBUS on uncorrected errors, log corrected errors
- 2: SIGBUS or log uncorrected errors, log corrected errors
- 3: never panic or SIGBUS, log all errors (for testing only)
- Default is 1
- Can be also set using sysfs which is preferable.
- monarchtimeout:
- Sets the time in us to wait for other CPUs on machine checks. 0
- to disable.
- mce=bios_cmci_threshold
- Don't overwrite the bios-set CMCI threshold. This boot option
- prevents Linux from overwriting the CMCI threshold set by the
- bios. Without this option, Linux always sets the CMCI
- threshold to 1. Enabling this may make memory predictive failure
- analysis less effective if the bios sets thresholds for memory
- errors since we will not see details for all errors.
- mce=recovery
- Force-enable recoverable machine check code paths
-
- nomce (for compatibility with i386): same as mce=off
-
- Everything else is in sysfs now.
-
-APICs
-
- apic Use IO-APIC. Default
-
- noapic Don't use the IO-APIC.
-
- disableapic Don't use the local APIC
-
- nolapic Don't use the local APIC (alias for i386 compatibility)
-
- pirq=... See Documentation/x86/i386/IO-APIC.txt
-
- noapictimer Don't set up the APIC timer
-
- no_timer_check Don't check the IO-APIC timer. This can work around
- problems with incorrect timer initialization on some boards.
- apicpmtimer
- Do APIC timer calibration using the pmtimer. Implies
- apicmaintimer. Useful when your PIT timer is totally
- broken.
-
-Timing
-
- notsc
- Deprecated, use tsc=unstable instead.
-
- nohpet
- Don't use the HPET timer.
-
-Idle loop
-
- idle=poll
- Don't do power saving in the idle loop using HLT, but poll for rescheduling
- event. This will make the CPUs eat a lot more power, but may be useful
- to get slightly better performance in multiprocessor benchmarks. It also
- makes some profiling using performance counters more accurate.
- Please note that on systems with MONITOR/MWAIT support (like Intel EM64T
- CPUs) this option has no performance advantage over the normal idle loop.
- It may also interact badly with hyperthreading.
-
-Rebooting
-
- reboot=b[ios] | t[riple] | k[bd] | a[cpi] | e[fi] [, [w]arm | [c]old]
- bios Use the CPU reboot vector for warm reset
- warm Don't set the cold reboot flag
- cold Set the cold reboot flag
- triple Force a triple fault (init)
- kbd Use the keyboard controller. cold reset (default)
- acpi Use the ACPI RESET_REG in the FADT. If ACPI is not configured or the
- ACPI reset does not work, the reboot path attempts the reset using
- the keyboard controller.
- efi Use efi reset_system runtime service. If EFI is not configured or the
- EFI reset does not work, the reboot path attempts the reset using
- the keyboard controller.
-
- Using warm reset will be much faster especially on big memory
- systems because the BIOS will not go through the memory check.
- Disadvantage is that not all hardware will be completely reinitialized
- on reboot so there may be boot problems on some systems.
-
- reboot=force
-
- Don't stop other CPUs on reboot. This can make reboot more reliable
- in some cases.
-
-Non Executable Mappings
-
- noexec=on|off
-
- on Enable(default)
- off Disable
-
-NUMA
-
- numa=off Only set up a single NUMA node spanning all memory.
-
- numa=noacpi Don't parse the SRAT table for NUMA setup
-
- numa=fake=<size>[MG]
- If given as a memory unit, fills all system RAM with nodes of
- size interleaved over physical nodes.
-
- numa=fake=<N>
- If given as an integer, fills all system RAM with N fake nodes
- interleaved over physical nodes.
-
- numa=fake=<N>U
- If given as an integer followed by 'U', it will divide each
- physical node into N emulated nodes.
-
-ACPI
-
- acpi=off Don't enable ACPI
- acpi=ht Use ACPI boot table parsing, but don't enable ACPI
- interpreter
- acpi=force Force ACPI on (currently not needed)
-
- acpi=strict Disable out of spec ACPI workarounds.
-
- acpi_sci={edge,level,high,low} Set up ACPI SCI interrupt.
-
- acpi=noirq Don't route interrupts
-
- acpi=nocmcff Disable firmware first mode for corrected errors. This
- disables parsing the HEST CMC error source to check if
- firmware has set the FF flag. This may result in
- duplicate corrected error reports.
-
-PCI
-
- pci=off Don't use PCI
- pci=conf1 Use conf1 access.
- pci=conf2 Use conf2 access.
- pci=rom Assign ROMs.
- pci=assign-busses Assign busses
- pci=irqmask=MASK Set PCI interrupt mask to MASK
- pci=lastbus=NUMBER Scan up to NUMBER busses, no matter what the mptable says.
- pci=noacpi Don't use ACPI to set up PCI interrupt routing.
-
-IOMMU (input/output memory management unit)
-
- Multiple x86-64 PCI-DMA mapping implementations exist, for example:
-
- 1. <lib/dma-direct.c>: use no hardware/software IOMMU at all
- (e.g. because you have < 3 GB memory).
- Kernel boot message: "PCI-DMA: Disabling IOMMU"
-
- 2. <arch/x86/kernel/amd_gart_64.c>: AMD GART based hardware IOMMU.
- Kernel boot message: "PCI-DMA: using GART IOMMU"
-
- 3. <arch/x86_64/kernel/pci-swiotlb.c> : Software IOMMU implementation. Used
- e.g. if there is no hardware IOMMU in the system and it is need because
- you have >3GB memory or told the kernel to us it (iommu=soft))
- Kernel boot message: "PCI-DMA: Using software bounce buffering
- for IO (SWIOTLB)"
-
- 4. <arch/x86_64/pci-calgary.c> : IBM Calgary hardware IOMMU. Used in IBM
- pSeries and xSeries servers. This hardware IOMMU supports DMA address
- mapping with memory protection, etc.
- Kernel boot message: "PCI-DMA: Using Calgary IOMMU"
-
- iommu=[<size>][,noagp][,off][,force][,noforce]
- [,memaper[=<order>]][,merge][,fullflush][,nomerge]
- [,noaperture][,calgary]
-
- General iommu options:
- off Don't initialize and use any kind of IOMMU.
- noforce Don't force hardware IOMMU usage when it is not needed.
- (default).
- force Force the use of the hardware IOMMU even when it is
- not actually needed (e.g. because < 3 GB memory).
- soft Use software bounce buffering (SWIOTLB) (default for
- Intel machines). This can be used to prevent the usage
- of an available hardware IOMMU.
-
- iommu options only relevant to the AMD GART hardware IOMMU:
- <size> Set the size of the remapping area in bytes.
- allowed Overwrite iommu off workarounds for specific chipsets.
- fullflush Flush IOMMU on each allocation (default).
- nofullflush Don't use IOMMU fullflush.
- memaper[=<order>] Allocate an own aperture over RAM with size 32MB<<order.
- (default: order=1, i.e. 64MB)
- merge Do scatter-gather (SG) merging. Implies "force"
- (experimental).
- nomerge Don't do scatter-gather (SG) merging.
- noaperture Ask the IOMMU not to touch the aperture for AGP.
- noagp Don't initialize the AGP driver and use full aperture.
- panic Always panic when IOMMU overflows.
- calgary Use the Calgary IOMMU if it is available
-
- iommu options only relevant to the software bounce buffering (SWIOTLB) IOMMU
- implementation:
- swiotlb=<pages>[,force]
- <pages> Prereserve that many 128K pages for the software IO
- bounce buffering.
- force Force all IO through the software TLB.
-
- Settings for the IBM Calgary hardware IOMMU currently found in IBM
- pSeries and xSeries machines:
-
- calgary=[64k,128k,256k,512k,1M,2M,4M,8M]
- calgary=[translate_empty_slots]
- calgary=[disable=<PCI bus number>]
- panic Always panic when IOMMU overflows
-
- 64k,...,8M - Set the size of each PCI slot's translation table
- when using the Calgary IOMMU. This is the size of the translation
- table itself in main memory. The smallest table, 64k, covers an IO
- space of 32MB; the largest, 8MB table, can cover an IO space of
- 4GB. Normally the kernel will make the right choice by itself.
-
- translate_empty_slots - Enable translation even on slots that have
- no devices attached to them, in case a device will be hotplugged
- in the future.
-
- disable=<PCI bus number> - Disable translation on a given PHB. For
- example, the built-in graphics adapter resides on the first bridge
- (PCI bus number 0); if translation (isolation) is enabled on this
- bridge, X servers that access the hardware directly from user
- space might stop working. Use this option if you have devices that
- are accessed from userspace directly on some PCI host bridge.
-
-Miscellaneous
-
- nogbpages
- Do not use GB pages for kernel direct mappings.
- gbpages
- Use GB pages for kernel direct mappings.
diff --git a/Documentation/x86/x86_64/index.rst b/Documentation/x86/x86_64/index.rst
new file mode 100644
index 000000000000..a8cf7713cac9
--- /dev/null
+++ b/Documentation/x86/x86_64/index.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============
+x86_64 Support
+==============
+
+.. toctree::
+ :maxdepth: 2
+
+ boot-options
--
2.20.1

2019-04-23 16:41:21

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 50/63] Documentation: x86: convert amd-memory-encryption.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
...ory-encryption.txt => amd-memory-encryption.rst} | 13 ++++++++++---
Documentation/x86/index.rst | 1 +
2 files changed, 11 insertions(+), 3 deletions(-)
rename Documentation/x86/{amd-memory-encryption.txt => amd-memory-encryption.rst} (94%)

diff --git a/Documentation/x86/amd-memory-encryption.txt b/Documentation/x86/amd-memory-encryption.rst
similarity index 94%
rename from Documentation/x86/amd-memory-encryption.txt
rename to Documentation/x86/amd-memory-encryption.rst
index afc41f544dab..c48d452d0718 100644
--- a/Documentation/x86/amd-memory-encryption.txt
+++ b/Documentation/x86/amd-memory-encryption.rst
@@ -1,3 +1,9 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================
+AMD Memory Encryption
+=====================
+
Secure Memory Encryption (SME) and Secure Encrypted Virtualization (SEV) are
features found on AMD processors.

@@ -34,7 +40,7 @@ is operating in 64-bit or 32-bit PAE mode, in all other modes the SEV hardware
forces the memory encryption bit to 1.

Support for SME and SEV can be determined through the CPUID instruction. The
-CPUID function 0x8000001f reports information related to SME:
+CPUID function 0x8000001f reports information related to SME::

0x8000001f[eax]:
Bit[0] indicates support for SME
@@ -48,14 +54,14 @@ CPUID function 0x8000001f reports information related to SME:
addresses)

If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be used to
-determine if SME is enabled and/or to enable memory encryption:
+determine if SME is enabled and/or to enable memory encryption::

0xc0010010:
Bit[23] 0 = memory encryption features are disabled
1 = memory encryption features are enabled

If SEV is supported, MSR 0xc0010131 (MSR_AMD64_SEV) can be used to determine if
-SEV is active:
+SEV is active::

0xc0010131:
Bit[0] 0 = memory encryption is not active
@@ -68,6 +74,7 @@ requirements for the system. If this bit is not set upon Linux startup then
Linux itself will not set it and memory encryption will not be possible.

The state of SME in the Linux kernel can be documented as follows:
+
- Supported:
The CPU supports SME (determined through CPUID instruction).

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 20091d3e5d97..a0426ab156bd 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -20,3 +20,4 @@ Linux x86 Support
pat
protection-keys
intel_mpx
+ amd-memory-encryption
--
2.20.1

2019-04-23 16:41:56

by Changbin Du

[permalink] [raw]
Subject: [PATCH v4 60/63] Documentation: x86: convert x86_64/5level-paging.txt to reST

This converts the plain text documentation to reStructuredText format and
add it to Sphinx TOC tree. No essential content change.

Signed-off-by: Changbin Du <[email protected]>
---
.../{5level-paging.txt => 5level-paging.rst} | 16 +++++++++++-----
Documentation/x86/x86_64/index.rst | 1 +
2 files changed, 12 insertions(+), 5 deletions(-)
rename Documentation/x86/x86_64/{5level-paging.txt => 5level-paging.rst} (91%)

diff --git a/Documentation/x86/x86_64/5level-paging.txt b/Documentation/x86/x86_64/5level-paging.rst
similarity index 91%
rename from Documentation/x86/x86_64/5level-paging.txt
rename to Documentation/x86/x86_64/5level-paging.rst
index 2432a5ef86d9..ab88a4514163 100644
--- a/Documentation/x86/x86_64/5level-paging.txt
+++ b/Documentation/x86/x86_64/5level-paging.rst
@@ -1,5 +1,11 @@
-== Overview ==
+.. SPDX-License-Identifier: GPL-2.0

+==============
+5-level paging
+==============
+
+Overview
+========
Original x86-64 was limited by 4-level paing to 256 TiB of virtual address
space and 64 TiB of physical address space. We are already bumping into
this limit: some vendors offers servers with 64 TiB of memory today.
@@ -16,16 +22,17 @@ QEMU 2.9 and later support 5-level paging.
Virtual memory layout for 5-level paging is described in
Documentation/x86/x86_64/mm.txt

-== Enabling 5-level paging ==

+Enabling 5-level paging
+=======================
CONFIG_X86_5LEVEL=y enables the feature.

Kernel with CONFIG_X86_5LEVEL=y still able to boot on 4-level hardware.
In this case additional page table level -- p4d -- will be folded at
runtime.

-== User-space and large virtual address space ==
-
+User-space and large virtual address space
+==========================================
On x86, 5-level paging enables 56-bit userspace virtual address space.
Not all user space is ready to handle wide addresses. It's known that
at least some JIT compilers use higher bits in pointers to encode their
@@ -58,4 +65,3 @@ One important case we need to handle here is interaction with MPX.
MPX (without MAWA extension) cannot handle addresses above 47-bit, so we
need to make sure that MPX cannot be enabled we already have VMA above
the boundary and forbid creating such VMAs once MPX is enabled.
-
diff --git a/Documentation/x86/x86_64/index.rst b/Documentation/x86/x86_64/index.rst
index 4b65d29ef459..7b8c82151358 100644
--- a/Documentation/x86/x86_64/index.rst
+++ b/Documentation/x86/x86_64/index.rst
@@ -10,3 +10,4 @@ x86_64 Support
boot-options
uefi
mm
+ 5level-paging
--
2.20.1

2019-04-23 17:38:19

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v4 00/63] Include linux ACPI/PCI/X86 docs into Sphinx TOC tree

On Tue, Apr 23, 2019 at 06:39:47PM +0200, Rafael J. Wysocki wrote:
> On Tue, Apr 23, 2019 at 6:30 PM Changbin Du <[email protected]> wrote:
> > Hi Corbet and All,
> > The kernel now uses Sphinx to generate intelligent and beautiful
> > documentation from reStructuredText files. I converted all of the Linux
> > ACPI/PCI/X86 docs to reST format in this serias.
> >
> > In this version I combined ACPI and PCI docs, and added new x86 docs
> > conversion.
>
> I'm not sure if combining all three into one big patch series has been
> a good idea, honestly.

Yeah, if you post this again, I would find it easier to deal with if
linux-pci only got the PCI-related things. 63 patches is a little too
much for one series.

Bjorn

2019-04-23 20:40:23

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 02/63] Documentation: ACPI: move namespace.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:31 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> Documentation/firmware-guide/acpi/index.rst | 1 +
> .../acpi/namespace.rst} | 310 +++++++++---------
> 2 files changed, 161 insertions(+), 150 deletions(-)
> rename Documentation/{acpi/namespace.txt => firmware-guide/acpi/namespace.rst} (54%)
>
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index 0ec7d072ba22..210ad8acd6df 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -7,3 +7,4 @@ ACPI Support
> .. toctree::
> :maxdepth: 1
>
> + namespace
> diff --git a/Documentation/acpi/namespace.txt b/Documentation/firmware-guide/acpi/namespace.rst
> similarity index 54%
> rename from Documentation/acpi/namespace.txt
> rename to Documentation/firmware-guide/acpi/namespace.rst
> index 1860cb3865c6..443f0e5d0617 100644
> --- a/Documentation/acpi/namespace.txt
> +++ b/Documentation/firmware-guide/acpi/namespace.rst
> @@ -1,85 +1,88 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +.. include:: <isonum.txt>
> +
> +===================================================
> ACPI Device Tree - Representation of ACPI Namespace
> +===================================================
> +
> +:Copyright: |copy| 2013, Intel Corporation
> +
> +:Author: Lv Zheng <[email protected]>
> +
> +:Abstract: The Linux ACPI subsystem converts ACPI namespace objects into a Linux
> + device tree under the /sys/devices/LNXSYSTEM:00 and updates it upon
> + receiving ACPI hotplug notification events. For each device object
> + in this hierarchy there is a corresponding symbolic link in the
> + /sys/bus/acpi/devices.
> + This document illustrates the structure of the ACPI device tree.

Well, this is a matter of preference. I would add Abstract as a chapter,
as this would make it part of the top index, with can be useful.

In any case:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> +
> +:Credit: Thanks for the help from Zhang Rui <[email protected]> and
> + Rafael J.Wysocki <[email protected]>.
> +
> +
> +ACPI Definition Blocks
> +======================
> +
> +The ACPI firmware sets up RSDP (Root System Description Pointer) in the
> +system memory address space pointing to the XSDT (Extended System
> +Description Table). The XSDT always points to the FADT (Fixed ACPI
> +Description Table) using its first entry, the data within the FADT
> +includes various fixed-length entries that describe fixed ACPI features
> +of the hardware. The FADT contains a pointer to the DSDT
> +(Differentiated System Descripition Table). The XSDT also contains
> +entries pointing to possibly multiple SSDTs (Secondary System
> +Description Table).
> +
> +The DSDT and SSDT data is organized in data structures called definition
> +blocks that contain definitions of various objects, including ACPI
> +control methods, encoded in AML (ACPI Machine Language). The data block
> +of the DSDT along with the contents of SSDTs represents a hierarchical
> +data structure called the ACPI namespace whose topology reflects the
> +structure of the underlying hardware platform.
> +
> +The relationships between ACPI System Definition Tables described above
> +are illustrated in the following diagram::
> +
> + +---------+ +-------+ +--------+ +------------------------+
> + | RSDP | +->| XSDT | +->| FADT | | +-------------------+ |
> + +---------+ | +-------+ | +--------+ +-|->| DSDT | |
> + | Pointer | | | Entry |-+ | ...... | | | +-------------------+ |
> + +---------+ | +-------+ | X_DSDT |--+ | | Definition Blocks | |
> + | Pointer |-+ | ..... | | ...... | | +-------------------+ |
> + +---------+ +-------+ +--------+ | +-------------------+ |
> + | Entry |------------------|->| SSDT | |
> + +- - - -+ | +-------------------| |
> + | Entry | - - - - - - - -+ | | Definition Blocks | |
> + +- - - -+ | | +-------------------+ |
> + | | +- - - - - - - - - -+ |
> + +-|->| SSDT | |
> + | +-------------------+ |
> + | | Definition Blocks | |
> + | +- - - - - - - - - -+ |
> + +------------------------+
> + |
> + OSPM Loading |
> + \|/
> + +----------------+
> + | ACPI Namespace |
> + +----------------+
> +
> + Figure 1. ACPI Definition Blocks
> +
> +.. note:: RSDP can also contain a pointer to the RSDT (Root System
> + Description Table). Platforms provide RSDT to enable
> + compatibility with ACPI 1.0 operating systems. The OS is expected
> + to use XSDT, if present.
> +
> +
> +Example ACPI Namespace
> +======================
> +
> +All definition blocks are loaded into a single namespace. The namespace
> +is a hierarchy of objects identified by names and paths.
> +The following naming conventions apply to object names in the ACPI
> +namespace:
>
> -Copyright (C) 2013, Intel Corporation
> -Author: Lv Zheng <[email protected]>
> -
> -
> -Abstract:
> -
> -The Linux ACPI subsystem converts ACPI namespace objects into a Linux
> -device tree under the /sys/devices/LNXSYSTEM:00 and updates it upon
> -receiving ACPI hotplug notification events. For each device object in this
> -hierarchy there is a corresponding symbolic link in the
> -/sys/bus/acpi/devices.
> -This document illustrates the structure of the ACPI device tree.
> -
> -
> -Credit:
> -
> -Thanks for the help from Zhang Rui <[email protected]> and Rafael J.
> -Wysocki <[email protected]>.
> -
> -
> -1. ACPI Definition Blocks
> -
> - The ACPI firmware sets up RSDP (Root System Description Pointer) in the
> - system memory address space pointing to the XSDT (Extended System
> - Description Table). The XSDT always points to the FADT (Fixed ACPI
> - Description Table) using its first entry, the data within the FADT
> - includes various fixed-length entries that describe fixed ACPI features
> - of the hardware. The FADT contains a pointer to the DSDT
> - (Differentiated System Descripition Table). The XSDT also contains
> - entries pointing to possibly multiple SSDTs (Secondary System
> - Description Table).
> -
> - The DSDT and SSDT data is organized in data structures called definition
> - blocks that contain definitions of various objects, including ACPI
> - control methods, encoded in AML (ACPI Machine Language). The data block
> - of the DSDT along with the contents of SSDTs represents a hierarchical
> - data structure called the ACPI namespace whose topology reflects the
> - structure of the underlying hardware platform.
> -
> - The relationships between ACPI System Definition Tables described above
> - are illustrated in the following diagram.
> -
> - +---------+ +-------+ +--------+ +------------------------+
> - | RSDP | +->| XSDT | +->| FADT | | +-------------------+ |
> - +---------+ | +-------+ | +--------+ +-|->| DSDT | |
> - | Pointer | | | Entry |-+ | ...... | | | +-------------------+ |
> - +---------+ | +-------+ | X_DSDT |--+ | | Definition Blocks | |
> - | Pointer |-+ | ..... | | ...... | | +-------------------+ |
> - +---------+ +-------+ +--------+ | +-------------------+ |
> - | Entry |------------------|->| SSDT | |
> - +- - - -+ | +-------------------| |
> - | Entry | - - - - - - - -+ | | Definition Blocks | |
> - +- - - -+ | | +-------------------+ |
> - | | +- - - - - - - - - -+ |
> - +-|->| SSDT | |
> - | +-------------------+ |
> - | | Definition Blocks | |
> - | +- - - - - - - - - -+ |
> - +------------------------+
> - |
> - OSPM Loading |
> - \|/
> - +----------------+
> - | ACPI Namespace |
> - +----------------+
> -
> - Figure 1. ACPI Definition Blocks
> -
> - NOTE: RSDP can also contain a pointer to the RSDT (Root System
> - Description Table). Platforms provide RSDT to enable
> - compatibility with ACPI 1.0 operating systems. The OS is expected
> - to use XSDT, if present.
> -
> -
> -2. Example ACPI Namespace
> -
> - All definition blocks are loaded into a single namespace. The namespace
> - is a hierarchy of objects identified by names and paths.
> - The following naming conventions apply to object names in the ACPI
> - namespace:
> 1. All names are 32 bits long.
> 2. The first byte of a name must be one of 'A' - 'Z', '_'.
> 3. Each of the remaining bytes of a name must be one of 'A' - 'Z', '0'
> @@ -91,7 +94,7 @@ Wysocki <[email protected]>.
> (i.e. names prepended with '^' are relative to the parent of the
> current namespace node).
>
> - The figure below shows an example ACPI namespace.
> +The figure below shows an example ACPI namespace::
>
> +------+
> | \ | Root
> @@ -184,19 +187,20 @@ Wysocki <[email protected]>.
> Figure 2. Example ACPI Namespace
>
>
> -3. Linux ACPI Device Objects
> +Linux ACPI Device Objects
> +=========================
>
> - The Linux kernel's core ACPI subsystem creates struct acpi_device
> - objects for ACPI namespace objects representing devices, power resources
> - processors, thermal zones. Those objects are exported to user space via
> - sysfs as directories in the subtree under /sys/devices/LNXSYSTM:00. The
> - format of their names is <bus_id:instance>, where 'bus_id' refers to the
> - ACPI namespace representation of the given object and 'instance' is used
> - for distinguishing different object of the same 'bus_id' (it is
> - two-digit decimal representation of an unsigned integer).
> +The Linux kernel's core ACPI subsystem creates struct acpi_device
> +objects for ACPI namespace objects representing devices, power resources
> +processors, thermal zones. Those objects are exported to user space via
> +sysfs as directories in the subtree under /sys/devices/LNXSYSTM:00. The
> +format of their names is <bus_id:instance>, where 'bus_id' refers to the
> +ACPI namespace representation of the given object and 'instance' is used
> +for distinguishing different object of the same 'bus_id' (it is
> +two-digit decimal representation of an unsigned integer).
>
> - The value of 'bus_id' depends on the type of the object whose name it is
> - part of as listed in the table below.
> +The value of 'bus_id' depends on the type of the object whose name it is
> +part of as listed in the table below::
>
> +---+-----------------+-------+----------+
> | | Object/Feature | Table | bus_id |
> @@ -226,10 +230,11 @@ Wysocki <[email protected]>.
>
> Table 1. ACPI Namespace Objects Mapping
>
> - The following rules apply when creating struct acpi_device objects on
> - the basis of the contents of ACPI System Description Tables (as
> - indicated by the letter in the first column and the notation in the
> - second column of the table above):
> +The following rules apply when creating struct acpi_device objects on
> +the basis of the contents of ACPI System Description Tables (as
> +indicated by the letter in the first column and the notation in the
> +second column of the table above):
> +
> N:
> The object's source is an ACPI namespace node (as indicated by the
> named object's type in the second column). In that case the object's
> @@ -249,13 +254,14 @@ Wysocki <[email protected]>.
> struct acpi_device object with LNXVIDEO 'bus_id' will be created for
> it.
>
> - The third column of the above table indicates which ACPI System
> - Description Tables contain information used for the creation of the
> - struct acpi_device objects represented by the given row (xSDT means DSDT
> - or SSDT).
> +The third column of the above table indicates which ACPI System
> +Description Tables contain information used for the creation of the
> +struct acpi_device objects represented by the given row (xSDT means DSDT
> +or SSDT).
> +
> +The forth column of the above table indicates the 'bus_id' generation
> +rule of the struct acpi_device object:
>
> - The forth column of the above table indicates the 'bus_id' generation
> - rule of the struct acpi_device object:
> _HID:
> _HID in the last column of the table means that the object's bus_id
> is derived from the _HID/_CID identification objects present under
> @@ -275,45 +281,47 @@ Wysocki <[email protected]>.
> object's bus_id.
>
>
> -4. Linux ACPI Physical Device Glue
> -
> - ACPI device (i.e. struct acpi_device) objects may be linked to other
> - objects in the Linux' device hierarchy that represent "physical" devices
> - (for example, devices on the PCI bus). If that happens, it means that
> - the ACPI device object is a "companion" of a device otherwise
> - represented in a different way and is used (1) to provide configuration
> - information on that device which cannot be obtained by other means and
> - (2) to do specific things to the device with the help of its ACPI
> - control methods. One ACPI device object may be linked this way to
> - multiple "physical" devices.
> -
> - If an ACPI device object is linked to a "physical" device, its sysfs
> - directory contains the "physical_node" symbolic link to the sysfs
> - directory of the target device object. In turn, the target device's
> - sysfs directory will then contain the "firmware_node" symbolic link to
> - the sysfs directory of the companion ACPI device object.
> - The linking mechanism relies on device identification provided by the
> - ACPI namespace. For example, if there's an ACPI namespace object
> - representing a PCI device (i.e. a device object under an ACPI namespace
> - object representing a PCI bridge) whose _ADR returns 0x00020000 and the
> - bus number of the parent PCI bridge is 0, the sysfs directory
> - representing the struct acpi_device object created for that ACPI
> - namespace object will contain the 'physical_node' symbolic link to the
> - /sys/devices/pci0000:00/0000:00:02:0/ sysfs directory of the
> - corresponding PCI device.
> -
> - The linking mechanism is generally bus-specific. The core of its
> - implementation is located in the drivers/acpi/glue.c file, but there are
> - complementary parts depending on the bus types in question located
> - elsewhere. For example, the PCI-specific part of it is located in
> - drivers/pci/pci-acpi.c.
> -
> -
> -5. Example Linux ACPI Device Tree
> -
> - The sysfs hierarchy of struct acpi_device objects corresponding to the
> - example ACPI namespace illustrated in Figure 2 with the addition of
> - fixed PWR_BUTTON/SLP_BUTTON devices is shown below.
> +Linux ACPI Physical Device Glue
> +===============================
> +
> +ACPI device (i.e. struct acpi_device) objects may be linked to other
> +objects in the Linux' device hierarchy that represent "physical" devices
> +(for example, devices on the PCI bus). If that happens, it means that
> +the ACPI device object is a "companion" of a device otherwise
> +represented in a different way and is used (1) to provide configuration
> +information on that device which cannot be obtained by other means and
> +(2) to do specific things to the device with the help of its ACPI
> +control methods. One ACPI device object may be linked this way to
> +multiple "physical" devices.
> +
> +If an ACPI device object is linked to a "physical" device, its sysfs
> +directory contains the "physical_node" symbolic link to the sysfs
> +directory of the target device object. In turn, the target device's
> +sysfs directory will then contain the "firmware_node" symbolic link to
> +the sysfs directory of the companion ACPI device object.
> +The linking mechanism relies on device identification provided by the
> +ACPI namespace. For example, if there's an ACPI namespace object
> +representing a PCI device (i.e. a device object under an ACPI namespace
> +object representing a PCI bridge) whose _ADR returns 0x00020000 and the
> +bus number of the parent PCI bridge is 0, the sysfs directory
> +representing the struct acpi_device object created for that ACPI
> +namespace object will contain the 'physical_node' symbolic link to the
> +/sys/devices/pci0000:00/0000:00:02:0/ sysfs directory of the
> +corresponding PCI device.
> +
> +The linking mechanism is generally bus-specific. The core of its
> +implementation is located in the drivers/acpi/glue.c file, but there are
> +complementary parts depending on the bus types in question located
> +elsewhere. For example, the PCI-specific part of it is located in
> +drivers/pci/pci-acpi.c.
> +
> +
> +Example Linux ACPI Device Tree
> +=================================
> +
> +The sysfs hierarchy of struct acpi_device objects corresponding to the
> +example ACPI namespace illustrated in Figure 2 with the addition of
> +fixed PWR_BUTTON/SLP_BUTTON devices is shown below::
>
> +--------------+---+-----------------+
> | LNXSYSTEM:00 | \ | acpi:LNXSYSTEM: |
> @@ -377,12 +385,14 @@ Wysocki <[email protected]>.
>
> Figure 3. Example Linux ACPI Device Tree
>
> - NOTE: Each node is represented as "object/path/modalias", where:
> - 1. 'object' is the name of the object's directory in sysfs.
> - 2. 'path' is the ACPI namespace path of the corresponding
> - ACPI namespace object, as returned by the object's 'path'
> - sysfs attribute.
> - 3. 'modalias' is the value of the object's 'modalias' sysfs
> - attribute (as described earlier in this document).
> - NOTE: N/A indicates the device object does not have the 'path' or the
> - 'modalias' attribute.
> +.. note:: Each node is represented as "object/path/modalias", where:
> +
> + 1. 'object' is the name of the object's directory in sysfs.
> + 2. 'path' is the ACPI namespace path of the corresponding
> + ACPI namespace object, as returned by the object's 'path'
> + sysfs attribute.
> + 3. 'modalias' is the value of the object's 'modalias' sysfs
> + attribute (as described earlier in this document).
> +
> +.. note:: N/A indicates the device object does not have the 'path' or the
> + 'modalias' attribute.



Thanks,
Mauro

2019-04-23 20:41:17

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 01/63] Documentation: add Linux ACPI to Sphinx TOC tree

Em Wed, 24 Apr 2019 00:28:30 +0800
Changbin Du <[email protected]> escreveu:

> Add below index.rst files for ACPI subsystem. More docs will be added later.
> o admin-guide/acpi/index.rst
> o driver-api/acpi/index.rst
> o firmware-guide/index.rst

Nice! you split it by usage.

Reviewed-by: Mauro Carvalho Chehab <[email protected]>
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> Documentation/admin-guide/acpi/index.rst | 10 ++++++++++
> Documentation/admin-guide/index.rst | 1 +
> Documentation/driver-api/acpi/index.rst | 7 +++++++
> Documentation/driver-api/index.rst | 1 +
> Documentation/firmware-guide/acpi/index.rst | 9 +++++++++
> Documentation/firmware-guide/index.rst | 13 +++++++++++++
> Documentation/index.rst | 10 ++++++++++
> 7 files changed, 51 insertions(+)
> create mode 100644 Documentation/admin-guide/acpi/index.rst
> create mode 100644 Documentation/driver-api/acpi/index.rst
> create mode 100644 Documentation/firmware-guide/acpi/index.rst
> create mode 100644 Documentation/firmware-guide/index.rst
>
> diff --git a/Documentation/admin-guide/acpi/index.rst b/Documentation/admin-guide/acpi/index.rst
> new file mode 100644
> index 000000000000..3e041206089d
> --- /dev/null
> +++ b/Documentation/admin-guide/acpi/index.rst
> @@ -0,0 +1,10 @@
> +============
> +ACPI Support
> +============
> +
> +Here we document in detail how to interact with various mechanisms in
> +the Linux ACPI support.
> +
> +.. toctree::
> + :maxdepth: 1
> +
> diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
> index 0a491676685e..5b8286fdd91b 100644
> --- a/Documentation/admin-guide/index.rst
> +++ b/Documentation/admin-guide/index.rst
> @@ -77,6 +77,7 @@ configure specific aspects of kernel behavior to your liking.
> LSM/index
> mm/index
> perf-security
> + acpi/index
>
> .. only:: subproject and html
>
> diff --git a/Documentation/driver-api/acpi/index.rst b/Documentation/driver-api/acpi/index.rst
> new file mode 100644
> index 000000000000..898b0c60671a
> --- /dev/null
> +++ b/Documentation/driver-api/acpi/index.rst
> @@ -0,0 +1,7 @@
> +============
> +ACPI Support
> +============
> +
> +.. toctree::
> + :maxdepth: 2
> +
> diff --git a/Documentation/driver-api/index.rst b/Documentation/driver-api/index.rst
> index c0b600ed9961..aa87075c7846 100644
> --- a/Documentation/driver-api/index.rst
> +++ b/Documentation/driver-api/index.rst
> @@ -56,6 +56,7 @@ available subsections can be seen below.
> slimbus
> soundwire/index
> fpga/index
> + acpi/index
>
> .. only:: subproject and html
>
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> new file mode 100644
> index 000000000000..0ec7d072ba22
> --- /dev/null
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -0,0 +1,9 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +============
> +ACPI Support
> +============
> +
> +.. toctree::
> + :maxdepth: 1
> +
> diff --git a/Documentation/firmware-guide/index.rst b/Documentation/firmware-guide/index.rst
> new file mode 100644
> index 000000000000..5355784ca0a2
> --- /dev/null
> +++ b/Documentation/firmware-guide/index.rst
> @@ -0,0 +1,13 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===============================
> +The Linux kernel firmware guide
> +===============================
> +
> +This section describes the ACPI subsystem in Linux from firmware perspective.
> +
> +.. toctree::
> + :maxdepth: 1
> +
> + acpi/index
> +
> diff --git a/Documentation/index.rst b/Documentation/index.rst
> index 80a421cb935e..fdfa85c56a50 100644
> --- a/Documentation/index.rst
> +++ b/Documentation/index.rst
> @@ -35,6 +35,16 @@ trying to get it to work optimally on a given system.
>
> admin-guide/index
>
> +Firmware-related documentation
> +------------------------------
> +The following holds information on the kernel's expectations regarding the
> +platform firmwares.
> +
> +.. toctree::
> + :maxdepth: 2
> +
> + firmware-guide/index
> +
> Application-developer documentation
> -----------------------------------
>



Thanks,
Mauro

2019-04-23 20:43:41

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 03/63] Documentation: ACPI: move enumeration.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:32 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.

Just looking at the conversion itself, it looks good to me.

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> .../acpi/enumeration.rst} | 135 ++++++++++--------
> Documentation/firmware-guide/acpi/index.rst | 1 +
> 2 files changed, 74 insertions(+), 62 deletions(-)
> rename Documentation/{acpi/enumeration.txt => firmware-guide/acpi/enumeration.rst} (87%)
>
> diff --git a/Documentation/acpi/enumeration.txt b/Documentation/firmware-guide/acpi/enumeration.rst
> similarity index 87%
> rename from Documentation/acpi/enumeration.txt
> rename to Documentation/firmware-guide/acpi/enumeration.rst
> index 7bcf9c3d9fbe..ce755e963714 100644
> --- a/Documentation/acpi/enumeration.txt
> +++ b/Documentation/firmware-guide/acpi/enumeration.rst
> @@ -1,5 +1,9 @@
> -ACPI based device enumeration
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=============================
> +ACPI Based Device Enumeration
> +=============================
> +
> ACPI 5 introduced a set of new resources (UartTSerialBus, I2cSerialBus,
> SpiSerialBus, GpioIo and GpioInt) which can be used in enumerating slave
> devices behind serial bus controllers.
> @@ -11,12 +15,12 @@ that are accessed through memory-mapped registers.
> In order to support this and re-use the existing drivers as much as
> possible we decided to do following:
>
> - o Devices that have no bus connector resource are represented as
> - platform devices.
> + - Devices that have no bus connector resource are represented as
> + platform devices.
>
> - o Devices behind real busses where there is a connector resource
> - are represented as struct spi_device or struct i2c_device
> - (standard UARTs are not busses so there is no struct uart_device).
> + - Devices behind real busses where there is a connector resource
> + are represented as struct spi_device or struct i2c_device
> + (standard UARTs are not busses so there is no struct uart_device).
>
> As both ACPI and Device Tree represent a tree of devices (and their
> resources) this implementation follows the Device Tree way as much as
> @@ -31,7 +35,8 @@ enumerated from ACPI namespace. This handle can be used to extract other
> device-specific configuration. There is an example of this below.
>
> Platform bus support
> -~~~~~~~~~~~~~~~~~~~~
> +====================
> +
> Since we are using platform devices to represent devices that are not
> connected to any physical bus we only need to implement a platform driver
> for the device and add supported ACPI IDs. If this same IP-block is used on
> @@ -39,7 +44,7 @@ some other non-ACPI platform, the driver might work out of the box or needs
> some minor changes.
>
> Adding ACPI support for an existing driver should be pretty
> -straightforward. Here is the simplest example:
> +straightforward. Here is the simplest example::
>
> #ifdef CONFIG_ACPI
> static const struct acpi_device_id mydrv_acpi_match[] = {
> @@ -61,12 +66,13 @@ configuring GPIOs it can get its ACPI handle and extract this information
> from ACPI tables.
>
> DMA support
> -~~~~~~~~~~~
> +===========
> +
> DMA controllers enumerated via ACPI should be registered in the system to
> provide generic access to their resources. For example, a driver that would
> like to be accessible to slave devices via generic API call
> dma_request_slave_channel() must register itself at the end of the probe
> -function like this:
> +function like this::
>
> err = devm_acpi_dma_controller_register(dev, xlate_func, dw);
> /* Handle the error if it's not a case of !CONFIG_ACPI */
> @@ -74,7 +80,7 @@ function like this:
> and implement custom xlate function if needed (usually acpi_dma_simple_xlate()
> is enough) which converts the FixedDMA resource provided by struct
> acpi_dma_spec into the corresponding DMA channel. A piece of code for that case
> -could look like:
> +could look like::
>
> #ifdef CONFIG_ACPI
> struct filter_args {
> @@ -114,7 +120,7 @@ provided by struct acpi_dma.
> Clients must call dma_request_slave_channel() with the string parameter that
> corresponds to a specific FixedDMA resource. By default "tx" means the first
> entry of the FixedDMA resource array, "rx" means the second entry. The table
> -below shows a layout:
> +below shows a layout::
>
> Device (I2C0)
> {
> @@ -138,12 +144,13 @@ acpi_dma_request_slave_chan_by_index() directly and therefore choose the
> specific FixedDMA resource by its index.
>
> SPI serial bus support
> -~~~~~~~~~~~~~~~~~~~~~~
> +======================
> +
> Slave devices behind SPI bus have SpiSerialBus resource attached to them.
> This is extracted automatically by the SPI core and the slave devices are
> enumerated once spi_register_master() is called by the bus driver.
>
> -Here is what the ACPI namespace for a SPI slave might look like:
> +Here is what the ACPI namespace for a SPI slave might look like::
>
> Device (EEP0)
> {
> @@ -163,7 +170,7 @@ Here is what the ACPI namespace for a SPI slave might look like:
>
> The SPI device drivers only need to add ACPI IDs in a similar way than with
> the platform device drivers. Below is an example where we add ACPI support
> -to at25 SPI eeprom driver (this is meant for the above ACPI snippet):
> +to at25 SPI eeprom driver (this is meant for the above ACPI snippet)::
>
> #ifdef CONFIG_ACPI
> static const struct acpi_device_id at25_acpi_match[] = {
> @@ -182,7 +189,7 @@ to at25 SPI eeprom driver (this is meant for the above ACPI snippet):
>
> Note that this driver actually needs more information like page size of the
> eeprom etc. but at the time writing this there is no standard way of
> -passing those. One idea is to return this in _DSM method like:
> +passing those. One idea is to return this in _DSM method like::
>
> Device (EEP0)
> {
> @@ -202,7 +209,7 @@ passing those. One idea is to return this in _DSM method like:
> }
>
> Then the at25 SPI driver can get this configuration by calling _DSM on its
> -ACPI handle like:
> +ACPI handle like::
>
> struct acpi_buffer output = { ACPI_ALLOCATE_BUFFER, NULL };
> struct acpi_object_list input;
> @@ -220,14 +227,15 @@ ACPI handle like:
> kfree(output.pointer);
>
> I2C serial bus support
> -~~~~~~~~~~~~~~~~~~~~~~
> +======================
> +
> The slaves behind I2C bus controller only need to add the ACPI IDs like
> with the platform and SPI drivers. The I2C core automatically enumerates
> any slave devices behind the controller device once the adapter is
> registered.
>
> Below is an example of how to add ACPI support to the existing mpu3050
> -input driver:
> +input driver::
>
> #ifdef CONFIG_ACPI
> static const struct acpi_device_id mpu3050_acpi_match[] = {
> @@ -251,56 +259,57 @@ input driver:
> };
>
> GPIO support
> -~~~~~~~~~~~~
> +============
> +
> ACPI 5 introduced two new resources to describe GPIO connections: GpioIo
> and GpioInt. These resources can be used to pass GPIO numbers used by
> the device to the driver. ACPI 5.1 extended this with _DSD (Device
> Specific Data) which made it possible to name the GPIOs among other things.
>
> -For example:
> +For example::
>
> -Device (DEV)
> -{
> - Method (_CRS, 0, NotSerialized)
> + Device (DEV)
> {
> - Name (SBUF, ResourceTemplate()
> + Method (_CRS, 0, NotSerialized)
> {
> - ...
> - // Used to power on/off the device
> - GpioIo (Exclusive, PullDefault, 0x0000, 0x0000,
> - IoRestrictionOutputOnly, "\\_SB.PCI0.GPI0",
> - 0x00, ResourceConsumer,,)
> + Name (SBUF, ResourceTemplate()
> {
> - // Pin List
> - 0x0055
> - }
> + ...
> + // Used to power on/off the device
> + GpioIo (Exclusive, PullDefault, 0x0000, 0x0000,
> + IoRestrictionOutputOnly, "\\_SB.PCI0.GPI0",
> + 0x00, ResourceConsumer,,)
> + {
> + // Pin List
> + 0x0055
> + }
> +
> + // Interrupt for the device
> + GpioInt (Edge, ActiveHigh, ExclusiveAndWake, PullNone,
> + 0x0000, "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer,,)
> + {
> + // Pin list
> + 0x0058
> + }
> +
> + ...
>
> - // Interrupt for the device
> - GpioInt (Edge, ActiveHigh, ExclusiveAndWake, PullNone,
> - 0x0000, "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer,,)
> - {
> - // Pin list
> - 0x0058
> }
>
> - ...
> -
> + Return (SBUF)
> }
>
> - Return (SBUF)
> - }
> -
> - // ACPI 5.1 _DSD used for naming the GPIOs
> - Name (_DSD, Package ()
> - {
> - ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
> - Package ()
> + // ACPI 5.1 _DSD used for naming the GPIOs
> + Name (_DSD, Package ()
> {
> - Package () {"power-gpios", Package() {^DEV, 0, 0, 0 }},
> - Package () {"irq-gpios", Package() {^DEV, 1, 0, 0 }},
> - }
> - })
> - ...
> + ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
> + Package ()
> + {
> + Package () {"power-gpios", Package() {^DEV, 0, 0, 0 }},
> + Package () {"irq-gpios", Package() {^DEV, 1, 0, 0 }},
> + }
> + })
> + ...
>
> These GPIO numbers are controller relative and path "\\_SB.PCI0.GPI0"
> specifies the path to the controller. In order to use these GPIOs in Linux
> @@ -310,7 +319,7 @@ There is a standard GPIO API for that and is documented in
> Documentation/gpio/.
>
> In the above example we can get the corresponding two GPIO descriptors with
> -a code like this:
> +a code like this::
>
> #include <linux/gpio/consumer.h>
> ...
> @@ -334,21 +343,22 @@ See Documentation/acpi/gpio-properties.txt for more information about the
> _DSD binding related to GPIOs.
>
> MFD devices
> -~~~~~~~~~~~
> +===========
> +
> The MFD devices register their children as platform devices. For the child
> devices there needs to be an ACPI handle that they can use to reference
> parts of the ACPI namespace that relate to them. In the Linux MFD subsystem
> we provide two ways:
>
> - o The children share the parent ACPI handle.
> - o The MFD cell can specify the ACPI id of the device.
> + - The children share the parent ACPI handle.
> + - The MFD cell can specify the ACPI id of the device.
>
> For the first case, the MFD drivers do not need to do anything. The
> resulting child platform device will have its ACPI_COMPANION() set to point
> to the parent device.
>
> If the ACPI namespace has a device that we can match using an ACPI id or ACPI
> -adr, the cell should be set like:
> +adr, the cell should be set like::
>
> static struct mfd_cell_acpi_match my_subdevice_cell_acpi_match = {
> .pnpid = "XYZ0001",
> @@ -366,7 +376,8 @@ the MFD device and if found, that ACPI companion device is bound to the
> resulting child platform device.
>
> Device Tree namespace link device ID
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +====================================
> +
> The Device Tree protocol uses device identification based on the "compatible"
> property whose value is a string or an array of strings recognized as device
> identifiers by drivers and the driver core. The set of all those strings may be
> @@ -423,4 +434,4 @@ the _DSD of the device object itself or the _DSD of its ancestor in the
> Otherwise, the _DSD itself is regarded as invalid and therefore the "compatible"
> property returned by it is meaningless.
>
> -Refer to DSD-properties-rules.txt for more information.
> +Refer to :doc:`DSD-properties-rules` for more information.
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index 210ad8acd6df..99677c73f1fb 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -8,3 +8,4 @@ ACPI Support
> :maxdepth: 1
>
> namespace
> + enumeration



Thanks,
Mauro

2019-04-23 20:47:26

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 04/63] Documentation: ACPI: move osi.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:33 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> Documentation/firmware-guide/acpi/index.rst | 1 +
> .../{acpi/osi.txt => firmware-guide/acpi/osi.rst} | 15 +++++++++------
> 2 files changed, 10 insertions(+), 6 deletions(-)
> rename Documentation/{acpi/osi.txt => firmware-guide/acpi/osi.rst} (97%)
>
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index 99677c73f1fb..868bd25a3398 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -9,3 +9,4 @@ ACPI Support
>
> namespace
> enumeration
> + osi
> diff --git a/Documentation/acpi/osi.txt b/Documentation/firmware-guide/acpi/osi.rst
> similarity index 97%
> rename from Documentation/acpi/osi.txt
> rename to Documentation/firmware-guide/acpi/osi.rst
> index 50cde0ceb9b0..29e9ef79ebc0 100644
> --- a/Documentation/acpi/osi.txt
> +++ b/Documentation/firmware-guide/acpi/osi.rst
> @@ -1,5 +1,8 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +==========================
> ACPI _OSI and _REV methods
> ---------------------------
> +==========================

You could probably do just the above, but changing the title
markups on the other files has the advantage of using the
same standard on all acpi files.

Either way, just looking at the conversion itself:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>


>
> An ACPI BIOS can use the "Operating System Interfaces" method (_OSI)
> to find out what the operating system supports. Eg. If BIOS
> @@ -14,7 +17,7 @@ This document explains how and why the BIOS and Linux should use these methods.
> It also explains how and why they are widely misused.
>
> How to use _OSI
> ----------------
> +===============
>
> Linux runs on two groups of machines -- those that are tested by the OEM
> to be compatible with Linux, and those that were never tested with Linux,
> @@ -62,7 +65,7 @@ the string when that support is added to the kernel.
> That was easy. Read on, to find out how to do it wrong.
>
> Before _OSI, there was _OS
> ---------------------------
> +==========================
>
> ACPI 1.0 specified "_OS" as an
> "object that evaluates to a string that identifies the operating system."
> @@ -96,7 +99,7 @@ That is the *only* viable strategy, as that is what modern Windows does,
> and so doing otherwise could steer the BIOS down an untested path.
>
> _OSI is born, and immediately misused
> ---------------------------------------
> +=====================================
>
> With _OSI, the *BIOS* provides the string describing an interface,
> and asks the OS: "YES/NO, are you compatible with this interface?"
> @@ -144,7 +147,7 @@ catastrophic failure resulting from the BIOS taking paths that
> were never validated under *any* OS.
>
> Do not use _REV
> ----------------
> +===============
>
> Since _OSI("Linux") went away, some BIOS writers used _REV
> to support Linux and Windows differences in the same BIOS.
> @@ -164,7 +167,7 @@ from mid-2015 onward. The ACPI specification will also be updated
> to reflect that _REV is deprecated, and always returns 2.
>
> Apple Mac and _OSI("Darwin")
> -----------------------------
> +============================
>
> On Apple's Mac platforms, the ACPI BIOS invokes _OSI("Darwin")
> to determine if the machine is running Apple OSX.



Thanks,
Mauro

2019-04-23 20:51:47

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 05/63] Documentation: ACPI: move linuxized-acpica.txt to driver-api/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:34 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> Documentation/driver-api/acpi/index.rst | 1 +
> .../acpi/linuxized-acpica.rst} | 115 ++++++++++--------
> 2 files changed, 66 insertions(+), 50 deletions(-)
> rename Documentation/{acpi/linuxized-acpica.txt => driver-api/acpi/linuxized-acpica.rst} (78%)
>
> diff --git a/Documentation/driver-api/acpi/index.rst b/Documentation/driver-api/acpi/index.rst
> index 898b0c60671a..12649947b19b 100644
> --- a/Documentation/driver-api/acpi/index.rst
> +++ b/Documentation/driver-api/acpi/index.rst
> @@ -5,3 +5,4 @@ ACPI Support
> .. toctree::
> :maxdepth: 2
>
> + linuxized-acpica
> diff --git a/Documentation/acpi/linuxized-acpica.txt b/Documentation/driver-api/acpi/linuxized-acpica.rst
> similarity index 78%
> rename from Documentation/acpi/linuxized-acpica.txt
> rename to Documentation/driver-api/acpi/linuxized-acpica.rst
> index 3ad7b0dfb083..f8aaea668e41 100644
> --- a/Documentation/acpi/linuxized-acpica.txt
> +++ b/Documentation/driver-api/acpi/linuxized-acpica.rst
> @@ -1,31 +1,35 @@
> -Linuxized ACPICA - Introduction to ACPICA Release Automation
> +.. SPDX-License-Identifier: GPL-2.0
> +.. include:: <isonum.txt>
>
> -Copyright (C) 2013-2016, Intel Corporation
> -Author: Lv Zheng <[email protected]>
> +============================================================
> +Linuxized ACPICA - Introduction to ACPICA Release Automation
> +============================================================
>
> +:Copyright: |copy| 2013-2016, Intel Corporation
>
> -Abstract:
> +:Author: Lv Zheng <[email protected]>
>
> -This document describes the ACPICA project and the relationship between
> -ACPICA and Linux. It also describes how ACPICA code in drivers/acpi/acpica,
> -include/acpi and tools/power/acpi is automatically updated to follow the
> -upstream.
> +:Abstract: This document describes the ACPICA project and the relationship
> + between ACPICA and Linux. It also describes how ACPICA code in
> + drivers/acpi/acpica, include/acpi and tools/power/acpi is
> + automatically updated to follow the upstream.
>

Same comment as on patch 02: I would keep the abstracts as a chapter,
in order to make them visible at the index, as this may help readers
to quickly look at the document's contents.

I'm sure other APCI documents also have abstracts. So, please consider
this comment also for the other docs.

Anyway, this is just a suggestion. I'm also fine with the above.
Either way, for the conversion itself:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

>
> -1. ACPICA Project
> +ACPICA Project
> +==============
>
> - The ACPI Component Architecture (ACPICA) project provides an operating
> - system (OS)-independent reference implementation of the Advanced
> - Configuration and Power Interface Specification (ACPI). It has been
> - adapted by various host OSes. By directly integrating ACPICA, Linux can
> - also benefit from the application experiences of ACPICA from other host
> - OSes.
> +The ACPI Component Architecture (ACPICA) project provides an operating
> +system (OS)-independent reference implementation of the Advanced
> +Configuration and Power Interface Specification (ACPI). It has been
> +adapted by various host OSes. By directly integrating ACPICA, Linux can
> +also benefit from the application experiences of ACPICA from other host
> +OSes.
>
> - The homepage of ACPICA project is: http://www.acpica.org, it is maintained and
> - supported by Intel Corporation.
> +The homepage of ACPICA project is: http://www.acpica.org, it is maintained and
> +supported by Intel Corporation.
>
> - The following figure depicts the Linux ACPI subsystem where the ACPICA
> - adaptation is included:
> +The following figure depicts the Linux ACPI subsystem where the ACPICA
> +adaptation is included::
>
> +---------------------------------------------------------+
> | |
> @@ -71,21 +75,27 @@ upstream.
>
> Figure 1. Linux ACPI Software Components
>
> - NOTE:
> +.. note::
> A. OS Service Layer - Provided by Linux to offer OS dependent
> implementation of the predefined ACPICA interfaces (acpi_os_*).
> + ::
> +
> include/acpi/acpiosxf.h
> drivers/acpi/osl.c
> include/acpi/platform
> include/asm/acenv.h
> B. ACPICA Functionality - Released from ACPICA code base to offer
> OS independent implementation of the ACPICA interfaces (acpi_*).
> + ::
> +
> drivers/acpi/acpica
> include/acpi/ac*.h
> tools/power/acpi
> C. Linux/ACPI Functionality - Providing Linux specific ACPI
> functionality to the other Linux kernel subsystems and user space
> programs.
> + ::
> +
> drivers/acpi
> include/linux/acpi.h
> include/linux/acpi*.h
> @@ -95,24 +105,27 @@ upstream.
> ACPI subsystem to offer architecture specific implementation of the
> ACPI interfaces. They are Linux specific components and are out of
> the scope of this document.
> + ::
> +
> include/asm/acpi.h
> include/asm/acpi*.h
> arch/*/acpi
>
> -2. ACPICA Release
> +ACPICA Release
> +==============
>
> - The ACPICA project maintains its code base at the following repository URL:
> - https://github.com/acpica/acpica.git. As a rule, a release is made every
> - month.
> +The ACPICA project maintains its code base at the following repository URL:
> +https://github.com/acpica/acpica.git. As a rule, a release is made every
> +month.
>
> - As the coding style adopted by the ACPICA project is not acceptable by
> - Linux, there is a release process to convert the ACPICA git commits into
> - Linux patches. The patches generated by this process are referred to as
> - "linuxized ACPICA patches". The release process is carried out on a local
> - copy the ACPICA git repository. Each commit in the monthly release is
> - converted into a linuxized ACPICA patch. Together, they form the monthly
> - ACPICA release patchset for the Linux ACPI community. This process is
> - illustrated in the following figure:
> +As the coding style adopted by the ACPICA project is not acceptable by
> +Linux, there is a release process to convert the ACPICA git commits into
> +Linux patches. The patches generated by this process are referred to as
> +"linuxized ACPICA patches". The release process is carried out on a local
> +copy the ACPICA git repository. Each commit in the monthly release is
> +converted into a linuxized ACPICA patch. Together, they form the monthly
> +ACPICA release patchset for the Linux ACPI community. This process is
> +illustrated in the following figure::
>
> +-----------------------------+
> | acpica / master (-) commits |
> @@ -153,7 +166,7 @@ upstream.
>
> Figure 2. ACPICA -> Linux Upstream Process
>
> - NOTE:
> +.. note::
> A. Linuxize Utilities - Provided by the ACPICA repository, including a
> utility located in source/tools/acpisrc folder and a number of
> scripts located in generate/linux folder.
> @@ -170,19 +183,20 @@ upstream.
> following kernel configuration options:
> CONFIG_ACPI/CONFIG_ACPI_DEBUG/CONFIG_ACPI_DEBUGGER
>
> -3. ACPICA Divergences
> +ACPICA Divergences
> +==================
>
> - Ideally, all of the ACPICA commits should be converted into Linux patches
> - automatically without manual modifications, the "linux / master" tree should
> - contain the ACPICA code that exactly corresponds to the ACPICA code
> - contained in "new linuxized acpica" tree and it should be possible to run
> - the release process fully automatically.
> +Ideally, all of the ACPICA commits should be converted into Linux patches
> +automatically without manual modifications, the "linux / master" tree should
> +contain the ACPICA code that exactly corresponds to the ACPICA code
> +contained in "new linuxized acpica" tree and it should be possible to run
> +the release process fully automatically.
>
> - As a matter of fact, however, there are source code differences between
> - the ACPICA code in Linux and the upstream ACPICA code, referred to as
> - "ACPICA Divergences".
> +As a matter of fact, however, there are source code differences between
> +the ACPICA code in Linux and the upstream ACPICA code, referred to as
> +"ACPICA Divergences".
>
> - The various sources of ACPICA divergences include:
> +The various sources of ACPICA divergences include:
> 1. Legacy divergences - Before the current ACPICA release process was
> established, there already had been divergences between Linux and
> ACPICA. Over the past several years those divergences have been greatly
> @@ -213,11 +227,12 @@ upstream.
> rebased on the ACPICA side in order to offer better solutions, new ACPICA
> divergences are generated.
>
> -4. ACPICA Development
> +ACPICA Development
> +==================
>
> - This paragraph guides Linux developers to use the ACPICA upstream release
> - utilities to obtain Linux patches corresponding to upstream ACPICA commits
> - before they become available from the ACPICA release process.
> +This paragraph guides Linux developers to use the ACPICA upstream release
> +utilities to obtain Linux patches corresponding to upstream ACPICA commits
> +before they become available from the ACPICA release process.
>
> 1. Cherry-pick an ACPICA commit
>
> @@ -225,7 +240,7 @@ upstream.
> you want to cherry pick must be committed into the local repository.
>
> Then the gen-patch.sh command can help to cherry-pick an ACPICA commit
> - from the ACPICA local repository:
> + from the ACPICA local repository::
>
> $ git clone https://github.com/acpica/acpica
> $ cd acpica
> @@ -240,7 +255,7 @@ upstream.
> changes that haven't been applied to Linux yet.
>
> You can generate the ACPICA release series yourself and rebase your code on
> - top of the generated ACPICA release patches:
> + top of the generated ACPICA release patches::
>
> $ git clone https://github.com/acpica/acpica
> $ cd acpica
> @@ -254,7 +269,7 @@ upstream.
> 3. Inspect the current divergences
>
> If you have local copies of both Linux and upstream ACPICA, you can generate
> - a diff file indicating the state of the current divergences:
> + a diff file indicating the state of the current divergences::
>
> # git clone https://github.com/acpica/acpica
> # git clone http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git



Thanks,
Mauro

2019-04-23 20:52:45

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 06/63] Documentation: ACPI: move scan_handlers.txt to driver-api/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:35 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>

For the conversion itself:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> ---
> Documentation/driver-api/acpi/index.rst | 1 +
> .../acpi/scan_handlers.rst} | 24 ++++++++++++-------
> 2 files changed, 16 insertions(+), 9 deletions(-)
> rename Documentation/{acpi/scan_handlers.txt => driver-api/acpi/scan_handlers.rst} (90%)
>
> diff --git a/Documentation/driver-api/acpi/index.rst b/Documentation/driver-api/acpi/index.rst
> index 12649947b19b..ace0008e54c2 100644
> --- a/Documentation/driver-api/acpi/index.rst
> +++ b/Documentation/driver-api/acpi/index.rst
> @@ -6,3 +6,4 @@ ACPI Support
> :maxdepth: 2
>
> linuxized-acpica
> + scan_handlers
> diff --git a/Documentation/acpi/scan_handlers.txt b/Documentation/driver-api/acpi/scan_handlers.rst
> similarity index 90%
> rename from Documentation/acpi/scan_handlers.txt
> rename to Documentation/driver-api/acpi/scan_handlers.rst
> index 3246ccf15992..7a197b3a33fc 100644
> --- a/Documentation/acpi/scan_handlers.txt
> +++ b/Documentation/driver-api/acpi/scan_handlers.rst
> @@ -1,7 +1,13 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +.. include:: <isonum.txt>
> +
> +==================
> ACPI Scan Handlers
> +==================
> +
> +:Copyright: |copy| 2012, Intel Corporation
>
> -Copyright (C) 2012, Intel Corporation
> -Author: Rafael J. Wysocki <[email protected]>
> +:Author: Rafael J. Wysocki <[email protected]>
>
> During system initialization and ACPI-based device hot-add, the ACPI namespace
> is scanned in search of device objects that generally represent various pieces
> @@ -30,14 +36,14 @@ to configure that link so that the kernel can use it.
> Those additional configuration tasks usually depend on the type of the hardware
> component represented by the given device node which can be determined on the
> basis of the device node's hardware ID (HID). They are performed by objects
> -called ACPI scan handlers represented by the following structure:
> +called ACPI scan handlers represented by the following structure::
>
> -struct acpi_scan_handler {
> - const struct acpi_device_id *ids;
> - struct list_head list_node;
> - int (*attach)(struct acpi_device *dev, const struct acpi_device_id *id);
> - void (*detach)(struct acpi_device *dev);
> -};
> + struct acpi_scan_handler {
> + const struct acpi_device_id *ids;
> + struct list_head list_node;
> + int (*attach)(struct acpi_device *dev, const struct acpi_device_id *id);
> + void (*detach)(struct acpi_device *dev);
> + };
>
> where ids is the list of IDs of device nodes the given handler is supposed to
> take care of, list_node is the hook to the global list of ACPI scan handlers



Thanks,
Mauro

2019-04-23 20:54:03

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 07/63] Documentation: ACPI: move DSD-properties-rules.txt to firmware-guide/acpi and covert to reST

Em Wed, 24 Apr 2019 00:28:36 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>

For the conversion itself:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> ---
> .../acpi/DSD-properties-rules.rst} | 21 +++++++++++--------
> Documentation/firmware-guide/acpi/index.rst | 1 +
> 2 files changed, 13 insertions(+), 9 deletions(-)
> rename Documentation/{acpi/DSD-properties-rules.txt => firmware-guide/acpi/DSD-properties-rules.rst} (88%)
>
> diff --git a/Documentation/acpi/DSD-properties-rules.txt b/Documentation/firmware-guide/acpi/DSD-properties-rules.rst
> similarity index 88%
> rename from Documentation/acpi/DSD-properties-rules.txt
> rename to Documentation/firmware-guide/acpi/DSD-properties-rules.rst
> index 3e4862bdad98..4306f29b6103 100644
> --- a/Documentation/acpi/DSD-properties-rules.txt
> +++ b/Documentation/firmware-guide/acpi/DSD-properties-rules.rst
> @@ -1,8 +1,11 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +==================================
> _DSD Device Properties Usage Rules
> -----------------------------------
> +==================================
>
> Properties, Property Sets and Property Subsets
> -----------------------------------------------
> +==============================================
>
> The _DSD (Device Specific Data) configuration object, introduced in ACPI 5.1,
> allows any type of device configuration data to be provided via the ACPI
> @@ -18,7 +21,7 @@ specific type) associated with it.
>
> In the ACPI _DSD context it is an element of the sub-package following the
> generic Device Properties UUID in the _DSD return package as specified in the
> -Device Properties UUID definition document [1].
> +Device Properties UUID definition document [1]_.
>
> It also may be regarded as the definition of a key and the associated data type
> that can be returned by _DSD in the Device Properties UUID sub-package for a
> @@ -33,14 +36,14 @@ Property subsets are nested collections of properties. Each of them is
> associated with an additional key (name) allowing the subset to be referred
> to as a whole (and to be treated as a separate entity). The canonical
> representation of property subsets is via the mechanism specified in the
> -Hierarchical Properties Extension UUID definition document [2].
> +Hierarchical Properties Extension UUID definition document [2]_.
>
> Property sets may be hierarchical. That is, a property set may contain
> multiple property subsets that each may contain property subsets of its
> own and so on.
>
> General Validity Rule for Property Sets
> ----------------------------------------
> +=======================================
>
> Valid property sets must follow the guidance given by the Device Properties UUID
> definition document [1].
> @@ -73,7 +76,7 @@ suitable for the ACPI environment and consequently they cannot belong to a valid
> property set.
>
> Property Sets and Device Tree Bindings
> ---------------------------------------
> +======================================
>
> It often is useful to make _DSD return property sets that follow Device Tree
> bindings.
> @@ -91,7 +94,7 @@ expected to automatically work in the ACPI environment regardless of their
> contents.
>
> References
> -----------
> +==========
>
> -[1] http://www.uefi.org/sites/default/files/resources/_DSD-device-properties-UUID.pdf
> -[2] http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.1.pdf
> +.. [1] http://www.uefi.org/sites/default/files/resources/_DSD-device-properties-UUID.pdf
> +.. [2] http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.1.pdf
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index 868bd25a3398..0e05b843521c 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -10,3 +10,4 @@ ACPI Support
> namespace
> enumeration
> osi
> + DSD-properties-rules



Thanks,
Mauro

2019-04-23 20:56:43

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 08/63] Documentation: ACPI: move gpio-properties.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:37 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> .../acpi/gpio-properties.rst} | 78 +++++++++++--------
> Documentation/firmware-guide/acpi/index.rst | 1 +
> MAINTAINERS | 2 +-
> 3 files changed, 46 insertions(+), 35 deletions(-)
> rename Documentation/{acpi/gpio-properties.txt => firmware-guide/acpi/gpio-properties.rst} (81%)
>
> diff --git a/Documentation/acpi/gpio-properties.txt b/Documentation/firmware-guide/acpi/gpio-properties.rst
> similarity index 81%
> rename from Documentation/acpi/gpio-properties.txt
> rename to Documentation/firmware-guide/acpi/gpio-properties.rst
> index 88c65cb5bf0a..89c636963544 100644
> --- a/Documentation/acpi/gpio-properties.txt
> +++ b/Documentation/firmware-guide/acpi/gpio-properties.rst
> @@ -1,5 +1,8 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +======================================
> _DSD Device Properties Related to GPIO
> ---------------------------------------
> +======================================
>
> With the release of ACPI 5.1, the _DSD configuration object finally
> allows names to be given to GPIOs (and other things as well) returned
> @@ -8,7 +11,7 @@ the corresponding GPIO, which is pretty error prone (it depends on
> the _CRS output ordering, for example).
>
> With _DSD we can now query GPIOs using a name instead of an integer
> -index, like the ASL example below shows:
> +index, like the ASL example below shows::
>
> // Bluetooth device with reset and shutdown GPIOs
> Device (BTH)
> @@ -34,15 +37,19 @@ index, like the ASL example below shows:
> })
> }
>
> -The format of the supported GPIO property is:
> +The format of the supported GPIO property is::
>
> Package () { "name", Package () { ref, index, pin, active_low }}
>
> - ref - The device that has _CRS containing GpioIo()/GpioInt() resources,
> - typically this is the device itself (BTH in our case).
> - index - Index of the GpioIo()/GpioInt() resource in _CRS starting from zero.
> - pin - Pin in the GpioIo()/GpioInt() resource. Typically this is zero.
> - active_low - If 1 the GPIO is marked as active_low.
> +ref
> + The device that has _CRS containing GpioIo()/GpioInt() resources,
> + typically this is the device itself (BTH in our case).
> +index
> + Index of the GpioIo()/GpioInt() resource in _CRS starting from zero.
> +pin
> + Pin in the GpioIo()/GpioInt() resource. Typically this is zero.
> +active_low
> + If 1 the GPIO is marked as active_low.
>
> Since ACPI GpioIo() resource does not have a field saying whether it is
> active low or high, the "active_low" argument can be used here. Setting
> @@ -55,7 +62,7 @@ It is possible to leave holes in the array of GPIOs. This is useful in
> cases like with SPI host controllers where some chip selects may be
> implemented as GPIOs and some as native signals. For example a SPI host
> controller can have chip selects 0 and 2 implemented as GPIOs and 1 as
> -native:
> +native::
>
> Package () {
> "cs-gpios",
> @@ -67,7 +74,7 @@ native:
> }
>
> Other supported properties
> ---------------------------
> +==========================
>
> Following Device Tree compatible device properties are also supported by
> _DSD device properties for GPIO controllers:
> @@ -78,7 +85,7 @@ _DSD device properties for GPIO controllers:
> - input
> - line-name
>
> -Example:
> +Example::
>
> Name (_DSD, Package () {
> // _DSD Hierarchical Properties Extension UUID
> @@ -100,7 +107,7 @@ Example:
>
> - gpio-line-names
>
> -Example:
> +Example::
>
> Package () {
> "gpio-line-names",
> @@ -114,7 +121,7 @@ See Documentation/devicetree/bindings/gpio/gpio.txt for more information
> about these properties.
>
> ACPI GPIO Mappings Provided by Drivers
> ---------------------------------------
> +======================================
>
> There are systems in which the ACPI tables do not contain _DSD but provide _CRS
> with GpioIo()/GpioInt() resources and device drivers still need to work with
> @@ -139,16 +146,16 @@ line in that resource starting from zero, and the active-low flag for that line,
> respectively, in analogy with the _DSD GPIO property format specified above.
>
> For the example Bluetooth device discussed previously the data structures in
> -question would look like this:
> +question would look like this::
>
> -static const struct acpi_gpio_params reset_gpio = { 1, 1, false };
> -static const struct acpi_gpio_params shutdown_gpio = { 0, 0, false };
> + static const struct acpi_gpio_params reset_gpio = { 1, 1, false };
> + static const struct acpi_gpio_params shutdown_gpio = { 0, 0, false };
>
> -static const struct acpi_gpio_mapping bluetooth_acpi_gpios[] = {
> - { "reset-gpios", &reset_gpio, 1 },
> - { "shutdown-gpios", &shutdown_gpio, 1 },
> - { },
> -};
> + static const struct acpi_gpio_mapping bluetooth_acpi_gpios[] = {
> + { "reset-gpios", &reset_gpio, 1 },
> + { "shutdown-gpios", &shutdown_gpio, 1 },
> + { },
> + };
>
> Next, the mapping table needs to be passed as the second argument to
> acpi_dev_add_driver_gpios() that will register it with the ACPI device object
> @@ -158,12 +165,12 @@ calling acpi_dev_remove_driver_gpios() on the ACPI device object where that
> table was previously registered.
>
> Using the _CRS fallback
> ------------------------
> +=======================
>
> If a device does not have _DSD or the driver does not create ACPI GPIO
> mapping, the Linux GPIO framework refuses to return any GPIOs. This is
> because the driver does not know what it actually gets. For example if we
> -have a device like below:
> +have a device like below::
>
> Device (BTH)
> {
> @@ -177,7 +184,7 @@ have a device like below:
> })
> }
>
> -The driver might expect to get the right GPIO when it does:
> +The driver might expect to get the right GPIO when it does::

Hmm... there is a small typo here:

": :" -> "::"

For the conversion itself, after correcting the above typo:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>



>
> desc = gpiod_get(dev, "reset", GPIOD_OUT_LOW);
>
> @@ -193,22 +200,25 @@ the ACPI GPIO mapping tables are hardly linked to ACPI ID and certain
> objects, as listed in the above chapter, of the device in question.
>
> Getting GPIO descriptor
> ------------------------
> +=======================
> +
> +There are two main approaches to get GPIO resource from ACPI::
>
> -There are two main approaches to get GPIO resource from ACPI:
> - desc = gpiod_get(dev, connection_id, flags);
> - desc = gpiod_get_index(dev, connection_id, index, flags);
> + desc = gpiod_get(dev, connection_id, flags);
> + desc = gpiod_get_index(dev, connection_id, index, flags);
>
> We may consider two different cases here, i.e. when connection ID is
> provided and otherwise.
>
> -Case 1:
> - desc = gpiod_get(dev, "non-null-connection-id", flags);
> - desc = gpiod_get_index(dev, "non-null-connection-id", index, flags);
> +Case 1::
> +
> + desc = gpiod_get(dev, "non-null-connection-id", flags);
> + desc = gpiod_get_index(dev, "non-null-connection-id", index, flags);
> +
> +Case 2::
>
> -Case 2:
> - desc = gpiod_get(dev, NULL, flags);
> - desc = gpiod_get_index(dev, NULL, index, flags);
> + desc = gpiod_get(dev, NULL, flags);
> + desc = gpiod_get_index(dev, NULL, index, flags);
>
> Case 1 assumes that corresponding ACPI device description must have
> defined device properties and will prevent to getting any GPIO resources
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index 0e05b843521c..61d67763851b 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -11,3 +11,4 @@ ACPI Support
> enumeration
> osi
> DSD-properties-rules
> + gpio-properties
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 09f43f1bdd15..87f930bf32ad 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6593,7 +6593,7 @@ M: Andy Shevchenko <[email protected]>
> L: [email protected]
> L: [email protected]
> S: Maintained
> -F: Documentation/acpi/gpio-properties.txt
> +F: Documentation/firmware-guide/acpi/gpio-properties.rst
> F: drivers/gpio/gpiolib-acpi.c
>
> GPIO IR Transmitter



Thanks,
Mauro

2019-04-23 21:04:45

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 09/63] Documentation: ACPI: move method-customizing.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:38 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> Documentation/acpi/method-customizing.txt | 73 -----------------
> Documentation/firmware-guide/acpi/index.rst | 3 +-
> .../acpi/method-customizing.rst | 82 +++++++++++++++++++
> 3 files changed, 84 insertions(+), 74 deletions(-)
> delete mode 100644 Documentation/acpi/method-customizing.txt
> create mode 100644 Documentation/firmware-guide/acpi/method-customizing.rst
>
> diff --git a/Documentation/acpi/method-customizing.txt b/Documentation/acpi/method-customizing.txt
> deleted file mode 100644
> index 7235da975f23..000000000000
> --- a/Documentation/acpi/method-customizing.txt
> +++ /dev/null
> @@ -1,73 +0,0 @@
> -Linux ACPI Custom Control Method How To
> -=======================================
> -
> -Written by Zhang Rui <[email protected]>
> -
> -
> -Linux supports customizing ACPI control methods at runtime.
> -
> -Users can use this to
> -1. override an existing method which may not work correctly,
> - or just for debugging purposes.
> -2. insert a completely new method in order to create a missing
> - method such as _OFF, _ON, _STA, _INI, etc.
> -For these cases, it is far simpler to dynamically install a single
> -control method rather than override the entire DSDT, because kernel
> -rebuild/reboot is not needed and test result can be got in minutes.
> -
> -Note: Only ACPI METHOD can be overridden, any other object types like
> - "Device", "OperationRegion", are not recognized. Methods
> - declared inside scope operators are also not supported.
> -Note: The same ACPI control method can be overridden for many times,
> - and it's always the latest one that used by Linux/kernel.
> -Note: To get the ACPI debug object output (Store (AAAA, Debug)),
> - please run "echo 1 > /sys/module/acpi/parameters/aml_debug_output".
> -
> -1. override an existing method
> - a) get the ACPI table via ACPI sysfs I/F. e.g. to get the DSDT,
> - just run "cat /sys/firmware/acpi/tables/DSDT > /tmp/dsdt.dat"
> - b) disassemble the table by running "iasl -d dsdt.dat".
> - c) rewrite the ASL code of the method and save it in a new file,
> - d) package the new file (psr.asl) to an ACPI table format.
> - Here is an example of a customized \_SB._AC._PSR method,
> -
> - DefinitionBlock ("", "SSDT", 1, "", "", 0x20080715)
> - {
> - Method (\_SB_.AC._PSR, 0, NotSerialized)
> - {
> - Store ("In AC _PSR", Debug)
> - Return (ACON)
> - }
> - }
> - Note that the full pathname of the method in ACPI namespace
> - should be used.
> - e) assemble the file to generate the AML code of the method.
> - e.g. "iasl -vw 6084 psr.asl" (psr.aml is generated as a result)
> - If parameter "-vw 6084" is not supported by your iASL compiler,
> - please try a newer version.
> - f) mount debugfs by "mount -t debugfs none /sys/kernel/debug"
> - g) override the old method via the debugfs by running
> - "cat /tmp/psr.aml > /sys/kernel/debug/acpi/custom_method"
> -
> -2. insert a new method
> - This is easier than overriding an existing method.
> - We just need to create the ASL code of the method we want to
> - insert and then follow the step c) ~ g) in section 1.
> -
> -3. undo your changes
> - The "undo" operation is not supported for a new inserted method
> - right now, i.e. we can not remove a method currently.
> - For an overridden method, in order to undo your changes, please
> - save a copy of the method original ASL code in step c) section 1,
> - and redo step c) ~ g) to override the method with the original one.
> -
> -
> -Note: We can use a kernel with multiple custom ACPI method running,
> - But each individual write to debugfs can implement a SINGLE
> - method override. i.e. if we want to insert/override multiple
> - ACPI methods, we need to redo step c) ~ g) for multiple times.
> -
> -Note: Be aware that root can mis-use this driver to modify arbitrary
> - memory and gain additional rights, if root's privileges got
> - restricted (for example if root is not allowed to load additional
> - modules after boot).
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index 61d67763851b..d1d069b26bbc 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -10,5 +10,6 @@ ACPI Support
> namespace
> enumeration
> osi
> + method-customizing
> DSD-properties-rules
> - gpio-properties
> + gpio-properties
> \ No newline at end of file
> diff --git a/Documentation/firmware-guide/acpi/method-customizing.rst b/Documentation/firmware-guide/acpi/method-customizing.rst
> new file mode 100644
> index 000000000000..32eb1cdc1549
> --- /dev/null
> +++ b/Documentation/firmware-guide/acpi/method-customizing.rst
> @@ -0,0 +1,82 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=======================================
> +Linux ACPI Custom Control Method How To
> +=======================================
> +
> +:Author: Zhang Rui <[email protected]>
> +
> +
> +Linux supports customizing ACPI control methods at runtime.
> +
> +Users can use this to:
> +
> +1. override an existing method which may not work correctly,
> + or just for debugging purposes.
> +2. insert a completely new method in order to create a missing
> + method such as _OFF, _ON, _STA, _INI, etc.
> +
> +For these cases, it is far simpler to dynamically install a single
> +control method rather than override the entire DSDT, because kernel
> +rebuild/reboot is not needed and test result can be got in minutes.
> +
> +.. note:: Only ACPI METHOD can be overridden, any other object types like
> + "Device", "OperationRegion", are not recognized. Methods
> + declared inside scope operators are also not supported.
> +.. note:: The same ACPI control method can be overridden for many times,
> + and it's always the latest one that used by Linux/kernel.
> +.. note:: To get the ACPI debug object output (Store (AAAA, Debug)),
> + please run "echo 1 > /sys/module/acpi/parameters/aml_debug_output".

Hmm... this may work (not sure if Sphinx would warn or not), but it
is visually bad on text mode. I would code it, instead, with something
like:

.. note::

- Only ACPI METHOD can be overridden, any other object types like
"Device", "OperationRegion", are not recognized. Methods
declared inside scope operators are also not supported.

- The same ACPI control method can be overridden for many times,
and it's always the latest one that used by Linux/kernel.

- To get the ACPI debug object output (Store (AAAA, Debug)),
please run::

echo 1 > /sys/module/acpi/parameters/aml_debug_output

As this would make it visually better on both text and html formats.

> +
> +1. override an existing method
> +==============================
> +a) get the ACPI table via ACPI sysfs I/F. e.g. to get the DSDT,
> + just run "cat /sys/firmware/acpi/tables/DSDT > /tmp/dsdt.dat"
> +b) disassemble the table by running "iasl -d dsdt.dat".
> +c) rewrite the ASL code of the method and save it in a new file,
> +d) package the new file (psr.asl) to an ACPI table format.
> + Here is an example of a customized \_SB._AC._PSR method::
> +
> + DefinitionBlock ("", "SSDT", 1, "", "", 0x20080715)
> + {
> + Method (\_SB_.AC._PSR, 0, NotSerialized)
> + {
> + Store ("In AC _PSR", Debug)
> + Return (ACON)
> + }
> + }
> +
> + Note that the full pathname of the method in ACPI namespace
> + should be used.
> +e) assemble the file to generate the AML code of the method.
> + e.g. "iasl -vw 6084 psr.asl" (psr.aml is generated as a result)
> + If parameter "-vw 6084" is not supported by your iASL compiler,
> + please try a newer version.

I would use ``iasl -vw 6084 psr.asl`` and ``-vw 6084``.

> +f) mount debugfs by "mount -t debugfs none /sys/kernel/debug"

I would do:

f) mount debugfs by running::

mount -t debugfs none /sys/kernel/debug

As it makes a better html document. I believe that the focus here is
sysadmins. Doing the above makes easier for them to cut and paste
commands.

> +g) override the old method via the debugfs by running
> + "cat /tmp/psr.aml > /sys/kernel/debug/acpi/custom_method"

Same applies here: I would also place the "cat" command on a literal
block.

> +
> +2. insert a new method
> +======================
> +This is easier than overriding an existing method.
> +We just need to create the ASL code of the method we want to
> +insert and then follow the step c) ~ g) in section 1.
> +
> +3. undo your changes
> +====================
> +The "undo" operation is not supported for a new inserted method
> +right now, i.e. we can not remove a method currently.
> +For an overridden method, in order to undo your changes, please
> +save a copy of the method original ASL code in step c) section 1,
> +and redo step c) ~ g) to override the method with the original one.
> +
> +
> +.. note:: We can use a kernel with multiple custom ACPI method running,
> + But each individual write to debugfs can implement a SINGLE
> + method override. i.e. if we want to insert/override multiple
> + ACPI methods, we need to redo step c) ~ g) for multiple times.
> +
> +.. note:: Be aware that root can mis-use this driver to modify arbitrary
> + memory and gain additional rights, if root's privileges got
> + restricted (for example if root is not allowed to load additional
> + modules after boot).

Same comment as above: IMHO, having a single note block with the two
notes would be better.

Thanks,
Mauro

2019-04-23 21:08:57

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 10/63] Documentation: ACPI: move initrd_table_override.txt to admin-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:39 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> Documentation/acpi/initrd_table_override.txt | 111 ----------------
> Documentation/admin-guide/acpi/index.rst | 1 +
> .../acpi/initrd_table_override.rst | 120 ++++++++++++++++++
> 3 files changed, 121 insertions(+), 111 deletions(-)
> delete mode 100644 Documentation/acpi/initrd_table_override.txt
> create mode 100644 Documentation/admin-guide/acpi/initrd_table_override.rst
>
> diff --git a/Documentation/acpi/initrd_table_override.txt b/Documentation/acpi/initrd_table_override.txt
> deleted file mode 100644
> index 30437a6db373..000000000000
> --- a/Documentation/acpi/initrd_table_override.txt
> +++ /dev/null
> @@ -1,111 +0,0 @@
> -Upgrading ACPI tables via initrd
> -================================
> -
> -1) Introduction (What is this about)
> -2) What is this for
> -3) How does it work
> -4) References (Where to retrieve userspace tools)
> -
> -1) What is this about
> ----------------------
> -
> -If the ACPI_TABLE_UPGRADE compile option is true, it is possible to
> -upgrade the ACPI execution environment that is defined by the ACPI tables
> -via upgrading the ACPI tables provided by the BIOS with an instrumented,
> -modified, more recent version one, or installing brand new ACPI tables.
> -
> -When building initrd with kernel in a single image, option
> -ACPI_TABLE_OVERRIDE_VIA_BUILTIN_INITRD should also be true for this
> -feature to work.
> -
> -For a full list of ACPI tables that can be upgraded/installed, take a look
> -at the char *table_sigs[MAX_ACPI_SIGNATURE]; definition in
> -drivers/acpi/tables.c.
> -All ACPI tables iasl (Intel's ACPI compiler and disassembler) knows should
> -be overridable, except:
> - - ACPI_SIG_RSDP (has a signature of 6 bytes)
> - - ACPI_SIG_FACS (does not have an ordinary ACPI table header)
> -Both could get implemented as well.
> -
> -
> -2) What is this for
> --------------------
> -
> -Complain to your platform/BIOS vendor if you find a bug which is so severe
> -that a workaround is not accepted in the Linux kernel. And this facility
> -allows you to upgrade the buggy tables before your platform/BIOS vendor
> -releases an upgraded BIOS binary.
> -
> -This facility can be used by platform/BIOS vendors to provide a Linux
> -compatible environment without modifying the underlying platform firmware.
> -
> -This facility also provides a powerful feature to easily debug and test
> -ACPI BIOS table compatibility with the Linux kernel by modifying old
> -platform provided ACPI tables or inserting new ACPI tables.
> -
> -It can and should be enabled in any kernel because there is no functional
> -change with not instrumented initrds.
> -
> -
> -3) How does it work
> --------------------
> -
> -# Extract the machine's ACPI tables:
> -cd /tmp
> -acpidump >acpidump
> -acpixtract -a acpidump
> -# Disassemble, modify and recompile them:
> -iasl -d *.dat
> -# For example add this statement into a _PRT (PCI Routing Table) function
> -# of the DSDT:
> -Store("HELLO WORLD", debug)
> -# And increase the OEM Revision. For example, before modification:
> -DefinitionBlock ("DSDT.aml", "DSDT", 2, "INTEL ", "TEMPLATE", 0x00000000)
> -# After modification:
> -DefinitionBlock ("DSDT.aml", "DSDT", 2, "INTEL ", "TEMPLATE", 0x00000001)
> -iasl -sa dsdt.dsl
> -# Add the raw ACPI tables to an uncompressed cpio archive.
> -# They must be put into a /kernel/firmware/acpi directory inside the cpio
> -# archive. Note that if the table put here matches a platform table
> -# (similar Table Signature, and similar OEMID, and similar OEM Table ID)
> -# with a more recent OEM Revision, the platform table will be upgraded by
> -# this table. If the table put here doesn't match a platform table
> -# (dissimilar Table Signature, or dissimilar OEMID, or dissimilar OEM Table
> -# ID), this table will be appended.
> -mkdir -p kernel/firmware/acpi
> -cp dsdt.aml kernel/firmware/acpi
> -# A maximum of "NR_ACPI_INITRD_TABLES (64)" tables are currently allowed
> -# (see osl.c):
> -iasl -sa facp.dsl
> -iasl -sa ssdt1.dsl
> -cp facp.aml kernel/firmware/acpi
> -cp ssdt1.aml kernel/firmware/acpi
> -# The uncompressed cpio archive must be the first. Other, typically
> -# compressed cpio archives, must be concatenated on top of the uncompressed
> -# one. Following command creates the uncompressed cpio archive and
> -# concatenates the original initrd on top:
> -find kernel | cpio -H newc --create > /boot/instrumented_initrd
> -cat /boot/initrd >>/boot/instrumented_initrd
> -# reboot with increased acpi debug level, e.g. boot params:
> -acpi.debug_level=0x2 acpi.debug_layer=0xFFFFFFFF
> -# and check your syslog:
> -[ 1.268089] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> -[ 1.272091] [ACPI Debug] String [0x0B] "HELLO WORLD"
> -
> -iasl is able to disassemble and recompile quite a lot different,
> -also static ACPI tables.
> -
> -
> -4) Where to retrieve userspace tools
> -------------------------------------
> -
> -iasl and acpixtract are part of Intel's ACPICA project:
> -http://acpica.org/
> -and should be packaged by distributions (for example in the acpica package
> -on SUSE).
> -
> -acpidump can be found in Len Browns pmtools:
> -ftp://kernel.org/pub/linux/kernel/people/lenb/acpi/utils/pmtools/acpidump
> -This tool is also part of the acpica package on SUSE.
> -Alternatively, used ACPI tables can be retrieved via sysfs in latest kernels:
> -/sys/firmware/acpi/tables
> diff --git a/Documentation/admin-guide/acpi/index.rst b/Documentation/admin-guide/acpi/index.rst
> index 3e041206089d..09e4e81e4fb7 100644
> --- a/Documentation/admin-guide/acpi/index.rst
> +++ b/Documentation/admin-guide/acpi/index.rst
> @@ -8,3 +8,4 @@ the Linux ACPI support.
> .. toctree::
> :maxdepth: 1
>
> + initrd_table_override
> diff --git a/Documentation/admin-guide/acpi/initrd_table_override.rst b/Documentation/admin-guide/acpi/initrd_table_override.rst
> new file mode 100644
> index 000000000000..0787b2b91ded
> --- /dev/null
> +++ b/Documentation/admin-guide/acpi/initrd_table_override.rst
> @@ -0,0 +1,120 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +================================
> +Upgrading ACPI tables via initrd
> +================================
> +
> +1) Introduction (What is this about)
> +2) What is this for
> +3) How does it work
> +4) References (Where to retrieve userspace tools)

Hmm... I did the same on my conversion, but IMO, the best would be to
hide (or remove, if ACPI maintainers agree) the contents, as this may
conflict with the body as people may add new stuff and forget to
update it.

So, if ACPI maintainers insist on keeping it, I would code this as:

.. Contents

1) Introduction (What is this about)
2) What is this for
3) How does it work
4) References (Where to retrieve userspace tools)

as this will make this invisible on html/pdf/epub output.

Anyway, with or without the above change:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> +
> +1) What is this about
> +=====================
> +
> +If the ACPI_TABLE_UPGRADE compile option is true, it is possible to
> +upgrade the ACPI execution environment that is defined by the ACPI tables
> +via upgrading the ACPI tables provided by the BIOS with an instrumented,
> +modified, more recent version one, or installing brand new ACPI tables.
> +
> +When building initrd with kernel in a single image, option
> +ACPI_TABLE_OVERRIDE_VIA_BUILTIN_INITRD should also be true for this
> +feature to work.
> +
> +For a full list of ACPI tables that can be upgraded/installed, take a look
> +at the char `*table_sigs[MAX_ACPI_SIGNATURE];` definition in
> +drivers/acpi/tables.c.
> +
> +All ACPI tables iasl (Intel's ACPI compiler and disassembler) knows should
> +be overridable, except:
> +
> + - ACPI_SIG_RSDP (has a signature of 6 bytes)
> + - ACPI_SIG_FACS (does not have an ordinary ACPI table header)
> +
> +Both could get implemented as well.
> +
> +
> +2) What is this for
> +===================
> +
> +Complain to your platform/BIOS vendor if you find a bug which is so severe
> +that a workaround is not accepted in the Linux kernel. And this facility
> +allows you to upgrade the buggy tables before your platform/BIOS vendor
> +releases an upgraded BIOS binary.
> +
> +This facility can be used by platform/BIOS vendors to provide a Linux
> +compatible environment without modifying the underlying platform firmware.
> +
> +This facility also provides a powerful feature to easily debug and test
> +ACPI BIOS table compatibility with the Linux kernel by modifying old
> +platform provided ACPI tables or inserting new ACPI tables.
> +
> +It can and should be enabled in any kernel because there is no functional
> +change with not instrumented initrds.
> +
> +
> +3) How does it work
> +===================
> +::
> +
> + # Extract the machine's ACPI tables:
> + cd /tmp
> + acpidump >acpidump
> + acpixtract -a acpidump
> + # Disassemble, modify and recompile them:
> + iasl -d *.dat
> + # For example add this statement into a _PRT (PCI Routing Table) function
> + # of the DSDT:
> + Store("HELLO WORLD", debug)
> + # And increase the OEM Revision. For example, before modification:
> + DefinitionBlock ("DSDT.aml", "DSDT", 2, "INTEL ", "TEMPLATE", 0x00000000)
> + # After modification:
> + DefinitionBlock ("DSDT.aml", "DSDT", 2, "INTEL ", "TEMPLATE", 0x00000001)
> + iasl -sa dsdt.dsl
> + # Add the raw ACPI tables to an uncompressed cpio archive.
> + # They must be put into a /kernel/firmware/acpi directory inside the cpio
> + # archive. Note that if the table put here matches a platform table
> + # (similar Table Signature, and similar OEMID, and similar OEM Table ID)
> + # with a more recent OEM Revision, the platform table will be upgraded by
> + # this table. If the table put here doesn't match a platform table
> + # (dissimilar Table Signature, or dissimilar OEMID, or dissimilar OEM Table
> + # ID), this table will be appended.
> + mkdir -p kernel/firmware/acpi
> + cp dsdt.aml kernel/firmware/acpi
> + # A maximum of "NR_ACPI_INITRD_TABLES (64)" tables are currently allowed
> + # (see osl.c):
> + iasl -sa facp.dsl
> + iasl -sa ssdt1.dsl
> + cp facp.aml kernel/firmware/acpi
> + cp ssdt1.aml kernel/firmware/acpi
> + # The uncompressed cpio archive must be the first. Other, typically
> + # compressed cpio archives, must be concatenated on top of the uncompressed
> + # one. Following command creates the uncompressed cpio archive and
> + # concatenates the original initrd on top:
> + find kernel | cpio -H newc --create > /boot/instrumented_initrd
> + cat /boot/initrd >>/boot/instrumented_initrd
> + # reboot with increased acpi debug level, e.g. boot params:
> + acpi.debug_level=0x2 acpi.debug_layer=0xFFFFFFFF
> + # and check your syslog:
> + [ 1.268089] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> + [ 1.272091] [ACPI Debug] String [0x0B] "HELLO WORLD"
> +
> +iasl is able to disassemble and recompile quite a lot different,
> +also static ACPI tables.
> +
> +
> +4) Where to retrieve userspace tools
> +====================================
> +
> +iasl and acpixtract are part of Intel's ACPICA project:
> +http://acpica.org/
> +
> +and should be packaged by distributions (for example in the acpica package
> +on SUSE).
> +
> +acpidump can be found in Len Browns pmtools:
> +ftp://kernel.org/pub/linux/kernel/people/lenb/acpi/utils/pmtools/acpidump
> +
> +This tool is also part of the acpica package on SUSE.
> +Alternatively, used ACPI tables can be retrieved via sysfs in latest kernels:
> +/sys/firmware/acpi/tables



Thanks,
Mauro

2019-04-23 21:10:11

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 11/63] Documentation: ACPI: move dsdt-override.txt to admin-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:40 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> ---
> .../acpi/dsdt-override.rst} | 8 +++++++-
> Documentation/admin-guide/acpi/index.rst | 1 +
> 2 files changed, 8 insertions(+), 1 deletion(-)
> rename Documentation/{acpi/dsdt-override.txt => admin-guide/acpi/dsdt-override.rst} (56%)
>
> diff --git a/Documentation/acpi/dsdt-override.txt b/Documentation/admin-guide/acpi/dsdt-override.rst
> similarity index 56%
> rename from Documentation/acpi/dsdt-override.txt
> rename to Documentation/admin-guide/acpi/dsdt-override.rst
> index 784841caa6e6..50bd7f194bf4 100644
> --- a/Documentation/acpi/dsdt-override.txt
> +++ b/Documentation/admin-guide/acpi/dsdt-override.rst
> @@ -1,6 +1,12 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===============
> +Overriding DSDT
> +===============
> +
> Linux supports a method of overriding the BIOS DSDT:
>
> -CONFIG_ACPI_CUSTOM_DSDT builds the image into the kernel.
> +CONFIG_ACPI_CUSTOM_DSDT - builds the image into the kernel.
>
> When to use this method is described in detail on the
> Linux/ACPI home page:
> diff --git a/Documentation/admin-guide/acpi/index.rst b/Documentation/admin-guide/acpi/index.rst
> index 09e4e81e4fb7..d68e9914c5ff 100644
> --- a/Documentation/admin-guide/acpi/index.rst
> +++ b/Documentation/admin-guide/acpi/index.rst
> @@ -9,3 +9,4 @@ the Linux ACPI support.
> :maxdepth: 1
>
> initrd_table_override
> + dsdt-override



Thanks,
Mauro

2019-04-23 21:11:21

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 12/63] Documentation: ACPI: move i2c-muxes.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:41 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>

For the conversion itself:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> ---
> Documentation/acpi/i2c-muxes.txt | 58 ------------------
> .../firmware-guide/acpi/i2c-muxes.rst | 61 +++++++++++++++++++
> Documentation/firmware-guide/acpi/index.rst | 3 +-
> 3 files changed, 63 insertions(+), 59 deletions(-)
> delete mode 100644 Documentation/acpi/i2c-muxes.txt
> create mode 100644 Documentation/firmware-guide/acpi/i2c-muxes.rst
>
> diff --git a/Documentation/acpi/i2c-muxes.txt b/Documentation/acpi/i2c-muxes.txt
> deleted file mode 100644
> index 9fcc4f0b885e..000000000000
> --- a/Documentation/acpi/i2c-muxes.txt
> +++ /dev/null
> @@ -1,58 +0,0 @@
> -ACPI I2C Muxes
> ---------------
> -
> -Describing an I2C device hierarchy that includes I2C muxes requires an ACPI
> -Device () scope per mux channel.
> -
> -Consider this topology:
> -
> -+------+ +------+
> -| SMB1 |-->| MUX0 |--CH00--> i2c client A (0x50)
> -| | | 0x70 |--CH01--> i2c client B (0x50)
> -+------+ +------+
> -
> -which corresponds to the following ASL:
> -
> -Device (SMB1)
> -{
> - Name (_HID, ...)
> - Device (MUX0)
> - {
> - Name (_HID, ...)
> - Name (_CRS, ResourceTemplate () {
> - I2cSerialBus (0x70, ControllerInitiated, I2C_SPEED,
> - AddressingMode7Bit, "^SMB1", 0x00,
> - ResourceConsumer,,)
> - }
> -
> - Device (CH00)
> - {
> - Name (_ADR, 0)
> -
> - Device (CLIA)
> - {
> - Name (_HID, ...)
> - Name (_CRS, ResourceTemplate () {
> - I2cSerialBus (0x50, ControllerInitiated, I2C_SPEED,
> - AddressingMode7Bit, "^CH00", 0x00,
> - ResourceConsumer,,)
> - }
> - }
> - }
> -
> - Device (CH01)
> - {
> - Name (_ADR, 1)
> -
> - Device (CLIB)
> - {
> - Name (_HID, ...)
> - Name (_CRS, ResourceTemplate () {
> - I2cSerialBus (0x50, ControllerInitiated, I2C_SPEED,
> - AddressingMode7Bit, "^CH01", 0x00,
> - ResourceConsumer,,)
> - }
> - }
> - }
> - }
> -}
> diff --git a/Documentation/firmware-guide/acpi/i2c-muxes.rst b/Documentation/firmware-guide/acpi/i2c-muxes.rst
> new file mode 100644
> index 000000000000..3a8997ccd7c4
> --- /dev/null
> +++ b/Documentation/firmware-guide/acpi/i2c-muxes.rst
> @@ -0,0 +1,61 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +==============
> +ACPI I2C Muxes
> +==============
> +
> +Describing an I2C device hierarchy that includes I2C muxes requires an ACPI
> +Device () scope per mux channel.
> +
> +Consider this topology::
> +
> + +------+ +------+
> + | SMB1 |-->| MUX0 |--CH00--> i2c client A (0x50)
> + | | | 0x70 |--CH01--> i2c client B (0x50)
> + +------+ +------+
> +
> +which corresponds to the following ASL::
> +
> + Device (SMB1)
> + {
> + Name (_HID, ...)
> + Device (MUX0)
> + {
> + Name (_HID, ...)
> + Name (_CRS, ResourceTemplate () {
> + I2cSerialBus (0x70, ControllerInitiated, I2C_SPEED,
> + AddressingMode7Bit, "^SMB1", 0x00,
> + ResourceConsumer,,)
> + }
> +
> + Device (CH00)
> + {
> + Name (_ADR, 0)
> +
> + Device (CLIA)
> + {
> + Name (_HID, ...)
> + Name (_CRS, ResourceTemplate () {
> + I2cSerialBus (0x50, ControllerInitiated, I2C_SPEED,
> + AddressingMode7Bit, "^CH00", 0x00,
> + ResourceConsumer,,)
> + }
> + }
> + }
> +
> + Device (CH01)
> + {
> + Name (_ADR, 1)
> +
> + Device (CLIB)
> + {
> + Name (_HID, ...)
> + Name (_CRS, ResourceTemplate () {
> + I2cSerialBus (0x50, ControllerInitiated, I2C_SPEED,
> + AddressingMode7Bit, "^CH01", 0x00,
> + ResourceConsumer,,)
> + }
> + }
> + }
> + }
> + }
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index d1d069b26bbc..1c89888f6ee8 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -12,4 +12,5 @@ ACPI Support
> osi
> method-customizing
> DSD-properties-rules
> - gpio-properties
> \ No newline at end of file
> + gpio-properties
> + i2c-muxes



Thanks,
Mauro

2019-04-23 21:13:58

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 13/63] Documentation: ACPI: move acpi-lid.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:42 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> .../acpi/acpi-lid.rst} | 48 ++++++++++++-------
> Documentation/firmware-guide/acpi/index.rst | 1 +
> 2 files changed, 33 insertions(+), 16 deletions(-)
> rename Documentation/{acpi/acpi-lid.txt => firmware-guide/acpi/acpi-lid.rst} (77%)
>
> diff --git a/Documentation/acpi/acpi-lid.txt b/Documentation/firmware-guide/acpi/acpi-lid.rst
> similarity index 77%
> rename from Documentation/acpi/acpi-lid.txt
> rename to Documentation/firmware-guide/acpi/acpi-lid.rst
> index effe7af3a5af..1d19e15a6945 100644
> --- a/Documentation/acpi/acpi-lid.txt
> +++ b/Documentation/firmware-guide/acpi/acpi-lid.rst
> @@ -1,25 +1,29 @@
> -Special Usage Model of the ACPI Control Method Lid Device
> +.. SPDX-License-Identifier: GPL-2.0
> +.. include:: <isonum.txt>
>
> -Copyright (C) 2016, Intel Corporation
> -Author: Lv Zheng <[email protected]>
> +=========================================================
> +Special Usage Model of the ACPI Control Method Lid Device
> +=========================================================
>
> +:Copyright: |copy| 2016, Intel Corporation
>
> -Abstract:
> +:Author: Lv Zheng <[email protected]>
>
> -Platforms containing lids convey lid state (open/close) to OSPMs using a
> -control method lid device. To implement this, the AML tables issue
> -Notify(lid_device, 0x80) to notify the OSPMs whenever the lid state has
> -changed. The _LID control method for the lid device must be implemented to
> -report the "current" state of the lid as either "opened" or "closed".
> +:Abstract: Platforms containing lids convey lid state (open/close) to OSPMs
> + using a control method lid device. To implement this, the AML tables issue
> + Notify(lid_device, 0x80) to notify the OSPMs whenever the lid state has
> + changed. The _LID control method for the lid device must be implemented to
> + report the "current" state of the lid as either "opened" or "closed".

Same comment for the abstract.
>
> -For most platforms, both the _LID method and the lid notifications are
> -reliable. However, there are exceptions. In order to work with these
> -exceptional buggy platforms, special restrictions and expections should be
> -taken into account. This document describes the restrictions and the
> -expections of the Linux ACPI lid device driver.
> + For most platforms, both the _LID method and the lid notifications are
> + reliable. However, there are exceptions. In order to work with these
> + exceptional buggy platforms, special restrictions and expections should be
> + taken into account. This document describes the restrictions and the
> + expections of the Linux ACPI lid device driver.
>
>
> 1. Restrictions of the returning value of the _LID control method
> +=================================================================
>
> The _LID control method is described to return the "current" lid state.
> However the word of "current" has ambiguity, some buggy AML tables return
> @@ -31,6 +35,7 @@ with cached value, the initial returning value is likely not reliable.
> There are platforms always retun "closed" as initial lid state.
>
> 2. Restrictions of the lid state change notifications
> +=====================================================
>
> There are buggy AML tables never notifying when the lid device state is
> changed to "opened". Thus the "opened" notification is not guaranteed. But
> @@ -40,17 +45,21 @@ trigger some system power saving operations on Windows. Since it is fully
> tested, it is reliable from all AML tables.
>
> 3. Expections for the userspace users of the ACPI lid device driver
> +===================================================================
>
> The ACPI button driver exports the lid state to the userspace via the
> -following file:
> +following file::
> +
> /proc/acpi/button/lid/LID0/state
> +
> This file actually calls the _LID control method described above. And given
> the previous explanation, it is not reliable enough on some platforms. So
> it is advised for the userspace program to not to solely rely on this file
> to determine the actual lid state.
>
> The ACPI button driver emits the following input event to the userspace:
> - SW_LID
> + * SW_LID
> +
> The ACPI lid device driver is implemented to try to deliver the platform
> triggered events to the userspace. However, given the fact that the buggy
> firmware cannot make sure "opened"/"closed" events are paired, the ACPI
> @@ -59,20 +68,25 @@ button driver uses the following 3 modes in order not to trigger issues.
> If the userspace hasn't been prepared to ignore the unreliable "opened"
> events and the unreliable initial state notification, Linux users can use
> the following kernel parameters to handle the possible issues:
> +
> A. button.lid_init_state=method:

Just for the sake of a better visual at the html output, I would place those
button.* as:

A. ``button.lid_init_state=method``:

Anyway, with or without such change:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> When this option is specified, the ACPI button driver reports the
> initial lid state using the returning value of the _LID control method
> and whether the "opened"/"closed" events are paired fully relies on the
> firmware implementation.
> +
> This option can be used to fix some platforms where the returning value
> of the _LID control method is reliable but the initial lid state
> notification is missing.
> +
> This option is the default behavior during the period the userspace
> isn't ready to handle the buggy AML tables.
> +
> B. button.lid_init_state=open:
> When this option is specified, the ACPI button driver always reports the
> initial lid state as "opened" and whether the "opened"/"closed" events
> are paired fully relies on the firmware implementation.
> +
> This may fix some platforms where the returning value of the _LID
> control method is not reliable and the initial lid state notification is
> missing.
> @@ -80,6 +94,7 @@ B. button.lid_init_state=open:
> If the userspace has been prepared to ignore the unreliable "opened" events
> and the unreliable initial state notification, Linux users should always
> use the following kernel parameter:
> +
> C. button.lid_init_state=ignore:
> When this option is specified, the ACPI button driver never reports the
> initial lid state and there is a compensation mechanism implemented to
> @@ -89,6 +104,7 @@ C. button.lid_init_state=ignore:
> notifications can be delivered to the userspace when the lid is actually
> opens given that some AML tables do not send "opened" notifications
> reliably.
> +
> In this mode, if everything is correctly implemented by the platform
> firmware, the old userspace programs should still work. Otherwise, the
> new userspace programs are required to work with the ACPI button driver.
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index 1c89888f6ee8..bedcb0b242a2 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -14,3 +14,4 @@ ACPI Support
> DSD-properties-rules
> gpio-properties
> i2c-muxes
> + acpi-lid
> \ No newline at end of file



Thanks,
Mauro

2019-04-23 21:16:11

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 14/63] Documentation: ACPI: move dsd/graph.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:43 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> ---
> .../acpi/dsd/graph.rst} | 157 +++++++++---------
> Documentation/firmware-guide/acpi/index.rst | 1 +
> 2 files changed, 81 insertions(+), 77 deletions(-)
> rename Documentation/{acpi/dsd/graph.txt => firmware-guide/acpi/dsd/graph.rst} (56%)
>
> diff --git a/Documentation/acpi/dsd/graph.txt b/Documentation/firmware-guide/acpi/dsd/graph.rst
> similarity index 56%
> rename from Documentation/acpi/dsd/graph.txt
> rename to Documentation/firmware-guide/acpi/dsd/graph.rst
> index b9ce910781dc..e0baed35b037 100644
> --- a/Documentation/acpi/dsd/graph.txt
> +++ b/Documentation/firmware-guide/acpi/dsd/graph.rst
> @@ -1,8 +1,11 @@
> -Graphs
> +.. SPDX-License-Identifier: GPL-2.0
>
> +======
> +Graphs
> +======
>
> _DSD
> -----
> +====
>
> _DSD (Device Specific Data) [7] is a predefined ACPI device
> configuration object that can be used to convey information on
> @@ -30,7 +33,7 @@ hierarchical data extension array on each depth.
>
>
> Ports and endpoints
> --------------------
> +===================
>
> The port and endpoint concepts are very similar to those in Devicetree
> [3]. A port represents an interface in a device, and an endpoint
> @@ -38,9 +41,9 @@ represents a connection to that interface.
>
> All port nodes are located under the device's "_DSD" node in the hierarchical
> data extension tree. The data extension related to each port node must begin
> -with "port" and must be followed by the "@" character and the number of the port
> -as its key. The target object it refers to should be called "PRTX", where "X" is
> -the number of the port. An example of such a package would be:
> +with "port" and must be followed by the "@" character and the number of the
> +port as its key. The target object it refers to should be called "PRTX", where
> +"X" is the number of the port. An example of such a package would be::
>
> Package() { "port@4", PRT4 }
>
> @@ -49,7 +52,7 @@ data extension key of the endpoint nodes must begin with
> "endpoint" and must be followed by the "@" character and the number of the
> endpoint. The object it refers to should be called "EPXY", where "X" is the
> number of the port and "Y" is the number of the endpoint. An example of such a
> -package would be:
> +package would be::
>
> Package() { "endpoint@0", EP40 }
>
> @@ -62,85 +65,85 @@ of that port shall be zero. Similarly, if a port may only have a single
> endpoint, the number of that endpoint shall be zero.
>
> The endpoint reference uses property extension with "remote-endpoint" property
> -name followed by a reference in the same package. Such references consist of the
> +name followed by a reference in the same package. Such references consist of
> the remote device reference, the first package entry of the port data extension
> reference under the device and finally the first package entry of the endpoint
> -data extension reference under the port. Individual references thus appear as:
> +data extension reference under the port. Individual references thus appear as::
>
> Package() { device, "port@X", "endpoint@Y" }
>
> -In the above example, "X" is the number of the port and "Y" is the number of the
> -endpoint.
> +In the above example, "X" is the number of the port and "Y" is the number of
> +the endpoint.
>
> The references to endpoints must be always done both ways, to the
> remote endpoint and back from the referred remote endpoint node.
>
> -A simple example of this is show below:
> +A simple example of this is show below::
>
> Scope (\_SB.PCI0.I2C2)
> {
> - Device (CAM0)
> - {
> - Name (_DSD, Package () {
> - ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
> - Package () {
> - Package () { "compatible", Package () { "nokia,smia" } },
> - },
> - ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
> - Package () {
> - Package () { "port@0", PRT0 },
> - }
> - })
> - Name (PRT0, Package() {
> - ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
> - Package () {
> - Package () { "reg", 0 },
> - },
> - ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
> - Package () {
> - Package () { "endpoint@0", EP00 },
> - }
> - })
> - Name (EP00, Package() {
> - ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
> - Package () {
> - Package () { "reg", 0 },
> - Package () { "remote-endpoint", Package() { \_SB.PCI0.ISP, "port@4", "endpoint@0" } },
> - }
> - })
> - }
> + Device (CAM0)
> + {
> + Name (_DSD, Package () {
> + ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
> + Package () {
> + Package () { "compatible", Package () { "nokia,smia" } },
> + },
> + ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
> + Package () {
> + Package () { "port@0", PRT0 },
> + }
> + })
> + Name (PRT0, Package() {
> + ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
> + Package () {
> + Package () { "reg", 0 },
> + },
> + ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
> + Package () {
> + Package () { "endpoint@0", EP00 },
> + }
> + })
> + Name (EP00, Package() {
> + ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
> + Package () {
> + Package () { "reg", 0 },
> + Package () { "remote-endpoint", Package() { \_SB.PCI0.ISP, "port@4", "endpoint@0" } },
> + }
> + })
> + }
> }
>
> Scope (\_SB.PCI0)
> {
> - Device (ISP)
> - {
> - Name (_DSD, Package () {
> - ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
> - Package () {
> - Package () { "port@4", PRT4 },
> - }
> - })
> -
> - Name (PRT4, Package() {
> - ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
> - Package () {
> - Package () { "reg", 4 }, /* CSI-2 port number */
> - },
> - ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
> - Package () {
> - Package () { "endpoint@0", EP40 },
> - }
> - })
> -
> - Name (EP40, Package() {
> - ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
> - Package () {
> - Package () { "reg", 0 },
> - Package () { "remote-endpoint", Package () { \_SB.PCI0.I2C2.CAM0, "port@0", "endpoint@0" } },
> - }
> - })
> - }
> + Device (ISP)
> + {
> + Name (_DSD, Package () {
> + ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
> + Package () {
> + Package () { "port@4", PRT4 },
> + }
> + })
> +
> + Name (PRT4, Package() {
> + ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
> + Package () {
> + Package () { "reg", 4 }, /* CSI-2 port number */
> + },
> + ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
> + Package () {
> + Package () { "endpoint@0", EP40 },
> + }
> + })
> +
> + Name (EP40, Package() {
> + ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
> + Package () {
> + Package () { "reg", 0 },
> + Package () { "remote-endpoint", Package () { \_SB.PCI0.I2C2.CAM0, "port@0", "endpoint@0" } },
> + }
> + })
> + }
> }
>
> Here, the port 0 of the "CAM0" device is connected to the port 4 of
> @@ -148,27 +151,27 @@ the "ISP" device and vice versa.
>
>
> References
> -----------
> +==========
>
> [1] _DSD (Device Specific Data) Implementation Guide.
> - <URL:http://www.uefi.org/sites/default/files/resources/_DSD-implementation-guide-toplevel-1_1.htm>,
> + http://www.uefi.org/sites/default/files/resources/_DSD-implementation-guide-toplevel-1_1.htm,
> referenced 2016-10-03.
>
> -[2] Devicetree. <URL:http://www.devicetree.org>, referenced 2016-10-03.
> +[2] Devicetree. http://www.devicetree.org, referenced 2016-10-03.
>
> [3] Documentation/devicetree/bindings/graph.txt
>
> [4] Device Properties UUID For _DSD.
> - <URL:http://www.uefi.org/sites/default/files/resources/_DSD-device-properties-UUID.pdf>,
> + http://www.uefi.org/sites/default/files/resources/_DSD-device-properties-UUID.pdf,
> referenced 2016-10-04.
>
> [5] Hierarchical Data Extension UUID For _DSD.
> - <URL:http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.1.pdf>,
> + http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.1.pdf,
> referenced 2016-10-04.
>
> [6] Advanced Configuration and Power Interface Specification.
> - <URL:http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf>,
> + http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf,
> referenced 2016-10-04.
>
> [7] _DSD Device Properties Usage Rules.
> - Documentation/acpi/DSD-properties-rules.txt
> + :doc:`../DSD-properties-rules`
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index bedcb0b242a2..f81cfbcb6878 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -8,6 +8,7 @@ ACPI Support
> :maxdepth: 1
>
> namespace
> + dsd/graph
> enumeration
> osi
> method-customizing



Thanks,
Mauro

2019-04-23 21:19:07

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 15/63] Documentation: ACPI: move dsd/data-node-references.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:44 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> .../acpi/dsd/data-node-references.rst} | 28 +++++++++++--------
> Documentation/firmware-guide/acpi/index.rst | 1 +
> 2 files changed, 17 insertions(+), 12 deletions(-)
> rename Documentation/{acpi/dsd/data-node-references.txt => firmware-guide/acpi/dsd/data-node-references.rst} (79%)
>
> diff --git a/Documentation/acpi/dsd/data-node-references.txt b/Documentation/firmware-guide/acpi/dsd/data-node-references.rst
> similarity index 79%
> rename from Documentation/acpi/dsd/data-node-references.txt
> rename to Documentation/firmware-guide/acpi/dsd/data-node-references.rst
> index c3871565c8cf..79c5368eaecf 100644
> --- a/Documentation/acpi/dsd/data-node-references.txt
> +++ b/Documentation/firmware-guide/acpi/dsd/data-node-references.rst
> @@ -1,9 +1,12 @@
> -Copyright (C) 2018 Intel Corporation
> -Author: Sakari Ailus <[email protected]>
> -
> +.. SPDX-License-Identifier: GPL-2.0
> +.. include:: <isonum.txt>
>
> +===================================
> Referencing hierarchical data nodes
> ------------------------------------
> +===================================
> +
> +:Copyright: |copy| 2018 Intel Corporation
> +:Author: Sakari Ailus <[email protected]>
>
> ACPI in general allows referring to device objects in the tree only.
> Hierarchical data extension nodes may not be referred to directly, hence this
> @@ -28,13 +31,14 @@ extension key.
>
>
> Example
> --------
> +=======
>
> - In the ASL snippet below, the "reference" _DSD property [2] contains a
> - device object reference to DEV0 and under that device object, a
> - hierarchical data extension key "node@1" referring to the NOD1 object
> - and lastly, a hierarchical data extension key "anothernode" referring to
> - the ANOD object which is also the final target node of the reference.
> +In the ASL snippet below, the "reference" _DSD property [2] contains a
> +device object reference to DEV0 and under that device object, a
> +hierarchical data extension key "node@1" referring to the NOD1 object
> +and lastly, a hierarchical data extension key "anothernode" referring to
> +the ANOD object which is also the final target node of the reference.
> +::
>
> Device (DEV0)
> {
> @@ -75,10 +79,10 @@ Example
> })
> }
>
> -Please also see a graph example in graph.txt .
> +Please also see a graph example in :doc:`graph`.
>
> References
> -----------
> +==========
>
> [1] Hierarchical Data Extension UUID For _DSD.
> <URL:http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.1.pdf>,

Hmm... on the previous patch, you replaced <URL:some_url> by some_url,
with makes sense. Please do the same here and on other patches on
this series with a similar way to describe URLs.

> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index f81cfbcb6878..6d4e0df4f063 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -9,6 +9,7 @@ ACPI Support
>
> namespace
> dsd/graph
> + dsd/data-node-references
> enumeration
> osi
> method-customizing



Thanks,
Mauro

2019-04-23 21:22:28

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 16/63] Documentation: ACPI: move debug.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:45 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> .../acpi/debug.rst} | 31 ++++++++++---------
> Documentation/firmware-guide/acpi/index.rst | 3 +-
> 2 files changed, 19 insertions(+), 15 deletions(-)
> rename Documentation/{acpi/debug.txt => firmware-guide/acpi/debug.rst} (91%)
>
> diff --git a/Documentation/acpi/debug.txt b/Documentation/firmware-guide/acpi/debug.rst
> similarity index 91%
> rename from Documentation/acpi/debug.txt
> rename to Documentation/firmware-guide/acpi/debug.rst
> index 65bf47c46b6d..1a152dd1d765 100644
> --- a/Documentation/acpi/debug.txt
> +++ b/Documentation/firmware-guide/acpi/debug.rst
> @@ -1,18 +1,21 @@
> - ACPI Debug Output
> +.. SPDX-License-Identifier: GPL-2.0
>
> +=================
> +ACPI Debug Output
> +=================
>
> The ACPI CA, the Linux ACPI core, and some ACPI drivers can generate debug
> output. This document describes how to use this facility.
>
> Compile-time configuration
> ---------------------------
> +==========================
>
> ACPI debug output is globally enabled by CONFIG_ACPI_DEBUG. If this config
> option is turned off, the debug messages are not even built into the
> kernel.
>
> Boot- and run-time configuration
> ---------------------------------
> +================================
>
> When CONFIG_ACPI_DEBUG=y, you can select the component and level of messages
> you're interested in. At boot-time, use the acpi.debug_layer and
> @@ -21,7 +24,7 @@ debug_layer and debug_level files in /sys/module/acpi/parameters/ to control
> the debug messages.
>
> debug_layer (component)
> ------------------------
> +=======================
>
> The "debug_layer" is a mask that selects components of interest, e.g., a
> specific driver or part of the ACPI interpreter. To build the debug_layer
> @@ -33,7 +36,7 @@ to /sys/module/acpi/parameters/debug_layer.
>
> The possible components are defined in include/acpi/acoutput.h and
> include/acpi/acpi_drivers.h. Reading /sys/module/acpi/parameters/debug_layer
> -shows the supported mask values, currently these:
> +shows the supported mask values, currently these::
>
> ACPI_UTILITIES 0x00000001
> ACPI_HARDWARE 0x00000002
> @@ -65,7 +68,7 @@ shows the supported mask values, currently these:
> ACPI_PROCESSOR_COMPONENT 0x20000000

This is one way of doing. The other one, with would likely produce a
better visual, would be to use tables, e. g.:

============== ==========
ACPI_UTILITIES 0x00000001
ACPI_HARDWARE 0x00000002
============== ==========

Of course, if you use tables here, you need to be consistent along
similar cases inside the document.

while both works, I prefer using tables on such cases.

Either way:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

>
> debug_level
> ------------
> +===========
>
> The "debug_level" is a mask that selects different types of messages, e.g.,
> those related to initialization, method execution, informational messages, etc.
> @@ -81,7 +84,7 @@ to /sys/module/acpi/parameters/debug_level.
>
> The possible levels are defined in include/acpi/acoutput.h. Reading
> /sys/module/acpi/parameters/debug_level shows the supported mask values,
> -currently these:
> +currently these::
>
> ACPI_LV_INIT 0x00000001
> ACPI_LV_DEBUG_OBJECT 0x00000002
> @@ -113,9 +116,9 @@ currently these:
> ACPI_LV_EVENTS 0x80000000
>
> Examples
> ---------
> +========
>
> -For example, drivers/acpi/bus.c contains this:
> +For example, drivers/acpi/bus.c contains this::
>
> #define _COMPONENT ACPI_BUS_COMPONENT
> ...
> @@ -127,22 +130,22 @@ statement uses ACPI_DB_INFO, which is macro based on the ACPI_LV_INFO
> definition.)
>
> Enable all AML "Debug" output (stores to the Debug object while interpreting
> -AML) during boot:
> +AML) during boot::
>
> acpi.debug_layer=0xffffffff acpi.debug_level=0x2
>
> -Enable PCI and PCI interrupt routing debug messages:
> +Enable PCI and PCI interrupt routing debug messages::
>
> acpi.debug_layer=0x400000 acpi.debug_level=0x4
>
> -Enable all ACPI hardware-related messages:
> +Enable all ACPI hardware-related messages::
>
> acpi.debug_layer=0x2 acpi.debug_level=0xffffffff
>
> -Enable all ACPI_DB_INFO messages after boot:
> +Enable all ACPI_DB_INFO messages after boot::
>
> # echo 0x4 > /sys/module/acpi/parameters/debug_level
>
> -Show all valid component values:
> +Show all valid component values::
>
> # cat /sys/module/acpi/parameters/debug_layer
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index 6d4e0df4f063..a45fea11f998 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -14,6 +14,7 @@ ACPI Support
> osi
> method-customizing
> DSD-properties-rules
> + debug
> gpio-properties
> i2c-muxes
> - acpi-lid
> \ No newline at end of file
> + acpi-lid



Thanks,
Mauro

2019-04-24 14:28:31

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 17/63] Documentation: ACPI: move method-tracing.txt to firmware-guide/acpi and convert to rsST

Em Wed, 24 Apr 2019 00:28:46 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> Documentation/acpi/method-tracing.txt | 192 ---------------
> Documentation/firmware-guide/acpi/index.rst | 1 +
> .../firmware-guide/acpi/method-tracing.rst | 225 ++++++++++++++++++
> 3 files changed, 226 insertions(+), 192 deletions(-)
> delete mode 100644 Documentation/acpi/method-tracing.txt
> create mode 100644 Documentation/firmware-guide/acpi/method-tracing.rst
>
> diff --git a/Documentation/acpi/method-tracing.txt b/Documentation/acpi/method-tracing.txt
> deleted file mode 100644
> index 0aba14c8f459..000000000000
> --- a/Documentation/acpi/method-tracing.txt
> +++ /dev/null
> @@ -1,192 +0,0 @@
> -ACPICA Trace Facility
> -
> -Copyright (C) 2015, Intel Corporation
> -Author: Lv Zheng <[email protected]>
> -
> -
> -Abstract:
> -
> -This document describes the functions and the interfaces of the method
> -tracing facility.
> -
> -1. Functionalities and usage examples:
> -
> - ACPICA provides method tracing capability. And two functions are
> - currently implemented using this capability.
> -
> - A. Log reducer
> - ACPICA subsystem provides debugging outputs when CONFIG_ACPI_DEBUG is
> - enabled. The debugging messages which are deployed via
> - ACPI_DEBUG_PRINT() macro can be reduced at 2 levels - per-component
> - level (known as debug layer, configured via
> - /sys/module/acpi/parameters/debug_layer) and per-type level (known as
> - debug level, configured via /sys/module/acpi/parameters/debug_level).
> -
> - But when the particular layer/level is applied to the control method
> - evaluations, the quantity of the debugging outputs may still be too
> - large to be put into the kernel log buffer. The idea thus is worked out
> - to only enable the particular debug layer/level (normally more detailed)
> - logs when the control method evaluation is started, and disable the
> - detailed logging when the control method evaluation is stopped.
> -
> - The following command examples illustrate the usage of the "log reducer"
> - functionality:
> - a. Filter out the debug layer/level matched logs when control methods
> - are being evaluated:
> - # cd /sys/module/acpi/parameters
> - # echo "0xXXXXXXXX" > trace_debug_layer
> - # echo "0xYYYYYYYY" > trace_debug_level
> - # echo "enable" > trace_state
> - b. Filter out the debug layer/level matched logs when the specified
> - control method is being evaluated:
> - # cd /sys/module/acpi/parameters
> - # echo "0xXXXXXXXX" > trace_debug_layer
> - # echo "0xYYYYYYYY" > trace_debug_level
> - # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> - # echo "method" > /sys/module/acpi/parameters/trace_state
> - c. Filter out the debug layer/level matched logs when the specified
> - control method is being evaluated for the first time:
> - # cd /sys/module/acpi/parameters
> - # echo "0xXXXXXXXX" > trace_debug_layer
> - # echo "0xYYYYYYYY" > trace_debug_level
> - # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> - # echo "method-once" > /sys/module/acpi/parameters/trace_state
> - Where:
> - 0xXXXXXXXX/0xYYYYYYYY: Refer to Documentation/acpi/debug.txt for
> - possible debug layer/level masking values.
> - \PPPP.AAAA.TTTT.HHHH: Full path of a control method that can be found
> - in the ACPI namespace. It needn't be an entry
> - of a control method evaluation.
> -
> - B. AML tracer
> -
> - There are special log entries added by the method tracing facility at
> - the "trace points" the AML interpreter starts/stops to execute a control
> - method, or an AML opcode. Note that the format of the log entries are
> - subject to change:
> - [ 0.186427] exdebug-0398 ex_trace_point : Method Begin [0xf58394d8:\_SB.PCI0.LPCB.ECOK] execution.
> - [ 0.186630] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905c88:If] execution.
> - [ 0.186820] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905cc0:LEqual] execution.
> - [ 0.187010] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905a20:-NamePath-] execution.
> - [ 0.187214] exdebug-0398 ex_trace_point : Opcode End [0xf5905a20:-NamePath-] execution.
> - [ 0.187407] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905f60:One] execution.
> - [ 0.187594] exdebug-0398 ex_trace_point : Opcode End [0xf5905f60:One] execution.
> - [ 0.187789] exdebug-0398 ex_trace_point : Opcode End [0xf5905cc0:LEqual] execution.
> - [ 0.187980] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905cc0:Return] execution.
> - [ 0.188146] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905f60:One] execution.
> - [ 0.188334] exdebug-0398 ex_trace_point : Opcode End [0xf5905f60:One] execution.
> - [ 0.188524] exdebug-0398 ex_trace_point : Opcode End [0xf5905cc0:Return] execution.
> - [ 0.188712] exdebug-0398 ex_trace_point : Opcode End [0xf5905c88:If] execution.
> - [ 0.188903] exdebug-0398 ex_trace_point : Method End [0xf58394d8:\_SB.PCI0.LPCB.ECOK] execution.
> -
> - Developers can utilize these special log entries to track the AML
> - interpretion, thus can aid issue debugging and performance tuning. Note
> - that, as the "AML tracer" logs are implemented via ACPI_DEBUG_PRINT()
> - macro, CONFIG_ACPI_DEBUG is also required to be enabled for enabling
> - "AML tracer" logs.
> -
> - The following command examples illustrate the usage of the "AML tracer"
> - functionality:
> - a. Filter out the method start/stop "AML tracer" logs when control
> - methods are being evaluated:
> - # cd /sys/module/acpi/parameters
> - # echo "0x80" > trace_debug_layer
> - # echo "0x10" > trace_debug_level
> - # echo "enable" > trace_state
> - b. Filter out the method start/stop "AML tracer" when the specified
> - control method is being evaluated:
> - # cd /sys/module/acpi/parameters
> - # echo "0x80" > trace_debug_layer
> - # echo "0x10" > trace_debug_level
> - # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> - # echo "method" > trace_state
> - c. Filter out the method start/stop "AML tracer" logs when the specified
> - control method is being evaluated for the first time:
> - # cd /sys/module/acpi/parameters
> - # echo "0x80" > trace_debug_layer
> - # echo "0x10" > trace_debug_level
> - # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> - # echo "method-once" > trace_state
> - d. Filter out the method/opcode start/stop "AML tracer" when the
> - specified control method is being evaluated:
> - # cd /sys/module/acpi/parameters
> - # echo "0x80" > trace_debug_layer
> - # echo "0x10" > trace_debug_level
> - # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> - # echo "opcode" > trace_state
> - e. Filter out the method/opcode start/stop "AML tracer" when the
> - specified control method is being evaluated for the first time:
> - # cd /sys/module/acpi/parameters
> - # echo "0x80" > trace_debug_layer
> - # echo "0x10" > trace_debug_level
> - # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> - # echo "opcode-opcode" > trace_state
> -
> - Note that all above method tracing facility related module parameters can
> - be used as the boot parameters, for example:
> - acpi.trace_debug_layer=0x80 acpi.trace_debug_level=0x10 \
> - acpi.trace_method_name=\_SB.LID0._LID acpi.trace_state=opcode-once
> -
> -2. Interface descriptions:
> -
> - All method tracing functions can be configured via ACPI module
> - parameters that are accessible at /sys/module/acpi/parameters/:
> -
> - trace_method_name
> - The full path of the AML method that the user wants to trace.
> - Note that the full path shouldn't contain the trailing "_"s in its
> - name segments but may contain "\" to form an absolute path.
> -
> - trace_debug_layer
> - The temporary debug_layer used when the tracing feature is enabled.
> - Using ACPI_EXECUTER (0x80) by default, which is the debug_layer
> - used to match all "AML tracer" logs.
> -
> - trace_debug_level
> - The temporary debug_level used when the tracing feature is enabled.
> - Using ACPI_LV_TRACE_POINT (0x10) by default, which is the
> - debug_level used to match all "AML tracer" logs.
> -
> - trace_state
> - The status of the tracing feature.
> - Users can enable/disable this debug tracing feature by executing
> - the following command:
> - # echo string > /sys/module/acpi/parameters/trace_state
> - Where "string" should be one of the following:
> - "disable"
> - Disable the method tracing feature.
> - "enable"
> - Enable the method tracing feature.
> - ACPICA debugging messages matching
> - "trace_debug_layer/trace_debug_level" during any method
> - execution will be logged.
> - "method"
> - Enable the method tracing feature.
> - ACPICA debugging messages matching
> - "trace_debug_layer/trace_debug_level" during method execution
> - of "trace_method_name" will be logged.
> - "method-once"
> - Enable the method tracing feature.
> - ACPICA debugging messages matching
> - "trace_debug_layer/trace_debug_level" during method execution
> - of "trace_method_name" will be logged only once.
> - "opcode"
> - Enable the method tracing feature.
> - ACPICA debugging messages matching
> - "trace_debug_layer/trace_debug_level" during method/opcode
> - execution of "trace_method_name" will be logged.
> - "opcode-once"
> - Enable the method tracing feature.
> - ACPICA debugging messages matching
> - "trace_debug_layer/trace_debug_level" during method/opcode
> - execution of "trace_method_name" will be logged only once.
> - Note that, the difference between the "enable" and other feature
> - enabling options are:
> - 1. When "enable" is specified, since
> - "trace_debug_layer/trace_debug_level" shall apply to all control
> - method evaluations, after configuring "trace_state" to "enable",
> - "trace_method_name" will be reset to NULL.
> - 2. When "method/opcode" is specified, if
> - "trace_method_name" is NULL when "trace_state" is configured to
> - these options, the "trace_debug_layer/trace_debug_level" will
> - apply to all control method evaluations.
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index a45fea11f998..287a7cbd82ac 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -13,6 +13,7 @@ ACPI Support
> enumeration
> osi
> method-customizing
> + method-tracing
> DSD-properties-rules
> debug
> gpio-properties
> diff --git a/Documentation/firmware-guide/acpi/method-tracing.rst b/Documentation/firmware-guide/acpi/method-tracing.rst
> new file mode 100644
> index 000000000000..7a997ba168d7
> --- /dev/null
> +++ b/Documentation/firmware-guide/acpi/method-tracing.rst
> @@ -0,0 +1,225 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +.. include:: <isonum.txt>
> +
> +=====================
> +ACPICA Trace Facility
> +=====================
> +
> +:Copyright: |copy| 2015, Intel Corporation
> +:Author: Lv Zheng <[email protected]>
> +
> +
> +:Abstract: This document describes the functions and the interfaces of the
> + method tracing facility.

Same comment as on other patches.

> +
> +1. Functionalities and usage examples
> +=====================================
> +
> +ACPICA provides method tracing capability. And two functions are
> +currently implemented using this capability.
> +
> +Log reducer
> +--------------
> +
> +ACPICA subsystem provides debugging outputs when CONFIG_ACPI_DEBUG is
> +enabled. The debugging messages which are deployed via
> +ACPI_DEBUG_PRINT() macro can be reduced at 2 levels - per-component
> +level (known as debug layer, configured via
> +/sys/module/acpi/parameters/debug_layer) and per-type level (known as
> +debug level, configured via /sys/module/acpi/parameters/debug_level).
> +
> +But when the particular layer/level is applied to the control method
> +evaluations, the quantity of the debugging outputs may still be too
> +large to be put into the kernel log buffer. The idea thus is worked out
> +to only enable the particular debug layer/level (normally more detailed)
> +logs when the control method evaluation is started, and disable the
> +detailed logging when the control method evaluation is stopped.
> +
> +The following command examples illustrate the usage of the "log reducer"
> +functionality:
> +
> +a. Filter out the debug layer/level matched logs when control methods
> + are being evaluated::
> +
> + # cd /sys/module/acpi/parameters
> + # echo "0xXXXXXXXX" > trace_debug_layer
> + # echo "0xYYYYYYYY" > trace_debug_level
> + # echo "enable" > trace_state
> +
> +b. Filter out the debug layer/level matched logs when the specified
> + control method is being evaluated::
> +
> + # cd /sys/module/acpi/parameters
> + # echo "0xXXXXXXXX" > trace_debug_layer
> + # echo "0xYYYYYYYY" > trace_debug_level
> + # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> + # echo "method" > /sys/module/acpi/parameters/trace_state
> +
> +c. Filter out the debug layer/level matched logs when the specified
> + control method is being evaluated for the first time::
> +
> + # cd /sys/module/acpi/parameters
> + # echo "0xXXXXXXXX" > trace_debug_layer
> + # echo "0xYYYYYYYY" > trace_debug_level
> + # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> + # echo "method-once" > /sys/module/acpi/parameters/trace_state
> +
> +Where:
> + 0xXXXXXXXX/0xYYYYYYYY
> + Refer to Documentation/acpi/debug.txt for possible debug layer/level
> + masking values.
> + \PPPP.AAAA.TTTT.HHHH
> + Full path of a control method that can be found in the ACPI namespace.
> + It needn't be an entry of a control method evaluation.
> +
> +AML tracer
> +-------------

The markup is bigger than the line. You should have seen a Sphinx
warning here.

> +
> +There are special log entries added by the method tracing facility at
> +the "trace points" the AML interpreter starts/stops to execute a control
> +method, or an AML opcode. Note that the format of the log entries are
> +subject to change::
> +
> + [ 0.186427] exdebug-0398 ex_trace_point : Method Begin [0xf58394d8:\_SB.PCI0.LPCB.ECOK] execution.
> + [ 0.186630] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905c88:If] execution.
> + [ 0.186820] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905cc0:LEqual] execution.
> + [ 0.187010] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905a20:-NamePath-] execution.
> + [ 0.187214] exdebug-0398 ex_trace_point : Opcode End [0xf5905a20:-NamePath-] execution.
> + [ 0.187407] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905f60:One] execution.
> + [ 0.187594] exdebug-0398 ex_trace_point : Opcode End [0xf5905f60:One] execution.
> + [ 0.187789] exdebug-0398 ex_trace_point : Opcode End [0xf5905cc0:LEqual] execution.
> + [ 0.187980] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905cc0:Return] execution.
> + [ 0.188146] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905f60:One] execution.
> + [ 0.188334] exdebug-0398 ex_trace_point : Opcode End [0xf5905f60:One] execution.
> + [ 0.188524] exdebug-0398 ex_trace_point : Opcode End [0xf5905cc0:Return] execution.
> + [ 0.188712] exdebug-0398 ex_trace_point : Opcode End [0xf5905c88:If] execution.
> + [ 0.188903] exdebug-0398 ex_trace_point : Method End [0xf58394d8:\_SB.PCI0.LPCB.ECOK] execution.
> +
> +Developers can utilize these special log entries to track the AML
> +interpretion, thus can aid issue debugging and performance tuning. Note
> +that, as the "AML tracer" logs are implemented via ACPI_DEBUG_PRINT()
> +macro, CONFIG_ACPI_DEBUG is also required to be enabled for enabling
> +"AML tracer" logs.
> +
> +The following command examples illustrate the usage of the "AML tracer"
> +functionality:
> +
> +a. Filter out the method start/stop "AML tracer" logs when control
> + methods are being evaluated::
> +
> + # cd /sys/module/acpi/parameters
> + # echo "0x80" > trace_debug_layer
> + # echo "0x10" > trace_debug_level
> + # echo "enable" > trace_state
> +
> +b. Filter out the method start/stop "AML tracer" when the specified
> + control method is being evaluated::
> +
> + # cd /sys/module/acpi/parameters
> + # echo "0x80" > trace_debug_layer
> + # echo "0x10" > trace_debug_level
> + # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> + # echo "method" > trace_state
> +
> +c. Filter out the method start/stop "AML tracer" logs when the specified
> + control method is being evaluated for the first time::
> +
> + # cd /sys/module/acpi/parameters
> + # echo "0x80" > trace_debug_layer
> + # echo "0x10" > trace_debug_level
> + # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> + # echo "method-once" > trace_state
> +
> +d. Filter out the method/opcode start/stop "AML tracer" when the
> + specified control method is being evaluated::
> +
> + # cd /sys/module/acpi/parameters
> + # echo "0x80" > trace_debug_layer
> + # echo "0x10" > trace_debug_level
> + # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> + # echo "opcode" > trace_state
> +
> +e. Filter out the method/opcode start/stop "AML tracer" when the
> + specified control method is being evaluated for the first time::
> +
> + # cd /sys/module/acpi/parameters
> + # echo "0x80" > trace_debug_layer
> + # echo "0x10" > trace_debug_level
> + # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> + # echo "opcode-opcode" > trace_state
> +
> +Note that all above method tracing facility related module parameters can
> +be used as the boot parameters, for example::
> +
> + acpi.trace_debug_layer=0x80 acpi.trace_debug_level=0x10 \
> + acpi.trace_method_name=\_SB.LID0._LID acpi.trace_state=opcode-once
> +
> +2. Interface descriptions
> +=========================
> +
> +All method tracing functions can be configured via ACPI module
> +parameters that are accessible at /sys/module/acpi/parameters/:
> +
> +trace_method_name
> +The full path of the AML method that the user wants to trace.
> +Note that the full path shouldn't contain the trailing "_"s in its
> +name segments but may contain "\" to form an absolute path.
> +


> +trace_debug_layer
> +The temporary debug_layer used when the tracing feature is enabled.
> +Using ACPI_EXECUTER (0x80) by default, which is the debug_layer
> +used to match all "AML tracer" logs.
> +
> +trace_debug_level
> +The temporary debug_level used when the tracing feature is enabled.
> +Using ACPI_LV_TRACE_POINT (0x10) by default, which is the
> +debug_level used to match all "AML tracer" logs.
> +
> +trace_state
> +The status of the tracing feature.
> +Users can enable/disable this debug tracing feature by executing
> +the following command::

For the above, please indent, in order to properly change the
sysfs node font to bold. Also, mark paragraphs with a \n, e. g:

trace_method_name
The full path of the AML method that the user wants to trace.

Note that the full path shouldn't contain the trailing "_"s in its
name segments but may contain "\" to form an absolute path.

trace_debug_layer
The temporary debug_layer used when the tracing feature is enabled.

Using ACPI_EXECUTER (0x80) by default, which is the debug_layer
used to match all "AML tracer" logs.

trace_debug_level
The temporary debug_level used when the tracing feature is enabled.

Using ACPI_LV_TRACE_POINT (0x10) by default, which is the
debug_level used to match all "AML tracer" logs.

trace_state
The status of the tracing feature.

Users can enable/disable this debug tracing feature by executing
the following command::

After doing such changes:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>


> +
> + # echo string > /sys/module/acpi/parameters/trace_state
> +
> +Where "string" should be one of the following:
> +
> +"disable"
> + Disable the method tracing feature.
> +"enable"
> + Enable the method tracing feature.
> + ACPICA debugging messages matching
> + "trace_debug_layer/trace_debug_level" during any method
> + execution will be logged.
> +"method"
> + Enable the method tracing feature.
> + ACPICA debugging messages matching
> + "trace_debug_layer/trace_debug_level" during method execution
> + of "trace_method_name" will be logged.
> +"method-once"
> + Enable the method tracing feature.
> + ACPICA debugging messages matching
> + "trace_debug_layer/trace_debug_level" during method execution
> + of "trace_method_name" will be logged only once.
> +"opcode"
> + Enable the method tracing feature.
> + ACPICA debugging messages matching
> + "trace_debug_layer/trace_debug_level" during method/opcode
> + execution of "trace_method_name" will be logged.
> +"opcode-once"
> + Enable the method tracing feature.
> + ACPICA debugging messages matching
> + "trace_debug_layer/trace_debug_level" during method/opcode
> + execution of "trace_method_name" will be logged only once.
> +
> +Note that, the difference between the "enable" and other feature
> +enabling options are:
> +
> +1. When "enable" is specified, since
> + "trace_debug_layer/trace_debug_level" shall apply to all control
> + method evaluations, after configuring "trace_state" to "enable",
> + "trace_method_name" will be reset to NULL.
> +2. When "method/opcode" is specified, if
> + "trace_method_name" is NULL when "trace_state" is configured to
> + these options, the "trace_debug_layer/trace_debug_level" will
> + apply to all control method evaluations.



Thanks,
Mauro

2019-04-24 14:31:33

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 19/63] Documentation: ACPI: move apei/output_format.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:48 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>

For the conversion changes:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> ---
> Documentation/acpi/apei/output_format.txt | 147 -----------------
> .../acpi/apei/output_format.rst | 150 ++++++++++++++++++
> Documentation/firmware-guide/acpi/index.rst | 1 +
> 3 files changed, 151 insertions(+), 147 deletions(-)
> delete mode 100644 Documentation/acpi/apei/output_format.txt
> create mode 100644 Documentation/firmware-guide/acpi/apei/output_format.rst
>
> diff --git a/Documentation/acpi/apei/output_format.txt b/Documentation/acpi/apei/output_format.txt
> deleted file mode 100644
> index 0c49c197c47a..000000000000
> --- a/Documentation/acpi/apei/output_format.txt
> +++ /dev/null
> @@ -1,147 +0,0 @@
> - APEI output format
> - ~~~~~~~~~~~~~~~~~~
> -
> -APEI uses printk as hardware error reporting interface, the output
> -format is as follow.
> -
> -<error record> :=
> -APEI generic hardware error status
> -severity: <integer>, <severity string>
> -section: <integer>, severity: <integer>, <severity string>
> -flags: <integer>
> -<section flags strings>
> -fru_id: <uuid string>
> -fru_text: <string>
> -section_type: <section type string>
> -<section data>
> -
> -<severity string>* := recoverable | fatal | corrected | info
> -
> -<section flags strings># :=
> -[primary][, containment warning][, reset][, threshold exceeded]\
> -[, resource not accessible][, latent error]
> -
> -<section type string> := generic processor error | memory error | \
> -PCIe error | unknown, <uuid string>
> -
> -<section data> :=
> -<generic processor section data> | <memory section data> | \
> -<pcie section data> | <null>
> -
> -<generic processor section data> :=
> -[processor_type: <integer>, <proc type string>]
> -[processor_isa: <integer>, <proc isa string>]
> -[error_type: <integer>
> -<proc error type strings>]
> -[operation: <integer>, <proc operation string>]
> -[flags: <integer>
> -<proc flags strings>]
> -[level: <integer>]
> -[version_info: <integer>]
> -[processor_id: <integer>]
> -[target_address: <integer>]
> -[requestor_id: <integer>]
> -[responder_id: <integer>]
> -[IP: <integer>]
> -
> -<proc type string>* := IA32/X64 | IA64
> -
> -<proc isa string>* := IA32 | IA64 | X64
> -
> -<processor error type strings># :=
> -[cache error][, TLB error][, bus error][, micro-architectural error]
> -
> -<proc operation string>* := unknown or generic | data read | data write | \
> -instruction execution
> -
> -<proc flags strings># :=
> -[restartable][, precise IP][, overflow][, corrected]
> -
> -<memory section data> :=
> -[error_status: <integer>]
> -[physical_address: <integer>]
> -[physical_address_mask: <integer>]
> -[node: <integer>]
> -[card: <integer>]
> -[module: <integer>]
> -[bank: <integer>]
> -[device: <integer>]
> -[row: <integer>]
> -[column: <integer>]
> -[bit_position: <integer>]
> -[requestor_id: <integer>]
> -[responder_id: <integer>]
> -[target_id: <integer>]
> -[error_type: <integer>, <mem error type string>]
> -
> -<mem error type string>* :=
> -unknown | no error | single-bit ECC | multi-bit ECC | \
> -single-symbol chipkill ECC | multi-symbol chipkill ECC | master abort | \
> -target abort | parity error | watchdog timeout | invalid address | \
> -mirror Broken | memory sparing | scrub corrected error | \
> -scrub uncorrected error
> -
> -<pcie section data> :=
> -[port_type: <integer>, <pcie port type string>]
> -[version: <integer>.<integer>]
> -[command: <integer>, status: <integer>]
> -[device_id: <integer>:<integer>:<integer>.<integer>
> -slot: <integer>
> -secondary_bus: <integer>
> -vendor_id: <integer>, device_id: <integer>
> -class_code: <integer>]
> -[serial number: <integer>, <integer>]
> -[bridge: secondary_status: <integer>, control: <integer>]
> -[aer_status: <integer>, aer_mask: <integer>
> -<aer status string>
> -[aer_uncor_severity: <integer>]
> -aer_layer=<aer layer string>, aer_agent=<aer agent string>
> -aer_tlp_header: <integer> <integer> <integer> <integer>]
> -
> -<pcie port type string>* := PCIe end point | legacy PCI end point | \
> -unknown | unknown | root port | upstream switch port | \
> -downstream switch port | PCIe to PCI/PCI-X bridge | \
> -PCI/PCI-X to PCIe bridge | root complex integrated endpoint device | \
> -root complex event collector
> -
> -if section severity is fatal or recoverable
> -<aer status string># :=
> -unknown | unknown | unknown | unknown | Data Link Protocol | \
> -unknown | unknown | unknown | unknown | unknown | unknown | unknown | \
> -Poisoned TLP | Flow Control Protocol | Completion Timeout | \
> -Completer Abort | Unexpected Completion | Receiver Overflow | \
> -Malformed TLP | ECRC | Unsupported Request
> -else
> -<aer status string># :=
> -Receiver Error | unknown | unknown | unknown | unknown | unknown | \
> -Bad TLP | Bad DLLP | RELAY_NUM Rollover | unknown | unknown | unknown | \
> -Replay Timer Timeout | Advisory Non-Fatal
> -fi
> -
> -<aer layer string> :=
> -Physical Layer | Data Link Layer | Transaction Layer
> -
> -<aer agent string> :=
> -Receiver ID | Requester ID | Completer ID | Transmitter ID
> -
> -Where, [] designate corresponding content is optional
> -
> -All <field string> description with * has the following format:
> -
> -field: <integer>, <field string>
> -
> -Where value of <integer> should be the position of "string" in <field
> -string> description. Otherwise, <field string> will be "unknown".
> -
> -All <field strings> description with # has the following format:
> -
> -field: <integer>
> -<field strings>
> -
> -Where each string in <fields strings> corresponding to one set bit of
> -<integer>. The bit position is the position of "string" in <field
> -strings> description.
> -
> -For more detailed explanation of every field, please refer to UEFI
> -specification version 2.3 or later, section Appendix N: Common
> -Platform Error Record.
> diff --git a/Documentation/firmware-guide/acpi/apei/output_format.rst b/Documentation/firmware-guide/acpi/apei/output_format.rst
> new file mode 100644
> index 000000000000..c2e7ebddb529
> --- /dev/null
> +++ b/Documentation/firmware-guide/acpi/apei/output_format.rst
> @@ -0,0 +1,150 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +==================
> +APEI output format
> +==================
> +
> +APEI uses printk as hardware error reporting interface, the output
> +format is as follow::
> +
> + <error record> :=
> + APEI generic hardware error status
> + severity: <integer>, <severity string>
> + section: <integer>, severity: <integer>, <severity string>
> + flags: <integer>
> + <section flags strings>
> + fru_id: <uuid string>
> + fru_text: <string>
> + section_type: <section type string>
> + <section data>
> +
> + <severity string>* := recoverable | fatal | corrected | info
> +
> + <section flags strings># :=
> + [primary][, containment warning][, reset][, threshold exceeded]\
> + [, resource not accessible][, latent error]
> +
> + <section type string> := generic processor error | memory error | \
> + PCIe error | unknown, <uuid string>
> +
> + <section data> :=
> + <generic processor section data> | <memory section data> | \
> + <pcie section data> | <null>
> +
> + <generic processor section data> :=
> + [processor_type: <integer>, <proc type string>]
> + [processor_isa: <integer>, <proc isa string>]
> + [error_type: <integer>
> + <proc error type strings>]
> + [operation: <integer>, <proc operation string>]
> + [flags: <integer>
> + <proc flags strings>]
> + [level: <integer>]
> + [version_info: <integer>]
> + [processor_id: <integer>]
> + [target_address: <integer>]
> + [requestor_id: <integer>]
> + [responder_id: <integer>]
> + [IP: <integer>]
> +
> + <proc type string>* := IA32/X64 | IA64
> +
> + <proc isa string>* := IA32 | IA64 | X64
> +
> + <processor error type strings># :=
> + [cache error][, TLB error][, bus error][, micro-architectural error]
> +
> + <proc operation string>* := unknown or generic | data read | data write | \
> + instruction execution
> +
> + <proc flags strings># :=
> + [restartable][, precise IP][, overflow][, corrected]
> +
> + <memory section data> :=
> + [error_status: <integer>]
> + [physical_address: <integer>]
> + [physical_address_mask: <integer>]
> + [node: <integer>]
> + [card: <integer>]
> + [module: <integer>]
> + [bank: <integer>]
> + [device: <integer>]
> + [row: <integer>]
> + [column: <integer>]
> + [bit_position: <integer>]
> + [requestor_id: <integer>]
> + [responder_id: <integer>]
> + [target_id: <integer>]
> + [error_type: <integer>, <mem error type string>]
> +
> + <mem error type string>* :=
> + unknown | no error | single-bit ECC | multi-bit ECC | \
> + single-symbol chipkill ECC | multi-symbol chipkill ECC | master abort | \
> + target abort | parity error | watchdog timeout | invalid address | \
> + mirror Broken | memory sparing | scrub corrected error | \
> + scrub uncorrected error
> +
> + <pcie section data> :=
> + [port_type: <integer>, <pcie port type string>]
> + [version: <integer>.<integer>]
> + [command: <integer>, status: <integer>]
> + [device_id: <integer>:<integer>:<integer>.<integer>
> + slot: <integer>
> + secondary_bus: <integer>
> + vendor_id: <integer>, device_id: <integer>
> + class_code: <integer>]
> + [serial number: <integer>, <integer>]
> + [bridge: secondary_status: <integer>, control: <integer>]
> + [aer_status: <integer>, aer_mask: <integer>
> + <aer status string>
> + [aer_uncor_severity: <integer>]
> + aer_layer=<aer layer string>, aer_agent=<aer agent string>
> + aer_tlp_header: <integer> <integer> <integer> <integer>]
> +
> + <pcie port type string>* := PCIe end point | legacy PCI end point | \
> + unknown | unknown | root port | upstream switch port | \
> + downstream switch port | PCIe to PCI/PCI-X bridge | \
> + PCI/PCI-X to PCIe bridge | root complex integrated endpoint device | \
> + root complex event collector
> +
> + if section severity is fatal or recoverable
> + <aer status string># :=
> + unknown | unknown | unknown | unknown | Data Link Protocol | \
> + unknown | unknown | unknown | unknown | unknown | unknown | unknown | \
> + Poisoned TLP | Flow Control Protocol | Completion Timeout | \
> + Completer Abort | Unexpected Completion | Receiver Overflow | \
> + Malformed TLP | ECRC | Unsupported Request
> + else
> + <aer status string># :=
> + Receiver Error | unknown | unknown | unknown | unknown | unknown | \
> + Bad TLP | Bad DLLP | RELAY_NUM Rollover | unknown | unknown | unknown | \
> + Replay Timer Timeout | Advisory Non-Fatal
> + fi
> +
> + <aer layer string> :=
> + Physical Layer | Data Link Layer | Transaction Layer
> +
> + <aer agent string> :=
> + Receiver ID | Requester ID | Completer ID | Transmitter ID
> +
> +Where, [] designate corresponding content is optional
> +
> +All <field string> description with * has the following format::
> +
> + field: <integer>, <field string>
> +
> +Where value of <integer> should be the position of "string" in <field
> +string> description. Otherwise, <field string> will be "unknown".
> +
> +All <field strings> description with # has the following format::
> +
> + field: <integer>
> + <field strings>
> +
> +Where each string in <fields strings> corresponding to one set bit of
> +<integer>. The bit position is the position of "string" in <field
> +strings> description.
> +
> +For more detailed explanation of every field, please refer to UEFI
> +specification version 2.3 or later, section Appendix N: Common
> +Platform Error Record.
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index e9f253d54897..869badba6d7a 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -17,6 +17,7 @@ ACPI Support
> DSD-properties-rules
> debug
> aml-debugger
> + apei/output_format
> gpio-properties
> i2c-muxes
> acpi-lid



Thanks,
Mauro

2019-04-24 14:52:00

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 21/63] Documentation: ACPI: move cppc_sysfs.txt to admin-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:50 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> .../acpi/cppc_sysfs.rst} | 71 ++++++++++---------
> Documentation/admin-guide/acpi/index.rst | 1 +
> 2 files changed, 40 insertions(+), 32 deletions(-)
> rename Documentation/{acpi/cppc_sysfs.txt => admin-guide/acpi/cppc_sysfs.rst} (51%)
>
> diff --git a/Documentation/acpi/cppc_sysfs.txt b/Documentation/admin-guide/acpi/cppc_sysfs.rst
> similarity index 51%
> rename from Documentation/acpi/cppc_sysfs.txt
> rename to Documentation/admin-guide/acpi/cppc_sysfs.rst
> index f20fb445135d..a4b99afbe331 100644
> --- a/Documentation/acpi/cppc_sysfs.txt
> +++ b/Documentation/admin-guide/acpi/cppc_sysfs.rst
> @@ -1,5 +1,11 @@
> +.. SPDX-License-Identifier: GPL-2.0
>
> - Collaborative Processor Performance Control (CPPC)
> +==================================================
> +Collaborative Processor Performance Control (CPPC)
> +==================================================
> +
> +CPPC
> +====
>
> CPPC defined in the ACPI spec describes a mechanism for the OS to manage the
> performance of a logical processor on a contigious and abstract performance
> @@ -10,31 +16,28 @@ For more details on CPPC please refer to the ACPI specification at:
>
> http://uefi.org/specifications
>
> -Some of the CPPC registers are exposed via sysfs under:
> -
> -/sys/devices/system/cpu/cpuX/acpi_cppc/
> -


> -for each cpu X

Hmm... removed by mistake?

> +Some of the CPPC registers are exposed via sysfs under::
>
> ---------------------------------------------------------------------------------
> + /sys/devices/system/cpu/cpuX/acpi_cppc/

Did you parse this with Sphinx? It doesn't sound a valid ReST construction
to my eyes, as:

1) I've seen some versions of Sphinx to abort with severe errors when
there's no blank line after the horizontal bar markup;

2) It will very likely ignore the "::" (I didn't test it myself), as you're
not indenting the horizontal bar. End of indentation will mean the end
of an (empty) literal block.

So, I would stick with:


Some of the CPPC registers are exposed via sysfs under:

/sys/devices/system/cpu/cpuX/acpi_cppc/

---------------------------------------------------------------------------------

for each cpu X::


or:

Some of the CPPC registers are exposed via sysfs under:

/sys/devices/system/cpu/cpuX/acpi_cppc/

for each cpu X

--------------------------------------------------------------------------------

::

(with is closer to the original author's intent)

Same applies to the other similar changes on this document.

>
> -$ ls -lR /sys/devices/system/cpu/cpu0/acpi_cppc/
> -/sys/devices/system/cpu/cpu0/acpi_cppc/:
> -total 0
> --r--r--r-- 1 root root 65536 Mar 5 19:38 feedback_ctrs
> --r--r--r-- 1 root root 65536 Mar 5 19:38 highest_perf
> --r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_freq
> --r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_nonlinear_perf
> --r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_perf
> --r--r--r-- 1 root root 65536 Mar 5 19:38 nominal_freq
> --r--r--r-- 1 root root 65536 Mar 5 19:38 nominal_perf
> --r--r--r-- 1 root root 65536 Mar 5 19:38 reference_perf
> --r--r--r-- 1 root root 65536 Mar 5 19:38 wraparound_time
> +for each cpu X::
>
> ---------------------------------------------------------------------------------
> + $ ls -lR /sys/devices/system/cpu/cpu0/acpi_cppc/
> + /sys/devices/system/cpu/cpu0/acpi_cppc/:
> + total 0
> + -r--r--r-- 1 root root 65536 Mar 5 19:38 feedback_ctrs
> + -r--r--r-- 1 root root 65536 Mar 5 19:38 highest_perf
> + -r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_freq
> + -r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_nonlinear_perf
> + -r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_perf
> + -r--r--r-- 1 root root 65536 Mar 5 19:38 nominal_freq
> + -r--r--r-- 1 root root 65536 Mar 5 19:38 nominal_perf
> + -r--r--r-- 1 root root 65536 Mar 5 19:38 reference_perf
> + -r--r--r-- 1 root root 65536 Mar 5 19:38 wraparound_time
>
> * highest_perf : Highest performance of this processor (abstract scale).
> -* nominal_perf : Highest sustained performance of this processor (abstract scale).
> +* nominal_perf : Highest sustained performance of this processor
> + (abstract scale).
> * lowest_nonlinear_perf : Lowest performance of this processor with nonlinear
> power savings (abstract scale).
> * lowest_perf : Lowest performance of this processor (abstract scale).
> @@ -48,22 +51,26 @@ total 0
> * feedback_ctrs : Includes both Reference and delivered performance counter.
> Reference counter ticks up proportional to processor's reference performance.
> Delivered counter ticks up proportional to processor's delivered performance.
> -* wraparound_time: Minimum time for the feedback counters to wraparound (seconds).
> +* wraparound_time: Minimum time for the feedback counters to wraparound
> + (seconds).
> * reference_perf : Performance level at which reference performance counter
> accumulates (abstract scale).
>
> ---------------------------------------------------------------------------------
>
> - Computing Average Delivered Performance
> +Computing Average Delivered Performance
> +=======================================
> +
> +Below describes the steps to compute the average performance delivered by
> +taking two different snapshots of feedback counters at time T1 and T2.
> +
> + T1: Read feedback_ctrs as fbc_t1
> + Wait or run some workload
>
> -Below describes the steps to compute the average performance delivered by taking
> -two different snapshots of feedback counters at time T1 and T2.
> + T2: Read feedback_ctrs as fbc_t2
>
> -T1: Read feedback_ctrs as fbc_t1
> - Wait or run some workload
> -T2: Read feedback_ctrs as fbc_t2
> +::
>
> -delivered_counter_delta = fbc_t2[del] - fbc_t1[del]
> -reference_counter_delta = fbc_t2[ref] - fbc_t1[ref]
> + delivered_counter_delta = fbc_t2[del] - fbc_t1[del]
> + reference_counter_delta = fbc_t2[ref] - fbc_t1[ref]
>
> -delivered_perf = (refernce_perf x delivered_counter_delta) / reference_counter_delta
> + delivered_perf = (refernce_perf x delivered_counter_delta) / reference_counter_delta
> diff --git a/Documentation/admin-guide/acpi/index.rst b/Documentation/admin-guide/acpi/index.rst
> index d68e9914c5ff..9049a7b9f065 100644
> --- a/Documentation/admin-guide/acpi/index.rst
> +++ b/Documentation/admin-guide/acpi/index.rst
> @@ -10,3 +10,4 @@ the Linux ACPI support.
>
> initrd_table_override
> dsdt-override
> + cppc_sysfs



Thanks,
Mauro

2019-04-24 14:57:08

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 23/63] Documentation: ACPI: move ssdt-overlays.txt to admin-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:52 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> ---
> Documentation/acpi/ssdt-overlays.txt | 172 -----------------
> Documentation/admin-guide/acpi/index.rst | 1 +
> .../admin-guide/acpi/ssdt-overlays.rst | 180 ++++++++++++++++++
> 3 files changed, 181 insertions(+), 172 deletions(-)
> delete mode 100644 Documentation/acpi/ssdt-overlays.txt
> create mode 100644 Documentation/admin-guide/acpi/ssdt-overlays.rst
>
> diff --git a/Documentation/acpi/ssdt-overlays.txt b/Documentation/acpi/ssdt-overlays.txt
> deleted file mode 100644
> index 5ae13f161ea2..000000000000
> --- a/Documentation/acpi/ssdt-overlays.txt
> +++ /dev/null
> @@ -1,172 +0,0 @@
> -
> -In order to support ACPI open-ended hardware configurations (e.g. development
> -boards) we need a way to augment the ACPI configuration provided by the firmware
> -image. A common example is connecting sensors on I2C / SPI buses on development
> -boards.
> -
> -Although this can be accomplished by creating a kernel platform driver or
> -recompiling the firmware image with updated ACPI tables, neither is practical:
> -the former proliferates board specific kernel code while the latter requires
> -access to firmware tools which are often not publicly available.
> -
> -Because ACPI supports external references in AML code a more practical
> -way to augment firmware ACPI configuration is by dynamically loading
> -user defined SSDT tables that contain the board specific information.
> -
> -For example, to enumerate a Bosch BMA222E accelerometer on the I2C bus of the
> -Minnowboard MAX development board exposed via the LSE connector [1], the
> -following ASL code can be used:
> -
> -DefinitionBlock ("minnowmax.aml", "SSDT", 1, "Vendor", "Accel", 0x00000003)
> -{
> - External (\_SB.I2C6, DeviceObj)
> -
> - Scope (\_SB.I2C6)
> - {
> - Device (STAC)
> - {
> - Name (_ADR, Zero)
> - Name (_HID, "BMA222E")
> -
> - Method (_CRS, 0, Serialized)
> - {
> - Name (RBUF, ResourceTemplate ()
> - {
> - I2cSerialBus (0x0018, ControllerInitiated, 0x00061A80,
> - AddressingMode7Bit, "\\_SB.I2C6", 0x00,
> - ResourceConsumer, ,)
> - GpioInt (Edge, ActiveHigh, Exclusive, PullDown, 0x0000,
> - "\\_SB.GPO2", 0x00, ResourceConsumer, , )
> - { // Pin list
> - 0
> - }
> - })
> - Return (RBUF)
> - }
> - }
> - }
> -}
> -
> -which can then be compiled to AML binary format:
> -
> -$ iasl minnowmax.asl
> -
> -Intel ACPI Component Architecture
> -ASL Optimizing Compiler version 20140214-64 [Mar 29 2014]
> -Copyright (c) 2000 - 2014 Intel Corporation
> -
> -ASL Input: minnomax.asl - 30 lines, 614 bytes, 7 keywords
> -AML Output: minnowmax.aml - 165 bytes, 6 named objects, 1 executable opcodes
> -
> -[1] http://wiki.minnowboard.org/MinnowBoard_MAX#Low_Speed_Expansion_Connector_.28Top.29
> -
> -The resulting AML code can then be loaded by the kernel using one of the methods
> -below.
> -
> -== Loading ACPI SSDTs from initrd ==
> -
> -This option allows loading of user defined SSDTs from initrd and it is useful
> -when the system does not support EFI or when there is not enough EFI storage.
> -
> -It works in a similar way with initrd based ACPI tables override/upgrade: SSDT
> -aml code must be placed in the first, uncompressed, initrd under the
> -"kernel/firmware/acpi" path. Multiple files can be used and this will translate
> -in loading multiple tables. Only SSDT and OEM tables are allowed. See
> -initrd_table_override.txt for more details.
> -
> -Here is an example:
> -
> -# Add the raw ACPI tables to an uncompressed cpio archive.
> -# They must be put into a /kernel/firmware/acpi directory inside the
> -# cpio archive.
> -# The uncompressed cpio archive must be the first.
> -# Other, typically compressed cpio archives, must be
> -# concatenated on top of the uncompressed one.
> -mkdir -p kernel/firmware/acpi
> -cp ssdt.aml kernel/firmware/acpi
> -
> -# Create the uncompressed cpio archive and concatenate the original initrd
> -# on top:
> -find kernel | cpio -H newc --create > /boot/instrumented_initrd
> -cat /boot/initrd >>/boot/instrumented_initrd
> -
> -== Loading ACPI SSDTs from EFI variables ==
> -
> -This is the preferred method, when EFI is supported on the platform, because it
> -allows a persistent, OS independent way of storing the user defined SSDTs. There
> -is also work underway to implement EFI support for loading user defined SSDTs
> -and using this method will make it easier to convert to the EFI loading
> -mechanism when that will arrive.
> -
> -In order to load SSDTs from an EFI variable the efivar_ssdt kernel command line
> -parameter can be used. The argument for the option is the variable name to
> -use. If there are multiple variables with the same name but with different
> -vendor GUIDs, all of them will be loaded.
> -
> -In order to store the AML code in an EFI variable the efivarfs filesystem can be
> -used. It is enabled and mounted by default in /sys/firmware/efi/efivars in all
> -recent distribution.
> -
> -Creating a new file in /sys/firmware/efi/efivars will automatically create a new
> -EFI variable. Updating a file in /sys/firmware/efi/efivars will update the EFI
> -variable. Please note that the file name needs to be specially formatted as
> -"Name-GUID" and that the first 4 bytes in the file (little-endian format)
> -represent the attributes of the EFI variable (see EFI_VARIABLE_MASK in
> -include/linux/efi.h). Writing to the file must also be done with one write
> -operation.
> -
> -For example, you can use the following bash script to create/update an EFI
> -variable with the content from a given file:
> -
> -#!/bin/sh -e
> -
> -while ! [ -z "$1" ]; do
> - case "$1" in
> - "-f") filename="$2"; shift;;
> - "-g") guid="$2"; shift;;
> - *) name="$1";;
> - esac
> - shift
> -done
> -
> -usage()
> -{
> - echo "Syntax: ${0##*/} -f filename [ -g guid ] name"
> - exit 1
> -}
> -
> -[ -n "$name" -a -f "$filename" ] || usage
> -
> -EFIVARFS="/sys/firmware/efi/efivars"
> -
> -[ -d "$EFIVARFS" ] || exit 2
> -
> -if stat -tf $EFIVARFS | grep -q -v de5e81e4; then
> - mount -t efivarfs none $EFIVARFS
> -fi
> -
> -# try to pick up an existing GUID
> -[ -n "$guid" ] || guid=$(find "$EFIVARFS" -name "$name-*" | head -n1 | cut -f2- -d-)
> -
> -# use a randomly generated GUID
> -[ -n "$guid" ] || guid="$(cat /proc/sys/kernel/random/uuid)"
> -
> -# efivarfs expects all of the data in one write
> -tmp=$(mktemp)
> -/bin/echo -ne "\007\000\000\000" | cat - $filename > $tmp
> -dd if=$tmp of="$EFIVARFS/$name-$guid" bs=$(stat -c %s $tmp)
> -rm $tmp
> -
> -== Loading ACPI SSDTs from configfs ==
> -
> -This option allows loading of user defined SSDTs from userspace via the configfs
> -interface. The CONFIG_ACPI_CONFIGFS option must be select and configfs must be
> -mounted. In the following examples, we assume that configfs has been mounted in
> -/config.
> -
> -New tables can be loading by creating new directories in /config/acpi/table/ and
> -writing the SSDT aml code in the aml attribute:
> -
> -cd /config/acpi/table
> -mkdir my_ssdt
> -cat ~/ssdt.aml > my_ssdt/aml
> diff --git a/Documentation/admin-guide/acpi/index.rst b/Documentation/admin-guide/acpi/index.rst
> index 9049a7b9f065..4d13eeea1eca 100644
> --- a/Documentation/admin-guide/acpi/index.rst
> +++ b/Documentation/admin-guide/acpi/index.rst
> @@ -10,4 +10,5 @@ the Linux ACPI support.
>
> initrd_table_override
> dsdt-override
> + ssdt-overlays
> cppc_sysfs
> diff --git a/Documentation/admin-guide/acpi/ssdt-overlays.rst b/Documentation/admin-guide/acpi/ssdt-overlays.rst
> new file mode 100644
> index 000000000000..da37455f96c9
> --- /dev/null
> +++ b/Documentation/admin-guide/acpi/ssdt-overlays.rst
> @@ -0,0 +1,180 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=============
> +SSDT Overlays
> +=============
> +
> +In order to support ACPI open-ended hardware configurations (e.g. development
> +boards) we need a way to augment the ACPI configuration provided by the firmware
> +image. A common example is connecting sensors on I2C / SPI buses on development
> +boards.
> +
> +Although this can be accomplished by creating a kernel platform driver or
> +recompiling the firmware image with updated ACPI tables, neither is practical:
> +the former proliferates board specific kernel code while the latter requires
> +access to firmware tools which are often not publicly available.
> +
> +Because ACPI supports external references in AML code a more practical
> +way to augment firmware ACPI configuration is by dynamically loading
> +user defined SSDT tables that contain the board specific information.
> +
> +For example, to enumerate a Bosch BMA222E accelerometer on the I2C bus of the
> +Minnowboard MAX development board exposed via the LSE connector [1], the
> +following ASL code can be used::
> +
> + DefinitionBlock ("minnowmax.aml", "SSDT", 1, "Vendor", "Accel", 0x00000003)
> + {
> + External (\_SB.I2C6, DeviceObj)
> +
> + Scope (\_SB.I2C6)
> + {
> + Device (STAC)
> + {
> + Name (_ADR, Zero)
> + Name (_HID, "BMA222E")
> +
> + Method (_CRS, 0, Serialized)
> + {
> + Name (RBUF, ResourceTemplate ()
> + {
> + I2cSerialBus (0x0018, ControllerInitiated, 0x00061A80,
> + AddressingMode7Bit, "\\_SB.I2C6", 0x00,
> + ResourceConsumer, ,)
> + GpioInt (Edge, ActiveHigh, Exclusive, PullDown, 0x0000,
> + "\\_SB.GPO2", 0x00, ResourceConsumer, , )
> + { // Pin list
> + 0
> + }
> + })
> + Return (RBUF)
> + }
> + }
> + }
> + }
> +
> +which can then be compiled to AML binary format::
> +
> + $ iasl minnowmax.asl
> +
> + Intel ACPI Component Architecture
> + ASL Optimizing Compiler version 20140214-64 [Mar 29 2014]
> + Copyright (c) 2000 - 2014 Intel Corporation
> +
> + ASL Input: minnomax.asl - 30 lines, 614 bytes, 7 keywords
> + AML Output: minnowmax.aml - 165 bytes, 6 named objects, 1 executable opcodes
> +
> +[1] http://wiki.minnowboard.org/MinnowBoard_MAX#Low_Speed_Expansion_Connector_.28Top.29
> +
> +The resulting AML code can then be loaded by the kernel using one of the methods
> +below.
> +
> +Loading ACPI SSDTs from initrd
> +==============================
> +
> +This option allows loading of user defined SSDTs from initrd and it is useful
> +when the system does not support EFI or when there is not enough EFI storage.
> +
> +It works in a similar way with initrd based ACPI tables override/upgrade: SSDT
> +aml code must be placed in the first, uncompressed, initrd under the
> +"kernel/firmware/acpi" path. Multiple files can be used and this will translate
> +in loading multiple tables. Only SSDT and OEM tables are allowed. See
> +initrd_table_override.txt for more details.
> +
> +Here is an example::
> +
> + # Add the raw ACPI tables to an uncompressed cpio archive.
> + # They must be put into a /kernel/firmware/acpi directory inside the
> + # cpio archive.
> + # The uncompressed cpio archive must be the first.
> + # Other, typically compressed cpio archives, must be
> + # concatenated on top of the uncompressed one.
> + mkdir -p kernel/firmware/acpi
> + cp ssdt.aml kernel/firmware/acpi
> +
> + # Create the uncompressed cpio archive and concatenate the original initrd
> + # on top:
> + find kernel | cpio -H newc --create > /boot/instrumented_initrd
> + cat /boot/initrd >>/boot/instrumented_initrd
> +
> +Loading ACPI SSDTs from EFI variables
> +=====================================
> +
> +This is the preferred method, when EFI is supported on the platform, because it
> +allows a persistent, OS independent way of storing the user defined SSDTs. There
> +is also work underway to implement EFI support for loading user defined SSDTs
> +and using this method will make it easier to convert to the EFI loading
> +mechanism when that will arrive.
> +
> +In order to load SSDTs from an EFI variable the efivar_ssdt kernel command line
> +parameter can be used. The argument for the option is the variable name to
> +use. If there are multiple variables with the same name but with different
> +vendor GUIDs, all of them will be loaded.
> +
> +In order to store the AML code in an EFI variable the efivarfs filesystem can be
> +used. It is enabled and mounted by default in /sys/firmware/efi/efivars in all
> +recent distribution.
> +
> +Creating a new file in /sys/firmware/efi/efivars will automatically create a new
> +EFI variable. Updating a file in /sys/firmware/efi/efivars will update the EFI
> +variable. Please note that the file name needs to be specially formatted as
> +"Name-GUID" and that the first 4 bytes in the file (little-endian format)
> +represent the attributes of the EFI variable (see EFI_VARIABLE_MASK in
> +include/linux/efi.h). Writing to the file must also be done with one write
> +operation.
> +
> +For example, you can use the following bash script to create/update an EFI
> +variable with the content from a given file::
> +
> + #!/bin/sh -e
> +
> + while ! [ -z "$1" ]; do
> + case "$1" in
> + "-f") filename="$2"; shift;;
> + "-g") guid="$2"; shift;;
> + *) name="$1";;
> + esac
> + shift
> + done
> +
> + usage()
> + {
> + echo "Syntax: ${0##*/} -f filename [ -g guid ] name"
> + exit 1
> + }
> +
> + [ -n "$name" -a -f "$filename" ] || usage
> +
> + EFIVARFS="/sys/firmware/efi/efivars"
> +
> + [ -d "$EFIVARFS" ] || exit 2
> +
> + if stat -tf $EFIVARFS | grep -q -v de5e81e4; then
> + mount -t efivarfs none $EFIVARFS
> + fi
> +
> + # try to pick up an existing GUID
> + [ -n "$guid" ] || guid=$(find "$EFIVARFS" -name "$name-*" | head -n1 | cut -f2- -d-)
> +
> + # use a randomly generated GUID
> + [ -n "$guid" ] || guid="$(cat /proc/sys/kernel/random/uuid)"
> +
> + # efivarfs expects all of the data in one write
> + tmp=$(mktemp)
> + /bin/echo -ne "\007\000\000\000" | cat - $filename > $tmp
> + dd if=$tmp of="$EFIVARFS/$name-$guid" bs=$(stat -c %s $tmp)
> + rm $tmp
> +
> +Loading ACPI SSDTs from configfs
> +================================
> +
> +This option allows loading of user defined SSDTs from userspace via the configfs
> +interface. The CONFIG_ACPI_CONFIGFS option must be select and configfs must be
> +mounted. In the following examples, we assume that configfs has been mounted in
> +/config.
> +
> +New tables can be loading by creating new directories in /config/acpi/table/ and
> +writing the SSDT aml code in the aml attribute::
> +
> + cd /config/acpi/table
> + mkdir my_ssdt
> + cat ~/ssdt.aml > my_ssdt/aml



Thanks,
Mauro

2019-04-24 14:58:47

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 24/63] Documentation: ACPI: move video_extension.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:53 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> Documentation/firmware-guide/acpi/index.rst | 1 +
> .../acpi/video_extension.rst} | 63 ++++++++++---------
> 2 files changed, 36 insertions(+), 28 deletions(-)
> rename Documentation/{acpi/video_extension.txt => firmware-guide/acpi/video_extension.rst} (79%)
>
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index 0e60f4b7129a..ae609eec4679 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -23,3 +23,4 @@ ACPI Support
> i2c-muxes
> acpi-lid
> lpit
> + video_extension
> diff --git a/Documentation/acpi/video_extension.txt b/Documentation/firmware-guide/acpi/video_extension.rst
> similarity index 79%
> rename from Documentation/acpi/video_extension.txt
> rename to Documentation/firmware-guide/acpi/video_extension.rst
> index 79bf6a4921be..06f7e3230b6e 100644
> --- a/Documentation/acpi/video_extension.txt
> +++ b/Documentation/firmware-guide/acpi/video_extension.rst
> @@ -1,5 +1,8 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=====================
> ACPI video extensions
> -~~~~~~~~~~~~~~~~~~~~~
> +=====================
>
> This driver implement the ACPI Extensions For Display Adapters for
> integrated graphics devices on motherboard, as specified in ACPI 2.0
> @@ -8,9 +11,10 @@ defining the video POST device, retrieving EDID information or to
> setup a video output, etc. Note that this is an ref. implementation
> only. It may or may not work for your integrated video device.
>
> -The ACPI video driver does 3 things regarding backlight control:
> +The ACPI video driver does 3 things regarding backlight control.
>
> -1 Export a sysfs interface for user space to control backlight level
> +1. Export a sysfs interface for user space to control backlight level
> +=====================================================================
>
> If the ACPI table has a video device, and acpi_backlight=vendor kernel
> command line is not present, the driver will register a backlight device

Hmm... you didn't touch on this part of the document:

And what ACPI video driver does is:
actual_brightness: on read, control method _BQC will be evaluated to
get the brightness level the firmware thinks it is at;
bl_power: not implemented, will set the current brightness instead;
brightness: on write, control method _BCM will run to set the requested
brightness level;
max_brightness: Derived from the _BCL package(see below);
type: firmware

You should touch it. My suggestion here is:

And what ACPI video driver does is:

actual_brightness:
on read, control method _BQC will be evaluated to
get the brightness level the firmware thinks it is at;
bl_power:
not implemented, will set the current brightness instead;
brightness:
on write, control method _BCM will run to set the requested
brightness level;
max_brightness:
Derived from the _BCL package(see below);
type:
firmware

> @@ -32,26 +36,26 @@ type: firmware
>
> Note that ACPI video backlight driver will always use index for
> brightness, actual_brightness and max_brightness. So if we have
> -the following _BCL package:
> +the following _BCL package::
>
> -Method (_BCL, 0, NotSerialized)
> -{
> - Return (Package (0x0C)
> + Method (_BCL, 0, NotSerialized)
> {
> - 0x64,
> - 0x32,
> - 0x0A,
> - 0x14,
> - 0x1E,
> - 0x28,
> - 0x32,
> - 0x3C,
> - 0x46,
> - 0x50,
> - 0x5A,
> - 0x64
> - })
> -}
> + Return (Package (0x0C)
> + {
> + 0x64,
> + 0x32,
> + 0x0A,
> + 0x14,
> + 0x1E,
> + 0x28,
> + 0x32,
> + 0x3C,
> + 0x46,
> + 0x50,
> + 0x5A,
> + 0x64
> + })
> + }
>
> The first two levels are for when laptop are on AC or on battery and are
> not used by Linux currently. The remaining 10 levels are supported levels
> @@ -62,13 +66,15 @@ as a "brightness level" indicator. Thus from the user space perspective
> the range of available brightness levels is from 0 to 9 (max_brightness)
> inclusive.
>
> -2 Notify user space about hotkey event
> +2. Notify user space about hotkey event
> +=======================================
>
> There are generally two cases for hotkey event reporting:
> +
> i) For some laptops, when user presses the hotkey, a scancode will be
> generated and sent to user space through the input device created by
> the keyboard driver as a key type input event, with proper remap, the
> - following key code will appear to user space:
> + following key code will appear to user space::
>
> EV_KEY, KEY_BRIGHTNESSUP
> EV_KEY, KEY_BRIGHTNESSDOWN
> @@ -82,7 +88,7 @@ ii) For some laptops, the press of the hotkey will not generate the
> about the event. The event value is defined in the ACPI spec. ACPI
> video driver will generate an key type input event according to the
> notify value it received and send the event to user space through the
> - input device it created:
> + input device it created::
>
> event keycode
> 0x86 KEY_BRIGHTNESSUP

Perhaps making this as a table would work better:

input device it created:

===== ===================
event keycode
===== ===================
0x86 KEY_BRIGHTNESSUP
0x87 KEY_BRIGHTNESSDOWN
etc.
===== ===================


> @@ -94,13 +100,14 @@ so this would lead to the same effect as case i) now.
> Once user space tool receives this event, it can modify the backlight
> level through the sysfs interface.
>
> -3 Change backlight level in the kernel
> +3. Change backlight level in the kernel
> +=======================================
>
> This works for machines covered by case ii) in Section 2. Once the driver
> received a notification, it will set the backlight level accordingly. This does
> not affect the sending of event to user space, they are always sent to user
> space regardless of whether or not the video module controls the backlight level
> directly. This behaviour can be controlled through the brightness_switch_enabled
> -module parameter as documented in admin-guide/kernel-parameters.rst. It is recommended to
> -disable this behaviour once a GUI environment starts up and wants to have full
> -control of the backlight level.
> +module parameter as documented in admin-guide/kernel-parameters.rst. It is
> +recommended to disable this behaviour once a GUI environment starts up and
> +wants to have full control of the backlight level.



Thanks,
Mauro

2019-04-24 15:03:06

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 20/63] Documentation: ACPI: move apei/einj.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:49 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> .../acpi/apei/einj.rst} | 98 ++++++++++---------
> Documentation/firmware-guide/acpi/index.rst | 1 +
> 2 files changed, 53 insertions(+), 46 deletions(-)
> rename Documentation/{acpi/apei/einj.txt => firmware-guide/acpi/apei/einj.rst} (67%)
>
> diff --git a/Documentation/acpi/apei/einj.txt b/Documentation/firmware-guide/acpi/apei/einj.rst
> similarity index 67%
> rename from Documentation/acpi/apei/einj.txt
> rename to Documentation/firmware-guide/acpi/apei/einj.rst
> index e550c8b98139..d85e2667155c 100644
> --- a/Documentation/acpi/apei/einj.txt
> +++ b/Documentation/firmware-guide/acpi/apei/einj.rst
> @@ -1,13 +1,16 @@
> - APEI Error INJection
> - ~~~~~~~~~~~~~~~~~~~~
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +====================
> +APEI Error INJection
> +====================
>
> EINJ provides a hardware error injection mechanism. It is very useful
> for debugging and testing APEI and RAS features in general.
>
> You need to check whether your BIOS supports EINJ first. For that, look
> -for early boot messages similar to this one:
> +for early boot messages similar to this one::
>
> -ACPI: EINJ 0x000000007370A000 000150 (v01 INTEL 00000001 INTL 00000001)
> + ACPI: EINJ 0x000000007370A000 000150 (v01 INTEL 00000001 INTL 00000001)
>
> which shows that the BIOS is exposing an EINJ table - it is the
> mechanism through which the injection is done.
> @@ -23,11 +26,11 @@ order to see the APEI,EINJ,... functionality supported and exposed by
> the BIOS menu.
>
> To use EINJ, make sure the following are options enabled in your kernel
> -configuration:
> +configuration::
>
> -CONFIG_DEBUG_FS
> -CONFIG_ACPI_APEI
> -CONFIG_ACPI_APEI_EINJ
> + CONFIG_DEBUG_FS
> + CONFIG_ACPI_APEI
> + CONFIG_ACPI_APEI_EINJ
>
> The EINJ user interface is in <debugfs mount point>/apei/einj.
>
> @@ -35,22 +38,22 @@ The following files belong to it:
>
> - available_error_type
>
> - This file shows which error types are supported:
> -
> - Error Type Value Error Description
> - ================ =================
> - 0x00000001 Processor Correctable
> - 0x00000002 Processor Uncorrectable non-fatal
> - 0x00000004 Processor Uncorrectable fatal
> - 0x00000008 Memory Correctable
> - 0x00000010 Memory Uncorrectable non-fatal
> - 0x00000020 Memory Uncorrectable fatal
> - 0x00000040 PCI Express Correctable
> - 0x00000080 PCI Express Uncorrectable fatal
> - 0x00000100 PCI Express Uncorrectable non-fatal
> - 0x00000200 Platform Correctable
> - 0x00000400 Platform Uncorrectable non-fatal
> - 0x00000800 Platform Uncorrectable fatal
> + This file shows which error types are supported::
> +
> + Error Type Value Error Description
> + ================ =================
> + 0x00000001 Processor Correctable
> + 0x00000002 Processor Uncorrectable non-fatal
> + 0x00000004 Processor Uncorrectable fatal
> + 0x00000008 Memory Correctable
> + 0x00000010 Memory Uncorrectable non-fatal
> + 0x00000020 Memory Uncorrectable fatal
> + 0x00000040 PCI Express Correctable
> + 0x00000080 PCI Express Uncorrectable fatal
> + 0x00000100 PCI Express Uncorrectable non-fatal
> + 0x00000200 Platform Correctable
> + 0x00000400 Platform Uncorrectable non-fatal
> + 0x00000800 Platform Uncorrectable fatal

This is a table and not a literal block.

The best here to preserve the author's intent is to just adjust the table
markups in order to make it parseable, e. g.:

This file shows which error types are supported:

================ ===================================
Error Type Value Error Description
================ ===================================
0x00000001 Processor Correctable
0x00000002 Processor Uncorrectable non-fatal
0x00000004 Processor Uncorrectable fatal
0x00000008 Memory Correctable
0x00000010 Memory Uncorrectable non-fatal
0x00000020 Memory Uncorrectable fatal
0x00000040 PCI Express Correctable
0x00000080 PCI Express Uncorrectable fatal
0x00000100 PCI Express Uncorrectable non-fatal
0x00000200 Platform Correctable
0x00000400 Platform Uncorrectable non-fatal
0x00000800 Platform Uncorrectable fatal
================ ===================================

After such change:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

>
> The format of the file contents are as above, except present are only
> the available error types.
> @@ -73,9 +76,12 @@ The following files belong to it:
> injection. Value is a bitmask as specified in ACPI5.0 spec for the
> SET_ERROR_TYPE_WITH_ADDRESS data structure:
>
> - Bit 0 - Processor APIC field valid (see param3 below).
> - Bit 1 - Memory address and mask valid (param1 and param2).
> - Bit 2 - PCIe (seg,bus,dev,fn) valid (see param4 below).
> + Bit 0
> + Processor APIC field valid (see param3 below).
> + Bit 1
> + Memory address and mask valid (param1 and param2).
> + Bit 2
> + PCIe (seg,bus,dev,fn) valid (see param4 below).
>
> If set to zero, legacy behavior is mimicked where the type of
> injection specifies just one bit set, and param1 is multiplexed.
> @@ -121,7 +127,7 @@ BIOS versions based on the ACPI 5.0 specification have more control over
> the target of the injection. For processor-related errors (type 0x1, 0x2
> and 0x4), you can set flags to 0x3 (param3 for bit 0, and param1 and
> param2 for bit 1) so that you have more information added to the error
> -signature being injected. The actual data passed is this:
> +signature being injected. The actual data passed is this::
>
> memory_address = param1;
> memory_address_range = param2;
> @@ -131,7 +137,7 @@ signature being injected. The actual data passed is this:
> For memory errors (type 0x8, 0x10 and 0x20) the address is set using
> param1 with a mask in param2 (0x0 is equivalent to all ones). For PCI
> express errors (type 0x40, 0x80 and 0x100) the segment, bus, device and
> -function are specified using param1:
> +function are specified using param1::
>
> 31 24 23 16 15 11 10 8 7 0
> +-------------------------------------------------+
> @@ -152,26 +158,26 @@ documentation for details (and expect changes to this API if vendors
> creativity in using this feature expands beyond our expectations).
>
>
> -An error injection example:
> +An error injection example::
>
> -# cd /sys/kernel/debug/apei/einj
> -# cat available_error_type # See which errors can be injected
> -0x00000002 Processor Uncorrectable non-fatal
> -0x00000008 Memory Correctable
> -0x00000010 Memory Uncorrectable non-fatal
> -# echo 0x12345000 > param1 # Set memory address for injection
> -# echo $((-1 << 12)) > param2 # Mask 0xfffffffffffff000 - anywhere in this page
> -# echo 0x8 > error_type # Choose correctable memory error
> -# echo 1 > error_inject # Inject now
> + # cd /sys/kernel/debug/apei/einj
> + # cat available_error_type # See which errors can be injected
> + 0x00000002 Processor Uncorrectable non-fatal
> + 0x00000008 Memory Correctable
> + 0x00000010 Memory Uncorrectable non-fatal
> + # echo 0x12345000 > param1 # Set memory address for injection
> + # echo $((-1 << 12)) > param2 # Mask 0xfffffffffffff000 - anywhere in this page
> + # echo 0x8 > error_type # Choose correctable memory error
> + # echo 1 > error_inject # Inject now
>
> -You should see something like this in dmesg:
> +You should see something like this in dmesg::
>
> -[22715.830801] EDAC sbridge MC3: HANDLING MCE MEMORY ERROR
> -[22715.834759] EDAC sbridge MC3: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010090
> -[22715.834759] EDAC sbridge MC3: TSC 0
> -[22715.834759] EDAC sbridge MC3: ADDR 12345000 EDAC sbridge MC3: MISC 144780c86
> -[22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0
> -[22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)
> + [22715.830801] EDAC sbridge MC3: HANDLING MCE MEMORY ERROR
> + [22715.834759] EDAC sbridge MC3: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010090
> + [22715.834759] EDAC sbridge MC3: TSC 0
> + [22715.834759] EDAC sbridge MC3: ADDR 12345000 EDAC sbridge MC3: MISC 144780c86
> + [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0
> + [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)
>
> For more information about EINJ, please refer to ACPI specification
> version 4.0, section 17.5 and ACPI 5.0, section 18.6.
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index 869badba6d7a..fca854f017d8 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -18,6 +18,7 @@ ACPI Support
> debug
> aml-debugger
> apei/output_format
> + apei/einj
> gpio-properties
> i2c-muxes
> acpi-lid



Thanks,
Mauro

2019-04-24 15:06:46

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 25/63] Documentation: add Linux PCI to Sphinx TOC tree

Em Wed, 24 Apr 2019 00:28:54 +0800
Changbin Du <[email protected]> escreveu:

> Add a index.rst for PCI subsystem. More docs will be added later.
>
> Signed-off-by: Changbin Du <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>
> ---
> Documentation/PCI/index.rst | 9 +++++++++

On a past discussion at docs ML, we've agreed to use lowercase for new
stuff. My suggestion here would be to use lowercase for "pci".

Also, there's already a pci directory under driver-api, added on this
commit:

commit fcc78f9c22474d60c65d522e50ea07006ec1b9fc
Author: Logan Gunthorpe <[email protected]>
Date: Thu Oct 4 15:27:39 2018 -0600

docs-rst: Add a new directory for PCI documentation

I would just add a new section at Documentation/driver-api/pci/index.rst
with something like:

Legacy PCI documentation
========================

.. note::

The files here were written a long time ago and need some serious
work. Use their contents with caution.

.. toctree::
:maxdepth: 1

<files converted from Documentation/PCI>

And add those documents from Documentation/PCI into it.

> Documentation/index.rst | 1 +
> 2 files changed, 10 insertions(+)
> create mode 100644 Documentation/PCI/index.rst
>
> diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
> new file mode 100644
> index 000000000000..c2f8728d11cf
> --- /dev/null
> +++ b/Documentation/PCI/index.rst
> @@ -0,0 +1,9 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=======================
> +Linux PCI Bus Subsystem
> +=======================
> +
> +.. toctree::
> + :maxdepth: 2
> + :numbered:
> diff --git a/Documentation/index.rst b/Documentation/index.rst
> index fdfa85c56a50..d80138284e0f 100644
> --- a/Documentation/index.rst
> +++ b/Documentation/index.rst
> @@ -100,6 +100,7 @@ needed).
> filesystems/index
> vm/index
> bpf/index
> + PCI/index
> misc-devices/index
>
> Architecture-specific documentation



Thanks,
Mauro

2019-04-24 15:22:42

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 26/63] Documentation: PCI: convert pci.txt to reST

Em Wed, 24 Apr 2019 00:28:55 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>
> ---
> Documentation/PCI/index.rst | 2 +
> Documentation/PCI/{pci.txt => pci.rst} | 267 +++++++++++++------------
> 2 files changed, 140 insertions(+), 129 deletions(-)
> rename Documentation/PCI/{pci.txt => pci.rst} (78%)
>
> diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
> index c2f8728d11cf..7babf43709b0 100644
> --- a/Documentation/PCI/index.rst
> +++ b/Documentation/PCI/index.rst
> @@ -7,3 +7,5 @@ Linux PCI Bus Subsystem
> .. toctree::
> :maxdepth: 2
> :numbered:
> +
> + pci

See my comments to patch 25/63. It applies to all PCI stuff,
so I won't keep repeating it. Anyway, the final decision with
regards to file naming belongs to the docs maintainer and to
the PCI maintainer.

> diff --git a/Documentation/PCI/pci.txt b/Documentation/PCI/pci.rst
> similarity index 78%
> rename from Documentation/PCI/pci.txt
> rename to Documentation/PCI/pci.rst
> index badb26ac33dc..29ddd2e9177a 100644
> --- a/Documentation/PCI/pci.txt
> +++ b/Documentation/PCI/pci.rst

I would either rename this file or Documentation/driver-api/pci/pci.rst.

Even if the decision is to keep those on different directories, it
sounds a very bad idea on my eyes to keep two files with different
content and identical names on different directories that belong to
the same subsystem.

@PCI maintainers:

The MAINTAINERS file, at the PCI SUBSYSTEM part is missing
an entry for Documentation/driver-api/pci/.

> @@ -1,10 +1,12 @@
> +.. SPDX-License-Identifier: GPL-2.0
>
> - How To Write Linux PCI Drivers
> +==============================
> +How To Write Linux PCI Drivers
> +==============================
>
> - by Martin Mares <[email protected]> on 07-Feb-2000
> - updated by Grant Grundler <[email protected]> on 23-Dec-2006
> +:Authors: - Martin Mares <[email protected]>
> + - Grant Grundler <[email protected]>
>
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> The world of PCI is vast and full of (mostly unpleasant) surprises.
> Since each CPU architecture implements different chip-sets and PCI devices
> have different requirements (erm, "features"), the result is the PCI support
> @@ -26,8 +28,8 @@ Please send questions/comments/patches about Linux PCI API to the
>
>
>
> -0. Structure of PCI drivers
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +Structure of PCI drivers
> +========================
> PCI drivers "discover" PCI devices in a system via pci_register_driver().
> Actually, it's the other way around. When the PCI generic code discovers
> a new device, the driver with a matching "description" will be notified.
> @@ -42,24 +44,25 @@ pointers and thus dictates the high level structure of a driver.
> Once the driver knows about a PCI device and takes ownership, the
> driver generally needs to perform the following initialization:
>
> - Enable the device
> - Request MMIO/IOP resources
> - Set the DMA mask size (for both coherent and streaming DMA)
> - Allocate and initialize shared control data (pci_allocate_coherent())
> - Access device configuration space (if needed)
> - Register IRQ handler (request_irq())
> - Initialize non-PCI (i.e. LAN/SCSI/etc parts of the chip)
> - Enable DMA/processing engines
> + - Enable the device
> + - Request MMIO/IOP resources
> + - Set the DMA mask size (for both coherent and streaming DMA)
> + - Allocate and initialize shared control data (pci_allocate_coherent())
> + - Access device configuration space (if needed)
> + - Register IRQ handler (request_irq())
> + - Initialize non-PCI (i.e. LAN/SCSI/etc parts of the chip)
> + - Enable DMA/processing engines
>
> When done using the device, and perhaps the module needs to be unloaded,
> the driver needs to take the follow steps:
> - Disable the device from generating IRQs
> - Release the IRQ (free_irq())
> - Stop all DMA activity
> - Release DMA buffers (both streaming and coherent)
> - Unregister from other subsystems (e.g. scsi or netdev)
> - Release MMIO/IOP resources
> - Disable the device
> +
> + - Disable the device from generating IRQs
> + - Release the IRQ (free_irq())
> + - Stop all DMA activity
> + - Release DMA buffers (both streaming and coherent)
> + - Unregister from other subsystems (e.g. scsi or netdev)
> + - Release MMIO/IOP resources
> + - Disable the device
>
> Most of these topics are covered in the following sections.
> For the rest look at LDD3 or <linux/pci.h> .
> @@ -70,13 +73,12 @@ completely empty or just returning an appropriate error codes to avoid
> lots of ifdefs in the drivers.
>
>
> -
> -1. pci_register_driver() call
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +pci_register_driver() call
> +==========================
>
> PCI device drivers call pci_register_driver() during their
> initialization with a pointer to a structure describing the driver
> -(struct pci_driver):
> +(struct pci_driver)::
>
> field name Description
> ---------- ------------------------------------------------------
> @@ -125,7 +127,7 @@ initialization with a pointer to a structure describing the driver
> The ID table is an array of struct pci_device_id entries ending with an
> all-zero entry. Definitions with static const are generally preferred.

Better to format this as a table, e. g.:

=============== =======================================================
field name Description
=============== =======================================================
id_table Pointer to table of device ID's the driver is
interested in. Most drivers should export this
table using MODULE_DEVICE_TABLE(pci,...).

probe This probing function gets called (during execution
of pci_register_driver() for already existing
devices or later if a new device gets inserted) for
all PCI devices which match the ID table and are not
"owned" by the other drivers yet. This function gets
passed a `struct pci_dev *` for each device whose
entry in the ID table matches the device. The probe
function returns zero when the driver chooses to
take "ownership" of the device or an error code
(negative number) otherwise.
The probe function always gets called from process
context, so it can sleep.

remove The remove() function gets called whenever a device
being handled by this driver is removed (either during
deregistration of the driver or when it's manually
pulled out of a hot-pluggable slot).
The remove function always gets called from process
context, so it can sleep.

suspend Put device into low power state.
suspend_late Put device into low power state.

resume_early Wake device from low power state.
resume Wake device from low power state.

(Please see Documentation/power/pci.rst for
descriptions of PCI Power Management and the
related functions.)

shutdown Hook into reboot_notifier_list (kernel/sys.c).
Intended to stop any idling DMA operations.
Useful for enabling wake-on-lan (NIC) or changing
the power state of a device before reboot.
e.g. drivers/net/e100.c.

err_handler See Documentation/PCI/pci-error-recovery.rst
=============== =======================================================


>
> -Each entry consists of:
> +Each entry consists of::
>
> vendor,device Vendor and device ID to match (or PCI_ANY_ID)

Same here:

Each entry consists of:

==================== =======================================================
vendor, device Vendor and device ID to match (or PCI_ANY_ID)

subvendor, subdevice Subsystem vendor and device ID to match (or PCI_ANY_ID)


class Device class, subclass, and "interface" to match.
See Appendix D of the PCI Local Bus Spec or
include/linux/pci_ids.h for a full list of classes.
Most drivers do not need to specify class/class_mask
as vendor/device is normally sufficient.

class_mask limit which sub-fields of the class field are compared.
See drivers/scsi/sym53c8xx_2/ for example of usage.

driver_data Data private to the driver.
Most drivers don't need to use driver_data field.
Best practice is to use driver_data as an index
into a static list of equivalent device types,
instead of using it as a pointer.
==================== =======================================================


>
> @@ -160,9 +162,10 @@ echo "vendor device subvendor subdevice class class_mask driver_data" > \
> All fields are passed in as hexadecimal values (no leading 0x).
> The vendor and device fields are mandatory, the others are optional. Users
> need pass only as many optional fields as necessary:
> - o subvendor and subdevice fields default to PCI_ANY_ID (FFFFFFFF)
> - o class and classmask fields default to 0
> - o driver_data defaults to 0UL.
> +
> + - subvendor and subdevice fields default to PCI_ANY_ID (FFFFFFFF)
> + - class and classmask fields default to 0
> + - driver_data defaults to 0UL.
>
> Note that driver_data must match the value used by any of the pci_device_id
> entries defined in the driver. This makes the driver_data field mandatory
> @@ -175,29 +178,30 @@ When the driver exits, it just calls pci_unregister_driver() and the PCI layer
> automatically calls the remove hook for all devices handled by the driver.
>
>
> -1.1 "Attributes" for driver functions/data
> +"Attributes" for driver functions/data
> +--------------------------------------
>
> Please mark the initialization and cleanup functions where appropriate
> -(the corresponding macros are defined in <linux/init.h>):
> +(the corresponding macros are defined in <linux/init.h>)::
>
> __init Initialization code. Thrown away after the driver
> initializes.
> __exit Exit code. Ignored for non-modular drivers.

Same here:

Please mark the initialization and cleanup functions where appropriate
(the corresponding macros are defined in <linux/init.h>):

=============== =================================================
__init Initialization code. Thrown away after the driver
initializes.
__exit Exit code. Ignored for non-modular drivers.
=============== =================================================


>
> Tips on when/where to use the above attributes:
> - o The module_init()/module_exit() functions (and all
> + - The module_init()/module_exit() functions (and all
> initialization functions called _only_ from these)
> should be marked __init/__exit.
>
> - o Do not mark the struct pci_driver.
> + - Do not mark the struct pci_driver.
>
> - o Do NOT mark a function if you are not sure which mark to use.
> + - Do NOT mark a function if you are not sure which mark to use.
> Better to not mark the function than mark the function wrong.
>
>
>
> -2. How to find PCI devices manually
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +How to find PCI devices manually
> +================================
>
> PCI drivers should have a really good reason for not using the
> pci_register_driver() interface to search for PCI devices.
> @@ -207,17 +211,17 @@ E.g. combined serial/parallel port/floppy controller.
>
> A manual search may be performed using the following constructs:
>
> -Searching by vendor and device ID:
> +Searching by vendor and device ID::
>
> struct pci_dev *dev = NULL;
> while (dev = pci_get_device(VENDOR_ID, DEVICE_ID, dev))
> configure_device(dev);
>
> -Searching by class ID (iterate in a similar way):
> +Searching by class ID (iterate in a similar way)::
>
> pci_get_class(CLASS_ID, dev)
>
> -Searching by both vendor/device and subsystem vendor/device ID:
> +Searching by both vendor/device and subsystem vendor/device ID::
>
> pci_get_subsys(VENDOR_ID,DEVICE_ID, SUBSYS_VENDOR_ID, SUBSYS_DEVICE_ID, dev).
>
> @@ -231,20 +235,20 @@ decrement the reference count on these devices by calling pci_dev_put().
>
>
>
> -3. Device Initialization Steps
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +Device Initialization Steps
> +===========================
>
> As noted in the introduction, most PCI drivers need the following steps
> for device initialization:
>
> - Enable the device
> - Request MMIO/IOP resources
> - Set the DMA mask size (for both coherent and streaming DMA)
> - Allocate and initialize shared control data (pci_allocate_coherent())
> - Access device configuration space (if needed)
> - Register IRQ handler (request_irq())
> - Initialize non-PCI (i.e. LAN/SCSI/etc parts of the chip)
> - Enable DMA/processing engines.
> + - Enable the device
> + - Request MMIO/IOP resources
> + - Set the DMA mask size (for both coherent and streaming DMA)
> + - Allocate and initialize shared control data (pci_allocate_coherent())
> + - Access device configuration space (if needed)
> + - Register IRQ handler (request_irq())
> + - Initialize non-PCI (i.e. LAN/SCSI/etc parts of the chip)
> + - Enable DMA/processing engines.
>
> The driver can access PCI config space registers at any time.
> (Well, almost. When running BIST, config space can go away...but
> @@ -252,17 +256,18 @@ that will just result in a PCI Bus Master Abort and config reads
> will return garbage).
>
>
> -3.1 Enable the PCI device
> -~~~~~~~~~~~~~~~~~~~~~~~~~
> +Enable the PCI device
> +---------------------
> Before touching any device registers, the driver needs to enable
> the PCI device by calling pci_enable_device(). This will:
> - o wake up the device if it was in suspended state,
> - o allocate I/O and memory regions of the device (if BIOS did not),
> - o allocate an IRQ (if BIOS did not).
>
> -NOTE: pci_enable_device() can fail! Check the return value.
> + - wake up the device if it was in suspended state,
> + - allocate I/O and memory regions of the device (if BIOS did not),
> + - allocate an IRQ (if BIOS did not).
> +
> +.. note:: pci_enable_device() can fail! Check the return value.
>
> -[ OS BUG: we don't check resource allocations before enabling those
> +.. warning:: OS BUG: we don't check resource allocations before enabling those
> resources. The sequence would make more sense if we called
> pci_request_resources() before calling pci_enable_device().
> Currently, the device drivers can't detect the bug when when two
> @@ -271,7 +276,7 @@ NOTE: pci_enable_device() can fail! Check the return value.
>
> This has been discussed before but not changed as of 2.6.19:
> http://lkml.org/lkml/2006/3/2/194
> -]
> +
>
> pci_set_master() will enable DMA by setting the bus master bit
> in the PCI_COMMAND register. It also fixes the latency timer value if
> @@ -288,8 +293,8 @@ pci_try_set_mwi() to have the system do its best effort at enabling
> Mem-Wr-Inval.
>
>
> -3.2 Request MMIO/IOP resources
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +Request MMIO/IOP resources
> +--------------------------
> Memory (MMIO), and I/O port addresses should NOT be read directly
> from the PCI device config space. Use the values in the pci_dev structure
> as the PCI "bus address" might have been remapped to a "host physical"
> @@ -304,9 +309,9 @@ Conversely, drivers should call pci_release_region() AFTER
> calling pci_disable_device().
> The idea is to prevent two devices colliding on the same address range.
>
> -[ See OS BUG comment above. Currently (2.6.19), The driver can only
> +.. tip:: See OS BUG comment above. Currently (2.6.19), The driver can only
> determine MMIO and IO Port resource availability _after_ calling
> - pci_enable_device(). ]
> + pci_enable_device().

Hmm... indentation seems to be wrong here

>
> Generic flavors of pci_request_region() are request_mem_region()
> (for MMIO ranges) and request_region() (for IO Port ranges).
> @@ -316,12 +321,12 @@ BARs.
> Also see pci_request_selected_regions() below.
>
>
> -3.3 Set the DMA mask size
> -~~~~~~~~~~~~~~~~~~~~~~~~~
> -[ If anything below doesn't make sense, please refer to
> +Set the DMA mask size
> +---------------------
> +.. note:: If anything below doesn't make sense, please refer to
> Documentation/DMA-API.txt. This section is just a reminder that
> drivers need to indicate DMA capabilities of the device and is not
> - an authoritative source for DMA interfaces. ]
> + an authoritative source for DMA interfaces.

and here.

To be frank, when handling note/tip/warning/..., I prefer to indent
it like:


.. note::

If anything below doesn't make sense, please refer to
Documentation/DMA-API.txt. This section is just a reminder that
drivers need to indicate DMA capabilities of the device and is not
an authoritative source for DMA interfaces.

As this makes it more visible when reading as a plain text file.


>
> While all drivers should explicitly indicate the DMA capability
> (e.g. 32 or 64 bit) of the PCI bus master, devices with more than
> @@ -342,23 +347,23 @@ Many 64-bit "PCI" devices (before PCI-X) and some PCI-X devices are
> ("consistent") data.
>
>
> -3.4 Setup shared control data
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +Setup shared control data
> +-------------------------
> Once the DMA masks are set, the driver can allocate "consistent" (a.k.a. shared)
> memory. See Documentation/DMA-API.txt for a full description of
> the DMA APIs. This section is just a reminder that it needs to be done
> before enabling DMA on the device.
>
>
> -3.5 Initialize device registers
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +Initialize device registers
> +---------------------------
> Some drivers will need specific "capability" fields programmed
> or other "vendor specific" register initialized or reset.
> E.g. clearing pending interrupts.
>
>
> -3.6 Register IRQ handler
> -~~~~~~~~~~~~~~~~~~~~~~~~
> +Register IRQ handler
> +--------------------
> While calling request_irq() is the last step described here,
> this is often just another intermediate step to initialize a device.
> This step can often be deferred until the device is opened for use.
> @@ -396,6 +401,7 @@ and msix_enabled flags in the pci_dev structure after calling
> pci_alloc_irq_vectors.
>
> There are (at least) two really good reasons for using MSI:
> +
> 1) MSI is an exclusive interrupt vector by definition.
> This means the interrupt handler doesn't have to verify
> its device caused the interrupt.
> @@ -411,23 +417,23 @@ of MSI/MSI-X usage.
>
>
>
> -4. PCI device shutdown
> -~~~~~~~~~~~~~~~~~~~~~~~
> +PCI device shutdown
> +===================
>
> When a PCI device driver is being unloaded, most of the following
> steps need to be performed:
>
> - Disable the device from generating IRQs
> - Release the IRQ (free_irq())
> - Stop all DMA activity
> - Release DMA buffers (both streaming and consistent)
> - Unregister from other subsystems (e.g. scsi or netdev)
> - Disable device from responding to MMIO/IO Port addresses
> - Release MMIO/IO Port resource(s)
> + - Disable the device from generating IRQs
> + - Release the IRQ (free_irq())
> + - Stop all DMA activity
> + - Release DMA buffers (both streaming and consistent)
> + - Unregister from other subsystems (e.g. scsi or netdev)
> + - Disable device from responding to MMIO/IO Port addresses
> + - Release MMIO/IO Port resource(s)
>
>
> -4.1 Stop IRQs on the device
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +Stop IRQs on the device
> +-----------------------
> How to do this is chip/device specific. If it's not done, it opens
> the possibility of a "screaming interrupt" if (and only if)
> the IRQ is shared with another device.
> @@ -446,16 +452,16 @@ MSI and MSI-X are defined to be exclusive interrupts and thus
> are not susceptible to the "screaming interrupt" problem.
>
>
> -4.2 Release the IRQ
> -~~~~~~~~~~~~~~~~~~~
> +Release the IRQ
> +---------------
> Once the device is quiesced (no more IRQs), one can call free_irq().
> This function will return control once any pending IRQs are handled,
> "unhook" the drivers IRQ handler from that IRQ, and finally release
> the IRQ if no one else is using it.
>
>
> -4.3 Stop all DMA activity
> -~~~~~~~~~~~~~~~~~~~~~~~~~
> +Stop all DMA activity
> +---------------------
> It's extremely important to stop all DMA operations BEFORE attempting
> to deallocate DMA control data. Failure to do so can result in memory
> corruption, hangs, and on some chip-sets a hard crash.
> @@ -467,8 +473,8 @@ While this step sounds obvious and trivial, several "mature" drivers
> didn't get this step right in the past.
>
>
> -4.4 Release DMA buffers
> -~~~~~~~~~~~~~~~~~~~~~~~
> +Release DMA buffers
> +-------------------
> Once DMA is stopped, clean up streaming DMA first.
> I.e. unmap data buffers and return buffers to "upstream"
> owners if there is one.
> @@ -478,8 +484,8 @@ Then clean up "consistent" buffers which contain the control data.
> See Documentation/DMA-API.txt for details on unmapping interfaces.
>
>
> -4.5 Unregister from other subsystems
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +Unregister from other subsystems
> +--------------------------------
> Most low level PCI device drivers support some other subsystem
> like USB, ALSA, SCSI, NetDev, Infiniband, etc. Make sure your
> driver isn't losing resources from that other subsystem.
> @@ -487,31 +493,31 @@ If this happens, typically the symptom is an Oops (panic) when
> the subsystem attempts to call into a driver that has been unloaded.
>
>
> -4.6 Disable Device from responding to MMIO/IO Port addresses
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +Disable Device from responding to MMIO/IO Port addresses
> +--------------------------------------------------------
> io_unmap() MMIO or IO Port resources and then call pci_disable_device().
> This is the symmetric opposite of pci_enable_device().
> Do not access device registers after calling pci_disable_device().
>
>
> -4.7 Release MMIO/IO Port Resource(s)
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +Release MMIO/IO Port Resource(s)
> +--------------------------------
> Call pci_release_region() to mark the MMIO or IO Port range as available.
> Failure to do so usually results in the inability to reload the driver.
>
>
>
> -5. How to access PCI config space
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +How to access PCI config space
> +==============================
>
> -You can use pci_(read|write)_config_(byte|word|dword) to access the config
> -space of a device represented by struct pci_dev *. All these functions return 0
> -when successful or an error code (PCIBIOS_...) which can be translated to a text
> -string by pcibios_strerror. Most drivers expect that accesses to valid PCI
> +You can use `pci_(read|write)_config_(byte|word|dword)` to access the config
> +space of a device represented by `struct pci_dev *`. All these functions return
> +0 when successful or an error code (`PCIBIOS_...`) which can be translated to a
> +text string by pcibios_strerror. Most drivers expect that accesses to valid PCI
> devices don't fail.
>
> If you don't have a struct pci_dev available, you can call
> -pci_bus_(read|write)_config_(byte|word|dword) to access a given device
> +`pci_bus_(read|write)_config_(byte|word|dword)` to access a given device
> and function on that bus.
>
> If you access fields in the standard portion of the config header, please
> @@ -522,28 +528,29 @@ pci_find_capability() for the particular capability and it will find the
> corresponding register block for you.
>
>
> +Other interesting functions
> +===========================
>
> -6. Other interesting functions
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +::
>
> -pci_get_domain_bus_and_slot() Find pci_dev corresponding to given domain,
> - bus and slot and number. If the device is
> - found, its reference count is increased.
> -pci_set_power_state() Set PCI Power Management state (0=D0 ... 3=D3)
> -pci_find_capability() Find specified capability in device's capability
> + pci_get_domain_bus_and_slot() Find pci_dev corresponding to given domain,
> + bus and slot and number. If the device is
> + found, its reference count is increased.
> + pci_set_power_state() Set PCI Power Management state (0=D0 ... 3=D3)
> + pci_find_capability() Find specified capability in device's capability
> list.
> -pci_resource_start() Returns bus start address for a given PCI region
> -pci_resource_end() Returns bus end address for a given PCI region
> -pci_resource_len() Returns the byte length of a PCI region
> -pci_set_drvdata() Set private driver data pointer for a pci_dev
> -pci_get_drvdata() Return private driver data pointer for a pci_dev
> -pci_set_mwi() Enable Memory-Write-Invalidate transactions.
> -pci_clear_mwi() Disable Memory-Write-Invalidate transactions.
> + pci_resource_start() Returns bus start address for a given PCI region
> + pci_resource_end() Returns bus end address for a given PCI region
> + pci_resource_len() Returns the byte length of a PCI region
> + pci_set_drvdata() Set private driver data pointer for a pci_dev
> + pci_get_drvdata() Return private driver data pointer for a pci_dev
> + pci_set_mwi() Enable Memory-Write-Invalidate transactions.
> + pci_clear_mwi() Disable Memory-Write-Invalidate transactions.

Better to use a list here:

=============================== ================================================
pci_get_domain_bus_and_slot() Find pci_dev corresponding to given domain,
bus and slot and number. If the device is
found, its reference count is increased.
pci_set_power_state() Set PCI Power Management state (0=D0 ... 3=D3)
pci_find_capability() Find specified capability in device's capability
list.
pci_resource_start() Returns bus start address for a given PCI region
pci_resource_end() Returns bus end address for a given PCI region
pci_resource_len() Returns the byte length of a PCI region
pci_set_drvdata() Set private driver data pointer for a pci_dev
pci_get_drvdata() Return private driver data pointer for a pci_dev
pci_set_mwi() Enable Memory-Write-Invalidate transactions.
pci_clear_mwi() Disable Memory-Write-Invalidate transactions.
=============================== ================================================


>
>
>
> -7. Miscellaneous hints
> -~~~~~~~~~~~~~~~~~~~~~~
> +Miscellaneous hints
> +===================
>
> When displaying PCI device names to the user (for example when a driver wants
> to tell the user what card has it found), please use pci_name(pci_dev).
> @@ -560,8 +567,8 @@ to be handled by platform and generic code, not individual drivers.
>
>
>
> -8. Vendor and device identifications
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +Vendor and device identifications
> +=================================
>
> Do not add new device or vendor IDs to include/linux/pci_ids.h unless they
> are shared across multiple drivers. You can add private definitions in
> @@ -576,18 +583,20 @@ and https://github.com/pciutils/pciids.
>
>
>
> -9. Obsolete functions
> -~~~~~~~~~~~~~~~~~~~~~
> +Obsolete functions
> +==================
>
> There are several functions which you might come across when trying to
> port an old driver to the new PCI interface. They are no longer present
> in the kernel as they aren't compatible with hotplug or PCI domains or
> having sane locking.
>
> -pci_find_device() Superseded by pci_get_device()
> -pci_find_subsys() Superseded by pci_get_subsys()
> -pci_find_slot() Superseded by pci_get_domain_bus_and_slot()
> -pci_get_slot() Superseded by pci_get_domain_bus_and_slot()
> +::
> +
> + pci_find_device() Superseded by pci_get_device()
> + pci_find_subsys() Superseded by pci_get_subsys()
> + pci_find_slot() Superseded by pci_get_domain_bus_and_slot()
> + pci_get_slot() Superseded by pci_get_domain_bus_and_slot()

A list works better here:

======================= ===========================================
pci_find_device() Superseded by pci_get_device()
pci_find_subsys() Superseded by pci_get_subsys()
pci_find_slot() Superseded by pci_get_domain_bus_and_slot()
pci_get_slot() Superseded by pci_get_domain_bus_and_slot()
======================= ===========================================


>
>
> The alternative is the traditional PCI device driver that walks PCI
> @@ -595,8 +604,8 @@ device lists. This is still possible but discouraged.
>
>
>
> -10. MMIO Space and "Write Posting"
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +MMIO Space and "Write Posting"
> +==============================
>
> Converting a driver from using I/O Port space to using MMIO space
> often requires some additional changes. Specifically, "write posting"
> @@ -609,14 +618,14 @@ the CPU before the transaction has reached its destination.
>
> Thus, timing sensitive code should add readl() where the CPU is
> expected to wait before doing other work. The classic "bit banging"
> -sequence works fine for I/O Port space:
> +sequence works fine for I/O Port space::
>
> for (i = 8; --i; val >>= 1) {
> outb(val & 1, ioport_reg); /* write bit */
> udelay(10);
> }
>
> -The same sequence for MMIO space should be:
> +The same sequence for MMIO space should be::
>
> for (i = 8; --i; val >>= 1) {
> writeb(val & 1, mmio_reg); /* write bit */



Thanks,
Mauro

2019-04-24 15:25:25

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 27/63] Documentation: PCI: convert PCIEBUS-HOWTO.txt to reST

Em Wed, 24 Apr 2019 00:28:56 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>
> ---
> .../{PCIEBUS-HOWTO.txt => PCIEBUS-HOWTO.rst} | 140 ++++++++++--------
> Documentation/PCI/index.rst | 1 +
> 2 files changed, 82 insertions(+), 59 deletions(-)
> rename Documentation/PCI/{PCIEBUS-HOWTO.txt => PCIEBUS-HOWTO.rst} (70%)

Names in lowercase after rename, please.

For the changes itself at the txt file:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

>
> diff --git a/Documentation/PCI/PCIEBUS-HOWTO.txt b/Documentation/PCI/PCIEBUS-HOWTO.rst
> similarity index 70%
> rename from Documentation/PCI/PCIEBUS-HOWTO.txt
> rename to Documentation/PCI/PCIEBUS-HOWTO.rst
> index 15f0bb3b5045..f882ff62c51f 100644
> --- a/Documentation/PCI/PCIEBUS-HOWTO.txt
> +++ b/Documentation/PCI/PCIEBUS-HOWTO.rst
> @@ -1,16 +1,23 @@
> - The PCI Express Port Bus Driver Guide HOWTO
> - Tom L Nguyen [email protected]
> - 11/03/2004
> +.. SPDX-License-Identifier: GPL-2.0
> +.. include:: <isonum.txt>
>
> -1. About this guide
> +===========================================
> +The PCI Express Port Bus Driver Guide HOWTO
> +===========================================
> +
> +:Author: Tom L Nguyen [email protected] 11/03/2004
> +:Copyright: |copy| 2004 Intel Corporation
> +
> +About this guide
> +================
>
> This guide describes the basics of the PCI Express Port Bus driver
> and provides information on how to enable the service drivers to
> register/unregister with the PCI Express Port Bus Driver.
>
> -2. Copyright 2004 Intel Corporation
>
> -3. What is the PCI Express Port Bus Driver
> +What is the PCI Express Port Bus Driver
> +=======================================
>
> A PCI Express Port is a logical PCI-PCI Bridge structure. There
> are two types of PCI Express Port: the Root Port and the Switch
> @@ -30,7 +37,8 @@ support (AER), and virtual channel support (VC). These services may
> be handled by a single complex driver or be individually distributed
> and handled by corresponding service drivers.
>
> -4. Why use the PCI Express Port Bus Driver?
> +Why use the PCI Express Port Bus Driver?
> +========================================
>
> In existing Linux kernels, the Linux Device Driver Model allows a
> physical device to be handled by only a single driver. The PCI
> @@ -51,28 +59,31 @@ PCI Express Ports and distributes all provided service requests
> to the corresponding service drivers as required. Some key
> advantages of using the PCI Express Port Bus driver are listed below:
>
> - - Allow multiple service drivers to run simultaneously on
> - a PCI-PCI Bridge Port device.
> + - Allow multiple service drivers to run simultaneously on
> + a PCI-PCI Bridge Port device.
>
> - - Allow service drivers implemented in an independent
> - staged approach.
> + - Allow service drivers implemented in an independent
> + staged approach.
>
> - - Allow one service driver to run on multiple PCI-PCI Bridge
> - Port devices.
> + - Allow one service driver to run on multiple PCI-PCI Bridge
> + Port devices.
>
> - - Manage and distribute resources of a PCI-PCI Bridge Port
> - device to requested service drivers.
> + - Manage and distribute resources of a PCI-PCI Bridge Port
> + device to requested service drivers.
>
> -5. Configuring the PCI Express Port Bus Driver vs. Service Drivers
> +Configuring the PCI Express Port Bus Driver vs. Service Drivers
> +===============================================================
>
> -5.1 Including the PCI Express Port Bus Driver Support into the Kernel
> +Including the PCI Express Port Bus Driver Support into the Kernel
> +-----------------------------------------------------------------
>
> Including the PCI Express Port Bus driver depends on whether the PCI
> Express support is included in the kernel config. The kernel will
> automatically include the PCI Express Port Bus driver as a kernel
> driver when the PCI Express support is enabled in the kernel.
>
> -5.2 Enabling Service Driver Support
> +Enabling Service Driver Support
> +-------------------------------
>
> PCI device drivers are implemented based on Linux Device Driver Model.
> All service drivers are PCI device drivers. As discussed above, it is
> @@ -89,9 +100,11 @@ header file /include/linux/pcieport_if.h, before calling these APIs.
> Failure to do so will result an identity mismatch, which prevents
> the PCI Express Port Bus driver from loading a service driver.
>
> -5.2.1 pcie_port_service_register
> +pcie_port_service_register
> +~~~~~~~~~~~~~~~~~~~~~~~~~~
> +::
>
> -int pcie_port_service_register(struct pcie_port_service_driver *new)
> + int pcie_port_service_register(struct pcie_port_service_driver *new)
>
> This API replaces the Linux Driver Model's pci_register_driver API. A
> service driver should always calls pcie_port_service_register at
> @@ -99,69 +112,76 @@ module init. Note that after service driver being loaded, calls
> such as pci_enable_device(dev) and pci_set_master(dev) are no longer
> necessary since these calls are executed by the PCI Port Bus driver.
>
> -5.2.2 pcie_port_service_unregister
> +pcie_port_service_unregister
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +::
>
> -void pcie_port_service_unregister(struct pcie_port_service_driver *new)
> + void pcie_port_service_unregister(struct pcie_port_service_driver *new)
>
> pcie_port_service_unregister replaces the Linux Driver Model's
> pci_unregister_driver. It's always called by service driver when a
> module exits.
>
> -5.2.3 Sample Code
> +Sample Code
> +~~~~~~~~~~~
>
> Below is sample service driver code to initialize the port service
> driver data structure.
> +::
>
> -static struct pcie_port_service_id service_id[] = { {
> - .vendor = PCI_ANY_ID,
> - .device = PCI_ANY_ID,
> - .port_type = PCIE_RC_PORT,
> - .service_type = PCIE_PORT_SERVICE_AER,
> - }, { /* end: all zeroes */ }
> -};
> + static struct pcie_port_service_id service_id[] = { {
> + .vendor = PCI_ANY_ID,
> + .device = PCI_ANY_ID,
> + .port_type = PCIE_RC_PORT,
> + .service_type = PCIE_PORT_SERVICE_AER,
> + }, { /* end: all zeroes */ }
> + };
>
> -static struct pcie_port_service_driver root_aerdrv = {
> - .name = (char *)device_name,
> - .id_table = &service_id[0],
> + static struct pcie_port_service_driver root_aerdrv = {
> + .name = (char *)device_name,
> + .id_table = &service_id[0],
>
> - .probe = aerdrv_load,
> - .remove = aerdrv_unload,
> + .probe = aerdrv_load,
> + .remove = aerdrv_unload,
>
> - .suspend = aerdrv_suspend,
> - .resume = aerdrv_resume,
> -};
> + .suspend = aerdrv_suspend,
> + .resume = aerdrv_resume,
> + };
>
> Below is a sample code for registering/unregistering a service
> driver.
> +::
>
> -static int __init aerdrv_service_init(void)
> -{
> - int retval = 0;
> + static int __init aerdrv_service_init(void)
> + {
> + int retval = 0;
>
> - retval = pcie_port_service_register(&root_aerdrv);
> - if (!retval) {
> - /*
> - * FIX ME
> - */
> - }
> - return retval;
> -}
> + retval = pcie_port_service_register(&root_aerdrv);
> + if (!retval) {
> + /*
> + * FIX ME
> + */
> + }
> + return retval;
> + }
>
> -static void __exit aerdrv_service_exit(void)
> -{
> - pcie_port_service_unregister(&root_aerdrv);
> -}
> + static void __exit aerdrv_service_exit(void)
> + {
> + pcie_port_service_unregister(&root_aerdrv);
> + }
>
> -module_init(aerdrv_service_init);
> -module_exit(aerdrv_service_exit);
> + module_init(aerdrv_service_init);
> + module_exit(aerdrv_service_exit);
>
> -6. Possible Resource Conflicts
> +Possible Resource Conflicts
> +===========================
>
> Since all service drivers of a PCI-PCI Bridge Port device are
> allowed to run simultaneously, below lists a few of possible resource
> conflicts with proposed solutions.
>
> -6.1 MSI and MSI-X Vector Resource
> +MSI and MSI-X Vector Resource
> +-----------------------------
>
> Once MSI or MSI-X interrupts are enabled on a device, it stays in this
> mode until they are disabled again. Since service drivers of the same
> @@ -179,7 +199,8 @@ driver. Service drivers should use (struct pcie_device*)dev->irq to
> call request_irq/free_irq. In addition, the interrupt mode is stored
> in the field interrupt_mode of struct pcie_device.
>
> -6.3 PCI Memory/IO Mapped Regions
> +PCI Memory/IO Mapped Regions
> +----------------------------
>
> Service drivers for PCI Express Power Management (PME), Advanced
> Error Reporting (AER), Hot-Plug (HP) and Virtual Channel (VC) access
> @@ -188,7 +209,8 @@ registers accessed are independent of each other. This patch assumes
> that all service drivers will be well behaved and not overwrite
> other service driver's configuration settings.
>
> -6.4 PCI Config Registers
> +PCI Config Registers
> +--------------------
>
> Each service driver runs its PCI config operations on its own
> capability structure except the PCI Express capability structure, in
> diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
> index 7babf43709b0..452723318405 100644
> --- a/Documentation/PCI/index.rst
> +++ b/Documentation/PCI/index.rst
> @@ -9,3 +9,4 @@ Linux PCI Bus Subsystem
> :numbered:
>
> pci
> + PCIEBUS-HOWTO



Thanks,
Mauro

2019-04-24 15:26:43

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 28/63] Documentation: PCI: convert pci-iov-howto.txt to reST

Em Wed, 24 Apr 2019 00:28:57 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>
> ---
> Documentation/PCI/index.rst | 1 +
> .../{pci-iov-howto.txt => pci-iov-howto.rst} | 161 ++++++++++--------
> 2 files changed, 94 insertions(+), 68 deletions(-)
> rename Documentation/PCI/{pci-iov-howto.txt => pci-iov-howto.rst} (63%)

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

>
> diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
> index 452723318405..e1c19962a7f8 100644
> --- a/Documentation/PCI/index.rst
> +++ b/Documentation/PCI/index.rst
> @@ -10,3 +10,4 @@ Linux PCI Bus Subsystem
>
> pci
> PCIEBUS-HOWTO
> + pci-iov-howto
> diff --git a/Documentation/PCI/pci-iov-howto.txt b/Documentation/PCI/pci-iov-howto.rst
> similarity index 63%
> rename from Documentation/PCI/pci-iov-howto.txt
> rename to Documentation/PCI/pci-iov-howto.rst
> index d2a84151e99c..b9fd003206f1 100644
> --- a/Documentation/PCI/pci-iov-howto.txt
> +++ b/Documentation/PCI/pci-iov-howto.rst
> @@ -1,14 +1,19 @@
> - PCI Express I/O Virtualization Howto
> - Copyright (C) 2009 Intel Corporation
> - Yu Zhao <[email protected]>
> +.. SPDX-License-Identifier: GPL-2.0
> +.. include:: <isonum.txt>
>
> - Update: November 2012
> - -- sysfs-based SRIOV enable-/disable-ment
> - Donald Dutile <[email protected]>
> +====================================
> +PCI Express I/O Virtualization Howto
> +====================================
>
> -1. Overview
> +:Copyright: |copy| 2009 Intel Corporation
> +:Authors: - Yu Zhao <[email protected]>
> + - Donald Dutile <[email protected]>
>
> -1.1 What is SR-IOV
> +Overview
> +========
> +
> +What is SR-IOV
> +--------------
>
> Single Root I/O Virtualization (SR-IOV) is a PCI Express Extended
> capability which makes one physical device appear as multiple virtual
> @@ -23,9 +28,11 @@ Memory Space, which is used to map its register set. VF device driver
> operates on the register set so it can be functional and appear as a
> real existing PCI device.
>
> -2. User Guide
> +User Guide
> +==========
>
> -2.1 How can I enable SR-IOV capability
> +How can I enable SR-IOV capability
> +----------------------------------
>
> Multiple methods are available for SR-IOV enablement.
> In the first method, the device driver (PF driver) will control the
> @@ -43,105 +50,123 @@ checks, e.g., check numvfs == 0 if enabling VFs, ensure
> numvfs <= totalvfs.
> The second method is the recommended method for new/future VF devices.
>
> -2.2 How can I use the Virtual Functions
> +How can I use the Virtual Functions
> +-----------------------------------
>
> The VF is treated as hot-plugged PCI devices in the kernel, so they
> should be able to work in the same way as real PCI devices. The VF
> requires device driver that is same as a normal PCI device's.
>
> -3. Developer Guide
> +Developer Guide
> +===============
>
> -3.1 SR-IOV API
> +SR-IOV API
> +----------
>
> To enable SR-IOV capability:
> -(a) For the first method, in the driver:
> +
> +(a) For the first method, in the driver::
> +
> int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
> - 'nr_virtfn' is number of VFs to be enabled.
> -(b) For the second method, from sysfs:
> +
> +'nr_virtfn' is number of VFs to be enabled.
> +
> +(b) For the second method, from sysfs::
> +
> echo 'nr_virtfn' > \
> /sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_numvfs
>
> To disable SR-IOV capability:
> -(a) For the first method, in the driver:
> +
> +(a) For the first method, in the driver::
> +
> void pci_disable_sriov(struct pci_dev *dev);
> -(b) For the second method, from sysfs:
> +
> +(b) For the second method, from sysfs::
> +
> echo 0 > \
> /sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_numvfs
>
> To enable auto probing VFs by a compatible driver on the host, run
> command below before enabling SR-IOV capabilities. This is the
> default behavior.
> +::
> +
> echo 1 > \
> /sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_drivers_autoprobe
>
> To disable auto probing VFs by a compatible driver on the host, run
> command below before enabling SR-IOV capabilities. Updating this
> entry will not affect VFs which are already probed.
> +::
> +
> echo 0 > \
> /sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_drivers_autoprobe
>
> -3.2 Usage example
> +Usage example
> +-------------
>
> Following piece of code illustrates the usage of the SR-IOV API.
> +::
>
> -static int dev_probe(struct pci_dev *dev, const struct pci_device_id *id)
> -{
> - pci_enable_sriov(dev, NR_VIRTFN);
> + static int dev_probe(struct pci_dev *dev, const struct pci_device_id *id)
> + {
> + pci_enable_sriov(dev, NR_VIRTFN);
>
> - ...
> -
> - return 0;
> -}
> + ...
>
> -static void dev_remove(struct pci_dev *dev)
> -{
> - pci_disable_sriov(dev);
> + return 0;
> + }
>
> - ...
> -}
> + static void dev_remove(struct pci_dev *dev)
> + {
> + pci_disable_sriov(dev);
>
> -static int dev_suspend(struct pci_dev *dev, pm_message_t state)
> -{
> - ...
> + ...
> + }
>
> - return 0;
> -}
> + static int dev_suspend(struct pci_dev *dev, pm_message_t state)
> + {
> + ...
>
> -static int dev_resume(struct pci_dev *dev)
> -{
> - ...
> + return 0;
> + }
>
> - return 0;
> -}
> + static int dev_resume(struct pci_dev *dev)
> + {
> + ...
>
> -static void dev_shutdown(struct pci_dev *dev)
> -{
> - ...
> -}
> + return 0;
> + }
>
> -static int dev_sriov_configure(struct pci_dev *dev, int numvfs)
> -{
> - if (numvfs > 0) {
> - ...
> - pci_enable_sriov(dev, numvfs);
> + static void dev_shutdown(struct pci_dev *dev)
> + {
> ...
> - return numvfs;
> }
> - if (numvfs == 0) {
> - ....
> - pci_disable_sriov(dev);
> - ...
> - return 0;
> +
> + static int dev_sriov_configure(struct pci_dev *dev, int numvfs)
> + {
> + if (numvfs > 0) {
> + ...
> + pci_enable_sriov(dev, numvfs);
> + ...
> + return numvfs;
> + }
> + if (numvfs == 0) {
> + ....
> + pci_disable_sriov(dev);
> + ...
> + return 0;
> + }
> }
> -}
> -
> -static struct pci_driver dev_driver = {
> - .name = "SR-IOV Physical Function driver",
> - .id_table = dev_id_table,
> - .probe = dev_probe,
> - .remove = dev_remove,
> - .suspend = dev_suspend,
> - .resume = dev_resume,
> - .shutdown = dev_shutdown,
> - .sriov_configure = dev_sriov_configure,
> -};
> +
> + static struct pci_driver dev_driver = {
> + .name = "SR-IOV Physical Function driver",
> + .id_table = dev_id_table,
> + .probe = dev_probe,
> + .remove = dev_remove,
> + .suspend = dev_suspend,
> + .resume = dev_resume,
> + .shutdown = dev_shutdown,
> + .sriov_configure = dev_sriov_configure,
> + };



Thanks,
Mauro

2019-04-24 15:32:36

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 29/63] Documentation: PCI: convert MSI-HOWTO.txt to reST

Em Wed, 24 Apr 2019 00:28:58 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>
>
> ---
> v2:
> o drop numbering.
> o simplify author list
> ---
> .../PCI/{MSI-HOWTO.txt => MSI-HOWTO.rst} | 83 +++++++++++--------
> Documentation/PCI/index.rst | 1 +
> 2 files changed, 50 insertions(+), 34 deletions(-)
> rename Documentation/PCI/{MSI-HOWTO.txt => MSI-HOWTO.rst} (88%)

Renamed names in lowercase, please.

>
> diff --git a/Documentation/PCI/MSI-HOWTO.txt b/Documentation/PCI/MSI-HOWTO.rst
> similarity index 88%
> rename from Documentation/PCI/MSI-HOWTO.txt
> rename to Documentation/PCI/MSI-HOWTO.rst
> index 618e13d5e276..18cc3700489b 100644
> --- a/Documentation/PCI/MSI-HOWTO.txt
> +++ b/Documentation/PCI/MSI-HOWTO.rst
> @@ -1,13 +1,14 @@
> - The MSI Driver Guide HOWTO
> - Tom L Nguyen [email protected]
> - 10/03/2003
> - Revised Feb 12, 2004 by Martine Silbermann
> - email: [email protected]
> - Revised Jun 25, 2004 by Tom L Nguyen
> - Revised Jul 9, 2008 by Matthew Wilcox <[email protected]>
> - Copyright 2003, 2008 Intel Corporation
> +.. SPDX-License-Identifier: GPL-2.0
> +.. include:: <isonum.txt>
>
> -1. About this guide
> +==========================
> +The MSI Driver Guide HOWTO
> +==========================
> +
> +:Authors: Tom L Nguyen; Martine Silbermann; Matthew Wilcox

Not so sure about this, as you removed the author emails.

It seems you missed to keep:

Copyright 2003, 2008 Intel Corporation

After re-adding the missing copyright:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> +
> +About this guide
> +================
>
> This guide describes the basics of Message Signaled Interrupts (MSIs),
> the advantages of using MSI over traditional interrupt mechanisms, how
> @@ -15,7 +16,8 @@ to change your driver to use MSI or MSI-X and some basic diagnostics to
> try if a device doesn't support MSIs.
>
>
> -2. What are MSIs?
> +What are MSIs?
> +==============
>
> A Message Signaled Interrupt is a write from the device to a special
> address which causes an interrupt to be received by the CPU.
> @@ -29,7 +31,8 @@ Devices may support both MSI and MSI-X, but only one can be enabled at
> a time.
>
>
> -3. Why use MSIs?
> +Why use MSIs?
> +=============
>
> There are three reasons why using MSIs can give an advantage over
> traditional pin-based interrupts.
> @@ -61,14 +64,16 @@ Other possible designs include giving one interrupt to each packet queue
> in a network card or each port in a storage controller.
>
>
> -4. How to use MSIs
> +How to use MSIs
> +===============
>
> PCI devices are initialised to use pin-based interrupts. The device
> driver has to set up the device to use MSI or MSI-X. Not all machines
> support MSIs correctly, and for those machines, the APIs described below
> will simply fail and the device will continue to use pin-based interrupts.
>
> -4.1 Include kernel support for MSIs
> +Include kernel support for MSIs
> +-------------------------------
>
> To support MSI or MSI-X, the kernel must be built with the CONFIG_PCI_MSI
> option enabled. This option is only available on some architectures,
> @@ -76,14 +81,15 @@ and it may depend on some other options also being set. For example,
> on x86, you must also enable X86_UP_APIC or SMP in order to see the
> CONFIG_PCI_MSI option.
>
> -4.2 Using MSI
> +Using MSI
> +---------
>
> Most of the hard work is done for the driver in the PCI layer. The driver
> simply has to request that the PCI layer set up the MSI capability for this
> device.
>
> To automatically use MSI or MSI-X interrupt vectors, use the following
> -function:
> +function::
>
> int pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
> unsigned int max_vecs, unsigned int flags);
> @@ -101,12 +107,12 @@ any possible kind of interrupt. If the PCI_IRQ_AFFINITY flag is set,
> pci_alloc_irq_vectors() will spread the interrupts around the available CPUs.
>
> To get the Linux IRQ numbers passed to request_irq() and free_irq() and the
> -vectors, use the following function:
> +vectors, use the following function::
>
> int pci_irq_vector(struct pci_dev *dev, unsigned int nr);
>
> Any allocated resources should be freed before removing the device using
> -the following function:
> +the following function::
>
> void pci_free_irq_vectors(struct pci_dev *dev);
>
> @@ -126,7 +132,7 @@ The typical usage of MSI or MSI-X interrupts is to allocate as many vectors
> as possible, likely up to the limit supported by the device. If nvec is
> larger than the number supported by the device it will automatically be
> capped to the supported limit, so there is no need to query the number of
> -vectors supported beforehand:
> +vectors supported beforehand::
>
> nvec = pci_alloc_irq_vectors(pdev, 1, nvec, PCI_IRQ_ALL_TYPES)
> if (nvec < 0)
> @@ -135,7 +141,7 @@ vectors supported beforehand:
> If a driver is unable or unwilling to deal with a variable number of MSI
> interrupts it can request a particular number of interrupts by passing that
> number to pci_alloc_irq_vectors() function as both 'min_vecs' and
> -'max_vecs' parameters:
> +'max_vecs' parameters::
>
> ret = pci_alloc_irq_vectors(pdev, nvec, nvec, PCI_IRQ_ALL_TYPES);
> if (ret < 0)
> @@ -143,23 +149,24 @@ number to pci_alloc_irq_vectors() function as both 'min_vecs' and
>
> The most notorious example of the request type described above is enabling
> the single MSI mode for a device. It could be done by passing two 1s as
> -'min_vecs' and 'max_vecs':
> +'min_vecs' and 'max_vecs'::
>
> ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
> if (ret < 0)
> goto out_err;
>
> Some devices might not support using legacy line interrupts, in which case
> -the driver can specify that only MSI or MSI-X is acceptable:
> +the driver can specify that only MSI or MSI-X is acceptable::
>
> nvec = pci_alloc_irq_vectors(pdev, 1, nvec, PCI_IRQ_MSI | PCI_IRQ_MSIX);
> if (nvec < 0)
> goto out_err;
>
> -4.3 Legacy APIs
> +Legacy APIs
> +-----------
>
> The following old APIs to enable and disable MSI or MSI-X interrupts should
> -not be used in new code:
> +not be used in new code::
>
> pci_enable_msi() /* deprecated */
> pci_disable_msi() /* deprecated */
> @@ -174,9 +181,11 @@ number of vectors. If you have a legitimate special use case for the count
> of vectors we might have to revisit that decision and add a
> pci_nr_irq_vectors() helper that handles MSI and MSI-X transparently.
>
> -4.4 Considerations when using MSIs
> +Considerations when using MSIs
> +------------------------------
>
> -4.4.1 Spinlocks
> +Spinlocks
> +~~~~~~~~~
>
> Most device drivers have a per-device spinlock which is taken in the
> interrupt handler. With pin-based interrupts or a single MSI, it is not
> @@ -188,7 +197,8 @@ acquire the spinlock. Such deadlocks can be avoided by using
> spin_lock_irqsave() or spin_lock_irq() which disable local interrupts
> and acquire the lock (see Documentation/kernel-hacking/locking.rst).
>
> -4.5 How to tell whether MSI/MSI-X is enabled on a device
> +How to tell whether MSI/MSI-X is enabled on a device
> +----------------------------------------------------
>
> Using 'lspci -v' (as root) may show some devices with "MSI", "Message
> Signalled Interrupts" or "MSI-X" capabilities. Each of these capabilities
> @@ -196,7 +206,8 @@ has an 'Enable' flag which is followed with either "+" (enabled)
> or "-" (disabled).
>
>
> -5. MSI quirks
> +MSI quirks
> +==========
>
> Several PCI chipsets or devices are known not to support MSIs.
> The PCI stack provides three ways to disable MSIs:
> @@ -205,7 +216,8 @@ The PCI stack provides three ways to disable MSIs:
> 2. on all devices behind a specific bridge
> 3. on a single device
>
> -5.1. Disabling MSIs globally
> +Disabling MSIs globally
> +-----------------------
>
> Some host chipsets simply don't support MSIs properly. If we're
> lucky, the manufacturer knows this and has indicated it in the ACPI
> @@ -219,7 +231,8 @@ on the kernel command line to disable MSIs on all devices. It would be
> in your best interests to report the problem to [email protected]
> including a full 'lspci -v' so we can add the quirks to the kernel.
>
> -5.2. Disabling MSIs below a bridge
> +Disabling MSIs below a bridge
> +-----------------------------
>
> Some PCI bridges are not able to route MSIs between busses properly.
> In this case, MSIs must be disabled on all devices behind the bridge.
> @@ -230,7 +243,7 @@ as the nVidia nForce and Serverworks HT2000). As with host chipsets,
> Linux mostly knows about them and automatically enables MSIs if it can.
> If you have a bridge unknown to Linux, you can enable
> MSIs in configuration space using whatever method you know works, then
> -enable MSIs on that bridge by doing:
> +enable MSIs on that bridge by doing::
>
> echo 1 > /sys/bus/pci/devices/$bridge/msi_bus
>
> @@ -244,7 +257,8 @@ below this bridge.
> Again, please notify [email protected] of any bridges that need
> special handling.
>
> -5.3. Disabling MSIs on a single device
> +Disabling MSIs on a single device
> +---------------------------------
>
> Some devices are known to have faulty MSI implementations. Usually this
> is handled in the individual device driver, but occasionally it's necessary
> @@ -252,7 +266,8 @@ to handle this with a quirk. Some drivers have an option to disable use
> of MSI. While this is a convenient workaround for the driver author,
> it is not good practice, and should not be emulated.
>
> -5.4. Finding why MSIs are disabled on a device
> +Finding why MSIs are disabled on a device
> +-----------------------------------------
>
> From the above three sections, you can see that there are many reasons
> why MSIs may not be enabled for a given device. Your first step should
> @@ -260,8 +275,8 @@ be to examine your dmesg carefully to determine whether MSIs are enabled
> for your machine. You should also check your .config to be sure you
> have enabled CONFIG_PCI_MSI.
>
> -Then, 'lspci -t' gives the list of bridges above a device. Reading
> -/sys/bus/pci/devices/*/msi_bus will tell you whether MSIs are enabled (1)
> +Then, 'lspci -t' gives the list of bridges above a device. Reading
> +`/sys/bus/pci/devices/*/msi_bus` will tell you whether MSIs are enabled (1)
> or disabled (0). If 0 is found in any of the msi_bus files belonging
> to bridges between the PCI root and the device, MSIs are disabled.
>
> diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
> index e1c19962a7f8..1b25bcc1edca 100644
> --- a/Documentation/PCI/index.rst
> +++ b/Documentation/PCI/index.rst
> @@ -11,3 +11,4 @@ Linux PCI Bus Subsystem
> pci
> PCIEBUS-HOWTO
> pci-iov-howto
> + MSI-HOWTO



Thanks,
Mauro

2019-04-24 15:47:40

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 31/63] Documentation: PCI: convert pci-error-recovery.txt to reST

Em Wed, 24 Apr 2019 00:29:00 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>
> ---
> Documentation/PCI/index.rst | 1 +
> ...or-recovery.txt => pci-error-recovery.rst} | 178 +++++++++---------
> MAINTAINERS | 2 +-
> 3 files changed, 94 insertions(+), 87 deletions(-)
> rename Documentation/PCI/{pci-error-recovery.txt => pci-error-recovery.rst} (80%)
>
> diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
> index c877a369481d..5ee4dba07116 100644
> --- a/Documentation/PCI/index.rst
> +++ b/Documentation/PCI/index.rst
> @@ -13,3 +13,4 @@ Linux PCI Bus Subsystem
> pci-iov-howto
> MSI-HOWTO
> acpi-info
> + pci-error-recovery
> diff --git a/Documentation/PCI/pci-error-recovery.txt b/Documentation/PCI/pci-error-recovery.rst
> similarity index 80%
> rename from Documentation/PCI/pci-error-recovery.txt
> rename to Documentation/PCI/pci-error-recovery.rst
> index 0b6bb3ef449e..533ec4035bf5 100644
> --- a/Documentation/PCI/pci-error-recovery.txt
> +++ b/Documentation/PCI/pci-error-recovery.rst
> @@ -1,12 +1,13 @@
> +.. SPDX-License-Identifier: GPL-2.0
>
> - PCI Error Recovery
> - ------------------
> - February 2, 2006
> +==================
> +PCI Error Recovery
> +==================
>
> - Current document maintainer:
> - Linas Vepstas <[email protected]>
> - updated by Richard Lary <[email protected]>
> - and Mike Mason <[email protected]> on 27-Jul-2009

Just wondering: wouldn't be good to preserve the date here?

> +
> +:Authors: - Linas Vepstas <[email protected]>
> + - Richard Lary <[email protected]>
> + - Mike Mason <[email protected]>
>
>
> Many PCI bus controllers are able to detect a variety of hardware
> @@ -63,7 +64,8 @@ mechanisms for dealing with SCSI bus errors and SCSI bus resets.
>
>
> Detailed Design
> ----------------
> +===============
> +
> Design and implementation details below, based on a chain of
> public email discussions with Ben Herrenschmidt, circa 5 April 2005.
>
> @@ -73,30 +75,33 @@ pci_driver. A driver that fails to provide the structure is "non-aware",
> and the actual recovery steps taken are platform dependent. The
> arch/powerpc implementation will simulate a PCI hotplug remove/add.
>
> -This structure has the form:
> -struct pci_error_handlers
> -{
> - int (*error_detected)(struct pci_dev *dev, enum pci_channel_state);
> - int (*mmio_enabled)(struct pci_dev *dev);
> - int (*slot_reset)(struct pci_dev *dev);
> - void (*resume)(struct pci_dev *dev);
> -};
> -
> -The possible channel states are:
> -enum pci_channel_state {
> - pci_channel_io_normal, /* I/O channel is in normal state */
> - pci_channel_io_frozen, /* I/O to channel is blocked */
> - pci_channel_io_perm_failure, /* PCI card is dead */
> -};
> -
> -Possible return values are:
> -enum pci_ers_result {
> - PCI_ERS_RESULT_NONE, /* no result/none/not supported in device driver */
> - PCI_ERS_RESULT_CAN_RECOVER, /* Device driver can recover without slot reset */
> - PCI_ERS_RESULT_NEED_RESET, /* Device driver wants slot to be reset. */
> - PCI_ERS_RESULT_DISCONNECT, /* Device has completely failed, is unrecoverable */
> - PCI_ERS_RESULT_RECOVERED, /* Device driver is fully recovered and operational */
> -};
> +This structure has the form::
> +
> + struct pci_error_handlers
> + {
> + int (*error_detected)(struct pci_dev *dev, enum pci_channel_state);
> + int (*mmio_enabled)(struct pci_dev *dev);
> + int (*slot_reset)(struct pci_dev *dev);
> + void (*resume)(struct pci_dev *dev);
> + };
> +
> +The possible channel states are::
> +
> + enum pci_channel_state {
> + pci_channel_io_normal, /* I/O channel is in normal state */
> + pci_channel_io_frozen, /* I/O to channel is blocked */
> + pci_channel_io_perm_failure, /* PCI card is dead */
> + };
> +
> +Possible return values are::
> +
> + enum pci_ers_result {
> + PCI_ERS_RESULT_NONE, /* no result/none/not supported in device driver */
> + PCI_ERS_RESULT_CAN_RECOVER, /* Device driver can recover without slot reset */
> + PCI_ERS_RESULT_NEED_RESET, /* Device driver wants slot to be reset. */
> + PCI_ERS_RESULT_DISCONNECT, /* Device has completely failed, is unrecoverable */
> + PCI_ERS_RESULT_RECOVERED, /* Device driver is fully recovered and operational */
> + };
>
> A driver does not have to implement all of these callbacks; however,
> if it implements any, it must implement error_detected(). If a callback
> @@ -134,16 +139,17 @@ shouldn't do any new IOs. Called in task context. This is sort of a
>
> All drivers participating in this system must implement this call.
> The driver must return one of the following result codes:
> - - PCI_ERS_RESULT_CAN_RECOVER:
> - Driver returns this if it thinks it might be able to recover
> - the HW by just banging IOs or if it wants to be given
> - a chance to extract some diagnostic information (see
> - mmio_enable, below).
> - - PCI_ERS_RESULT_NEED_RESET:
> - Driver returns this if it can't recover without a
> - slot reset.
> - - PCI_ERS_RESULT_DISCONNECT:
> - Driver returns this if it doesn't want to recover at all.
> +
> + - PCI_ERS_RESULT_CAN_RECOVER:
> + Driver returns this if it thinks it might be able to recover
> + the HW by just banging IOs or if it wants to be given
> + a chance to extract some diagnostic information (see
> + mmio_enable, below).
> + - PCI_ERS_RESULT_NEED_RESET:
> + Driver returns this if it can't recover without a
> + slot reset.
> + - PCI_ERS_RESULT_DISCONNECT:
> + Driver returns this if it doesn't want to recover at all.

This would look better on both text and html if you format it as:

- PCI_ERS_RESULT_CAN_RECOVER:
Driver returns this if it thinks it might be able to recover
the HW by just banging IOs or if it wants to be given
a chance to extract some diagnostic information (see
mmio_enable, below).
- PCI_ERS_RESULT_NEED_RESET:
Driver returns this if it can't recover without a
slot reset.
- PCI_ERS_RESULT_DISCONNECT:
Driver returns this if it doesn't want to recover at all.


>
> The next step taken will depend on the result codes returned by the
> drivers.
> @@ -177,7 +183,7 @@ is STEP 6 (Permanent Failure).
> >>> get the device working again.
>
> STEP 2: MMIO Enabled
> --------------------
> +--------------------
> The platform re-enables MMIO to the device (but typically not the
> DMA), and then calls the mmio_enabled() callback on all affected
> device drivers.
> @@ -203,23 +209,23 @@ instead will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
> >>> into one of the next states, that is, link reset or slot reset.


It sounds you forgot to reformat the note with ">>>", e. g.
something like:

.. note::

The current powerpc implementation assumes that a device driver will
*not* schedule or semaphore in this routine; the current powerpc
implementation uses one kernel thread to notify all devices;
thus, if one device sleeps/schedules, all devices are affected.
Doing better requires complex multi-threaded logic in the error
recovery implementation (e.g. waiting for all notification threads
to "join" before proceeding with recovery.) This seems excessively
complex and not worth implementing.

The current powerpc implementation doesn't much care if the device
attempts I/O at this point, or not. I/O's will fail, returning
a value of 0xff on read, and writes will be dropped. If more than
EEH_MAX_FAILS I/O's are attempted to a frozen adapter, EEH
assumes that the device driver has gone into an infinite loop
and prints an error to syslog. A reboot is then required to
get the device working again.


>
> The driver should return one of the following result codes:
> - - PCI_ERS_RESULT_RECOVERED
> - Driver returns this if it thinks the device is fully
> - functional and thinks it is ready to start
> - normal driver operations again. There is no
> - guarantee that the driver will actually be
> - allowed to proceed, as another driver on the
> - same segment might have failed and thus triggered a
> - slot reset on platforms that support it.
> -
> - - PCI_ERS_RESULT_NEED_RESET
> - Driver returns this if it thinks the device is not
> - recoverable in its current state and it needs a slot
> - reset to proceed.
> -
> - - PCI_ERS_RESULT_DISCONNECT
> - Same as above. Total failure, no recovery even after
> - reset driver dead. (To be defined more precisely)
> + - PCI_ERS_RESULT_RECOVERED
> + Driver returns this if it thinks the device is fully
> + functional and thinks it is ready to start
> + normal driver operations again. There is no
> + guarantee that the driver will actually be
> + allowed to proceed, as another driver on the
> + same segment might have failed and thus triggered a
> + slot reset on platforms that support it.
> +
> + - PCI_ERS_RESULT_NEED_RESET
> + Driver returns this if it thinks the device is not
> + recoverable in its current state and it needs a slot
> + reset to proceed.
> +
> + - PCI_ERS_RESULT_DISCONNECT
> + Same as above. Total failure, no recovery even after
> + reset driver dead. (To be defined more precisely)

Same as above, this would look a way better if you format it as:

- PCI_ERS_RESULT_RECOVERED
Driver returns this if it thinks the device is fully
functional and thinks it is ready to start
normal driver operations again. There is no
guarantee that the driver will actually be
allowed to proceed, as another driver on the
same segment might have failed and thus triggered a
slot reset on platforms that support it.

- PCI_ERS_RESULT_NEED_RESET
Driver returns this if it thinks the device is not
recoverable in its current state and it needs a slot
reset to proceed.

- PCI_ERS_RESULT_DISCONNECT
Same as above. Total failure, no recovery even after
reset driver dead. (To be defined more precisely)

> The next step taken depends on the results returned by the drivers.
> If all drivers returned PCI_ERS_RESULT_RECOVERED, then the platform
> @@ -293,24 +299,24 @@ device will be considered "dead" in this case.
> Drivers for multi-function cards will need to coordinate among
> themselves as to which driver instance will perform any "one-shot"
> or global device initialization. For example, the Symbios sym53cxx2
> -driver performs device init only from PCI function 0:
> +driver performs device init only from PCI function 0::
>
> -+ if (PCI_FUNC(pdev->devfn) == 0)
> -+ sym_reset_scsi_bus(np, 0);
> + + if (PCI_FUNC(pdev->devfn) == 0)
> + + sym_reset_scsi_bus(np, 0);
>
> - Result codes:
> - - PCI_ERS_RESULT_DISCONNECT
> - Same as above.
> +Result codes:
> + - PCI_ERS_RESULT_DISCONNECT
> + Same as above.
>
> Drivers for PCI Express cards that require a fundamental reset must
> set the needs_freset bit in the pci_dev structure in their probe function.
> For example, the QLogic qla2xxx driver sets the needs_freset bit for certain
> -PCI card types:
> +PCI card types::
>
> -+ /* Set EEH reset type to fundamental if required by hba */
> -+ if (IS_QLA24XX(ha) || IS_QLA25XX(ha) || IS_QLA81XX(ha))
> -+ pdev->needs_freset = 1;
> -+
> + + /* Set EEH reset type to fundamental if required by hba */
> + + if (IS_QLA24XX(ha) || IS_QLA25XX(ha) || IS_QLA81XX(ha))
> + + pdev->needs_freset = 1;
> + +
>
> Platform proceeds either to STEP 5 (Resume Operations) or STEP 6 (Permanent
> Failure).

Again, you forgot to convert one note there, just after the above
lines:

.. note::

The current powerpc implementation does not try a power-cycle
reset if the driver returned PCI_ERS_RESULT_DISCONNECT.
However, it probably should.


> @@ -370,23 +376,23 @@ The current policy is to turn this into a platform policy.
> That is, the recovery API only requires that:
>
> - There is no guarantee that interrupt delivery can proceed from any
> -device on the segment starting from the error detection and until the
> -slot_reset callback is called, at which point interrupts are expected
> -to be fully operational.
> + device on the segment starting from the error detection and until the
> + slot_reset callback is called, at which point interrupts are expected
> + to be fully operational.
>
> - There is no guarantee that interrupt delivery is stopped, that is,
> -a driver that gets an interrupt after detecting an error, or that detects
> -an error within the interrupt handler such that it prevents proper
> -ack'ing of the interrupt (and thus removal of the source) should just
> -return IRQ_NOTHANDLED. It's up to the platform to deal with that
> -condition, typically by masking the IRQ source during the duration of
> -the error handling. It is expected that the platform "knows" which
> -interrupts are routed to error-management capable slots and can deal
> -with temporarily disabling that IRQ number during error processing (this
> -isn't terribly complex). That means some IRQ latency for other devices
> -sharing the interrupt, but there is simply no other way. High end
> -platforms aren't supposed to share interrupts between many devices
> -anyway :)
> + a driver that gets an interrupt after detecting an error, or that detects
> + an error within the interrupt handler such that it prevents proper
> + ack'ing of the interrupt (and thus removal of the source) should just
> + return IRQ_NOTHANDLED. It's up to the platform to deal with that
> + condition, typically by masking the IRQ source during the duration of
> + the error handling. It is expected that the platform "knows" which
> + interrupts are routed to error-management capable slots and can deal
> + with temporarily disabling that IRQ number during error processing (this
> + isn't terribly complex). That means some IRQ latency for other devices
> + sharing the interrupt, but there is simply no other way. High end
> + platforms aren't supposed to share interrupts between many devices
> + anyway :)
>
> >>> Implementation details for the powerpc platform are discussed in
> >>> the file Documentation/powerpc/eeh-pci-error-recovery.txt

Another note to be converted:

.. note::

Implementation details for the powerpc platform are discussed in
the file Documentation/powerpc/eeh-pci-error-recovery.txt

As of this writing, there is a growing list of device drivers with
patches implementing error recovery. Not all of these patches are in
mainline yet. These may be used as "examples"::

drivers/scsi/ipr
drivers/scsi/sym53c8xx_2
...
drivers/net/qlge



> diff --git a/MAINTAINERS b/MAINTAINERS
> index 87f930bf32ad..403178958b05 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -11965,7 +11965,7 @@ M: Sam Bobroff <[email protected]>
> M: Oliver O'Halloran <[email protected]>
> L: [email protected]
> S: Supported
> -F: Documentation/PCI/pci-error-recovery.txt
> +F: Documentation/PCI/pci-error-recovery.rst
> F: drivers/pci/pcie/aer.c
> F: drivers/pci/pcie/dpc.c
> F: drivers/pci/pcie/err.c



Thanks,
Mauro

2019-04-24 15:47:51

by Changbin Du

[permalink] [raw]
Subject: Re: [PATCH v4 00/63] Include linux ACPI/PCI/X86 docs into Sphinx TOC tree

On Tue, Apr 23, 2019 at 12:36:44PM -0500, Bjorn Helgaas wrote:
> On Tue, Apr 23, 2019 at 06:39:47PM +0200, Rafael J. Wysocki wrote:
> > On Tue, Apr 23, 2019 at 6:30 PM Changbin Du <[email protected]> wrote:
> > > Hi Corbet and All,
> > > The kernel now uses Sphinx to generate intelligent and beautiful
> > > documentation from reStructuredText files. I converted all of the Linux
> > > ACPI/PCI/X86 docs to reST format in this serias.
> > >
> > > In this version I combined ACPI and PCI docs, and added new x86 docs
> > > conversion.
> >
> > I'm not sure if combining all three into one big patch series has been
> > a good idea, honestly.
>
> Yeah, if you post this again, I would find it easier to deal with if
> linux-pci only got the PCI-related things. 63 patches is a little too
> much for one series.
>
sure, so I will resend them respectively.

> Bjorn

--
Cheers,
Changbin Du

2019-04-24 15:50:51

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 32/63] Documentation: PCI: convert pcieaer-howto.txt to reST

Em Wed, 24 Apr 2019 00:29:01 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>
> ---
> Documentation/PCI/index.rst | 1 +
> .../{pcieaer-howto.txt => pcieaer-howto.rst} | 110 ++++++++++++------
> 2 files changed, 74 insertions(+), 37 deletions(-)
> rename Documentation/PCI/{pcieaer-howto.txt => pcieaer-howto.rst} (81%)
>
> diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
> index 5ee4dba07116..86c76c22810b 100644
> --- a/Documentation/PCI/index.rst
> +++ b/Documentation/PCI/index.rst
> @@ -14,3 +14,4 @@ Linux PCI Bus Subsystem
> MSI-HOWTO
> acpi-info
> pci-error-recovery
> + pcieaer-howto
> diff --git a/Documentation/PCI/pcieaer-howto.txt b/Documentation/PCI/pcieaer-howto.rst
> similarity index 81%
> rename from Documentation/PCI/pcieaer-howto.txt
> rename to Documentation/PCI/pcieaer-howto.rst
> index 48ce7903e3c6..67f77ff76865 100644
> --- a/Documentation/PCI/pcieaer-howto.txt
> +++ b/Documentation/PCI/pcieaer-howto.rst
> @@ -1,21 +1,29 @@
> - The PCI Express Advanced Error Reporting Driver Guide HOWTO
> - T. Long Nguyen <[email protected]>
> - Yanmin Zhang <[email protected]>
> - 07/29/2006
> +.. SPDX-License-Identifier: GPL-2.0
> +.. include:: <isonum.txt>
>
> +===========================================================
> +The PCI Express Advanced Error Reporting Driver Guide HOWTO
> +===========================================================
>
> -1. Overview
> +:Authors: - T. Long Nguyen <[email protected]>
> + - Yanmin Zhang <[email protected]>
>
> -1.1 About this guide
> +:Copyright: |copy| 2006 Intel Corporation
> +
> +Overview
> +===========
> +
> +About this guide
> +----------------
>
> This guide describes the basics of the PCI Express Advanced Error
> Reporting (AER) driver and provides information on how to use it, as
> well as how to enable the drivers of endpoint devices to conform with
> PCI Express AER driver.
>
> -1.2 Copyright (C) Intel Corporation 2006.
>
> -1.3 What is the PCI Express AER Driver?
> +What is the PCI Express AER Driver?
> +-----------------------------------
>
> PCI Express error signaling can occur on the PCI Express link itself
> or on behalf of transactions initiated on the link. PCI Express
> @@ -30,17 +38,19 @@ The PCI Express AER driver provides the infrastructure to support PCI
> Express Advanced Error Reporting capability. The PCI Express AER
> driver provides three basic functions:
>
> -- Gathers the comprehensive error information if errors occurred.
> -- Reports error to the users.
> -- Performs error recovery actions.
> + - Gathers the comprehensive error information if errors occurred.
> + - Reports error to the users.
> + - Performs error recovery actions.
>
> AER driver only attaches root ports which support PCI-Express AER
> capability.
>
>
> -2. User Guide
> +User Guide
> +==========
>
> -2.1 Include the PCI Express AER Root Driver into the Linux Kernel
> +Include the PCI Express AER Root Driver into the Linux Kernel
> +-------------------------------------------------------------
>
> The PCI Express AER Root driver is a Root Port service driver attached
> to the PCI Express Port Bus driver. If a user wants to use it, the driver
> @@ -48,7 +58,8 @@ has to be compiled. Option CONFIG_PCIEAER supports this capability. It
> depends on CONFIG_PCIEPORTBUS, so pls. set CONFIG_PCIEPORTBUS=y and
> CONFIG_PCIEAER = y.
>
> -2.2 Load PCI Express AER Root Driver
> +Load PCI Express AER Root Driver
> +--------------------------------
>
> Some systems have AER support in firmware. Enabling Linux AER support at
> the same time the firmware handles AER may result in unpredictable
> @@ -56,30 +67,34 @@ behavior. Therefore, Linux does not handle AER events unless the firmware
> grants AER control to the OS via the ACPI _OSC method. See the PCI FW 3.0
> Specification for details regarding _OSC usage.
>
> -2.3 AER error output
> +AER error output
> +----------------
>
> When a PCIe AER error is captured, an error message will be output to
> console. If it's a correctable error, it is output as a warning.
> Otherwise, it is printed as an error. So users could choose different
> log level to filter out correctable error messages.
>
> -Below shows an example:
> -0000:50:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, id=0500(Requester ID)
> -0000:50:00.0: device [8086:0329] error status/mask=00100000/00000000
> -0000:50:00.0: [20] Unsupported Request (First)
> -0000:50:00.0: TLP Header: 04000001 00200a03 05010000 00050100
> +Below shows an example::
> +
> + 0000:50:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, id=0500(Requester ID)
> + 0000:50:00.0: device [8086:0329] error status/mask=00100000/00000000
> + 0000:50:00.0: [20] Unsupported Request (First)
> + 0000:50:00.0: TLP Header: 04000001 00200a03 05010000 00050100
>
> In the example, 'Requester ID' means the ID of the device who sends
> the error message to root port. Pls. refer to pci express specs for
> other fields.
>
> -2.4 AER Statistics / Counters
> +AER Statistics / Counters
> +-------------------------
>
> When PCIe AER errors are captured, the counters / statistics are also exposed
> in the form of sysfs attributes which are documented at
> Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
>
> -3. Developer Guide
> +Developer Guide
> +===============
>
> To enable AER aware support requires a software driver to configure
> the AER capability structure within its device and to provide callbacks.
> @@ -120,7 +135,8 @@ hierarchy and links. These errors do not include any device specific
> errors because device specific errors will still get sent directly to
> the device driver.
>
> -3.1 Configure the AER capability structure
> +Configure the AER capability structure
> +--------------------------------------
>
> AER aware drivers of PCI Express component need change the device
> control registers to enable AER. They also could change AER registers,
> @@ -128,9 +144,11 @@ including mask and severity registers. Helper function
> pci_enable_pcie_error_reporting could be used to enable AER. See
> section 3.3.
>
> -3.2. Provide callbacks
> +Provide callbacks
> +-----------------
>
> -3.2.1 callback reset_link to reset pci express link
> +callback reset_link to reset pci express link
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> This callback is used to reset the pci express physical link when a
> fatal error happens. The root port aer service driver provides a
> @@ -140,13 +158,15 @@ upstream ports should provide their own reset_link functions.
>
> In struct pcie_port_service_driver, a new pointer, reset_link, is
> added.
> +::
>
> -pci_ers_result_t (*reset_link) (struct pci_dev *dev);
> + pci_ers_result_t (*reset_link) (struct pci_dev *dev);
>
> Section 3.2.2.2 provides more detailed info on when to call
> reset_link.
>
> -3.2.2 PCI error-recovery callbacks
> +PCI error-recovery callbacks
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> The PCI Express AER Root driver uses error callbacks to coordinate
> with downstream device drivers associated with a hierarchy in question
> @@ -161,7 +181,8 @@ definitions of the callbacks.
>
> Below sections specify when to call the error callback functions.
>
> -3.2.2.1 Correctable errors
> +Correctable errors
> +~~~~~~~~~~~~~~~~~~
>
> Correctable errors pose no impacts on the functionality of
> the interface. The PCI Express protocol can recover without any
> @@ -169,13 +190,16 @@ software intervention or any loss of data. These errors do not
> require any recovery actions. The AER driver clears the device's
> correctable error status register accordingly and logs these errors.
>
> -3.2.2.2 Non-correctable (non-fatal and fatal) errors
> +Non-correctable (non-fatal and fatal) errors
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> If an error message indicates a non-fatal error, performing link reset
> at upstream is not required. The AER driver calls error_detected(dev,
> pci_channel_io_normal) to all drivers associated within a hierarchy in
> -question. for example,
> -EndPoint<==>DownstreamPort B<==>UpstreamPort A<==>RootPort.
> +question. for example::
> +
> + EndPoint<==>DownstreamPort B<==>UpstreamPort A<==>RootPort
> +
> If Upstream port A captures an AER error, the hierarchy consists of
> Downstream port B and EndPoint.
>
> @@ -199,23 +223,33 @@ function. If error_detected returns PCI_ERS_RESULT_CAN_RECOVER and
> reset_link returns PCI_ERS_RESULT_RECOVERED, the error handling goes
> to mmio_enabled.
>
> -3.3 helper functions
> +helper functions
> +----------------
> +::
> +
> + int pci_enable_pcie_error_reporting(struct pci_dev *dev);
>
> -3.3.1 int pci_enable_pcie_error_reporting(struct pci_dev *dev);
> pci_enable_pcie_error_reporting enables the device to send error
> messages to root port when an error is detected. Note that devices
> don't enable the error reporting by default, so device drivers need
> call this function to enable it.
>
> -3.3.2 int pci_disable_pcie_error_reporting(struct pci_dev *dev);
> +::
> +
> + int pci_disable_pcie_error_reporting(struct pci_dev *dev);
> +
> pci_disable_pcie_error_reporting disables the device to send error
> messages to root port when an error is detected.
>
> -3.3.3 int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
> +::
> +
> + int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);`
> +
> pci_cleanup_aer_uncorrect_error_status cleanups the uncorrectable
> error status register.
>
> -3.4 Frequent Asked Questions
> +Frequent Asked Questions
> +------------------------
>
> Q: What happens if a PCI Express device driver does not provide an
> error recovery handler (pci_driver->err_handler is equal to NULL)?

I strongly suspect that you also need to touch the Q/A, as otherwise
you'll have either bad output and/or sphinx warnings. Something like:

Q:
What happens if a PCI Express device driver does not provide an
error recovery handler (pci_driver->err_handler is equal to NULL)?

A:
The devices attached with the driver won't be recovered. If the
error is fatal, kernel will print out warning messages. Please refer
to section 3 for more information.

Q:
What happens if an upstream port service driver does not provide
callback reset_link?

A:
Fatal error recovery will fail if the errors are reported by the
upstream ports who are attached by the service driver.

Q:
How does this infrastructure deal with driver that is not PCI
Express aware?

A:
This infrastructure calls the error callback functions of the
driver when an error happens. But if the driver is not aware of
PCI Express, the device might not report its own errors to root
port.

Q:
What modifications will that driver need to make it compatible
with the PCI Express AER Root driver?

A:
It could call the helper functions to enable AER in devices and
cleanup uncorrectable status register. Pls. refer to section 3.3.


> @@ -245,7 +279,8 @@ A: It could call the helper functions to enable AER in devices and
> cleanup uncorrectable status register. Pls. refer to section 3.3.
>
>
> -4. Software error injection
> +Software error injection
> +========================
>
> Debugging PCIe AER error recovery code is quite difficult because it
> is hard to trigger real hardware errors. Software based error
> @@ -261,6 +296,7 @@ After reboot with new kernel or insert the module, a device file named
>
> Then, you need a user space tool named aer-inject, which can be gotten
> from:
> +
> https://git.kernel.org/cgit/linux/kernel/git/gong.chen/aer-inject.git/
>
> More information about aer-inject can be found in the document comes



Thanks,
Mauro

2019-04-24 16:11:16

by Changbin Du

[permalink] [raw]
Subject: Re: [PATCH v4 02/63] Documentation: ACPI: move namespace.txt to firmware-guide/acpi and convert to reST

On Tue, Apr 23, 2019 at 05:38:40PM -0300, Mauro Carvalho Chehab wrote:
> Em Wed, 24 Apr 2019 00:28:31 +0800
> Changbin Du <[email protected]> escreveu:
>
> > This converts the plain text documentation to reStructuredText format and
> > add it to Sphinx TOC tree. No essential content change.
> >
> > Signed-off-by: Changbin Du <[email protected]>
> > ---
> > Documentation/firmware-guide/acpi/index.rst | 1 +
> > .../acpi/namespace.rst} | 310 +++++++++---------
> > 2 files changed, 161 insertions(+), 150 deletions(-)
> > rename Documentation/{acpi/namespace.txt => firmware-guide/acpi/namespace.rst} (54%)
> >
> > diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> > index 0ec7d072ba22..210ad8acd6df 100644
> > --- a/Documentation/firmware-guide/acpi/index.rst
> > +++ b/Documentation/firmware-guide/acpi/index.rst
> > @@ -7,3 +7,4 @@ ACPI Support
> > .. toctree::
> > :maxdepth: 1
> >
> > + namespace
> > diff --git a/Documentation/acpi/namespace.txt b/Documentation/firmware-guide/acpi/namespace.rst
> > similarity index 54%
> > rename from Documentation/acpi/namespace.txt
> > rename to Documentation/firmware-guide/acpi/namespace.rst
> > index 1860cb3865c6..443f0e5d0617 100644
> > --- a/Documentation/acpi/namespace.txt
> > +++ b/Documentation/firmware-guide/acpi/namespace.rst
> > @@ -1,85 +1,88 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +.. include:: <isonum.txt>
> > +
> > +===================================================
> > ACPI Device Tree - Representation of ACPI Namespace
> > +===================================================
> > +
> > +:Copyright: |copy| 2013, Intel Corporation
> > +
> > +:Author: Lv Zheng <[email protected]>
> > +
> > +:Abstract: The Linux ACPI subsystem converts ACPI namespace objects into a Linux
> > + device tree under the /sys/devices/LNXSYSTEM:00 and updates it upon
> > + receiving ACPI hotplug notification events. For each device object
> > + in this hierarchy there is a corresponding symbolic link in the
> > + /sys/bus/acpi/devices.
> > + This document illustrates the structure of the ACPI device tree.
>
> Well, this is a matter of preference. I would add Abstract as a chapter,
> as this would make it part of the top index, with can be useful.
>
Now it becomes a chapter. Thanks.

> In any case:
>
> Reviewed-by: Mauro Carvalho Chehab <[email protected]>
>
> > +
> > +:Credit: Thanks for the help from Zhang Rui <[email protected]> and
> > + Rafael J.Wysocki <[email protected]>.
> > +
> > +
> > +ACPI Definition Blocks
> > +======================
> > +
> > +The ACPI firmware sets up RSDP (Root System Description Pointer) in the
> > +system memory address space pointing to the XSDT (Extended System
> > +Description Table). The XSDT always points to the FADT (Fixed ACPI
> > +Description Table) using its first entry, the data within the FADT
> > +includes various fixed-length entries that describe fixed ACPI features
> > +of the hardware. The FADT contains a pointer to the DSDT
> > +(Differentiated System Descripition Table). The XSDT also contains
> > +entries pointing to possibly multiple SSDTs (Secondary System
> > +Description Table).
> > +
> > +The DSDT and SSDT data is organized in data structures called definition
> > +blocks that contain definitions of various objects, including ACPI
> > +control methods, encoded in AML (ACPI Machine Language). The data block
> > +of the DSDT along with the contents of SSDTs represents a hierarchical
> > +data structure called the ACPI namespace whose topology reflects the
> > +structure of the underlying hardware platform.
> > +
> > +The relationships between ACPI System Definition Tables described above
> > +are illustrated in the following diagram::
> > +
> > + +---------+ +-------+ +--------+ +------------------------+
> > + | RSDP | +->| XSDT | +->| FADT | | +-------------------+ |
> > + +---------+ | +-------+ | +--------+ +-|->| DSDT | |
> > + | Pointer | | | Entry |-+ | ...... | | | +-------------------+ |
> > + +---------+ | +-------+ | X_DSDT |--+ | | Definition Blocks | |
> > + | Pointer |-+ | ..... | | ...... | | +-------------------+ |
> > + +---------+ +-------+ +--------+ | +-------------------+ |
> > + | Entry |------------------|->| SSDT | |
> > + +- - - -+ | +-------------------| |
> > + | Entry | - - - - - - - -+ | | Definition Blocks | |
> > + +- - - -+ | | +-------------------+ |
> > + | | +- - - - - - - - - -+ |
> > + +-|->| SSDT | |
> > + | +-------------------+ |
> > + | | Definition Blocks | |
> > + | +- - - - - - - - - -+ |
> > + +------------------------+
> > + |
> > + OSPM Loading |
> > + \|/
> > + +----------------+
> > + | ACPI Namespace |
> > + +----------------+
> > +
> > + Figure 1. ACPI Definition Blocks
> > +
> > +.. note:: RSDP can also contain a pointer to the RSDT (Root System
> > + Description Table). Platforms provide RSDT to enable
> > + compatibility with ACPI 1.0 operating systems. The OS is expected
> > + to use XSDT, if present.
> > +
> > +
> > +Example ACPI Namespace
> > +======================
> > +
> > +All definition blocks are loaded into a single namespace. The namespace
> > +is a hierarchy of objects identified by names and paths.
> > +The following naming conventions apply to object names in the ACPI
> > +namespace:
> >
> > -Copyright (C) 2013, Intel Corporation
> > -Author: Lv Zheng <[email protected]>
> > -
> > -
> > -Abstract:
> > -
> > -The Linux ACPI subsystem converts ACPI namespace objects into a Linux
> > -device tree under the /sys/devices/LNXSYSTEM:00 and updates it upon
> > -receiving ACPI hotplug notification events. For each device object in this
> > -hierarchy there is a corresponding symbolic link in the
> > -/sys/bus/acpi/devices.
> > -This document illustrates the structure of the ACPI device tree.
> > -
> > -
> > -Credit:
> > -
> > -Thanks for the help from Zhang Rui <[email protected]> and Rafael J.
> > -Wysocki <[email protected]>.
> > -
> > -
> > -1. ACPI Definition Blocks
> > -
> > - The ACPI firmware sets up RSDP (Root System Description Pointer) in the
> > - system memory address space pointing to the XSDT (Extended System
> > - Description Table). The XSDT always points to the FADT (Fixed ACPI
> > - Description Table) using its first entry, the data within the FADT
> > - includes various fixed-length entries that describe fixed ACPI features
> > - of the hardware. The FADT contains a pointer to the DSDT
> > - (Differentiated System Descripition Table). The XSDT also contains
> > - entries pointing to possibly multiple SSDTs (Secondary System
> > - Description Table).
> > -
> > - The DSDT and SSDT data is organized in data structures called definition
> > - blocks that contain definitions of various objects, including ACPI
> > - control methods, encoded in AML (ACPI Machine Language). The data block
> > - of the DSDT along with the contents of SSDTs represents a hierarchical
> > - data structure called the ACPI namespace whose topology reflects the
> > - structure of the underlying hardware platform.
> > -
> > - The relationships between ACPI System Definition Tables described above
> > - are illustrated in the following diagram.
> > -
> > - +---------+ +-------+ +--------+ +------------------------+
> > - | RSDP | +->| XSDT | +->| FADT | | +-------------------+ |
> > - +---------+ | +-------+ | +--------+ +-|->| DSDT | |
> > - | Pointer | | | Entry |-+ | ...... | | | +-------------------+ |
> > - +---------+ | +-------+ | X_DSDT |--+ | | Definition Blocks | |
> > - | Pointer |-+ | ..... | | ...... | | +-------------------+ |
> > - +---------+ +-------+ +--------+ | +-------------------+ |
> > - | Entry |------------------|->| SSDT | |
> > - +- - - -+ | +-------------------| |
> > - | Entry | - - - - - - - -+ | | Definition Blocks | |
> > - +- - - -+ | | +-------------------+ |
> > - | | +- - - - - - - - - -+ |
> > - +-|->| SSDT | |
> > - | +-------------------+ |
> > - | | Definition Blocks | |
> > - | +- - - - - - - - - -+ |
> > - +------------------------+
> > - |
> > - OSPM Loading |
> > - \|/
> > - +----------------+
> > - | ACPI Namespace |
> > - +----------------+
> > -
> > - Figure 1. ACPI Definition Blocks
> > -
> > - NOTE: RSDP can also contain a pointer to the RSDT (Root System
> > - Description Table). Platforms provide RSDT to enable
> > - compatibility with ACPI 1.0 operating systems. The OS is expected
> > - to use XSDT, if present.
> > -
> > -
> > -2. Example ACPI Namespace
> > -
> > - All definition blocks are loaded into a single namespace. The namespace
> > - is a hierarchy of objects identified by names and paths.
> > - The following naming conventions apply to object names in the ACPI
> > - namespace:
> > 1. All names are 32 bits long.
> > 2. The first byte of a name must be one of 'A' - 'Z', '_'.
> > 3. Each of the remaining bytes of a name must be one of 'A' - 'Z', '0'
> > @@ -91,7 +94,7 @@ Wysocki <[email protected]>.
> > (i.e. names prepended with '^' are relative to the parent of the
> > current namespace node).
> >
> > - The figure below shows an example ACPI namespace.
> > +The figure below shows an example ACPI namespace::
> >
> > +------+
> > | \ | Root
> > @@ -184,19 +187,20 @@ Wysocki <[email protected]>.
> > Figure 2. Example ACPI Namespace
> >
> >
> > -3. Linux ACPI Device Objects
> > +Linux ACPI Device Objects
> > +=========================
> >
> > - The Linux kernel's core ACPI subsystem creates struct acpi_device
> > - objects for ACPI namespace objects representing devices, power resources
> > - processors, thermal zones. Those objects are exported to user space via
> > - sysfs as directories in the subtree under /sys/devices/LNXSYSTM:00. The
> > - format of their names is <bus_id:instance>, where 'bus_id' refers to the
> > - ACPI namespace representation of the given object and 'instance' is used
> > - for distinguishing different object of the same 'bus_id' (it is
> > - two-digit decimal representation of an unsigned integer).
> > +The Linux kernel's core ACPI subsystem creates struct acpi_device
> > +objects for ACPI namespace objects representing devices, power resources
> > +processors, thermal zones. Those objects are exported to user space via
> > +sysfs as directories in the subtree under /sys/devices/LNXSYSTM:00. The
> > +format of their names is <bus_id:instance>, where 'bus_id' refers to the
> > +ACPI namespace representation of the given object and 'instance' is used
> > +for distinguishing different object of the same 'bus_id' (it is
> > +two-digit decimal representation of an unsigned integer).
> >
> > - The value of 'bus_id' depends on the type of the object whose name it is
> > - part of as listed in the table below.
> > +The value of 'bus_id' depends on the type of the object whose name it is
> > +part of as listed in the table below::
> >
> > +---+-----------------+-------+----------+
> > | | Object/Feature | Table | bus_id |
> > @@ -226,10 +230,11 @@ Wysocki <[email protected]>.
> >
> > Table 1. ACPI Namespace Objects Mapping
> >
> > - The following rules apply when creating struct acpi_device objects on
> > - the basis of the contents of ACPI System Description Tables (as
> > - indicated by the letter in the first column and the notation in the
> > - second column of the table above):
> > +The following rules apply when creating struct acpi_device objects on
> > +the basis of the contents of ACPI System Description Tables (as
> > +indicated by the letter in the first column and the notation in the
> > +second column of the table above):
> > +
> > N:
> > The object's source is an ACPI namespace node (as indicated by the
> > named object's type in the second column). In that case the object's
> > @@ -249,13 +254,14 @@ Wysocki <[email protected]>.
> > struct acpi_device object with LNXVIDEO 'bus_id' will be created for
> > it.
> >
> > - The third column of the above table indicates which ACPI System
> > - Description Tables contain information used for the creation of the
> > - struct acpi_device objects represented by the given row (xSDT means DSDT
> > - or SSDT).
> > +The third column of the above table indicates which ACPI System
> > +Description Tables contain information used for the creation of the
> > +struct acpi_device objects represented by the given row (xSDT means DSDT
> > +or SSDT).
> > +
> > +The forth column of the above table indicates the 'bus_id' generation
> > +rule of the struct acpi_device object:
> >
> > - The forth column of the above table indicates the 'bus_id' generation
> > - rule of the struct acpi_device object:
> > _HID:
> > _HID in the last column of the table means that the object's bus_id
> > is derived from the _HID/_CID identification objects present under
> > @@ -275,45 +281,47 @@ Wysocki <[email protected]>.
> > object's bus_id.
> >
> >
> > -4. Linux ACPI Physical Device Glue
> > -
> > - ACPI device (i.e. struct acpi_device) objects may be linked to other
> > - objects in the Linux' device hierarchy that represent "physical" devices
> > - (for example, devices on the PCI bus). If that happens, it means that
> > - the ACPI device object is a "companion" of a device otherwise
> > - represented in a different way and is used (1) to provide configuration
> > - information on that device which cannot be obtained by other means and
> > - (2) to do specific things to the device with the help of its ACPI
> > - control methods. One ACPI device object may be linked this way to
> > - multiple "physical" devices.
> > -
> > - If an ACPI device object is linked to a "physical" device, its sysfs
> > - directory contains the "physical_node" symbolic link to the sysfs
> > - directory of the target device object. In turn, the target device's
> > - sysfs directory will then contain the "firmware_node" symbolic link to
> > - the sysfs directory of the companion ACPI device object.
> > - The linking mechanism relies on device identification provided by the
> > - ACPI namespace. For example, if there's an ACPI namespace object
> > - representing a PCI device (i.e. a device object under an ACPI namespace
> > - object representing a PCI bridge) whose _ADR returns 0x00020000 and the
> > - bus number of the parent PCI bridge is 0, the sysfs directory
> > - representing the struct acpi_device object created for that ACPI
> > - namespace object will contain the 'physical_node' symbolic link to the
> > - /sys/devices/pci0000:00/0000:00:02:0/ sysfs directory of the
> > - corresponding PCI device.
> > -
> > - The linking mechanism is generally bus-specific. The core of its
> > - implementation is located in the drivers/acpi/glue.c file, but there are
> > - complementary parts depending on the bus types in question located
> > - elsewhere. For example, the PCI-specific part of it is located in
> > - drivers/pci/pci-acpi.c.
> > -
> > -
> > -5. Example Linux ACPI Device Tree
> > -
> > - The sysfs hierarchy of struct acpi_device objects corresponding to the
> > - example ACPI namespace illustrated in Figure 2 with the addition of
> > - fixed PWR_BUTTON/SLP_BUTTON devices is shown below.
> > +Linux ACPI Physical Device Glue
> > +===============================
> > +
> > +ACPI device (i.e. struct acpi_device) objects may be linked to other
> > +objects in the Linux' device hierarchy that represent "physical" devices
> > +(for example, devices on the PCI bus). If that happens, it means that
> > +the ACPI device object is a "companion" of a device otherwise
> > +represented in a different way and is used (1) to provide configuration
> > +information on that device which cannot be obtained by other means and
> > +(2) to do specific things to the device with the help of its ACPI
> > +control methods. One ACPI device object may be linked this way to
> > +multiple "physical" devices.
> > +
> > +If an ACPI device object is linked to a "physical" device, its sysfs
> > +directory contains the "physical_node" symbolic link to the sysfs
> > +directory of the target device object. In turn, the target device's
> > +sysfs directory will then contain the "firmware_node" symbolic link to
> > +the sysfs directory of the companion ACPI device object.
> > +The linking mechanism relies on device identification provided by the
> > +ACPI namespace. For example, if there's an ACPI namespace object
> > +representing a PCI device (i.e. a device object under an ACPI namespace
> > +object representing a PCI bridge) whose _ADR returns 0x00020000 and the
> > +bus number of the parent PCI bridge is 0, the sysfs directory
> > +representing the struct acpi_device object created for that ACPI
> > +namespace object will contain the 'physical_node' symbolic link to the
> > +/sys/devices/pci0000:00/0000:00:02:0/ sysfs directory of the
> > +corresponding PCI device.
> > +
> > +The linking mechanism is generally bus-specific. The core of its
> > +implementation is located in the drivers/acpi/glue.c file, but there are
> > +complementary parts depending on the bus types in question located
> > +elsewhere. For example, the PCI-specific part of it is located in
> > +drivers/pci/pci-acpi.c.
> > +
> > +
> > +Example Linux ACPI Device Tree
> > +=================================
> > +
> > +The sysfs hierarchy of struct acpi_device objects corresponding to the
> > +example ACPI namespace illustrated in Figure 2 with the addition of
> > +fixed PWR_BUTTON/SLP_BUTTON devices is shown below::
> >
> > +--------------+---+-----------------+
> > | LNXSYSTEM:00 | \ | acpi:LNXSYSTEM: |
> > @@ -377,12 +385,14 @@ Wysocki <[email protected]>.
> >
> > Figure 3. Example Linux ACPI Device Tree
> >
> > - NOTE: Each node is represented as "object/path/modalias", where:
> > - 1. 'object' is the name of the object's directory in sysfs.
> > - 2. 'path' is the ACPI namespace path of the corresponding
> > - ACPI namespace object, as returned by the object's 'path'
> > - sysfs attribute.
> > - 3. 'modalias' is the value of the object's 'modalias' sysfs
> > - attribute (as described earlier in this document).
> > - NOTE: N/A indicates the device object does not have the 'path' or the
> > - 'modalias' attribute.
> > +.. note:: Each node is represented as "object/path/modalias", where:
> > +
> > + 1. 'object' is the name of the object's directory in sysfs.
> > + 2. 'path' is the ACPI namespace path of the corresponding
> > + ACPI namespace object, as returned by the object's 'path'
> > + sysfs attribute.
> > + 3. 'modalias' is the value of the object's 'modalias' sysfs
> > + attribute (as described earlier in this document).
> > +
> > +.. note:: N/A indicates the device object does not have the 'path' or the
> > + 'modalias' attribute.
>
>
>
> Thanks,
> Mauro

--
Cheers,
Changbin Du

2019-04-24 16:18:48

by Changbin Du

[permalink] [raw]
Subject: Re: [PATCH v4 05/63] Documentation: ACPI: move linuxized-acpica.txt to driver-api/acpi and convert to reST

On Tue, Apr 23, 2019 at 05:50:30PM -0300, Mauro Carvalho Chehab wrote:
> Em Wed, 24 Apr 2019 00:28:34 +0800
> Changbin Du <[email protected]> escreveu:
>
> > This converts the plain text documentation to reStructuredText format and
> > add it to Sphinx TOC tree. No essential content change.
> >
> > Signed-off-by: Changbin Du <[email protected]>
> > ---
> > Documentation/driver-api/acpi/index.rst | 1 +
> > .../acpi/linuxized-acpica.rst} | 115 ++++++++++--------
> > 2 files changed, 66 insertions(+), 50 deletions(-)
> > rename Documentation/{acpi/linuxized-acpica.txt => driver-api/acpi/linuxized-acpica.rst} (78%)
> >
> > diff --git a/Documentation/driver-api/acpi/index.rst b/Documentation/driver-api/acpi/index.rst
> > index 898b0c60671a..12649947b19b 100644
> > --- a/Documentation/driver-api/acpi/index.rst
> > +++ b/Documentation/driver-api/acpi/index.rst
> > @@ -5,3 +5,4 @@ ACPI Support
> > .. toctree::
> > :maxdepth: 2
> >
> > + linuxized-acpica
> > diff --git a/Documentation/acpi/linuxized-acpica.txt b/Documentation/driver-api/acpi/linuxized-acpica.rst
> > similarity index 78%
> > rename from Documentation/acpi/linuxized-acpica.txt
> > rename to Documentation/driver-api/acpi/linuxized-acpica.rst
> > index 3ad7b0dfb083..f8aaea668e41 100644
> > --- a/Documentation/acpi/linuxized-acpica.txt
> > +++ b/Documentation/driver-api/acpi/linuxized-acpica.rst
> > @@ -1,31 +1,35 @@
> > -Linuxized ACPICA - Introduction to ACPICA Release Automation
> > +.. SPDX-License-Identifier: GPL-2.0
> > +.. include:: <isonum.txt>
> >
> > -Copyright (C) 2013-2016, Intel Corporation
> > -Author: Lv Zheng <[email protected]>
> > +============================================================
> > +Linuxized ACPICA - Introduction to ACPICA Release Automation
> > +============================================================
> >
> > +:Copyright: |copy| 2013-2016, Intel Corporation
> >
> > -Abstract:
> > +:Author: Lv Zheng <[email protected]>
> >
> > -This document describes the ACPICA project and the relationship between
> > -ACPICA and Linux. It also describes how ACPICA code in drivers/acpi/acpica,
> > -include/acpi and tools/power/acpi is automatically updated to follow the
> > -upstream.
> > +:Abstract: This document describes the ACPICA project and the relationship
> > + between ACPICA and Linux. It also describes how ACPICA code in
> > + drivers/acpi/acpica, include/acpi and tools/power/acpi is
> > + automatically updated to follow the upstream.
> >
>
> Same comment as on patch 02: I would keep the abstracts as a chapter,
> in order to make them visible at the index, as this may help readers
> to quickly look at the document's contents.
>
ok, done.

> I'm sure other APCI documents also have abstracts. So, please consider
> this comment also for the other docs.
>
For short description, I'd keep it. For long case, will make it as a chapter.
Thanks.

> Anyway, this is just a suggestion. I'm also fine with the above.
> Either way, for the conversion itself:
>
> Reviewed-by: Mauro Carvalho Chehab <[email protected]>
>
> >
> > -1. ACPICA Project
> > +ACPICA Project
> > +==============
> >
> > - The ACPI Component Architecture (ACPICA) project provides an operating
> > - system (OS)-independent reference implementation of the Advanced
> > - Configuration and Power Interface Specification (ACPI). It has been
> > - adapted by various host OSes. By directly integrating ACPICA, Linux can
> > - also benefit from the application experiences of ACPICA from other host
> > - OSes.
> > +The ACPI Component Architecture (ACPICA) project provides an operating
> > +system (OS)-independent reference implementation of the Advanced
> > +Configuration and Power Interface Specification (ACPI). It has been
> > +adapted by various host OSes. By directly integrating ACPICA, Linux can
> > +also benefit from the application experiences of ACPICA from other host
> > +OSes.
> >
> > - The homepage of ACPICA project is: http://www.acpica.org, it is maintained and
> > - supported by Intel Corporation.
> > +The homepage of ACPICA project is: http://www.acpica.org, it is maintained and
> > +supported by Intel Corporation.
> >
> > - The following figure depicts the Linux ACPI subsystem where the ACPICA
> > - adaptation is included:
> > +The following figure depicts the Linux ACPI subsystem where the ACPICA
> > +adaptation is included::
> >
> > +---------------------------------------------------------+
> > | |
> > @@ -71,21 +75,27 @@ upstream.
> >
> > Figure 1. Linux ACPI Software Components
> >
> > - NOTE:
> > +.. note::
> > A. OS Service Layer - Provided by Linux to offer OS dependent
> > implementation of the predefined ACPICA interfaces (acpi_os_*).
> > + ::
> > +
> > include/acpi/acpiosxf.h
> > drivers/acpi/osl.c
> > include/acpi/platform
> > include/asm/acenv.h
> > B. ACPICA Functionality - Released from ACPICA code base to offer
> > OS independent implementation of the ACPICA interfaces (acpi_*).
> > + ::
> > +
> > drivers/acpi/acpica
> > include/acpi/ac*.h
> > tools/power/acpi
> > C. Linux/ACPI Functionality - Providing Linux specific ACPI
> > functionality to the other Linux kernel subsystems and user space
> > programs.
> > + ::
> > +
> > drivers/acpi
> > include/linux/acpi.h
> > include/linux/acpi*.h
> > @@ -95,24 +105,27 @@ upstream.
> > ACPI subsystem to offer architecture specific implementation of the
> > ACPI interfaces. They are Linux specific components and are out of
> > the scope of this document.
> > + ::
> > +
> > include/asm/acpi.h
> > include/asm/acpi*.h
> > arch/*/acpi
> >
> > -2. ACPICA Release
> > +ACPICA Release
> > +==============
> >
> > - The ACPICA project maintains its code base at the following repository URL:
> > - https://github.com/acpica/acpica.git. As a rule, a release is made every
> > - month.
> > +The ACPICA project maintains its code base at the following repository URL:
> > +https://github.com/acpica/acpica.git. As a rule, a release is made every
> > +month.
> >
> > - As the coding style adopted by the ACPICA project is not acceptable by
> > - Linux, there is a release process to convert the ACPICA git commits into
> > - Linux patches. The patches generated by this process are referred to as
> > - "linuxized ACPICA patches". The release process is carried out on a local
> > - copy the ACPICA git repository. Each commit in the monthly release is
> > - converted into a linuxized ACPICA patch. Together, they form the monthly
> > - ACPICA release patchset for the Linux ACPI community. This process is
> > - illustrated in the following figure:
> > +As the coding style adopted by the ACPICA project is not acceptable by
> > +Linux, there is a release process to convert the ACPICA git commits into
> > +Linux patches. The patches generated by this process are referred to as
> > +"linuxized ACPICA patches". The release process is carried out on a local
> > +copy the ACPICA git repository. Each commit in the monthly release is
> > +converted into a linuxized ACPICA patch. Together, they form the monthly
> > +ACPICA release patchset for the Linux ACPI community. This process is
> > +illustrated in the following figure::
> >
> > +-----------------------------+
> > | acpica / master (-) commits |
> > @@ -153,7 +166,7 @@ upstream.
> >
> > Figure 2. ACPICA -> Linux Upstream Process
> >
> > - NOTE:
> > +.. note::
> > A. Linuxize Utilities - Provided by the ACPICA repository, including a
> > utility located in source/tools/acpisrc folder and a number of
> > scripts located in generate/linux folder.
> > @@ -170,19 +183,20 @@ upstream.
> > following kernel configuration options:
> > CONFIG_ACPI/CONFIG_ACPI_DEBUG/CONFIG_ACPI_DEBUGGER
> >
> > -3. ACPICA Divergences
> > +ACPICA Divergences
> > +==================
> >
> > - Ideally, all of the ACPICA commits should be converted into Linux patches
> > - automatically without manual modifications, the "linux / master" tree should
> > - contain the ACPICA code that exactly corresponds to the ACPICA code
> > - contained in "new linuxized acpica" tree and it should be possible to run
> > - the release process fully automatically.
> > +Ideally, all of the ACPICA commits should be converted into Linux patches
> > +automatically without manual modifications, the "linux / master" tree should
> > +contain the ACPICA code that exactly corresponds to the ACPICA code
> > +contained in "new linuxized acpica" tree and it should be possible to run
> > +the release process fully automatically.
> >
> > - As a matter of fact, however, there are source code differences between
> > - the ACPICA code in Linux and the upstream ACPICA code, referred to as
> > - "ACPICA Divergences".
> > +As a matter of fact, however, there are source code differences between
> > +the ACPICA code in Linux and the upstream ACPICA code, referred to as
> > +"ACPICA Divergences".
> >
> > - The various sources of ACPICA divergences include:
> > +The various sources of ACPICA divergences include:
> > 1. Legacy divergences - Before the current ACPICA release process was
> > established, there already had been divergences between Linux and
> > ACPICA. Over the past several years those divergences have been greatly
> > @@ -213,11 +227,12 @@ upstream.
> > rebased on the ACPICA side in order to offer better solutions, new ACPICA
> > divergences are generated.
> >
> > -4. ACPICA Development
> > +ACPICA Development
> > +==================
> >
> > - This paragraph guides Linux developers to use the ACPICA upstream release
> > - utilities to obtain Linux patches corresponding to upstream ACPICA commits
> > - before they become available from the ACPICA release process.
> > +This paragraph guides Linux developers to use the ACPICA upstream release
> > +utilities to obtain Linux patches corresponding to upstream ACPICA commits
> > +before they become available from the ACPICA release process.
> >
> > 1. Cherry-pick an ACPICA commit
> >
> > @@ -225,7 +240,7 @@ upstream.
> > you want to cherry pick must be committed into the local repository.
> >
> > Then the gen-patch.sh command can help to cherry-pick an ACPICA commit
> > - from the ACPICA local repository:
> > + from the ACPICA local repository::
> >
> > $ git clone https://github.com/acpica/acpica
> > $ cd acpica
> > @@ -240,7 +255,7 @@ upstream.
> > changes that haven't been applied to Linux yet.
> >
> > You can generate the ACPICA release series yourself and rebase your code on
> > - top of the generated ACPICA release patches:
> > + top of the generated ACPICA release patches::
> >
> > $ git clone https://github.com/acpica/acpica
> > $ cd acpica
> > @@ -254,7 +269,7 @@ upstream.
> > 3. Inspect the current divergences
> >
> > If you have local copies of both Linux and upstream ACPICA, you can generate
> > - a diff file indicating the state of the current divergences:
> > + a diff file indicating the state of the current divergences::
> >
> > # git clone https://github.com/acpica/acpica
> > # git clone http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>
>
>
> Thanks,
> Mauro

--
Cheers,
Changbin Du

2019-04-24 16:19:57

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [PATCH v4 00/63] Include linux ACPI/PCI/X86 docs into Sphinx TOC tree

On Wed, 24 Apr 2019 00:28:29 +0800
Changbin Du <[email protected]> wrote:

> The kernel now uses Sphinx to generate intelligent and beautiful documentation
> from reStructuredText files. I converted all of the Linux ACPI/PCI/X86 docs to
> reST format in this serias.
>
> In this version I combined ACPI and PCI docs, and added new x86 docs conversion.

As mentioned by others, this is a lot of stuff; I would really rather see
each of those groups as separate patch sets.

If you can do a reasonably quick turnaround with Mauro's suggestions
addressed and tags applied, we should be able to get at least some of this
into 5.2. Thanks, Mauro, for looking at all of this stuff!

Thanks,

jon

2019-04-24 16:34:58

by Changbin Du

[permalink] [raw]
Subject: Re: [PATCH v4 10/63] Documentation: ACPI: move initrd_table_override.txt to admin-guide/acpi and convert to reST

On Tue, Apr 23, 2019 at 06:07:34PM -0300, Mauro Carvalho Chehab wrote:
> Em Wed, 24 Apr 2019 00:28:39 +0800
> Changbin Du <[email protected]> escreveu:
>
> > This converts the plain text documentation to reStructuredText format and
> > add it to Sphinx TOC tree. No essential content change.
> >
> > Signed-off-by: Changbin Du <[email protected]>
> > ---
> > Documentation/acpi/initrd_table_override.txt | 111 ----------------
> > Documentation/admin-guide/acpi/index.rst | 1 +
> > .../acpi/initrd_table_override.rst | 120 ++++++++++++++++++
> > 3 files changed, 121 insertions(+), 111 deletions(-)
> > delete mode 100644 Documentation/acpi/initrd_table_override.txt
> > create mode 100644 Documentation/admin-guide/acpi/initrd_table_override.rst
> >
> > diff --git a/Documentation/acpi/initrd_table_override.txt b/Documentation/acpi/initrd_table_override.txt
> > deleted file mode 100644
> > index 30437a6db373..000000000000
> > --- a/Documentation/acpi/initrd_table_override.txt
> > +++ /dev/null
> > @@ -1,111 +0,0 @@
> > -Upgrading ACPI tables via initrd
> > -================================
> > -
> > -1) Introduction (What is this about)
> > -2) What is this for
> > -3) How does it work
> > -4) References (Where to retrieve userspace tools)
> > -
> > -1) What is this about
> > ----------------------
> > -
> > -If the ACPI_TABLE_UPGRADE compile option is true, it is possible to
> > -upgrade the ACPI execution environment that is defined by the ACPI tables
> > -via upgrading the ACPI tables provided by the BIOS with an instrumented,
> > -modified, more recent version one, or installing brand new ACPI tables.
> > -
> > -When building initrd with kernel in a single image, option
> > -ACPI_TABLE_OVERRIDE_VIA_BUILTIN_INITRD should also be true for this
> > -feature to work.
> > -
> > -For a full list of ACPI tables that can be upgraded/installed, take a look
> > -at the char *table_sigs[MAX_ACPI_SIGNATURE]; definition in
> > -drivers/acpi/tables.c.
> > -All ACPI tables iasl (Intel's ACPI compiler and disassembler) knows should
> > -be overridable, except:
> > - - ACPI_SIG_RSDP (has a signature of 6 bytes)
> > - - ACPI_SIG_FACS (does not have an ordinary ACPI table header)
> > -Both could get implemented as well.
> > -
> > -
> > -2) What is this for
> > --------------------
> > -
> > -Complain to your platform/BIOS vendor if you find a bug which is so severe
> > -that a workaround is not accepted in the Linux kernel. And this facility
> > -allows you to upgrade the buggy tables before your platform/BIOS vendor
> > -releases an upgraded BIOS binary.
> > -
> > -This facility can be used by platform/BIOS vendors to provide a Linux
> > -compatible environment without modifying the underlying platform firmware.
> > -
> > -This facility also provides a powerful feature to easily debug and test
> > -ACPI BIOS table compatibility with the Linux kernel by modifying old
> > -platform provided ACPI tables or inserting new ACPI tables.
> > -
> > -It can and should be enabled in any kernel because there is no functional
> > -change with not instrumented initrds.
> > -
> > -
> > -3) How does it work
> > --------------------
> > -
> > -# Extract the machine's ACPI tables:
> > -cd /tmp
> > -acpidump >acpidump
> > -acpixtract -a acpidump
> > -# Disassemble, modify and recompile them:
> > -iasl -d *.dat
> > -# For example add this statement into a _PRT (PCI Routing Table) function
> > -# of the DSDT:
> > -Store("HELLO WORLD", debug)
> > -# And increase the OEM Revision. For example, before modification:
> > -DefinitionBlock ("DSDT.aml", "DSDT", 2, "INTEL ", "TEMPLATE", 0x00000000)
> > -# After modification:
> > -DefinitionBlock ("DSDT.aml", "DSDT", 2, "INTEL ", "TEMPLATE", 0x00000001)
> > -iasl -sa dsdt.dsl
> > -# Add the raw ACPI tables to an uncompressed cpio archive.
> > -# They must be put into a /kernel/firmware/acpi directory inside the cpio
> > -# archive. Note that if the table put here matches a platform table
> > -# (similar Table Signature, and similar OEMID, and similar OEM Table ID)
> > -# with a more recent OEM Revision, the platform table will be upgraded by
> > -# this table. If the table put here doesn't match a platform table
> > -# (dissimilar Table Signature, or dissimilar OEMID, or dissimilar OEM Table
> > -# ID), this table will be appended.
> > -mkdir -p kernel/firmware/acpi
> > -cp dsdt.aml kernel/firmware/acpi
> > -# A maximum of "NR_ACPI_INITRD_TABLES (64)" tables are currently allowed
> > -# (see osl.c):
> > -iasl -sa facp.dsl
> > -iasl -sa ssdt1.dsl
> > -cp facp.aml kernel/firmware/acpi
> > -cp ssdt1.aml kernel/firmware/acpi
> > -# The uncompressed cpio archive must be the first. Other, typically
> > -# compressed cpio archives, must be concatenated on top of the uncompressed
> > -# one. Following command creates the uncompressed cpio archive and
> > -# concatenates the original initrd on top:
> > -find kernel | cpio -H newc --create > /boot/instrumented_initrd
> > -cat /boot/initrd >>/boot/instrumented_initrd
> > -# reboot with increased acpi debug level, e.g. boot params:
> > -acpi.debug_level=0x2 acpi.debug_layer=0xFFFFFFFF
> > -# and check your syslog:
> > -[ 1.268089] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> > -[ 1.272091] [ACPI Debug] String [0x0B] "HELLO WORLD"
> > -
> > -iasl is able to disassemble and recompile quite a lot different,
> > -also static ACPI tables.
> > -
> > -
> > -4) Where to retrieve userspace tools
> > -------------------------------------
> > -
> > -iasl and acpixtract are part of Intel's ACPICA project:
> > -http://acpica.org/
> > -and should be packaged by distributions (for example in the acpica package
> > -on SUSE).
> > -
> > -acpidump can be found in Len Browns pmtools:
> > -ftp://kernel.org/pub/linux/kernel/people/lenb/acpi/utils/pmtools/acpidump
> > -This tool is also part of the acpica package on SUSE.
> > -Alternatively, used ACPI tables can be retrieved via sysfs in latest kernels:
> > -/sys/firmware/acpi/tables
> > diff --git a/Documentation/admin-guide/acpi/index.rst b/Documentation/admin-guide/acpi/index.rst
> > index 3e041206089d..09e4e81e4fb7 100644
> > --- a/Documentation/admin-guide/acpi/index.rst
> > +++ b/Documentation/admin-guide/acpi/index.rst
> > @@ -8,3 +8,4 @@ the Linux ACPI support.
> > .. toctree::
> > :maxdepth: 1
> >
> > + initrd_table_override
> > diff --git a/Documentation/admin-guide/acpi/initrd_table_override.rst b/Documentation/admin-guide/acpi/initrd_table_override.rst
> > new file mode 100644
> > index 000000000000..0787b2b91ded
> > --- /dev/null
> > +++ b/Documentation/admin-guide/acpi/initrd_table_override.rst
> > @@ -0,0 +1,120 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +================================
> > +Upgrading ACPI tables via initrd
> > +================================
> > +
> > +1) Introduction (What is this about)
> > +2) What is this for
> > +3) How does it work
> > +4) References (Where to retrieve userspace tools)
>
> Hmm... I did the same on my conversion, but IMO, the best would be to
> hide (or remove, if ACPI maintainers agree) the contents, as this may
> conflict with the body as people may add new stuff and forget to
> update it.
>
> So, if ACPI maintainers insist on keeping it, I would code this as:
>
> .. Contents
>
> 1) Introduction (What is this about)
> 2) What is this for
> 3) How does it work
> 4) References (Where to retrieve userspace tools)
>
> as this will make this invisible on html/pdf/epub output.
>
I just removed it. If anyone wants it back, please comment. Thanks.

> Anyway, with or without the above change:
>
> Reviewed-by: Mauro Carvalho Chehab <[email protected]>
>
> > +
> > +1) What is this about
> > +=====================
> > +
> > +If the ACPI_TABLE_UPGRADE compile option is true, it is possible to
> > +upgrade the ACPI execution environment that is defined by the ACPI tables
> > +via upgrading the ACPI tables provided by the BIOS with an instrumented,
> > +modified, more recent version one, or installing brand new ACPI tables.
> > +
> > +When building initrd with kernel in a single image, option
> > +ACPI_TABLE_OVERRIDE_VIA_BUILTIN_INITRD should also be true for this
> > +feature to work.
> > +
> > +For a full list of ACPI tables that can be upgraded/installed, take a look
> > +at the char `*table_sigs[MAX_ACPI_SIGNATURE];` definition in
> > +drivers/acpi/tables.c.
> > +
> > +All ACPI tables iasl (Intel's ACPI compiler and disassembler) knows should
> > +be overridable, except:
> > +
> > + - ACPI_SIG_RSDP (has a signature of 6 bytes)
> > + - ACPI_SIG_FACS (does not have an ordinary ACPI table header)
> > +
> > +Both could get implemented as well.
> > +
> > +
> > +2) What is this for
> > +===================
> > +
> > +Complain to your platform/BIOS vendor if you find a bug which is so severe
> > +that a workaround is not accepted in the Linux kernel. And this facility
> > +allows you to upgrade the buggy tables before your platform/BIOS vendor
> > +releases an upgraded BIOS binary.
> > +
> > +This facility can be used by platform/BIOS vendors to provide a Linux
> > +compatible environment without modifying the underlying platform firmware.
> > +
> > +This facility also provides a powerful feature to easily debug and test
> > +ACPI BIOS table compatibility with the Linux kernel by modifying old
> > +platform provided ACPI tables or inserting new ACPI tables.
> > +
> > +It can and should be enabled in any kernel because there is no functional
> > +change with not instrumented initrds.
> > +
> > +
> > +3) How does it work
> > +===================
> > +::
> > +
> > + # Extract the machine's ACPI tables:
> > + cd /tmp
> > + acpidump >acpidump
> > + acpixtract -a acpidump
> > + # Disassemble, modify and recompile them:
> > + iasl -d *.dat
> > + # For example add this statement into a _PRT (PCI Routing Table) function
> > + # of the DSDT:
> > + Store("HELLO WORLD", debug)
> > + # And increase the OEM Revision. For example, before modification:
> > + DefinitionBlock ("DSDT.aml", "DSDT", 2, "INTEL ", "TEMPLATE", 0x00000000)
> > + # After modification:
> > + DefinitionBlock ("DSDT.aml", "DSDT", 2, "INTEL ", "TEMPLATE", 0x00000001)
> > + iasl -sa dsdt.dsl
> > + # Add the raw ACPI tables to an uncompressed cpio archive.
> > + # They must be put into a /kernel/firmware/acpi directory inside the cpio
> > + # archive. Note that if the table put here matches a platform table
> > + # (similar Table Signature, and similar OEMID, and similar OEM Table ID)
> > + # with a more recent OEM Revision, the platform table will be upgraded by
> > + # this table. If the table put here doesn't match a platform table
> > + # (dissimilar Table Signature, or dissimilar OEMID, or dissimilar OEM Table
> > + # ID), this table will be appended.
> > + mkdir -p kernel/firmware/acpi
> > + cp dsdt.aml kernel/firmware/acpi
> > + # A maximum of "NR_ACPI_INITRD_TABLES (64)" tables are currently allowed
> > + # (see osl.c):
> > + iasl -sa facp.dsl
> > + iasl -sa ssdt1.dsl
> > + cp facp.aml kernel/firmware/acpi
> > + cp ssdt1.aml kernel/firmware/acpi
> > + # The uncompressed cpio archive must be the first. Other, typically
> > + # compressed cpio archives, must be concatenated on top of the uncompressed
> > + # one. Following command creates the uncompressed cpio archive and
> > + # concatenates the original initrd on top:
> > + find kernel | cpio -H newc --create > /boot/instrumented_initrd
> > + cat /boot/initrd >>/boot/instrumented_initrd
> > + # reboot with increased acpi debug level, e.g. boot params:
> > + acpi.debug_level=0x2 acpi.debug_layer=0xFFFFFFFF
> > + # and check your syslog:
> > + [ 1.268089] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> > + [ 1.272091] [ACPI Debug] String [0x0B] "HELLO WORLD"
> > +
> > +iasl is able to disassemble and recompile quite a lot different,
> > +also static ACPI tables.
> > +
> > +
> > +4) Where to retrieve userspace tools
> > +====================================
> > +
> > +iasl and acpixtract are part of Intel's ACPICA project:
> > +http://acpica.org/
> > +
> > +and should be packaged by distributions (for example in the acpica package
> > +on SUSE).
> > +
> > +acpidump can be found in Len Browns pmtools:
> > +ftp://kernel.org/pub/linux/kernel/people/lenb/acpi/utils/pmtools/acpidump
> > +
> > +This tool is also part of the acpica package on SUSE.
> > +Alternatively, used ACPI tables can be retrieved via sysfs in latest kernels:
> > +/sys/firmware/acpi/tables
>
>
>
> Thanks,
> Mauro

--
Cheers,
Changbin Du

2019-04-24 16:45:49

by Changbin Du

[permalink] [raw]
Subject: Re: [PATCH v4 15/63] Documentation: ACPI: move dsd/data-node-references.txt to firmware-guide/acpi and convert to reST

On Tue, Apr 23, 2019 at 06:17:48PM -0300, Mauro Carvalho Chehab wrote:
> Em Wed, 24 Apr 2019 00:28:44 +0800
> Changbin Du <[email protected]> escreveu:
>
> > This converts the plain text documentation to reStructuredText format and
> > add it to Sphinx TOC tree. No essential content change.
> >
> > Signed-off-by: Changbin Du <[email protected]>
> > ---
> > .../acpi/dsd/data-node-references.rst} | 28 +++++++++++--------
> > Documentation/firmware-guide/acpi/index.rst | 1 +
> > 2 files changed, 17 insertions(+), 12 deletions(-)
> > rename Documentation/{acpi/dsd/data-node-references.txt => firmware-guide/acpi/dsd/data-node-references.rst} (79%)
> >
> > diff --git a/Documentation/acpi/dsd/data-node-references.txt b/Documentation/firmware-guide/acpi/dsd/data-node-references.rst
> > similarity index 79%
> > rename from Documentation/acpi/dsd/data-node-references.txt
> > rename to Documentation/firmware-guide/acpi/dsd/data-node-references.rst
> > index c3871565c8cf..79c5368eaecf 100644
> > --- a/Documentation/acpi/dsd/data-node-references.txt
> > +++ b/Documentation/firmware-guide/acpi/dsd/data-node-references.rst
> > @@ -1,9 +1,12 @@
> > -Copyright (C) 2018 Intel Corporation
> > -Author: Sakari Ailus <[email protected]>
> > -
> > +.. SPDX-License-Identifier: GPL-2.0
> > +.. include:: <isonum.txt>
> >
> > +===================================
> > Referencing hierarchical data nodes
> > ------------------------------------
> > +===================================
> > +
> > +:Copyright: |copy| 2018 Intel Corporation
> > +:Author: Sakari Ailus <[email protected]>
> >
> > ACPI in general allows referring to device objects in the tree only.
> > Hierarchical data extension nodes may not be referred to directly, hence this
> > @@ -28,13 +31,14 @@ extension key.
> >
> >
> > Example
> > --------
> > +=======
> >
> > - In the ASL snippet below, the "reference" _DSD property [2] contains a
> > - device object reference to DEV0 and under that device object, a
> > - hierarchical data extension key "node@1" referring to the NOD1 object
> > - and lastly, a hierarchical data extension key "anothernode" referring to
> > - the ANOD object which is also the final target node of the reference.
> > +In the ASL snippet below, the "reference" _DSD property [2] contains a
> > +device object reference to DEV0 and under that device object, a
> > +hierarchical data extension key "node@1" referring to the NOD1 object
> > +and lastly, a hierarchical data extension key "anothernode" referring to
> > +the ANOD object which is also the final target node of the reference.
> > +::
> >
> > Device (DEV0)
> > {
> > @@ -75,10 +79,10 @@ Example
> > })
> > }
> >
> > -Please also see a graph example in graph.txt .
> > +Please also see a graph example in :doc:`graph`.
> >
> > References
> > -----------
> > +==========
> >
> > [1] Hierarchical Data Extension UUID For _DSD.
> > <URL:http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.1.pdf>,
>
> Hmm... on the previous patch, you replaced <URL:some_url> by some_url,
> with makes sense. Please do the same here and on other patches on
> this series with a similar way to describe URLs.
>
Done, thanks.

> > diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> > index f81cfbcb6878..6d4e0df4f063 100644
> > --- a/Documentation/firmware-guide/acpi/index.rst
> > +++ b/Documentation/firmware-guide/acpi/index.rst
> > @@ -9,6 +9,7 @@ ACPI Support
> >
> > namespace
> > dsd/graph
> > + dsd/data-node-references
> > enumeration
> > osi
> > method-customizing
>
>
>
> Thanks,
> Mauro

--
Cheers,
Changbin Du

2019-04-24 16:55:59

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 00/63] Include linux ACPI/PCI/X86 docs into Sphinx TOC tree

Em Wed, 24 Apr 2019 10:18:46 -0600
Jonathan Corbet <[email protected]> escreveu:

> On Wed, 24 Apr 2019 00:28:29 +0800
> Changbin Du <[email protected]> wrote:
>
> > The kernel now uses Sphinx to generate intelligent and beautiful documentation
> > from reStructuredText files. I converted all of the Linux ACPI/PCI/X86 docs to
> > reST format in this serias.
> >
> > In this version I combined ACPI and PCI docs, and added new x86 docs conversion.
>
> As mentioned by others, this is a lot of stuff; I would really rather see
> each of those groups as separate patch sets.

Yeah, makes sense to me, either to split into separate patchsets or to
group the changes per sub-dir (or both).

> If you can do a reasonably quick turnaround with Mauro's suggestions
> addressed and tags applied, we should be able to get at least some of this
> into 5.2. Thanks, Mauro, for looking at all of this stuff!

Anytime! Just to make clear, I'm still reviewing it... I'm at patch 35/63
now. So, expect more comments from my side.


Thanks,
Mauro

2019-04-24 17:06:19

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 37/63] Documentation: add Linux x86 docs to Sphinx TOC tree

Em Wed, 24 Apr 2019 00:29:06 +0800
Changbin Du <[email protected]> escreveu:

> Add a index.rst for x86 support. More docs will be added later.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> Documentation/index.rst | 1 +
> Documentation/x86/index.rst | 9 +++++++++
> 2 files changed, 10 insertions(+)
> create mode 100644 Documentation/x86/index.rst
>
> diff --git a/Documentation/index.rst b/Documentation/index.rst
> index d80138284e0f..f185c8040fa9 100644
> --- a/Documentation/index.rst
> +++ b/Documentation/index.rst
> @@ -112,6 +112,7 @@ implementation.
> .. toctree::
> :maxdepth: 2
>
> + x86/index
> sh/index
>
> Filesystem Documentation
> diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
> new file mode 100644
> index 000000000000..7612d3142b2a
> --- /dev/null
> +++ b/Documentation/x86/index.rst
> @@ -0,0 +1,9 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=================
> +Linux x86 Support
> +=================
> +
> +.. toctree::
> + :maxdepth: 2
> + :numbered:

Looks ok to me:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

Just one reminder: after merging both this and my conversions, we may
need to do some review at the architecture titles, in order to
make them consistent.

On mu conversion patches, I'm using

==================
$arch Architecture
==================

Probably also not the best title. Anyway, this can easily be fixed
with a follow up patch once we get everything merged.

Thanks,
Mauro

2019-04-24 17:13:55

by Changbin Du

[permalink] [raw]
Subject: Re: [PATCH v4 20/63] Documentation: ACPI: move apei/einj.txt to firmware-guide/acpi and convert to reST

On Wed, Apr 24, 2019 at 11:33:49AM -0300, Mauro Carvalho Chehab wrote:
> Em Wed, 24 Apr 2019 00:28:49 +0800
> Changbin Du <[email protected]> escreveu:
>
> > This converts the plain text documentation to reStructuredText format and
> > add it to Sphinx TOC tree. No essential content change.
> >
> > Signed-off-by: Changbin Du <[email protected]>
> > ---
> > .../acpi/apei/einj.rst} | 98 ++++++++++---------
> > Documentation/firmware-guide/acpi/index.rst | 1 +
> > 2 files changed, 53 insertions(+), 46 deletions(-)
> > rename Documentation/{acpi/apei/einj.txt => firmware-guide/acpi/apei/einj.rst} (67%)
> >
> > diff --git a/Documentation/acpi/apei/einj.txt b/Documentation/firmware-guide/acpi/apei/einj.rst
> > similarity index 67%
> > rename from Documentation/acpi/apei/einj.txt
> > rename to Documentation/firmware-guide/acpi/apei/einj.rst
> > index e550c8b98139..d85e2667155c 100644
> > --- a/Documentation/acpi/apei/einj.txt
> > +++ b/Documentation/firmware-guide/acpi/apei/einj.rst
> > @@ -1,13 +1,16 @@
> > - APEI Error INJection
> > - ~~~~~~~~~~~~~~~~~~~~
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +====================
> > +APEI Error INJection
> > +====================
> >
> > EINJ provides a hardware error injection mechanism. It is very useful
> > for debugging and testing APEI and RAS features in general.
> >
> > You need to check whether your BIOS supports EINJ first. For that, look
> > -for early boot messages similar to this one:
> > +for early boot messages similar to this one::
> >
> > -ACPI: EINJ 0x000000007370A000 000150 (v01 INTEL 00000001 INTL 00000001)
> > + ACPI: EINJ 0x000000007370A000 000150 (v01 INTEL 00000001 INTL 00000001)
> >
> > which shows that the BIOS is exposing an EINJ table - it is the
> > mechanism through which the injection is done.
> > @@ -23,11 +26,11 @@ order to see the APEI,EINJ,... functionality supported and exposed by
> > the BIOS menu.
> >
> > To use EINJ, make sure the following are options enabled in your kernel
> > -configuration:
> > +configuration::
> >
> > -CONFIG_DEBUG_FS
> > -CONFIG_ACPI_APEI
> > -CONFIG_ACPI_APEI_EINJ
> > + CONFIG_DEBUG_FS
> > + CONFIG_ACPI_APEI
> > + CONFIG_ACPI_APEI_EINJ
> >
> > The EINJ user interface is in <debugfs mount point>/apei/einj.
> >
> > @@ -35,22 +38,22 @@ The following files belong to it:
> >
> > - available_error_type
> >
> > - This file shows which error types are supported:
> > -
> > - Error Type Value Error Description
> > - ================ =================
> > - 0x00000001 Processor Correctable
> > - 0x00000002 Processor Uncorrectable non-fatal
> > - 0x00000004 Processor Uncorrectable fatal
> > - 0x00000008 Memory Correctable
> > - 0x00000010 Memory Uncorrectable non-fatal
> > - 0x00000020 Memory Uncorrectable fatal
> > - 0x00000040 PCI Express Correctable
> > - 0x00000080 PCI Express Uncorrectable fatal
> > - 0x00000100 PCI Express Uncorrectable non-fatal
> > - 0x00000200 Platform Correctable
> > - 0x00000400 Platform Uncorrectable non-fatal
> > - 0x00000800 Platform Uncorrectable fatal
> > + This file shows which error types are supported::
> > +
> > + Error Type Value Error Description
> > + ================ =================
> > + 0x00000001 Processor Correctable
> > + 0x00000002 Processor Uncorrectable non-fatal
> > + 0x00000004 Processor Uncorrectable fatal
> > + 0x00000008 Memory Correctable
> > + 0x00000010 Memory Uncorrectable non-fatal
> > + 0x00000020 Memory Uncorrectable fatal
> > + 0x00000040 PCI Express Correctable
> > + 0x00000080 PCI Express Uncorrectable fatal
> > + 0x00000100 PCI Express Uncorrectable non-fatal
> > + 0x00000200 Platform Correctable
> > + 0x00000400 Platform Uncorrectable non-fatal
> > + 0x00000800 Platform Uncorrectable fatal
>
> This is a table and not a literal block.
>
> The best here to preserve the author's intent is to just adjust the table
> markups in order to make it parseable, e. g.:
>
> This file shows which error types are supported:
>
> ================ ===================================
> Error Type Value Error Description
> ================ ===================================
> 0x00000001 Processor Correctable
> 0x00000002 Processor Uncorrectable non-fatal
> 0x00000004 Processor Uncorrectable fatal
> 0x00000008 Memory Correctable
> 0x00000010 Memory Uncorrectable non-fatal
> 0x00000020 Memory Uncorrectable fatal
> 0x00000040 PCI Express Correctable
> 0x00000080 PCI Express Uncorrectable fatal
> 0x00000100 PCI Express Uncorrectable non-fatal
> 0x00000200 Platform Correctable
> 0x00000400 Platform Uncorrectable non-fatal
> 0x00000800 Platform Uncorrectable fatal
> ================ ===================================
>
Done, thanks.

> After such change:
>
> Reviewed-by: Mauro Carvalho Chehab <[email protected]>
>
> >
> > The format of the file contents are as above, except present are only
> > the available error types.
> > @@ -73,9 +76,12 @@ The following files belong to it:
> > injection. Value is a bitmask as specified in ACPI5.0 spec for the
> > SET_ERROR_TYPE_WITH_ADDRESS data structure:
> >
> > - Bit 0 - Processor APIC field valid (see param3 below).
> > - Bit 1 - Memory address and mask valid (param1 and param2).
> > - Bit 2 - PCIe (seg,bus,dev,fn) valid (see param4 below).
> > + Bit 0
> > + Processor APIC field valid (see param3 below).
> > + Bit 1
> > + Memory address and mask valid (param1 and param2).
> > + Bit 2
> > + PCIe (seg,bus,dev,fn) valid (see param4 below).
> >
> > If set to zero, legacy behavior is mimicked where the type of
> > injection specifies just one bit set, and param1 is multiplexed.
> > @@ -121,7 +127,7 @@ BIOS versions based on the ACPI 5.0 specification have more control over
> > the target of the injection. For processor-related errors (type 0x1, 0x2
> > and 0x4), you can set flags to 0x3 (param3 for bit 0, and param1 and
> > param2 for bit 1) so that you have more information added to the error
> > -signature being injected. The actual data passed is this:
> > +signature being injected. The actual data passed is this::
> >
> > memory_address = param1;
> > memory_address_range = param2;
> > @@ -131,7 +137,7 @@ signature being injected. The actual data passed is this:
> > For memory errors (type 0x8, 0x10 and 0x20) the address is set using
> > param1 with a mask in param2 (0x0 is equivalent to all ones). For PCI
> > express errors (type 0x40, 0x80 and 0x100) the segment, bus, device and
> > -function are specified using param1:
> > +function are specified using param1::
> >
> > 31 24 23 16 15 11 10 8 7 0
> > +-------------------------------------------------+
> > @@ -152,26 +158,26 @@ documentation for details (and expect changes to this API if vendors
> > creativity in using this feature expands beyond our expectations).
> >
> >
> > -An error injection example:
> > +An error injection example::
> >
> > -# cd /sys/kernel/debug/apei/einj
> > -# cat available_error_type # See which errors can be injected
> > -0x00000002 Processor Uncorrectable non-fatal
> > -0x00000008 Memory Correctable
> > -0x00000010 Memory Uncorrectable non-fatal
> > -# echo 0x12345000 > param1 # Set memory address for injection
> > -# echo $((-1 << 12)) > param2 # Mask 0xfffffffffffff000 - anywhere in this page
> > -# echo 0x8 > error_type # Choose correctable memory error
> > -# echo 1 > error_inject # Inject now
> > + # cd /sys/kernel/debug/apei/einj
> > + # cat available_error_type # See which errors can be injected
> > + 0x00000002 Processor Uncorrectable non-fatal
> > + 0x00000008 Memory Correctable
> > + 0x00000010 Memory Uncorrectable non-fatal
> > + # echo 0x12345000 > param1 # Set memory address for injection
> > + # echo $((-1 << 12)) > param2 # Mask 0xfffffffffffff000 - anywhere in this page
> > + # echo 0x8 > error_type # Choose correctable memory error
> > + # echo 1 > error_inject # Inject now
> >
> > -You should see something like this in dmesg:
> > +You should see something like this in dmesg::
> >
> > -[22715.830801] EDAC sbridge MC3: HANDLING MCE MEMORY ERROR
> > -[22715.834759] EDAC sbridge MC3: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010090
> > -[22715.834759] EDAC sbridge MC3: TSC 0
> > -[22715.834759] EDAC sbridge MC3: ADDR 12345000 EDAC sbridge MC3: MISC 144780c86
> > -[22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0
> > -[22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)
> > + [22715.830801] EDAC sbridge MC3: HANDLING MCE MEMORY ERROR
> > + [22715.834759] EDAC sbridge MC3: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010090
> > + [22715.834759] EDAC sbridge MC3: TSC 0
> > + [22715.834759] EDAC sbridge MC3: ADDR 12345000 EDAC sbridge MC3: MISC 144780c86
> > + [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0
> > + [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)
> >
> > For more information about EINJ, please refer to ACPI specification
> > version 4.0, section 17.5 and ACPI 5.0, section 18.6.
> > diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> > index 869badba6d7a..fca854f017d8 100644
> > --- a/Documentation/firmware-guide/acpi/index.rst
> > +++ b/Documentation/firmware-guide/acpi/index.rst
> > @@ -18,6 +18,7 @@ ACPI Support
> > debug
> > aml-debugger
> > apei/output_format
> > + apei/einj
> > gpio-properties
> > i2c-muxes
> > acpi-lid
>
>
>
> Thanks,
> Mauro

--
Cheers,
Changbin Du

2019-04-24 17:44:02

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 38/63] Documentation: x86: convert boot.txt to reST

Em Wed, 24 Apr 2019 00:29:07 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> Documentation/x86/boot.rst | 1205 +++++++++++++++++++++++++++++++++++
> Documentation/x86/boot.txt | 1130 --------------------------------
> Documentation/x86/index.rst | 2 +
> 3 files changed, 1207 insertions(+), 1130 deletions(-)
> create mode 100644 Documentation/x86/boot.rst
> delete mode 100644 Documentation/x86/boot.txt
>
> diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
> new file mode 100644
> index 000000000000..9f55e832bc47
> --- /dev/null
> +++ b/Documentation/x86/boot.rst
> @@ -0,0 +1,1205 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===========================
> +The Linux/x86 Boot Protocol
> +===========================
> +
> +On the x86 platform, the Linux kernel uses a rather complicated boot
> +convention. This has evolved partially due to historical aspects, as
> +well as the desire in the early days to have the kernel itself be a
> +bootable image, the complicated PC memory model and due to changed
> +expectations in the PC industry caused by the effective demise of
> +real-mode DOS as a mainstream operating system.
> +
> +Currently, the following versions of the Linux/x86 boot protocol exist.
> +
> +Old kernels:
> + zImage/Image support only. Some very early kernels
> + may not even support a command line.
> +
> +Protocol 2.00:
> + (Kernel 1.3.73) Added bzImage and initrd support, as
> + well as a formalized way to communicate between the
> + boot loader and the kernel. setup.S made relocatable,
> + although the traditional setup area still assumed writable.
> +
> +Protocol 2.01:
> + (Kernel 1.3.76) Added a heap overrun warning.
> +
> +Protocol 2.02:
> + (Kernel 2.4.0-test3-pre3) New command line protocol.
> + Lower the conventional memory ceiling. No overwrite
> + of the traditional setup area, thus making booting
> + safe for systems which use the EBDA from SMM or 32-bit
> + BIOS entry points. zImage deprecated but still supported.
> +
> +Protocol 2.03:
> + (Kernel 2.4.18-pre1) Explicitly makes the highest possible
> + initrd address available to the bootloader.
> +
> +Protocol 2.04:
> + (Kernel 2.6.14) Extend the syssize field to four bytes.
> +
> +Protocol 2.05:
> + (Kernel 2.6.20) Make protected mode kernel relocatable.
> + Introduce relocatable_kernel and kernel_alignment fields.
> +
> +Protocol 2.06:
> + (Kernel 2.6.22) Added a field that contains the size of
> + the boot command line.
> +
> +Protocol 2.07:
> + (Kernel 2.6.24) Added paravirtualised boot protocol.
> + Introduced hardware_subarch and hardware_subarch_data
> + and KEEP_SEGMENTS flag in load_flags.
> +
> +Protocol 2.08:
> + (Kernel 2.6.26) Added crc32 checksum and ELF format
> + payload. Introduced payload_offset and payload_length
> + fields to aid in locating the payload.
> +
> +Protocol 2.09:
> + (Kernel 2.6.26) Added a field of 64-bit physical
> + pointer to single linked list of struct setup_data.
> +
> +Protocol 2.10:
> + (Kernel 2.6.31) Added a protocol for relaxed alignment
> + beyond the kernel_alignment added, new init_size and
> + pref_address fields. Added extended boot loader IDs.
> +
> +Protocol 2.11:
> + (Kernel 3.6) Added a field for offset of EFI handover
> + protocol entry point.
> +
> +Protocol 2.12:
> + (Kernel 3.8) Added the xloadflags field and extension fields
> + to struct boot_params for loading bzImage and ramdisk
> + above 4G in 64bit.

This is a side node, but you should really try to avoid replacing too
many lines, as it makes a lot harder for reviewers for no good reason.

For example, this is the way I would convert this changelog table:


@@ -10,6 +11,7 @@ real-mode DOS as a mainstream operating system.

Currently, the following versions of the Linux/x86 boot protocol exist.

+=============== ===============================================================
Old kernels: zImage/Image support only. Some very early kernels
may not even support a command line.

@@ -64,33 +66,35 @@ Protocol 2.12: (Kernel 3.8) Added the xloadflags field and extension fields
Protocol 2.13: (Kernel 3.14) Support 32- and 64-bit flags being set in
xloadflags to support booting a 64-bit kernel from 32-bit
EFI
+=============== ===============================================================


This is simple enough, preserves the original author's intent and
makes a lot easier for reviewers to check what you changed.

> +
> +MEMORY LAYOUT
> +=============
> +
> +The traditional memory map for the kernel loader, used for Image or
> +zImage kernels, typically looks like::
> +
> + | |
> + 0A0000 +------------------------+
> + | Reserved for BIOS | Do not use. Reserved for BIOS EBDA.
> + 09A000 +------------------------+
> + | Command line |
> + | Stack/heap | For use by the kernel real-mode code.
> + 098000 +------------------------+
> + | Kernel setup | The kernel real-mode code.
> + 090200 +------------------------+
> + | Kernel boot sector | The kernel legacy boot sector.
> + 090000 +------------------------+
> + | Protected-mode kernel | The bulk of the kernel image.
> + 010000 +------------------------+
> + | Boot loader | <- Boot sector entry point 0000:7C00
> + 001000 +------------------------+
> + | Reserved for MBR/BIOS |
> + 000800 +------------------------+
> + | Typically used by MBR |
> + 000600 +------------------------+
> + | BIOS use only |
> + 000000 +------------------------+
> +
> +

I might be wrong, but it seems that you broke the above ascii
artwork.

> +When using bzImage, the protected-mode kernel was relocated to
> +0x100000 ("high memory"), and the kernel real-mode block (boot sector,
> +setup, and stack/heap) was made relocatable to any address between
> +0x10000 and end of low memory. Unfortunately, in protocols 2.00 and
> +2.01 the 0x90000+ memory range is still used internally by the kernel;
> +the 2.02 protocol resolves that problem.
> +
> +It is desirable to keep the "memory ceiling" -- the highest point in
> +low memory touched by the boot loader -- as low as possible, since
> +some newer BIOSes have begun to allocate some rather large amounts of
> +memory, called the Extended BIOS Data Area, near the top of low
> +memory. The boot loader should use the "INT 12h" BIOS call to verify
> +how much low memory is available.
> +
> +Unfortunately, if INT 12h reports that the amount of memory is too
> +low, there is usually nothing the boot loader can do but to report an
> +error to the user. The boot loader should therefore be designed to
> +take up as little space in low memory as it reasonably can. For
> +zImage or old bzImage kernels, which need data written into the
> +0x90000 segment, the boot loader should make sure not to use memory
> +above the 0x9A000 point; too many BIOSes will break above that point.
> +
> +For a modern bzImage kernel with boot protocol version >= 2.02, a
> +memory layout like the following is suggested::
> +
> + ~ ~
> + | Protected-mode kernel |
> + 100000 +------------------------+
> + | I/O memory hole |
> + 0A0000 +------------------------+
> + | Reserved for BIOS | Leave as much as possible unused
> + ~ ~
> + | Command line | (Can also be below the X+10000 mark)
> + X+10000 +------------------------+
> + | Stack/heap | For use by the kernel real-mode code.
> + X+08000 +------------------------+
> + | Kernel setup | The kernel real-mode code.
> + | Kernel boot sector | The kernel legacy boot sector.
> + X +------------------------+
> + | Boot loader | <- Boot sector entry point 0000:7C00
> + 001000 +------------------------+
> + | Reserved for MBR/BIOS |
> + 000800 +------------------------+
> + | Typically used by MBR |
> + 000600 +------------------------+
> + | BIOS use only |
> + 000000 +------------------------+


Same here: it sounds to me that you mistakenly replaced some tabs
by spaces.

> +
> +... where the address X is as low as the design of the boot loader
> +permits.

That seems to be the legend of the artwork. I would indent it, in
order to be shown inside the artwork.

> +
> +
> +THE REAL-MODE KERNEL HEADER
> +===========================
> +
> +In the following text, and anywhere in the kernel boot sequence, "a
> +sector" refers to 512 bytes. It is independent of the actual sector
> +size of the underlying medium.
> +
> +The first step in loading a Linux kernel should be to load the
> +real-mode code (boot sector and setup code) and then examine the
> +following header at offset 0x01f1. The real-mode code can total up to
> +32K, although the boot loader may choose to load only the first two
> +sectors (1K) and then examine the bootup sector size.
> +
> +The header looks like::
> +
> + Offset Proto Name Meaning
> + /Size
> +
> + 01F1/1 ALL(1 setup_sects The size of the setup in sectors
> + 01F2/2 ALL root_flags If set, the root is mounted readonly
> + 01F4/4 2.04+(2 syssize The size of the 32-bit code in 16-byte paras
> + 01F8/2 ALL ram_size DO NOT USE - for bootsect.S use only
> + 01FA/2 ALL vid_mode Video mode control
> + 01FC/2 ALL root_dev Default root device number
> + 01FE/2 ALL boot_flag 0xAA55 magic number
> + 0200/2 2.00+ jump Jump instruction
> + 0202/4 2.00+ header Magic signature "HdrS"
> + 0206/2 2.00+ version Boot protocol version supported
> + 0208/4 2.00+ realmode_swtch Boot loader hook (see below)
> + 020C/2 2.00+ start_sys_seg The load-low segment (0x1000) (obsolete)
> + 020E/2 2.00+ kernel_version Pointer to kernel version string
> + 0210/1 2.00+ type_of_loader Boot loader identifier
> + 0211/1 2.00+ loadflags Boot protocol option flags
> + 0212/2 2.00+ setup_move_size Move to high memory size (used with hooks)
> + 0214/4 2.00+ code32_start Boot loader hook (see below)
> + 0218/4 2.00+ ramdisk_image initrd load address (set by boot loader)
> + 021C/4 2.00+ ramdisk_size initrd size (set by boot loader)
> + 0220/4 2.00+ bootsect_kludge DO NOT USE - for bootsect.S use only
> + 0224/2 2.01+ heap_end_ptr Free memory after setup end
> + 0226/1 2.02+(3 ext_loader_ver Extended boot loader version
> + 0227/1 2.02+(3 ext_loader_type Extended boot loader ID
> + 0228/4 2.02+ cmd_line_ptr 32-bit pointer to the kernel command line
> + 022C/4 2.03+ initrd_addr_max Highest legal initrd address
> + 0230/4 2.05+ kernel_alignment Physical addr alignment required for kernel
> + 0234/1 2.05+ relocatable_kernel Whether kernel is relocatable or not
> + 0235/1 2.10+ min_alignment Minimum alignment, as a power of two
> + 0236/2 2.12+ xloadflags Boot protocol option flags
> + 0238/4 2.06+ cmdline_size Maximum size of the kernel command line
> + 023C/4 2.07+ hardware_subarch Hardware subarchitecture
> + 0240/8 2.07+ hardware_subarch_data Subarchitecture-specific data
> + 0248/4 2.08+ payload_offset Offset of kernel payload
> + 024C/4 2.08+ payload_length Length of kernel payload
> + 0250/8 2.09+ setup_data 64-bit physical pointer to linked list
> + of struct setup_data
> + 0258/8 2.10+ pref_address Preferred loading address
> + 0260/4 2.10+ init_size Linear memory required during initialization
> + 0264/4 2.11+ handover_offset Offset of handover entry point

This is a table. Please use table markups and fix some wrong indentation
there, as it makes a lot easier to read it on html, e-pub and pdf formats.

E. g. something like:

====== ======== ===================== ========================================
Offset Proto Name Meaning
/Size

01F1/1 ALL(1) setup_sects The size of the setup in sectors
01F2/2 ALL root_flags If set, the root is mounted readonly
01F4/4 2.04+(2) syssize The size of the 32-bit code in 16-byte
paras
01F8/2 ALL ram_size DO NOT USE - for bootsect.S use only
01FA/2 ALL vid_mode Video mode control
01FC/2 ALL root_dev Default root device number
01FE/2 ALL boot_flag 0xAA55 magic number
0200/2 2.00+ jump Jump instruction
0202/4 2.00+ header Magic signature "HdrS"
0206/2 2.00+ version Boot protocol version supported
0208/4 2.00+ realmode_swtch Boot loader hook (see below)
020C/2 2.00+ start_sys_seg The load-low segment (0x1000) (obsolete)
020E/2 2.00+ kernel_version Pointer to kernel version string
0210/1 2.00+ type_of_loader Boot loader identifier
0211/1 2.00+ loadflags Boot protocol option flags
0212/2 2.00+ setup_move_size Move to high memory size
(used with hooks)
0214/4 2.00+ code32_start Boot loader hook (see below)
0218/4 2.00+ ramdisk_image initrd load address (set by boot loader)
021C/4 2.00+ ramdisk_size initrd size (set by boot loader)
0220/4 2.00+ bootsect_kludge DO NOT USE - for bootsect.S use only
0224/2 2.01+ heap_end_ptr Free memory after setup end
0226/1 2.02+(3) ext_loader_ver Extended boot loader version
0227/1 2.02+(3) ext_loader_type Extended boot loader ID
0228/4 2.02+ cmd_line_ptr 32-bit pointer to the kernel command line
022C/4 2.03+ initrd_addr_max Highest legal initrd address
0230/4 2.05+ kernel_alignment Physical addr alignment required for
kernel
0234/1 2.05+ relocatable_kernel Whether kernel is relocatable or not
0235/1 2.10+ min_alignment Minimum alignment, as a power of two
0236/2 2.12+ xloadflags Boot protocol option flags
0238/4 2.06+ cmdline_size Maximum size of the kernel command line
023C/4 2.07+ hardware_subarch Hardware subarchitecture
0240/8 2.07+ hardware_subarch_data Subarchitecture-specific data
0248/4 2.08+ payload_offset Offset of kernel payload
024C/4 2.08+ payload_length Length of kernel payload
0250/8 2.09+ setup_data 64-bit physical pointer to linked list
of struct setup_data
0258/8 2.10+ pref_address Preferred loading address
0260/4 2.10+ init_size Linear memory required during
initialization
0264/4 2.11+ handover_offset Offset of handover entry point
====== ======== ===================== ========================================


> +
> +(1) For backwards compatibility, if the setup_sects field contains 0, the
> + real value is 4.
> +
> +(2) For boot protocol prior to 2.04, the upper two bytes of the syssize
> + field are unusable, which means the size of a bzImage kernel
> + cannot be determined.
> +
> +(3) Ignored, but safe to set, for boot protocols 2.02-2.09.

Btw, (1), (2) and (3) here sounds to be footnotes. Perhaps you could use
ReST footnote markups, if ok for the X86 maintainers.

> +
> +If the "HdrS" (0x53726448) magic number is not found at offset 0x202,
> +the boot protocol version is "old". Loading an old kernel, the
> +following parameters should be assumed::
> +
> + Image type = zImage
> + initrd not supported
> + Real-mode kernel must be located at 0x90000.
> +
> +Otherwise, the "version" field contains the protocol version,
> +e.g. protocol version 2.01 will contain 0x0201 in this field. When
> +setting fields in the header, you must make sure only to set fields
> +supported by the protocol version in use.
> +
> +
> +DETAILS OF HEADER FIELDS
> +========================
> +
> +For each field, some are information from the kernel to the bootloader
> +("read"), some are expected to be filled out by the bootloader
> +("write"), and some are expected to be read and modified by the
> +bootloader ("modify").
> +
> +All general purpose boot loaders should write the fields marked
> +(obligatory). Boot loaders who want to load the kernel at a
> +nonstandard address should fill in the fields marked (reloc); other
> +boot loaders can ignore those fields.
> +
> +The byte order of all fields is littleendian (this is x86, after all.)
> +::
> +
> + Field name: setup_sects
> + Type: read
> + Offset/size: 0x1f1/1
> + Protocol: ALL

Marking this as a literal block sounds plain wrong to me. I suspect that
you could use this syntax instead:

:Field name: setup_sects
:Type: read
:Offset/size: 0x1f1/1
:Protocol: ALL

Or:

Field name: setup_sects
-----------------------

Type:
read
Offset/size:
0x1f1/1
Protocol:
ALL

Or (my favorite):

Field name: setup_sects
-----------------------

:Type: read
:Offset/size: 0x1f1/1
:Protocol: ALL

As it is more compact in text, and will provide a much better
html/pdf output. It will also make (IMHO) a lot easier for
people to read in text and seek for an specific field.

Of course, whatever we do here should be applied to all similar
structs inside this file.

> +
> +The size of the setup code in 512-byte sectors. If this field is
> +0, the real value is 4. The real-mode code consists of the boot
> +sector (always one 512-byte sector) plus the setup code.
> +::
> +
> + Field name: root_flags
> + Type: modify (optional)
> + Offset/size: 0x1f2/2
> + Protocol: ALL
> +
> +If this field is nonzero, the root defaults to readonly. The use of
> +this field is deprecated; use the "ro" or "rw" options on the
> +command line instead.
> +::
> +
> + Field name: syssize
> + Type: read
> + Offset/size: 0x1f4/4 (protocol 2.04+) 0x1f4/2 (protocol ALL)
> + Protocol: 2.04+
> +
> +The size of the protected-mode code in units of 16-byte paragraphs.
> +For protocol versions older than 2.04 this field is only two bytes
> +wide, and therefore cannot be trusted for the size of a kernel if
> +the LOAD_HIGH flag is set.
> +::
> +
> + Field name: ram_size
> + Type: kernel internal
> + Offset/size: 0x1f8/2
> + Protocol: ALL
> +
> +This field is obsolete.
> +::
> +
> + Field name: vid_mode
> + Type: modify (obligatory)
> + Offset/size: 0x1fa/2
> +
> +Please see the section on SPECIAL COMMAND LINE OPTIONS.
> +::
> +
> + Field name: root_dev
> + Type: modify (optional)
> + Offset/size: 0x1fc/2
> + Protocol: ALL
> +
> +The default root device device number. The use of this field is
> +deprecated, use the "root=" option on the command line instead.
> +::
> +
> + Field name: boot_flag
> + Type: read
> + Offset/size: 0x1fe/2
> + Protocol: ALL
> +
> +Contains 0xAA55. This is the closest thing old Linux kernels have
> +to a magic number.
> +::
> +
> + Field name: jump
> + Type: read
> + Offset/size: 0x200/2
> + Protocol: 2.00+
> +
> +Contains an x86 jump instruction, 0xEB followed by a signed offset
> +relative to byte 0x202. This can be used to determine the size of
> +the header.
> +::
> +
> + Field name: header
> + Type: read
> + Offset/size: 0x202/4
> + Protocol: 2.00+
> +
> +Contains the magic number "HdrS" (0x53726448).
> +::
> +
> + Field name: version
> + Type: read
> + Offset/size: 0x206/2
> + Protocol: 2.00+
> +
> +Contains the boot protocol version, in (major << 8)+minor format,
> +e.g. 0x0204 for version 2.04, and 0x0a11 for a hypothetical version
> +10.17.
> +::
> +
> + Field name: realmode_swtch
> + Type: modify (optional)
> + Offset/size: 0x208/4
> + Protocol: 2.00+
> +
> +Boot loader hook (see ADVANCED BOOT LOADER HOOKS below.)
> +::
> +
> + Field name: start_sys_seg
> + Type: read
> + Offset/size: 0x20c/2
> + Protocol: 2.00+
> +
> +The load low segment (0x1000). Obsolete.
> +::
> +
> + Field name: kernel_version
> + Type: read
> + Offset/size: 0x20e/2
> + Protocol: 2.00+
> +
> +If set to a nonzero value, contains a pointer to a NUL-terminated
> +human-readable kernel version number string, less 0x200. This can
> +be used to display the kernel version to the user. This value
> +should be less than (0x200*setup_sects).
> +
> +For example, if this value is set to 0x1c00, the kernel version
> +number string can be found at offset 0x1e00 in the kernel file.
> +This is a valid value if and only if the "setup_sects" field
> +contains the value 15 or higher, as::
> +
> + 0x1c00 < 15*0x200 (= 0x1e00) but
> + 0x1c00 >= 14*0x200 (= 0x1c00)
> +
> + 0x1c00 >> 9 = 14, so the minimum value for setup_secs is 15.
> +
> +::
> +
> + Field name: type_of_loader
> + Type: write (obligatory)
> + Offset/size: 0x210/1
> + Protocol: 2.00+
> +
> +If your boot loader has an assigned id (see table below), enter
> +0xTV here, where T is an identifier for the boot loader and V is
> +a version number. Otherwise, enter 0xFF here.
> +
> +For boot loader IDs above T = 0xD, write T = 0xE to this field and
> +write the extended ID minus 0x10 to the ext_loader_type field.
> +Similarly, the ext_loader_ver field can be used to provide more than
> +four bits for the bootloader version.
> +
> +For example, for T = 0x15, V = 0x234, write::
> +
> + type_of_loader <- 0xE4
> + ext_loader_type <- 0x05
> + ext_loader_ver <- 0x23
> +
> +Assigned boot loader ids (hexadecimal)::
> +
> + 0 LILO (0x00 reserved for pre-2.00 bootloader)
> + 1 Loadlin
> + 2 bootsect-loader (0x20, all other values reserved)
> + 3 Syslinux
> + 4 Etherboot/gPXE/iPXE
> + 5 ELILO
> + 7 GRUB
> + 8 U-Boot
> + 9 Xen
> + A Gujin
> + B Qemu
> + C Arcturus Networks uCbootloader
> + D kexec-tools
> + E Extended (see ext_loader_type)
> + F Special (0xFF = undefined)
> + 10 Reserved
> + 11 Minimal Linux Bootloader <http://sebastian-plotz.blogspot.de>
> + 12 OVMF UEFI virtualization stack

Clearly there's something wrong with the last 3 lines, as they aren't
following the expected indentation.

Anyway, IMO the best would be to use a table, instead:

== =======================================
0 LILO
(0x00 reserved for pre-2.00 bootloader)
1 Loadlin
2 bootsect-loader
(0x20, all other values reserved)
3 Syslinux
4 Etherboot/gPXE/iPXE
5 ELILO
7 GRUB
8 U-Boot
9 Xen
A Gujin
B Qemu
C Arcturus Networks uCbootloader
D kexec-tools
E Extended
(see ext_loader_type)
F Special
(0xFF = undefined)
10 Reserved
11 Minimal Linux Bootloader
<http://sebastian-plotz.blogspot.de>
12 OVMF UEFI virtualization stack
== =======================================



> +
> +Please contact <[email protected]> if you need a bootloader ID value assigned.
> +::
> +
> + Field name: loadflags
> + Type: modify (obligatory)
> + Offset/size: 0x211/1
> + Protocol: 2.00+
> +
> +This field is a bitmask.
> +::
> +
> + Bit 0 (read): LOADED_HIGH
> + - If 0, the protected-mode code is loaded at 0x10000.
> + - If 1, the protected-mode code is loaded at 0x100000.
> +
> + Bit 1 (kernel internal): KASLR_FLAG
> + - Used internally by the compressed kernel to communicate
> + KASLR status to kernel proper.
> + If 1, KASLR enabled.
> + If 0, KASLR disabled.

You need to either add blank lines or add a "- " before the
two if's above.

> +
> + Bit 5 (write): QUIET_FLAG
> + - If 0, print early messages.
> + - If 1, suppress early messages.
> + This requests to the kernel (decompressor and early
> + kernel) to not write early messages that require
> + accessing the display hardware directly.
> +
> + Bit 6 (write): KEEP_SEGMENTS
> + Protocol: 2.07+
> + - If 0, reload the segment registers in the 32bit entry point.
> + - If 1, do not reload the segment registers in the 32bit entry point.
> + Assume that %cs %ds %ss %es are all set to flat segments with
> + a base of 0 (or the equivalent for their environment).
> +
> + Bit 7 (write): CAN_USE_HEAP
> + Set this bit to 1 to indicate that the value entered in the
> + heap_end_ptr is valid. If this field is clear, some setup code
> + functionality will be disabled.
> +
> +::
> +
> + Field name: setup_move_size
> + Type: modify (obligatory)
> + Offset/size: 0x212/2
> + Protocol: 2.00-2.01
> +
> +When using protocol 2.00 or 2.01, if the real mode kernel is not
> +loaded at 0x90000, it gets moved there later in the loading
> +sequence. Fill in this field if you want additional data (such as
> +the kernel command line) moved in addition to the real-mode kernel
> +itself.
> +
> +The unit is bytes starting with the beginning of the boot sector.
> +
> +This field is can be ignored when the protocol is 2.02 or higher, or
> +if the real-mode code is loaded at 0x90000.
> +::
> +
> + Field name: code32_start
> + Type: modify (optional, reloc)
> + Offset/size: 0x214/4
> + Protocol: 2.00+
> +
> +The address to jump to in protected mode. This defaults to the load
> +address of the kernel, and can be used by the boot loader to
> +determine the proper load address.
> +
> +This field can be modified for two purposes:
> +
> + 1. as a boot loader hook (see ADVANCED BOOT LOADER HOOKS below.)
> +
> + 2. if a bootloader which does not install a hook loads a
> + relocatable kernel at a nonstandard address it will have to modify
> + this field to point to the load address.
> +
> +::
> +
> + Field name: ramdisk_image
> + Type: write (obligatory)
> + Offset/size: 0x218/4
> + Protocol: 2.00+
> +
> +The 32-bit linear address of the initial ramdisk or ramfs. Leave at
> +zero if there is no initial ramdisk/ramfs.
> +::
> +
> + Field name: ramdisk_size
> + Type: write (obligatory)
> + Offset/size: 0x21c/4
> + Protocol: 2.00+
> +
> +Size of the initial ramdisk or ramfs. Leave at zero if there is no
> +initial ramdisk/ramfs.
> +::
> +
> + Field name: bootsect_kludge
> + Type: kernel internal
> + Offset/size: 0x220/4
> + Protocol: 2.00+
> +
> +This field is obsolete.
> +::
> +
> + Field name: heap_end_ptr
> + Type: write (obligatory)
> + Offset/size: 0x224/2
> + Protocol: 2.01+
> +
> +Set this field to the offset (from the beginning of the real-mode
> +code) of the end of the setup stack/heap, minus 0x0200.
> +::
> +
> + Field name: ext_loader_ver
> + Type: write (optional)
> + Offset/size: 0x226/1
> + Protocol: 2.02+
> +
> +This field is used as an extension of the version number in the
> +type_of_loader field. The total version number is considered to be
> +(type_of_loader & 0x0f) + (ext_loader_ver << 4).
> +
> +The use of this field is boot loader specific. If not written, it
> +is zero.
> +
> +Kernels prior to 2.6.31 did not recognize this field, but it is safe
> +to write for protocol version 2.02 or higher.
> +::
> +
> + Field name: ext_loader_type
> + Type: write (obligatory if (type_of_loader & 0xf0) == 0xe0)
> + Offset/size: 0x227/1
> + Protocol: 2.02+
> +
> +This field is used as an extension of the type number in
> +type_of_loader field. If the type in type_of_loader is 0xE, then
> +the actual type is (ext_loader_type + 0x10).
> +
> +This field is ignored if the type in type_of_loader is not 0xE.
> +
> +Kernels prior to 2.6.31 did not recognize this field, but it is safe
> +to write for protocol version 2.02 or higher.
> +::
> +
> + Field name: cmd_line_ptr
> + Type: write (obligatory)
> + Offset/size: 0x228/4
> + Protocol: 2.02+
> +
> +Set this field to the linear address of the kernel command line.
> +The kernel command line can be located anywhere between the end of
> +the setup heap and 0xA0000; it does not have to be located in the
> +same 64K segment as the real-mode code itself.
> +
> +Fill in this field even if your boot loader does not support a
> +command line, in which case you can point this to an empty string
> +(or better yet, to the string "auto".) If this field is left at
> +zero, the kernel will assume that your boot loader does not support
> +the 2.02+ protocol.
> +::
> +
> + Field name: initrd_addr_max
> + Type: read
> + Offset/size: 0x22c/4
> + Protocol: 2.03+
> +
> +The maximum address that may be occupied by the initial
> +ramdisk/ramfs contents. For boot protocols 2.02 or earlier, this
> +field is not present, and the maximum address is 0x37FFFFFF. (This
> +address is defined as the address of the highest safe byte, so if
> +your ramdisk is exactly 131072 bytes long and this field is
> +0x37FFFFFF, you can start your ramdisk at 0x37FE0000.)
> +::
> +
> + Field name: kernel_alignment
> + Type: read/modify (reloc)
> + Offset/size: 0x230/4
> + Protocol: 2.05+ (read), 2.10+ (modify)
> +
> +Alignment unit required by the kernel (if relocatable_kernel is
> +true.) A relocatable kernel that is loaded at an alignment
> +incompatible with the value in this field will be realigned during
> +kernel initialization.
> +
> +Starting with protocol version 2.10, this reflects the kernel
> +alignment preferred for optimal performance; it is possible for the
> +loader to modify this field to permit a lesser alignment. See the
> +min_alignment and pref_address field below.
> +::
> +
> + Field name: relocatable_kernel
> + Type: read (reloc)
> + Offset/size: 0x234/1
> + Protocol: 2.05+
> +
> +If this field is nonzero, the protected-mode part of the kernel can
> +be loaded at any address that satisfies the kernel_alignment field.
> +After loading, the boot loader must set the code32_start field to
> +point to the loaded code, or to a boot loader hook.
> +::
> +
> + Field name: min_alignment
> + Type: read (reloc)
> + Offset/size: 0x235/1
> + Protocol: 2.10+
> +
> +This field, if nonzero, indicates as a power of two the minimum
> +alignment required, as opposed to preferred, by the kernel to boot.
> +If a boot loader makes use of this field, it should update the
> +kernel_alignment field with the alignment unit desired; typically::
> +
> + kernel_alignment = 1 << min_alignment
> +
> +There may be a considerable performance cost with an excessively
> +misaligned kernel. Therefore, a loader should typically try each
> +power-of-two alignment from kernel_alignment down to this alignment.
> +::
> +
> + Field name: xloadflags
> + Type: read
> + Offset/size: 0x236/2
> + Protocol: 2.12+
> +
> +This field is a bitmask.
> +::
> +
> + Bit 0 (read): XLF_KERNEL_64
> + - If 1, this kernel has the legacy 64-bit entry point at 0x200.
> +
> + Bit 1 (read): XLF_CAN_BE_LOADED_ABOVE_4G
> + - If 1, kernel/boot_params/cmdline/ramdisk can be above 4G.

Please indent it the same way as Bit 0.

> +
> + Bit 2 (read): XLF_EFI_HANDOVER_32
> + - If 1, the kernel supports the 32-bit EFI handoff entry point
> + given at handover_offset.
> +
> + Bit 3 (read): XLF_EFI_HANDOVER_64
> + - If 1, the kernel supports the 64-bit EFI handoff entry point
> + given at handover_offset + 0x200.
> +
> + Bit 4 (read): XLF_EFI_KEXEC
> + - If 1, the kernel supports kexec EFI boot with EFI runtime support.
> +
> +::
> +
> + Field name: cmdline_size
> + Type: read
> + Offset/size: 0x238/4
> + Protocol: 2.06+
> +
> +The maximum size of the command line without the terminating
> +zero. This means that the command line can contain at most
> +cmdline_size characters. With protocol version 2.05 and earlier, the
> +maximum size was 255.
> +::
> +
> + Field name: hardware_subarch
> + Type: write (optional, defaults to x86/PC)
> + Offset/size: 0x23c/4
> + Protocol: 2.07+
> +
> +In a paravirtualized environment the hardware low level architectural
> +pieces such as interrupt handling, page table handling, and
> +accessing process control registers needs to be done differently.
> +
> +This field allows the bootloader to inform the kernel we are in one
> +one of those environments.
> +::
> +
> + 0x00000000 The default x86/PC environment
> + 0x00000001 lguest
> + 0x00000002 Xen
> + 0x00000003 Moorestown MID
> + 0x00000004 CE4100 TV Platform

This is already a table. Just add the markups for it, instead of using '::'

e. g.:

+ ========== ==============================
0x00000000 The default x86/PC environment
0x00000001 lguest
0x00000002 Xen
0x00000003 Moorestown MID
0x00000004 CE4100 TV Platform
+ ========== ==============================


> +
> +::
> +
> + Field name: hardware_subarch_data
> + Type: write (subarch-dependent)
> + Offset/size: 0x240/8
> + Protocol: 2.07+
> +
> +A pointer to data that is specific to hardware subarch
> +This field is currently unused for the default x86/PC environment,
> +do not modify.
> +::
> +
> + Field name: payload_offset
> + Type: read
> + Offset/size: 0x248/4
> + Protocol: 2.08+
> +
> +If non-zero then this field contains the offset from the beginning
> +of the protected-mode code to the payload.
> +
> +The payload may be compressed. The format of both the compressed and
> +uncompressed data should be determined using the standard magic
> +numbers. The currently supported compression formats are gzip
> +(magic numbers 1F 8B or 1F 9E), bzip2 (magic number 42 5A), LZMA
> +(magic number 5D 00), XZ (magic number FD 37), and LZ4 (magic number
> +02 21). The uncompressed payload is currently always ELF (magic
> +number 7F 45 4C 46).
> +::
> +
> + Field name: payload_length
> + Type: read
> + Offset/size: 0x24c/4
> + Protocol: 2.08+
> +
> +The length of the payload.
> +::
> +
> + Field name: setup_data
> + Type: write (special)
> + Offset/size: 0x250/8
> + Protocol: 2.09+
> +
> +The 64-bit physical pointer to NULL terminated single linked list of
> +struct setup_data. This is used to define a more extensible boot
> +parameters passing mechanism. The definition of struct setup_data is
> +as follow::
> +
> + struct setup_data {
> + u64 next;
> + u32 type;
> + u32 len;
> + u8 data[0];
> + };
> +
> +Where, the next is a 64-bit physical pointer to the next node of
> +linked list, the next field of the last node is 0; the type is used
> +to identify the contents of data; the len is the length of data
> +field; the data holds the real payload.
> +
> +This list may be modified at a number of points during the bootup
> +process. Therefore, when modifying this list one should always make
> +sure to consider the case where the linked list already contains
> +entries.
> +::
> +
> + Field name: pref_address
> + Type: read (reloc)
> + Offset/size: 0x258/8
> + Protocol: 2.10+
> +
> +This field, if nonzero, represents a preferred load address for the
> +kernel. A relocating bootloader should attempt to load at this
> +address if possible.
> +
> +A non-relocatable kernel will unconditionally move itself and to run
> +at this address.
> +::
> +
> + Field name: init_size
> + Type: read
> + Offset/size: 0x260/4
> +
> +This field indicates the amount of linear contiguous memory starting
> +at the kernel runtime start address that the kernel needs before it
> +is capable of examining its memory map. This is not the same thing
> +as the total amount of memory the kernel needs to boot, but it can
> +be used by a relocating boot loader to help select a safe load
> +address for the kernel.
> +
> +The kernel runtime start address is determined by the following algorithm::
> +
> + if (relocatable_kernel)
> + runtime_start = align_up(load_address, kernel_alignment)
> + else
> + runtime_start = pref_address
> +
> +::
> +
> + Field name: handover_offset
> + Type: read
> + Offset/size: 0x264/4
> +
> +This field is the offset from the beginning of the kernel image to
> +the EFI handover protocol entry point. Boot loaders using the EFI
> +handover protocol to boot the kernel should jump to this offset.
> +
> +See EFI HANDOVER PROTOCOL below for more details.
> +
> +
> +THE IMAGE CHECKSUM
> +==================
> +
> +From boot protocol version 2.08 onwards the CRC-32 is calculated over
> +the entire file using the characteristic polynomial 0x04C11DB7 and an
> +initial remainder of 0xffffffff. The checksum is appended to the
> +file; therefore the CRC of the file up to the limit specified in the
> +syssize field of the header is always 0.
> +
> +
> +THE KERNEL COMMAND LINE
> +=======================
> +
> +The kernel command line has become an important way for the boot
> +loader to communicate with the kernel. Some of its options are also
> +relevant to the boot loader itself, see "special command line options"
> +below.
> +
> +The kernel command line is a null-terminated string. The maximum
> +length can be retrieved from the field cmdline_size. Before protocol
> +version 2.06, the maximum was 255 characters. A string that is too
> +long will be automatically truncated by the kernel.
> +
> +If the boot protocol version is 2.02 or later, the address of the
> +kernel command line is given by the header field cmd_line_ptr (see
> +above.) This address can be anywhere between the end of the setup
> +heap and 0xA0000.
> +
> +If the protocol version is *not* 2.02 or higher, the kernel
> +command line is entered using the following protocol:
> +
> + - At offset 0x0020 (word), "cmd_line_magic", enter the magic
> + number 0xA33F.
> +
> + - At offset 0x0022 (word), "cmd_line_offset", enter the offset
> + of the kernel command line (relative to the start of the
> + real-mode kernel).
> +
> + - The kernel command line *must* be within the memory region
> + covered by setup_move_size, so you may need to adjust this
> + field.
> +
> +
> +MEMORY LAYOUT OF THE REAL-MODE CODE
> +===================================
> +
> +The real-mode code requires a stack/heap to be set up, as well as
> +memory allocated for the kernel command line. This needs to be done
> +in the real-mode accessible memory in bottom megabyte.
> +
> +It should be noted that modern machines often have a sizable Extended
> +BIOS Data Area (EBDA). As a result, it is advisable to use as little
> +of the low megabyte as possible.
> +
> +Unfortunately, under the following circumstances the 0x90000 memory
> +segment has to be used:
> +
> + - When loading a zImage kernel ((loadflags & 0x01) == 0).
> + - When loading a 2.01 or earlier boot protocol kernel.
> +
> + For the 2.00 and 2.01 boot protocols, the real-mode code
> + can be loaded at another address, but it is internally
> + relocated to 0x90000. For the "old" protocol, the
> + real-mode code must be loaded at 0x90000.
> +
> +When loading at 0x90000, avoid using memory above 0x9a000.
> +
> +For boot protocol 2.02 or higher, the command line does not have to be
> +located in the same 64K segment as the real-mode setup code; it is
> +thus permitted to give the stack/heap the full 64K segment and locate
> +the command line above it.
> +
> +The kernel command line should not be located below the real-mode
> +code, nor should it be located in high memory.
> +
> +
> +SAMPLE BOOT CONFIGURATION
> +=========================
> +
> +As a sample configuration, assume the following layout of the real
> +mode segment.
> +


> +When loading below 0x90000, use the entire segment::
> +
> + 0x0000-0x7fff Real mode kernel
> + 0x8000-0xdfff Stack and heap
> + 0xe000-0xffff Kernel command line
> +
> +When loading at 0x90000 OR the protocol version is 2.01 or earlier::
> +
> + 0x0000-0x7fff Real mode kernel
> + 0x8000-0x97ff Stack and heap
> + 0x9800-0x9fff Kernel command line

Again, tables. Just do:

When loading below 0x90000, use the entire segment:

+ ============= ===================
0x0000-0x7fff Real mode kernel
0x8000-0xdfff Stack and heap
0xe000-0xffff Kernel command line
+ ============= ===================

When loading at 0x90000 OR the protocol version is 2.01 or earlier:

+ ============= ===================
0x0000-0x7fff Real mode kernel
0x8000-0x97ff Stack and heap
0x9800-0x9fff Kernel command line
+ ============= ===================



> +
> +Such a boot loader should enter the following fields in the header::
> +
> + unsigned long base_ptr; /* base address for real-mode segment */
> +
> + if ( setup_sects == 0 ) {
> + setup_sects = 4;
> + }
> +
> + if ( protocol >= 0x0200 ) {
> + type_of_loader = <type code>;
> + if ( loading_initrd ) {
> + ramdisk_image = <initrd_address>;
> + ramdisk_size = <initrd_size>;
> + }
> +
> + if ( protocol >= 0x0202 && loadflags & 0x01 )
> + heap_end = 0xe000;
> + else
> + heap_end = 0x9800;
> +
> + if ( protocol >= 0x0201 ) {
> + heap_end_ptr = heap_end - 0x200;
> + loadflags |= 0x80; /* CAN_USE_HEAP */
> + }
> +
> + if ( protocol >= 0x0202 ) {
> + cmd_line_ptr = base_ptr + heap_end;
> + strcpy(cmd_line_ptr, cmdline);
> + } else {
> + cmd_line_magic = 0xA33F;
> + cmd_line_offset = heap_end;
> + setup_move_size = heap_end + strlen(cmdline)+1;
> + strcpy(base_ptr+cmd_line_offset, cmdline);
> + }
> + } else {
> + /* Very old kernel */
> +
> + heap_end = 0x9800;
> +
> + cmd_line_magic = 0xA33F;
> + cmd_line_offset = heap_end;
> +
> + /* A very old kernel MUST have its real-mode code
> + loaded at 0x90000 */
> +
> + if ( base_ptr != 0x90000 ) {
> + /* Copy the real-mode kernel */
> + memcpy(0x90000, base_ptr, (setup_sects+1)*512);
> + base_ptr = 0x90000; /* Relocated */
> + }
> +
> + strcpy(0x90000+cmd_line_offset, cmdline);
> +
> + /* It is recommended to clear memory up to the 32K mark */
> + memset(0x90000 + (setup_sects+1)*512, 0,
> + (64-(setup_sects+1))*512);
> + }
> +
> +
> +LOADING THE REST OF THE KERNEL
> +==============================
> +
> +The 32-bit (non-real-mode) kernel starts at offset (setup_sects+1)*512
> +in the kernel file (again, if setup_sects == 0 the real value is 4.)
> +It should be loaded at address 0x10000 for Image/zImage kernels and
> +0x100000 for bzImage kernels.
> +
> +The kernel is a bzImage kernel if the protocol >= 2.00 and the 0x01
> +bit (LOAD_HIGH) in the loadflags field is set::
> +
> + is_bzImage = (protocol >= 0x0200) && (loadflags & 0x01);
> + load_address = is_bzImage ? 0x100000 : 0x10000;
> +
> +Note that Image/zImage kernels can be up to 512K in size, and thus use
> +the entire 0x10000-0x90000 range of memory. This means it is pretty
> +much a requirement for these kernels to load the real-mode part at
> +0x90000. bzImage kernels allow much more flexibility.
> +
> +
> +SPECIAL COMMAND LINE OPTIONS
> +============================
> +
> +If the command line provided by the boot loader is entered by the
> +user, the user may expect the following command line options to work.
> +They should normally not be deleted from the kernel command line even
> +though not all of them are actually meaningful to the kernel. Boot
> +loader authors who need additional command line options for the boot
> +loader itself should get them registered in
> +Documentation/admin-guide/kernel-parameters.rst to make sure they will not
> +conflict with actual kernel options now or in the future.
> +
> + vga=<mode>
> + <mode> here is either an integer (in C notation, either
> + decimal, octal, or hexadecimal) or one of the strings
> + "normal" (meaning 0xFFFF), "ext" (meaning 0xFFFE) or "ask"
> + (meaning 0xFFFD). This value should be entered into the
> + vid_mode field, as it is used by the kernel before the command
> + line is parsed.
> +
> + mem=<size>
> + <size> is an integer in C notation optionally followed by
> + (case insensitive) K, M, G, T, P or E (meaning << 10, << 20,
> + << 30, << 40, << 50 or << 60). This specifies the end of
> + memory to the kernel. This affects the possible placement of
> + an initrd, since an initrd should be placed near end of
> + memory. Note that this is an option to *both* the kernel and
> + the bootloader!
> +
> + initrd=<file>
> + An initrd should be loaded. The meaning of <file> is
> + obviously bootloader-dependent, and some boot loaders
> + (e.g. LILO) do not have such a command.
> +
> +In addition, some boot loaders add the following options to the
> +user-specified command line:
> +
> + BOOT_IMAGE=<file>
> + The boot image which was loaded. Again, the meaning of <file>
> + is obviously bootloader-dependent.
> +
> + auto
> + The kernel was booted without explicit user intervention.
> +
> +If these options are added by the boot loader, it is highly
> +recommended that they are located *first*, before the user-specified
> +or configuration-specified command line. Otherwise, "init=/bin/sh"
> +gets confused by the "auto" option.
> +
> +
> +RUNNING THE KERNEL
> +==================
> +
> +The kernel is started by jumping to the kernel entry point, which is
> +located at *segment* offset 0x20 from the start of the real mode
> +kernel. This means that if you loaded your real-mode kernel code at
> +0x90000, the kernel entry point is 9020:0000.
> +
> +At entry, ds = es = ss should point to the start of the real-mode
> +kernel code (0x9000 if the code is loaded at 0x90000), sp should be
> +set up properly, normally pointing to the top of the heap, and
> +interrupts should be disabled. Furthermore, to guard against bugs in
> +the kernel, it is recommended that the boot loader sets fs = gs = ds =
> +es = ss.
> +
> +In our example from above, we would do::
> +
> + /* Note: in the case of the "old" kernel protocol, base_ptr must
> + be == 0x90000 at this point; see the previous sample code */
> +
> + seg = base_ptr >> 4;
> +
> + cli(); /* Enter with interrupts disabled! */
> +
> + /* Set up the real-mode kernel stack */
> + _SS = seg;
> + _SP = heap_end;
> +
> + _DS = _ES = _FS = _GS = seg;
> + jmp_far(seg+0x20, 0); /* Run the kernel */
> +
> +If your boot sector accesses a floppy drive, it is recommended to
> +switch off the floppy motor before running the kernel, since the
> +kernel boot leaves interrupts off and thus the motor will not be
> +switched off, especially if the loaded kernel has the floppy driver as
> +a demand-loaded module!
> +
> +
> +ADVANCED BOOT LOADER HOOKS
> +==========================
> +
> +If the boot loader runs in a particularly hostile environment (such as
> +LOADLIN, which runs under DOS) it may be impossible to follow the
> +standard memory location requirements. Such a boot loader may use the
> +following hooks that, if set, are invoked by the kernel at the
> +appropriate time. The use of these hooks should probably be
> +considered an absolutely last resort!
> +
> +IMPORTANT: All the hooks are required to preserve %esp, %ebp, %esi and
> +%edi across invocation.
> +
> + realmode_swtch:
> + A 16-bit real mode far subroutine invoked immediately before
> + entering protected mode. The default routine disables NMI, so
> + your routine should probably do so, too.
> +
> + code32_start:
> + A 32-bit flat-mode routine *jumped* to immediately after the
> + transition to protected mode, but before the kernel is
> + uncompressed. No segments, except CS, are guaranteed to be
> + set up (current kernels do, but older ones do not); you should
> + set them up to BOOT_DS (0x18) yourself.
> +
> + After completing your hook, you should jump to the address
> + that was in this field before your boot loader overwrote it
> + (relocated, if appropriate.)
> +
> +
> +32-bit BOOT PROTOCOL
> +====================
> +
> +For machine with some new BIOS other than legacy BIOS, such as EFI,
> +LinuxBIOS, etc, and kexec, the 16-bit real mode setup code in kernel
> +based on legacy BIOS can not be used, so a 32-bit boot protocol needs
> +to be defined.
> +
> +In 32-bit boot protocol, the first step in loading a Linux kernel
> +should be to setup the boot parameters (struct boot_params,
> +traditionally known as "zero page"). The memory for struct boot_params
> +should be allocated and initialized to all zero. Then the setup header
> +from offset 0x01f1 of kernel image on should be loaded into struct
> +boot_params and examined. The end of setup header can be calculated as
> +follow::
> +
> + 0x0202 + byte value at offset 0x0201
> +
> +In addition to read/modify/write the setup header of the struct
> +boot_params as that of 16-bit boot protocol, the boot loader should
> +also fill the additional fields of the struct boot_params as that
> +described in zero-page.txt.
> +
> +After setting up the struct boot_params, the boot loader can load the
> +32/64-bit kernel in the same way as that of 16-bit boot protocol.
> +
> +In 32-bit boot protocol, the kernel is started by jumping to the
> +32-bit kernel entry point, which is the start address of loaded
> +32/64-bit kernel.
> +
> +At entry, the CPU must be in 32-bit protected mode with paging
> +disabled; a GDT must be loaded with the descriptors for selectors
> +__BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat
> +segment; __BOOT_CS must have execute/read permission, and __BOOT_DS
> +must have read/write permission; CS must be __BOOT_CS and DS, ES, SS
> +must be __BOOT_DS; interrupt must be disabled; %esi must hold the base
> +address of the struct boot_params; %ebp, %edi and %ebx must be zero.
> +
> +64-bit BOOT PROTOCOL
> +====================
> +
> +For machine with 64bit cpus and 64bit kernel, we could use 64bit bootloader
> +and we need a 64-bit boot protocol.
> +
> +In 64-bit boot protocol, the first step in loading a Linux kernel
> +should be to setup the boot parameters (struct boot_params,
> +traditionally known as "zero page"). The memory for struct boot_params
> +could be allocated anywhere (even above 4G) and initialized to all zero.
> +Then, the setup header at offset 0x01f1 of kernel image on should be
> +loaded into struct boot_params and examined. The end of setup header
> +can be calculated as follows::
> +
> + 0x0202 + byte value at offset 0x0201
> +
> +In addition to read/modify/write the setup header of the struct
> +boot_params as that of 16-bit boot protocol, the boot loader should
> +also fill the additional fields of the struct boot_params as described
> +in zero-page.txt.
> +
> +After setting up the struct boot_params, the boot loader can load
> +64-bit kernel in the same way as that of 16-bit boot protocol, but
> +kernel could be loaded above 4G.
> +
> +In 64-bit boot protocol, the kernel is started by jumping to the
> +64-bit kernel entry point, which is the start address of loaded
> +64-bit kernel plus 0x200.
> +
> +At entry, the CPU must be in 64-bit mode with paging enabled.
> +The range with setup_header.init_size from start address of loaded
> +kernel and zero page and command line buffer get ident mapping;
> +a GDT must be loaded with the descriptors for selectors
> +__BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat
> +segment; __BOOT_CS must have execute/read permission, and __BOOT_DS
> +must have read/write permission; CS must be __BOOT_CS and DS, ES, SS
> +must be __BOOT_DS; interrupt must be disabled; %rsi must hold the base
> +address of the struct boot_params.
> +
> +EFI HANDOVER PROTOCOL
> +=====================
> +
> +This protocol allows boot loaders to defer initialisation to the EFI
> +boot stub. The boot loader is required to load the kernel/initrd(s)
> +from the boot media and jump to the EFI handover protocol entry point
> +which is hdr->handover_offset bytes from the beginning of
> +startup_{32,64}.
> +
> +The function prototype for the handover entry point looks like this::
> +
> + efi_main(void *handle, efi_system_table_t *table, struct boot_params *bp)
> +
> +'handle' is the EFI image handle passed to the boot loader by the EFI
> +firmware, 'table' is the EFI system table - these are the first two
> +arguments of the "handoff state" as described in section 2.3 of the
> +UEFI specification. 'bp' is the boot loader-allocated boot params.
> +
> +The boot loader *must* fill out the following fields in bp::
> +
> + - hdr.code32_start
> + - hdr.cmd_line_ptr
> + - hdr.ramdisk_image (if applicable)
> + - hdr.ramdisk_size (if applicable)
> +
> +All other fields should be zero.
> diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
> deleted file mode 100644
> index f4c2a97bfdbd..000000000000
> --- a/Documentation/x86/boot.txt
> +++ /dev/null
> @@ -1,1130 +0,0 @@
> - THE LINUX/x86 BOOT PROTOCOL
> - ---------------------------
> -
> -On the x86 platform, the Linux kernel uses a rather complicated boot
> -convention. This has evolved partially due to historical aspects, as
> -well as the desire in the early days to have the kernel itself be a
> -bootable image, the complicated PC memory model and due to changed
> -expectations in the PC industry caused by the effective demise of
> -real-mode DOS as a mainstream operating system.
> -
> -Currently, the following versions of the Linux/x86 boot protocol exist.
> -
> -Old kernels: zImage/Image support only. Some very early kernels
> - may not even support a command line.
> -
> -Protocol 2.00: (Kernel 1.3.73) Added bzImage and initrd support, as
> - well as a formalized way to communicate between the
> - boot loader and the kernel. setup.S made relocatable,
> - although the traditional setup area still assumed
> - writable.
> -
> -Protocol 2.01: (Kernel 1.3.76) Added a heap overrun warning.
> -
> -Protocol 2.02: (Kernel 2.4.0-test3-pre3) New command line protocol.
> - Lower the conventional memory ceiling. No overwrite
> - of the traditional setup area, thus making booting
> - safe for systems which use the EBDA from SMM or 32-bit
> - BIOS entry points. zImage deprecated but still
> - supported.
> -
> -Protocol 2.03: (Kernel 2.4.18-pre1) Explicitly makes the highest possible
> - initrd address available to the bootloader.
> -
> -Protocol 2.04: (Kernel 2.6.14) Extend the syssize field to four bytes.
> -
> -Protocol 2.05: (Kernel 2.6.20) Make protected mode kernel relocatable.
> - Introduce relocatable_kernel and kernel_alignment fields.
> -
> -Protocol 2.06: (Kernel 2.6.22) Added a field that contains the size of
> - the boot command line.
> -
> -Protocol 2.07: (Kernel 2.6.24) Added paravirtualised boot protocol.
> - Introduced hardware_subarch and hardware_subarch_data
> - and KEEP_SEGMENTS flag in load_flags.
> -
> -Protocol 2.08: (Kernel 2.6.26) Added crc32 checksum and ELF format
> - payload. Introduced payload_offset and payload_length
> - fields to aid in locating the payload.
> -
> -Protocol 2.09: (Kernel 2.6.26) Added a field of 64-bit physical
> - pointer to single linked list of struct setup_data.
> -
> -Protocol 2.10: (Kernel 2.6.31) Added a protocol for relaxed alignment
> - beyond the kernel_alignment added, new init_size and
> - pref_address fields. Added extended boot loader IDs.
> -
> -Protocol 2.11: (Kernel 3.6) Added a field for offset of EFI handover
> - protocol entry point.
> -
> -Protocol 2.12: (Kernel 3.8) Added the xloadflags field and extension fields
> - to struct boot_params for loading bzImage and ramdisk
> - above 4G in 64bit.
> -
> -**** MEMORY LAYOUT
> -
> -The traditional memory map for the kernel loader, used for Image or
> -zImage kernels, typically looks like:
> -
> - | |
> -0A0000 +------------------------+
> - | Reserved for BIOS | Do not use. Reserved for BIOS EBDA.
> -09A000 +------------------------+
> - | Command line |
> - | Stack/heap | For use by the kernel real-mode code.
> -098000 +------------------------+
> - | Kernel setup | The kernel real-mode code.
> -090200 +------------------------+
> - | Kernel boot sector | The kernel legacy boot sector.
> -090000 +------------------------+
> - | Protected-mode kernel | The bulk of the kernel image.
> -010000 +------------------------+
> - | Boot loader | <- Boot sector entry point 0000:7C00
> -001000 +------------------------+
> - | Reserved for MBR/BIOS |
> -000800 +------------------------+
> - | Typically used by MBR |
> -000600 +------------------------+
> - | BIOS use only |
> -000000 +------------------------+
> -
> -
> -When using bzImage, the protected-mode kernel was relocated to
> -0x100000 ("high memory"), and the kernel real-mode block (boot sector,
> -setup, and stack/heap) was made relocatable to any address between
> -0x10000 and end of low memory. Unfortunately, in protocols 2.00 and
> -2.01 the 0x90000+ memory range is still used internally by the kernel;
> -the 2.02 protocol resolves that problem.
> -
> -It is desirable to keep the "memory ceiling" -- the highest point in
> -low memory touched by the boot loader -- as low as possible, since
> -some newer BIOSes have begun to allocate some rather large amounts of
> -memory, called the Extended BIOS Data Area, near the top of low
> -memory. The boot loader should use the "INT 12h" BIOS call to verify
> -how much low memory is available.
> -
> -Unfortunately, if INT 12h reports that the amount of memory is too
> -low, there is usually nothing the boot loader can do but to report an
> -error to the user. The boot loader should therefore be designed to
> -take up as little space in low memory as it reasonably can. For
> -zImage or old bzImage kernels, which need data written into the
> -0x90000 segment, the boot loader should make sure not to use memory
> -above the 0x9A000 point; too many BIOSes will break above that point.
> -
> -For a modern bzImage kernel with boot protocol version >= 2.02, a
> -memory layout like the following is suggested:
> -
> - ~ ~
> - | Protected-mode kernel |
> -100000 +------------------------+
> - | I/O memory hole |
> -0A0000 +------------------------+
> - | Reserved for BIOS | Leave as much as possible unused
> - ~ ~
> - | Command line | (Can also be below the X+10000 mark)
> -X+10000 +------------------------+
> - | Stack/heap | For use by the kernel real-mode code.
> -X+08000 +------------------------+
> - | Kernel setup | The kernel real-mode code.
> - | Kernel boot sector | The kernel legacy boot sector.
> -X +------------------------+
> - | Boot loader | <- Boot sector entry point 0000:7C00
> -001000 +------------------------+
> - | Reserved for MBR/BIOS |
> -000800 +------------------------+
> - | Typically used by MBR |
> -000600 +------------------------+
> - | BIOS use only |
> -000000 +------------------------+
> -
> -... where the address X is as low as the design of the boot loader
> -permits.
> -
> -
> -**** THE REAL-MODE KERNEL HEADER
> -
> -In the following text, and anywhere in the kernel boot sequence, "a
> -sector" refers to 512 bytes. It is independent of the actual sector
> -size of the underlying medium.
> -
> -The first step in loading a Linux kernel should be to load the
> -real-mode code (boot sector and setup code) and then examine the
> -following header at offset 0x01f1. The real-mode code can total up to
> -32K, although the boot loader may choose to load only the first two
> -sectors (1K) and then examine the bootup sector size.
> -
> -The header looks like:
> -
> -Offset Proto Name Meaning
> -/Size
> -
> -01F1/1 ALL(1 setup_sects The size of the setup in sectors
> -01F2/2 ALL root_flags If set, the root is mounted readonly
> -01F4/4 2.04+(2 syssize The size of the 32-bit code in 16-byte paras
> -01F8/2 ALL ram_size DO NOT USE - for bootsect.S use only
> -01FA/2 ALL vid_mode Video mode control
> -01FC/2 ALL root_dev Default root device number
> -01FE/2 ALL boot_flag 0xAA55 magic number
> -0200/2 2.00+ jump Jump instruction
> -0202/4 2.00+ header Magic signature "HdrS"
> -0206/2 2.00+ version Boot protocol version supported
> -0208/4 2.00+ realmode_swtch Boot loader hook (see below)
> -020C/2 2.00+ start_sys_seg The load-low segment (0x1000) (obsolete)
> -020E/2 2.00+ kernel_version Pointer to kernel version string
> -0210/1 2.00+ type_of_loader Boot loader identifier
> -0211/1 2.00+ loadflags Boot protocol option flags
> -0212/2 2.00+ setup_move_size Move to high memory size (used with hooks)
> -0214/4 2.00+ code32_start Boot loader hook (see below)
> -0218/4 2.00+ ramdisk_image initrd load address (set by boot loader)
> -021C/4 2.00+ ramdisk_size initrd size (set by boot loader)
> -0220/4 2.00+ bootsect_kludge DO NOT USE - for bootsect.S use only
> -0224/2 2.01+ heap_end_ptr Free memory after setup end
> -0226/1 2.02+(3 ext_loader_ver Extended boot loader version
> -0227/1 2.02+(3 ext_loader_type Extended boot loader ID
> -0228/4 2.02+ cmd_line_ptr 32-bit pointer to the kernel command line
> -022C/4 2.03+ initrd_addr_max Highest legal initrd address
> -0230/4 2.05+ kernel_alignment Physical addr alignment required for kernel
> -0234/1 2.05+ relocatable_kernel Whether kernel is relocatable or not
> -0235/1 2.10+ min_alignment Minimum alignment, as a power of two
> -0236/2 2.12+ xloadflags Boot protocol option flags
> -0238/4 2.06+ cmdline_size Maximum size of the kernel command line
> -023C/4 2.07+ hardware_subarch Hardware subarchitecture
> -0240/8 2.07+ hardware_subarch_data Subarchitecture-specific data
> -0248/4 2.08+ payload_offset Offset of kernel payload
> -024C/4 2.08+ payload_length Length of kernel payload
> -0250/8 2.09+ setup_data 64-bit physical pointer to linked list
> - of struct setup_data
> -0258/8 2.10+ pref_address Preferred loading address
> -0260/4 2.10+ init_size Linear memory required during initialization
> -0264/4 2.11+ handover_offset Offset of handover entry point
> -
> -(1) For backwards compatibility, if the setup_sects field contains 0, the
> - real value is 4.
> -
> -(2) For boot protocol prior to 2.04, the upper two bytes of the syssize
> - field are unusable, which means the size of a bzImage kernel
> - cannot be determined.
> -
> -(3) Ignored, but safe to set, for boot protocols 2.02-2.09.
> -
> -If the "HdrS" (0x53726448) magic number is not found at offset 0x202,
> -the boot protocol version is "old". Loading an old kernel, the
> -following parameters should be assumed:
> -
> - Image type = zImage
> - initrd not supported
> - Real-mode kernel must be located at 0x90000.
> -
> -Otherwise, the "version" field contains the protocol version,
> -e.g. protocol version 2.01 will contain 0x0201 in this field. When
> -setting fields in the header, you must make sure only to set fields
> -supported by the protocol version in use.
> -
> -
> -**** DETAILS OF HEADER FIELDS
> -
> -For each field, some are information from the kernel to the bootloader
> -("read"), some are expected to be filled out by the bootloader
> -("write"), and some are expected to be read and modified by the
> -bootloader ("modify").
> -
> -All general purpose boot loaders should write the fields marked
> -(obligatory). Boot loaders who want to load the kernel at a
> -nonstandard address should fill in the fields marked (reloc); other
> -boot loaders can ignore those fields.
> -
> -The byte order of all fields is littleendian (this is x86, after all.)
> -
> -Field name: setup_sects
> -Type: read
> -Offset/size: 0x1f1/1
> -Protocol: ALL
> -
> - The size of the setup code in 512-byte sectors. If this field is
> - 0, the real value is 4. The real-mode code consists of the boot
> - sector (always one 512-byte sector) plus the setup code.
> -
> -Field name: root_flags
> -Type: modify (optional)
> -Offset/size: 0x1f2/2
> -Protocol: ALL
> -
> - If this field is nonzero, the root defaults to readonly. The use of
> - this field is deprecated; use the "ro" or "rw" options on the
> - command line instead.
> -
> -Field name: syssize
> -Type: read
> -Offset/size: 0x1f4/4 (protocol 2.04+) 0x1f4/2 (protocol ALL)
> -Protocol: 2.04+
> -
> - The size of the protected-mode code in units of 16-byte paragraphs.
> - For protocol versions older than 2.04 this field is only two bytes
> - wide, and therefore cannot be trusted for the size of a kernel if
> - the LOAD_HIGH flag is set.
> -
> -Field name: ram_size
> -Type: kernel internal
> -Offset/size: 0x1f8/2
> -Protocol: ALL
> -
> - This field is obsolete.
> -
> -Field name: vid_mode
> -Type: modify (obligatory)
> -Offset/size: 0x1fa/2
> -
> - Please see the section on SPECIAL COMMAND LINE OPTIONS.
> -
> -Field name: root_dev
> -Type: modify (optional)
> -Offset/size: 0x1fc/2
> -Protocol: ALL
> -
> - The default root device device number. The use of this field is
> - deprecated, use the "root=" option on the command line instead.
> -
> -Field name: boot_flag
> -Type: read
> -Offset/size: 0x1fe/2
> -Protocol: ALL
> -
> - Contains 0xAA55. This is the closest thing old Linux kernels have
> - to a magic number.
> -
> -Field name: jump
> -Type: read
> -Offset/size: 0x200/2
> -Protocol: 2.00+
> -
> - Contains an x86 jump instruction, 0xEB followed by a signed offset
> - relative to byte 0x202. This can be used to determine the size of
> - the header.
> -
> -Field name: header
> -Type: read
> -Offset/size: 0x202/4
> -Protocol: 2.00+
> -
> - Contains the magic number "HdrS" (0x53726448).
> -
> -Field name: version
> -Type: read
> -Offset/size: 0x206/2
> -Protocol: 2.00+
> -
> - Contains the boot protocol version, in (major << 8)+minor format,
> - e.g. 0x0204 for version 2.04, and 0x0a11 for a hypothetical version
> - 10.17.
> -
> -Field name: realmode_swtch
> -Type: modify (optional)
> -Offset/size: 0x208/4
> -Protocol: 2.00+
> -
> - Boot loader hook (see ADVANCED BOOT LOADER HOOKS below.)
> -
> -Field name: start_sys_seg
> -Type: read
> -Offset/size: 0x20c/2
> -Protocol: 2.00+
> -
> - The load low segment (0x1000). Obsolete.
> -
> -Field name: kernel_version
> -Type: read
> -Offset/size: 0x20e/2
> -Protocol: 2.00+
> -
> - If set to a nonzero value, contains a pointer to a NUL-terminated
> - human-readable kernel version number string, less 0x200. This can
> - be used to display the kernel version to the user. This value
> - should be less than (0x200*setup_sects).
> -
> - For example, if this value is set to 0x1c00, the kernel version
> - number string can be found at offset 0x1e00 in the kernel file.
> - This is a valid value if and only if the "setup_sects" field
> - contains the value 15 or higher, as:
> -
> - 0x1c00 < 15*0x200 (= 0x1e00) but
> - 0x1c00 >= 14*0x200 (= 0x1c00)
> -
> - 0x1c00 >> 9 = 14, so the minimum value for setup_secs is 15.
> -
> -Field name: type_of_loader
> -Type: write (obligatory)
> -Offset/size: 0x210/1
> -Protocol: 2.00+
> -
> - If your boot loader has an assigned id (see table below), enter
> - 0xTV here, where T is an identifier for the boot loader and V is
> - a version number. Otherwise, enter 0xFF here.
> -
> - For boot loader IDs above T = 0xD, write T = 0xE to this field and
> - write the extended ID minus 0x10 to the ext_loader_type field.
> - Similarly, the ext_loader_ver field can be used to provide more than
> - four bits for the bootloader version.
> -
> - For example, for T = 0x15, V = 0x234, write:
> -
> - type_of_loader <- 0xE4
> - ext_loader_type <- 0x05
> - ext_loader_ver <- 0x23
> -
> - Assigned boot loader ids (hexadecimal):
> -
> - 0 LILO (0x00 reserved for pre-2.00 bootloader)
> - 1 Loadlin
> - 2 bootsect-loader (0x20, all other values reserved)
> - 3 Syslinux
> - 4 Etherboot/gPXE/iPXE
> - 5 ELILO
> - 7 GRUB
> - 8 U-Boot
> - 9 Xen
> - A Gujin
> - B Qemu
> - C Arcturus Networks uCbootloader
> - D kexec-tools
> - E Extended (see ext_loader_type)
> - F Special (0xFF = undefined)
> - 10 Reserved
> - 11 Minimal Linux Bootloader <http://sebastian-plotz.blogspot.de>
> - 12 OVMF UEFI virtualization stack
> -
> - Please contact <[email protected]> if you need a bootloader ID
> - value assigned.
> -
> -Field name: loadflags
> -Type: modify (obligatory)
> -Offset/size: 0x211/1
> -Protocol: 2.00+
> -
> - This field is a bitmask.
> -
> - Bit 0 (read): LOADED_HIGH
> - - If 0, the protected-mode code is loaded at 0x10000.
> - - If 1, the protected-mode code is loaded at 0x100000.
> -
> - Bit 1 (kernel internal): KASLR_FLAG
> - - Used internally by the compressed kernel to communicate
> - KASLR status to kernel proper.
> - If 1, KASLR enabled.
> - If 0, KASLR disabled.
> -
> - Bit 5 (write): QUIET_FLAG
> - - If 0, print early messages.
> - - If 1, suppress early messages.
> - This requests to the kernel (decompressor and early
> - kernel) to not write early messages that require
> - accessing the display hardware directly.
> -
> - Bit 6 (write): KEEP_SEGMENTS
> - Protocol: 2.07+
> - - If 0, reload the segment registers in the 32bit entry point.
> - - If 1, do not reload the segment registers in the 32bit entry point.
> - Assume that %cs %ds %ss %es are all set to flat segments with
> - a base of 0 (or the equivalent for their environment).
> -
> - Bit 7 (write): CAN_USE_HEAP
> - Set this bit to 1 to indicate that the value entered in the
> - heap_end_ptr is valid. If this field is clear, some setup code
> - functionality will be disabled.
> -
> -Field name: setup_move_size
> -Type: modify (obligatory)
> -Offset/size: 0x212/2
> -Protocol: 2.00-2.01
> -
> - When using protocol 2.00 or 2.01, if the real mode kernel is not
> - loaded at 0x90000, it gets moved there later in the loading
> - sequence. Fill in this field if you want additional data (such as
> - the kernel command line) moved in addition to the real-mode kernel
> - itself.
> -
> - The unit is bytes starting with the beginning of the boot sector.
> -
> - This field is can be ignored when the protocol is 2.02 or higher, or
> - if the real-mode code is loaded at 0x90000.
> -
> -Field name: code32_start
> -Type: modify (optional, reloc)
> -Offset/size: 0x214/4
> -Protocol: 2.00+
> -
> - The address to jump to in protected mode. This defaults to the load
> - address of the kernel, and can be used by the boot loader to
> - determine the proper load address.
> -
> - This field can be modified for two purposes:
> -
> - 1. as a boot loader hook (see ADVANCED BOOT LOADER HOOKS below.)
> -
> - 2. if a bootloader which does not install a hook loads a
> - relocatable kernel at a nonstandard address it will have to modify
> - this field to point to the load address.
> -
> -Field name: ramdisk_image
> -Type: write (obligatory)
> -Offset/size: 0x218/4
> -Protocol: 2.00+
> -
> - The 32-bit linear address of the initial ramdisk or ramfs. Leave at
> - zero if there is no initial ramdisk/ramfs.
> -
> -Field name: ramdisk_size
> -Type: write (obligatory)
> -Offset/size: 0x21c/4
> -Protocol: 2.00+
> -
> - Size of the initial ramdisk or ramfs. Leave at zero if there is no
> - initial ramdisk/ramfs.
> -
> -Field name: bootsect_kludge
> -Type: kernel internal
> -Offset/size: 0x220/4
> -Protocol: 2.00+
> -
> - This field is obsolete.
> -
> -Field name: heap_end_ptr
> -Type: write (obligatory)
> -Offset/size: 0x224/2
> -Protocol: 2.01+
> -
> - Set this field to the offset (from the beginning of the real-mode
> - code) of the end of the setup stack/heap, minus 0x0200.
> -
> -Field name: ext_loader_ver
> -Type: write (optional)
> -Offset/size: 0x226/1
> -Protocol: 2.02+
> -
> - This field is used as an extension of the version number in the
> - type_of_loader field. The total version number is considered to be
> - (type_of_loader & 0x0f) + (ext_loader_ver << 4).
> -
> - The use of this field is boot loader specific. If not written, it
> - is zero.
> -
> - Kernels prior to 2.6.31 did not recognize this field, but it is safe
> - to write for protocol version 2.02 or higher.
> -
> -Field name: ext_loader_type
> -Type: write (obligatory if (type_of_loader & 0xf0) == 0xe0)
> -Offset/size: 0x227/1
> -Protocol: 2.02+
> -
> - This field is used as an extension of the type number in
> - type_of_loader field. If the type in type_of_loader is 0xE, then
> - the actual type is (ext_loader_type + 0x10).
> -
> - This field is ignored if the type in type_of_loader is not 0xE.
> -
> - Kernels prior to 2.6.31 did not recognize this field, but it is safe
> - to write for protocol version 2.02 or higher.
> -
> -Field name: cmd_line_ptr
> -Type: write (obligatory)
> -Offset/size: 0x228/4
> -Protocol: 2.02+
> -
> - Set this field to the linear address of the kernel command line.
> - The kernel command line can be located anywhere between the end of
> - the setup heap and 0xA0000; it does not have to be located in the
> - same 64K segment as the real-mode code itself.
> -
> - Fill in this field even if your boot loader does not support a
> - command line, in which case you can point this to an empty string
> - (or better yet, to the string "auto".) If this field is left at
> - zero, the kernel will assume that your boot loader does not support
> - the 2.02+ protocol.
> -
> -Field name: initrd_addr_max
> -Type: read
> -Offset/size: 0x22c/4
> -Protocol: 2.03+
> -
> - The maximum address that may be occupied by the initial
> - ramdisk/ramfs contents. For boot protocols 2.02 or earlier, this
> - field is not present, and the maximum address is 0x37FFFFFF. (This
> - address is defined as the address of the highest safe byte, so if
> - your ramdisk is exactly 131072 bytes long and this field is
> - 0x37FFFFFF, you can start your ramdisk at 0x37FE0000.)
> -
> -Field name: kernel_alignment
> -Type: read/modify (reloc)
> -Offset/size: 0x230/4
> -Protocol: 2.05+ (read), 2.10+ (modify)
> -
> - Alignment unit required by the kernel (if relocatable_kernel is
> - true.) A relocatable kernel that is loaded at an alignment
> - incompatible with the value in this field will be realigned during
> - kernel initialization.
> -
> - Starting with protocol version 2.10, this reflects the kernel
> - alignment preferred for optimal performance; it is possible for the
> - loader to modify this field to permit a lesser alignment. See the
> - min_alignment and pref_address field below.
> -
> -Field name: relocatable_kernel
> -Type: read (reloc)
> -Offset/size: 0x234/1
> -Protocol: 2.05+
> -
> - If this field is nonzero, the protected-mode part of the kernel can
> - be loaded at any address that satisfies the kernel_alignment field.
> - After loading, the boot loader must set the code32_start field to
> - point to the loaded code, or to a boot loader hook.
> -
> -Field name: min_alignment
> -Type: read (reloc)
> -Offset/size: 0x235/1
> -Protocol: 2.10+
> -
> - This field, if nonzero, indicates as a power of two the minimum
> - alignment required, as opposed to preferred, by the kernel to boot.
> - If a boot loader makes use of this field, it should update the
> - kernel_alignment field with the alignment unit desired; typically:
> -
> - kernel_alignment = 1 << min_alignment
> -
> - There may be a considerable performance cost with an excessively
> - misaligned kernel. Therefore, a loader should typically try each
> - power-of-two alignment from kernel_alignment down to this alignment.
> -
> -Field name: xloadflags
> -Type: read
> -Offset/size: 0x236/2
> -Protocol: 2.12+
> -
> - This field is a bitmask.
> -
> - Bit 0 (read): XLF_KERNEL_64
> - - If 1, this kernel has the legacy 64-bit entry point at 0x200.
> -
> - Bit 1 (read): XLF_CAN_BE_LOADED_ABOVE_4G
> - - If 1, kernel/boot_params/cmdline/ramdisk can be above 4G.
> -
> - Bit 2 (read): XLF_EFI_HANDOVER_32
> - - If 1, the kernel supports the 32-bit EFI handoff entry point
> - given at handover_offset.
> -
> - Bit 3 (read): XLF_EFI_HANDOVER_64
> - - If 1, the kernel supports the 64-bit EFI handoff entry point
> - given at handover_offset + 0x200.
> -
> - Bit 4 (read): XLF_EFI_KEXEC
> - - If 1, the kernel supports kexec EFI boot with EFI runtime support.
> -
> -Field name: cmdline_size
> -Type: read
> -Offset/size: 0x238/4
> -Protocol: 2.06+
> -
> - The maximum size of the command line without the terminating
> - zero. This means that the command line can contain at most
> - cmdline_size characters. With protocol version 2.05 and earlier, the
> - maximum size was 255.
> -
> -Field name: hardware_subarch
> -Type: write (optional, defaults to x86/PC)
> -Offset/size: 0x23c/4
> -Protocol: 2.07+
> -
> - In a paravirtualized environment the hardware low level architectural
> - pieces such as interrupt handling, page table handling, and
> - accessing process control registers needs to be done differently.
> -
> - This field allows the bootloader to inform the kernel we are in one
> - one of those environments.
> -
> - 0x00000000 The default x86/PC environment
> - 0x00000001 lguest
> - 0x00000002 Xen
> - 0x00000003 Moorestown MID
> - 0x00000004 CE4100 TV Platform
> -
> -Field name: hardware_subarch_data
> -Type: write (subarch-dependent)
> -Offset/size: 0x240/8
> -Protocol: 2.07+
> -
> - A pointer to data that is specific to hardware subarch
> - This field is currently unused for the default x86/PC environment,
> - do not modify.
> -
> -Field name: payload_offset
> -Type: read
> -Offset/size: 0x248/4
> -Protocol: 2.08+
> -
> - If non-zero then this field contains the offset from the beginning
> - of the protected-mode code to the payload.
> -
> - The payload may be compressed. The format of both the compressed and
> - uncompressed data should be determined using the standard magic
> - numbers. The currently supported compression formats are gzip
> - (magic numbers 1F 8B or 1F 9E), bzip2 (magic number 42 5A), LZMA
> - (magic number 5D 00), XZ (magic number FD 37), and LZ4 (magic number
> - 02 21). The uncompressed payload is currently always ELF (magic
> - number 7F 45 4C 46).
> -
> -Field name: payload_length
> -Type: read
> -Offset/size: 0x24c/4
> -Protocol: 2.08+
> -
> - The length of the payload.
> -
> -Field name: setup_data
> -Type: write (special)
> -Offset/size: 0x250/8
> -Protocol: 2.09+
> -
> - The 64-bit physical pointer to NULL terminated single linked list of
> - struct setup_data. This is used to define a more extensible boot
> - parameters passing mechanism. The definition of struct setup_data is
> - as follow:
> -
> - struct setup_data {
> - u64 next;
> - u32 type;
> - u32 len;
> - u8 data[0];
> - };
> -
> - Where, the next is a 64-bit physical pointer to the next node of
> - linked list, the next field of the last node is 0; the type is used
> - to identify the contents of data; the len is the length of data
> - field; the data holds the real payload.
> -
> - This list may be modified at a number of points during the bootup
> - process. Therefore, when modifying this list one should always make
> - sure to consider the case where the linked list already contains
> - entries.
> -
> -Field name: pref_address
> -Type: read (reloc)
> -Offset/size: 0x258/8
> -Protocol: 2.10+
> -
> - This field, if nonzero, represents a preferred load address for the
> - kernel. A relocating bootloader should attempt to load at this
> - address if possible.
> -
> - A non-relocatable kernel will unconditionally move itself and to run
> - at this address.
> -
> -Field name: init_size
> -Type: read
> -Offset/size: 0x260/4
> -
> - This field indicates the amount of linear contiguous memory starting
> - at the kernel runtime start address that the kernel needs before it
> - is capable of examining its memory map. This is not the same thing
> - as the total amount of memory the kernel needs to boot, but it can
> - be used by a relocating boot loader to help select a safe load
> - address for the kernel.
> -
> - The kernel runtime start address is determined by the following algorithm:
> -
> - if (relocatable_kernel)
> - runtime_start = align_up(load_address, kernel_alignment)
> - else
> - runtime_start = pref_address
> -
> -Field name: handover_offset
> -Type: read
> -Offset/size: 0x264/4
> -
> - This field is the offset from the beginning of the kernel image to
> - the EFI handover protocol entry point. Boot loaders using the EFI
> - handover protocol to boot the kernel should jump to this offset.
> -
> - See EFI HANDOVER PROTOCOL below for more details.
> -
> -
> -**** THE IMAGE CHECKSUM
> -
> -From boot protocol version 2.08 onwards the CRC-32 is calculated over
> -the entire file using the characteristic polynomial 0x04C11DB7 and an
> -initial remainder of 0xffffffff. The checksum is appended to the
> -file; therefore the CRC of the file up to the limit specified in the
> -syssize field of the header is always 0.
> -
> -
> -**** THE KERNEL COMMAND LINE
> -
> -The kernel command line has become an important way for the boot
> -loader to communicate with the kernel. Some of its options are also
> -relevant to the boot loader itself, see "special command line options"
> -below.
> -
> -The kernel command line is a null-terminated string. The maximum
> -length can be retrieved from the field cmdline_size. Before protocol
> -version 2.06, the maximum was 255 characters. A string that is too
> -long will be automatically truncated by the kernel.
> -
> -If the boot protocol version is 2.02 or later, the address of the
> -kernel command line is given by the header field cmd_line_ptr (see
> -above.) This address can be anywhere between the end of the setup
> -heap and 0xA0000.
> -
> -If the protocol version is *not* 2.02 or higher, the kernel
> -command line is entered using the following protocol:
> -
> - At offset 0x0020 (word), "cmd_line_magic", enter the magic
> - number 0xA33F.
> -
> - At offset 0x0022 (word), "cmd_line_offset", enter the offset
> - of the kernel command line (relative to the start of the
> - real-mode kernel).
> -
> - The kernel command line *must* be within the memory region
> - covered by setup_move_size, so you may need to adjust this
> - field.
> -
> -
> -**** MEMORY LAYOUT OF THE REAL-MODE CODE
> -
> -The real-mode code requires a stack/heap to be set up, as well as
> -memory allocated for the kernel command line. This needs to be done
> -in the real-mode accessible memory in bottom megabyte.
> -
> -It should be noted that modern machines often have a sizable Extended
> -BIOS Data Area (EBDA). As a result, it is advisable to use as little
> -of the low megabyte as possible.
> -
> -Unfortunately, under the following circumstances the 0x90000 memory
> -segment has to be used:
> -
> - - When loading a zImage kernel ((loadflags & 0x01) == 0).
> - - When loading a 2.01 or earlier boot protocol kernel.
> -
> - -> For the 2.00 and 2.01 boot protocols, the real-mode code
> - can be loaded at another address, but it is internally
> - relocated to 0x90000. For the "old" protocol, the
> - real-mode code must be loaded at 0x90000.
> -
> -When loading at 0x90000, avoid using memory above 0x9a000.
> -
> -For boot protocol 2.02 or higher, the command line does not have to be
> -located in the same 64K segment as the real-mode setup code; it is
> -thus permitted to give the stack/heap the full 64K segment and locate
> -the command line above it.
> -
> -The kernel command line should not be located below the real-mode
> -code, nor should it be located in high memory.
> -
> -
> -**** SAMPLE BOOT CONFIGURATION
> -
> -As a sample configuration, assume the following layout of the real
> -mode segment:
> -
> - When loading below 0x90000, use the entire segment:
> -
> - 0x0000-0x7fff Real mode kernel
> - 0x8000-0xdfff Stack and heap
> - 0xe000-0xffff Kernel command line
> -
> - When loading at 0x90000 OR the protocol version is 2.01 or earlier:
> -
> - 0x0000-0x7fff Real mode kernel
> - 0x8000-0x97ff Stack and heap
> - 0x9800-0x9fff Kernel command line
> -
> -Such a boot loader should enter the following fields in the header:
> -
> - unsigned long base_ptr; /* base address for real-mode segment */
> -
> - if ( setup_sects == 0 ) {
> - setup_sects = 4;
> - }
> -
> - if ( protocol >= 0x0200 ) {
> - type_of_loader = <type code>;
> - if ( loading_initrd ) {
> - ramdisk_image = <initrd_address>;
> - ramdisk_size = <initrd_size>;
> - }
> -
> - if ( protocol >= 0x0202 && loadflags & 0x01 )
> - heap_end = 0xe000;
> - else
> - heap_end = 0x9800;
> -
> - if ( protocol >= 0x0201 ) {
> - heap_end_ptr = heap_end - 0x200;
> - loadflags |= 0x80; /* CAN_USE_HEAP */
> - }
> -
> - if ( protocol >= 0x0202 ) {
> - cmd_line_ptr = base_ptr + heap_end;
> - strcpy(cmd_line_ptr, cmdline);
> - } else {
> - cmd_line_magic = 0xA33F;
> - cmd_line_offset = heap_end;
> - setup_move_size = heap_end + strlen(cmdline)+1;
> - strcpy(base_ptr+cmd_line_offset, cmdline);
> - }
> - } else {
> - /* Very old kernel */
> -
> - heap_end = 0x9800;
> -
> - cmd_line_magic = 0xA33F;
> - cmd_line_offset = heap_end;
> -
> - /* A very old kernel MUST have its real-mode code
> - loaded at 0x90000 */
> -
> - if ( base_ptr != 0x90000 ) {
> - /* Copy the real-mode kernel */
> - memcpy(0x90000, base_ptr, (setup_sects+1)*512);
> - base_ptr = 0x90000; /* Relocated */
> - }
> -
> - strcpy(0x90000+cmd_line_offset, cmdline);
> -
> - /* It is recommended to clear memory up to the 32K mark */
> - memset(0x90000 + (setup_sects+1)*512, 0,
> - (64-(setup_sects+1))*512);
> - }
> -
> -
> -**** LOADING THE REST OF THE KERNEL
> -
> -The 32-bit (non-real-mode) kernel starts at offset (setup_sects+1)*512
> -in the kernel file (again, if setup_sects == 0 the real value is 4.)
> -It should be loaded at address 0x10000 for Image/zImage kernels and
> -0x100000 for bzImage kernels.
> -
> -The kernel is a bzImage kernel if the protocol >= 2.00 and the 0x01
> -bit (LOAD_HIGH) in the loadflags field is set:
> -
> - is_bzImage = (protocol >= 0x0200) && (loadflags & 0x01);
> - load_address = is_bzImage ? 0x100000 : 0x10000;
> -
> -Note that Image/zImage kernels can be up to 512K in size, and thus use
> -the entire 0x10000-0x90000 range of memory. This means it is pretty
> -much a requirement for these kernels to load the real-mode part at
> -0x90000. bzImage kernels allow much more flexibility.
> -
> -
> -**** SPECIAL COMMAND LINE OPTIONS
> -
> -If the command line provided by the boot loader is entered by the
> -user, the user may expect the following command line options to work.
> -They should normally not be deleted from the kernel command line even
> -though not all of them are actually meaningful to the kernel. Boot
> -loader authors who need additional command line options for the boot
> -loader itself should get them registered in
> -Documentation/admin-guide/kernel-parameters.rst to make sure they will not
> -conflict with actual kernel options now or in the future.
> -
> - vga=<mode>
> - <mode> here is either an integer (in C notation, either
> - decimal, octal, or hexadecimal) or one of the strings
> - "normal" (meaning 0xFFFF), "ext" (meaning 0xFFFE) or "ask"
> - (meaning 0xFFFD). This value should be entered into the
> - vid_mode field, as it is used by the kernel before the command
> - line is parsed.
> -
> - mem=<size>
> - <size> is an integer in C notation optionally followed by
> - (case insensitive) K, M, G, T, P or E (meaning << 10, << 20,
> - << 30, << 40, << 50 or << 60). This specifies the end of
> - memory to the kernel. This affects the possible placement of
> - an initrd, since an initrd should be placed near end of
> - memory. Note that this is an option to *both* the kernel and
> - the bootloader!
> -
> - initrd=<file>
> - An initrd should be loaded. The meaning of <file> is
> - obviously bootloader-dependent, and some boot loaders
> - (e.g. LILO) do not have such a command.
> -
> -In addition, some boot loaders add the following options to the
> -user-specified command line:
> -
> - BOOT_IMAGE=<file>
> - The boot image which was loaded. Again, the meaning of <file>
> - is obviously bootloader-dependent.
> -
> - auto
> - The kernel was booted without explicit user intervention.
> -
> -If these options are added by the boot loader, it is highly
> -recommended that they are located *first*, before the user-specified
> -or configuration-specified command line. Otherwise, "init=/bin/sh"
> -gets confused by the "auto" option.
> -
> -
> -**** RUNNING THE KERNEL
> -
> -The kernel is started by jumping to the kernel entry point, which is
> -located at *segment* offset 0x20 from the start of the real mode
> -kernel. This means that if you loaded your real-mode kernel code at
> -0x90000, the kernel entry point is 9020:0000.
> -
> -At entry, ds = es = ss should point to the start of the real-mode
> -kernel code (0x9000 if the code is loaded at 0x90000), sp should be
> -set up properly, normally pointing to the top of the heap, and
> -interrupts should be disabled. Furthermore, to guard against bugs in
> -the kernel, it is recommended that the boot loader sets fs = gs = ds =
> -es = ss.
> -
> -In our example from above, we would do:
> -
> - /* Note: in the case of the "old" kernel protocol, base_ptr must
> - be == 0x90000 at this point; see the previous sample code */
> -
> - seg = base_ptr >> 4;
> -
> - cli(); /* Enter with interrupts disabled! */
> -
> - /* Set up the real-mode kernel stack */
> - _SS = seg;
> - _SP = heap_end;
> -
> - _DS = _ES = _FS = _GS = seg;
> - jmp_far(seg+0x20, 0); /* Run the kernel */
> -
> -If your boot sector accesses a floppy drive, it is recommended to
> -switch off the floppy motor before running the kernel, since the
> -kernel boot leaves interrupts off and thus the motor will not be
> -switched off, especially if the loaded kernel has the floppy driver as
> -a demand-loaded module!
> -
> -
> -**** ADVANCED BOOT LOADER HOOKS
> -
> -If the boot loader runs in a particularly hostile environment (such as
> -LOADLIN, which runs under DOS) it may be impossible to follow the
> -standard memory location requirements. Such a boot loader may use the
> -following hooks that, if set, are invoked by the kernel at the
> -appropriate time. The use of these hooks should probably be
> -considered an absolutely last resort!
> -
> -IMPORTANT: All the hooks are required to preserve %esp, %ebp, %esi and
> -%edi across invocation.
> -
> - realmode_swtch:
> - A 16-bit real mode far subroutine invoked immediately before
> - entering protected mode. The default routine disables NMI, so
> - your routine should probably do so, too.
> -
> - code32_start:
> - A 32-bit flat-mode routine *jumped* to immediately after the
> - transition to protected mode, but before the kernel is
> - uncompressed. No segments, except CS, are guaranteed to be
> - set up (current kernels do, but older ones do not); you should
> - set them up to BOOT_DS (0x18) yourself.
> -
> - After completing your hook, you should jump to the address
> - that was in this field before your boot loader overwrote it
> - (relocated, if appropriate.)
> -
> -
> -**** 32-bit BOOT PROTOCOL
> -
> -For machine with some new BIOS other than legacy BIOS, such as EFI,
> -LinuxBIOS, etc, and kexec, the 16-bit real mode setup code in kernel
> -based on legacy BIOS can not be used, so a 32-bit boot protocol needs
> -to be defined.
> -
> -In 32-bit boot protocol, the first step in loading a Linux kernel
> -should be to setup the boot parameters (struct boot_params,
> -traditionally known as "zero page"). The memory for struct boot_params
> -should be allocated and initialized to all zero. Then the setup header
> -from offset 0x01f1 of kernel image on should be loaded into struct
> -boot_params and examined. The end of setup header can be calculated as
> -follow:
> -
> - 0x0202 + byte value at offset 0x0201
> -
> -In addition to read/modify/write the setup header of the struct
> -boot_params as that of 16-bit boot protocol, the boot loader should
> -also fill the additional fields of the struct boot_params as that
> -described in zero-page.txt.
> -
> -After setting up the struct boot_params, the boot loader can load the
> -32/64-bit kernel in the same way as that of 16-bit boot protocol.
> -
> -In 32-bit boot protocol, the kernel is started by jumping to the
> -32-bit kernel entry point, which is the start address of loaded
> -32/64-bit kernel.
> -
> -At entry, the CPU must be in 32-bit protected mode with paging
> -disabled; a GDT must be loaded with the descriptors for selectors
> -__BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat
> -segment; __BOOT_CS must have execute/read permission, and __BOOT_DS
> -must have read/write permission; CS must be __BOOT_CS and DS, ES, SS
> -must be __BOOT_DS; interrupt must be disabled; %esi must hold the base
> -address of the struct boot_params; %ebp, %edi and %ebx must be zero.
> -
> -**** 64-bit BOOT PROTOCOL
> -
> -For machine with 64bit cpus and 64bit kernel, we could use 64bit bootloader
> -and we need a 64-bit boot protocol.
> -
> -In 64-bit boot protocol, the first step in loading a Linux kernel
> -should be to setup the boot parameters (struct boot_params,
> -traditionally known as "zero page"). The memory for struct boot_params
> -could be allocated anywhere (even above 4G) and initialized to all zero.
> -Then, the setup header at offset 0x01f1 of kernel image on should be
> -loaded into struct boot_params and examined. The end of setup header
> -can be calculated as follows:
> -
> - 0x0202 + byte value at offset 0x0201
> -
> -In addition to read/modify/write the setup header of the struct
> -boot_params as that of 16-bit boot protocol, the boot loader should
> -also fill the additional fields of the struct boot_params as described
> -in zero-page.txt.
> -
> -After setting up the struct boot_params, the boot loader can load
> -64-bit kernel in the same way as that of 16-bit boot protocol, but
> -kernel could be loaded above 4G.
> -
> -In 64-bit boot protocol, the kernel is started by jumping to the
> -64-bit kernel entry point, which is the start address of loaded
> -64-bit kernel plus 0x200.
> -
> -At entry, the CPU must be in 64-bit mode with paging enabled.
> -The range with setup_header.init_size from start address of loaded
> -kernel and zero page and command line buffer get ident mapping;
> -a GDT must be loaded with the descriptors for selectors
> -__BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat
> -segment; __BOOT_CS must have execute/read permission, and __BOOT_DS
> -must have read/write permission; CS must be __BOOT_CS and DS, ES, SS
> -must be __BOOT_DS; interrupt must be disabled; %rsi must hold the base
> -address of the struct boot_params.
> -
> -**** EFI HANDOVER PROTOCOL
> -
> -This protocol allows boot loaders to defer initialisation to the EFI
> -boot stub. The boot loader is required to load the kernel/initrd(s)
> -from the boot media and jump to the EFI handover protocol entry point
> -which is hdr->handover_offset bytes from the beginning of
> -startup_{32,64}.
> -
> -The function prototype for the handover entry point looks like this,
> -
> - efi_main(void *handle, efi_system_table_t *table, struct boot_params *bp)
> -
> -'handle' is the EFI image handle passed to the boot loader by the EFI
> -firmware, 'table' is the EFI system table - these are the first two
> -arguments of the "handoff state" as described in section 2.3 of the
> -UEFI specification. 'bp' is the boot loader-allocated boot params.
> -
> -The boot loader *must* fill out the following fields in bp,
> -
> - o hdr.code32_start
> - o hdr.cmd_line_ptr
> - o hdr.ramdisk_image (if applicable)
> - o hdr.ramdisk_size (if applicable)
> -
> -All other fields should be zero.
> diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
> index 7612d3142b2a..8f08caf4fbbb 100644
> --- a/Documentation/x86/index.rst
> +++ b/Documentation/x86/index.rst
> @@ -7,3 +7,5 @@ Linux x86 Support
> .. toctree::
> :maxdepth: 2
> :numbered:
> +
> + boot



Thanks,
Mauro

2019-04-24 17:47:26

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 39/63] Documentation: x86: convert topology.txt to reST

Em Wed, 24 Apr 2019 00:29:08 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> ---
> Documentation/x86/index.rst | 1 +
> Documentation/x86/topology.rst | 228 +++++++++++++++++++++++++++++++++
> Documentation/x86/topology.txt | 217 -------------------------------
> 3 files changed, 229 insertions(+), 217 deletions(-)
> create mode 100644 Documentation/x86/topology.rst
> delete mode 100644 Documentation/x86/topology.txt

Why? Please preserve as much as possible from the original file...
it is really hard to see what you're doing. Most of those x86
files are already almost at ReST format (like this one). There's
absolutely **no reason** why you would do so much radical changes
that would below the 50% similarity threshold that would make git
to recognize as a change on the same file!

I'll give a quick review on this one, but it is really hard to be
sure that something is missing, when the similarity is too low.

>
> diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
> index 8f08caf4fbbb..2033791e53bc 100644
> --- a/Documentation/x86/index.rst
> +++ b/Documentation/x86/index.rst
> @@ -9,3 +9,4 @@ Linux x86 Support
> :numbered:
>
> boot
> + topology
> diff --git a/Documentation/x86/topology.rst b/Documentation/x86/topology.rst
> new file mode 100644
> index 000000000000..1df5f56f4882
> --- /dev/null
> +++ b/Documentation/x86/topology.rst
> @@ -0,0 +1,228 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +============
> +x86 Topology
> +============
> +
> +This documents and clarifies the main aspects of x86 topology modelling and
> +representation in the kernel. Update/change when doing changes to the
> +respective code.
> +
> +The architecture-agnostic topology definitions are in
> +Documentation/cputopology.txt. This file holds x86-specific
> +differences/specialities which must not necessarily apply to the generic
> +definitions. Thus, the way to read up on Linux topology on x86 is to start
> +with the generic one and look at this one in parallel for the x86 specifics.
> +
> +Needless to say, code should use the generic functions - this file is *only*
> +here to *document* the inner workings of x86 topology.
> +
> +Started by Thomas Gleixner <[email protected]> and Borislav Petkov <[email protected]>.
> +
> +The main aim of the topology facilities is to present adequate interfaces to
> +code which needs to know/query/use the structure of the running system wrt
> +threads, cores, packages, etc.
> +
> +The kernel does not care about the concept of physical sockets because a
> +socket has no relevance to software. It's an electromechanical component. In
> +the past a socket always contained a single package (see below), but with the
> +advent of Multi Chip Modules (MCM) a socket can hold more than one package. So
> +there might be still references to sockets in the code, but they are of
> +historical nature and should be cleaned up.
> +
> +The topology of a system is described in the units of:
> +
> + - packages
> + - cores
> + - threads
> +
> +Package
> +=======
> +
> +Packages contain a number of cores plus shared resources, e.g. DRAM
> +controller, shared caches etc.
> +
> +AMD nomenclature for package is 'Node'.
> +
> +Package-related topology information in the kernel:
> +
> + - cpuinfo_x86.x86_max_cores:
> +
> + The number of cores in a package. This information is retrieved via CPUID.
> +
> + - cpuinfo_x86.phys_proc_id:
> +
> + The physical ID of the package. This information is retrieved via CPUID
> + and deduced from the APIC IDs of the cores in the package.
> +
> + - cpuinfo_x86.logical_id:
> +
> + The logical ID of the package. As we do not trust BIOSes to enumerate the
> + packages in a consistent way, we introduced the concept of logical package
> + ID so we can sanely calculate the number of maximum possible packages in
> + the system and have the packages enumerated linearly.
> +
> + - topology_max_packages():
> +
> + The maximum possible number of packages in the system. Helpful for per
> + package facilities to preallocate per package information.
> +
> + - cpu_llc_id:
> +
> + A per-CPU variable containing:
> +
> + - On Intel, the first APIC ID of the list of CPUs sharing the Last Level
> + Cache.
> +
> + - On AMD, the Node ID or Core Complex ID containing the Last Level
> + Cache. In general, it is a number identifying an LLC uniquely on the
> + system.
> +
> +Cores
> +=====
> +
> +A core consists of 1 or more threads. It does not matter whether the threads
> +are SMT- or CMT-type threads.
> +
> +AMDs nomenclature for a CMT core is "Compute Unit". The kernel always uses
> +"core".
> +
> +Core-related topology information in the kernel:
> +
> + - smp_num_siblings:
> +
> + The number of threads in a core. The number of threads in a package can be
> + calculated by::
> +
> + threads_per_package = cpuinfo_x86.x86_max_cores * smp_num_siblings
> +
> +
> +Threads
> +=======
> +
> +A thread is a single scheduling unit. It's the equivalent to a logical Linux
> +CPU.
> +
> +AMDs nomenclature for CMT threads is "Compute Unit Core". The kernel always
> +uses "thread".
> +
> +Thread-related topology information in the kernel:
> +
> + - topology_core_cpumask():
> +
> + The cpumask contains all online threads in the package to which a thread
> + belongs.
> +
> + The number of online threads is also printed in /proc/cpuinfo "siblings."
> +
> + - topology_sibling_cpumask():
> +
> + The cpumask contains all online threads in the core to which a thread
> + belongs.
> +
> + - topology_logical_package_id():
> +
> + The logical package ID to which a thread belongs.
> +
> + - topology_physical_package_id():
> +
> + The physical package ID to which a thread belongs.
> +
> + - topology_core_id();
> +
> + The ID of the core to which a thread belongs. It is also printed in /proc/cpuinfo
> + "core_id."
> +
> +
> +
> +System topology examples
> +========================
> +
> +.. note:: The alternative Linux CPU enumeration depends on how the BIOS
> + enumerates the threads. Many BIOSes enumerate all threads 0 first and
> + then all threads 1. That has the "advantage" that the logical Linux CPU
> + numbers of threads 0 stay the same whether threads are enabled or not.
> + That's merely an implementation detail and has no practical impact.
> +
> +1) Single Package, Single Core
> +::

I would just place the :: on the above line. Same applies to similar
cases on this file.

> +
> + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
> +
> +2) Single Package, Dual Core
> +
> + a) One thread per core
> + ::
> +
> + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
> + -> [core 1] -> [thread 0] -> Linux CPU 1

Something got broken here.

> +
> + b) Two threads per core
> + ::
> +
> + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
> + -> [thread 1] -> Linux CPU 1
> + -> [core 1] -> [thread 0] -> Linux CPU 2
> + -> [thread 1] -> Linux CPU 3

And here... This one, for example, should be, instead:

[package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
-> [thread 1] -> Linux CPU 1
-> [core 1] -> [thread 0] -> Linux CPU 2
-> [thread 1] -> Linux CPU 3

Clearly there's something that it is messing with tabs on your
x86 conversion.

I'll stop my review here, as it sounds pointless to review it,
as there are too many broken whitespace stuff on your
conversion.

Thanks,
Mauro

2019-04-24 17:50:14

by Changbin Du

[permalink] [raw]
Subject: Re: [PATCH v4 24/63] Documentation: ACPI: move video_extension.txt to firmware-guide/acpi and convert to reST

On Wed, Apr 24, 2019 at 11:56:47AM -0300, Mauro Carvalho Chehab wrote:
> Em Wed, 24 Apr 2019 00:28:53 +0800
> Changbin Du <[email protected]> escreveu:
>
> > This converts the plain text documentation to reStructuredText format and
> > add it to Sphinx TOC tree. No essential content change.
> >
> > Signed-off-by: Changbin Du <[email protected]>
> > ---
> > Documentation/firmware-guide/acpi/index.rst | 1 +
> > .../acpi/video_extension.rst} | 63 ++++++++++---------
> > 2 files changed, 36 insertions(+), 28 deletions(-)
> > rename Documentation/{acpi/video_extension.txt => firmware-guide/acpi/video_extension.rst} (79%)
> >
> > diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> > index 0e60f4b7129a..ae609eec4679 100644
> > --- a/Documentation/firmware-guide/acpi/index.rst
> > +++ b/Documentation/firmware-guide/acpi/index.rst
> > @@ -23,3 +23,4 @@ ACPI Support
> > i2c-muxes
> > acpi-lid
> > lpit
> > + video_extension
> > diff --git a/Documentation/acpi/video_extension.txt b/Documentation/firmware-guide/acpi/video_extension.rst
> > similarity index 79%
> > rename from Documentation/acpi/video_extension.txt
> > rename to Documentation/firmware-guide/acpi/video_extension.rst
> > index 79bf6a4921be..06f7e3230b6e 100644
> > --- a/Documentation/acpi/video_extension.txt
> > +++ b/Documentation/firmware-guide/acpi/video_extension.rst
> > @@ -1,5 +1,8 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +=====================
> > ACPI video extensions
> > -~~~~~~~~~~~~~~~~~~~~~
> > +=====================
> >
> > This driver implement the ACPI Extensions For Display Adapters for
> > integrated graphics devices on motherboard, as specified in ACPI 2.0
> > @@ -8,9 +11,10 @@ defining the video POST device, retrieving EDID information or to
> > setup a video output, etc. Note that this is an ref. implementation
> > only. It may or may not work for your integrated video device.
> >
> > -The ACPI video driver does 3 things regarding backlight control:
> > +The ACPI video driver does 3 things regarding backlight control.
> >
> > -1 Export a sysfs interface for user space to control backlight level
> > +1. Export a sysfs interface for user space to control backlight level
> > +=====================================================================
> >
> > If the ACPI table has a video device, and acpi_backlight=vendor kernel
> > command line is not present, the driver will register a backlight device
>
> Hmm... you didn't touch on this part of the document:
>
> And what ACPI video driver does is:
> actual_brightness: on read, control method _BQC will be evaluated to
> get the brightness level the firmware thinks it is at;
> bl_power: not implemented, will set the current brightness instead;
> brightness: on write, control method _BCM will run to set the requested
> brightness level;
> max_brightness: Derived from the _BCL package(see below);
> type: firmware
>
> You should touch it. My suggestion here is:
>
> And what ACPI video driver does is:
>
> actual_brightness:
> on read, control method _BQC will be evaluated to
> get the brightness level the firmware thinks it is at;
> bl_power:
> not implemented, will set the current brightness instead;
> brightness:
> on write, control method _BCM will run to set the requested
> brightness level;
> max_brightness:
> Derived from the _BCL package(see below);
> type:
> firmware
>
Thanks, done.

> > @@ -32,26 +36,26 @@ type: firmware
> >
> > Note that ACPI video backlight driver will always use index for
> > brightness, actual_brightness and max_brightness. So if we have
> > -the following _BCL package:
> > +the following _BCL package::
> >
> > -Method (_BCL, 0, NotSerialized)
> > -{
> > - Return (Package (0x0C)
> > + Method (_BCL, 0, NotSerialized)
> > {
> > - 0x64,
> > - 0x32,
> > - 0x0A,
> > - 0x14,
> > - 0x1E,
> > - 0x28,
> > - 0x32,
> > - 0x3C,
> > - 0x46,
> > - 0x50,
> > - 0x5A,
> > - 0x64
> > - })
> > -}
> > + Return (Package (0x0C)
> > + {
> > + 0x64,
> > + 0x32,
> > + 0x0A,
> > + 0x14,
> > + 0x1E,
> > + 0x28,
> > + 0x32,
> > + 0x3C,
> > + 0x46,
> > + 0x50,
> > + 0x5A,
> > + 0x64
> > + })
> > + }
> >
> > The first two levels are for when laptop are on AC or on battery and are
> > not used by Linux currently. The remaining 10 levels are supported levels
> > @@ -62,13 +66,15 @@ as a "brightness level" indicator. Thus from the user space perspective
> > the range of available brightness levels is from 0 to 9 (max_brightness)
> > inclusive.
> >
> > -2 Notify user space about hotkey event
> > +2. Notify user space about hotkey event
> > +=======================================
> >
> > There are generally two cases for hotkey event reporting:
> > +
> > i) For some laptops, when user presses the hotkey, a scancode will be
> > generated and sent to user space through the input device created by
> > the keyboard driver as a key type input event, with proper remap, the
> > - following key code will appear to user space:
> > + following key code will appear to user space::
> >
> > EV_KEY, KEY_BRIGHTNESSUP
> > EV_KEY, KEY_BRIGHTNESSDOWN
> > @@ -82,7 +88,7 @@ ii) For some laptops, the press of the hotkey will not generate the
> > about the event. The event value is defined in the ACPI spec. ACPI
> > video driver will generate an key type input event according to the
> > notify value it received and send the event to user space through the
> > - input device it created:
> > + input device it created::
> >
> > event keycode
> > 0x86 KEY_BRIGHTNESSUP
>
> Perhaps making this as a table would work better:
>
> input device it created:
>
> ===== ===================
> event keycode
> ===== ===================
> 0x86 KEY_BRIGHTNESSUP
> 0x87 KEY_BRIGHTNESSDOWN
> etc.
> ===== ===================
>
>
Done.

> > @@ -94,13 +100,14 @@ so this would lead to the same effect as case i) now.
> > Once user space tool receives this event, it can modify the backlight
> > level through the sysfs interface.
> >
> > -3 Change backlight level in the kernel
> > +3. Change backlight level in the kernel
> > +=======================================
> >
> > This works for machines covered by case ii) in Section 2. Once the driver
> > received a notification, it will set the backlight level accordingly. This does
> > not affect the sending of event to user space, they are always sent to user
> > space regardless of whether or not the video module controls the backlight level
> > directly. This behaviour can be controlled through the brightness_switch_enabled
> > -module parameter as documented in admin-guide/kernel-parameters.rst. It is recommended to
> > -disable this behaviour once a GUI environment starts up and wants to have full
> > -control of the backlight level.
> > +module parameter as documented in admin-guide/kernel-parameters.rst. It is
> > +recommended to disable this behaviour once a GUI environment starts up and
> > +wants to have full control of the backlight level.
>
>
>
> Thanks,
> Mauro

--
Cheers,
Changbin Du

2019-04-24 17:51:42

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 00/63] Include linux ACPI/PCI/X86 docs into Sphinx TOC tree

Em Wed, 24 Apr 2019 23:46:18 +0800
Changbin Du <[email protected]> escreveu:

> On Tue, Apr 23, 2019 at 12:36:44PM -0500, Bjorn Helgaas wrote:
> > On Tue, Apr 23, 2019 at 06:39:47PM +0200, Rafael J. Wysocki wrote:
> > > On Tue, Apr 23, 2019 at 6:30 PM Changbin Du <[email protected]> wrote:
> > > > Hi Corbet and All,
> > > > The kernel now uses Sphinx to generate intelligent and beautiful
> > > > documentation from reStructuredText files. I converted all of the Linux
> > > > ACPI/PCI/X86 docs to reST format in this serias.
> > > >
> > > > In this version I combined ACPI and PCI docs, and added new x86 docs
> > > > conversion.
> > >
> > > I'm not sure if combining all three into one big patch series has been
> > > a good idea, honestly.
> >
> > Yeah, if you post this again, I would find it easier to deal with if
> > linux-pci only got the PCI-related things. 63 patches is a little too
> > much for one series.
> >
> sure, so I will resend them respectively.

I reviewed up to patch 39. There are too many files on x86 that seems
to be mangled by some tab->whitespace conversion, with caused
very big diffs and lots of broken ascii artwork.

Please ensure that the diffs will contain the minimal amount of stuff
that would be required for them to be properly formatted as ReST
files there.

Ah, perhaps next time you could format the patches with a lower
merge similarity logic (using, for example, the parameter -M10).

Regards,
Mauro

>
> > Bjorn
>



Thanks,
Mauro

2019-04-24 17:55:50

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 09/63] Documentation: ACPI: move method-customizing.txt to firmware-guide/acpi and convert to reST

Em Thu, 25 Apr 2019 00:28:52 +0800
Changbin Du <[email protected]> escreveu:

> On Tue, Apr 23, 2019 at 06:03:16PM -0300, Mauro Carvalho Chehab wrote:
> > Em Wed, 24 Apr 2019 00:28:38 +0800
> > Changbin Du <[email protected]> escreveu:
> >

> > > +.. note:: Only ACPI METHOD can be overridden, any other object types like
> > > + "Device", "OperationRegion", are not recognized. Methods
> > > + declared inside scope operators are also not supported.
> > > +.. note:: The same ACPI control method can be overridden for many times,
> > > + and it's always the latest one that used by Linux/kernel.
> > > +.. note:: To get the ACPI debug object output (Store (AAAA, Debug)),
> > > + please run "echo 1 > /sys/module/acpi/parameters/aml_debug_output".
> >
> > Hmm... this may work (not sure if Sphinx would warn or not), but it
> > is visually bad on text mode. I would code it, instead, with something
> > like:
> >
> > .. note::
> >
> > - Only ACPI METHOD can be overridden, any other object types like
> > "Device", "OperationRegion", are not recognized. Methods
> > declared inside scope operators are also not supported.
> >
> > - The same ACPI control method can be overridden for many times,
> > and it's always the latest one that used by Linux/kernel.
> >
> > - To get the ACPI debug object output (Store (AAAA, Debug)),
> > please run::
> >
> > echo 1 > /sys/module/acpi/parameters/aml_debug_output
> >
> > As this would make it visually better on both text and html formats.
> >
> No warnings given.

Interesting. I'm now wondering if it did the right thing or if it produced
some weird output... Maybe the answer depends on the Sphinx version one
would be using.

> Your suggested style is better so applied it. Thanks!

Thank you!

Thanks,
Mauro

2019-04-24 19:19:05

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 18/63] Documentation: ACPI: move aml-debugger.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:47 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>

For the conversion changes:

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> ---
> Documentation/acpi/aml-debugger.txt | 66 ----------------
> .../firmware-guide/acpi/aml-debugger.rst | 75 +++++++++++++++++++
> Documentation/firmware-guide/acpi/index.rst | 1 +
> 3 files changed, 76 insertions(+), 66 deletions(-)
> delete mode 100644 Documentation/acpi/aml-debugger.txt
> create mode 100644 Documentation/firmware-guide/acpi/aml-debugger.rst
>
> diff --git a/Documentation/acpi/aml-debugger.txt b/Documentation/acpi/aml-debugger.txt
> deleted file mode 100644
> index 75ebeb64ab29..000000000000
> --- a/Documentation/acpi/aml-debugger.txt
> +++ /dev/null
> @@ -1,66 +0,0 @@
> -The AML Debugger
> -
> -Copyright (C) 2016, Intel Corporation
> -Author: Lv Zheng <[email protected]>
> -
> -
> -This document describes the usage of the AML debugger embedded in the Linux
> -kernel.
> -
> -1. Build the debugger
> -
> - The following kernel configuration items are required to enable the AML
> - debugger interface from the Linux kernel:
> -
> - CONFIG_ACPI_DEBUGGER=y
> - CONFIG_ACPI_DEBUGGER_USER=m
> -
> - The userspace utilities can be built from the kernel source tree using
> - the following commands:
> -
> - $ cd tools
> - $ make acpi
> -
> - The resultant userspace tool binary is then located at:
> -
> - tools/power/acpi/acpidbg
> -
> - It can be installed to system directories by running "make install" (as a
> - sufficiently privileged user).
> -
> -2. Start the userspace debugger interface
> -
> - After booting the kernel with the debugger built-in, the debugger can be
> - started by using the following commands:
> -
> - # mount -t debugfs none /sys/kernel/debug
> - # modprobe acpi_dbg
> - # tools/power/acpi/acpidbg
> -
> - That spawns the interactive AML debugger environment where you can execute
> - debugger commands.
> -
> - The commands are documented in the "ACPICA Overview and Programmer Reference"
> - that can be downloaded from
> -
> - https://acpica.org/documentation
> -
> - The detailed debugger commands reference is located in Chapter 12 "ACPICA
> - Debugger Reference". The "help" command can be used for a quick reference.
> -
> -3. Stop the userspace debugger interface
> -
> - The interactive debugger interface can be closed by pressing Ctrl+C or using
> - the "quit" or "exit" commands. When finished, unload the module with:
> -
> - # rmmod acpi_dbg
> -
> - The module unloading may fail if there is an acpidbg instance running.
> -
> -4. Run the debugger in a script
> -
> - It may be useful to run the AML debugger in a test script. "acpidbg" supports
> - this in a special "batch" mode. For example, the following command outputs
> - the entire ACPI namespace:
> -
> - # acpidbg -b "namespace"
> diff --git a/Documentation/firmware-guide/acpi/aml-debugger.rst b/Documentation/firmware-guide/acpi/aml-debugger.rst
> new file mode 100644
> index 000000000000..a889d43bc6c5
> --- /dev/null
> +++ b/Documentation/firmware-guide/acpi/aml-debugger.rst
> @@ -0,0 +1,75 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +.. include:: <isonum.txt>
> +
> +================
> +The AML Debugger
> +================
> +
> +:Copyright: |copy| 2016, Intel Corporation
> +:Author: Lv Zheng <[email protected]>
> +
> +
> +This document describes the usage of the AML debugger embedded in the Linux
> +kernel.
> +
> +1. Build the debugger
> +=====================
> +
> +The following kernel configuration items are required to enable the AML
> +debugger interface from the Linux kernel::
> +
> + CONFIG_ACPI_DEBUGGER=y
> + CONFIG_ACPI_DEBUGGER_USER=m
> +
> +The userspace utilities can be built from the kernel source tree using
> +the following commands::
> +
> + $ cd tools
> + $ make acpi
> +
> +The resultant userspace tool binary is then located at::
> +
> + tools/power/acpi/acpidbg
> +
> +It can be installed to system directories by running "make install" (as a
> +sufficiently privileged user).
> +
> +2. Start the userspace debugger interface
> +=========================================
> +
> +After booting the kernel with the debugger built-in, the debugger can be
> +started by using the following commands::
> +
> + # mount -t debugfs none /sys/kernel/debug
> + # modprobe acpi_dbg
> + # tools/power/acpi/acpidbg
> +
> +That spawns the interactive AML debugger environment where you can execute
> +debugger commands.
> +
> +The commands are documented in the "ACPICA Overview and Programmer Reference"
> +that can be downloaded from
> +
> +https://acpica.org/documentation
> +
> +The detailed debugger commands reference is located in Chapter 12 "ACPICA
> +Debugger Reference". The "help" command can be used for a quick reference.
> +
> +3. Stop the userspace debugger interface
> +========================================
> +
> +The interactive debugger interface can be closed by pressing Ctrl+C or using
> +the "quit" or "exit" commands. When finished, unload the module with::
> +
> + # rmmod acpi_dbg
> +
> +The module unloading may fail if there is an acpidbg instance running.
> +
> +4. Run the debugger in a script
> +===============================
> +
> +It may be useful to run the AML debugger in a test script. "acpidbg" supports
> +this in a special "batch" mode. For example, the following command outputs
> +the entire ACPI namespace::
> +
> + # acpidbg -b "namespace"
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index 287a7cbd82ac..e9f253d54897 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -16,6 +16,7 @@ ACPI Support
> method-tracing
> DSD-properties-rules
> debug
> + aml-debugger
> gpio-properties
> i2c-muxes
> acpi-lid



Thanks,
Mauro

2019-04-24 20:56:05

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 22/63] Documentation: ACPI: move lpit.txt to firmware-guide/acpi and convert to reST

Em Wed, 24 Apr 2019 00:28:51 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> ---
> Documentation/firmware-guide/acpi/index.rst | 1 +
> .../lpit.txt => firmware-guide/acpi/lpit.rst} | 18 +++++++++++++-----
> 2 files changed, 14 insertions(+), 5 deletions(-)
> rename Documentation/{acpi/lpit.txt => firmware-guide/acpi/lpit.rst} (68%)
>
> diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> index fca854f017d8..0e60f4b7129a 100644
> --- a/Documentation/firmware-guide/acpi/index.rst
> +++ b/Documentation/firmware-guide/acpi/index.rst
> @@ -22,3 +22,4 @@ ACPI Support
> gpio-properties
> i2c-muxes
> acpi-lid
> + lpit
> diff --git a/Documentation/acpi/lpit.txt b/Documentation/firmware-guide/acpi/lpit.rst
> similarity index 68%
> rename from Documentation/acpi/lpit.txt
> rename to Documentation/firmware-guide/acpi/lpit.rst
> index b426398d2e97..aca928fab027 100644
> --- a/Documentation/acpi/lpit.txt
> +++ b/Documentation/firmware-guide/acpi/lpit.rst
> @@ -1,3 +1,9 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===========================
> +Low Power Idle Table (LPIT)
> +===========================
> +
> To enumerate platform Low Power Idle states, Intel platforms are using
> “Low Power Idle Table” (LPIT). More details about this table can be
> downloaded from:
> @@ -8,13 +14,15 @@ Residencies for each low power state can be read via FFH
>
> On platforms supporting S0ix sleep states, there can be two types of
> residencies:
> -- CPU PKG C10 (Read via FFH interface)
> -- Platform Controller Hub (PCH) SLP_S0 (Read via memory mapped interface)
> +
> + - CPU PKG C10 (Read via FFH interface)
> + - Platform Controller Hub (PCH) SLP_S0 (Read via memory mapped interface)
>
> The following attributes are added dynamically to the cpuidle
> -sysfs attribute group:
> - /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
> - /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
> +sysfs attribute group::
> +
> + /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
> + /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
>
> The "low_power_idle_cpu_residency_us" attribute shows time spent
> by the CPU package in PKG C10



Thanks,
Mauro

2019-04-24 21:27:34

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 30/63] Documentation: PCI: convert acpi-info.txt to reST

Em Wed, 24 Apr 2019 00:28:59 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>
> ---
> Documentation/PCI/{acpi-info.txt => acpi-info.rst} | 11 ++++++++---
> Documentation/PCI/index.rst | 1 +
> 2 files changed, 9 insertions(+), 3 deletions(-)
> rename Documentation/PCI/{acpi-info.txt => acpi-info.rst} (97%)
>
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.rst
> similarity index 97%
> rename from Documentation/PCI/acpi-info.txt
> rename to Documentation/PCI/acpi-info.rst
> index 3ffa3b03970e..f7dabb7ca255 100644
> --- a/Documentation/PCI/acpi-info.txt
> +++ b/Documentation/PCI/acpi-info.rst
> @@ -1,4 +1,8 @@
> - ACPI considerations for PCI host bridges
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +========================================
> +ACPI considerations for PCI host bridges
> +========================================
>
> The general rule is that the ACPI namespace should describe everything the
> OS might use unless there's another way for the OS to find it [1, 2].
> @@ -135,8 +139,9 @@ address always corresponds to bus 0, even if the bus range below the bridge
>
> Extended Address Space Descriptor (.4)
> General Flags: Bit [0] Consumer/Producer:
> - 1–This device consumes this resource
> - 0–This device produces and consumes this resource
> +
> + * 1 – This device consumes this resource
> + * 0 – This device produces and consumes this resource

Hmm.. I think that you would need to add some extra blank lines before
the above, e. g., something like:

[4] ACPI 6.2, sec 6.4.3.5.1, 2, 3, 4:
QWord/DWord/Word Address Space Descriptor (.1, .2, .3)
General Flags: Bit [0] Ignored

Extended Address Space Descriptor (.4)
General Flags: Bit [0] Consumer/Producer:

* 1 – This device consumes this resource
* 0 – This device produces and consumes this resource

>
> [5] ACPI 6.2, sec 19.6.43:
> ResourceUsage specifies whether the Memory range is consumed by
> diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
> index 1b25bcc1edca..c877a369481d 100644
> --- a/Documentation/PCI/index.rst
> +++ b/Documentation/PCI/index.rst
> @@ -12,3 +12,4 @@ Linux PCI Bus Subsystem
> PCIEBUS-HOWTO
> pci-iov-howto
> MSI-HOWTO
> + acpi-info



Thanks,
Mauro

2019-04-24 21:35:04

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 33/63] Documentation: PCI: convert endpoint/pci-endpoint.txt to reST

Em Wed, 24 Apr 2019 00:29:02 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>
> ---
> Documentation/PCI/endpoint/index.rst | 10 ++
> .../{pci-endpoint.txt => pci-endpoint.rst} | 95 +++++++++++--------
> Documentation/PCI/index.rst | 1 +
> 3 files changed, 68 insertions(+), 38 deletions(-)
> create mode 100644 Documentation/PCI/endpoint/index.rst
> rename Documentation/PCI/endpoint/{pci-endpoint.txt => pci-endpoint.rst} (82%)
>
> diff --git a/Documentation/PCI/endpoint/index.rst b/Documentation/PCI/endpoint/index.rst
> new file mode 100644
> index 000000000000..0db4f2fcd7f0
> --- /dev/null
> +++ b/Documentation/PCI/endpoint/index.rst
> @@ -0,0 +1,10 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +======================
> +PCI Endpoint Framework
> +======================
> +
> +.. toctree::
> + :maxdepth: 2
> +
> + pci-endpoint
> diff --git a/Documentation/PCI/endpoint/pci-endpoint.txt b/Documentation/PCI/endpoint/pci-endpoint.rst
> similarity index 82%
> rename from Documentation/PCI/endpoint/pci-endpoint.txt
> rename to Documentation/PCI/endpoint/pci-endpoint.rst
> index e86a96b66a6a..6674ce5425bf 100644
> --- a/Documentation/PCI/endpoint/pci-endpoint.txt
> +++ b/Documentation/PCI/endpoint/pci-endpoint.rst
> @@ -1,11 +1,17 @@
> - PCI ENDPOINT FRAMEWORK
> - Kishon Vijay Abraham I <[email protected]>
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +======================
> +PCI Endpoint Framework
> +======================

Hmm... considering that you decided to create an index file for the
endpoint, with the same title, I would just remove this from here.

> +
> +:Author: Kishon Vijay Abraham I <[email protected]>
>
> This document is a guide to use the PCI Endpoint Framework in order to create
> endpoint controller driver, endpoint function driver, and using configfs
> interface to bind the function driver to the controller driver.
>
> -1. Introduction
> +Introduction
> +============
>
> Linux has a comprehensive PCI subsystem to support PCI controllers that
> operates in Root Complex mode. The subsystem has capability to scan PCI bus,
> @@ -19,24 +25,27 @@ add endpoint mode support in Linux. This will help to run Linux in an
> EP system which can have a wide variety of use cases from testing or
> validation, co-processor accelerator, etc.
>
> -2. PCI Endpoint Core
> +PCI Endpoint Core
> +=================
>
> The PCI Endpoint Core layer comprises 3 components: the Endpoint Controller
> library, the Endpoint Function library, and the configfs layer to bind the
> endpoint function with the endpoint controller.
>
> -2.1 PCI Endpoint Controller(EPC) Library
> +PCI Endpoint Controller(EPC) Library
> +------------------------------------
>
> The EPC library provides APIs to be used by the controller that can operate
> in endpoint mode. It also provides APIs to be used by function driver/library
> in order to implement a particular endpoint function.
>
> -2.1.1 APIs for the PCI controller Driver
> +APIs for the PCI controller Driver
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> This section lists the APIs that the PCI Endpoint core provides to be used
> by the PCI controller driver.
>
> -*) devm_pci_epc_create()/pci_epc_create()
> +* devm_pci_epc_create()/pci_epc_create()

I would, instead, promote this as a sub-level. E. g. something like:

devm_pci_epc_create()/pci_epc_create()
......................................

(if you do that, you'll need to also promote some similar function
documentation within this doc)


>
> The PCI controller driver should implement the following ops:
> * write_header: ops to populate configuration space header

Better to add a blank line between those two lines. There's no sense
(IMHO) on using bold font to the first line here.

> @@ -51,110 +60,116 @@ by the PCI controller driver.
> The PCI controller driver can then create a new EPC device by invoking
> devm_pci_epc_create()/pci_epc_create().
>
> -*) devm_pci_epc_destroy()/pci_epc_destroy()
> +* devm_pci_epc_destroy()/pci_epc_destroy()
>
> The PCI controller driver can destroy the EPC device created by either
> devm_pci_epc_create() or pci_epc_create() using devm_pci_epc_destroy() or
> pci_epc_destroy().
>
> -*) pci_epc_linkup()
> +* pci_epc_linkup()
>
> In order to notify all the function devices that the EPC device to which
> they are linked has established a link with the host, the PCI controller
> driver should invoke pci_epc_linkup().
>
> -*) pci_epc_mem_init()
> +* pci_epc_mem_init()
>
> Initialize the pci_epc_mem structure used for allocating EPC addr space.
>
> -*) pci_epc_mem_exit()
> +* pci_epc_mem_exit()
>
> Cleanup the pci_epc_mem structure allocated during pci_epc_mem_init().
>
> -2.1.2 APIs for the PCI Endpoint Function Driver
> +
> +APIs for the PCI Endpoint Function Driver
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> This section lists the APIs that the PCI Endpoint core provides to be used
> by the PCI endpoint function driver.
>
> -*) pci_epc_write_header()
> +* pci_epc_write_header()
>
> The PCI endpoint function driver should use pci_epc_write_header() to
> write the standard configuration header to the endpoint controller.
>
> -*) pci_epc_set_bar()
> +* pci_epc_set_bar()
>
> The PCI endpoint function driver should use pci_epc_set_bar() to configure
> the Base Address Register in order for the host to assign PCI addr space.
> Register space of the function driver is usually configured
> using this API.
>
> -*) pci_epc_clear_bar()
> +* pci_epc_clear_bar()
>
> The PCI endpoint function driver should use pci_epc_clear_bar() to reset
> the BAR.
>
> -*) pci_epc_raise_irq()
> +* pci_epc_raise_irq()
>
> The PCI endpoint function driver should use pci_epc_raise_irq() to raise
> Legacy Interrupt, MSI or MSI-X Interrupt.
>
> -*) pci_epc_mem_alloc_addr()
> +* pci_epc_mem_alloc_addr()
>
> The PCI endpoint function driver should use pci_epc_mem_alloc_addr(), to
> allocate memory address from EPC addr space which is required to access
> RC's buffer
>
> -*) pci_epc_mem_free_addr()
> +* pci_epc_mem_free_addr()
>
> The PCI endpoint function driver should use pci_epc_mem_free_addr() to
> free the memory space allocated using pci_epc_mem_alloc_addr().
>
> -2.1.3 Other APIs
> +Other APIs
> +~~~~~~~~~~
>
> There are other APIs provided by the EPC library. These are used for binding
> the EPF device with EPC device. pci-ep-cfs.c can be used as reference for
> using these APIs.
>
> -*) pci_epc_get()
> +* pci_epc_get()
>
> Get a reference to the PCI endpoint controller based on the device name of
> the controller.
>
> -*) pci_epc_put()
> +* pci_epc_put()
>
> Release the reference to the PCI endpoint controller obtained using
> pci_epc_get()
>
> -*) pci_epc_add_epf()
> +* pci_epc_add_epf()
>
> Add a PCI endpoint function to a PCI endpoint controller. A PCIe device
> can have up to 8 functions according to the specification.
>
> -*) pci_epc_remove_epf()
> +* pci_epc_remove_epf()
>
> Remove the PCI endpoint function from PCI endpoint controller.
>
> -*) pci_epc_start()
> +* pci_epc_start()
>
> The PCI endpoint function driver should invoke pci_epc_start() once it
> has configured the endpoint function and wants to start the PCI link.
>
> -*) pci_epc_stop()
> +* pci_epc_stop()
>
> The PCI endpoint function driver should invoke pci_epc_stop() to stop
> the PCI LINK.
>
> -2.2 PCI Endpoint Function(EPF) Library
> +
> +PCI Endpoint Function(EPF) Library
> +----------------------------------
>
> The EPF library provides APIs to be used by the function driver and the EPC
> library to provide endpoint mode functionality.
>
> -2.2.1 APIs for the PCI Endpoint Function Driver
> +APIs for the PCI Endpoint Function Driver
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> This section lists the APIs that the PCI Endpoint core provides to be used
> by the PCI endpoint function driver.
>
> -*) pci_epf_register_driver()
> +* pci_epf_register_driver()
>
> The PCI Endpoint Function driver should implement the following ops:
> * bind: ops to perform when a EPC device has been bound to EPF device
> @@ -166,50 +181,54 @@ by the PCI endpoint function driver.
> The PCI Function driver can then register the PCI EPF driver by using
> pci_epf_register_driver().
>
> -*) pci_epf_unregister_driver()
> +* pci_epf_unregister_driver()
>
> The PCI Function driver can unregister the PCI EPF driver by using
> pci_epf_unregister_driver().
>
> -*) pci_epf_alloc_space()
> +* pci_epf_alloc_space()
>
> The PCI Function driver can allocate space for a particular BAR using
> pci_epf_alloc_space().
>
> -*) pci_epf_free_space()
> +* pci_epf_free_space()
>
> The PCI Function driver can free the allocated space
> (using pci_epf_alloc_space) by invoking pci_epf_free_space().
>
> -2.2.2 APIs for the PCI Endpoint Controller Library
> +APIs for the PCI Endpoint Controller Library
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> This section lists the APIs that the PCI Endpoint core provides to be used
> by the PCI endpoint controller library.
>
> -*) pci_epf_linkup()
> +* pci_epf_linkup()
>
> The PCI endpoint controller library invokes pci_epf_linkup() when the
> EPC device has established the connection to the host.
>
> -2.2.2 Other APIs
> +Other APIs
> +~~~~~~~~~~
> +
> There are other APIs provided by the EPF library. These are used to notify
> the function driver when the EPF device is bound to the EPC device.
> pci-ep-cfs.c can be used as reference for using these APIs.
>
> -*) pci_epf_create()
> +* pci_epf_create()
>
> Create a new PCI EPF device by passing the name of the PCI EPF device.
> This name will be used to bind the the EPF device to a EPF driver.
>
> -*) pci_epf_destroy()
> +* pci_epf_destroy()
>
> Destroy the created PCI EPF device.
>
> -*) pci_epf_bind()
> +* pci_epf_bind()
>
> pci_epf_bind() should be invoked when the EPF device has been bound to
> a EPC device.
>
> -*) pci_epf_unbind()
> +* pci_epf_unbind()
>
> pci_epf_unbind() should be invoked when the binding between EPC device
> and EPF device is lost.
> diff --git a/Documentation/PCI/index.rst b/Documentation/PCI/index.rst
> index 86c76c22810b..c8ea2e626c20 100644
> --- a/Documentation/PCI/index.rst
> +++ b/Documentation/PCI/index.rst
> @@ -15,3 +15,4 @@ Linux PCI Bus Subsystem
> acpi-info
> pci-error-recovery
> pcieaer-howto
> + endpoint/index



Thanks,
Mauro

2019-04-24 21:57:16

by Changbin Du

[permalink] [raw]
Subject: Re: [PATCH v4 08/63] Documentation: ACPI: move gpio-properties.txt to firmware-guide/acpi and convert to reST

On Tue, Apr 23, 2019 at 05:55:15PM -0300, Mauro Carvalho Chehab wrote:
> Em Wed, 24 Apr 2019 00:28:37 +0800
> Changbin Du <[email protected]> escreveu:
>
> > This converts the plain text documentation to reStructuredText format and
> > add it to Sphinx TOC tree. No essential content change.
> >
> > Signed-off-by: Changbin Du <[email protected]>
> > ---
> > .../acpi/gpio-properties.rst} | 78 +++++++++++--------
> > Documentation/firmware-guide/acpi/index.rst | 1 +
> > MAINTAINERS | 2 +-
> > 3 files changed, 46 insertions(+), 35 deletions(-)
> > rename Documentation/{acpi/gpio-properties.txt => firmware-guide/acpi/gpio-properties.rst} (81%)
> >
> > diff --git a/Documentation/acpi/gpio-properties.txt b/Documentation/firmware-guide/acpi/gpio-properties.rst
> > similarity index 81%
> > rename from Documentation/acpi/gpio-properties.txt
> > rename to Documentation/firmware-guide/acpi/gpio-properties.rst
> > index 88c65cb5bf0a..89c636963544 100644
> > --- a/Documentation/acpi/gpio-properties.txt
> > +++ b/Documentation/firmware-guide/acpi/gpio-properties.rst
> > @@ -1,5 +1,8 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +======================================
> > _DSD Device Properties Related to GPIO
> > ---------------------------------------
> > +======================================
> >
> > With the release of ACPI 5.1, the _DSD configuration object finally
> > allows names to be given to GPIOs (and other things as well) returned
> > @@ -8,7 +11,7 @@ the corresponding GPIO, which is pretty error prone (it depends on
> > the _CRS output ordering, for example).
> >
> > With _DSD we can now query GPIOs using a name instead of an integer
> > -index, like the ASL example below shows:
> > +index, like the ASL example below shows::
> >
> > // Bluetooth device with reset and shutdown GPIOs
> > Device (BTH)
> > @@ -34,15 +37,19 @@ index, like the ASL example below shows:
> > })
> > }
> >
> > -The format of the supported GPIO property is:
> > +The format of the supported GPIO property is::
> >
> > Package () { "name", Package () { ref, index, pin, active_low }}
> >
> > - ref - The device that has _CRS containing GpioIo()/GpioInt() resources,
> > - typically this is the device itself (BTH in our case).
> > - index - Index of the GpioIo()/GpioInt() resource in _CRS starting from zero.
> > - pin - Pin in the GpioIo()/GpioInt() resource. Typically this is zero.
> > - active_low - If 1 the GPIO is marked as active_low.
> > +ref
> > + The device that has _CRS containing GpioIo()/GpioInt() resources,
> > + typically this is the device itself (BTH in our case).
> > +index
> > + Index of the GpioIo()/GpioInt() resource in _CRS starting from zero.
> > +pin
> > + Pin in the GpioIo()/GpioInt() resource. Typically this is zero.
> > +active_low
> > + If 1 the GPIO is marked as active_low.
> >
> > Since ACPI GpioIo() resource does not have a field saying whether it is
> > active low or high, the "active_low" argument can be used here. Setting
> > @@ -55,7 +62,7 @@ It is possible to leave holes in the array of GPIOs. This is useful in
> > cases like with SPI host controllers where some chip selects may be
> > implemented as GPIOs and some as native signals. For example a SPI host
> > controller can have chip selects 0 and 2 implemented as GPIOs and 1 as
> > -native:
> > +native::
> >
> > Package () {
> > "cs-gpios",
> > @@ -67,7 +74,7 @@ native:
> > }
> >
> > Other supported properties
> > ---------------------------
> > +==========================
> >
> > Following Device Tree compatible device properties are also supported by
> > _DSD device properties for GPIO controllers:
> > @@ -78,7 +85,7 @@ _DSD device properties for GPIO controllers:
> > - input
> > - line-name
> >
> > -Example:
> > +Example::
> >
> > Name (_DSD, Package () {
> > // _DSD Hierarchical Properties Extension UUID
> > @@ -100,7 +107,7 @@ Example:
> >
> > - gpio-line-names
> >
> > -Example:
> > +Example::
> >
> > Package () {
> > "gpio-line-names",
> > @@ -114,7 +121,7 @@ See Documentation/devicetree/bindings/gpio/gpio.txt for more information
> > about these properties.
> >
> > ACPI GPIO Mappings Provided by Drivers
> > ---------------------------------------
> > +======================================
> >
> > There are systems in which the ACPI tables do not contain _DSD but provide _CRS
> > with GpioIo()/GpioInt() resources and device drivers still need to work with
> > @@ -139,16 +146,16 @@ line in that resource starting from zero, and the active-low flag for that line,
> > respectively, in analogy with the _DSD GPIO property format specified above.
> >
> > For the example Bluetooth device discussed previously the data structures in
> > -question would look like this:
> > +question would look like this::
> >
> > -static const struct acpi_gpio_params reset_gpio = { 1, 1, false };
> > -static const struct acpi_gpio_params shutdown_gpio = { 0, 0, false };
> > + static const struct acpi_gpio_params reset_gpio = { 1, 1, false };
> > + static const struct acpi_gpio_params shutdown_gpio = { 0, 0, false };
> >
> > -static const struct acpi_gpio_mapping bluetooth_acpi_gpios[] = {
> > - { "reset-gpios", &reset_gpio, 1 },
> > - { "shutdown-gpios", &shutdown_gpio, 1 },
> > - { },
> > -};
> > + static const struct acpi_gpio_mapping bluetooth_acpi_gpios[] = {
> > + { "reset-gpios", &reset_gpio, 1 },
> > + { "shutdown-gpios", &shutdown_gpio, 1 },
> > + { },
> > + };
> >
> > Next, the mapping table needs to be passed as the second argument to
> > acpi_dev_add_driver_gpios() that will register it with the ACPI device object
> > @@ -158,12 +165,12 @@ calling acpi_dev_remove_driver_gpios() on the ACPI device object where that
> > table was previously registered.
> >
> > Using the _CRS fallback
> > ------------------------
> > +=======================
> >
> > If a device does not have _DSD or the driver does not create ACPI GPIO
> > mapping, the Linux GPIO framework refuses to return any GPIOs. This is
> > because the driver does not know what it actually gets. For example if we
> > -have a device like below:
> > +have a device like below::
> >
> > Device (BTH)
> > {
> > @@ -177,7 +184,7 @@ have a device like below:
> > })
> > }
> >
> > -The driver might expect to get the right GPIO when it does:
> > +The driver might expect to get the right GPIO when it does::
>
> Hmm... there is a small typo here:
>
> ": :" -> "::"
>
Good catch! Thanks.

> For the conversion itself, after correcting the above typo:
>
> Reviewed-by: Mauro Carvalho Chehab <[email protected]>
>
>
>
> >
> > desc = gpiod_get(dev, "reset", GPIOD_OUT_LOW);
> >
> > @@ -193,22 +200,25 @@ the ACPI GPIO mapping tables are hardly linked to ACPI ID and certain
> > objects, as listed in the above chapter, of the device in question.
> >
> > Getting GPIO descriptor
> > ------------------------
> > +=======================
> > +
> > +There are two main approaches to get GPIO resource from ACPI::
> >
> > -There are two main approaches to get GPIO resource from ACPI:
> > - desc = gpiod_get(dev, connection_id, flags);
> > - desc = gpiod_get_index(dev, connection_id, index, flags);
> > + desc = gpiod_get(dev, connection_id, flags);
> > + desc = gpiod_get_index(dev, connection_id, index, flags);
> >
> > We may consider two different cases here, i.e. when connection ID is
> > provided and otherwise.
> >
> > -Case 1:
> > - desc = gpiod_get(dev, "non-null-connection-id", flags);
> > - desc = gpiod_get_index(dev, "non-null-connection-id", index, flags);
> > +Case 1::
> > +
> > + desc = gpiod_get(dev, "non-null-connection-id", flags);
> > + desc = gpiod_get_index(dev, "non-null-connection-id", index, flags);
> > +
> > +Case 2::
> >
> > -Case 2:
> > - desc = gpiod_get(dev, NULL, flags);
> > - desc = gpiod_get_index(dev, NULL, index, flags);
> > + desc = gpiod_get(dev, NULL, flags);
> > + desc = gpiod_get_index(dev, NULL, index, flags);
> >
> > Case 1 assumes that corresponding ACPI device description must have
> > defined device properties and will prevent to getting any GPIO resources
> > diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> > index 0e05b843521c..61d67763851b 100644
> > --- a/Documentation/firmware-guide/acpi/index.rst
> > +++ b/Documentation/firmware-guide/acpi/index.rst
> > @@ -11,3 +11,4 @@ ACPI Support
> > enumeration
> > osi
> > DSD-properties-rules
> > + gpio-properties
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 09f43f1bdd15..87f930bf32ad 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -6593,7 +6593,7 @@ M: Andy Shevchenko <[email protected]>
> > L: [email protected]
> > L: [email protected]
> > S: Maintained
> > -F: Documentation/acpi/gpio-properties.txt
> > +F: Documentation/firmware-guide/acpi/gpio-properties.rst
> > F: drivers/gpio/gpiolib-acpi.c
> >
> > GPIO IR Transmitter
>
>
>
> Thanks,
> Mauro

--
Cheers,
Changbin Du

2019-04-24 22:04:36

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 34/63] Documentation: PCI: convert endpoint/pci-endpoint-cfs.txt to reST

Em Wed, 24 Apr 2019 00:29:03 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> ---
> Documentation/PCI/endpoint/index.rst | 1 +
> ...-endpoint-cfs.txt => pci-endpoint-cfs.rst} | 99 +++++++++++--------
> 2 files changed, 57 insertions(+), 43 deletions(-)
> rename Documentation/PCI/endpoint/{pci-endpoint-cfs.txt => pci-endpoint-cfs.rst} (64%)
>
> diff --git a/Documentation/PCI/endpoint/index.rst b/Documentation/PCI/endpoint/index.rst
> index 0db4f2fcd7f0..3951de9f923c 100644
> --- a/Documentation/PCI/endpoint/index.rst
> +++ b/Documentation/PCI/endpoint/index.rst
> @@ -8,3 +8,4 @@ PCI Endpoint Framework
> :maxdepth: 2
>
> pci-endpoint
> + pci-endpoint-cfs
> diff --git a/Documentation/PCI/endpoint/pci-endpoint-cfs.txt b/Documentation/PCI/endpoint/pci-endpoint-cfs.rst
> similarity index 64%
> rename from Documentation/PCI/endpoint/pci-endpoint-cfs.txt
> rename to Documentation/PCI/endpoint/pci-endpoint-cfs.rst
> index d740f29960a4..b6d39cdec56e 100644
> --- a/Documentation/PCI/endpoint/pci-endpoint-cfs.txt
> +++ b/Documentation/PCI/endpoint/pci-endpoint-cfs.rst
> @@ -1,41 +1,51 @@
> - CONFIGURING PCI ENDPOINT USING CONFIGFS
> - Kishon Vijay Abraham I <[email protected]>
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=======================================
> +Configuring PCI Endpoint Using CONFIGFS
> +=======================================
> +
> +:Author: Kishon Vijay Abraham I <[email protected]>
>
> The PCI Endpoint Core exposes configfs entry (pci_ep) to configure the
> PCI endpoint function and to bind the endpoint function
> with the endpoint controller. (For introducing other mechanisms to
> configure the PCI Endpoint Function refer to [1]).
>
> -*) Mounting configfs
> +Mounting configfs
> +=================
>
> The PCI Endpoint Core layer creates pci_ep directory in the mounted configfs
> -directory. configfs can be mounted using the following command.
> +directory. configfs can be mounted using the following command::
>
> mount -t configfs none /sys/kernel/config
>
> -*) Directory Structure
> +Directory Structure
> +===================
>
> The pci_ep configfs has two directories at its root: controllers and
> functions. Every EPC device present in the system will have an entry in
> the *controllers* directory and and every EPF driver present in the system
> will have an entry in the *functions* directory.
> +::
>
> -/sys/kernel/config/pci_ep/
> - .. controllers/
> - .. functions/
> + /sys/kernel/config/pci_ep/
> + .. controllers/
> + .. functions/
>
> -*) Creating EPF Device
> +Creating EPF Device
> +===================
>
> Every registered EPF driver will be listed in controllers directory. The
> entries corresponding to EPF driver will be created by the EPF core.
> +::
>
> -/sys/kernel/config/pci_ep/functions/
> - .. <EPF Driver1>/
> - ... <EPF Device 11>/
> - ... <EPF Device 21>/
> - .. <EPF Driver2>/
> - ... <EPF Device 12>/
> - ... <EPF Device 22>/
> + /sys/kernel/config/pci_ep/functions/
> + .. <EPF Driver1>/
> + ... <EPF Device 11>/
> + ... <EPF Device 21>/
> + .. <EPF Driver2>/
> + ... <EPF Device 12>/
> + ... <EPF Device 22>/
>
> In order to create a <EPF device> of the type probed by <EPF Driver>, the
> user has to create a directory inside <EPF DriverN>.
> @@ -44,34 +54,37 @@ Every <EPF device> directory consists of the following entries that can be
> used to configure the standard configuration header of the endpoint function.
> (These entries are created by the framework when any new <EPF Device> is
> created)
> -
> - .. <EPF Driver1>/
> - ... <EPF Device 11>/
> - ... vendorid
> - ... deviceid
> - ... revid
> - ... progif_code
> - ... subclass_code
> - ... baseclass_code
> - ... cache_line_size
> - ... subsys_vendor_id
> - ... subsys_id
> - ... interrupt_pin
> -
> -*) EPC Device
> +::
> +
> + .. <EPF Driver1>/
> + ... <EPF Device 11>/
> + ... vendorid
> + ... deviceid
> + ... revid
> + ... progif_code
> + ... subclass_code
> + ... baseclass_code
> + ... cache_line_size
> + ... subsys_vendor_id
> + ... subsys_id
> + ... interrupt_pin
> +
> +EPC Device
> +==========
>
> Every registered EPC device will be listed in controllers directory. The
> entries corresponding to EPC device will be created by the EPC core.
> -
> -/sys/kernel/config/pci_ep/controllers/
> - .. <EPC Device1>/
> - ... <Symlink EPF Device11>/
> - ... <Symlink EPF Device12>/
> - ... start
> - .. <EPC Device2>/
> - ... <Symlink EPF Device21>/
> - ... <Symlink EPF Device22>/
> - ... start
> +::
> +
> + /sys/kernel/config/pci_ep/controllers/
> + .. <EPC Device1>/
> + ... <Symlink EPF Device11>/
> + ... <Symlink EPF Device12>/
> + ... start
> + .. <EPC Device2>/
> + ... <Symlink EPF Device21>/
> + ... <Symlink EPF Device22>/
> + ... start
>
> The <EPC Device> directory will have a list of symbolic links to
> <EPF Device>. These symbolic links should be created by the user to
> @@ -81,7 +94,7 @@ The <EPC Device> directory will also have a *start* field. Once
> "1" is written to this field, the endpoint device will be ready to
> establish the link with the host. This is usually done after
> all the EPF devices are created and linked with the EPC device.
> -
> +::
>
> | controllers/
> | <Directory: EPC name>/
> @@ -102,4 +115,4 @@ all the EPF devices are created and linked with the EPC device.
> | interrupt_pin
> | function
>
> -[1] -> Documentation/PCI/endpoint/pci-endpoint.txt
> +[1] :doc:`pci-endpoint`

Thanks,
Mauro

2019-04-24 22:07:12

by Changbin Du

[permalink] [raw]
Subject: Re: [PATCH v4 09/63] Documentation: ACPI: move method-customizing.txt to firmware-guide/acpi and convert to reST

On Tue, Apr 23, 2019 at 06:03:16PM -0300, Mauro Carvalho Chehab wrote:
> Em Wed, 24 Apr 2019 00:28:38 +0800
> Changbin Du <[email protected]> escreveu:
>
> > This converts the plain text documentation to reStructuredText format and
> > add it to Sphinx TOC tree. No essential content change.
> >
> > Signed-off-by: Changbin Du <[email protected]>
> > ---
> > Documentation/acpi/method-customizing.txt | 73 -----------------
> > Documentation/firmware-guide/acpi/index.rst | 3 +-
> > .../acpi/method-customizing.rst | 82 +++++++++++++++++++
> > 3 files changed, 84 insertions(+), 74 deletions(-)
> > delete mode 100644 Documentation/acpi/method-customizing.txt
> > create mode 100644 Documentation/firmware-guide/acpi/method-customizing.rst
> >
> > diff --git a/Documentation/acpi/method-customizing.txt b/Documentation/acpi/method-customizing.txt
> > deleted file mode 100644
> > index 7235da975f23..000000000000
> > --- a/Documentation/acpi/method-customizing.txt
> > +++ /dev/null
> > @@ -1,73 +0,0 @@
> > -Linux ACPI Custom Control Method How To
> > -=======================================
> > -
> > -Written by Zhang Rui <[email protected]>
> > -
> > -
> > -Linux supports customizing ACPI control methods at runtime.
> > -
> > -Users can use this to
> > -1. override an existing method which may not work correctly,
> > - or just for debugging purposes.
> > -2. insert a completely new method in order to create a missing
> > - method such as _OFF, _ON, _STA, _INI, etc.
> > -For these cases, it is far simpler to dynamically install a single
> > -control method rather than override the entire DSDT, because kernel
> > -rebuild/reboot is not needed and test result can be got in minutes.
> > -
> > -Note: Only ACPI METHOD can be overridden, any other object types like
> > - "Device", "OperationRegion", are not recognized. Methods
> > - declared inside scope operators are also not supported.
> > -Note: The same ACPI control method can be overridden for many times,
> > - and it's always the latest one that used by Linux/kernel.
> > -Note: To get the ACPI debug object output (Store (AAAA, Debug)),
> > - please run "echo 1 > /sys/module/acpi/parameters/aml_debug_output".
> > -
> > -1. override an existing method
> > - a) get the ACPI table via ACPI sysfs I/F. e.g. to get the DSDT,
> > - just run "cat /sys/firmware/acpi/tables/DSDT > /tmp/dsdt.dat"
> > - b) disassemble the table by running "iasl -d dsdt.dat".
> > - c) rewrite the ASL code of the method and save it in a new file,
> > - d) package the new file (psr.asl) to an ACPI table format.
> > - Here is an example of a customized \_SB._AC._PSR method,
> > -
> > - DefinitionBlock ("", "SSDT", 1, "", "", 0x20080715)
> > - {
> > - Method (\_SB_.AC._PSR, 0, NotSerialized)
> > - {
> > - Store ("In AC _PSR", Debug)
> > - Return (ACON)
> > - }
> > - }
> > - Note that the full pathname of the method in ACPI namespace
> > - should be used.
> > - e) assemble the file to generate the AML code of the method.
> > - e.g. "iasl -vw 6084 psr.asl" (psr.aml is generated as a result)
> > - If parameter "-vw 6084" is not supported by your iASL compiler,
> > - please try a newer version.
> > - f) mount debugfs by "mount -t debugfs none /sys/kernel/debug"
> > - g) override the old method via the debugfs by running
> > - "cat /tmp/psr.aml > /sys/kernel/debug/acpi/custom_method"
> > -
> > -2. insert a new method
> > - This is easier than overriding an existing method.
> > - We just need to create the ASL code of the method we want to
> > - insert and then follow the step c) ~ g) in section 1.
> > -
> > -3. undo your changes
> > - The "undo" operation is not supported for a new inserted method
> > - right now, i.e. we can not remove a method currently.
> > - For an overridden method, in order to undo your changes, please
> > - save a copy of the method original ASL code in step c) section 1,
> > - and redo step c) ~ g) to override the method with the original one.
> > -
> > -
> > -Note: We can use a kernel with multiple custom ACPI method running,
> > - But each individual write to debugfs can implement a SINGLE
> > - method override. i.e. if we want to insert/override multiple
> > - ACPI methods, we need to redo step c) ~ g) for multiple times.
> > -
> > -Note: Be aware that root can mis-use this driver to modify arbitrary
> > - memory and gain additional rights, if root's privileges got
> > - restricted (for example if root is not allowed to load additional
> > - modules after boot).
> > diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> > index 61d67763851b..d1d069b26bbc 100644
> > --- a/Documentation/firmware-guide/acpi/index.rst
> > +++ b/Documentation/firmware-guide/acpi/index.rst
> > @@ -10,5 +10,6 @@ ACPI Support
> > namespace
> > enumeration
> > osi
> > + method-customizing
> > DSD-properties-rules
> > - gpio-properties
> > + gpio-properties
> > \ No newline at end of file
> > diff --git a/Documentation/firmware-guide/acpi/method-customizing.rst b/Documentation/firmware-guide/acpi/method-customizing.rst
> > new file mode 100644
> > index 000000000000..32eb1cdc1549
> > --- /dev/null
> > +++ b/Documentation/firmware-guide/acpi/method-customizing.rst
> > @@ -0,0 +1,82 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +=======================================
> > +Linux ACPI Custom Control Method How To
> > +=======================================
> > +
> > +:Author: Zhang Rui <[email protected]>
> > +
> > +
> > +Linux supports customizing ACPI control methods at runtime.
> > +
> > +Users can use this to:
> > +
> > +1. override an existing method which may not work correctly,
> > + or just for debugging purposes.
> > +2. insert a completely new method in order to create a missing
> > + method such as _OFF, _ON, _STA, _INI, etc.
> > +
> > +For these cases, it is far simpler to dynamically install a single
> > +control method rather than override the entire DSDT, because kernel
> > +rebuild/reboot is not needed and test result can be got in minutes.
> > +
> > +.. note:: Only ACPI METHOD can be overridden, any other object types like
> > + "Device", "OperationRegion", are not recognized. Methods
> > + declared inside scope operators are also not supported.
> > +.. note:: The same ACPI control method can be overridden for many times,
> > + and it's always the latest one that used by Linux/kernel.
> > +.. note:: To get the ACPI debug object output (Store (AAAA, Debug)),
> > + please run "echo 1 > /sys/module/acpi/parameters/aml_debug_output".
>
> Hmm... this may work (not sure if Sphinx would warn or not), but it
> is visually bad on text mode. I would code it, instead, with something
> like:
>
> .. note::
>
> - Only ACPI METHOD can be overridden, any other object types like
> "Device", "OperationRegion", are not recognized. Methods
> declared inside scope operators are also not supported.
>
> - The same ACPI control method can be overridden for many times,
> and it's always the latest one that used by Linux/kernel.
>
> - To get the ACPI debug object output (Store (AAAA, Debug)),
> please run::
>
> echo 1 > /sys/module/acpi/parameters/aml_debug_output
>
> As this would make it visually better on both text and html formats.
>
No warnings given. Your suggested style is better so applied it. Thanks!

> > +
> > +1. override an existing method
> > +==============================
> > +a) get the ACPI table via ACPI sysfs I/F. e.g. to get the DSDT,
> > + just run "cat /sys/firmware/acpi/tables/DSDT > /tmp/dsdt.dat"
> > +b) disassemble the table by running "iasl -d dsdt.dat".
> > +c) rewrite the ASL code of the method and save it in a new file,
> > +d) package the new file (psr.asl) to an ACPI table format.
> > + Here is an example of a customized \_SB._AC._PSR method::
> > +
> > + DefinitionBlock ("", "SSDT", 1, "", "", 0x20080715)
> > + {
> > + Method (\_SB_.AC._PSR, 0, NotSerialized)
> > + {
> > + Store ("In AC _PSR", Debug)
> > + Return (ACON)
> > + }
> > + }
> > +
> > + Note that the full pathname of the method in ACPI namespace
> > + should be used.
> > +e) assemble the file to generate the AML code of the method.
> > + e.g. "iasl -vw 6084 psr.asl" (psr.aml is generated as a result)
> > + If parameter "-vw 6084" is not supported by your iASL compiler,
> > + please try a newer version.
>
> I would use ``iasl -vw 6084 psr.asl`` and ``-vw 6084``.
>
> > +f) mount debugfs by "mount -t debugfs none /sys/kernel/debug"
>
> I would do:
>
> f) mount debugfs by running::
>
> mount -t debugfs none /sys/kernel/debug
>
> As it makes a better html document. I believe that the focus here is
> sysadmins. Doing the above makes easier for them to cut and paste
> commands.
>
> > +g) override the old method via the debugfs by running
> > + "cat /tmp/psr.aml > /sys/kernel/debug/acpi/custom_method"
>
> Same applies here: I would also place the "cat" command on a literal
> block.
>
> > +
> > +2. insert a new method
> > +======================
> > +This is easier than overriding an existing method.
> > +We just need to create the ASL code of the method we want to
> > +insert and then follow the step c) ~ g) in section 1.
> > +
> > +3. undo your changes
> > +====================
> > +The "undo" operation is not supported for a new inserted method
> > +right now, i.e. we can not remove a method currently.
> > +For an overridden method, in order to undo your changes, please
> > +save a copy of the method original ASL code in step c) section 1,
> > +and redo step c) ~ g) to override the method with the original one.
> > +
> > +
> > +.. note:: We can use a kernel with multiple custom ACPI method running,
> > + But each individual write to debugfs can implement a SINGLE
> > + method override. i.e. if we want to insert/override multiple
> > + ACPI methods, we need to redo step c) ~ g) for multiple times.
> > +
> > +.. note:: Be aware that root can mis-use this driver to modify arbitrary
> > + memory and gain additional rights, if root's privileges got
> > + restricted (for example if root is not allowed to load additional
> > + modules after boot).
>
> Same comment as above: IMHO, having a single note block with the two
> notes would be better.
>
> Thanks,
> Mauro

--
Cheers,
Changbin Du

2019-04-24 22:16:05

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 35/63] Documentation: PCI: convert endpoint/pci-test-function.txt to reST

Em Wed, 24 Apr 2019 00:29:04 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>
> ---
> Documentation/PCI/endpoint/index.rst | 1 +
> ...est-function.txt => pci-test-function.rst} | 32 +++++++++++--------
> 2 files changed, 20 insertions(+), 13 deletions(-)
> rename Documentation/PCI/endpoint/{pci-test-function.txt => pci-test-function.rst} (84%)
>
> diff --git a/Documentation/PCI/endpoint/index.rst b/Documentation/PCI/endpoint/index.rst
> index 3951de9f923c..b680a3fc4fec 100644
> --- a/Documentation/PCI/endpoint/index.rst
> +++ b/Documentation/PCI/endpoint/index.rst
> @@ -9,3 +9,4 @@ PCI Endpoint Framework
>
> pci-endpoint
> pci-endpoint-cfs
> + pci-test-function
> diff --git a/Documentation/PCI/endpoint/pci-test-function.txt b/Documentation/PCI/endpoint/pci-test-function.rst
> similarity index 84%
> rename from Documentation/PCI/endpoint/pci-test-function.txt
> rename to Documentation/PCI/endpoint/pci-test-function.rst
> index 5916f1f592bb..ba02cddcec37 100644
> --- a/Documentation/PCI/endpoint/pci-test-function.txt
> +++ b/Documentation/PCI/endpoint/pci-test-function.rst
> @@ -1,5 +1,10 @@
> - PCI TEST
> - Kishon Vijay Abraham I <[email protected]>
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=================
> +PCI Test Function
> +=================
> +
> +:Author: Kishon Vijay Abraham I <[email protected]>
>
> Traditionally PCI RC has always been validated by using standard
> PCI cards like ethernet PCI cards or USB PCI cards or SATA PCI cards.
> @@ -23,30 +28,31 @@ The PCI endpoint test device has the following registers:
> 8) PCI_ENDPOINT_TEST_IRQ_TYPE
> 9) PCI_ENDPOINT_TEST_IRQ_NUMBER
>
> -*) PCI_ENDPOINT_TEST_MAGIC
> +* PCI_ENDPOINT_TEST_MAGIC

Same comment as on a previous patch. I suspect that the author's intention
for all stuff under Documentation/PCI/endpoint/ (or perhaps this is due
tothe markup language he uses) is to have:

*) foo

as a chapter, e. g. the right conversion would be, instead:

PCI_ENDPOINT_TEST_MAGIC
=======================

(same applies to the other similar markups here and on other files under
the endpoint/ directory)

>
> This register will be used to test BAR0. A known pattern will be written
> and read back from MAGIC register to verify BAR0.
>
> -*) PCI_ENDPOINT_TEST_COMMAND:
> +* PCI_ENDPOINT_TEST_COMMAND:
>
> This register will be used by the host driver to indicate the function
> that the endpoint device must perform.
>
> -Bitfield Description:
> +Bitfield Description::
> +
> Bit 0 : raise legacy IRQ
> Bit 1 : raise MSI IRQ
> Bit 2 : raise MSI-X IRQ
> Bit 3 : read command (read data from RC buffer)
> Bit 4 : write command (write data to RC buffer)
> - Bit 5 : copy command (copy data from one RC buffer to another
> - RC buffer)
> + Bit 5 : copy command (copy data from one RC buffer to another RC buffer)

Please use a table instead:

Bitfield Description:

===== =======================================================
Bit 0 raise legacy IRQ
Bit 1 raise MSI IRQ
Bit 2 raise MSI-X IRQ
Bit 3 read command (read data from RC buffer)
Bit 4 write command (write data to RC buffer)
Bit 5 copy command (copy data from one RC buffer to another
RC buffer)
===== =======================================================



>
> -*) PCI_ENDPOINT_TEST_STATUS
> +* PCI_ENDPOINT_TEST_STATUS
>
> This register reflects the status of the PCI endpoint device.
>
> -Bitfield Description:
> +Bitfield Description::
> +
> Bit 0 : read success
> Bit 1 : read fail
> Bit 2 : write success
> @@ -57,17 +63,17 @@ Bitfield Description:
> Bit 7 : source address is invalid
> Bit 8 : destination address is invalid

Same here:

Bitfield Description:

===== ==============================
Bit 0 read success
Bit 1 read fail
Bit 2 write success
Bit 3 write fail
Bit 4 copy success
Bit 5 copy fail
Bit 6 IRQ raised
Bit 7 source address is invalid
Bit 8 destination address is invalid
===== ==============================


>
> -*) PCI_ENDPOINT_TEST_SRC_ADDR
> +* PCI_ENDPOINT_TEST_SRC_ADDR
>
> This register contains the source address (RC buffer address) for the
> COPY/READ command.
>
> -*) PCI_ENDPOINT_TEST_DST_ADDR
> +* PCI_ENDPOINT_TEST_DST_ADDR
>
> This register contains the destination address (RC buffer address) for
> the COPY/WRITE command.
>
> -*) PCI_ENDPOINT_TEST_IRQ_TYPE
> +* PCI_ENDPOINT_TEST_IRQ_TYPE
>
> This register contains the interrupt type (Legacy/MSI) triggered
> for the READ/WRITE/COPY and raise IRQ (Legacy/MSI) commands.
> @@ -77,7 +83,7 @@ Possible types:

You need a blank line before - MSI, in order to not use a bold font for
"Possible types:".

> - MSI : 1
> - MSI-X : 2
>
> -*) PCI_ENDPOINT_TEST_IRQ_NUMBER
> +* PCI_ENDPOINT_TEST_IRQ_NUMBER
>
> This register contains the triggered ID interrupt.
>

Same here: ou need a blank line on this text:

Admissible values:
+
- Legacy : 0
- MSI : [1 .. 32]
- MSI-X : [1 .. 2048]


In order to avoid using bold font for "Admissible values".


Thanks,
Mauro

2019-04-24 22:39:00

by Changbin Du

[permalink] [raw]
Subject: Re: [PATCH v4 17/63] Documentation: ACPI: move method-tracing.txt to firmware-guide/acpi and convert to rsST

On Wed, Apr 24, 2019 at 11:26:38AM -0300, Mauro Carvalho Chehab wrote:
> Em Wed, 24 Apr 2019 00:28:46 +0800
> Changbin Du <[email protected]> escreveu:
>
> > This converts the plain text documentation to reStructuredText format and
> > add it to Sphinx TOC tree. No essential content change.
> >
> > Signed-off-by: Changbin Du <[email protected]>
> > ---
> > Documentation/acpi/method-tracing.txt | 192 ---------------
> > Documentation/firmware-guide/acpi/index.rst | 1 +
> > .../firmware-guide/acpi/method-tracing.rst | 225 ++++++++++++++++++
> > 3 files changed, 226 insertions(+), 192 deletions(-)
> > delete mode 100644 Documentation/acpi/method-tracing.txt
> > create mode 100644 Documentation/firmware-guide/acpi/method-tracing.rst
> >
> > diff --git a/Documentation/acpi/method-tracing.txt b/Documentation/acpi/method-tracing.txt
> > deleted file mode 100644
> > index 0aba14c8f459..000000000000
> > --- a/Documentation/acpi/method-tracing.txt
> > +++ /dev/null
> > @@ -1,192 +0,0 @@
> > -ACPICA Trace Facility
> > -
> > -Copyright (C) 2015, Intel Corporation
> > -Author: Lv Zheng <[email protected]>
> > -
> > -
> > -Abstract:
> > -
> > -This document describes the functions and the interfaces of the method
> > -tracing facility.
> > -
> > -1. Functionalities and usage examples:
> > -
> > - ACPICA provides method tracing capability. And two functions are
> > - currently implemented using this capability.
> > -
> > - A. Log reducer
> > - ACPICA subsystem provides debugging outputs when CONFIG_ACPI_DEBUG is
> > - enabled. The debugging messages which are deployed via
> > - ACPI_DEBUG_PRINT() macro can be reduced at 2 levels - per-component
> > - level (known as debug layer, configured via
> > - /sys/module/acpi/parameters/debug_layer) and per-type level (known as
> > - debug level, configured via /sys/module/acpi/parameters/debug_level).
> > -
> > - But when the particular layer/level is applied to the control method
> > - evaluations, the quantity of the debugging outputs may still be too
> > - large to be put into the kernel log buffer. The idea thus is worked out
> > - to only enable the particular debug layer/level (normally more detailed)
> > - logs when the control method evaluation is started, and disable the
> > - detailed logging when the control method evaluation is stopped.
> > -
> > - The following command examples illustrate the usage of the "log reducer"
> > - functionality:
> > - a. Filter out the debug layer/level matched logs when control methods
> > - are being evaluated:
> > - # cd /sys/module/acpi/parameters
> > - # echo "0xXXXXXXXX" > trace_debug_layer
> > - # echo "0xYYYYYYYY" > trace_debug_level
> > - # echo "enable" > trace_state
> > - b. Filter out the debug layer/level matched logs when the specified
> > - control method is being evaluated:
> > - # cd /sys/module/acpi/parameters
> > - # echo "0xXXXXXXXX" > trace_debug_layer
> > - # echo "0xYYYYYYYY" > trace_debug_level
> > - # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> > - # echo "method" > /sys/module/acpi/parameters/trace_state
> > - c. Filter out the debug layer/level matched logs when the specified
> > - control method is being evaluated for the first time:
> > - # cd /sys/module/acpi/parameters
> > - # echo "0xXXXXXXXX" > trace_debug_layer
> > - # echo "0xYYYYYYYY" > trace_debug_level
> > - # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> > - # echo "method-once" > /sys/module/acpi/parameters/trace_state
> > - Where:
> > - 0xXXXXXXXX/0xYYYYYYYY: Refer to Documentation/acpi/debug.txt for
> > - possible debug layer/level masking values.
> > - \PPPP.AAAA.TTTT.HHHH: Full path of a control method that can be found
> > - in the ACPI namespace. It needn't be an entry
> > - of a control method evaluation.
> > -
> > - B. AML tracer
> > -
> > - There are special log entries added by the method tracing facility at
> > - the "trace points" the AML interpreter starts/stops to execute a control
> > - method, or an AML opcode. Note that the format of the log entries are
> > - subject to change:
> > - [ 0.186427] exdebug-0398 ex_trace_point : Method Begin [0xf58394d8:\_SB.PCI0.LPCB.ECOK] execution.
> > - [ 0.186630] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905c88:If] execution.
> > - [ 0.186820] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905cc0:LEqual] execution.
> > - [ 0.187010] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905a20:-NamePath-] execution.
> > - [ 0.187214] exdebug-0398 ex_trace_point : Opcode End [0xf5905a20:-NamePath-] execution.
> > - [ 0.187407] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905f60:One] execution.
> > - [ 0.187594] exdebug-0398 ex_trace_point : Opcode End [0xf5905f60:One] execution.
> > - [ 0.187789] exdebug-0398 ex_trace_point : Opcode End [0xf5905cc0:LEqual] execution.
> > - [ 0.187980] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905cc0:Return] execution.
> > - [ 0.188146] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905f60:One] execution.
> > - [ 0.188334] exdebug-0398 ex_trace_point : Opcode End [0xf5905f60:One] execution.
> > - [ 0.188524] exdebug-0398 ex_trace_point : Opcode End [0xf5905cc0:Return] execution.
> > - [ 0.188712] exdebug-0398 ex_trace_point : Opcode End [0xf5905c88:If] execution.
> > - [ 0.188903] exdebug-0398 ex_trace_point : Method End [0xf58394d8:\_SB.PCI0.LPCB.ECOK] execution.
> > -
> > - Developers can utilize these special log entries to track the AML
> > - interpretion, thus can aid issue debugging and performance tuning. Note
> > - that, as the "AML tracer" logs are implemented via ACPI_DEBUG_PRINT()
> > - macro, CONFIG_ACPI_DEBUG is also required to be enabled for enabling
> > - "AML tracer" logs.
> > -
> > - The following command examples illustrate the usage of the "AML tracer"
> > - functionality:
> > - a. Filter out the method start/stop "AML tracer" logs when control
> > - methods are being evaluated:
> > - # cd /sys/module/acpi/parameters
> > - # echo "0x80" > trace_debug_layer
> > - # echo "0x10" > trace_debug_level
> > - # echo "enable" > trace_state
> > - b. Filter out the method start/stop "AML tracer" when the specified
> > - control method is being evaluated:
> > - # cd /sys/module/acpi/parameters
> > - # echo "0x80" > trace_debug_layer
> > - # echo "0x10" > trace_debug_level
> > - # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> > - # echo "method" > trace_state
> > - c. Filter out the method start/stop "AML tracer" logs when the specified
> > - control method is being evaluated for the first time:
> > - # cd /sys/module/acpi/parameters
> > - # echo "0x80" > trace_debug_layer
> > - # echo "0x10" > trace_debug_level
> > - # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> > - # echo "method-once" > trace_state
> > - d. Filter out the method/opcode start/stop "AML tracer" when the
> > - specified control method is being evaluated:
> > - # cd /sys/module/acpi/parameters
> > - # echo "0x80" > trace_debug_layer
> > - # echo "0x10" > trace_debug_level
> > - # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> > - # echo "opcode" > trace_state
> > - e. Filter out the method/opcode start/stop "AML tracer" when the
> > - specified control method is being evaluated for the first time:
> > - # cd /sys/module/acpi/parameters
> > - # echo "0x80" > trace_debug_layer
> > - # echo "0x10" > trace_debug_level
> > - # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> > - # echo "opcode-opcode" > trace_state
> > -
> > - Note that all above method tracing facility related module parameters can
> > - be used as the boot parameters, for example:
> > - acpi.trace_debug_layer=0x80 acpi.trace_debug_level=0x10 \
> > - acpi.trace_method_name=\_SB.LID0._LID acpi.trace_state=opcode-once
> > -
> > -2. Interface descriptions:
> > -
> > - All method tracing functions can be configured via ACPI module
> > - parameters that are accessible at /sys/module/acpi/parameters/:
> > -
> > - trace_method_name
> > - The full path of the AML method that the user wants to trace.
> > - Note that the full path shouldn't contain the trailing "_"s in its
> > - name segments but may contain "\" to form an absolute path.
> > -
> > - trace_debug_layer
> > - The temporary debug_layer used when the tracing feature is enabled.
> > - Using ACPI_EXECUTER (0x80) by default, which is the debug_layer
> > - used to match all "AML tracer" logs.
> > -
> > - trace_debug_level
> > - The temporary debug_level used when the tracing feature is enabled.
> > - Using ACPI_LV_TRACE_POINT (0x10) by default, which is the
> > - debug_level used to match all "AML tracer" logs.
> > -
> > - trace_state
> > - The status of the tracing feature.
> > - Users can enable/disable this debug tracing feature by executing
> > - the following command:
> > - # echo string > /sys/module/acpi/parameters/trace_state
> > - Where "string" should be one of the following:
> > - "disable"
> > - Disable the method tracing feature.
> > - "enable"
> > - Enable the method tracing feature.
> > - ACPICA debugging messages matching
> > - "trace_debug_layer/trace_debug_level" during any method
> > - execution will be logged.
> > - "method"
> > - Enable the method tracing feature.
> > - ACPICA debugging messages matching
> > - "trace_debug_layer/trace_debug_level" during method execution
> > - of "trace_method_name" will be logged.
> > - "method-once"
> > - Enable the method tracing feature.
> > - ACPICA debugging messages matching
> > - "trace_debug_layer/trace_debug_level" during method execution
> > - of "trace_method_name" will be logged only once.
> > - "opcode"
> > - Enable the method tracing feature.
> > - ACPICA debugging messages matching
> > - "trace_debug_layer/trace_debug_level" during method/opcode
> > - execution of "trace_method_name" will be logged.
> > - "opcode-once"
> > - Enable the method tracing feature.
> > - ACPICA debugging messages matching
> > - "trace_debug_layer/trace_debug_level" during method/opcode
> > - execution of "trace_method_name" will be logged only once.
> > - Note that, the difference between the "enable" and other feature
> > - enabling options are:
> > - 1. When "enable" is specified, since
> > - "trace_debug_layer/trace_debug_level" shall apply to all control
> > - method evaluations, after configuring "trace_state" to "enable",
> > - "trace_method_name" will be reset to NULL.
> > - 2. When "method/opcode" is specified, if
> > - "trace_method_name" is NULL when "trace_state" is configured to
> > - these options, the "trace_debug_layer/trace_debug_level" will
> > - apply to all control method evaluations.
> > diff --git a/Documentation/firmware-guide/acpi/index.rst b/Documentation/firmware-guide/acpi/index.rst
> > index a45fea11f998..287a7cbd82ac 100644
> > --- a/Documentation/firmware-guide/acpi/index.rst
> > +++ b/Documentation/firmware-guide/acpi/index.rst
> > @@ -13,6 +13,7 @@ ACPI Support
> > enumeration
> > osi
> > method-customizing
> > + method-tracing
> > DSD-properties-rules
> > debug
> > gpio-properties
> > diff --git a/Documentation/firmware-guide/acpi/method-tracing.rst b/Documentation/firmware-guide/acpi/method-tracing.rst
> > new file mode 100644
> > index 000000000000..7a997ba168d7
> > --- /dev/null
> > +++ b/Documentation/firmware-guide/acpi/method-tracing.rst
> > @@ -0,0 +1,225 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +.. include:: <isonum.txt>
> > +
> > +=====================
> > +ACPICA Trace Facility
> > +=====================
> > +
> > +:Copyright: |copy| 2015, Intel Corporation
> > +:Author: Lv Zheng <[email protected]>
> > +
> > +
> > +:Abstract: This document describes the functions and the interfaces of the
> > + method tracing facility.
>
> Same comment as on other patches.
>
Fixed, thanks.

> > +
> > +1. Functionalities and usage examples
> > +=====================================
> > +
> > +ACPICA provides method tracing capability. And two functions are
> > +currently implemented using this capability.
> > +
> > +Log reducer
> > +--------------
> > +
> > +ACPICA subsystem provides debugging outputs when CONFIG_ACPI_DEBUG is
> > +enabled. The debugging messages which are deployed via
> > +ACPI_DEBUG_PRINT() macro can be reduced at 2 levels - per-component
> > +level (known as debug layer, configured via
> > +/sys/module/acpi/parameters/debug_layer) and per-type level (known as
> > +debug level, configured via /sys/module/acpi/parameters/debug_level).
> > +
> > +But when the particular layer/level is applied to the control method
> > +evaluations, the quantity of the debugging outputs may still be too
> > +large to be put into the kernel log buffer. The idea thus is worked out
> > +to only enable the particular debug layer/level (normally more detailed)
> > +logs when the control method evaluation is started, and disable the
> > +detailed logging when the control method evaluation is stopped.
> > +
> > +The following command examples illustrate the usage of the "log reducer"
> > +functionality:
> > +
> > +a. Filter out the debug layer/level matched logs when control methods
> > + are being evaluated::
> > +
> > + # cd /sys/module/acpi/parameters
> > + # echo "0xXXXXXXXX" > trace_debug_layer
> > + # echo "0xYYYYYYYY" > trace_debug_level
> > + # echo "enable" > trace_state
> > +
> > +b. Filter out the debug layer/level matched logs when the specified
> > + control method is being evaluated::
> > +
> > + # cd /sys/module/acpi/parameters
> > + # echo "0xXXXXXXXX" > trace_debug_layer
> > + # echo "0xYYYYYYYY" > trace_debug_level
> > + # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> > + # echo "method" > /sys/module/acpi/parameters/trace_state
> > +
> > +c. Filter out the debug layer/level matched logs when the specified
> > + control method is being evaluated for the first time::
> > +
> > + # cd /sys/module/acpi/parameters
> > + # echo "0xXXXXXXXX" > trace_debug_layer
> > + # echo "0xYYYYYYYY" > trace_debug_level
> > + # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> > + # echo "method-once" > /sys/module/acpi/parameters/trace_state
> > +
> > +Where:
> > + 0xXXXXXXXX/0xYYYYYYYY
> > + Refer to Documentation/acpi/debug.txt for possible debug layer/level
> > + masking values.
> > + \PPPP.AAAA.TTTT.HHHH
> > + Full path of a control method that can be found in the ACPI namespace.
> > + It needn't be an entry of a control method evaluation.
> > +
> > +AML tracer
> > +-------------
>
> The markup is bigger than the line. You should have seen a Sphinx
> warning here.
>
> > +
> > +There are special log entries added by the method tracing facility at
> > +the "trace points" the AML interpreter starts/stops to execute a control
> > +method, or an AML opcode. Note that the format of the log entries are
> > +subject to change::
> > +
> > + [ 0.186427] exdebug-0398 ex_trace_point : Method Begin [0xf58394d8:\_SB.PCI0.LPCB.ECOK] execution.
> > + [ 0.186630] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905c88:If] execution.
> > + [ 0.186820] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905cc0:LEqual] execution.
> > + [ 0.187010] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905a20:-NamePath-] execution.
> > + [ 0.187214] exdebug-0398 ex_trace_point : Opcode End [0xf5905a20:-NamePath-] execution.
> > + [ 0.187407] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905f60:One] execution.
> > + [ 0.187594] exdebug-0398 ex_trace_point : Opcode End [0xf5905f60:One] execution.
> > + [ 0.187789] exdebug-0398 ex_trace_point : Opcode End [0xf5905cc0:LEqual] execution.
> > + [ 0.187980] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905cc0:Return] execution.
> > + [ 0.188146] exdebug-0398 ex_trace_point : Opcode Begin [0xf5905f60:One] execution.
> > + [ 0.188334] exdebug-0398 ex_trace_point : Opcode End [0xf5905f60:One] execution.
> > + [ 0.188524] exdebug-0398 ex_trace_point : Opcode End [0xf5905cc0:Return] execution.
> > + [ 0.188712] exdebug-0398 ex_trace_point : Opcode End [0xf5905c88:If] execution.
> > + [ 0.188903] exdebug-0398 ex_trace_point : Method End [0xf58394d8:\_SB.PCI0.LPCB.ECOK] execution.
> > +
> > +Developers can utilize these special log entries to track the AML
> > +interpretion, thus can aid issue debugging and performance tuning. Note
> > +that, as the "AML tracer" logs are implemented via ACPI_DEBUG_PRINT()
> > +macro, CONFIG_ACPI_DEBUG is also required to be enabled for enabling
> > +"AML tracer" logs.
> > +
> > +The following command examples illustrate the usage of the "AML tracer"
> > +functionality:
> > +
> > +a. Filter out the method start/stop "AML tracer" logs when control
> > + methods are being evaluated::
> > +
> > + # cd /sys/module/acpi/parameters
> > + # echo "0x80" > trace_debug_layer
> > + # echo "0x10" > trace_debug_level
> > + # echo "enable" > trace_state
> > +
> > +b. Filter out the method start/stop "AML tracer" when the specified
> > + control method is being evaluated::
> > +
> > + # cd /sys/module/acpi/parameters
> > + # echo "0x80" > trace_debug_layer
> > + # echo "0x10" > trace_debug_level
> > + # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> > + # echo "method" > trace_state
> > +
> > +c. Filter out the method start/stop "AML tracer" logs when the specified
> > + control method is being evaluated for the first time::
> > +
> > + # cd /sys/module/acpi/parameters
> > + # echo "0x80" > trace_debug_layer
> > + # echo "0x10" > trace_debug_level
> > + # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> > + # echo "method-once" > trace_state
> > +
> > +d. Filter out the method/opcode start/stop "AML tracer" when the
> > + specified control method is being evaluated::
> > +
> > + # cd /sys/module/acpi/parameters
> > + # echo "0x80" > trace_debug_layer
> > + # echo "0x10" > trace_debug_level
> > + # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> > + # echo "opcode" > trace_state
> > +
> > +e. Filter out the method/opcode start/stop "AML tracer" when the
> > + specified control method is being evaluated for the first time::
> > +
> > + # cd /sys/module/acpi/parameters
> > + # echo "0x80" > trace_debug_layer
> > + # echo "0x10" > trace_debug_level
> > + # echo "\PPPP.AAAA.TTTT.HHHH" > trace_method_name
> > + # echo "opcode-opcode" > trace_state
> > +
> > +Note that all above method tracing facility related module parameters can
> > +be used as the boot parameters, for example::
> > +
> > + acpi.trace_debug_layer=0x80 acpi.trace_debug_level=0x10 \
> > + acpi.trace_method_name=\_SB.LID0._LID acpi.trace_state=opcode-once
> > +
> > +2. Interface descriptions
> > +=========================
> > +
> > +All method tracing functions can be configured via ACPI module
> > +parameters that are accessible at /sys/module/acpi/parameters/:
> > +
> > +trace_method_name
> > +The full path of the AML method that the user wants to trace.
> > +Note that the full path shouldn't contain the trailing "_"s in its
> > +name segments but may contain "\" to form an absolute path.
> > +
>
>
> > +trace_debug_layer
> > +The temporary debug_layer used when the tracing feature is enabled.
> > +Using ACPI_EXECUTER (0x80) by default, which is the debug_layer
> > +used to match all "AML tracer" logs.
> > +
> > +trace_debug_level
> > +The temporary debug_level used when the tracing feature is enabled.
> > +Using ACPI_LV_TRACE_POINT (0x10) by default, which is the
> > +debug_level used to match all "AML tracer" logs.
> > +
> > +trace_state
> > +The status of the tracing feature.
> > +Users can enable/disable this debug tracing feature by executing
> > +the following command::
>
> For the above, please indent, in order to properly change the
> sysfs node font to bold. Also, mark paragraphs with a \n, e. g:
>
> trace_method_name
> The full path of the AML method that the user wants to trace.
>
> Note that the full path shouldn't contain the trailing "_"s in its
> name segments but may contain "\" to form an absolute path.
>
> trace_debug_layer
> The temporary debug_layer used when the tracing feature is enabled.
>
> Using ACPI_EXECUTER (0x80) by default, which is the debug_layer
> used to match all "AML tracer" logs.
>
> trace_debug_level
> The temporary debug_level used when the tracing feature is enabled.
>
> Using ACPI_LV_TRACE_POINT (0x10) by default, which is the
> debug_level used to match all "AML tracer" logs.
>
> trace_state
> The status of the tracing feature.
>
> Users can enable/disable this debug tracing feature by executing
> the following command::
>
Done, thanks.

> After doing such changes:
>
> Reviewed-by: Mauro Carvalho Chehab <[email protected]>
>
>
> > +
> > + # echo string > /sys/module/acpi/parameters/trace_state
> > +
> > +Where "string" should be one of the following:
> > +
> > +"disable"
> > + Disable the method tracing feature.
> > +"enable"
> > + Enable the method tracing feature.
> > + ACPICA debugging messages matching
> > + "trace_debug_layer/trace_debug_level" during any method
> > + execution will be logged.
> > +"method"
> > + Enable the method tracing feature.
> > + ACPICA debugging messages matching
> > + "trace_debug_layer/trace_debug_level" during method execution
> > + of "trace_method_name" will be logged.
> > +"method-once"
> > + Enable the method tracing feature.
> > + ACPICA debugging messages matching
> > + "trace_debug_layer/trace_debug_level" during method execution
> > + of "trace_method_name" will be logged only once.
> > +"opcode"
> > + Enable the method tracing feature.
> > + ACPICA debugging messages matching
> > + "trace_debug_layer/trace_debug_level" during method/opcode
> > + execution of "trace_method_name" will be logged.
> > +"opcode-once"
> > + Enable the method tracing feature.
> > + ACPICA debugging messages matching
> > + "trace_debug_layer/trace_debug_level" during method/opcode
> > + execution of "trace_method_name" will be logged only once.
> > +
> > +Note that, the difference between the "enable" and other feature
> > +enabling options are:
> > +
> > +1. When "enable" is specified, since
> > + "trace_debug_layer/trace_debug_level" shall apply to all control
> > + method evaluations, after configuring "trace_state" to "enable",
> > + "trace_method_name" will be reset to NULL.
> > +2. When "method/opcode" is specified, if
> > + "trace_method_name" is NULL when "trace_state" is configured to
> > + these options, the "trace_debug_layer/trace_debug_level" will
> > + apply to all control method evaluations.
>
>
>
> Thanks,
> Mauro

--
Cheers,
Changbin Du

2019-04-24 22:39:34

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 36/63] Documentation: PCI: convert endpoint/pci-test-howto.txt to reST

Em Wed, 24 Apr 2019 00:29:05 +0800
Changbin Du <[email protected]> escreveu:

> This converts the plain text documentation to reStructuredText format and
> add it to Sphinx TOC tree. No essential content change.
>
> Signed-off-by: Changbin Du <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> ---
> Documentation/PCI/endpoint/index.rst | 1 +
> ...{pci-test-howto.txt => pci-test-howto.rst} | 81 +++++++++++++------
> 2 files changed, 56 insertions(+), 26 deletions(-)
> rename Documentation/PCI/endpoint/{pci-test-howto.txt => pci-test-howto.rst} (78%)
>
> diff --git a/Documentation/PCI/endpoint/index.rst b/Documentation/PCI/endpoint/index.rst
> index b680a3fc4fec..d114ea74b444 100644
> --- a/Documentation/PCI/endpoint/index.rst
> +++ b/Documentation/PCI/endpoint/index.rst
> @@ -10,3 +10,4 @@ PCI Endpoint Framework
> pci-endpoint
> pci-endpoint-cfs
> pci-test-function
> + pci-test-howto
> diff --git a/Documentation/PCI/endpoint/pci-test-howto.txt b/Documentation/PCI/endpoint/pci-test-howto.rst
> similarity index 78%
> rename from Documentation/PCI/endpoint/pci-test-howto.txt
> rename to Documentation/PCI/endpoint/pci-test-howto.rst
> index 040479f437a5..909f770a07d6 100644
> --- a/Documentation/PCI/endpoint/pci-test-howto.txt
> +++ b/Documentation/PCI/endpoint/pci-test-howto.rst
> @@ -1,38 +1,51 @@
> - PCI TEST USERGUIDE
> - Kishon Vijay Abraham I <[email protected]>
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===================
> +PCI Test User Guide
> +===================
> +
> +:Author: Kishon Vijay Abraham I <[email protected]>
>
> This document is a guide to help users use pci-epf-test function driver
> and pci_endpoint_test host driver for testing PCI. The list of steps to
> be followed in the host side and EP side is given below.
>
> -1. Endpoint Device
> +Endpoint Device
> +===============
>
> -1.1 Endpoint Controller Devices
> +Endpoint Controller Devices
> +---------------------------
>
> -To find the list of endpoint controller devices in the system:
> +To find the list of endpoint controller devices in the system::
>
> # ls /sys/class/pci_epc/
> 51000000.pcie_ep
>
> -If PCI_ENDPOINT_CONFIGFS is enabled
> +If PCI_ENDPOINT_CONFIGFS is enabled::
> +
> # ls /sys/kernel/config/pci_ep/controllers
> 51000000.pcie_ep
>
> -1.2 Endpoint Function Drivers
>
> -To find the list of endpoint function drivers in the system:
> +Endpoint Function Drivers
> +-------------------------
> +
> +To find the list of endpoint function drivers in the system::
>
> # ls /sys/bus/pci-epf/drivers
> pci_epf_test
>
> -If PCI_ENDPOINT_CONFIGFS is enabled
> +If PCI_ENDPOINT_CONFIGFS is enabled::
> +
> # ls /sys/kernel/config/pci_ep/functions
> pci_epf_test
>
> -1.3 Creating pci-epf-test Device
> +
> +Creating pci-epf-test Device
> +----------------------------
>
> PCI endpoint function device can be created using the configfs. To create
> -pci-epf-test device, the following commands can be used
> +pci-epf-test device, the following commands can be used::
>
> # mount -t configfs none /sys/kernel/config
> # cd /sys/kernel/config/pci_ep/
> @@ -42,7 +55,7 @@ The "mkdir func1" above creates the pci-epf-test function device that will
> be probed by pci_epf_test driver.
>
> The PCI endpoint framework populates the directory with the following
> -configurable fields.
> +configurable fields::
>
> # ls functions/pci_epf_test/func1
> baseclass_code interrupt_pin progif_code subsys_id
> @@ -51,67 +64,83 @@ configurable fields.
>
> The PCI endpoint function driver populates these entries with default values
> when the device is bound to the driver. The pci-epf-test driver populates
> -vendorid with 0xffff and interrupt_pin with 0x0001
> +vendorid with 0xffff and interrupt_pin with 0x0001::
>
> # cat functions/pci_epf_test/func1/vendorid
> 0xffff
> # cat functions/pci_epf_test/func1/interrupt_pin
> 0x0001
>
> -1.4 Configuring pci-epf-test Device
> +
> +Configuring pci-epf-test Device
> +-------------------------------
>
> The user can configure the pci-epf-test device using configfs entry. In order
> to change the vendorid and the number of MSI interrupts used by the function
> -device, the following commands can be used.
> +device, the following commands can be used::
>
> # echo 0x104c > functions/pci_epf_test/func1/vendorid
> # echo 0xb500 > functions/pci_epf_test/func1/deviceid
> # echo 16 > functions/pci_epf_test/func1/msi_interrupts
> # echo 8 > functions/pci_epf_test/func1/msix_interrupts
>
> -1.5 Binding pci-epf-test Device to EP Controller
> +
> +Binding pci-epf-test Device to EP Controller
> +--------------------------------------------
>
> In order for the endpoint function device to be useful, it has to be bound to
> a PCI endpoint controller driver. Use the configfs to bind the function
> -device to one of the controller driver present in the system.
> +device to one of the controller driver present in the system::
>
> # ln -s functions/pci_epf_test/func1 controllers/51000000.pcie_ep/
>
> Once the above step is completed, the PCI endpoint is ready to establish a link
> with the host.
>
> -1.6 Start the Link
> +
> +Start the Link
> +--------------
>
> In order for the endpoint device to establish a link with the host, the _start_
> -field should be populated with '1'.
> +field should be populated with '1'::
>
> # echo 1 > controllers/51000000.pcie_ep/start
>
> -2. RootComplex Device
>
> -2.1 lspci Output
> +RootComplex Device
> +==================
> +
> +lspci Output
> +------------
>
> -Note that the devices listed here correspond to the value populated in 1.4 above
> +Note that the devices listed here correspond to the value populated in 1.4
> +above::
>
> 00:00.0 PCI bridge: Texas Instruments Device 8888 (rev 01)
> 01:00.0 Unassigned class [ff00]: Texas Instruments Device b500
>
> -2.2 Using Endpoint Test function Device
> +
> +Using Endpoint Test function Device
> +-----------------------------------
>
> pcitest.sh added in tools/pci/ can be used to run all the default PCI endpoint
> -tests. To compile this tool the following commands should be used:
> +tests. To compile this tool the following commands should be used::
>
> # cd <kernel-dir>
> # make -C tools/pci
>
> -or if you desire to compile and install in your system:
> +or if you desire to compile and install in your system::
>
> # cd <kernel-dir>
> # make -C tools/pci install
>
> The tool and script will be located in <rootfs>/usr/bin/
>
> -2.2.1 pcitest.sh Output
> +
> +pcitest.sh Output
> +~~~~~~~~~~~~~~~~~
> +::
> +
> # pcitest.sh
> BAR tests
>



Thanks,
Mauro

2019-04-25 05:10:59

by Changbin Du

[permalink] [raw]
Subject: Re: [PATCH v4 21/63] Documentation: ACPI: move cppc_sysfs.txt to admin-guide/acpi and convert to reST

On Wed, Apr 24, 2019 at 11:48:44AM -0300, Mauro Carvalho Chehab wrote:
> Em Wed, 24 Apr 2019 00:28:50 +0800
> Changbin Du <[email protected]> escreveu:
>
> > This converts the plain text documentation to reStructuredText format and
> > add it to Sphinx TOC tree. No essential content change.
> >
> > Signed-off-by: Changbin Du <[email protected]>
> > ---
> > .../acpi/cppc_sysfs.rst} | 71 ++++++++++---------
> > Documentation/admin-guide/acpi/index.rst | 1 +
> > 2 files changed, 40 insertions(+), 32 deletions(-)
> > rename Documentation/{acpi/cppc_sysfs.txt => admin-guide/acpi/cppc_sysfs.rst} (51%)
> >
> > diff --git a/Documentation/acpi/cppc_sysfs.txt b/Documentation/admin-guide/acpi/cppc_sysfs.rst
> > similarity index 51%
> > rename from Documentation/acpi/cppc_sysfs.txt
> > rename to Documentation/admin-guide/acpi/cppc_sysfs.rst
> > index f20fb445135d..a4b99afbe331 100644
> > --- a/Documentation/acpi/cppc_sysfs.txt
> > +++ b/Documentation/admin-guide/acpi/cppc_sysfs.rst
> > @@ -1,5 +1,11 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> >
> > - Collaborative Processor Performance Control (CPPC)
> > +==================================================
> > +Collaborative Processor Performance Control (CPPC)
> > +==================================================
> > +
> > +CPPC
> > +====
> >
> > CPPC defined in the ACPI spec describes a mechanism for the OS to manage the
> > performance of a logical processor on a contigious and abstract performance
> > @@ -10,31 +16,28 @@ For more details on CPPC please refer to the ACPI specification at:
> >
> > http://uefi.org/specifications
> >
> > -Some of the CPPC registers are exposed via sysfs under:
> > -
> > -/sys/devices/system/cpu/cpuX/acpi_cppc/
> > -
>
>
> > -for each cpu X
>
> Hmm... removed by mistake?
>
I comfirmed that no content removed.

> > +Some of the CPPC registers are exposed via sysfs under::
> >
> > ---------------------------------------------------------------------------------
> > + /sys/devices/system/cpu/cpuX/acpi_cppc/
>
> Did you parse this with Sphinx? It doesn't sound a valid ReST construction
> to my eyes, as:
>
> 1) I've seen some versions of Sphinx to abort with severe errors when
> there's no blank line after the horizontal bar markup;
>
> 2) It will very likely ignore the "::" (I didn't test it myself), as you're
> not indenting the horizontal bar. End of indentation will mean the end
> of an (empty) literal block.
>
> So, I would stick with:
>
>
> Some of the CPPC registers are exposed via sysfs under:
>
> /sys/devices/system/cpu/cpuX/acpi_cppc/
>
> ---------------------------------------------------------------------------------
>
> for each cpu X::
>
>
> or:
>
> Some of the CPPC registers are exposed via sysfs under:
>
> /sys/devices/system/cpu/cpuX/acpi_cppc/
>
> for each cpu X
>
> --------------------------------------------------------------------------------
>
> ::
>
> (with is closer to the original author's intent)
>
> Same applies to the other similar changes on this document.
>
I didn't seen any warning here and the generated html is good. So I think it is
ok.

> >
> > -$ ls -lR /sys/devices/system/cpu/cpu0/acpi_cppc/
> > -/sys/devices/system/cpu/cpu0/acpi_cppc/:
> > -total 0
> > --r--r--r-- 1 root root 65536 Mar 5 19:38 feedback_ctrs
> > --r--r--r-- 1 root root 65536 Mar 5 19:38 highest_perf
> > --r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_freq
> > --r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_nonlinear_perf
> > --r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_perf
> > --r--r--r-- 1 root root 65536 Mar 5 19:38 nominal_freq
> > --r--r--r-- 1 root root 65536 Mar 5 19:38 nominal_perf
> > --r--r--r-- 1 root root 65536 Mar 5 19:38 reference_perf
> > --r--r--r-- 1 root root 65536 Mar 5 19:38 wraparound_time
> > +for each cpu X::
> >
> > ---------------------------------------------------------------------------------
> > + $ ls -lR /sys/devices/system/cpu/cpu0/acpi_cppc/
> > + /sys/devices/system/cpu/cpu0/acpi_cppc/:
> > + total 0
> > + -r--r--r-- 1 root root 65536 Mar 5 19:38 feedback_ctrs
> > + -r--r--r-- 1 root root 65536 Mar 5 19:38 highest_perf
> > + -r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_freq
> > + -r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_nonlinear_perf
> > + -r--r--r-- 1 root root 65536 Mar 5 19:38 lowest_perf
> > + -r--r--r-- 1 root root 65536 Mar 5 19:38 nominal_freq
> > + -r--r--r-- 1 root root 65536 Mar 5 19:38 nominal_perf
> > + -r--r--r-- 1 root root 65536 Mar 5 19:38 reference_perf
> > + -r--r--r-- 1 root root 65536 Mar 5 19:38 wraparound_time
> >
> > * highest_perf : Highest performance of this processor (abstract scale).
> > -* nominal_perf : Highest sustained performance of this processor (abstract scale).
> > +* nominal_perf : Highest sustained performance of this processor
> > + (abstract scale).
> > * lowest_nonlinear_perf : Lowest performance of this processor with nonlinear
> > power savings (abstract scale).
> > * lowest_perf : Lowest performance of this processor (abstract scale).
> > @@ -48,22 +51,26 @@ total 0
> > * feedback_ctrs : Includes both Reference and delivered performance counter.
> > Reference counter ticks up proportional to processor's reference performance.
> > Delivered counter ticks up proportional to processor's delivered performance.
> > -* wraparound_time: Minimum time for the feedback counters to wraparound (seconds).
> > +* wraparound_time: Minimum time for the feedback counters to wraparound
> > + (seconds).
> > * reference_perf : Performance level at which reference performance counter
> > accumulates (abstract scale).
> >
> > ---------------------------------------------------------------------------------
> >
> > - Computing Average Delivered Performance
> > +Computing Average Delivered Performance
> > +=======================================
> > +
> > +Below describes the steps to compute the average performance delivered by
> > +taking two different snapshots of feedback counters at time T1 and T2.
> > +
> > + T1: Read feedback_ctrs as fbc_t1
> > + Wait or run some workload
> >
> > -Below describes the steps to compute the average performance delivered by taking
> > -two different snapshots of feedback counters at time T1 and T2.
> > + T2: Read feedback_ctrs as fbc_t2
> >
> > -T1: Read feedback_ctrs as fbc_t1
> > - Wait or run some workload
> > -T2: Read feedback_ctrs as fbc_t2
> > +::
> >
> > -delivered_counter_delta = fbc_t2[del] - fbc_t1[del]
> > -reference_counter_delta = fbc_t2[ref] - fbc_t1[ref]
> > + delivered_counter_delta = fbc_t2[del] - fbc_t1[del]
> > + reference_counter_delta = fbc_t2[ref] - fbc_t1[ref]
> >
> > -delivered_perf = (refernce_perf x delivered_counter_delta) / reference_counter_delta
> > + delivered_perf = (refernce_perf x delivered_counter_delta) / reference_counter_delta
> > diff --git a/Documentation/admin-guide/acpi/index.rst b/Documentation/admin-guide/acpi/index.rst
> > index d68e9914c5ff..9049a7b9f065 100644
> > --- a/Documentation/admin-guide/acpi/index.rst
> > +++ b/Documentation/admin-guide/acpi/index.rst
> > @@ -10,3 +10,4 @@ the Linux ACPI support.
> >
> > initrd_table_override
> > dsdt-override
> > + cppc_sysfs
>
>
>
> Thanks,
> Mauro

--
Cheers,
Changbin Du

2019-04-25 10:51:46

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v4 21/63] Documentation: ACPI: move cppc_sysfs.txt to admin-guide/acpi and convert to reST

Em Thu, 25 Apr 2019 01:22:34 +0800
Changbin Du <[email protected]> escreveu:

> On Wed, Apr 24, 2019 at 11:48:44AM -0300, Mauro Carvalho Chehab wrote:
> > Em Wed, 24 Apr 2019 00:28:50 +0800
> > Changbin Du <[email protected]> escreveu:
> >
> > > This converts the plain text documentation to reStructuredText format and
> > > add it to Sphinx TOC tree. No essential content change.
> > >
> > > Signed-off-by: Changbin Du <[email protected]>
> > > ---
> > > .../acpi/cppc_sysfs.rst} | 71 ++++++++++---------
> > > Documentation/admin-guide/acpi/index.rst | 1 +
> > > 2 files changed, 40 insertions(+), 32 deletions(-)
> > > rename Documentation/{acpi/cppc_sysfs.txt => admin-guide/acpi/cppc_sysfs.rst} (51%)
> > >
> > > diff --git a/Documentation/acpi/cppc_sysfs.txt b/Documentation/admin-guide/acpi/cppc_sysfs.rst
> > > similarity index 51%
> > > rename from Documentation/acpi/cppc_sysfs.txt
> > > rename to Documentation/admin-guide/acpi/cppc_sysfs.rst
> > > index f20fb445135d..a4b99afbe331 100644
> > > --- a/Documentation/acpi/cppc_sysfs.txt
> > > +++ b/Documentation/admin-guide/acpi/cppc_sysfs.rst
> > > @@ -1,5 +1,11 @@
> > > +.. SPDX-License-Identifier: GPL-2.0
> > >
> > > - Collaborative Processor Performance Control (CPPC)
> > > +==================================================
> > > +Collaborative Processor Performance Control (CPPC)
> > > +==================================================
> > > +
> > > +CPPC
> > > +====
> > >
> > > CPPC defined in the ACPI spec describes a mechanism for the OS to manage the
> > > performance of a logical processor on a contigious and abstract performance
> > > @@ -10,31 +16,28 @@ For more details on CPPC please refer to the ACPI specification at:
> > >
> > > http://uefi.org/specifications
> > >
> > > -Some of the CPPC registers are exposed via sysfs under:
> > > -
> > > -/sys/devices/system/cpu/cpuX/acpi_cppc/
> > > -
> >
> >
> > > -for each cpu X
> >
> > Hmm... removed by mistake?
> >
> I comfirmed that no content removed.

At this patch, it looks that you removed the line: "for each cpu X"
(or am I reading it wrong?)

>
> > > +Some of the CPPC registers are exposed via sysfs under::
> > >
> > > ---------------------------------------------------------------------------------
> > > + /sys/devices/system/cpu/cpuX/acpi_cppc/
> >
> > Did you parse this with Sphinx? It doesn't sound a valid ReST construction
> > to my eyes, as:
> >
> > 1) I've seen some versions of Sphinx to abort with severe errors when
> > there's no blank line after the horizontal bar markup;
> >
> > 2) It will very likely ignore the "::" (I didn't test it myself), as you're
> > not indenting the horizontal bar. End of indentation will mean the end
> > of an (empty) literal block.
> >
> > So, I would stick with:
> >
> >
> > Some of the CPPC registers are exposed via sysfs under:
> >
> > /sys/devices/system/cpu/cpuX/acpi_cppc/
> >
> > ---------------------------------------------------------------------------------
> >
> > for each cpu X::
> >
> >
> > or:
> >
> > Some of the CPPC registers are exposed via sysfs under:
> >
> > /sys/devices/system/cpu/cpuX/acpi_cppc/
> >
> > for each cpu X
> >
> > --------------------------------------------------------------------------------
> >
> > ::
> >
> > (with is closer to the original author's intent)
> >
> > Same applies to the other similar changes on this document.
> >
> I didn't seen any warning here and the generated html is good. So I think it is
> ok.

Basically, what you're doing is:

<rst>

::

foo
literal-block bar

</rst>

(where "foo" is the horizontal bar markup)

I would avoid such pattern for two reasons:

1) it sounds a violation of ReST syntax to format an in
indented paragraph some non-blank lines after a non-indented
line. As such, I won't doubt that different versions of Sphinx
would handle it differently. I'm even tempted to open a BZ
to Sphinx in order for them to provide a fix for that, if the
latest version of Sphinx accepts such crazy markup.

2) It is very confusing for any human reading it.

Thanks,
Mauro

2019-04-25 16:01:45

by Changbin Du

[permalink] [raw]
Subject: Re: [PATCH v4 25/63] Documentation: add Linux PCI to Sphinx TOC tree

On Wed, Apr 24, 2019 at 12:03:43PM -0300, Mauro Carvalho Chehab wrote:
> Em Wed, 24 Apr 2019 00:28:54 +0800
> Changbin Du <[email protected]> escreveu:
>
> > Add a index.rst for PCI subsystem. More docs will be added later.
> >
> > Signed-off-by: Changbin Du <[email protected]>
> > Acked-by: Bjorn Helgaas <[email protected]>
> > ---
> > Documentation/PCI/index.rst | 9 +++++++++
>
> On a past discussion at docs ML, we've agreed to use lowercase for new
> stuff. My suggestion here would be to use lowercase for "pci".
>
> Also, there's already a pci directory under driver-api, added on this
> commit:
>
> commit fcc78f9c22474d60c65d522e50ea07006ec1b9fc
> Author: Logan Gunthorpe <[email protected]>
> Date: Thu Oct 4 15:27:39 2018 -0600
>
> docs-rst: Add a new directory for PCI documentation
>
> I would just add a new section at Documentation/driver-api/pci/index.rst
> with something like:
>
> Legacy PCI documentation
> ========================
>
> .. note::
>
> The files here were written a long time ago and need some serious
> work. Use their contents with caution.
>
> .. toctree::
> :maxdepth: 1
>
> <files converted from Documentation/PCI>
>
> And add those documents from Documentation/PCI into it.
>
Bjorn, Jonathan,
Do you agree with this? By this all Documentation/PCI/* will be moved to
Documentation/driver-api/pci.

>
> Thanks,
> Mauro

--
Cheers,
Changbin Du

2019-04-25 17:09:11

by Changbin Du

[permalink] [raw]
Subject: Re: [PATCH v4 38/63] Documentation: x86: convert boot.txt to reST

On Wed, Apr 24, 2019 at 02:36:44PM -0300, Mauro Carvalho Chehab wrote:
> Em Wed, 24 Apr 2019 00:29:07 +0800
> Changbin Du <[email protected]> escreveu:
>
> > This converts the plain text documentation to reStructuredText format and
> > add it to Sphinx TOC tree. No essential content change.
> >
> > Signed-off-by: Changbin Du <[email protected]>
> > ---
> > Documentation/x86/boot.rst | 1205 +++++++++++++++++++++++++++++++++++
> > Documentation/x86/boot.txt | 1130 --------------------------------
> > Documentation/x86/index.rst | 2 +
> > 3 files changed, 1207 insertions(+), 1130 deletions(-)
> > create mode 100644 Documentation/x86/boot.rst
> > delete mode 100644 Documentation/x86/boot.txt
> >
> > diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
> > new file mode 100644
> > index 000000000000..9f55e832bc47
> > --- /dev/null
> > +++ b/Documentation/x86/boot.rst
> > @@ -0,0 +1,1205 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +===========================
> > +The Linux/x86 Boot Protocol
> > +===========================
> > +
> > +On the x86 platform, the Linux kernel uses a rather complicated boot
> > +convention. This has evolved partially due to historical aspects, as
> > +well as the desire in the early days to have the kernel itself be a
> > +bootable image, the complicated PC memory model and due to changed
> > +expectations in the PC industry caused by the effective demise of
> > +real-mode DOS as a mainstream operating system.
> > +
> > +Currently, the following versions of the Linux/x86 boot protocol exist.
> > +
> > +Old kernels:
> > + zImage/Image support only. Some very early kernels
> > + may not even support a command line.
> > +
> > +Protocol 2.00:
> > + (Kernel 1.3.73) Added bzImage and initrd support, as
> > + well as a formalized way to communicate between the
> > + boot loader and the kernel. setup.S made relocatable,
> > + although the traditional setup area still assumed writable.
> > +
> > +Protocol 2.01:
> > + (Kernel 1.3.76) Added a heap overrun warning.
> > +
> > +Protocol 2.02:
> > + (Kernel 2.4.0-test3-pre3) New command line protocol.
> > + Lower the conventional memory ceiling. No overwrite
> > + of the traditional setup area, thus making booting
> > + safe for systems which use the EBDA from SMM or 32-bit
> > + BIOS entry points. zImage deprecated but still supported.
> > +
> > +Protocol 2.03:
> > + (Kernel 2.4.18-pre1) Explicitly makes the highest possible
> > + initrd address available to the bootloader.
> > +
> > +Protocol 2.04:
> > + (Kernel 2.6.14) Extend the syssize field to four bytes.
> > +
> > +Protocol 2.05:
> > + (Kernel 2.6.20) Make protected mode kernel relocatable.
> > + Introduce relocatable_kernel and kernel_alignment fields.
> > +
> > +Protocol 2.06:
> > + (Kernel 2.6.22) Added a field that contains the size of
> > + the boot command line.
> > +
> > +Protocol 2.07:
> > + (Kernel 2.6.24) Added paravirtualised boot protocol.
> > + Introduced hardware_subarch and hardware_subarch_data
> > + and KEEP_SEGMENTS flag in load_flags.
> > +
> > +Protocol 2.08:
> > + (Kernel 2.6.26) Added crc32 checksum and ELF format
> > + payload. Introduced payload_offset and payload_length
> > + fields to aid in locating the payload.
> > +
> > +Protocol 2.09:
> > + (Kernel 2.6.26) Added a field of 64-bit physical
> > + pointer to single linked list of struct setup_data.
> > +
> > +Protocol 2.10:
> > + (Kernel 2.6.31) Added a protocol for relaxed alignment
> > + beyond the kernel_alignment added, new init_size and
> > + pref_address fields. Added extended boot loader IDs.
> > +
> > +Protocol 2.11:
> > + (Kernel 3.6) Added a field for offset of EFI handover
> > + protocol entry point.
> > +
> > +Protocol 2.12:
> > + (Kernel 3.8) Added the xloadflags field and extension fields
> > + to struct boot_params for loading bzImage and ramdisk
> > + above 4G in 64bit.
>
> This is a side node, but you should really try to avoid replacing too
> many lines, as it makes a lot harder for reviewers for no good reason.
>
> For example, this is the way I would convert this changelog table:
>
>
> @@ -10,6 +11,7 @@ real-mode DOS as a mainstream operating system.
>
> Currently, the following versions of the Linux/x86 boot protocol exist.
>
> +=============== ===============================================================
> Old kernels: zImage/Image support only. Some very early kernels
> may not even support a command line.
>
> @@ -64,33 +66,35 @@ Protocol 2.12: (Kernel 3.8) Added the xloadflags field and extension fields
> Protocol 2.13: (Kernel 3.14) Support 32- and 64-bit flags being set in
> xloadflags to support booting a 64-bit kernel from 32-bit
> EFI
> +=============== ===============================================================
>
>
> This is simple enough, preserves the original author's intent and
> makes a lot easier for reviewers to check what you changed.
>
much better. thanks.

> > +
> > +MEMORY LAYOUT
> > +=============
> > +
> > +The traditional memory map for the kernel loader, used for Image or
> > +zImage kernels, typically looks like::
> > +
> > + | |
> > + 0A0000 +------------------------+
> > + | Reserved for BIOS | Do not use. Reserved for BIOS EBDA.
> > + 09A000 +------------------------+
> > + | Command line |
> > + | Stack/heap | For use by the kernel real-mode code.
> > + 098000 +------------------------+
> > + | Kernel setup | The kernel real-mode code.
> > + 090200 +------------------------+
> > + | Kernel boot sector | The kernel legacy boot sector.
> > + 090000 +------------------------+
> > + | Protected-mode kernel | The bulk of the kernel image.
> > + 010000 +------------------------+
> > + | Boot loader | <- Boot sector entry point 0000:7C00
> > + 001000 +------------------------+
> > + | Reserved for MBR/BIOS |
> > + 000800 +------------------------+
> > + | Typically used by MBR |
> > + 000600 +------------------------+
> > + | BIOS use only |
> > + 000000 +------------------------+
> > +
> > +
>
> I might be wrong, but it seems that you broke the above ascii
> artwork.
>
You r right and fixed.

> > +When using bzImage, the protected-mode kernel was relocated to
> > +0x100000 ("high memory"), and the kernel real-mode block (boot sector,
> > +setup, and stack/heap) was made relocatable to any address between
> > +0x10000 and end of low memory. Unfortunately, in protocols 2.00 and
> > +2.01 the 0x90000+ memory range is still used internally by the kernel;
> > +the 2.02 protocol resolves that problem.
> > +
> > +It is desirable to keep the "memory ceiling" -- the highest point in
> > +low memory touched by the boot loader -- as low as possible, since
> > +some newer BIOSes have begun to allocate some rather large amounts of
> > +memory, called the Extended BIOS Data Area, near the top of low
> > +memory. The boot loader should use the "INT 12h" BIOS call to verify
> > +how much low memory is available.
> > +
> > +Unfortunately, if INT 12h reports that the amount of memory is too
> > +low, there is usually nothing the boot loader can do but to report an
> > +error to the user. The boot loader should therefore be designed to
> > +take up as little space in low memory as it reasonably can. For
> > +zImage or old bzImage kernels, which need data written into the
> > +0x90000 segment, the boot loader should make sure not to use memory
> > +above the 0x9A000 point; too many BIOSes will break above that point.
> > +
> > +For a modern bzImage kernel with boot protocol version >= 2.02, a
> > +memory layout like the following is suggested::
> > +
> > + ~ ~
> > + | Protected-mode kernel |
> > + 100000 +------------------------+
> > + | I/O memory hole |
> > + 0A0000 +------------------------+
> > + | Reserved for BIOS | Leave as much as possible unused
> > + ~ ~
> > + | Command line | (Can also be below the X+10000 mark)
> > + X+10000 +------------------------+
> > + | Stack/heap | For use by the kernel real-mode code.
> > + X+08000 +------------------------+
> > + | Kernel setup | The kernel real-mode code.
> > + | Kernel boot sector | The kernel legacy boot sector.
> > + X +------------------------+
> > + | Boot loader | <- Boot sector entry point 0000:7C00
> > + 001000 +------------------------+
> > + | Reserved for MBR/BIOS |
> > + 000800 +------------------------+
> > + | Typically used by MBR |
> > + 000600 +------------------------+
> > + | BIOS use only |
> > + 000000 +------------------------+
>
>
> Same here: it sounds to me that you mistakenly replaced some tabs
> by spaces.
>
> > +
> > +... where the address X is as low as the design of the boot loader
> > +permits.
>
> That seems to be the legend of the artwork. I would indent it, in
> order to be shown inside the artwork.
>
agree.

> > +
> > +
> > +THE REAL-MODE KERNEL HEADER
> > +===========================
> > +
> > +In the following text, and anywhere in the kernel boot sequence, "a
> > +sector" refers to 512 bytes. It is independent of the actual sector
> > +size of the underlying medium.
> > +
> > +The first step in loading a Linux kernel should be to load the
> > +real-mode code (boot sector and setup code) and then examine the
> > +following header at offset 0x01f1. The real-mode code can total up to
> > +32K, although the boot loader may choose to load only the first two
> > +sectors (1K) and then examine the bootup sector size.
> > +
> > +The header looks like::
> > +
> > + Offset Proto Name Meaning
> > + /Size
> > +
> > + 01F1/1 ALL(1 setup_sects The size of the setup in sectors
> > + 01F2/2 ALL root_flags If set, the root is mounted readonly
> > + 01F4/4 2.04+(2 syssize The size of the 32-bit code in 16-byte paras
> > + 01F8/2 ALL ram_size DO NOT USE - for bootsect.S use only
> > + 01FA/2 ALL vid_mode Video mode control
> > + 01FC/2 ALL root_dev Default root device number
> > + 01FE/2 ALL boot_flag 0xAA55 magic number
> > + 0200/2 2.00+ jump Jump instruction
> > + 0202/4 2.00+ header Magic signature "HdrS"
> > + 0206/2 2.00+ version Boot protocol version supported
> > + 0208/4 2.00+ realmode_swtch Boot loader hook (see below)
> > + 020C/2 2.00+ start_sys_seg The load-low segment (0x1000) (obsolete)
> > + 020E/2 2.00+ kernel_version Pointer to kernel version string
> > + 0210/1 2.00+ type_of_loader Boot loader identifier
> > + 0211/1 2.00+ loadflags Boot protocol option flags
> > + 0212/2 2.00+ setup_move_size Move to high memory size (used with hooks)
> > + 0214/4 2.00+ code32_start Boot loader hook (see below)
> > + 0218/4 2.00+ ramdisk_image initrd load address (set by boot loader)
> > + 021C/4 2.00+ ramdisk_size initrd size (set by boot loader)
> > + 0220/4 2.00+ bootsect_kludge DO NOT USE - for bootsect.S use only
> > + 0224/2 2.01+ heap_end_ptr Free memory after setup end
> > + 0226/1 2.02+(3 ext_loader_ver Extended boot loader version
> > + 0227/1 2.02+(3 ext_loader_type Extended boot loader ID
> > + 0228/4 2.02+ cmd_line_ptr 32-bit pointer to the kernel command line
> > + 022C/4 2.03+ initrd_addr_max Highest legal initrd address
> > + 0230/4 2.05+ kernel_alignment Physical addr alignment required for kernel
> > + 0234/1 2.05+ relocatable_kernel Whether kernel is relocatable or not
> > + 0235/1 2.10+ min_alignment Minimum alignment, as a power of two
> > + 0236/2 2.12+ xloadflags Boot protocol option flags
> > + 0238/4 2.06+ cmdline_size Maximum size of the kernel command line
> > + 023C/4 2.07+ hardware_subarch Hardware subarchitecture
> > + 0240/8 2.07+ hardware_subarch_data Subarchitecture-specific data
> > + 0248/4 2.08+ payload_offset Offset of kernel payload
> > + 024C/4 2.08+ payload_length Length of kernel payload
> > + 0250/8 2.09+ setup_data 64-bit physical pointer to linked list
> > + of struct setup_data
> > + 0258/8 2.10+ pref_address Preferred loading address
> > + 0260/4 2.10+ init_size Linear memory required during initialization
> > + 0264/4 2.11+ handover_offset Offset of handover entry point
>
> This is a table. Please use table markups and fix some wrong indentation
> there, as it makes a lot easier to read it on html, e-pub and pdf formats.
>
> E. g. something like:
>
> ====== ======== ===================== ========================================
> Offset Proto Name Meaning
> /Size
>
> 01F1/1 ALL(1) setup_sects The size of the setup in sectors
> 01F2/2 ALL root_flags If set, the root is mounted readonly
> 01F4/4 2.04+(2) syssize The size of the 32-bit code in 16-byte
> paras
> 01F8/2 ALL ram_size DO NOT USE - for bootsect.S use only
> 01FA/2 ALL vid_mode Video mode control
> 01FC/2 ALL root_dev Default root device number
> 01FE/2 ALL boot_flag 0xAA55 magic number
> 0200/2 2.00+ jump Jump instruction
> 0202/4 2.00+ header Magic signature "HdrS"
> 0206/2 2.00+ version Boot protocol version supported
> 0208/4 2.00+ realmode_swtch Boot loader hook (see below)
> 020C/2 2.00+ start_sys_seg The load-low segment (0x1000) (obsolete)
> 020E/2 2.00+ kernel_version Pointer to kernel version string
> 0210/1 2.00+ type_of_loader Boot loader identifier
> 0211/1 2.00+ loadflags Boot protocol option flags
> 0212/2 2.00+ setup_move_size Move to high memory size
> (used with hooks)
> 0214/4 2.00+ code32_start Boot loader hook (see below)
> 0218/4 2.00+ ramdisk_image initrd load address (set by boot loader)
> 021C/4 2.00+ ramdisk_size initrd size (set by boot loader)
> 0220/4 2.00+ bootsect_kludge DO NOT USE - for bootsect.S use only
> 0224/2 2.01+ heap_end_ptr Free memory after setup end
> 0226/1 2.02+(3) ext_loader_ver Extended boot loader version
> 0227/1 2.02+(3) ext_loader_type Extended boot loader ID
> 0228/4 2.02+ cmd_line_ptr 32-bit pointer to the kernel command line
> 022C/4 2.03+ initrd_addr_max Highest legal initrd address
> 0230/4 2.05+ kernel_alignment Physical addr alignment required for
> kernel
> 0234/1 2.05+ relocatable_kernel Whether kernel is relocatable or not
> 0235/1 2.10+ min_alignment Minimum alignment, as a power of two
> 0236/2 2.12+ xloadflags Boot protocol option flags
> 0238/4 2.06+ cmdline_size Maximum size of the kernel command line
> 023C/4 2.07+ hardware_subarch Hardware subarchitecture
> 0240/8 2.07+ hardware_subarch_data Subarchitecture-specific data
> 0248/4 2.08+ payload_offset Offset of kernel payload
> 024C/4 2.08+ payload_length Length of kernel payload
> 0250/8 2.09+ setup_data 64-bit physical pointer to linked list
> of struct setup_data
> 0258/8 2.10+ pref_address Preferred loading address
> 0260/4 2.10+ init_size Linear memory required during
> initialization
> 0264/4 2.11+ handover_offset Offset of handover entry point
> ====== ======== ===================== ========================================
>
>
done as table.

> > +
> > +(1) For backwards compatibility, if the setup_sects field contains 0, the
> > + real value is 4.
> > +
> > +(2) For boot protocol prior to 2.04, the upper two bytes of the syssize
> > + field are unusable, which means the size of a bzImage kernel
> > + cannot be determined.
> > +
> > +(3) Ignored, but safe to set, for boot protocols 2.02-2.09.
>
> Btw, (1), (2) and (3) here sounds to be footnotes. Perhaps you could use
> ReST footnote markups, if ok for the X86 maintainers.
>
Turned into footnote.

> > +
> > +If the "HdrS" (0x53726448) magic number is not found at offset 0x202,
> > +the boot protocol version is "old". Loading an old kernel, the
> > +following parameters should be assumed::
> > +
> > + Image type = zImage
> > + initrd not supported
> > + Real-mode kernel must be located at 0x90000.
> > +
> > +Otherwise, the "version" field contains the protocol version,
> > +e.g. protocol version 2.01 will contain 0x0201 in this field. When
> > +setting fields in the header, you must make sure only to set fields
> > +supported by the protocol version in use.
> > +
> > +
> > +DETAILS OF HEADER FIELDS
> > +========================
> > +
> > +For each field, some are information from the kernel to the bootloader
> > +("read"), some are expected to be filled out by the bootloader
> > +("write"), and some are expected to be read and modified by the
> > +bootloader ("modify").
> > +
> > +All general purpose boot loaders should write the fields marked
> > +(obligatory). Boot loaders who want to load the kernel at a
> > +nonstandard address should fill in the fields marked (reloc); other
> > +boot loaders can ignore those fields.
> > +
> > +The byte order of all fields is littleendian (this is x86, after all.)
> > +::
> > +
> > + Field name: setup_sects
> > + Type: read
> > + Offset/size: 0x1f1/1
> > + Protocol: ALL
>
> Marking this as a literal block sounds plain wrong to me. I suspect that
> you could use this syntax instead:
>
> :Field name: setup_sects
> :Type: read
> :Offset/size: 0x1f1/1
> :Protocol: ALL
>
> Or:
>
> Field name: setup_sects
> -----------------------
>
> Type:
> read
> Offset/size:
> 0x1f1/1
> Protocol:
> ALL
>
> Or (my favorite):
>
> Field name: setup_sects
> -----------------------
>
> :Type: read
> :Offset/size: 0x1f1/1
> :Protocol: ALL
>
> As it is more compact in text, and will provide a much better
> html/pdf output. It will also make (IMHO) a lot easier for
> people to read in text and seek for an specific field.
>
> Of course, whatever we do here should be applied to all similar
> structs inside this file.
>
I convert them to tables. The output looks good. Thanks.

> > +
> > +The size of the setup code in 512-byte sectors. If this field is
> > +0, the real value is 4. The real-mode code consists of the boot
> > +sector (always one 512-byte sector) plus the setup code.
> > +::
> > +
> > + Field name: root_flags
> > + Type: modify (optional)
> > + Offset/size: 0x1f2/2
> > + Protocol: ALL
> > +
> > +If this field is nonzero, the root defaults to readonly. The use of
> > +this field is deprecated; use the "ro" or "rw" options on the
> > +command line instead.
> > +::
> > +
> > + Field name: syssize
> > + Type: read
> > + Offset/size: 0x1f4/4 (protocol 2.04+) 0x1f4/2 (protocol ALL)
> > + Protocol: 2.04+
> > +
> > +The size of the protected-mode code in units of 16-byte paragraphs.
> > +For protocol versions older than 2.04 this field is only two bytes
> > +wide, and therefore cannot be trusted for the size of a kernel if
> > +the LOAD_HIGH flag is set.
> > +::
> > +
> > + Field name: ram_size
> > + Type: kernel internal
> > + Offset/size: 0x1f8/2
> > + Protocol: ALL
> > +
> > +This field is obsolete.
> > +::
> > +
> > + Field name: vid_mode
> > + Type: modify (obligatory)
> > + Offset/size: 0x1fa/2
> > +
> > +Please see the section on SPECIAL COMMAND LINE OPTIONS.
> > +::
> > +
> > + Field name: root_dev
> > + Type: modify (optional)
> > + Offset/size: 0x1fc/2
> > + Protocol: ALL
> > +
> > +The default root device device number. The use of this field is
> > +deprecated, use the "root=" option on the command line instead.
> > +::
> > +
> > + Field name: boot_flag
> > + Type: read
> > + Offset/size: 0x1fe/2
> > + Protocol: ALL
> > +
> > +Contains 0xAA55. This is the closest thing old Linux kernels have
> > +to a magic number.
> > +::
> > +
> > + Field name: jump
> > + Type: read
> > + Offset/size: 0x200/2
> > + Protocol: 2.00+
> > +
> > +Contains an x86 jump instruction, 0xEB followed by a signed offset
> > +relative to byte 0x202. This can be used to determine the size of
> > +the header.
> > +::
> > +
> > + Field name: header
> > + Type: read
> > + Offset/size: 0x202/4
> > + Protocol: 2.00+
> > +
> > +Contains the magic number "HdrS" (0x53726448).
> > +::
> > +
> > + Field name: version
> > + Type: read
> > + Offset/size: 0x206/2
> > + Protocol: 2.00+
> > +
> > +Contains the boot protocol version, in (major << 8)+minor format,
> > +e.g. 0x0204 for version 2.04, and 0x0a11 for a hypothetical version
> > +10.17.
> > +::
> > +
> > + Field name: realmode_swtch
> > + Type: modify (optional)
> > + Offset/size: 0x208/4
> > + Protocol: 2.00+
> > +
> > +Boot loader hook (see ADVANCED BOOT LOADER HOOKS below.)
> > +::
> > +
> > + Field name: start_sys_seg
> > + Type: read
> > + Offset/size: 0x20c/2
> > + Protocol: 2.00+
> > +
> > +The load low segment (0x1000). Obsolete.
> > +::
> > +
> > + Field name: kernel_version
> > + Type: read
> > + Offset/size: 0x20e/2
> > + Protocol: 2.00+
> > +
> > +If set to a nonzero value, contains a pointer to a NUL-terminated
> > +human-readable kernel version number string, less 0x200. This can
> > +be used to display the kernel version to the user. This value
> > +should be less than (0x200*setup_sects).
> > +
> > +For example, if this value is set to 0x1c00, the kernel version
> > +number string can be found at offset 0x1e00 in the kernel file.
> > +This is a valid value if and only if the "setup_sects" field
> > +contains the value 15 or higher, as::
> > +
> > + 0x1c00 < 15*0x200 (= 0x1e00) but
> > + 0x1c00 >= 14*0x200 (= 0x1c00)
> > +
> > + 0x1c00 >> 9 = 14, so the minimum value for setup_secs is 15.
> > +
> > +::
> > +
> > + Field name: type_of_loader
> > + Type: write (obligatory)
> > + Offset/size: 0x210/1
> > + Protocol: 2.00+
> > +
> > +If your boot loader has an assigned id (see table below), enter
> > +0xTV here, where T is an identifier for the boot loader and V is
> > +a version number. Otherwise, enter 0xFF here.
> > +
> > +For boot loader IDs above T = 0xD, write T = 0xE to this field and
> > +write the extended ID minus 0x10 to the ext_loader_type field.
> > +Similarly, the ext_loader_ver field can be used to provide more than
> > +four bits for the bootloader version.
> > +
> > +For example, for T = 0x15, V = 0x234, write::
> > +
> > + type_of_loader <- 0xE4
> > + ext_loader_type <- 0x05
> > + ext_loader_ver <- 0x23
> > +
> > +Assigned boot loader ids (hexadecimal)::
> > +
> > + 0 LILO (0x00 reserved for pre-2.00 bootloader)
> > + 1 Loadlin
> > + 2 bootsect-loader (0x20, all other values reserved)
> > + 3 Syslinux
> > + 4 Etherboot/gPXE/iPXE
> > + 5 ELILO
> > + 7 GRUB
> > + 8 U-Boot
> > + 9 Xen
> > + A Gujin
> > + B Qemu
> > + C Arcturus Networks uCbootloader
> > + D kexec-tools
> > + E Extended (see ext_loader_type)
> > + F Special (0xFF = undefined)
> > + 10 Reserved
> > + 11 Minimal Linux Bootloader <http://sebastian-plotz.blogspot.de>
> > + 12 OVMF UEFI virtualization stack
>
> Clearly there's something wrong with the last 3 lines, as they aren't
> following the expected indentation.
>
> Anyway, IMO the best would be to use a table, instead:
>
> == =======================================
> 0 LILO
> (0x00 reserved for pre-2.00 bootloader)
> 1 Loadlin
> 2 bootsect-loader
> (0x20, all other values reserved)
> 3 Syslinux
> 4 Etherboot/gPXE/iPXE
> 5 ELILO
> 7 GRUB
> 8 U-Boot
> 9 Xen
> A Gujin
> B Qemu
> C Arcturus Networks uCbootloader
> D kexec-tools
> E Extended
> (see ext_loader_type)
> F Special
> (0xFF = undefined)
> 10 Reserved
> 11 Minimal Linux Bootloader
> <http://sebastian-plotz.blogspot.de>
> 12 OVMF UEFI virtualization stack
> == =======================================
>
done.

>
>
> > +
> > +Please contact <[email protected]> if you need a bootloader ID value assigned.
> > +::
> > +
> > + Field name: loadflags
> > + Type: modify (obligatory)
> > + Offset/size: 0x211/1
> > + Protocol: 2.00+
> > +
> > +This field is a bitmask.
> > +::
> > +
> > + Bit 0 (read): LOADED_HIGH
> > + - If 0, the protected-mode code is loaded at 0x10000.
> > + - If 1, the protected-mode code is loaded at 0x100000.
> > +
> > + Bit 1 (kernel internal): KASLR_FLAG
> > + - Used internally by the compressed kernel to communicate
> > + KASLR status to kernel proper.
> > + If 1, KASLR enabled.
> > + If 0, KASLR disabled.
>
> You need to either add blank lines or add a "- " before the
> two if's above.
>
done.

> > +
> > + Bit 5 (write): QUIET_FLAG
> > + - If 0, print early messages.
> > + - If 1, suppress early messages.
> > + This requests to the kernel (decompressor and early
> > + kernel) to not write early messages that require
> > + accessing the display hardware directly.
> > +
> > + Bit 6 (write): KEEP_SEGMENTS
> > + Protocol: 2.07+
> > + - If 0, reload the segment registers in the 32bit entry point.
> > + - If 1, do not reload the segment registers in the 32bit entry point.
> > + Assume that %cs %ds %ss %es are all set to flat segments with
> > + a base of 0 (or the equivalent for their environment).
> > +
> > + Bit 7 (write): CAN_USE_HEAP
> > + Set this bit to 1 to indicate that the value entered in the
> > + heap_end_ptr is valid. If this field is clear, some setup code
> > + functionality will be disabled.
> > +
> > +::
> > +
> > + Field name: setup_move_size
> > + Type: modify (obligatory)
> > + Offset/size: 0x212/2
> > + Protocol: 2.00-2.01
> > +
> > +When using protocol 2.00 or 2.01, if the real mode kernel is not
> > +loaded at 0x90000, it gets moved there later in the loading
> > +sequence. Fill in this field if you want additional data (such as
> > +the kernel command line) moved in addition to the real-mode kernel
> > +itself.
> > +
> > +The unit is bytes starting with the beginning of the boot sector.
> > +
> > +This field is can be ignored when the protocol is 2.02 or higher, or
> > +if the real-mode code is loaded at 0x90000.
> > +::
> > +
> > + Field name: code32_start
> > + Type: modify (optional, reloc)
> > + Offset/size: 0x214/4
> > + Protocol: 2.00+
> > +
> > +The address to jump to in protected mode. This defaults to the load
> > +address of the kernel, and can be used by the boot loader to
> > +determine the proper load address.
> > +
> > +This field can be modified for two purposes:
> > +
> > + 1. as a boot loader hook (see ADVANCED BOOT LOADER HOOKS below.)
> > +
> > + 2. if a bootloader which does not install a hook loads a
> > + relocatable kernel at a nonstandard address it will have to modify
> > + this field to point to the load address.
> > +
> > +::
> > +
> > + Field name: ramdisk_image
> > + Type: write (obligatory)
> > + Offset/size: 0x218/4
> > + Protocol: 2.00+
> > +
> > +The 32-bit linear address of the initial ramdisk or ramfs. Leave at
> > +zero if there is no initial ramdisk/ramfs.
> > +::
> > +
> > + Field name: ramdisk_size
> > + Type: write (obligatory)
> > + Offset/size: 0x21c/4
> > + Protocol: 2.00+
> > +
> > +Size of the initial ramdisk or ramfs. Leave at zero if there is no
> > +initial ramdisk/ramfs.
> > +::
> > +
> > + Field name: bootsect_kludge
> > + Type: kernel internal
> > + Offset/size: 0x220/4
> > + Protocol: 2.00+
> > +
> > +This field is obsolete.
> > +::
> > +
> > + Field name: heap_end_ptr
> > + Type: write (obligatory)
> > + Offset/size: 0x224/2
> > + Protocol: 2.01+
> > +
> > +Set this field to the offset (from the beginning of the real-mode
> > +code) of the end of the setup stack/heap, minus 0x0200.
> > +::
> > +
> > + Field name: ext_loader_ver
> > + Type: write (optional)
> > + Offset/size: 0x226/1
> > + Protocol: 2.02+
> > +
> > +This field is used as an extension of the version number in the
> > +type_of_loader field. The total version number is considered to be
> > +(type_of_loader & 0x0f) + (ext_loader_ver << 4).
> > +
> > +The use of this field is boot loader specific. If not written, it
> > +is zero.
> > +
> > +Kernels prior to 2.6.31 did not recognize this field, but it is safe
> > +to write for protocol version 2.02 or higher.
> > +::
> > +
> > + Field name: ext_loader_type
> > + Type: write (obligatory if (type_of_loader & 0xf0) == 0xe0)
> > + Offset/size: 0x227/1
> > + Protocol: 2.02+
> > +
> > +This field is used as an extension of the type number in
> > +type_of_loader field. If the type in type_of_loader is 0xE, then
> > +the actual type is (ext_loader_type + 0x10).
> > +
> > +This field is ignored if the type in type_of_loader is not 0xE.
> > +
> > +Kernels prior to 2.6.31 did not recognize this field, but it is safe
> > +to write for protocol version 2.02 or higher.
> > +::
> > +
> > + Field name: cmd_line_ptr
> > + Type: write (obligatory)
> > + Offset/size: 0x228/4
> > + Protocol: 2.02+
> > +
> > +Set this field to the linear address of the kernel command line.
> > +The kernel command line can be located anywhere between the end of
> > +the setup heap and 0xA0000; it does not have to be located in the
> > +same 64K segment as the real-mode code itself.
> > +
> > +Fill in this field even if your boot loader does not support a
> > +command line, in which case you can point this to an empty string
> > +(or better yet, to the string "auto".) If this field is left at
> > +zero, the kernel will assume that your boot loader does not support
> > +the 2.02+ protocol.
> > +::
> > +
> > + Field name: initrd_addr_max
> > + Type: read
> > + Offset/size: 0x22c/4
> > + Protocol: 2.03+
> > +
> > +The maximum address that may be occupied by the initial
> > +ramdisk/ramfs contents. For boot protocols 2.02 or earlier, this
> > +field is not present, and the maximum address is 0x37FFFFFF. (This
> > +address is defined as the address of the highest safe byte, so if
> > +your ramdisk is exactly 131072 bytes long and this field is
> > +0x37FFFFFF, you can start your ramdisk at 0x37FE0000.)
> > +::
> > +
> > + Field name: kernel_alignment
> > + Type: read/modify (reloc)
> > + Offset/size: 0x230/4
> > + Protocol: 2.05+ (read), 2.10+ (modify)
> > +
> > +Alignment unit required by the kernel (if relocatable_kernel is
> > +true.) A relocatable kernel that is loaded at an alignment
> > +incompatible with the value in this field will be realigned during
> > +kernel initialization.
> > +
> > +Starting with protocol version 2.10, this reflects the kernel
> > +alignment preferred for optimal performance; it is possible for the
> > +loader to modify this field to permit a lesser alignment. See the
> > +min_alignment and pref_address field below.
> > +::
> > +
> > + Field name: relocatable_kernel
> > + Type: read (reloc)
> > + Offset/size: 0x234/1
> > + Protocol: 2.05+
> > +
> > +If this field is nonzero, the protected-mode part of the kernel can
> > +be loaded at any address that satisfies the kernel_alignment field.
> > +After loading, the boot loader must set the code32_start field to
> > +point to the loaded code, or to a boot loader hook.
> > +::
> > +
> > + Field name: min_alignment
> > + Type: read (reloc)
> > + Offset/size: 0x235/1
> > + Protocol: 2.10+
> > +
> > +This field, if nonzero, indicates as a power of two the minimum
> > +alignment required, as opposed to preferred, by the kernel to boot.
> > +If a boot loader makes use of this field, it should update the
> > +kernel_alignment field with the alignment unit desired; typically::
> > +
> > + kernel_alignment = 1 << min_alignment
> > +
> > +There may be a considerable performance cost with an excessively
> > +misaligned kernel. Therefore, a loader should typically try each
> > +power-of-two alignment from kernel_alignment down to this alignment.
> > +::
> > +
> > + Field name: xloadflags
> > + Type: read
> > + Offset/size: 0x236/2
> > + Protocol: 2.12+
> > +
> > +This field is a bitmask.
> > +::
> > +
> > + Bit 0 (read): XLF_KERNEL_64
> > + - If 1, this kernel has the legacy 64-bit entry point at 0x200.
> > +
> > + Bit 1 (read): XLF_CAN_BE_LOADED_ABOVE_4G
> > + - If 1, kernel/boot_params/cmdline/ramdisk can be above 4G.
>
> Please indent it the same way as Bit 0.
>
done.

> > +
> > + Bit 2 (read): XLF_EFI_HANDOVER_32
> > + - If 1, the kernel supports the 32-bit EFI handoff entry point
> > + given at handover_offset.
> > +
> > + Bit 3 (read): XLF_EFI_HANDOVER_64
> > + - If 1, the kernel supports the 64-bit EFI handoff entry point
> > + given at handover_offset + 0x200.
> > +
> > + Bit 4 (read): XLF_EFI_KEXEC
> > + - If 1, the kernel supports kexec EFI boot with EFI runtime support.
> > +
> > +::
> > +
> > + Field name: cmdline_size
> > + Type: read
> > + Offset/size: 0x238/4
> > + Protocol: 2.06+
> > +
> > +The maximum size of the command line without the terminating
> > +zero. This means that the command line can contain at most
> > +cmdline_size characters. With protocol version 2.05 and earlier, the
> > +maximum size was 255.
> > +::
> > +
> > + Field name: hardware_subarch
> > + Type: write (optional, defaults to x86/PC)
> > + Offset/size: 0x23c/4
> > + Protocol: 2.07+
> > +
> > +In a paravirtualized environment the hardware low level architectural
> > +pieces such as interrupt handling, page table handling, and
> > +accessing process control registers needs to be done differently.
> > +
> > +This field allows the bootloader to inform the kernel we are in one
> > +one of those environments.
> > +::
> > +
> > + 0x00000000 The default x86/PC environment
> > + 0x00000001 lguest
> > + 0x00000002 Xen
> > + 0x00000003 Moorestown MID
> > + 0x00000004 CE4100 TV Platform
>
> This is already a table. Just add the markups for it, instead of using '::'
>
> e. g.:
>
> + ========== ==============================
> 0x00000000 The default x86/PC environment
> 0x00000001 lguest
> 0x00000002 Xen
> 0x00000003 Moorestown MID
> 0x00000004 CE4100 TV Platform
> + ========== ==============================
>
done.

>
> > +
> > +::
> > +
> > + Field name: hardware_subarch_data
> > + Type: write (subarch-dependent)
> > + Offset/size: 0x240/8
> > + Protocol: 2.07+
> > +
> > +A pointer to data that is specific to hardware subarch
> > +This field is currently unused for the default x86/PC environment,
> > +do not modify.
> > +::
> > +
> > + Field name: payload_offset
> > + Type: read
> > + Offset/size: 0x248/4
> > + Protocol: 2.08+
> > +
> > +If non-zero then this field contains the offset from the beginning
> > +of the protected-mode code to the payload.
> > +
> > +The payload may be compressed. The format of both the compressed and
> > +uncompressed data should be determined using the standard magic
> > +numbers. The currently supported compression formats are gzip
> > +(magic numbers 1F 8B or 1F 9E), bzip2 (magic number 42 5A), LZMA
> > +(magic number 5D 00), XZ (magic number FD 37), and LZ4 (magic number
> > +02 21). The uncompressed payload is currently always ELF (magic
> > +number 7F 45 4C 46).
> > +::
> > +
> > + Field name: payload_length
> > + Type: read
> > + Offset/size: 0x24c/4
> > + Protocol: 2.08+
> > +
> > +The length of the payload.
> > +::
> > +
> > + Field name: setup_data
> > + Type: write (special)
> > + Offset/size: 0x250/8
> > + Protocol: 2.09+
> > +
> > +The 64-bit physical pointer to NULL terminated single linked list of
> > +struct setup_data. This is used to define a more extensible boot
> > +parameters passing mechanism. The definition of struct setup_data is
> > +as follow::
> > +
> > + struct setup_data {
> > + u64 next;
> > + u32 type;
> > + u32 len;
> > + u8 data[0];
> > + };
> > +
> > +Where, the next is a 64-bit physical pointer to the next node of
> > +linked list, the next field of the last node is 0; the type is used
> > +to identify the contents of data; the len is the length of data
> > +field; the data holds the real payload.
> > +
> > +This list may be modified at a number of points during the bootup
> > +process. Therefore, when modifying this list one should always make
> > +sure to consider the case where the linked list already contains
> > +entries.
> > +::
> > +
> > + Field name: pref_address
> > + Type: read (reloc)
> > + Offset/size: 0x258/8
> > + Protocol: 2.10+
> > +
> > +This field, if nonzero, represents a preferred load address for the
> > +kernel. A relocating bootloader should attempt to load at this
> > +address if possible.
> > +
> > +A non-relocatable kernel will unconditionally move itself and to run
> > +at this address.
> > +::
> > +
> > + Field name: init_size
> > + Type: read
> > + Offset/size: 0x260/4
> > +
> > +This field indicates the amount of linear contiguous memory starting
> > +at the kernel runtime start address that the kernel needs before it
> > +is capable of examining its memory map. This is not the same thing
> > +as the total amount of memory the kernel needs to boot, but it can
> > +be used by a relocating boot loader to help select a safe load
> > +address for the kernel.
> > +
> > +The kernel runtime start address is determined by the following algorithm::
> > +
> > + if (relocatable_kernel)
> > + runtime_start = align_up(load_address, kernel_alignment)
> > + else
> > + runtime_start = pref_address
> > +
> > +::
> > +
> > + Field name: handover_offset
> > + Type: read
> > + Offset/size: 0x264/4
> > +
> > +This field is the offset from the beginning of the kernel image to
> > +the EFI handover protocol entry point. Boot loaders using the EFI
> > +handover protocol to boot the kernel should jump to this offset.
> > +
> > +See EFI HANDOVER PROTOCOL below for more details.
> > +
> > +
> > +THE IMAGE CHECKSUM
> > +==================
> > +
> > +From boot protocol version 2.08 onwards the CRC-32 is calculated over
> > +the entire file using the characteristic polynomial 0x04C11DB7 and an
> > +initial remainder of 0xffffffff. The checksum is appended to the
> > +file; therefore the CRC of the file up to the limit specified in the
> > +syssize field of the header is always 0.
> > +
> > +
> > +THE KERNEL COMMAND LINE
> > +=======================
> > +
> > +The kernel command line has become an important way for the boot
> > +loader to communicate with the kernel. Some of its options are also
> > +relevant to the boot loader itself, see "special command line options"
> > +below.
> > +
> > +The kernel command line is a null-terminated string. The maximum
> > +length can be retrieved from the field cmdline_size. Before protocol
> > +version 2.06, the maximum was 255 characters. A string that is too
> > +long will be automatically truncated by the kernel.
> > +
> > +If the boot protocol version is 2.02 or later, the address of the
> > +kernel command line is given by the header field cmd_line_ptr (see
> > +above.) This address can be anywhere between the end of the setup
> > +heap and 0xA0000.
> > +
> > +If the protocol version is *not* 2.02 or higher, the kernel
> > +command line is entered using the following protocol:
> > +
> > + - At offset 0x0020 (word), "cmd_line_magic", enter the magic
> > + number 0xA33F.
> > +
> > + - At offset 0x0022 (word), "cmd_line_offset", enter the offset
> > + of the kernel command line (relative to the start of the
> > + real-mode kernel).
> > +
> > + - The kernel command line *must* be within the memory region
> > + covered by setup_move_size, so you may need to adjust this
> > + field.
> > +
> > +
> > +MEMORY LAYOUT OF THE REAL-MODE CODE
> > +===================================
> > +
> > +The real-mode code requires a stack/heap to be set up, as well as
> > +memory allocated for the kernel command line. This needs to be done
> > +in the real-mode accessible memory in bottom megabyte.
> > +
> > +It should be noted that modern machines often have a sizable Extended
> > +BIOS Data Area (EBDA). As a result, it is advisable to use as little
> > +of the low megabyte as possible.
> > +
> > +Unfortunately, under the following circumstances the 0x90000 memory
> > +segment has to be used:
> > +
> > + - When loading a zImage kernel ((loadflags & 0x01) == 0).
> > + - When loading a 2.01 or earlier boot protocol kernel.
> > +
> > + For the 2.00 and 2.01 boot protocols, the real-mode code
> > + can be loaded at another address, but it is internally
> > + relocated to 0x90000. For the "old" protocol, the
> > + real-mode code must be loaded at 0x90000.
> > +
> > +When loading at 0x90000, avoid using memory above 0x9a000.
> > +
> > +For boot protocol 2.02 or higher, the command line does not have to be
> > +located in the same 64K segment as the real-mode setup code; it is
> > +thus permitted to give the stack/heap the full 64K segment and locate
> > +the command line above it.
> > +
> > +The kernel command line should not be located below the real-mode
> > +code, nor should it be located in high memory.
> > +
> > +
> > +SAMPLE BOOT CONFIGURATION
> > +=========================
> > +
> > +As a sample configuration, assume the following layout of the real
> > +mode segment.
> > +
>
>
> > +When loading below 0x90000, use the entire segment::
> > +
> > + 0x0000-0x7fff Real mode kernel
> > + 0x8000-0xdfff Stack and heap
> > + 0xe000-0xffff Kernel command line
> > +
> > +When loading at 0x90000 OR the protocol version is 2.01 or earlier::
> > +
> > + 0x0000-0x7fff Real mode kernel
> > + 0x8000-0x97ff Stack and heap
> > + 0x9800-0x9fff Kernel command line
>
> Again, tables. Just do:
>
> When loading below 0x90000, use the entire segment:
>
> + ============= ===================
> 0x0000-0x7fff Real mode kernel
> 0x8000-0xdfff Stack and heap
> 0xe000-0xffff Kernel command line
> + ============= ===================
>
> When loading at 0x90000 OR the protocol version is 2.01 or earlier:
>
> + ============= ===================
> 0x0000-0x7fff Real mode kernel
> 0x8000-0x97ff Stack and heap
> 0x9800-0x9fff Kernel command line
> + ============= ===================
>
done.

>
>
> > +
> > +Such a boot loader should enter the following fields in the header::
> > +
> > + unsigned long base_ptr; /* base address for real-mode segment */
> > +
> > + if ( setup_sects == 0 ) {
> > + setup_sects = 4;
> > + }
> > +
> > + if ( protocol >= 0x0200 ) {
> > + type_of_loader = <type code>;
> > + if ( loading_initrd ) {
> > + ramdisk_image = <initrd_address>;
> > + ramdisk_size = <initrd_size>;
> > + }
> > +
> > + if ( protocol >= 0x0202 && loadflags & 0x01 )
> > + heap_end = 0xe000;
> > + else
> > + heap_end = 0x9800;
> > +
> > + if ( protocol >= 0x0201 ) {
> > + heap_end_ptr = heap_end - 0x200;
> > + loadflags |= 0x80; /* CAN_USE_HEAP */
> > + }
> > +
> > + if ( protocol >= 0x0202 ) {
> > + cmd_line_ptr = base_ptr + heap_end;
> > + strcpy(cmd_line_ptr, cmdline);
> > + } else {
> > + cmd_line_magic = 0xA33F;
> > + cmd_line_offset = heap_end;
> > + setup_move_size = heap_end + strlen(cmdline)+1;
> > + strcpy(base_ptr+cmd_line_offset, cmdline);
> > + }
> > + } else {
> > + /* Very old kernel */
> > +
> > + heap_end = 0x9800;
> > +
> > + cmd_line_magic = 0xA33F;
> > + cmd_line_offset = heap_end;
> > +
> > + /* A very old kernel MUST have its real-mode code
> > + loaded at 0x90000 */
> > +
> > + if ( base_ptr != 0x90000 ) {
> > + /* Copy the real-mode kernel */
> > + memcpy(0x90000, base_ptr, (setup_sects+1)*512);
> > + base_ptr = 0x90000; /* Relocated */
> > + }
> > +
> > + strcpy(0x90000+cmd_line_offset, cmdline);
> > +
> > + /* It is recommended to clear memory up to the 32K mark */
> > + memset(0x90000 + (setup_sects+1)*512, 0,
> > + (64-(setup_sects+1))*512);
> > + }
> > +
> > +
> > +LOADING THE REST OF THE KERNEL
> > +==============================
> > +
> > +The 32-bit (non-real-mode) kernel starts at offset (setup_sects+1)*512
> > +in the kernel file (again, if setup_sects == 0 the real value is 4.)
> > +It should be loaded at address 0x10000 for Image/zImage kernels and
> > +0x100000 for bzImage kernels.
> > +
> > +The kernel is a bzImage kernel if the protocol >= 2.00 and the 0x01
> > +bit (LOAD_HIGH) in the loadflags field is set::
> > +
> > + is_bzImage = (protocol >= 0x0200) && (loadflags & 0x01);
> > + load_address = is_bzImage ? 0x100000 : 0x10000;
> > +
> > +Note that Image/zImage kernels can be up to 512K in size, and thus use
> > +the entire 0x10000-0x90000 range of memory. This means it is pretty
> > +much a requirement for these kernels to load the real-mode part at
> > +0x90000. bzImage kernels allow much more flexibility.
> > +
> > +
> > +SPECIAL COMMAND LINE OPTIONS
> > +============================
> > +
> > +If the command line provided by the boot loader is entered by the
> > +user, the user may expect the following command line options to work.
> > +They should normally not be deleted from the kernel command line even
> > +though not all of them are actually meaningful to the kernel. Boot
> > +loader authors who need additional command line options for the boot
> > +loader itself should get them registered in
> > +Documentation/admin-guide/kernel-parameters.rst to make sure they will not
> > +conflict with actual kernel options now or in the future.
> > +
> > + vga=<mode>
> > + <mode> here is either an integer (in C notation, either
> > + decimal, octal, or hexadecimal) or one of the strings
> > + "normal" (meaning 0xFFFF), "ext" (meaning 0xFFFE) or "ask"
> > + (meaning 0xFFFD). This value should be entered into the
> > + vid_mode field, as it is used by the kernel before the command
> > + line is parsed.
> > +
> > + mem=<size>
> > + <size> is an integer in C notation optionally followed by
> > + (case insensitive) K, M, G, T, P or E (meaning << 10, << 20,
> > + << 30, << 40, << 50 or << 60). This specifies the end of
> > + memory to the kernel. This affects the possible placement of
> > + an initrd, since an initrd should be placed near end of
> > + memory. Note that this is an option to *both* the kernel and
> > + the bootloader!
> > +
> > + initrd=<file>
> > + An initrd should be loaded. The meaning of <file> is
> > + obviously bootloader-dependent, and some boot loaders
> > + (e.g. LILO) do not have such a command.
> > +
> > +In addition, some boot loaders add the following options to the
> > +user-specified command line:
> > +
> > + BOOT_IMAGE=<file>
> > + The boot image which was loaded. Again, the meaning of <file>
> > + is obviously bootloader-dependent.
> > +
> > + auto
> > + The kernel was booted without explicit user intervention.
> > +
> > +If these options are added by the boot loader, it is highly
> > +recommended that they are located *first*, before the user-specified
> > +or configuration-specified command line. Otherwise, "init=/bin/sh"
> > +gets confused by the "auto" option.
> > +
> > +
> > +RUNNING THE KERNEL
> > +==================
> > +
> > +The kernel is started by jumping to the kernel entry point, which is
> > +located at *segment* offset 0x20 from the start of the real mode
> > +kernel. This means that if you loaded your real-mode kernel code at
> > +0x90000, the kernel entry point is 9020:0000.
> > +
> > +At entry, ds = es = ss should point to the start of the real-mode
> > +kernel code (0x9000 if the code is loaded at 0x90000), sp should be
> > +set up properly, normally pointing to the top of the heap, and
> > +interrupts should be disabled. Furthermore, to guard against bugs in
> > +the kernel, it is recommended that the boot loader sets fs = gs = ds =
> > +es = ss.
> > +
> > +In our example from above, we would do::
> > +
> > + /* Note: in the case of the "old" kernel protocol, base_ptr must
> > + be == 0x90000 at this point; see the previous sample code */
> > +
> > + seg = base_ptr >> 4;
> > +
> > + cli(); /* Enter with interrupts disabled! */
> > +
> > + /* Set up the real-mode kernel stack */
> > + _SS = seg;
> > + _SP = heap_end;
> > +
> > + _DS = _ES = _FS = _GS = seg;
> > + jmp_far(seg+0x20, 0); /* Run the kernel */
> > +
> > +If your boot sector accesses a floppy drive, it is recommended to
> > +switch off the floppy motor before running the kernel, since the
> > +kernel boot leaves interrupts off and thus the motor will not be
> > +switched off, especially if the loaded kernel has the floppy driver as
> > +a demand-loaded module!
> > +
> > +
> > +ADVANCED BOOT LOADER HOOKS
> > +==========================
> > +
> > +If the boot loader runs in a particularly hostile environment (such as
> > +LOADLIN, which runs under DOS) it may be impossible to follow the
> > +standard memory location requirements. Such a boot loader may use the
> > +following hooks that, if set, are invoked by the kernel at the
> > +appropriate time. The use of these hooks should probably be
> > +considered an absolutely last resort!
> > +
> > +IMPORTANT: All the hooks are required to preserve %esp, %ebp, %esi and
> > +%edi across invocation.
> > +
> > + realmode_swtch:
> > + A 16-bit real mode far subroutine invoked immediately before
> > + entering protected mode. The default routine disables NMI, so
> > + your routine should probably do so, too.
> > +
> > + code32_start:
> > + A 32-bit flat-mode routine *jumped* to immediately after the
> > + transition to protected mode, but before the kernel is
> > + uncompressed. No segments, except CS, are guaranteed to be
> > + set up (current kernels do, but older ones do not); you should
> > + set them up to BOOT_DS (0x18) yourself.
> > +
> > + After completing your hook, you should jump to the address
> > + that was in this field before your boot loader overwrote it
> > + (relocated, if appropriate.)
> > +
> > +
> > +32-bit BOOT PROTOCOL
> > +====================
> > +
> > +For machine with some new BIOS other than legacy BIOS, such as EFI,
> > +LinuxBIOS, etc, and kexec, the 16-bit real mode setup code in kernel
> > +based on legacy BIOS can not be used, so a 32-bit boot protocol needs
> > +to be defined.
> > +
> > +In 32-bit boot protocol, the first step in loading a Linux kernel
> > +should be to setup the boot parameters (struct boot_params,
> > +traditionally known as "zero page"). The memory for struct boot_params
> > +should be allocated and initialized to all zero. Then the setup header
> > +from offset 0x01f1 of kernel image on should be loaded into struct
> > +boot_params and examined. The end of setup header can be calculated as
> > +follow::
> > +
> > + 0x0202 + byte value at offset 0x0201
> > +
> > +In addition to read/modify/write the setup header of the struct
> > +boot_params as that of 16-bit boot protocol, the boot loader should
> > +also fill the additional fields of the struct boot_params as that
> > +described in zero-page.txt.
> > +
> > +After setting up the struct boot_params, the boot loader can load the
> > +32/64-bit kernel in the same way as that of 16-bit boot protocol.
> > +
> > +In 32-bit boot protocol, the kernel is started by jumping to the
> > +32-bit kernel entry point, which is the start address of loaded
> > +32/64-bit kernel.
> > +
> > +At entry, the CPU must be in 32-bit protected mode with paging
> > +disabled; a GDT must be loaded with the descriptors for selectors
> > +__BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat
> > +segment; __BOOT_CS must have execute/read permission, and __BOOT_DS
> > +must have read/write permission; CS must be __BOOT_CS and DS, ES, SS
> > +must be __BOOT_DS; interrupt must be disabled; %esi must hold the base
> > +address of the struct boot_params; %ebp, %edi and %ebx must be zero.
> > +
> > +64-bit BOOT PROTOCOL
> > +====================
> > +
> > +For machine with 64bit cpus and 64bit kernel, we could use 64bit bootloader
> > +and we need a 64-bit boot protocol.
> > +
> > +In 64-bit boot protocol, the first step in loading a Linux kernel
> > +should be to setup the boot parameters (struct boot_params,
> > +traditionally known as "zero page"). The memory for struct boot_params
> > +could be allocated anywhere (even above 4G) and initialized to all zero.
> > +Then, the setup header at offset 0x01f1 of kernel image on should be
> > +loaded into struct boot_params and examined. The end of setup header
> > +can be calculated as follows::
> > +
> > + 0x0202 + byte value at offset 0x0201
> > +
> > +In addition to read/modify/write the setup header of the struct
> > +boot_params as that of 16-bit boot protocol, the boot loader should
> > +also fill the additional fields of the struct boot_params as described
> > +in zero-page.txt.
> > +
> > +After setting up the struct boot_params, the boot loader can load
> > +64-bit kernel in the same way as that of 16-bit boot protocol, but
> > +kernel could be loaded above 4G.
> > +
> > +In 64-bit boot protocol, the kernel is started by jumping to the
> > +64-bit kernel entry point, which is the start address of loaded
> > +64-bit kernel plus 0x200.
> > +
> > +At entry, the CPU must be in 64-bit mode with paging enabled.
> > +The range with setup_header.init_size from start address of loaded
> > +kernel and zero page and command line buffer get ident mapping;
> > +a GDT must be loaded with the descriptors for selectors
> > +__BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat
> > +segment; __BOOT_CS must have execute/read permission, and __BOOT_DS
> > +must have read/write permission; CS must be __BOOT_CS and DS, ES, SS
> > +must be __BOOT_DS; interrupt must be disabled; %rsi must hold the base
> > +address of the struct boot_params.
> > +
> > +EFI HANDOVER PROTOCOL
> > +=====================
> > +
> > +This protocol allows boot loaders to defer initialisation to the EFI
> > +boot stub. The boot loader is required to load the kernel/initrd(s)
> > +from the boot media and jump to the EFI handover protocol entry point
> > +which is hdr->handover_offset bytes from the beginning of
> > +startup_{32,64}.
> > +
> > +The function prototype for the handover entry point looks like this::
> > +
> > + efi_main(void *handle, efi_system_table_t *table, struct boot_params *bp)
> > +
> > +'handle' is the EFI image handle passed to the boot loader by the EFI
> > +firmware, 'table' is the EFI system table - these are the first two
> > +arguments of the "handoff state" as described in section 2.3 of the
> > +UEFI specification. 'bp' is the boot loader-allocated boot params.
> > +
> > +The boot loader *must* fill out the following fields in bp::
> > +
> > + - hdr.code32_start
> > + - hdr.cmd_line_ptr
> > + - hdr.ramdisk_image (if applicable)
> > + - hdr.ramdisk_size (if applicable)
> > +
> > +All other fields should be zero.
> > diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
> > deleted file mode 100644
> > index f4c2a97bfdbd..000000000000
> > --- a/Documentation/x86/boot.txt
> > +++ /dev/null
> > @@ -1,1130 +0,0 @@
> > - THE LINUX/x86 BOOT PROTOCOL
> > - ---------------------------
> > -
> > -On the x86 platform, the Linux kernel uses a rather complicated boot
> > -convention. This has evolved partially due to historical aspects, as
> > -well as the desire in the early days to have the kernel itself be a
> > -bootable image, the complicated PC memory model and due to changed
> > -expectations in the PC industry caused by the effective demise of
> > -real-mode DOS as a mainstream operating system.
> > -
> > -Currently, the following versions of the Linux/x86 boot protocol exist.
> > -
> > -Old kernels: zImage/Image support only. Some very early kernels
> > - may not even support a command line.
> > -
> > -Protocol 2.00: (Kernel 1.3.73) Added bzImage and initrd support, as
> > - well as a formalized way to communicate between the
> > - boot loader and the kernel. setup.S made relocatable,
> > - although the traditional setup area still assumed
> > - writable.
> > -
> > -Protocol 2.01: (Kernel 1.3.76) Added a heap overrun warning.
> > -
> > -Protocol 2.02: (Kernel 2.4.0-test3-pre3) New command line protocol.
> > - Lower the conventional memory ceiling. No overwrite
> > - of the traditional setup area, thus making booting
> > - safe for systems which use the EBDA from SMM or 32-bit
> > - BIOS entry points. zImage deprecated but still
> > - supported.
> > -
> > -Protocol 2.03: (Kernel 2.4.18-pre1) Explicitly makes the highest possible
> > - initrd address available to the bootloader.
> > -
> > -Protocol 2.04: (Kernel 2.6.14) Extend the syssize field to four bytes.
> > -
> > -Protocol 2.05: (Kernel 2.6.20) Make protected mode kernel relocatable.
> > - Introduce relocatable_kernel and kernel_alignment fields.
> > -
> > -Protocol 2.06: (Kernel 2.6.22) Added a field that contains the size of
> > - the boot command line.
> > -
> > -Protocol 2.07: (Kernel 2.6.24) Added paravirtualised boot protocol.
> > - Introduced hardware_subarch and hardware_subarch_data
> > - and KEEP_SEGMENTS flag in load_flags.
> > -
> > -Protocol 2.08: (Kernel 2.6.26) Added crc32 checksum and ELF format
> > - payload. Introduced payload_offset and payload_length
> > - fields to aid in locating the payload.
> > -
> > -Protocol 2.09: (Kernel 2.6.26) Added a field of 64-bit physical
> > - pointer to single linked list of struct setup_data.
> > -
> > -Protocol 2.10: (Kernel 2.6.31) Added a protocol for relaxed alignment
> > - beyond the kernel_alignment added, new init_size and
> > - pref_address fields. Added extended boot loader IDs.
> > -
> > -Protocol 2.11: (Kernel 3.6) Added a field for offset of EFI handover
> > - protocol entry point.
> > -
> > -Protocol 2.12: (Kernel 3.8) Added the xloadflags field and extension fields
> > - to struct boot_params for loading bzImage and ramdisk
> > - above 4G in 64bit.
> > -
> > -**** MEMORY LAYOUT
> > -
> > -The traditional memory map for the kernel loader, used for Image or
> > -zImage kernels, typically looks like:
> > -
> > - | |
> > -0A0000 +------------------------+
> > - | Reserved for BIOS | Do not use. Reserved for BIOS EBDA.
> > -09A000 +------------------------+
> > - | Command line |
> > - | Stack/heap | For use by the kernel real-mode code.
> > -098000 +------------------------+
> > - | Kernel setup | The kernel real-mode code.
> > -090200 +------------------------+
> > - | Kernel boot sector | The kernel legacy boot sector.
> > -090000 +------------------------+
> > - | Protected-mode kernel | The bulk of the kernel image.
> > -010000 +------------------------+
> > - | Boot loader | <- Boot sector entry point 0000:7C00
> > -001000 +------------------------+
> > - | Reserved for MBR/BIOS |
> > -000800 +------------------------+
> > - | Typically used by MBR |
> > -000600 +------------------------+
> > - | BIOS use only |
> > -000000 +------------------------+
> > -
> > -
> > -When using bzImage, the protected-mode kernel was relocated to
> > -0x100000 ("high memory"), and the kernel real-mode block (boot sector,
> > -setup, and stack/heap) was made relocatable to any address between
> > -0x10000 and end of low memory. Unfortunately, in protocols 2.00 and
> > -2.01 the 0x90000+ memory range is still used internally by the kernel;
> > -the 2.02 protocol resolves that problem.
> > -
> > -It is desirable to keep the "memory ceiling" -- the highest point in
> > -low memory touched by the boot loader -- as low as possible, since
> > -some newer BIOSes have begun to allocate some rather large amounts of
> > -memory, called the Extended BIOS Data Area, near the top of low
> > -memory. The boot loader should use the "INT 12h" BIOS call to verify
> > -how much low memory is available.
> > -
> > -Unfortunately, if INT 12h reports that the amount of memory is too
> > -low, there is usually nothing the boot loader can do but to report an
> > -error to the user. The boot loader should therefore be designed to
> > -take up as little space in low memory as it reasonably can. For
> > -zImage or old bzImage kernels, which need data written into the
> > -0x90000 segment, the boot loader should make sure not to use memory
> > -above the 0x9A000 point; too many BIOSes will break above that point.
> > -
> > -For a modern bzImage kernel with boot protocol version >= 2.02, a
> > -memory layout like the following is suggested:
> > -
> > - ~ ~
> > - | Protected-mode kernel |
> > -100000 +------------------------+
> > - | I/O memory hole |
> > -0A0000 +------------------------+
> > - | Reserved for BIOS | Leave as much as possible unused
> > - ~ ~
> > - | Command line | (Can also be below the X+10000 mark)
> > -X+10000 +------------------------+
> > - | Stack/heap | For use by the kernel real-mode code.
> > -X+08000 +------------------------+
> > - | Kernel setup | The kernel real-mode code.
> > - | Kernel boot sector | The kernel legacy boot sector.
> > -X +------------------------+
> > - | Boot loader | <- Boot sector entry point 0000:7C00
> > -001000 +------------------------+
> > - | Reserved for MBR/BIOS |
> > -000800 +------------------------+
> > - | Typically used by MBR |
> > -000600 +------------------------+
> > - | BIOS use only |
> > -000000 +------------------------+
> > -
> > -... where the address X is as low as the design of the boot loader
> > -permits.
> > -
> > -
> > -**** THE REAL-MODE KERNEL HEADER
> > -
> > -In the following text, and anywhere in the kernel boot sequence, "a
> > -sector" refers to 512 bytes. It is independent of the actual sector
> > -size of the underlying medium.
> > -
> > -The first step in loading a Linux kernel should be to load the
> > -real-mode code (boot sector and setup code) and then examine the
> > -following header at offset 0x01f1. The real-mode code can total up to
> > -32K, although the boot loader may choose to load only the first two
> > -sectors (1K) and then examine the bootup sector size.
> > -
> > -The header looks like:
> > -
> > -Offset Proto Name Meaning
> > -/Size
> > -
> > -01F1/1 ALL(1 setup_sects The size of the setup in sectors
> > -01F2/2 ALL root_flags If set, the root is mounted readonly
> > -01F4/4 2.04+(2 syssize The size of the 32-bit code in 16-byte paras
> > -01F8/2 ALL ram_size DO NOT USE - for bootsect.S use only
> > -01FA/2 ALL vid_mode Video mode control
> > -01FC/2 ALL root_dev Default root device number
> > -01FE/2 ALL boot_flag 0xAA55 magic number
> > -0200/2 2.00+ jump Jump instruction
> > -0202/4 2.00+ header Magic signature "HdrS"
> > -0206/2 2.00+ version Boot protocol version supported
> > -0208/4 2.00+ realmode_swtch Boot loader hook (see below)
> > -020C/2 2.00+ start_sys_seg The load-low segment (0x1000) (obsolete)
> > -020E/2 2.00+ kernel_version Pointer to kernel version string
> > -0210/1 2.00+ type_of_loader Boot loader identifier
> > -0211/1 2.00+ loadflags Boot protocol option flags
> > -0212/2 2.00+ setup_move_size Move to high memory size (used with hooks)
> > -0214/4 2.00+ code32_start Boot loader hook (see below)
> > -0218/4 2.00+ ramdisk_image initrd load address (set by boot loader)
> > -021C/4 2.00+ ramdisk_size initrd size (set by boot loader)
> > -0220/4 2.00+ bootsect_kludge DO NOT USE - for bootsect.S use only
> > -0224/2 2.01+ heap_end_ptr Free memory after setup end
> > -0226/1 2.02+(3 ext_loader_ver Extended boot loader version
> > -0227/1 2.02+(3 ext_loader_type Extended boot loader ID
> > -0228/4 2.02+ cmd_line_ptr 32-bit pointer to the kernel command line
> > -022C/4 2.03+ initrd_addr_max Highest legal initrd address
> > -0230/4 2.05+ kernel_alignment Physical addr alignment required for kernel
> > -0234/1 2.05+ relocatable_kernel Whether kernel is relocatable or not
> > -0235/1 2.10+ min_alignment Minimum alignment, as a power of two
> > -0236/2 2.12+ xloadflags Boot protocol option flags
> > -0238/4 2.06+ cmdline_size Maximum size of the kernel command line
> > -023C/4 2.07+ hardware_subarch Hardware subarchitecture
> > -0240/8 2.07+ hardware_subarch_data Subarchitecture-specific data
> > -0248/4 2.08+ payload_offset Offset of kernel payload
> > -024C/4 2.08+ payload_length Length of kernel payload
> > -0250/8 2.09+ setup_data 64-bit physical pointer to linked list
> > - of struct setup_data
> > -0258/8 2.10+ pref_address Preferred loading address
> > -0260/4 2.10+ init_size Linear memory required during initialization
> > -0264/4 2.11+ handover_offset Offset of handover entry point
> > -
> > -(1) For backwards compatibility, if the setup_sects field contains 0, the
> > - real value is 4.
> > -
> > -(2) For boot protocol prior to 2.04, the upper two bytes of the syssize
> > - field are unusable, which means the size of a bzImage kernel
> > - cannot be determined.
> > -
> > -(3) Ignored, but safe to set, for boot protocols 2.02-2.09.
> > -
> > -If the "HdrS" (0x53726448) magic number is not found at offset 0x202,
> > -the boot protocol version is "old". Loading an old kernel, the
> > -following parameters should be assumed:
> > -
> > - Image type = zImage
> > - initrd not supported
> > - Real-mode kernel must be located at 0x90000.
> > -
> > -Otherwise, the "version" field contains the protocol version,
> > -e.g. protocol version 2.01 will contain 0x0201 in this field. When
> > -setting fields in the header, you must make sure only to set fields
> > -supported by the protocol version in use.
> > -
> > -
> > -**** DETAILS OF HEADER FIELDS
> > -
> > -For each field, some are information from the kernel to the bootloader
> > -("read"), some are expected to be filled out by the bootloader
> > -("write"), and some are expected to be read and modified by the
> > -bootloader ("modify").
> > -
> > -All general purpose boot loaders should write the fields marked
> > -(obligatory). Boot loaders who want to load the kernel at a
> > -nonstandard address should fill in the fields marked (reloc); other
> > -boot loaders can ignore those fields.
> > -
> > -The byte order of all fields is littleendian (this is x86, after all.)
> > -
> > -Field name: setup_sects
> > -Type: read
> > -Offset/size: 0x1f1/1
> > -Protocol: ALL
> > -
> > - The size of the setup code in 512-byte sectors. If this field is
> > - 0, the real value is 4. The real-mode code consists of the boot
> > - sector (always one 512-byte sector) plus the setup code.
> > -
> > -Field name: root_flags
> > -Type: modify (optional)
> > -Offset/size: 0x1f2/2
> > -Protocol: ALL
> > -
> > - If this field is nonzero, the root defaults to readonly. The use of
> > - this field is deprecated; use the "ro" or "rw" options on the
> > - command line instead.
> > -
> > -Field name: syssize
> > -Type: read
> > -Offset/size: 0x1f4/4 (protocol 2.04+) 0x1f4/2 (protocol ALL)
> > -Protocol: 2.04+
> > -
> > - The size of the protected-mode code in units of 16-byte paragraphs.
> > - For protocol versions older than 2.04 this field is only two bytes
> > - wide, and therefore cannot be trusted for the size of a kernel if
> > - the LOAD_HIGH flag is set.
> > -
> > -Field name: ram_size
> > -Type: kernel internal
> > -Offset/size: 0x1f8/2
> > -Protocol: ALL
> > -
> > - This field is obsolete.
> > -
> > -Field name: vid_mode
> > -Type: modify (obligatory)
> > -Offset/size: 0x1fa/2
> > -
> > - Please see the section on SPECIAL COMMAND LINE OPTIONS.
> > -
> > -Field name: root_dev
> > -Type: modify (optional)
> > -Offset/size: 0x1fc/2
> > -Protocol: ALL
> > -
> > - The default root device device number. The use of this field is
> > - deprecated, use the "root=" option on the command line instead.
> > -
> > -Field name: boot_flag
> > -Type: read
> > -Offset/size: 0x1fe/2
> > -Protocol: ALL
> > -
> > - Contains 0xAA55. This is the closest thing old Linux kernels have
> > - to a magic number.
> > -
> > -Field name: jump
> > -Type: read
> > -Offset/size: 0x200/2
> > -Protocol: 2.00+
> > -
> > - Contains an x86 jump instruction, 0xEB followed by a signed offset
> > - relative to byte 0x202. This can be used to determine the size of
> > - the header.
> > -
> > -Field name: header
> > -Type: read
> > -Offset/size: 0x202/4
> > -Protocol: 2.00+
> > -
> > - Contains the magic number "HdrS" (0x53726448).
> > -
> > -Field name: version
> > -Type: read
> > -Offset/size: 0x206/2
> > -Protocol: 2.00+
> > -
> > - Contains the boot protocol version, in (major << 8)+minor format,
> > - e.g. 0x0204 for version 2.04, and 0x0a11 for a hypothetical version
> > - 10.17.
> > -
> > -Field name: realmode_swtch
> > -Type: modify (optional)
> > -Offset/size: 0x208/4
> > -Protocol: 2.00+
> > -
> > - Boot loader hook (see ADVANCED BOOT LOADER HOOKS below.)
> > -
> > -Field name: start_sys_seg
> > -Type: read
> > -Offset/size: 0x20c/2
> > -Protocol: 2.00+
> > -
> > - The load low segment (0x1000). Obsolete.
> > -
> > -Field name: kernel_version
> > -Type: read
> > -Offset/size: 0x20e/2
> > -Protocol: 2.00+
> > -
> > - If set to a nonzero value, contains a pointer to a NUL-terminated
> > - human-readable kernel version number string, less 0x200. This can
> > - be used to display the kernel version to the user. This value
> > - should be less than (0x200*setup_sects).
> > -
> > - For example, if this value is set to 0x1c00, the kernel version
> > - number string can be found at offset 0x1e00 in the kernel file.
> > - This is a valid value if and only if the "setup_sects" field
> > - contains the value 15 or higher, as:
> > -
> > - 0x1c00 < 15*0x200 (= 0x1e00) but
> > - 0x1c00 >= 14*0x200 (= 0x1c00)
> > -
> > - 0x1c00 >> 9 = 14, so the minimum value for setup_secs is 15.
> > -
> > -Field name: type_of_loader
> > -Type: write (obligatory)
> > -Offset/size: 0x210/1
> > -Protocol: 2.00+
> > -
> > - If your boot loader has an assigned id (see table below), enter
> > - 0xTV here, where T is an identifier for the boot loader and V is
> > - a version number. Otherwise, enter 0xFF here.
> > -
> > - For boot loader IDs above T = 0xD, write T = 0xE to this field and
> > - write the extended ID minus 0x10 to the ext_loader_type field.
> > - Similarly, the ext_loader_ver field can be used to provide more than
> > - four bits for the bootloader version.
> > -
> > - For example, for T = 0x15, V = 0x234, write:
> > -
> > - type_of_loader <- 0xE4
> > - ext_loader_type <- 0x05
> > - ext_loader_ver <- 0x23
> > -
> > - Assigned boot loader ids (hexadecimal):
> > -
> > - 0 LILO (0x00 reserved for pre-2.00 bootloader)
> > - 1 Loadlin
> > - 2 bootsect-loader (0x20, all other values reserved)
> > - 3 Syslinux
> > - 4 Etherboot/gPXE/iPXE
> > - 5 ELILO
> > - 7 GRUB
> > - 8 U-Boot
> > - 9 Xen
> > - A Gujin
> > - B Qemu
> > - C Arcturus Networks uCbootloader
> > - D kexec-tools
> > - E Extended (see ext_loader_type)
> > - F Special (0xFF = undefined)
> > - 10 Reserved
> > - 11 Minimal Linux Bootloader <http://sebastian-plotz.blogspot.de>
> > - 12 OVMF UEFI virtualization stack
> > -
> > - Please contact <[email protected]> if you need a bootloader ID
> > - value assigned.
> > -
> > -Field name: loadflags
> > -Type: modify (obligatory)
> > -Offset/size: 0x211/1
> > -Protocol: 2.00+
> > -
> > - This field is a bitmask.
> > -
> > - Bit 0 (read): LOADED_HIGH
> > - - If 0, the protected-mode code is loaded at 0x10000.
> > - - If 1, the protected-mode code is loaded at 0x100000.
> > -
> > - Bit 1 (kernel internal): KASLR_FLAG
> > - - Used internally by the compressed kernel to communicate
> > - KASLR status to kernel proper.
> > - If 1, KASLR enabled.
> > - If 0, KASLR disabled.
> > -
> > - Bit 5 (write): QUIET_FLAG
> > - - If 0, print early messages.
> > - - If 1, suppress early messages.
> > - This requests to the kernel (decompressor and early
> > - kernel) to not write early messages that require
> > - accessing the display hardware directly.
> > -
> > - Bit 6 (write): KEEP_SEGMENTS
> > - Protocol: 2.07+
> > - - If 0, reload the segment registers in the 32bit entry point.
> > - - If 1, do not reload the segment registers in the 32bit entry point.
> > - Assume that %cs %ds %ss %es are all set to flat segments with
> > - a base of 0 (or the equivalent for their environment).
> > -
> > - Bit 7 (write): CAN_USE_HEAP
> > - Set this bit to 1 to indicate that the value entered in the
> > - heap_end_ptr is valid. If this field is clear, some setup code
> > - functionality will be disabled.
> > -
> > -Field name: setup_move_size
> > -Type: modify (obligatory)
> > -Offset/size: 0x212/2
> > -Protocol: 2.00-2.01
> > -
> > - When using protocol 2.00 or 2.01, if the real mode kernel is not
> > - loaded at 0x90000, it gets moved there later in the loading
> > - sequence. Fill in this field if you want additional data (such as
> > - the kernel command line) moved in addition to the real-mode kernel
> > - itself.
> > -
> > - The unit is bytes starting with the beginning of the boot sector.
> > -
> > - This field is can be ignored when the protocol is 2.02 or higher, or
> > - if the real-mode code is loaded at 0x90000.
> > -
> > -Field name: code32_start
> > -Type: modify (optional, reloc)
> > -Offset/size: 0x214/4
> > -Protocol: 2.00+
> > -
> > - The address to jump to in protected mode. This defaults to the load
> > - address of the kernel, and can be used by the boot loader to
> > - determine the proper load address.
> > -
> > - This field can be modified for two purposes:
> > -
> > - 1. as a boot loader hook (see ADVANCED BOOT LOADER HOOKS below.)
> > -
> > - 2. if a bootloader which does not install a hook loads a
> > - relocatable kernel at a nonstandard address it will have to modify
> > - this field to point to the load address.
> > -
> > -Field name: ramdisk_image
> > -Type: write (obligatory)
> > -Offset/size: 0x218/4
> > -Protocol: 2.00+
> > -
> > - The 32-bit linear address of the initial ramdisk or ramfs. Leave at
> > - zero if there is no initial ramdisk/ramfs.
> > -
> > -Field name: ramdisk_size
> > -Type: write (obligatory)
> > -Offset/size: 0x21c/4
> > -Protocol: 2.00+
> > -
> > - Size of the initial ramdisk or ramfs. Leave at zero if there is no
> > - initial ramdisk/ramfs.
> > -
> > -Field name: bootsect_kludge
> > -Type: kernel internal
> > -Offset/size: 0x220/4
> > -Protocol: 2.00+
> > -
> > - This field is obsolete.
> > -
> > -Field name: heap_end_ptr
> > -Type: write (obligatory)
> > -Offset/size: 0x224/2
> > -Protocol: 2.01+
> > -
> > - Set this field to the offset (from the beginning of the real-mode
> > - code) of the end of the setup stack/heap, minus 0x0200.
> > -
> > -Field name: ext_loader_ver
> > -Type: write (optional)
> > -Offset/size: 0x226/1
> > -Protocol: 2.02+
> > -
> > - This field is used as an extension of the version number in the
> > - type_of_loader field. The total version number is considered to be
> > - (type_of_loader & 0x0f) + (ext_loader_ver << 4).
> > -
> > - The use of this field is boot loader specific. If not written, it
> > - is zero.
> > -
> > - Kernels prior to 2.6.31 did not recognize this field, but it is safe
> > - to write for protocol version 2.02 or higher.
> > -
> > -Field name: ext_loader_type
> > -Type: write (obligatory if (type_of_loader & 0xf0) == 0xe0)
> > -Offset/size: 0x227/1
> > -Protocol: 2.02+
> > -
> > - This field is used as an extension of the type number in
> > - type_of_loader field. If the type in type_of_loader is 0xE, then
> > - the actual type is (ext_loader_type + 0x10).
> > -
> > - This field is ignored if the type in type_of_loader is not 0xE.
> > -
> > - Kernels prior to 2.6.31 did not recognize this field, but it is safe
> > - to write for protocol version 2.02 or higher.
> > -
> > -Field name: cmd_line_ptr
> > -Type: write (obligatory)
> > -Offset/size: 0x228/4
> > -Protocol: 2.02+
> > -
> > - Set this field to the linear address of the kernel command line.
> > - The kernel command line can be located anywhere between the end of
> > - the setup heap and 0xA0000; it does not have to be located in the
> > - same 64K segment as the real-mode code itself.
> > -
> > - Fill in this field even if your boot loader does not support a
> > - command line, in which case you can point this to an empty string
> > - (or better yet, to the string "auto".) If this field is left at
> > - zero, the kernel will assume that your boot loader does not support
> > - the 2.02+ protocol.
> > -
> > -Field name: initrd_addr_max
> > -Type: read
> > -Offset/size: 0x22c/4
> > -Protocol: 2.03+
> > -
> > - The maximum address that may be occupied by the initial
> > - ramdisk/ramfs contents. For boot protocols 2.02 or earlier, this
> > - field is not present, and the maximum address is 0x37FFFFFF. (This
> > - address is defined as the address of the highest safe byte, so if
> > - your ramdisk is exactly 131072 bytes long and this field is
> > - 0x37FFFFFF, you can start your ramdisk at 0x37FE0000.)
> > -
> > -Field name: kernel_alignment
> > -Type: read/modify (reloc)
> > -Offset/size: 0x230/4
> > -Protocol: 2.05+ (read), 2.10+ (modify)
> > -
> > - Alignment unit required by the kernel (if relocatable_kernel is
> > - true.) A relocatable kernel that is loaded at an alignment
> > - incompatible with the value in this field will be realigned during
> > - kernel initialization.
> > -
> > - Starting with protocol version 2.10, this reflects the kernel
> > - alignment preferred for optimal performance; it is possible for the
> > - loader to modify this field to permit a lesser alignment. See the
> > - min_alignment and pref_address field below.
> > -
> > -Field name: relocatable_kernel
> > -Type: read (reloc)
> > -Offset/size: 0x234/1
> > -Protocol: 2.05+
> > -
> > - If this field is nonzero, the protected-mode part of the kernel can
> > - be loaded at any address that satisfies the kernel_alignment field.
> > - After loading, the boot loader must set the code32_start field to
> > - point to the loaded code, or to a boot loader hook.
> > -
> > -Field name: min_alignment
> > -Type: read (reloc)
> > -Offset/size: 0x235/1
> > -Protocol: 2.10+
> > -
> > - This field, if nonzero, indicates as a power of two the minimum
> > - alignment required, as opposed to preferred, by the kernel to boot.
> > - If a boot loader makes use of this field, it should update the
> > - kernel_alignment field with the alignment unit desired; typically:
> > -
> > - kernel_alignment = 1 << min_alignment
> > -
> > - There may be a considerable performance cost with an excessively
> > - misaligned kernel. Therefore, a loader should typically try each
> > - power-of-two alignment from kernel_alignment down to this alignment.
> > -
> > -Field name: xloadflags
> > -Type: read
> > -Offset/size: 0x236/2
> > -Protocol: 2.12+
> > -
> > - This field is a bitmask.
> > -
> > - Bit 0 (read): XLF_KERNEL_64
> > - - If 1, this kernel has the legacy 64-bit entry point at 0x200.
> > -
> > - Bit 1 (read): XLF_CAN_BE_LOADED_ABOVE_4G
> > - - If 1, kernel/boot_params/cmdline/ramdisk can be above 4G.
> > -
> > - Bit 2 (read): XLF_EFI_HANDOVER_32
> > - - If 1, the kernel supports the 32-bit EFI handoff entry point
> > - given at handover_offset.
> > -
> > - Bit 3 (read): XLF_EFI_HANDOVER_64
> > - - If 1, the kernel supports the 64-bit EFI handoff entry point
> > - given at handover_offset + 0x200.
> > -
> > - Bit 4 (read): XLF_EFI_KEXEC
> > - - If 1, the kernel supports kexec EFI boot with EFI runtime support.
> > -
> > -Field name: cmdline_size
> > -Type: read
> > -Offset/size: 0x238/4
> > -Protocol: 2.06+
> > -
> > - The maximum size of the command line without the terminating
> > - zero. This means that the command line can contain at most
> > - cmdline_size characters. With protocol version 2.05 and earlier, the
> > - maximum size was 255.
> > -
> > -Field name: hardware_subarch
> > -Type: write (optional, defaults to x86/PC)
> > -Offset/size: 0x23c/4
> > -Protocol: 2.07+
> > -
> > - In a paravirtualized environment the hardware low level architectural
> > - pieces such as interrupt handling, page table handling, and
> > - accessing process control registers needs to be done differently.
> > -
> > - This field allows the bootloader to inform the kernel we are in one
> > - one of those environments.
> > -
> > - 0x00000000 The default x86/PC environment
> > - 0x00000001 lguest
> > - 0x00000002 Xen
> > - 0x00000003 Moorestown MID
> > - 0x00000004 CE4100 TV Platform
> > -
> > -Field name: hardware_subarch_data
> > -Type: write (subarch-dependent)
> > -Offset/size: 0x240/8
> > -Protocol: 2.07+
> > -
> > - A pointer to data that is specific to hardware subarch
> > - This field is currently unused for the default x86/PC environment,
> > - do not modify.
> > -
> > -Field name: payload_offset
> > -Type: read
> > -Offset/size: 0x248/4
> > -Protocol: 2.08+
> > -
> > - If non-zero then this field contains the offset from the beginning
> > - of the protected-mode code to the payload.
> > -
> > - The payload may be compressed. The format of both the compressed and
> > - uncompressed data should be determined using the standard magic
> > - numbers. The currently supported compression formats are gzip
> > - (magic numbers 1F 8B or 1F 9E), bzip2 (magic number 42 5A), LZMA
> > - (magic number 5D 00), XZ (magic number FD 37), and LZ4 (magic number
> > - 02 21). The uncompressed payload is currently always ELF (magic
> > - number 7F 45 4C 46).
> > -
> > -Field name: payload_length
> > -Type: read
> > -Offset/size: 0x24c/4
> > -Protocol: 2.08+
> > -
> > - The length of the payload.
> > -
> > -Field name: setup_data
> > -Type: write (special)
> > -Offset/size: 0x250/8
> > -Protocol: 2.09+
> > -
> > - The 64-bit physical pointer to NULL terminated single linked list of
> > - struct setup_data. This is used to define a more extensible boot
> > - parameters passing mechanism. The definition of struct setup_data is
> > - as follow:
> > -
> > - struct setup_data {
> > - u64 next;
> > - u32 type;
> > - u32 len;
> > - u8 data[0];
> > - };
> > -
> > - Where, the next is a 64-bit physical pointer to the next node of
> > - linked list, the next field of the last node is 0; the type is used
> > - to identify the contents of data; the len is the length of data
> > - field; the data holds the real payload.
> > -
> > - This list may be modified at a number of points during the bootup
> > - process. Therefore, when modifying this list one should always make
> > - sure to consider the case where the linked list already contains
> > - entries.
> > -
> > -Field name: pref_address
> > -Type: read (reloc)
> > -Offset/size: 0x258/8
> > -Protocol: 2.10+
> > -
> > - This field, if nonzero, represents a preferred load address for the
> > - kernel. A relocating bootloader should attempt to load at this
> > - address if possible.
> > -
> > - A non-relocatable kernel will unconditionally move itself and to run
> > - at this address.
> > -
> > -Field name: init_size
> > -Type: read
> > -Offset/size: 0x260/4
> > -
> > - This field indicates the amount of linear contiguous memory starting
> > - at the kernel runtime start address that the kernel needs before it
> > - is capable of examining its memory map. This is not the same thing
> > - as the total amount of memory the kernel needs to boot, but it can
> > - be used by a relocating boot loader to help select a safe load
> > - address for the kernel.
> > -
> > - The kernel runtime start address is determined by the following algorithm:
> > -
> > - if (relocatable_kernel)
> > - runtime_start = align_up(load_address, kernel_alignment)
> > - else
> > - runtime_start = pref_address
> > -
> > -Field name: handover_offset
> > -Type: read
> > -Offset/size: 0x264/4
> > -
> > - This field is the offset from the beginning of the kernel image to
> > - the EFI handover protocol entry point. Boot loaders using the EFI
> > - handover protocol to boot the kernel should jump to this offset.
> > -
> > - See EFI HANDOVER PROTOCOL below for more details.
> > -
> > -
> > -**** THE IMAGE CHECKSUM
> > -
> > -From boot protocol version 2.08 onwards the CRC-32 is calculated over
> > -the entire file using the characteristic polynomial 0x04C11DB7 and an
> > -initial remainder of 0xffffffff. The checksum is appended to the
> > -file; therefore the CRC of the file up to the limit specified in the
> > -syssize field of the header is always 0.
> > -
> > -
> > -**** THE KERNEL COMMAND LINE
> > -
> > -The kernel command line has become an important way for the boot
> > -loader to communicate with the kernel. Some of its options are also
> > -relevant to the boot loader itself, see "special command line options"
> > -below.
> > -
> > -The kernel command line is a null-terminated string. The maximum
> > -length can be retrieved from the field cmdline_size. Before protocol
> > -version 2.06, the maximum was 255 characters. A string that is too
> > -long will be automatically truncated by the kernel.
> > -
> > -If the boot protocol version is 2.02 or later, the address of the
> > -kernel command line is given by the header field cmd_line_ptr (see
> > -above.) This address can be anywhere between the end of the setup
> > -heap and 0xA0000.
> > -
> > -If the protocol version is *not* 2.02 or higher, the kernel
> > -command line is entered using the following protocol:
> > -
> > - At offset 0x0020 (word), "cmd_line_magic", enter the magic
> > - number 0xA33F.
> > -
> > - At offset 0x0022 (word), "cmd_line_offset", enter the offset
> > - of the kernel command line (relative to the start of the
> > - real-mode kernel).
> > -
> > - The kernel command line *must* be within the memory region
> > - covered by setup_move_size, so you may need to adjust this
> > - field.
> > -
> > -
> > -**** MEMORY LAYOUT OF THE REAL-MODE CODE
> > -
> > -The real-mode code requires a stack/heap to be set up, as well as
> > -memory allocated for the kernel command line. This needs to be done
> > -in the real-mode accessible memory in bottom megabyte.
> > -
> > -It should be noted that modern machines often have a sizable Extended
> > -BIOS Data Area (EBDA). As a result, it is advisable to use as little
> > -of the low megabyte as possible.
> > -
> > -Unfortunately, under the following circumstances the 0x90000 memory
> > -segment has to be used:
> > -
> > - - When loading a zImage kernel ((loadflags & 0x01) == 0).
> > - - When loading a 2.01 or earlier boot protocol kernel.
> > -
> > - -> For the 2.00 and 2.01 boot protocols, the real-mode code
> > - can be loaded at another address, but it is internally
> > - relocated to 0x90000. For the "old" protocol, the
> > - real-mode code must be loaded at 0x90000.
> > -
> > -When loading at 0x90000, avoid using memory above 0x9a000.
> > -
> > -For boot protocol 2.02 or higher, the command line does not have to be
> > -located in the same 64K segment as the real-mode setup code; it is
> > -thus permitted to give the stack/heap the full 64K segment and locate
> > -the command line above it.
> > -
> > -The kernel command line should not be located below the real-mode
> > -code, nor should it be located in high memory.
> > -
> > -
> > -**** SAMPLE BOOT CONFIGURATION
> > -
> > -As a sample configuration, assume the following layout of the real
> > -mode segment:
> > -
> > - When loading below 0x90000, use the entire segment:
> > -
> > - 0x0000-0x7fff Real mode kernel
> > - 0x8000-0xdfff Stack and heap
> > - 0xe000-0xffff Kernel command line
> > -
> > - When loading at 0x90000 OR the protocol version is 2.01 or earlier:
> > -
> > - 0x0000-0x7fff Real mode kernel
> > - 0x8000-0x97ff Stack and heap
> > - 0x9800-0x9fff Kernel command line
> > -
> > -Such a boot loader should enter the following fields in the header:
> > -
> > - unsigned long base_ptr; /* base address for real-mode segment */
> > -
> > - if ( setup_sects == 0 ) {
> > - setup_sects = 4;
> > - }
> > -
> > - if ( protocol >= 0x0200 ) {
> > - type_of_loader = <type code>;
> > - if ( loading_initrd ) {
> > - ramdisk_image = <initrd_address>;
> > - ramdisk_size = <initrd_size>;
> > - }
> > -
> > - if ( protocol >= 0x0202 && loadflags & 0x01 )
> > - heap_end = 0xe000;
> > - else
> > - heap_end = 0x9800;
> > -
> > - if ( protocol >= 0x0201 ) {
> > - heap_end_ptr = heap_end - 0x200;
> > - loadflags |= 0x80; /* CAN_USE_HEAP */
> > - }
> > -
> > - if ( protocol >= 0x0202 ) {
> > - cmd_line_ptr = base_ptr + heap_end;
> > - strcpy(cmd_line_ptr, cmdline);
> > - } else {
> > - cmd_line_magic = 0xA33F;
> > - cmd_line_offset = heap_end;
> > - setup_move_size = heap_end + strlen(cmdline)+1;
> > - strcpy(base_ptr+cmd_line_offset, cmdline);
> > - }
> > - } else {
> > - /* Very old kernel */
> > -
> > - heap_end = 0x9800;
> > -
> > - cmd_line_magic = 0xA33F;
> > - cmd_line_offset = heap_end;
> > -
> > - /* A very old kernel MUST have its real-mode code
> > - loaded at 0x90000 */
> > -
> > - if ( base_ptr != 0x90000 ) {
> > - /* Copy the real-mode kernel */
> > - memcpy(0x90000, base_ptr, (setup_sects+1)*512);
> > - base_ptr = 0x90000; /* Relocated */
> > - }
> > -
> > - strcpy(0x90000+cmd_line_offset, cmdline);
> > -
> > - /* It is recommended to clear memory up to the 32K mark */
> > - memset(0x90000 + (setup_sects+1)*512, 0,
> > - (64-(setup_sects+1))*512);
> > - }
> > -
> > -
> > -**** LOADING THE REST OF THE KERNEL
> > -
> > -The 32-bit (non-real-mode) kernel starts at offset (setup_sects+1)*512
> > -in the kernel file (again, if setup_sects == 0 the real value is 4.)
> > -It should be loaded at address 0x10000 for Image/zImage kernels and
> > -0x100000 for bzImage kernels.
> > -
> > -The kernel is a bzImage kernel if the protocol >= 2.00 and the 0x01
> > -bit (LOAD_HIGH) in the loadflags field is set:
> > -
> > - is_bzImage = (protocol >= 0x0200) && (loadflags & 0x01);
> > - load_address = is_bzImage ? 0x100000 : 0x10000;
> > -
> > -Note that Image/zImage kernels can be up to 512K in size, and thus use
> > -the entire 0x10000-0x90000 range of memory. This means it is pretty
> > -much a requirement for these kernels to load the real-mode part at
> > -0x90000. bzImage kernels allow much more flexibility.
> > -
> > -
> > -**** SPECIAL COMMAND LINE OPTIONS
> > -
> > -If the command line provided by the boot loader is entered by the
> > -user, the user may expect the following command line options to work.
> > -They should normally not be deleted from the kernel command line even
> > -though not all of them are actually meaningful to the kernel. Boot
> > -loader authors who need additional command line options for the boot
> > -loader itself should get them registered in
> > -Documentation/admin-guide/kernel-parameters.rst to make sure they will not
> > -conflict with actual kernel options now or in the future.
> > -
> > - vga=<mode>
> > - <mode> here is either an integer (in C notation, either
> > - decimal, octal, or hexadecimal) or one of the strings
> > - "normal" (meaning 0xFFFF), "ext" (meaning 0xFFFE) or "ask"
> > - (meaning 0xFFFD). This value should be entered into the
> > - vid_mode field, as it is used by the kernel before the command
> > - line is parsed.
> > -
> > - mem=<size>
> > - <size> is an integer in C notation optionally followed by
> > - (case insensitive) K, M, G, T, P or E (meaning << 10, << 20,
> > - << 30, << 40, << 50 or << 60). This specifies the end of
> > - memory to the kernel. This affects the possible placement of
> > - an initrd, since an initrd should be placed near end of
> > - memory. Note that this is an option to *both* the kernel and
> > - the bootloader!
> > -
> > - initrd=<file>
> > - An initrd should be loaded. The meaning of <file> is
> > - obviously bootloader-dependent, and some boot loaders
> > - (e.g. LILO) do not have such a command.
> > -
> > -In addition, some boot loaders add the following options to the
> > -user-specified command line:
> > -
> > - BOOT_IMAGE=<file>
> > - The boot image which was loaded. Again, the meaning of <file>
> > - is obviously bootloader-dependent.
> > -
> > - auto
> > - The kernel was booted without explicit user intervention.
> > -
> > -If these options are added by the boot loader, it is highly
> > -recommended that they are located *first*, before the user-specified
> > -or configuration-specified command line. Otherwise, "init=/bin/sh"
> > -gets confused by the "auto" option.
> > -
> > -
> > -**** RUNNING THE KERNEL
> > -
> > -The kernel is started by jumping to the kernel entry point, which is
> > -located at *segment* offset 0x20 from the start of the real mode
> > -kernel. This means that if you loaded your real-mode kernel code at
> > -0x90000, the kernel entry point is 9020:0000.
> > -
> > -At entry, ds = es = ss should point to the start of the real-mode
> > -kernel code (0x9000 if the code is loaded at 0x90000), sp should be
> > -set up properly, normally pointing to the top of the heap, and
> > -interrupts should be disabled. Furthermore, to guard against bugs in
> > -the kernel, it is recommended that the boot loader sets fs = gs = ds =
> > -es = ss.
> > -
> > -In our example from above, we would do:
> > -
> > - /* Note: in the case of the "old" kernel protocol, base_ptr must
> > - be == 0x90000 at this point; see the previous sample code */
> > -
> > - seg = base_ptr >> 4;
> > -
> > - cli(); /* Enter with interrupts disabled! */
> > -
> > - /* Set up the real-mode kernel stack */
> > - _SS = seg;
> > - _SP = heap_end;
> > -
> > - _DS = _ES = _FS = _GS = seg;
> > - jmp_far(seg+0x20, 0); /* Run the kernel */
> > -
> > -If your boot sector accesses a floppy drive, it is recommended to
> > -switch off the floppy motor before running the kernel, since the
> > -kernel boot leaves interrupts off and thus the motor will not be
> > -switched off, especially if the loaded kernel has the floppy driver as
> > -a demand-loaded module!
> > -
> > -
> > -**** ADVANCED BOOT LOADER HOOKS
> > -
> > -If the boot loader runs in a particularly hostile environment (such as
> > -LOADLIN, which runs under DOS) it may be impossible to follow the
> > -standard memory location requirements. Such a boot loader may use the
> > -following hooks that, if set, are invoked by the kernel at the
> > -appropriate time. The use of these hooks should probably be
> > -considered an absolutely last resort!
> > -
> > -IMPORTANT: All the hooks are required to preserve %esp, %ebp, %esi and
> > -%edi across invocation.
> > -
> > - realmode_swtch:
> > - A 16-bit real mode far subroutine invoked immediately before
> > - entering protected mode. The default routine disables NMI, so
> > - your routine should probably do so, too.
> > -
> > - code32_start:
> > - A 32-bit flat-mode routine *jumped* to immediately after the
> > - transition to protected mode, but before the kernel is
> > - uncompressed. No segments, except CS, are guaranteed to be
> > - set up (current kernels do, but older ones do not); you should
> > - set them up to BOOT_DS (0x18) yourself.
> > -
> > - After completing your hook, you should jump to the address
> > - that was in this field before your boot loader overwrote it
> > - (relocated, if appropriate.)
> > -
> > -
> > -**** 32-bit BOOT PROTOCOL
> > -
> > -For machine with some new BIOS other than legacy BIOS, such as EFI,
> > -LinuxBIOS, etc, and kexec, the 16-bit real mode setup code in kernel
> > -based on legacy BIOS can not be used, so a 32-bit boot protocol needs
> > -to be defined.
> > -
> > -In 32-bit boot protocol, the first step in loading a Linux kernel
> > -should be to setup the boot parameters (struct boot_params,
> > -traditionally known as "zero page"). The memory for struct boot_params
> > -should be allocated and initialized to all zero. Then the setup header
> > -from offset 0x01f1 of kernel image on should be loaded into struct
> > -boot_params and examined. The end of setup header can be calculated as
> > -follow:
> > -
> > - 0x0202 + byte value at offset 0x0201
> > -
> > -In addition to read/modify/write the setup header of the struct
> > -boot_params as that of 16-bit boot protocol, the boot loader should
> > -also fill the additional fields of the struct boot_params as that
> > -described in zero-page.txt.
> > -
> > -After setting up the struct boot_params, the boot loader can load the
> > -32/64-bit kernel in the same way as that of 16-bit boot protocol.
> > -
> > -In 32-bit boot protocol, the kernel is started by jumping to the
> > -32-bit kernel entry point, which is the start address of loaded
> > -32/64-bit kernel.
> > -
> > -At entry, the CPU must be in 32-bit protected mode with paging
> > -disabled; a GDT must be loaded with the descriptors for selectors
> > -__BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat
> > -segment; __BOOT_CS must have execute/read permission, and __BOOT_DS
> > -must have read/write permission; CS must be __BOOT_CS and DS, ES, SS
> > -must be __BOOT_DS; interrupt must be disabled; %esi must hold the base
> > -address of the struct boot_params; %ebp, %edi and %ebx must be zero.
> > -
> > -**** 64-bit BOOT PROTOCOL
> > -
> > -For machine with 64bit cpus and 64bit kernel, we could use 64bit bootloader
> > -and we need a 64-bit boot protocol.
> > -
> > -In 64-bit boot protocol, the first step in loading a Linux kernel
> > -should be to setup the boot parameters (struct boot_params,
> > -traditionally known as "zero page"). The memory for struct boot_params
> > -could be allocated anywhere (even above 4G) and initialized to all zero.
> > -Then, the setup header at offset 0x01f1 of kernel image on should be
> > -loaded into struct boot_params and examined. The end of setup header
> > -can be calculated as follows:
> > -
> > - 0x0202 + byte value at offset 0x0201
> > -
> > -In addition to read/modify/write the setup header of the struct
> > -boot_params as that of 16-bit boot protocol, the boot loader should
> > -also fill the additional fields of the struct boot_params as described
> > -in zero-page.txt.
> > -
> > -After setting up the struct boot_params, the boot loader can load
> > -64-bit kernel in the same way as that of 16-bit boot protocol, but
> > -kernel could be loaded above 4G.
> > -
> > -In 64-bit boot protocol, the kernel is started by jumping to the
> > -64-bit kernel entry point, which is the start address of loaded
> > -64-bit kernel plus 0x200.
> > -
> > -At entry, the CPU must be in 64-bit mode with paging enabled.
> > -The range with setup_header.init_size from start address of loaded
> > -kernel and zero page and command line buffer get ident mapping;
> > -a GDT must be loaded with the descriptors for selectors
> > -__BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat
> > -segment; __BOOT_CS must have execute/read permission, and __BOOT_DS
> > -must have read/write permission; CS must be __BOOT_CS and DS, ES, SS
> > -must be __BOOT_DS; interrupt must be disabled; %rsi must hold the base
> > -address of the struct boot_params.
> > -
> > -**** EFI HANDOVER PROTOCOL
> > -
> > -This protocol allows boot loaders to defer initialisation to the EFI
> > -boot stub. The boot loader is required to load the kernel/initrd(s)
> > -from the boot media and jump to the EFI handover protocol entry point
> > -which is hdr->handover_offset bytes from the beginning of
> > -startup_{32,64}.
> > -
> > -The function prototype for the handover entry point looks like this,
> > -
> > - efi_main(void *handle, efi_system_table_t *table, struct boot_params *bp)
> > -
> > -'handle' is the EFI image handle passed to the boot loader by the EFI
> > -firmware, 'table' is the EFI system table - these are the first two
> > -arguments of the "handoff state" as described in section 2.3 of the
> > -UEFI specification. 'bp' is the boot loader-allocated boot params.
> > -
> > -The boot loader *must* fill out the following fields in bp,
> > -
> > - o hdr.code32_start
> > - o hdr.cmd_line_ptr
> > - o hdr.ramdisk_image (if applicable)
> > - o hdr.ramdisk_size (if applicable)
> > -
> > -All other fields should be zero.
> > diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
> > index 7612d3142b2a..8f08caf4fbbb 100644
> > --- a/Documentation/x86/index.rst
> > +++ b/Documentation/x86/index.rst
> > @@ -7,3 +7,5 @@ Linux x86 Support
> > .. toctree::
> > :maxdepth: 2
> > :numbered:
> > +
> > + boot
>
>
>
> Thanks,
> Mauro

--
Cheers,
Changbin Du

2019-04-26 14:25:16

by Changbin Du

[permalink] [raw]
Subject: Re: [PATCH v4 39/63] Documentation: x86: convert topology.txt to reST

On Wed, Apr 24, 2019 at 02:44:07PM -0300, Mauro Carvalho Chehab wrote:
> Em Wed, 24 Apr 2019 00:29:08 +0800
> Changbin Du <[email protected]> escreveu:
>
> > This converts the plain text documentation to reStructuredText format and
> > add it to Sphinx TOC tree. No essential content change.
> >
> > Signed-off-by: Changbin Du <[email protected]>
> > ---
> > Documentation/x86/index.rst | 1 +
> > Documentation/x86/topology.rst | 228 +++++++++++++++++++++++++++++++++
> > Documentation/x86/topology.txt | 217 -------------------------------
> > 3 files changed, 229 insertions(+), 217 deletions(-)
> > create mode 100644 Documentation/x86/topology.rst
> > delete mode 100644 Documentation/x86/topology.txt
>
> Why? Please preserve as much as possible from the original file...
> it is really hard to see what you're doing. Most of those x86
> files are already almost at ReST format (like this one). There's
> absolutely **no reason** why you would do so much radical changes
> that would below the 50% similarity threshold that would make git
> to recognize as a change on the same file!
>
My editor changed the indent. I need to redo this conversion. Thanks.

> I'll give a quick review on this one, but it is really hard to be
> sure that something is missing, when the similarity is too low.
>
> >
> > diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
> > index 8f08caf4fbbb..2033791e53bc 100644
> > --- a/Documentation/x86/index.rst
> > +++ b/Documentation/x86/index.rst
> > @@ -9,3 +9,4 @@ Linux x86 Support
> > :numbered:
> >
> > boot
> > + topology
> > diff --git a/Documentation/x86/topology.rst b/Documentation/x86/topology.rst
> > new file mode 100644
> > index 000000000000..1df5f56f4882
> > --- /dev/null
> > +++ b/Documentation/x86/topology.rst
> > @@ -0,0 +1,228 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +============
> > +x86 Topology
> > +============
> > +
> > +This documents and clarifies the main aspects of x86 topology modelling and
> > +representation in the kernel. Update/change when doing changes to the
> > +respective code.
> > +
> > +The architecture-agnostic topology definitions are in
> > +Documentation/cputopology.txt. This file holds x86-specific
> > +differences/specialities which must not necessarily apply to the generic
> > +definitions. Thus, the way to read up on Linux topology on x86 is to start
> > +with the generic one and look at this one in parallel for the x86 specifics.
> > +
> > +Needless to say, code should use the generic functions - this file is *only*
> > +here to *document* the inner workings of x86 topology.
> > +
> > +Started by Thomas Gleixner <[email protected]> and Borislav Petkov <[email protected]>.
> > +
> > +The main aim of the topology facilities is to present adequate interfaces to
> > +code which needs to know/query/use the structure of the running system wrt
> > +threads, cores, packages, etc.
> > +
> > +The kernel does not care about the concept of physical sockets because a
> > +socket has no relevance to software. It's an electromechanical component. In
> > +the past a socket always contained a single package (see below), but with the
> > +advent of Multi Chip Modules (MCM) a socket can hold more than one package. So
> > +there might be still references to sockets in the code, but they are of
> > +historical nature and should be cleaned up.
> > +
> > +The topology of a system is described in the units of:
> > +
> > + - packages
> > + - cores
> > + - threads
> > +
> > +Package
> > +=======
> > +
> > +Packages contain a number of cores plus shared resources, e.g. DRAM
> > +controller, shared caches etc.
> > +
> > +AMD nomenclature for package is 'Node'.
> > +
> > +Package-related topology information in the kernel:
> > +
> > + - cpuinfo_x86.x86_max_cores:
> > +
> > + The number of cores in a package. This information is retrieved via CPUID.
> > +
> > + - cpuinfo_x86.phys_proc_id:
> > +
> > + The physical ID of the package. This information is retrieved via CPUID
> > + and deduced from the APIC IDs of the cores in the package.
> > +
> > + - cpuinfo_x86.logical_id:
> > +
> > + The logical ID of the package. As we do not trust BIOSes to enumerate the
> > + packages in a consistent way, we introduced the concept of logical package
> > + ID so we can sanely calculate the number of maximum possible packages in
> > + the system and have the packages enumerated linearly.
> > +
> > + - topology_max_packages():
> > +
> > + The maximum possible number of packages in the system. Helpful for per
> > + package facilities to preallocate per package information.
> > +
> > + - cpu_llc_id:
> > +
> > + A per-CPU variable containing:
> > +
> > + - On Intel, the first APIC ID of the list of CPUs sharing the Last Level
> > + Cache.
> > +
> > + - On AMD, the Node ID or Core Complex ID containing the Last Level
> > + Cache. In general, it is a number identifying an LLC uniquely on the
> > + system.
> > +
> > +Cores
> > +=====
> > +
> > +A core consists of 1 or more threads. It does not matter whether the threads
> > +are SMT- or CMT-type threads.
> > +
> > +AMDs nomenclature for a CMT core is "Compute Unit". The kernel always uses
> > +"core".
> > +
> > +Core-related topology information in the kernel:
> > +
> > + - smp_num_siblings:
> > +
> > + The number of threads in a core. The number of threads in a package can be
> > + calculated by::
> > +
> > + threads_per_package = cpuinfo_x86.x86_max_cores * smp_num_siblings
> > +
> > +
> > +Threads
> > +=======
> > +
> > +A thread is a single scheduling unit. It's the equivalent to a logical Linux
> > +CPU.
> > +
> > +AMDs nomenclature for CMT threads is "Compute Unit Core". The kernel always
> > +uses "thread".
> > +
> > +Thread-related topology information in the kernel:
> > +
> > + - topology_core_cpumask():
> > +
> > + The cpumask contains all online threads in the package to which a thread
> > + belongs.
> > +
> > + The number of online threads is also printed in /proc/cpuinfo "siblings."
> > +
> > + - topology_sibling_cpumask():
> > +
> > + The cpumask contains all online threads in the core to which a thread
> > + belongs.
> > +
> > + - topology_logical_package_id():
> > +
> > + The logical package ID to which a thread belongs.
> > +
> > + - topology_physical_package_id():
> > +
> > + The physical package ID to which a thread belongs.
> > +
> > + - topology_core_id();
> > +
> > + The ID of the core to which a thread belongs. It is also printed in /proc/cpuinfo
> > + "core_id."
> > +
> > +
> > +
> > +System topology examples
> > +========================
> > +
> > +.. note:: The alternative Linux CPU enumeration depends on how the BIOS
> > + enumerates the threads. Many BIOSes enumerate all threads 0 first and
> > + then all threads 1. That has the "advantage" that the logical Linux CPU
> > + numbers of threads 0 stay the same whether threads are enabled or not.
> > + That's merely an implementation detail and has no practical impact.
> > +
> > +1) Single Package, Single Core
> > +::
>
> I would just place the :: on the above line. Same applies to similar
> cases on this file.
>
Sure.

> > +
> > + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
> > +
> > +2) Single Package, Dual Core
> > +
> > + a) One thread per core
> > + ::
> > +
> > + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
> > + -> [core 1] -> [thread 0] -> Linux CPU 1
>
> Something got broken here.
>
> > +
> > + b) Two threads per core
> > + ::
> > +
> > + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
> > + -> [thread 1] -> Linux CPU 1
> > + -> [core 1] -> [thread 0] -> Linux CPU 2
> > + -> [thread 1] -> Linux CPU 3
>
> And here... This one, for example, should be, instead:
>
> [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
> -> [thread 1] -> Linux CPU 1
> -> [core 1] -> [thread 0] -> Linux CPU 2
> -> [thread 1] -> Linux CPU 3
>
> Clearly there's something that it is messing with tabs on your
> x86 conversion.
>
Sorry for such mistake. I will check them one by one.

> I'll stop my review here, as it sounds pointless to review it,
> as there are too many broken whitespace stuff on your
> conversion.
>
> Thanks,
> Mauro

--
Cheers,
Changbin Du