2023-06-29 17:33:30

by Lizhi Hou

[permalink] [raw]
Subject: [PATCH V10 0/5] Generate device tree node for pci devices

This patch series introduces OF overlay support for PCI devices which
primarily addresses two use cases. First, it provides a data driven method
to describe hardware peripherals that are present in a PCI endpoint and
hence can be accessed by the PCI host. Second, it allows reuse of a OF
compatible driver -- often used in SoC platforms -- in a PCI host based
system.

There are 2 series devices rely on this patch:

1) Xilinx Alveo Accelerator cards (FPGA based device)
2) Microchip LAN9662 Ethernet Controller

Please see: https://lore.kernel.org/lkml/[email protected]/

Normally, the PCI core discovers PCI devices and their BARs using the
PCI enumeration process. However, the process does not provide a way to
discover the hardware peripherals that are present in a PCI device, and
which can be accessed through the PCI BARs. Also, the enumeration process
does not provide a way to associate MSI-X vectors of a PCI device with the
hardware peripherals that are present in the device. PCI device drivers
often use header files to describe the hardware peripherals and their
resources as there is no standard data driven way to do so. This patch
series proposes to use flattened device tree blob to describe the
peripherals in a data driven way. Based on previous discussion, using
device tree overlay is the best way to unflatten the blob and populate
platform devices. To use device tree overlay, there are three obvious
problems that need to be resolved.

First, we need to create a base tree for non-DT system such as x86_64. A
patch series has been submitted for this:
https://lore.kernel.org/lkml/[email protected]/
https://lore.kernel.org/lkml/[email protected]/

Second, a device tree node corresponding to the PCI endpoint is required
for overlaying the flattened device tree blob for that PCI endpoint.
Because PCI is a self-discoverable bus, a device tree node is usually not
created for PCI devices. This series adds support to generate a device
tree node for a PCI device which advertises itself using PCI quirks
infrastructure.

Third, we need to generate device tree nodes for PCI bridges since a child
PCI endpoint may choose to have a device tree node created.

This patch series is made up of three patches.

The first patch is adding OF interface to create or destroy OF node
dynamically.

The second patch introduces a kernel option, CONFIG_PCI_DYNAMIC_OF_NODES.
When the option is turned on, the kernel will generate device tree nodes
for all PCI bridges unconditionally. The patch also shows how to use the
PCI quirks infrastructure, DECLARE_PCI_FIXUP_FINAL to generate a device
tree node for a device. Specifically, the patch generates a device tree
node for Xilinx Alveo U50 PCIe accelerator device. The generated device
tree nodes do not have any property.

The third patch adds basic properties ('reg', 'compatible' and
'device_type') to the dynamically generated device tree nodes. More
properties can be added in the future.

Here is the example of device tree nodes generated within the ARM64 QEMU.

# lspci -t
-[0000:00]-+-00.0
+-01.0
+-03.0-[01-03]----00.0-[02-03]----00.0-[03]----00.0
+-03.1-[04]--
\-04.0-[05-06]----00.0-[06]--

Without CONFIG_PCI_DYNAMIC_OF_NODES

# tree /sys/firmware/devicetree/base/pcie@10000000/
/sys/firmware/devicetree/base/pcie@10000000/
|-- #address-cells
|-- #interrupt-cells
|-- #size-cells
|-- bus-range
|-- compatible
|-- device_type
|-- dma-coherent
|-- interrupt-map
|-- interrupt-map-mask
|-- linux,pci-domain
|-- msi-map
|-- name
|-- ranges
`-- reg

With CONFIG_PCI_DYNAMIC_OF_NODES

# tree /sys/firmware/devicetree/base/pcie@10000000/
/sys/firmware/devicetree/base/pcie@10000000/
|-- #address-cells
|-- #interrupt-cells
|-- #size-cells
|-- bus-range
|-- compatible
|-- device_type
|-- dma-coherent
|-- interrupt-map
|-- interrupt-map-mask
|-- linux,pci-domain
|-- msi-map
|-- name
|-- pci@3,0
| |-- #address-cells
| |-- #size-cells
| |-- compatible
| |-- device_type
| |-- dynamic
| |-- pci@0,0
| | |-- #address-cells
| | |-- #size-cells
| | |-- compatible
| | |-- device_type
| | |-- dynamic
| | |-- pci@0,0
| | | |-- #address-cells
| | | |-- #size-cells
| | | |-- compatible
| | | |-- dev@0,0
| | | | |-- #address-cells
| | | | |-- #size-cells
| | | | |-- compatible
| | | | |-- dynamic
| | | | |-- ranges
| | | | `-- reg
| | | |-- device_type
| | | |-- dynamic
| | | |-- ranges
| | | `-- reg
| | |-- ranges
| | `-- reg
| |-- ranges
| `-- reg
|-- pci@3,1
| |-- #address-cells
| |-- #size-cells
| |-- compatible
| |-- device_type
| |-- dynamic
| |-- ranges
| `-- reg
|-- pci@4,0
| |-- #address-cells
| |-- #size-cells
| |-- compatible
| |-- device_type
| |-- dynamic
| |-- pci@0,0
| | |-- #address-cells
| | |-- #size-cells
| | |-- compatible
| | |-- device_type
| | |-- dynamic
| | `-- reg
| |-- ranges
| `-- reg
|-- ranges
`-- reg

Changes since v9:
- Introduce 'dynamic' property to identify dynamically generated device tree
node for PCI device
- Added 'bus-range' property to remove dtc warnings
- Minor code review fixes

Changes since v8:
- Added patches to create unit test to verifying address translation
The test relies on QEMU PCI Test Device, please see
https://github.com/houlz0507/xoclv2/blob/pci-dt-0329/pci-dt-patch-0329/README
for test setup
- Minor code review fixes

Changes since v7:
- Modified dynamic node creation interfaces
- Added unittest for new added interfaces

Changes since v6:
- Removed single line wrapper functions
- Added Signed-off-by Clément Léger <[email protected]>

Changes since v5:
- Fixed code review comments
- Fixed incorrect 'ranges' and 'reg' properties

Changes since RFC v4:
- Fixed code review comments

Changes since RFC v3:
- Split the Xilinx Alveo U50 PCI quirk to a separate patch
- Minor changes in commit description and code comment

Changes since RFC v2:
- Merged patch 3 with patch 2
- Added OF interfaces of_changeset_add_prop_* and use them to create
properties.
- Added '#address-cells', '#size-cells' and 'ranges' properties.

Changes since RFC v1:
- Added one patch to create basic properties.
- To move DT related code out of PCI subsystem, replaced of_node_alloc()
with of_create_node()/of_destroy_node()

Lizhi Hou (5):
of: dynamic: Add interfaces for creating device node dynamically
PCI: Create device tree node for bridge
PCI: Add quirks to generate device tree node for Xilinx Alveo U50
of: overlay: Extend of_overlay_fdt_apply() to specify the target node
of: unittest: Add pci_dt_testdrv pci driver

drivers/of/dynamic.c | 187 ++++++++++++++
drivers/of/overlay.c | 42 +++-
drivers/of/unittest-data/Makefile | 3 +-
.../of/unittest-data/overlay_pci_node.dtso | 22 ++
drivers/of/unittest.c | 211 +++++++++++++++-
drivers/pci/Kconfig | 12 +
drivers/pci/Makefile | 1 +
drivers/pci/bus.c | 2 +
drivers/pci/of.c | 88 +++++++
drivers/pci/of_property.c | 235 ++++++++++++++++++
drivers/pci/pci.h | 15 ++
drivers/pci/quirks.c | 13 +
drivers/pci/remove.c | 1 +
include/linux/of.h | 28 ++-
14 files changed, 845 insertions(+), 15 deletions(-)
create mode 100644 drivers/of/unittest-data/overlay_pci_node.dtso
create mode 100644 drivers/pci/of_property.c

--
2.34.1



2023-06-29 17:37:06

by Lizhi Hou

[permalink] [raw]
Subject: [PATCH V10 4/5] of: overlay: Extend of_overlay_fdt_apply() to specify the target node

Currently, in an overlay fdt fragment, it needs to specify the exact
location in base DT. In another word, when the fdt fragment is generated,
the base DT location for the fragment is already known.

There is new use case that the base DT location is unknown when fdt
fragment is generated. For example, the add-on device provide a fdt
overlay with its firmware to describe its downstream devices. Because it
is add-on device which can be plugged to different systems, its firmware
will not be able to know the overlay location in base DT. Instead, the
device driver will load the overlay fdt and apply it to base DT at runtime.
In this case, of_overlay_fdt_apply() needs to be extended to specify
the target node for device driver to apply overlay fdt.
int overlay_fdt_apply(..., struct device_node *base);

Signed-off-by: Lizhi Hou <[email protected]>
---
drivers/of/overlay.c | 42 +++++++++++++++++++++++++++++++-----------
drivers/of/unittest.c | 3 ++-
include/linux/of.h | 2 +-
3 files changed, 34 insertions(+), 13 deletions(-)

diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
index 7feb643f1370..6f3ae30c878d 100644
--- a/drivers/of/overlay.c
+++ b/drivers/of/overlay.c
@@ -682,9 +682,11 @@ static int build_changeset(struct overlay_changeset *ovcs)
* 1) "target" property containing the phandle of the target
* 2) "target-path" property containing the path of the target
*/
-static struct device_node *find_target(struct device_node *info_node)
+static struct device_node *find_target(struct device_node *info_node,
+ struct device_node *target_base)
{
struct device_node *node;
+ char *target_path;
const char *path;
u32 val;
int ret;
@@ -700,10 +702,23 @@ static struct device_node *find_target(struct device_node *info_node)

ret = of_property_read_string(info_node, "target-path", &path);
if (!ret) {
- node = of_find_node_by_path(path);
- if (!node)
- pr_err("find target, node: %pOF, path '%s' not found\n",
- info_node, path);
+ if (target_base) {
+ target_path = kasprintf(GFP_KERNEL, "%pOF%s", target_base, path);
+ if (!target_path)
+ return NULL;
+ node = of_find_node_by_path(target_path);
+ if (!node) {
+ pr_err("find target, node: %pOF, path '%s' not found\n",
+ info_node, target_path);
+ }
+ kfree(target_path);
+ } else {
+ node = of_find_node_by_path(path);
+ if (!node) {
+ pr_err("find target, node: %pOF, path '%s' not found\n",
+ info_node, path);
+ }
+ }
return node;
}

@@ -715,6 +730,7 @@ static struct device_node *find_target(struct device_node *info_node)
/**
* init_overlay_changeset() - initialize overlay changeset from overlay tree
* @ovcs: Overlay changeset to build
+ * @target_base: Point to the target node to apply overlay
*
* Initialize @ovcs. Populate @ovcs->fragments with node information from
* the top level of @overlay_root. The relevant top level nodes are the
@@ -725,7 +741,8 @@ static struct device_node *find_target(struct device_node *info_node)
* detected in @overlay_root. On error return, the caller of
* init_overlay_changeset() must call free_overlay_changeset().
*/
-static int init_overlay_changeset(struct overlay_changeset *ovcs)
+static int init_overlay_changeset(struct overlay_changeset *ovcs,
+ struct device_node *target_base)
{
struct device_node *node, *overlay_node;
struct fragment *fragment;
@@ -786,7 +803,7 @@ static int init_overlay_changeset(struct overlay_changeset *ovcs)

fragment = &fragments[cnt];
fragment->overlay = overlay_node;
- fragment->target = find_target(node);
+ fragment->target = find_target(node, target_base);
if (!fragment->target) {
of_node_put(fragment->overlay);
ret = -EINVAL;
@@ -877,6 +894,7 @@ static void free_overlay_changeset(struct overlay_changeset *ovcs)
*
* of_overlay_apply() - Create and apply an overlay changeset
* @ovcs: overlay changeset
+ * @base: point to the target node to apply overlay
*
* Creates and applies an overlay changeset.
*
@@ -900,7 +918,8 @@ static void free_overlay_changeset(struct overlay_changeset *ovcs)
* the caller of of_overlay_apply() must call free_overlay_changeset().
*/

-static int of_overlay_apply(struct overlay_changeset *ovcs)
+static int of_overlay_apply(struct overlay_changeset *ovcs,
+ struct device_node *base)
{
int ret = 0, ret_revert, ret_tmp;

@@ -908,7 +927,7 @@ static int of_overlay_apply(struct overlay_changeset *ovcs)
if (ret)
goto out;

- ret = init_overlay_changeset(ovcs);
+ ret = init_overlay_changeset(ovcs, base);
if (ret)
goto out;

@@ -952,6 +971,7 @@ static int of_overlay_apply(struct overlay_changeset *ovcs)
* @overlay_fdt: pointer to overlay FDT
* @overlay_fdt_size: number of bytes in @overlay_fdt
* @ret_ovcs_id: pointer for returning created changeset id
+ * @base: pointer for the target node to apply overlay
*
* Creates and applies an overlay changeset.
*
@@ -967,7 +987,7 @@ static int of_overlay_apply(struct overlay_changeset *ovcs)
*/

int of_overlay_fdt_apply(const void *overlay_fdt, u32 overlay_fdt_size,
- int *ret_ovcs_id)
+ int *ret_ovcs_id, struct device_node *base)
{
void *new_fdt;
void *new_fdt_align;
@@ -1037,7 +1057,7 @@ int of_overlay_fdt_apply(const void *overlay_fdt, u32 overlay_fdt_size,
}
ovcs->overlay_mem = overlay_mem;

- ret = of_overlay_apply(ovcs);
+ ret = of_overlay_apply(ovcs, base);
/*
* If of_overlay_apply() error, calling free_overlay_changeset() may
* result in a memory leak if the apply partly succeeded, so do NOT
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index 1193a574fa36..4a0774954b93 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -3506,7 +3506,8 @@ static int __init overlay_data_apply(const char *overlay_name, int *ovcs_id)
if (!size)
pr_err("no overlay data for %s\n", overlay_name);

- ret = of_overlay_fdt_apply(info->dtbo_begin, size, &info->ovcs_id);
+ ret = of_overlay_fdt_apply(info->dtbo_begin, size, &info->ovcs_id,
+ NULL);
if (ovcs_id)
*ovcs_id = info->ovcs_id;
if (ret < 0)
diff --git a/include/linux/of.h b/include/linux/of.h
index 703152181a44..c989a5400da4 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -1671,7 +1671,7 @@ struct of_overlay_notify_data {
#ifdef CONFIG_OF_OVERLAY

int of_overlay_fdt_apply(const void *overlay_fdt, u32 overlay_fdt_size,
- int *ovcs_id);
+ int *ovcs_id, struct device_node *target_base);
int of_overlay_remove(int *ovcs_id);
int of_overlay_remove_all(void);

--
2.34.1


2023-06-29 17:37:34

by Lizhi Hou

[permalink] [raw]
Subject: [PATCH V10 3/5] PCI: Add quirks to generate device tree node for Xilinx Alveo U50

The Xilinx Alveo U50 PCI card exposes multiple hardware peripherals on
its PCI BAR. The card firmware provides a flattened device tree to
describe the hardware peripherals on its BARs. This allows U50 driver to
load the flattened device tree and generate the device tree node for
hardware peripherals underneath.

To generate device tree node for U50 card, add PCI quirks to call
of_pci_make_dev_node() for U50.

Signed-off-by: Lizhi Hou <[email protected]>
---
drivers/pci/quirks.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index c525867760bf..7776012eb03f 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -6041,3 +6041,15 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2d, dpc_log_size);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2f, dpc_log_size);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a31, dpc_log_size);
#endif
+
+/*
+ * For a PCI device with multiple downstream devices, its driver may use
+ * a flattened device tree to describe the downstream devices.
+ *
+ * To overlay the flattened device tree, the PCI device and all its ancestor
+ * devices need to have device tree nodes on system base device tree. Thus,
+ * before driver probing, it might need to add a device tree node as the final
+ * fixup.
+ */
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5020, of_pci_make_dev_node);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5021, of_pci_make_dev_node);
--
2.34.1


2023-06-29 17:38:32

by Lizhi Hou

[permalink] [raw]
Subject: [PATCH V10 2/5] PCI: Create device tree node for bridge

The PCI endpoint device such as Xilinx Alveo PCI card maps the register
spaces from multiple hardware peripherals to its PCI BAR. Normally,
the PCI core discovers devices and BARs using the PCI enumeration process.
There is no infrastructure to discover the hardware peripherals that are
present in a PCI device, and which can be accessed through the PCI BARs.

Apparently, the device tree framework requires a device tree node for the
PCI device. Thus, it can generate the device tree nodes for hardware
peripherals underneath. Because PCI is self discoverable bus, there might
not be a device tree node created for PCI devices. Furthermore, if the PCI
device is hot pluggable, when it is plugged in, the device tree nodes for
its parent bridges are required. Add support to generate device tree node
for PCI bridges.

Add an of_pci_make_dev_node() interface that can be used to create device
tree node for PCI devices.

Add a PCI_DYNAMIC_OF_NODES config option. When the option is turned on,
the kernel will generate device tree nodes for PCI bridges unconditionally.

Initially, add the basic properties for the dynamically generated device
tree nodes which include #address-cells, #size-cells, device_type,
compatible, ranges, reg.

Signed-off-by: Lizhi Hou <[email protected]>
---
drivers/pci/Kconfig | 12 ++
drivers/pci/Makefile | 1 +
drivers/pci/bus.c | 2 +
drivers/pci/of.c | 88 ++++++++++++++
drivers/pci/of_property.c | 235 ++++++++++++++++++++++++++++++++++++++
drivers/pci/pci.h | 15 +++
drivers/pci/remove.c | 1 +
7 files changed, 354 insertions(+)
create mode 100644 drivers/pci/of_property.c

diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 9309f2469b41..7264a5cee6bf 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -193,6 +193,18 @@ config PCI_HYPERV
The PCI device frontend driver allows the kernel to import arbitrary
PCI devices from a PCI backend to support PCI driver domains.

+config PCI_DYNAMIC_OF_NODES
+ bool "Create device tree nodes for PCI devices"
+ depends on OF
+ select OF_DYNAMIC
+ help
+ This option enables support for generating device tree nodes for some
+ PCI devices. Thus, the driver of this kind can load and overlay
+ flattened device tree for its downstream devices.
+
+ Once this option is selected, the device tree nodes will be generated
+ for all PCI bridges.
+
choice
prompt "PCI Express hierarchy optimization setting"
default PCIE_BUS_DEFAULT
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 2680e4c92f0a..cc8b4e01e29d 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -32,6 +32,7 @@ obj-$(CONFIG_PCI_P2PDMA) += p2pdma.o
obj-$(CONFIG_XEN_PCIDEV_FRONTEND) += xen-pcifront.o
obj-$(CONFIG_VGA_ARB) += vgaarb.o
obj-$(CONFIG_PCI_DOE) += doe.o
+obj-$(CONFIG_PCI_DYNAMIC_OF_NODES) += of_property.o

# Endpoint library must be initialized before its users
obj-$(CONFIG_PCI_ENDPOINT) += endpoint/
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 5bc81cc0a2de..ab7d06cd0099 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -340,6 +340,8 @@ void pci_bus_add_device(struct pci_dev *dev)
*/
pcibios_bus_add_device(dev);
pci_fixup_device(pci_fixup_final, dev);
+ if (pci_is_bridge(dev))
+ of_pci_make_dev_node(dev);
pci_create_sysfs_dev_files(dev);
pci_proc_attach_device(dev);
pci_bridge_d3_update(dev);
diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index 2c25f4fa0225..9786ae407948 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -487,6 +487,15 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *
} else {
/* We found a P2P bridge, check if it has a node */
ppnode = pci_device_to_OF_node(ppdev);
+#if IS_ENABLED(CONFIG_PCI_DYNAMIC_OF_NODES)
+ /*
+ * Interrupt mapping is not supported for dynamic
+ * generated bridge node. Thus, set ppnode to NULL
+ * to do standard swizzling.
+ */
+ if (of_property_present(ppnode, "dynamic"))
+ ppnode = NULL;
+#endif
}

/*
@@ -617,6 +626,85 @@ int devm_of_pci_bridge_init(struct device *dev, struct pci_host_bridge *bridge)
return pci_parse_request_of_pci_ranges(dev, bridge);
}

+#if IS_ENABLED(CONFIG_PCI_DYNAMIC_OF_NODES)
+
+void of_pci_remove_node(struct pci_dev *pdev)
+{
+ struct device_node *np;
+
+ np = pci_device_to_OF_node(pdev);
+ if (!np || !of_node_check_flag(np, OF_DYNAMIC))
+ return;
+ pdev->dev.of_node = NULL;
+
+ of_changeset_revert(np->data);
+ of_changeset_destroy(np->data);
+ of_node_put(np);
+}
+
+void of_pci_make_dev_node(struct pci_dev *pdev)
+{
+ struct device_node *ppnode, *np = NULL;
+ const char *pci_type;
+ struct of_changeset *cset;
+ const char *name;
+ int ret;
+
+ /*
+ * If there is already a device tree node linked to this device,
+ * return immediately.
+ */
+ if (pci_device_to_OF_node(pdev))
+ return;
+
+ /* Check if there is device tree node for parent device */
+ if (!pdev->bus->self)
+ ppnode = pdev->bus->dev.of_node;
+ else
+ ppnode = pdev->bus->self->dev.of_node;
+ if (!ppnode)
+ return;
+
+ if (pci_is_bridge(pdev))
+ pci_type = "pci";
+ else
+ pci_type = "dev";
+
+ name = kasprintf(GFP_KERNEL, "%s@%x,%x", pci_type,
+ PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
+ if (!name)
+ return;
+
+ cset = kmalloc(sizeof(*cset), GFP_KERNEL);
+ if (!cset)
+ goto failed;
+ of_changeset_init(cset);
+
+ np = of_changeset_create_node(ppnode, name, cset);
+ if (!np)
+ goto failed;
+ np->data = cset;
+
+ ret = of_pci_add_properties(pdev, cset, np);
+ if (ret)
+ goto failed;
+
+ ret = of_changeset_apply(cset);
+ if (ret)
+ goto failed;
+
+ pdev->dev.of_node = np;
+ kfree(name);
+
+ return;
+
+failed:
+ if (np)
+ of_node_put(np);
+ kfree(name);
+}
+#endif
+
#endif /* CONFIG_PCI */

/**
diff --git a/drivers/pci/of_property.c b/drivers/pci/of_property.c
new file mode 100644
index 000000000000..1432f9eed3af
--- /dev/null
+++ b/drivers/pci/of_property.c
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (C) 2022-2023, Advanced Micro Devices, Inc.
+ */
+
+#include <linux/pci.h>
+#include <linux/of.h>
+#include <linux/bitfield.h>
+#include <linux/bits.h>
+#include "pci.h"
+
+#define OF_PCI_ADDRESS_CELLS 3
+#define OF_PCI_SIZE_CELLS 2
+
+struct of_pci_addr_pair {
+ u32 phys_addr[OF_PCI_ADDRESS_CELLS];
+ u32 size[OF_PCI_SIZE_CELLS];
+};
+
+/*
+ * Each entry in the ranges table is a tuple containing the child address,
+ * the parent address, and the size of the region in the child address space.
+ * Thus, for PCI, in each entry parent address is an address on the primary
+ * side and the child address is the corresponding address on the secondary
+ * side.
+ */
+struct of_pci_range {
+ u32 child_addr[OF_PCI_ADDRESS_CELLS];
+ u32 parent_addr[OF_PCI_ADDRESS_CELLS];
+ u32 size[OF_PCI_SIZE_CELLS];
+};
+
+#define OF_PCI_ADDR_SPACE_IO 0x1
+#define OF_PCI_ADDR_SPACE_MEM32 0x2
+#define OF_PCI_ADDR_SPACE_MEM64 0x3
+
+#define OF_PCI_ADDR_FIELD_NONRELOC BIT(31)
+#define OF_PCI_ADDR_FIELD_SS GENMASK(25, 24)
+#define OF_PCI_ADDR_FIELD_PREFETCH BIT(30)
+#define OF_PCI_ADDR_FIELD_BUS GENMASK(23, 16)
+#define OF_PCI_ADDR_FIELD_DEV GENMASK(15, 11)
+#define OF_PCI_ADDR_FIELD_FUNC GENMASK(10, 8)
+#define OF_PCI_ADDR_FIELD_REG GENMASK(7, 0)
+
+enum of_pci_prop_compatible {
+ PROP_COMPAT_PCI_VVVV_DDDD,
+ PROP_COMPAT_PCICLASS_CCSSPP,
+ PROP_COMPAT_PCICLASS_CCSS,
+ PROP_COMPAT_NUM,
+};
+
+static void of_pci_set_address(struct pci_dev *pdev, u32 *prop, u64 addr,
+ u32 reg_num, u32 flags, bool reloc)
+{
+ prop[0] = FIELD_PREP(OF_PCI_ADDR_FIELD_BUS, pdev->bus->number) |
+ FIELD_PREP(OF_PCI_ADDR_FIELD_DEV, PCI_SLOT(pdev->devfn)) |
+ FIELD_PREP(OF_PCI_ADDR_FIELD_FUNC, PCI_FUNC(pdev->devfn));
+ prop[0] |= flags | reg_num;
+ if (!reloc) {
+ prop[0] |= OF_PCI_ADDR_FIELD_NONRELOC;
+ prop[1] = upper_32_bits(addr);
+ prop[2] = lower_32_bits(addr);
+ }
+}
+
+static int of_pci_get_addr_flags(struct resource *res, u32 *flags)
+{
+ u32 ss;
+
+ if (res->flags & IORESOURCE_IO)
+ ss = OF_PCI_ADDR_SPACE_IO;
+ else if (res->flags & IORESOURCE_MEM_64)
+ ss = OF_PCI_ADDR_SPACE_MEM64;
+ else if (res->flags & IORESOURCE_MEM)
+ ss = OF_PCI_ADDR_SPACE_MEM32;
+ else
+ return -EINVAL;
+
+ *flags = 0;
+ if (res->flags & IORESOURCE_PREFETCH)
+ *flags |= OF_PCI_ADDR_FIELD_PREFETCH;
+
+ *flags |= FIELD_PREP(OF_PCI_ADDR_FIELD_SS, ss);
+
+ return 0;
+}
+
+static int of_pci_prop_bus_range(struct pci_dev *pdev,
+ struct of_changeset *ocs,
+ struct device_node *np)
+{
+ u32 bus_range[] = { pdev->subordinate->busn_res.start,
+ pdev->subordinate->busn_res.end };
+
+ return of_changeset_add_prop_u32_array(ocs, np, "bus-range", bus_range,
+ ARRAY_SIZE(bus_range));
+}
+
+static int of_pci_prop_ranges(struct pci_dev *pdev, struct of_changeset *ocs,
+ struct device_node *np)
+{
+ struct of_pci_range *rp;
+ struct resource *res;
+ u32 flags, num;
+ int i, j, ret;
+ u64 val64;
+
+ if (pci_is_bridge(pdev)) {
+ num = PCI_BRIDGE_RESOURCE_NUM;
+ res = &pdev->resource[PCI_BRIDGE_RESOURCES];
+ } else {
+ num = PCI_STD_NUM_BARS;
+ res = &pdev->resource[PCI_STD_RESOURCES];
+ }
+
+ rp = kcalloc(num, sizeof(*rp), GFP_KERNEL);
+ if (!rp)
+ return -ENOMEM;
+
+ for (i = 0, j = 0; j < num; j++) {
+ if (!resource_size(&res[j]))
+ continue;
+
+ if (of_pci_get_addr_flags(&res[j], &flags))
+ continue;
+
+ val64 = res[j].start;
+ of_pci_set_address(pdev, rp[i].parent_addr, val64, 0, flags,
+ false);
+ if (pci_is_bridge(pdev)) {
+ memcpy(rp[i].child_addr, rp[i].parent_addr,
+ sizeof(rp[i].child_addr));
+ } else {
+ /*
+ * For endpoint device, the lower 64-bits of child
+ * address is always zero.
+ */
+ rp[i].child_addr[0] = j;
+ }
+
+ val64 = resource_size(&res[j]);
+ rp[i].size[0] = upper_32_bits(val64);
+ rp[i].size[1] = lower_32_bits(val64);
+
+ i++;
+ }
+
+ ret = of_changeset_add_prop_u32_array(ocs, np, "ranges", (u32 *)rp,
+ i * sizeof(*rp) / sizeof(u32));
+ kfree(rp);
+
+ return ret;
+}
+
+static int of_pci_prop_reg(struct pci_dev *pdev, struct of_changeset *ocs,
+ struct device_node *np)
+{
+ struct of_pci_addr_pair reg = { 0 };
+
+ /* configuration space */
+ of_pci_set_address(pdev, reg.phys_addr, 0, 0, 0, true);
+
+ return of_changeset_add_prop_u32_array(ocs, np, "reg", (u32 *)&reg,
+ sizeof(reg) / sizeof(u32));
+}
+
+static int of_pci_prop_compatible(struct pci_dev *pdev,
+ struct of_changeset *ocs,
+ struct device_node *np)
+{
+ const char *compat_strs[PROP_COMPAT_NUM] = { 0 };
+ int i, ret;
+
+ compat_strs[PROP_COMPAT_PCI_VVVV_DDDD] =
+ kasprintf(GFP_KERNEL, "pci%x,%x", pdev->vendor, pdev->device);
+ compat_strs[PROP_COMPAT_PCICLASS_CCSSPP] =
+ kasprintf(GFP_KERNEL, "pciclass,%06x", pdev->class);
+ compat_strs[PROP_COMPAT_PCICLASS_CCSS] =
+ kasprintf(GFP_KERNEL, "pciclass,%04x", pdev->class >> 8);
+
+ ret = of_changeset_add_prop_string_array(ocs, np, "compatible",
+ compat_strs, PROP_COMPAT_NUM);
+ for (i = 0; i < PROP_COMPAT_NUM; i++)
+ kfree(compat_strs[i]);
+
+ return ret;
+}
+
+int of_pci_add_properties(struct pci_dev *pdev, struct of_changeset *ocs,
+ struct device_node *np)
+{
+ int ret;
+ /*
+ * The added properties will be released when the
+ * changeset is destroyed.
+ */
+ if (pci_is_bridge(pdev)) {
+ ret = of_changeset_add_prop_string(ocs, np, "device_type",
+ "pci");
+ if (ret)
+ return ret;
+
+ ret = of_pci_prop_bus_range(pdev, ocs, np);
+ if (ret)
+ return ret;
+ }
+
+ ret = of_changeset_add_empty_prop(ocs, np, "dynamic");
+ if (ret)
+ return ret;
+
+ ret = of_pci_prop_ranges(pdev, ocs, np);
+ if (ret)
+ return ret;
+
+ ret = of_changeset_add_prop_u32(ocs, np, "#address-cells",
+ OF_PCI_ADDRESS_CELLS);
+ if (ret)
+ return ret;
+
+ ret = of_changeset_add_prop_u32(ocs, np, "#size-cells",
+ OF_PCI_SIZE_CELLS);
+ if (ret)
+ return ret;
+
+ ret = of_pci_prop_reg(pdev, ocs, np);
+ if (ret)
+ return ret;
+
+ ret = of_pci_prop_compatible(pdev, ocs, np);
+ if (ret)
+ return ret;
+
+ return 0;
+}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 2475098f6518..686836afee1c 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -678,6 +678,21 @@ static inline int devm_of_pci_bridge_init(struct device *dev, struct pci_host_br

#endif /* CONFIG_OF */

+struct of_changeset;
+
+#ifdef CONFIG_PCI_DYNAMIC_OF_NODES
+void of_pci_make_dev_node(struct pci_dev *pdev);
+void of_pci_remove_node(struct pci_dev *pdev);
+int of_pci_add_properties(struct pci_dev *pdev, struct of_changeset *ocs,
+ struct device_node *np);
+#else
+static inline void
+of_pci_make_dev_node(struct pci_dev *pdev) { }
+
+static inline void
+of_pci_remove_node(struct pci_dev *pdev) { }
+#endif
+
#ifdef CONFIG_PCIEAER
void pci_no_aer(void);
void pci_aer_init(struct pci_dev *dev);
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index d68aee29386b..d749ea8250d6 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -22,6 +22,7 @@ static void pci_stop_dev(struct pci_dev *dev)
device_release_driver(&dev->dev);
pci_proc_detach_device(dev);
pci_remove_sysfs_dev_files(dev);
+ of_pci_remove_node(dev);

pci_dev_assign_added(dev, false);
}
--
2.34.1


2023-06-29 17:38:42

by Lizhi Hou

[permalink] [raw]
Subject: [PATCH V10 1/5] of: dynamic: Add interfaces for creating device node dynamically

of_changeset_create_node() creates device node dynamically and attaches
the newly created node to a changeset.

Expand of_changeset APIs to handle specific types of properties.
of_changeset_add_empty_prop()
of_changeset_add_prop_string()
of_changeset_add_prop_string_array()
of_changeset_add_prop_u32_array()

Signed-off-by: Clément Léger <[email protected]>
Signed-off-by: Lizhi Hou <[email protected]>
---
drivers/of/dynamic.c | 187 ++++++++++++++++++++++++++++++++++++++++++
drivers/of/unittest.c | 19 ++++-
include/linux/of.h | 26 ++++++
3 files changed, 231 insertions(+), 1 deletion(-)

diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
index e311d406b170..4d0720a09be7 100644
--- a/drivers/of/dynamic.c
+++ b/drivers/of/dynamic.c
@@ -487,6 +487,38 @@ struct device_node *__of_node_dup(const struct device_node *np,
return NULL;
}

+/**
+ * of_changeset_create_node - Dynamically create a device node and attach to
+ * a given changeset.
+ *
+ * @parent: Pointer to parent device node
+ * @full_name: Node full name
+ * @cset: Pointer to changeset
+ *
+ * Return: Pointer to the created device node or NULL in case of an error.
+ */
+struct device_node *of_changeset_create_node(struct device_node *parent,
+ const char *full_name,
+ struct of_changeset *cset)
+{
+ struct device_node *np;
+ int ret;
+
+ np = __of_node_dup(NULL, full_name);
+ if (!np)
+ return NULL;
+ np->parent = parent;
+
+ ret = of_changeset_attach_node(cset, np);
+ if (ret) {
+ of_node_put(np);
+ return NULL;
+ }
+
+ return np;
+}
+EXPORT_SYMBOL(of_changeset_create_node);
+
static void __of_changeset_entry_destroy(struct of_changeset_entry *ce)
{
if (ce->action == OF_RECONFIG_ATTACH_NODE &&
@@ -960,3 +992,158 @@ int of_changeset_action(struct of_changeset *ocs, unsigned long action,
return 0;
}
EXPORT_SYMBOL_GPL(of_changeset_action);
+
+static int of_changeset_add_prop_helper(struct of_changeset *ocs,
+ struct device_node *np,
+ const struct property *pp)
+{
+ struct property *new_pp;
+ int ret;
+
+ new_pp = __of_prop_dup(pp, GFP_KERNEL);
+ if (!new_pp)
+ return -ENOMEM;
+
+ ret = of_changeset_add_property(ocs, np, new_pp);
+ if (ret) {
+ kfree(new_pp->name);
+ kfree(new_pp->value);
+ kfree(new_pp);
+ }
+
+ return ret;
+}
+
+/**
+ * of_changeset_add_empty_prop - Add an empty property to a changeset
+ *
+ * @ocs: changeset pointer
+ * @np: device node pointer
+ * @prop_name: name of the property to be added
+ *
+ * Create an empty property and add it to a changeset.
+ *
+ * Return: 0 on success, a negative error value in case of an error.
+ */
+int of_changeset_add_empty_prop(struct of_changeset *ocs,
+ struct device_node *np,
+ const char *prop_name)
+{
+ struct property prop = { 0 };
+
+ prop.name = (char *)prop_name;
+
+ return of_changeset_add_prop_helper(ocs, np, &prop);
+}
+EXPORT_SYMBOL_GPL(of_changeset_add_empty_prop);
+
+/**
+ * of_changeset_add_prop_string - Add a string property to a changeset
+ *
+ * @ocs: changeset pointer
+ * @np: device node pointer
+ * @prop_name: name of the property to be added
+ * @str: pointer to null terminated string
+ *
+ * Create a string property and add it to a changeset.
+ *
+ * Return: 0 on success, a negative error value in case of an error.
+ */
+int of_changeset_add_prop_string(struct of_changeset *ocs,
+ struct device_node *np,
+ const char *prop_name, const char *str)
+{
+ struct property prop;
+
+ prop.name = (char *)prop_name;
+ prop.length = strlen(str) + 1;
+ prop.value = (void *)str;
+
+ return of_changeset_add_prop_helper(ocs, np, &prop);
+}
+EXPORT_SYMBOL_GPL(of_changeset_add_prop_string);
+
+/**
+ * of_changeset_add_prop_string_array - Add a string list property to
+ * a changeset
+ *
+ * @ocs: changeset pointer
+ * @np: device node pointer
+ * @prop_name: name of the property to be added
+ * @str_array: pointer to an array of null terminated strings
+ * @sz: number of string array elements
+ *
+ * Create a string list property and add it to a changeset.
+ *
+ * Return: 0 on success, a negative error value in case of an error.
+ */
+int of_changeset_add_prop_string_array(struct of_changeset *ocs,
+ struct device_node *np,
+ const char *prop_name,
+ const char **str_array, size_t sz)
+{
+ struct property prop;
+ int i, ret;
+ char *vp;
+
+ prop.name = (char *)prop_name;
+
+ prop.length = 0;
+ for (i = 0; i < sz; i++)
+ prop.length += strlen(str_array[i]) + 1;
+
+ prop.value = kmalloc(prop.length, GFP_KERNEL);
+ if (!prop.value)
+ return -ENOMEM;
+
+ vp = prop.value;
+ for (i = 0; i < sz; i++) {
+ vp += snprintf(vp, (char *)prop.value + prop.length - vp, "%s",
+ str_array[i]) + 1;
+ }
+ ret = of_changeset_add_prop_helper(ocs, np, &prop);
+ kfree(prop.value);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(of_changeset_add_prop_string_array);
+
+/**
+ * of_changeset_add_prop_u32_array - Add a property of 32 bit integers
+ * property to a changeset
+ *
+ * @ocs: changeset pointer
+ * @np: device node pointer
+ * @prop_name: name of the property to be added
+ * @array: pointer to an array of 32 bit integers
+ * @sz: number of array elements
+ *
+ * Create a property of 32 bit integers and add it to a changeset.
+ *
+ * Return: 0 on success, a negative error value in case of an error.
+ */
+int of_changeset_add_prop_u32_array(struct of_changeset *ocs,
+ struct device_node *np,
+ const char *prop_name,
+ const u32 *array, size_t sz)
+{
+ struct property prop;
+ __be32 *val;
+ int i, ret;
+
+ val = kcalloc(sz, sizeof(__be32), GFP_KERNEL);
+ if (!val)
+ return -ENOMEM;
+
+ for (i = 0; i < sz; i++)
+ val[i] = cpu_to_be32(array[i]);
+ prop.name = (char *)prop_name;
+ prop.length = sizeof(u32) * sz;
+ prop.value = (void *)val;
+
+ ret = of_changeset_add_prop_helper(ocs, np, &prop);
+ kfree(val);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(of_changeset_add_prop_u32_array);
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index 2191c0136531..1193a574fa36 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -802,7 +802,9 @@ static void __init of_unittest_changeset(void)
struct property *ppname_n21, pname_n21 = { .name = "name", .length = 3, .value = "n21" };
struct property *ppupdate, pupdate = { .name = "prop-update", .length = 5, .value = "abcd" };
struct property *ppremove;
- struct device_node *n1, *n2, *n21, *nchangeset, *nremove, *parent, *np;
+ struct device_node *n1, *n2, *n21, *n22, *nchangeset, *nremove, *parent, *np;
+ static const char * const str_array[] = { "str1", "str2", "str3" };
+ const u32 u32_array[] = { 1, 2, 3 };
struct of_changeset chgset;

n1 = __of_node_dup(NULL, "n1");
@@ -857,6 +859,17 @@ static void __init of_unittest_changeset(void)
unittest(!of_changeset_add_property(&chgset, parent, ppadd), "fail add prop prop-add\n");
unittest(!of_changeset_update_property(&chgset, parent, ppupdate), "fail update prop\n");
unittest(!of_changeset_remove_property(&chgset, parent, ppremove), "fail remove prop\n");
+ n22 = of_changeset_create_node(n2, "n22", &chgset);
+ unittest(n22, "fail create n22\n");
+ unittest(!of_changeset_add_prop_string(&chgset, n22, "prop-str", "abcd"),
+ "fail add prop prop-str");
+ unittest(!of_changeset_add_prop_string_array(&chgset, n22, "prop-str-array",
+ (const char **)str_array,
+ ARRAY_SIZE(str_array)),
+ "fail add prop prop-str-array");
+ unittest(!of_changeset_add_prop_u32_array(&chgset, n22, "prop-u32-array",
+ u32_array, ARRAY_SIZE(u32_array)),
+ "fail add prop prop-u32-array");

unittest(!of_changeset_apply(&chgset), "apply failed\n");

@@ -866,6 +879,9 @@ static void __init of_unittest_changeset(void)
unittest((np = of_find_node_by_path("/testcase-data/changeset/n2/n21")),
"'%pOF' not added\n", n21);
of_node_put(np);
+ unittest((np = of_find_node_by_path("/testcase-data/changeset/n2/n22")),
+ "'%pOF' not added\n", n22);
+ of_node_put(np);

unittest(!of_changeset_revert(&chgset), "revert failed\n");

@@ -874,6 +890,7 @@ static void __init of_unittest_changeset(void)
of_node_put(n1);
of_node_put(n2);
of_node_put(n21);
+ of_node_put(n22);
#endif
}

diff --git a/include/linux/of.h b/include/linux/of.h
index 6ecde0515677..703152181a44 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -1580,6 +1580,32 @@ static inline int of_changeset_update_property(struct of_changeset *ocs,
{
return of_changeset_action(ocs, OF_RECONFIG_UPDATE_PROPERTY, np, prop);
}
+
+struct device_node *of_changeset_create_node(struct device_node *parent,
+ const char *full_name,
+ struct of_changeset *cset);
+int of_changeset_add_empty_prop(struct of_changeset *ocs,
+ struct device_node *np,
+ const char *prop_name);
+int of_changeset_add_prop_string(struct of_changeset *ocs,
+ struct device_node *np,
+ const char *prop_name, const char *str);
+int of_changeset_add_prop_string_array(struct of_changeset *ocs,
+ struct device_node *np,
+ const char *prop_name,
+ const char **str_array, size_t sz);
+int of_changeset_add_prop_u32_array(struct of_changeset *ocs,
+ struct device_node *np,
+ const char *prop_name,
+ const u32 *array, size_t sz);
+static inline int of_changeset_add_prop_u32(struct of_changeset *ocs,
+ struct device_node *np,
+ const char *prop_name,
+ const u32 val)
+{
+ return of_changeset_add_prop_u32_array(ocs, np, prop_name, &val, 1);
+}
+
#else /* CONFIG_OF_DYNAMIC */
static inline int of_reconfig_notifier_register(struct notifier_block *nb)
{
--
2.34.1


2023-06-29 17:38:46

by Lizhi Hou

[permalink] [raw]
Subject: [PATCH V10 5/5] of: unittest: Add pci_dt_testdrv pci driver

pci_dt_testdrv is bound to QEMU PCI Test Device. It reads
overlay_pci_node fdt fragment and apply it to Test Device. Then it
calls of_platform_default_populate() to populate the platform
devices.

Signed-off-by: Lizhi Hou <[email protected]>
---
drivers/of/unittest-data/Makefile | 3 +-
.../of/unittest-data/overlay_pci_node.dtso | 22 ++
drivers/of/unittest.c | 189 ++++++++++++++++++
drivers/pci/quirks.c | 1 +
4 files changed, 214 insertions(+), 1 deletion(-)
create mode 100644 drivers/of/unittest-data/overlay_pci_node.dtso

diff --git a/drivers/of/unittest-data/Makefile b/drivers/of/unittest-data/Makefile
index ea5f4da68e23..1aa875088159 100644
--- a/drivers/of/unittest-data/Makefile
+++ b/drivers/of/unittest-data/Makefile
@@ -32,7 +32,8 @@ obj-$(CONFIG_OF_OVERLAY) += overlay.dtbo.o \
overlay_gpio_02b.dtbo.o \
overlay_gpio_03.dtbo.o \
overlay_gpio_04a.dtbo.o \
- overlay_gpio_04b.dtbo.o
+ overlay_gpio_04b.dtbo.o \
+ overlay_pci_node.dtbo.o

# enable creation of __symbols__ node
DTC_FLAGS_overlay += -@
diff --git a/drivers/of/unittest-data/overlay_pci_node.dtso b/drivers/of/unittest-data/overlay_pci_node.dtso
new file mode 100644
index 000000000000..c05e52e9e44a
--- /dev/null
+++ b/drivers/of/unittest-data/overlay_pci_node.dtso
@@ -0,0 +1,22 @@
+// SPDX-License-Identifier: GPL-2.0
+/dts-v1/;
+/ {
+ fragment@0 {
+ target-path="";
+ __overlay__ {
+ #address-cells = <3>;
+ #size-cells = <2>;
+ pci-ep-bus@0 {
+ compatible = "simple-bus";
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0x0 0x0 0x0 0x0 0x1000>;
+ reg = <0 0 0 0 0>;
+ unittest-pci@100 {
+ compatible = "unittest-pci";
+ reg = <0x100 0x200>;
+ };
+ };
+ };
+ };
+};
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index 4a0774954b93..ead54e47c063 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -22,6 +22,7 @@
#include <linux/slab.h>
#include <linux/device.h>
#include <linux/platform_device.h>
+#include <linux/pci.h>
#include <linux/kernel.h>

#include <linux/i2c.h>
@@ -3352,6 +3353,7 @@ OVERLAY_INFO_EXTERN(overlay_gpio_02b);
OVERLAY_INFO_EXTERN(overlay_gpio_03);
OVERLAY_INFO_EXTERN(overlay_gpio_04a);
OVERLAY_INFO_EXTERN(overlay_gpio_04b);
+OVERLAY_INFO_EXTERN(overlay_pci_node);
OVERLAY_INFO_EXTERN(overlay_bad_add_dup_node);
OVERLAY_INFO_EXTERN(overlay_bad_add_dup_prop);
OVERLAY_INFO_EXTERN(overlay_bad_phandle);
@@ -3387,6 +3389,7 @@ static struct overlay_info overlays[] = {
OVERLAY_INFO(overlay_gpio_03, 0),
OVERLAY_INFO(overlay_gpio_04a, 0),
OVERLAY_INFO(overlay_gpio_04b, 0),
+ OVERLAY_INFO(overlay_pci_node, 0),
OVERLAY_INFO(overlay_bad_add_dup_node, -EINVAL),
OVERLAY_INFO(overlay_bad_add_dup_prop, -EINVAL),
OVERLAY_INFO(overlay_bad_phandle, -EINVAL),
@@ -3757,6 +3760,191 @@ static inline __init void of_unittest_overlay_high_level(void) {}

#endif

+#ifdef CONFIG_PCI_DYNAMIC_OF_NODES
+
+int of_unittest_pci_dev_num;
+int of_unittest_pci_child_num;
+
+/*
+ * PCI device tree node test driver
+ */
+static const struct pci_device_id testdrv_pci_ids[] = {
+ { PCI_DEVICE(PCI_VENDOR_ID_REDHAT, 0x5), }, /* PCI_VENDOR_ID_REDHAT */
+ { 0, }
+};
+
+static int testdrv_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct overlay_info *info;
+ struct device_node *dn;
+ int ret, ovcs_id;
+ u32 size;
+
+ dn = pdev->dev.of_node;
+ if (!dn) {
+ dev_err(&pdev->dev, "does not find bus endpoint");
+ return -EINVAL;
+ }
+
+ for (info = overlays; info && info->name; info++) {
+ if (!strcmp(info->name, "overlay_pci_node"))
+ break;
+ }
+ if (!info || !info->name) {
+ dev_err(&pdev->dev, "no overlay data for overlay_pci_node");
+ return -ENODEV;
+ }
+
+ size = info->dtbo_end - info->dtbo_begin;
+ ret = of_overlay_fdt_apply(info->dtbo_begin, size, &ovcs_id, dn);
+ of_node_put(dn);
+ if (ret)
+ return ret;
+
+ of_platform_default_populate(dn, NULL, &pdev->dev);
+ pci_set_drvdata(pdev, (void *)(uintptr_t)ovcs_id);
+
+ return 0;
+}
+
+static void testdrv_remove(struct pci_dev *pdev)
+{
+ int ovcs_id = (int)(uintptr_t)pci_get_drvdata(pdev);
+
+ of_platform_depopulate(&pdev->dev);
+ of_overlay_remove(&ovcs_id);
+}
+
+static struct pci_driver testdrv_driver = {
+ .name = "pci_dt_testdrv",
+ .id_table = testdrv_pci_ids,
+ .probe = testdrv_probe,
+ .remove = testdrv_remove,
+};
+
+static int unittest_pci_probe(struct platform_device *pdev)
+{
+ struct resource *res;
+ struct device *dev;
+ u64 exp_addr;
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res)
+ return -ENODEV;
+
+ dev = &pdev->dev;
+ while (dev && !dev_is_pci(dev))
+ dev = dev->parent;
+ if (!dev) {
+ pr_err("unable to find parent device\n");
+ return -ENODEV;
+ }
+
+ exp_addr = pci_resource_start(to_pci_dev(dev), 0) + 0x100;
+ unittest(res->start == exp_addr, "Incorrect translated address %llx, expected %llx\n",
+ (u64)res->start, exp_addr);
+
+ of_unittest_pci_child_num++;
+
+ return 0;
+}
+
+static const struct of_device_id unittest_pci_of_match[] = {
+ { .compatible = "unittest-pci" },
+ { }
+};
+
+static struct platform_driver unittest_pci_driver = {
+ .probe = unittest_pci_probe,
+ .driver = {
+ .name = "unittest-pci",
+ .of_match_table = unittest_pci_of_match,
+ },
+};
+
+static int of_unittest_pci_node_verify(struct pci_dev *pdev, bool add)
+{
+ struct device_node *pnp, *np = NULL;
+ struct device *child_dev;
+ char *path = NULL;
+ const __be32 *reg;
+ int rc = 0;
+
+ pnp = pdev->dev.of_node;
+ unittest(pnp, "Failed creating PCI dt node\n");
+ if (!pnp)
+ return -ENODEV;
+
+ if (add) {
+ path = kasprintf(GFP_KERNEL, "%pOF/pci-ep-bus@0/unittest-pci@100", pnp);
+ np = of_find_node_by_path(path);
+ unittest(np, "Failed to get unittest-pci node under PCI node\n");
+ if (!np) {
+ rc = -ENODEV;
+ goto failed;
+ }
+
+ reg = of_get_property(np, "reg", NULL);
+ unittest(reg, "Failed to get reg property\n");
+ if (!reg)
+ rc = -ENODEV;
+ } else {
+ path = kasprintf(GFP_KERNEL, "%pOF/pci-ep-bus@0", pnp);
+ np = of_find_node_by_path(path);
+ unittest(!np, "Child device tree node is not removed\n");
+ child_dev = device_find_any_child(&pdev->dev);
+ unittest(!child_dev, "Child device is not removed\n");
+ }
+
+failed:
+ kfree(path);
+ if (np)
+ of_node_put(np);
+
+ return rc;
+}
+
+static void __init of_unittest_pci_node(void)
+{
+ struct pci_dev *pdev = NULL;
+ int rc;
+
+ rc = pci_register_driver(&testdrv_driver);
+ unittest(!rc, "Failed to register pci test driver; rc = %d\n", rc);
+ if (rc)
+ return;
+
+ rc = platform_driver_register(&unittest_pci_driver);
+ if (unittest(!rc, "Failed to register unittest pci driver\n")) {
+ pci_unregister_driver(&testdrv_driver);
+ return;
+ }
+
+ while ((pdev = pci_get_device(PCI_VENDOR_ID_REDHAT, 0x5, pdev)) != NULL) {
+ of_unittest_pci_node_verify(pdev, true);
+ of_unittest_pci_dev_num++;
+ }
+ if (pdev)
+ pci_dev_put(pdev);
+
+ unittest(of_unittest_pci_dev_num,
+ "No test PCI device been found. Please run QEMU with '-device pci-testdev'\n");
+ unittest(of_unittest_pci_dev_num == of_unittest_pci_child_num,
+ "Child device number %d is not expected %d", of_unittest_pci_child_num,
+ of_unittest_pci_dev_num);
+
+ platform_driver_unregister(&unittest_pci_driver);
+ pci_unregister_driver(&testdrv_driver);
+
+ while ((pdev = pci_get_device(PCI_VENDOR_ID_REDHAT, 0x5, pdev)) != NULL)
+ of_unittest_pci_node_verify(pdev, false);
+ if (pdev)
+ pci_dev_put(pdev);
+}
+#else
+static void __init of_unittest_pci_node(void) { }
+#endif
+
static int __init of_unittest(void)
{
struct device_node *np;
@@ -3807,6 +3995,7 @@ static int __init of_unittest(void)
of_unittest_platform_populate();
of_unittest_overlay();
of_unittest_lifecycle();
+ of_unittest_pci_node();

/* Double check linkage after removing testcase data */
of_unittest_check_tree_linkage();
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 7776012eb03f..24860e1b76d4 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -6053,3 +6053,4 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a31, dpc_log_size);
*/
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5020, of_pci_make_dev_node);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5021, of_pci_make_dev_node);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REDHAT, 0x0005, of_pci_make_dev_node);
--
2.34.1


2023-06-29 20:51:56

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH V10 3/5] PCI: Add quirks to generate device tree node for Xilinx Alveo U50

On Thu, Jun 29, 2023 at 10:19:48AM -0700, Lizhi Hou wrote:
> The Xilinx Alveo U50 PCI card exposes multiple hardware peripherals on
> its PCI BAR. The card firmware provides a flattened device tree to
> describe the hardware peripherals on its BARs. This allows U50 driver to
> load the flattened device tree and generate the device tree node for
> hardware peripherals underneath.
>
> To generate device tree node for U50 card, add PCI quirks to call
> of_pci_make_dev_node() for U50.
>
> Signed-off-by: Lizhi Hou <[email protected]>

Acked-by: Bjorn Helgaas <[email protected]>

I already gave my ack for v9, so ideally you would add that before
posting the v10. But here it is again :)

> ---
> drivers/pci/quirks.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index c525867760bf..7776012eb03f 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -6041,3 +6041,15 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2d, dpc_log_size);
> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2f, dpc_log_size);
> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a31, dpc_log_size);
> #endif
> +
> +/*
> + * For a PCI device with multiple downstream devices, its driver may use
> + * a flattened device tree to describe the downstream devices.
> + *
> + * To overlay the flattened device tree, the PCI device and all its ancestor
> + * devices need to have device tree nodes on system base device tree. Thus,
> + * before driver probing, it might need to add a device tree node as the final
> + * fixup.
> + */
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5020, of_pci_make_dev_node);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5021, of_pci_make_dev_node);
> --
> 2.34.1
>

2023-06-29 22:57:32

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH V10 2/5] PCI: Create device tree node for bridge

On Thu, Jun 29, 2023 at 10:19:47AM -0700, Lizhi Hou wrote:
> The PCI endpoint device such as Xilinx Alveo PCI card maps the register
> spaces from multiple hardware peripherals to its PCI BAR. Normally,
> the PCI core discovers devices and BARs using the PCI enumeration process.
> There is no infrastructure to discover the hardware peripherals that are
> present in a PCI device, and which can be accessed through the PCI BARs.

IIUC this is basically a multi-function device except that instead of
each device being a separate PCI Function, they all appear in a single
Function. That would mean all the devices share the same config space
so a single PCI Command register controls all of them, they all share
the same IRQs (either INTx or MSI/MSI-X), any MMIO registers are likely
in a shared BAR, etc., right?

Obviously PCI enumeration only sees the single Function and binds a
single driver to it. But IIUC, you want to use existing drivers for
each of these sub-devices, so this series adds a DT node for the
single Function (using the quirks that call of_pci_make_dev_node()).
And I assume that when the PCI driver claims the single Function, it
will use that DT node to add platform devices, and those existing
drivers can claim those?

I don't see the PCI driver for the single Function in this series. Is
that coming? Is this series useful without it?

> Apparently, the device tree framework requires a device tree node for the
> PCI device. Thus, it can generate the device tree nodes for hardware
> peripherals underneath. Because PCI is self discoverable bus, there might
> not be a device tree node created for PCI devices. Furthermore, if the PCI
> device is hot pluggable, when it is plugged in, the device tree nodes for
> its parent bridges are required. Add support to generate device tree node
> for PCI bridges.

Can you remind me why hot-adding a PCI device requires DT nodes for
parent bridges? I don't think we have those today, so maybe the DT
node for the PCI device requires a DT parent? How far up does that
go? From this patch, I guess a Root Port would be the top DT node on
a PCIe system, since that's the top PCI-to-PCI bridge?

This patch adds a DT node for *every* PCI bridge in the system. We
only actually need that node for these unusual devices. Is there some
way the driver for the single PCI Function could add that node when it
is needed? Sorry if you've answered this in the past; maybe the
answer could be in the commit log or a code comment in case somebody
else wonders.

> @@ -340,6 +340,8 @@ void pci_bus_add_device(struct pci_dev *dev)
> */
> pcibios_bus_add_device(dev);
> pci_fixup_device(pci_fixup_final, dev);
> + if (pci_is_bridge(dev))
> + of_pci_make_dev_node(dev);

It'd be nice to have a clue here about why we need this, since this is
executed for *every* system, even ACPI platforms that typically don't
use OF things.

> pci_create_sysfs_dev_files(dev);
> pci_proc_attach_device(dev);
> pci_bridge_d3_update(dev);
> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> index 2c25f4fa0225..9786ae407948 100644
> --- a/drivers/pci/of.c
> +++ b/drivers/pci/of.c
> @@ -487,6 +487,15 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *
> } else {
> /* We found a P2P bridge, check if it has a node */
> ppnode = pci_device_to_OF_node(ppdev);
> +#if IS_ENABLED(CONFIG_PCI_DYNAMIC_OF_NODES)

I would use plain #ifdef here instead of IS_ENABLED(), as you did in
pci.h below. IS_ENABLED() is true if the Kconfig symbol is set to
either "y" or "m".

Using IS_ENABLED() suggests that the config option *could* be a
module, which is not the case here because CONFIG_PCI_DYNAMIC_OF_NODES
is a bool.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/kconfig.h?id=v6.4#n69

> @@ -617,6 +626,85 @@ int devm_of_pci_bridge_init(struct device *dev, struct pci_host_bridge *bridge)
> return pci_parse_request_of_pci_ranges(dev, bridge);
> }
>
> +#if IS_ENABLED(CONFIG_PCI_DYNAMIC_OF_NODES)

Same here, of course.

> +void of_pci_remove_node(struct pci_dev *pdev)
> +{
> + struct device_node *np;
> +
> + np = pci_device_to_OF_node(pdev);
> + if (!np || !of_node_check_flag(np, OF_DYNAMIC))

> + * Each entry in the ranges table is a tuple containing the child address,
> + * the parent address, and the size of the region in the child address space.
> + * Thus, for PCI, in each entry parent address is an address on the primary
> + * side and the child address is the corresponding address on the secondary
> + * side.
> + */
> +struct of_pci_range {
> + u32 child_addr[OF_PCI_ADDRESS_CELLS];
> + u32 parent_addr[OF_PCI_ADDRESS_CELLS];
> + u32 size[OF_PCI_SIZE_CELLS];

> + if (pci_is_bridge(pdev)) {
> + memcpy(rp[i].child_addr, rp[i].parent_addr,
> + sizeof(rp[i].child_addr));
> + } else {
> + /*
> + * For endpoint device, the lower 64-bits of child
> + * address is always zero.

I think this connects with the secondary side comment above, right? I
think I would comment this as:

/*
* PCI-PCI bridges don't translate addresses, so the child
* (secondary side) address is identical to the parent (primary
* side) address.
*/

and

/*
* Non-bridges have no child (secondary side) address, so clear it
* out.
*/

> + */
> + rp[i].child_addr[0] = j;

> + ret = of_changeset_add_empty_prop(ocs, np, "dynamic");

It seems slightly confusing to use a "dynamic" property here when we
also have the OF_DYNAMIC dynamic flag above. I think they have
different meanings, don't they?

Bjorn

2023-06-30 00:23:40

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH V10 2/5] PCI: Create device tree node for bridge

On Thu, Jun 29, 2023 at 05:56:31PM -0500, Bjorn Helgaas wrote:
> On Thu, Jun 29, 2023 at 10:19:47AM -0700, Lizhi Hou wrote:
> > The PCI endpoint device such as Xilinx Alveo PCI card maps the register
> > spaces from multiple hardware peripherals to its PCI BAR. Normally,
> > the PCI core discovers devices and BARs using the PCI enumeration process.
> > There is no infrastructure to discover the hardware peripherals that are
> > present in a PCI device, and which can be accessed through the PCI BARs.
>
> IIUC this is basically a multi-function device except that instead of
> each device being a separate PCI Function, they all appear in a single
> Function. That would mean all the devices share the same config space
> so a single PCI Command register controls all of them, they all share
> the same IRQs (either INTx or MSI/MSI-X), any MMIO registers are likely
> in a shared BAR, etc., right?
>
> Obviously PCI enumeration only sees the single Function and binds a
> single driver to it. But IIUC, you want to use existing drivers for
> each of these sub-devices, so this series adds a DT node for the
> single Function (using the quirks that call of_pci_make_dev_node()).
> And I assume that when the PCI driver claims the single Function, it
> will use that DT node to add platform devices, and those existing
> drivers can claim those?
>
> I don't see the PCI driver for the single Function in this series. Is
> that coming? Is this series useful without it?
>
> > Apparently, the device tree framework requires a device tree node for the
> > PCI device. Thus, it can generate the device tree nodes for hardware
> > peripherals underneath. Because PCI is self discoverable bus, there might
> > not be a device tree node created for PCI devices. Furthermore, if the PCI
> > device is hot pluggable, when it is plugged in, the device tree nodes for
> > its parent bridges are required. Add support to generate device tree node
> > for PCI bridges.
>
> Can you remind me why hot-adding a PCI device requires DT nodes for
> parent bridges? I don't think we have those today, so maybe the DT
> node for the PCI device requires a DT parent? How far up does that
> go? From this patch, I guess a Root Port would be the top DT node on
> a PCIe system, since that's the top PCI-to-PCI bridge?
>
> This patch adds a DT node for *every* PCI bridge in the system. We
> only actually need that node for these unusual devices. Is there some
> way the driver for the single PCI Function could add that node when it
> is needed? Sorry if you've answered this in the past; maybe the
> answer could be in the commit log or a code comment in case somebody
> else wonders.
>
> > @@ -340,6 +340,8 @@ void pci_bus_add_device(struct pci_dev *dev)
> > */
> > pcibios_bus_add_device(dev);
> > pci_fixup_device(pci_fixup_final, dev);
> > + if (pci_is_bridge(dev))
> > + of_pci_make_dev_node(dev);
>
> It'd be nice to have a clue here about why we need this, since this is
> executed for *every* system, even ACPI platforms that typically don't
> use OF things.
>
> > pci_create_sysfs_dev_files(dev);
> > pci_proc_attach_device(dev);
> > pci_bridge_d3_update(dev);
> > diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> > index 2c25f4fa0225..9786ae407948 100644
> > --- a/drivers/pci/of.c
> > +++ b/drivers/pci/of.c
> > @@ -487,6 +487,15 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *
> > } else {
> > /* We found a P2P bridge, check if it has a node */
> > ppnode = pci_device_to_OF_node(ppdev);
> > +#if IS_ENABLED(CONFIG_PCI_DYNAMIC_OF_NODES)
>
> I would use plain #ifdef here instead of IS_ENABLED(), as you did in
> pci.h below. IS_ENABLED() is true if the Kconfig symbol is set to
> either "y" or "m".

Actually, IS_ENABLED() with a C 'if' rather than a preprocessor #if
would work here and is preferred.

But this code and the "dynamic" property needs more discussion.

Rob

2023-06-30 00:24:35

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH V10 2/5] PCI: Create device tree node for bridge

On Thu, Jun 29, 2023 at 05:56:31PM -0500, Bjorn Helgaas wrote:
> On Thu, Jun 29, 2023 at 10:19:47AM -0700, Lizhi Hou wrote:
> > The PCI endpoint device such as Xilinx Alveo PCI card maps the register
> > spaces from multiple hardware peripherals to its PCI BAR. Normally,
> > the PCI core discovers devices and BARs using the PCI enumeration process.
> > There is no infrastructure to discover the hardware peripherals that are
> > present in a PCI device, and which can be accessed through the PCI BARs.
>
> IIUC this is basically a multi-function device except that instead of
> each device being a separate PCI Function, they all appear in a single
> Function. That would mean all the devices share the same config space
> so a single PCI Command register controls all of them, they all share
> the same IRQs (either INTx or MSI/MSI-X), any MMIO registers are likely
> in a shared BAR, etc., right?

Could be multiple BARs, but yes.

> Obviously PCI enumeration only sees the single Function and binds a
> single driver to it. But IIUC, you want to use existing drivers for
> each of these sub-devices, so this series adds a DT node for the
> single Function (using the quirks that call of_pci_make_dev_node()).
> And I assume that when the PCI driver claims the single Function, it
> will use that DT node to add platform devices, and those existing
> drivers can claim those?

Yes. It will call some variant of of_platform_populate().

> I don't see the PCI driver for the single Function in this series. Is
> that coming? Is this series useful without it?

https://lore.kernel.org/all/[email protected]/

I asked for things to be split up as the original series did a lot
of new things at once. This series only works with the QEMU PCI test
device which the DT unittest will use.

> > Apparently, the device tree framework requires a device tree node for the
> > PCI device. Thus, it can generate the device tree nodes for hardware
> > peripherals underneath. Because PCI is self discoverable bus, there might
> > not be a device tree node created for PCI devices. Furthermore, if the PCI
> > device is hot pluggable, when it is plugged in, the device tree nodes for
> > its parent bridges are required. Add support to generate device tree node
> > for PCI bridges.
>
> Can you remind me why hot-adding a PCI device requires DT nodes for
> parent bridges?

Because the PCI device needs a DT node and we can't just put PCI devices
in the DT root. We have to create the bus hierarchy.

> I don't think we have those today, so maybe the DT
> node for the PCI device requires a DT parent? How far up does that
> go?

All the way.

> From this patch, I guess a Root Port would be the top DT node on
> a PCIe system, since that's the top PCI-to-PCI bridge?

Yes. Plus above the host bridge could have a hierarchy of nodes.

> This patch adds a DT node for *every* PCI bridge in the system. We
> only actually need that node for these unusual devices. Is there some
> way the driver for the single PCI Function could add that node when it
> is needed? Sorry if you've answered this in the past; maybe the
> answer could be in the commit log or a code comment in case somebody
> else wonders.

This was discussed early on. I don't think it would work to create the
nodes at the time we discover we have a device that wants a DT node. The
issue is decisions are made in the code based on whether there's a DT
node for a PCI device or not. It might work, but I think it's fragile to
have nodes attached to devices at different points in time.

>
> > @@ -340,6 +340,8 @@ void pci_bus_add_device(struct pci_dev *dev)
> > */
> > pcibios_bus_add_device(dev);
> > pci_fixup_device(pci_fixup_final, dev);
> > + if (pci_is_bridge(dev))
> > + of_pci_make_dev_node(dev);
>
> It'd be nice to have a clue here about why we need this, since this is
> executed for *every* system, even ACPI platforms that typically don't
> use OF things.
>
> > pci_create_sysfs_dev_files(dev);
> > pci_proc_attach_device(dev);
> > pci_bridge_d3_update(dev);
> > diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> > index 2c25f4fa0225..9786ae407948 100644
> > --- a/drivers/pci/of.c
> > +++ b/drivers/pci/of.c
> > @@ -487,6 +487,15 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *
> > } else {
> > /* We found a P2P bridge, check if it has a node */
> > ppnode = pci_device_to_OF_node(ppdev);
> > +#if IS_ENABLED(CONFIG_PCI_DYNAMIC_OF_NODES)
>
> I would use plain #ifdef here instead of IS_ENABLED(), as you did in
> pci.h below. IS_ENABLED() is true if the Kconfig symbol is set to
> either "y" or "m".
>
> Using IS_ENABLED() suggests that the config option *could* be a
> module, which is not the case here because CONFIG_PCI_DYNAMIC_OF_NODES
> is a bool.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/kconfig.h?id=v6.4#n69
>
> > @@ -617,6 +626,85 @@ int devm_of_pci_bridge_init(struct device *dev, struct pci_host_bridge *bridge)
> > return pci_parse_request_of_pci_ranges(dev, bridge);
> > }
> >
> > +#if IS_ENABLED(CONFIG_PCI_DYNAMIC_OF_NODES)
>
> Same here, of course.
>
> > +void of_pci_remove_node(struct pci_dev *pdev)
> > +{
> > + struct device_node *np;
> > +
> > + np = pci_device_to_OF_node(pdev);
> > + if (!np || !of_node_check_flag(np, OF_DYNAMIC))
>
> > + * Each entry in the ranges table is a tuple containing the child address,
> > + * the parent address, and the size of the region in the child address space.
> > + * Thus, for PCI, in each entry parent address is an address on the primary
> > + * side and the child address is the corresponding address on the secondary
> > + * side.
> > + */
> > +struct of_pci_range {
> > + u32 child_addr[OF_PCI_ADDRESS_CELLS];
> > + u32 parent_addr[OF_PCI_ADDRESS_CELLS];
> > + u32 size[OF_PCI_SIZE_CELLS];
>
> > + if (pci_is_bridge(pdev)) {
> > + memcpy(rp[i].child_addr, rp[i].parent_addr,
> > + sizeof(rp[i].child_addr));
> > + } else {
> > + /*
> > + * For endpoint device, the lower 64-bits of child
> > + * address is always zero.
>
> I think this connects with the secondary side comment above, right? I
> think I would comment this as:
>
> /*
> * PCI-PCI bridges don't translate addresses, so the child
> * (secondary side) address is identical to the parent (primary
> * side) address.
> */
>
> and
>
> /*
> * Non-bridges have no child (secondary side) address, so clear it
> * out.
> */
>
> > + */
> > + rp[i].child_addr[0] = j;
>
> > + ret = of_changeset_add_empty_prop(ocs, np, "dynamic");
>
> It seems slightly confusing to use a "dynamic" property here when we
> also have the OF_DYNAMIC dynamic flag above. I think they have
> different meanings, don't they?

Hum, what's the property for? It's new in this version. Any DT property
needs to be documented, but I don't see why we need it.

Rob

2023-06-30 15:01:21

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH V10 2/5] PCI: Create device tree node for bridge

On Thu, Jun 29, 2023 at 05:55:51PM -0600, Rob Herring wrote:
> On Thu, Jun 29, 2023 at 05:56:31PM -0500, Bjorn Helgaas wrote:
> > On Thu, Jun 29, 2023 at 10:19:47AM -0700, Lizhi Hou wrote:
> > > The PCI endpoint device such as Xilinx Alveo PCI card maps the register
> > > spaces from multiple hardware peripherals to its PCI BAR. Normally,
> > > the PCI core discovers devices and BARs using the PCI enumeration process.
> > > There is no infrastructure to discover the hardware peripherals that are
> > > present in a PCI device, and which can be accessed through the PCI BARs.

> > > --- a/drivers/pci/of.c
> > > +++ b/drivers/pci/of.c
> > > @@ -487,6 +487,15 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *
> > > } else {
> > > /* We found a P2P bridge, check if it has a node */
> > > ppnode = pci_device_to_OF_node(ppdev);
> > > +#if IS_ENABLED(CONFIG_PCI_DYNAMIC_OF_NODES)
> >
> > I would use plain #ifdef here instead of IS_ENABLED(), as you did in
> > pci.h below. IS_ENABLED() is true if the Kconfig symbol is set to
> > either "y" or "m".
>
> Actually, IS_ENABLED() with a C 'if' rather than a preprocessor #if
> would work here and is preferred.

Makes sense; I see the justification at [1]. I do wish it didn't have
to be different between this usage and the "#ifdef
CONFIG_PCI_DYNAMIC_OF_NODES" in pci.h for the stubs. But this is OK
by me.

Bjorn

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?id=v6.4#n1162

2023-06-30 17:24:34

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH V10 2/5] PCI: Create device tree node for bridge

On Thu, Jun 29, 2023 at 05:52:26PM -0600, Rob Herring wrote:
> On Thu, Jun 29, 2023 at 05:56:31PM -0500, Bjorn Helgaas wrote:
> > On Thu, Jun 29, 2023 at 10:19:47AM -0700, Lizhi Hou wrote:
> > > The PCI endpoint device such as Xilinx Alveo PCI card maps the register
> > > spaces from multiple hardware peripherals to its PCI BAR. Normally,
> > > the PCI core discovers devices and BARs using the PCI enumeration process.
> > > There is no infrastructure to discover the hardware peripherals that are
> > > present in a PCI device, and which can be accessed through the PCI BARs.
> >
> > IIUC this is basically a multi-function device except that instead of
> > each device being a separate PCI Function, they all appear in a single
> > Function. That would mean all the devices share the same config space
> > so a single PCI Command register controls all of them, they all share
> > the same IRQs (either INTx or MSI/MSI-X), any MMIO registers are likely
> > in a shared BAR, etc., right?
>
> Could be multiple BARs, but yes.

Where does the PCI glue live? E.g., who ioremaps the BARs? Who sets
up PCI interrupts? Who enables bus mastering? The platform driver
that claims the DT node wouldn't know that this is part of a PCI
device, so I guess the PCI driver must do all that stuff? I don't see
it in the xmgmt-drv.c from
https://lore.kernel.org/all/[email protected]/

> > Obviously PCI enumeration only sees the single Function and binds a
> > single driver to it. But IIUC, you want to use existing drivers for
> > each of these sub-devices, so this series adds a DT node for the
> > single Function (using the quirks that call of_pci_make_dev_node()).
> > And I assume that when the PCI driver claims the single Function, it
> > will use that DT node to add platform devices, and those existing
> > drivers can claim those?
>
> Yes. It will call some variant of of_platform_populate().
>
> > I don't see the PCI driver for the single Function in this series. Is
> > that coming? Is this series useful without it?
>
> https://lore.kernel.org/all/[email protected]/
>
> I asked for things to be split up as the original series did a lot
> of new things at once. This series only works with the QEMU PCI test
> device which the DT unittest will use.
>
> > > Apparently, the device tree framework requires a device tree node for the
> > > PCI device. Thus, it can generate the device tree nodes for hardware
> > > peripherals underneath. Because PCI is self discoverable bus, there might
> > > not be a device tree node created for PCI devices. Furthermore, if the PCI
> > > device is hot pluggable, when it is plugged in, the device tree nodes for
> > > its parent bridges are required. Add support to generate device tree node
> > > for PCI bridges.
> >
> > Can you remind me why hot-adding a PCI device requires DT nodes for
> > parent bridges?
>
> Because the PCI device needs a DT node and we can't just put PCI devices
> in the DT root. We have to create the bus hierarchy.
>
> > I don't think we have those today, so maybe the DT
> > node for the PCI device requires a DT parent? How far up does that
> > go?
>
> All the way.
>
> > From this patch, I guess a Root Port would be the top DT node on
> > a PCIe system, since that's the top PCI-to-PCI bridge?
>
> Yes. Plus above the host bridge could have a hierarchy of nodes.

I'm missing something if it goes "all the way up," i.e., to a single
system root, but a Root Port is the top DT node. If a Root Port is
the top, there would be several roots.

> > This patch adds a DT node for *every* PCI bridge in the system. We
> > only actually need that node for these unusual devices. Is there some
> > way the driver for the single PCI Function could add that node when it
> > is needed? Sorry if you've answered this in the past; maybe the
> > answer could be in the commit log or a code comment in case somebody
> > else wonders.
>
> This was discussed early on. I don't think it would work to create the
> nodes at the time we discover we have a device that wants a DT node. The
> issue is decisions are made in the code based on whether there's a DT
> node for a PCI device or not. It might work, but I think it's fragile to
> have nodes attached to devices at different points in time.

Ah. So I guess the problem is we enumerate a PCI bridge, we might do
something based on the fact that it doesn't have a DT node, then add a
DT node for it later.

Bjorn

2023-06-30 18:46:10

by Lizhi Hou

[permalink] [raw]
Subject: Re: [PATCH V10 2/5] PCI: Create device tree node for bridge


On 6/29/23 16:52, Rob Herring wrote:
>>> + rp[i].child_addr[0] = j;
>>> + ret = of_changeset_add_empty_prop(ocs, np, "dynamic");
>> It seems slightly confusing to use a "dynamic" property here when we
>> also have the OF_DYNAMIC dynamic flag above. I think they have
>> different meanings, don't they?
> Hum, what's the property for? It's new in this version. Any DT property
> needs to be documented, but I don't see why we need it.

This is mentioned in my previous reply for V9

https://lore.kernel.org/lkml/[email protected]/

As we discussed before, "interrupt-map" was intended to be used here.

And after thinking it more, it may not work for the cases where ppnode

is not dynamically generated and it does not have "interrupt-map".

For example the IBM ppc system, its device tree has nodes for pci bridge

and it does not have "interrupt-map".

Based on previous discussions, OF_DYNAMIC should not be used here.

So I think adding "dynamic" might be a way to identify the dynamically

added node. Or we can introduce a new flag e.g OF_IRQ_SWIZZLING.


Thanks,

Lizhi


2023-06-30 20:42:58

by Lizhi Hou

[permalink] [raw]
Subject: Re: [PATCH V10 2/5] PCI: Create device tree node for bridge


On 6/30/23 09:48, Bjorn Helgaas wrote:
> On Thu, Jun 29, 2023 at 05:52:26PM -0600, Rob Herring wrote:
>> On Thu, Jun 29, 2023 at 05:56:31PM -0500, Bjorn Helgaas wrote:
>>> On Thu, Jun 29, 2023 at 10:19:47AM -0700, Lizhi Hou wrote:
>>>> The PCI endpoint device such as Xilinx Alveo PCI card maps the register
>>>> spaces from multiple hardware peripherals to its PCI BAR. Normally,
>>>> the PCI core discovers devices and BARs using the PCI enumeration process.
>>>> There is no infrastructure to discover the hardware peripherals that are
>>>> present in a PCI device, and which can be accessed through the PCI BARs.
>>> IIUC this is basically a multi-function device except that instead of
>>> each device being a separate PCI Function, they all appear in a single
>>> Function. That would mean all the devices share the same config space
>>> so a single PCI Command register controls all of them, they all share
>>> the same IRQs (either INTx or MSI/MSI-X), any MMIO registers are likely
>>> in a shared BAR, etc., right?
>> Could be multiple BARs, but yes.
> Where does the PCI glue live? E.g., who ioremaps the BARs? Who sets
> up PCI interrupts? Who enables bus mastering? The platform driver
> that claims the DT node wouldn't know that this is part of a PCI
> device, so I guess the PCI driver must do all that stuff? I don't see
> it in the xmgmt-drv.c from
> https://lore.kernel.org/all/[email protected]/
>
Yes, the PCI driver will do all that stuff. This xmgmt-drv.c is created

to just populating the devices based on fdt input.  It is removed after

the unittest is created which can populate devices and verify the

address translation.


Thanks,

Lizhi


2023-07-18 16:06:50

by Lizhi Hou

[permalink] [raw]
Subject: Re: [PATCH V10 2/5] PCI: Create device tree node for bridge

Hi Rob,

Do you have any comment on this?

Thanks,

Lizhi

On 6/30/23 11:24, Lizhi Hou wrote:
>
> On 6/29/23 16:52, Rob Herring wrote:
>>>> +            rp[i].child_addr[0] = j;
>>>> +    ret = of_changeset_add_empty_prop(ocs, np, "dynamic");
>>> It seems slightly confusing to use a "dynamic" property here when we
>>> also have the OF_DYNAMIC dynamic flag above.  I think they have
>>> different meanings, don't they?
>> Hum, what's the property for? It's new in this version. Any DT property
>> needs to be documented, but I don't see why we need it.
>
> This is mentioned in my previous reply for V9
>
> https://lore.kernel.org/lkml/[email protected]/
>
>
> As we discussed before, "interrupt-map" was intended to be used here.
>
> And after thinking it more, it may not work for the cases where ppnode
>
> is not dynamically generated and it does not have "interrupt-map".
>
> For example the IBM ppc system, its device tree has nodes for pci bridge
>
> and it does not have "interrupt-map".
>
> Based on previous discussions, OF_DYNAMIC should not be used here.
>
> So I think adding "dynamic" might be a way to identify the dynamically
>
> added node. Or we can introduce a new flag e.g OF_IRQ_SWIZZLING.
>
>
> Thanks,
>
> Lizhi
>

2023-07-18 18:19:23

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH V10 2/5] PCI: Create device tree node for bridge

On Fri, Jun 30, 2023 at 12:25 PM Lizhi Hou <[email protected]> wrote:
>
>
> On 6/29/23 16:52, Rob Herring wrote:
> >>> + rp[i].child_addr[0] = j;
> >>> + ret = of_changeset_add_empty_prop(ocs, np, "dynamic");
> >> It seems slightly confusing to use a "dynamic" property here when we
> >> also have the OF_DYNAMIC dynamic flag above. I think they have
> >> different meanings, don't they?
> > Hum, what's the property for? It's new in this version. Any DT property
> > needs to be documented, but I don't see why we need it.
>
> This is mentioned in my previous reply for V9
>
> https://lore.kernel.org/lkml/[email protected]/
>
> As we discussed before, "interrupt-map" was intended to be used here.
>
> And after thinking it more, it may not work for the cases where ppnode
>
> is not dynamically generated and it does not have "interrupt-map".
>
> For example the IBM ppc system, its device tree has nodes for pci bridge
>
> and it does not have "interrupt-map".

How do you know? I ask because usually the only way I have visibility
there is when I break something. In traditional OpenFirmware, which
IBM PPC is, all PCI devices have a DT node because it's the firmware
telling the OS "these are the devices I discovered and this is how I
configured them".

> Based on previous discussions, OF_DYNAMIC should not be used here.

For the same reasons, I don't think the behavior should change based
on being dynamic. Now maybe the behavior when it's an ACPI system with
DT overlays has to change, but that's a problem for later. I don't yet
know if we'd handle that here somehow or elsewhere so that this node
looks like a normal DT system.

This should all work the same whether we've generated the nodes or
they were already present in the FDT when we booted.

> So I think adding "dynamic" might be a way to identify the dynamically
>
> added node. Or we can introduce a new flag e.g OF_IRQ_SWIZZLING.

I hope not. The flags tend to be hacks.

Rob

2023-07-24 18:35:36

by Lizhi Hou

[permalink] [raw]
Subject: Re: [PATCH V10 2/5] PCI: Create device tree node for bridge


On 7/18/23 11:15, Rob Herring wrote:
> On Fri, Jun 30, 2023 at 12:25 PM Lizhi Hou <[email protected]> wrote:
>>
>> On 6/29/23 16:52, Rob Herring wrote:
>>>>> + rp[i].child_addr[0] = j;
>>>>> + ret = of_changeset_add_empty_prop(ocs, np, "dynamic");
>>>> It seems slightly confusing to use a "dynamic" property here when we
>>>> also have the OF_DYNAMIC dynamic flag above. I think they have
>>>> different meanings, don't they?
>>> Hum, what's the property for? It's new in this version. Any DT property
>>> needs to be documented, but I don't see why we need it.
>> This is mentioned in my previous reply for V9
>>
>> https://lore.kernel.org/lkml/[email protected]/
>>
>> As we discussed before, "interrupt-map" was intended to be used here.
>>
>> And after thinking it more, it may not work for the cases where ppnode
>>
>> is not dynamically generated and it does not have "interrupt-map".
>>
>> For example the IBM ppc system, its device tree has nodes for pci bridge
>>
>> and it does not have "interrupt-map".
> How do you know? I ask because usually the only way I have visibility
> there is when I break something. In traditional OpenFirmware, which
> IBM PPC is, all PCI devices have a DT node because it's the firmware
> telling the OS "these are the devices I discovered and this is how I
> configured them".

I configured a ppc VM and added a bridge to the VM

qemu-system-ppc -L pc-bios -boot c -prom-env "boot-device=hd:,\yaboot"
-prom-env "boot-args=conf=hd:,\yaboot.conf" -M mac99 -m 1024 -hda
debian10.qcow2 -nographic -device pci-bridge,chassis_nr=1,id=pci.9

# ls /proc/device-tree/pci\@f2000000/pci1b36\,1\@f/
66mhz-capable        class-code      fast-back-to-back min-grant     
vendor-id
assigned-addresses  device-id      interrupts         name
bus-range        devsel-speed  linux,phandle      reg
cache-line-size     ethernet@1      max-latency revision-id

The bridge node does not have 'interrupt-map'. That is why I concerned
for using 'interrupt-map'.


To further debugging on if it really breaks anything, I added a nic
device under bridge. Even without my patch, it is failed anyway.

     [    0.086586] pci 0000:01:01.0: of_irq_parse_pci: failed with rc=-22

So I setup another power10 VM and see the 'interrupt-map' is created for
pci bridge. And the nic device under bridge works fine.


Maybe using 'interrupt-map' will not break anything in the real world. 
I will re-create a patchset which uses 'interrupt-map' (like V9) and
checks it only when CONFIG_PCI_DYNAMIC_OF_NODES is turned on.


Thanks,

Lizhi

>
>> Based on previous discussions, OF_DYNAMIC should not be used here.
> For the same reasons, I don't think the behavior should change based
> on being dynamic. Now maybe the behavior when it's an ACPI system with
> DT overlays has to change, but that's a problem for later. I don't yet
> know if we'd handle that here somehow or elsewhere so that this node
> looks like a normal DT system.
>
> This should all work the same whether we've generated the nodes or
> they were already present in the FDT when we booted.
>
>> So I think adding "dynamic" might be a way to identify the dynamically
>>
>> added node. Or we can introduce a new flag e.g OF_IRQ_SWIZZLING.
> I hope not. The flags tend to be hacks.
>
> Rob