2019-09-27 00:27:22

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH 00/11] of: dma-ranges fixes and improvements

This series fixes several issues related to 'dma-ranges'. Primarily,
'dma-ranges' in a PCI bridge node does correctly set dma masks for PCI
devices not described in the DT. A common case needing dma-ranges is a
32-bit PCIe bridge on a 64-bit system. This affects several platforms
including Broadcom, NXP, Renesas, and Arm Juno. There's been several
attempts to fix these issues, most recently earlier this week[1].

In the process, I found several bugs in the address translation. It
appears that things have happened to work as various DTs happen to use
1:1 addresses.

First 3 patches are just some clean-up. The 4th patch adds a unittest
exhibiting the issues. Patches 5-9 rework how of_dma_configure() works
making it work on either a struct device child node or a struct
device_node parent node so that it works on bus leaf nodes like PCI
bridges. Patches 10 and 11 fix 2 issues with address translation for
dma-ranges.

My testing on this has been with QEMU virt machine hacked up to set PCI
dma-ranges and the unittest. Nicolas reports this series resolves the
issues on Rpi4 and NXP Layerscape platforms.

Rob

[1] https://lore.kernel.org/linux-arm-kernel/[email protected]/

Rob Herring (5):
of: Remove unused of_find_matching_node_by_address()
of: Make of_dma_get_range() private
of/unittest: Add dma-ranges address translation tests
of/address: Translate 'dma-ranges' for parent nodes missing
'dma-ranges'
of/address: Fix of_pci_range_parser_one translation of DMA addresses

Robin Murphy (6):
of: address: Report of_dma_get_range() errors meaningfully
of: Ratify of_dma_configure() interface
of/address: Introduce of_get_next_dma_parent() helper
of: address: Follow DMA parent for "dma-coherent"
of: Factor out #{addr,size}-cells parsing
of: Make of_dma_get_range() work on bus nodes

drivers/of/address.c | 83 +++++++++----------
drivers/of/base.c | 32 ++++---
drivers/of/device.c | 12 ++-
drivers/of/of_private.h | 14 ++++
drivers/of/unittest-data/testcases.dts | 1 +
drivers/of/unittest-data/tests-address.dtsi | 48 +++++++++++
drivers/of/unittest.c | 92 +++++++++++++++++++++
include/linux/of_address.h | 21 +----
include/linux/of_device.h | 4 +-
9 files changed, 227 insertions(+), 80 deletions(-)
create mode 100644 drivers/of/unittest-data/tests-address.dtsi

--
2.20.1


2019-09-27 00:27:30

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH 05/11] of: Ratify of_dma_configure() interface

From: Robin Murphy <[email protected]>

For various DMA masters not directly represented in DT, we pass the OF
node of their parent or bridge device as the master_np argument to
of_dma_configure(), such that they can inherit the appropriate DMA
configuration from whatever is described there. This creates an
ambiguity for properties which are not valid for a device itself but
only for its parent bus, since we don't know whether to start looking
for those at master_np or master_np->parent.

Fortunately, the de-facto interface since the prototype change in
1f5c69aa51f9 ("of: Move of_dma_configure() to device.c to help re-use")
is pretty clear-cut: either master_np is redundant with dev->of_node, or
dev->of_node is NULL and master_np is already the relevant parent. Let's
formally ratify that so we can start relying on it.

Signed-off-by: Robin Murphy <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
---
drivers/of/device.c | 12 ++++++++++--
include/linux/of_device.h | 4 ++--
2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/of/device.c b/drivers/of/device.c
index da8158392010..a45261e21144 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -75,7 +75,8 @@ int of_device_add(struct platform_device *ofdev)
/**
* of_dma_configure - Setup DMA configuration
* @dev: Device to apply DMA configuration
- * @np: Pointer to OF node having DMA configuration
+ * @parent: OF node of parent device having DMA configuration, if
+ * @dev->of_node is NULL (ignored otherwise)
* @force_dma: Whether device is to be set up by of_dma_configure() even if
* DMA capability is not explicitly described by firmware.
*
@@ -86,15 +87,22 @@ int of_device_add(struct platform_device *ofdev)
* can use a platform bus notifier and handle BUS_NOTIFY_ADD_DEVICE events
* to fix up DMA configuration.
*/
-int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
+int of_dma_configure(struct device *dev, struct device_node *parent, bool force_dma)
{
u64 dma_addr, paddr, size = 0;
int ret;
bool coherent;
unsigned long offset;
const struct iommu_ops *iommu;
+ struct device_node *np;
u64 mask;

+ np = dev->of_node;
+ if (!np)
+ np = parent;
+ if (!np)
+ return -ENODEV;
+
ret = of_dma_get_range(np, &dma_addr, &paddr, &size);
if (ret < 0) {
/*
diff --git a/include/linux/of_device.h b/include/linux/of_device.h
index 8d31e39dd564..a4fe418e57f6 100644
--- a/include/linux/of_device.h
+++ b/include/linux/of_device.h
@@ -56,7 +56,7 @@ static inline struct device_node *of_cpu_device_node_get(int cpu)
}

int of_dma_configure(struct device *dev,
- struct device_node *np,
+ struct device_node *parent,
bool force_dma);
#else /* CONFIG_OF */

@@ -107,7 +107,7 @@ static inline struct device_node *of_cpu_device_node_get(int cpu)
}

static inline int of_dma_configure(struct device *dev,
- struct device_node *np,
+ struct device_node *parent,
bool force_dma)
{
return 0;
--
2.20.1

2019-09-27 00:27:38

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH 06/11] of/address: Introduce of_get_next_dma_parent() helper

From: Robin Murphy <[email protected]>

Add of_get_next_dma_parent() helper which is similar to
__of_get_dma_parent(), but can be used in iterators and decrements the
ref count on the prior parent.

Signed-off-by: Robin Murphy <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
---
drivers/of/address.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 53d2656c2269..e9188c82fdae 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -695,6 +695,16 @@ static struct device_node *__of_get_dma_parent(const struct device_node *np)
return of_node_get(args.np);
}

+static struct device_node *of_get_next_dma_parent(struct device_node *np)
+{
+ struct device_node *parent;
+
+ parent = __of_get_dma_parent(np);
+ of_node_put(np);
+
+ return parent;
+}
+
u64 of_translate_dma_address(struct device_node *dev, const __be32 *in_addr)
{
struct device_node *host;
--
2.20.1

2019-09-27 00:27:38

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH 04/11] of/unittest: Add dma-ranges address translation tests

The functions for parsing 'dma-ranges' ranges are buggy and fail to
handle several conditions. Add new tests for of_dma_get_range() and
for_each_of_pci_range().

With this test, we get 5 new failures which are fixed in subsequent
commits:

OF: translation of DMA address(0) to CPU address failed node(/testcase-data/address-tests/device@70000000)
FAIL of_unittest_dma_ranges_one():798 of_dma_get_range failed on node /testcase-data/address-tests/device@70000000 rc=-22
OF: translation of DMA address(10000000) to CPU address failed node(/testcase-data/address-tests/bus@80000000/device@1000)
FAIL of_unittest_dma_ranges_one():798 of_dma_get_range failed on node /testcase-data/address-tests/bus@80000000/device@1000 rc=-22
OF: translation of DMA address(0) to CPU address failed node(/testcase-data/address-tests/pci@90000000)
FAIL of_unittest_dma_ranges_one():798 of_dma_get_range failed on node /testcase-data/address-tests/pci@90000000 rc=-22
FAIL of_unittest_pci_dma_ranges():851 for_each_of_pci_range wrong CPU addr (d0000000) on node /testcase-data/address-tests/pci@90000000
FAIL of_unittest_pci_dma_ranges():861 for_each_of_pci_range wrong CPU addr (ffffffffffffffff) on node /testcase-data/address-tests/pci@90000000

Cc: Robin Murphy <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
---
drivers/of/unittest-data/testcases.dts | 1 +
drivers/of/unittest-data/tests-address.dtsi | 48 +++++++++++
drivers/of/unittest.c | 92 +++++++++++++++++++++
3 files changed, 141 insertions(+)
create mode 100644 drivers/of/unittest-data/tests-address.dtsi

diff --git a/drivers/of/unittest-data/testcases.dts b/drivers/of/unittest-data/testcases.dts
index 55fe0ee20109..a85b5e1c381a 100644
--- a/drivers/of/unittest-data/testcases.dts
+++ b/drivers/of/unittest-data/testcases.dts
@@ -15,5 +15,6 @@
#include "tests-phandle.dtsi"
#include "tests-interrupts.dtsi"
#include "tests-match.dtsi"
+#include "tests-address.dtsi"
#include "tests-platform.dtsi"
#include "tests-overlay.dtsi"
diff --git a/drivers/of/unittest-data/tests-address.dtsi b/drivers/of/unittest-data/tests-address.dtsi
new file mode 100644
index 000000000000..3fe5d3987beb
--- /dev/null
+++ b/drivers/of/unittest-data/tests-address.dtsi
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/ {
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ testcase-data {
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges;
+
+ address-tests {
+ #address-cells = <1>;
+ #size-cells = <1>;
+ /* ranges here is to make sure we don't use it for
+ * dma-ranges translation */
+ ranges = <0x70000000 0x70000000 0x40000000>,
+ <0x00000000 0xd0000000 0x20000000>;
+ dma-ranges = <0x0 0x20000000 0x40000000>;
+
+ device@70000000 {
+ reg = <0x70000000 0x1000>;
+ };
+
+ bus@80000000 {
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0x0 0x80000000 0x100000>;
+ dma-ranges = <0x10000000 0x0 0x40000000>;
+
+ device@1000 {
+ reg = <0x1000 0x1000>;
+ };
+ };
+
+ pci@90000000 {
+ device_type = "pci";
+ #address-cells = <3>;
+ #size-cells = <2>;
+ reg = <0x90000000 0x1000>;
+ ranges = <0x42000000 0x0 0x40000000 0x40000000 0x0 0x10000000>;
+ dma-ranges = <0x42000000 0x0 0x80000000 0x00000000 0x0 0x10000000>,
+ <0x42000000 0x0 0xc0000000 0x20000000 0x0 0x10000000>;
+ };
+
+ };
+ };
+};
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index e6b175370f2e..3969075194c5 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -12,6 +12,7 @@
#include <linux/hashtable.h>
#include <linux/libfdt.h>
#include <linux/of.h>
+#include <linux/of_address.h>
#include <linux/of_fdt.h>
#include <linux/of_irq.h>
#include <linux/of_platform.h>
@@ -779,6 +780,95 @@ static void __init of_unittest_changeset(void)
#endif
}

+static void __init of_unittest_dma_ranges_one(const char *path,
+ u64 expect_dma_addr, u64 expect_paddr, u64 expect_size)
+{
+ struct device_node *np;
+ u64 dma_addr, paddr, size;
+ int rc;
+
+ np = of_find_node_by_path(path);
+ if (!np) {
+ pr_err("missing testcase data\n");
+ return;
+ }
+
+ rc = of_dma_get_range(np, &dma_addr, &paddr, &size);
+
+ unittest(!rc, "of_dma_get_range failed on node %pOF rc=%i\n", np, rc);
+ if (!rc) {
+ unittest(size == expect_size,
+ "of_dma_get_range wrong size on node %pOF size=%llx\n", np, size);
+ unittest(paddr == expect_paddr,
+ "of_dma_get_range wrong phys addr (%llx) on node %pOF", paddr, np);
+ unittest(dma_addr == expect_dma_addr,
+ "of_dma_get_range wrong DMA addr (%llx) on node %pOF", dma_addr, np);
+ }
+ of_node_put(np);
+}
+
+static void __init of_unittest_parse_dma_ranges(void)
+{
+ of_unittest_dma_ranges_one("/testcase-data/address-tests/device@70000000",
+ 0x0, 0x20000000, 0x40000000);
+ of_unittest_dma_ranges_one("/testcase-data/address-tests/bus@80000000/device@1000",
+ 0x10000000, 0x20000000, 0x40000000);
+ of_unittest_dma_ranges_one("/testcase-data/address-tests/pci@90000000",
+ 0x80000000, 0x20000000, 0x10000000);
+}
+
+static void __init of_unittest_pci_dma_ranges(void)
+{
+ struct device_node *np;
+ struct of_pci_range range;
+ struct of_pci_range_parser parser;
+ int i = 0;
+
+ if (!IS_ENABLED(CONFIG_PCI))
+ return;
+
+ np = of_find_node_by_path("/testcase-data/address-tests/pci@90000000");
+ if (!np) {
+ pr_err("missing testcase data\n");
+ return;
+ }
+
+ if (of_pci_dma_range_parser_init(&parser, np)) {
+ pr_err("missing dma-ranges property\n");
+ return;
+ }
+
+ /*
+ * Get the dma-ranges from the device tree
+ */
+ for_each_of_pci_range(&parser, &range) {
+ if (!i) {
+ unittest(range.size == 0x10000000,
+ "for_each_of_pci_range wrong size on node %pOF size=%llx\n",
+ np, range.size);
+ unittest(range.cpu_addr == 0x20000000,
+ "for_each_of_pci_range wrong CPU addr (%llx) on node %pOF",
+ range.cpu_addr, np);
+ unittest(range.pci_addr == 0x80000000,
+ "for_each_of_pci_range wrong DMA addr (%llx) on node %pOF",
+ range.pci_addr, np);
+ } else {
+ unittest(range.size == 0x10000000,
+ "for_each_of_pci_range wrong size on node %pOF size=%llx\n",
+ np, range.size);
+ unittest(range.cpu_addr == 0x40000000,
+ "for_each_of_pci_range wrong CPU addr (%llx) on node %pOF",
+ range.cpu_addr, np);
+ unittest(range.pci_addr == 0xc0000000,
+ "for_each_of_pci_range wrong DMA addr (%llx) on node %pOF",
+ range.pci_addr, np);
+ }
+ i++;
+ }
+
+ of_node_put(np);
+}
+
static void __init of_unittest_parse_interrupts(void)
{
struct device_node *np;
@@ -2552,6 +2642,8 @@ static int __init of_unittest(void)
of_unittest_changeset();
of_unittest_parse_interrupts();
of_unittest_parse_interrupts_extended();
+ of_unittest_parse_dma_ranges();
+ of_unittest_pci_dma_ranges();
of_unittest_match_node();
of_unittest_platform_populate();
of_unittest_overlay();
--
2.20.1

2019-09-27 00:27:58

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH 10/11] of/address: Translate 'dma-ranges' for parent nodes missing 'dma-ranges'

'dma-ranges' frequently exists without parent nodes having 'dma-ranges'.
While this is an error for 'ranges', this is fine because DMA capable
devices always have a translatable DMA address. Also, with no
'dma-ranges' at all, the assumption is that DMA addresses are 1:1 with
no restrictions unless perhaps the device itself has implicit
restrictions.

Cc: Robin Murphy <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
---
drivers/of/address.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index e918001c7798..5b835d332709 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -519,9 +519,13 @@ static int of_translate_one(struct device_node *parent, struct of_bus *bus,
*
* As far as we know, this damage only exists on Apple machines, so
* This code is only enabled on powerpc. --gcl
+ *
+ * This quirk also applies for 'dma-ranges' which frequently exist in
+ * child nodes without 'dma-ranges' in the parent nodes. --RobH
*/
ranges = of_get_property(parent, rprop, &rlen);
- if (ranges == NULL && !of_empty_ranges_quirk(parent)) {
+ if (ranges == NULL && !of_empty_ranges_quirk(parent) &&
+ strcmp(rprop, "dma-ranges")) {
pr_debug("no ranges; cannot translate\n");
return 1;
}
--
2.20.1

2019-09-27 00:28:01

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH 11/11] of/address: Fix of_pci_range_parser_one translation of DMA addresses

of_pci_range_parser_one() has a bug when parsing dma-ranges. When it
translates the parent address (aka cpu address in the code), 'ranges' is
always being used. This happens to work because most users are just 1:1
translation.

Cc: Robin Murphy <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
---
drivers/of/address.c | 15 ++++++++++++---
include/linux/of_address.h | 1 +
2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 5b835d332709..54011a355b81 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -243,6 +243,7 @@ static int parser_init(struct of_pci_range_parser *parser,
parser->node = node;
parser->pna = of_n_addr_cells(node);
parser->np = parser->pna + na + ns;
+ parser->dma = !strcmp(name, "dma-ranges");

parser->range = of_get_property(node, name, &rlen);
if (parser->range == NULL)
@@ -281,7 +282,11 @@ struct of_pci_range *of_pci_range_parser_one(struct of_pci_range_parser *parser,
range->pci_space = be32_to_cpup(parser->range);
range->flags = of_bus_pci_get_flags(parser->range);
range->pci_addr = of_read_number(parser->range + 1, ns);
- range->cpu_addr = of_translate_address(parser->node,
+ if (parser->dma)
+ range->cpu_addr = of_translate_dma_address(parser->node,
+ parser->range + na);
+ else
+ range->cpu_addr = of_translate_address(parser->node,
parser->range + na);
range->size = of_read_number(parser->range + parser->pna + na, ns);

@@ -294,8 +299,12 @@ struct of_pci_range *of_pci_range_parser_one(struct of_pci_range_parser *parser,

flags = of_bus_pci_get_flags(parser->range);
pci_addr = of_read_number(parser->range + 1, ns);
- cpu_addr = of_translate_address(parser->node,
- parser->range + na);
+ if (parser->dma)
+ cpu_addr = of_translate_dma_address(parser->node,
+ parser->range + na);
+ else
+ cpu_addr = of_translate_address(parser->node,
+ parser->range + na);
size = of_read_number(parser->range + parser->pna + na, ns);

if (flags != range->flags)
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index ddda3936039c..eac7ab109df4 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -12,6 +12,7 @@ struct of_pci_range_parser {
const __be32 *end;
int np;
int pna;
+ bool dma;
};

struct of_pci_range {
--
2.20.1

2019-09-27 00:29:29

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH 02/11] of: Make of_dma_get_range() private

of_dma_get_range() is only used within the DT core code, so remove the
export and move the header declaration to the private header.

Cc: Robin Murphy <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
---
drivers/of/address.c | 1 -
drivers/of/of_private.h | 11 +++++++++++
include/linux/of_address.h | 8 --------
3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 0c3cf515c510..8e354d12fb04 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -972,7 +972,6 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz

return ret;
}
-EXPORT_SYMBOL_GPL(of_dma_get_range);

/**
* of_dma_is_coherent - Check if device is coherent
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index 24786818e32e..f8c58615c393 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -158,4 +158,15 @@ extern void __of_sysfs_remove_bin_file(struct device_node *np,
#define for_each_transaction_entry_reverse(_oft, _te) \
list_for_each_entry_reverse(_te, &(_oft)->te_list, node)

+#ifdef CONFIG_OF_ADDRESS
+extern int of_dma_get_range(struct device_node *np, u64 *dma_addr,
+ u64 *paddr, u64 *size);
+#else
+static inline int of_dma_get_range(struct device_node *np, u64 *dma_addr,
+ u64 *paddr, u64 *size)
+{
+ return -ENODEV;
+}
+#endif
+
#endif /* _LINUX_OF_PRIVATE_H */
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index e317f375374a..ddda3936039c 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -51,8 +51,6 @@ extern int of_pci_dma_range_parser_init(struct of_pci_range_parser *parser,
extern struct of_pci_range *of_pci_range_parser_one(
struct of_pci_range_parser *parser,
struct of_pci_range *range);
-extern int of_dma_get_range(struct device_node *np, u64 *dma_addr,
- u64 *paddr, u64 *size);
extern bool of_dma_is_coherent(struct device_node *np);
#else /* CONFIG_OF_ADDRESS */
static inline void __iomem *of_io_request_and_map(struct device_node *device,
@@ -92,12 +90,6 @@ static inline struct of_pci_range *of_pci_range_parser_one(
return NULL;
}

-static inline int of_dma_get_range(struct device_node *np, u64 *dma_addr,
- u64 *paddr, u64 *size)
-{
- return -ENODEV;
-}
-
static inline bool of_dma_is_coherent(struct device_node *np)
{
return false;
--
2.20.1

2019-09-27 00:30:11

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH 07/11] of: address: Follow DMA parent for "dma-coherent"

From: Robin Murphy <[email protected]>

Much like for address translation, when checking for DMA coherence we
should be sure to walk up the DMA hierarchy, rather than the MMIO one,
now that we can accommodate them being different.

Signed-off-by: Robin Murphy <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
---
drivers/of/address.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index e9188c82fdae..3fd34f7ad772 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -999,7 +999,7 @@ bool of_dma_is_coherent(struct device_node *np)
of_node_put(node);
return true;
}
- node = of_get_next_parent(node);
+ node = of_get_next_dma_parent(node);
}
of_node_put(node);
return false;
--
2.20.1

2019-09-27 00:30:18

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH 09/11] of: Make of_dma_get_range() work on bus nodes

From: Robin Murphy <[email protected]>

Since the "dma-ranges" property is only valid for a node representing a
bus, of_dma_get_range() currently assumes the node passed in is a leaf
representing a device, and starts the walk from its parent. In cases
like PCI host controllers on typical FDT systems, however, where the PCI
endpoints are probed dynamically the initial leaf node represents the
'bus' itself, and this logic means we fail to consider any "dma-ranges"
describing the host bridge itself. Rework the logic such that
of_dma_get_range() works correctly starting from a bus node containing
"dma-ranges".

Signed-off-by: Robin Murphy <[email protected]>
[robh: Allow for the bus child node to still be passed in]
Signed-off-by: Rob Herring <[email protected]>
---
drivers/of/address.c | 24 +++++-------------------
1 file changed, 5 insertions(+), 19 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 887c0413f648..e918001c7798 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -922,18 +922,9 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
if (!node)
return -EINVAL;

- while (1) {
- struct device_node *parent;
-
- naddr = of_n_addr_cells(node);
- nsize = of_n_size_cells(node);
-
- parent = __of_get_dma_parent(node);
- of_node_put(node);
-
- node = parent;
- if (!node)
- break;
+ while (node) {
+ naddr = of_bus_n_addr_cells(node);
+ nsize = of_bus_n_size_cells(node);

ranges = of_get_property(node, "dma-ranges", &len);

@@ -941,12 +932,7 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
if (ranges && len > 0)
break;

- /*
- * At least empty ranges has to be defined for parent node if
- * DMA is supported
- */
- if (!ranges)
- break;
+ node = of_get_next_dma_parent(node);
}

if (!ranges) {
@@ -965,7 +951,7 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
* size : nsize cells
*/
dmaaddr = of_read_number(ranges, naddr);
- *paddr = of_translate_dma_address(np, ranges);
+ *paddr = of_translate_dma_address(node, ranges + naddr);
if (*paddr == OF_BAD_ADDR) {
pr_err("translation of DMA address(%llx) to CPU address failed node(%pOF)\n",
dmaaddr, np);
--
2.20.1

2019-09-27 00:30:52

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH 08/11] of: Factor out #{addr,size}-cells parsing

From: Robin Murphy <[email protected]>

In some cases such as PCI host controllers, we may have a "parent bus"
which is an OF leaf node, but still need to correctly parse ranges from
the point of view of that bus. For that, factor out variants of the
"#addr-cells" and "#size-cells" parsers which do not assume they have a
device node and thus immediately traverse upwards before reading the
relevant property.

Signed-off-by: Robin Murphy <[email protected]>
[robh: don't make of_bus_n_{addr,size}_cells() public]
Signed-off-by: Rob Herring <[email protected]>
---
drivers/of/address.c | 2 ++
drivers/of/base.c | 32 ++++++++++++++++++++++----------
drivers/of/of_private.h | 3 +++
3 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 3fd34f7ad772..887c0413f648 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -14,6 +14,8 @@
#include <linux/slab.h>
#include <linux/string.h>

+#include "of_private.h"
+
/* Max address size we deal with */
#define OF_MAX_ADDR_CELLS 4
#define OF_CHECK_ADDR_COUNT(na) ((na) > 0 && (na) <= OF_MAX_ADDR_CELLS)
diff --git a/drivers/of/base.c b/drivers/of/base.c
index 55e7f5bb0549..12b2e9287117 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -86,34 +86,46 @@ static bool __of_node_is_type(const struct device_node *np, const char *type)
return np && match && type && !strcmp(match, type);
}

-int of_n_addr_cells(struct device_node *np)
+int of_bus_n_addr_cells(struct device_node *np)
{
u32 cells;

- do {
- if (np->parent)
- np = np->parent;
+ for (; np; np = np->parent)
if (!of_property_read_u32(np, "#address-cells", &cells))
return cells;
- } while (np->parent);
+
/* No #address-cells property for the root node */
return OF_ROOT_NODE_ADDR_CELLS_DEFAULT;
}
+
+int of_n_addr_cells(struct device_node *np)
+{
+ if (np->parent)
+ np = np->parent;
+
+ return of_bus_n_addr_cells(np);
+}
EXPORT_SYMBOL(of_n_addr_cells);

-int of_n_size_cells(struct device_node *np)
+int of_bus_n_size_cells(struct device_node *np)
{
u32 cells;

- do {
- if (np->parent)
- np = np->parent;
+ for (; np; np = np->parent)
if (!of_property_read_u32(np, "#size-cells", &cells))
return cells;
- } while (np->parent);
+
/* No #size-cells property for the root node */
return OF_ROOT_NODE_SIZE_CELLS_DEFAULT;
}
+
+int of_n_size_cells(struct device_node *np)
+{
+ if (np->parent)
+ np = np->parent;
+
+ return of_bus_n_size_cells(np);
+}
EXPORT_SYMBOL(of_n_size_cells);

#ifdef CONFIG_NUMA
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index f8c58615c393..66294d29942a 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -158,6 +158,9 @@ extern void __of_sysfs_remove_bin_file(struct device_node *np,
#define for_each_transaction_entry_reverse(_oft, _te) \
list_for_each_entry_reverse(_te, &(_oft)->te_list, node)

+extern int of_bus_n_addr_cells(struct device_node *np);
+extern int of_bus_n_size_cells(struct device_node *np);
+
#ifdef CONFIG_OF_ADDRESS
extern int of_dma_get_range(struct device_node *np, u64 *dma_addr,
u64 *paddr, u64 *size);
--
2.20.1

2019-09-27 00:31:46

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH 01/11] of: Remove unused of_find_matching_node_by_address()

of_find_matching_node_by_address() is unused, so remove it.

Cc: Robin Murphy <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
---
drivers/of/address.c | 19 -------------------
include/linux/of_address.h | 12 ------------
2 files changed, 31 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 978427a9d5e6..0c3cf515c510 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -826,25 +826,6 @@ int of_address_to_resource(struct device_node *dev, int index,
}
EXPORT_SYMBOL_GPL(of_address_to_resource);

-struct device_node *of_find_matching_node_by_address(struct device_node *from,
- const struct of_device_id *matches,
- u64 base_address)
-{
- struct device_node *dn = of_find_matching_node(from, matches);
- struct resource res;
-
- while (dn) {
- if (!of_address_to_resource(dn, 0, &res) &&
- res.start == base_address)
- return dn;
-
- dn = of_find_matching_node(dn, matches);
- }
-
- return NULL;
-}
-
-
/**
* of_iomap - Maps the memory mapped IO for a given device_node
* @device: the device whose io range will be mapped
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index 30e40fb6936b..e317f375374a 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -33,10 +33,6 @@ extern u64 of_translate_dma_address(struct device_node *dev,
extern u64 of_translate_address(struct device_node *np, const __be32 *addr);
extern int of_address_to_resource(struct device_node *dev, int index,
struct resource *r);
-extern struct device_node *of_find_matching_node_by_address(
- struct device_node *from,
- const struct of_device_id *matches,
- u64 base_address);
extern void __iomem *of_iomap(struct device_node *device, int index);
void __iomem *of_io_request_and_map(struct device_node *device,
int index, const char *name);
@@ -71,14 +67,6 @@ static inline u64 of_translate_address(struct device_node *np,
return OF_BAD_ADDR;
}

-static inline struct device_node *of_find_matching_node_by_address(
- struct device_node *from,
- const struct of_device_id *matches,
- u64 base_address)
-{
- return NULL;
-}
-
static inline const __be32 *of_get_address(struct device_node *dev, int index,
u64 *size, unsigned int *flags)
{
--
2.20.1

2019-09-27 09:18:16

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH 01/11] of: Remove unused of_find_matching_node_by_address()

On Fri, Sep 27, 2019 at 2:25 AM Rob Herring <[email protected]> wrote:
> of_find_matching_node_by_address() is unused, so remove it.
>
> Cc: Robin Murphy <[email protected]>
> Signed-off-by: Rob Herring <[email protected]>

Reviewed-by: Geert Uytterhoeven <[email protected]>

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2019-09-27 09:19:51

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH 02/11] of: Make of_dma_get_range() private

On Fri, Sep 27, 2019 at 2:25 AM Rob Herring <[email protected]> wrote:
> of_dma_get_range() is only used within the DT core code, so remove the
> export and move the header declaration to the private header.
>
> Cc: Robin Murphy <[email protected]>
> Signed-off-by: Rob Herring <[email protected]>

Reviewed-by: Geert Uytterhoeven <[email protected]>

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2019-09-27 09:25:11

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH 06/11] of/address: Introduce of_get_next_dma_parent() helper

On Fri, Sep 27, 2019 at 2:25 AM Rob Herring <[email protected]> wrote:
> From: Robin Murphy <[email protected]>
>
> Add of_get_next_dma_parent() helper which is similar to
> __of_get_dma_parent(), but can be used in iterators and decrements the
> ref count on the prior parent.
>
> Signed-off-by: Robin Murphy <[email protected]>
> Signed-off-by: Rob Herring <[email protected]>

Reviewed-by: Geert Uytterhoeven <[email protected]>

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2019-09-27 09:29:30

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH 08/11] of: Factor out #{addr,size}-cells parsing

On Fri, Sep 27, 2019 at 2:25 AM Rob Herring <[email protected]> wrote:
> From: Robin Murphy <[email protected]>
>
> In some cases such as PCI host controllers, we may have a "parent bus"
> which is an OF leaf node, but still need to correctly parse ranges from
> the point of view of that bus. For that, factor out variants of the
> "#addr-cells" and "#size-cells" parsers which do not assume they have a
> device node and thus immediately traverse upwards before reading the
> relevant property.
>
> Signed-off-by: Robin Murphy <[email protected]>
> [robh: don't make of_bus_n_{addr,size}_cells() public]
> Signed-off-by: Rob Herring <[email protected]>

Reviewed-by: Geert Uytterhoeven <[email protected]>

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2019-09-29 11:17:10

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 00/11] of: dma-ranges fixes and improvements

On Fri, Sep 27, 2019 at 2:24 AM Rob Herring <[email protected]> wrote:
>
> This series fixes several issues related to 'dma-ranges'. Primarily,
> 'dma-ranges' in a PCI bridge node does correctly set dma masks for PCI
> devices not described in the DT. A common case needing dma-ranges is a
> 32-bit PCIe bridge on a 64-bit system. This affects several platforms
> including Broadcom, NXP, Renesas, and Arm Juno. There's been several
> attempts to fix these issues, most recently earlier this week[1].
>
> In the process, I found several bugs in the address translation. It
> appears that things have happened to work as various DTs happen to use
> 1:1 addresses.
>
> First 3 patches are just some clean-up. The 4th patch adds a unittest
> exhibiting the issues. Patches 5-9 rework how of_dma_configure() works
> making it work on either a struct device child node or a struct
> device_node parent node so that it works on bus leaf nodes like PCI
> bridges. Patches 10 and 11 fix 2 issues with address translation for
> dma-ranges.
>
> My testing on this has been with QEMU virt machine hacked up to set PCI
> dma-ranges and the unittest. Nicolas reports this series resolves the
> issues on Rpi4 and NXP Layerscape platforms.

I've only looked briefly, but this all seems reasonable. Adding Christoph
to Cc here to draw his attention to it as he's done a lot of reworks on
the dma-mapping interfaces recently.

On a semi-related note, Thierry asked about one aspect of the dma-ranges
property recently, which is the behavior of dma_set_mask() and related
functions when a driver sets a mask that is larger than the memory
area in the bus-ranges but smaller than the available physical RAM.
As I understood Thierry's problem and the current code, the generic
dma_set_mask() will either reject the new mask entirely or override
the mask set by of_dma_configure, but it fails to set a correct mask
within the limitations of the parent bus in this case.

We had discussed and proposed patches for this in the past, but
it seems that never got anywhere. Maybe now that a number of
people have looked at this logic, we can figure it out for good.

Arnd

> [1] https://lore.kernel.org/linux-arm-kernel/[email protected]/
>
> Rob Herring (5):
> of: Remove unused of_find_matching_node_by_address()
> of: Make of_dma_get_range() private
> of/unittest: Add dma-ranges address translation tests
> of/address: Translate 'dma-ranges' for parent nodes missing
> 'dma-ranges'
> of/address: Fix of_pci_range_parser_one translation of DMA addresses
>
> Robin Murphy (6):
> of: address: Report of_dma_get_range() errors meaningfully
> of: Ratify of_dma_configure() interface
> of/address: Introduce of_get_next_dma_parent() helper
> of: address: Follow DMA parent for "dma-coherent"
> of: Factor out #{addr,size}-cells parsing
> of: Make of_dma_get_range() work on bus nodes
>
> drivers/of/address.c | 83 +++++++++----------
> drivers/of/base.c | 32 ++++---
> drivers/of/device.c | 12 ++-
> drivers/of/of_private.h | 14 ++++
> drivers/of/unittest-data/testcases.dts | 1 +
> drivers/of/unittest-data/tests-address.dtsi | 48 +++++++++++
> drivers/of/unittest.c | 92 +++++++++++++++++++++
> include/linux/of_address.h | 21 +----
> include/linux/of_device.h | 4 +-
> 9 files changed, 227 insertions(+), 80 deletions(-)
> create mode 100644 drivers/of/unittest-data/tests-address.dtsi
>
> --
> 2.20.1

2019-09-30 08:24:24

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 00/11] of: dma-ranges fixes and improvements

On Sun, Sep 29, 2019 at 01:16:20PM +0200, Arnd Bergmann wrote:
> On a semi-related note, Thierry asked about one aspect of the dma-ranges
> property recently, which is the behavior of dma_set_mask() and related
> functions when a driver sets a mask that is larger than the memory
> area in the bus-ranges but smaller than the available physical RAM.
> As I understood Thierry's problem and the current code, the generic
> dma_set_mask() will either reject the new mask entirely or override
> the mask set by of_dma_configure, but it fails to set a correct mask
> within the limitations of the parent bus in this case.

There days dma_set_mask will only reject a mask if it is too small
to be supported by the hardware. Larger than required masks are now
always accepted.

2019-09-30 08:59:07

by Thierry Reding

[permalink] [raw]
Subject: Re: [PATCH 00/11] of: dma-ranges fixes and improvements

On Mon, Sep 30, 2019 at 01:20:55AM -0700, Christoph Hellwig wrote:
> On Sun, Sep 29, 2019 at 01:16:20PM +0200, Arnd Bergmann wrote:
> > On a semi-related note, Thierry asked about one aspect of the dma-ranges
> > property recently, which is the behavior of dma_set_mask() and related
> > functions when a driver sets a mask that is larger than the memory
> > area in the bus-ranges but smaller than the available physical RAM.
> > As I understood Thierry's problem and the current code, the generic
> > dma_set_mask() will either reject the new mask entirely or override
> > the mask set by of_dma_configure, but it fails to set a correct mask
> > within the limitations of the parent bus in this case.
>
> There days dma_set_mask will only reject a mask if it is too small
> to be supported by the hardware. Larger than required masks are now
> always accepted.

Summarizing why this came up: the memory subsystem on Tegra194 has a
mechanism controlled by bit 39 of physical addresses. This is used to
support two variants of sector ordering for block linear formats. The
GPU uses a slightly different ordering than other MSS clients, so the
drivers have to set this bit depending on who they interoperate with.

I was running into this as I was adding support for IOMMU support for
the Ethernet controller on Tegra194. The controller has a HW feature
register that contains how many address bits it supports. This is 40
for Tegra194, corresponding to the number of address bits to the MSS.
Without IOMMU support that's not a problem because there are no systems
with 40 bits of system memory. However, if we enable IOMMU support, the
DMA/IOMMU code will allocate from the top of a 48-bit (constrained to
40 bits via the DMA mask) input address space. This causes bit 39 to be
set, which in turn will make the MSS reorder sectors and break network
communications.

Since this reordering takes place at the MSS level, this applies to all
MSS clients. Most of these clients always want bit 39 to be 0, whereas
the clients that can and want to make use of the reordering always want
bit 39 to be under their control, so they can control in a fine-grained
way when to set it.

This means that effectively all MSS clients can only address 39 bits, so
instead of hard-coding that for each driver I thought it'd make sense to
have a central place to configure this, so that all devices by default
are restricted to 39-bit addressing. However, with the current DMA API
implementation this causes a problem because the default 39-bit DMA mask
would get overwritten by the driver (as in the example of the Ethernet
controller setting a 40-bit DMA mask because that's what the hardware
supports).

I realize that this is somewhat exotic. On one hand it is correct for a
driver to say that the hardware supports 40-bit addressing (i.e. the
Ethernet controller can address bit 39), but from a system integration
point of view, using bit 39 is wrong, except in a very restricted set of
cases.

If I understand correctly, describing this with a dma-ranges property is
the right thing to do, but it wouldn't work with the current
implementation because drivers can still override a lower DMA mask with
a higher one.

Thierry


Attachments:
(No filename) (3.19 kB)
signature.asc (849.00 B)
Download all attachments

2019-09-30 09:22:43

by Nicolas Saenz Julienne

[permalink] [raw]
Subject: Re: [PATCH 00/11] of: dma-ranges fixes and improvements

On Thu, 2019-09-26 at 19:24 -0500, Rob Herring wrote:
> This series fixes several issues related to 'dma-ranges'. Primarily,
> 'dma-ranges' in a PCI bridge node does correctly set dma masks for PCI
> devices not described in the DT. A common case needing dma-ranges is a
> 32-bit PCIe bridge on a 64-bit system. This affects several platforms
> including Broadcom, NXP, Renesas, and Arm Juno. There's been several
> attempts to fix these issues, most recently earlier this week[1].
>
> In the process, I found several bugs in the address translation. It
> appears that things have happened to work as various DTs happen to use
> 1:1 addresses.
>
> First 3 patches are just some clean-up. The 4th patch adds a unittest
> exhibiting the issues. Patches 5-9 rework how of_dma_configure() works
> making it work on either a struct device child node or a struct
> device_node parent node so that it works on bus leaf nodes like PCI
> bridges. Patches 10 and 11 fix 2 issues with address translation for
> dma-ranges.
>
> My testing on this has been with QEMU virt machine hacked up to set PCI
> dma-ranges and the unittest. Nicolas reports this series resolves the
> issues on Rpi4 and NXP Layerscape platforms.
>
> Rob
>
> [1]
>
https://lore.kernel.org/linux-arm-kernel/[email protected]/
>
> Rob Herring (5):
> of: Remove unused of_find_matching_node_by_address()
> of: Make of_dma_get_range() private
> of/unittest: Add dma-ranges address translation tests
> of/address: Translate 'dma-ranges' for parent nodes missing
> 'dma-ranges'
> of/address: Fix of_pci_range_parser_one translation of DMA addresses
>
> Robin Murphy (6):
> of: address: Report of_dma_get_range() errors meaningfully
> of: Ratify of_dma_configure() interface
> of/address: Introduce of_get_next_dma_parent() helper
> of: address: Follow DMA parent for "dma-coherent"
> of: Factor out #{addr,size}-cells parsing
> of: Make of_dma_get_range() work on bus nodes

Re-tested the whole series. Verified both the unittests run fine and PCIe's
behaviour is fixed.

Tested-by: Nicolas Saenz Julienne <[email protected]>

Also for the whole series:

Reviewed-by: Nicolas Saenz Julienne <[email protected]>

Regards,
Nicolas


Attachments:
signature.asc (499.00 B)
This is a digitally signed message part

2019-09-30 09:56:01

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH 00/11] of: dma-ranges fixes and improvements

On 2019-09-30 9:56 am, Thierry Reding wrote:
> On Mon, Sep 30, 2019 at 01:20:55AM -0700, Christoph Hellwig wrote:
>> On Sun, Sep 29, 2019 at 01:16:20PM +0200, Arnd Bergmann wrote:
>>> On a semi-related note, Thierry asked about one aspect of the dma-ranges
>>> property recently, which is the behavior of dma_set_mask() and related
>>> functions when a driver sets a mask that is larger than the memory
>>> area in the bus-ranges but smaller than the available physical RAM.
>>> As I understood Thierry's problem and the current code, the generic
>>> dma_set_mask() will either reject the new mask entirely or override
>>> the mask set by of_dma_configure, but it fails to set a correct mask
>>> within the limitations of the parent bus in this case.
>>
>> There days dma_set_mask will only reject a mask if it is too small
>> to be supported by the hardware. Larger than required masks are now
>> always accepted.
>
> Summarizing why this came up: the memory subsystem on Tegra194 has a
> mechanism controlled by bit 39 of physical addresses. This is used to
> support two variants of sector ordering for block linear formats. The
> GPU uses a slightly different ordering than other MSS clients, so the
> drivers have to set this bit depending on who they interoperate with.
>
> I was running into this as I was adding support for IOMMU support for
> the Ethernet controller on Tegra194. The controller has a HW feature
> register that contains how many address bits it supports. This is 40
> for Tegra194, corresponding to the number of address bits to the MSS.
> Without IOMMU support that's not a problem because there are no systems
> with 40 bits of system memory. However, if we enable IOMMU support, the
> DMA/IOMMU code will allocate from the top of a 48-bit (constrained to
> 40 bits via the DMA mask) input address space. This causes bit 39 to be
> set, which in turn will make the MSS reorder sectors and break network
> communications.
>
> Since this reordering takes place at the MSS level, this applies to all
> MSS clients. Most of these clients always want bit 39 to be 0, whereas
> the clients that can and want to make use of the reordering always want
> bit 39 to be under their control, so they can control in a fine-grained
> way when to set it.
>
> This means that effectively all MSS clients can only address 39 bits, so
> instead of hard-coding that for each driver I thought it'd make sense to
> have a central place to configure this, so that all devices by default
> are restricted to 39-bit addressing. However, with the current DMA API
> implementation this causes a problem because the default 39-bit DMA mask
> would get overwritten by the driver (as in the example of the Ethernet
> controller setting a 40-bit DMA mask because that's what the hardware
> supports).
>
> I realize that this is somewhat exotic. On one hand it is correct for a
> driver to say that the hardware supports 40-bit addressing (i.e. the
> Ethernet controller can address bit 39), but from a system integration
> point of view, using bit 39 is wrong, except in a very restricted set of
> cases.
>
> If I understand correctly, describing this with a dma-ranges property is
> the right thing to do, but it wouldn't work with the current
> implementation because drivers can still override a lower DMA mask with
> a higher one.

But that sounds like exactly the situation for which we introduced
bus_dma_mask. If "dma-ranges" is found, then we should initialise that
to reflect the limitation. Drivers may subsequently set a larger mask
based on what the device is natively capable of, but the DMA API
internals should quietly clamp that down to the bus mask wherever it
matters.

Since that change, the initial value of dma_mask and coherent_dma_mask
doesn't really matter much, as we expect drivers to reset them anyway
(and in general they have to be able to enlarge them from a 32-bit
default value).

As far as I'm aware this has been working fine (albeit in equivalent
ACPI form) for at least one SoC with 64-bit device masks, a 48-bit
IOMMU, and a 44-bit interconnect in between - indeed if I avoid
distraction long enough to set up the big new box under my desk, the
sending of future emails will depend on it ;)

Robin.

2019-09-30 12:47:14

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH 00/11] of: dma-ranges fixes and improvements

On 9/27/19 2:24 AM, Rob Herring wrote:
> This series fixes several issues related to 'dma-ranges'. Primarily,
> 'dma-ranges' in a PCI bridge node does correctly set dma masks for PCI
> devices not described in the DT. A common case needing dma-ranges is a
> 32-bit PCIe bridge on a 64-bit system. This affects several platforms
> including Broadcom, NXP, Renesas, and Arm Juno. There's been several
> attempts to fix these issues, most recently earlier this week[1].
>
> In the process, I found several bugs in the address translation. It
> appears that things have happened to work as various DTs happen to use
> 1:1 addresses.
>
> First 3 patches are just some clean-up. The 4th patch adds a unittest
> exhibiting the issues. Patches 5-9 rework how of_dma_configure() works
> making it work on either a struct device child node or a struct
> device_node parent node so that it works on bus leaf nodes like PCI
> bridges. Patches 10 and 11 fix 2 issues with address translation for
> dma-ranges.
>
> My testing on this has been with QEMU virt machine hacked up to set PCI
> dma-ranges and the unittest. Nicolas reports this series resolves the
> issues on Rpi4 and NXP Layerscape platforms.

With the following patches applied:
https://patchwork.ozlabs.org/patch/1144870/
https://patchwork.ozlabs.org/patch/1144871/
on R8A7795 Salvator-XS
Tested-by: Marek Vasut <[email protected]>

--
Best regards,
Marek Vasut

2019-09-30 12:50:18

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 01/11] of: Remove unused of_find_matching_node_by_address()

On Thu, Sep 26, 2019 at 07:24:45PM -0500, Rob Herring wrote:
> of_find_matching_node_by_address() is unused, so remove it.
>
> Cc: Robin Murphy <[email protected]>
> Signed-off-by: Rob Herring <[email protected]>

Looks good,

Reviewed-by: Christoph Hellwig <[email protected]>

2019-09-30 12:54:10

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH 00/11] of: dma-ranges fixes and improvements

On 30/09/2019 13:40, Marek Vasut wrote:
> On 9/27/19 2:24 AM, Rob Herring wrote:
>> This series fixes several issues related to 'dma-ranges'. Primarily,
>> 'dma-ranges' in a PCI bridge node does correctly set dma masks for PCI
>> devices not described in the DT. A common case needing dma-ranges is a
>> 32-bit PCIe bridge on a 64-bit system. This affects several platforms
>> including Broadcom, NXP, Renesas, and Arm Juno. There's been several
>> attempts to fix these issues, most recently earlier this week[1].
>>
>> In the process, I found several bugs in the address translation. It
>> appears that things have happened to work as various DTs happen to use
>> 1:1 addresses.
>>
>> First 3 patches are just some clean-up. The 4th patch adds a unittest
>> exhibiting the issues. Patches 5-9 rework how of_dma_configure() works
>> making it work on either a struct device child node or a struct
>> device_node parent node so that it works on bus leaf nodes like PCI
>> bridges. Patches 10 and 11 fix 2 issues with address translation for
>> dma-ranges.
>>
>> My testing on this has been with QEMU virt machine hacked up to set PCI
>> dma-ranges and the unittest. Nicolas reports this series resolves the
>> issues on Rpi4 and NXP Layerscape platforms.
>
> With the following patches applied:
> https://patchwork.ozlabs.org/patch/1144870/
> https://patchwork.ozlabs.org/patch/1144871/

Can you try it without those additional patches? This series aims to
make the parsing work properly generically, such that we shouldn't need
to add an additional PCI-specific version of almost the same code.

Robin.

2019-09-30 12:54:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 02/11] of: Make of_dma_get_range() private

> +#ifdef CONFIG_OF_ADDRESS
> +extern int of_dma_get_range(struct device_node *np, u64 *dma_addr,
> + u64 *paddr, u64 *size);

No need for the extern here.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

2019-09-30 12:56:42

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH 00/11] of: dma-ranges fixes and improvements

On 9/30/19 2:52 PM, Robin Murphy wrote:
> On 30/09/2019 13:40, Marek Vasut wrote:
>> On 9/27/19 2:24 AM, Rob Herring wrote:
>>> This series fixes several issues related to 'dma-ranges'. Primarily,
>>> 'dma-ranges' in a PCI bridge node does correctly set dma masks for PCI
>>> devices not described in the DT. A common case needing dma-ranges is a
>>> 32-bit PCIe bridge on a 64-bit system. This affects several platforms
>>> including Broadcom, NXP, Renesas, and Arm Juno. There's been several
>>> attempts to fix these issues, most recently earlier this week[1].
>>>
>>> In the process, I found several bugs in the address translation. It
>>> appears that things have happened to work as various DTs happen to use
>>> 1:1 addresses.
>>>
>>> First 3 patches are just some clean-up. The 4th patch adds a unittest
>>> exhibiting the issues. Patches 5-9 rework how of_dma_configure() works
>>> making it work on either a struct device child node or a struct
>>> device_node parent node so that it works on bus leaf nodes like PCI
>>> bridges. Patches 10 and 11 fix 2 issues with address translation for
>>> dma-ranges.
>>>
>>> My testing on this has been with QEMU virt machine hacked up to set PCI
>>> dma-ranges and the unittest. Nicolas reports this series resolves the
>>> issues on Rpi4 and NXP Layerscape platforms.
>>
>> With the following patches applied:
>>        https://patchwork.ozlabs.org/patch/1144870/
>>        https://patchwork.ozlabs.org/patch/1144871/
>
> Can you try it without those additional patches? This series aims to
> make the parsing work properly generically, such that we shouldn't need
> to add an additional PCI-specific version of almost the same code.

Seems to work even without those.

--
Best regards,
Marek Vasut

2019-09-30 13:01:53

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 05/11] of: Ratify of_dma_configure() interface

On Thu, Sep 26, 2019 at 07:24:49PM -0500, Rob Herring wrote:
> -int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
> +int of_dma_configure(struct device *dev, struct device_node *parent, bool force_dma)

This creates a > 80 char line.

> {
> u64 dma_addr, paddr, size = 0;
> int ret;
> bool coherent;
> unsigned long offset;
> const struct iommu_ops *iommu;
> + struct device_node *np;
> u64 mask;
>
> + np = dev->of_node;
> + if (!np)
> + np = parent;
> + if (!np)
> + return -ENODEV;

I have to say I find the older calling convention simpler to understand.
If we want to enforce the invariant I'd rather do that explicitly:

if (dev->of_node && np != dev->of_node)
return -EINVAL;

2019-09-30 13:07:57

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH 00/11] of: dma-ranges fixes and improvements

On 30/09/2019 13:54, Marek Vasut wrote:
> On 9/30/19 2:52 PM, Robin Murphy wrote:
>> On 30/09/2019 13:40, Marek Vasut wrote:
>>> On 9/27/19 2:24 AM, Rob Herring wrote:
>>>> This series fixes several issues related to 'dma-ranges'. Primarily,
>>>> 'dma-ranges' in a PCI bridge node does correctly set dma masks for PCI
>>>> devices not described in the DT. A common case needing dma-ranges is a
>>>> 32-bit PCIe bridge on a 64-bit system. This affects several platforms
>>>> including Broadcom, NXP, Renesas, and Arm Juno. There's been several
>>>> attempts to fix these issues, most recently earlier this week[1].
>>>>
>>>> In the process, I found several bugs in the address translation. It
>>>> appears that things have happened to work as various DTs happen to use
>>>> 1:1 addresses.
>>>>
>>>> First 3 patches are just some clean-up. The 4th patch adds a unittest
>>>> exhibiting the issues. Patches 5-9 rework how of_dma_configure() works
>>>> making it work on either a struct device child node or a struct
>>>> device_node parent node so that it works on bus leaf nodes like PCI
>>>> bridges. Patches 10 and 11 fix 2 issues with address translation for
>>>> dma-ranges.
>>>>
>>>> My testing on this has been with QEMU virt machine hacked up to set PCI
>>>> dma-ranges and the unittest. Nicolas reports this series resolves the
>>>> issues on Rpi4 and NXP Layerscape platforms.
>>>
>>> With the following patches applied:
>>>        https://patchwork.ozlabs.org/patch/1144870/
>>>        https://patchwork.ozlabs.org/patch/1144871/
>>
>> Can you try it without those additional patches? This series aims to
>> make the parsing work properly generically, such that we shouldn't need
>> to add an additional PCI-specific version of almost the same code.
>
> Seems to work even without those.

Great, thanks for confirming!

Robin.

2019-09-30 13:36:24

by Thierry Reding

[permalink] [raw]
Subject: Re: [PATCH 00/11] of: dma-ranges fixes and improvements

On Mon, Sep 30, 2019 at 10:55:15AM +0100, Robin Murphy wrote:
> On 2019-09-30 9:56 am, Thierry Reding wrote:
> > On Mon, Sep 30, 2019 at 01:20:55AM -0700, Christoph Hellwig wrote:
> > > On Sun, Sep 29, 2019 at 01:16:20PM +0200, Arnd Bergmann wrote:
> > > > On a semi-related note, Thierry asked about one aspect of the dma-ranges
> > > > property recently, which is the behavior of dma_set_mask() and related
> > > > functions when a driver sets a mask that is larger than the memory
> > > > area in the bus-ranges but smaller than the available physical RAM.
> > > > As I understood Thierry's problem and the current code, the generic
> > > > dma_set_mask() will either reject the new mask entirely or override
> > > > the mask set by of_dma_configure, but it fails to set a correct mask
> > > > within the limitations of the parent bus in this case.
> > >
> > > There days dma_set_mask will only reject a mask if it is too small
> > > to be supported by the hardware. Larger than required masks are now
> > > always accepted.
> >
> > Summarizing why this came up: the memory subsystem on Tegra194 has a
> > mechanism controlled by bit 39 of physical addresses. This is used to
> > support two variants of sector ordering for block linear formats. The
> > GPU uses a slightly different ordering than other MSS clients, so the
> > drivers have to set this bit depending on who they interoperate with.
> >
> > I was running into this as I was adding support for IOMMU support for
> > the Ethernet controller on Tegra194. The controller has a HW feature
> > register that contains how many address bits it supports. This is 40
> > for Tegra194, corresponding to the number of address bits to the MSS.
> > Without IOMMU support that's not a problem because there are no systems
> > with 40 bits of system memory. However, if we enable IOMMU support, the
> > DMA/IOMMU code will allocate from the top of a 48-bit (constrained to
> > 40 bits via the DMA mask) input address space. This causes bit 39 to be
> > set, which in turn will make the MSS reorder sectors and break network
> > communications.
> >
> > Since this reordering takes place at the MSS level, this applies to all
> > MSS clients. Most of these clients always want bit 39 to be 0, whereas
> > the clients that can and want to make use of the reordering always want
> > bit 39 to be under their control, so they can control in a fine-grained
> > way when to set it.
> >
> > This means that effectively all MSS clients can only address 39 bits, so
> > instead of hard-coding that for each driver I thought it'd make sense to
> > have a central place to configure this, so that all devices by default
> > are restricted to 39-bit addressing. However, with the current DMA API
> > implementation this causes a problem because the default 39-bit DMA mask
> > would get overwritten by the driver (as in the example of the Ethernet
> > controller setting a 40-bit DMA mask because that's what the hardware
> > supports).
> >
> > I realize that this is somewhat exotic. On one hand it is correct for a
> > driver to say that the hardware supports 40-bit addressing (i.e. the
> > Ethernet controller can address bit 39), but from a system integration
> > point of view, using bit 39 is wrong, except in a very restricted set of
> > cases.
> >
> > If I understand correctly, describing this with a dma-ranges property is
> > the right thing to do, but it wouldn't work with the current
> > implementation because drivers can still override a lower DMA mask with
> > a higher one.
>
> But that sounds like exactly the situation for which we introduced
> bus_dma_mask. If "dma-ranges" is found, then we should initialise that to
> reflect the limitation. Drivers may subsequently set a larger mask based on
> what the device is natively capable of, but the DMA API internals should
> quietly clamp that down to the bus mask wherever it matters.
>
> Since that change, the initial value of dma_mask and coherent_dma_mask
> doesn't really matter much, as we expect drivers to reset them anyway (and
> in general they have to be able to enlarge them from a 32-bit default
> value).
>
> As far as I'm aware this has been working fine (albeit in equivalent ACPI
> form) for at least one SoC with 64-bit device masks, a 48-bit IOMMU, and a
> 44-bit interconnect in between - indeed if I avoid distraction long enough
> to set up the big new box under my desk, the sending of future emails will
> depend on it ;)

After applying this series it does indeed seem to be working. The only
thing I had to do was add a dma-ranges property to the DMA parent. I
ended up doing that via an interconnects property because the default
DMA parent on Tegra194 is /cbb which restricts #address-cells = <1> and
#size-cells = <1>, so it can't actually translate anything beyond 32
bits of system memory.

So I basically ended up making the memory controller an interconnect
provider, increasing #address-cells = <2> and #size-cells = <2> again
and then using a dma-ranges property like this:

dma-ranges = <0x0 0x0 0x0 0x80 0x0>;

to specify that only 39 bits should be used for addressing, leaving the
special bit 39 up to the driver to set as required.

Coincidentally making the memory controller an interconnect provider is
something that I was planning to do anyway in order to support memory
frequency scaling, so this all actually fits together pretty elegantly.

Thierry


Attachments:
(No filename) (5.39 kB)
signature.asc (849.00 B)
Download all attachments

2019-09-30 13:36:25

by Nicolas Saenz Julienne

[permalink] [raw]
Subject: Re: [PATCH 05/11] of: Ratify of_dma_configure() interface

On Mon, 2019-09-30 at 05:57 -0700, Christoph Hellwig wrote:
> On Thu, Sep 26, 2019 at 07:24:49PM -0500, Rob Herring wrote:
> > -int of_dma_configure(struct device *dev, struct device_node *np, bool
> > force_dma)
> > +int of_dma_configure(struct device *dev, struct device_node *parent, bool
> > force_dma)
>
> This creates a > 80 char line.
>
> > {
> > u64 dma_addr, paddr, size = 0;
> > int ret;
> > bool coherent;
> > unsigned long offset;
> > const struct iommu_ops *iommu;
> > + struct device_node *np;
> > u64 mask;
> >
> > + np = dev->of_node;
> > + if (!np)
> > + np = parent;
> > + if (!np)
> > + return -ENODEV;
>
> I have to say I find the older calling convention simpler to understand.
> If we want to enforce the invariant I'd rather do that explicitly:
>
> if (dev->of_node && np != dev->of_node)
> return -EINVAL;

As is, this would break Freescale Layerscape fsl-mc bus' dma_configure():

static int fsl_mc_dma_configure(struct device *dev)
{
struct device *dma_dev = dev;

while (dev_is_fsl_mc(dma_dev))
dma_dev = dma_dev->parent;

return of_dma_configure(dev, dma_dev->of_node, 0);
}

But I think that with this series, given the fact that we now treat the lack of
dma-ranges as a 1:1 mapping instead of an error, we could rewrite the function
like this:

static int fsl_mc_dma_configure(struct device *dev)
{
return of_dma_configure(dev, false, 0);
}

If needed I can test this.

Regards,
Nicolas


Attachments:
signature.asc (499.00 B)
This is a digitally signed message part

2019-09-30 21:27:53

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH 05/11] of: Ratify of_dma_configure() interface

On Mon, Sep 30, 2019 at 8:32 AM Nicolas Saenz Julienne
<[email protected]> wrote:
>
> On Mon, 2019-09-30 at 05:57 -0700, Christoph Hellwig wrote:
> > On Thu, Sep 26, 2019 at 07:24:49PM -0500, Rob Herring wrote:
> > > -int of_dma_configure(struct device *dev, struct device_node *np, bool
> > > force_dma)
> > > +int of_dma_configure(struct device *dev, struct device_node *parent, bool
> > > force_dma)
> >
> > This creates a > 80 char line.
> >
> > > {
> > > u64 dma_addr, paddr, size = 0;
> > > int ret;
> > > bool coherent;
> > > unsigned long offset;
> > > const struct iommu_ops *iommu;
> > > + struct device_node *np;
> > > u64 mask;
> > >
> > > + np = dev->of_node;
> > > + if (!np)
> > > + np = parent;
> > > + if (!np)
> > > + return -ENODEV;
> >
> > I have to say I find the older calling convention simpler to understand.
> > If we want to enforce the invariant I'd rather do that explicitly:
> >
> > if (dev->of_node && np != dev->of_node)
> > return -EINVAL;
>
> As is, this would break Freescale Layerscape fsl-mc bus' dma_configure():

This may break PCI too for devices that have a DT node.

> static int fsl_mc_dma_configure(struct device *dev)
> {
> struct device *dma_dev = dev;
>
> while (dev_is_fsl_mc(dma_dev))
> dma_dev = dma_dev->parent;
>
> return of_dma_configure(dev, dma_dev->of_node, 0);
> }
>
> But I think that with this series, given the fact that we now treat the lack of
> dma-ranges as a 1:1 mapping instead of an error, we could rewrite the function
> like this:

Now, I'm reconsidering allowing this abuse... It's better if the code
which understands the bus structure in DT for a specific bus passes in
the right thing. Maybe I should go back to Robin's version (below).
OTOH, the existing assumption that 'dma-ranges' was in the immediate
parent was an assumption on the bus structure which maybe doesn't
always apply.

diff --git a/drivers/of/device.c b/drivers/of/device.c
index a45261e21144..6951450bb8f3 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -98,12 +98,15 @@ int of_dma_configure(struct device *dev, struct
device_node *parent, bool force_
u64 mask;

np = dev->of_node;
- if (!np)
- np = parent;
+ if (np)
+ parent = of_get_dma_parent(np);
+ else
+ np = of_node_get(parent);
if (!np)
return -ENODEV;

- ret = of_dma_get_range(np, &dma_addr, &paddr, &size);
+ ret = of_dma_get_range(parent, &dma_addr, &paddr, &size);
+ of_node_put(parent);
if (ret < 0) {
/*
* For legacy reasons, we have to assume some devices need

2019-10-01 15:46:15

by Nicolas Saenz Julienne

[permalink] [raw]
Subject: Re: [PATCH 05/11] of: Ratify of_dma_configure() interface

On Mon, 2019-09-30 at 16:24 -0500, Rob Herring wrote:
> On Mon, Sep 30, 2019 at 8:32 AM Nicolas Saenz Julienne
> <[email protected]> wrote:
> > On Mon, 2019-09-30 at 05:57 -0700, Christoph Hellwig wrote:
> > > On Thu, Sep 26, 2019 at 07:24:49PM -0500, Rob Herring wrote:
> > > > -int of_dma_configure(struct device *dev, struct device_node *np, bool
> > > > force_dma)
> > > > +int of_dma_configure(struct device *dev, struct device_node *parent,
> > > > bool
> > > > force_dma)
> > >
> > > This creates a > 80 char line.
> > >
> > > > {
> > > > u64 dma_addr, paddr, size = 0;
> > > > int ret;
> > > > bool coherent;
> > > > unsigned long offset;
> > > > const struct iommu_ops *iommu;
> > > > + struct device_node *np;
> > > > u64 mask;
> > > >
> > > > + np = dev->of_node;
> > > > + if (!np)
> > > > + np = parent;
> > > > + if (!np)
> > > > + return -ENODEV;
> > >
> > > I have to say I find the older calling convention simpler to understand.
> > > If we want to enforce the invariant I'd rather do that explicitly:
> > >
> > > if (dev->of_node && np != dev->of_node)
> > > return -EINVAL;
> >
> > As is, this would break Freescale Layerscape fsl-mc bus' dma_configure():
>
> This may break PCI too for devices that have a DT node.
>
> > static int fsl_mc_dma_configure(struct device *dev)
> > {
> > struct device *dma_dev = dev;
> >
> > while (dev_is_fsl_mc(dma_dev))
> > dma_dev = dma_dev->parent;
> >
> > return of_dma_configure(dev, dma_dev->of_node, 0);
> > }
> >
> > But I think that with this series, given the fact that we now treat the lack
> > of
> > dma-ranges as a 1:1 mapping instead of an error, we could rewrite the
> > function
> > like this:
>
> Now, I'm reconsidering allowing this abuse... It's better if the code
> which understands the bus structure in DT for a specific bus passes in
> the right thing. Maybe I should go back to Robin's version (below).
> OTOH, the existing assumption that 'dma-ranges' was in the immediate
> parent was an assumption on the bus structure which maybe doesn't
> always apply.
>
> diff --git a/drivers/of/device.c b/drivers/of/device.c
> index a45261e21144..6951450bb8f3 100644
> --- a/drivers/of/device.c
> +++ b/drivers/of/device.c
> @@ -98,12 +98,15 @@ int of_dma_configure(struct device *dev, struct
> device_node *parent, bool force_
> u64 mask;
>
> np = dev->of_node;
> - if (!np)
> - np = parent;
> + if (np)
> + parent = of_get_dma_parent(np);
> + else
> + np = of_node_get(parent);
> if (!np)
> return -ENODEV;
>
> - ret = of_dma_get_range(np, &dma_addr, &paddr, &size);
> + ret = of_dma_get_range(parent, &dma_addr, &paddr, &size);
> + of_node_put(parent);
> if (ret < 0) {
> /*
> * For legacy reasons, we have to assume some devices need

I spent some time thinking about your comments and researching. I came to the
realization that both these solutions break the usage in
drivers/gpu/drm/sun4i/sun4i_backend.c:805. In that specific case both
'dev->of_node' and 'parent' exist yet the device receiving the configuration
and 'parent' aren't related in any way.

IOW we can't just use 'dev->of_node' as a starting point to walk upwards the
tree. We always have to respect whatever DT node the bus provided, and start
there. This clashes with the current solutions, as they are based on the fact
that we can use dev->of_node when present.

My guess at this point, if we're forced to honor that behaviour, is that we
have to create a new API for the PCI use case. Something the likes of
of_dma_configure_parent().

Regards,
Nicolas


Attachments:
signature.asc (499.00 B)
This is a digitally signed message part

2019-10-04 07:56:14

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH 05/11] of: Ratify of_dma_configure() interface

On Tue, Oct 1, 2019 at 10:43 AM Nicolas Saenz Julienne
<[email protected]> wrote:
>
> On Mon, 2019-09-30 at 16:24 -0500, Rob Herring wrote:
> > On Mon, Sep 30, 2019 at 8:32 AM Nicolas Saenz Julienne
> > <[email protected]> wrote:
> > > On Mon, 2019-09-30 at 05:57 -0700, Christoph Hellwig wrote:
> > > > On Thu, Sep 26, 2019 at 07:24:49PM -0500, Rob Herring wrote:
> > > > > -int of_dma_configure(struct device *dev, struct device_node *np, bool
> > > > > force_dma)
> > > > > +int of_dma_configure(struct device *dev, struct device_node *parent,
> > > > > bool
> > > > > force_dma)
> > > >
> > > > This creates a > 80 char line.
> > > >
> > > > > {
> > > > > u64 dma_addr, paddr, size = 0;
> > > > > int ret;
> > > > > bool coherent;
> > > > > unsigned long offset;
> > > > > const struct iommu_ops *iommu;
> > > > > + struct device_node *np;
> > > > > u64 mask;
> > > > >
> > > > > + np = dev->of_node;
> > > > > + if (!np)
> > > > > + np = parent;
> > > > > + if (!np)
> > > > > + return -ENODEV;
> > > >
> > > > I have to say I find the older calling convention simpler to understand.
> > > > If we want to enforce the invariant I'd rather do that explicitly:
> > > >
> > > > if (dev->of_node && np != dev->of_node)
> > > > return -EINVAL;
> > >
> > > As is, this would break Freescale Layerscape fsl-mc bus' dma_configure():
> >
> > This may break PCI too for devices that have a DT node.
> >
> > > static int fsl_mc_dma_configure(struct device *dev)
> > > {
> > > struct device *dma_dev = dev;
> > >
> > > while (dev_is_fsl_mc(dma_dev))
> > > dma_dev = dma_dev->parent;
> > >
> > > return of_dma_configure(dev, dma_dev->of_node, 0);
> > > }
> > >
> > > But I think that with this series, given the fact that we now treat the lack
> > > of
> > > dma-ranges as a 1:1 mapping instead of an error, we could rewrite the
> > > function
> > > like this:
> >
> > Now, I'm reconsidering allowing this abuse... It's better if the code
> > which understands the bus structure in DT for a specific bus passes in
> > the right thing. Maybe I should go back to Robin's version (below).
> > OTOH, the existing assumption that 'dma-ranges' was in the immediate
> > parent was an assumption on the bus structure which maybe doesn't
> > always apply.
> >
> > diff --git a/drivers/of/device.c b/drivers/of/device.c
> > index a45261e21144..6951450bb8f3 100644
> > --- a/drivers/of/device.c
> > +++ b/drivers/of/device.c
> > @@ -98,12 +98,15 @@ int of_dma_configure(struct device *dev, struct
> > device_node *parent, bool force_
> > u64 mask;
> >
> > np = dev->of_node;
> > - if (!np)
> > - np = parent;
> > + if (np)
> > + parent = of_get_dma_parent(np);
> > + else
> > + np = of_node_get(parent);
> > if (!np)
> > return -ENODEV;
> >
> > - ret = of_dma_get_range(np, &dma_addr, &paddr, &size);
> > + ret = of_dma_get_range(parent, &dma_addr, &paddr, &size);
> > + of_node_put(parent);
> > if (ret < 0) {
> > /*
> > * For legacy reasons, we have to assume some devices need
>
> I spent some time thinking about your comments and researching. I came to the
> realization that both these solutions break the usage in
> drivers/gpu/drm/sun4i/sun4i_backend.c:805. In that specific case both
> 'dev->of_node' and 'parent' exist yet the device receiving the configuration
> and 'parent' aren't related in any way.

I knew there was some reason I didn't like those virtual DT nodes...

That does seem to be the oddest case. Several of the others are just
non-DT child platform devices. Perhaps we need a "copy the DMA config
from another struct device (or parent struct device)" function to
avoid using a DT function on a non-DT device.

> IOW we can't just use 'dev->of_node' as a starting point to walk upwards the
> tree. We always have to respect whatever DT node the bus provided, and start
> there. This clashes with the current solutions, as they are based on the fact
> that we can use dev->of_node when present.

Yes, you are right.

> My guess at this point, if we're forced to honor that behaviour, is that we
> have to create a new API for the PCI use case. Something the likes of
> of_dma_configure_parent().

I think of_dma_configure just has to work with the device_node of
either the device or the device parent and dev->of_node is never used
unless the caller sets it.

Rob

2019-10-07 17:54:43

by Nicolas Saenz Julienne

[permalink] [raw]
Subject: Re: [PATCH 05/11] of: Ratify of_dma_configure() interface

On Thu, 2019-10-03 at 20:53 -0500, Rob Herring wrote:
> > > > But I think that with this series, given the fact that we now treat the
> > > > lack
> > > > of
> > > > dma-ranges as a 1:1 mapping instead of an error, we could rewrite the
> > > > function
> > > > like this:
> > >
> > > Now, I'm reconsidering allowing this abuse... It's better if the code
> > > which understands the bus structure in DT for a specific bus passes in
> > > the right thing. Maybe I should go back to Robin's version (below).
> > > OTOH, the existing assumption that 'dma-ranges' was in the immediate
> > > parent was an assumption on the bus structure which maybe doesn't
> > > always apply.
> > >
> > > diff --git a/drivers/of/device.c b/drivers/of/device.c
> > > index a45261e21144..6951450bb8f3 100644
> > > --- a/drivers/of/device.c
> > > +++ b/drivers/of/device.c
> > > @@ -98,12 +98,15 @@ int of_dma_configure(struct device *dev, struct
> > > device_node *parent, bool force_
> > > u64 mask;
> > >
> > > np = dev->of_node;
> > > - if (!np)
> > > - np = parent;
> > > + if (np)
> > > + parent = of_get_dma_parent(np);
> > > + else
> > > + np = of_node_get(parent);
> > > if (!np)
> > > return -ENODEV;
> > >
> > > - ret = of_dma_get_range(np, &dma_addr, &paddr, &size);
> > > + ret = of_dma_get_range(parent, &dma_addr, &paddr, &size);
> > > + of_node_put(parent);
> > > if (ret < 0) {
> > > /*
> > > * For legacy reasons, we have to assume some devices need
> >
> > I spent some time thinking about your comments and researching. I came to
> > the
> > realization that both these solutions break the usage in
> > drivers/gpu/drm/sun4i/sun4i_backend.c:805. In that specific case both
> > 'dev->of_node' and 'parent' exist yet the device receiving the configuration
> > and 'parent' aren't related in any way.
>
> I knew there was some reason I didn't like those virtual DT nodes...
>
> That does seem to be the oddest case. Several of the others are just
> non-DT child platform devices. Perhaps we need a "copy the DMA config
> from another struct device (or parent struct device)" function to
> avoid using a DT function on a non-DT device.
>
> > IOW we can't just use 'dev->of_node' as a starting point to walk upwards the
> > tree. We always have to respect whatever DT node the bus provided, and start
> > there. This clashes with the current solutions, as they are based on the
> > fact
> > that we can use dev->of_node when present.
>
> Yes, you are right.
>
> > My guess at this point, if we're forced to honor that behaviour, is that we
> > have to create a new API for the PCI use case. Something the likes of
> > of_dma_configure_parent().
>
> I think of_dma_configure just has to work with the device_node of
> either the device or the device parent and dev->of_node is never used
> unless the caller sets it.

Fine, so given the following two distinct uses of
of_dma_configure(struct device *dev, struct device_node *np, bool ...):

- dev->of_node == np: Platform bus' typical use, we imperatively have to start
parsing dma-ranges from np's DMA parent, as the device we're configuring
might be a bus containing dma-ranges himself. For example a platform PCIe bus.

- dev->of_node != np: Here the bus is pulling some trick. The device might or
might not be represented in DT and np might be a bus or a device. But one
thing I realised is that device being configured never represents a memory
mapped bus. Assuming this assumption is acceptable, we can traverse the DT
tree starting from np and get a correct configuration as long as dma-ranges
not being present is interpreted as a 1:1 mapping.

The resulting code, which I tested on an RPi4, Freescale Layerscape and passes
OF's unit tests, looks like this:

int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
{
u64 dma_addr, paddr, size = 0;
struct device_node *parent;
u64 mask;
int ret;

if (!np)
return -ENODEV;

parent = of_node_get(np);
if (dev->of_node == parent)
parent = of_get_next_dma_parent(np);

ret = of_dma_get_range(parent, &dma_addr, &paddr, &size);
of_node_put(parent);

[...]
}

Would that be acceptable?

Regards,
Nicolas


Attachments:
signature.asc (499.00 B)
This is a digitally signed message part