This is my resurected attempt at adding support for generic PCI host
bridge controllers that make use of device tree information to
configure themselves. I've tagged it as v8 although the patches
have now been reshuffled in order to ease adoption so referring to
the older versions might be a bit of a hoop jumping exercise.
Changes from v7:
- Reordered the patches so that fixes and non-controversial patches
from v7 can be accepted more easily. If agreed I can split the
series again into patches that can be upstreamed easily and ones
that still need discussion.
- Moved the of_create_host_bridge() function to drivers/of/of_pci.c
to better reflect its target use.
- Added the function to remap the bus I/O resources that used to be
provided in my arm64 patch series and (re)named it pci_remap_iospace()
- Removed error code checking from parsing and mapping of IRQ from DT
in recognition that some PCI devices will not have legacy IRQ mappings.
v7 thread here with all the historic information: https://lkml.org/lkml/2014/3/14/279
Best regards,
Liviu
Liviu Dudau (9):
Fix ioport_map() for !CONFIG_GENERIC_IOMAP cases.
pci: Export find_pci_host_bridge() function.
pci: Introduce pci_register_io_range() helper function.
pci: OF: Fix the conversion of IO ranges into IO resources.
pci: Create pci_host_bridge before its associated bus in pci_create_root_bus.
pci: Introduce a domain number for pci_host_bridge.
pci: of: Parse and map the IRQ when adding the PCI device.
pci: Add support for creating a generic host_bridge from device tree
pci: Remap I/O bus resources into CPU space with pci_remap_iospace()
drivers/of/address.c | 108 ++++++++++++++++++++++++++++++++++++
drivers/of/of_pci.c | 135 +++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/host-bridge.c | 21 ++++++-
drivers/pci/pci.c | 37 +++++++++++++
drivers/pci/probe.c | 68 ++++++++++++++++-------
include/asm-generic/io.h | 2 +-
include/linux/of_address.h | 14 +----
include/linux/of_pci.h | 10 ++++
include/linux/pci.h | 15 +++++
9 files changed, 376 insertions(+), 34 deletions(-)
--
2.0.0
This is a useful function and we should make it visible outside the
generic PCI code. Export it as a GPL symbol.
Signed-off-by: Liviu Dudau <[email protected]>
Tested-by: Tanmay Inamdar <[email protected]>
---
drivers/pci/host-bridge.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
index 0e5f3c9..36c669e 100644
--- a/drivers/pci/host-bridge.c
+++ b/drivers/pci/host-bridge.c
@@ -16,12 +16,13 @@ static struct pci_bus *find_pci_root_bus(struct pci_bus *bus)
return bus;
}
-static struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
+struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
{
struct pci_bus *root_bus = find_pci_root_bus(bus);
return to_pci_host_bridge(root_bus->bridge);
}
+EXPORT_SYMBOL_GPL(find_pci_host_bridge);
void pci_set_host_bridge_release(struct pci_host_bridge *bridge,
void (*release_fn)(struct pci_host_bridge *),
--
2.0.0
Introduce a default implementation for remapping PCI bus I/O resources
onto the CPU address space. Architectures with special needs may
provide their own version, but most should be able to use this one.
Signed-off-by: Liviu Dudau <[email protected]>
---
drivers/pci/pci.c | 33 +++++++++++++++++++++++++++++++++
include/linux/pci.h | 3 +++
2 files changed, 36 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8e65dc3..a90df97 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2708,6 +2708,39 @@ int pci_request_regions_exclusive(struct pci_dev *pdev, const char *res_name)
}
EXPORT_SYMBOL(pci_request_regions_exclusive);
+/**
+ * pci_remap_iospace - Remap the memory mapped I/O space
+ * @res: Resource describing the I/O space
+ * @phys_addr: physical address where the range will be mapped.
+ *
+ * Remap the memory mapped I/O space described by the @res
+ * into the CPU physical address space. Only architectures
+ * that have memory mapped IO defined (and hence PCI_IOBASE)
+ * should call this function.
+ */
+int __weak pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
+{
+ int err = -ENODEV;
+
+#ifdef PCI_IOBASE
+ if (!(res->flags & IORESOURCE_IO))
+ return -EINVAL;
+
+ if (res->end > IO_SPACE_LIMIT)
+ return -EINVAL;
+
+ err = ioremap_page_range(res->start + (unsigned long)PCI_IOBASE,
+ res->end + 1 + (unsigned long)PCI_IOBASE,
+ phys_addr, __pgprot(PROT_DEVICE_nGnRE));
+#else
+ /* this architecture does not have memory mapped I/O space,
+ so this function should never be called */
+ WARN_ON(1);
+#endif
+
+ return err;
+}
+
static void __pci_set_master(struct pci_dev *dev, bool enable)
{
u16 old_cmd, cmd;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 556dc5f..65fb1fc 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1100,6 +1100,9 @@ int __must_check pci_bus_alloc_resource(struct pci_bus *bus,
resource_size_t),
void *alignf_data);
+
+int pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr);
+
static inline dma_addr_t pci_bus_address(struct pci_dev *pdev, int bar)
{
struct pci_bus_region region;
--
2.0.0
The inline version of ioport_map() that gets used when !CONFIG_GENERIC_IOMAP
is wrong. It returns a mapped (i.e. virtual) address that can start from
zero and completely ignores the PCI_IOBASE and IO_SPACE_LIMIT that most
architectures that use !CONFIG_GENERIC_MAP define.
Signed-off-by: Liviu Dudau <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Tested-by: Tanmay Inamdar <[email protected]>
---
include/asm-generic/io.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
index 975e1cc..2e2161b 100644
--- a/include/asm-generic/io.h
+++ b/include/asm-generic/io.h
@@ -331,7 +331,7 @@ static inline void iounmap(void __iomem *addr)
#ifndef CONFIG_GENERIC_IOMAP
static inline void __iomem *ioport_map(unsigned long port, unsigned int nr)
{
- return (void __iomem *) port;
+ return (void __iomem *)(PCI_IOBASE + (port & IO_SPACE_LIMIT));
}
static inline void ioport_unmap(void __iomem *p)
--
2.0.0
The ranges property for a host bridge controller in DT describes
the mapping between the PCI bus address and the CPU physical address.
The resources framework however expects that the IO resources start
at a pseudo "port" address 0 (zero) and have a maximum size of IO_SPACE_LIMIT.
The conversion from pci ranges to resources failed to take that into account.
In the process move the function into drivers/of/address.c as it now
depends on pci_address_to_pio() code and make it return an error message.
Signed-off-by: Liviu Dudau <[email protected]>
Tested-by: Tanmay Inamdar <[email protected]>
---
drivers/of/address.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++
include/linux/of_address.h | 13 ++-----------
2 files changed, 49 insertions(+), 11 deletions(-)
diff --git a/drivers/of/address.c b/drivers/of/address.c
index 1345733..cbbaed2 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -872,3 +872,50 @@ bool of_dma_is_coherent(struct device_node *np)
return false;
}
EXPORT_SYMBOL_GPL(of_dma_is_coherent);
+
+/*
+ * of_pci_range_to_resource - Create a resource from an of_pci_range
+ * @range: the PCI range that describes the resource
+ * @np: device node where the range belongs to
+ * @res: pointer to a valid resource that will be updated to
+ * reflect the values contained in the range.
+ *
+ * Returns EINVAL if the range cannot be converted to resource.
+ *
+ * Note that if the range is an IO range, the resource will be converted
+ * using pci_address_to_pio() which can fail if it is called too early or
+ * if the range cannot be matched to any host bridge IO space (our case here).
+ * To guard against that we try to register the IO range first.
+ * If that fails we know that pci_address_to_pio() will do too.
+ */
+int of_pci_range_to_resource(struct of_pci_range *range,
+ struct device_node *np, struct resource *res)
+{
+ int err;
+ res->flags = range->flags;
+ res->parent = res->child = res->sibling = NULL;
+ res->name = np->full_name;
+
+ if (res->flags & IORESOURCE_IO) {
+ unsigned long port = -1;
+ err = pci_register_io_range(range->cpu_addr, range->size);
+ if (err)
+ goto invalid_range;
+ port = pci_address_to_pio(range->cpu_addr);
+ if (port == (unsigned long)-1) {
+ err = -EINVAL;
+ goto invalid_range;
+ }
+ res->start = port;
+ } else {
+ res->start = range->cpu_addr;
+ }
+ res->end = res->start + range->size - 1;
+ return 0;
+
+invalid_range:
+ res->start = (resource_size_t)OF_BAD_ADDR;
+ res->end = (resource_size_t)OF_BAD_ADDR;
+ return err;
+}
+
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index ac4aac4..33c0420 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -23,17 +23,8 @@ struct of_pci_range {
#define for_each_of_pci_range(parser, range) \
for (; of_pci_range_parser_one(parser, range);)
-static inline void of_pci_range_to_resource(struct of_pci_range *range,
- struct device_node *np,
- struct resource *res)
-{
- res->flags = range->flags;
- res->start = range->cpu_addr;
- res->end = range->cpu_addr + range->size - 1;
- res->parent = res->child = res->sibling = NULL;
- res->name = np->full_name;
-}
-
+extern int of_pci_range_to_resource(struct of_pci_range *range,
+ struct device_node *np, struct resource *res);
/* Translate a DMA address from device space to CPU space */
extern u64 of_translate_dma_address(struct device_node *dev,
const __be32 *in_addr);
--
2.0.0
Before commit 7b5436635800 the pci_host_bridge was created before the root bus.
As that commit has added a needless dependency on the bus for pci_alloc_host_bridge()
the creation order has been changed for no good reason. Revert the order of
creation as we are going to depend on the pci_host_bridge structure to retrieve the
domain number of the root bus.
Signed-off-by: Liviu Dudau <[email protected]>
Acked-by: Grant Likely <[email protected]>
Tested-by: Tanmay Inamdar <[email protected]>
---
drivers/pci/probe.c | 31 ++++++++++++++++---------------
1 file changed, 16 insertions(+), 15 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index e3cf8a2..2c92662 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -515,7 +515,7 @@ static void pci_release_host_bridge_dev(struct device *dev)
kfree(bridge);
}
-static struct pci_host_bridge *pci_alloc_host_bridge(struct pci_bus *b)
+static struct pci_host_bridge *pci_alloc_host_bridge(void)
{
struct pci_host_bridge *bridge;
@@ -524,7 +524,6 @@ static struct pci_host_bridge *pci_alloc_host_bridge(struct pci_bus *b)
return NULL;
INIT_LIST_HEAD(&bridge->windows);
- bridge->bus = b;
return bridge;
}
@@ -1761,9 +1760,16 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
char bus_addr[64];
char *fmt;
+ bridge = pci_alloc_host_bridge();
+ if (!bridge)
+ return NULL;
+
+ bridge->dev.parent = parent;
+ bridge->dev.release = pci_release_host_bridge_dev;
+
b = pci_alloc_bus();
if (!b)
- return NULL;
+ goto err_out;
b->sysdata = sysdata;
b->ops = ops;
@@ -1772,26 +1778,19 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
if (b2) {
/* If we already got to this bus through a different bridge, ignore it */
dev_dbg(&b2->dev, "bus already known\n");
- goto err_out;
+ goto err_bus_out;
}
- bridge = pci_alloc_host_bridge(b);
- if (!bridge)
- goto err_out;
-
- bridge->dev.parent = parent;
- bridge->dev.release = pci_release_host_bridge_dev;
+ bridge->bus = b;
dev_set_name(&bridge->dev, "pci%04x:%02x", pci_domain_nr(b), bus);
error = pcibios_root_bridge_prepare(bridge);
- if (error) {
- kfree(bridge);
+ if (error)
goto err_out;
- }
error = device_register(&bridge->dev);
if (error) {
put_device(&bridge->dev);
- goto err_out;
+ goto err_bus_out;
}
b->bridge = get_device(&bridge->dev);
device_enable_async_suspend(b->bridge);
@@ -1848,8 +1847,10 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
class_dev_reg_err:
put_device(&bridge->dev);
device_unregister(&bridge->dev);
-err_out:
+err_bus_out:
kfree(b);
+err_out:
+ kfree(bridge);
return NULL;
}
--
2.0.0
Enhance the default implementation of pcibios_add_device() to
parse and map the IRQ of the device if a DT binding is available.
Signed-off-by: Liviu Dudau <[email protected]>
---
drivers/pci/pci.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 63a54a3..8e65dc3 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -17,6 +17,7 @@
#include <linux/spinlock.h>
#include <linux/string.h>
#include <linux/log2.h>
+#include <linux/of_pci.h>
#include <linux/pci-aspm.h>
#include <linux/pm_wakeup.h>
#include <linux/interrupt.h>
@@ -1453,6 +1454,9 @@ EXPORT_SYMBOL(pcim_pin_device);
*/
int __weak pcibios_add_device(struct pci_dev *dev)
{
+#ifdef CONFIG_OF
+ dev->irq = of_irq_parse_and_map_pci(dev, 0, 0);
+#endif
return 0;
}
--
2.0.0
Several platforms use a rather generic version of parsing
the device tree to find the host bridge ranges. Move the common code
into the generic PCI code and use it to create a pci_host_bridge
structure that can be used by arch code.
Based on early attempts by Andrew Murray to unify the code.
Used powerpc and microblaze PCI code as starting point.
Signed-off-by: Liviu Dudau <[email protected]>
Tested-by: Tanmay Inamdar <[email protected]>
---
drivers/of/of_pci.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/host-bridge.c | 18 +++++++
include/linux/of_pci.h | 10 ++++
include/linux/pci.h | 8 +++
4 files changed, 171 insertions(+)
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 8481996..55d8320 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -89,6 +89,141 @@ int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
}
EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
+/**
+ * pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT
+ * @dev: device node of the host bridge having the range property
+ * @resources: list where the range of resources will be added after DT parsing
+ * @io_base: pointer to a variable that will contain the physical address for
+ * the start of the I/O range.
+ *
+ * It is the callers job to free the @resources list if an error is returned.
+ *
+ * This function will parse the "ranges" property of a PCI host bridge device
+ * node and setup the resource mapping based on its content. It is expected
+ * that the property conforms with the Power ePAPR document.
+ *
+ * Each architecture is then offered the chance of applying their own
+ * filtering of pci_host_bridge_windows based on their own restrictions by
+ * calling pcibios_fixup_bridge_ranges(). The filtered list of windows
+ * can then be used when creating a pci_host_bridge structure.
+ */
+static int pci_host_bridge_of_get_ranges(struct device_node *dev,
+ struct list_head *resources, resource_size_t *io_base)
+{
+ struct resource *res;
+ struct of_pci_range range;
+ struct of_pci_range_parser parser;
+ int err;
+
+ pr_info("PCI host bridge %s ranges:\n", dev->full_name);
+
+ /* Check for ranges property */
+ err = of_pci_range_parser_init(&parser, dev);
+ if (err)
+ return err;
+
+ pr_debug("Parsing ranges property...\n");
+ for_each_of_pci_range(&parser, &range) {
+ /* Read next ranges element */
+ pr_debug("pci_space: 0x%08x pci_addr:0x%016llx cpu_addr:0x%016llx size:0x%016llx\n",
+ range.pci_space, range.pci_addr, range.cpu_addr, range.size);
+
+ /*
+ * If we failed translation or got a zero-sized region
+ * then skip this range
+ */
+ if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
+ continue;
+
+ res = kzalloc(sizeof(struct resource), GFP_KERNEL);
+ if (!res)
+ return -ENOMEM;
+
+ err = of_pci_range_to_resource(&range, dev, res);
+ if (err)
+ return err;
+
+ if (resource_type(res) == IORESOURCE_IO)
+ *io_base = range.cpu_addr;
+
+ pci_add_resource_offset(resources, res,
+ res->start - range.pci_addr);
+ }
+
+ /* Apply architecture specific fixups for the ranges */
+ return pcibios_fixup_bridge_ranges(resources);
+}
+
+static atomic_t domain_nr = ATOMIC_INIT(-1);
+
+/**
+ * of_create_pci_host_bridge - Create a PCI host bridge structure using
+ * information passed in the DT.
+ * @parent: device owning this host bridge
+ * @ops: pci_ops associated with the host controller
+ * @host_data: opaque data structure used by the host controller.
+ *
+ * returns a pointer to the newly created pci_host_bridge structure, or
+ * NULL if the call failed.
+ *
+ * This function will try to obtain the host bridge domain number by
+ * using of_alias_get_id() call with "pci-domain" as a stem. If that
+ * fails, a local allocator will be used that will put each host bridge
+ * in a new domain.
+ */
+struct pci_host_bridge *
+of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host_data)
+{
+ int err, domain, busno;
+ struct resource *bus_range;
+ struct pci_bus *root_bus;
+ struct pci_host_bridge *bridge;
+ resource_size_t io_base;
+ LIST_HEAD(res);
+
+ bus_range = kzalloc(sizeof(*bus_range), GFP_KERNEL);
+ if (!bus_range)
+ return ERR_PTR(-ENOMEM);
+
+ domain = of_alias_get_id(parent->of_node, "pci-domain");
+ if (domain == -ENODEV)
+ domain = atomic_inc_return(&domain_nr);
+
+ err = of_pci_parse_bus_range(parent->of_node, bus_range);
+ if (err) {
+ dev_info(parent, "No bus range for %s, using default [0-255]\n",
+ parent->of_node->full_name);
+ bus_range->start = 0;
+ bus_range->end = 255;
+ bus_range->flags = IORESOURCE_BUS;
+ }
+ busno = bus_range->start;
+ pci_add_resource(&res, bus_range);
+
+ /* now parse the rest of host bridge bus ranges */
+ err = pci_host_bridge_of_get_ranges(parent->of_node, &res, &io_base);
+ if (err)
+ goto err_create;
+
+ /* then create the root bus */
+ root_bus = pci_create_root_bus_in_domain(parent, domain, busno,
+ ops, host_data, &res);
+ if (IS_ERR(root_bus)) {
+ err = PTR_ERR(root_bus);
+ goto err_create;
+ }
+
+ bridge = to_pci_host_bridge(root_bus->bridge);
+ bridge->io_base = io_base;
+
+ return bridge;
+
+err_create:
+ pci_free_resource_list(&res);
+ return ERR_PTR(err);
+}
+EXPORT_SYMBOL_GPL(of_create_pci_host_bridge);
+
#ifdef CONFIG_PCI_MSI
static LIST_HEAD(of_pci_msi_chip_list);
diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
index 36c669e..cfee5d1 100644
--- a/drivers/pci/host-bridge.c
+++ b/drivers/pci/host-bridge.c
@@ -5,6 +5,9 @@
#include <linux/kernel.h>
#include <linux/pci.h>
#include <linux/module.h>
+#include <linux/of_address.h>
+#include <linux/of_pci.h>
+#include <linux/slab.h>
#include "pci.h"
@@ -83,3 +86,18 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res,
res->end = region->end + offset;
}
EXPORT_SYMBOL(pcibios_bus_to_resource);
+
+/**
+ * Simple version of the platform specific code for filtering the list
+ * of resources obtained from the ranges declaration in DT.
+ *
+ * Platforms can override this function in order to impose stronger
+ * constraints onto the list of resources that a host bridge can use.
+ * The filtered list will then be used to create a root bus and associate
+ * it with the host bridge.
+ *
+ */
+int __weak pcibios_fixup_bridge_ranges(struct list_head *resources)
+{
+ return 0;
+}
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index dde3a4a..71e36d0 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -15,6 +15,9 @@ struct device_node *of_pci_find_child_device(struct device_node *parent,
int of_pci_get_devfn(struct device_node *np);
int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
+struct pci_host_bridge *of_create_pci_host_bridge(struct device *parent,
+ struct pci_ops *ops, void *host_data);
+
#else
static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq)
{
@@ -43,6 +46,13 @@ of_pci_parse_bus_range(struct device_node *node, struct resource *res)
{
return -EINVAL;
}
+
+static inline struct pci_host_bridge *
+of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops,
+ void *host_data)
+{
+ return NULL;
+}
#endif
#if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7e7b939..556dc5f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -402,6 +402,7 @@ struct pci_host_bridge {
struct device dev;
struct pci_bus *bus; /* root bus */
int domain_nr;
+ resource_size_t io_base; /* physical address for the start of I/O area */
struct list_head windows; /* pci_host_bridge_windows */
void (*release_fn)(struct pci_host_bridge *);
void *release_data;
@@ -1809,8 +1810,15 @@ static inline void pci_set_of_node(struct pci_dev *dev) { }
static inline void pci_release_of_node(struct pci_dev *dev) { }
static inline void pci_set_bus_of_node(struct pci_bus *bus) { }
static inline void pci_release_bus_of_node(struct pci_bus *bus) { }
+
#endif /* CONFIG_OF */
+/* Used by architecture code to apply any quirks to the list of
+ * pci_host_bridge resource ranges before they are being used
+ * by of_create_pci_host_bridge()
+ */
+extern int pcibios_fixup_bridge_ranges(struct list_head *resources);
+
#ifdef CONFIG_EEH
static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev)
{
--
2.0.0
Make it easier to discover the domain number of a bus by storing
the number in pci_host_bridge for the root bus. Several architectures
have their own way of storing this information, so it makes sense
to try to unify the code. While at this, add a new function that
creates a root bus in a given domain and make pci_create_root_bus()
a wrapper around this function.
Signed-off-by: Liviu Dudau <[email protected]>
Tested-by: Tanmay Inamdar <[email protected]>
---
drivers/pci/probe.c | 41 +++++++++++++++++++++++++++++++++--------
include/linux/pci.h | 4 ++++
2 files changed, 37 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2c92662..abf5e82 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1748,8 +1748,9 @@ void __weak pcibios_remove_bus(struct pci_bus *bus)
{
}
-struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
- struct pci_ops *ops, void *sysdata, struct list_head *resources)
+struct pci_bus *pci_create_root_bus_in_domain(struct device *parent,
+ int domain, int bus, struct pci_ops *ops, void *sysdata,
+ struct list_head *resources)
{
int error;
struct pci_host_bridge *bridge;
@@ -1762,27 +1763,31 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
bridge = pci_alloc_host_bridge();
if (!bridge)
- return NULL;
+ return ERR_PTR(-ENOMEM);
bridge->dev.parent = parent;
bridge->dev.release = pci_release_host_bridge_dev;
+ bridge->domain_nr = domain;
b = pci_alloc_bus();
- if (!b)
+ if (!b) {
+ error = -ENOMEM;
goto err_out;
+ }
b->sysdata = sysdata;
b->ops = ops;
b->number = b->busn_res.start = bus;
- b2 = pci_find_bus(pci_domain_nr(b), bus);
+ b2 = pci_find_bus(bridge->domain_nr, bus);
if (b2) {
/* If we already got to this bus through a different bridge, ignore it */
dev_dbg(&b2->dev, "bus already known\n");
+ error = -EEXIST;
goto err_bus_out;
}
bridge->bus = b;
- dev_set_name(&bridge->dev, "pci%04x:%02x", pci_domain_nr(b), bus);
+ dev_set_name(&bridge->dev, "pci%04x:%02x", bridge->domain_nr, bus);
error = pcibios_root_bridge_prepare(bridge);
if (error)
goto err_out;
@@ -1801,7 +1806,7 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
b->dev.class = &pcibus_class;
b->dev.parent = b->bridge;
- dev_set_name(&b->dev, "%04x:%02x", pci_domain_nr(b), bus);
+ dev_set_name(&b->dev, "%04x:%02x", bridge->domain_nr, bus);
error = device_register(&b->dev);
if (error)
goto class_dev_reg_err;
@@ -1851,7 +1856,27 @@ err_bus_out:
kfree(b);
err_out:
kfree(bridge);
- return NULL;
+ return ERR_PTR(error);
+}
+
+struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
+ struct pci_ops *ops, void *sysdata, struct list_head *resources)
+{
+ int domain_nr;
+ struct pci_bus *b = pci_alloc_bus();
+ if (!b)
+ return NULL;
+
+ b->sysdata = sysdata;
+ domain_nr = pci_domain_nr(b);
+ kfree(b);
+
+ b = pci_create_root_bus_in_domain(parent, domain_nr, bus,
+ ops, sysdata, resources);
+ if (IS_ERR(b))
+ return NULL;
+
+ return b;
}
int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int bus_max)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 466bcd1..7e7b939 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -401,6 +401,7 @@ struct pci_host_bridge_window {
struct pci_host_bridge {
struct device dev;
struct pci_bus *bus; /* root bus */
+ int domain_nr;
struct list_head windows; /* pci_host_bridge_windows */
void (*release_fn)(struct pci_host_bridge *);
void *release_data;
@@ -769,6 +770,9 @@ struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops, void *sysdata);
struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
struct pci_ops *ops, void *sysdata,
struct list_head *resources);
+struct pci_bus *pci_create_root_bus_in_domain(struct device *parent,
+ int domain, int bus, struct pci_ops *ops,
+ void *sysdata, struct list_head *resources);
int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax);
int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax);
void pci_bus_release_busn_res(struct pci_bus *b);
--
2.0.0
Some architectures do not have a simple view of the PCI I/O space
and instead use a range of CPU addresses that map to bus addresses. For
some architectures these ranges will be expressed by OF bindings
in a device tree file.
Introduce a pci_register_io_range() helper function with a generic
implementation that can be used by such architectures to keep track
of the I/O ranges described by the PCI bindings. If the PCI_IOBASE
macro is not defined that signals lack of support for PCI and we
return an error.
Signed-off-by: Liviu Dudau <[email protected]>
---
drivers/of/address.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++
include/linux/of_address.h | 1 +
2 files changed, 62 insertions(+)
diff --git a/drivers/of/address.c b/drivers/of/address.c
index 5edfcb0..1345733 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -5,6 +5,7 @@
#include <linux/module.h>
#include <linux/of_address.h>
#include <linux/pci_regs.h>
+#include <linux/slab.h>
#include <linux/string.h>
/* Max address size we deal with */
@@ -601,12 +602,72 @@ const __be32 *of_get_address(struct device_node *dev, int index, u64 *size,
}
EXPORT_SYMBOL(of_get_address);
+struct io_range {
+ struct list_head list;
+ phys_addr_t start;
+ resource_size_t size;
+};
+
+static LIST_HEAD(io_range_list);
+
+/*
+ * Record the PCI IO range (expressed as CPU physical address + size).
+ * Return a negative value if an error has occured, zero otherwise
+ */
+int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
+{
+#ifdef PCI_IOBASE
+ struct io_range *res;
+ resource_size_t allocated_size = 0;
+
+ /* check if the range hasn't been previously recorded */
+ list_for_each_entry(res, &io_range_list, list) {
+ if (addr >= res->start && addr + size <= res->start + size)
+ return 0;
+ allocated_size += res->size;
+ }
+
+ /* range not registed yet, check for available space */
+ if (allocated_size + size - 1 > IO_SPACE_LIMIT)
+ return -E2BIG;
+
+ /* add the range to the list */
+ res = kzalloc(sizeof(*res), GFP_KERNEL);
+ if (!res)
+ return -ENOMEM;
+
+ res->start = addr;
+ res->size = size;
+
+ list_add_tail(&res->list, &io_range_list);
+
+ return 0;
+#else
+ return -EINVAL;
+#endif
+}
+
unsigned long __weak pci_address_to_pio(phys_addr_t address)
{
+#ifdef PCI_IOBASE
+ struct io_range *res;
+ resource_size_t offset = 0;
+
+ list_for_each_entry(res, &io_range_list, list) {
+ if (address >= res->start &&
+ address < res->start + res->size) {
+ return res->start - address + offset;
+ }
+ offset += res->size;
+ }
+
+ return (unsigned long)-1;
+#else
if (address > IO_SPACE_LIMIT)
return (unsigned long)-1;
return (unsigned long) address;
+#endif
}
static int __of_address_to_resource(struct device_node *dev,
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index c13b878..ac4aac4 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -55,6 +55,7 @@ extern void __iomem *of_iomap(struct device_node *device, int index);
extern const __be32 *of_get_address(struct device_node *dev, int index,
u64 *size, unsigned int *flags);
+extern int pci_register_io_range(phys_addr_t addr, resource_size_t size);
extern unsigned long pci_address_to_pio(phys_addr_t addr);
extern int of_pci_range_parser_init(struct of_pci_range_parser *parser,
--
2.0.0
On Tuesday 01 July 2014 19:43:28 Liviu Dudau wrote:
> +/*
> + * Record the PCI IO range (expressed as CPU physical address + size).
> + * Return a negative value if an error has occured, zero otherwise
> + */
> +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
> +{
> +#ifdef PCI_IOBASE
> + struct io_range *res;
> + resource_size_t allocated_size = 0;
> +
> + /* check if the range hasn't been previously recorded */
> + list_for_each_entry(res, &io_range_list, list) {
> + if (addr >= res->start && addr + size <= res->start + size)
> + return 0;
> + allocated_size += res->size;
> + }
> +
> + /* range not registed yet, check for available space */
> + if (allocated_size + size - 1 > IO_SPACE_LIMIT)
> + return -E2BIG;
> +
> + /* add the range to the list */
> + res = kzalloc(sizeof(*res), GFP_KERNEL);
> + if (!res)
> + return -ENOMEM;
> +
> + res->start = addr;
> + res->size = size;
> +
> + list_add_tail(&res->list, &io_range_list);
> +
> + return 0;
> +#else
> + return -EINVAL;
> +#endif
> +}
> +
> unsigned long __weak pci_address_to_pio(phys_addr_t address)
> {
> +#ifdef PCI_IOBASE
> + struct io_range *res;
> + resource_size_t offset = 0;
> +
> + list_for_each_entry(res, &io_range_list, list) {
> + if (address >= res->start &&
> + address < res->start + res->size) {
> + return res->start - address + offset;
> + }
> + offset += res->size;
> + }
> +
> + return (unsigned long)-1;
> +#else
> if (address > IO_SPACE_LIMIT)
> return (unsigned long)-1;
>
> return (unsigned long) address;
> +#endif
> }
This still conflicts with the other allocator you have in patch 9
for pci_remap_iospace: nothing guarantees that the mapping is the
same for both.
Also, this is a completely pointless exercise at this moment, because
nobody cares about the result of pci_address_to_pio on architectures
that don't already provide this function. If we ever get a proper
Open Firmware implementation that wants to put hardcoded PCI devices
into DT, we can add an implementation, but for now this seems overkill.
The allocator in pci_register_io_range seems reasonable, why not merge
this function with pci_remap_iospace() as I have asked you multiple
times before? Just make it return the io_offset so the caller can
put that into the PCI host resources.
Arnd
On Tue, Jul 01, 2014 at 09:36:10PM +0200, Arnd Bergmann wrote:
> On Tuesday 01 July 2014 19:43:28 Liviu Dudau wrote:
> > +/*
> > + * Record the PCI IO range (expressed as CPU physical address + size).
> > + * Return a negative value if an error has occured, zero otherwise
> > + */
> > +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
> > +{
> > +#ifdef PCI_IOBASE
> > + struct io_range *res;
> > + resource_size_t allocated_size = 0;
> > +
> > + /* check if the range hasn't been previously recorded */
> > + list_for_each_entry(res, &io_range_list, list) {
> > + if (addr >= res->start && addr + size <= res->start + size)
> > + return 0;
> > + allocated_size += res->size;
> > + }
> > +
> > + /* range not registed yet, check for available space */
> > + if (allocated_size + size - 1 > IO_SPACE_LIMIT)
> > + return -E2BIG;
> > +
> > + /* add the range to the list */
> > + res = kzalloc(sizeof(*res), GFP_KERNEL);
> > + if (!res)
> > + return -ENOMEM;
> > +
> > + res->start = addr;
> > + res->size = size;
> > +
> > + list_add_tail(&res->list, &io_range_list);
> > +
> > + return 0;
> > +#else
> > + return -EINVAL;
> > +#endif
> > +}
> > +
> > unsigned long __weak pci_address_to_pio(phys_addr_t address)
> > {
> > +#ifdef PCI_IOBASE
> > + struct io_range *res;
> > + resource_size_t offset = 0;
> > +
> > + list_for_each_entry(res, &io_range_list, list) {
> > + if (address >= res->start &&
> > + address < res->start + res->size) {
> > + return res->start - address + offset;
> > + }
> > + offset += res->size;
> > + }
> > +
> > + return (unsigned long)-1;
> > +#else
> > if (address > IO_SPACE_LIMIT)
> > return (unsigned long)-1;
> >
> > return (unsigned long) address;
> > +#endif
> > }
>
> This still conflicts with the other allocator you have in patch 9
> for pci_remap_iospace: nothing guarantees that the mapping is the
> same for both.
>
> Also, this is a completely pointless exercise at this moment, because
> nobody cares about the result of pci_address_to_pio on architectures
> that don't already provide this function. If we ever get a proper
> Open Firmware implementation that wants to put hardcoded PCI devices
> into DT, we can add an implementation, but for now this seems overkill.
>
> The allocator in pci_register_io_range seems reasonable, why not merge
> this function with pci_remap_iospace() as I have asked you multiple
> times before? Just make it return the io_offset so the caller can
> put that into the PCI host resources.
Hi Arnd,
While I agree with you that at some moment the allocators were inconsistent
wrt each other, for this version I would respectfully disagree on this.
The allocator in pci_register_io_range() only makes sure that the ranges
are not overlapping, it doesn't do any mapping whatsoever, while
pci_remap_iospace() does only an ioremap_page_range(). The idea is that
you get the offset out of pci_address_to_pio() and apply it to
pci_remap_iospace().
Why do you think there are conflicts?
Best regards,
Liviu
>
> Arnd
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
-------------------
.oooO
( )
\ ( Oooo.
\_) ( )
) /
(_/
One small step
for me ...
Several platforms use a rather generic version of parsing
the device tree to find the host bridge ranges. Move the common code
into the generic PCI code and use it to create a pci_host_bridge
structure that can be used by arch code.
Based on early attempts by Andrew Murray to unify the code.
Used powerpc and microblaze PCI code as starting point.
Signed-off-by: Liviu Dudau <[email protected]>
Tested-by: Tanmay Inamdar <[email protected]>
---
drivers/of/of_pci.c | 136 ++++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/host-bridge.c | 15 +++++
include/linux/of_pci.h | 10 ++++
include/linux/pci.h | 8 +++
4 files changed, 169 insertions(+)
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 8481996..e81402a 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -1,6 +1,7 @@
#include <linux/kernel.h>
#include <linux/export.h>
#include <linux/of.h>
+#include <linux/of_address.h>
#include <linux/of_pci.h>
static inline int __of_pci_pci_compare(struct device_node *node,
@@ -89,6 +90,141 @@ int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
}
EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
+/**
+ * pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT
+ * @dev: device node of the host bridge having the range property
+ * @resources: list where the range of resources will be added after DT parsing
+ * @io_base: pointer to a variable that will contain the physical address for
+ * the start of the I/O range.
+ *
+ * It is the callers job to free the @resources list if an error is returned.
+ *
+ * This function will parse the "ranges" property of a PCI host bridge device
+ * node and setup the resource mapping based on its content. It is expected
+ * that the property conforms with the Power ePAPR document.
+ *
+ * Each architecture is then offered the chance of applying their own
+ * filtering of pci_host_bridge_windows based on their own restrictions by
+ * calling pcibios_fixup_bridge_ranges(). The filtered list of windows
+ * can then be used when creating a pci_host_bridge structure.
+ */
+static int pci_host_bridge_of_get_ranges(struct device_node *dev,
+ struct list_head *resources, resource_size_t *io_base)
+{
+ struct resource *res;
+ struct of_pci_range range;
+ struct of_pci_range_parser parser;
+ int err;
+
+ pr_info("PCI host bridge %s ranges:\n", dev->full_name);
+
+ /* Check for ranges property */
+ err = of_pci_range_parser_init(&parser, dev);
+ if (err)
+ return err;
+
+ pr_debug("Parsing ranges property...\n");
+ for_each_of_pci_range(&parser, &range) {
+ /* Read next ranges element */
+ pr_debug("pci_space: 0x%08x pci_addr:0x%016llx cpu_addr:0x%016llx size:0x%016llx\n",
+ range.pci_space, range.pci_addr, range.cpu_addr, range.size);
+
+ /*
+ * If we failed translation or got a zero-sized region
+ * then skip this range
+ */
+ if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
+ continue;
+
+ res = kzalloc(sizeof(struct resource), GFP_KERNEL);
+ if (!res)
+ return -ENOMEM;
+
+ err = of_pci_range_to_resource(&range, dev, res);
+ if (err)
+ return err;
+
+ if (resource_type(res) == IORESOURCE_IO)
+ *io_base = range.cpu_addr;
+
+ pci_add_resource_offset(resources, res,
+ res->start - range.pci_addr);
+ }
+
+ /* Apply architecture specific fixups for the ranges */
+ return pcibios_fixup_bridge_ranges(resources);
+}
+
+static atomic_t domain_nr = ATOMIC_INIT(-1);
+
+/**
+ * of_create_pci_host_bridge - Create a PCI host bridge structure using
+ * information passed in the DT.
+ * @parent: device owning this host bridge
+ * @ops: pci_ops associated with the host controller
+ * @host_data: opaque data structure used by the host controller.
+ *
+ * returns a pointer to the newly created pci_host_bridge structure, or
+ * NULL if the call failed.
+ *
+ * This function will try to obtain the host bridge domain number by
+ * using of_alias_get_id() call with "pci-domain" as a stem. If that
+ * fails, a local allocator will be used that will put each host bridge
+ * in a new domain.
+ */
+struct pci_host_bridge *
+of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host_data)
+{
+ int err, domain, busno;
+ struct resource *bus_range;
+ struct pci_bus *root_bus;
+ struct pci_host_bridge *bridge;
+ resource_size_t io_base = 0;
+ LIST_HEAD(res);
+
+ bus_range = kzalloc(sizeof(*bus_range), GFP_KERNEL);
+ if (!bus_range)
+ return ERR_PTR(-ENOMEM);
+
+ domain = of_alias_get_id(parent->of_node, "pci-domain");
+ if (domain == -ENODEV)
+ domain = atomic_inc_return(&domain_nr);
+
+ err = of_pci_parse_bus_range(parent->of_node, bus_range);
+ if (err) {
+ dev_info(parent, "No bus range for %s, using default [0-255]\n",
+ parent->of_node->full_name);
+ bus_range->start = 0;
+ bus_range->end = 255;
+ bus_range->flags = IORESOURCE_BUS;
+ }
+ busno = bus_range->start;
+ pci_add_resource(&res, bus_range);
+
+ /* now parse the rest of host bridge bus ranges */
+ err = pci_host_bridge_of_get_ranges(parent->of_node, &res, &io_base);
+ if (err)
+ goto err_create;
+
+ /* then create the root bus */
+ root_bus = pci_create_root_bus_in_domain(parent, domain, busno,
+ ops, host_data, &res);
+ if (IS_ERR(root_bus)) {
+ err = PTR_ERR(root_bus);
+ goto err_create;
+ }
+
+ bridge = to_pci_host_bridge(root_bus->bridge);
+ bridge->io_base = io_base;
+
+ return bridge;
+
+err_create:
+ pci_free_resource_list(&res);
+ return ERR_PTR(err);
+}
+EXPORT_SYMBOL_GPL(of_create_pci_host_bridge);
+
#ifdef CONFIG_PCI_MSI
static LIST_HEAD(of_pci_msi_chip_list);
diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
index 36c669e..54ceafd 100644
--- a/drivers/pci/host-bridge.c
+++ b/drivers/pci/host-bridge.c
@@ -83,3 +83,18 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res,
res->end = region->end + offset;
}
EXPORT_SYMBOL(pcibios_bus_to_resource);
+
+/**
+ * Simple version of the platform specific code for filtering the list
+ * of resources obtained from the ranges declaration in DT.
+ *
+ * Platforms can override this function in order to impose stronger
+ * constraints onto the list of resources that a host bridge can use.
+ * The filtered list will then be used to create a root bus and associate
+ * it with the host bridge.
+ *
+ */
+int __weak pcibios_fixup_bridge_ranges(struct list_head *resources)
+{
+ return 0;
+}
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index dde3a4a..71e36d0 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -15,6 +15,9 @@ struct device_node *of_pci_find_child_device(struct device_node *parent,
int of_pci_get_devfn(struct device_node *np);
int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
+struct pci_host_bridge *of_create_pci_host_bridge(struct device *parent,
+ struct pci_ops *ops, void *host_data);
+
#else
static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq)
{
@@ -43,6 +46,13 @@ of_pci_parse_bus_range(struct device_node *node, struct resource *res)
{
return -EINVAL;
}
+
+static inline struct pci_host_bridge *
+of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops,
+ void *host_data)
+{
+ return NULL;
+}
#endif
#if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7e7b939..556dc5f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -402,6 +402,7 @@ struct pci_host_bridge {
struct device dev;
struct pci_bus *bus; /* root bus */
int domain_nr;
+ resource_size_t io_base; /* physical address for the start of I/O area */
struct list_head windows; /* pci_host_bridge_windows */
void (*release_fn)(struct pci_host_bridge *);
void *release_data;
@@ -1809,8 +1810,15 @@ static inline void pci_set_of_node(struct pci_dev *dev) { }
static inline void pci_release_of_node(struct pci_dev *dev) { }
static inline void pci_set_bus_of_node(struct pci_bus *bus) { }
static inline void pci_release_bus_of_node(struct pci_bus *bus) { }
+
#endif /* CONFIG_OF */
+/* Used by architecture code to apply any quirks to the list of
+ * pci_host_bridge resource ranges before they are being used
+ * by of_create_pci_host_bridge()
+ */
+extern int pcibios_fixup_bridge_ranges(struct list_head *resources);
+
#ifdef CONFIG_EEH
static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev)
{
--
2.0.0
It is clearly not my day today! This will fail compilation on ARCH=arm due to
missing header include for <linux/slab.h>
Will send an update tomorrow after more testing. Sorry!
Liviu
On Tue, Jul 01, 2014 at 09:50:50PM +0100, Liviu Dudau wrote:
> Several platforms use a rather generic version of parsing
> the device tree to find the host bridge ranges. Move the common code
> into the generic PCI code and use it to create a pci_host_bridge
> structure that can be used by arch code.
>
> Based on early attempts by Andrew Murray to unify the code.
> Used powerpc and microblaze PCI code as starting point.
>
> Signed-off-by: Liviu Dudau <[email protected]>
> Tested-by: Tanmay Inamdar <[email protected]>
> ---
> drivers/of/of_pci.c | 136 ++++++++++++++++++++++++++++++++++++++++++++++
> drivers/pci/host-bridge.c | 15 +++++
> include/linux/of_pci.h | 10 ++++
> include/linux/pci.h | 8 +++
> 4 files changed, 169 insertions(+)
>
> diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> index 8481996..e81402a 100644
> --- a/drivers/of/of_pci.c
> +++ b/drivers/of/of_pci.c
> @@ -1,6 +1,7 @@
> #include <linux/kernel.h>
> #include <linux/export.h>
> #include <linux/of.h>
> +#include <linux/of_address.h>
> #include <linux/of_pci.h>
>
> static inline int __of_pci_pci_compare(struct device_node *node,
> @@ -89,6 +90,141 @@ int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
> }
> EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
>
> +/**
> + * pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT
> + * @dev: device node of the host bridge having the range property
> + * @resources: list where the range of resources will be added after DT parsing
> + * @io_base: pointer to a variable that will contain the physical address for
> + * the start of the I/O range.
> + *
> + * It is the callers job to free the @resources list if an error is returned.
> + *
> + * This function will parse the "ranges" property of a PCI host bridge device
> + * node and setup the resource mapping based on its content. It is expected
> + * that the property conforms with the Power ePAPR document.
> + *
> + * Each architecture is then offered the chance of applying their own
> + * filtering of pci_host_bridge_windows based on their own restrictions by
> + * calling pcibios_fixup_bridge_ranges(). The filtered list of windows
> + * can then be used when creating a pci_host_bridge structure.
> + */
> +static int pci_host_bridge_of_get_ranges(struct device_node *dev,
> + struct list_head *resources, resource_size_t *io_base)
> +{
> + struct resource *res;
> + struct of_pci_range range;
> + struct of_pci_range_parser parser;
> + int err;
> +
> + pr_info("PCI host bridge %s ranges:\n", dev->full_name);
> +
> + /* Check for ranges property */
> + err = of_pci_range_parser_init(&parser, dev);
> + if (err)
> + return err;
> +
> + pr_debug("Parsing ranges property...\n");
> + for_each_of_pci_range(&parser, &range) {
> + /* Read next ranges element */
> + pr_debug("pci_space: 0x%08x pci_addr:0x%016llx cpu_addr:0x%016llx size:0x%016llx\n",
> + range.pci_space, range.pci_addr, range.cpu_addr, range.size);
> +
> + /*
> + * If we failed translation or got a zero-sized region
> + * then skip this range
> + */
> + if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
> + continue;
> +
> + res = kzalloc(sizeof(struct resource), GFP_KERNEL);
> + if (!res)
> + return -ENOMEM;
> +
> + err = of_pci_range_to_resource(&range, dev, res);
> + if (err)
> + return err;
> +
> + if (resource_type(res) == IORESOURCE_IO)
> + *io_base = range.cpu_addr;
> +
> + pci_add_resource_offset(resources, res,
> + res->start - range.pci_addr);
> + }
> +
> + /* Apply architecture specific fixups for the ranges */
> + return pcibios_fixup_bridge_ranges(resources);
> +}
> +
> +static atomic_t domain_nr = ATOMIC_INIT(-1);
> +
> +/**
> + * of_create_pci_host_bridge - Create a PCI host bridge structure using
> + * information passed in the DT.
> + * @parent: device owning this host bridge
> + * @ops: pci_ops associated with the host controller
> + * @host_data: opaque data structure used by the host controller.
> + *
> + * returns a pointer to the newly created pci_host_bridge structure, or
> + * NULL if the call failed.
> + *
> + * This function will try to obtain the host bridge domain number by
> + * using of_alias_get_id() call with "pci-domain" as a stem. If that
> + * fails, a local allocator will be used that will put each host bridge
> + * in a new domain.
> + */
> +struct pci_host_bridge *
> +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host_data)
> +{
> + int err, domain, busno;
> + struct resource *bus_range;
> + struct pci_bus *root_bus;
> + struct pci_host_bridge *bridge;
> + resource_size_t io_base = 0;
> + LIST_HEAD(res);
> +
> + bus_range = kzalloc(sizeof(*bus_range), GFP_KERNEL);
> + if (!bus_range)
> + return ERR_PTR(-ENOMEM);
> +
> + domain = of_alias_get_id(parent->of_node, "pci-domain");
> + if (domain == -ENODEV)
> + domain = atomic_inc_return(&domain_nr);
> +
> + err = of_pci_parse_bus_range(parent->of_node, bus_range);
> + if (err) {
> + dev_info(parent, "No bus range for %s, using default [0-255]\n",
> + parent->of_node->full_name);
> + bus_range->start = 0;
> + bus_range->end = 255;
> + bus_range->flags = IORESOURCE_BUS;
> + }
> + busno = bus_range->start;
> + pci_add_resource(&res, bus_range);
> +
> + /* now parse the rest of host bridge bus ranges */
> + err = pci_host_bridge_of_get_ranges(parent->of_node, &res, &io_base);
> + if (err)
> + goto err_create;
> +
> + /* then create the root bus */
> + root_bus = pci_create_root_bus_in_domain(parent, domain, busno,
> + ops, host_data, &res);
> + if (IS_ERR(root_bus)) {
> + err = PTR_ERR(root_bus);
> + goto err_create;
> + }
> +
> + bridge = to_pci_host_bridge(root_bus->bridge);
> + bridge->io_base = io_base;
> +
> + return bridge;
> +
> +err_create:
> + pci_free_resource_list(&res);
> + return ERR_PTR(err);
> +}
> +EXPORT_SYMBOL_GPL(of_create_pci_host_bridge);
> +
> #ifdef CONFIG_PCI_MSI
>
> static LIST_HEAD(of_pci_msi_chip_list);
> diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
> index 36c669e..54ceafd 100644
> --- a/drivers/pci/host-bridge.c
> +++ b/drivers/pci/host-bridge.c
> @@ -83,3 +83,18 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res,
> res->end = region->end + offset;
> }
> EXPORT_SYMBOL(pcibios_bus_to_resource);
> +
> +/**
> + * Simple version of the platform specific code for filtering the list
> + * of resources obtained from the ranges declaration in DT.
> + *
> + * Platforms can override this function in order to impose stronger
> + * constraints onto the list of resources that a host bridge can use.
> + * The filtered list will then be used to create a root bus and associate
> + * it with the host bridge.
> + *
> + */
> +int __weak pcibios_fixup_bridge_ranges(struct list_head *resources)
> +{
> + return 0;
> +}
> diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
> index dde3a4a..71e36d0 100644
> --- a/include/linux/of_pci.h
> +++ b/include/linux/of_pci.h
> @@ -15,6 +15,9 @@ struct device_node *of_pci_find_child_device(struct device_node *parent,
> int of_pci_get_devfn(struct device_node *np);
> int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
> int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
> +struct pci_host_bridge *of_create_pci_host_bridge(struct device *parent,
> + struct pci_ops *ops, void *host_data);
> +
> #else
> static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq)
> {
> @@ -43,6 +46,13 @@ of_pci_parse_bus_range(struct device_node *node, struct resource *res)
> {
> return -EINVAL;
> }
> +
> +static inline struct pci_host_bridge *
> +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops,
> + void *host_data)
> +{
> + return NULL;
> +}
> #endif
>
> #if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 7e7b939..556dc5f 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -402,6 +402,7 @@ struct pci_host_bridge {
> struct device dev;
> struct pci_bus *bus; /* root bus */
> int domain_nr;
> + resource_size_t io_base; /* physical address for the start of I/O area */
> struct list_head windows; /* pci_host_bridge_windows */
> void (*release_fn)(struct pci_host_bridge *);
> void *release_data;
> @@ -1809,8 +1810,15 @@ static inline void pci_set_of_node(struct pci_dev *dev) { }
> static inline void pci_release_of_node(struct pci_dev *dev) { }
> static inline void pci_set_bus_of_node(struct pci_bus *bus) { }
> static inline void pci_release_bus_of_node(struct pci_bus *bus) { }
> +
> #endif /* CONFIG_OF */
>
> +/* Used by architecture code to apply any quirks to the list of
> + * pci_host_bridge resource ranges before they are being used
> + * by of_create_pci_host_bridge()
> + */
> +extern int pcibios_fixup_bridge_ranges(struct list_head *resources);
> +
> #ifdef CONFIG_EEH
> static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev)
> {
> --
> 2.0.0
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
-------------------
.oooO
( )
\ ( Oooo.
\_) ( )
) /
(_/
One small step
for me ...
On Tue, Jul 01, 2014 at 07:43:32PM +0100, Liviu Dudau wrote:
> Enhance the default implementation of pcibios_add_device() to
> parse and map the IRQ of the device if a DT binding is available.
>
> Signed-off-by: Liviu Dudau <[email protected]>
> ---
> drivers/pci/pci.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 63a54a3..8e65dc3 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -17,6 +17,7 @@
> #include <linux/spinlock.h>
> #include <linux/string.h>
> #include <linux/log2.h>
> +#include <linux/of_pci.h>
> #include <linux/pci-aspm.h>
> #include <linux/pm_wakeup.h>
> #include <linux/interrupt.h>
> @@ -1453,6 +1454,9 @@ EXPORT_SYMBOL(pcim_pin_device);
> */
> int __weak pcibios_add_device(struct pci_dev *dev)
> {
> +#ifdef CONFIG_OF
> + dev->irq = of_irq_parse_and_map_pci(dev, 0, 0);
> +#endif
You could avoid the #ifdef by only assigning dev->irq if
of_irq_parse_and_map_pci returned something > 0.
Will
On Tue, Jul 01, 2014 at 07:43:28PM +0100, Liviu Dudau wrote:
> Some architectures do not have a simple view of the PCI I/O space
> and instead use a range of CPU addresses that map to bus addresses. For
> some architectures these ranges will be expressed by OF bindings
> in a device tree file.
>
> Introduce a pci_register_io_range() helper function with a generic
> implementation that can be used by such architectures to keep track
> of the I/O ranges described by the PCI bindings. If the PCI_IOBASE
> macro is not defined that signals lack of support for PCI and we
> return an error.
[...]
> +/*
> + * Record the PCI IO range (expressed as CPU physical address + size).
> + * Return a negative value if an error has occured, zero otherwise
> + */
> +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
> +{
> +#ifdef PCI_IOBASE
> + struct io_range *res;
> + resource_size_t allocated_size = 0;
> +
> + /* check if the range hasn't been previously recorded */
> + list_for_each_entry(res, &io_range_list, list) {
> + if (addr >= res->start && addr + size <= res->start + size)
> + return 0;
> + allocated_size += res->size;
> + }
> +
> + /* range not registed yet, check for available space */
> + if (allocated_size + size - 1 > IO_SPACE_LIMIT)
> + return -E2BIG;
> +
> + /* add the range to the list */
> + res = kzalloc(sizeof(*res), GFP_KERNEL);
> + if (!res)
> + return -ENOMEM;
> +
> + res->start = addr;
> + res->size = size;
> +
> + list_add_tail(&res->list, &io_range_list);
> +
> + return 0;
Hopefully a stupid question, but how is this serialised? I'm just surprised
that adding to and searching a list are sufficient, unless there's a big
lock somewhere.
Will
Hi Liviu,
On Tue, Jul 01, 2014 at 07:43:33PM +0100, Liviu Dudau wrote:
> Several platforms use a rather generic version of parsing
> the device tree to find the host bridge ranges. Move the common code
> into the generic PCI code and use it to create a pci_host_bridge
> structure that can be used by arch code.
>
> Based on early attempts by Andrew Murray to unify the code.
> Used powerpc and microblaze PCI code as starting point.
I just had a quick look at this to see how it differs from the parsing in
pci-host-generic.c and there a few small differences worth discussing.
> +static int pci_host_bridge_of_get_ranges(struct device_node *dev,
> + struct list_head *resources, resource_size_t *io_base)
> +{
> + struct resource *res;
> + struct of_pci_range range;
> + struct of_pci_range_parser parser;
> + int err;
> +
> + pr_info("PCI host bridge %s ranges:\n", dev->full_name);
> +
> + /* Check for ranges property */
> + err = of_pci_range_parser_init(&parser, dev);
> + if (err)
> + return err;
> +
> + pr_debug("Parsing ranges property...\n");
> + for_each_of_pci_range(&parser, &range) {
> + /* Read next ranges element */
> + pr_debug("pci_space: 0x%08x pci_addr:0x%016llx cpu_addr:0x%016llx size:0x%016llx\n",
> + range.pci_space, range.pci_addr, range.cpu_addr, range.size);
> +
> + /*
> + * If we failed translation or got a zero-sized region
> + * then skip this range
> + */
> + if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
> + continue;
> +
> + res = kzalloc(sizeof(struct resource), GFP_KERNEL);
> + if (!res)
> + return -ENOMEM;
> +
> + err = of_pci_range_to_resource(&range, dev, res);
> + if (err)
> + return err;
> +
> + if (resource_type(res) == IORESOURCE_IO)
> + *io_base = range.cpu_addr;
> +
> + pci_add_resource_offset(resources, res,
> + res->start - range.pci_addr);
Where do you request_resource before adding it?
> + }
> +
> + /* Apply architecture specific fixups for the ranges */
> + return pcibios_fixup_bridge_ranges(resources);
I currently mandate at least one non-prefetchable resource in the
device-tree. Should I simply drop this restriction, or do I have to somehow
hook this into the pcibios callback?
> +}
> +
> +static atomic_t domain_nr = ATOMIC_INIT(-1);
> +
> +/**
> + * of_create_pci_host_bridge - Create a PCI host bridge structure using
> + * information passed in the DT.
> + * @parent: device owning this host bridge
> + * @ops: pci_ops associated with the host controller
> + * @host_data: opaque data structure used by the host controller.
> + *
> + * returns a pointer to the newly created pci_host_bridge structure, or
> + * NULL if the call failed.
> + *
> + * This function will try to obtain the host bridge domain number by
> + * using of_alias_get_id() call with "pci-domain" as a stem. If that
> + * fails, a local allocator will be used that will put each host bridge
> + * in a new domain.
> + */
> +struct pci_host_bridge *
> +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host_data)
> +{
> + int err, domain, busno;
> + struct resource *bus_range;
> + struct pci_bus *root_bus;
> + struct pci_host_bridge *bridge;
> + resource_size_t io_base;
> + LIST_HEAD(res);
> +
> + bus_range = kzalloc(sizeof(*bus_range), GFP_KERNEL);
> + if (!bus_range)
> + return ERR_PTR(-ENOMEM);
> +
> + domain = of_alias_get_id(parent->of_node, "pci-domain");
> + if (domain == -ENODEV)
> + domain = atomic_inc_return(&domain_nr);
> +
> + err = of_pci_parse_bus_range(parent->of_node, bus_range);
> + if (err) {
> + dev_info(parent, "No bus range for %s, using default [0-255]\n",
> + parent->of_node->full_name);
> + bus_range->start = 0;
> + bus_range->end = 255;
> + bus_range->flags = IORESOURCE_BUS;
What about bus_range->name?
> + }
> + busno = bus_range->start;
> + pci_add_resource(&res, bus_range);
I currently truncate the bus range to fit inside the Configuration Space
window I have (in the reg property). How can I continue to do that with this
patch?
Will
On Tuesday 01 July 2014 21:45:09 Liviu Dudau wrote:
> On Tue, Jul 01, 2014 at 09:36:10PM +0200, Arnd Bergmann wrote:
> > On Tuesday 01 July 2014 19:43:28 Liviu Dudau wrote:
> >
> > This still conflicts with the other allocator you have in patch 9
> > for pci_remap_iospace: nothing guarantees that the mapping is the
> > same for both.
> >
> > Also, this is a completely pointless exercise at this moment, because
> > nobody cares about the result of pci_address_to_pio on architectures
> > that don't already provide this function. If we ever get a proper
> > Open Firmware implementation that wants to put hardcoded PCI devices
> > into DT, we can add an implementation, but for now this seems overkill.
> >
> > The allocator in pci_register_io_range seems reasonable, why not merge
> > this function with pci_remap_iospace() as I have asked you multiple
> > times before? Just make it return the io_offset so the caller can
> > put that into the PCI host resources.
>
> Hi Arnd,
>
> While I agree with you that at some moment the allocators were inconsistent
> wrt each other, for this version I would respectfully disagree on this.
> The allocator in pci_register_io_range() only makes sure that the ranges
> are not overlapping, it doesn't do any mapping whatsoever, while
> pci_remap_iospace() does only an ioremap_page_range(). The idea is that
> you get the offset out of pci_address_to_pio() and apply it to
> pci_remap_iospace().
Ok, got it now, I'm sorry I didn't read this properly at first.
Your solution looks correct to me, just using different
tradeoffs to what I was expecting: You get a working pci_address_to_pio()
function, which is probably never needed, but in turn you need to
keep the state of each host bridge in a global list.
Arnd
Some more detailed comments now
On Tuesday 01 July 2014 19:43:28 Liviu Dudau wrote:
> +/*
> + * Record the PCI IO range (expressed as CPU physical address + size).
> + * Return a negative value if an error has occured, zero otherwise
> + */
> +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
> +{
> +#ifdef PCI_IOBASE
> + struct io_range *res;
I was confused by the variable naming here: A variable named 'res' is
normally a 'struct resource'. Maybe better call this 'range'.
> + resource_size_t allocated_size = 0;
> +
> + /* check if the range hasn't been previously recorded */
> + list_for_each_entry(res, &io_range_list, list) {
> + if (addr >= res->start && addr + size <= res->start + size)
> + return 0;
> + allocated_size += res->size;
> + }
A spin_lock around the list lookup should be sufficient to get around
the race that Will mentioned.
> + /* range not registed yet, check for available space */
> + if (allocated_size + size - 1 > IO_SPACE_LIMIT)
> + return -E2BIG;
It might be better to limit the size to 64K if it doesn't fit at first.
Arnd
On Wed, Jul 02, 2014 at 01:38:04PM +0100, Arnd Bergmann wrote:
> Some more detailed comments now
>
> On Tuesday 01 July 2014 19:43:28 Liviu Dudau wrote:
> > +/*
> > + * Record the PCI IO range (expressed as CPU physical address + size).
> > + * Return a negative value if an error has occured, zero otherwise
> > + */
> > +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
> > +{
> > +#ifdef PCI_IOBASE
> > + struct io_range *res;
>
> I was confused by the variable naming here: A variable named 'res' is
> normally a 'struct resource'. Maybe better call this 'range'.
>
> > + resource_size_t allocated_size = 0;
> > +
> > + /* check if the range hasn't been previously recorded */
> > + list_for_each_entry(res, &io_range_list, list) {
> > + if (addr >= res->start && addr + size <= res->start + size)
> > + return 0;
> > + allocated_size += res->size;
> > + }
>
> A spin_lock around the list lookup should be sufficient to get around
> the race that Will mentioned.
>
> > + /* range not registed yet, check for available space */
> > + if (allocated_size + size - 1 > IO_SPACE_LIMIT)
> > + return -E2BIG;
>
> It might be better to limit the size to 64K if it doesn't fit at first.
Thanks Arnd for review. Will update and post a new patch soon if I don't
get any other comments.
Best regards,
Liviu
>
>
> Arnd
>
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Wed, Jul 02, 2014 at 01:30:31PM +0100, Arnd Bergmann wrote:
> On Tuesday 01 July 2014 21:45:09 Liviu Dudau wrote:
> > On Tue, Jul 01, 2014 at 09:36:10PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 01 July 2014 19:43:28 Liviu Dudau wrote:
> > >
> > > This still conflicts with the other allocator you have in patch 9
> > > for pci_remap_iospace: nothing guarantees that the mapping is the
> > > same for both.
> > >
> > > Also, this is a completely pointless exercise at this moment, because
> > > nobody cares about the result of pci_address_to_pio on architectures
> > > that don't already provide this function. If we ever get a proper
> > > Open Firmware implementation that wants to put hardcoded PCI devices
> > > into DT, we can add an implementation, but for now this seems overkill.
> > >
> > > The allocator in pci_register_io_range seems reasonable, why not merge
> > > this function with pci_remap_iospace() as I have asked you multiple
> > > times before? Just make it return the io_offset so the caller can
> > > put that into the PCI host resources.
> >
> > Hi Arnd,
> >
> > While I agree with you that at some moment the allocators were inconsistent
> > wrt each other, for this version I would respectfully disagree on this.
> > The allocator in pci_register_io_range() only makes sure that the ranges
> > are not overlapping, it doesn't do any mapping whatsoever, while
> > pci_remap_iospace() does only an ioremap_page_range(). The idea is that
> > you get the offset out of pci_address_to_pio() and apply it to
> > pci_remap_iospace().
>
> Ok, got it now, I'm sorry I didn't read this properly at first.
>
> Your solution looks correct to me, just using different
> tradeoffs to what I was expecting: You get a working pci_address_to_pio()
> function, which is probably never needed, but in turn you need to
> keep the state of each host bridge in a global list.
Just a reminder that with my patchset I *do* start using pci_address_to_pio()
in order to correctly parse the IO ranges from DT.
Best regards,
Liviu
>
> Arnd
>
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Wednesday 02 July 2014 15:23:03 Liviu Dudau wrote:
> >
> > Your solution looks correct to me, just using different
> > tradeoffs to what I was expecting: You get a working pci_address_to_pio()
> > function, which is probably never needed, but in turn you need to
> > keep the state of each host bridge in a global list.
>
> Just a reminder that with my patchset I *do* start using pci_address_to_pio()
> in order to correctly parse the IO ranges from DT.
Yes, what I meant is that it would be easier not to do that. All existing
drivers expect of_pci_range_to_resource() to return the CPU address for
an I/O space register, not the Linux I/O port number that we want to
pass to the PCI core. This is suboptimal because it's not obvious how
it works, but it lets us get away without an extra registration step.
Once all probe functions in PCI host drivers have been changed to the
of_create_pci_host_bridge, that should not matter any more, because
there is only one place left that calls it and we only have to get it
right once.
Also, when you change that of_pci_range_to_resource, you also have to
audit all callers of that function and ensure they can deal with the new
behavior.
Arnd
On Wed, Jul 02, 2014 at 12:22:22PM +0100, Will Deacon wrote:
> On Tue, Jul 01, 2014 at 07:43:28PM +0100, Liviu Dudau wrote:
> > Some architectures do not have a simple view of the PCI I/O space
> > and instead use a range of CPU addresses that map to bus addresses. For
> > some architectures these ranges will be expressed by OF bindings
> > in a device tree file.
> >
> > Introduce a pci_register_io_range() helper function with a generic
> > implementation that can be used by such architectures to keep track
> > of the I/O ranges described by the PCI bindings. If the PCI_IOBASE
> > macro is not defined that signals lack of support for PCI and we
> > return an error.
>
> [...]
>
> > +/*
> > + * Record the PCI IO range (expressed as CPU physical address + size).
> > + * Return a negative value if an error has occured, zero otherwise
> > + */
> > +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
> > +{
> > +#ifdef PCI_IOBASE
> > + struct io_range *res;
> > + resource_size_t allocated_size = 0;
> > +
> > + /* check if the range hasn't been previously recorded */
> > + list_for_each_entry(res, &io_range_list, list) {
> > + if (addr >= res->start && addr + size <= res->start + size)
> > + return 0;
> > + allocated_size += res->size;
> > + }
> > +
> > + /* range not registed yet, check for available space */
> > + if (allocated_size + size - 1 > IO_SPACE_LIMIT)
> > + return -E2BIG;
> > +
> > + /* add the range to the list */
> > + res = kzalloc(sizeof(*res), GFP_KERNEL);
> > + if (!res)
> > + return -ENOMEM;
> > +
> > + res->start = addr;
> > + res->size = size;
> > +
> > + list_add_tail(&res->list, &io_range_list);
> > +
> > + return 0;
>
> Hopefully a stupid question, but how is this serialised? I'm just surprised
> that adding to and searching a list are sufficient, unless there's a big
> lock somewhere.
Sorry, tripped into my own filters!
You are right, there is no serialisation here, will add one.
Best regards,
Liviu
>
> Will
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Wed, Jul 02, 2014 at 12:22:30PM +0100, Will Deacon wrote:
> Hi Liviu,
>
> On Tue, Jul 01, 2014 at 07:43:33PM +0100, Liviu Dudau wrote:
> > Several platforms use a rather generic version of parsing
> > the device tree to find the host bridge ranges. Move the common code
> > into the generic PCI code and use it to create a pci_host_bridge
> > structure that can be used by arch code.
> >
> > Based on early attempts by Andrew Murray to unify the code.
> > Used powerpc and microblaze PCI code as starting point.
>
> I just had a quick look at this to see how it differs from the parsing in
> pci-host-generic.c and there a few small differences worth discussing.
>
> > +static int pci_host_bridge_of_get_ranges(struct device_node *dev,
> > + struct list_head *resources, resource_size_t *io_base)
> > +{
> > + struct resource *res;
> > + struct of_pci_range range;
> > + struct of_pci_range_parser parser;
> > + int err;
> > +
> > + pr_info("PCI host bridge %s ranges:\n", dev->full_name);
> > +
> > + /* Check for ranges property */
> > + err = of_pci_range_parser_init(&parser, dev);
> > + if (err)
> > + return err;
> > +
> > + pr_debug("Parsing ranges property...\n");
> > + for_each_of_pci_range(&parser, &range) {
> > + /* Read next ranges element */
> > + pr_debug("pci_space: 0x%08x pci_addr:0x%016llx cpu_addr:0x%016llx size:0x%016llx\n",
> > + range.pci_space, range.pci_addr, range.cpu_addr, range.size);
> > +
> > + /*
> > + * If we failed translation or got a zero-sized region
> > + * then skip this range
> > + */
> > + if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
> > + continue;
> > +
> > + res = kzalloc(sizeof(struct resource), GFP_KERNEL);
> > + if (!res)
> > + return -ENOMEM;
> > +
> > + err = of_pci_range_to_resource(&range, dev, res);
> > + if (err)
> > + return err;
> > +
> > + if (resource_type(res) == IORESOURCE_IO)
> > + *io_base = range.cpu_addr;
> > +
> > + pci_add_resource_offset(resources, res,
> > + res->start - range.pci_addr);
>
> Where do you request_resource before adding it?
I don't, because I'm expecting that arch code might filter the list. When
the host bridge code calls pci_scan_root_bus() the resources will be
requested then.
>
> > + }
> > +
> > + /* Apply architecture specific fixups for the ranges */
> > + return pcibios_fixup_bridge_ranges(resources);
>
> I currently mandate at least one non-prefetchable resource in the
> device-tree. Should I simply drop this restriction, or do I have to somehow
> hook this into the pcibios callback?
Don't think I understand why you need at least one non-prefetcheable resource
but if you want to mandate that then the pcibios_fixup_bridge_ranges() would
be the place to put that check.
>
> > +}
> > +
> > +static atomic_t domain_nr = ATOMIC_INIT(-1);
> > +
> > +/**
> > + * of_create_pci_host_bridge - Create a PCI host bridge structure using
> > + * information passed in the DT.
> > + * @parent: device owning this host bridge
> > + * @ops: pci_ops associated with the host controller
> > + * @host_data: opaque data structure used by the host controller.
> > + *
> > + * returns a pointer to the newly created pci_host_bridge structure, or
> > + * NULL if the call failed.
> > + *
> > + * This function will try to obtain the host bridge domain number by
> > + * using of_alias_get_id() call with "pci-domain" as a stem. If that
> > + * fails, a local allocator will be used that will put each host bridge
> > + * in a new domain.
> > + */
> > +struct pci_host_bridge *
> > +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host_data)
> > +{
> > + int err, domain, busno;
> > + struct resource *bus_range;
> > + struct pci_bus *root_bus;
> > + struct pci_host_bridge *bridge;
> > + resource_size_t io_base;
> > + LIST_HEAD(res);
> > +
> > + bus_range = kzalloc(sizeof(*bus_range), GFP_KERNEL);
> > + if (!bus_range)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + domain = of_alias_get_id(parent->of_node, "pci-domain");
> > + if (domain == -ENODEV)
> > + domain = atomic_inc_return(&domain_nr);
> > +
> > + err = of_pci_parse_bus_range(parent->of_node, bus_range);
> > + if (err) {
> > + dev_info(parent, "No bus range for %s, using default [0-255]\n",
> > + parent->of_node->full_name);
> > + bus_range->start = 0;
> > + bus_range->end = 255;
> > + bus_range->flags = IORESOURCE_BUS;
>
> What about bus_range->name?
Don't know! Is anyone using it?
>
> > + }
> > + busno = bus_range->start;
> > + pci_add_resource(&res, bus_range);
>
> I currently truncate the bus range to fit inside the Configuration Space
> window I have (in the reg property). How can I continue to do that with this
> patch?
Not easily. Unless I add an argument to this function that will allow you to
pass in the max number for the bus range, then the code becomes:
+ err = of_pci_parse_bus_range(parent->of_node, bus_range);
+ if (err) {
+ dev_info(parent, "No bus range for %s, using default [0-%d]\n",
+ parent->of_node->full_name, max_range);
+ bus_range->start = 0;
+ bus_range->end = max_range;
+ bus_range->flags = IORESOURCE_BUS;
+ } else {
+ if (bus_range->end > bus_range->start + max_range) {
+ bus_range->end = bus_range->start + max_range;
+ }
+ }
Or something like that.
Best regards,
Liviu
>
> Will
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Wed, Jul 02, 2014 at 06:23:55PM +0100, Liviu Dudau wrote:
> On Wed, Jul 02, 2014 at 12:22:30PM +0100, Will Deacon wrote:
> > On Tue, Jul 01, 2014 at 07:43:33PM +0100, Liviu Dudau wrote:
> > > Several platforms use a rather generic version of parsing
> > > the device tree to find the host bridge ranges. Move the common code
> > > into the generic PCI code and use it to create a pci_host_bridge
> > > structure that can be used by arch code.
> > >
> > > Based on early attempts by Andrew Murray to unify the code.
> > > Used powerpc and microblaze PCI code as starting point.
> >
> > I just had a quick look at this to see how it differs from the parsing in
> > pci-host-generic.c and there a few small differences worth discussing.
[...]
> > > + }
> > > +
> > > + /* Apply architecture specific fixups for the ranges */
> > > + return pcibios_fixup_bridge_ranges(resources);
> >
> > I currently mandate at least one non-prefetchable resource in the
> > device-tree. Should I simply drop this restriction, or do I have to somehow
> > hook this into the pcibios callback?
>
> Don't think I understand why you need at least one non-prefetcheable resource
> but if you want to mandate that then the pcibios_fixup_bridge_ranges() would
> be the place to put that check.
I think it was Arnd's idea at the time:
http://lists.infradead.org/pipermail/linux-arm-kernel/2014-February/232225.html
and it's probably worth keeping if possible (just to avoid changes to the
behaviour of the existing driver).
However, that means I already need a host-controller callback via
pcibios_fixup_bridge_ranges...
> > > + err = of_pci_parse_bus_range(parent->of_node, bus_range);
> > > + if (err) {
> > > + dev_info(parent, "No bus range for %s, using default [0-255]\n",
> > > + parent->of_node->full_name);
> > > + bus_range->start = 0;
> > > + bus_range->end = 255;
> > > + bus_range->flags = IORESOURCE_BUS;
> >
> > What about bus_range->name?
>
> Don't know! Is anyone using it?
I guess /proc/iomem prints it out? I set it in my current driver, if you
want to take a look.
> >
> > > + }
> > > + busno = bus_range->start;
> > > + pci_add_resource(&res, bus_range);
> >
> > I currently truncate the bus range to fit inside the Configuration Space
> > window I have (in the reg property). How can I continue to do that with this
> > patch?
>
> Not easily. Unless I add an argument to this function that will allow you to
> pass in the max number for the bus range, then the code becomes:
>
> + err = of_pci_parse_bus_range(parent->of_node, bus_range);
> + if (err) {
> + dev_info(parent, "No bus range for %s, using default [0-%d]\n",
> + parent->of_node->full_name, max_range);
> + bus_range->start = 0;
> + bus_range->end = max_range;
> + bus_range->flags = IORESOURCE_BUS;
> + } else {
> + if (bus_range->end > bus_range->start + max_range) {
> + bus_range->end = bus_range->start + max_range;
> + }
> + }
>
> Or something like that.
Again, take a look at my driver (it's in mainline now) to see how I deal
with this.
Will
Hi,
On Tue, Jul 1, 2014 at 11:43 AM, Liviu Dudau <[email protected]> wrote:
> This is a useful function and we should make it visible outside the
> generic PCI code. Export it as a GPL symbol.
>
> Signed-off-by: Liviu Dudau <[email protected]>
> Tested-by: Tanmay Inamdar <[email protected]>
> ---
> drivers/pci/host-bridge.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
> index 0e5f3c9..36c669e 100644
> --- a/drivers/pci/host-bridge.c
> +++ b/drivers/pci/host-bridge.c
> @@ -16,12 +16,13 @@ static struct pci_bus *find_pci_root_bus(struct pci_bus *bus)
> return bus;
> }
>
> -static struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
> +struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
> {
> struct pci_bus *root_bus = find_pci_root_bus(bus);
>
> return to_pci_host_bridge(root_bus->bridge);
> }
> +EXPORT_SYMBOL_GPL(find_pci_host_bridge);
Is there any specific reason behind making this symbol GPL? I think
the other functions in this file are just EXPORT_SYMBOL. Ultimately
companies which have non gpl Linux modules (nvidia) will face issue
using this API.
The same applies to 'of_create_pci_host_bridge'.
>
> void pci_set_host_bridge_release(struct pci_host_bridge *bridge,
> void (*release_fn)(struct pci_host_bridge *),
> --
> 2.0.0
>
On Wednesday 02 July 2014 18:31:13 Will Deacon wrote:
> > > > + err = of_pci_parse_bus_range(parent->of_node, bus_range);
> > > > + if (err) {
> > > > + dev_info(parent, "No bus range for %s, using default [0-255]\n",
> > > > + parent->of_node->full_name);
> > > > + bus_range->start = 0;
> > > > + bus_range->end = 255;
> > > > + bus_range->flags = IORESOURCE_BUS;
> > >
> > > What about bus_range->name?
> >
> > Don't know! Is anyone using it?
>
> I guess /proc/iomem prints it out? I set it in my current driver, if you
> want to take a look.
I don't think the bus resources show up anywhere in procfs. Anyway, it's
always a good idea to give resources a name, if only for debugging purposes.
Arnd
On Wednesday 02 July 2014 11:06:38 Tanmay Inamdar wrote:
> On Tue, Jul 1, 2014 at 11:43 AM, Liviu Dudau <[email protected]> wrote:
> > This is a useful function and we should make it visible outside the
> > generic PCI code. Export it as a GPL symbol.
> >
> > Signed-off-by: Liviu Dudau <[email protected]>
> > Tested-by: Tanmay Inamdar <[email protected]>
> > ---
> > drivers/pci/host-bridge.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
> > index 0e5f3c9..36c669e 100644
> > --- a/drivers/pci/host-bridge.c
> > +++ b/drivers/pci/host-bridge.c
> > @@ -16,12 +16,13 @@ static struct pci_bus *find_pci_root_bus(struct pci_bus *bus)
> > return bus;
> > }
> >
> > -static struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
> > +struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
> > {
> > struct pci_bus *root_bus = find_pci_root_bus(bus);
> >
> > return to_pci_host_bridge(root_bus->bridge);
> > }
> > +EXPORT_SYMBOL_GPL(find_pci_host_bridge);
>
> Is there any specific reason behind making this symbol GPL? I think
> the other functions in this file are just EXPORT_SYMBOL. Ultimately
> companies which have non gpl Linux modules (nvidia) will face issue
> using this API.
>
> The same applies to 'of_create_pci_host_bridge'.
I think EXPORT_SYMBOL_GPL() is better here. The new symbols are unlikely
to be used by a peripheral device driver, and PCI host controllers are
already restricted by EXPORT_SYMBOL_GPL.
nvidia will certainly not do a PCI host controller driver that is not
upstream or not GPL-compatible.
Arnd
On Wed, Jul 2, 2014 at 12:12 PM, Arnd Bergmann <[email protected]> wrote:
> On Wednesday 02 July 2014 11:06:38 Tanmay Inamdar wrote:
>> On Tue, Jul 1, 2014 at 11:43 AM, Liviu Dudau <[email protected]> wrote:
>> > This is a useful function and we should make it visible outside the
>> > generic PCI code. Export it as a GPL symbol.
>> >
>> > Signed-off-by: Liviu Dudau <[email protected]>
>> > Tested-by: Tanmay Inamdar <[email protected]>
>> > ---
>> > drivers/pci/host-bridge.c | 3 ++-
>> > 1 file changed, 2 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
>> > index 0e5f3c9..36c669e 100644
>> > --- a/drivers/pci/host-bridge.c
>> > +++ b/drivers/pci/host-bridge.c
>> > @@ -16,12 +16,13 @@ static struct pci_bus *find_pci_root_bus(struct pci_bus *bus)
>> > return bus;
>> > }
>> >
>> > -static struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
>> > +struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
>> > {
>> > struct pci_bus *root_bus = find_pci_root_bus(bus);
>> >
>> > return to_pci_host_bridge(root_bus->bridge);
>> > }
>> > +EXPORT_SYMBOL_GPL(find_pci_host_bridge);
>>
>> Is there any specific reason behind making this symbol GPL? I think
>> the other functions in this file are just EXPORT_SYMBOL. Ultimately
>> companies which have non gpl Linux modules (nvidia) will face issue
>> using this API.
>>
>> The same applies to 'of_create_pci_host_bridge'.
>
> I think EXPORT_SYMBOL_GPL() is better here. The new symbols are unlikely
> to be used by a peripheral device driver, and PCI host controllers are
> already restricted by EXPORT_SYMBOL_GPL.
>
You are right as long as the functions are not used directly. But what
if GPL functions are called indirectly. For example, 'pci_domain_nr'
implementation in Liviu's V7 series calls 'find_pci_host_bridge'.
> nvidia will certainly not do a PCI host controller driver that is not
> upstream or not GPL-compatible.
>
> Arnd
On Wed, Jul 02, 2014 at 09:43:41PM +0100, Tanmay Inamdar wrote:
> On Wed, Jul 2, 2014 at 12:12 PM, Arnd Bergmann <[email protected]> wrote:
> > On Wednesday 02 July 2014 11:06:38 Tanmay Inamdar wrote:
> >> On Tue, Jul 1, 2014 at 11:43 AM, Liviu Dudau <[email protected]> wrote:
> >> > This is a useful function and we should make it visible outside the
> >> > generic PCI code. Export it as a GPL symbol.
> >> >
> >> > Signed-off-by: Liviu Dudau <[email protected]>
> >> > Tested-by: Tanmay Inamdar <[email protected]>
> >> > ---
> >> > drivers/pci/host-bridge.c | 3 ++-
> >> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
> >> > index 0e5f3c9..36c669e 100644
> >> > --- a/drivers/pci/host-bridge.c
> >> > +++ b/drivers/pci/host-bridge.c
> >> > @@ -16,12 +16,13 @@ static struct pci_bus *find_pci_root_bus(struct pci_bus *bus)
> >> > return bus;
> >> > }
> >> >
> >> > -static struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
> >> > +struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
> >> > {
> >> > struct pci_bus *root_bus = find_pci_root_bus(bus);
> >> >
> >> > return to_pci_host_bridge(root_bus->bridge);
> >> > }
> >> > +EXPORT_SYMBOL_GPL(find_pci_host_bridge);
> >>
> >> Is there any specific reason behind making this symbol GPL? I think
> >> the other functions in this file are just EXPORT_SYMBOL. Ultimately
> >> companies which have non gpl Linux modules (nvidia) will face issue
> >> using this API.
> >>
> >> The same applies to 'of_create_pci_host_bridge'.
> >
> > I think EXPORT_SYMBOL_GPL() is better here. The new symbols are unlikely
> > to be used by a peripheral device driver, and PCI host controllers are
> > already restricted by EXPORT_SYMBOL_GPL.
> >
>
> You are right as long as the functions are not used directly. But what
> if GPL functions are called indirectly. For example, 'pci_domain_nr'
> implementation in Liviu's V7 series calls 'find_pci_host_bridge'.
I will not be drawn into the discussion of EXPORT_SYMBOL vs EXPORT_SYMBOL_GPL()
other than to say that I don't understand what is so secret in implementing
a standard. I do not want to support host bridge drivers that are not open
source.
Best regards,
Liviu
>
> > nvidia will certainly not do a PCI host controller driver that is not
> > upstream or not GPL-compatible.
> >
> > Arnd
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Wednesday 02 July 2014 13:43:41 Tanmay Inamdar wrote:
> On Wed, Jul 2, 2014 at 12:12 PM, Arnd Bergmann <[email protected]> wrote:
> > I think EXPORT_SYMBOL_GPL() is better here. The new symbols are unlikely
> > to be used by a peripheral device driver, and PCI host controllers are
> > already restricted by EXPORT_SYMBOL_GPL.
> >
>
> You are right as long as the functions are not used directly. But what
> if GPL functions are called indirectly. For example, 'pci_domain_nr'
> implementation in Liviu's V7 series calls 'find_pci_host_bridge'.
Good point. If pci_domain_nr() doesn't require access to an EXPORT_SYMBOL_GPL
symbol, it should not start doing that after this patch.
For of_create_pci_host_bridge() however, I can't think of any reason to use
a legacy EXPORT_SYMBOL.
Arnd
On Wed, Jul 2, 2014 at 6:17 AM, Will Deacon <[email protected]> wrote:
> On Tue, Jul 01, 2014 at 07:43:32PM +0100, Liviu Dudau wrote:
>> Enhance the default implementation of pcibios_add_device() to
>> parse and map the IRQ of the device if a DT binding is available.
>>
>> Signed-off-by: Liviu Dudau <[email protected]>
>> ---
>> drivers/pci/pci.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 63a54a3..8e65dc3 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -17,6 +17,7 @@
>> #include <linux/spinlock.h>
>> #include <linux/string.h>
>> #include <linux/log2.h>
>> +#include <linux/of_pci.h>
>> #include <linux/pci-aspm.h>
>> #include <linux/pm_wakeup.h>
>> #include <linux/interrupt.h>
>> @@ -1453,6 +1454,9 @@ EXPORT_SYMBOL(pcim_pin_device);
>> */
>> int __weak pcibios_add_device(struct pci_dev *dev)
>> {
>> +#ifdef CONFIG_OF
>> + dev->irq = of_irq_parse_and_map_pci(dev, 0, 0);
>> +#endif
>
> You could avoid the #ifdef by only assigning dev->irq if
> of_irq_parse_and_map_pci returned something > 0.
Perhaps it could just be unconditional always. Presumably, dev->irq is
not already set to something valid and setting it to <= 0 should not
have any consequences.
Rob
On Tue, Jul 1, 2014 at 1:43 PM, Liviu Dudau <[email protected]> wrote:
> The ranges property for a host bridge controller in DT describes
> the mapping between the PCI bus address and the CPU physical address.
> The resources framework however expects that the IO resources start
> at a pseudo "port" address 0 (zero) and have a maximum size of IO_SPACE_LIMIT.
> The conversion from pci ranges to resources failed to take that into account.
I don't think this change is right. There are 2 resources: the PCI bus
addresses and cpu addresses. This function deals with the cpu
addresses. Returning pci addresses for i/o and cpu addresses for
memory is going to be error prone. We probably need both cpu and pci
resources exposed to host controllers.
Making the new function only deal with i/o bus resources and naming it
of_pci_range_to_io_resource would be better.
Rob
> In the process move the function into drivers/of/address.c as it now
> depends on pci_address_to_pio() code and make it return an error message.
>
> Signed-off-by: Liviu Dudau <[email protected]>
> Tested-by: Tanmay Inamdar <[email protected]>
> ---
> drivers/of/address.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++
> include/linux/of_address.h | 13 ++-----------
> 2 files changed, 49 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 1345733..cbbaed2 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -872,3 +872,50 @@ bool of_dma_is_coherent(struct device_node *np)
> return false;
> }
> EXPORT_SYMBOL_GPL(of_dma_is_coherent);
> +
> +/*
> + * of_pci_range_to_resource - Create a resource from an of_pci_range
> + * @range: the PCI range that describes the resource
> + * @np: device node where the range belongs to
> + * @res: pointer to a valid resource that will be updated to
> + * reflect the values contained in the range.
> + *
> + * Returns EINVAL if the range cannot be converted to resource.
> + *
> + * Note that if the range is an IO range, the resource will be converted
> + * using pci_address_to_pio() which can fail if it is called too early or
> + * if the range cannot be matched to any host bridge IO space (our case here).
> + * To guard against that we try to register the IO range first.
> + * If that fails we know that pci_address_to_pio() will do too.
> + */
> +int of_pci_range_to_resource(struct of_pci_range *range,
> + struct device_node *np, struct resource *res)
> +{
> + int err;
> + res->flags = range->flags;
> + res->parent = res->child = res->sibling = NULL;
> + res->name = np->full_name;
> +
> + if (res->flags & IORESOURCE_IO) {
> + unsigned long port = -1;
> + err = pci_register_io_range(range->cpu_addr, range->size);
> + if (err)
> + goto invalid_range;
> + port = pci_address_to_pio(range->cpu_addr);
> + if (port == (unsigned long)-1) {
> + err = -EINVAL;
> + goto invalid_range;
> + }
> + res->start = port;
> + } else {
> + res->start = range->cpu_addr;
> + }
> + res->end = res->start + range->size - 1;
> + return 0;
> +
> +invalid_range:
> + res->start = (resource_size_t)OF_BAD_ADDR;
> + res->end = (resource_size_t)OF_BAD_ADDR;
> + return err;
> +}
> +
> diff --git a/include/linux/of_address.h b/include/linux/of_address.h
> index ac4aac4..33c0420 100644
> --- a/include/linux/of_address.h
> +++ b/include/linux/of_address.h
> @@ -23,17 +23,8 @@ struct of_pci_range {
> #define for_each_of_pci_range(parser, range) \
> for (; of_pci_range_parser_one(parser, range);)
>
> -static inline void of_pci_range_to_resource(struct of_pci_range *range,
> - struct device_node *np,
> - struct resource *res)
> -{
> - res->flags = range->flags;
> - res->start = range->cpu_addr;
> - res->end = range->cpu_addr + range->size - 1;
> - res->parent = res->child = res->sibling = NULL;
> - res->name = np->full_name;
> -}
> -
> +extern int of_pci_range_to_resource(struct of_pci_range *range,
> + struct device_node *np, struct resource *res);
> /* Translate a DMA address from device space to CPU space */
> extern u64 of_translate_dma_address(struct device_node *dev,
> const __be32 *in_addr);
> --
> 2.0.0
>
>
> _______________________________________________
> linaro-kernel mailing list
> [email protected]
> http://lists.linaro.org/mailman/listinfo/linaro-kernel
On Saturday 05 July 2014 14:25:52 Rob Herring wrote:
> On Tue, Jul 1, 2014 at 1:43 PM, Liviu Dudau <[email protected]> wrote:
> > The ranges property for a host bridge controller in DT describes
> > the mapping between the PCI bus address and the CPU physical address.
> > The resources framework however expects that the IO resources start
> > at a pseudo "port" address 0 (zero) and have a maximum size of IO_SPACE_LIMIT.
> > The conversion from pci ranges to resources failed to take that into account.
>
> I don't think this change is right. There are 2 resources: the PCI bus
> addresses and cpu addresses. This function deals with the cpu
> addresses. Returning pci addresses for i/o and cpu addresses for
> memory is going to be error prone. We probably need both cpu and pci
> resources exposed to host controllers.
>
> Making the new function only deal with i/o bus resources and naming it
> of_pci_range_to_io_resource would be better.
I think you are correct that this change by itself is will break existing
drivers that rely on the current behavior of of_pci_range_to_resource,
but there is also something wrong with the existing implementation:
of_pci_range_to_resource() at the moment returns a the address in
cpu address space (i.e. IORESOURCE_MEM) but sets the res->flags
value to IORESOURCE_IO, which means it doesn't fit into the resource
tree. Liviu's version gets that part right, and it would be nice
to fix that eventually, however we do it here.
Arnd
On Tue, Jul 1, 2014 at 1:43 PM, Liviu Dudau <[email protected]> wrote:
> This is my resurected attempt at adding support for generic PCI host
> bridge controllers that make use of device tree information to
> configure themselves. I've tagged it as v8 although the patches
> have now been reshuffled in order to ease adoption so referring to
> the older versions might be a bit of a hoop jumping exercise.
>
> Changes from v7:
> - Reordered the patches so that fixes and non-controversial patches
> from v7 can be accepted more easily. If agreed I can split the
> series again into patches that can be upstreamed easily and ones
> that still need discussion.
> - Moved the of_create_host_bridge() function to drivers/of/of_pci.c
> to better reflect its target use.
> - Added the function to remap the bus I/O resources that used to be
> provided in my arm64 patch series and (re)named it pci_remap_iospace()
> - Removed error code checking from parsing and mapping of IRQ from DT
> in recognition that some PCI devices will not have legacy IRQ mappings.
>
> v7 thread here with all the historic information: https://lkml.org/lkml/2014/3/14/279
Can you publish a branch for this series please.
Rob
On Sat, Jul 05, 2014 at 09:46:09PM +0100, Arnd Bergmann wrote:
> On Saturday 05 July 2014 14:25:52 Rob Herring wrote:
> > On Tue, Jul 1, 2014 at 1:43 PM, Liviu Dudau <[email protected]> wrote:
> > > The ranges property for a host bridge controller in DT describes
> > > the mapping between the PCI bus address and the CPU physical address.
> > > The resources framework however expects that the IO resources start
> > > at a pseudo "port" address 0 (zero) and have a maximum size of IO_SPACE_LIMIT.
> > > The conversion from pci ranges to resources failed to take that into account.
> >
> > I don't think this change is right. There are 2 resources: the PCI bus
> > addresses and cpu addresses. This function deals with the cpu
> > addresses. Returning pci addresses for i/o and cpu addresses for
> > memory is going to be error prone. We probably need both cpu and pci
> > resources exposed to host controllers.
> >
> > Making the new function only deal with i/o bus resources and naming it
> > of_pci_range_to_io_resource would be better.
>
> I think you are correct that this change by itself is will break existing
> drivers that rely on the current behavior of of_pci_range_to_resource,
> but there is also something wrong with the existing implementation:
Either I'm very confused or I've managed to confuse everyone else. The I/O
resources described using CPU addresses *are* using "pseudo" port based
addresses (or at least that is my understanding and my reading of the code).
Can you point me to a function that is expecting the IO resource to have
the start address at the physical address of the mapped space?
I was trying to fix exactly this issue, that you cannot use the resource
structure returned by this function in any call that is expecting an IO
resource.
Rob, you can try this function with two host bridges. Patch [3/9] changes
pci_address_to_pio() to calculate the offset of the range based on already
registed ranges, so the first host bridge will have it's IO resources
starting from zero, but the second host bridge will have .start offseted
by the size of the IO space of the first bridge. That is not a PCI bus
address AFAICT.
Best regards,
Liviu
>
> of_pci_range_to_resource() at the moment returns a the address in
> cpu address space (i.e. IORESOURCE_MEM) but sets the res->flags
> value to IORESOURCE_IO, which means it doesn't fit into the resource
> tree. Liviu's version gets that part right, and it would be nice
> to fix that eventually, however we do it here.
>
> Arnd
>
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Sun, Jul 06, 2014 at 04:23:43PM +0100, Rob Herring wrote:
> On Tue, Jul 1, 2014 at 1:43 PM, Liviu Dudau <[email protected]> wrote:
> > This is my resurected attempt at adding support for generic PCI host
> > bridge controllers that make use of device tree information to
> > configure themselves. I've tagged it as v8 although the patches
> > have now been reshuffled in order to ease adoption so referring to
> > the older versions might be a bit of a hoop jumping exercise.
> >
> > Changes from v7:
> > - Reordered the patches so that fixes and non-controversial patches
> > from v7 can be accepted more easily. If agreed I can split the
> > series again into patches that can be upstreamed easily and ones
> > that still need discussion.
> > - Moved the of_create_host_bridge() function to drivers/of/of_pci.c
> > to better reflect its target use.
> > - Added the function to remap the bus I/O resources that used to be
> > provided in my arm64 patch series and (re)named it pci_remap_iospace()
> > - Removed error code checking from parsing and mapping of IRQ from DT
> > in recognition that some PCI devices will not have legacy IRQ mappings.
> >
> > v7 thread here with all the historic information: https://lkml.org/lkml/2014/3/14/279
>
> Can you publish a branch for this series please.
Do you want a branch that has the series as publised (+the one obvious miss
on the header include) or a branch that has the comments rolled in, but is
is not published yet as I'm waiting on answers from Bjorn regarding domain
number handling?
Best regards,
Liviu
>
> Rob
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Monday 07 July 2014, Liviu Dudau wrote:
> On Sat, Jul 05, 2014 at 09:46:09PM +0100, Arnd Bergmann wrote:
> > On Saturday 05 July 2014 14:25:52 Rob Herring wrote:
> > > On Tue, Jul 1, 2014 at 1:43 PM, Liviu Dudau <[email protected]> wrote:
> > > > The ranges property for a host bridge controller in DT describes
> > > > the mapping between the PCI bus address and the CPU physical address.
> > > > The resources framework however expects that the IO resources start
> > > > at a pseudo "port" address 0 (zero) and have a maximum size of IO_SPACE_LIMIT.
> > > > The conversion from pci ranges to resources failed to take that into account.
> > >
> > > I don't think this change is right. There are 2 resources: the PCI bus
> > > addresses and cpu addresses. This function deals with the cpu
> > > addresses. Returning pci addresses for i/o and cpu addresses for
> > > memory is going to be error prone. We probably need both cpu and pci
> > > resources exposed to host controllers.
> > >
> > > Making the new function only deal with i/o bus resources and naming it
> > > of_pci_range_to_io_resource would be better.
> >
> > I think you are correct that this change by itself is will break existing
> > drivers that rely on the current behavior of of_pci_range_to_resource,
> > but there is also something wrong with the existing implementation:
>
> Either I'm very confused or I've managed to confuse everyone else. The I/O
> resources described using CPU addresses *are* using "pseudo" port based
> addresses (or at least that is my understanding and my reading of the code).
> Can you point me to a function that is expecting the IO resource to have
> the start address at the physical address of the mapped space?
pci_v3_preinit() in arch/arm/mach-integrator/pci_v3.c for instance takes
the resource returned by of_pci_range_to_resource and programs the
start and size into hardware registers that expect a physical address
as far as I can tell.
> I was trying to fix exactly this issue, that you cannot use the resource
> structure returned by this function in any call that is expecting an IO
> resource.
I looked at the other drivers briefly, and I think you indeed fix the Tegra
driver with this but break the integrator driver as mentioned above.
The other callers of of_pci_range_to_resource() are apparently not
impacted as they recalculate the values they get.
Arnd
On Tue, Jul 01, 2014 at 07:43:27PM +0100, Liviu Dudau wrote:
> This is a useful function and we should make it visible outside the
> generic PCI code. Export it as a GPL symbol.
>
> Signed-off-by: Liviu Dudau <[email protected]>
> Tested-by: Tanmay Inamdar <[email protected]>
> ---
> drivers/pci/host-bridge.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
> index 0e5f3c9..36c669e 100644
> --- a/drivers/pci/host-bridge.c
> +++ b/drivers/pci/host-bridge.c
> @@ -16,12 +16,13 @@ static struct pci_bus *find_pci_root_bus(struct pci_bus *bus)
> return bus;
> }
>
> -static struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
> +struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
> {
> struct pci_bus *root_bus = find_pci_root_bus(bus);
>
> return to_pci_host_bridge(root_bus->bridge);
> }
> +EXPORT_SYMBOL_GPL(find_pci_host_bridge);
There's nothing in this series that uses find_pci_host_bridge(), so
how about we just wait until we have something that needs it?
Also, if/when we export this, I'd prefer a name that starts with "pci_"
as most of the other interfaces do.
> void pci_set_host_bridge_release(struct pci_host_bridge *bridge,
> void (*release_fn)(struct pci_host_bridge *),
> --
> 2.0.0
>
On Tue, Jul 01, 2014 at 07:43:28PM +0100, Liviu Dudau wrote:
> Some architectures do not have a simple view of the PCI I/O space
> and instead use a range of CPU addresses that map to bus addresses. For
> some architectures these ranges will be expressed by OF bindings
> in a device tree file.
>
> Introduce a pci_register_io_range() helper function with a generic
> implementation that can be used by such architectures to keep track
> of the I/O ranges described by the PCI bindings. If the PCI_IOBASE
> macro is not defined that signals lack of support for PCI and we
> return an error.
>
> Signed-off-by: Liviu Dudau <[email protected]>
> ---
> drivers/of/address.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++
> include/linux/of_address.h | 1 +
> 2 files changed, 62 insertions(+)
>
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 5edfcb0..1345733 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -5,6 +5,7 @@
> #include <linux/module.h>
> #include <linux/of_address.h>
> #include <linux/pci_regs.h>
> +#include <linux/slab.h>
> #include <linux/string.h>
>
> /* Max address size we deal with */
> @@ -601,12 +602,72 @@ const __be32 *of_get_address(struct device_node *dev, int index, u64 *size,
> }
> EXPORT_SYMBOL(of_get_address);
>
> +struct io_range {
> + struct list_head list;
> + phys_addr_t start;
> + resource_size_t size;
> +};
> +
> +static LIST_HEAD(io_range_list);
> +
> +/*
> + * Record the PCI IO range (expressed as CPU physical address + size).
> + * Return a negative value if an error has occured, zero otherwise
> + */
> +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
I don't understand the interface here. What's the mapping from CPU
physical address to bus I/O port? For example, I have the following
machine in mind:
HWP0002:00: PCI Root Bridge (domain 0000 [bus 00-1b])
HWP0002:00: memory-mapped IO port space [mem 0xf8010000000-0xf8010000fff]
HWP0002:00: host bridge window [io 0x0000-0x0fff]
HWP0002:09: PCI Root Bridge (domain 0001 [bus 00-1b])
HWP0002:09: memory-mapped IO port space [mem 0xf8110000000-0xf8110000fff]
HWP0002:09: host bridge window [io 0x1000000-0x1000fff] (PCI address [0x0-0xfff])
The CPU physical memory [mem 0xf8010000000-0xf8010000fff] is translated by
the bridge to I/O ports 0x0000-0x0fff on PCI bus 0000:00. Drivers use,
e.g., "inb(0)" to access it.
Similarly, [mem 0xf8110000000-0xf8110000fff] is translated by the second
bridge to I/O ports 0x0000-0x0fff on PCI bus 0001:00. Drivers use
"inb(0x1000000)" to access it.
pci_register_io_range() seems sort of like it's intended to track the
memory-mapped IO port spaces, e.g., [mem 0xf8010000000-0xf8010000fff].
But I would think you'd want to keep track of at least the base port
number on the PCI bus, too. Or is that why it's weak?
Here's what these look like in /proc/iomem and /proc/ioports (note that
there are two resource structs for each memory-mapped IO port space: one
IORESOURCE_MEM for the memory-mapped area (used only by the host bridge
driver), and one IORESOURCE_IO for the I/O port space (this becomes the
parent of a region used by a regular device driver):
/proc/iomem:
PCI Bus 0000:00 I/O Ports 00000000-00000fff
PCI Bus 0001:00 I/O Ports 01000000-01000fff
/proc/ioports:
00000000-00000fff : PCI Bus 0000:00
01000000-01000fff : PCI Bus 0001:00
> +{
> +#ifdef PCI_IOBASE
> + struct io_range *res;
> + resource_size_t allocated_size = 0;
> +
> + /* check if the range hasn't been previously recorded */
> + list_for_each_entry(res, &io_range_list, list) {
> + if (addr >= res->start && addr + size <= res->start + size)
> + return 0;
> + allocated_size += res->size;
> + }
> +
> + /* range not registed yet, check for available space */
> + if (allocated_size + size - 1 > IO_SPACE_LIMIT)
> + return -E2BIG;
> +
> + /* add the range to the list */
> + res = kzalloc(sizeof(*res), GFP_KERNEL);
> + if (!res)
> + return -ENOMEM;
> +
> + res->start = addr;
> + res->size = size;
> +
> + list_add_tail(&res->list, &io_range_list);
> +
> + return 0;
> +#else
> + return -EINVAL;
> +#endif
> +}
> +
> unsigned long __weak pci_address_to_pio(phys_addr_t address)
> {
> +#ifdef PCI_IOBASE
> + struct io_range *res;
> + resource_size_t offset = 0;
> +
> + list_for_each_entry(res, &io_range_list, list) {
> + if (address >= res->start &&
> + address < res->start + res->size) {
> + return res->start - address + offset;
> + }
> + offset += res->size;
> + }
> +
> + return (unsigned long)-1;
> +#else
> if (address > IO_SPACE_LIMIT)
> return (unsigned long)-1;
>
> return (unsigned long) address;
> +#endif
> }
>
> static int __of_address_to_resource(struct device_node *dev,
> diff --git a/include/linux/of_address.h b/include/linux/of_address.h
> index c13b878..ac4aac4 100644
> --- a/include/linux/of_address.h
> +++ b/include/linux/of_address.h
> @@ -55,6 +55,7 @@ extern void __iomem *of_iomap(struct device_node *device, int index);
> extern const __be32 *of_get_address(struct device_node *dev, int index,
> u64 *size, unsigned int *flags);
>
> +extern int pci_register_io_range(phys_addr_t addr, resource_size_t size);
> extern unsigned long pci_address_to_pio(phys_addr_t addr);
>
> extern int of_pci_range_parser_init(struct of_pci_range_parser *parser,
> --
> 2.0.0
>
On Tue, Jul 01, 2014 at 07:43:31PM +0100, Liviu Dudau wrote:
> Make it easier to discover the domain number of a bus by storing
> the number in pci_host_bridge for the root bus. Several architectures
> have their own way of storing this information, so it makes sense
> to try to unify the code. While at this, add a new function that
> creates a root bus in a given domain and make pci_create_root_bus()
> a wrapper around this function.
"While at this" is always a good clue that maybe something should be
split into a separate patch :) This is a very good example, since it
adds a new interface that deserves its own changelog.
> Signed-off-by: Liviu Dudau <[email protected]>
> Tested-by: Tanmay Inamdar <[email protected]>
> ---
> drivers/pci/probe.c | 41 +++++++++++++++++++++++++++++++++--------
> include/linux/pci.h | 4 ++++
> 2 files changed, 37 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 2c92662..abf5e82 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1748,8 +1748,9 @@ void __weak pcibios_remove_bus(struct pci_bus *bus)
> {
> }
>
> -struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> - struct pci_ops *ops, void *sysdata, struct list_head *resources)
> +struct pci_bus *pci_create_root_bus_in_domain(struct device *parent,
> + int domain, int bus, struct pci_ops *ops, void *sysdata,
> + struct list_head *resources)
I don't think we should do it this way; this makes it possible to have a
host bridge where "bridge->domain_nr != pci_domain_nr(bridge->bus)".
I wonder if it would help to make a weak pci_domain_nr() function that
returns "bridge->domain_nr". Then each arch could individually drop its
pci_domain_nr() definition as it was converted, e.g., something like this:
- Convert every arch pci_domain_nr() from a #define to a non-inline
function
- Add bridge.domain_nr, initialized from pci_domain_nr()
- Add a weak generic pci_domain_nr() that returns bridge.domain_nr
- Add a way to create a host bridge in a specified domain, so we can
initialize bridge.domain_nr without using pci_domain_nr()
- Convert each arch to use the new creation mechanism and drop its
pci_domain_nr() implementation
> {
> int error;
> struct pci_host_bridge *bridge;
> @@ -1762,27 +1763,31 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
>
> bridge = pci_alloc_host_bridge();
> if (!bridge)
> - return NULL;
> + return ERR_PTR(-ENOMEM);
>
> bridge->dev.parent = parent;
> bridge->dev.release = pci_release_host_bridge_dev;
> + bridge->domain_nr = domain;
>
> b = pci_alloc_bus();
> - if (!b)
> + if (!b) {
> + error = -ENOMEM;
> goto err_out;
> + }
>
> b->sysdata = sysdata;
> b->ops = ops;
> b->number = b->busn_res.start = bus;
> - b2 = pci_find_bus(pci_domain_nr(b), bus);
> + b2 = pci_find_bus(bridge->domain_nr, bus);
> if (b2) {
> /* If we already got to this bus through a different bridge, ignore it */
> dev_dbg(&b2->dev, "bus already known\n");
> + error = -EEXIST;
> goto err_bus_out;
> }
>
> bridge->bus = b;
> - dev_set_name(&bridge->dev, "pci%04x:%02x", pci_domain_nr(b), bus);
> + dev_set_name(&bridge->dev, "pci%04x:%02x", bridge->domain_nr, bus);
> error = pcibios_root_bridge_prepare(bridge);
> if (error)
> goto err_out;
> @@ -1801,7 +1806,7 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
>
> b->dev.class = &pcibus_class;
> b->dev.parent = b->bridge;
> - dev_set_name(&b->dev, "%04x:%02x", pci_domain_nr(b), bus);
> + dev_set_name(&b->dev, "%04x:%02x", bridge->domain_nr, bus);
> error = device_register(&b->dev);
> if (error)
> goto class_dev_reg_err;
> @@ -1851,7 +1856,27 @@ err_bus_out:
> kfree(b);
> err_out:
> kfree(bridge);
> - return NULL;
> + return ERR_PTR(error);
> +}
> +
> +struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> + struct pci_ops *ops, void *sysdata, struct list_head *resources)
> +{
> + int domain_nr;
> + struct pci_bus *b = pci_alloc_bus();
> + if (!b)
> + return NULL;
> +
> + b->sysdata = sysdata;
> + domain_nr = pci_domain_nr(b);
> + kfree(b);
> +
> + b = pci_create_root_bus_in_domain(parent, domain_nr, bus,
> + ops, sysdata, resources);
> + if (IS_ERR(b))
> + return NULL;
> +
> + return b;
> }
>
> int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int bus_max)
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 466bcd1..7e7b939 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -401,6 +401,7 @@ struct pci_host_bridge_window {
> struct pci_host_bridge {
> struct device dev;
> struct pci_bus *bus; /* root bus */
> + int domain_nr;
> struct list_head windows; /* pci_host_bridge_windows */
> void (*release_fn)(struct pci_host_bridge *);
> void *release_data;
> @@ -769,6 +770,9 @@ struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops, void *sysdata);
> struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> struct pci_ops *ops, void *sysdata,
> struct list_head *resources);
> +struct pci_bus *pci_create_root_bus_in_domain(struct device *parent,
> + int domain, int bus, struct pci_ops *ops,
> + void *sysdata, struct list_head *resources);
> int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax);
> int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax);
> void pci_bus_release_busn_res(struct pci_bus *b);
> --
> 2.0.0
>
On Tue, Jul 01, 2014 at 07:43:33PM +0100, Liviu Dudau wrote:
> Several platforms use a rather generic version of parsing
> the device tree to find the host bridge ranges. Move the common code
> into the generic PCI code and use it to create a pci_host_bridge
> structure that can be used by arch code.
>
> Based on early attempts by Andrew Murray to unify the code.
> Used powerpc and microblaze PCI code as starting point.
>
> Signed-off-by: Liviu Dudau <[email protected]>
> Tested-by: Tanmay Inamdar <[email protected]>
> ---
> drivers/of/of_pci.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++
> drivers/pci/host-bridge.c | 18 +++++++
> include/linux/of_pci.h | 10 ++++
> include/linux/pci.h | 8 +++
> 4 files changed, 171 insertions(+)
>
> diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> index 8481996..55d8320 100644
> --- a/drivers/of/of_pci.c
> +++ b/drivers/of/of_pci.c
> @@ -89,6 +89,141 @@ int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
> }
> EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
>
> +/**
> + * pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT
> + * @dev: device node of the host bridge having the range property
> + * @resources: list where the range of resources will be added after DT parsing
> + * @io_base: pointer to a variable that will contain the physical address for
> + * the start of the I/O range.
> + *
> + * It is the callers job to free the @resources list if an error is returned.
> + *
> + * This function will parse the "ranges" property of a PCI host bridge device
> + * node and setup the resource mapping based on its content. It is expected
> + * that the property conforms with the Power ePAPR document.
> + *
> + * Each architecture is then offered the chance of applying their own
> + * filtering of pci_host_bridge_windows based on their own restrictions by
> + * calling pcibios_fixup_bridge_ranges(). The filtered list of windows
> + * can then be used when creating a pci_host_bridge structure.
> + */
> +static int pci_host_bridge_of_get_ranges(struct device_node *dev,
> + struct list_head *resources, resource_size_t *io_base)
> +{
> + struct resource *res;
> + struct of_pci_range range;
> + struct of_pci_range_parser parser;
> + int err;
> +
> + pr_info("PCI host bridge %s ranges:\n", dev->full_name);
> +
> + /* Check for ranges property */
> + err = of_pci_range_parser_init(&parser, dev);
> + if (err)
> + return err;
> +
> + pr_debug("Parsing ranges property...\n");
> + for_each_of_pci_range(&parser, &range) {
> + /* Read next ranges element */
> + pr_debug("pci_space: 0x%08x pci_addr:0x%016llx cpu_addr:0x%016llx size:0x%016llx\n",
> + range.pci_space, range.pci_addr, range.cpu_addr, range.size);
If you're not trying to match other printk formats, you could try to match
the %pR format used elsewhere, e.g., "%#010llx-%#010llx" with
range.cpu_addr, range.cpu_addr + range.size - 1.
> +
> + /*
> + * If we failed translation or got a zero-sized region
> + * then skip this range
> + */
> + if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
> + continue;
> +
> + res = kzalloc(sizeof(struct resource), GFP_KERNEL);
> + if (!res)
> + return -ENOMEM;
> +
> + err = of_pci_range_to_resource(&range, dev, res);
> + if (err)
> + return err;
> +
> + if (resource_type(res) == IORESOURCE_IO)
> + *io_base = range.cpu_addr;
> +
> + pci_add_resource_offset(resources, res,
> + res->start - range.pci_addr);
> + }
> +
> + /* Apply architecture specific fixups for the ranges */
> + return pcibios_fixup_bridge_ranges(resources);
> +}
> +
> +static atomic_t domain_nr = ATOMIC_INIT(-1);
> +
> +/**
> + * of_create_pci_host_bridge - Create a PCI host bridge structure using
> + * information passed in the DT.
> + * @parent: device owning this host bridge
> + * @ops: pci_ops associated with the host controller
> + * @host_data: opaque data structure used by the host controller.
> + *
> + * returns a pointer to the newly created pci_host_bridge structure, or
> + * NULL if the call failed.
> + *
> + * This function will try to obtain the host bridge domain number by
> + * using of_alias_get_id() call with "pci-domain" as a stem. If that
> + * fails, a local allocator will be used that will put each host bridge
> + * in a new domain.
> + */
> +struct pci_host_bridge *
> +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host_data)
> +{
> + int err, domain, busno;
> + struct resource *bus_range;
> + struct pci_bus *root_bus;
> + struct pci_host_bridge *bridge;
> + resource_size_t io_base;
> + LIST_HEAD(res);
> +
> + bus_range = kzalloc(sizeof(*bus_range), GFP_KERNEL);
> + if (!bus_range)
> + return ERR_PTR(-ENOMEM);
> +
> + domain = of_alias_get_id(parent->of_node, "pci-domain");
> + if (domain == -ENODEV)
> + domain = atomic_inc_return(&domain_nr);
> +
> + err = of_pci_parse_bus_range(parent->of_node, bus_range);
> + if (err) {
> + dev_info(parent, "No bus range for %s, using default [0-255]\n",
> + parent->of_node->full_name);
> + bus_range->start = 0;
> + bus_range->end = 255;
> + bus_range->flags = IORESOURCE_BUS;
If you put the dev_info() down here, you can print &bus_range with %pR.
> + }
> + busno = bus_range->start;
> + pci_add_resource(&res, bus_range);
> +
> + /* now parse the rest of host bridge bus ranges */
> + err = pci_host_bridge_of_get_ranges(parent->of_node, &res, &io_base);
> + if (err)
> + goto err_create;
> +
> + /* then create the root bus */
> + root_bus = pci_create_root_bus_in_domain(parent, domain, busno,
> + ops, host_data, &res);
> + if (IS_ERR(root_bus)) {
> + err = PTR_ERR(root_bus);
> + goto err_create;
> + }
> +
> + bridge = to_pci_host_bridge(root_bus->bridge);
> + bridge->io_base = io_base;
> +
> + return bridge;
> +
> +err_create:
> + pci_free_resource_list(&res);
> + return ERR_PTR(err);
> +}
> +EXPORT_SYMBOL_GPL(of_create_pci_host_bridge);
> +
> #ifdef CONFIG_PCI_MSI
>
> static LIST_HEAD(of_pci_msi_chip_list);
> diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
> index 36c669e..cfee5d1 100644
> --- a/drivers/pci/host-bridge.c
> +++ b/drivers/pci/host-bridge.c
> @@ -5,6 +5,9 @@
> #include <linux/kernel.h>
> #include <linux/pci.h>
> #include <linux/module.h>
> +#include <linux/of_address.h>
> +#include <linux/of_pci.h>
> +#include <linux/slab.h>
>
> #include "pci.h"
>
> @@ -83,3 +86,18 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res,
> res->end = region->end + offset;
> }
> EXPORT_SYMBOL(pcibios_bus_to_resource);
> +
> +/**
> + * Simple version of the platform specific code for filtering the list
> + * of resources obtained from the ranges declaration in DT.
> + *
> + * Platforms can override this function in order to impose stronger
> + * constraints onto the list of resources that a host bridge can use.
> + * The filtered list will then be used to create a root bus and associate
> + * it with the host bridge.
> + *
> + */
> +int __weak pcibios_fixup_bridge_ranges(struct list_head *resources)
> +{
> + return 0;
> +}
I'd wait to add this until there's a platform that needs to implement it.
Splitting it out will make this patch that much smaller and easier to
understand.
> diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
> index dde3a4a..71e36d0 100644
> --- a/include/linux/of_pci.h
> +++ b/include/linux/of_pci.h
> @@ -15,6 +15,9 @@ struct device_node *of_pci_find_child_device(struct device_node *parent,
> int of_pci_get_devfn(struct device_node *np);
> int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
> int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
> +struct pci_host_bridge *of_create_pci_host_bridge(struct device *parent,
> + struct pci_ops *ops, void *host_data);
> +
> #else
> static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq)
> {
> @@ -43,6 +46,13 @@ of_pci_parse_bus_range(struct device_node *node, struct resource *res)
> {
> return -EINVAL;
> }
> +
> +static inline struct pci_host_bridge *
> +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops,
> + void *host_data)
> +{
> + return NULL;
> +}
> #endif
>
> #if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 7e7b939..556dc5f 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -402,6 +402,7 @@ struct pci_host_bridge {
> struct device dev;
> struct pci_bus *bus; /* root bus */
> int domain_nr;
> + resource_size_t io_base; /* physical address for the start of I/O area */
I don't see where this is used yet.
As far as I know, there's nothing that prevents a host bridge from having
several I/O port apertures (or several memory-mapped I/O port spaces).
> struct list_head windows; /* pci_host_bridge_windows */
> void (*release_fn)(struct pci_host_bridge *);
> void *release_data;
> @@ -1809,8 +1810,15 @@ static inline void pci_set_of_node(struct pci_dev *dev) { }
> static inline void pci_release_of_node(struct pci_dev *dev) { }
> static inline void pci_set_bus_of_node(struct pci_bus *bus) { }
> static inline void pci_release_bus_of_node(struct pci_bus *bus) { }
> +
> #endif /* CONFIG_OF */
>
> +/* Used by architecture code to apply any quirks to the list of
> + * pci_host_bridge resource ranges before they are being used
> + * by of_create_pci_host_bridge()
> + */
> +extern int pcibios_fixup_bridge_ranges(struct list_head *resources);
> +
> #ifdef CONFIG_EEH
> static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev)
> {
> --
> 2.0.0
>
On Tuesday 08 July 2014, Bjorn Helgaas wrote:
> On Tue, Jul 01, 2014 at 07:43:28PM +0100, Liviu Dudau wrote:
> > +static LIST_HEAD(io_range_list);
> > +
> > +/*
> > + * Record the PCI IO range (expressed as CPU physical address + size).
> > + * Return a negative value if an error has occured, zero otherwise
> > + */
> > +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
>
> I don't understand the interface here. What's the mapping from CPU
> physical address to bus I/O port? For example, I have the following
> machine in mind:
>
> HWP0002:00: PCI Root Bridge (domain 0000 [bus 00-1b])
> HWP0002:00: memory-mapped IO port space [mem 0xf8010000000-0xf8010000fff]
> HWP0002:00: host bridge window [io 0x0000-0x0fff]
>
> HWP0002:09: PCI Root Bridge (domain 0001 [bus 00-1b])
> HWP0002:09: memory-mapped IO port space [mem 0xf8110000000-0xf8110000fff]
> HWP0002:09: host bridge window [io 0x1000000-0x1000fff] (PCI address [0x0-0xfff])
>
> The CPU physical memory [mem 0xf8010000000-0xf8010000fff] is translated by
> the bridge to I/O ports 0x0000-0x0fff on PCI bus 0000:00. Drivers use,
> e.g., "inb(0)" to access it.
>
> Similarly, [mem 0xf8110000000-0xf8110000fff] is translated by the second
> bridge to I/O ports 0x0000-0x0fff on PCI bus 0001:00. Drivers use
> "inb(0x1000000)" to access it.
I guess you are thinking of the IA64 model here where you keep the virtual
I/O port numbers in a per-bus lookup table that gets accessed for each
inb() call. I've thought about this some more, and I believe there are good
reasons for sticking with the model used on arm32 and powerpc for the
generic OF implementation.
The idea is that there is a single virtual memory range for all I/O port
mappings and we use the MMU to do the translation rather than computing
it manually in the inb() implemnetation. The main advantage is that all
functions used in device drivers to (potentially) access I/O ports
become trivial this way, which helps for code size and in some cases
(e.g. SoC-internal registers with a low latency) it may even be performance
relevant.
What this scheme gives you is a set of functions that literally do:
/* architecture specific virtual address */
#define PCI_IOBASE (void __iomem *)0xabcd00000000000
static inline u32 inl(unsigned long port)
{
return readl(port + PCI_IOBASE);
}
static inline void __iomem *ioport_map(unsigned long port, unsigned int nr)
{
return port + PCI_IOBASE;
}
static inline unsigned int ioread32(void __iomem *p)
{
return readl(p);
}
Since we want this to work on 32-bit machines, the virtual I/O space has
to be rather tightly packed, so Liviu's algorithm just picks the next
available address for each new I/O space.
> pci_register_io_range() seems sort of like it's intended to track the
> memory-mapped IO port spaces, e.g., [mem 0xf8010000000-0xf8010000fff].
> But I would think you'd want to keep track of at least the base port
> number on the PCI bus, too. Or is that why it's weak?
The PCI bus start address only gets factored in when the window is registered
with the PCI core in patch 8/9, where we go over all ranges doing
+ pci_add_resource_offset(resources, res,
+ res->start - range.pci_addr);
With Liviu's patch, this can be done in exactly the same way for both
MMIO and PIO spaces.
> Here's what these look like in /proc/iomem and /proc/ioports (note that
> there are two resource structs for each memory-mapped IO port space: one
> IORESOURCE_MEM for the memory-mapped area (used only by the host bridge
> driver), and one IORESOURCE_IO for the I/O port space (this becomes the
> parent of a region used by a regular device driver):
>
> /proc/iomem:
> PCI Bus 0000:00 I/O Ports 00000000-00000fff
> PCI Bus 0001:00 I/O Ports 01000000-01000fff
>
> /proc/ioports:
> 00000000-00000fff : PCI Bus 0000:00
> 01000000-01000fff : PCI Bus 0001:00
The only difference I'd expect here is that the last line would make it
packed more tightly, so it's instead
/proc/ioports:
00000000-00000fff : PCI Bus 0000:00
00001000-00001fff : PCI Bus 0001:00
In practice we'd probably have 64KB per host controller, and each of them
would be a separate domain. I think we normally don't register the
IORESOURCE_MEM resource, but I agree it's a good idea and we should
always do that.
Arnd
On Mon, Jul 07, 2014 at 10:22:00PM +0100, Arnd Bergmann wrote:
> On Monday 07 July 2014, Liviu Dudau wrote:
> > On Sat, Jul 05, 2014 at 09:46:09PM +0100, Arnd Bergmann wrote:
> > > On Saturday 05 July 2014 14:25:52 Rob Herring wrote:
> > > > On Tue, Jul 1, 2014 at 1:43 PM, Liviu Dudau <[email protected]> wrote:
> > > > > The ranges property for a host bridge controller in DT describes
> > > > > the mapping between the PCI bus address and the CPU physical address.
> > > > > The resources framework however expects that the IO resources start
> > > > > at a pseudo "port" address 0 (zero) and have a maximum size of IO_SPACE_LIMIT.
> > > > > The conversion from pci ranges to resources failed to take that into account.
> > > >
> > > > I don't think this change is right. There are 2 resources: the PCI bus
> > > > addresses and cpu addresses. This function deals with the cpu
> > > > addresses. Returning pci addresses for i/o and cpu addresses for
> > > > memory is going to be error prone. We probably need both cpu and pci
> > > > resources exposed to host controllers.
> > > >
> > > > Making the new function only deal with i/o bus resources and naming it
> > > > of_pci_range_to_io_resource would be better.
> > >
> > > I think you are correct that this change by itself is will break existing
> > > drivers that rely on the current behavior of of_pci_range_to_resource,
> > > but there is also something wrong with the existing implementation:
> >
> > Either I'm very confused or I've managed to confuse everyone else. The I/O
> > resources described using CPU addresses *are* using "pseudo" port based
> > addresses (or at least that is my understanding and my reading of the code).
> > Can you point me to a function that is expecting the IO resource to have
> > the start address at the physical address of the mapped space?
>
> pci_v3_preinit() in arch/arm/mach-integrator/pci_v3.c for instance takes
> the resource returned by of_pci_range_to_resource and programs the
> start and size into hardware registers that expect a physical address
> as far as I can tell.
>
> > I was trying to fix exactly this issue, that you cannot use the resource
> > structure returned by this function in any call that is expecting an IO
> > resource.
>
> I looked at the other drivers briefly, and I think you indeed fix the Tegra
> driver with this but break the integrator driver as mentioned above.
> The other callers of of_pci_range_to_resource() are apparently not
> impacted as they recalculate the values they get.
I would argue that integrator version is having broken assumptions. If it would
try to allocate that IO range or request the resource as returned currently by
of_pci_range_to_resource (without my patch) it would fail. I know because I did
the same thing in my host bridge driver and it failed miserably. That's why I
tried to patch it.
I will lay out my argument here and people can tell me if I am wrong:
PCI IO resources (even if they are memory mapped on certain architectures) need
to emulate the x86 world "port" concept. Why do I think this? Because of this
structure at the beginning of kernel/resource.c:
struct resource ioport_resource = {
.name = "PCI IO",
.start = 0,
.end = IO_SPACE_LIMIT,
.flags = IORESOURCE_IO,
};
EXPORT_SYMBOL(ioport_resource);
The other resource that people seem to confuse it with is the next one in that
file:
struct resource iomem_resource = {
.name = "PCI mem",
.start = 0,
.end = -1,
.flags = IORESOURCE_MEM,
};
EXPORT_SYMBOL(iomem_resource);
Now, there are architecture that override the .start and .end values, but arm
is not one of those, and mach-integrator doesn't change it either. So one can
play with the ioport_resource values to move the "port" window wherever he/she
wants, but it doesn't change the "port access" way of addressing it.
If the IO space is memory mapped, then we use the port number, the io_offset
and the PCI_IOBASE to get to the virtual address that, when accessed, will
generate the correct addresses on the bus, based on what the host bridge has
been configured.
This is the current level of my understanding of PCI IO.
Now, I believe Rob has switched entirely to using my series in some test that
he has run and he hasn't encountered any issues, as long as one remembers in
the host bridge driver to add the io_base offset to the .start resource. If
not then I need to patch pci_v3.c.
Best regards,
Liviu
>
> Arnd
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Tue, Jul 08, 2014 at 02:01:04AM +0100, Bjorn Helgaas wrote:
> On Tue, Jul 01, 2014 at 07:43:33PM +0100, Liviu Dudau wrote:
> > Several platforms use a rather generic version of parsing
> > the device tree to find the host bridge ranges. Move the common code
> > into the generic PCI code and use it to create a pci_host_bridge
> > structure that can be used by arch code.
> >
> > Based on early attempts by Andrew Murray to unify the code.
> > Used powerpc and microblaze PCI code as starting point.
> >
> > Signed-off-by: Liviu Dudau <[email protected]>
> > Tested-by: Tanmay Inamdar <[email protected]>
> > ---
> > drivers/of/of_pci.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++
> > drivers/pci/host-bridge.c | 18 +++++++
> > include/linux/of_pci.h | 10 ++++
> > include/linux/pci.h | 8 +++
> > 4 files changed, 171 insertions(+)
> >
> > diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> > index 8481996..55d8320 100644
> > --- a/drivers/of/of_pci.c
> > +++ b/drivers/of/of_pci.c
> > @@ -89,6 +89,141 @@ int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
> > }
> > EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
> >
> > +/**
> > + * pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT
> > + * @dev: device node of the host bridge having the range property
> > + * @resources: list where the range of resources will be added after DT parsing
> > + * @io_base: pointer to a variable that will contain the physical address for
> > + * the start of the I/O range.
> > + *
> > + * It is the callers job to free the @resources list if an error is returned.
> > + *
> > + * This function will parse the "ranges" property of a PCI host bridge device
> > + * node and setup the resource mapping based on its content. It is expected
> > + * that the property conforms with the Power ePAPR document.
> > + *
> > + * Each architecture is then offered the chance of applying their own
> > + * filtering of pci_host_bridge_windows based on their own restrictions by
> > + * calling pcibios_fixup_bridge_ranges(). The filtered list of windows
> > + * can then be used when creating a pci_host_bridge structure.
> > + */
> > +static int pci_host_bridge_of_get_ranges(struct device_node *dev,
> > + struct list_head *resources, resource_size_t *io_base)
> > +{
> > + struct resource *res;
> > + struct of_pci_range range;
> > + struct of_pci_range_parser parser;
> > + int err;
> > +
> > + pr_info("PCI host bridge %s ranges:\n", dev->full_name);
> > +
> > + /* Check for ranges property */
> > + err = of_pci_range_parser_init(&parser, dev);
> > + if (err)
> > + return err;
> > +
> > + pr_debug("Parsing ranges property...\n");
> > + for_each_of_pci_range(&parser, &range) {
> > + /* Read next ranges element */
> > + pr_debug("pci_space: 0x%08x pci_addr:0x%016llx cpu_addr:0x%016llx size:0x%016llx\n",
> > + range.pci_space, range.pci_addr, range.cpu_addr, range.size);
>
> If you're not trying to match other printk formats, you could try to match
> the %pR format used elsewhere, e.g., "%#010llx-%#010llx" with
> range.cpu_addr, range.cpu_addr + range.size - 1.
Yes, not a big fan of the ugly output it generates, but the output matches closely the ranges
definition in the device tree file so it is easy to validate that you are parsing the right
entry. I am happy to change it to shorten the cpu range message.
>
> > +
> > + /*
> > + * If we failed translation or got a zero-sized region
> > + * then skip this range
> > + */
> > + if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
> > + continue;
> > +
> > + res = kzalloc(sizeof(struct resource), GFP_KERNEL);
> > + if (!res)
> > + return -ENOMEM;
> > +
> > + err = of_pci_range_to_resource(&range, dev, res);
> > + if (err)
> > + return err;
> > +
> > + if (resource_type(res) == IORESOURCE_IO)
> > + *io_base = range.cpu_addr;
> > +
> > + pci_add_resource_offset(resources, res,
> > + res->start - range.pci_addr);
> > + }
> > +
> > + /* Apply architecture specific fixups for the ranges */
> > + return pcibios_fixup_bridge_ranges(resources);
> > +}
> > +
> > +static atomic_t domain_nr = ATOMIC_INIT(-1);
> > +
> > +/**
> > + * of_create_pci_host_bridge - Create a PCI host bridge structure using
> > + * information passed in the DT.
> > + * @parent: device owning this host bridge
> > + * @ops: pci_ops associated with the host controller
> > + * @host_data: opaque data structure used by the host controller.
> > + *
> > + * returns a pointer to the newly created pci_host_bridge structure, or
> > + * NULL if the call failed.
> > + *
> > + * This function will try to obtain the host bridge domain number by
> > + * using of_alias_get_id() call with "pci-domain" as a stem. If that
> > + * fails, a local allocator will be used that will put each host bridge
> > + * in a new domain.
> > + */
> > +struct pci_host_bridge *
> > +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host_data)
> > +{
> > + int err, domain, busno;
> > + struct resource *bus_range;
> > + struct pci_bus *root_bus;
> > + struct pci_host_bridge *bridge;
> > + resource_size_t io_base;
> > + LIST_HEAD(res);
> > +
> > + bus_range = kzalloc(sizeof(*bus_range), GFP_KERNEL);
> > + if (!bus_range)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + domain = of_alias_get_id(parent->of_node, "pci-domain");
> > + if (domain == -ENODEV)
> > + domain = atomic_inc_return(&domain_nr);
> > +
> > + err = of_pci_parse_bus_range(parent->of_node, bus_range);
> > + if (err) {
> > + dev_info(parent, "No bus range for %s, using default [0-255]\n",
> > + parent->of_node->full_name);
> > + bus_range->start = 0;
> > + bus_range->end = 255;
> > + bus_range->flags = IORESOURCE_BUS;
>
> If you put the dev_info() down here, you can print &bus_range with %pR.
Sure, will do.
>
> > + }
> > + busno = bus_range->start;
> > + pci_add_resource(&res, bus_range);
> > +
> > + /* now parse the rest of host bridge bus ranges */
> > + err = pci_host_bridge_of_get_ranges(parent->of_node, &res, &io_base);
> > + if (err)
> > + goto err_create;
> > +
> > + /* then create the root bus */
> > + root_bus = pci_create_root_bus_in_domain(parent, domain, busno,
> > + ops, host_data, &res);
> > + if (IS_ERR(root_bus)) {
> > + err = PTR_ERR(root_bus);
> > + goto err_create;
> > + }
> > +
> > + bridge = to_pci_host_bridge(root_bus->bridge);
> > + bridge->io_base = io_base;
> > +
> > + return bridge;
> > +
> > +err_create:
> > + pci_free_resource_list(&res);
> > + return ERR_PTR(err);
> > +}
> > +EXPORT_SYMBOL_GPL(of_create_pci_host_bridge);
> > +
> > #ifdef CONFIG_PCI_MSI
> >
> > static LIST_HEAD(of_pci_msi_chip_list);
> > diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
> > index 36c669e..cfee5d1 100644
> > --- a/drivers/pci/host-bridge.c
> > +++ b/drivers/pci/host-bridge.c
> > @@ -5,6 +5,9 @@
> > #include <linux/kernel.h>
> > #include <linux/pci.h>
> > #include <linux/module.h>
> > +#include <linux/of_address.h>
> > +#include <linux/of_pci.h>
> > +#include <linux/slab.h>
> >
> > #include "pci.h"
> >
> > @@ -83,3 +86,18 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res,
> > res->end = region->end + offset;
> > }
> > EXPORT_SYMBOL(pcibios_bus_to_resource);
> > +
> > +/**
> > + * Simple version of the platform specific code for filtering the list
> > + * of resources obtained from the ranges declaration in DT.
> > + *
> > + * Platforms can override this function in order to impose stronger
> > + * constraints onto the list of resources that a host bridge can use.
> > + * The filtered list will then be used to create a root bus and associate
> > + * it with the host bridge.
> > + *
> > + */
> > +int __weak pcibios_fixup_bridge_ranges(struct list_head *resources)
> > +{
> > + return 0;
> > +}
>
> I'd wait to add this until there's a platform that needs to implement it.
> Splitting it out will make this patch that much smaller and easier to
> understand.
I need this as this is the default implementation (i.e. do nothing). Otherwise the
link phase will give errors.
>
> > diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
> > index dde3a4a..71e36d0 100644
> > --- a/include/linux/of_pci.h
> > +++ b/include/linux/of_pci.h
> > @@ -15,6 +15,9 @@ struct device_node *of_pci_find_child_device(struct device_node *parent,
> > int of_pci_get_devfn(struct device_node *np);
> > int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
> > int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
> > +struct pci_host_bridge *of_create_pci_host_bridge(struct device *parent,
> > + struct pci_ops *ops, void *host_data);
> > +
> > #else
> > static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq)
> > {
> > @@ -43,6 +46,13 @@ of_pci_parse_bus_range(struct device_node *node, struct resource *res)
> > {
> > return -EINVAL;
> > }
> > +
> > +static inline struct pci_host_bridge *
> > +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops,
> > + void *host_data)
> > +{
> > + return NULL;
> > +}
> > #endif
> >
> > #if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 7e7b939..556dc5f 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -402,6 +402,7 @@ struct pci_host_bridge {
> > struct device dev;
> > struct pci_bus *bus; /* root bus */
> > int domain_nr;
> > + resource_size_t io_base; /* physical address for the start of I/O area */
>
> I don't see where this is used yet.
It's used in pci_host_bridge_of_get_ranges() (earlier in this patch).
>
> As far as I know, there's nothing that prevents a host bridge from having
> several I/O port apertures (or several memory-mapped I/O port spaces).
The pci_register_io_range() will give different offsets for each apperture.
I just need to make sure I don't overwrite io_base when parsing multiple IO
ranges.
Thanks for reviewing this,
Liviu
>
> > struct list_head windows; /* pci_host_bridge_windows */
> > void (*release_fn)(struct pci_host_bridge *);
> > void *release_data;
> > @@ -1809,8 +1810,15 @@ static inline void pci_set_of_node(struct pci_dev *dev) { }
> > static inline void pci_release_of_node(struct pci_dev *dev) { }
> > static inline void pci_set_bus_of_node(struct pci_bus *bus) { }
> > static inline void pci_release_bus_of_node(struct pci_bus *bus) { }
> > +
> > #endif /* CONFIG_OF */
> >
> > +/* Used by architecture code to apply any quirks to the list of
> > + * pci_host_bridge resource ranges before they are being used
> > + * by of_create_pci_host_bridge()
> > + */
> > +extern int pcibios_fixup_bridge_ranges(struct list_head *resources);
> > +
> > #ifdef CONFIG_EEH
> > static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev)
> > {
> > --
> > 2.0.0
> >
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Tue, Jul 08, 2014 at 01:14:18AM +0100, Bjorn Helgaas wrote:
> On Tue, Jul 01, 2014 at 07:43:28PM +0100, Liviu Dudau wrote:
> > Some architectures do not have a simple view of the PCI I/O space
> > and instead use a range of CPU addresses that map to bus addresses. For
> > some architectures these ranges will be expressed by OF bindings
> > in a device tree file.
> >
> > Introduce a pci_register_io_range() helper function with a generic
> > implementation that can be used by such architectures to keep track
> > of the I/O ranges described by the PCI bindings. If the PCI_IOBASE
> > macro is not defined that signals lack of support for PCI and we
> > return an error.
> >
> > Signed-off-by: Liviu Dudau <[email protected]>
> > ---
> > drivers/of/address.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++
> > include/linux/of_address.h | 1 +
> > 2 files changed, 62 insertions(+)
> >
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > index 5edfcb0..1345733 100644
> > --- a/drivers/of/address.c
> > +++ b/drivers/of/address.c
> > @@ -5,6 +5,7 @@
> > #include <linux/module.h>
> > #include <linux/of_address.h>
> > #include <linux/pci_regs.h>
> > +#include <linux/slab.h>
> > #include <linux/string.h>
> >
> > /* Max address size we deal with */
> > @@ -601,12 +602,72 @@ const __be32 *of_get_address(struct device_node *dev, int index, u64 *size,
> > }
> > EXPORT_SYMBOL(of_get_address);
> >
> > +struct io_range {
> > + struct list_head list;
> > + phys_addr_t start;
> > + resource_size_t size;
> > +};
> > +
> > +static LIST_HEAD(io_range_list);
> > +
> > +/*
> > + * Record the PCI IO range (expressed as CPU physical address + size).
> > + * Return a negative value if an error has occured, zero otherwise
> > + */
> > +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
>
> I don't understand the interface here. What's the mapping from CPU
> physical address to bus I/O port? For example, I have the following
> machine in mind:
>
> HWP0002:00: PCI Root Bridge (domain 0000 [bus 00-1b])
> HWP0002:00: memory-mapped IO port space [mem 0xf8010000000-0xf8010000fff]
> HWP0002:00: host bridge window [io 0x0000-0x0fff]
>
> HWP0002:09: PCI Root Bridge (domain 0001 [bus 00-1b])
> HWP0002:09: memory-mapped IO port space [mem 0xf8110000000-0xf8110000fff]
> HWP0002:09: host bridge window [io 0x1000000-0x1000fff] (PCI address [0x0-0xfff])
>
> The CPU physical memory [mem 0xf8010000000-0xf8010000fff] is translated by
> the bridge to I/O ports 0x0000-0x0fff on PCI bus 0000:00. Drivers use,
> e.g., "inb(0)" to access it.
>
> Similarly, [mem 0xf8110000000-0xf8110000fff] is translated by the second
> bridge to I/O ports 0x0000-0x0fff on PCI bus 0001:00. Drivers use
> "inb(0x1000000)" to access it.
>
> pci_register_io_range() seems sort of like it's intended to track the
> memory-mapped IO port spaces, e.g., [mem 0xf8010000000-0xf8010000fff].
> But I would think you'd want to keep track of at least the base port
> number on the PCI bus, too. Or is that why it's weak?
It's weak in case the default implementation doesn't fit someones requirements.
And yes, it is trying to track the memory-mapped IO port spaces. When
calling pci_address_to_pio() - which takes the CPU address - it will
return the port number (0x0000 - 0x0fff and 0x1000000 - 0x1000fff respectively).
pci_address_to_pio() uses the list built by calling pci_register_io_range()
to calculate the correct offsets (although in this case it would move your
second host bridge io ports to [io 0x1000 - 0x1fff] as it tries not to leave
gaps in the reservations).
>
> Here's what these look like in /proc/iomem and /proc/ioports (note that
> there are two resource structs for each memory-mapped IO port space: one
> IORESOURCE_MEM for the memory-mapped area (used only by the host bridge
> driver), and one IORESOURCE_IO for the I/O port space (this becomes the
> parent of a region used by a regular device driver):
>
> /proc/iomem:
> PCI Bus 0000:00 I/O Ports 00000000-00000fff
> PCI Bus 0001:00 I/O Ports 01000000-01000fff
>
> /proc/ioports:
> 00000000-00000fff : PCI Bus 0000:00
> 01000000-01000fff : PCI Bus 0001:00
OK, I have a question that might be ovbious to you but I have missed the answer
so far: how does the IORESOURCE_MEM area gets created? Is it the host bridge
driver's job to do it? Is it something that the framework should do when it
notices that the IORESOURCE_IO is memory mapped?
Many thanks,
Liviu
>
> > +{
> > +#ifdef PCI_IOBASE
> > + struct io_range *res;
> > + resource_size_t allocated_size = 0;
> > +
> > + /* check if the range hasn't been previously recorded */
> > + list_for_each_entry(res, &io_range_list, list) {
> > + if (addr >= res->start && addr + size <= res->start + size)
> > + return 0;
> > + allocated_size += res->size;
> > + }
> > +
> > + /* range not registed yet, check for available space */
> > + if (allocated_size + size - 1 > IO_SPACE_LIMIT)
> > + return -E2BIG;
> > +
> > + /* add the range to the list */
> > + res = kzalloc(sizeof(*res), GFP_KERNEL);
> > + if (!res)
> > + return -ENOMEM;
> > +
> > + res->start = addr;
> > + res->size = size;
> > +
> > + list_add_tail(&res->list, &io_range_list);
> > +
> > + return 0;
> > +#else
> > + return -EINVAL;
> > +#endif
> > +}
> > +
> > unsigned long __weak pci_address_to_pio(phys_addr_t address)
> > {
> > +#ifdef PCI_IOBASE
> > + struct io_range *res;
> > + resource_size_t offset = 0;
> > +
> > + list_for_each_entry(res, &io_range_list, list) {
> > + if (address >= res->start &&
> > + address < res->start + res->size) {
> > + return res->start - address + offset;
> > + }
> > + offset += res->size;
> > + }
> > +
> > + return (unsigned long)-1;
> > +#else
> > if (address > IO_SPACE_LIMIT)
> > return (unsigned long)-1;
> >
> > return (unsigned long) address;
> > +#endif
> > }
> >
> > static int __of_address_to_resource(struct device_node *dev,
> > diff --git a/include/linux/of_address.h b/include/linux/of_address.h
> > index c13b878..ac4aac4 100644
> > --- a/include/linux/of_address.h
> > +++ b/include/linux/of_address.h
> > @@ -55,6 +55,7 @@ extern void __iomem *of_iomap(struct device_node *device, int index);
> > extern const __be32 *of_get_address(struct device_node *dev, int index,
> > u64 *size, unsigned int *flags);
> >
> > +extern int pci_register_io_range(phys_addr_t addr, resource_size_t size);
> > extern unsigned long pci_address_to_pio(phys_addr_t addr);
> >
> > extern int of_pci_range_parser_init(struct of_pci_range_parser *parser,
> > --
> > 2.0.0
> >
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Tue, Jul 08, 2014 at 12:27:49AM +0100, Bjorn Helgaas wrote:
> On Tue, Jul 01, 2014 at 07:43:27PM +0100, Liviu Dudau wrote:
> > This is a useful function and we should make it visible outside the
> > generic PCI code. Export it as a GPL symbol.
> >
> > Signed-off-by: Liviu Dudau <[email protected]>
> > Tested-by: Tanmay Inamdar <[email protected]>
> > ---
> > drivers/pci/host-bridge.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c
> > index 0e5f3c9..36c669e 100644
> > --- a/drivers/pci/host-bridge.c
> > +++ b/drivers/pci/host-bridge.c
> > @@ -16,12 +16,13 @@ static struct pci_bus *find_pci_root_bus(struct pci_bus *bus)
> > return bus;
> > }
> >
> > -static struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
> > +struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus)
> > {
> > struct pci_bus *root_bus = find_pci_root_bus(bus);
> >
> > return to_pci_host_bridge(root_bus->bridge);
> > }
> > +EXPORT_SYMBOL_GPL(find_pci_host_bridge);
>
> There's nothing in this series that uses find_pci_host_bridge(), so
> how about we just wait until we have something that needs it?
>
> Also, if/when we export this, I'd prefer a name that starts with "pci_"
> as most of the other interfaces do.
Understood.
I was using the function in the other patch series that adds PCIe support to arm64
in order to provide a pci_domain_nr() implementation. But I might not need it
in the end if I go ahead with separating pci_host_bridge creation from bus
creation.
Best regards,
Liviu
>
> > void pci_set_host_bridge_release(struct pci_host_bridge *bridge,
> > void (*release_fn)(struct pci_host_bridge *),
> > --
> > 2.0.0
> >
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Tue, Jul 08, 2014 at 01:59:54AM +0100, Bjorn Helgaas wrote:
> On Tue, Jul 01, 2014 at 07:43:31PM +0100, Liviu Dudau wrote:
> > Make it easier to discover the domain number of a bus by storing
> > the number in pci_host_bridge for the root bus. Several architectures
> > have their own way of storing this information, so it makes sense
> > to try to unify the code. While at this, add a new function that
> > creates a root bus in a given domain and make pci_create_root_bus()
> > a wrapper around this function.
>
> "While at this" is always a good clue that maybe something should be
> split into a separate patch :) This is a very good example, since it
> adds a new interface that deserves its own changelog.
Yes, I'm coming to the same conclusion. :)
>
> > Signed-off-by: Liviu Dudau <[email protected]>
> > Tested-by: Tanmay Inamdar <[email protected]>
> > ---
> > drivers/pci/probe.c | 41 +++++++++++++++++++++++++++++++++--------
> > include/linux/pci.h | 4 ++++
> > 2 files changed, 37 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index 2c92662..abf5e82 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -1748,8 +1748,9 @@ void __weak pcibios_remove_bus(struct pci_bus *bus)
> > {
> > }
> >
> > -struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> > - struct pci_ops *ops, void *sysdata, struct list_head *resources)
> > +struct pci_bus *pci_create_root_bus_in_domain(struct device *parent,
> > + int domain, int bus, struct pci_ops *ops, void *sysdata,
> > + struct list_head *resources)
>
> I don't think we should do it this way; this makes it possible to have a
> host bridge where "bridge->domain_nr != pci_domain_nr(bridge->bus)".
>
> I wonder if it would help to make a weak pci_domain_nr() function that
> returns "bridge->domain_nr". Then each arch could individually drop its
> pci_domain_nr() definition as it was converted, e.g., something like this:
>
> - Convert every arch pci_domain_nr() from a #define to a non-inline
> function
> - Add bridge.domain_nr, initialized from pci_domain_nr()
> - Add a weak generic pci_domain_nr() that returns bridge.domain_nr
> - Add a way to create a host bridge in a specified domain, so we can
> initialize bridge.domain_nr without using pci_domain_nr()
> - Convert each arch to use the new creation mechanism and drop its
> pci_domain_nr() implementation
I will try to propose a patch implementing this.
Best regards,
Liviu
>
> > {
> > int error;
> > struct pci_host_bridge *bridge;
> > @@ -1762,27 +1763,31 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> >
> > bridge = pci_alloc_host_bridge();
> > if (!bridge)
> > - return NULL;
> > + return ERR_PTR(-ENOMEM);
> >
> > bridge->dev.parent = parent;
> > bridge->dev.release = pci_release_host_bridge_dev;
> > + bridge->domain_nr = domain;
> >
> > b = pci_alloc_bus();
> > - if (!b)
> > + if (!b) {
> > + error = -ENOMEM;
> > goto err_out;
> > + }
> >
> > b->sysdata = sysdata;
> > b->ops = ops;
> > b->number = b->busn_res.start = bus;
> > - b2 = pci_find_bus(pci_domain_nr(b), bus);
> > + b2 = pci_find_bus(bridge->domain_nr, bus);
> > if (b2) {
> > /* If we already got to this bus through a different bridge, ignore it */
> > dev_dbg(&b2->dev, "bus already known\n");
> > + error = -EEXIST;
> > goto err_bus_out;
> > }
> >
> > bridge->bus = b;
> > - dev_set_name(&bridge->dev, "pci%04x:%02x", pci_domain_nr(b), bus);
> > + dev_set_name(&bridge->dev, "pci%04x:%02x", bridge->domain_nr, bus);
> > error = pcibios_root_bridge_prepare(bridge);
> > if (error)
> > goto err_out;
> > @@ -1801,7 +1806,7 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> >
> > b->dev.class = &pcibus_class;
> > b->dev.parent = b->bridge;
> > - dev_set_name(&b->dev, "%04x:%02x", pci_domain_nr(b), bus);
> > + dev_set_name(&b->dev, "%04x:%02x", bridge->domain_nr, bus);
> > error = device_register(&b->dev);
> > if (error)
> > goto class_dev_reg_err;
> > @@ -1851,7 +1856,27 @@ err_bus_out:
> > kfree(b);
> > err_out:
> > kfree(bridge);
> > - return NULL;
> > + return ERR_PTR(error);
> > +}
> > +
> > +struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> > + struct pci_ops *ops, void *sysdata, struct list_head *resources)
> > +{
> > + int domain_nr;
> > + struct pci_bus *b = pci_alloc_bus();
> > + if (!b)
> > + return NULL;
> > +
> > + b->sysdata = sysdata;
> > + domain_nr = pci_domain_nr(b);
> > + kfree(b);
> > +
> > + b = pci_create_root_bus_in_domain(parent, domain_nr, bus,
> > + ops, sysdata, resources);
> > + if (IS_ERR(b))
> > + return NULL;
> > +
> > + return b;
> > }
> >
> > int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int bus_max)
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 466bcd1..7e7b939 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -401,6 +401,7 @@ struct pci_host_bridge_window {
> > struct pci_host_bridge {
> > struct device dev;
> > struct pci_bus *bus; /* root bus */
> > + int domain_nr;
> > struct list_head windows; /* pci_host_bridge_windows */
> > void (*release_fn)(struct pci_host_bridge *);
> > void *release_data;
> > @@ -769,6 +770,9 @@ struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops, void *sysdata);
> > struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> > struct pci_ops *ops, void *sysdata,
> > struct list_head *resources);
> > +struct pci_bus *pci_create_root_bus_in_domain(struct device *parent,
> > + int domain, int bus, struct pci_ops *ops,
> > + void *sysdata, struct list_head *resources);
> > int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax);
> > int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax);
> > void pci_bus_release_busn_res(struct pci_bus *b);
> > --
> > 2.0.0
> >
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Sun, Jul 06, 2014 at 04:23:43PM +0100, Rob Herring wrote:
> On Tue, Jul 1, 2014 at 1:43 PM, Liviu Dudau <[email protected]> wrote:
> > This is my resurected attempt at adding support for generic PCI host
> > bridge controllers that make use of device tree information to
> > configure themselves. I've tagged it as v8 although the patches
> > have now been reshuffled in order to ease adoption so referring to
> > the older versions might be a bit of a hoop jumping exercise.
> >
> > Changes from v7:
> > - Reordered the patches so that fixes and non-controversial patches
> > from v7 can be accepted more easily. If agreed I can split the
> > series again into patches that can be upstreamed easily and ones
> > that still need discussion.
> > - Moved the of_create_host_bridge() function to drivers/of/of_pci.c
> > to better reflect its target use.
> > - Added the function to remap the bus I/O resources that used to be
> > provided in my arm64 patch series and (re)named it pci_remap_iospace()
> > - Removed error code checking from parsing and mapping of IRQ from DT
> > in recognition that some PCI devices will not have legacy IRQ mappings.
> >
> > v7 thread here with all the historic information: https://lkml.org/lkml/2014/3/14/279
>
> Can you publish a branch for this series please.
>
> Rob
>
Hi Rob,
I have pushed a brach that matches my v8 patchset +1 obvious missing header include
here: http://www.linux-arm.org/git?p=linux-ld.git;a=shortlog;h=refs/heads/for-upstream/pci_v8
Best regards,
Liviu
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Tue, Jul 8, 2014 at 4:46 AM, Liviu Dudau <[email protected]> wrote:
> On Tue, Jul 08, 2014 at 01:59:54AM +0100, Bjorn Helgaas wrote:
>> I wonder if it would help to make a weak pci_domain_nr() function that
>> returns "bridge->domain_nr". Then each arch could individually drop its
>> pci_domain_nr() definition as it was converted, e.g., something like this:
>>
>> - Convert every arch pci_domain_nr() from a #define to a non-inline
>> function
>> - Add bridge.domain_nr, initialized from pci_domain_nr()
>> - Add a weak generic pci_domain_nr() that returns bridge.domain_nr
>> - Add a way to create a host bridge in a specified domain, so we can
>> initialize bridge.domain_nr without using pci_domain_nr()
>> - Convert each arch to use the new creation mechanism and drop its
>> pci_domain_nr() implementation
>
> I will try to propose a patch implementing this.
I think this is more of an extra credit, cleanup sort of thing. I
don't think it advances your primary goal of (I think) getting arm64
PCI support in. So my advice is to not worry about unifying domain
handling until later.
Bjorn
On Tue, Jul 8, 2014 at 1:00 AM, Arnd Bergmann <[email protected]> wrote:
> On Tuesday 08 July 2014, Bjorn Helgaas wrote:
>> On Tue, Jul 01, 2014 at 07:43:28PM +0100, Liviu Dudau wrote:
>> > +static LIST_HEAD(io_range_list);
>> > +
>> > +/*
>> > + * Record the PCI IO range (expressed as CPU physical address + size).
>> > + * Return a negative value if an error has occured, zero otherwise
>> > + */
>> > +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
>>
>> I don't understand the interface here. What's the mapping from CPU
>> physical address to bus I/O port? For example, I have the following
>> machine in mind:
>>
>> HWP0002:00: PCI Root Bridge (domain 0000 [bus 00-1b])
>> HWP0002:00: memory-mapped IO port space [mem 0xf8010000000-0xf8010000fff]
>> HWP0002:00: host bridge window [io 0x0000-0x0fff]
>>
>> HWP0002:09: PCI Root Bridge (domain 0001 [bus 00-1b])
>> HWP0002:09: memory-mapped IO port space [mem 0xf8110000000-0xf8110000fff]
>> HWP0002:09: host bridge window [io 0x1000000-0x1000fff] (PCI address [0x0-0xfff])
>>
>> The CPU physical memory [mem 0xf8010000000-0xf8010000fff] is translated by
>> the bridge to I/O ports 0x0000-0x0fff on PCI bus 0000:00. Drivers use,
>> e.g., "inb(0)" to access it.
>>
>> Similarly, [mem 0xf8110000000-0xf8110000fff] is translated by the second
>> bridge to I/O ports 0x0000-0x0fff on PCI bus 0001:00. Drivers use
>> "inb(0x1000000)" to access it.
>
> I guess you are thinking of the IA64 model here where you keep the virtual
> I/O port numbers in a per-bus lookup table that gets accessed for each
> inb() call. I've thought about this some more, and I believe there are good
> reasons for sticking with the model used on arm32 and powerpc for the
> generic OF implementation.
>
> The idea is that there is a single virtual memory range for all I/O port
> mappings and we use the MMU to do the translation rather than computing
> it manually in the inb() implemnetation. The main advantage is that all
> functions used in device drivers to (potentially) access I/O ports
> become trivial this way, which helps for code size and in some cases
> (e.g. SoC-internal registers with a low latency) it may even be performance
> relevant.
My example is from ia64, but I'm not advocating for the lookup table.
The point is that the hardware works similarly (at least for dense ia64
I/O port spaces) in terms of mapping CPU physical addresses to PCI I/O
space.
I think my confusion is because your pci_register_io_range() and
pci_addess_to_pci() implementations assume that every io_range starts at
I/O port 0 on PCI (correct me if I'm wrong). I suspect that's why you
don't save the I/O port number in struct io_range.
Maybe that assumption is guaranteed by OF, but it doesn't hold for ACPI;
ACPI can describe several I/O port apertures for a single bridge, each
associated with a different CPU physical memory region.
If my speculation here is correct, a comment to the effect that each
io_range corresponds to a PCI I/O space range that starts at 0 might be
enough.
If you did add a PCI I/O port number argument to pci_register_io_range(),
we might be able to make an ACPI-based implementation of it. But I guess
that could be done if/when anybody ever wants to do that.
>> Here's what these look like in /proc/iomem and /proc/ioports (note that
>> there are two resource structs for each memory-mapped IO port space: one
>> IORESOURCE_MEM for the memory-mapped area (used only by the host bridge
>> driver), and one IORESOURCE_IO for the I/O port space (this becomes the
>> parent of a region used by a regular device driver):
>>
>> /proc/iomem:
>> PCI Bus 0000:00 I/O Ports 00000000-00000fff
>> PCI Bus 0001:00 I/O Ports 01000000-01000fff
Oops, I forgot the actual physical memory addresses here, but you got
the idea anyway. It should have been something like this:
/proc/iomem:
f8010000000-f8010000fff PCI Bus 0000:00 I/O Ports 00000000-00000fff
f8110000000-f8110000fff PCI Bus 0001:00 I/O Ports 01000000-01000fff
Bjorn
On Tue, Jul 08, 2014 at 11:29:40AM +0100, Liviu Dudau wrote:
> On Tue, Jul 08, 2014 at 02:01:04AM +0100, Bjorn Helgaas wrote:
> > On Tue, Jul 01, 2014 at 07:43:33PM +0100, Liviu Dudau wrote:
> > > ...
> > > + for_each_of_pci_range(&parser, &range) {
> > > + /* Read next ranges element */
> > > + pr_debug("pci_space: 0x%08x pci_addr:0x%016llx cpu_addr:0x%016llx size:0x%016llx\n",
> > > + range.pci_space, range.pci_addr, range.cpu_addr, range.size);
> >
> > If you're not trying to match other printk formats, you could try to match
> > the %pR format used elsewhere, e.g., "%#010llx-%#010llx" with
> > range.cpu_addr, range.cpu_addr + range.size - 1.
>
> Yes, not a big fan of the ugly output it generates, but the output matches closely the ranges
> definition in the device tree file so it is easy to validate that you are parsing the right
> entry. I am happy to change it to shorten the cpu range message.
If it already matches other device tree stuff, that's perfect. I'm not
familiar with that.
> > > +int __weak pcibios_fixup_bridge_ranges(struct list_head *resources)
> > > +{
> > > + return 0;
> > > +}
> >
> > I'd wait to add this until there's a platform that needs to implement it.
> > Splitting it out will make this patch that much smaller and easier to
> > understand.
>
> I need this as this is the default implementation (i.e. do nothing). Otherwise the
> link phase will give errors.
I meant that you could remove the default implementation *and* the call of
it, since it currently does nothing.
> > > diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
> > > index dde3a4a..71e36d0 100644
> > > --- a/include/linux/of_pci.h
> > > +++ b/include/linux/of_pci.h
> > > @@ -15,6 +15,9 @@ struct device_node *of_pci_find_child_device(struct device_node *parent,
> > > int of_pci_get_devfn(struct device_node *np);
> > > int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
> > > int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
> > > +struct pci_host_bridge *of_create_pci_host_bridge(struct device *parent,
> > > + struct pci_ops *ops, void *host_data);
> > > +
> > > #else
> > > static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq)
> > > {
> > > @@ -43,6 +46,13 @@ of_pci_parse_bus_range(struct device_node *node, struct resource *res)
> > > {
> > > return -EINVAL;
> > > }
> > > +
> > > +static inline struct pci_host_bridge *
> > > +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops,
> > > + void *host_data)
> > > +{
> > > + return NULL;
> > > +}
> > > #endif
> > >
> > > #if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)
> > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > index 7e7b939..556dc5f 100644
> > > --- a/include/linux/pci.h
> > > +++ b/include/linux/pci.h
> > > @@ -402,6 +402,7 @@ struct pci_host_bridge {
> > > struct device dev;
> > > struct pci_bus *bus; /* root bus */
> > > int domain_nr;
> > > + resource_size_t io_base; /* physical address for the start of I/O area */
> >
> > I don't see where this is used yet.
>
> It's used in pci_host_bridge_of_get_ranges() (earlier in this patch).
of_create_pci_host_bridge() fills in bridge->io_base, but I don't see
anything that ever *reads* bridge->io_base.
> > As far as I know, there's nothing that prevents a host bridge from having
> > several I/O port apertures (or several memory-mapped I/O port spaces).
>
> The pci_register_io_range() will give different offsets for each apperture.
> I just need to make sure I don't overwrite io_base when parsing multiple IO
> ranges.
Let's continue this in the other thread where I'm trying to understand the
I/O apertures. Obviously I'm still missing something if you can indeed
have multiple I/O apertures per bridge (because then only one of them can
start at I/O address 0 on PCI).
Bjorn
On Tue, Jul 08, 2014 at 10:33:05PM +0100, Bjorn Helgaas wrote:
> On Tue, Jul 08, 2014 at 11:29:40AM +0100, Liviu Dudau wrote:
> > On Tue, Jul 08, 2014 at 02:01:04AM +0100, Bjorn Helgaas wrote:
> > > On Tue, Jul 01, 2014 at 07:43:33PM +0100, Liviu Dudau wrote:
> > > > ...
> > > > + for_each_of_pci_range(&parser, &range) {
> > > > + /* Read next ranges element */
> > > > + pr_debug("pci_space: 0x%08x pci_addr:0x%016llx cpu_addr:0x%016llx size:0x%016llx\n",
> > > > + range.pci_space, range.pci_addr, range.cpu_addr, range.size);
> > >
> > > If you're not trying to match other printk formats, you could try to match
> > > the %pR format used elsewhere, e.g., "%#010llx-%#010llx" with
> > > range.cpu_addr, range.cpu_addr + range.size - 1.
> >
> > Yes, not a big fan of the ugly output it generates, but the output matches closely the ranges
> > definition in the device tree file so it is easy to validate that you are parsing the right
> > entry. I am happy to change it to shorten the cpu range message.
>
> If it already matches other device tree stuff, that's perfect. I'm not
> familiar with that.
>
> > > > +int __weak pcibios_fixup_bridge_ranges(struct list_head *resources)
> > > > +{
> > > > + return 0;
> > > > +}
> > >
> > > I'd wait to add this until there's a platform that needs to implement it.
> > > Splitting it out will make this patch that much smaller and easier to
> > > understand.
> >
> > I need this as this is the default implementation (i.e. do nothing). Otherwise the
> > link phase will give errors.
>
> I meant that you could remove the default implementation *and* the call of
> it, since it currently does nothing.
True. But it looks like converting Will's pci-host-generic.c driver will make use of this.
>
> > > > diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
> > > > index dde3a4a..71e36d0 100644
> > > > --- a/include/linux/of_pci.h
> > > > +++ b/include/linux/of_pci.h
> > > > @@ -15,6 +15,9 @@ struct device_node *of_pci_find_child_device(struct device_node *parent,
> > > > int of_pci_get_devfn(struct device_node *np);
> > > > int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
> > > > int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
> > > > +struct pci_host_bridge *of_create_pci_host_bridge(struct device *parent,
> > > > + struct pci_ops *ops, void *host_data);
> > > > +
> > > > #else
> > > > static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq)
> > > > {
> > > > @@ -43,6 +46,13 @@ of_pci_parse_bus_range(struct device_node *node, struct resource *res)
> > > > {
> > > > return -EINVAL;
> > > > }
> > > > +
> > > > +static inline struct pci_host_bridge *
> > > > +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops,
> > > > + void *host_data)
> > > > +{
> > > > + return NULL;
> > > > +}
> > > > #endif
> > > >
> > > > #if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)
> > > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > > index 7e7b939..556dc5f 100644
> > > > --- a/include/linux/pci.h
> > > > +++ b/include/linux/pci.h
> > > > @@ -402,6 +402,7 @@ struct pci_host_bridge {
> > > > struct device dev;
> > > > struct pci_bus *bus; /* root bus */
> > > > int domain_nr;
> > > > + resource_size_t io_base; /* physical address for the start of I/O area */
> > >
> > > I don't see where this is used yet.
> >
> > It's used in pci_host_bridge_of_get_ranges() (earlier in this patch).
>
> of_create_pci_host_bridge() fills in bridge->io_base, but I don't see
> anything that ever *reads* bridge->io_base.
Ah, understood. It is used by the host bridge drivers to set their ATR registers to the
correct CPU address values.
>
> > > As far as I know, there's nothing that prevents a host bridge from having
> > > several I/O port apertures (or several memory-mapped I/O port spaces).
> >
> > The pci_register_io_range() will give different offsets for each apperture.
> > I just need to make sure I don't overwrite io_base when parsing multiple IO
> > ranges.
>
> Let's continue this in the other thread where I'm trying to understand the
> I/O apertures. Obviously I'm still missing something if you can indeed
> have multiple I/O apertures per bridge (because then only one of them can
> start at I/O address 0 on PCI).
Sure.
Thanks,
Liviu
>
> Bjorn
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Tue, Jul 08, 2014 at 11:27:38PM +0100, Liviu Dudau wrote:
> On Tue, Jul 08, 2014 at 10:33:05PM +0100, Bjorn Helgaas wrote:
> > On Tue, Jul 08, 2014 at 11:29:40AM +0100, Liviu Dudau wrote:
> > > On Tue, Jul 08, 2014 at 02:01:04AM +0100, Bjorn Helgaas wrote:
> > > > On Tue, Jul 01, 2014 at 07:43:33PM +0100, Liviu Dudau wrote:
> > > > > ...
> > > > > +int __weak pcibios_fixup_bridge_ranges(struct list_head *resources)
> > > > > +{
> > > > > + return 0;
> > > > > +}
> > > >
> > > > I'd wait to add this until there's a platform that needs to implement it.
> > > > Splitting it out will make this patch that much smaller and easier to
> > > > understand.
> > >
> > > I need this as this is the default implementation (i.e. do nothing). Otherwise the
> > > link phase will give errors.
> >
> > I meant that you could remove the default implementation *and* the call of
> > it, since it currently does nothing.
>
> True. But it looks like converting Will's pci-host-generic.c driver will make use of this.
I think we should move this part of the patch to the conversion of
that driver. That will keep related changes grouped together, which
makes them easier to review and easier to backport, e.g., to distro
kernels.
> > > > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > > > index 7e7b939..556dc5f 100644
> > > > > --- a/include/linux/pci.h
> > > > > +++ b/include/linux/pci.h
> > > > > @@ -402,6 +402,7 @@ struct pci_host_bridge {
> > > > > struct device dev;
> > > > > struct pci_bus *bus; /* root bus */
> > > > > int domain_nr;
> > > > > + resource_size_t io_base; /* physical address for the start of I/O area */
> > > >
> > > > I don't see where this is used yet.
> > >
> > > It's used in pci_host_bridge_of_get_ranges() (earlier in this patch).
> >
> > of_create_pci_host_bridge() fills in bridge->io_base, but I don't see
> > anything that ever *reads* bridge->io_base.
>
> Ah, understood. It is used by the host bridge drivers to set their ATR registers to the
> correct CPU address values.
Same applies here.
Bjorn
On Tue, Jul 08, 2014 at 10:29:51PM +0100, Bjorn Helgaas wrote:
> On Tue, Jul 8, 2014 at 1:00 AM, Arnd Bergmann <[email protected]> wrote:
> > On Tuesday 08 July 2014, Bjorn Helgaas wrote:
> >> On Tue, Jul 01, 2014 at 07:43:28PM +0100, Liviu Dudau wrote:
> >> > +static LIST_HEAD(io_range_list);
> >> > +
> >> > +/*
> >> > + * Record the PCI IO range (expressed as CPU physical address + size).
> >> > + * Return a negative value if an error has occured, zero otherwise
> >> > + */
> >> > +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
> >>
> >> I don't understand the interface here. What's the mapping from CPU
> >> physical address to bus I/O port? For example, I have the following
> >> machine in mind:
> >>
> >> HWP0002:00: PCI Root Bridge (domain 0000 [bus 00-1b])
> >> HWP0002:00: memory-mapped IO port space [mem 0xf8010000000-0xf8010000fff]
> >> HWP0002:00: host bridge window [io 0x0000-0x0fff]
> >>
> >> HWP0002:09: PCI Root Bridge (domain 0001 [bus 00-1b])
> >> HWP0002:09: memory-mapped IO port space [mem 0xf8110000000-0xf8110000fff]
> >> HWP0002:09: host bridge window [io 0x1000000-0x1000fff] (PCI address [0x0-0xfff])
> >>
> >> The CPU physical memory [mem 0xf8010000000-0xf8010000fff] is translated by
> >> the bridge to I/O ports 0x0000-0x0fff on PCI bus 0000:00. Drivers use,
> >> e.g., "inb(0)" to access it.
> >>
> >> Similarly, [mem 0xf8110000000-0xf8110000fff] is translated by the second
> >> bridge to I/O ports 0x0000-0x0fff on PCI bus 0001:00. Drivers use
> >> "inb(0x1000000)" to access it.
> >
> > I guess you are thinking of the IA64 model here where you keep the virtual
> > I/O port numbers in a per-bus lookup table that gets accessed for each
> > inb() call. I've thought about this some more, and I believe there are good
> > reasons for sticking with the model used on arm32 and powerpc for the
> > generic OF implementation.
> >
> > The idea is that there is a single virtual memory range for all I/O port
> > mappings and we use the MMU to do the translation rather than computing
> > it manually in the inb() implemnetation. The main advantage is that all
> > functions used in device drivers to (potentially) access I/O ports
> > become trivial this way, which helps for code size and in some cases
> > (e.g. SoC-internal registers with a low latency) it may even be performance
> > relevant.
>
> My example is from ia64, but I'm not advocating for the lookup table.
> The point is that the hardware works similarly (at least for dense ia64
> I/O port spaces) in terms of mapping CPU physical addresses to PCI I/O
> space.
>
> I think my confusion is because your pci_register_io_range() and
> pci_addess_to_pci() implementations assume that every io_range starts at
> I/O port 0 on PCI (correct me if I'm wrong). I suspect that's why you
> don't save the I/O port number in struct io_range.
>
> Maybe that assumption is guaranteed by OF, but it doesn't hold for ACPI;
> ACPI can describe several I/O port apertures for a single bridge, each
> associated with a different CPU physical memory region.
That is actually a good catch, I've completely missed the fact that
io_range->pci_addr could be non-zero.
>
> If my speculation here is correct, a comment to the effect that each
> io_range corresponds to a PCI I/O space range that starts at 0 might be
> enough.
>
> If you did add a PCI I/O port number argument to pci_register_io_range(),
> we might be able to make an ACPI-based implementation of it. But I guess
> that could be done if/when anybody ever wants to do that.
No, I think you are right, the PCI I/O port number needs to be recorded. I
need to add that to pci_register_io_range().
>
> >> Here's what these look like in /proc/iomem and /proc/ioports (note that
> >> there are two resource structs for each memory-mapped IO port space: one
> >> IORESOURCE_MEM for the memory-mapped area (used only by the host bridge
> >> driver), and one IORESOURCE_IO for the I/O port space (this becomes the
> >> parent of a region used by a regular device driver):
> >>
> >> /proc/iomem:
> >> PCI Bus 0000:00 I/O Ports 00000000-00000fff
> >> PCI Bus 0001:00 I/O Ports 01000000-01000fff
>
> Oops, I forgot the actual physical memory addresses here, but you got
> the idea anyway. It should have been something like this:
>
> /proc/iomem:
> f8010000000-f8010000fff PCI Bus 0000:00 I/O Ports 00000000-00000fff
> f8110000000-f8110000fff PCI Bus 0001:00 I/O Ports 01000000-01000fff
>
> Bjorn
>
Thanks for being thorough with your review.
Best regards,
Liviu
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Tue, Jul 08, 2014 at 07:41:50PM +0100, Bjorn Helgaas wrote:
> On Tue, Jul 8, 2014 at 4:46 AM, Liviu Dudau <[email protected]> wrote:
> > On Tue, Jul 08, 2014 at 01:59:54AM +0100, Bjorn Helgaas wrote:
>
> >> I wonder if it would help to make a weak pci_domain_nr() function that
> >> returns "bridge->domain_nr". Then each arch could individually drop its
> >> pci_domain_nr() definition as it was converted, e.g., something like this:
> >>
> >> - Convert every arch pci_domain_nr() from a #define to a non-inline
> >> function
> >> - Add bridge.domain_nr, initialized from pci_domain_nr()
> >> - Add a weak generic pci_domain_nr() that returns bridge.domain_nr
> >> - Add a way to create a host bridge in a specified domain, so we can
> >> initialize bridge.domain_nr without using pci_domain_nr()
> >> - Convert each arch to use the new creation mechanism and drop its
> >> pci_domain_nr() implementation
> >
> > I will try to propose a patch implementing this.
>
> I think this is more of an extra credit, cleanup sort of thing. I
> don't think it advances your primary goal of (I think) getting arm64
> PCI support in. So my advice is to not worry about unifying domain
> handling until later.
Getting arm64 supported *is* my main goal. But like you have stated in your
review of v7, you wanted to see another architecture converted as a guarantee
of "genericity" (for lack of a better word) for my patches. The one architecture
I've set my eyes on is microblaze, and that one uses pci_scan_root_bus()
rather than pci_create_root_bus() so I don't have any opportunity to pass the
domain number or any additional info (like the sysdata pointer that we were
talking about) to the pci_host_bridge structure unless I do this cleanup.
Best regards,
Liviu
>
> Bjorn
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Tue, Jul 08, 2014 at 11:37:37PM +0100, Bjorn Helgaas wrote:
> On Tue, Jul 08, 2014 at 11:27:38PM +0100, Liviu Dudau wrote:
> > On Tue, Jul 08, 2014 at 10:33:05PM +0100, Bjorn Helgaas wrote:
> > > On Tue, Jul 08, 2014 at 11:29:40AM +0100, Liviu Dudau wrote:
> > > > On Tue, Jul 08, 2014 at 02:01:04AM +0100, Bjorn Helgaas wrote:
> > > > > On Tue, Jul 01, 2014 at 07:43:33PM +0100, Liviu Dudau wrote:
> > > > > > ...
> > > > > > +int __weak pcibios_fixup_bridge_ranges(struct list_head *resources)
> > > > > > +{
> > > > > > + return 0;
> > > > > > +}
> > > > >
> > > > > I'd wait to add this until there's a platform that needs to implement it.
> > > > > Splitting it out will make this patch that much smaller and easier to
> > > > > understand.
> > > >
> > > > I need this as this is the default implementation (i.e. do nothing). Otherwise the
> > > > link phase will give errors.
> > >
> > > I meant that you could remove the default implementation *and* the call of
> > > it, since it currently does nothing.
> >
> > True. But it looks like converting Will's pci-host-generic.c driver will make use of this.
>
> I think we should move this part of the patch to the conversion of
> that driver. That will keep related changes grouped together, which
> makes them easier to review and easier to backport, e.g., to distro
> kernels.
>
> > > > > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > > > > index 7e7b939..556dc5f 100644
> > > > > > --- a/include/linux/pci.h
> > > > > > +++ b/include/linux/pci.h
> > > > > > @@ -402,6 +402,7 @@ struct pci_host_bridge {
> > > > > > struct device dev;
> > > > > > struct pci_bus *bus; /* root bus */
> > > > > > int domain_nr;
> > > > > > + resource_size_t io_base; /* physical address for the start of I/O area */
> > > > >
> > > > > I don't see where this is used yet.
> > > >
> > > > It's used in pci_host_bridge_of_get_ranges() (earlier in this patch).
> > >
> > > of_create_pci_host_bridge() fills in bridge->io_base, but I don't see
> > > anything that ever *reads* bridge->io_base.
> >
> > Ah, understood. It is used by the host bridge drivers to set their ATR registers to the
> > correct CPU address values.
>
> Same applies here.
Except I have no idea which host bridge driver is going to be submitted first. There are
at least a couple of public series for HB drivers based on my series plus the one I have yet
to submit for my platform.
Maybe an associated documentation patch that explains the intended use for io_base could
persuade you on the value of keeping the io_base patch here?
Best regards,
Liviu
>
> Bjorn
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Tuesday 08 July 2014, Bjorn Helgaas wrote:
> On Tue, Jul 8, 2014 at 1:00 AM, Arnd Bergmann <[email protected]> wrote:
> > On Tuesday 08 July 2014, Bjorn Helgaas wrote:
> >> On Tue, Jul 01, 2014 at 07:43:28PM +0100, Liviu Dudau wrote:
> >> > +static LIST_HEAD(io_range_list);
> >> > +
> >> > +/*
> >> > + * Record the PCI IO range (expressed as CPU physical address + size).
> >> > + * Return a negative value if an error has occured, zero otherwise
> >> > + */
> >> > +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
> >>
> >> I don't understand the interface here. What's the mapping from CPU
> >> physical address to bus I/O port? For example, I have the following
> >> machine in mind:
> >>
> >> HWP0002:00: PCI Root Bridge (domain 0000 [bus 00-1b])
> >> HWP0002:00: memory-mapped IO port space [mem 0xf8010000000-0xf8010000fff]
> >> HWP0002:00: host bridge window [io 0x0000-0x0fff]
> >>
> >> HWP0002:09: PCI Root Bridge (domain 0001 [bus 00-1b])
> >> HWP0002:09: memory-mapped IO port space [mem 0xf8110000000-0xf8110000fff]
> >> HWP0002:09: host bridge window [io 0x1000000-0x1000fff] (PCI address [0x0-0xfff])
> >>
> >> The CPU physical memory [mem 0xf8010000000-0xf8010000fff] is translated by
> >> the bridge to I/O ports 0x0000-0x0fff on PCI bus 0000:00. Drivers use,
> >> e.g., "inb(0)" to access it.
> >>
> >> Similarly, [mem 0xf8110000000-0xf8110000fff] is translated by the second
> >> bridge to I/O ports 0x0000-0x0fff on PCI bus 0001:00. Drivers use
> >> "inb(0x1000000)" to access it.
> >
> > I guess you are thinking of the IA64 model here where you keep the virtual
> > I/O port numbers in a per-bus lookup table that gets accessed for each
> > inb() call. I've thought about this some more, and I believe there are good
> > reasons for sticking with the model used on arm32 and powerpc for the
> > generic OF implementation.
> >
> > The idea is that there is a single virtual memory range for all I/O port
> > mappings and we use the MMU to do the translation rather than computing
> > it manually in the inb() implemnetation. The main advantage is that all
> > functions used in device drivers to (potentially) access I/O ports
> > become trivial this way, which helps for code size and in some cases
> > (e.g. SoC-internal registers with a low latency) it may even be performance
> > relevant.
>
> My example is from ia64, but I'm not advocating for the lookup table.
> The point is that the hardware works similarly (at least for dense ia64
> I/O port spaces) in terms of mapping CPU physical addresses to PCI I/O
> space.
>
> I think my confusion is because your pci_register_io_range() and
> pci_addess_to_pci() implementations assume that every io_range starts at
> I/O port 0 on PCI (correct me if I'm wrong). I suspect that's why you
> don't save the I/O port number in struct io_range.
I think you are just misreading the code, but I agree it's hard to
understand and I made the same mistake in my initial reply to the
first version.
pci_register_io_range and pci_address_to_pci only worry about the mapping
between CPU physical and Linux I/O address, they do not care which PCI
port numbers are behind that. The mapping between PCI port numbers and
Linux port numbers is done correctly in patch 8/9 in the
pci_host_bridge_of_get_ranges() function.
> Maybe that assumption is guaranteed by OF, but it doesn't hold for ACPI;
> ACPI can describe several I/O port apertures for a single bridge, each
> associated with a different CPU physical memory region.
DT can have the same, although the common case is that each PCI host
bridge has 64KB of I/O ports starting at address 0. Most driver writers
get it wrong for the case where it starts at a different address, so
I really want to have a generic implementation that gets it right.
> If my speculation here is correct, a comment to the effect that each
> io_range corresponds to a PCI I/O space range that starts at 0 might be
> enough.
>
> If you did add a PCI I/O port number argument to pci_register_io_range(),
> we might be able to make an ACPI-based implementation of it. But I guess
> that could be done if/when anybody ever wants to do that.
I think we shoulnd't worry about it before we actually need it. As far as
I understand, the only user of that code (unless someone wants to convert
ia64) would be ARM64 with ACPI, but that uses the SBSA hardware model that
recommends having no I/O space at all.
Arnd
On Tuesday 08 July 2014, Liviu Dudau wrote:
> > Here's what these look like in /proc/iomem and /proc/ioports (note that
> > there are two resource structs for each memory-mapped IO port space: one
> > IORESOURCE_MEM for the memory-mapped area (used only by the host bridge
> > driver), and one IORESOURCE_IO for the I/O port space (this becomes the
> > parent of a region used by a regular device driver):
> >
> > /proc/iomem:
> > PCI Bus 0000:00 I/O Ports 00000000-00000fff
> > PCI Bus 0001:00 I/O Ports 01000000-01000fff
> >
> > /proc/ioports:
> > 00000000-00000fff : PCI Bus 0000:00
> > 01000000-01000fff : PCI Bus 0001:00
>
> OK, I have a question that might be ovbious to you but I have missed the answer
> so far: how does the IORESOURCE_MEM area gets created? Is it the host bridge
> driver's job to do it? Is it something that the framework should do when it
> notices that the IORESOURCE_IO is memory mapped?
The host bridge driver should either register the IORESOURCE_MEM resource
itself from its probe or setup function, or it should get registered behind
the covers in drivers using of_create_pci_host_bridge().
Your new pci_host_bridge_of_get_ranges already loops over all the
resources, so it would be a good place to put that.
Arnd
On Wednesday 09 July 2014, Liviu Dudau wrote:
> > Maybe that assumption is guaranteed by OF, but it doesn't hold for ACPI;
> > ACPI can describe several I/O port apertures for a single bridge, each
> > associated with a different CPU physical memory region.
>
> That is actually a good catch, I've completely missed the fact that
> io_range->pci_addr could be non-zero.
Hmm, that's what I thought in my initial review, but you convinced me
that it's actually correct later on, and I still believe it is. Maybe
now you got confused by your own code?
Please have another look, I think your code in pci_host_bridge_of_get_ranges
sufficiently handles the registration to the PCI code with the correct
io_offset. The only thing that we might want to add is to record the
PCI address along with the bridge->io_base: For the host driver to
set up the mapping window correctly, you either need both of them, or
you assume they are already set up.
Arnd
On Wednesday 09 July 2014, Liviu Dudau wrote:
> On Tue, Jul 08, 2014 at 11:37:37PM +0100, Bjorn Helgaas wrote:
> > On Tue, Jul 08, 2014 at 11:27:38PM +0100, Liviu Dudau wrote:
> > > > > > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > > > > > index 7e7b939..556dc5f 100644
> > > > > > > --- a/include/linux/pci.h
> > > > > > > +++ b/include/linux/pci.h
> > > > > > > @@ -402,6 +402,7 @@ struct pci_host_bridge {
> > > > > > > struct device dev;
> > > > > > > struct pci_bus *bus; /* root bus */
> > > > > > > int domain_nr;
> > > > > > > + resource_size_t io_base; /* physical address for the start of I/O area */
> > > > > >
> > > > > > I don't see where this is used yet.
> > > > >
> > > > > It's used in pci_host_bridge_of_get_ranges() (earlier in this patch).
> > > >
> > > > of_create_pci_host_bridge() fills in bridge->io_base, but I don't see
> > > > anything that ever reads bridge->io_base.
> > >
> > > Ah, understood. It is used by the host bridge drivers to set their ATR registers to the
> > > correct CPU address values.
Actually, as we just discovered with one of the pci_dw drivers, it may
be the wrong number: what you program in the ATR registers is not
necessarily the same as the address visible to the CPU. What you need
instead is the address at the bus immediately above the PCI host bridge.
I think for now we can leave out this part from common code and add
the infrastructure later. Host drivers can have their own loop
around the ranges if they need to set up these registers. Ideally
at least on arm64, they should be set up by the firmware already.
Arnd
On Tuesday 08 July 2014, Liviu Dudau wrote:
> On Mon, Jul 07, 2014 at 10:22:00PM +0100, Arnd Bergmann wrote:
> >
> > I looked at the other drivers briefly, and I think you indeed fix the Tegra
> > driver with this but break the integrator driver as mentioned above.
> > The other callers of of_pci_range_to_resource() are apparently not
> > impacted as they recalculate the values they get.
>
> I would argue that integrator version is having broken assumptions. If it would
> try to allocate that IO range or request the resource as returned currently by
> of_pci_range_to_resource (without my patch) it would fail. I know because I did
> the same thing in my host bridge driver and it failed miserably. That's why I
> tried to patch it.
The integrator code was just introduced and the reason for how it does things
is the way that of_pci_range_to_resource() works today. We tried to cope with
it and not change the existing behavior in order to not break any other drivers.
It's certainly not fair to call the integrator version broken, it just works
around the common code having a quirky interface. We should probably have
done of_pci_range_to_resource better than it is today (I would have argued
for it to return an IORESOURCE_MEM with the CPU address), but it took long
enough to get that merged and I was sick of arguing about it.
> If the IO space is memory mapped, then we use the port number, the io_offset
> and the PCI_IOBASE to get to the virtual address that, when accessed, will
> generate the correct addresses on the bus, based on what the host bridge has
> been configured.
>
> This is the current level of my understanding of PCI IO.
Your understanding is absolutely correct, and that's great because very few
people get that right. What I think we're really arguing about is what the
of_pci_range_to_resource is supposed to return. As you and Bjorn both pointed
out earlier, there are in fact two resources associated with the I/O window
and the flaw in the current implementation is that of_pci_range_to_resource
returns the numeric values for the IORESOURCE_MEM resource, but sets the
type to IORESOURCE_IO, which is offset from that by PCI_IOBASE.
You try to fix that by making it return the correct IORESOURCE_IO resource,
which is a reasonable approach but you must not break drivers that rely
on the broken resource while doing that.
The approach that I would have picked is to return the IORESOURCE_MEM
resource associated with the I/O window and pick a (basically random)
IORESOURCE_IO resource struct based on what hasn't been used and then
compute the appropriate io_offset from that. This approach of course
would also have required fixing up all drivers relying on the current
behavior.
To be clear, I'm fine with you (and Bjorn if he cares) picking the
approach you like here, either one of these works fine as long as the
host drivers use the interface in the way it is defined.
> Now, I believe Rob has switched entirely to using my series in some test that
> he has run and he hasn't encountered any issues, as long as one remembers in
> the host bridge driver to add the io_base offset to the .start resource. If
> not then I need to patch pci_v3.c.
The crazy part of all these discussions is that basically nobody ever uses
I/O port access, so it's very hard to test and we don't even notice when
we get it wrong, but we end up spending most of the time for PCI host controller
reviews trying to get these right.
I'm very thankful that you are doing this work to get it moved into common
code so hopefully this is the last time we ever have to worry about it because
all future drivers will be able to use the common implemnetation.
Arnd
On Tue, Jul 08, 2014 at 03:14:17PM +0100, Arnd Bergmann wrote:
> On Tuesday 08 July 2014, Liviu Dudau wrote:
> > > Here's what these look like in /proc/iomem and /proc/ioports (note that
> > > there are two resource structs for each memory-mapped IO port space: one
> > > IORESOURCE_MEM for the memory-mapped area (used only by the host bridge
> > > driver), and one IORESOURCE_IO for the I/O port space (this becomes the
> > > parent of a region used by a regular device driver):
> > >
> > > /proc/iomem:
> > > PCI Bus 0000:00 I/O Ports 00000000-00000fff
> > > PCI Bus 0001:00 I/O Ports 01000000-01000fff
> > >
> > > /proc/ioports:
> > > 00000000-00000fff : PCI Bus 0000:00
> > > 01000000-01000fff : PCI Bus 0001:00
> >
> > OK, I have a question that might be ovbious to you but I have missed the answer
> > so far: how does the IORESOURCE_MEM area gets created? Is it the host bridge
> > driver's job to do it? Is it something that the framework should do when it
> > notices that the IORESOURCE_IO is memory mapped?
>
> The host bridge driver should either register the IORESOURCE_MEM resource
> itself from its probe or setup function, or it should get registered behind
> the covers in drivers using of_create_pci_host_bridge().
>
> Your new pci_host_bridge_of_get_ranges already loops over all the
> resources, so it would be a good place to put that.
OK, so it is not something that I've missed, just something that x86-64 does and
my version doesn't yet.
Thanks for confirming that.
Liviu
>
> Arnd
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Wed, Jul 09, 2014 at 07:32:37AM +0100, Arnd Bergmann wrote:
> On Wednesday 09 July 2014, Liviu Dudau wrote:
> > > Maybe that assumption is guaranteed by OF, but it doesn't hold for ACPI;
> > > ACPI can describe several I/O port apertures for a single bridge, each
> > > associated with a different CPU physical memory region.
> >
> > That is actually a good catch, I've completely missed the fact that
> > io_range->pci_addr could be non-zero.
>
> Hmm, that's what I thought in my initial review, but you convinced me
> that it's actually correct later on, and I still believe it is. Maybe
> now you got confused by your own code?
Man, it has been too long. Yes, I am now confused by my own code, which
is not a good sign.
>
> Please have another look, I think your code in pci_host_bridge_of_get_ranges
> sufficiently handles the registration to the PCI code with the correct
> io_offset. The only thing that we might want to add is to record the
> PCI address along with the bridge->io_base: For the host driver to
> set up the mapping window correctly, you either need both of them, or
> you assume they are already set up.
Hmm, having another look at pci_host_bridge_of_get_range() I'm not convinced
that we need another storage for pci_addr. The resource gets added to the
list of resources used by the bridge offsetted by range.pci_addr, so when
re-creating the PCI bus address the value should come in play.
I will double check but I think the code is correct as it is. Sorry for the
early confusion.
Best regards,
Liviu
>
> Arnd
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Wed, Jul 09, 2014 at 07:20:49AM +0100, Arnd Bergmann wrote:
> On Tuesday 08 July 2014, Bjorn Helgaas wrote:
> > On Tue, Jul 8, 2014 at 1:00 AM, Arnd Bergmann <[email protected]> wrote:
> > > On Tuesday 08 July 2014, Bjorn Helgaas wrote:
> > >> On Tue, Jul 01, 2014 at 07:43:28PM +0100, Liviu Dudau wrote:
> > >> > +static LIST_HEAD(io_range_list);
> > >> > +
> > >> > +/*
> > >> > + * Record the PCI IO range (expressed as CPU physical address + size).
> > >> > + * Return a negative value if an error has occured, zero otherwise
> > >> > + */
> > >> > +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
> > >>
> > >> I don't understand the interface here. What's the mapping from CPU
> > >> physical address to bus I/O port? For example, I have the following
> > >> machine in mind:
> > >>
> > >> HWP0002:00: PCI Root Bridge (domain 0000 [bus 00-1b])
> > >> HWP0002:00: memory-mapped IO port space [mem 0xf8010000000-0xf8010000fff]
> > >> HWP0002:00: host bridge window [io 0x0000-0x0fff]
> > >>
> > >> HWP0002:09: PCI Root Bridge (domain 0001 [bus 00-1b])
> > >> HWP0002:09: memory-mapped IO port space [mem 0xf8110000000-0xf8110000fff]
> > >> HWP0002:09: host bridge window [io 0x1000000-0x1000fff] (PCI address [0x0-0xfff])
> > >>
> > >> The CPU physical memory [mem 0xf8010000000-0xf8010000fff] is translated by
> > >> the bridge to I/O ports 0x0000-0x0fff on PCI bus 0000:00. Drivers use,
> > >> e.g., "inb(0)" to access it.
> > >>
> > >> Similarly, [mem 0xf8110000000-0xf8110000fff] is translated by the second
> > >> bridge to I/O ports 0x0000-0x0fff on PCI bus 0001:00. Drivers use
> > >> "inb(0x1000000)" to access it.
> > >
> > > I guess you are thinking of the IA64 model here where you keep the virtual
> > > I/O port numbers in a per-bus lookup table that gets accessed for each
> > > inb() call. I've thought about this some more, and I believe there are good
> > > reasons for sticking with the model used on arm32 and powerpc for the
> > > generic OF implementation.
> > >
> > > The idea is that there is a single virtual memory range for all I/O port
> > > mappings and we use the MMU to do the translation rather than computing
> > > it manually in the inb() implemnetation. The main advantage is that all
> > > functions used in device drivers to (potentially) access I/O ports
> > > become trivial this way, which helps for code size and in some cases
> > > (e.g. SoC-internal registers with a low latency) it may even be performance
> > > relevant.
> >
> > My example is from ia64, but I'm not advocating for the lookup table.
> > The point is that the hardware works similarly (at least for dense ia64
> > I/O port spaces) in terms of mapping CPU physical addresses to PCI I/O
> > space.
> >
> > I think my confusion is because your pci_register_io_range() and
> > pci_addess_to_pci() implementations assume that every io_range starts at
> > I/O port 0 on PCI (correct me if I'm wrong). I suspect that's why you
> > don't save the I/O port number in struct io_range.
>
> I think you are just misreading the code, but I agree it's hard to
> understand and I made the same mistake in my initial reply to the
> first version.
I am willing to make the code more easy to understand and validate. Proof that
things are not that easy to check is that I've also got confused last night
without having all the code in front of me. Any suggestions?
Best regards,
Liviu
>
> pci_register_io_range and pci_address_to_pci only worry about the mapping
> between CPU physical and Linux I/O address, they do not care which PCI
> port numbers are behind that. The mapping between PCI port numbers and
> Linux port numbers is done correctly in patch 8/9 in the
> pci_host_bridge_of_get_ranges() function.
>
> > Maybe that assumption is guaranteed by OF, but it doesn't hold for ACPI;
> > ACPI can describe several I/O port apertures for a single bridge, each
> > associated with a different CPU physical memory region.
>
> DT can have the same, although the common case is that each PCI host
> bridge has 64KB of I/O ports starting at address 0. Most driver writers
> get it wrong for the case where it starts at a different address, so
> I really want to have a generic implementation that gets it right.
>
> > If my speculation here is correct, a comment to the effect that each
> > io_range corresponds to a PCI I/O space range that starts at 0 might be
> > enough.
> >
> > If you did add a PCI I/O port number argument to pci_register_io_range(),
> > we might be able to make an ACPI-based implementation of it. But I guess
> > that could be done if/when anybody ever wants to do that.
>
> I think we shoulnd't worry about it before we actually need it. As far as
> I understand, the only user of that code (unless someone wants to convert
> ia64) would be ARM64 with ACPI, but that uses the SBSA hardware model that
> recommends having no I/O space at all.
>
> Arnd
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Wed, Jul 09, 2014 at 09:31:50AM +0100, Arnd Bergmann wrote:
> On Tuesday 08 July 2014, Liviu Dudau wrote:
> > On Mon, Jul 07, 2014 at 10:22:00PM +0100, Arnd Bergmann wrote:
> > >
> > > I looked at the other drivers briefly, and I think you indeed fix the Tegra
> > > driver with this but break the integrator driver as mentioned above.
> > > The other callers of of_pci_range_to_resource() are apparently not
> > > impacted as they recalculate the values they get.
> >
> > I would argue that integrator version is having broken assumptions. If it would
> > try to allocate that IO range or request the resource as returned currently by
> > of_pci_range_to_resource (without my patch) it would fail. I know because I did
> > the same thing in my host bridge driver and it failed miserably. That's why I
> > tried to patch it.
>
> The integrator code was just introduced and the reason for how it does things
> is the way that of_pci_range_to_resource() works today. We tried to cope with
> it and not change the existing behavior in order to not break any other drivers.
>
> It's certainly not fair to call the integrator version broken, it just works
> around the common code having a quirky interface. We should probably have
> done of_pci_range_to_resource better than it is today (I would have argued
> for it to return an IORESOURCE_MEM with the CPU address), but it took long
> enough to get that merged and I was sick of arguing about it.
Understood. That is why I have carefully worded my email as not to diss anyone.
I didn't say the code is broken, I've said it has broken assumptions.
>
> > If the IO space is memory mapped, then we use the port number, the io_offset
> > and the PCI_IOBASE to get to the virtual address that, when accessed, will
> > generate the correct addresses on the bus, based on what the host bridge has
> > been configured.
> >
> > This is the current level of my understanding of PCI IO.
>
> Your understanding is absolutely correct, and that's great because very few
> people get that right. What I think we're really arguing about is what the
> of_pci_range_to_resource is supposed to return. As you and Bjorn both pointed
> out earlier, there are in fact two resources associated with the I/O window
> and the flaw in the current implementation is that of_pci_range_to_resource
> returns the numeric values for the IORESOURCE_MEM resource, but sets the
> type to IORESOURCE_IO, which is offset from that by PCI_IOBASE.
>
> You try to fix that by making it return the correct IORESOURCE_IO resource,
> which is a reasonable approach but you must not break drivers that rely
> on the broken resource while doing that.
Or I need to fix the existing drivers that rely on the old behaviour.
>
> The approach that I would have picked is to return the IORESOURCE_MEM
> resource associated with the I/O window and pick a (basically random)
> IORESOURCE_IO resource struct based on what hasn't been used and then
> compute the appropriate io_offset from that. This approach of course
> would also have required fixing up all drivers relying on the current
> behavior.
>
> To be clear, I'm fine with you (and Bjorn if he cares) picking the
> approach you like here, either one of these works fine as long as the
> host drivers use the interface in the way it is defined.
OK. Thanks for that. It does look like either way some existing code needs
fixing, so I will have a look at that. Unless Bjorn votes for making a new
version of pci_range_to_resource().
>
> > Now, I believe Rob has switched entirely to using my series in some test that
> > he has run and he hasn't encountered any issues, as long as one remembers in
> > the host bridge driver to add the io_base offset to the .start resource. If
> > not then I need to patch pci_v3.c.
>
> The crazy part of all these discussions is that basically nobody ever uses
> I/O port access, so it's very hard to test and we don't even notice when
> we get it wrong, but we end up spending most of the time for PCI host controller
> reviews trying to get these right.
>
> I'm very thankful that you are doing this work to get it moved into common
> code so hopefully this is the last time we ever have to worry about it because
> all future drivers will be able to use the common implemnetation.
Ahh, we humans! We always hope for the best! :)
My only chance of succeeding is if I make it a no brainer for people to use the
code. At the moment the interface for host bridge drivers is not too bad, but
it looks like the internals are still hard to comprehend.
Best regards,
Liviu
>
> Arnd
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Tue, Jul 8, 2014 at 4:48 PM, Liviu Dudau <[email protected]> wrote:
> On Tue, Jul 08, 2014 at 07:41:50PM +0100, Bjorn Helgaas wrote:
>> On Tue, Jul 8, 2014 at 4:46 AM, Liviu Dudau <[email protected]> wrote:
>> > On Tue, Jul 08, 2014 at 01:59:54AM +0100, Bjorn Helgaas wrote:
>>
>> >> I wonder if it would help to make a weak pci_domain_nr() function that
>> >> returns "bridge->domain_nr". Then each arch could individually drop its
>> >> pci_domain_nr() definition as it was converted, e.g., something like this:
>> >>
>> >> - Convert every arch pci_domain_nr() from a #define to a non-inline
>> >> function
>> >> - Add bridge.domain_nr, initialized from pci_domain_nr()
>> >> - Add a weak generic pci_domain_nr() that returns bridge.domain_nr
>> >> - Add a way to create a host bridge in a specified domain, so we can
>> >> initialize bridge.domain_nr without using pci_domain_nr()
>> >> - Convert each arch to use the new creation mechanism and drop its
>> >> pci_domain_nr() implementation
>> >
>> > I will try to propose a patch implementing this.
>>
>> I think this is more of an extra credit, cleanup sort of thing. I
>> don't think it advances your primary goal of (I think) getting arm64
>> PCI support in. So my advice is to not worry about unifying domain
>> handling until later.
>
> Getting arm64 supported *is* my main goal. But like you have stated in your
> review of v7, you wanted to see another architecture converted as a guarantee
> of "genericity" (for lack of a better word) for my patches. The one architecture
> I've set my eyes on is microblaze, and that one uses pci_scan_root_bus()
> rather than pci_create_root_bus() so I don't have any opportunity to pass the
> domain number or any additional info (like the sysdata pointer that we were
> talking about) to the pci_host_bridge structure unless I do this cleanup.
I think maybe I was too harsh about that, or maybe we had different
ideas about what "conversion" involved. My comment was in response to
"pci: Introduce pci_register_io_range() helper function", and I don't
remember why I was concerned about that; it's not even in drivers/pci,
and it doesn't have an obvious connection to putting the domain number
in struct pci_host_bridge.
The thing I'm more concerned about is adding new PCI interfaces, e.g.,
pci_create_root_bus_in_domain(), that are only used by one
architecture. Then it's hard to be sure that it's going to be useful
for other arches. If you can add arm64 using the existing PCI
interfaces, I don't any problem with that.
Bjorn
On Wed, Jul 9, 2014 at 12:20 AM, Arnd Bergmann <[email protected]> wrote:
> On Tuesday 08 July 2014, Bjorn Helgaas wrote:
>> I think my confusion is because your pci_register_io_range() and
>> pci_addess_to_pci() implementations assume that every io_range starts at
>> I/O port 0 on PCI (correct me if I'm wrong). I suspect that's why you
>> don't save the I/O port number in struct io_range.
>
> I think you are just misreading the code, but I agree it's hard to
> understand and I made the same mistake in my initial reply to the
> first version.
>
> pci_register_io_range and pci_address_to_pci only worry about the mapping
> between CPU physical and Linux I/O address, they do not care which PCI
> port numbers are behind that. The mapping between PCI port numbers and
> Linux port numbers is done correctly in patch 8/9 in the
> pci_host_bridge_of_get_ranges() function.
Ah, I see now. Thanks for explaining this again (I see you explained
it earlier; I just didn't understand it). Now that I see it, it *is*
very slick to handle both MMIO and PIO spaces the same way.
Bjorn
On Wed, Jul 09, 2014 at 04:10:04PM +0100, Bjorn Helgaas wrote:
> On Tue, Jul 8, 2014 at 4:48 PM, Liviu Dudau <[email protected]> wrote:
> > On Tue, Jul 08, 2014 at 07:41:50PM +0100, Bjorn Helgaas wrote:
> >> On Tue, Jul 8, 2014 at 4:46 AM, Liviu Dudau <[email protected]> wrote:
> >> > On Tue, Jul 08, 2014 at 01:59:54AM +0100, Bjorn Helgaas wrote:
> >>
> >> >> I wonder if it would help to make a weak pci_domain_nr() function that
> >> >> returns "bridge->domain_nr". Then each arch could individually drop its
> >> >> pci_domain_nr() definition as it was converted, e.g., something like this:
> >> >>
> >> >> - Convert every arch pci_domain_nr() from a #define to a non-inline
> >> >> function
> >> >> - Add bridge.domain_nr, initialized from pci_domain_nr()
> >> >> - Add a weak generic pci_domain_nr() that returns bridge.domain_nr
> >> >> - Add a way to create a host bridge in a specified domain, so we can
> >> >> initialize bridge.domain_nr without using pci_domain_nr()
> >> >> - Convert each arch to use the new creation mechanism and drop its
> >> >> pci_domain_nr() implementation
> >> >
> >> > I will try to propose a patch implementing this.
> >>
> >> I think this is more of an extra credit, cleanup sort of thing. I
> >> don't think it advances your primary goal of (I think) getting arm64
> >> PCI support in. So my advice is to not worry about unifying domain
> >> handling until later.
> >
> > Getting arm64 supported *is* my main goal. But like you have stated in your
> > review of v7, you wanted to see another architecture converted as a guarantee
> > of "genericity" (for lack of a better word) for my patches. The one architecture
> > I've set my eyes on is microblaze, and that one uses pci_scan_root_bus()
> > rather than pci_create_root_bus() so I don't have any opportunity to pass the
> > domain number or any additional info (like the sysdata pointer that we were
> > talking about) to the pci_host_bridge structure unless I do this cleanup.
>
> I think maybe I was too harsh about that, or maybe we had different
> ideas about what "conversion" involved. My comment was in response to
> "pci: Introduce pci_register_io_range() helper function", and I don't
> remember why I was concerned about that; it's not even in drivers/pci,
> and it doesn't have an obvious connection to putting the domain number
> in struct pci_host_bridge.
Well, to be honest I did move some of the code (as mentioned in the Changelog) from
drivers/pci into drivers/of. It makes more sense to be in OF, as it mostly concerns
architectures that use it.
>
> The thing I'm more concerned about is adding new PCI interfaces, e.g.,
> pci_create_root_bus_in_domain(), that are only used by one
> architecture. Then it's hard to be sure that it's going to be useful
> for other arches. If you can add arm64 using the existing PCI
> interfaces, I don't any problem with that.
(No blame here or reproaches, I'm just restating the situation:) I (mostly) did try
that in my v7 series but it also got NAK-ed by Arnd and Catalin as it had too much
arm64 specific code in there.
I don't see a way out of adding new PCI interfaces if we want to have support in
the PCI framework for unifying existing architectures. Of course, there is the painful
alternative of changing the existing APIs and fixing arches in one go, but like you've
said is going to be messy. I don't think I (or the people and companies wanting PCIe
on arm64) should cop out and pick a quick fix that adds sysdata structure into arm64
just to avoid new APIs, as this is not going to help anyone in long term. What I can
do is to create a set of parallel APIs for pci_{scan,create}_root_bus() that take
a pci_host_bridge pointer and start converting architectures one by one to that API
while deprecating the existing one. That way we can add arm64 easily as it would be
the first architecture to use new code without breaking things *and* we provide a
migration path.
Best regards,
Liviu
>
> Bjorn
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Thu, Jul 10, 2014 at 3:47 AM, Liviu Dudau <[email protected]> wrote:
> I don't see a way out of adding new PCI interfaces if we want to have support in
> the PCI framework for unifying existing architectures. Of course, there is the painful
> alternative of changing the existing APIs and fixing arches in one go, but like you've
> said is going to be messy. I don't think I (or the people and companies wanting PCIe
> on arm64) should cop out and pick a quick fix that adds sysdata structure into arm64
> just to avoid new APIs, as this is not going to help anyone in long term. What I can
> do is to create a set of parallel APIs for pci_{scan,create}_root_bus() that take
> a pci_host_bridge pointer and start converting architectures one by one to that API
> while deprecating the existing one. That way we can add arm64 easily as it would be
> the first architecture to use new code without breaking things *and* we provide a
> migration path.
A lot of the v7 discussion was about pci_register_io_range(). I
apologize, because I think I really derailed things there and it was
unwarranted. Arnd was right that migrating other arches should be a
separate effort. I *think* I was probably thinking about the proposal
of adding pci_create_root_bus_in_domain(), and my reservations about
that got transferred to the pci_register_io_range() discussion. In
any case, I'm completely fine with pci_register_io_range() now.
Most of the rest of the v7 discussion was about "Introduce a domain
number for pci_host_bridge." I think we should add arm64 using the
existing pci_scan_root_bus() and keep the domain number in the arm64
sysdata structure like every other arch does. Isn't that feasible?
We can worry about domain unification later.
I haven't followed closely enough to know what other objections people had.
Bjorn
Hi,
On Tue, Jul 8, 2014 at 10:18 AM, Liviu Dudau <[email protected]> wrote:
> On Sun, Jul 06, 2014 at 04:23:43PM +0100, Rob Herring wrote:
>> On Tue, Jul 1, 2014 at 1:43 PM, Liviu Dudau <[email protected]> wrote:
>> > This is my resurected attempt at adding support for generic PCI host
>> > bridge controllers that make use of device tree information to
>> > configure themselves. I've tagged it as v8 although the patches
>> > have now been reshuffled in order to ease adoption so referring to
>> > the older versions might be a bit of a hoop jumping exercise.
>> >
>> > Changes from v7:
>> > - Reordered the patches so that fixes and non-controversial patches
>> > from v7 can be accepted more easily. If agreed I can split the
>> > series again into patches that can be upstreamed easily and ones
>> > that still need discussion.
>> > - Moved the of_create_host_bridge() function to drivers/of/of_pci.c
>> > to better reflect its target use.
>> > - Added the function to remap the bus I/O resources that used to be
>> > provided in my arm64 patch series and (re)named it pci_remap_iospace()
>> > - Removed error code checking from parsing and mapping of IRQ from DT
>> > in recognition that some PCI devices will not have legacy IRQ mappings.
>> >
>> > v7 thread here with all the historic information: https://lkml.org/lkml/2014/3/14/279
>>
>> Can you publish a branch for this series please.
>>
>> Rob
>>
>
> Hi Rob,
>
> I have pushed a brach that matches my v8 patchset +1 obvious missing header include
> here: http://www.linux-arm.org/git?p=linux-ld.git;a=shortlog;h=refs/heads/for-upstream/pci_v8
>
I was still getting following compilation error after applying arm64
pci headers. Please let me know if I am missing something.
linux-git/drivers/of/of_pci.c: In function ‘pci_host_bridge_of_get_ranges’:
linux-git/drivers/of/of_pci.c:114:22: error: storage size of ‘range’ isn’t known
struct of_pci_range range;
^
linux-git/drivers/of/of_pci.c:115:29: error: storage size of ‘parser’
isn’t known
struct of_pci_range_parser parser;
^
linux-git/drivers/of/of_pci.c:121:2: error: implicit declaration of
function ‘of_pci_range_parser_init’
[-Werror=implicit-function-declaration]
err = of_pci_range_parser_init(&parser, dev);
Below patch fixes the errors.
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 55d8320..da88dac 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -2,6 +2,7 @@
#include <linux/export.h>
#include <linux/of.h>
#include <linux/of_pci.h>
+#include <linux/of_address.h>
static inline int __of_pci_pci_compare(struct device_node *node,
unsigned int data)
> Best regards,
> Liviu
>
>
> --
> ====================
> | I would like to |
> | fix the world, |
> | but they're not |
> | giving me the |
> \ source code! /
> ---------------
> ¯\_(ツ)_/¯
>
On Friday, July 11, 2014 9:44 AM, Tanmay Inamdar wrote:
> On Tue, Jul 8, 2014 at 10:18 AM, Liviu Dudau <[email protected]> wrote:
> > On Sun, Jul 06, 2014 at 04:23:43PM +0100, Rob Herring wrote:
> >> On Tue, Jul 1, 2014 at 1:43 PM, Liviu Dudau <[email protected]> wrote:
> >> > This is my resurected attempt at adding support for generic PCI host
> >> > bridge controllers that make use of device tree information to
> >> > configure themselves. I've tagged it as v8 although the patches
> >> > have now been reshuffled in order to ease adoption so referring to
> >> > the older versions might be a bit of a hoop jumping exercise.
> >> >
> >> > Changes from v7:
> >> > - Reordered the patches so that fixes and non-controversial patches
> >> > from v7 can be accepted more easily. If agreed I can split the
> >> > series again into patches that can be upstreamed easily and ones
> >> > that still need discussion.
> >> > - Moved the of_create_host_bridge() function to drivers/of/of_pci.c
> >> > to better reflect its target use.
> >> > - Added the function to remap the bus I/O resources that used to be
> >> > provided in my arm64 patch series and (re)named it pci_remap_iospace()
> >> > - Removed error code checking from parsing and mapping of IRQ from DT
> >> > in recognition that some PCI devices will not have legacy IRQ mappings.
> >> >
> >> > v7 thread here with all the historic information: https://lkml.org/lkml/2014/3/14/279
> >>
> >> Can you publish a branch for this series please.
> >>
> >> Rob
> >>
> >
> > Hi Rob,
> >
> > I have pushed a brach that matches my v8 patchset +1 obvious missing header include
> > here: http://www.linux-arm.org/git?p=linux-ld.git;a=shortlog;h=refs/heads/for-upstream/pci_v8
> >
>
> I was still getting following compilation error after applying arm64
> pci headers. Please let me know if I am missing something.
>
> linux-git/drivers/of/of_pci.c: In function ‘pci_host_bridge_of_get_ranges’:
> linux-git/drivers/of/of_pci.c:114:22: error: storage size of ‘range’ isn’t known
> struct of_pci_range range;
> ^
> linux-git/drivers/of/of_pci.c:115:29: error: storage size of ‘parser’
> isn’t known
> struct of_pci_range_parser parser;
> ^
> linux-git/drivers/of/of_pci.c:121:2: error: implicit declaration of
> function ‘of_pci_range_parser_init’
> [-Werror=implicit-function-declaration]
> err = of_pci_range_parser_init(&parser, dev);
>
>
> Below patch fixes the errors.
>
> diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> index 55d8320..da88dac 100644
> --- a/drivers/of/of_pci.c
> +++ b/drivers/of/of_pci.c
> @@ -2,6 +2,7 @@
> #include <linux/export.h>
> #include <linux/of.h>
> #include <linux/of_pci.h>
> +#include <linux/of_address.h>
Yes, right. I also found the build errors as above mentioned.
"of_address.h" should be included, in order to fix the build errors.
However, for readability, the following would be better.
#include <linux/of.h>
+#include <linux/of_address.h>
#include <linux/of_pci.h>
Best regards,
Jingoo Han
>
> static inline int __of_pci_pci_compare(struct device_node *node,
> unsigned int data)
>
>
> > Best regards,
> > Liviu
> >
> >
> > --
> > ====================
> > | I would like to |
> > | fix the world, |
> > | but they're not |
> > | giving me the |
> > \ source code! /
> > ---------------
> > ¯\_(ツ)_/¯
> >
On Wednesday, July 02, 2014 3:44 AM, Liviu Dudau wrote:
>
> Several platforms use a rather generic version of parsing
> the device tree to find the host bridge ranges. Move the common code
> into the generic PCI code and use it to create a pci_host_bridge
> structure that can be used by arch code.
>
> Based on early attempts by Andrew Murray to unify the code.
> Used powerpc and microblaze PCI code as starting point.
>
> Signed-off-by: Liviu Dudau <[email protected]>
> Tested-by: Tanmay Inamdar <[email protected]>
> ---
> drivers/of/of_pci.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++
> drivers/pci/host-bridge.c | 18 +++++++
> include/linux/of_pci.h | 10 ++++
> include/linux/pci.h | 8 +++
> 4 files changed, 171 insertions(+)
>
> diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> index 8481996..55d8320 100644
> --- a/drivers/of/of_pci.c
> +++ b/drivers/of/of_pci.c
[.....]
> +struct pci_host_bridge *
> +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host_data)
> +{
> + int err, domain, busno;
> + struct resource *bus_range;
> + struct pci_bus *root_bus;
> + struct pci_host_bridge *bridge;
> + resource_size_t io_base;
> + LIST_HEAD(res);
> +
> + bus_range = kzalloc(sizeof(*bus_range), GFP_KERNEL);
> + if (!bus_range)
> + return ERR_PTR(-ENOMEM);
> +
> + domain = of_alias_get_id(parent->of_node, "pci-domain");
> + if (domain == -ENODEV)
> + domain = atomic_inc_return(&domain_nr);
> +
> + err = of_pci_parse_bus_range(parent->of_node, bus_range);
> + if (err) {
> + dev_info(parent, "No bus range for %s, using default [0-255]\n",
> + parent->of_node->full_name);
> + bus_range->start = 0;
> + bus_range->end = 255;
> + bus_range->flags = IORESOURCE_BUS;
> + }
> + busno = bus_range->start;
> + pci_add_resource(&res, bus_range);
> +
> + /* now parse the rest of host bridge bus ranges */
> + err = pci_host_bridge_of_get_ranges(parent->of_node, &res, &io_base);
> + if (err)
> + goto err_create;
> +
> + /* then create the root bus */
> + root_bus = pci_create_root_bus_in_domain(parent, domain, busno,
> + ops, host_data, &res);
> + if (IS_ERR(root_bus)) {
> + err = PTR_ERR(root_bus);
> + goto err_create;
> + }
> +
> + bridge = to_pci_host_bridge(root_bus->bridge);
> + bridge->io_base = io_base;
Hi Liviu Dudau,
Would you fix the following warning?
drivers/of/of_pci.c: In function 'of_create_pci_host_bridge'
drivers/of/of_pci.c:218:18: warning: 'of_base' may be used uninitialized in this function [-Wmaybe-uninitialized]
bridge->io_base = io_base;
Best regards,
Jingoo Han
[.....]
On Fri, Jul 11, 2014 at 08:43:21AM +0100, Jingoo Han wrote:
> On Wednesday, July 02, 2014 3:44 AM, Liviu Dudau wrote:
> >
> > Several platforms use a rather generic version of parsing
> > the device tree to find the host bridge ranges. Move the common code
> > into the generic PCI code and use it to create a pci_host_bridge
> > structure that can be used by arch code.
> >
> > Based on early attempts by Andrew Murray to unify the code.
> > Used powerpc and microblaze PCI code as starting point.
> >
> > Signed-off-by: Liviu Dudau <[email protected]>
> > Tested-by: Tanmay Inamdar <[email protected]>
> > ---
> > drivers/of/of_pci.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++
> > drivers/pci/host-bridge.c | 18 +++++++
> > include/linux/of_pci.h | 10 ++++
> > include/linux/pci.h | 8 +++
> > 4 files changed, 171 insertions(+)
> >
> > diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> > index 8481996..55d8320 100644
> > --- a/drivers/of/of_pci.c
> > +++ b/drivers/of/of_pci.c
>
> [.....]
>
> > +struct pci_host_bridge *
> > +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host_data)
> > +{
> > + int err, domain, busno;
> > + struct resource *bus_range;
> > + struct pci_bus *root_bus;
> > + struct pci_host_bridge *bridge;
> > + resource_size_t io_base;
> > + LIST_HEAD(res);
> > +
> > + bus_range = kzalloc(sizeof(*bus_range), GFP_KERNEL);
> > + if (!bus_range)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + domain = of_alias_get_id(parent->of_node, "pci-domain");
> > + if (domain == -ENODEV)
> > + domain = atomic_inc_return(&domain_nr);
> > +
> > + err = of_pci_parse_bus_range(parent->of_node, bus_range);
> > + if (err) {
> > + dev_info(parent, "No bus range for %s, using default [0-255]\n",
> > + parent->of_node->full_name);
> > + bus_range->start = 0;
> > + bus_range->end = 255;
> > + bus_range->flags = IORESOURCE_BUS;
> > + }
> > + busno = bus_range->start;
> > + pci_add_resource(&res, bus_range);
> > +
> > + /* now parse the rest of host bridge bus ranges */
> > + err = pci_host_bridge_of_get_ranges(parent->of_node, &res, &io_base);
> > + if (err)
> > + goto err_create;
> > +
> > + /* then create the root bus */
> > + root_bus = pci_create_root_bus_in_domain(parent, domain, busno,
> > + ops, host_data, &res);
> > + if (IS_ERR(root_bus)) {
> > + err = PTR_ERR(root_bus);
> > + goto err_create;
> > + }
> > +
> > + bridge = to_pci_host_bridge(root_bus->bridge);
> > + bridge->io_base = io_base;
>
> Hi Liviu Dudau,
>
> Would you fix the following warning?
>
> drivers/of/of_pci.c: In function 'of_create_pci_host_bridge'
> drivers/of/of_pci.c:218:18: warning: 'of_base' may be used uninitialized in this function [-Wmaybe-uninitialized]
> bridge->io_base = io_base;
Yes, I have a simple fix which is to set the initial value to zero when declaring the variable.
Best regards,
Liviu
>
> Best regards,
> Jingoo Han
>
> [.....]
>
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Fri, Jul 11, 2014 at 08:33:23AM +0100, Jingoo Han wrote:
> On Friday, July 11, 2014 9:44 AM, Tanmay Inamdar wrote:
> > On Tue, Jul 8, 2014 at 10:18 AM, Liviu Dudau <[email protected]> wrote:
> > > On Sun, Jul 06, 2014 at 04:23:43PM +0100, Rob Herring wrote:
> > >> On Tue, Jul 1, 2014 at 1:43 PM, Liviu Dudau <[email protected]> wrote:
> > >> > This is my resurected attempt at adding support for generic PCI host
> > >> > bridge controllers that make use of device tree information to
> > >> > configure themselves. I've tagged it as v8 although the patches
> > >> > have now been reshuffled in order to ease adoption so referring to
> > >> > the older versions might be a bit of a hoop jumping exercise.
> > >> >
> > >> > Changes from v7:
> > >> > - Reordered the patches so that fixes and non-controversial patches
> > >> > from v7 can be accepted more easily. If agreed I can split the
> > >> > series again into patches that can be upstreamed easily and ones
> > >> > that still need discussion.
> > >> > - Moved the of_create_host_bridge() function to drivers/of/of_pci.c
> > >> > to better reflect its target use.
> > >> > - Added the function to remap the bus I/O resources that used to be
> > >> > provided in my arm64 patch series and (re)named it pci_remap_iospace()
> > >> > - Removed error code checking from parsing and mapping of IRQ from DT
> > >> > in recognition that some PCI devices will not have legacy IRQ mappings.
> > >> >
> > >> > v7 thread here with all the historic information: https://lkml.org/lkml/2014/3/14/279
> > >>
> > >> Can you publish a branch for this series please.
> > >>
> > >> Rob
> > >>
> > >
> > > Hi Rob,
> > >
> > > I have pushed a brach that matches my v8 patchset +1 obvious missing header include
> > > here: http://www.linux-arm.org/git?p=linux-ld.git;a=shortlog;h=refs/heads/for-upstream/pci_v8
> > >
> >
> > I was still getting following compilation error after applying arm64
> > pci headers. Please let me know if I am missing something.
> >
> > linux-git/drivers/of/of_pci.c: In function ‘pci_host_bridge_of_get_ranges’:
> > linux-git/drivers/of/of_pci.c:114:22: error: storage size of ‘range’ isn’t known
> > struct of_pci_range range;
> > ^
> > linux-git/drivers/of/of_pci.c:115:29: error: storage size of ‘parser’
> > isn’t known
> > struct of_pci_range_parser parser;
> > ^
> > linux-git/drivers/of/of_pci.c:121:2: error: implicit declaration of
> > function ‘of_pci_range_parser_init’
> > [-Werror=implicit-function-declaration]
> > err = of_pci_range_parser_init(&parser, dev);
> >
> >
> > Below patch fixes the errors.
> >
> > diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> > index 55d8320..da88dac 100644
> > --- a/drivers/of/of_pci.c
> > +++ b/drivers/of/of_pci.c
> > @@ -2,6 +2,7 @@
> > #include <linux/export.h>
> > #include <linux/of.h>
> > #include <linux/of_pci.h>
> > +#include <linux/of_address.h>
>
> Yes, right. I also found the build errors as above mentioned.
> "of_address.h" should be included, in order to fix the build errors.
> However, for readability, the following would be better.
>
> #include <linux/of.h>
> +#include <linux/of_address.h>
> #include <linux/of_pci.h>
>
> Best regards,
> Jingoo Han
Thanks, guys! Like I've said, it was not my day when I've submitted v8 series.
I've cherry picked the patches into a clean branch but forgot to be thorough
and was compiling in the old working directory.
Sorry,
Liviu
>
> >
> > static inline int __of_pci_pci_compare(struct device_node *node,
> > unsigned int data)
> >
> >
> > > Best regards,
> > > Liviu
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Thu, Jul 10, 2014 at 11:36:10PM +0100, Bjorn Helgaas wrote:
> On Thu, Jul 10, 2014 at 3:47 AM, Liviu Dudau <[email protected]> wrote:
>
> > I don't see a way out of adding new PCI interfaces if we want to have support in
> > the PCI framework for unifying existing architectures. Of course, there is the painful
> > alternative of changing the existing APIs and fixing arches in one go, but like you've
> > said is going to be messy. I don't think I (or the people and companies wanting PCIe
> > on arm64) should cop out and pick a quick fix that adds sysdata structure into arm64
> > just to avoid new APIs, as this is not going to help anyone in long term. What I can
> > do is to create a set of parallel APIs for pci_{scan,create}_root_bus() that take
> > a pci_host_bridge pointer and start converting architectures one by one to that API
> > while deprecating the existing one. That way we can add arm64 easily as it would be
> > the first architecture to use new code without breaking things *and* we provide a
> > migration path.
>
> A lot of the v7 discussion was about pci_register_io_range(). I
> apologize, because I think I really derailed things there and it was
> unwarranted. Arnd was right that migrating other arches should be a
> separate effort. I *think* I was probably thinking about the proposal
> of adding pci_create_root_bus_in_domain(), and my reservations about
> that got transferred to the pci_register_io_range() discussion. In
> any case, I'm completely fine with pci_register_io_range() now.
>
> Most of the rest of the v7 discussion was about "Introduce a domain
> number for pci_host_bridge." I think we should add arm64 using the
> existing pci_scan_root_bus() and keep the domain number in the arm64
> sysdata structure like every other arch does. Isn't that feasible?
> We can worry about domain unification later.
Thanks!
I'm really not that keen to add sysdata support in the arch code as it
requires initialisation code that I have tried to eliminate. What I'm
going to suggest for my v9 is a parallel set of APIs that arm64 will
be the first to use without changing the existing pci_{scan,create}_bus()
functions and then the conversion process will migrate arches to the new API.
Best regards,
Liviu
>
> I haven't followed closely enough to know what other objections people had.
>
> Bjorn
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Thu, Jul 10, 2014 at 11:36:10PM +0100, Bjorn Helgaas wrote:
> Most of the rest of the v7 discussion was about "Introduce a domain
> number for pci_host_bridge." I think we should add arm64 using the
> existing pci_scan_root_bus() and keep the domain number in the arm64
> sysdata structure like every other arch does. Isn't that feasible?
> We can worry about domain unification later.
I think that's what we were trying to avoid, adding an arm64-specific
pci_sys_data structure (and arm64-specific API). IIUC, avoiding this
would allow the host controller drivers to use the sysdata pointer for
their own private data structures.
Also since you can specify the domain number via DT (and in Liviu's
v8 patches read by of_create_pci_host_bridge), I think it would make
sense to have it stored in some generic data structures (e.g.
pci_host_bridge) rather than in an arm64 private sysdata.
(Liviu is thinking of an alternative API but maybe he could briefly
describe it here before posting a new series)
--
Catalin
On Fri, Jul 11, 2014 at 03:11:16PM +0100, Catalin Marinas wrote:
> On Thu, Jul 10, 2014 at 11:36:10PM +0100, Bjorn Helgaas wrote:
> > Most of the rest of the v7 discussion was about "Introduce a domain
> > number for pci_host_bridge." I think we should add arm64 using the
> > existing pci_scan_root_bus() and keep the domain number in the arm64
> > sysdata structure like every other arch does. Isn't that feasible?
> > We can worry about domain unification later.
>
> I think that's what we were trying to avoid, adding an arm64-specific
> pci_sys_data structure (and arm64-specific API). IIUC, avoiding this
> would allow the host controller drivers to use the sysdata pointer for
> their own private data structures.
>
> Also since you can specify the domain number via DT (and in Liviu's
> v8 patches read by of_create_pci_host_bridge), I think it would make
> sense to have it stored in some generic data structures (e.g.
> pci_host_bridge) rather than in an arm64 private sysdata.
>
> (Liviu is thinking of an alternative API but maybe he could briefly
> describe it here before posting a new series)
>
> --
> Catalin
My plan is to keep the domain number in the pci_host_bridge and split
the creation of the pci_host_bridge out of the pci_create_root_bus().
The new function (tentatively called pci_create_new_root_bus()) will
no longer call pci_alloc_host_bridge() but will accept it as a
parameter, allowing one to be able to set the domain_nr ahead of the
root bus creation.
Best regards,
Liviu
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Fri, Jul 11, 2014 at 04:08:23PM +0100, Liviu Dudau wrote:
> On Fri, Jul 11, 2014 at 03:11:16PM +0100, Catalin Marinas wrote:
> > On Thu, Jul 10, 2014 at 11:36:10PM +0100, Bjorn Helgaas wrote:
> > > Most of the rest of the v7 discussion was about "Introduce a domain
> > > number for pci_host_bridge." I think we should add arm64 using the
> > > existing pci_scan_root_bus() and keep the domain number in the arm64
> > > sysdata structure like every other arch does. Isn't that feasible?
> > > We can worry about domain unification later.
> >
> > I think that's what we were trying to avoid, adding an arm64-specific
> > pci_sys_data structure (and arm64-specific API). IIUC, avoiding this
> > would allow the host controller drivers to use the sysdata pointer for
> > their own private data structures.
> >
> > Also since you can specify the domain number via DT (and in Liviu's
> > v8 patches read by of_create_pci_host_bridge), I think it would make
> > sense to have it stored in some generic data structures (e.g.
> > pci_host_bridge) rather than in an arm64 private sysdata.
> >
> > (Liviu is thinking of an alternative API but maybe he could briefly
> > describe it here before posting a new series)
>
> My plan is to keep the domain number in the pci_host_bridge and split
> the creation of the pci_host_bridge out of the pci_create_root_bus().
Wouldn't it make more sense to add domain_nr to the pci_bus structure
(well, only needed for the root bus)? It would simplify pci_domain_nr()
as well which only takes a pci_bus parameter.
> The new function (tentatively called pci_create_new_root_bus()) will
> no longer call pci_alloc_host_bridge() but will accept it as a
> parameter, allowing one to be able to set the domain_nr ahead of the
> root bus creation.
If we place domain_nr in pci_bus, this split wouldn't help but we still
need your original pci_create_root_bus_in_domain(). Are there other uses
of your proposal above?
Yet another alternative is to ignore PCI domains altogether (domain 0
always).
--
Catalin
On Fri, Jul 11, 2014 at 8:11 AM, Catalin Marinas
<[email protected]> wrote:
> On Thu, Jul 10, 2014 at 11:36:10PM +0100, Bjorn Helgaas wrote:
>> Most of the rest of the v7 discussion was about "Introduce a domain
>> number for pci_host_bridge." I think we should add arm64 using the
>> existing pci_scan_root_bus() and keep the domain number in the arm64
>> sysdata structure like every other arch does. Isn't that feasible?
>> We can worry about domain unification later.
>
> I think that's what we were trying to avoid, adding an arm64-specific
> pci_sys_data structure (and arm64-specific API). IIUC, avoiding this
> would allow the host controller drivers to use the sysdata pointer for
> their own private data structures.
>
> Also since you can specify the domain number via DT (and in Liviu's
> v8 patches read by of_create_pci_host_bridge), I think it would make
> sense to have it stored in some generic data structures (e.g.
> pci_host_bridge) rather than in an arm64 private sysdata.
It would definitely be nice to keep the domain in a generic data
structure rather than an arm64-specific one. But every other arch
keeps it in an arch-specific structure today, and I think following
that existing pattern is the quickest way forward.
But you mentioned arm64-specific API, too. What do you have in mind
there? I know there will be arm64 implementations of various
pcibios_*() functions (just like for every other arch), but it sounds
like there might be something else?
Bjorn
On Fri, Jul 11, 2014 at 06:02:56PM +0100, Bjorn Helgaas wrote:
> On Fri, Jul 11, 2014 at 8:11 AM, Catalin Marinas
> <[email protected]> wrote:
> > On Thu, Jul 10, 2014 at 11:36:10PM +0100, Bjorn Helgaas wrote:
> >> Most of the rest of the v7 discussion was about "Introduce a domain
> >> number for pci_host_bridge." I think we should add arm64 using the
> >> existing pci_scan_root_bus() and keep the domain number in the arm64
> >> sysdata structure like every other arch does. Isn't that feasible?
> >> We can worry about domain unification later.
> >
> > I think that's what we were trying to avoid, adding an arm64-specific
> > pci_sys_data structure (and arm64-specific API). IIUC, avoiding this
> > would allow the host controller drivers to use the sysdata pointer for
> > their own private data structures.
> >
> > Also since you can specify the domain number via DT (and in Liviu's
> > v8 patches read by of_create_pci_host_bridge), I think it would make
> > sense to have it stored in some generic data structures (e.g.
> > pci_host_bridge) rather than in an arm64 private sysdata.
>
> It would definitely be nice to keep the domain in a generic data
> structure rather than an arm64-specific one. But every other arch
> keeps it in an arch-specific structure today, and I think following
> that existing pattern is the quickest way forward.
In this case we end up with an arm64-specific struct pci_sys_data and I
assume some API that takes care of this data structure to populate the
domain nr.
In Liviu's implementation, of_create_pci_host_bridge() is called by the
host controller driver directly and reads the domain_nr from the DT. It
also gets a void *host_data which it stores as sysdata in the pci_bus
structure (but that's specific to the host controller driver rather than
arm64). Since sysdata is opaque to of_create_pci_host_bridge(), it
cannot set the domain_nr.
If we go for arm64 sysdata, the host controller driver would have to
call some arm64 pci* function (maybe with its own private data as
argument) which would have to parse the DT, assign the domain nr and
eventually call pci_create_root_bus(). But it means that significant
part of of_create_pci_host_bridge() would be moved under arch/arm64 (an
alternative is for each host driver to implement its own
of_create_pci_host_bridge()).
In addition, host drivers would retrieve their private data from the
arm64 specific sysdata structure and we were trying to make host drivers
depend only on generic data structures and API rather than arch
specific.
> But you mentioned arm64-specific API, too. What do you have in mind
> there? I know there will be arm64 implementations of various
> pcibios_*() functions (just like for every other arch), but it sounds
> like there might be something else?
Not much left which is arm64 specific, just some callbacks:
http://linux-arm.org/git?p=linux-ld.git;a=commitdiff;h=82ebbce34506676528b9a7ae8f8fbc84b6b6248e
AFAICT, there isn't anything that host drivers need to call directly.
Basically we want to decouple the PCI host driver model from the arch
specific data structures/API.
--
Catalin
On Fri, Jul 11, 2014 at 07:02:06PM +0100, Catalin Marinas wrote:
> On Fri, Jul 11, 2014 at 06:02:56PM +0100, Bjorn Helgaas wrote:
> > On Fri, Jul 11, 2014 at 8:11 AM, Catalin Marinas
> > <[email protected]> wrote:
> > > On Thu, Jul 10, 2014 at 11:36:10PM +0100, Bjorn Helgaas wrote:
> > >> Most of the rest of the v7 discussion was about "Introduce a domain
> > >> number for pci_host_bridge." I think we should add arm64 using the
> > >> existing pci_scan_root_bus() and keep the domain number in the arm64
> > >> sysdata structure like every other arch does. Isn't that feasible?
> > >> We can worry about domain unification later.
> > >
> > > I think that's what we were trying to avoid, adding an arm64-specific
> > > pci_sys_data structure (and arm64-specific API). IIUC, avoiding this
> > > would allow the host controller drivers to use the sysdata pointer for
> > > their own private data structures.
> > >
> > > Also since you can specify the domain number via DT (and in Liviu's
> > > v8 patches read by of_create_pci_host_bridge), I think it would make
> > > sense to have it stored in some generic data structures (e.g.
> > > pci_host_bridge) rather than in an arm64 private sysdata.
> >
> > It would definitely be nice to keep the domain in a generic data
> > structure rather than an arm64-specific one. But every other arch
> > keeps it in an arch-specific structure today, and I think following
> > that existing pattern is the quickest way forward.
>
> In this case we end up with an arm64-specific struct pci_sys_data and I
> assume some API that takes care of this data structure to populate the
> domain nr.
>
> In Liviu's implementation, of_create_pci_host_bridge() is called by the
> host controller driver directly and reads the domain_nr from the DT. It
> also gets a void *host_data which it stores as sysdata in the pci_bus
> structure (but that's specific to the host controller driver rather than
> arm64). Since sysdata is opaque to of_create_pci_host_bridge(), it
> cannot set the domain_nr.
Some more thinking, so I guess we could get away without changing the
API. On top of Liviu's tree here:
http://linux-arm.org/git?p=linux-ld.git;a=shortlog;h=refs/heads/for-upstream/pci_v8
I reverted "pci: Introduce a domain number for pci_host_bridge.":
http://linux-arm.org/git?p=linux-ld.git;a=commitdiff;h=b44e1c7d6b01c436f6f55662a1414e925161c9ca
and added this patch on top (if you agree with the idea, we can split it
nicely in arm64, OF and PCI specific parts). What we get is the
domain_nr in a generic structure and free the sysdata pointer for the
host controller driver.
----------------8<----------------------------------------
>From b32606aa3997fc8a45014a64f99e921eef4872b0 Mon Sep 17 00:00:00 2001
From: Catalin Marinas <[email protected]>
Date: Mon, 14 Jul 2014 17:20:01 +0100
Subject: [PATCH] pci: Add support for generic domain_nr in pci_bus
This patch adds domain_nr in struct pci_bus if
CONFIG_PCI_DOMAINS_GENERIC is enabled. The default implementation for
pci_domain_nr() simply returns bus->domain_nr. For the root bus, the
core PCI code calls pci_set_domain_nr(bus, parent_device) while the
child buses inherit the domain nr of the parent bus.
This patch also adds an of_pci_set_domain_nr() implementation which
parses the device tree for the "pci-domain" property or sets domain_nr
to the next available value (this function could also be implemented
entirely in arm64).
Signed-off-by: Catalin Marinas <[email protected]>
---
arch/arm64/Kconfig | 3 +++
arch/arm64/include/asm/pci.h | 10 ----------
arch/arm64/kernel/pci.c | 5 +++++
drivers/of/of_pci.c | 20 +++++++++++++-------
drivers/pci/probe.c | 11 ++++++++---
include/linux/of_pci.h | 5 +++++
include/linux/pci.h | 15 +++++++++++++++
7 files changed, 49 insertions(+), 20 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 48ed631adde2..2c884f7453ba 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -160,6 +160,9 @@ config PCI
config PCI_DOMAINS
def_bool PCI
+config PCI_DOMAINS_GENERIC
+ def_bool PCI
+
config PCI_SYSCALL
def_bool PCI
diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
index 3f7856e92d66..4f091a5135b7 100644
--- a/arch/arm64/include/asm/pci.h
+++ b/arch/arm64/include/asm/pci.h
@@ -29,16 +29,6 @@ struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus);
extern int isa_dma_bridge_buggy;
#ifdef CONFIG_PCI
-static inline int pci_domain_nr(struct pci_bus *bus)
-{
- struct pci_host_bridge *bridge = find_pci_host_bridge(bus);
-
- if (bridge)
- return bridge->domain_nr;
-
- return 0;
-}
-
static inline int pci_proc_domain(struct pci_bus *bus)
{
return 1;
diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
index 955d6d1cb011..d5ed1afb0d88 100644
--- a/arch/arm64/kernel/pci.c
+++ b/arch/arm64/kernel/pci.c
@@ -36,3 +36,8 @@ resource_size_t pcibios_align_resource(void *data, const struct resource *res,
{
return res->start;
}
+
+void pci_set_domain_nr(struct pci_bus *bus, struct device *parent)
+{
+ of_pci_set_domain_nr(bus, parent);
+}
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index e81402af5cde..54f06b748bf1 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -175,7 +175,7 @@ static atomic_t domain_nr = ATOMIC_INIT(-1);
struct pci_host_bridge *
of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host_data)
{
- int err, domain, busno;
+ int err, busno;
struct resource *bus_range;
struct pci_bus *root_bus;
struct pci_host_bridge *bridge;
@@ -186,10 +186,6 @@ of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host
if (!bus_range)
return ERR_PTR(-ENOMEM);
- domain = of_alias_get_id(parent->of_node, "pci-domain");
- if (domain == -ENODEV)
- domain = atomic_inc_return(&domain_nr);
-
err = of_pci_parse_bus_range(parent->of_node, bus_range);
if (err) {
dev_info(parent, "No bus range for %s, using default [0-255]\n",
@@ -207,8 +203,7 @@ of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host
goto err_create;
/* then create the root bus */
- root_bus = pci_create_root_bus_in_domain(parent, domain, busno,
- ops, host_data, &res);
+ root_bus = pci_create_root_bus(parent, busno, ops, host_data, &res);
if (IS_ERR(root_bus)) {
err = PTR_ERR(root_bus);
goto err_create;
@@ -225,6 +220,17 @@ err_create:
}
EXPORT_SYMBOL_GPL(of_create_pci_host_bridge);
+void of_pci_set_domain_nr(struct pci_bus *bus, struct device *parent)
+{
+ int domain;
+
+ domain = of_alias_get_id(parent->of_node, "pci-domain");
+ if (domain == -ENODEV)
+ domain = atomic_inc_return(&domain_nr);
+
+ bus->domain_nr = domain;
+}
+
#ifdef CONFIG_PCI_MSI
static LIST_HEAD(of_pci_msi_chip_list);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2c9266237edc..aa30a9e8915d 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -485,7 +485,7 @@ void pci_read_bridge_bases(struct pci_bus *child)
}
}
-static struct pci_bus *pci_alloc_bus(void)
+static struct pci_bus *pci_alloc_bus(struct pci_bus *parent)
{
struct pci_bus *b;
@@ -500,6 +500,10 @@ static struct pci_bus *pci_alloc_bus(void)
INIT_LIST_HEAD(&b->resources);
b->max_bus_speed = PCI_SPEED_UNKNOWN;
b->cur_bus_speed = PCI_SPEED_UNKNOWN;
+#ifdef CONFIG_PCI_DOMAINS_GENERIC
+ if (parent)
+ b->domain_nr = parent->domain_nr;
+#endif
return b;
}
@@ -670,7 +674,7 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
/*
* Allocate a new bus, and inherit stuff from the parent..
*/
- child = pci_alloc_bus();
+ child = pci_alloc_bus(parent);
if (!child)
return NULL;
@@ -1767,13 +1771,14 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
bridge->dev.parent = parent;
bridge->dev.release = pci_release_host_bridge_dev;
- b = pci_alloc_bus();
+ b = pci_alloc_bus(NULL);
if (!b)
goto err_out;
b->sysdata = sysdata;
b->ops = ops;
b->number = b->busn_res.start = bus;
+ pci_set_domain_nr(b, parent);
b2 = pci_find_bus(pci_domain_nr(b), bus);
if (b2) {
/* If we already got to this bus through a different bridge, ignore it */
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index 71e36d091db2..af16ac40c7a2 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -17,6 +17,7 @@ int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
struct pci_host_bridge *of_create_pci_host_bridge(struct device *parent,
struct pci_ops *ops, void *host_data);
+void of_pci_set_domain_nr(struct pci_bus *bus, struct device *parent);
#else
static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq)
@@ -53,6 +54,10 @@ of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops,
{
return NULL;
}
+
+static inline void of_pci_set_domain_nr(struct pci_bus *bus, struct device *parent)
+{
+}
#endif
#if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index d32b4ed1f411..9113f62c5038 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -457,6 +457,9 @@ struct pci_bus {
unsigned char primary; /* number of primary bridge */
unsigned char max_bus_speed; /* enum pci_bus_speed */
unsigned char cur_bus_speed; /* enum pci_bus_speed */
+#ifdef CONFIG_PCI_DOMAINS_GENERIC
+ int domain_nr;
+#endif
char name[48];
@@ -1292,6 +1295,18 @@ static inline int pci_domain_nr(struct pci_bus *bus) { return 0; }
static inline int pci_proc_domain(struct pci_bus *bus) { return 0; }
#endif /* CONFIG_PCI_DOMAINS */
+#ifdef CONFIG_PCI_DOMAINS_GENERIC
+static inline int pci_domain_nr(struct pci_bus *bus)
+{
+ return bus->domain_nr;
+}
+extern void pci_set_domain_nr(struct pci_bus *bus, struct device *parent);
+#else
+static inline void pci_set_domain_nr(struct pci_bus *bus, struct device *parent)
+{
+}
+#endif
+
/* some architectures require additional setup to direct VGA traffic */
typedef int (*arch_set_vga_state_t)(struct pci_dev *pdev, bool decode,
unsigned int command_bits, u32 flags);
On Tue, Jul 01, 2014 at 07:43:34PM +0100, Liviu Dudau wrote:
> Introduce a default implementation for remapping PCI bus I/O resources
> onto the CPU address space. Architectures with special needs may
> provide their own version, but most should be able to use this one.
[...]
> +/**
> + * pci_remap_iospace - Remap the memory mapped I/O space
> + * @res: Resource describing the I/O space
> + * @phys_addr: physical address where the range will be mapped.
> + *
> + * Remap the memory mapped I/O space described by the @res
> + * into the CPU physical address space. Only architectures
> + * that have memory mapped IO defined (and hence PCI_IOBASE)
> + * should call this function.
> + */
> +int __weak pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
> +{
> + int err = -ENODEV;
> +
> +#ifdef PCI_IOBASE
> + if (!(res->flags & IORESOURCE_IO))
> + return -EINVAL;
> +
> + if (res->end > IO_SPACE_LIMIT)
> + return -EINVAL;
> +
> + err = ioremap_page_range(res->start + (unsigned long)PCI_IOBASE,
> + res->end + 1 + (unsigned long)PCI_IOBASE,
> + phys_addr, __pgprot(PROT_DEVICE_nGnRE));
Except that PROT_DEVICE_nGnRE is arm64 only. I think that's a function
that should remain arch specific.
--
Catalin
On Mon, Jul 14, 2014 at 05:54:43PM +0100, Catalin Marinas wrote:
> On Tue, Jul 01, 2014 at 07:43:34PM +0100, Liviu Dudau wrote:
> > Introduce a default implementation for remapping PCI bus I/O resources
> > onto the CPU address space. Architectures with special needs may
> > provide their own version, but most should be able to use this one.
> [...]
> > +/**
> > + * pci_remap_iospace - Remap the memory mapped I/O space
> > + * @res: Resource describing the I/O space
> > + * @phys_addr: physical address where the range will be mapped.
> > + *
> > + * Remap the memory mapped I/O space described by the @res
> > + * into the CPU physical address space. Only architectures
> > + * that have memory mapped IO defined (and hence PCI_IOBASE)
> > + * should call this function.
> > + */
> > +int __weak pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
> > +{
> > + int err = -ENODEV;
> > +
> > +#ifdef PCI_IOBASE
> > + if (!(res->flags & IORESOURCE_IO))
> > + return -EINVAL;
> > +
> > + if (res->end > IO_SPACE_LIMIT)
> > + return -EINVAL;
> > +
> > + err = ioremap_page_range(res->start + (unsigned long)PCI_IOBASE,
> > + res->end + 1 + (unsigned long)PCI_IOBASE,
> > + phys_addr, __pgprot(PROT_DEVICE_nGnRE));
>
> Except that PROT_DEVICE_nGnRE is arm64 only. I think that's a function
> that should remain arch specific.
Yes, I was following Arnd's suggestion and lost track of the fact that
PROT_DEVICE_nGnRE is arm64 specific.
Best regards,
Liviu
>
> --
> Catalin
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Monday 14 July 2014 17:54:43 Catalin Marinas wrote:
> On Tue, Jul 01, 2014 at 07:43:34PM +0100, Liviu Dudau wrote:
> > Introduce a default implementation for remapping PCI bus I/O resources
> > onto the CPU address space. Architectures with special needs may
> > provide their own version, but most should be able to use this one.
> [...]
> > +/**
> > + * pci_remap_iospace - Remap the memory mapped I/O space
> > + * @res: Resource describing the I/O space
> > + * @phys_addr: physical address where the range will be mapped.
> > + *
> > + * Remap the memory mapped I/O space described by the @res
> > + * into the CPU physical address space. Only architectures
> > + * that have memory mapped IO defined (and hence PCI_IOBASE)
> > + * should call this function.
> > + */
> > +int __weak pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
> > +{
> > + int err = -ENODEV;
> > +
> > +#ifdef PCI_IOBASE
> > + if (!(res->flags & IORESOURCE_IO))
> > + return -EINVAL;
> > +
> > + if (res->end > IO_SPACE_LIMIT)
> > + return -EINVAL;
> > +
> > + err = ioremap_page_range(res->start + (unsigned long)PCI_IOBASE,
> > + res->end + 1 + (unsigned long)PCI_IOBASE,
> > + phys_addr, __pgprot(PROT_DEVICE_nGnRE));
>
> Except that PROT_DEVICE_nGnRE is arm64 only. I think that's a function
> that should remain arch specific.
>
How about #defining a macro with the correct pgprot value in asm/pci.h
or asm/pgtable.h?
We can provide a default for that in another architecture independent
location.
Arnd
On Mon, Jul 14, 2014 at 08:15:48PM +0200, Arnd Bergmann wrote:
> On Monday 14 July 2014 17:54:43 Catalin Marinas wrote:
> > On Tue, Jul 01, 2014 at 07:43:34PM +0100, Liviu Dudau wrote:
> > > Introduce a default implementation for remapping PCI bus I/O resources
> > > onto the CPU address space. Architectures with special needs may
> > > provide their own version, but most should be able to use this one.
> > [...]
> > > +/**
> > > + * pci_remap_iospace - Remap the memory mapped I/O space
> > > + * @res: Resource describing the I/O space
> > > + * @phys_addr: physical address where the range will be mapped.
> > > + *
> > > + * Remap the memory mapped I/O space described by the @res
> > > + * into the CPU physical address space. Only architectures
> > > + * that have memory mapped IO defined (and hence PCI_IOBASE)
> > > + * should call this function.
> > > + */
> > > +int __weak pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
> > > +{
> > > + int err = -ENODEV;
> > > +
> > > +#ifdef PCI_IOBASE
> > > + if (!(res->flags & IORESOURCE_IO))
> > > + return -EINVAL;
> > > +
> > > + if (res->end > IO_SPACE_LIMIT)
> > > + return -EINVAL;
> > > +
> > > + err = ioremap_page_range(res->start + (unsigned long)PCI_IOBASE,
> > > + res->end + 1 + (unsigned long)PCI_IOBASE,
> > > + phys_addr, __pgprot(PROT_DEVICE_nGnRE));
> >
> > Except that PROT_DEVICE_nGnRE is arm64 only. I think that's a function
> > that should remain arch specific.
> >
>
> How about #defining a macro with the correct pgprot value in asm/pci.h
> or asm/pgtable.h?
> We can provide a default for that in another architecture independent
> location.
I was discussing the same thing with Catalin today. It is the most reasonable
approach, as the host bridge driver that is likely to call this function should
not be aware of the architectural flags used here.
Best regards,
Liviu
>
> Arnd
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
-------------------
.oooO
( )
\ ( Oooo.
\_) ( )
) /
(_/
One small step
for me ...
On Mon, Jul 14, 2014 at 07:15:48PM +0100, Arnd Bergmann wrote:
> On Monday 14 July 2014 17:54:43 Catalin Marinas wrote:
> > On Tue, Jul 01, 2014 at 07:43:34PM +0100, Liviu Dudau wrote:
> > > Introduce a default implementation for remapping PCI bus I/O resources
> > > onto the CPU address space. Architectures with special needs may
> > > provide their own version, but most should be able to use this one.
> > [...]
> > > +/**
> > > + * pci_remap_iospace - Remap the memory mapped I/O space
> > > + * @res: Resource describing the I/O space
> > > + * @phys_addr: physical address where the range will be mapped.
> > > + *
> > > + * Remap the memory mapped I/O space described by the @res
> > > + * into the CPU physical address space. Only architectures
> > > + * that have memory mapped IO defined (and hence PCI_IOBASE)
> > > + * should call this function.
> > > + */
> > > +int __weak pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
> > > +{
> > > + int err = -ENODEV;
> > > +
> > > +#ifdef PCI_IOBASE
> > > + if (!(res->flags & IORESOURCE_IO))
> > > + return -EINVAL;
> > > +
> > > + if (res->end > IO_SPACE_LIMIT)
> > > + return -EINVAL;
> > > +
> > > + err = ioremap_page_range(res->start + (unsigned long)PCI_IOBASE,
> > > + res->end + 1 + (unsigned long)PCI_IOBASE,
> > > + phys_addr, __pgprot(PROT_DEVICE_nGnRE));
> >
> > Except that PROT_DEVICE_nGnRE is arm64 only. I think that's a function
> > that should remain arch specific.
> >
>
> How about #defining a macro with the correct pgprot value in asm/pci.h
> or asm/pgtable.h?
> We can provide a default for that in another architecture independent
> location.
That should work. We already have pgprot_noncached/writecombine in
asm-generic/pgtable.h which all architectures seem to include. The
default can be pgprot_noncached and we would invoke it as
pgprot_device(PAGE_KERNEL).
--
Catalin
On Wed, Jul 9, 2014 at 3:31 AM, Arnd Bergmann <[email protected]> wrote:
> On Tuesday 08 July 2014, Liviu Dudau wrote:
>> On Mon, Jul 07, 2014 at 10:22:00PM +0100, Arnd Bergmann wrote:
>> >
>> > I looked at the other drivers briefly, and I think you indeed fix the Tegra
>> > driver with this but break the integrator driver as mentioned above.
>> > The other callers of of_pci_range_to_resource() are apparently not
>> > impacted as they recalculate the values they get.
>>
>> I would argue that integrator version is having broken assumptions. If it would
>> try to allocate that IO range or request the resource as returned currently by
>> of_pci_range_to_resource (without my patch) it would fail. I know because I did
>> the same thing in my host bridge driver and it failed miserably. That's why I
>> tried to patch it.
>
> The integrator code was just introduced and the reason for how it does things
> is the way that of_pci_range_to_resource() works today. We tried to cope with
> it and not change the existing behavior in order to not break any other drivers.
>
> It's certainly not fair to call the integrator version broken, it just works
> around the common code having a quirky interface. We should probably have
> done of_pci_range_to_resource better than it is today (I would have argued
> for it to return an IORESOURCE_MEM with the CPU address), but it took long
> enough to get that merged and I was sick of arguing about it.
>
>> If the IO space is memory mapped, then we use the port number, the io_offset
>> and the PCI_IOBASE to get to the virtual address that, when accessed, will
>> generate the correct addresses on the bus, based on what the host bridge has
>> been configured.
>>
>> This is the current level of my understanding of PCI IO.
What is io_offset supposed to be and be based on?
> Your understanding is absolutely correct, and that's great because very few
> people get that right. What I think we're really arguing about is what the
> of_pci_range_to_resource is supposed to return. As you and Bjorn both pointed
> out earlier, there are in fact two resources associated with the I/O window
> and the flaw in the current implementation is that of_pci_range_to_resource
> returns the numeric values for the IORESOURCE_MEM resource, but sets the
> type to IORESOURCE_IO, which is offset from that by PCI_IOBASE.
>
> You try to fix that by making it return the correct IORESOURCE_IO resource,
> which is a reasonable approach but you must not break drivers that rely
> on the broken resource while doing that.
>
> The approach that I would have picked is to return the IORESOURCE_MEM
> resource associated with the I/O window and pick a (basically random)
> IORESOURCE_IO resource struct based on what hasn't been used and then
> compute the appropriate io_offset from that. This approach of course
> would also have required fixing up all drivers relying on the current
> behavior.
>
> To be clear, I'm fine with you (and Bjorn if he cares) picking the
> approach you like here, either one of these works fine as long as the
> host drivers use the interface in the way it is defined.
>
>> Now, I believe Rob has switched entirely to using my series in some test that
>> he has run and he hasn't encountered any issues, as long as one remembers in
>> the host bridge driver to add the io_base offset to the .start resource. If
>> not then I need to patch pci_v3.c.
>
> The crazy part of all these discussions is that basically nobody ever uses
> I/O port access, so it's very hard to test and we don't even notice when
> we get it wrong, but we end up spending most of the time for PCI host controller
> reviews trying to get these right.
FWIW, I test i/o accesses with Versatile QEMU. The LSI53xxxx device in
the model has a kconfig option to use i/o accesses. However, I have
seen in the past this is an area where 2 wrongs can make a right.
Rob
On Wed, Jul 16, 2014 at 03:35:37PM +0100, Rob Herring wrote:
> On Wed, Jul 9, 2014 at 3:31 AM, Arnd Bergmann <[email protected]> wrote:
> > On Tuesday 08 July 2014, Liviu Dudau wrote:
> >> On Mon, Jul 07, 2014 at 10:22:00PM +0100, Arnd Bergmann wrote:
> >> >
> >> > I looked at the other drivers briefly, and I think you indeed fix the Tegra
> >> > driver with this but break the integrator driver as mentioned above.
> >> > The other callers of of_pci_range_to_resource() are apparently not
> >> > impacted as they recalculate the values they get.
> >>
> >> I would argue that integrator version is having broken assumptions. If it would
> >> try to allocate that IO range or request the resource as returned currently by
> >> of_pci_range_to_resource (without my patch) it would fail. I know because I did
> >> the same thing in my host bridge driver and it failed miserably. That's why I
> >> tried to patch it.
> >
> > The integrator code was just introduced and the reason for how it does things
> > is the way that of_pci_range_to_resource() works today. We tried to cope with
> > it and not change the existing behavior in order to not break any other drivers.
> >
> > It's certainly not fair to call the integrator version broken, it just works
> > around the common code having a quirky interface. We should probably have
> > done of_pci_range_to_resource better than it is today (I would have argued
> > for it to return an IORESOURCE_MEM with the CPU address), but it took long
> > enough to get that merged and I was sick of arguing about it.
> >
> >> If the IO space is memory mapped, then we use the port number, the io_offset
> >> and the PCI_IOBASE to get to the virtual address that, when accessed, will
> >> generate the correct addresses on the bus, based on what the host bridge has
> >> been configured.
> >>
> >> This is the current level of my understanding of PCI IO.
>
> What is io_offset supposed to be and be based on?
io_offset is the offset that gets applied for each host bridge to the port number
to get the offset from PCI_IOBASE. Basically, the second host bridge will have
port numbers starting from zero like the first one in the system, but the io_offset
will be >= largest port number in the first host bridge.
>
> > Your understanding is absolutely correct, and that's great because very few
> > people get that right. What I think we're really arguing about is what the
> > of_pci_range_to_resource is supposed to return. As you and Bjorn both pointed
> > out earlier, there are in fact two resources associated with the I/O window
> > and the flaw in the current implementation is that of_pci_range_to_resource
> > returns the numeric values for the IORESOURCE_MEM resource, but sets the
> > type to IORESOURCE_IO, which is offset from that by PCI_IOBASE.
> >
> > You try to fix that by making it return the correct IORESOURCE_IO resource,
> > which is a reasonable approach but you must not break drivers that rely
> > on the broken resource while doing that.
> >
> > The approach that I would have picked is to return the IORESOURCE_MEM
> > resource associated with the I/O window and pick a (basically random)
> > IORESOURCE_IO resource struct based on what hasn't been used and then
> > compute the appropriate io_offset from that. This approach of course
> > would also have required fixing up all drivers relying on the current
> > behavior.
> >
> > To be clear, I'm fine with you (and Bjorn if he cares) picking the
> > approach you like here, either one of these works fine as long as the
> > host drivers use the interface in the way it is defined.
> >
> >> Now, I believe Rob has switched entirely to using my series in some test that
> >> he has run and he hasn't encountered any issues, as long as one remembers in
> >> the host bridge driver to add the io_base offset to the .start resource. If
> >> not then I need to patch pci_v3.c.
> >
> > The crazy part of all these discussions is that basically nobody ever uses
> > I/O port access, so it's very hard to test and we don't even notice when
> > we get it wrong, but we end up spending most of the time for PCI host controller
> > reviews trying to get these right.
>
> FWIW, I test i/o accesses with Versatile QEMU. The LSI53xxxx device in
> the model has a kconfig option to use i/o accesses. However, I have
> seen in the past this is an area where 2 wrongs can make a right.
:)
Best regards,
Liviu
>
> Rob
>
--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
¯\_(ツ)_/¯
On Wednesday 16 July 2014 09:35:37 Rob Herring wrote:
> On Wed, Jul 9, 2014 at 3:31 AM, Arnd Bergmann <[email protected]> wrote:
> > On Tuesday 08 July 2014, Liviu Dudau wrote:
> >> On Mon, Jul 07, 2014 at 10:22:00PM +0100, Arnd Bergmann wrote:
> >> >
> >> > I looked at the other drivers briefly, and I think you indeed fix the Tegra
> >> > driver with this but break the integrator driver as mentioned above.
> >> > The other callers of of_pci_range_to_resource() are apparently not
> >> > impacted as they recalculate the values they get.
> >>
> >> I would argue that integrator version is having broken assumptions. If it would
> >> try to allocate that IO range or request the resource as returned currently by
> >> of_pci_range_to_resource (without my patch) it would fail. I know because I did
> >> the same thing in my host bridge driver and it failed miserably. That's why I
> >> tried to patch it.
> >
> > The integrator code was just introduced and the reason for how it does things
> > is the way that of_pci_range_to_resource() works today. We tried to cope with
> > it and not change the existing behavior in order to not break any other drivers.
> >
> > It's certainly not fair to call the integrator version broken, it just works
> > around the common code having a quirky interface. We should probably have
> > done of_pci_range_to_resource better than it is today (I would have argued
> > for it to return an IORESOURCE_MEM with the CPU address), but it took long
> > enough to get that merged and I was sick of arguing about it.
> >
> >> If the IO space is memory mapped, then we use the port number, the io_offset
> >> and the PCI_IOBASE to get to the virtual address that, when accessed, will
> >> generate the correct addresses on the bus, based on what the host bridge has
> >> been configured.
> >>
> >> This is the current level of my understanding of PCI IO.
>
> What is io_offset supposed to be and be based on?
(you probably know most of this, but I'll explain it the long way
to avoid ambiguity).
io_offset is a concept used internally to translate bus-specific I/O port
numbers into Linux-global ports.
A simple example would be having two PCI host bridges each with a
(hardware) port range from 0 to 0xffff. These numbers are programmed
into "BARs" in PCI device config space and they are used on the physical
address lines in PCI or in the packet header on PCIe.
In Linux, we have a single logical port range that is seen by device
drivers, in the example the first host bridge would use ports 0-0xfffff
and the second one would use ports 0x10000-0x1ffff.
The PCI core uses the io_offset to translate between the two address
spaces when it does the resource allocation during bus probe, so a device
that gets Linux I/O port 0x10100 has its BAR programmed with 0x100 and
the struct resource filled as 0x10000.
When a PCI host bridge driver registers its root bus with the PCI core,
it passes the io_offset using the last argument to pci_add_resource_offset()
along with the Linux I/O port resource, so in the example the first
io_offset is zero, while the second one is 0x10000.
Note that there is no requirement for the I/O port range on the bus to
start at zero, and you can even have negative io_offset values to
deal with that, but this is the exception.
> >> Now, I believe Rob has switched entirely to using my series in some test that
> >> he has run and he hasn't encountered any issues, as long as one remembers in
> >> the host bridge driver to add the io_base offset to the .start resource. If
> >> not then I need to patch pci_v3.c.
> >
> > The crazy part of all these discussions is that basically nobody ever uses
> > I/O port access, so it's very hard to test and we don't even notice when
> > we get it wrong, but we end up spending most of the time for PCI host controller
> > reviews trying to get these right.
>
> FWIW, I test i/o accesses with Versatile QEMU. The LSI53xxxx device in
> the model has a kconfig option to use i/o accesses. However, I have
> seen in the past this is an area where 2 wrongs can make a right.
Can you point me to a git tree with your kernel and dts?
Arnd
On Mon, Jul 14, 2014 at 10:39 AM, Catalin Marinas
<[email protected]> wrote:
> ...
> Some more thinking, so I guess we could get away without changing the
> API. On top of Liviu's tree here:
>
> http://linux-arm.org/git?p=linux-ld.git;a=shortlog;h=refs/heads/for-upstream/pci_v8
>
> I reverted "pci: Introduce a domain number for pci_host_bridge.":
>
> http://linux-arm.org/git?p=linux-ld.git;a=commitdiff;h=b44e1c7d6b01c436f6f55662a1414e925161c9ca
>
> and added this patch on top (if you agree with the idea, we can split it
> nicely in arm64, OF and PCI specific parts). What we get is the
> domain_nr in a generic structure and free the sysdata pointer for the
> host controller driver.
>
> ----------------8<----------------------------------------
> From b32606aa3997fc8a45014a64f99e921eef4872b0 Mon Sep 17 00:00:00 2001
> From: Catalin Marinas <[email protected]>
> Date: Mon, 14 Jul 2014 17:20:01 +0100
> Subject: [PATCH] pci: Add support for generic domain_nr in pci_bus
>
> This patch adds domain_nr in struct pci_bus if
> CONFIG_PCI_DOMAINS_GENERIC is enabled. The default implementation for
> pci_domain_nr() simply returns bus->domain_nr. For the root bus, the
> core PCI code calls pci_set_domain_nr(bus, parent_device) while the
> child buses inherit the domain nr of the parent bus.
>
> This patch also adds an of_pci_set_domain_nr() implementation which
> parses the device tree for the "pci-domain" property or sets domain_nr
> to the next available value (this function could also be implemented
> entirely in arm64).
>
> Signed-off-by: Catalin Marinas <[email protected]>
I like this. It seems like a reasonable step forward. I don't really
like the pci_set_domain_nr() interface because the domain conceptually
exists before the root bus in the domain, but we can deal with that
later.
Tiny nit: please remove the "extern" on the pci_set_domain_nr()
declaration in include/linux/pci.h; we recently removed all the rest
(f39d5b72913e).
I'd really like to see all this stuff in v3.17, but I'm going to be on
vacation for the next three weeks and won't be able to do much until
Aug 11, which is probably going to be in the middle of the merge
window. But maybe the series can be integrated in -next via an ARM
tree or something. If it helps, you can add my
Acked-by: Bjorn Helgaas <[email protected]>
for this piece. I don't remember how many other PCI changes are
involved; maybe my ack on this will be enough? Even if we can't get
it in for the merge window, I'm open to trying to merge it after
v3.17-rc1 if it's isolated enough.
Bjorn
> ---
> arch/arm64/Kconfig | 3 +++
> arch/arm64/include/asm/pci.h | 10 ----------
> arch/arm64/kernel/pci.c | 5 +++++
> drivers/of/of_pci.c | 20 +++++++++++++-------
> drivers/pci/probe.c | 11 ++++++++---
> include/linux/of_pci.h | 5 +++++
> include/linux/pci.h | 15 +++++++++++++++
> 7 files changed, 49 insertions(+), 20 deletions(-)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 48ed631adde2..2c884f7453ba 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -160,6 +160,9 @@ config PCI
> config PCI_DOMAINS
> def_bool PCI
>
> +config PCI_DOMAINS_GENERIC
> + def_bool PCI
> +
> config PCI_SYSCALL
> def_bool PCI
>
> diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
> index 3f7856e92d66..4f091a5135b7 100644
> --- a/arch/arm64/include/asm/pci.h
> +++ b/arch/arm64/include/asm/pci.h
> @@ -29,16 +29,6 @@ struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus);
> extern int isa_dma_bridge_buggy;
>
> #ifdef CONFIG_PCI
> -static inline int pci_domain_nr(struct pci_bus *bus)
> -{
> - struct pci_host_bridge *bridge = find_pci_host_bridge(bus);
> -
> - if (bridge)
> - return bridge->domain_nr;
> -
> - return 0;
> -}
> -
> static inline int pci_proc_domain(struct pci_bus *bus)
> {
> return 1;
> diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
> index 955d6d1cb011..d5ed1afb0d88 100644
> --- a/arch/arm64/kernel/pci.c
> +++ b/arch/arm64/kernel/pci.c
> @@ -36,3 +36,8 @@ resource_size_t pcibios_align_resource(void *data, const struct resource *res,
> {
> return res->start;
> }
> +
> +void pci_set_domain_nr(struct pci_bus *bus, struct device *parent)
> +{
> + of_pci_set_domain_nr(bus, parent);
> +}
> diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
> index e81402af5cde..54f06b748bf1 100644
> --- a/drivers/of/of_pci.c
> +++ b/drivers/of/of_pci.c
> @@ -175,7 +175,7 @@ static atomic_t domain_nr = ATOMIC_INIT(-1);
> struct pci_host_bridge *
> of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host_data)
> {
> - int err, domain, busno;
> + int err, busno;
> struct resource *bus_range;
> struct pci_bus *root_bus;
> struct pci_host_bridge *bridge;
> @@ -186,10 +186,6 @@ of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host
> if (!bus_range)
> return ERR_PTR(-ENOMEM);
>
> - domain = of_alias_get_id(parent->of_node, "pci-domain");
> - if (domain == -ENODEV)
> - domain = atomic_inc_return(&domain_nr);
> -
> err = of_pci_parse_bus_range(parent->of_node, bus_range);
> if (err) {
> dev_info(parent, "No bus range for %s, using default [0-255]\n",
> @@ -207,8 +203,7 @@ of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host
> goto err_create;
>
> /* then create the root bus */
> - root_bus = pci_create_root_bus_in_domain(parent, domain, busno,
> - ops, host_data, &res);
> + root_bus = pci_create_root_bus(parent, busno, ops, host_data, &res);
> if (IS_ERR(root_bus)) {
> err = PTR_ERR(root_bus);
> goto err_create;
> @@ -225,6 +220,17 @@ err_create:
> }
> EXPORT_SYMBOL_GPL(of_create_pci_host_bridge);
>
> +void of_pci_set_domain_nr(struct pci_bus *bus, struct device *parent)
> +{
> + int domain;
> +
> + domain = of_alias_get_id(parent->of_node, "pci-domain");
> + if (domain == -ENODEV)
> + domain = atomic_inc_return(&domain_nr);
> +
> + bus->domain_nr = domain;
> +}
> +
> #ifdef CONFIG_PCI_MSI
>
> static LIST_HEAD(of_pci_msi_chip_list);
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 2c9266237edc..aa30a9e8915d 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -485,7 +485,7 @@ void pci_read_bridge_bases(struct pci_bus *child)
> }
> }
>
> -static struct pci_bus *pci_alloc_bus(void)
> +static struct pci_bus *pci_alloc_bus(struct pci_bus *parent)
> {
> struct pci_bus *b;
>
> @@ -500,6 +500,10 @@ static struct pci_bus *pci_alloc_bus(void)
> INIT_LIST_HEAD(&b->resources);
> b->max_bus_speed = PCI_SPEED_UNKNOWN;
> b->cur_bus_speed = PCI_SPEED_UNKNOWN;
> +#ifdef CONFIG_PCI_DOMAINS_GENERIC
> + if (parent)
> + b->domain_nr = parent->domain_nr;
> +#endif
> return b;
> }
>
> @@ -670,7 +674,7 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
> /*
> * Allocate a new bus, and inherit stuff from the parent..
> */
> - child = pci_alloc_bus();
> + child = pci_alloc_bus(parent);
> if (!child)
> return NULL;
>
> @@ -1767,13 +1771,14 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> bridge->dev.parent = parent;
> bridge->dev.release = pci_release_host_bridge_dev;
>
> - b = pci_alloc_bus();
> + b = pci_alloc_bus(NULL);
> if (!b)
> goto err_out;
>
> b->sysdata = sysdata;
> b->ops = ops;
> b->number = b->busn_res.start = bus;
> + pci_set_domain_nr(b, parent);
> b2 = pci_find_bus(pci_domain_nr(b), bus);
> if (b2) {
> /* If we already got to this bus through a different bridge, ignore it */
> diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
> index 71e36d091db2..af16ac40c7a2 100644
> --- a/include/linux/of_pci.h
> +++ b/include/linux/of_pci.h
> @@ -17,6 +17,7 @@ int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
> int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
> struct pci_host_bridge *of_create_pci_host_bridge(struct device *parent,
> struct pci_ops *ops, void *host_data);
> +void of_pci_set_domain_nr(struct pci_bus *bus, struct device *parent);
>
> #else
> static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq)
> @@ -53,6 +54,10 @@ of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops,
> {
> return NULL;
> }
> +
> +static inline void of_pci_set_domain_nr(struct pci_bus *bus, struct device *parent)
> +{
> +}
> #endif
>
> #if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index d32b4ed1f411..9113f62c5038 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -457,6 +457,9 @@ struct pci_bus {
> unsigned char primary; /* number of primary bridge */
> unsigned char max_bus_speed; /* enum pci_bus_speed */
> unsigned char cur_bus_speed; /* enum pci_bus_speed */
> +#ifdef CONFIG_PCI_DOMAINS_GENERIC
> + int domain_nr;
> +#endif
>
> char name[48];
>
> @@ -1292,6 +1295,18 @@ static inline int pci_domain_nr(struct pci_bus *bus) { return 0; }
> static inline int pci_proc_domain(struct pci_bus *bus) { return 0; }
> #endif /* CONFIG_PCI_DOMAINS */
>
> +#ifdef CONFIG_PCI_DOMAINS_GENERIC
> +static inline int pci_domain_nr(struct pci_bus *bus)
> +{
> + return bus->domain_nr;
> +}
> +extern void pci_set_domain_nr(struct pci_bus *bus, struct device *parent);
> +#else
> +static inline void pci_set_domain_nr(struct pci_bus *bus, struct device *parent)
> +{
> +}
> +#endif
> +
> /* some architectures require additional setup to direct VGA traffic */
> typedef int (*arch_set_vga_state_t)(struct pci_dev *pdev, bool decode,
> unsigned int command_bits, u32 flags);
On Tue, Jul 22, 2014 at 04:15:58AM +0100, Bjorn Helgaas wrote:
> On Mon, Jul 14, 2014 at 10:39 AM, Catalin Marinas
> <[email protected]> wrote:
> > From b32606aa3997fc8a45014a64f99e921eef4872b0 Mon Sep 17 00:00:00 2001
> > From: Catalin Marinas <[email protected]>
> > Date: Mon, 14 Jul 2014 17:20:01 +0100
> > Subject: [PATCH] pci: Add support for generic domain_nr in pci_bus
> >
> > This patch adds domain_nr in struct pci_bus if
> > CONFIG_PCI_DOMAINS_GENERIC is enabled. The default implementation for
> > pci_domain_nr() simply returns bus->domain_nr. For the root bus, the
> > core PCI code calls pci_set_domain_nr(bus, parent_device) while the
> > child buses inherit the domain nr of the parent bus.
> >
> > This patch also adds an of_pci_set_domain_nr() implementation which
> > parses the device tree for the "pci-domain" property or sets domain_nr
> > to the next available value (this function could also be implemented
> > entirely in arm64).
> >
> > Signed-off-by: Catalin Marinas <[email protected]>
>
> I like this. It seems like a reasonable step forward. I don't really
> like the pci_set_domain_nr() interface because the domain conceptually
> exists before the root bus in the domain, but we can deal with that
> later.
That's just the name I came up with. Maybe something like
pci_assign_domain_to_bus()?
> I'd really like to see all this stuff in v3.17, but I'm going to be on
> vacation for the next three weeks and won't be able to do much until
> Aug 11, which is probably going to be in the middle of the merge
> window. But maybe the series can be integrated in -next via an ARM
> tree or something. If it helps, you can add my
>
> Acked-by: Bjorn Helgaas <[email protected]>
>
> for this piece. I don't remember how many other PCI changes are
> involved; maybe my ack on this will be enough? Even if we can't get
> it in for the merge window, I'm open to trying to merge it after
> v3.17-rc1 if it's isolated enough.
Thanks for the Ack but I think Liviu needs to address a few more things
with the OF part of his PCIe patches and it's unlikely that they are
ready for 3.17. My "deadline" is for 3.18 to get the generic (and arm64)
PCIe support in.
(I'm off for two weeks starting now as well).
--
Catalin