2018-01-15 23:29:22

by Jim Quinlan

[permalink] [raw]
Subject: [PATCH v4 0/8] PCI: brcmstb: Add Broadcom Settopbox PCIe support

This patch series adds support for the Broadcom Settopbox PCIe host
controller. It is targeted to Broadcom Settopbox chips running on
ARM, ARM64, and MIPS platforms.

V4 Changes:
- Merged all BrcmSTB PCIe controller files into a single file.
- All new files now have the SPDX identifier.
- Removed the list of PCIe controllers.
- Removed "link-up" race.
- Removed probe of msi psuedo-device.
- Multiple comment text changes, as requested.
- "SSC" => "Spread Spectrum Clocking".
- Set 'memc' variable.
- Unnecessary variable initializations removed (eg rc_bar2_size).
- Added comment on "L23" link state.
- Removed use of "__refdata".
- Formatting of structure elements.

V3 Changes:
- Fold pcie-brcmstb-msi.c into pcie-brcmstb.c
- Use PCI_XXX constants for PCIe capability registers
- Removal of any unused constants
- Change s/pci/pcie/ for filenames, comment text
- Config space access now uses 8/16/32 read/writes
- Use proper multi-line comment style
- Use function names, structure that are common in other host drivers
- DT binding 'brcm,ssc' is now 'brcm,enable-ssc'
- Dropped DT binding 'xyz-supply'
- Not setting CRS support as Linux does it if it is advertised.
- Removed code that was considered "debug code".
- Use of_get_pcie_domain_nr()
- Variable 'bridge_setup_done' removed.

V2 Changes:
* Patch brcmstb-add-memory-API:
- fix DT_PROP_DATA_TO_U32 macro.
- dropped one EXPORT_SYMBOL, changed the other to GPL.
* Patch DT-docs-for-Brcmstb-PCIe:
- change 'brcm,gen' prop to standard 'max-link-speed'.
- rewrite bindings commit to omit standard prop defs.
- change props "supplies", "supply-names" to "xyz-supply"
* Patch removed: export-symbol-arch_setup_dma_ops [4/9]
* Patch brcmstb-add-dma-ranges:
- use get_dma_ops(); also use a const dma_map_ops structure.
- rewrite map_sg(), unmap_sg(), other calls like syng_sg_*()
- omit brcm_mapping_error(), but added code in brcm_dma_supported()
- put all of the notifier code in one compilation unit.

Florian Fainelli (1):
SOC: brcmstb: add memory API

Jim Quinlan (7):
dt-bindings: pci: Add DT docs for Brcmstb PCIe device
PCI: brcmstb: Add Broadcom STB PCIe host controller driver
PCI: brcmstb: Add dma-range mapping for inbound traffic
PCI/MSI: Enable PCI_MSI_IRQ_DOMAIN support for MIPS
PCI: brcmstb: Add MSI capability
MIPS: BMIPS: Add PCI bindings for 7425, 7435
MIPS: BMIPS: Enable PCI

.../devicetree/bindings/pci/brcmstb-pcie.txt | 59 +
arch/mips/Kconfig | 3 +
arch/mips/boot/dts/brcm/bcm7425.dtsi | 26 +
arch/mips/boot/dts/brcm/bcm7435.dtsi | 27 +
arch/mips/boot/dts/brcm/bcm97425svmb.dts | 4 +
arch/mips/boot/dts/brcm/bcm97435svmb.dts | 4 +
arch/mips/include/asm/Kbuild | 1 +
drivers/pci/Kconfig | 2 +-
drivers/pci/host/Kconfig | 9 +
drivers/pci/host/Makefile | 1 +
drivers/pci/host/pcie-brcmstb.c | 1830 ++++++++++++++++++++
drivers/soc/bcm/brcmstb/Makefile | 2 +-
drivers/soc/bcm/brcmstb/memory.c | 158 ++
include/soc/brcmstb/memory_api.h | 25 +
14 files changed, 2149 insertions(+), 2 deletions(-)
create mode 100644 Documentation/devicetree/bindings/pci/brcmstb-pcie.txt
create mode 100644 drivers/pci/host/pcie-brcmstb.c
create mode 100644 drivers/soc/bcm/brcmstb/memory.c
create mode 100644 include/soc/brcmstb/memory_api.h

--
1.9.0.138.g2de3478


2018-01-15 23:29:27

by Jim Quinlan

[permalink] [raw]
Subject: [PATCH v4 1/8] SOC: brcmstb: add memory API

From: Florian Fainelli <[email protected]>

This commit adds a memory API suitable for ascertaining the sizes of
each of the N memory controllers in a Broadcom STB chip. Its first
user will be the Broadcom STB PCIe root complex driver, which needs
to know these sizes to properly set up DMA mappings for inbound
regions.

We cannot use memblock here or anything like what Linux provides
because it collapses adjacent regions within a larger block, and here
we actually need per-memory controller addresses and sizes, which is
why we resort to manual DT parsing.

Signed-off-by: Jim Quinlan <[email protected]>

Conflicts:
drivers/soc/bcm/brcmstb/Makefile
---
drivers/soc/bcm/brcmstb/Makefile | 2 +-
drivers/soc/bcm/brcmstb/memory.c | 158 +++++++++++++++++++++++++++++++++++++++
include/soc/brcmstb/memory_api.h | 25 +++++++
3 files changed, 184 insertions(+), 1 deletion(-)
create mode 100644 drivers/soc/bcm/brcmstb/memory.c
create mode 100644 include/soc/brcmstb/memory_api.h

diff --git a/drivers/soc/bcm/brcmstb/Makefile b/drivers/soc/bcm/brcmstb/Makefile
index 01687c2..e4ccd3a 100644
--- a/drivers/soc/bcm/brcmstb/Makefile
+++ b/drivers/soc/bcm/brcmstb/Makefile
@@ -1,2 +1,2 @@
-obj-y += common.o biuctrl.o
+obj-y += common.o biuctrl.o memory.o
obj-$(CONFIG_BRCMSTB_PM) += pm/
diff --git a/drivers/soc/bcm/brcmstb/memory.c b/drivers/soc/bcm/brcmstb/memory.c
new file mode 100644
index 0000000..65334b0
--- /dev/null
+++ b/drivers/soc/bcm/brcmstb/memory.c
@@ -0,0 +1,158 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright © 2015-2017 Broadcom */
+
+#include <linux/device.h>
+#include <linux/io.h>
+#include <linux/libfdt.h>
+#include <linux/of_address.h>
+#include <linux/of_fdt.h>
+#include <linux/sizes.h>
+#include <soc/brcmstb/memory_api.h>
+
+/* Macro to help extract property data */
+#define DT_PROP_DATA_TO_U32(b, offs) (fdt32_to_cpu(*(u32*)(b + offs)))
+
+/* Constants used when retrieving memc info */
+#define NUM_BUS_RANGES 10
+#define BUS_RANGE_ULIMIT_SHIFT 4
+#define BUS_RANGE_LLIMIT_SHIFT 4
+#define BUS_RANGE_PA_SHIFT 12
+
+enum {
+ BUSNUM_MCP0 = 0x4,
+ BUSNUM_MCP1 = 0x5,
+ BUSNUM_MCP2 = 0x6,
+};
+
+/*
+ * If the DT nodes are handy, determine which MEMC holds the specified
+ * physical address.
+ */
+#ifdef CONFIG_ARCH_BRCMSTB
+int __brcmstb_memory_phys_addr_to_memc(phys_addr_t pa, void __iomem *base)
+{
+ int memc = -1;
+ int i;
+
+ for (i = 0; i < NUM_BUS_RANGES; i++, base += 8) {
+ const u64 ulimit_raw = readl(base);
+ const u64 llimit_raw = readl(base + 4);
+ const u64 ulimit =
+ ((ulimit_raw >> BUS_RANGE_ULIMIT_SHIFT)
+ << BUS_RANGE_PA_SHIFT) | 0xfff;
+ const u64 llimit = (llimit_raw >> BUS_RANGE_LLIMIT_SHIFT)
+ << BUS_RANGE_PA_SHIFT;
+ const u32 busnum = (u32)(ulimit_raw & 0xf);
+
+ if (pa >= llimit && pa <= ulimit) {
+ if (busnum >= BUSNUM_MCP0 && busnum <= BUSNUM_MCP2) {
+ memc = busnum - BUSNUM_MCP0;
+ break;
+ }
+ }
+ }
+
+ return memc;
+}
+
+int brcmstb_memory_phys_addr_to_memc(phys_addr_t pa)
+{
+ int memc = -1;
+ struct device_node *np;
+ void __iomem *cpubiuctrl;
+
+ np = of_find_compatible_node(NULL, NULL, "brcm,brcmstb-cpu-biu-ctrl");
+ if (!np)
+ return memc;
+
+ cpubiuctrl = of_iomap(np, 0);
+ if (!cpubiuctrl)
+ goto cleanup;
+
+ memc = __brcmstb_memory_phys_addr_to_memc(pa, cpubiuctrl);
+ iounmap(cpubiuctrl);
+
+cleanup:
+ of_node_put(np);
+
+ return memc;
+}
+
+#elif defined(CONFIG_MIPS)
+int brcmstb_memory_phys_addr_to_memc(phys_addr_t pa)
+{
+ /* The logic here is fairly simple and hardcoded: if pa <= 0x5000_0000,
+ * then this is MEMC0, else MEMC1.
+ *
+ * For systems with 2GB on MEMC0, MEMC1 starts at 9000_0000, with 1GB
+ * on MEMC0, MEMC1 starts at 6000_0000.
+ */
+ if (pa >= 0x50000000ULL)
+ return 1;
+ else
+ return 0;
+}
+#endif
+
+u64 brcmstb_memory_memc_size(int memc)
+{
+ const void *fdt = initial_boot_params;
+ const int mem_offset = fdt_path_offset(fdt, "/memory");
+ int addr_cells = 1, size_cells = 1;
+ const struct fdt_property *prop;
+ int proplen, cellslen;
+ u64 memc_size = 0;
+ int i;
+
+ /* Get root size and address cells if specified */
+ prop = fdt_get_property(fdt, 0, "#size-cells", &proplen);
+ if (prop)
+ size_cells = DT_PROP_DATA_TO_U32(prop->data, 0);
+
+ prop = fdt_get_property(fdt, 0, "#address-cells", &proplen);
+ if (prop)
+ addr_cells = DT_PROP_DATA_TO_U32(prop->data, 0);
+
+ if (mem_offset < 0)
+ return -1;
+
+ prop = fdt_get_property(fdt, mem_offset, "reg", &proplen);
+ cellslen = (int)sizeof(u32) * (addr_cells + size_cells);
+ if ((proplen % cellslen) != 0)
+ return -1;
+
+ for (i = 0; i < proplen / cellslen; ++i) {
+ u64 addr = 0;
+ u64 size = 0;
+ int memc_idx;
+ int j;
+
+ for (j = 0; j < addr_cells; ++j) {
+ int offset = (cellslen * i) + (sizeof(u32) * j);
+
+ addr |= (u64)DT_PROP_DATA_TO_U32(prop->data, offset) <<
+ ((addr_cells - j - 1) * 32);
+ }
+ for (j = 0; j < size_cells; ++j) {
+ int offset = (cellslen * i) +
+ (sizeof(u32) * (j + addr_cells));
+
+ size |= (u64)DT_PROP_DATA_TO_U32(prop->data, offset) <<
+ ((size_cells - j - 1) * 32);
+ }
+
+ if ((phys_addr_t)addr != addr) {
+ pr_err("phys_addr_t is smaller than provided address 0x%llx!\n",
+ addr);
+ return -1;
+ }
+
+ memc_idx = brcmstb_memory_phys_addr_to_memc((phys_addr_t)addr);
+ if (memc_idx == memc)
+ memc_size += size;
+ }
+
+ return memc_size;
+}
+EXPORT_SYMBOL_GPL(brcmstb_memory_memc_size);
+
diff --git a/include/soc/brcmstb/memory_api.h b/include/soc/brcmstb/memory_api.h
new file mode 100644
index 0000000..d922906
--- /dev/null
+++ b/include/soc/brcmstb/memory_api.h
@@ -0,0 +1,25 @@
+#ifndef __MEMORY_API_H
+#define __MEMORY_API_H
+
+/*
+ * Bus Interface Unit control register setup, must happen early during boot,
+ * before SMP is brought up, called by machine entry point.
+ */
+void brcmstb_biuctrl_init(void);
+
+#ifdef CONFIG_SOC_BRCMSTB
+int brcmstb_memory_phys_addr_to_memc(phys_addr_t pa);
+u64 brcmstb_memory_memc_size(int memc);
+#else
+static inline int brcmstb_memory_phys_addr_to_memc(phys_addr_t pa)
+{
+ return -EINVAL;
+}
+
+static inline u64 brcmstb_memory_memc_size(int memc)
+{
+ return -1;
+}
+#endif
+
+#endif /* __MEMORY_API_H */
--
1.9.0.138.g2de3478

2018-01-15 23:29:34

by Jim Quinlan

[permalink] [raw]
Subject: [PATCH v4 3/8] PCI: brcmstb: Add Broadcom STB PCIe host controller driver

This commit adds the basic Broadcom STB PCIe controller. Missing is
the ability to process MSI and also handle dma-ranges for inbound
memory accesses. These two functionalities are added in subsequent
commits.

The PCIe block contains an MDIO interface. This is a local interface
only accessible by the PCIe controller. It cannot be used or shared
by any other HW. As such, the small amount of code for this
controller is included in this driver as there is little upside to put
it elsewhere.

Signed-off-by: Jim Quinlan <[email protected]>
---
drivers/pci/host/Kconfig | 9 +
drivers/pci/host/Makefile | 1 +
drivers/pci/host/pcie-brcmstb.c | 1096 +++++++++++++++++++++++++++++++++++++++
3 files changed, 1106 insertions(+)
create mode 100644 drivers/pci/host/pcie-brcmstb.c

diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
index 38d1298..8373209 100644
--- a/drivers/pci/host/Kconfig
+++ b/drivers/pci/host/Kconfig
@@ -226,4 +226,13 @@ config VMD
To compile this driver as a module, choose M here: the
module will be called vmd.

+config PCIE_BRCMSTB
+ tristate "Broadcom Brcmstb PCIe platform host driver"
+ depends on ARCH_BRCMSTB || BMIPS_GENERIC
+ depends on OF
+ depends on SOC_BRCMSTB
+ default ARCH_BRCMSTB || BMIPS_GENERIC
+ help
+ Adds support for Broadcom Settop Box PCIe host controller.
+
endmenu
diff --git a/drivers/pci/host/Makefile b/drivers/pci/host/Makefile
index 34ec1d8..3f144bb 100644
--- a/drivers/pci/host/Makefile
+++ b/drivers/pci/host/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_PCIE_ROCKCHIP) += pcie-rockchip.o
obj-$(CONFIG_PCIE_MEDIATEK) += pcie-mediatek.o
obj-$(CONFIG_PCIE_TANGO_SMP8759) += pcie-tango.o
obj-$(CONFIG_VMD) += vmd.o
+obj-$(CONFIG_PCIE_BRCMSTB) += pcie-brcmstb.o

# The following drivers are for devices that use the generic ACPI
# pci_root.c driver but don't support standard ECAM config access.
diff --git a/drivers/pci/host/pcie-brcmstb.c b/drivers/pci/host/pcie-brcmstb.c
new file mode 100644
index 0000000..fd15ab1
--- /dev/null
+++ b/drivers/pci/host/pcie-brcmstb.c
@@ -0,0 +1,1096 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2009 - 2017 Broadcom */
+
+#include <linux/clk.h>
+#include <linux/compiler.h>
+#include <linux/delay.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/ioport.h>
+#include <linux/irqdomain.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/log2.h>
+#include <linux/module.h>
+#include <linux/of_address.h>
+#include <linux/of_irq.h>
+#include <linux/of_pci.h>
+#include <linux/of_platform.h>
+#include <linux/pci.h>
+#include <linux/printk.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <soc/brcmstb/memory_api.h>
+#include <linux/string.h>
+#include <linux/types.h>
+
+/* BRCM_PCIE_CAP_REGS - Offset for the mandatory capability config regs */
+#define BRCM_PCIE_CAP_REGS 0x00ac
+
+/*
+ * Broadcom Settop Box PCIe Register Offsets. The names are from
+ * the chip's RDB and we use them here so that a script can correlate
+ * this code and the RDB to prevent discrepancies.
+ */
+#define PCIE_RC_CFG_VENDOR_VENDOR_SPECIFIC_REG1 0x0188
+#define PCIE_RC_CFG_PRIV1_ID_VAL3 0x043c
+#define PCIE_RC_DL_MDIO_ADDR 0x1100
+#define PCIE_RC_DL_MDIO_WR_DATA 0x1104
+#define PCIE_RC_DL_MDIO_RD_DATA 0x1108
+#define PCIE_MISC_MISC_CTRL 0x4008
+#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_LO 0x400c
+#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_HI 0x4010
+#define PCIE_MISC_RC_BAR1_CONFIG_LO 0x402c
+#define PCIE_MISC_RC_BAR2_CONFIG_LO 0x4034
+#define PCIE_MISC_RC_BAR2_CONFIG_HI 0x4038
+#define PCIE_MISC_RC_BAR3_CONFIG_LO 0x403c
+#define PCIE_MISC_PCIE_CTRL 0x4064
+#define PCIE_MISC_PCIE_STATUS 0x4068
+#define PCIE_MISC_REVISION 0x406c
+#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT 0x4070
+#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_HI 0x4080
+#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_LIMIT_HI 0x4084
+#define PCIE_MISC_HARD_PCIE_HARD_DEBUG 0x4204
+#define PCIE_INTR2_CPU_BASE 0x4300
+
+/*
+ * Broadcom Settop Box PCIe Register Field shift and mask info. The
+ * names are from the chip's RDB and we use them here so that a script
+ * can correlate this code and the RDB to prevent discrepancies.
+ */
+#define PCIE_RC_CFG_VENDOR_VENDOR_SPECIFIC_REG1_ENDIAN_MODE_BAR2_MASK 0xc
+#define PCIE_RC_CFG_VENDOR_VENDOR_SPECIFIC_REG1_ENDIAN_MODE_BAR2_SHIFT 0x2
+#define PCIE_RC_CFG_PRIV1_ID_VAL3_CLASS_CODE_MASK 0xffffff
+#define PCIE_RC_CFG_PRIV1_ID_VAL3_CLASS_CODE_SHIFT 0x0
+#define PCIE_MISC_MISC_CTRL_SCB_ACCESS_EN_MASK 0x1000
+#define PCIE_MISC_MISC_CTRL_SCB_ACCESS_EN_SHIFT 0xc
+#define PCIE_MISC_MISC_CTRL_CFG_READ_UR_MODE_MASK 0x2000
+#define PCIE_MISC_MISC_CTRL_CFG_READ_UR_MODE_SHIFT 0xd
+#define PCIE_MISC_MISC_CTRL_MAX_BURST_SIZE_MASK 0x300000
+#define PCIE_MISC_MISC_CTRL_MAX_BURST_SIZE_SHIFT 0x14
+#define PCIE_MISC_MISC_CTRL_SCB0_SIZE_MASK 0xf8000000
+#define PCIE_MISC_MISC_CTRL_SCB0_SIZE_SHIFT 0x1b
+#define PCIE_MISC_MISC_CTRL_SCB1_SIZE_MASK 0x7c00000
+#define PCIE_MISC_MISC_CTRL_SCB1_SIZE_SHIFT 0x16
+#define PCIE_MISC_MISC_CTRL_SCB2_SIZE_MASK 0x1f
+#define PCIE_MISC_MISC_CTRL_SCB2_SIZE_SHIFT 0x0
+#define PCIE_MISC_RC_BAR1_CONFIG_LO_SIZE_MASK 0x1f
+#define PCIE_MISC_RC_BAR1_CONFIG_LO_SIZE_SHIFT 0x0
+#define PCIE_MISC_RC_BAR2_CONFIG_LO_SIZE_MASK 0x1f
+#define PCIE_MISC_RC_BAR2_CONFIG_LO_SIZE_SHIFT 0x0
+#define PCIE_MISC_RC_BAR3_CONFIG_LO_SIZE_MASK 0x1f
+#define PCIE_MISC_RC_BAR3_CONFIG_LO_SIZE_SHIFT 0x0
+#define PCIE_MISC_PCIE_CTRL_PCIE_PERSTB_MASK 0x4
+#define PCIE_MISC_PCIE_CTRL_PCIE_PERSTB_SHIFT 0x2
+#define PCIE_MISC_PCIE_CTRL_PCIE_L23_REQUEST_MASK 0x1
+#define PCIE_MISC_PCIE_CTRL_PCIE_L23_REQUEST_SHIFT 0x0
+#define PCIE_MISC_PCIE_STATUS_PCIE_PORT_MASK 0x80
+#define PCIE_MISC_PCIE_STATUS_PCIE_PORT_SHIFT 0x7
+#define PCIE_MISC_PCIE_STATUS_PCIE_DL_ACTIVE_MASK 0x20
+#define PCIE_MISC_PCIE_STATUS_PCIE_DL_ACTIVE_SHIFT 0x5
+#define PCIE_MISC_PCIE_STATUS_PCIE_PHYLINKUP_MASK 0x10
+#define PCIE_MISC_PCIE_STATUS_PCIE_PHYLINKUP_SHIFT 0x4
+#define PCIE_MISC_PCIE_STATUS_PCIE_LINK_IN_L23_MASK 0x40
+#define PCIE_MISC_PCIE_STATUS_PCIE_LINK_IN_L23_SHIFT 0x6
+#define PCIE_MISC_REVISION_MAJMIN_MASK 0xffff
+#define PCIE_MISC_REVISION_MAJMIN_SHIFT 0
+#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT_LIMIT_MASK 0xfff00000
+#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT_LIMIT_SHIFT 0x14
+#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT_BASE_MASK 0xfff0
+#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT_BASE_SHIFT 0x4
+#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT_NUM_MASK_BITS 0xc
+#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_HI_BASE_MASK 0xff
+#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_HI_BASE_SHIFT 0x0
+#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_LIMIT_HI_LIMIT_MASK 0xff
+#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_LIMIT_HI_LIMIT_SHIFT 0x0
+#define PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK 0x2
+#define PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_SHIFT 0x1
+#define PCIE_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK 0x08000000
+#define PCIE_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_SHIFT 0x1b
+#define PCIE_RGR1_SW_INIT_1_PERST_MASK 0x1
+#define PCIE_RGR1_SW_INIT_1_PERST_SHIFT 0x0
+
+#define BRCM_NUM_PCIE_OUT_WINS 0x4
+#define BRCM_MAX_SCB 0x4
+
+#define BRCM_MSI_TARGET_ADDR_LT_4GB 0x0fffffffcULL
+#define BRCM_MSI_TARGET_ADDR_GT_4GB 0xffffffffcULL
+
+#define BURST_SIZE_128 0
+#define BURST_SIZE_256 1
+#define BURST_SIZE_512 2
+
+/* Offsets from PCIE_INTR2_CPU_BASE */
+#define STATUS 0x0
+#define SET 0x4
+#define CLR 0x8
+#define MASK_STATUS 0xc
+#define MASK_SET 0x10
+#define MASK_CLR 0x14
+
+#define PCIE_BUSNUM_SHIFT 20
+#define PCIE_SLOT_SHIFT 15
+#define PCIE_FUNC_SHIFT 12
+
+#if defined(__BIG_ENDIAN)
+#define DATA_ENDIAN 2 /* PCIe->DDR inbound traffic */
+#define MMIO_ENDIAN 2 /* CPU->PCIe outbound traffic */
+#else
+#define DATA_ENDIAN 0
+#define MMIO_ENDIAN 0
+#endif
+
+#define MDIO_PORT0 0x0
+#define MDIO_DATA_MASK 0x7fffffff
+#define MDIO_DATA_SHIFT 0x0
+#define MDIO_PORT_MASK 0xf0000
+#define MDIO_PORT_SHIFT 0x16
+#define MDIO_REGAD_MASK 0xffff
+#define MDIO_REGAD_SHIFT 0x0
+#define MDIO_CMD_MASK 0xfff00000
+#define MDIO_CMD_SHIFT 0x14
+#define MDIO_CMD_READ 0x1
+#define MDIO_CMD_WRITE 0x0
+#define MDIO_DATA_DONE_MASK 0x80000000
+#define MDIO_RD_DONE(x) (((x) & MDIO_DATA_DONE_MASK) ? 1 : 0)
+#define MDIO_WT_DONE(x) (((x) & MDIO_DATA_DONE_MASK) ? 0 : 1)
+#define SSC_REGS_ADDR 0x1100
+#define SET_ADDR_OFFSET 0x1f
+#define SSC_CNTL_OFFSET 0x2
+#define SSC_CNTL_OVRD_EN_MASK 0x8000
+#define SSC_CNTL_OVRD_EN_SHIFT 0xf
+#define SSC_CNTL_OVRD_VAL_MASK 0x4000
+#define SSC_CNTL_OVRD_VAL_SHIFT 0xe
+#define SSC_STATUS_OFFSET 0x1
+#define SSC_STATUS_SSC_MASK 0x400
+#define SSC_STATUS_SSC_SHIFT 0xa
+#define SSC_STATUS_PLL_LOCK_MASK 0x800
+#define SSC_STATUS_PLL_LOCK_SHIFT 0xb
+
+#define IDX_ADDR(pcie) \
+ ((pcie)->reg_offsets[EXT_CFG_INDEX])
+#define DATA_ADDR(pcie) \
+ ((pcie)->reg_offsets[EXT_CFG_DATA])
+#define PCIE_RGR1_SW_INIT_1(pcie) \
+ ((pcie)->reg_offsets[RGR1_SW_INIT_1])
+
+enum {
+ RGR1_SW_INIT_1,
+ EXT_CFG_INDEX,
+ EXT_CFG_DATA,
+};
+
+enum {
+ RGR1_SW_INIT_1_INIT_MASK,
+ RGR1_SW_INIT_1_INIT_SHIFT,
+ RGR1_SW_INIT_1_PERST_MASK,
+ RGR1_SW_INIT_1_PERST_SHIFT,
+};
+
+enum pcie_type {
+ BCM7425,
+ BCM7435,
+ GENERIC,
+ BCM7278,
+};
+
+struct brcm_window {
+ dma_addr_t pcie_addr;
+ phys_addr_t cpu_addr;
+ dma_addr_t size;
+};
+
+/* Internal PCIe Host Controller Information.*/
+struct brcm_pcie {
+ struct device *dev;
+ void __iomem *base;
+ struct list_head resources;
+ int irq;
+ struct clk *clk;
+ struct pci_bus *root_bus;
+ struct device_node *dn;
+ int id;
+ bool suspended;
+ int num_out_wins;
+ bool ssc;
+ int gen;
+ struct brcm_window out_wins[BRCM_NUM_PCIE_OUT_WINS];
+ unsigned int rev;
+ const int *reg_offsets;
+ const int *reg_field_info;
+ enum pcie_type type;
+};
+
+struct pcie_cfg_data {
+ const int *reg_field_info;
+ const int *offsets;
+ const enum pcie_type type;
+};
+
+static const int pcie_reg_field_info[] = {
+ [RGR1_SW_INIT_1_INIT_MASK] = 0x2,
+ [RGR1_SW_INIT_1_INIT_SHIFT] = 0x1,
+};
+
+static const int pcie_reg_field_info_bcm7278[] = {
+ [RGR1_SW_INIT_1_INIT_MASK] = 0x1,
+ [RGR1_SW_INIT_1_INIT_SHIFT] = 0x0,
+};
+
+static const int pcie_offset_bcm7425[] = {
+ [RGR1_SW_INIT_1] = 0x8010,
+ [EXT_CFG_INDEX] = 0x8300,
+ [EXT_CFG_DATA] = 0x8304,
+};
+
+static const struct pcie_cfg_data bcm7425_cfg = {
+ .reg_field_info = pcie_reg_field_info,
+ .offsets = pcie_offset_bcm7425,
+ .type = BCM7425,
+};
+
+static const int pcie_offsets[] = {
+ [RGR1_SW_INIT_1] = 0x9210,
+ [EXT_CFG_INDEX] = 0x9000,
+ [EXT_CFG_DATA] = 0x9004,
+};
+
+static const struct pcie_cfg_data bcm7435_cfg = {
+ .reg_field_info = pcie_reg_field_info,
+ .offsets = pcie_offsets,
+ .type = BCM7435,
+};
+
+static const struct pcie_cfg_data generic_cfg = {
+ .reg_field_info = pcie_reg_field_info,
+ .offsets = pcie_offsets,
+ .type = GENERIC,
+};
+
+static const int pcie_offset_bcm7278[] = {
+ [RGR1_SW_INIT_1] = 0xc010,
+ [EXT_CFG_INDEX] = 0x9000,
+ [EXT_CFG_DATA] = 0x9004,
+};
+
+static const struct pcie_cfg_data bcm7278_cfg = {
+ .reg_field_info = pcie_reg_field_info_bcm7278,
+ .offsets = pcie_offset_bcm7278,
+ .type = BCM7278,
+};
+
+static void __iomem *brcm_pcie_map_conf(struct pci_bus *bus, unsigned int devfn,
+ int where);
+
+static struct pci_ops brcm_pcie_ops = {
+ .map_bus = brcm_pcie_map_conf,
+ .read = pci_generic_config_read,
+ .write = pci_generic_config_write,
+};
+
+#if defined(CONFIG_MIPS)
+/* Broadcom MIPs HW implicitly does the swapping if necessary */
+#define bcm_readl(a) __raw_readl(a)
+#define bcm_writel(d, a) __raw_writel(d, a)
+#define bcm_readw(a) __raw_readw(a)
+#define bcm_writew(d, a) __raw_writew(d, a)
+#else
+#define bcm_readl(a) readl(a)
+#define bcm_writel(d, a) writel(d, a)
+#define bcm_readw(a) readw(a)
+#define bcm_writew(d, a) writew(d, a)
+#endif
+
+/* These macros extract/insert fields to host controller's register set. */
+#define RD_FLD(base, reg, field) \
+ rd_fld(base + reg, reg##_##field##_MASK, reg##_##field##_SHIFT)
+#define WR_FLD(base, reg, field, val) \
+ wr_fld(base + reg, reg##_##field##_MASK, reg##_##field##_SHIFT, val)
+#define WR_FLD_RB(base, reg, field, val) \
+ wr_fld_rb(base + reg, reg##_##field##_MASK, reg##_##field##_SHIFT, val)
+#define WR_FLD_WITH_OFFSET(base, off, reg, field, val) \
+ wr_fld(base + reg + off, reg##_##field##_MASK, \
+ reg##_##field##_SHIFT, val)
+#define EXTRACT_FIELD(val, reg, field) \
+ ((val & reg##_##field##_MASK) >> reg##_##field##_SHIFT)
+#define INSERT_FIELD(val, reg, field, field_val) \
+ ((val & ~reg##_##field##_MASK) | \
+ (reg##_##field##_MASK & (field_val << reg##_##field##_SHIFT)))
+
+static phys_addr_t scb_size[BRCM_MAX_SCB];
+static int num_memc;
+static int num_pcie;
+static DEFINE_MUTEX(brcm_pcie_lock);
+
+static u32 rd_fld(void __iomem *p, u32 mask, int shift)
+{
+ return (bcm_readl(p) & mask) >> shift;
+}
+
+static void wr_fld(void __iomem *p, u32 mask, int shift, u32 val)
+{
+ u32 reg = bcm_readl(p);
+
+ reg = (reg & ~mask) | ((val << shift) & mask);
+ bcm_writel(reg, p);
+}
+
+static void wr_fld_rb(void __iomem *p, u32 mask, int shift, u32 val)
+{
+ wr_fld(p, mask, shift, val);
+ (void)bcm_readl(p);
+}
+
+static const char *link_speed_to_str(int s)
+{
+ switch (s) {
+ case 1:
+ return "2.5";
+ case 2:
+ return "5.0";
+ case 3:
+ return "8.0";
+ default:
+ break;
+ }
+ return "???";
+}
+
+/*
+ * The roundup_pow_of_two() from log2.h invokes
+ * __roundup_pow_of_two(unsigned long), but we really need a
+ * such a function to take a native u64 since unsigned long
+ * is 32 bits on some configurations. So we provide this helper
+ * function below.
+ */
+static u64 roundup_pow_of_two_64(u64 n)
+{
+ return 1ULL << fls64(n - 1);
+}
+
+/*
+ * This is to convert the size of the inbound "BAR" region to the
+ * non-linear values of PCIE_X_MISC_RC_BAR[123]_CONFIG_LO.SIZE
+ */
+int encode_ibar_size(u64 size)
+{
+ int log2_in = ilog2(size);
+
+ if (log2_in >= 12 && log2_in <= 15)
+ /* Covers 4KB to 32KB (inclusive) */
+ return (log2_in - 12) + 0x1c;
+ else if (log2_in >= 16 && log2_in <= 37)
+ /* Covers 64KB to 32GB, (inclusive) */
+ return log2_in - 15;
+ /* Something is awry so disable */
+ return 0;
+}
+
+static u32 mdio_form_pkt(int port, int regad, int cmd)
+{
+ u32 pkt = 0;
+
+ pkt |= (port << MDIO_PORT_SHIFT) & MDIO_PORT_MASK;
+ pkt |= (regad << MDIO_REGAD_SHIFT) & MDIO_REGAD_MASK;
+ pkt |= (cmd << MDIO_CMD_SHIFT) & MDIO_CMD_MASK;
+
+ return pkt;
+}
+
+/* negative return value indicates error */
+static int mdio_read(void __iomem *base, u8 port, u8 regad)
+{
+ int tries;
+ u32 data;
+
+ bcm_writel(mdio_form_pkt(port, regad, MDIO_CMD_READ),
+ base + PCIE_RC_DL_MDIO_ADDR);
+ bcm_readl(base + PCIE_RC_DL_MDIO_ADDR);
+
+ data = bcm_readl(base + PCIE_RC_DL_MDIO_RD_DATA);
+ for (tries = 0; !MDIO_RD_DONE(data) && tries < 10; tries++) {
+ udelay(10);
+ data = bcm_readl(base + PCIE_RC_DL_MDIO_RD_DATA);
+ }
+
+ return MDIO_RD_DONE(data)
+ ? (data & MDIO_DATA_MASK) >> MDIO_DATA_SHIFT
+ : -EIO;
+}
+
+/* negative return value indicates error */
+static int mdio_write(void __iomem *base, u8 port, u8 regad, u16 wrdata)
+{
+ int tries;
+ u32 data;
+
+ bcm_writel(mdio_form_pkt(port, regad, MDIO_CMD_WRITE),
+ base + PCIE_RC_DL_MDIO_ADDR);
+ bcm_readl(base + PCIE_RC_DL_MDIO_ADDR);
+ bcm_writel(MDIO_DATA_DONE_MASK | wrdata,
+ base + PCIE_RC_DL_MDIO_WR_DATA);
+
+ data = bcm_readl(base + PCIE_RC_DL_MDIO_WR_DATA);
+ for (tries = 0; !MDIO_WT_DONE(data) && tries < 10; tries++) {
+ udelay(10);
+ data = bcm_readl(base + PCIE_RC_DL_MDIO_WR_DATA);
+ }
+
+ return MDIO_WT_DONE(data) ? 0 : -EIO;
+}
+
+/*
+ * Configures device for Spread Spectrum Clocking (SSC) mode; a negative
+ * return value indicates error.
+ */
+static int set_ssc(void __iomem *base)
+{
+ int tmp;
+ u16 wrdata;
+ int pll, ssc;
+
+ tmp = mdio_write(base, MDIO_PORT0, SET_ADDR_OFFSET, SSC_REGS_ADDR);
+ if (tmp < 0)
+ return tmp;
+
+ tmp = mdio_read(base, MDIO_PORT0, SSC_CNTL_OFFSET);
+ if (tmp < 0)
+ return tmp;
+
+ wrdata = INSERT_FIELD(tmp, SSC_CNTL_OVRD, EN, 1);
+ wrdata = INSERT_FIELD(wrdata, SSC_CNTL_OVRD, VAL, 1);
+ tmp = mdio_write(base, MDIO_PORT0, SSC_CNTL_OFFSET, wrdata);
+ if (tmp < 0)
+ return tmp;
+
+ usleep_range(1000, 2000);
+ tmp = mdio_read(base, MDIO_PORT0, SSC_STATUS_OFFSET);
+ if (tmp < 0)
+ return tmp;
+
+ ssc = EXTRACT_FIELD(tmp, SSC_STATUS, SSC);
+ pll = EXTRACT_FIELD(tmp, SSC_STATUS, PLL_LOCK);
+
+ return (ssc && pll) ? 0 : -EIO;
+}
+
+/* Limits operation to a specific generation (1, 2, or 3) */
+static void set_gen(void __iomem *base, int gen)
+{
+ u32 lnkcap = bcm_readl(base + BRCM_PCIE_CAP_REGS + PCI_EXP_LNKCAP);
+ u16 lnkctl2 = bcm_readw(base + BRCM_PCIE_CAP_REGS + PCI_EXP_LNKCTL2);
+
+ lnkcap = (lnkcap & ~PCI_EXP_LNKCAP_SLS) | gen;
+ bcm_writel(lnkcap, base + BRCM_PCIE_CAP_REGS + PCI_EXP_LNKCAP);
+
+ lnkctl2 = (lnkctl2 & ~0xf) | gen;
+ bcm_writew(lnkctl2, base + BRCM_PCIE_CAP_REGS + PCI_EXP_LNKCTL2);
+}
+
+static void brcm_pcie_set_outbound_win(struct brcm_pcie *pcie,
+ unsigned int win, phys_addr_t cpu_addr,
+ dma_addr_t pcie_addr, dma_addr_t size)
+{
+ void __iomem *base = pcie->base;
+ phys_addr_t cpu_addr_mb, limit_addr_mb;
+ u32 tmp;
+
+ /* Set the base of the pcie_addr window */
+ bcm_writel(lower_32_bits(pcie_addr) + MMIO_ENDIAN,
+ base + PCIE_MISC_CPU_2_PCIE_MEM_WIN0_LO + (win * 8));
+ bcm_writel(upper_32_bits(pcie_addr),
+ base + PCIE_MISC_CPU_2_PCIE_MEM_WIN0_HI + (win * 8));
+
+ cpu_addr_mb = cpu_addr >> 20;
+ limit_addr_mb = (cpu_addr + size - 1) >> 20;
+
+ /* Write the addr base low register */
+ WR_FLD_WITH_OFFSET(base, (win * 4),
+ PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT,
+ BASE, cpu_addr_mb);
+ /* Write the addr limit low register */
+ WR_FLD_WITH_OFFSET(base, (win * 4),
+ PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT,
+ LIMIT, limit_addr_mb);
+
+ if (pcie->type != BCM7435 && pcie->type != BCM7425) {
+ /* Write the cpu addr high register */
+ tmp = (u32)(cpu_addr_mb >>
+ PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT_NUM_MASK_BITS);
+ WR_FLD_WITH_OFFSET(base, (win * 8),
+ PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_HI,
+ BASE, tmp);
+ /* Write the cpu limit high register */
+ tmp = (u32)(limit_addr_mb >>
+ PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT_NUM_MASK_BITS);
+ WR_FLD_WITH_OFFSET(base, (win * 8),
+ PCIE_MISC_CPU_2_PCIE_MEM_WIN0_LIMIT_HI,
+ LIMIT, tmp);
+ }
+}
+
+/* Configuration space read/write support */
+static int cfg_index(int busnr, int devfn, int reg)
+{
+ return ((PCI_SLOT(devfn) & 0x1f) << PCIE_SLOT_SHIFT)
+ | ((PCI_FUNC(devfn) & 0x07) << PCIE_FUNC_SHIFT)
+ | (busnr << PCIE_BUSNUM_SHIFT)
+ | (reg & ~3);
+}
+
+/* The controller is capable of serving in both RC and EP roles */
+static bool brcm_pcie_rc_mode(struct brcm_pcie *pcie)
+{
+ void __iomem *base = pcie->base;
+ u32 val = bcm_readl(base + PCIE_MISC_PCIE_STATUS);
+
+ return !!EXTRACT_FIELD(val, PCIE_MISC_PCIE_STATUS, PCIE_PORT);
+}
+
+static bool brcm_pcie_link_up(struct brcm_pcie *pcie)
+{
+ void __iomem *base = pcie->base;
+ u32 val = bcm_readl(base + PCIE_MISC_PCIE_STATUS);
+ u32 dla = EXTRACT_FIELD(val, PCIE_MISC_PCIE_STATUS, PCIE_DL_ACTIVE);
+ u32 plu = EXTRACT_FIELD(val, PCIE_MISC_PCIE_STATUS, PCIE_PHYLINKUP);
+
+ return (dla && plu) ? true : false;
+}
+
+static void __iomem *brcm_pcie_map_conf(struct pci_bus *bus, unsigned int devfn,
+ int where)
+{
+ struct brcm_pcie *pcie = bus->sysdata;
+ void __iomem *base = pcie->base;
+ int idx;
+
+ /* Accesses to the RC go right to the RC registers if slot==0 */
+ if (pci_is_root_bus(bus))
+ return PCI_SLOT(devfn) ? NULL : base + where;
+
+ /* For devices, write to the config space index register */
+ idx = cfg_index(bus->number, devfn, where);
+ bcm_writel(idx, pcie->base + IDX_ADDR(pcie));
+ return base + DATA_ADDR(pcie) + (where & 0x3);
+}
+
+static inline void brcm_pcie_bridge_sw_init_set(struct brcm_pcie *pcie,
+ unsigned int val)
+{
+ unsigned int shift = pcie->reg_field_info[RGR1_SW_INIT_1_INIT_SHIFT];
+ u32 mask = pcie->reg_field_info[RGR1_SW_INIT_1_INIT_MASK];
+
+ wr_fld_rb(pcie->base + PCIE_RGR1_SW_INIT_1(pcie), mask, shift, val);
+}
+
+static inline void brcm_pcie_perst_set(struct brcm_pcie *pcie,
+ unsigned int val)
+{
+ if (pcie->type != BCM7278)
+ wr_fld_rb(pcie->base + PCIE_RGR1_SW_INIT_1(pcie),
+ PCIE_RGR1_SW_INIT_1_PERST_MASK,
+ PCIE_RGR1_SW_INIT_1_PERST_SHIFT, val);
+ else
+ /* Assert = 0, de-assert = 1 on 7278 */
+ WR_FLD_RB(pcie->base, PCIE_MISC_PCIE_CTRL, PCIE_PERSTB, !val);
+}
+
+static int brcm_pcie_add_controller(struct brcm_pcie *pcie)
+{
+ int i, ret = 0;
+
+ mutex_lock(&brcm_pcie_lock);
+ if (num_pcie > 0) {
+ num_pcie++;
+ goto done;
+ }
+
+ /* Determine num_memc and their sizes */
+ for (i = 0, num_memc = 0; i < BRCM_MAX_SCB; i++) {
+ u64 size = brcmstb_memory_memc_size(i);
+
+ if (size == (u64)-1) {
+ dev_err(pcie->dev, "cannot get memc%d size\n", i);
+ ret = -EINVAL;
+ goto done;
+ } else if (size) {
+ scb_size[i] = roundup_pow_of_two_64(size);
+ num_memc++;
+ } else {
+ break;
+ }
+ }
+ if (!ret && num_memc == 0) {
+ ret = -EINVAL;
+ goto done;
+ }
+
+ num_pcie++;
+done:
+ mutex_unlock(&brcm_pcie_lock);
+ return ret;
+}
+
+static void brcm_pcie_remove_controller(struct brcm_pcie *pcie)
+{
+ mutex_lock(&brcm_pcie_lock);
+ if (--num_pcie == 0)
+ num_memc = 0;
+ mutex_unlock(&brcm_pcie_lock);
+}
+
+static int brcm_pcie_parse_request_of_pci_ranges(struct brcm_pcie *pcie)
+{
+ struct resource_entry *win;
+ int ret;
+
+ ret = of_pci_get_host_bridge_resources(pcie->dn, 0, 0xff,
+ &pcie->resources, NULL);
+ if (ret) {
+ dev_err(pcie->dev, "failed to get host resources\n");
+ return ret;
+ }
+
+ resource_list_for_each_entry(win, &pcie->resources) {
+ struct resource *parent, *res = win->res;
+ dma_addr_t offset = (dma_addr_t)win->offset;
+
+ if (resource_type(res) == IORESOURCE_IO) {
+ parent = &ioport_resource;
+ } else if (resource_type(res) == IORESOURCE_MEM) {
+ if (pcie->num_out_wins >= BRCM_NUM_PCIE_OUT_WINS) {
+ dev_err(pcie->dev, "too many outbound wins\n");
+ return -EINVAL;
+ }
+ pcie->out_wins[pcie->num_out_wins].cpu_addr
+ = (phys_addr_t)res->start;
+ pcie->out_wins[pcie->num_out_wins].pcie_addr
+ = (dma_addr_t)(res->start
+ - (phys_addr_t)offset);
+ pcie->out_wins[pcie->num_out_wins].size
+ = (dma_addr_t)(res->end - res->start + 1);
+ pcie->num_out_wins++;
+ parent = &iomem_resource;
+ } else {
+ continue;
+ }
+
+ ret = devm_request_resource(pcie->dev, parent, res);
+ if (ret) {
+ dev_err(pcie->dev, "failed to get res %pR\n", res);
+ return ret;
+ }
+ }
+ return 0;
+}
+
+static int brcm_pcie_setup(struct brcm_pcie *pcie)
+{
+ void __iomem *base = pcie->base;
+ unsigned int scb_size_val;
+ u64 rc_bar2_offset, rc_bar2_size, total_mem_size = 0;
+ u32 tmp, burst;
+ int i, j, ret, limit;
+ u16 nlw, cls, lnksta;
+ bool ssc_good = false;
+ struct device *dev = pcie->dev;
+
+ /* Reset the bridge */
+ brcm_pcie_bridge_sw_init_set(pcie, 1);
+
+ /*
+ * Ensure that the fundamental reset is asserted, except for 7278,
+ * which fails if we do this.
+ */
+ if (pcie->type != BCM7278)
+ brcm_pcie_perst_set(pcie, 1);
+
+ usleep_range(100, 200);
+
+ /* Take the bridge out of reset */
+ brcm_pcie_bridge_sw_init_set(pcie, 0);
+
+ WR_FLD_RB(base, PCIE_MISC_HARD_PCIE_HARD_DEBUG, SERDES_IDDQ, 0);
+ /* Wait for SerDes to be stable */
+ usleep_range(100, 200);
+
+ /* Grab the PCIe hw revision number */
+ tmp = bcm_readl(base + PCIE_MISC_REVISION);
+ pcie->rev = EXTRACT_FIELD(tmp, PCIE_MISC_REVISION, MAJMIN);
+
+ /* Set SCB_MAX_BURST_SIZE, CFG_READ_UR_MODE, SCB_ACCESS_EN */
+ tmp = INSERT_FIELD(0, PCIE_MISC_MISC_CTRL, SCB_ACCESS_EN, 1);
+ tmp = INSERT_FIELD(tmp, PCIE_MISC_MISC_CTRL, CFG_READ_UR_MODE, 1);
+ burst = (pcie->type == GENERIC || pcie->type == BCM7278)
+ ? BURST_SIZE_512 : BURST_SIZE_256;
+ tmp = INSERT_FIELD(tmp, PCIE_MISC_MISC_CTRL, MAX_BURST_SIZE, burst);
+ bcm_writel(tmp, base + PCIE_MISC_MISC_CTRL);
+
+ /*
+ * Set up inbound memory view for the EP (called RC_BAR2,
+ * not to be confused with the BARs that are advertised by
+ * the EP).
+ */
+ for (i = 0; i < num_memc; i++)
+ total_mem_size += scb_size[i];
+
+ /*
+ * The PCIe host controller by design must set the inbound
+ * viewport to be a contiguous arrangement of all of the
+ * system's memory. In addition, its size mut be a power of
+ * two. To further complicate matters, the viewport must
+ * start on a pcie-address that is aligned on a multiple of its
+ * size. If a portion of the viewport does not represent
+ * system memory -- e.g. 3GB of memory requires a 4GB viewport
+ * -- we can map the outbound memory in or after 3GB and even
+ * though the viewport will overlap the outbound memory the
+ * controller will know to send outbound memory downstream and
+ * everything else upstream.
+ */
+ rc_bar2_size = roundup_pow_of_two_64(total_mem_size);
+
+ /*
+ * Set simple configuration based on memory sizes
+ * only. We always start the viewport at address 0.
+ */
+ rc_bar2_offset = 0;
+
+ tmp = lower_32_bits(rc_bar2_offset);
+ tmp = INSERT_FIELD(tmp, PCIE_MISC_RC_BAR2_CONFIG_LO, SIZE,
+ encode_ibar_size(rc_bar2_size));
+ bcm_writel(tmp, base + PCIE_MISC_RC_BAR2_CONFIG_LO);
+ bcm_writel(upper_32_bits(rc_bar2_offset),
+ base + PCIE_MISC_RC_BAR2_CONFIG_HI);
+
+ scb_size_val = scb_size[0]
+ ? ilog2(scb_size[0]) - 15 : 0xf; /* 0xf is 1GB */
+ WR_FLD(base, PCIE_MISC_MISC_CTRL, SCB0_SIZE, scb_size_val);
+
+ if (num_memc > 1) {
+ scb_size_val = scb_size[1]
+ ? ilog2(scb_size[1]) - 15 : 0xf; /* 0xf is 1GB */
+ WR_FLD(base, PCIE_MISC_MISC_CTRL, SCB1_SIZE, scb_size_val);
+ }
+
+ if (num_memc > 2) {
+ scb_size_val = scb_size[2]
+ ? ilog2(scb_size[2]) - 15 : 0xf; /* 0xf is 1GB */
+ WR_FLD(base, PCIE_MISC_MISC_CTRL, SCB2_SIZE, scb_size_val);
+ }
+
+ /* disable the PCIe->GISB memory window (RC_BAR1) */
+ WR_FLD(base, PCIE_MISC_RC_BAR1_CONFIG_LO, SIZE, 0);
+
+ /* disable the PCIe->SCB memory window (RC_BAR3) */
+ WR_FLD(base, PCIE_MISC_RC_BAR3_CONFIG_LO, SIZE, 0);
+
+ if (!pcie->suspended) {
+ /* clear any interrupts we find on boot */
+ bcm_writel(0xffffffff, base + PCIE_INTR2_CPU_BASE + CLR);
+ (void)bcm_readl(base + PCIE_INTR2_CPU_BASE + CLR);
+ }
+
+ /* Mask all interrupts since we are not handling any yet */
+ bcm_writel(0xffffffff, base + PCIE_INTR2_CPU_BASE + MASK_SET);
+ (void)bcm_readl(base + PCIE_INTR2_CPU_BASE + MASK_SET);
+
+ if (pcie->gen)
+ set_gen(base, pcie->gen);
+
+ /* Unassert the fundamental reset */
+ brcm_pcie_perst_set(pcie, 0);
+
+ /*
+ * Give the RC/EP time to wake up, before trying to configure RC.
+ * Intermittently check status for link-up, up to a total of 100ms
+ * when we don't know if the device is there, and up to 1000ms if
+ * we do know the device is there.
+ */
+ limit = pcie->suspended ? 1000 : 100;
+ for (i = 1, j = 0; j < limit && !brcm_pcie_link_up(pcie);
+ j += i, i = i * 2)
+ msleep(i + j > limit ? limit - j : i);
+
+ if (!brcm_pcie_link_up(pcie)) {
+ dev_info(dev, "link down\n");
+ return -ENODEV;
+ }
+
+ if (!brcm_pcie_rc_mode(pcie)) {
+ dev_err(dev, "PCIe misconfigured; is in EP mode\n");
+ return -EINVAL;
+ }
+
+ for (i = 0; i < pcie->num_out_wins; i++)
+ brcm_pcie_set_outbound_win(pcie, i, pcie->out_wins[i].cpu_addr,
+ pcie->out_wins[i].pcie_addr,
+ pcie->out_wins[i].size);
+
+ /*
+ * For config space accesses on the RC, show the right class for
+ * a PCIe-PCIe bridge (the default setting is to be EP mode).
+ */
+ WR_FLD_RB(base, PCIE_RC_CFG_PRIV1_ID_VAL3, CLASS_CODE, 0x060400);
+
+ if (pcie->ssc) {
+ ret = set_ssc(base);
+ if (ret == 0)
+ ssc_good = true;
+ else
+ dev_err(dev, "failed attempt to enter ssc mode\n");
+ }
+
+ lnksta = bcm_readw(base + BRCM_PCIE_CAP_REGS + PCI_EXP_LNKSTA);
+ cls = lnksta & PCI_EXP_LNKSTA_CLS;
+ nlw = (lnksta & PCI_EXP_LNKSTA_NLW) >> PCI_EXP_LNKSTA_NLW_SHIFT;
+ dev_info(dev, "link up, %s Gbps x%u %s\n", link_speed_to_str(cls),
+ nlw, ssc_good ? "(SSC)" : "(!SSC)");
+
+ /* PCIe->SCB endian mode for BAR */
+ /* field ENDIAN_MODE_BAR2 = DATA_ENDIAN */
+ WR_FLD_RB(base, PCIE_RC_CFG_VENDOR_VENDOR_SPECIFIC_REG1,
+ ENDIAN_MODE_BAR2, DATA_ENDIAN);
+
+ /*
+ * Refclk from RC should be gated with CLKREQ# input when ASPM L0s,L1
+ * is enabled => setting the CLKREQ_DEBUG_ENABLE field to 1.
+ */
+ WR_FLD_RB(base, PCIE_MISC_HARD_PCIE_HARD_DEBUG, CLKREQ_DEBUG_ENABLE, 1);
+
+ return 0;
+}
+
+/* L23 is a low-power PCIe link state */
+static void enter_l23(struct brcm_pcie *pcie)
+{
+ void __iomem *base = pcie->base;
+ int tries, l23;
+
+ /* assert request for L23 */
+ WR_FLD_RB(base, PCIE_MISC_PCIE_CTRL, PCIE_L23_REQUEST, 1);
+ /* poll L23 status */
+ for (tries = 0, l23 = 0; tries < 1000 && !l23; tries++)
+ l23 = RD_FLD(base, PCIE_MISC_PCIE_STATUS, PCIE_LINK_IN_L23);
+ if (!l23)
+ dev_err(pcie->dev, "failed to enter L23\n");
+}
+
+static void turn_off(struct brcm_pcie *pcie)
+{
+ void __iomem *base = pcie->base;
+
+ if (brcm_pcie_link_up(pcie))
+ enter_l23(pcie);
+ /* Assert fundamental reset */
+ brcm_pcie_perst_set(pcie, 1);
+ /* Deassert request for L23 in case it was asserted */
+ WR_FLD_RB(base, PCIE_MISC_PCIE_CTRL, PCIE_L23_REQUEST, 0);
+ /* Turn off SerDes */
+ WR_FLD_RB(base, PCIE_MISC_HARD_PCIE_HARD_DEBUG, SERDES_IDDQ, 1);
+ /* Shutdown PCIe bridge */
+ brcm_pcie_bridge_sw_init_set(pcie, 1);
+}
+
+static int brcm_pcie_suspend(struct device *dev)
+{
+ struct brcm_pcie *pcie = dev_get_drvdata(dev);
+
+ turn_off(pcie);
+ clk_disable_unprepare(pcie->clk);
+ pcie->suspended = true;
+
+ return 0;
+}
+
+static int brcm_pcie_resume(struct device *dev)
+{
+ struct brcm_pcie *pcie = dev_get_drvdata(dev);
+ void __iomem *base;
+ int ret;
+
+ base = pcie->base;
+ clk_prepare_enable(pcie->clk);
+
+ /* Take bridge out of reset so we can access the SerDes reg */
+ brcm_pcie_bridge_sw_init_set(pcie, 0);
+
+ /* Turn on SerDes */
+ WR_FLD_RB(base, PCIE_MISC_HARD_PCIE_HARD_DEBUG, SERDES_IDDQ, 0);
+ /* Wait for SerDes to be stable */
+ usleep_range(100, 200);
+
+ ret = brcm_pcie_setup(pcie);
+ if (ret)
+ return ret;
+
+ pcie->suspended = false;
+
+ return 0;
+}
+
+static void _brcm_pcie_remove(struct brcm_pcie *pcie)
+{
+ turn_off(pcie);
+ clk_disable_unprepare(pcie->clk);
+ clk_put(pcie->clk);
+ brcm_pcie_remove_controller(pcie);
+}
+
+static int brcm_pcie_remove(struct platform_device *pdev)
+{
+ struct brcm_pcie *pcie = platform_get_drvdata(pdev);
+
+ pci_stop_root_bus(pcie->root_bus);
+ pci_remove_root_bus(pcie->root_bus);
+ _brcm_pcie_remove(pcie);
+
+ return 0;
+}
+
+static const struct of_device_id brcm_pcie_match[] = {
+ { .compatible = "brcm,bcm7425-pcie", .data = &bcm7425_cfg },
+ { .compatible = "brcm,bcm7435-pcie", .data = &bcm7435_cfg },
+ { .compatible = "brcm,bcm7278-pcie", .data = &bcm7278_cfg },
+ { .compatible = "brcm,bcm7445-pcie", .data = &generic_cfg },
+ {},
+};
+MODULE_DEVICE_TABLE(of, brcm_pcie_match);
+
+static int brcm_pcie_probe(struct platform_device *pdev)
+{
+ struct device_node *dn = pdev->dev.of_node;
+ const struct of_device_id *of_id;
+ const struct pcie_cfg_data *data;
+ int ret;
+ struct brcm_pcie *pcie;
+ struct resource *res;
+ void __iomem *base;
+ u32 tmp;
+ struct pci_host_bridge *bridge;
+ struct pci_bus *child;
+
+ bridge = devm_pci_alloc_host_bridge(&pdev->dev, sizeof(*pcie));
+ if (!bridge)
+ return -ENOMEM;
+
+ pcie = pci_host_bridge_priv(bridge);
+ INIT_LIST_HEAD(&pcie->resources);
+
+ of_id = of_match_node(brcm_pcie_match, dn);
+ if (!of_id) {
+ dev_err(&pdev->dev, "failed to look up compatible string\n");
+ return -EINVAL;
+ }
+
+ if (of_property_read_u32(dn, "dma-ranges", &tmp) == 0) {
+ dev_err(&pdev->dev, "cannot yet handle dma-ranges\n");
+ return -EINVAL;
+ }
+
+ data = of_id->data;
+ pcie->reg_offsets = data->offsets;
+ pcie->reg_field_info = data->reg_field_info;
+ pcie->type = data->type;
+ pcie->dn = dn;
+ pcie->dev = &pdev->dev;
+
+ /* We use the domain number as our controller number */
+ pcie->id = of_get_pci_domain_nr(dn);
+ if (pcie->id < 0)
+ return pcie->id;
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res)
+ return -EINVAL;
+
+ base = devm_ioremap_resource(&pdev->dev, res);
+ if (IS_ERR(base))
+ return PTR_ERR(base);
+
+ pcie->clk = of_clk_get_by_name(dn, "sw_pcie");
+ if (IS_ERR(pcie->clk)) {
+ dev_err(&pdev->dev, "could not get clock\n");
+ pcie->clk = NULL;
+ }
+ pcie->base = base;
+
+ ret = of_pci_get_max_link_speed(dn);
+ pcie->gen = (ret < 0) ? 0 : ret;
+
+ pcie->ssc = of_property_read_bool(dn, "brcm,enable-ssc");
+
+ ret = irq_of_parse_and_map(pdev->dev.of_node, 0);
+ if (ret == 0)
+ /* keep going, as we don't use this intr yet */
+ dev_warn(pcie->dev, "cannot get PCIe interrupt\n");
+ else
+ pcie->irq = ret;
+
+ ret = brcm_pcie_parse_request_of_pci_ranges(pcie);
+ if (ret)
+ return ret;
+
+ ret = clk_prepare_enable(pcie->clk);
+ if (ret) {
+ dev_err(&pdev->dev, "could not enable clock\n");
+ return ret;
+ }
+
+ ret = brcm_pcie_add_controller(pcie);
+ if (ret)
+ return ret;
+
+ ret = brcm_pcie_setup(pcie);
+ if (ret)
+ goto fail;
+
+ list_splice_init(&pcie->resources, &bridge->windows);
+ bridge->dev.parent = &pdev->dev;
+ bridge->busnr = 0;
+ bridge->ops = &brcm_pcie_ops;
+ bridge->sysdata = pcie;
+ bridge->map_irq = of_irq_parse_and_map_pci;
+ bridge->swizzle_irq = pci_common_swizzle;
+
+ ret = pci_scan_root_bus_bridge(bridge);
+ if (ret < 0) {
+ dev_err(pcie->dev, "Scanning root bridge failed\n");
+ goto fail;
+ }
+
+ pci_assign_unassigned_bus_resources(bridge->bus);
+ list_for_each_entry(child, &bridge->bus->children, node)
+ pcie_bus_configure_settings(child);
+ pci_bus_add_devices(bridge->bus);
+ platform_set_drvdata(pdev, pcie);
+ pcie->root_bus = bridge->bus;
+
+ return 0;
+
+fail:
+ _brcm_pcie_remove(pcie);
+ return ret;
+}
+
+static const struct dev_pm_ops brcm_pcie_pm_ops = {
+ .suspend_noirq = brcm_pcie_suspend,
+ .resume_noirq = brcm_pcie_resume,
+};
+
+static struct platform_driver brcm_pcie_driver = {
+ .probe = brcm_pcie_probe,
+ .remove = brcm_pcie_remove,
+ .driver = {
+ .name = "brcm-pcie",
+ .owner = THIS_MODULE,
+ .of_match_table = brcm_pcie_match,
+ .pm = &brcm_pcie_pm_ops,
+ },
+};
+
+module_platform_driver(brcm_pcie_driver);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Broadcom STB PCIe RC driver");
+MODULE_AUTHOR("Broadcom");
--
1.9.0.138.g2de3478

2018-01-15 23:29:41

by Jim Quinlan

[permalink] [raw]
Subject: [PATCH v4 5/8] PCI/MSI: Enable PCI_MSI_IRQ_DOMAIN support for MIPS

Add MIPS as an arch that supports PCI_MSI_IRQ_DOMAIN and add
generation of msi.h in the MIPS arch.

Signed-off-by: Jim Quinlan <[email protected]>
---
arch/mips/include/asm/Kbuild | 1 +
drivers/pci/Kconfig | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/mips/include/asm/Kbuild b/arch/mips/include/asm/Kbuild
index b1f6669..92028be 100644
--- a/arch/mips/include/asm/Kbuild
+++ b/arch/mips/include/asm/Kbuild
@@ -9,6 +9,7 @@ generic-y += irq_work.h
generic-y += local64.h
generic-y += mcs_spinlock.h
generic-y += mm-arch-hooks.h
+generic-y += msi.h
generic-y += parport.h
generic-y += percpu.h
generic-y += preempt.h
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index bda1517..717616f 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -25,7 +25,7 @@ config PCI_MSI
If you don't know what to do here, say Y.

config PCI_MSI_IRQ_DOMAIN
- def_bool ARC || ARM || ARM64 || X86
+ def_bool ARC || ARM || ARM64 || MIPS || X86
depends on PCI_MSI
select GENERIC_MSI_IRQ_DOMAIN

--
1.9.0.138.g2de3478

2018-01-15 23:29:45

by Jim Quinlan

[permalink] [raw]
Subject: [PATCH v4 8/8] MIPS: BMIPS: Enable PCI

Adds the Kconfig hooks to enable the Broadcom STB PCIe root complex
driver for Broadcom MIPS systems.

Signed-off-by: Jim Quinlan <[email protected]>
---
arch/mips/Kconfig | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 350a990..fe17361 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -210,6 +210,9 @@ config BMIPS_GENERIC
select BOOT_RAW
select NO_EXCEPT_FILL
select USE_OF
+ select HW_HAS_PCI
+ select PCI_DRIVERS_GENERIC
+ select PCI
select CEVT_R4K
select CSRC_R4K
select SYNC_R4K
--
1.9.0.138.g2de3478

2018-01-15 23:30:05

by Jim Quinlan

[permalink] [raw]
Subject: [PATCH v4 7/8] MIPS: BMIPS: Add PCI bindings for 7425, 7435

Adds the PCIe nodes for the Broadcom STB PCIe root complex.

Signed-off-by: Jim Quinlan <[email protected]>
---
arch/mips/boot/dts/brcm/bcm7425.dtsi | 26 ++++++++++++++++++++++++++
arch/mips/boot/dts/brcm/bcm7435.dtsi | 27 +++++++++++++++++++++++++++
arch/mips/boot/dts/brcm/bcm97425svmb.dts | 4 ++++
arch/mips/boot/dts/brcm/bcm97435svmb.dts | 4 ++++
4 files changed, 61 insertions(+)

diff --git a/arch/mips/boot/dts/brcm/bcm7425.dtsi b/arch/mips/boot/dts/brcm/bcm7425.dtsi
index e4fb9b6..02168d0 100644
--- a/arch/mips/boot/dts/brcm/bcm7425.dtsi
+++ b/arch/mips/boot/dts/brcm/bcm7425.dtsi
@@ -495,4 +495,30 @@
status = "disabled";
};
};
+
+ pcie: pcie@10410000 {
+ reg = <0x10410000 0x830c>;
+ compatible = "brcm,bcm7425-pcie";
+ interrupts = <37>, <37>;
+ interrupt-names = "pcie", "msi";
+ interrupt-parent = <&periph_intc>;
+ #address-cells = <3>;
+ #size-cells = <2>;
+ linux,pci-domain = <0>;
+ brcm,enable-ssc;
+ bus-range = <0x00 0xff>;
+ msi-controller;
+ #interrupt-cells = <1>;
+ /* 4x128mb windows */
+ ranges = <0x2000000 0x0 0xd0000000 0xd0000000 0 0x08000000>,
+ <0x2000000 0x0 0xd8000000 0xd8000000 0 0x08000000>,
+ <0x2000000 0x0 0xe0000000 0xe0000000 0 0x08000000>,
+ <0x2000000 0x0 0xe8000000 0xe8000000 0 0x08000000>;
+ interrupt-map-mask = <0 0 0 7>;
+ interrupt-map = <0 0 0 1 &periph_intc 33
+ 0 0 0 2 &periph_intc 34
+ 0 0 0 3 &periph_intc 35
+ 0 0 0 4 &periph_intc 36>;
+ };
+
};
diff --git a/arch/mips/boot/dts/brcm/bcm7435.dtsi b/arch/mips/boot/dts/brcm/bcm7435.dtsi
index 1484e89..84881224 100644
--- a/arch/mips/boot/dts/brcm/bcm7435.dtsi
+++ b/arch/mips/boot/dts/brcm/bcm7435.dtsi
@@ -510,4 +510,31 @@
status = "disabled";
};
};
+
+ pcie: pcie@10410000 {
+ reg = <0x10410000 0x930c>;
+ interrupts = <0x27>, <0x27>;
+ interrupt-names = "pcie", "msi";
+ interrupt-parent = <&periph_intc>;
+ compatible = "brcm,bcm7435-pcie";
+ #address-cells = <3>;
+ #size-cells = <2>;
+ linux,pci-domain = <0>;
+ brcm,enable-ssc;
+ bus-range = <0x00 0xff>;
+ msi-controller;
+ #interrupt-cells = <1>;
+ /* 4x128mb windows */
+ ranges = <0x2000000 0x0 0xd0000000 0xd0000000 0 0x08000000>,
+ <0x2000000 0x0 0xd8000000 0xd8000000 0 0x08000000>,
+ <0x2000000 0x0 0xe0000000 0xe0000000 0 0x08000000>,
+ <0x2000000 0x0 0xe8000000 0xe8000000 0 0x08000000>;
+ interrupt-map-mask = <0 0 0 7>;
+ interrupt-map = <0 0 0 1 &periph_intc 35
+ 0 0 0 2 &periph_intc 36
+ 0 0 0 3 &periph_intc 37
+ 0 0 0 4 &periph_intc 38>;
+ status = "disabled";
+ };
+
};
diff --git a/arch/mips/boot/dts/brcm/bcm97425svmb.dts b/arch/mips/boot/dts/brcm/bcm97425svmb.dts
index ce762c7..a958e56 100644
--- a/arch/mips/boot/dts/brcm/bcm97425svmb.dts
+++ b/arch/mips/boot/dts/brcm/bcm97425svmb.dts
@@ -144,3 +144,7 @@
&mspi {
status = "okay";
};
+
+&pcie {
+ status = "okay";
+};
diff --git a/arch/mips/boot/dts/brcm/bcm97435svmb.dts b/arch/mips/boot/dts/brcm/bcm97435svmb.dts
index d4dd31a..f41791e 100644
--- a/arch/mips/boot/dts/brcm/bcm97435svmb.dts
+++ b/arch/mips/boot/dts/brcm/bcm97435svmb.dts
@@ -120,3 +120,7 @@
&mspi {
status = "okay";
};
+
+&pcie {
+ status = "okay";
+};
--
1.9.0.138.g2de3478

2018-01-15 23:30:04

by Jim Quinlan

[permalink] [raw]
Subject: [PATCH v4 6/8] PCI: brcmstb: Add MSI capability

This commit adds MSI to the Broadcom STB PCIe host controller. It does
not add MSIX since that functionality is not in the HW. The MSI
controller is physically located within the PCIe block, however, there
is no reason why the MSI controller could not be moved elsewhere in
the future.

Since the internal Brcmstb MSI controller is intertwined with the PCIe
controller, it is not its own platform device but rather part of the
PCIe platform device.

Signed-off-by: Jim Quinlan <[email protected]>
---
drivers/pci/host/pcie-brcmstb.c | 374 +++++++++++++++++++++++++++++++++++++---
1 file changed, 353 insertions(+), 21 deletions(-)

diff --git a/drivers/pci/host/pcie-brcmstb.c b/drivers/pci/host/pcie-brcmstb.c
index 049f0db..6bc2c2b 100644
--- a/drivers/pci/host/pcie-brcmstb.c
+++ b/drivers/pci/host/pcie-brcmstb.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (C) 2009 - 2017 Broadcom */

+#include <linux/bitops.h>
#include <linux/clk.h>
#include <linux/compiler.h>
#include <linux/delay.h>
@@ -9,11 +10,13 @@
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/ioport.h>
+#include <linux/irqchip/chained_irq.h>
#include <linux/irqdomain.h>
#include <linux/kernel.h>
#include <linux/list.h>
#include <linux/log2.h>
#include <linux/module.h>
+#include <linux/msi.h>
#include <linux/of_address.h>
#include <linux/of_irq.h>
#include <linux/of_pci.h>
@@ -46,6 +49,9 @@
#define PCIE_MISC_RC_BAR2_CONFIG_LO 0x4034
#define PCIE_MISC_RC_BAR2_CONFIG_HI 0x4038
#define PCIE_MISC_RC_BAR3_CONFIG_LO 0x403c
+#define PCIE_MISC_MSI_BAR_CONFIG_LO 0x4044
+#define PCIE_MISC_MSI_BAR_CONFIG_HI 0x4048
+#define PCIE_MISC_MSI_DATA_CONFIG 0x404c
#define PCIE_MISC_PCIE_CTRL 0x4064
#define PCIE_MISC_PCIE_STATUS 0x4068
#define PCIE_MISC_REVISION 0x406c
@@ -54,6 +60,7 @@
#define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_LIMIT_HI 0x4084
#define PCIE_MISC_HARD_PCIE_HARD_DEBUG 0x4204
#define PCIE_INTR2_CPU_BASE 0x4300
+#define PCIE_MSI_INTR2_BASE 0x4500

/*
* Broadcom Settop Box PCIe Register Field shift and mask info. The
@@ -114,6 +121,8 @@

#define BRCM_NUM_PCIE_OUT_WINS 0x4
#define BRCM_MAX_SCB 0x4
+#define BRCM_INT_PCI_MSI_NR 32
+#define BRCM_PCIE_HW_REV_33 0x0303

#define BRCM_MSI_TARGET_ADDR_LT_4GB 0x0fffffffcULL
#define BRCM_MSI_TARGET_ADDR_GT_4GB 0xffffffffcULL
@@ -202,6 +211,33 @@ struct brcm_window {
dma_addr_t size;
};

+struct brcm_msi {
+ struct device *dev;
+ void __iomem *base;
+ struct device_node *dn;
+ struct irq_domain *msi_domain;
+ struct irq_domain *inner_domain;
+ struct mutex lock; /* guards the alloc/free operations */
+ u64 target_addr;
+ int irq;
+
+ /* intr_base is the base pointer for interrupt status/set/clr regs */
+ void __iomem *intr_base;
+
+ /* intr_legacy_mask indicates how many bits are MSI interrupts */
+ u32 intr_legacy_mask;
+
+ /*
+ * intr_legacy_offset indicates bit position of MSI_01. It is
+ * to map the register bit position to a hwirq that starts at 0.
+ */
+ u32 intr_legacy_offset;
+
+ /* used indicates which MSI interrupts have been alloc'd */
+ unsigned long used;
+ unsigned int rev;
+};
+
/* Internal PCIe Host Controller Information.*/
struct brcm_pcie {
struct device *dev;
@@ -216,7 +252,10 @@ struct brcm_pcie {
int num_out_wins;
bool ssc;
int gen;
+ u64 msi_target_addr;
struct brcm_window out_wins[BRCM_NUM_PCIE_OUT_WINS];
+ struct brcm_msi *msi;
+ bool msi_internal;
unsigned int rev;
const int *reg_offsets;
const int *reg_field_info;
@@ -224,9 +263,9 @@ struct brcm_pcie {
};

struct pcie_cfg_data {
- const int *reg_field_info;
- const int *offsets;
- const enum pcie_type type;
+ const int *reg_field_info;
+ const int *offsets;
+ const enum pcie_type type;
};

static const int pcie_reg_field_info[] = {
@@ -827,6 +866,267 @@ static void brcm_pcie_set_outbound_win(struct brcm_pcie *pcie,
}
}

+static struct irq_chip brcm_msi_irq_chip = {
+ .name = "Brcm_MSI",
+ .irq_mask = pci_msi_mask_irq,
+ .irq_unmask = pci_msi_unmask_irq,
+};
+
+static struct msi_domain_info brcm_msi_domain_info = {
+ .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
+ MSI_FLAG_PCI_MSIX),
+ .chip = &brcm_msi_irq_chip,
+};
+
+static void brcm_pcie_msi_isr(struct irq_desc *desc)
+{
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ struct brcm_msi *msi;
+ unsigned long status, virq;
+ u32 mask, bit, hwirq;
+ struct device *dev;
+
+ chained_irq_enter(chip, desc);
+ msi = irq_desc_get_handler_data(desc);
+ mask = msi->intr_legacy_mask;
+ dev = msi->dev;
+
+ while ((status = bcm_readl(msi->intr_base + STATUS) & mask)) {
+ for_each_set_bit(bit, &status, BRCM_INT_PCI_MSI_NR) {
+ /* clear the interrupt */
+ bcm_writel(1 << bit, msi->intr_base + CLR);
+
+ /* Account for legacy interrupt offset */
+ hwirq = bit - msi->intr_legacy_offset;
+
+ virq = irq_find_mapping(msi->inner_domain, hwirq);
+ if (virq) {
+ if (msi->used & (1 << hwirq))
+ generic_handle_irq(virq);
+ else
+ dev_info(dev, "unhandled MSI %d\n",
+ hwirq);
+ } else {
+ /* Unknown MSI, just clear it */
+ dev_dbg(dev, "unexpected MSI\n");
+ }
+ }
+ }
+ chained_irq_exit(chip, desc);
+}
+
+static void brcm_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
+{
+ struct brcm_msi *msi = irq_data_get_irq_chip_data(data);
+ u32 temp;
+
+ msg->address_lo = lower_32_bits(msi->target_addr);
+ msg->address_hi = upper_32_bits(msi->target_addr);
+ temp = bcm_readl(msi->base + PCIE_MISC_MSI_DATA_CONFIG);
+ msg->data = ((temp >> 16) & (temp & 0xffff)) | data->hwirq;
+}
+
+static int brcm_msi_set_affinity(struct irq_data *irq_data,
+ const struct cpumask *mask, bool force)
+{
+ return -EINVAL;
+}
+
+static struct irq_chip brcm_msi_bottom_irq_chip = {
+ .name = "Brcm_MSI",
+ .irq_compose_msi_msg = brcm_compose_msi_msg,
+ .irq_set_affinity = brcm_msi_set_affinity,
+};
+
+static int brcm_msi_alloc(struct brcm_msi *msi)
+{
+ int bit, hwirq;
+
+ mutex_lock(&msi->lock);
+ bit = ~msi->used ? ffz(msi->used) : -1;
+
+ if (bit >= 0 && bit < BRCM_INT_PCI_MSI_NR) {
+ msi->used |= (1 << bit);
+ hwirq = bit - msi->intr_legacy_offset;
+ } else {
+ hwirq = -ENOSPC;
+ }
+
+ mutex_unlock(&msi->lock);
+ return hwirq;
+}
+
+static void brcm_msi_free(struct brcm_msi *msi, unsigned long hwirq)
+{
+ mutex_lock(&msi->lock);
+ msi->used &= ~(1 << (hwirq + msi->intr_legacy_offset));
+ mutex_unlock(&msi->lock);
+}
+
+static int brcm_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
+ unsigned int nr_irqs, void *args)
+{
+ struct brcm_msi *msi = domain->host_data;
+ int hwirq;
+
+ hwirq = brcm_msi_alloc(msi);
+
+ if (hwirq < 0)
+ return hwirq;
+
+ irq_domain_set_info(domain, virq, (irq_hw_number_t)hwirq,
+ &brcm_msi_bottom_irq_chip, domain->host_data,
+ handle_simple_irq, NULL, NULL);
+ return 0;
+}
+
+static void brcm_irq_domain_free(struct irq_domain *domain,
+ unsigned int virq, unsigned int nr_irqs)
+{
+ struct irq_data *d = irq_domain_get_irq_data(domain, virq);
+ struct brcm_msi *msi = irq_data_get_irq_chip_data(d);
+
+ brcm_msi_free(msi, d->hwirq);
+}
+
+static void brcm_msi_set_regs(struct brcm_msi *msi)
+{
+ u32 data_val, msi_lo, msi_hi;
+
+ if (msi->rev >= BRCM_PCIE_HW_REV_33) {
+ /*
+ * ffe0 -- least sig 5 bits are 0 indicating 32 msgs
+ * 6540 -- this is our arbitrary unique data value
+ */
+ data_val = 0xffe06540;
+ } else {
+ /*
+ * fff8 -- least sig 3 bits are 0 indicating 8 msgs
+ * 6540 -- this is our arbitrary unique data value
+ */
+ data_val = 0xfff86540;
+ }
+
+ /*
+ * Make sure we are not masking MSIs. Note that MSIs can be masked,
+ * but that occurs on the PCIe EP device
+ */
+ bcm_writel(0xffffffff & msi->intr_legacy_mask,
+ msi->intr_base + MASK_CLR);
+
+ msi_lo = lower_32_bits(msi->target_addr);
+ msi_hi = upper_32_bits(msi->target_addr);
+ /*
+ * The 0 bit of PCIE_MISC_MSI_BAR_CONFIG_LO is repurposed to MSI
+ * enable, which we set to 1.
+ */
+ bcm_writel(msi_lo | 1, msi->base + PCIE_MISC_MSI_BAR_CONFIG_LO);
+ bcm_writel(msi_hi, msi->base + PCIE_MISC_MSI_BAR_CONFIG_HI);
+ bcm_writel(data_val, msi->base + PCIE_MISC_MSI_DATA_CONFIG);
+}
+
+static const struct irq_domain_ops msi_domain_ops = {
+ .alloc = brcm_irq_domain_alloc,
+ .free = brcm_irq_domain_free,
+};
+
+static int brcm_allocate_domains(struct brcm_msi *msi)
+{
+ struct fwnode_handle *fwnode = of_node_to_fwnode(msi->dn);
+ struct device *dev = msi->dev;
+
+ msi->inner_domain = irq_domain_add_linear(NULL, BRCM_INT_PCI_MSI_NR,
+ &msi_domain_ops, msi);
+ if (!msi->inner_domain) {
+ dev_err(dev, "failed to create IRQ domain\n");
+ return -ENOMEM;
+ }
+
+ msi->msi_domain = pci_msi_create_irq_domain(fwnode,
+ &brcm_msi_domain_info,
+ msi->inner_domain);
+ if (!msi->msi_domain) {
+ dev_err(dev, "failed to create MSI domain\n");
+ irq_domain_remove(msi->inner_domain);
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void brcm_free_domains(struct brcm_msi *msi)
+{
+ irq_domain_remove(msi->msi_domain);
+ irq_domain_remove(msi->inner_domain);
+}
+
+static void brcm_msi_remove(struct brcm_pcie *pcie)
+{
+ struct brcm_msi *msi = pcie->msi;
+
+ if (!msi)
+ return;
+ irq_set_chained_handler(msi->irq, NULL);
+ irq_set_handler_data(msi->irq, NULL);
+ brcm_free_domains(msi);
+}
+
+static int brcm_pcie_enable_msi(struct brcm_pcie *pcie)
+{
+ struct brcm_msi *msi;
+ int irq, ret;
+ struct device *dev = pcie->dev;
+
+ irq = irq_of_parse_and_map(dev->of_node, 1);
+ if (irq <= 0) {
+ dev_err(dev, "cannot map msi intr\n");
+ return -ENODEV;
+ }
+
+ msi = devm_kzalloc(dev, sizeof(struct brcm_msi), GFP_KERNEL);
+ if (!msi)
+ return -ENOMEM;
+
+ msi->dev = dev;
+ msi->base = pcie->base;
+ msi->rev = pcie->rev;
+ msi->dn = pcie->dn;
+ msi->target_addr = pcie->msi_target_addr;
+ msi->irq = irq;
+
+ ret = brcm_allocate_domains(msi);
+ if (ret)
+ return ret;
+
+ irq_set_chained_handler_and_data(msi->irq, brcm_pcie_msi_isr, msi);
+
+ if (msi->rev >= BRCM_PCIE_HW_REV_33) {
+ msi->intr_base = msi->base + PCIE_MSI_INTR2_BASE;
+ /*
+ * This version of PCIe hw has only 32 intr bits
+ * starting at bit position 0.
+ */
+ msi->intr_legacy_mask = 0xffffffff;
+ msi->intr_legacy_offset = 0x0;
+ msi->used = 0x0;
+
+ } else {
+ msi->intr_base = msi->base + PCIE_INTR2_CPU_BASE;
+ /*
+ * This version of PCIe hw has only 8 intr bits starting
+ * at bit position 24.
+ */
+ msi->intr_legacy_mask = 0xff000000;
+ msi->intr_legacy_offset = 24;
+ msi->used = 0x00ffffff;
+ }
+
+ brcm_msi_set_regs(msi);
+ pcie->msi = msi;
+
+ return 0;
+}
+
/* Configuration space read/write support */
static int cfg_index(int busnr, int devfn, int reg)
{
@@ -1071,6 +1371,7 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
u16 nlw, cls, lnksta;
bool ssc_good = false;
struct device *dev = pcie->dev;
+ u64 msi_target_addr;

/* Reset the bridge */
brcm_pcie_bridge_sw_init_set(pcie, 1);
@@ -1115,27 +1416,24 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
* The PCIe host controller by design must set the inbound
* viewport to be a contiguous arrangement of all of the
* system's memory. In addition, its size mut be a power of
- * two. To further complicate matters, the viewport must
- * start on a pcie-address that is aligned on a multiple of its
- * size. If a portion of the viewport does not represent
- * system memory -- e.g. 3GB of memory requires a 4GB viewport
- * -- we can map the outbound memory in or after 3GB and even
- * though the viewport will overlap the outbound memory the
- * controller will know to send outbound memory downstream and
- * everything else upstream.
+ * two. Further, the MSI target address must NOT be placed
+ * inside this region, as the decoding logic will consider its
+ * address to be inbound memory traffic. To further
+ * complicate matters, the viewport must start on a
+ * pcie-address that is aligned on a multiple of its size.
+ * If a portion of the viewport does not represent system
+ * memory -- e.g. 3GB of memory requires a 4GB viewport --
+ * we can map the outbound memory in or after 3GB and even
+ * though the viewport will overlap the outbound memory
+ * the controller will know to send outbound memory downstream
+ * and everything else upstream.
*/
rc_bar2_size = roundup_pow_of_two_64(total_mem_size);

- /*
- * Set simple configuration based on memory sizes
- * only. We always start the viewport at address 0.
- */
- rc_bar2_offset = 0;
-
if (dma_ranges) {
/*
* The best-case scenario is to place the inbound
- * region in the first 4GB of pci-space, as some
+ * region in the first 4GB of pcie-space, as some
* legacy devices can only address 32bits.
* We would also like to put the MSI under 4GB
* as well, since some devices require a 32bit
@@ -1144,6 +1442,14 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
if (total_mem_size <= 0xc0000000ULL &&
rc_bar2_size <= 0x100000000ULL) {
rc_bar2_offset = 0;
+ /* If the viewport is less then 4GB we can fit
+ * the MSI target address under 4GB. Otherwise
+ * put it right below 64GB.
+ */
+ msi_target_addr =
+ (rc_bar2_size == 0x100000000ULL)
+ ? BRCM_MSI_TARGET_ADDR_GT_4GB
+ : BRCM_MSI_TARGET_ADDR_LT_4GB;
} else {
/*
* The system memory is 4GB or larger so we
@@ -1153,8 +1459,12 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
* start it at the 1x multiple of its size
*/
rc_bar2_offset = rc_bar2_size;
- }

+ /* Since we are starting the viewport at 4GB or
+ * higher, put the MSI target address below 4GB
+ */
+ msi_target_addr = BRCM_MSI_TARGET_ADDR_LT_4GB;
+ }
} else {
/*
* Set simple configuration based on memory sizes
@@ -1162,7 +1472,12 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
* and set the MSI target address accordingly.
*/
rc_bar2_offset = 0;
+
+ msi_target_addr = (rc_bar2_size >= 0x100000000ULL)
+ ? BRCM_MSI_TARGET_ADDR_GT_4GB
+ : BRCM_MSI_TARGET_ADDR_LT_4GB;
}
+ pcie->msi_target_addr = msi_target_addr;

tmp = lower_32_bits(rc_bar2_offset);
tmp = INSERT_FIELD(tmp, PCIE_MISC_RC_BAR2_CONFIG_LO, SIZE,
@@ -1332,6 +1647,9 @@ static int brcm_pcie_resume(struct device *dev)
if (ret)
return ret;

+ if (pcie->msi && pcie->msi_internal)
+ brcm_msi_set_regs(pcie->msi);
+
pcie->suspended = false;

return 0;
@@ -1339,6 +1657,7 @@ static int brcm_pcie_resume(struct device *dev)

static void _brcm_pcie_remove(struct brcm_pcie *pcie)
{
+ brcm_msi_remove(pcie);
turn_off(pcie);
clk_disable_unprepare(pcie->clk);
clk_put(pcie->clk);
@@ -1367,7 +1686,7 @@ static int brcm_pcie_remove(struct platform_device *pdev)

static int brcm_pcie_probe(struct platform_device *pdev)
{
- struct device_node *dn = pdev->dev.of_node;
+ struct device_node *dn = pdev->dev.of_node, *msi_dn;
const struct of_device_id *of_id;
const struct pcie_cfg_data *data;
int ret;
@@ -1447,6 +1766,20 @@ static int brcm_pcie_probe(struct platform_device *pdev)
if (ret)
goto fail;

+ msi_dn = of_parse_phandle(pcie->dn, "msi-parent", 0);
+ /* Use the internal MSI if no msi-parent property */
+ if (!msi_dn)
+ msi_dn = pcie->dn;
+
+ if (pci_msi_enabled() && msi_dn == pcie->dn) {
+ ret = brcm_pcie_enable_msi(pcie);
+ if (ret)
+ dev_err(pcie->dev,
+ "probe of internal MSI failed: %d)", ret);
+ else
+ pcie->msi_internal = true;
+ }
+
list_splice_init(&pcie->resources, &bridge->windows);
bridge->dev.parent = &pdev->dev;
bridge->busnr = 0;
@@ -1469,7 +1802,6 @@ static int brcm_pcie_probe(struct platform_device *pdev)
pcie->root_bus = bridge->bus;

return 0;
-
fail:
_brcm_pcie_remove(pcie);
return ret;
--
1.9.0.138.g2de3478

2018-01-15 23:30:53

by Jim Quinlan

[permalink] [raw]
Subject: [PATCH v4 4/8] PCI: brcmstb: Add dma-range mapping for inbound traffic

The Broadcom STB PCIe host controller is intimately related to the
memory subsystem. This close relationship adds complexity to how cpu
system memory is mapped to PCIe memory. Ideally, this mapping is an
identity mapping, or an identity mapping off by a constant. Not so in
this case.

Consider the Broadcom reference board BCM97445LCC_4X8 which has 6 GB
of system memory. Here is how the PCIe controller maps the
system memory to PCIe memory:

memc0-a@[ 0....3fffffff] <=> pci@[ 0....3fffffff]
memc0-b@[100000000...13fffffff] <=> pci@[ 40000000....7fffffff]
memc1-a@[ 40000000....7fffffff] <=> pci@[ 80000000....bfffffff]
memc1-b@[300000000...33fffffff] <=> pci@[ c0000000....ffffffff]
memc2-a@[ 80000000....bfffffff] <=> pci@[100000000...13fffffff]
memc2-b@[c00000000...c3fffffff] <=> pci@[140000000...17fffffff]

Although there are some "gaps" that can be added between the
individual mappings by software, the permutation of memory regions for
the most part is fixed by HW. The solution of having something close
to an identity mapping is not possible.

The idea behind this HW design is that the same PCIe module can
act as an RC or EP, and if it acts as an EP it concatenates all
of system memory into a BAR so anything can be accessed. Unfortunately,
when the PCIe block is in the role of an RC it also presents this
"BAR" to downstream PCIe devices, rather than offering an identity map
between its system memory and PCIe space.

Suppose that an endpoint driver allocs some DMA memory. Suppose this
memory is located at 0x6000_0000, which is in the middle of memc1-a.
The driver wants a dma_addr_t value that it can pass on to the EP to
use. Without doing any custom mapping, the EP will use this value for
DMA: the driver will get a dma_addr_t equal to 0x6000_0000. But this
won't work; the device needs a dma_addr_t that reflects the PCIe space
address, namely 0xa000_0000.

So, essentially the solution to this problem must modify the
dma_addr_t returned by the DMA routines routines. There are two
ways (I know of) of doing this:

(a) overriding/redefining the dma_to_phys() and phys_to_dma() calls
that are used by the dma_ops routines. This is the approach of

arch/mips/cavium-octeon/dma-octeon.c

In ARM and ARM64 these two routines are defiend in asm/dma-mapping.h
as static inline functions.

(b) Subscribe to a notifier that notifies when a device is added to a
bus. When this happens, set_dma_ops() can be called for the device.
This method is mentioned in:

http://lxr.free-electrons.com/source/drivers/of/platform.c?v=3.16#L152

where it says as a comment

"In case if platform code need to use own special DMA
configuration, it can use Platform bus notifier and
handle BUS_NOTIFY_ADD_DEVICE event to fix up DMA
configuration."

Solution (b) is what this commit does. It uses its own set of
dma_ops which are wrappers around the arch_dma_ops. The
wrappers translate the dma addresses before/after invoking
the arch_dma_ops, as appropriate.

Signed-off-by: Jim Quinlan <[email protected]>
---
drivers/pci/host/pcie-brcmstb.c | 420 +++++++++++++++++++++++++++++++++++++++-
1 file changed, 411 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/host/pcie-brcmstb.c b/drivers/pci/host/pcie-brcmstb.c
index fd15ab1..049f0db 100644
--- a/drivers/pci/host/pcie-brcmstb.c
+++ b/drivers/pci/host/pcie-brcmstb.c
@@ -4,6 +4,7 @@
#include <linux/clk.h>
#include <linux/compiler.h>
#include <linux/delay.h>
+#include <linux/dma-mapping.h>
#include <linux/init.h>
#include <linux/interrupt.h>
#include <linux/io.h>
@@ -318,11 +319,307 @@ static void __iomem *brcm_pcie_map_conf(struct pci_bus *bus, unsigned int devfn,
((val & ~reg##_##field##_MASK) | \
(reg##_##field##_MASK & (field_val << reg##_##field##_SHIFT)))

+static const struct dma_map_ops *arch_dma_ops;
+static const struct dma_map_ops *brcm_dma_ops_ptr;
+static struct of_pci_range *dma_ranges;
+static int num_dma_ranges;
+
static phys_addr_t scb_size[BRCM_MAX_SCB];
static int num_memc;
static int num_pcie;
static DEFINE_MUTEX(brcm_pcie_lock);

+static dma_addr_t brcm_to_pci(dma_addr_t addr)
+{
+ struct of_pci_range *p;
+
+ if (!num_dma_ranges)
+ return addr;
+
+ for (p = dma_ranges; p < &dma_ranges[num_dma_ranges]; p++)
+ if (addr >= p->cpu_addr && addr < (p->cpu_addr + p->size))
+ return addr - p->cpu_addr + p->pci_addr;
+
+ return addr;
+}
+
+static dma_addr_t brcm_to_cpu(dma_addr_t addr)
+{
+ struct of_pci_range *p;
+
+ if (!num_dma_ranges)
+ return addr;
+
+ for (p = dma_ranges; p < &dma_ranges[num_dma_ranges]; p++)
+ if (addr >= p->pci_addr && addr < (p->pci_addr + p->size))
+ return addr - p->pci_addr + p->cpu_addr;
+
+ return addr;
+}
+
+static void *brcm_alloc(struct device *dev, size_t size, dma_addr_t *handle,
+ gfp_t gfp, unsigned long attrs)
+{
+ void *ret;
+
+ ret = arch_dma_ops->alloc(dev, size, handle, gfp, attrs);
+ if (ret)
+ *handle = brcm_to_pci(*handle);
+ return ret;
+}
+
+static void brcm_free(struct device *dev, size_t size, void *cpu_addr,
+ dma_addr_t handle, unsigned long attrs)
+{
+ handle = brcm_to_cpu(handle);
+ arch_dma_ops->free(dev, size, cpu_addr, handle, attrs);
+}
+
+static int brcm_mmap(struct device *dev, struct vm_area_struct *vma,
+ void *cpu_addr, dma_addr_t dma_addr, size_t size,
+ unsigned long attrs)
+{
+ dma_addr = brcm_to_cpu(dma_addr);
+ return arch_dma_ops->mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
+}
+
+static int brcm_get_sgtable(struct device *dev, struct sg_table *sgt,
+ void *cpu_addr, dma_addr_t handle, size_t size,
+ unsigned long attrs)
+{
+ handle = brcm_to_cpu(handle);
+ return arch_dma_ops->get_sgtable(dev, sgt, cpu_addr, handle, size,
+ attrs);
+}
+
+static dma_addr_t brcm_map_page(struct device *dev, struct page *page,
+ unsigned long offset, size_t size,
+ enum dma_data_direction dir,
+ unsigned long attrs)
+{
+ return brcm_to_pci(arch_dma_ops->map_page(dev, page, offset, size,
+ dir, attrs));
+}
+
+static void brcm_unmap_page(struct device *dev, dma_addr_t handle,
+ size_t size, enum dma_data_direction dir,
+ unsigned long attrs)
+{
+ handle = brcm_to_cpu(handle);
+ arch_dma_ops->unmap_page(dev, handle, size, dir, attrs);
+}
+
+static int brcm_map_sg(struct device *dev, struct scatterlist *sgl,
+ int nents, enum dma_data_direction dir,
+ unsigned long attrs)
+{
+ int i, j;
+ struct scatterlist *sg;
+
+ for_each_sg(sgl, sg, nents, i) {
+#ifdef CONFIG_NEED_SG_DMA_LENGTH
+ sg->dma_length = sg->length;
+#endif
+ sg->dma_address =
+ brcm_dma_ops_ptr->map_page(dev, sg_page(sg), sg->offset,
+ sg->length, dir, attrs);
+ if (dma_mapping_error(dev, sg->dma_address))
+ goto bad_mapping;
+ }
+ return nents;
+
+bad_mapping:
+ for_each_sg(sgl, sg, i, j)
+ brcm_dma_ops_ptr->unmap_page(dev, sg_dma_address(sg),
+ sg_dma_len(sg), dir, attrs);
+ return 0;
+}
+
+static void brcm_unmap_sg(struct device *dev,
+ struct scatterlist *sgl, int nents,
+ enum dma_data_direction dir,
+ unsigned long attrs)
+{
+ int i;
+ struct scatterlist *sg;
+
+ for_each_sg(sgl, sg, nents, i)
+ brcm_dma_ops_ptr->unmap_page(dev, sg_dma_address(sg),
+ sg_dma_len(sg), dir, attrs);
+}
+
+static void brcm_sync_single_for_cpu(struct device *dev,
+ dma_addr_t handle, size_t size,
+ enum dma_data_direction dir)
+{
+ handle = brcm_to_cpu(handle);
+ arch_dma_ops->sync_single_for_cpu(dev, handle, size, dir);
+}
+
+static void brcm_sync_single_for_device(struct device *dev,
+ dma_addr_t handle, size_t size,
+ enum dma_data_direction dir)
+{
+ handle = brcm_to_cpu(handle);
+ arch_dma_ops->sync_single_for_device(dev, handle, size, dir);
+}
+
+static dma_addr_t brcm_map_resource(struct device *dev, phys_addr_t phys,
+ size_t size,
+ enum dma_data_direction dir,
+ unsigned long attrs)
+{
+ if (arch_dma_ops->map_resource)
+ return brcm_to_pci(arch_dma_ops->map_resource
+ (dev, phys, size, dir, attrs));
+ return brcm_to_pci((dma_addr_t)phys);
+}
+
+static void brcm_unmap_resource(struct device *dev, dma_addr_t handle,
+ size_t size, enum dma_data_direction dir,
+ unsigned long attrs)
+{
+ if (arch_dma_ops->unmap_resource)
+ arch_dma_ops->unmap_resource(dev, brcm_to_cpu(handle), size,
+ dir, attrs);
+}
+
+void brcm_sync_sg_for_cpu(struct device *dev, struct scatterlist *sgl,
+ int nents, enum dma_data_direction dir)
+{
+ struct scatterlist *sg;
+ int i;
+
+ for_each_sg(sgl, sg, nents, i)
+ brcm_dma_ops_ptr->sync_single_for_cpu(dev, sg_dma_address(sg),
+ sg->length, dir);
+}
+
+void brcm_sync_sg_for_device(struct device *dev, struct scatterlist *sgl,
+ int nents, enum dma_data_direction dir)
+{
+ struct scatterlist *sg;
+ int i;
+
+ for_each_sg(sgl, sg, nents, i)
+ brcm_dma_ops_ptr->sync_single_for_device(dev,
+ sg_dma_address(sg),
+ sg->length, dir);
+}
+
+static int brcm_mapping_error(struct device *dev, dma_addr_t dma_addr)
+{
+ return arch_dma_ops->mapping_error(dev, dma_addr);
+}
+
+static int brcm_dma_supported(struct device *dev, u64 mask)
+{
+ if (num_dma_ranges) {
+ /*
+ * It is our translated addresses that the EP will "see", so
+ * we check all of the ranges for the largest possible value.
+ */
+ int i;
+
+ for (i = 0; i < num_dma_ranges; i++)
+ if (dma_ranges[i].pci_addr + dma_ranges[i].size - 1
+ > mask)
+ return 0;
+ return 1;
+ }
+
+ return arch_dma_ops->dma_supported(dev, mask);
+}
+
+#ifdef ARCH_HAS_DMA_GET_REQUIRED_MASK
+u64 brcm_get_required_mask)(struct device *dev)
+{
+ return arch_dma_ops->get_required_mask(dev);
+}
+#endif
+
+static const struct dma_map_ops brcm_dma_ops = {
+ .alloc = brcm_alloc,
+ .free = brcm_free,
+ .mmap = brcm_mmap,
+ .get_sgtable = brcm_get_sgtable,
+ .map_page = brcm_map_page,
+ .unmap_page = brcm_unmap_page,
+ .map_sg = brcm_map_sg,
+ .unmap_sg = brcm_unmap_sg,
+ .map_resource = brcm_map_resource,
+ .unmap_resource = brcm_unmap_resource,
+ .sync_single_for_cpu = brcm_sync_single_for_cpu,
+ .sync_single_for_device = brcm_sync_single_for_device,
+ .sync_sg_for_cpu = brcm_sync_sg_for_cpu,
+ .sync_sg_for_device = brcm_sync_sg_for_device,
+ .mapping_error = brcm_mapping_error,
+ .dma_supported = brcm_dma_supported,
+#ifdef ARCH_HAS_DMA_GET_REQUIRED_MASK
+ .get_required_mask = brcm_get_required_mask,
+#endif
+};
+
+static void brcm_set_dma_ops(struct device *dev)
+{
+ int ret;
+
+ if (IS_ENABLED(CONFIG_ARM64)) {
+ /*
+ * We are going to invoke get_dma_ops(). That
+ * function, at this point in time, invokes
+ * get_arch_dma_ops(), and for ARM64 that function
+ * returns a pointer to dummy_dma_ops. So then we'd
+ * like to call arch_setup_dma_ops(), but that isn't
+ * exported. Instead, we call of_dma_configure(),
+ * which is exported, and this calls
+ * arch_setup_dma_ops(). Once we do this the call to
+ * get_dma_ops() will work properly because
+ * dev->dma_ops will be set.
+ */
+ ret = of_dma_configure(dev, dev->of_node);
+ if (ret) {
+ dev_err(dev, "of_dma_configure() failed: %d\n", ret);
+ return;
+ }
+ }
+
+ arch_dma_ops = get_dma_ops(dev);
+ if (!arch_dma_ops) {
+ dev_err(dev, "failed to get arch_dma_ops\n");
+ return;
+ }
+
+ set_dma_ops(dev, &brcm_dma_ops);
+}
+
+static int brcmstb_platform_notifier(struct notifier_block *nb,
+ unsigned long event, void *__dev)
+{
+ struct device *dev = __dev;
+
+ brcm_dma_ops_ptr = &brcm_dma_ops;
+ if (event != BUS_NOTIFY_ADD_DEVICE)
+ return NOTIFY_DONE;
+
+ brcm_set_dma_ops(dev);
+ return NOTIFY_OK;
+}
+
+static struct notifier_block brcmstb_platform_nb = {
+ .notifier_call = brcmstb_platform_notifier,
+};
+
+static int brcm_register_notifier(void)
+{
+ return bus_register_notifier(&pci_bus_type, &brcmstb_platform_nb);
+}
+
+static int brcm_unregister_notifier(void)
+{
+ return bus_unregister_notifier(&pci_bus_type, &brcmstb_platform_nb);
+}
+
static u32 rd_fld(void __iomem *p, u32 mask, int shift)
{
return (bcm_readl(p) & mask) >> shift;
@@ -596,9 +893,71 @@ static inline void brcm_pcie_perst_set(struct brcm_pcie *pcie,
WR_FLD_RB(pcie->base, PCIE_MISC_PCIE_CTRL, PCIE_PERSTB, !val);
}

+static int pci_dma_range_parser_init(struct of_pci_range_parser *parser,
+ struct device_node *node)
+{
+ const int na = 3, ns = 2;
+ int rlen;
+
+ parser->node = node;
+ parser->pna = of_n_addr_cells(node);
+ parser->np = parser->pna + na + ns;
+
+ parser->range = of_get_property(node, "dma-ranges", &rlen);
+ if (!parser->range)
+ return -ENOENT;
+
+ parser->end = parser->range + rlen / sizeof(__be32);
+
+ return 0;
+}
+
+static int brcm_pcie_parse_map_dma_ranges(struct brcm_pcie *pcie)
+{
+ int i;
+ struct of_pci_range_parser parser;
+ struct device_node *dn = pcie->dn;
+
+ /*
+ * Parse dma-ranges property if present. If there are multiple
+ * PCIe controllers, we only have to parse from one of them since
+ * the others will have an identical mapping.
+ */
+ if (!pci_dma_range_parser_init(&parser, dn)) {
+ unsigned int max_ranges
+ = (parser.end - parser.range) / parser.np;
+
+ dma_ranges = kcalloc(max_ranges, sizeof(struct of_pci_range),
+ GFP_KERNEL);
+ if (!dma_ranges)
+ return -ENOMEM;
+
+ for (i = 0; of_pci_range_parser_one(&parser, dma_ranges + i);
+ i++)
+ num_dma_ranges++;
+ }
+
+ for (i = 0, num_memc = 0; i < BRCM_MAX_SCB; i++) {
+ u64 size = brcmstb_memory_memc_size(i);
+
+ if (size == (u64)-1) {
+ dev_err(pcie->dev, "cannot get memc%d size", i);
+ return -EINVAL;
+ } else if (size) {
+ scb_size[i] = roundup_pow_of_two_64(size);
+ num_memc++;
+ } else {
+ break;
+ }
+ }
+
+ return 0;
+}
+
static int brcm_pcie_add_controller(struct brcm_pcie *pcie)
{
int i, ret = 0;
+ struct device *dev = pcie->dev;

mutex_lock(&brcm_pcie_lock);
if (num_pcie > 0) {
@@ -606,12 +965,21 @@ static int brcm_pcie_add_controller(struct brcm_pcie *pcie)
goto done;
}

+ ret = brcm_register_notifier();
+ if (ret) {
+ dev_err(dev, "failed to register pci bus notifier\n");
+ goto done;
+ }
+ ret = brcm_pcie_parse_map_dma_ranges(pcie);
+ if (ret)
+ goto done;
+
/* Determine num_memc and their sizes */
for (i = 0, num_memc = 0; i < BRCM_MAX_SCB; i++) {
u64 size = brcmstb_memory_memc_size(i);

if (size == (u64)-1) {
- dev_err(pcie->dev, "cannot get memc%d size\n", i);
+ dev_err(dev, "cannot get memc%d size\n", i);
ret = -EINVAL;
goto done;
} else if (size) {
@@ -635,8 +1003,16 @@ static int brcm_pcie_add_controller(struct brcm_pcie *pcie)
static void brcm_pcie_remove_controller(struct brcm_pcie *pcie)
{
mutex_lock(&brcm_pcie_lock);
- if (--num_pcie == 0)
- num_memc = 0;
+ if (--num_pcie > 0)
+ goto out;
+
+ if (brcm_unregister_notifier())
+ dev_err(pcie->dev, "failed to unregister pci bus notifier\n");
+ kfree(dma_ranges);
+ dma_ranges = NULL;
+ num_dma_ranges = 0;
+ num_memc = 0;
+out:
mutex_unlock(&brcm_pcie_lock);
}

@@ -756,6 +1132,38 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
*/
rc_bar2_offset = 0;

+ if (dma_ranges) {
+ /*
+ * The best-case scenario is to place the inbound
+ * region in the first 4GB of pci-space, as some
+ * legacy devices can only address 32bits.
+ * We would also like to put the MSI under 4GB
+ * as well, since some devices require a 32bit
+ * MSI target address.
+ */
+ if (total_mem_size <= 0xc0000000ULL &&
+ rc_bar2_size <= 0x100000000ULL) {
+ rc_bar2_offset = 0;
+ } else {
+ /*
+ * The system memory is 4GB or larger so we
+ * cannot start the inbound region at location
+ * 0 (since we have to allow some space for
+ * outbound memory @ 3GB). So instead we
+ * start it at the 1x multiple of its size
+ */
+ rc_bar2_offset = rc_bar2_size;
+ }
+
+ } else {
+ /*
+ * Set simple configuration based on memory sizes
+ * only. We always start the viewport at address 0,
+ * and set the MSI target address accordingly.
+ */
+ rc_bar2_offset = 0;
+ }
+
tmp = lower_32_bits(rc_bar2_offset);
tmp = INSERT_FIELD(tmp, PCIE_MISC_RC_BAR2_CONFIG_LO, SIZE,
encode_ibar_size(rc_bar2_size));
@@ -966,7 +1374,6 @@ static int brcm_pcie_probe(struct platform_device *pdev)
struct brcm_pcie *pcie;
struct resource *res;
void __iomem *base;
- u32 tmp;
struct pci_host_bridge *bridge;
struct pci_bus *child;

@@ -983,11 +1390,6 @@ static int brcm_pcie_probe(struct platform_device *pdev)
return -EINVAL;
}

- if (of_property_read_u32(dn, "dma-ranges", &tmp) == 0) {
- dev_err(&pdev->dev, "cannot yet handle dma-ranges\n");
- return -EINVAL;
- }
-
data = of_id->data;
pcie->reg_offsets = data->offsets;
pcie->reg_field_info = data->reg_field_info;
--
1.9.0.138.g2de3478

2018-01-15 23:31:18

by Jim Quinlan

[permalink] [raw]
Subject: [PATCH v4 2/8] dt-bindings: pci: Add DT docs for Brcmstb PCIe device

The DT bindings description of the Brcmstb PCIe device is described. This
node can be used by almost all Broadcom settop box chips, using
ARM, ARM64, or MIPS CPU architectures.

Signed-off-by: Jim Quinlan <[email protected]>
---
.../devicetree/bindings/pci/brcmstb-pcie.txt | 59 ++++++++++++++++++++++
1 file changed, 59 insertions(+)
create mode 100644 Documentation/devicetree/bindings/pci/brcmstb-pcie.txt

diff --git a/Documentation/devicetree/bindings/pci/brcmstb-pcie.txt b/Documentation/devicetree/bindings/pci/brcmstb-pcie.txt
new file mode 100644
index 0000000..a1a9ad5
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/brcmstb-pcie.txt
@@ -0,0 +1,59 @@
+Brcmstb PCIe Host Controller Device Tree Bindings
+
+Required Properties:
+- compatible
+ "brcm,bcm7425-pcie" -- for 7425 family MIPS-based SOCs.
+ "brcm,bcm7435-pcie" -- for 7435 family MIPS-based SOCs.
+ "brcm,bcm7445-pcie" -- for 7445 and later ARM based SOCs (not including
+ the 7278).
+ "brcm,bcm7278-pcie" -- for 7278 family ARM-based SOCs.
+
+- reg -- the register start address and length for the PCIe reg block.
+- interrupts -- two interrupts are specified; the first interrupt is for
+ the PCI host controller and the second is for MSI if the built-in
+ MSI controller is to be used.
+- interrupt-names -- names of the interrupts (above): "pcie" and "msi".
+- #address-cells -- set to <3>.
+- #size-cells -- set to <2>.
+- #interrupt-cells: set to <1>.
+- interrupt-map-mask and interrupt-map, standard PCI properties to define the
+ mapping of the PCIe interface to interrupt numbers.
+- ranges: ranges for the PCI memory and I/O regions.
+- linux,pci-domain -- should be unique per host controller.
+
+Optional Properties:
+- clocks -- phandle of pcie clock.
+- clock-names -- set to "sw_pcie" if clocks is used.
+- dma-ranges -- Specifies the inbound memory mapping regions when
+ an "identity map" is not possible.
+- msi-controller -- this property is typically specified to have the
+ PCIe controller use its internal MSI controller.
+- msi-parent -- set to use an external MSI interrupt controller.
+- brcm,enable-ssc -- (boolean) indicates usage of spread-spectrum clocking.
+- max-link-speed -- (integer) indicates desired generation of link:
+ 1 => 2.5 Gbps (gen1), 2 => 5.0 Gbps (gen2), 3 => 8.0 Gbps (gen3).
+
+Example Node:
+
+pcie0: pcie@f0460000 {
+ reg = <0x0 0xf0460000 0x0 0x9310>;
+ interrupts = <0x0 0x0 0x4>;
+ compatible = "brcm,bcm7445-pcie";
+ #address-cells = <3>;
+ #size-cells = <2>;
+ ranges = <0x02000000 0x00000000 0x00000000 0x00000000 0xc0000000 0x00000000 0x08000000
+ 0x02000000 0x00000000 0x08000000 0x00000000 0xc8000000 0x00000000 0x08000000>;
+ #interrupt-cells = <1>;
+ interrupt-map-mask = <0 0 0 7>;
+ interrupt-map = <0 0 0 1 &intc 0 47 3
+ 0 0 0 2 &intc 0 48 3
+ 0 0 0 3 &intc 0 49 3
+ 0 0 0 4 &intc 0 50 3>;
+ clocks = <&sw_pcie0>;
+ clock-names = "sw_pcie";
+ msi-parent = <&pcie0>; /* use PCIe's internal MSI controller */
+ msi-controller; /* use PCIe's internal MSI controller */
+ brcm,ssc;
+ max-link-speed = <1>;
+ linux,pci-domain = <0>;
+ };
--
1.9.0.138.g2de3478

2018-01-18 02:16:38

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v4 4/8] PCI: brcmstb: Add dma-range mapping for inbound traffic

On Mon, Jan 15, 2018 at 5:28 PM, Jim Quinlan <[email protected]> wrote:
> The Broadcom STB PCIe host controller is intimately related to the
> memory subsystem. This close relationship adds complexity to how cpu
> system memory is mapped to PCIe memory. Ideally, this mapping is an
> identity mapping, or an identity mapping off by a constant. Not so in
> this case.
>
> Consider the Broadcom reference board BCM97445LCC_4X8 which has 6 GB
> of system memory. Here is how the PCIe controller maps the
> system memory to PCIe memory:
>
> memc0-a@[ 0....3fffffff] <=> pci@[ 0....3fffffff]
> memc0-b@[100000000...13fffffff] <=> pci@[ 40000000....7fffffff]
> memc1-a@[ 40000000....7fffffff] <=> pci@[ 80000000....bfffffff]
> memc1-b@[300000000...33fffffff] <=> pci@[ c0000000....ffffffff]
> memc2-a@[ 80000000....bfffffff] <=> pci@[100000000...13fffffff]
> memc2-b@[c00000000...c3fffffff] <=> pci@[140000000...17fffffff]
>
> Although there are some "gaps" that can be added between the
> individual mappings by software, the permutation of memory regions for
> the most part is fixed by HW. The solution of having something close
> to an identity mapping is not possible.
>
> The idea behind this HW design is that the same PCIe module can
> act as an RC or EP, and if it acts as an EP it concatenates all
> of system memory into a BAR so anything can be accessed. Unfortunately,
> when the PCIe block is in the role of an RC it also presents this
> "BAR" to downstream PCIe devices, rather than offering an identity map
> between its system memory and PCIe space.
>
> Suppose that an endpoint driver allocs some DMA memory. Suppose this
> memory is located at 0x6000_0000, which is in the middle of memc1-a.
> The driver wants a dma_addr_t value that it can pass on to the EP to
> use. Without doing any custom mapping, the EP will use this value for
> DMA: the driver will get a dma_addr_t equal to 0x6000_0000. But this
> won't work; the device needs a dma_addr_t that reflects the PCIe space
> address, namely 0xa000_0000.
>
> So, essentially the solution to this problem must modify the
> dma_addr_t returned by the DMA routines routines. There are two
> ways (I know of) of doing this:
>
> (a) overriding/redefining the dma_to_phys() and phys_to_dma() calls
> that are used by the dma_ops routines. This is the approach of
>
> arch/mips/cavium-octeon/dma-octeon.c

MIPS is rarely an example to follow. :)

> In ARM and ARM64 these two routines are defiend in asm/dma-mapping.h
> as static inline functions.
>
> (b) Subscribe to a notifier that notifies when a device is added to a
> bus. When this happens, set_dma_ops() can be called for the device.
> This method is mentioned in:
>
> http://lxr.free-electrons.com/source/drivers/of/platform.c?v=3.16#L152

Why refer to an external website when you can just refer to the source
of the project this patch applies to directly.

> where it says as a comment
>
> "In case if platform code need to use own special DMA
> configuration, it can use Platform bus notifier and
> handle BUS_NOTIFY_ADD_DEVICE event to fix up DMA
> configuration."

In the current tree, this comment is in drivers/of/device.c.

> Solution (b) is what this commit does. It uses its own set of
> dma_ops which are wrappers around the arch_dma_ops. The
> wrappers translate the dma addresses before/after invoking
> the arch_dma_ops, as appropriate.

2018-01-18 07:32:10

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v4 4/8] PCI: brcmstb: Add dma-range mapping for inbound traffic

On Wed, Jan 17, 2018 at 08:15:33PM -0600, Rob Herring wrote:
> > (a) overriding/redefining the dma_to_phys() and phys_to_dma() calls
> > that are used by the dma_ops routines. This is the approach of
> >
> > arch/mips/cavium-octeon/dma-octeon.c
>
> MIPS is rarely an example to follow. :)

But in this case it actually is the example to follow as told previously.

NAK again for these chained dma ops that only create problems.

2018-01-18 15:14:33

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH v4 4/8] PCI: brcmstb: Add dma-range mapping for inbound traffic

On January 17, 2018 11:31:23 PM PST, Christoph Hellwig <[email protected]> wrote:
>On Wed, Jan 17, 2018 at 08:15:33PM -0600, Rob Herring wrote:
>> > (a) overriding/redefining the dma_to_phys() and phys_to_dma() calls
>> > that are used by the dma_ops routines. This is the approach of
>> >
>> > arch/mips/cavium-octeon/dma-octeon.c
>>
>> MIPS is rarely an example to follow. :)
>
>But in this case it actually is the example to follow as told
>previously.
>
>NAK again for these chained dma ops that only create problems.

Care to explain what should be done instead?

--
Florian

2018-01-18 15:24:40

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v4 4/8] PCI: brcmstb: Add dma-range mapping for inbound traffic

On Thu, Jan 18, 2018 at 07:09:23AM -0800, Florian Fainelli wrote:
> >But in this case it actually is the example to follow as told
> >previously.
> >
> >NAK again for these chained dma ops that only create problems.
>
> Care to explain what should be done instead?

Override phys_to_dma and dma_to_phys as mips and x86 do for similar
situations.

Bonous points of finding some generic way of doing it instead of
hiding it in arch code.

2018-01-19 19:21:22

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH v4 2/8] dt-bindings: pci: Add DT docs for Brcmstb PCIe device

On Mon, Jan 15, 2018 at 06:28:39PM -0500, Jim Quinlan wrote:
> The DT bindings description of the Brcmstb PCIe device is described. This
> node can be used by almost all Broadcom settop box chips, using
> ARM, ARM64, or MIPS CPU architectures.
>
> Signed-off-by: Jim Quinlan <[email protected]>
> ---
> .../devicetree/bindings/pci/brcmstb-pcie.txt | 59 ++++++++++++++++++++++
> 1 file changed, 59 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/pci/brcmstb-pcie.txt

I acked v3. Please add acks when posting new versions.

2018-01-19 19:50:34

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH v4 4/8] PCI: brcmstb: Add dma-range mapping for inbound traffic

On 01/18/2018 07:23 AM, Christoph Hellwig wrote:
> On Thu, Jan 18, 2018 at 07:09:23AM -0800, Florian Fainelli wrote:
>>> But in this case it actually is the example to follow as told
>>> previously.
>>>
>>> NAK again for these chained dma ops that only create problems.
>>
>> Care to explain what should be done instead?
>
> Override phys_to_dma and dma_to_phys as mips and x86 do for similar
> situations.

How can this work well in the context of a loadable module for instance?
For MIPS, this would mean that we have to override phys_to_dma() and
dma_to_phys() in the platform that is *susceptible* to use this PCIe
controller (arch/mips/bmips) which is fine, but there, we essentially
need to find a way to make this dynamic based on whether the PCIe
controller is loaded or not.

As you might have seen from this patch, what needs to be done is highly
dependent on the processor architecture and its memory controller
physical memory map, so I don't see how we are in any better situation
if we need to replicate 3 times across MIPS, ARM and ARM64 how the
addresses need to be mangled.

Are you suggesting we somehow decouple the memory mangling part into a
portion that can be built into the kernel image (so phys_to_dma() and
dma_to_phys() is resolved at vmlinux link time) and can be selected by
different architectures that need it? If so, yikes.

>
> Bonous points of finding some generic way of doing it instead of
> hiding it in arch code.
>

I can see value in having a generic mechanism, ala X86_DMA_REMAP
allowing architectures to have the ability to override phys_to_dma() and
dma_to_phys() but right now, especially if we look at
arch/x86/pci/sta2x11-fixup.c this really appears to be quite messy and
equally ugly than stacking operations...

What is the actual problem you want to avoid with the stacking of DMA
operations, is it because it becomes harder to audit, or are there are
other reasons?
--
Florian

2018-01-23 13:21:34

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v4 4/8] PCI: brcmstb: Add dma-range mapping for inbound traffic

On Fri, Jan 19, 2018 at 11:47:54AM -0800, Florian Fainelli wrote:
> How can this work well in the context of a loadable module for instance?
> For MIPS, this would mean that we have to override phys_to_dma() and
> dma_to_phys() in the platform that is *susceptible* to use this PCIe
> controller (arch/mips/bmips) which is fine, but there, we essentially
> need to find a way to make this dynamic based on whether the PCIe
> controller is loaded or not.
>
> As you might have seen from this patch, what needs to be done is highly
> dependent on the processor architecture and its memory controller
> physical memory map, so I don't see how we are in any better situation
> if we need to replicate 3 times across MIPS, ARM and ARM64 how the
> addresses need to be mangled.
>
> Are you suggesting we somehow decouple the memory mangling part into a
> portion that can be built into the kernel image (so phys_to_dma() and
> dma_to_phys() is resolved at vmlinux link time) and can be selected by
> different architectures that need it? If so, yikes.

On architectures with crazy PCIe controllers (this seems to include
mips, arm, arm64 and x86 thanks to the weird SOCs) we will need a
a few different memory maps, yes. Take a look at
arch/x86/pci/sta2x11-fixup.c, preferably from a tree where the worst
issues are fixed:

http://git.infradead.org/users/hch/misc.git/blob/refs/heads/dma-direct-all:/arch/x86/pci/sta2x11-fixup.c

Overriding phys_to_dma and dma_to_phys is required if you need to
support swiotlb, and chances are with a broken PCIe controller on
arm64 or mips64 you eventuall will.

This sta2x11 code should probably be lifted to common code in
one form or another eventually, althought it will need another
fair round of cleanups for now.

> I can see value in having a generic mechanism, ala X86_DMA_REMAP
> allowing architectures to have the ability to override phys_to_dma() and
> dma_to_phys() but right now, especially if we look at
> arch/x86/pci/sta2x11-fixup.c this really appears to be quite messy and
> equally ugly than stacking operations...
>
> What is the actual problem you want to avoid with the stacking of DMA
> operations, is it because it becomes harder to audit, or are there are
> other reasons?

Audit, consolidate into a single dma-direct implementation and properly
support swiotlb out of the box.

2018-01-24 20:05:54

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH v4 4/8] PCI: brcmstb: Add dma-range mapping for inbound traffic

On 01/23/2018 05:20 AM, Christoph Hellwig wrote:
> On Fri, Jan 19, 2018 at 11:47:54AM -0800, Florian Fainelli wrote:
>> How can this work well in the context of a loadable module for instance?
>> For MIPS, this would mean that we have to override phys_to_dma() and
>> dma_to_phys() in the platform that is *susceptible* to use this PCIe
>> controller (arch/mips/bmips) which is fine, but there, we essentially
>> need to find a way to make this dynamic based on whether the PCIe
>> controller is loaded or not.
>>
>> As you might have seen from this patch, what needs to be done is highly
>> dependent on the processor architecture and its memory controller
>> physical memory map, so I don't see how we are in any better situation
>> if we need to replicate 3 times across MIPS, ARM and ARM64 how the
>> addresses need to be mangled.
>>
>> Are you suggesting we somehow decouple the memory mangling part into a
>> portion that can be built into the kernel image (so phys_to_dma() and
>> dma_to_phys() is resolved at vmlinux link time) and can be selected by
>> different architectures that need it? If so, yikes.
>
> On architectures with crazy PCIe controllers (this seems to include
> mips, arm, arm64 and x86 thanks to the weird SOCs) we will need a
> a few different memory maps, yes. Take a look at
> arch/x86/pci/sta2x11-fixup.c, preferably from a tree where the worst
> issues are fixed:
>
> http://git.infradead.org/users/hch/misc.git/blob/refs/heads/dma-direct-all:/arch/x86/pci/sta2x11-fixup.c
>
> Overriding phys_to_dma and dma_to_phys is required if you need to
> support swiotlb, and chances are with a broken PCIe controller on
> arm64 or mips64 you eventuall will.
>
> This sta2x11 code should probably be lifted to common code in
> one form or another eventually, althought it will need another
> fair round of cleanups for now.

This looks nicer than the current shape, but this still requires to
register a PCI fixup to override phys_to_dma() and dma_to_phys(), and it
would appear that you have dodged my question about how this is supposed
to fit with an entirely modular PCIe root complex driver? Are you
suggesting that we split the module into a built-in part and a modular part?

>
>> I can see value in having a generic mechanism, ala X86_DMA_REMAP
>> allowing architectures to have the ability to override phys_to_dma() and
>> dma_to_phys() but right now, especially if we look at
>> arch/x86/pci/sta2x11-fixup.c this really appears to be quite messy and
>> equally ugly than stacking operations...
>>
>> What is the actual problem you want to avoid with the stacking of DMA
>> operations, is it because it becomes harder to audit, or are there are
>> other reasons?
>
> Audit, consolidate into a single dma-direct implementation and properly
> support swiotlb out of the box.
>

OK.
--
Florian

2018-01-26 07:54:40

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v4 4/8] PCI: brcmstb: Add dma-range mapping for inbound traffic

On Wed, Jan 24, 2018 at 12:04:58PM -0800, Florian Fainelli wrote:
> This looks nicer than the current shape, but this still requires to
> register a PCI fixup to override phys_to_dma() and dma_to_phys(), and it
> would appear that you have dodged my question about how this is supposed
> to fit with an entirely modular PCIe root complex driver? Are you
> suggesting that we split the module into a built-in part and a modular part?

I don't think entirely modular PCI root bridges should be a focal point
for the design. If we happen to support them by other design choices:
fine, but they should not be a priority.

That being said if we have core dma mapping or PCIe code that has
a list of offsets and the root complex only populates them it should
work just fine.

2018-01-26 17:47:35

by Jim Quinlan

[permalink] [raw]
Subject: Re: [PATCH v4 4/8] PCI: brcmstb: Add dma-range mapping for inbound traffic

On Fri, Jan 26, 2018 at 2:53 AM, Christoph Hellwig <[email protected]> wrote:
> On Wed, Jan 24, 2018 at 12:04:58PM -0800, Florian Fainelli wrote:
>> This looks nicer than the current shape, but this still requires to
>> register a PCI fixup to override phys_to_dma() and dma_to_phys(), and it
>> would appear that you have dodged my question about how this is supposed
>> to fit with an entirely modular PCIe root complex driver? Are you
>> suggesting that we split the module into a built-in part and a modular part?
>
> I don't think entirely modular PCI root bridges should be a focal point
> for the design. If we happen to support them by other design choices:
> fine, but they should not be a priority.

I disagree. If there is one common thing our customers request it is
the ability to remove (or control the insmod of after boot) the pcie
RC driver. I didn't add this in as a "nice-to-have".

>
> That being said if we have core dma mapping or PCIe code that has
> a list of offsets and the root complex only populates them it should
> work just fine.

I'm looking at arch/arm/include/asm/dma-mapping.h. In addition to
overriding dma_to_phsy() and phys_to_dma(), it looks like I may have
to define __arch_pfn_to_dma(), __arch_dma_to_pfn(),
__arch_dma_to_virt(), __arch_virt_to_dma(). Do you agree or is this
not necessary? If it is, this seems more intrusive than our
pcie-brcmstb-dma.c solution which doesn't require tentacles into
major include files and Kconfigs.

Another issue is that our function wrappers -- depending upon whether
we are dealing with a pci device or not -- will have to possibly call
the actual ARM and ARM64 definitions of these functions, which have
been of course #ifdef'd out. This means that our code must contain
identical copies of these functions' code and that the code must
somehow be kept in sync. Do you see a solution to this?

Jim

2018-02-12 14:47:01

by Jim Quinlan

[permalink] [raw]
Subject: Re: [PATCH v4 4/8] PCI: brcmstb: Add dma-range mapping for inbound traffic

On Fri, Jan 26, 2018 at 12:46 PM, Jim Quinlan <[email protected]> wrote:
> On Fri, Jan 26, 2018 at 2:53 AM, Christoph Hellwig <[email protected]> wrote:
>> On Wed, Jan 24, 2018 at 12:04:58PM -0800, Florian Fainelli wrote:
>>> This looks nicer than the current shape, but this still requires to
>>> register a PCI fixup to override phys_to_dma() and dma_to_phys(), and it
>>> would appear that you have dodged my question about how this is supposed
>>> to fit with an entirely modular PCIe root complex driver? Are you
>>> suggesting that we split the module into a built-in part and a modular part?
>>
>> I don't think entirely modular PCI root bridges should be a focal point
>> for the design. If we happen to support them by other design choices:
>> fine, but they should not be a priority.
>
> I disagree. If there is one common thing our customers request it is
> the ability to remove (or control the insmod of after boot) the pcie
> RC driver. I didn't add this in as a "nice-to-have".
>
>>
>> That being said if we have core dma mapping or PCIe code that has
>> a list of offsets and the root complex only populates them it should
>> work just fine.
>
> I'm looking at arch/arm/include/asm/dma-mapping.h. In addition to
> overriding dma_to_phsy() and phys_to_dma(), it looks like I may have
> to define __arch_pfn_to_dma(), __arch_dma_to_pfn(),
> __arch_dma_to_virt(), __arch_virt_to_dma(). Do you agree or is this
> not necessary? If it is, this seems more intrusive than our
> pcie-brcmstb-dma.c solution which doesn't require tentacles into
> major include files and Kconfigs.
>
> Another issue is that our function wrappers -- depending upon whether
> we are dealing with a pci device or not -- will have to possibly call
> the actual ARM and ARM64 definitions of these functions, which have
> been of course #ifdef'd out. This means that our code must contain
> identical copies of these functions' code and that the code must
> somehow be kept in sync. Do you see a solution to this?
>
> Jim

Cristoph,
Could you please respond to my comments? Even a negative response is
better than none. The problem with doing what you suggested is with
ARM -- ARM64 and MIPS relatively uncomplicated . With ARM, I have to
define the aforementioned functions -- the only way of doing this is
to define arch/arm/mach-bcm/include/mach/memory.h, and for that to be
picked up we no longer can have CONFIG_ARCH_MULTIPLATFORM=y, which is
an unacceptable cost to pay for just an unusual PCIe RC controller.

Regarding my current submission -- you are right, SWIOTLB will not
work for EPs that require it. However, we don't care about these
devices, and can just bailout with EPs when the dma_mask is <=
0xffff_ffff or if swiotlb_force == SWIOTLB_FORCE. Note that this
would only affect PCIe DMA. We also have no plan of using MIPS64.

-- Jim

2018-03-09 14:45:41

by James Hogan

[permalink] [raw]
Subject: Re: [PATCH v4 7/8] MIPS: BMIPS: Add PCI bindings for 7425, 7435

On Mon, Jan 15, 2018 at 06:28:44PM -0500, Jim Quinlan wrote:
> diff --git a/arch/mips/boot/dts/brcm/bcm7425.dtsi b/arch/mips/boot/dts/brcm/bcm7425.dtsi
> index e4fb9b6..02168d0 100644
> --- a/arch/mips/boot/dts/brcm/bcm7425.dtsi
> +++ b/arch/mips/boot/dts/brcm/bcm7425.dtsi
> @@ -495,4 +495,30 @@
> status = "disabled";
> };
> };
> +
> + pcie: pcie@10410000 {
> + reg = <0x10410000 0x830c>;
> + compatible = "brcm,bcm7425-pcie";
> + interrupts = <37>, <37>;
> + interrupt-names = "pcie", "msi";
> + interrupt-parent = <&periph_intc>;
> + #address-cells = <3>;
> + #size-cells = <2>;
> + linux,pci-domain = <0>;
> + brcm,enable-ssc;
> + bus-range = <0x00 0xff>;
> + msi-controller;
> + #interrupt-cells = <1>;
> + /* 4x128mb windows */
> + ranges = <0x2000000 0x0 0xd0000000 0xd0000000 0 0x08000000>,
> + <0x2000000 0x0 0xd8000000 0xd8000000 0 0x08000000>,
> + <0x2000000 0x0 0xe0000000 0xe0000000 0 0x08000000>,
> + <0x2000000 0x0 0xe8000000 0xe8000000 0 0x08000000>;
> + interrupt-map-mask = <0 0 0 7>;
> + interrupt-map = <0 0 0 1 &periph_intc 33
> + 0 0 0 2 &periph_intc 34
> + 0 0 0 3 &periph_intc 35
> + 0 0 0 4 &periph_intc 36>;

no status = "disabled" like the other dtsi?

> + };
> +
> };
> diff --git a/arch/mips/boot/dts/brcm/bcm7435.dtsi b/arch/mips/boot/dts/brcm/bcm7435.dtsi
> index 1484e89..84881224 100644
> --- a/arch/mips/boot/dts/brcm/bcm7435.dtsi
> +++ b/arch/mips/boot/dts/brcm/bcm7435.dtsi
> @@ -510,4 +510,31 @@
> status = "disabled";
> };
> };
> +
> + pcie: pcie@10410000 {
> + reg = <0x10410000 0x930c>;
> + interrupts = <0x27>, <0x27>;
> + interrupt-names = "pcie", "msi";
> + interrupt-parent = <&periph_intc>;
> + compatible = "brcm,bcm7435-pcie";

Might be nice to be consistent in your property ordering between these
two dtsi files. I for one would prefer compatible to be near the top
too, if only for consistency with most other nodes in these files.

> + #address-cells = <3>;
> + #size-cells = <2>;
> + linux,pci-domain = <0>;
> + brcm,enable-ssc;
> + bus-range = <0x00 0xff>;
> + msi-controller;
> + #interrupt-cells = <1>;
> + /* 4x128mb windows */
> + ranges = <0x2000000 0x0 0xd0000000 0xd0000000 0 0x08000000>,
> + <0x2000000 0x0 0xd8000000 0xd8000000 0 0x08000000>,
> + <0x2000000 0x0 0xe0000000 0xe0000000 0 0x08000000>,
> + <0x2000000 0x0 0xe8000000 0xe8000000 0 0x08000000>;
> + interrupt-map-mask = <0 0 0 7>;
> + interrupt-map = <0 0 0 1 &periph_intc 35
> + 0 0 0 2 &periph_intc 36
> + 0 0 0 3 &periph_intc 37
> + 0 0 0 4 &periph_intc 38>;
> + status = "disabled";
> + };
> +
> };

Cheers
James


Attachments:
(No filename) (2.65 kB)
signature.asc (849.00 B)
Digital signature
Download all attachments

2018-03-09 15:08:48

by James Hogan

[permalink] [raw]
Subject: Re: [PATCH v4 1/8] SOC: brcmstb: add memory API

On Mon, Jan 15, 2018 at 06:28:38PM -0500, Jim Quinlan wrote:
> From: Florian Fainelli <[email protected]>
>
> This commit adds a memory API suitable for ascertaining the sizes of
> each of the N memory controllers in a Broadcom STB chip. Its first
> user will be the Broadcom STB PCIe root complex driver, which needs
> to know these sizes to properly set up DMA mappings for inbound
> regions.
>
> We cannot use memblock here or anything like what Linux provides
> because it collapses adjacent regions within a larger block, and here
> we actually need per-memory controller addresses and sizes, which is
> why we resort to manual DT parsing.
>
> Signed-off-by: Jim Quinlan <[email protected]>
>
> Conflicts:
> drivers/soc/bcm/brcmstb/Makefile

That can go.

> +++ b/drivers/soc/bcm/brcmstb/memory.c
...
> +/* Macro to help extract property data */
> +#define DT_PROP_DATA_TO_U32(b, offs) (fdt32_to_cpu(*(u32*)(b + offs)))

Checkpatch complains about missing whitespace after u32.

Cheers
James


Attachments:
(No filename) (1.01 kB)
signature.asc (849.00 B)
Digital signature
Download all attachments