Hi,
This is v4 of my Apple M1 DART IOMMU driver series as a follow up to the previous
versions [1][2][3].
Short summary: this series adds support for the iommu found in Apple's new M1
SoC which is required to use DMA on most peripherals like the display controller,
the USB ports or the internal PCIe bus (which is used for WiFi, Ethernet and
more USB ports).
So far this code has been tested by multiple people with dwc3 in host and
device mode (which both only require changes to the device tree after this
patchset) and PCIe (using a yet to be finalized patchset).
== Testing this patchset with the USB-C controller ==
The two USB-C ports on the M1 machines are exposed as two separate dwc3
controllers which are behind a DART. Now that my USB phy bringup code has been
merged into our bootloader m1n1 you can easily test this patchset yourself:
1) Follow the instructions at [4] to setup our bootloader m1n1 on your M1
machine which will allow you to boot kernels using a normal USB cable.
Note that you'll still need a special setup to expose the UART for very
low-level debugging.
2) Either apply this patchset and add the DART and dwc3 nodes as done in [5]
or alternatively just pull from [6] to get the tree that I've been using.
(That tree contains two commits already in linux-next, this patchset and
a final commit to add the dart+dwc3 nodes)
3) Boot the kernel through our bootloader m1n1. You'll need a version after
commit [7] which enables the USB PHY and the USB PD chip.
Note that the dwc3 controller has a quirk where each root port can only be used
once right now. The most stable way to test is to already connected the USB
device(s) before booting the kernel.
It's also possible to test the PCIe bus but this requires a more complex setup
for now. I can write a quick howto if anyone is interested though.
(tl;dr: Mark Kettenis has a u-boot fork that includes PCIe bringup code and
Marc Zyngier has a WIP patchset to add a PCIe driver)
== Hardwired 16K IOMMU pagesize ==
This IOMMU comes with a hard-wired pagesize of 16K. This makes booting a
kernel with 4K page challenging. Right now it will hit a BUG_ON if a translated
domain is used.
For dwc3 this is no issue: As long as the iommu is set to bypass mode
dwc3 works just fine. Translated mode just isn't supported then.
The most controversial part on which I'd like to get feedback are the
PCIe DARTs. These DARTs do not support hardware bypass mode and also limit
the iova space to 32 bit. To still allow booting on kernels with a 4K
pagesize I manually program a software bypass mode. With a correctly configured
dma-ranges in the PCIe nodes this then also allows a 4K kernel to boot.
In the long term, I'd like to extend the dma-iommu framework itself to
support iommu pagesizes with a larger granule than the CPU pagesize if that is
something you agree with.
This would be important to later support the thunderbolt DARTs since I would be
very uncomfortable to have these running in (software or hardware) bypass mode.
== Project Blurb ==
Asahi Linux is an open community project dedicated to developing and
maintaining mainline support for Apple Silicon on Linux. Feel free to
drop by #asahi and #asahi-dev on OFTC to chat with us, or check
our website for more information on the project:
https://asahilinux.org/
== Changes ==
Changes for v4:
- Addressed Rob Herring's remark about the incorrect phandles in the device
tree binding example and added his reviewed-by tag
- Take the software linear mapping range from the bus instead of hardcoding
it in the driver
- Use def_domain_type to force bypass mode if there's a pagesize mismatch
between the DART (hardwired to 16KB) and the kernel (may use 4K)
- Added lockdep_assert_held instead of comments as suggested by Rouven Czerwinski
- rebased on 5.13-rc7
Changes for v3:
- fixed name of the iommu node in the device tree binding example
pointed out by Arnd Bergmann
- remove hardware specific checks from io-pgtable.c as pointed out by
Will Deacon
- introduced a fake bypass mode by programming static linear pagetables
if the DART does not support regular bypass mode as proposed by Alex
Graf
- added checks to enforce bypass mode if there is a pagesize mismatch
between the DART HW and the CPU.
- fixed usage of GFP_KERNEL during a held spinlock found by Julia Lawall
- rebased on v5.13-rc3
Changes for v2:
- fixed devicetree binding linting issues pointed out by Rob Herring and
reworked that file.
- made DART-specific code in io-pgtable.c unconditional and removed flag from
Kconfig as proposed by Robin Murphy.
- allowed multiple DART nodes in the "iommus" property as proposed by
Rob Herring and Robin Murphy. this resulted in significant changes
to apple-iommu-dart.c.
- the domain aperture is now forced to 32bit if translation is enabled after
the original suggestion to limit the aperture by Mark Kettenis and the
follow-up discussion and investigation with Mark Kettenis, Arnd Bergmann,
Robin Murphy and Rob Herring. This change also simplified the code
in io-pgtable.c and made some of the improvements suggested during review
not apply anymore.
- added support for bypassed and isolated domain modes.
- reject IOMMU_MMIO and IOMMU_NOEXEC since it's unknown how to set these up
for now or if the hardware even supports these flags.
- renamed some registers to be less confusing (mainly s/DOMAIN/STREAM/ to
prevent confusion with linux's iommu domain concept).
[1] https://lore.kernel.org/linux-iommu/[email protected]/
[2] https://lore.kernel.org/linux-iommu/[email protected]/
[3] https://lore.kernel.org/linux-iommu/[email protected]/
[4] https://github.com/AsahiLinux/docs/wiki/Developer-Quickstart
[5] https://github.com/AsahiLinux/linux/commit/7d4ebb0b22e9bfec849e2af86ddeb46ec29d7feb
[6] https://github.com/AsahiLinux/linux/tree/dart/dev
[7] https://github.com/AsahiLinux/m1n1/commit/9529ec2b4fd6550f9cfd66d9f2448b90804699a1
Sven Peter (3):
iommu: io-pgtable: add DART pagetable format
dt-bindings: iommu: add DART iommu bindings
iommu: dart: Add DART iommu driver
.../devicetree/bindings/iommu/apple,dart.yaml | 81 ++
MAINTAINERS | 7 +
drivers/iommu/Kconfig | 15 +
drivers/iommu/Makefile | 1 +
drivers/iommu/apple-dart-iommu.c | 1058 +++++++++++++++++
drivers/iommu/io-pgtable-arm.c | 62 +
drivers/iommu/io-pgtable.c | 1 +
include/linux/io-pgtable.h | 7 +
8 files changed, 1232 insertions(+)
create mode 100644 Documentation/devicetree/bindings/iommu/apple,dart.yaml
create mode 100644 drivers/iommu/apple-dart-iommu.c
--
2.25.1
DART (Device Address Resolution Table) is the iommu found on Apple
ARM SoCs such as the M1.
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Sven Peter <[email protected]>
---
.../devicetree/bindings/iommu/apple,dart.yaml | 81 +++++++++++++++++++
MAINTAINERS | 6 ++
2 files changed, 87 insertions(+)
create mode 100644 Documentation/devicetree/bindings/iommu/apple,dart.yaml
diff --git a/Documentation/devicetree/bindings/iommu/apple,dart.yaml b/Documentation/devicetree/bindings/iommu/apple,dart.yaml
new file mode 100644
index 000000000000..94aa9e9afa59
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/apple,dart.yaml
@@ -0,0 +1,81 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/iommu/apple,dart.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Apple DART IOMMU
+
+maintainers:
+ - Sven Peter <[email protected]>
+
+description: |+
+ Apple SoCs may contain an implementation of their Device Address
+ Resolution Table which provides a mandatory layer of address
+ translations for various masters.
+
+ Each DART instance is capable of handling up to 16 different streams
+ with individual pagetables and page-level read/write protection flags.
+
+ This DART IOMMU also raises interrupts in response to various
+ fault conditions.
+
+properties:
+ compatible:
+ const: apple,t8103-dart
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ description:
+ Reference to the gate clock phandle if required for this IOMMU.
+ Optional since not all IOMMUs are attached to a clock gate.
+
+ '#iommu-cells':
+ const: 1
+ description:
+ Has to be one. The single cell describes the stream id emitted by
+ a master to the IOMMU.
+
+required:
+ - compatible
+ - reg
+ - '#iommu-cells'
+ - interrupts
+
+additionalProperties: false
+
+examples:
+ - |+
+ dart1: iommu@82f80000 {
+ compatible = "apple,t8103-dart";
+ reg = <0x82f80000 0x4000>;
+ interrupts = <1 781 4>;
+ #iommu-cells = <1>;
+ };
+
+ master1 {
+ iommus = <&dart1 0>;
+ };
+
+ - |+
+ dart2a: iommu@82f00000 {
+ compatible = "apple,t8103-dart";
+ reg = <0x82f00000 0x4000>;
+ interrupts = <1 781 4>;
+ #iommu-cells = <1>;
+ };
+ dart2b: iommu@82f80000 {
+ compatible = "apple,t8103-dart";
+ reg = <0x82f80000 0x4000>;
+ interrupts = <1 781 4>;
+ #iommu-cells = <1>;
+ };
+
+ master2 {
+ iommus = <&dart2a 0>, <&dart2b 1>;
+ };
diff --git a/MAINTAINERS b/MAINTAINERS
index 8c5ee008301a..29e5541c8f21 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1240,6 +1240,12 @@ L: [email protected]
S: Odd fixes
F: drivers/input/mouse/bcm5974.c
+APPLE DART IOMMU DRIVER
+M: Sven Peter <[email protected]>
+L: [email protected]
+S: Maintained
+F: Documentation/devicetree/bindings/iommu/apple,dart.yaml
+
APPLE SMC DRIVER
M: Henrik Rydberg <[email protected]>
L: [email protected]
--
2.25.1
Apple's new SoCs use iommus for almost all peripherals. These Device
Address Resolution Tables must be setup before these peripherals can
act as DMA masters.
Signed-off-by: Sven Peter <[email protected]>
---
MAINTAINERS | 1 +
drivers/iommu/Kconfig | 15 +
drivers/iommu/Makefile | 1 +
drivers/iommu/apple-dart-iommu.c | 1058 ++++++++++++++++++++++++++++++
4 files changed, 1075 insertions(+)
create mode 100644 drivers/iommu/apple-dart-iommu.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 29e5541c8f21..c1ffaa56b5f9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1245,6 +1245,7 @@ M: Sven Peter <[email protected]>
L: [email protected]
S: Maintained
F: Documentation/devicetree/bindings/iommu/apple,dart.yaml
+F: drivers/iommu/apple-dart-iommu.c
APPLE SMC DRIVER
M: Henrik Rydberg <[email protected]>
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 1f111b399bca..87882c628b46 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -249,6 +249,21 @@ config SPAPR_TCE_IOMMU
Enables bits of IOMMU API required by VFIO. The iommu_ops
is not implemented as it is not necessary for VFIO.
+config IOMMU_APPLE_DART
+ tristate "Apple DART IOMMU Support"
+ depends on ARM64 || (COMPILE_TEST && !GENERIC_ATOMIC64)
+ select IOMMU_API
+ select IOMMU_IO_PGTABLE
+ select IOMMU_IO_PGTABLE_LPAE
+ default ARCH_APPLE
+ help
+ Support for Apple DART (Device Address Resolution Table) IOMMUs
+ found in Apple ARM SoCs like the M1.
+ This IOMMU is required for most peripherals using DMA to access
+ the main memory.
+
+ Say Y here if you are using an Apple SoC with a DART IOMMU.
+
# ARM IOMMU support
config ARM_SMMU
tristate "ARM Ltd. System MMU (SMMU) Support"
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index c0fb0ba88143..8c813f0ebc54 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -29,3 +29,4 @@ obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
obj-$(CONFIG_IOMMU_SVA_LIB) += iommu-sva-lib.o io-pgfault.o
obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o
+obj-$(CONFIG_IOMMU_APPLE_DART) += apple-dart-iommu.o
diff --git a/drivers/iommu/apple-dart-iommu.c b/drivers/iommu/apple-dart-iommu.c
new file mode 100644
index 000000000000..637ba6e7cef9
--- /dev/null
+++ b/drivers/iommu/apple-dart-iommu.c
@@ -0,0 +1,1058 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Apple DART (Device Address Resolution Table) IOMMU driver
+ *
+ * Copyright (C) 2021 The Asahi Linux Contributors
+ *
+ * Based on arm/arm-smmu/arm-ssmu.c and arm/arm-smmu-v3/arm-smmu-v3.c
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2015 ARM Limited
+ * and on exynos-iommu.c
+ * Copyright (c) 2011,2016 Samsung Electronics Co., Ltd.
+ */
+
+#include <linux/bitfield.h>
+#include <linux/clk.h>
+#include <linux/dma-direct.h>
+#include <linux/dma-iommu.h>
+#include <linux/dma-mapping.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/io-pgtable.h>
+#include <linux/iopoll.h>
+#include <linux/list.h>
+#include <linux/lockdep.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_iommu.h>
+#include <linux/of_platform.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/ratelimit.h>
+#include <linux/slab.h>
+#include <linux/pci.h>
+
+#define DART_MAX_STREAMS 16
+#define DART_MAX_TTBR 4
+
+#define DART_STREAM_ALL 0xffff
+
+#define DART_PARAMS1 0x00
+#define DART_PARAMS_PAGE_SHIFT GENMASK(27, 24)
+
+#define DART_PARAMS2 0x04
+#define DART_PARAMS_BYPASS_SUPPORT BIT(0)
+
+#define DART_STREAM_COMMAND 0x20
+#define DART_STREAM_COMMAND_BUSY BIT(2)
+#define DART_STREAM_COMMAND_INVALIDATE BIT(20)
+
+#define DART_STREAM_SELECT 0x34
+
+#define DART_ERROR 0x40
+#define DART_ERROR_STREAM GENMASK(27, 24)
+#define DART_ERROR_CODE GENMASK(23, 0)
+#define DART_ERROR_FLAG BIT(31)
+#define DART_ERROR_READ_FAULT BIT(4)
+#define DART_ERROR_WRITE_FAULT BIT(3)
+#define DART_ERROR_NO_PTE BIT(2)
+#define DART_ERROR_NO_PMD BIT(1)
+#define DART_ERROR_NO_TTBR BIT(0)
+
+#define DART_CONFIG 0x60
+#define DART_CONFIG_LOCK BIT(15)
+
+#define DART_STREAM_COMMAND_BUSY_TIMEOUT 100
+
+#define DART_STREAM_REMAP 0x80
+
+#define DART_ERROR_ADDR_HI 0x54
+#define DART_ERROR_ADDR_LO 0x50
+
+#define DART_TCR(sid) (0x100 + 4 * (sid))
+#define DART_TCR_TRANSLATE_ENABLE BIT(7)
+#define DART_TCR_BYPASS0_ENABLE BIT(8)
+#define DART_TCR_BYPASS1_ENABLE BIT(12)
+
+#define DART_TTBR(sid, idx) (0x200 + 16 * (sid) + 4 * (idx))
+#define DART_TTBR_VALID BIT(31)
+#define DART_TTBR_SHIFT 12
+
+/*
+ * Private structure associated with each DART device.
+ *
+ * @dev: device struct
+ * @regs: mapped MMIO region
+ * @irq: interrupt number, can be shared with other DARTs
+ * @clks: clocks associated with this DART
+ * @num_clks: number of @clks
+ * @lock: lock for @used_sids and hardware operations involving this dart
+ * @used_sids: bitmap of streams attached to a domain
+ * @pgsize: pagesize supported by this DART
+ * @supports_bypass: indicates if this DART supports bypass mode
+ * @force_bypass: force bypass mode due to pagesize mismatch?
+ * @sw_bypass_cpu_start: offset into cpu address space in software bypass mode
+ * @sw_bypass_dma_start: offset into dma address space in software bypass mode
+ * @sw_bypass_len: length of iova space in software bypass mode
+ * @iommu: iommu core device
+ */
+struct apple_dart {
+ struct device *dev;
+
+ void __iomem *regs;
+
+ int irq;
+ struct clk_bulk_data *clks;
+ int num_clks;
+
+ spinlock_t lock;
+
+ u32 used_sids;
+ u32 pgsize;
+
+ u32 supports_bypass : 1;
+ u32 force_bypass : 1;
+
+ u64 sw_bypass_cpu_start;
+ u64 sw_bypass_dma_start;
+ u64 sw_bypass_len;
+
+ struct iommu_device iommu;
+};
+
+/*
+ * This structure is used to identify a single stream attached to a domain.
+ * It's used as a list inside that domain to be able to attach multiple
+ * streams to a single domain. Since multiple devices can use a single stream
+ * it additionally keeps track of how many devices are represented by this
+ * stream. Once that number reaches zero it is detached from the IOMMU domain
+ * and all translations from this stream are disabled.
+ *
+ * @dart: DART instance to which this stream belongs
+ * @sid: stream id within the DART instance
+ * @num_devices: count of devices attached to this stream
+ * @stream_head: list head for the next stream
+ */
+struct apple_dart_stream {
+ struct apple_dart *dart;
+ u32 sid;
+
+ u32 num_devices;
+
+ struct list_head stream_head;
+};
+
+/*
+ * This structure is attached to each iommu domain handled by a DART.
+ * A single domain is used to represent a single virtual address space.
+ * It is always allocated together with a page table.
+ *
+ * Streams are the smallest units the DART hardware can differentiate.
+ * These are pointed to the page table of a domain whenever a device is
+ * attached to it. A single stream can only be assigned to a single domain.
+ *
+ * Devices are assigned to at least a single and sometimes multiple individual
+ * streams (using the iommus property in the device tree). Multiple devices
+ * can theoretically be represented by the same stream, though this is usually
+ * not the case.
+ *
+ * We only keep track of streams here and just count how many devices are
+ * represented by each stream. When the last device is removed the whole stream
+ * is removed from the domain.
+ *
+ * @dart: pointer to the DART instance
+ * @pgtbl_ops: pagetable ops allocated by io-pgtable
+ * @type: domain type IOMMU_DOMAIN_IDENTITY_{IDENTITY,DMA,UNMANAGED,BLOCKED}
+ * @sw_bypass_cpu_start: offset into cpu address space in software bypass mode
+ * @sw_bypass_dma_start: offset into dma address space in software bypass mode
+ * @sw_bypass_len: length of iova space in software bypass mode
+ * @streams: list of streams attached to this domain
+ * @lock: spinlock for operations involving the list of streams
+ * @domain: core iommu domain pointer
+ */
+struct apple_dart_domain {
+ struct apple_dart *dart;
+ struct io_pgtable_ops *pgtbl_ops;
+
+ unsigned int type;
+
+ u64 sw_bypass_cpu_start;
+ u64 sw_bypass_dma_start;
+ u64 sw_bypass_len;
+
+ struct list_head streams;
+
+ spinlock_t lock;
+
+ struct iommu_domain domain;
+};
+
+/*
+ * This structure is attached to devices with dev_iommu_priv_set() on of_xlate
+ * and contains a list of streams bound to this device as defined in the
+ * device tree. Multiple DART instances can be attached to a single device
+ * and each stream is identified by its stream id.
+ * It's usually reference by a pointer called *cfg.
+ *
+ * A dynamic array instead of a linked list is used here since in almost
+ * all cases a device will just be attached to a single stream and streams
+ * are never removed after they have been added.
+ *
+ * @num_streams: number of streams attached
+ * @streams: array of structs to identify attached streams and the device link
+ * to the iommu
+ */
+struct apple_dart_master_cfg {
+ int num_streams;
+ struct {
+ struct apple_dart *dart;
+ u32 sid;
+
+ struct device_link *link;
+ } streams[];
+};
+
+static struct platform_driver apple_dart_driver;
+static const struct iommu_ops apple_dart_iommu_ops;
+static const struct iommu_flush_ops apple_dart_tlb_ops;
+
+static struct apple_dart_domain *to_dart_domain(struct iommu_domain *dom)
+{
+ return container_of(dom, struct apple_dart_domain, domain);
+}
+
+static void apple_dart_hw_enable_translation(struct apple_dart *dart, u16 sid)
+{
+ writel(DART_TCR_TRANSLATE_ENABLE, dart->regs + DART_TCR(sid));
+}
+
+static void apple_dart_hw_disable_dma(struct apple_dart *dart, u16 sid)
+{
+ writel(0, dart->regs + DART_TCR(sid));
+}
+
+static void apple_dart_hw_enable_bypass(struct apple_dart *dart, u16 sid)
+{
+ WARN_ON(!dart->supports_bypass);
+ writel(DART_TCR_BYPASS0_ENABLE | DART_TCR_BYPASS1_ENABLE,
+ dart->regs + DART_TCR(sid));
+}
+
+static void apple_dart_hw_set_ttbr(struct apple_dart *dart, u16 sid, u16 idx,
+ phys_addr_t paddr)
+{
+ writel(DART_TTBR_VALID | (paddr >> DART_TTBR_SHIFT),
+ dart->regs + DART_TTBR(sid, idx));
+}
+
+static void apple_dart_hw_clear_ttbr(struct apple_dart *dart, u16 sid, u16 idx)
+{
+ writel(0, dart->regs + DART_TTBR(sid, idx));
+}
+
+static void apple_dart_hw_clear_all_ttbrs(struct apple_dart *dart, u16 sid)
+{
+ int i;
+
+ for (i = 0; i < 4; ++i)
+ apple_dart_hw_clear_ttbr(dart, sid, i);
+}
+
+static int apple_dart_hw_stream_command(struct apple_dart *dart, u16 sid_bitmap,
+ u32 command)
+{
+ unsigned long flags;
+ int ret;
+ u32 command_reg;
+
+ spin_lock_irqsave(&dart->lock, flags);
+
+ writel(sid_bitmap, dart->regs + DART_STREAM_SELECT);
+ writel(command, dart->regs + DART_STREAM_COMMAND);
+
+ ret = readl_poll_timeout_atomic(
+ dart->regs + DART_STREAM_COMMAND, command_reg,
+ !(command_reg & DART_STREAM_COMMAND_BUSY), 1,
+ DART_STREAM_COMMAND_BUSY_TIMEOUT);
+
+ spin_unlock_irqrestore(&dart->lock, flags);
+
+ if (ret) {
+ dev_err(dart->dev,
+ "busy bit did not clear after command %x for streams %x\n",
+ command, sid_bitmap);
+ return ret;
+ }
+
+ return 0;
+}
+
+static int apple_dart_hw_invalidate_tlb_global(struct apple_dart *dart)
+{
+ return apple_dart_hw_stream_command(dart, DART_STREAM_ALL,
+ DART_STREAM_COMMAND_INVALIDATE);
+}
+
+static int apple_dart_hw_invalidate_tlb_stream(struct apple_dart *dart, u16 sid)
+{
+ return apple_dart_hw_stream_command(dart, 1 << sid,
+ DART_STREAM_COMMAND_INVALIDATE);
+}
+
+static int apple_dart_hw_reset(struct apple_dart *dart)
+{
+ int sid;
+ u32 config;
+
+ config = readl(dart->regs + DART_CONFIG);
+ if (config & DART_CONFIG_LOCK) {
+ dev_err(dart->dev, "DART is locked down until reboot: %08x\n",
+ config);
+ return -EINVAL;
+ }
+
+ for (sid = 0; sid < DART_MAX_STREAMS; ++sid) {
+ apple_dart_hw_disable_dma(dart, sid);
+ apple_dart_hw_clear_all_ttbrs(dart, sid);
+ }
+
+ /* restore stream identity map */
+ writel(0x03020100, dart->regs + DART_STREAM_REMAP);
+ writel(0x07060504, dart->regs + DART_STREAM_REMAP + 4);
+ writel(0x0b0a0908, dart->regs + DART_STREAM_REMAP + 8);
+ writel(0x0f0e0d0c, dart->regs + DART_STREAM_REMAP + 12);
+
+ /* clear any pending errors before the interrupt is unmasked */
+ writel(readl(dart->regs + DART_ERROR), dart->regs + DART_ERROR);
+
+ return apple_dart_hw_invalidate_tlb_global(dart);
+}
+
+static void apple_dart_domain_flush_tlb(struct apple_dart_domain *domain)
+{
+ unsigned long flags;
+ struct apple_dart_stream *stream;
+ struct apple_dart *dart = domain->dart;
+
+ if (!dart)
+ return;
+
+ spin_lock_irqsave(&domain->lock, flags);
+ list_for_each_entry(stream, &domain->streams, stream_head) {
+ apple_dart_hw_invalidate_tlb_stream(stream->dart, stream->sid);
+ }
+ spin_unlock_irqrestore(&domain->lock, flags);
+}
+
+static void apple_dart_flush_iotlb_all(struct iommu_domain *domain)
+{
+ struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+
+ apple_dart_domain_flush_tlb(dart_domain);
+}
+
+static void apple_dart_iotlb_sync(struct iommu_domain *domain,
+ struct iommu_iotlb_gather *gather)
+{
+ struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+
+ apple_dart_domain_flush_tlb(dart_domain);
+}
+
+static void apple_dart_iotlb_sync_map(struct iommu_domain *domain,
+ unsigned long iova, size_t size)
+{
+ struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+
+ apple_dart_domain_flush_tlb(dart_domain);
+}
+
+static void apple_dart_tlb_flush_all(void *cookie)
+{
+ struct apple_dart_domain *domain = cookie;
+
+ apple_dart_domain_flush_tlb(domain);
+}
+
+static void apple_dart_tlb_flush_walk(unsigned long iova, size_t size,
+ size_t granule, void *cookie)
+{
+ struct apple_dart_domain *domain = cookie;
+
+ apple_dart_domain_flush_tlb(domain);
+}
+
+static const struct iommu_flush_ops apple_dart_tlb_ops = {
+ .tlb_flush_all = apple_dart_tlb_flush_all,
+ .tlb_flush_walk = apple_dart_tlb_flush_walk,
+ .tlb_add_page = NULL,
+};
+
+static phys_addr_t apple_dart_iova_to_phys(struct iommu_domain *domain,
+ dma_addr_t iova)
+{
+ struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+ struct io_pgtable_ops *ops = dart_domain->pgtbl_ops;
+
+ if (domain->type == IOMMU_DOMAIN_IDENTITY &&
+ dart_domain->dart->supports_bypass)
+ return iova;
+ if (!ops)
+ return -ENODEV;
+
+ return ops->iova_to_phys(ops, iova);
+}
+
+static int apple_dart_map(struct iommu_domain *domain, unsigned long iova,
+ phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
+{
+ struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+ struct io_pgtable_ops *ops = dart_domain->pgtbl_ops;
+
+ if (!ops)
+ return -ENODEV;
+ if (prot & IOMMU_MMIO)
+ return -EINVAL;
+ if (prot & IOMMU_NOEXEC)
+ return -EINVAL;
+
+ return ops->map(ops, iova, paddr, size, prot, gfp);
+}
+
+static size_t apple_dart_unmap(struct iommu_domain *domain, unsigned long iova,
+ size_t size, struct iommu_iotlb_gather *gather)
+{
+ struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+ struct io_pgtable_ops *ops = dart_domain->pgtbl_ops;
+
+ if (!ops)
+ return 0;
+
+ return ops->unmap(ops, iova, size, gather);
+}
+
+static int apple_dart_prepare_sw_bypass(struct apple_dart *dart,
+ struct apple_dart_domain *dart_domain,
+ struct device *dev)
+{
+ lockdep_assert_held(&dart_domain->lock);
+
+ if (dart->supports_bypass)
+ return 0;
+ if (dart_domain->type != IOMMU_DOMAIN_IDENTITY)
+ return 0;
+
+ // use the bus region from the first attached dev for the bypass range
+ if (!dart->sw_bypass_len) {
+ const struct bus_dma_region *dma_rgn = dev->dma_range_map;
+
+ if (!dma_rgn)
+ return -EINVAL;
+
+ dart->sw_bypass_len = dma_rgn->size;
+ dart->sw_bypass_cpu_start = dma_rgn->cpu_start;
+ dart->sw_bypass_dma_start = dma_rgn->dma_start;
+ }
+
+ // ensure that we don't mix different bypass setups
+ if (dart_domain->sw_bypass_len) {
+ if (dart->sw_bypass_len != dart_domain->sw_bypass_len)
+ return -EINVAL;
+ if (dart->sw_bypass_cpu_start !=
+ dart_domain->sw_bypass_cpu_start)
+ return -EINVAL;
+ if (dart->sw_bypass_dma_start !=
+ dart_domain->sw_bypass_dma_start)
+ return -EINVAL;
+ } else {
+ dart_domain->sw_bypass_len = dart->sw_bypass_len;
+ dart_domain->sw_bypass_cpu_start = dart->sw_bypass_cpu_start;
+ dart_domain->sw_bypass_dma_start = dart->sw_bypass_dma_start;
+ }
+
+ return 0;
+}
+
+static int apple_dart_domain_needs_pgtbl_ops(struct apple_dart *dart,
+ struct iommu_domain *domain)
+{
+ if (domain->type == IOMMU_DOMAIN_DMA)
+ return 1;
+ if (domain->type == IOMMU_DOMAIN_UNMANAGED)
+ return 1;
+ if (!dart->supports_bypass && domain->type == IOMMU_DOMAIN_IDENTITY)
+ return 1;
+ return 0;
+}
+
+static int apple_dart_finalize_domain(struct iommu_domain *domain)
+{
+ struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+ struct apple_dart *dart = dart_domain->dart;
+ struct io_pgtable_cfg pgtbl_cfg;
+
+ lockdep_assert_held(&dart_domain->lock);
+
+ if (dart_domain->pgtbl_ops)
+ return 0;
+ if (!apple_dart_domain_needs_pgtbl_ops(dart, domain))
+ return 0;
+
+ pgtbl_cfg = (struct io_pgtable_cfg){
+ .pgsize_bitmap = dart->pgsize,
+ .ias = 32,
+ .oas = 36,
+ .coherent_walk = 1,
+ .tlb = &apple_dart_tlb_ops,
+ .iommu_dev = dart->dev,
+ };
+
+ dart_domain->pgtbl_ops =
+ alloc_io_pgtable_ops(ARM_APPLE_DART, &pgtbl_cfg, domain);
+ if (!dart_domain->pgtbl_ops)
+ return -ENOMEM;
+
+ domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+ domain->geometry.aperture_start = 0;
+ domain->geometry.aperture_end = DMA_BIT_MASK(32);
+ domain->geometry.force_aperture = true;
+
+ /*
+ * Some DARTs come without hardware bypass support but we may still
+ * be forced to use bypass mode (to e.g. allow kernels with 4K pages to
+ * boot). If we reach this point with an identity domain we have to setup
+ * bypass mode in software. This is done by creating a static pagetable
+ * for a linear map specified by dma-ranges in the device tree.
+ */
+ if (domain->type == IOMMU_DOMAIN_IDENTITY) {
+ u64 offset;
+ int ret;
+
+ for (offset = 0; offset < dart_domain->sw_bypass_len;
+ offset += dart->pgsize) {
+ ret = dart_domain->pgtbl_ops->map(
+ dart_domain->pgtbl_ops,
+ dart_domain->sw_bypass_dma_start + offset,
+ dart_domain->sw_bypass_cpu_start + offset,
+ dart->pgsize, IOMMU_READ | IOMMU_WRITE,
+ GFP_ATOMIC);
+ if (ret < 0) {
+ free_io_pgtable_ops(dart_domain->pgtbl_ops);
+ dart_domain->pgtbl_ops = NULL;
+ return -EINVAL;
+ }
+ }
+ }
+
+ return 0;
+}
+
+static void
+apple_dart_stream_setup_translation(struct apple_dart_domain *domain,
+ struct apple_dart *dart, u32 sid)
+{
+ int i;
+ struct io_pgtable_cfg *pgtbl_cfg =
+ &io_pgtable_ops_to_pgtable(domain->pgtbl_ops)->cfg;
+
+ for (i = 0; i < pgtbl_cfg->apple_dart_cfg.n_ttbrs; ++i)
+ apple_dart_hw_set_ttbr(dart, sid, i,
+ pgtbl_cfg->apple_dart_cfg.ttbr[i]);
+ for (; i < DART_MAX_TTBR; ++i)
+ apple_dart_hw_clear_ttbr(dart, sid, i);
+
+ apple_dart_hw_enable_translation(dart, sid);
+ apple_dart_hw_invalidate_tlb_stream(dart, sid);
+}
+
+static int apple_dart_attach_stream(struct apple_dart_domain *domain,
+ struct apple_dart *dart, u32 sid)
+{
+ unsigned long flags;
+ struct apple_dart_stream *stream;
+ int ret;
+
+ lockdep_assert_held(&domain->lock);
+
+ if (WARN_ON(dart->force_bypass &&
+ domain->type != IOMMU_DOMAIN_IDENTITY))
+ return -EINVAL;
+
+ /*
+ * we can't mix and match DARTs that support bypass mode with those who don't
+ * because the iova space in fake bypass mode generally has an offset
+ */
+ if (WARN_ON(domain->type == IOMMU_DOMAIN_IDENTITY &&
+ (domain->dart->supports_bypass != dart->supports_bypass)))
+ return -EINVAL;
+
+ list_for_each_entry(stream, &domain->streams, stream_head) {
+ if (stream->dart == dart && stream->sid == sid) {
+ stream->num_devices++;
+ return 0;
+ }
+ }
+
+ spin_lock_irqsave(&dart->lock, flags);
+
+ if (WARN_ON(dart->used_sids & BIT(sid))) {
+ ret = -EINVAL;
+ goto error;
+ }
+
+ stream = kzalloc(sizeof(*stream), GFP_ATOMIC);
+ if (!stream) {
+ ret = -ENOMEM;
+ goto error;
+ }
+
+ stream->dart = dart;
+ stream->sid = sid;
+ stream->num_devices = 1;
+ list_add(&stream->stream_head, &domain->streams);
+
+ dart->used_sids |= BIT(sid);
+ spin_unlock_irqrestore(&dart->lock, flags);
+
+ apple_dart_hw_clear_all_ttbrs(stream->dart, stream->sid);
+
+ switch (domain->type) {
+ case IOMMU_DOMAIN_IDENTITY:
+ if (stream->dart->supports_bypass)
+ apple_dart_hw_enable_bypass(stream->dart, stream->sid);
+ else
+ apple_dart_stream_setup_translation(
+ domain, stream->dart, stream->sid);
+ break;
+ case IOMMU_DOMAIN_BLOCKED:
+ apple_dart_hw_disable_dma(stream->dart, stream->sid);
+ break;
+ case IOMMU_DOMAIN_UNMANAGED:
+ case IOMMU_DOMAIN_DMA:
+ apple_dart_stream_setup_translation(domain, stream->dart,
+ stream->sid);
+ break;
+ }
+
+ return 0;
+
+error:
+ spin_unlock_irqrestore(&dart->lock, flags);
+ return ret;
+}
+
+static void apple_dart_disable_stream(struct apple_dart *dart, u32 sid)
+{
+ unsigned long flags;
+
+ apple_dart_hw_disable_dma(dart, sid);
+ apple_dart_hw_clear_all_ttbrs(dart, sid);
+ apple_dart_hw_invalidate_tlb_stream(dart, sid);
+
+ spin_lock_irqsave(&dart->lock, flags);
+ dart->used_sids &= ~BIT(sid);
+ spin_unlock_irqrestore(&dart->lock, flags);
+}
+
+static void apple_dart_detach_stream(struct apple_dart_domain *domain,
+ struct apple_dart *dart, u32 sid)
+{
+ struct apple_dart_stream *stream;
+
+ lockdep_assert_held(&domain->lock);
+
+ list_for_each_entry(stream, &domain->streams, stream_head) {
+ if (stream->dart == dart && stream->sid == sid) {
+ stream->num_devices--;
+
+ if (stream->num_devices == 0) {
+ apple_dart_disable_stream(dart, sid);
+ list_del(&stream->stream_head);
+ kfree(stream);
+ }
+ return;
+ }
+ }
+}
+
+static int apple_dart_attach_dev(struct iommu_domain *domain,
+ struct device *dev)
+{
+ int ret;
+ int i, j;
+ unsigned long flags;
+ struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
+ struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+ struct apple_dart *dart = cfg->streams[0].dart;
+
+ if (WARN_ON(dart->force_bypass &&
+ dart_domain->type != IOMMU_DOMAIN_IDENTITY)) {
+ dev_warn(
+ dev,
+ "IOMMU must be in bypass mode but trying to attach to translated domain.\n");
+ return -EINVAL;
+ }
+
+ spin_lock_irqsave(&dart_domain->lock, flags);
+
+ ret = apple_dart_prepare_sw_bypass(dart, dart_domain, dev);
+ if (ret)
+ goto out;
+
+ if (!dart_domain->dart)
+ dart_domain->dart = dart;
+
+ ret = apple_dart_finalize_domain(domain);
+ if (ret)
+ goto out;
+
+ for (i = 0; i < cfg->num_streams; ++i) {
+ ret = apple_dart_attach_stream(
+ dart_domain, cfg->streams[i].dart, cfg->streams[i].sid);
+ if (ret) {
+ /* try to undo what we did before returning */
+ for (j = 0; j < i; ++j)
+ apple_dart_detach_stream(dart_domain,
+ cfg->streams[j].dart,
+ cfg->streams[j].sid);
+
+ goto out;
+ }
+ }
+
+ ret = 0;
+
+out:
+ spin_unlock_irqrestore(&dart_domain->lock, flags);
+ return ret;
+}
+
+static void apple_dart_detach_dev(struct iommu_domain *domain,
+ struct device *dev)
+{
+ int i;
+ unsigned long flags;
+ struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
+ struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+
+ spin_lock_irqsave(&dart_domain->lock, flags);
+
+ for (i = 0; i < cfg->num_streams; ++i)
+ apple_dart_detach_stream(dart_domain, cfg->streams[i].dart,
+ cfg->streams[i].sid);
+
+ spin_unlock_irqrestore(&dart_domain->lock, flags);
+}
+
+static struct iommu_device *apple_dart_probe_device(struct device *dev)
+{
+ struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
+ int i;
+
+ if (!cfg)
+ return ERR_PTR(-ENODEV);
+
+ for (i = 0; i < cfg->num_streams; ++i) {
+ cfg->streams[i].link =
+ device_link_add(dev, cfg->streams[i].dart->dev,
+ DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+ }
+
+ return &cfg->streams[0].dart->iommu;
+}
+
+static void apple_dart_release_device(struct device *dev)
+{
+ struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
+ int i;
+
+ if (!cfg)
+ return;
+
+ for (i = 0; i < cfg->num_streams; ++i)
+ device_link_del(cfg->streams[i].link);
+
+ dev_iommu_priv_set(dev, NULL);
+ kfree(cfg);
+}
+
+static struct iommu_domain *apple_dart_domain_alloc(unsigned int type)
+{
+ struct apple_dart_domain *dart_domain;
+
+ if (type != IOMMU_DOMAIN_DMA && type != IOMMU_DOMAIN_UNMANAGED &&
+ type != IOMMU_DOMAIN_IDENTITY && type != IOMMU_DOMAIN_BLOCKED)
+ return NULL;
+
+ dart_domain = kzalloc(sizeof(*dart_domain), GFP_KERNEL);
+ if (!dart_domain)
+ return NULL;
+
+ INIT_LIST_HEAD(&dart_domain->streams);
+ spin_lock_init(&dart_domain->lock);
+ iommu_get_dma_cookie(&dart_domain->domain);
+ dart_domain->type = type;
+
+ return &dart_domain->domain;
+}
+
+static void apple_dart_domain_free(struct iommu_domain *domain)
+{
+ struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+
+ WARN_ON(!list_empty(&dart_domain->streams));
+
+ kfree(dart_domain);
+}
+
+static int apple_dart_of_xlate(struct device *dev, struct of_phandle_args *args)
+{
+ struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
+ struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
+ unsigned int num_streams = cfg ? cfg->num_streams : 0;
+ struct apple_dart_master_cfg *cfg_new;
+ struct apple_dart *dart = platform_get_drvdata(iommu_pdev);
+
+ if (args->args_count != 1)
+ return -EINVAL;
+
+ cfg_new = krealloc(cfg, struct_size(cfg, streams, num_streams + 1),
+ GFP_KERNEL);
+ if (!cfg_new)
+ return -ENOMEM;
+
+ cfg = cfg_new;
+ dev_iommu_priv_set(dev, cfg);
+
+ cfg->num_streams = num_streams;
+ cfg->streams[cfg->num_streams].dart = dart;
+ cfg->streams[cfg->num_streams].sid = args->args[0];
+ cfg->num_streams++;
+
+ return 0;
+}
+
+static struct iommu_group *apple_dart_device_group(struct device *dev)
+{
+#ifdef CONFIG_PCI
+ struct iommu_group *group;
+
+ if (dev_is_pci(dev))
+ group = pci_device_group(dev);
+ else
+ group = generic_device_group(dev);
+
+ return group;
+#else
+ return generic_device_group(dev);
+#endif
+}
+
+static int apple_dart_def_domain_type(struct device *dev)
+{
+ struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
+ struct apple_dart *dart = cfg->streams[0].dart;
+
+ if (dart->force_bypass)
+ return IOMMU_DOMAIN_IDENTITY;
+ if (!dart->supports_bypass)
+ return IOMMU_DOMAIN_DMA;
+
+ return 0;
+}
+
+static const struct iommu_ops apple_dart_iommu_ops = {
+ .domain_alloc = apple_dart_domain_alloc,
+ .domain_free = apple_dart_domain_free,
+ .attach_dev = apple_dart_attach_dev,
+ .detach_dev = apple_dart_detach_dev,
+ .map = apple_dart_map,
+ .unmap = apple_dart_unmap,
+ .flush_iotlb_all = apple_dart_flush_iotlb_all,
+ .iotlb_sync = apple_dart_iotlb_sync,
+ .iotlb_sync_map = apple_dart_iotlb_sync_map,
+ .iova_to_phys = apple_dart_iova_to_phys,
+ .probe_device = apple_dart_probe_device,
+ .release_device = apple_dart_release_device,
+ .device_group = apple_dart_device_group,
+ .of_xlate = apple_dart_of_xlate,
+ .def_domain_type = apple_dart_def_domain_type,
+ .pgsize_bitmap = -1UL, /* Restricted during dart probe */
+};
+
+static irqreturn_t apple_dart_irq(int irq, void *dev)
+{
+ struct apple_dart *dart = dev;
+ static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
+ DEFAULT_RATELIMIT_BURST);
+ const char *fault_name = NULL;
+ u32 error = readl(dart->regs + DART_ERROR);
+ u32 error_code = FIELD_GET(DART_ERROR_CODE, error);
+ u32 addr_lo = readl(dart->regs + DART_ERROR_ADDR_LO);
+ u32 addr_hi = readl(dart->regs + DART_ERROR_ADDR_HI);
+ u64 addr = addr_lo | (((u64)addr_hi) << 32);
+ u8 stream_idx = FIELD_GET(DART_ERROR_STREAM, error);
+
+ if (!(error & DART_ERROR_FLAG))
+ return IRQ_NONE;
+
+ if (error_code & DART_ERROR_READ_FAULT)
+ fault_name = "READ FAULT";
+ else if (error_code & DART_ERROR_WRITE_FAULT)
+ fault_name = "WRITE FAULT";
+ else if (error_code & DART_ERROR_NO_PTE)
+ fault_name = "NO PTE FOR IOVA";
+ else if (error_code & DART_ERROR_NO_PMD)
+ fault_name = "NO PMD FOR IOVA";
+ else if (error_code & DART_ERROR_NO_TTBR)
+ fault_name = "NO TTBR FOR IOVA";
+
+ if (WARN_ON(fault_name == NULL))
+ fault_name = "unknown";
+
+ if (__ratelimit(&rs)) {
+ dev_err(dart->dev,
+ "translation fault: status:0x%x stream:%d code:0x%x (%s) at 0x%llx",
+ error, stream_idx, error_code, fault_name, addr);
+ }
+
+ writel(error, dart->regs + DART_ERROR);
+ return IRQ_HANDLED;
+}
+
+static int apple_dart_probe(struct platform_device *pdev)
+{
+ int ret;
+ u32 dart_params[2];
+ struct resource *res;
+ struct apple_dart *dart;
+ struct device *dev = &pdev->dev;
+
+ dart = devm_kzalloc(dev, sizeof(*dart), GFP_KERNEL);
+ if (!dart)
+ return -ENOMEM;
+
+ dart->dev = dev;
+ spin_lock_init(&dart->lock);
+
+ if (pdev->num_resources < 1)
+ return -ENODEV;
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (resource_size(res) < 0x4000) {
+ dev_err(dev, "MMIO region too small (%pr)\n", res);
+ return -EINVAL;
+ }
+
+ dart->regs = devm_ioremap_resource(dev, res);
+ if (IS_ERR(dart->regs))
+ return PTR_ERR(dart->regs);
+
+ ret = devm_clk_bulk_get_all(dev, &dart->clks);
+ if (ret < 0)
+ return ret;
+ dart->num_clks = ret;
+
+ ret = clk_bulk_prepare_enable(dart->num_clks, dart->clks);
+ if (ret)
+ return ret;
+
+ ret = apple_dart_hw_reset(dart);
+ if (ret)
+ goto err_clk_disable;
+
+ dart_params[0] = readl(dart->regs + DART_PARAMS1);
+ dart_params[1] = readl(dart->regs + DART_PARAMS2);
+ dart->pgsize = 1 << FIELD_GET(DART_PARAMS_PAGE_SHIFT, dart_params[0]);
+ dart->supports_bypass = dart_params[1] & DART_PARAMS_BYPASS_SUPPORT;
+ dart->force_bypass = dart->pgsize > PAGE_SIZE;
+
+ dart->irq = platform_get_irq(pdev, 0);
+ if (dart->irq < 0) {
+ ret = -ENODEV;
+ goto err_clk_disable;
+ }
+
+ ret = devm_request_irq(dart->dev, dart->irq, apple_dart_irq,
+ IRQF_SHARED, "apple-dart fault handler", dart);
+ if (ret)
+ goto err_clk_disable;
+
+ platform_set_drvdata(pdev, dart);
+
+ ret = iommu_device_sysfs_add(&dart->iommu, dev, NULL, "apple-dart.%s",
+ dev_name(&pdev->dev));
+ if (ret)
+ goto err_clk_disable;
+
+ ret = iommu_device_register(&dart->iommu, &apple_dart_iommu_ops, dev);
+ if (ret)
+ goto err_clk_disable;
+
+ if (dev->bus->iommu_ops != &apple_dart_iommu_ops) {
+ ret = bus_set_iommu(dev->bus, &apple_dart_iommu_ops);
+ if (ret)
+ goto err_clk_disable;
+ }
+#ifdef CONFIG_PCI
+ if (dev->bus->iommu_ops != pci_bus_type.iommu_ops) {
+ ret = bus_set_iommu(&pci_bus_type, &apple_dart_iommu_ops);
+ if (ret)
+ goto err_clk_disable;
+ }
+#endif
+
+ dev_info(
+ &pdev->dev,
+ "DART [pagesize %x, bypass support: %d, bypass forced: %d] initialized\n",
+ dart->pgsize, dart->supports_bypass, dart->force_bypass);
+ return 0;
+
+err_clk_disable:
+ clk_bulk_disable(dart->num_clks, dart->clks);
+ clk_bulk_unprepare(dart->num_clks, dart->clks);
+
+ return ret;
+}
+
+static int apple_dart_remove(struct platform_device *pdev)
+{
+ struct apple_dart *dart = platform_get_drvdata(pdev);
+
+ devm_free_irq(dart->dev, dart->irq, dart);
+
+ iommu_device_unregister(&dart->iommu);
+ iommu_device_sysfs_remove(&dart->iommu);
+
+ clk_bulk_disable(dart->num_clks, dart->clks);
+ clk_bulk_unprepare(dart->num_clks, dart->clks);
+
+ return 0;
+}
+
+static void apple_dart_shutdown(struct platform_device *pdev)
+{
+ apple_dart_remove(pdev);
+}
+
+static const struct of_device_id apple_dart_of_match[] = {
+ { .compatible = "apple,t8103-dart", .data = NULL },
+ {},
+};
+MODULE_DEVICE_TABLE(of, apple_dart_of_match);
+
+static struct platform_driver apple_dart_driver = {
+ .driver = {
+ .name = "apple-dart",
+ .of_match_table = apple_dart_of_match,
+ },
+ .probe = apple_dart_probe,
+ .remove = apple_dart_remove,
+ .shutdown = apple_dart_shutdown,
+};
+module_platform_driver(apple_dart_driver);
+
+MODULE_DESCRIPTION("IOMMU API for Apple's DART");
+MODULE_AUTHOR("Sven Peter <[email protected]>");
+MODULE_LICENSE("GPL v2");
--
2.25.1
Apple's DART iommu uses a pagetable format that shares some
similarities with the ones already implemented by io-pgtable.c.
Add a new format variant to support the required differences
so that we don't have to duplicate the pagetable handling code.
Signed-off-by: Sven Peter <[email protected]>
---
drivers/iommu/io-pgtable-arm.c | 62 ++++++++++++++++++++++++++++++++++
drivers/iommu/io-pgtable.c | 1 +
include/linux/io-pgtable.h | 7 ++++
3 files changed, 70 insertions(+)
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 87def58e79b5..1dd5c45b4b5b 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -127,6 +127,9 @@
#define ARM_MALI_LPAE_MEMATTR_IMP_DEF 0x88ULL
#define ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC 0x8DULL
+#define APPLE_DART_PTE_PROT_NO_WRITE (1<<7)
+#define APPLE_DART_PTE_PROT_NO_READ (1<<8)
+
/* IOPTE accessors */
#define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d))
@@ -381,6 +384,15 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
{
arm_lpae_iopte pte;
+ if (data->iop.fmt == ARM_APPLE_DART) {
+ pte = 0;
+ if (!(prot & IOMMU_WRITE))
+ pte |= APPLE_DART_PTE_PROT_NO_WRITE;
+ if (!(prot & IOMMU_READ))
+ pte |= APPLE_DART_PTE_PROT_NO_READ;
+ return pte;
+ }
+
if (data->iop.fmt == ARM_64_LPAE_S1 ||
data->iop.fmt == ARM_32_LPAE_S1) {
pte = ARM_LPAE_PTE_nG;
@@ -1043,6 +1055,51 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
return NULL;
}
+static struct io_pgtable *
+apple_dart_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
+{
+ struct arm_lpae_io_pgtable *data;
+ int i;
+
+ if (cfg->oas > 36)
+ return NULL;
+
+ data = arm_lpae_alloc_pgtable(cfg);
+ if (!data)
+ return NULL;
+
+ /*
+ * Apple's DART always requires three levels with the first level being
+ * stored in four MMIO registers. We always concatenate the first and
+ * second level so that we only have to setup the MMIO registers once.
+ * This results in an effective two level pagetable.
+ */
+ if (data->start_level < 1)
+ return NULL;
+ if (data->start_level == 1 && data->pgd_bits > 2)
+ return NULL;
+ if (data->start_level > 1)
+ data->pgd_bits = 0;
+ data->start_level = 2;
+ cfg->apple_dart_cfg.n_ttbrs = 1 << data->pgd_bits;
+ data->pgd_bits += data->bits_per_level;
+
+ data->pgd = __arm_lpae_alloc_pages(ARM_LPAE_PGD_SIZE(data), GFP_KERNEL,
+ cfg);
+ if (!data->pgd)
+ goto out_free_data;
+
+ for (i = 0; i < cfg->apple_dart_cfg.n_ttbrs; ++i)
+ cfg->apple_dart_cfg.ttbr[i] =
+ virt_to_phys(data->pgd + i * ARM_LPAE_GRANULE(data));
+
+ return &data->iop;
+
+out_free_data:
+ kfree(data);
+ return NULL;
+}
+
struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s1_init_fns = {
.alloc = arm_64_lpae_alloc_pgtable_s1,
.free = arm_lpae_free_pgtable,
@@ -1068,6 +1125,11 @@ struct io_pgtable_init_fns io_pgtable_arm_mali_lpae_init_fns = {
.free = arm_lpae_free_pgtable,
};
+struct io_pgtable_init_fns io_pgtable_apple_dart_init_fns = {
+ .alloc = apple_dart_alloc_pgtable,
+ .free = arm_lpae_free_pgtable,
+};
+
#ifdef CONFIG_IOMMU_IO_PGTABLE_LPAE_SELFTEST
static struct io_pgtable_cfg *cfg_cookie __initdata;
diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c
index 6e9917ce980f..fd8e6bd6caf9 100644
--- a/drivers/iommu/io-pgtable.c
+++ b/drivers/iommu/io-pgtable.c
@@ -20,6 +20,7 @@ io_pgtable_init_table[IO_PGTABLE_NUM_FMTS] = {
[ARM_64_LPAE_S1] = &io_pgtable_arm_64_lpae_s1_init_fns,
[ARM_64_LPAE_S2] = &io_pgtable_arm_64_lpae_s2_init_fns,
[ARM_MALI_LPAE] = &io_pgtable_arm_mali_lpae_init_fns,
+ [ARM_APPLE_DART] = &io_pgtable_apple_dart_init_fns,
#endif
#ifdef CONFIG_IOMMU_IO_PGTABLE_ARMV7S
[ARM_V7S] = &io_pgtable_arm_v7s_init_fns,
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 4d40dfa75b55..a4bfac7f85f7 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -16,6 +16,7 @@ enum io_pgtable_fmt {
ARM_V7S,
ARM_MALI_LPAE,
AMD_IOMMU_V1,
+ ARM_APPLE_DART,
IO_PGTABLE_NUM_FMTS,
};
@@ -136,6 +137,11 @@ struct io_pgtable_cfg {
u64 transtab;
u64 memattr;
} arm_mali_lpae_cfg;
+
+ struct {
+ u64 ttbr[4];
+ u32 n_ttbrs;
+ } apple_dart_cfg;
};
};
@@ -246,5 +252,6 @@ extern struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s2_init_fns;
extern struct io_pgtable_init_fns io_pgtable_arm_v7s_init_fns;
extern struct io_pgtable_init_fns io_pgtable_arm_mali_lpae_init_fns;
extern struct io_pgtable_init_fns io_pgtable_amd_iommu_v1_init_fns;
+extern struct io_pgtable_init_fns io_pgtable_apple_dart_init_fns;
#endif /* __IO_PGTABLE_H */
--
2.25.1
On 27.06.21 16:34, Sven Peter wrote:
>
> Apple's DART iommu uses a pagetable format that shares some
> similarities with the ones already implemented by io-pgtable.c.
> Add a new format variant to support the required differences
> so that we don't have to duplicate the pagetable handling code.
>
> Signed-off-by: Sven Peter <[email protected]>
> ---
> drivers/iommu/io-pgtable-arm.c | 62 ++++++++++++++++++++++++++++++++++
> drivers/iommu/io-pgtable.c | 1 +
> include/linux/io-pgtable.h | 7 ++++
> 3 files changed, 70 insertions(+)
>
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 87def58e79b5..1dd5c45b4b5b 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -127,6 +127,9 @@
> #define ARM_MALI_LPAE_MEMATTR_IMP_DEF 0x88ULL
> #define ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC 0x8DULL
>
> +#define APPLE_DART_PTE_PROT_NO_WRITE (1<<7)
> +#define APPLE_DART_PTE_PROT_NO_READ (1<<8)
> +
> /* IOPTE accessors */
> #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d))
>
> @@ -381,6 +384,15 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
> {
> arm_lpae_iopte pte;
>
> + if (data->iop.fmt == ARM_APPLE_DART) {
> + pte = 0;
> + if (!(prot & IOMMU_WRITE))
> + pte |= APPLE_DART_PTE_PROT_NO_WRITE;
> + if (!(prot & IOMMU_READ))
> + pte |= APPLE_DART_PTE_PROT_NO_READ;
> + return pte;
What about the other bits, such as sharability, XN, etc? Do they not
exist on DART? Or have they not been reverse engineered and 0s happen to
"just work"?
> + }
> +
> if (data->iop.fmt == ARM_64_LPAE_S1 ||
> data->iop.fmt == ARM_32_LPAE_S1) {
> pte = ARM_LPAE_PTE_nG;
> @@ -1043,6 +1055,51 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
> return NULL;
> }
>
> +static struct io_pgtable *
> +apple_dart_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
> +{
> + struct arm_lpae_io_pgtable *data;
> + int i;
> +
> + if (cfg->oas > 36)
> + return NULL;
> +
> + data = arm_lpae_alloc_pgtable(cfg);
> + if (!data)
> + return NULL;
> +
> + /*
> + * Apple's DART always requires three levels with the first level being
> + * stored in four MMIO registers. We always concatenate the first and
> + * second level so that we only have to setup the MMIO registers once.
> + * This results in an effective two level pagetable.
> + */
> + if (data->start_level < 1)
> + return NULL;
> + if (data->start_level == 1 && data->pgd_bits > 2)
> + return NULL;
> + if (data->start_level > 1)
> + data->pgd_bits = 0;
> + data->start_level = 2;
> + cfg->apple_dart_cfg.n_ttbrs = 1 << data->pgd_bits;
Maybe add a BUG_ON if n_ttbrs > ARRAY_SIZE(ttbr)? Or alternatively, do a
normal runtime check and bail out then.
Alex
Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
On Mon, Jun 28, 2021, at 12:54, Alexander Graf wrote:
>
>
> On 27.06.21 16:34, Sven Peter wrote:
> >
> > Apple's DART iommu uses a pagetable format that shares some
> > similarities with the ones already implemented by io-pgtable.c.
> > Add a new format variant to support the required differences
> > so that we don't have to duplicate the pagetable handling code.
> >
> > Signed-off-by: Sven Peter <[email protected]>
> > ---
> > drivers/iommu/io-pgtable-arm.c | 62 ++++++++++++++++++++++++++++++++++
> > drivers/iommu/io-pgtable.c | 1 +
> > include/linux/io-pgtable.h | 7 ++++
> > 3 files changed, 70 insertions(+)
> >
> > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> > index 87def58e79b5..1dd5c45b4b5b 100644
> > --- a/drivers/iommu/io-pgtable-arm.c
> > +++ b/drivers/iommu/io-pgtable-arm.c
> > @@ -127,6 +127,9 @@
> > #define ARM_MALI_LPAE_MEMATTR_IMP_DEF 0x88ULL
> > #define ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC 0x8DULL
> >
> > +#define APPLE_DART_PTE_PROT_NO_WRITE (1<<7)
> > +#define APPLE_DART_PTE_PROT_NO_READ (1<<8)
> > +
> > /* IOPTE accessors */
> > #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d))
> >
> > @@ -381,6 +384,15 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
> > {
> > arm_lpae_iopte pte;
> >
> > + if (data->iop.fmt == ARM_APPLE_DART) {
> > + pte = 0;
> > + if (!(prot & IOMMU_WRITE))
> > + pte |= APPLE_DART_PTE_PROT_NO_WRITE;
> > + if (!(prot & IOMMU_READ))
> > + pte |= APPLE_DART_PTE_PROT_NO_READ;
> > + return pte;
>
> What about the other bits, such as sharability, XN, etc? Do they not
> exist on DART? Or have they not been reverse engineered and 0s happen to
> "just work"?
I'm fairly certain they don't exist (or are at least not used by XNU).
The co-processors that can run code also either use an entire separate iommu
(e.g. the GPU) or only use DART as a "second stage" and have their own
MMU which e.g. handles XN (e.g. the SEP or AOP).
>
> > + }
> > +
> > if (data->iop.fmt == ARM_64_LPAE_S1 ||
> > data->iop.fmt == ARM_32_LPAE_S1) {
> > pte = ARM_LPAE_PTE_nG;
> > @@ -1043,6 +1055,51 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
> > return NULL;
> > }
> >
> > +static struct io_pgtable *
> > +apple_dart_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
> > +{
> > + struct arm_lpae_io_pgtable *data;
> > + int i;
> > +
> > + if (cfg->oas > 36)
> > + return NULL;
> > +
> > + data = arm_lpae_alloc_pgtable(cfg);
> > + if (!data)
> > + return NULL;
> > +
> > + /*
> > + * Apple's DART always requires three levels with the first level being
> > + * stored in four MMIO registers. We always concatenate the first and
> > + * second level so that we only have to setup the MMIO registers once.
> > + * This results in an effective two level pagetable.
> > + */
> > + if (data->start_level < 1)
> > + return NULL;
> > + if (data->start_level == 1 && data->pgd_bits > 2)
> > + return NULL;
> > + if (data->start_level > 1)
> > + data->pgd_bits = 0;
> > + data->start_level = 2;
> > + cfg->apple_dart_cfg.n_ttbrs = 1 << data->pgd_bits;
>
> Maybe add a BUG_ON if n_ttbrs > ARRAY_SIZE(ttbr)? Or alternatively, do a
> normal runtime check and bail out then.
n_ttbrs can't actually be larger than 4 at this point already due to the
previous checks.
I can add a BUG_ON though just to make it explicit and be safe in case those
checks or the array size ever change.
Sven
On 29.06.21 09:37, Sven Peter wrote:
>
> On Mon, Jun 28, 2021, at 12:54, Alexander Graf wrote:
>>
>>
>> On 27.06.21 16:34, Sven Peter wrote:
>>>
>>> Apple's DART iommu uses a pagetable format that shares some
>>> similarities with the ones already implemented by io-pgtable.c.
>>> Add a new format variant to support the required differences
>>> so that we don't have to duplicate the pagetable handling code.
>>>
>>> Signed-off-by: Sven Peter <[email protected]>
>>> ---
>>> drivers/iommu/io-pgtable-arm.c | 62 ++++++++++++++++++++++++++++++++++
>>> drivers/iommu/io-pgtable.c | 1 +
>>> include/linux/io-pgtable.h | 7 ++++
>>> 3 files changed, 70 insertions(+)
>>>
>>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>>> index 87def58e79b5..1dd5c45b4b5b 100644
>>> --- a/drivers/iommu/io-pgtable-arm.c
>>> +++ b/drivers/iommu/io-pgtable-arm.c
>>> @@ -127,6 +127,9 @@
>>> #define ARM_MALI_LPAE_MEMATTR_IMP_DEF 0x88ULL
>>> #define ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC 0x8DULL
>>>
>>> +#define APPLE_DART_PTE_PROT_NO_WRITE (1<<7)
>>> +#define APPLE_DART_PTE_PROT_NO_READ (1<<8)
>>> +
>>> /* IOPTE accessors */
>>> #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d))
>>>
>>> @@ -381,6 +384,15 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
>>> {
>>> arm_lpae_iopte pte;
>>>
>>> + if (data->iop.fmt == ARM_APPLE_DART) {
>>> + pte = 0;
>>> + if (!(prot & IOMMU_WRITE))
>>> + pte |= APPLE_DART_PTE_PROT_NO_WRITE;
>>> + if (!(prot & IOMMU_READ))
>>> + pte |= APPLE_DART_PTE_PROT_NO_READ;
>>> + return pte;
>>
>> What about the other bits, such as sharability, XN, etc? Do they not
>> exist on DART? Or have they not been reverse engineered and 0s happen to
>> "just work"?
>
> I'm fairly certain they don't exist (or are at least not used by XNU).
>
> The co-processors that can run code also either use an entire separate iommu
> (e.g. the GPU) or only use DART as a "second stage" and have their own
> MMU which e.g. handles XN (e.g. the SEP or AOP).
Ok :).
>
>>
>>> + }
>>> +
>>> if (data->iop.fmt == ARM_64_LPAE_S1 ||
>>> data->iop.fmt == ARM_32_LPAE_S1) {
>>> pte = ARM_LPAE_PTE_nG;
>>> @@ -1043,6 +1055,51 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
>>> return NULL;
>>> }
>>>
>>> +static struct io_pgtable *
>>> +apple_dart_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
>>> +{
>>> + struct arm_lpae_io_pgtable *data;
>>> + int i;
>>> +
>>> + if (cfg->oas > 36)
>>> + return NULL;
>>> +
>>> + data = arm_lpae_alloc_pgtable(cfg);
>>> + if (!data)
>>> + return NULL;
>>> +
>>> + /*
>>> + * Apple's DART always requires three levels with the first level being
>>> + * stored in four MMIO registers. We always concatenate the first and
>>> + * second level so that we only have to setup the MMIO registers once.
>>> + * This results in an effective two level pagetable.
>>> + */
>>> + if (data->start_level < 1)
>>> + return NULL;
>>> + if (data->start_level == 1 && data->pgd_bits > 2)
>>> + return NULL;
>>> + if (data->start_level > 1)
>>> + data->pgd_bits = 0;
>>> + data->start_level = 2;
>>> + cfg->apple_dart_cfg.n_ttbrs = 1 << data->pgd_bits;
>>
>> Maybe add a BUG_ON if n_ttbrs > ARRAY_SIZE(ttbr)? Or alternatively, do a
>> normal runtime check and bail out then.
>
> n_ttbrs can't actually be larger than 4 at this point already due to the
> previous checks.
> I can add a BUG_ON though just to make it explicit and be safe in case those
> checks or the array size ever change.
Ah, now I see it too. No worries then - I agree that you have all cases
covered.
Reviewed-by: Alexander Graf <[email protected]>
Alex
Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
Looks really good! Just a few minor comments. With them addressed,
Reviewed-by: Alyssa Rosenzweig <[email protected]>
> + Say Y here if you are using an Apple SoC with a DART IOMMU.
Nit: Do we need to spell out "with a DART IOMMU"? Don't all the apple
socs need DART?
> +/*
> + * This structure is used to identify a single stream attached to a domain.
> + * It's used as a list inside that domain to be able to attach multiple
> + * streams to a single domain. Since multiple devices can use a single stream
> + * it additionally keeps track of how many devices are represented by this
> + * stream. Once that number reaches zero it is detached from the IOMMU domain
> + * and all translations from this stream are disabled.
> + *
> + * @dart: DART instance to which this stream belongs
> + * @sid: stream id within the DART instance
> + * @num_devices: count of devices attached to this stream
> + * @stream_head: list head for the next stream
> + */
> +struct apple_dart_stream {
> + struct apple_dart *dart;
> + u32 sid;
> +
> + u32 num_devices;
> +
> + struct list_head stream_head;
> +};
It wasn't obvious to me why we can get away without reference counting.
Looking ahead it looks like we assert locks in each case. Maybe add
that to the comment?
```
> +static void apple_dart_hw_set_ttbr(struct apple_dart *dart, u16 sid, u16 idx,
> + phys_addr_t paddr)
> +{
> + writel(DART_TTBR_VALID | (paddr >> DART_TTBR_SHIFT),
> + dart->regs + DART_TTBR(sid, idx));
> +}
```
Should we be checking alignment here? Something like
BUG_ON(paddr & ((1 << DART_TTBR_SHIFT) - 1));
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Hi,
On Wed, Jun 30, 2021, at 15:49, Alyssa Rosenzweig wrote:
> Looks really good! Just a few minor comments. With them addressed,
>
> Reviewed-by: Alyssa Rosenzweig <[email protected]>
Thanks!
>
> > + Say Y here if you are using an Apple SoC with a DART IOMMU.
>
> Nit: Do we need to spell out "with a DART IOMMU"? Don't all the apple
> socs need DART?
Good point, I'll remove it.
>
> > +/*
> > + * This structure is used to identify a single stream attached to a domain.
> > + * It's used as a list inside that domain to be able to attach multiple
> > + * streams to a single domain. Since multiple devices can use a single stream
> > + * it additionally keeps track of how many devices are represented by this
> > + * stream. Once that number reaches zero it is detached from the IOMMU domain
> > + * and all translations from this stream are disabled.
> > + *
> > + * @dart: DART instance to which this stream belongs
> > + * @sid: stream id within the DART instance
> > + * @num_devices: count of devices attached to this stream
> > + * @stream_head: list head for the next stream
> > + */
> > +struct apple_dart_stream {
> > + struct apple_dart *dart;
> > + u32 sid;
> > +
> > + u32 num_devices;
> > +
> > + struct list_head stream_head;
> > +};
>
> It wasn't obvious to me why we can get away without reference counting.
> Looking ahead it looks like we assert locks in each case. Maybe add
> that to the comment?
Sure, I'll add that to the comment.
>
> ```
> > +static void apple_dart_hw_set_ttbr(struct apple_dart *dart, u16 sid, u16 idx,
> > + phys_addr_t paddr)
> > +{
> > + writel(DART_TTBR_VALID | (paddr >> DART_TTBR_SHIFT),
> > + dart->regs + DART_TTBR(sid, idx));
> > +}
> ```
>
> Should we be checking alignment here? Something like
>
> BUG_ON(paddr & ((1 << DART_TTBR_SHIFT) - 1));
>
Sure, right now paddr will always be aligned but adding that
BUG_ON doesn't hurt :)
Best,
Sven
> > Should we be checking alignment here? Something like
> >
> > BUG_ON(paddr & ((1 << DART_TTBR_SHIFT) - 1));
> >
>
> Sure, right now paddr will always be aligned but adding that
> BUG_ON doesn't hurt :)
Probably should have suggested WARN_ON instead of BUG_ON but yes.
On 2021-06-27 15:34, Sven Peter wrote:
> Apple's DART iommu uses a pagetable format that shares some
> similarities with the ones already implemented by io-pgtable.c.
> Add a new format variant to support the required differences
> so that we don't have to duplicate the pagetable handling code.
>
> Signed-off-by: Sven Peter <[email protected]>
> ---
> drivers/iommu/io-pgtable-arm.c | 62 ++++++++++++++++++++++++++++++++++
> drivers/iommu/io-pgtable.c | 1 +
> include/linux/io-pgtable.h | 7 ++++
> 3 files changed, 70 insertions(+)
>
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 87def58e79b5..1dd5c45b4b5b 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -127,6 +127,9 @@
> #define ARM_MALI_LPAE_MEMATTR_IMP_DEF 0x88ULL
> #define ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC 0x8DULL
>
> +#define APPLE_DART_PTE_PROT_NO_WRITE (1<<7)
> +#define APPLE_DART_PTE_PROT_NO_READ (1<<8)
> +
> /* IOPTE accessors */
> #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d))
>
> @@ -381,6 +384,15 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
> {
> arm_lpae_iopte pte;
>
> + if (data->iop.fmt == ARM_APPLE_DART) {
> + pte = 0;
> + if (!(prot & IOMMU_WRITE))
> + pte |= APPLE_DART_PTE_PROT_NO_WRITE;
> + if (!(prot & IOMMU_READ))
> + pte |= APPLE_DART_PTE_PROT_NO_READ;
> + return pte;
> + }
> +
> if (data->iop.fmt == ARM_64_LPAE_S1 ||
> data->iop.fmt == ARM_32_LPAE_S1) {
> pte = ARM_LPAE_PTE_nG;
> @@ -1043,6 +1055,51 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
> return NULL;
> }
>
> +static struct io_pgtable *
> +apple_dart_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
> +{
> + struct arm_lpae_io_pgtable *data;
> + int i;
> +
> + if (cfg->oas > 36)
> + return NULL;
> +
> + data = arm_lpae_alloc_pgtable(cfg);
> + if (!data)
> + return NULL;
> +
> + /*
> + * Apple's DART always requires three levels with the first level being
> + * stored in four MMIO registers. We always concatenate the first and
> + * second level so that we only have to setup the MMIO registers once.
> + * This results in an effective two level pagetable.
> + */
Nit: I appreciate the effort to document the weirdness, but this comment
did rather mislead me initially, and now that I (think I) understand how
things work it seems a bit backwards. Could we say something like:
"The table format itself always uses two levels, but the total VA
space is mapped by four separate tables, making the MMIO registers
an effective "level 1". For simplicity, though, we treat this
equivalently to LPAE stage 2 concatenation at level 2, with the
additional TTBRs each just pointing at consecutive pages."
?
> + if (data->start_level < 1)
> + return NULL;
> + if (data->start_level == 1 && data->pgd_bits > 2)
> + return NULL;
> + if (data->start_level > 1)
> + data->pgd_bits = 0;
> + data->start_level = 2;
> + cfg->apple_dart_cfg.n_ttbrs = 1 << data->pgd_bits;
> + data->pgd_bits += data->bits_per_level;
> +
> + data->pgd = __arm_lpae_alloc_pages(ARM_LPAE_PGD_SIZE(data), GFP_KERNEL,
> + cfg);
> + if (!data->pgd)
> + goto out_free_data;
> +
> + for (i = 0; i < cfg->apple_dart_cfg.n_ttbrs; ++i)
> + cfg->apple_dart_cfg.ttbr[i] =
> + virt_to_phys(data->pgd + i * ARM_LPAE_GRANULE(data));
> +
> + return &data->iop;
> +
> +out_free_data:
> + kfree(data);
> + return NULL;
> +}
> +
> struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s1_init_fns = {
> .alloc = arm_64_lpae_alloc_pgtable_s1,
> .free = arm_lpae_free_pgtable,
> @@ -1068,6 +1125,11 @@ struct io_pgtable_init_fns io_pgtable_arm_mali_lpae_init_fns = {
> .free = arm_lpae_free_pgtable,
> };
>
> +struct io_pgtable_init_fns io_pgtable_apple_dart_init_fns = {
> + .alloc = apple_dart_alloc_pgtable,
> + .free = arm_lpae_free_pgtable,
> +};
> +
> #ifdef CONFIG_IOMMU_IO_PGTABLE_LPAE_SELFTEST
>
> static struct io_pgtable_cfg *cfg_cookie __initdata;
> diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c
> index 6e9917ce980f..fd8e6bd6caf9 100644
> --- a/drivers/iommu/io-pgtable.c
> +++ b/drivers/iommu/io-pgtable.c
> @@ -20,6 +20,7 @@ io_pgtable_init_table[IO_PGTABLE_NUM_FMTS] = {
> [ARM_64_LPAE_S1] = &io_pgtable_arm_64_lpae_s1_init_fns,
> [ARM_64_LPAE_S2] = &io_pgtable_arm_64_lpae_s2_init_fns,
> [ARM_MALI_LPAE] = &io_pgtable_arm_mali_lpae_init_fns,
> + [ARM_APPLE_DART] = &io_pgtable_apple_dart_init_fns,
> #endif
> #ifdef CONFIG_IOMMU_IO_PGTABLE_ARMV7S
> [ARM_V7S] = &io_pgtable_arm_v7s_init_fns,
> diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> index 4d40dfa75b55..a4bfac7f85f7 100644
> --- a/include/linux/io-pgtable.h
> +++ b/include/linux/io-pgtable.h
> @@ -16,6 +16,7 @@ enum io_pgtable_fmt {
> ARM_V7S,
> ARM_MALI_LPAE,
> AMD_IOMMU_V1,
> + ARM_APPLE_DART,
s/ARM_// - this is pure Apple ;)
With that fixed and hopefully a somewhat clarified comment above,
Reviewed-by: Robin Murphy <[email protected]>
> IO_PGTABLE_NUM_FMTS,
> };
>
> @@ -136,6 +137,11 @@ struct io_pgtable_cfg {
> u64 transtab;
> u64 memattr;
> } arm_mali_lpae_cfg;
> +
> + struct {
> + u64 ttbr[4];
> + u32 n_ttbrs;
> + } apple_dart_cfg;
> };
> };
>
> @@ -246,5 +252,6 @@ extern struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s2_init_fns;
> extern struct io_pgtable_init_fns io_pgtable_arm_v7s_init_fns;
> extern struct io_pgtable_init_fns io_pgtable_arm_mali_lpae_init_fns;
> extern struct io_pgtable_init_fns io_pgtable_amd_iommu_v1_init_fns;
> +extern struct io_pgtable_init_fns io_pgtable_apple_dart_init_fns;
>
> #endif /* __IO_PGTABLE_H */
>
^^ Nit: the subsystem style for the subject format should be
"iommu/dart: Add..." - similarly on patch #1, which I just realised I
missed (sorry!)
On 2021-06-27 15:34, Sven Peter wrote:
> Apple's new SoCs use iommus for almost all peripherals. These Device
> Address Resolution Tables must be setup before these peripherals can
> act as DMA masters.
>
> Signed-off-by: Sven Peter <[email protected]>
> ---
> MAINTAINERS | 1 +
> drivers/iommu/Kconfig | 15 +
> drivers/iommu/Makefile | 1 +
> drivers/iommu/apple-dart-iommu.c | 1058 ++++++++++++++++++++++++++++++
I'd be inclined to drop "-iommu" from the filename, unless there's some
other "apple-dart" functionality that might lead to a module name clash
in future?
> 4 files changed, 1075 insertions(+)
> create mode 100644 drivers/iommu/apple-dart-iommu.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 29e5541c8f21..c1ffaa56b5f9 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1245,6 +1245,7 @@ M: Sven Peter <[email protected]>
> L: [email protected]
> S: Maintained
> F: Documentation/devicetree/bindings/iommu/apple,dart.yaml
> +F: drivers/iommu/apple-dart-iommu.c
>
> APPLE SMC DRIVER
> M: Henrik Rydberg <[email protected]>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 1f111b399bca..87882c628b46 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -249,6 +249,21 @@ config SPAPR_TCE_IOMMU
> Enables bits of IOMMU API required by VFIO. The iommu_ops
> is not implemented as it is not necessary for VFIO.
>
> +config IOMMU_APPLE_DART
> + tristate "Apple DART IOMMU Support"
> + depends on ARM64 || (COMPILE_TEST && !GENERIC_ATOMIC64)
> + select IOMMU_API
> + select IOMMU_IO_PGTABLE
This is redundant - the individual formats already select it.
> + select IOMMU_IO_PGTABLE_LPAE
> + default ARCH_APPLE
> + help
> + Support for Apple DART (Device Address Resolution Table) IOMMUs
> + found in Apple ARM SoCs like the M1.
> + This IOMMU is required for most peripherals using DMA to access
> + the main memory.
> +
> + Say Y here if you are using an Apple SoC with a DART IOMMU.
> +
> # ARM IOMMU support
> config ARM_SMMU
> tristate "ARM Ltd. System MMU (SMMU) Support"
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index c0fb0ba88143..8c813f0ebc54 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -29,3 +29,4 @@ obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
> obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
> obj-$(CONFIG_IOMMU_SVA_LIB) += iommu-sva-lib.o io-pgfault.o
> obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o
> +obj-$(CONFIG_IOMMU_APPLE_DART) += apple-dart-iommu.o
> diff --git a/drivers/iommu/apple-dart-iommu.c b/drivers/iommu/apple-dart-iommu.c
> new file mode 100644
> index 000000000000..637ba6e7cef9
> --- /dev/null
> +++ b/drivers/iommu/apple-dart-iommu.c
> @@ -0,0 +1,1058 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Apple DART (Device Address Resolution Table) IOMMU driver
> + *
> + * Copyright (C) 2021 The Asahi Linux Contributors
> + *
> + * Based on arm/arm-smmu/arm-ssmu.c and arm/arm-smmu-v3/arm-smmu-v3.c
> + * Copyright (C) 2013 ARM Limited
> + * Copyright (C) 2015 ARM Limited
> + * and on exynos-iommu.c
> + * Copyright (c) 2011,2016 Samsung Electronics Co., Ltd.
> + */
> +
> +#include <linux/bitfield.h>
> +#include <linux/clk.h>
> +#include <linux/dma-direct.h>
Oh, right, bus_dma_region. Fair enough :)
> +#include <linux/dma-iommu.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/err.h>
> +#include <linux/interrupt.h>
> +#include <linux/io-pgtable.h>
> +#include <linux/iopoll.h>
> +#include <linux/list.h>
> +#include <linux/lockdep.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_iommu.h>
> +#include <linux/of_platform.h>
> +#include <linux/pci.h>
> +#include <linux/platform_device.h>
> +#include <linux/ratelimit.h>
> +#include <linux/slab.h>
> +#include <linux/pci.h>
Redundant duplicate
> +#define DART_MAX_STREAMS 16
> +#define DART_MAX_TTBR 4
> +
> +#define DART_STREAM_ALL 0xffff
> +
> +#define DART_PARAMS1 0x00
> +#define DART_PARAMS_PAGE_SHIFT GENMASK(27, 24)
> +
> +#define DART_PARAMS2 0x04
> +#define DART_PARAMS_BYPASS_SUPPORT BIT(0)
> +
> +#define DART_STREAM_COMMAND 0x20
> +#define DART_STREAM_COMMAND_BUSY BIT(2)
> +#define DART_STREAM_COMMAND_INVALIDATE BIT(20)
> +
> +#define DART_STREAM_SELECT 0x34
> +
> +#define DART_ERROR 0x40
> +#define DART_ERROR_STREAM GENMASK(27, 24)
> +#define DART_ERROR_CODE GENMASK(23, 0)
> +#define DART_ERROR_FLAG BIT(31)
> +#define DART_ERROR_READ_FAULT BIT(4)
> +#define DART_ERROR_WRITE_FAULT BIT(3)
> +#define DART_ERROR_NO_PTE BIT(2)
> +#define DART_ERROR_NO_PMD BIT(1)
> +#define DART_ERROR_NO_TTBR BIT(0)
> +
> +#define DART_CONFIG 0x60
> +#define DART_CONFIG_LOCK BIT(15)
> +
> +#define DART_STREAM_COMMAND_BUSY_TIMEOUT 100
> +
> +#define DART_STREAM_REMAP 0x80
> +
> +#define DART_ERROR_ADDR_HI 0x54
> +#define DART_ERROR_ADDR_LO 0x50
> +
> +#define DART_TCR(sid) (0x100 + 4 * (sid))
> +#define DART_TCR_TRANSLATE_ENABLE BIT(7)
> +#define DART_TCR_BYPASS0_ENABLE BIT(8)
> +#define DART_TCR_BYPASS1_ENABLE BIT(12)
> +
> +#define DART_TTBR(sid, idx) (0x200 + 16 * (sid) + 4 * (idx))
> +#define DART_TTBR_VALID BIT(31)
> +#define DART_TTBR_SHIFT 12
> +
> +/*
> + * Private structure associated with each DART device.
> + *
> + * @dev: device struct
> + * @regs: mapped MMIO region
> + * @irq: interrupt number, can be shared with other DARTs
> + * @clks: clocks associated with this DART
> + * @num_clks: number of @clks
> + * @lock: lock for @used_sids and hardware operations involving this dart
> + * @used_sids: bitmap of streams attached to a domain
> + * @pgsize: pagesize supported by this DART
> + * @supports_bypass: indicates if this DART supports bypass mode
> + * @force_bypass: force bypass mode due to pagesize mismatch?
> + * @sw_bypass_cpu_start: offset into cpu address space in software bypass mode
> + * @sw_bypass_dma_start: offset into dma address space in software bypass mode
> + * @sw_bypass_len: length of iova space in software bypass mode
> + * @iommu: iommu core device
> + */
> +struct apple_dart {
> + struct device *dev;
> +
> + void __iomem *regs;
> +
> + int irq;
> + struct clk_bulk_data *clks;
> + int num_clks;
> +
> + spinlock_t lock;
> +
> + u32 used_sids;
> + u32 pgsize;
> +
> + u32 supports_bypass : 1;
> + u32 force_bypass : 1;
> +
> + u64 sw_bypass_cpu_start;
> + u64 sw_bypass_dma_start;
> + u64 sw_bypass_len;
> +
> + struct iommu_device iommu;
> +};
> +
> +/*
> + * This structure is used to identify a single stream attached to a domain.
> + * It's used as a list inside that domain to be able to attach multiple
> + * streams to a single domain. Since multiple devices can use a single stream
> + * it additionally keeps track of how many devices are represented by this
> + * stream. Once that number reaches zero it is detached from the IOMMU domain
> + * and all translations from this stream are disabled.
That sounds a lot like something you should be doing properly with groups.
> + * @dart: DART instance to which this stream belongs
> + * @sid: stream id within the DART instance
> + * @num_devices: count of devices attached to this stream
> + * @stream_head: list head for the next stream
> + */
> +struct apple_dart_stream {
> + struct apple_dart *dart;
> + u32 sid;
What are the actual SID values like? If they're large and sparse then
maybe a list makes sense, but if they're small and dense then an array
hanging off the apple_dart structure itself might be more efficient.
Given DART_MAX_STREAMS, I'm thinking the latter, and considerably so.
The impression I'm getting so far is that this seems conceptually a bit
like arm-smmu with stream indexing.
> + u32 num_devices;
> +
> + struct list_head stream_head;
> +};
> +
> +/*
> + * This structure is attached to each iommu domain handled by a DART.
> + * A single domain is used to represent a single virtual address space.
> + * It is always allocated together with a page table.
> + *
> + * Streams are the smallest units the DART hardware can differentiate.
> + * These are pointed to the page table of a domain whenever a device is
> + * attached to it. A single stream can only be assigned to a single domain.
> + *
> + * Devices are assigned to at least a single and sometimes multiple individual
> + * streams (using the iommus property in the device tree). Multiple devices
> + * can theoretically be represented by the same stream, though this is usually
> + * not the case.
> + *
> + * We only keep track of streams here and just count how many devices are
> + * represented by each stream. When the last device is removed the whole stream
> + * is removed from the domain.
> + *
> + * @dart: pointer to the DART instance
> + * @pgtbl_ops: pagetable ops allocated by io-pgtable
> + * @type: domain type IOMMU_DOMAIN_IDENTITY_{IDENTITY,DMA,UNMANAGED,BLOCKED}
> + * @sw_bypass_cpu_start: offset into cpu address space in software bypass mode
> + * @sw_bypass_dma_start: offset into dma address space in software bypass mode
> + * @sw_bypass_len: length of iova space in software bypass mode
> + * @streams: list of streams attached to this domain
> + * @lock: spinlock for operations involving the list of streams
> + * @domain: core iommu domain pointer
> + */
> +struct apple_dart_domain {
> + struct apple_dart *dart;
> + struct io_pgtable_ops *pgtbl_ops;
> +
> + unsigned int type;
Given that this is assigned from domain->type it appears to be redundant.
> + u64 sw_bypass_cpu_start;
> + u64 sw_bypass_dma_start;
> + u64 sw_bypass_len;
> +
> + struct list_head streams;
I'm staring to think this could just be a bitmap, in a u16 even.
> +
> + spinlock_t lock;
> +
> + struct iommu_domain domain;
> +};
> +
> +/*
> + * This structure is attached to devices with dev_iommu_priv_set() on of_xlate
> + * and contains a list of streams bound to this device as defined in the
> + * device tree. Multiple DART instances can be attached to a single device
> + * and each stream is identified by its stream id.
> + * It's usually reference by a pointer called *cfg.
> + *
> + * A dynamic array instead of a linked list is used here since in almost
> + * all cases a device will just be attached to a single stream and streams
> + * are never removed after they have been added.
> + *
> + * @num_streams: number of streams attached
> + * @streams: array of structs to identify attached streams and the device link
> + * to the iommu
> + */
> +struct apple_dart_master_cfg {
> + int num_streams;
> + struct {
> + struct apple_dart *dart;
> + u32 sid;
Can't you use the fwspec for this?
> + struct device_link *link;
Is it necessary to use stateless links, or could you use
DL_FLAG_AUTOREMOVE_SUPPLIER and not have to keep track of them manually?
> + } streams[];
> +};
> +
> +static struct platform_driver apple_dart_driver;
> +static const struct iommu_ops apple_dart_iommu_ops;
> +static const struct iommu_flush_ops apple_dart_tlb_ops;
> +
> +static struct apple_dart_domain *to_dart_domain(struct iommu_domain *dom)
> +{
> + return container_of(dom, struct apple_dart_domain, domain);
> +}
> +
> +static void apple_dart_hw_enable_translation(struct apple_dart *dart, u16 sid)
> +{
> + writel(DART_TCR_TRANSLATE_ENABLE, dart->regs + DART_TCR(sid));
> +}
> +
> +static void apple_dart_hw_disable_dma(struct apple_dart *dart, u16 sid)
> +{
> + writel(0, dart->regs + DART_TCR(sid));
> +}
> +
> +static void apple_dart_hw_enable_bypass(struct apple_dart *dart, u16 sid)
> +{
> + WARN_ON(!dart->supports_bypass);
> + writel(DART_TCR_BYPASS0_ENABLE | DART_TCR_BYPASS1_ENABLE,
> + dart->regs + DART_TCR(sid));
> +}
> +
> +static void apple_dart_hw_set_ttbr(struct apple_dart *dart, u16 sid, u16 idx,
> + phys_addr_t paddr)
> +{
> + writel(DART_TTBR_VALID | (paddr >> DART_TTBR_SHIFT),
> + dart->regs + DART_TTBR(sid, idx));
> +}
> +
> +static void apple_dart_hw_clear_ttbr(struct apple_dart *dart, u16 sid, u16 idx)
> +{
> + writel(0, dart->regs + DART_TTBR(sid, idx));
> +}
> +
> +static void apple_dart_hw_clear_all_ttbrs(struct apple_dart *dart, u16 sid)
> +{
> + int i;
> +
> + for (i = 0; i < 4; ++i)
> + apple_dart_hw_clear_ttbr(dart, sid, i);
> +}
> +
> +static int apple_dart_hw_stream_command(struct apple_dart *dart, u16 sid_bitmap,
> + u32 command)
> +{
> + unsigned long flags;
> + int ret;
> + u32 command_reg;
> +
> + spin_lock_irqsave(&dart->lock, flags);
> +
> + writel(sid_bitmap, dart->regs + DART_STREAM_SELECT);
> + writel(command, dart->regs + DART_STREAM_COMMAND);
> +
> + ret = readl_poll_timeout_atomic(
> + dart->regs + DART_STREAM_COMMAND, command_reg,
> + !(command_reg & DART_STREAM_COMMAND_BUSY), 1,
> + DART_STREAM_COMMAND_BUSY_TIMEOUT);
> +
> + spin_unlock_irqrestore(&dart->lock, flags);
> +
> + if (ret) {
> + dev_err(dart->dev,
> + "busy bit did not clear after command %x for streams %x\n",
> + command, sid_bitmap);
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> +static int apple_dart_hw_invalidate_tlb_global(struct apple_dart *dart)
> +{
> + return apple_dart_hw_stream_command(dart, DART_STREAM_ALL,
> + DART_STREAM_COMMAND_INVALIDATE);
> +}
> +
> +static int apple_dart_hw_invalidate_tlb_stream(struct apple_dart *dart, u16 sid)
> +{
> + return apple_dart_hw_stream_command(dart, 1 << sid,
> + DART_STREAM_COMMAND_INVALIDATE);
> +}
> +
> +static int apple_dart_hw_reset(struct apple_dart *dart)
> +{
> + int sid;
> + u32 config;
> +
> + config = readl(dart->regs + DART_CONFIG);
> + if (config & DART_CONFIG_LOCK) {
> + dev_err(dart->dev, "DART is locked down until reboot: %08x\n",
> + config);
> + return -EINVAL;
> + }
> +
> + for (sid = 0; sid < DART_MAX_STREAMS; ++sid) {
> + apple_dart_hw_disable_dma(dart, sid);
> + apple_dart_hw_clear_all_ttbrs(dart, sid);
> + }
> +
> + /* restore stream identity map */
> + writel(0x03020100, dart->regs + DART_STREAM_REMAP);
> + writel(0x07060504, dart->regs + DART_STREAM_REMAP + 4);
> + writel(0x0b0a0908, dart->regs + DART_STREAM_REMAP + 8);
> + writel(0x0f0e0d0c, dart->regs + DART_STREAM_REMAP + 12);
Any hint of what the magic numbers mean?
> + /* clear any pending errors before the interrupt is unmasked */
> + writel(readl(dart->regs + DART_ERROR), dart->regs + DART_ERROR);
> +
> + return apple_dart_hw_invalidate_tlb_global(dart);
> +}
> +
> +static void apple_dart_domain_flush_tlb(struct apple_dart_domain *domain)
> +{
> + unsigned long flags;
> + struct apple_dart_stream *stream;
> + struct apple_dart *dart = domain->dart;
> +
> + if (!dart)
> + return;
Can that happen? Feels like it's probably a bug elsewhere if it could :/
> + spin_lock_irqsave(&domain->lock, flags);
> + list_for_each_entry(stream, &domain->streams, stream_head) {
> + apple_dart_hw_invalidate_tlb_stream(stream->dart, stream->sid);
> + }
> + spin_unlock_irqrestore(&domain->lock, flags);
> +}
> +
> +static void apple_dart_flush_iotlb_all(struct iommu_domain *domain)
> +{
> + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> +
> + apple_dart_domain_flush_tlb(dart_domain);
> +}
> +
> +static void apple_dart_iotlb_sync(struct iommu_domain *domain,
> + struct iommu_iotlb_gather *gather)
> +{
> + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> +
> + apple_dart_domain_flush_tlb(dart_domain);
> +}
> +
> +static void apple_dart_iotlb_sync_map(struct iommu_domain *domain,
> + unsigned long iova, size_t size)
> +{
> + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> +
> + apple_dart_domain_flush_tlb(dart_domain);
> +}
> +
> +static void apple_dart_tlb_flush_all(void *cookie)
> +{
> + struct apple_dart_domain *domain = cookie;
> +
> + apple_dart_domain_flush_tlb(domain);
> +}
> +
> +static void apple_dart_tlb_flush_walk(unsigned long iova, size_t size,
> + size_t granule, void *cookie)
> +{
> + struct apple_dart_domain *domain = cookie;
> +
> + apple_dart_domain_flush_tlb(domain);
> +}
> +
> +static const struct iommu_flush_ops apple_dart_tlb_ops = {
> + .tlb_flush_all = apple_dart_tlb_flush_all,
> + .tlb_flush_walk = apple_dart_tlb_flush_walk,
> + .tlb_add_page = NULL,
> +};
> +
> +static phys_addr_t apple_dart_iova_to_phys(struct iommu_domain *domain,
> + dma_addr_t iova)
> +{
> + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> + struct io_pgtable_ops *ops = dart_domain->pgtbl_ops;
> +
> + if (domain->type == IOMMU_DOMAIN_IDENTITY &&
> + dart_domain->dart->supports_bypass)
That second check seems redundant - if you don't support bypass surely
you shouldn't have allowed attaching an identity domain in the first
place? And even if you fake one with a pagetable you shouldn't need to
walk it, for obvious reasons ;)
TBH, dealing with identity domains in iova_to_phys at all irks me - it's
largely due to dubious hacks in networking drivers which hopefully you
should never have to deal with on M1 anyway, and either way it's not
like they can't check the domain type themselves and save a pointless
indirect call altogether :(
> + return iova;
> + if (!ops)
> + return -ENODEV;
> +
> + return ops->iova_to_phys(ops, iova);
> +}
> +
> +static int apple_dart_map(struct iommu_domain *domain, unsigned long iova,
> + phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
> +{
> + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> + struct io_pgtable_ops *ops = dart_domain->pgtbl_ops;
> +
> + if (!ops)
> + return -ENODEV;
> + if (prot & IOMMU_MMIO)
> + return -EINVAL;
> + if (prot & IOMMU_NOEXEC)
> + return -EINVAL;
Hmm, I guess the usual expectation is just to ignore any prot flags you
can't enforce - after all, some IOMMUs don't even have a notion of read
or write permissions.
> + return ops->map(ops, iova, paddr, size, prot, gfp);
> +}
> +
> +static size_t apple_dart_unmap(struct iommu_domain *domain, unsigned long iova,
> + size_t size, struct iommu_iotlb_gather *gather)
> +{
> + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> + struct io_pgtable_ops *ops = dart_domain->pgtbl_ops;
> +
> + if (!ops)
> + return 0;
That should never legitimately happen, since no previous mapping could
have succeeded either.
> + return ops->unmap(ops, iova, size, gather);
> +}
> +
> +static int apple_dart_prepare_sw_bypass(struct apple_dart *dart,
> + struct apple_dart_domain *dart_domain,
> + struct device *dev)
> +{
> + lockdep_assert_held(&dart_domain->lock);
> +
> + if (dart->supports_bypass)
> + return 0;
> + if (dart_domain->type != IOMMU_DOMAIN_IDENTITY)
> + return 0;
> +
> + // use the bus region from the first attached dev for the bypass range
> + if (!dart->sw_bypass_len) {
> + const struct bus_dma_region *dma_rgn = dev->dma_range_map;
> +
> + if (!dma_rgn)
> + return -EINVAL;
> +
> + dart->sw_bypass_len = dma_rgn->size;
> + dart->sw_bypass_cpu_start = dma_rgn->cpu_start;
> + dart->sw_bypass_dma_start = dma_rgn->dma_start;
> + }
> +
> + // ensure that we don't mix different bypass setups
> + if (dart_domain->sw_bypass_len) {
> + if (dart->sw_bypass_len != dart_domain->sw_bypass_len)
> + return -EINVAL;
> + if (dart->sw_bypass_cpu_start !=
> + dart_domain->sw_bypass_cpu_start)
> + return -EINVAL;
> + if (dart->sw_bypass_dma_start !=
> + dart_domain->sw_bypass_dma_start)
> + return -EINVAL;
> + } else {
> + dart_domain->sw_bypass_len = dart->sw_bypass_len;
> + dart_domain->sw_bypass_cpu_start = dart->sw_bypass_cpu_start;
> + dart_domain->sw_bypass_dma_start = dart->sw_bypass_dma_start;
> + }
> +
> + return 0;
> +}
> +
> +static int apple_dart_domain_needs_pgtbl_ops(struct apple_dart *dart,
> + struct iommu_domain *domain)
> +{
> + if (domain->type == IOMMU_DOMAIN_DMA)
> + return 1;
> + if (domain->type == IOMMU_DOMAIN_UNMANAGED)
> + return 1;
> + if (!dart->supports_bypass && domain->type == IOMMU_DOMAIN_IDENTITY)
> + return 1;
> + return 0;
> +}
> +
> +static int apple_dart_finalize_domain(struct iommu_domain *domain)
> +{
> + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> + struct apple_dart *dart = dart_domain->dart;
> + struct io_pgtable_cfg pgtbl_cfg;
> +
> + lockdep_assert_held(&dart_domain->lock);
> +
> + if (dart_domain->pgtbl_ops)
> + return 0;
> + if (!apple_dart_domain_needs_pgtbl_ops(dart, domain))
> + return 0;
> +
> + pgtbl_cfg = (struct io_pgtable_cfg){
> + .pgsize_bitmap = dart->pgsize,
> + .ias = 32,
> + .oas = 36,
> + .coherent_walk = 1,
> + .tlb = &apple_dart_tlb_ops,
> + .iommu_dev = dart->dev,
> + };
> +
> + dart_domain->pgtbl_ops =
> + alloc_io_pgtable_ops(ARM_APPLE_DART, &pgtbl_cfg, domain);
> + if (!dart_domain->pgtbl_ops)
> + return -ENOMEM;
> +
> + domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> + domain->geometry.aperture_start = 0;
> + domain->geometry.aperture_end = DMA_BIT_MASK(32);
> + domain->geometry.force_aperture = true;
> +
> + /*
> + * Some DARTs come without hardware bypass support but we may still
> + * be forced to use bypass mode (to e.g. allow kernels with 4K pages to
> + * boot). If we reach this point with an identity domain we have to setup
> + * bypass mode in software. This is done by creating a static pagetable
> + * for a linear map specified by dma-ranges in the device tree.
> + */
> + if (domain->type == IOMMU_DOMAIN_IDENTITY) {
> + u64 offset;
> + int ret;
> +
> + for (offset = 0; offset < dart_domain->sw_bypass_len;
> + offset += dart->pgsize) {
> + ret = dart_domain->pgtbl_ops->map(
> + dart_domain->pgtbl_ops,
> + dart_domain->sw_bypass_dma_start + offset,
> + dart_domain->sw_bypass_cpu_start + offset,
> + dart->pgsize, IOMMU_READ | IOMMU_WRITE,
> + GFP_ATOMIC);
> + if (ret < 0) {
> + free_io_pgtable_ops(dart_domain->pgtbl_ops);
> + dart_domain->pgtbl_ops = NULL;
> + return -EINVAL;
> + }
> + }
Could you set up a single per-DART pagetable in prepare_sw_bypass (or
even better at probe time if you think you're likely to need it) and
just share that between all fake identity domains? That could be a
follow-up optimisation, though.
> + }
> +
> + return 0;
> +}
> +
> +static void
> +apple_dart_stream_setup_translation(struct apple_dart_domain *domain,
> + struct apple_dart *dart, u32 sid)
> +{
> + int i;
> + struct io_pgtable_cfg *pgtbl_cfg =
> + &io_pgtable_ops_to_pgtable(domain->pgtbl_ops)->cfg;
> +
> + for (i = 0; i < pgtbl_cfg->apple_dart_cfg.n_ttbrs; ++i)
> + apple_dart_hw_set_ttbr(dart, sid, i,
> + pgtbl_cfg->apple_dart_cfg.ttbr[i]);
> + for (; i < DART_MAX_TTBR; ++i)
> + apple_dart_hw_clear_ttbr(dart, sid, i);
> +
> + apple_dart_hw_enable_translation(dart, sid);
> + apple_dart_hw_invalidate_tlb_stream(dart, sid);
> +}
> +
> +static int apple_dart_attach_stream(struct apple_dart_domain *domain,
> + struct apple_dart *dart, u32 sid)
> +{
> + unsigned long flags;
> + struct apple_dart_stream *stream;
> + int ret;
> +
> + lockdep_assert_held(&domain->lock);
> +
> + if (WARN_ON(dart->force_bypass &&
> + domain->type != IOMMU_DOMAIN_IDENTITY))
> + return -EINVAL;
Ideally you shouldn't allow that to happen, but I guess if you have
mixed capabilities afross different instances then in principle an
unmanaged domain could still slip through. But then again a user of an
unmanaged domain might be OK with using larger pages anyway. Either way
I'm not sure it's worthy of a WARN (similarly below) since it doesn't
represent a "this should never happen" condition if the user has got
their hands on a VFIO driver and is mucking about, it's just a normal
failure because you can't support the attachment.
> + /*
> + * we can't mix and match DARTs that support bypass mode with those who don't
> + * because the iova space in fake bypass mode generally has an offset
> + */
Erm, something doesn't sound right there... IOMMU_DOMAIN_IDENTITY should
be exactly what it says, regardless of how it's implemented. If you
can't provide a true identity mapping then you're probably better off
not pretending to support them in the first place.
> + if (WARN_ON(domain->type == IOMMU_DOMAIN_IDENTITY &&
> + (domain->dart->supports_bypass != dart->supports_bypass)))
> + return -EINVAL;
> +
> + list_for_each_entry(stream, &domain->streams, stream_head) {
> + if (stream->dart == dart && stream->sid == sid) {
> + stream->num_devices++;
> + return 0;
> + }
> + }
> +
> + spin_lock_irqsave(&dart->lock, flags);
> +
> + if (WARN_ON(dart->used_sids & BIT(sid))) {
> + ret = -EINVAL;
> + goto error;
> + }
> +
> + stream = kzalloc(sizeof(*stream), GFP_ATOMIC);
> + if (!stream) {
> + ret = -ENOMEM;
> + goto error;
> + }
Couldn't you do this outside the lock? (If, calling back to other
comments, it can't get refactored out of existence anyway)
> + stream->dart = dart;
> + stream->sid = sid;
> + stream->num_devices = 1;
> + list_add(&stream->stream_head, &domain->streams);
> +
> + dart->used_sids |= BIT(sid);
> + spin_unlock_irqrestore(&dart->lock, flags);
> +
> + apple_dart_hw_clear_all_ttbrs(stream->dart, stream->sid);
> +
> + switch (domain->type) {
> + case IOMMU_DOMAIN_IDENTITY:
> + if (stream->dart->supports_bypass)
> + apple_dart_hw_enable_bypass(stream->dart, stream->sid);
> + else
> + apple_dart_stream_setup_translation(
> + domain, stream->dart, stream->sid);
> + break;
> + case IOMMU_DOMAIN_BLOCKED:
> + apple_dart_hw_disable_dma(stream->dart, stream->sid);
> + break;
> + case IOMMU_DOMAIN_UNMANAGED:
> + case IOMMU_DOMAIN_DMA:
> + apple_dart_stream_setup_translation(domain, stream->dart,
> + stream->sid);
> + break;
> + }
> +
> + return 0;
> +
> +error:
> + spin_unlock_irqrestore(&dart->lock, flags);
> + return ret;
> +}
> +
> +static void apple_dart_disable_stream(struct apple_dart *dart, u32 sid)
> +{
> + unsigned long flags;
> +
> + apple_dart_hw_disable_dma(dart, sid);
> + apple_dart_hw_clear_all_ttbrs(dart, sid);
> + apple_dart_hw_invalidate_tlb_stream(dart, sid);
> +
> + spin_lock_irqsave(&dart->lock, flags);
> + dart->used_sids &= ~BIT(sid);
> + spin_unlock_irqrestore(&dart->lock, flags);
> +}
> +
> +static void apple_dart_detach_stream(struct apple_dart_domain *domain,
> + struct apple_dart *dart, u32 sid)
> +{
> + struct apple_dart_stream *stream;
> +
> + lockdep_assert_held(&domain->lock);
> +
> + list_for_each_entry(stream, &domain->streams, stream_head) {
> + if (stream->dart == dart && stream->sid == sid) {
> + stream->num_devices--;
> +
> + if (stream->num_devices == 0) {
> + apple_dart_disable_stream(dart, sid);
> + list_del(&stream->stream_head);
> + kfree(stream);
> + }
> + return;
> + }
> + }
> +}
> +
> +static int apple_dart_attach_dev(struct iommu_domain *domain,
> + struct device *dev)
> +{
> + int ret;
> + int i, j;
> + unsigned long flags;
> + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
> + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> + struct apple_dart *dart = cfg->streams[0].dart;
> +
> + if (WARN_ON(dart->force_bypass &&
> + dart_domain->type != IOMMU_DOMAIN_IDENTITY)) {
> + dev_warn(
> + dev,
> + "IOMMU must be in bypass mode but trying to attach to translated domain.\n");
> + return -EINVAL;
> + }
Again, a bit excessive with the warnings. In fact, transpose my comment
from apple_dart_attach_stream() to here, because this means the
equivalent warning there is literally unreachable :/
> + spin_lock_irqsave(&dart_domain->lock, flags);
> +
> + ret = apple_dart_prepare_sw_bypass(dart, dart_domain, dev);
> + if (ret)
> + goto out;
> +
> + if (!dart_domain->dart)
> + dart_domain->dart = dart;
> +
> + ret = apple_dart_finalize_domain(domain);
> + if (ret)
> + goto out;
> +
> + for (i = 0; i < cfg->num_streams; ++i) {
> + ret = apple_dart_attach_stream(
> + dart_domain, cfg->streams[i].dart, cfg->streams[i].sid);
> + if (ret) {
> + /* try to undo what we did before returning */
> + for (j = 0; j < i; ++j)
> + apple_dart_detach_stream(dart_domain,
> + cfg->streams[j].dart,
> + cfg->streams[j].sid);
> +
> + goto out;
> + }
> + }
> +
> + ret = 0;
> +
> +out:
> + spin_unlock_irqrestore(&dart_domain->lock, flags);
> + return ret;
> +}
> +
> +static void apple_dart_detach_dev(struct iommu_domain *domain,
> + struct device *dev)
> +{
> + int i;
> + unsigned long flags;
> + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
> + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> +
> + spin_lock_irqsave(&dart_domain->lock, flags);
> +
> + for (i = 0; i < cfg->num_streams; ++i)
> + apple_dart_detach_stream(dart_domain, cfg->streams[i].dart,
> + cfg->streams[i].sid);
> +
> + spin_unlock_irqrestore(&dart_domain->lock, flags);
> +}
> +
> +static struct iommu_device *apple_dart_probe_device(struct device *dev)
> +{
> + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
> + int i;
> +
> + if (!cfg)
> + return ERR_PTR(-ENODEV);
> +
> + for (i = 0; i < cfg->num_streams; ++i) {
> + cfg->streams[i].link =
> + device_link_add(dev, cfg->streams[i].dart->dev,
> + DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
> + }
> +
> + return &cfg->streams[0].dart->iommu;
> +}
> +
> +static void apple_dart_release_device(struct device *dev)
> +{
> + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
> + int i;
> +
> + if (!cfg)
> + return;
Shouldn't happen - if it's disappeared since probe_device succeeded
you've got bigger problems anyway.
> +
> + for (i = 0; i < cfg->num_streams; ++i)
> + device_link_del(cfg->streams[i].link);
> +
> + dev_iommu_priv_set(dev, NULL);
> + kfree(cfg);
> +}
> +
> +static struct iommu_domain *apple_dart_domain_alloc(unsigned int type)
> +{
> + struct apple_dart_domain *dart_domain;
> +
> + if (type != IOMMU_DOMAIN_DMA && type != IOMMU_DOMAIN_UNMANAGED &&
> + type != IOMMU_DOMAIN_IDENTITY && type != IOMMU_DOMAIN_BLOCKED)
> + return NULL;
I want to say there's not much point in that, but then I realise I've
spent the last couple of days writing patches to add a new domain type :)
> + dart_domain = kzalloc(sizeof(*dart_domain), GFP_KERNEL);
> + if (!dart_domain)
> + return NULL;
> +
> + INIT_LIST_HEAD(&dart_domain->streams);
> + spin_lock_init(&dart_domain->lock);
> + iommu_get_dma_cookie(&dart_domain->domain);
> + dart_domain->type = type;
Yeah, this is "useful" for a handful of CPU cycles until we return and
iommu_domain_alloc() sets dart_domain->domain->type to the same thing,
all before *its* caller even knows the domain exists.
> + return &dart_domain->domain;
> +}
> +
> +static void apple_dart_domain_free(struct iommu_domain *domain)
> +{
> + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> +
> + WARN_ON(!list_empty(&dart_domain->streams));
Why? This code is perfectly legal API usage:
d = iommu_domain_alloc(bus)
if (d)
iommu_domain_free(d);
Sure it looks pointless, but it's the kind of thing that can
legitimately happen (with a lot more going on in between) if an
unmanaged domain user tears itself down before it gets round to
attaching, due to probe deferral or some other error condition.
> + kfree(dart_domain);
> +}
> +
> +static int apple_dart_of_xlate(struct device *dev, struct of_phandle_args *args)
> +{
> + struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
> + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
> + unsigned int num_streams = cfg ? cfg->num_streams : 0;
> + struct apple_dart_master_cfg *cfg_new;
> + struct apple_dart *dart = platform_get_drvdata(iommu_pdev);
> +
> + if (args->args_count != 1)
> + return -EINVAL;
> +
> + cfg_new = krealloc(cfg, struct_size(cfg, streams, num_streams + 1),
> + GFP_KERNEL);
> + if (!cfg_new)
> + return -ENOMEM;
> +
> + cfg = cfg_new;
> + dev_iommu_priv_set(dev, cfg);
> +
> + cfg->num_streams = num_streams;
> + cfg->streams[cfg->num_streams].dart = dart;
> + cfg->streams[cfg->num_streams].sid = args->args[0];
> + cfg->num_streams++;
Yeah, this is way too reminiscent of the fwspec code for comfort. Even
if you can't use autoremove links for some reason, an array of 16
device_link pointers hung off apple_dart still wins over these little
pointer-heavy structures if you need more than a few of them.
> + return 0;
> +}
> +
> +static struct iommu_group *apple_dart_device_group(struct device *dev)
> +{
> +#ifdef CONFIG_PCI
> + struct iommu_group *group;
> +
> + if (dev_is_pci(dev))
> + group = pci_device_group(dev);
> + else
> + group = generic_device_group(dev);
...and this is where it gets bad :(
If you can have multiple devices behind the same stream such that the
IOMMU can't tell them apart, you *have* to ensure they get put in the
same group, so that the IOMMU core knows the topology (and reflects it
correctly to userspace) and doesn't try to do things that then
unexpectedly fail. This is the point where you need to check if a stream
is already known, and return the existing group if so, and then you
won't need to check and refcount all the time in attach/detach because
the IOMMU core will do the right thing for you.
Many drivers only run on systems where devices don't alias at the IOMMU
level (aliasing at the PCI level is already taken care of), or use a
single group because effectively everything aliases, so it's not the
most common scenario, but as I mentioned before arm-smmu is one that
does - take a look at the flow though that in the "!smmu->smrs" cases
for the closest example.
> +
> + return group;
> +#else
> + return generic_device_group(dev);
> +#endif
> +}
> +
> +static int apple_dart_def_domain_type(struct device *dev)
> +{
> + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
> + struct apple_dart *dart = cfg->streams[0].dart;
> +
> + if (dart->force_bypass)
> + return IOMMU_DOMAIN_IDENTITY;
> + if (!dart->supports_bypass)
> + return IOMMU_DOMAIN_DMA;
> +
> + return 0;
> +}
> +
> +static const struct iommu_ops apple_dart_iommu_ops = {
> + .domain_alloc = apple_dart_domain_alloc,
> + .domain_free = apple_dart_domain_free,
> + .attach_dev = apple_dart_attach_dev,
> + .detach_dev = apple_dart_detach_dev,
> + .map = apple_dart_map,
> + .unmap = apple_dart_unmap,
> + .flush_iotlb_all = apple_dart_flush_iotlb_all,
> + .iotlb_sync = apple_dart_iotlb_sync,
> + .iotlb_sync_map = apple_dart_iotlb_sync_map,
> + .iova_to_phys = apple_dart_iova_to_phys,
> + .probe_device = apple_dart_probe_device,
> + .release_device = apple_dart_release_device,
> + .device_group = apple_dart_device_group,
> + .of_xlate = apple_dart_of_xlate,
> + .def_domain_type = apple_dart_def_domain_type,
> + .pgsize_bitmap = -1UL, /* Restricted during dart probe */
> +};
> +
> +static irqreturn_t apple_dart_irq(int irq, void *dev)
> +{
> + struct apple_dart *dart = dev;
> + static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
> + DEFAULT_RATELIMIT_BURST);
> + const char *fault_name = NULL;
> + u32 error = readl(dart->regs + DART_ERROR);
> + u32 error_code = FIELD_GET(DART_ERROR_CODE, error);
> + u32 addr_lo = readl(dart->regs + DART_ERROR_ADDR_LO);
> + u32 addr_hi = readl(dart->regs + DART_ERROR_ADDR_HI);
> + u64 addr = addr_lo | (((u64)addr_hi) << 32);
> + u8 stream_idx = FIELD_GET(DART_ERROR_STREAM, error);
> +
> + if (!(error & DART_ERROR_FLAG))
> + return IRQ_NONE;
> +
> + if (error_code & DART_ERROR_READ_FAULT)
> + fault_name = "READ FAULT";
> + else if (error_code & DART_ERROR_WRITE_FAULT)
> + fault_name = "WRITE FAULT";
> + else if (error_code & DART_ERROR_NO_PTE)
> + fault_name = "NO PTE FOR IOVA";
> + else if (error_code & DART_ERROR_NO_PMD)
> + fault_name = "NO PMD FOR IOVA";
> + else if (error_code & DART_ERROR_NO_TTBR)
> + fault_name = "NO TTBR FOR IOVA";
Can multiple bits be set at once or is there a strict precedence?
> + if (WARN_ON(fault_name == NULL))
You're already logging a clear and attributable message below; I can
guarantee that a big noisy backtrace showing that you got here from
el0_irq() or el1_irq() is not useful over and above that.
> + fault_name = "unknown";
> +
> + if (__ratelimit(&rs)) {
Just use dev_err_ratelimited() to hide the guts if you're not doing
anything tricky.
> + dev_err(dart->dev,
> + "translation fault: status:0x%x stream:%d code:0x%x (%s) at 0x%llx",
> + error, stream_idx, error_code, fault_name, addr);
> + }
> +
> + writel(error, dart->regs + DART_ERROR);
> + return IRQ_HANDLED;
> +}
> +
> +static int apple_dart_probe(struct platform_device *pdev)
> +{
> + int ret;
> + u32 dart_params[2];
> + struct resource *res;
> + struct apple_dart *dart;
> + struct device *dev = &pdev->dev;
> +
> + dart = devm_kzalloc(dev, sizeof(*dart), GFP_KERNEL);
> + if (!dart)
> + return -ENOMEM;
> +
> + dart->dev = dev;
> + spin_lock_init(&dart->lock);
> +
> + if (pdev->num_resources < 1)
> + return -ENODEV;
But you have 2 resources (one MEM and one IRQ)? And anyway their
respective absences would hardly go unnoticed below.
> + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> + if (resource_size(res) < 0x4000) {
> + dev_err(dev, "MMIO region too small (%pr)\n", res);
> + return -EINVAL;
> + }
> +
> + dart->regs = devm_ioremap_resource(dev, res);
> + if (IS_ERR(dart->regs))
> + return PTR_ERR(dart->regs);
> +
> + ret = devm_clk_bulk_get_all(dev, &dart->clks);
> + if (ret < 0)
> + return ret;
> + dart->num_clks = ret;
> +
> + ret = clk_bulk_prepare_enable(dart->num_clks, dart->clks);
> + if (ret)
> + return ret;
> +
> + ret = apple_dart_hw_reset(dart);
> + if (ret)
> + goto err_clk_disable;
> +
> + dart_params[0] = readl(dart->regs + DART_PARAMS1);
> + dart_params[1] = readl(dart->regs + DART_PARAMS2);
> + dart->pgsize = 1 << FIELD_GET(DART_PARAMS_PAGE_SHIFT, dart_params[0]);
> + dart->supports_bypass = dart_params[1] & DART_PARAMS_BYPASS_SUPPORT;
> + dart->force_bypass = dart->pgsize > PAGE_SIZE;
> +
> + dart->irq = platform_get_irq(pdev, 0);
> + if (dart->irq < 0) {
> + ret = -ENODEV;
> + goto err_clk_disable;
> + }
FWIW I'd get the IRQ earlier when there's still nothing to clean up on
failure - it's only the request which needs to wait until you've
actually set up enough to be able to handle it if it does fire.
> + ret = devm_request_irq(dart->dev, dart->irq, apple_dart_irq,
> + IRQF_SHARED, "apple-dart fault handler", dart);
Be verfy careful with this pattern of mixing devrers-managed IRQs with
explicitly-managed clocks, especially when IRQF_SHARED is in play. In
the failure path here, and in remove, you have a period where the clocks
have been disabled but the IRQ is still live - try CONFIG_DEBUG_SHIRQ
and don't be surprised if you deadlock trying to read an unclocked register.
If you can't also offload the clock management to devres to guarantee
ordering relative to the IRQ (I think I saw some patches recently), it's
probably safest to manually manage the latter.
> + if (ret)
> + goto err_clk_disable;
> +
> + platform_set_drvdata(pdev, dart);
> +
> + ret = iommu_device_sysfs_add(&dart->iommu, dev, NULL, "apple-dart.%s",
> + dev_name(&pdev->dev));
> + if (ret)
> + goto err_clk_disable;
> +
> + ret = iommu_device_register(&dart->iommu, &apple_dart_iommu_ops, dev);
> + if (ret)
> + goto err_clk_disable;
> +
> + if (dev->bus->iommu_ops != &apple_dart_iommu_ops) {
> + ret = bus_set_iommu(dev->bus, &apple_dart_iommu_ops);
> + if (ret)
> + goto err_clk_disable;
> + }
> +#ifdef CONFIG_PCI
> + if (dev->bus->iommu_ops != pci_bus_type.iommu_ops) {
But it's still a platform device, not a PCI device?
> + ret = bus_set_iommu(&pci_bus_type, &apple_dart_iommu_ops);
> + if (ret)
> + goto err_clk_disable;
And the platform bus ops?
> + }
> +#endif
> +
> + dev_info(
> + &pdev->dev,
> + "DART [pagesize %x, bypass support: %d, bypass forced: %d] initialized\n",
> + dart->pgsize, dart->supports_bypass, dart->force_bypass);
> + return 0;
> +
> +err_clk_disable:
> + clk_bulk_disable(dart->num_clks, dart->clks);
> + clk_bulk_unprepare(dart->num_clks, dart->clks);
No need to open-code clk_bulk_disable_unprepare() ;)
> + return ret;
> +}
> +
> +static int apple_dart_remove(struct platform_device *pdev)
> +{
> + struct apple_dart *dart = platform_get_drvdata(pdev);
> +
> + devm_free_irq(dart->dev, dart->irq, dart);
> +
> + iommu_device_unregister(&dart->iommu);
> + iommu_device_sysfs_remove(&dart->iommu);
> +
> + clk_bulk_disable(dart->num_clks, dart->clks);
> + clk_bulk_unprepare(dart->num_clks, dart->clks);
Ditto.
And again the bus ops are still installed - that'll get really fun if
this is a module unload...
> + return 0;
> +}
> +
> +static void apple_dart_shutdown(struct platform_device *pdev)
> +{
> + apple_dart_remove(pdev);
The main reason for doing somthing on shutdown is in the case of kexec,
to put the hardware back into a disable or otherwise sane state so as
not to trip up whatever the subsequent payload is. If you're not doing
that (which may be legitimate if the expectation is that software must
always fully reset and initialise a DART before I/O can work) then
there's not much point in doing anything, really. Stuff like tidying up
sysfs is a complete waste of time when the world's about to end ;)
Robin.
> +}
> +
> +static const struct of_device_id apple_dart_of_match[] = {
> + { .compatible = "apple,t8103-dart", .data = NULL },
> + {},
> +};
> +MODULE_DEVICE_TABLE(of, apple_dart_of_match);
> +
> +static struct platform_driver apple_dart_driver = {
> + .driver = {
> + .name = "apple-dart",
> + .of_match_table = apple_dart_of_match,
> + },
> + .probe = apple_dart_probe,
> + .remove = apple_dart_remove,
> + .shutdown = apple_dart_shutdown,
> +};
> +module_platform_driver(apple_dart_driver);
> +
> +MODULE_DESCRIPTION("IOMMU API for Apple's DART");
> +MODULE_AUTHOR("Sven Peter <[email protected]>");
> +MODULE_LICENSE("GPL v2");
>
Hi,
On Tue, Jul 13, 2021, at 21:17, Robin Murphy wrote:
> On 2021-06-27 15:34, Sven Peter wrote:
> > Apple's DART iommu uses a pagetable format that shares some
> > similarities with the ones already implemented by io-pgtable.c.
> > Add a new format variant to support the required differences
> > so that we don't have to duplicate the pagetable handling code.
> >
> > Signed-off-by: Sven Peter <[email protected]>
> > ---
> > drivers/iommu/io-pgtable-arm.c | 62 ++++++++++++++++++++++++++++++++++
> > drivers/iommu/io-pgtable.c | 1 +
> > include/linux/io-pgtable.h | 7 ++++
> > 3 files changed, 70 insertions(+)
> >
> > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> > index 87def58e79b5..1dd5c45b4b5b 100644
> > --- a/drivers/iommu/io-pgtable-arm.c
> > +++ b/drivers/iommu/io-pgtable-arm.c
> > @@ -127,6 +127,9 @@
> > #define ARM_MALI_LPAE_MEMATTR_IMP_DEF 0x88ULL
> > #define ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC 0x8DULL
> >
> > +#define APPLE_DART_PTE_PROT_NO_WRITE (1<<7)
> > +#define APPLE_DART_PTE_PROT_NO_READ (1<<8)
> > +
> > /* IOPTE accessors */
> > #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d))
> >
> > @@ -381,6 +384,15 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
> > {
> > arm_lpae_iopte pte;
> >
> > + if (data->iop.fmt == ARM_APPLE_DART) {
> > + pte = 0;
> > + if (!(prot & IOMMU_WRITE))
> > + pte |= APPLE_DART_PTE_PROT_NO_WRITE;
> > + if (!(prot & IOMMU_READ))
> > + pte |= APPLE_DART_PTE_PROT_NO_READ;
> > + return pte;
> > + }
> > +
> > if (data->iop.fmt == ARM_64_LPAE_S1 ||
> > data->iop.fmt == ARM_32_LPAE_S1) {
> > pte = ARM_LPAE_PTE_nG;
> > @@ -1043,6 +1055,51 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
> > return NULL;
> > }
> >
> > +static struct io_pgtable *
> > +apple_dart_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
> > +{
> > + struct arm_lpae_io_pgtable *data;
> > + int i;
> > +
> > + if (cfg->oas > 36)
> > + return NULL;
> > +
> > + data = arm_lpae_alloc_pgtable(cfg);
> > + if (!data)
> > + return NULL;
> > +
> > + /*
> > + * Apple's DART always requires three levels with the first level being
> > + * stored in four MMIO registers. We always concatenate the first and
> > + * second level so that we only have to setup the MMIO registers once.
> > + * This results in an effective two level pagetable.
> > + */
>
> Nit: I appreciate the effort to document the weirdness, but this comment
> did rather mislead me initially, and now that I (think I) understand how
> things work it seems a bit backwards. Could we say something like:
>
> "The table format itself always uses two levels, but the total VA
> space is mapped by four separate tables, making the MMIO registers
> an effective "level 1". For simplicity, though, we treat this
> equivalently to LPAE stage 2 concatenation at level 2, with the
> additional TTBRs each just pointing at consecutive pages."
>
> ?
>
Sure, your version is much easier to understand! Thanks.
> > + if (data->start_level < 1)
> > + return NULL;
> > + if (data->start_level == 1 && data->pgd_bits > 2)
> > + return NULL;
> > + if (data->start_level > 1)
> > + data->pgd_bits = 0;
> > + data->start_level = 2;
> > + cfg->apple_dart_cfg.n_ttbrs = 1 << data->pgd_bits;
> > + data->pgd_bits += data->bits_per_level;
> > +
> > + data->pgd = __arm_lpae_alloc_pages(ARM_LPAE_PGD_SIZE(data), GFP_KERNEL,
> > + cfg);
> > + if (!data->pgd)
> > + goto out_free_data;
> > +
> > + for (i = 0; i < cfg->apple_dart_cfg.n_ttbrs; ++i)
> > + cfg->apple_dart_cfg.ttbr[i] =
> > + virt_to_phys(data->pgd + i * ARM_LPAE_GRANULE(data));
> > +
> > + return &data->iop;
> > +
> > +out_free_data:
> > + kfree(data);
> > + return NULL;
> > +}
> > +
> > struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s1_init_fns = {
> > .alloc = arm_64_lpae_alloc_pgtable_s1,
> > .free = arm_lpae_free_pgtable,
> > @@ -1068,6 +1125,11 @@ struct io_pgtable_init_fns io_pgtable_arm_mali_lpae_init_fns = {
> > .free = arm_lpae_free_pgtable,
> > };
> >
> > +struct io_pgtable_init_fns io_pgtable_apple_dart_init_fns = {
> > + .alloc = apple_dart_alloc_pgtable,
> > + .free = arm_lpae_free_pgtable,
> > +};
> > +
> > #ifdef CONFIG_IOMMU_IO_PGTABLE_LPAE_SELFTEST
> >
> > static struct io_pgtable_cfg *cfg_cookie __initdata;
> > diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c
> > index 6e9917ce980f..fd8e6bd6caf9 100644
> > --- a/drivers/iommu/io-pgtable.c
> > +++ b/drivers/iommu/io-pgtable.c
> > @@ -20,6 +20,7 @@ io_pgtable_init_table[IO_PGTABLE_NUM_FMTS] = {
> > [ARM_64_LPAE_S1] = &io_pgtable_arm_64_lpae_s1_init_fns,
> > [ARM_64_LPAE_S2] = &io_pgtable_arm_64_lpae_s2_init_fns,
> > [ARM_MALI_LPAE] = &io_pgtable_arm_mali_lpae_init_fns,
> > + [ARM_APPLE_DART] = &io_pgtable_apple_dart_init_fns,
> > #endif
> > #ifdef CONFIG_IOMMU_IO_PGTABLE_ARMV7S
> > [ARM_V7S] = &io_pgtable_arm_v7s_init_fns,
> > diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> > index 4d40dfa75b55..a4bfac7f85f7 100644
> > --- a/include/linux/io-pgtable.h
> > +++ b/include/linux/io-pgtable.h
> > @@ -16,6 +16,7 @@ enum io_pgtable_fmt {
> > ARM_V7S,
> > ARM_MALI_LPAE,
> > AMD_IOMMU_V1,
> > + ARM_APPLE_DART,
>
> s/ARM_// - this is pure Apple ;)
>
> With that fixed and hopefully a somewhat clarified comment above,
>
> Reviewed-by: Robin Murphy <[email protected]>
True, I'll remove the ARM_ prefix and also change the commit message to match the
subsystem style as mentioned in your reply for the third patch.
Thanks for the review!
>
> > IO_PGTABLE_NUM_FMTS,
> > };
> >
> > @@ -136,6 +137,11 @@ struct io_pgtable_cfg {
> > u64 transtab;
> > u64 memattr;
> > } arm_mali_lpae_cfg;
> > +
> > + struct {
> > + u64 ttbr[4];
> > + u32 n_ttbrs;
> > + } apple_dart_cfg;
> > };
> > };
> >
> > @@ -246,5 +252,6 @@ extern struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s2_init_fns;
> > extern struct io_pgtable_init_fns io_pgtable_arm_v7s_init_fns;
> > extern struct io_pgtable_init_fns io_pgtable_arm_mali_lpae_init_fns;
> > extern struct io_pgtable_init_fns io_pgtable_amd_iommu_v1_init_fns;
> > +extern struct io_pgtable_init_fns io_pgtable_apple_dart_init_fns;
> >
> > #endif /* __IO_PGTABLE_H */
> >
>
Sven
On 2021-06-27 15:34, Sven Peter wrote:
[...]
> In the long term, I'd like to extend the dma-iommu framework itself to
> support iommu pagesizes with a larger granule than the CPU pagesize if that is
> something you agree with.
BTW this isn't something we can fully support in general. IOMMU API
users may expect this to work:
iommu_map(domain, iova, page_to_phys(p1), PAGE_SIZE, prot);
iommu_map(domain, iova + PAGE_SIZE, page_to_phys(p2), PAGE_SIZE, prot);
Although they do in principle have visibility of pgsize_bitmap, I still
doubt anyone is really prepared for CPU-page-aligned mappings to fail.
Even at the DMA API level you could hide *some* of it (at the cost of
effectively only having 1/4 of the usable address space), but there are
still cases like where v4l2 has a hard requirement that a page-aligned
scatterlist can be mapped into a contiguous region of DMA addresses.
> This would be important to later support the thunderbolt DARTs since I would be
> very uncomfortable to have these running in (software or hardware) bypass mode.
Funnily enough that's the one case that would be relatively workable,
since untrusted devices are currently subject to bounce-buffering of the
entire DMA request, so it doesn't matter so much how the bounce buffer
itself is mapped. Even with the possible future optimisation of only
bouncing the non-page-aligned start and end parts of a buffer I think it
still works (the physical alignment just has to be considered in terms
of the IOMMU granule).
Robin.
On Wed, Jul 14, 2021 at 8:21 PM Robin Murphy <[email protected]> wrote:
>
> On 2021-06-27 15:34, Sven Peter wrote:
> [...]
> > In the long term, I'd like to extend the dma-iommu framework itself to
> > support iommu pagesizes with a larger granule than the CPU pagesize if that is
> > something you agree with.
>
> BTW this isn't something we can fully support in general. IOMMU API
> users may expect this to work:
>
> iommu_map(domain, iova, page_to_phys(p1), PAGE_SIZE, prot);
> iommu_map(domain, iova + PAGE_SIZE, page_to_phys(p2), PAGE_SIZE, prot);
>
> Although they do in principle have visibility of pgsize_bitmap, I still
> doubt anyone is really prepared for CPU-page-aligned mappings to fail.
> Even at the DMA API level you could hide *some* of it (at the cost of
> effectively only having 1/4 of the usable address space), but there are
> still cases like where v4l2 has a hard requirement that a page-aligned
> scatterlist can be mapped into a contiguous region of DMA addresses.
I think that was the same conclusion we had earlier: the dma-mapping
interfaces should be possible for large iotlb pages, but any driver directly
using the IOMMU API, such as VFIO, would not work.
The question is how we can best allow one but not the other.
Arnd
On Wed, Jul 14, 2021 at 10:51:34PM +0200, Arnd Bergmann wrote:
> The question is how we can best allow one but not the other.
By only allowing to allocate domains of type IDENTITY and DMA, but fail
to allocate UNMANAGED domains.
Regards,
Joerg
Hi,
Awesome, thanks a lot for the detailed review!
On Wed, Jul 14, 2021, at 01:23, Robin Murphy wrote:
> ^^ Nit: the subsystem style for the subject format should be
> "iommu/dart: Add..." - similarly on patch #1, which I just realised I
> missed (sorry!)
Sure!
>
> On 2021-06-27 15:34, Sven Peter wrote:
> > Apple's new SoCs use iommus for almost all peripherals. These Device
> > Address Resolution Tables must be setup before these peripherals can
> > act as DMA masters.
> >
> > Signed-off-by: Sven Peter <[email protected]>
> > ---
> > MAINTAINERS | 1 +
> > drivers/iommu/Kconfig | 15 +
> > drivers/iommu/Makefile | 1 +
> > drivers/iommu/apple-dart-iommu.c | 1058 ++++++++++++++++++++++++++++++
>
> I'd be inclined to drop "-iommu" from the filename, unless there's some
> other "apple-dart" functionality that might lead to a module name clash
> in future?
Sure, DART should only be an iommu.
>
> > 4 files changed, 1075 insertions(+)
> > create mode 100644 drivers/iommu/apple-dart-iommu.c
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 29e5541c8f21..c1ffaa56b5f9 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -1245,6 +1245,7 @@ M: Sven Peter <[email protected]>
> > L: [email protected]
> > S: Maintained
> > F: Documentation/devicetree/bindings/iommu/apple,dart.yaml
> > +F: drivers/iommu/apple-dart-iommu.c
> >
> > APPLE SMC DRIVER
> > M: Henrik Rydberg <[email protected]>
> > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> > index 1f111b399bca..87882c628b46 100644
> > --- a/drivers/iommu/Kconfig
> > +++ b/drivers/iommu/Kconfig
> > @@ -249,6 +249,21 @@ config SPAPR_TCE_IOMMU
> > Enables bits of IOMMU API required by VFIO. The iommu_ops
> > is not implemented as it is not necessary for VFIO.
> >
> > +config IOMMU_APPLE_DART
> > + tristate "Apple DART IOMMU Support"
> > + depends on ARM64 || (COMPILE_TEST && !GENERIC_ATOMIC64)
> > + select IOMMU_API
> > + select IOMMU_IO_PGTABLE
>
> This is redundant - the individual formats already select it.
Removed for the next version.
>
[...]
> > +#include <linux/pci.h>
>
> Redundant duplicate
Whoops, removed for the next version as well.
>
> > +#define DART_MAX_STREAMS 16
[...]
> > +
> > +/*
> > + * This structure is used to identify a single stream attached to a domain.
> > + * It's used as a list inside that domain to be able to attach multiple
> > + * streams to a single domain. Since multiple devices can use a single stream
> > + * it additionally keeps track of how many devices are represented by this
> > + * stream. Once that number reaches zero it is detached from the IOMMU domain
> > + * and all translations from this stream are disabled.
>
> That sounds a lot like something you should be doing properly with groups.
The hint to look at arm-smmu for a similar flow was very helpful, thanks!
Now that I understand how these groups works I completely agree that this
needs to be reworked and done properly.
>
> > + * @dart: DART instance to which this stream belongs
> > + * @sid: stream id within the DART instance
> > + * @num_devices: count of devices attached to this stream
> > + * @stream_head: list head for the next stream
> > + */
> > +struct apple_dart_stream {
> > + struct apple_dart *dart;
> > + u32 sid;
>
> What are the actual SID values like? If they're large and sparse then
> maybe a list makes sense, but if they're small and dense then an array
> hanging off the apple_dart structure itself might be more efficient.
> Given DART_MAX_STREAMS, I'm thinking the latter, and considerably so.
>
> The impression I'm getting so far is that this seems conceptually a bit
> like arm-smmu with stream indexing.
There are two (very similar) types of DARTs.
The one supported with this series has up to 16 stream ids which will be
integers <16. There's another variant used for Thunderbolt for which I will
add support in a follow-up that supports up to 64 stream ids then.
So at worst this is an array with 64 entries if this structure won't
disappear completely.
And yes, this is conceptually a bit like arm-smmu's stream indexing I think.
>
> > + u32 num_devices;
> > +
> > + struct list_head stream_head;
> > +};
> > +
> > +/*
> > + * This structure is attached to each iommu domain handled by a DART.
> > + * A single domain is used to represent a single virtual address space.
> > + * It is always allocated together with a page table.
> > + *
> > + * Streams are the smallest units the DART hardware can differentiate.
> > + * These are pointed to the page table of a domain whenever a device is
> > + * attached to it. A single stream can only be assigned to a single domain.
> > + *
> > + * Devices are assigned to at least a single and sometimes multiple individual
> > + * streams (using the iommus property in the device tree). Multiple devices
> > + * can theoretically be represented by the same stream, though this is usually
> > + * not the case.
> > + *
> > + * We only keep track of streams here and just count how many devices are
> > + * represented by each stream. When the last device is removed the whole stream
> > + * is removed from the domain.
> > + *
> > + * @dart: pointer to the DART instance
> > + * @pgtbl_ops: pagetable ops allocated by io-pgtable
> > + * @type: domain type IOMMU_DOMAIN_IDENTITY_{IDENTITY,DMA,UNMANAGED,BLOCKED}
> > + * @sw_bypass_cpu_start: offset into cpu address space in software bypass mode
> > + * @sw_bypass_dma_start: offset into dma address space in software bypass mode
> > + * @sw_bypass_len: length of iova space in software bypass mode
> > + * @streams: list of streams attached to this domain
> > + * @lock: spinlock for operations involving the list of streams
> > + * @domain: core iommu domain pointer
> > + */
> > +struct apple_dart_domain {
> > + struct apple_dart *dart;
> > + struct io_pgtable_ops *pgtbl_ops;
> > +
> > + unsigned int type;
>
> Given that this is assigned from domain->type it appears to be redundant.
Yup, removed.
>
> > + u64 sw_bypass_cpu_start;
> > + u64 sw_bypass_dma_start;
> > + u64 sw_bypass_len;
> > +
> > + struct list_head streams;
>
> I'm staring to think this could just be a bitmap, in a u16 even.
The problem is that these streams may come from two different
DART instances. That is required for e.g. the dwc3 controller which
has a weird quirk where DMA transactions go through two separate
DARTs with no clear pattern (e.g. some xhci control structures use the
first dart while other structures use the second one).
Both of them need to point to the same pagetable.
In the device tree the node will have an entry like this:
dwc3_0: usb@382280000{
...
iommus = <&dwc3_0_dart_0 0>, <&dwc3_0_dart_1 1>;
};
There's no need for a linked list though once I do this properly with
groups. I can just use an array allocated when the first device is
attached, which just contains apple_dart* and streamid values.
>
> > +
> > + spinlock_t lock;
> > +
> > + struct iommu_domain domain;
> > +};
> > +
> > +/*
> > + * This structure is attached to devices with dev_iommu_priv_set() on of_xlate
> > + * and contains a list of streams bound to this device as defined in the
> > + * device tree. Multiple DART instances can be attached to a single device
> > + * and each stream is identified by its stream id.
> > + * It's usually reference by a pointer called *cfg.
> > + *
> > + * A dynamic array instead of a linked list is used here since in almost
> > + * all cases a device will just be attached to a single stream and streams
> > + * are never removed after they have been added.
> > + *
> > + * @num_streams: number of streams attached
> > + * @streams: array of structs to identify attached streams and the device link
> > + * to the iommu
> > + */
> > +struct apple_dart_master_cfg {
> > + int num_streams;
> > + struct {
> > + struct apple_dart *dart;
> > + u32 sid;
>
> Can't you use the fwspec for this?
I'd be happy to use the fwspec code if that's somehow possible.
I'm not sure how though since I need to store both the reference to the DART
_and_ to the stream id. As far as I can tell the fwspec code would only allow
to store the stream ids.
(see also the previous comment regarding the dwc3 node which requires stream
ids from two separate DART instances)
>
> > + struct device_link *link;
>
> Is it necessary to use stateless links, or could you use
> DL_FLAG_AUTOREMOVE_SUPPLIER and not have to keep track of them manually?
I'll just use DL_FLAG_AUTOREMOVE_SUPPLIER. No idea why I went for stateless links.
>
[...]
> > + /* restore stream identity map */
> > + writel(0x03020100, dart->regs + DART_STREAM_REMAP);
> > + writel(0x07060504, dart->regs + DART_STREAM_REMAP + 4);
> > + writel(0x0b0a0908, dart->regs + DART_STREAM_REMAP + 8);
> > + writel(0x0f0e0d0c, dart->regs + DART_STREAM_REMAP + 12);
>
> Any hint of what the magic numbers mean?
Yes, it's just 0,1,2,3...,0xe,0xf but I can't do 8bit writes to the bus
and 32 bit writes then require these slightly awkward "swapped" numbers.
I'll add a comment since it's not obvious at first glance.
>
> > + /* clear any pending errors before the interrupt is unmasked */
> > + writel(readl(dart->regs + DART_ERROR), dart->regs + DART_ERROR);
> > +
> > + return apple_dart_hw_invalidate_tlb_global(dart);
> > +}
> > +
> > +static void apple_dart_domain_flush_tlb(struct apple_dart_domain *domain)
> > +{
> > + unsigned long flags;
> > + struct apple_dart_stream *stream;
> > + struct apple_dart *dart = domain->dart;
> > +
> > + if (!dart)
> > + return;
>
> Can that happen? Feels like it's probably a bug elsewhere if it could :/
No, this can't happen. I'll remove it.
>
> > + spin_lock_irqsave(&domain->lock, flags);
> > + list_for_each_entry(stream, &domain->streams, stream_head) {
> > + apple_dart_hw_invalidate_tlb_stream(stream->dart, stream->sid);
> > + }
> > + spin_unlock_irqrestore(&domain->lock, flags);
> > +}
> > +
> > +static void apple_dart_flush_iotlb_all(struct iommu_domain *domain)
> > +{
> > + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> > +
> > + apple_dart_domain_flush_tlb(dart_domain);
> > +}
> > +
> > +static void apple_dart_iotlb_sync(struct iommu_domain *domain,
> > + struct iommu_iotlb_gather *gather)
> > +{
> > + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> > +
> > + apple_dart_domain_flush_tlb(dart_domain);
> > +}
> > +
> > +static void apple_dart_iotlb_sync_map(struct iommu_domain *domain,
> > + unsigned long iova, size_t size)
> > +{
> > + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> > +
> > + apple_dart_domain_flush_tlb(dart_domain);
> > +}
> > +
> > +static void apple_dart_tlb_flush_all(void *cookie)
> > +{
> > + struct apple_dart_domain *domain = cookie;
> > +
> > + apple_dart_domain_flush_tlb(domain);
> > +}
> > +
> > +static void apple_dart_tlb_flush_walk(unsigned long iova, size_t size,
> > + size_t granule, void *cookie)
> > +{
> > + struct apple_dart_domain *domain = cookie;
> > +
> > + apple_dart_domain_flush_tlb(domain);
> > +}
> > +
> > +static const struct iommu_flush_ops apple_dart_tlb_ops = {
> > + .tlb_flush_all = apple_dart_tlb_flush_all,
> > + .tlb_flush_walk = apple_dart_tlb_flush_walk,
> > + .tlb_add_page = NULL,
> > +};
> > +
> > +static phys_addr_t apple_dart_iova_to_phys(struct iommu_domain *domain,
> > + dma_addr_t iova)
> > +{
> > + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> > + struct io_pgtable_ops *ops = dart_domain->pgtbl_ops;
> > +
> > + if (domain->type == IOMMU_DOMAIN_IDENTITY &&
> > + dart_domain->dart->supports_bypass)
>
> That second check seems redundant - if you don't support bypass surely
> you shouldn't have allowed attaching an identity domain in the first
> place? And even if you fake one with a pagetable you shouldn't need to
> walk it, for obvious reasons ;)
True, and with the patch you sent I don't need this here either way.
>
> TBH, dealing with identity domains in iova_to_phys at all irks me - it's
> largely due to dubious hacks in networking drivers which hopefully you
> should never have to deal with on M1 anyway, and either way it's not
> like they can't check the domain type themselves and save a pointless
> indirect call altogether :(
>
> > + return iova;
> > + if (!ops)
> > + return -ENODEV;
> > +
> > + return ops->iova_to_phys(ops, iova);
> > +}
> > +
> > +static int apple_dart_map(struct iommu_domain *domain, unsigned long iova,
> > + phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
> > +{
> > + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> > + struct io_pgtable_ops *ops = dart_domain->pgtbl_ops;
> > +
> > + if (!ops)
> > + return -ENODEV;
> > + if (prot & IOMMU_MMIO)
> > + return -EINVAL;
> > + if (prot & IOMMU_NOEXEC)
> > + return -EINVAL;
>
> Hmm, I guess the usual expectation is just to ignore any prot flags you
> can't enforce - after all, some IOMMUs don't even have a notion of read
> or write permissions.
Sure, I'll just remove those checks.
>
> > + return ops->map(ops, iova, paddr, size, prot, gfp);
> > +}
> > +
> > +static size_t apple_dart_unmap(struct iommu_domain *domain, unsigned long iova,
> > + size_t size, struct iommu_iotlb_gather *gather)
> > +{
> > + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> > + struct io_pgtable_ops *ops = dart_domain->pgtbl_ops;
> > +
> > + if (!ops)
> > + return 0;
>
> That should never legitimately happen, since no previous mapping could
> have succeeded either.
Ack, removed.
>
> > + return ops->unmap(ops, iova, size, gather);
> > +}
> > +
> > +static int apple_dart_prepare_sw_bypass(struct apple_dart *dart,
> > + struct apple_dart_domain *dart_domain,
> > + struct device *dev)
> > +{
> > + lockdep_assert_held(&dart_domain->lock);
> > +
> > + if (dart->supports_bypass)
> > + return 0;
> > + if (dart_domain->type != IOMMU_DOMAIN_IDENTITY)
> > + return 0;
> > +
> > + // use the bus region from the first attached dev for the bypass range
> > + if (!dart->sw_bypass_len) {
> > + const struct bus_dma_region *dma_rgn = dev->dma_range_map;
> > +
> > + if (!dma_rgn)
> > + return -EINVAL;
> > +
> > + dart->sw_bypass_len = dma_rgn->size;
> > + dart->sw_bypass_cpu_start = dma_rgn->cpu_start;
> > + dart->sw_bypass_dma_start = dma_rgn->dma_start;
> > + }
> > +
> > + // ensure that we don't mix different bypass setups
> > + if (dart_domain->sw_bypass_len) {
> > + if (dart->sw_bypass_len != dart_domain->sw_bypass_len)
> > + return -EINVAL;
> > + if (dart->sw_bypass_cpu_start !=
> > + dart_domain->sw_bypass_cpu_start)
> > + return -EINVAL;
> > + if (dart->sw_bypass_dma_start !=
> > + dart_domain->sw_bypass_dma_start)
> > + return -EINVAL;
> > + } else {
> > + dart_domain->sw_bypass_len = dart->sw_bypass_len;
> > + dart_domain->sw_bypass_cpu_start = dart->sw_bypass_cpu_start;
> > + dart_domain->sw_bypass_dma_start = dart->sw_bypass_dma_start;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static int apple_dart_domain_needs_pgtbl_ops(struct apple_dart *dart,
> > + struct iommu_domain *domain)
> > +{
> > + if (domain->type == IOMMU_DOMAIN_DMA)
> > + return 1;
> > + if (domain->type == IOMMU_DOMAIN_UNMANAGED)
> > + return 1;
> > + if (!dart->supports_bypass && domain->type == IOMMU_DOMAIN_IDENTITY)
> > + return 1;
> > + return 0;
> > +}
> > +
> > +static int apple_dart_finalize_domain(struct iommu_domain *domain)
> > +{
> > + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> > + struct apple_dart *dart = dart_domain->dart;
> > + struct io_pgtable_cfg pgtbl_cfg;
> > +
> > + lockdep_assert_held(&dart_domain->lock);
> > +
> > + if (dart_domain->pgtbl_ops)
> > + return 0;
> > + if (!apple_dart_domain_needs_pgtbl_ops(dart, domain))
> > + return 0;
> > +
> > + pgtbl_cfg = (struct io_pgtable_cfg){
> > + .pgsize_bitmap = dart->pgsize,
> > + .ias = 32,
> > + .oas = 36,
> > + .coherent_walk = 1,
> > + .tlb = &apple_dart_tlb_ops,
> > + .iommu_dev = dart->dev,
> > + };
> > +
> > + dart_domain->pgtbl_ops =
> > + alloc_io_pgtable_ops(ARM_APPLE_DART, &pgtbl_cfg, domain);
> > + if (!dart_domain->pgtbl_ops)
> > + return -ENOMEM;
> > +
> > + domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> > + domain->geometry.aperture_start = 0;
> > + domain->geometry.aperture_end = DMA_BIT_MASK(32);
> > + domain->geometry.force_aperture = true;
> > +
> > + /*
> > + * Some DARTs come without hardware bypass support but we may still
> > + * be forced to use bypass mode (to e.g. allow kernels with 4K pages to
> > + * boot). If we reach this point with an identity domain we have to setup
> > + * bypass mode in software. This is done by creating a static pagetable
> > + * for a linear map specified by dma-ranges in the device tree.
> > + */
> > + if (domain->type == IOMMU_DOMAIN_IDENTITY) {
> > + u64 offset;
> > + int ret;
> > +
> > + for (offset = 0; offset < dart_domain->sw_bypass_len;
> > + offset += dart->pgsize) {
> > + ret = dart_domain->pgtbl_ops->map(
> > + dart_domain->pgtbl_ops,
> > + dart_domain->sw_bypass_dma_start + offset,
> > + dart_domain->sw_bypass_cpu_start + offset,
> > + dart->pgsize, IOMMU_READ | IOMMU_WRITE,
> > + GFP_ATOMIC);
> > + if (ret < 0) {
> > + free_io_pgtable_ops(dart_domain->pgtbl_ops);
> > + dart_domain->pgtbl_ops = NULL;
> > + return -EINVAL;
> > + }
> > + }
>
> Could you set up a single per-DART pagetable in prepare_sw_bypass (or
> even better at probe time if you think you're likely to need it) and
> just share that between all fake identity domains? That could be a
> follow-up optimisation, though.
I'll see if that's possible. So essentially I want to setup an identity
mapping with respect to bus_dma_region from the first attached device.
Right now this is always mapping the entire 4G VA space to RAM
starting at 0x8_0000_0000.
See also my reply to another comment further down since software
bypass mode might have to disappear anyway.
>
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static void
> > +apple_dart_stream_setup_translation(struct apple_dart_domain *domain,
> > + struct apple_dart *dart, u32 sid)
> > +{
> > + int i;
> > + struct io_pgtable_cfg *pgtbl_cfg =
> > + &io_pgtable_ops_to_pgtable(domain->pgtbl_ops)->cfg;
> > +
> > + for (i = 0; i < pgtbl_cfg->apple_dart_cfg.n_ttbrs; ++i)
> > + apple_dart_hw_set_ttbr(dart, sid, i,
> > + pgtbl_cfg->apple_dart_cfg.ttbr[i]);
> > + for (; i < DART_MAX_TTBR; ++i)
> > + apple_dart_hw_clear_ttbr(dart, sid, i);
> > +
> > + apple_dart_hw_enable_translation(dart, sid);
> > + apple_dart_hw_invalidate_tlb_stream(dart, sid);
> > +}
> > +
> > +static int apple_dart_attach_stream(struct apple_dart_domain *domain,
> > + struct apple_dart *dart, u32 sid)
> > +{
> > + unsigned long flags;
> > + struct apple_dart_stream *stream;
> > + int ret;
> > +
> > + lockdep_assert_held(&domain->lock);
> > +
> > + if (WARN_ON(dart->force_bypass &&
> > + domain->type != IOMMU_DOMAIN_IDENTITY))
> > + return -EINVAL;
>
> Ideally you shouldn't allow that to happen, but I guess if you have
> mixed capabilities afross different instances then in principle an
> unmanaged domain could still slip through. But then again a user of an
> unmanaged domain might be OK with using larger pages anyway. Either way
> I'm not sure it's worthy of a WARN (similarly below) since it doesn't
> represent a "this should never happen" condition if the user has got
> their hands on a VFIO driver and is mucking about, it's just a normal
> failure because you can't support the attachment.
Makes sense, will remove that WARN (and other below as well).
>
> > + /*
> > + * we can't mix and match DARTs that support bypass mode with those who don't
> > + * because the iova space in fake bypass mode generally has an offset
> > + */
>
> Erm, something doesn't sound right there... IOMMU_DOMAIN_IDENTITY should
> be exactly what it says, regardless of how it's implemented. If you
> can't provide a true identity mapping then you're probably better off
> not pretending to support them in the first place.
Some background: the PCIe DART only supports a 32bit VA space but RAM
on these machines starts at 0x8_0000_0000. I have something like
dma-ranges = <0x42000000 0 0 0x8 0 0 0xffff0000>;
in the pcie nodes to add that offset to dma addresses.
What I want to do here then is to setup an identity mapping with respect
to the DMA layer understanding of addresses encoded in bus_dma_region.
Now this will always just be a constant offset of 0x8_0000_0000 for
all M1s but I didn't want to hardcode that.
The code here is just there to guard against a situation where someone
somehow manages to attach two devices with different offsets to the same
domain.
If that's not how the abstraction is supposed to work and/or too big of a hack
I'll just remove the software bypass mode altogether.
PCIe won't work on 4k kernels then but the only people using this so far
build their own kernels with patches either way and won't complain.
And by the time Linux will actually be useful for "normal" setups
the dma-iommu layer can hopefully just handle a larger page granularity.
>
> > + if (WARN_ON(domain->type == IOMMU_DOMAIN_IDENTITY &&
> > + (domain->dart->supports_bypass != dart->supports_bypass)))
> > + return -EINVAL;
> > +
> > + list_for_each_entry(stream, &domain->streams, stream_head) {
> > + if (stream->dart == dart && stream->sid == sid) {
> > + stream->num_devices++;
> > + return 0;
> > + }
> > + }
> > +
> > + spin_lock_irqsave(&dart->lock, flags);
> > +
> > + if (WARN_ON(dart->used_sids & BIT(sid))) {
> > + ret = -EINVAL;
> > + goto error;
> > + }
> > +
> > + stream = kzalloc(sizeof(*stream), GFP_ATOMIC);
> > + if (!stream) {
> > + ret = -ENOMEM;
> > + goto error;
> > + }
>
> Couldn't you do this outside the lock? (If, calling back to other
> comments, it can't get refactored out of existence anyway)
Probably, but I'll first see if I can just refactor it away.
>
> > + stream->dart = dart;
> > + stream->sid = sid;
> > + stream->num_devices = 1;
> > + list_add(&stream->stream_head, &domain->streams);
> > +
> > + dart->used_sids |= BIT(sid);
> > + spin_unlock_irqrestore(&dart->lock, flags);
> > +
> > + apple_dart_hw_clear_all_ttbrs(stream->dart, stream->sid);
> > +
> > + switch (domain->type) {
> > + case IOMMU_DOMAIN_IDENTITY:
> > + if (stream->dart->supports_bypass)
> > + apple_dart_hw_enable_bypass(stream->dart, stream->sid);
> > + else
> > + apple_dart_stream_setup_translation(
> > + domain, stream->dart, stream->sid);
> > + break;
> > + case IOMMU_DOMAIN_BLOCKED:
> > + apple_dart_hw_disable_dma(stream->dart, stream->sid);
> > + break;
> > + case IOMMU_DOMAIN_UNMANAGED:
> > + case IOMMU_DOMAIN_DMA:
> > + apple_dart_stream_setup_translation(domain, stream->dart,
> > + stream->sid);
> > + break;
> > + }
> > +
> > + return 0;
> > +
> > +error:
> > + spin_unlock_irqrestore(&dart->lock, flags);
> > + return ret;
> > +}
> > +
> > +static void apple_dart_disable_stream(struct apple_dart *dart, u32 sid)
> > +{
> > + unsigned long flags;
> > +
> > + apple_dart_hw_disable_dma(dart, sid);
> > + apple_dart_hw_clear_all_ttbrs(dart, sid);
> > + apple_dart_hw_invalidate_tlb_stream(dart, sid);
> > +
> > + spin_lock_irqsave(&dart->lock, flags);
> > + dart->used_sids &= ~BIT(sid);
> > + spin_unlock_irqrestore(&dart->lock, flags);
> > +}
> > +
> > +static void apple_dart_detach_stream(struct apple_dart_domain *domain,
> > + struct apple_dart *dart, u32 sid)
> > +{
> > + struct apple_dart_stream *stream;
> > +
> > + lockdep_assert_held(&domain->lock);
> > +
> > + list_for_each_entry(stream, &domain->streams, stream_head) {
> > + if (stream->dart == dart && stream->sid == sid) {
> > + stream->num_devices--;
> > +
> > + if (stream->num_devices == 0) {
> > + apple_dart_disable_stream(dart, sid);
> > + list_del(&stream->stream_head);
> > + kfree(stream);
> > + }
> > + return;
> > + }
> > + }
> > +}
> > +
> > +static int apple_dart_attach_dev(struct iommu_domain *domain,
> > + struct device *dev)
> > +{
> > + int ret;
> > + int i, j;
> > + unsigned long flags;
> > + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
> > + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> > + struct apple_dart *dart = cfg->streams[0].dart;
> > +
> > + if (WARN_ON(dart->force_bypass &&
> > + dart_domain->type != IOMMU_DOMAIN_IDENTITY)) {
> > + dev_warn(
> > + dev,
> > + "IOMMU must be in bypass mode but trying to attach to translated domain.\n");
> > + return -EINVAL;
> > + }
>
> Again, a bit excessive with the warnings. In fact, transpose my comment
> from apple_dart_attach_stream() to here, because this means the
> equivalent warning there is literally unreachable :/
Okay, I'll go through the code paths again, get rid of these warnings and
make sure I don't check the same thing more than once.
>
[...]
> > +
> > +static void apple_dart_release_device(struct device *dev)
> > +{
> > + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
> > + int i;
> > +
> > + if (!cfg)
> > + return;
>
> Shouldn't happen - if it's disappeared since probe_device succeeded
> you've got bigger problems anyway.
Ok, will remove it.
>
> > +
> > + for (i = 0; i < cfg->num_streams; ++i)
> > + device_link_del(cfg->streams[i].link);
> > +
> > + dev_iommu_priv_set(dev, NULL);
> > + kfree(cfg);
> > +}
> > +
> > +static struct iommu_domain *apple_dart_domain_alloc(unsigned int type)
> > +{
> > + struct apple_dart_domain *dart_domain;
> > +
> > + if (type != IOMMU_DOMAIN_DMA && type != IOMMU_DOMAIN_UNMANAGED &&
> > + type != IOMMU_DOMAIN_IDENTITY && type != IOMMU_DOMAIN_BLOCKED)
> > + return NULL;
>
> I want to say there's not much point in that, but then I realise I've
> spent the last couple of days writing patches to add a new domain type :)
Hah! Just because I'm curious: What is that new domain type going to be? :)
>
> > + dart_domain = kzalloc(sizeof(*dart_domain), GFP_KERNEL);
> > + if (!dart_domain)
> > + return NULL;
> > +
> > + INIT_LIST_HEAD(&dart_domain->streams);
> > + spin_lock_init(&dart_domain->lock);
> > + iommu_get_dma_cookie(&dart_domain->domain);
> > + dart_domain->type = type;
>
> Yeah, this is "useful" for a handful of CPU cycles until we return and
> iommu_domain_alloc() sets dart_domain->domain->type to the same thing,
> all before *its* caller even knows the domain exists.
True, will remove it.
>
> > + return &dart_domain->domain;
> > +}
> > +
> > +static void apple_dart_domain_free(struct iommu_domain *domain)
> > +{
> > + struct apple_dart_domain *dart_domain = to_dart_domain(domain);
> > +
> > + WARN_ON(!list_empty(&dart_domain->streams));
>
> Why? This code is perfectly legal API usage:
>
> d = iommu_domain_alloc(bus)
> if (d)
> iommu_domain_free(d);
>
> Sure it looks pointless, but it's the kind of thing that can
> legitimately happen (with a lot more going on in between) if an
> unmanaged domain user tears itself down before it gets round to
> attaching, due to probe deferral or some other error condition.
Ah, makes sense. I'll remove the warning!
>
> > + kfree(dart_domain);
> > +}
> > +
> > +static int apple_dart_of_xlate(struct device *dev, struct of_phandle_args *args)
> > +{
> > + struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
> > + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
> > + unsigned int num_streams = cfg ? cfg->num_streams : 0;
> > + struct apple_dart_master_cfg *cfg_new;
> > + struct apple_dart *dart = platform_get_drvdata(iommu_pdev);
> > +
> > + if (args->args_count != 1)
> > + return -EINVAL;
> > +
> > + cfg_new = krealloc(cfg, struct_size(cfg, streams, num_streams + 1),
> > + GFP_KERNEL);
> > + if (!cfg_new)
> > + return -ENOMEM;
> > +
> > + cfg = cfg_new;
> > + dev_iommu_priv_set(dev, cfg);
> > +
> > + cfg->num_streams = num_streams;
> > + cfg->streams[cfg->num_streams].dart = dart;
> > + cfg->streams[cfg->num_streams].sid = args->args[0];
> > + cfg->num_streams++;
>
> Yeah, this is way too reminiscent of the fwspec code for comfort. Even
> if you can't use autoremove links for some reason, an array of 16
> device_link pointers hung off apple_dart still wins over these little
> pointer-heavy structures if you need more than a few of them.
I can get rid of the links, but I'll still need some way to store
both the apple_dart and the sid here. Like mentioned above, I'll
be happy to reuse the fwspec code but I don't see how yet.
>
> > + return 0;
> > +}
> > +
> > +static struct iommu_group *apple_dart_device_group(struct device *dev)
> > +{
> > +#ifdef CONFIG_PCI
> > + struct iommu_group *group;
> > +
> > + if (dev_is_pci(dev))
> > + group = pci_device_group(dev);
> > + else
> > + group = generic_device_group(dev);
>
> ...and this is where it gets bad :(
>
> If you can have multiple devices behind the same stream such that the
> IOMMU can't tell them apart, you *have* to ensure they get put in the
> same group, so that the IOMMU core knows the topology (and reflects it
> correctly to userspace) and doesn't try to do things that then
> unexpectedly fail. This is the point where you need to check if a stream
> is already known, and return the existing group if so, and then you
> won't need to check and refcount all the time in attach/detach because
> the IOMMU core will do the right thing for you.
>
> Many drivers only run on systems where devices don't alias at the IOMMU
> level (aliasing at the PCI level is already taken care of), or use a
> single group because effectively everything aliases, so it's not the
> most common scenario, but as I mentioned before arm-smmu is one that
> does - take a look at the flow though that in the "!smmu->smrs" cases
> for the closest example.
Okay, this is very good to know. Thanks again for the pointer to the
arm-smmu code, it really helped me understand how iommu_groups are
supposed to work. I'll do this the proper way for v5, which should also
simplify this driver :-)
>
> > +
> > + return group;
> > +#else
> > + return generic_device_group(dev);
> > +#endif
> > +}
> > +
> > +static int apple_dart_def_domain_type(struct device *dev)
> > +{
> > + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
> > + struct apple_dart *dart = cfg->streams[0].dart;
> > +
> > + if (dart->force_bypass)
> > + return IOMMU_DOMAIN_IDENTITY;
> > + if (!dart->supports_bypass)
> > + return IOMMU_DOMAIN_DMA;
> > +
> > + return 0;
> > +}
> > +
> > +static const struct iommu_ops apple_dart_iommu_ops = {
> > + .domain_alloc = apple_dart_domain_alloc,
> > + .domain_free = apple_dart_domain_free,
> > + .attach_dev = apple_dart_attach_dev,
> > + .detach_dev = apple_dart_detach_dev,
> > + .map = apple_dart_map,
> > + .unmap = apple_dart_unmap,
> > + .flush_iotlb_all = apple_dart_flush_iotlb_all,
> > + .iotlb_sync = apple_dart_iotlb_sync,
> > + .iotlb_sync_map = apple_dart_iotlb_sync_map,
> > + .iova_to_phys = apple_dart_iova_to_phys,
> > + .probe_device = apple_dart_probe_device,
> > + .release_device = apple_dart_release_device,
> > + .device_group = apple_dart_device_group,
> > + .of_xlate = apple_dart_of_xlate,
> > + .def_domain_type = apple_dart_def_domain_type,
> > + .pgsize_bitmap = -1UL, /* Restricted during dart probe */
> > +};
> > +
> > +static irqreturn_t apple_dart_irq(int irq, void *dev)
> > +{
> > + struct apple_dart *dart = dev;
> > + static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
> > + DEFAULT_RATELIMIT_BURST);
> > + const char *fault_name = NULL;
> > + u32 error = readl(dart->regs + DART_ERROR);
> > + u32 error_code = FIELD_GET(DART_ERROR_CODE, error);
> > + u32 addr_lo = readl(dart->regs + DART_ERROR_ADDR_LO);
> > + u32 addr_hi = readl(dart->regs + DART_ERROR_ADDR_HI);
> > + u64 addr = addr_lo | (((u64)addr_hi) << 32);
> > + u8 stream_idx = FIELD_GET(DART_ERROR_STREAM, error);
> > +
> > + if (!(error & DART_ERROR_FLAG))
> > + return IRQ_NONE;
> > +
> > + if (error_code & DART_ERROR_READ_FAULT)
> > + fault_name = "READ FAULT";
> > + else if (error_code & DART_ERROR_WRITE_FAULT)
> > + fault_name = "WRITE FAULT";
> > + else if (error_code & DART_ERROR_NO_PTE)
> > + fault_name = "NO PTE FOR IOVA";
> > + else if (error_code & DART_ERROR_NO_PMD)
> > + fault_name = "NO PMD FOR IOVA";
> > + else if (error_code & DART_ERROR_NO_TTBR)
> > + fault_name = "NO TTBR FOR IOVA";
>
> Can multiple bits be set at once or is there a strict precedence?
I'll double check and either add a comment that there's a precedence or
print names for all bits that are set.
>
> > + if (WARN_ON(fault_name == NULL))
>
> You're already logging a clear and attributable message below; I can
> guarantee that a big noisy backtrace showing that you got here from
> el0_irq() or el1_irq() is not useful over and above that.
>
> > + fault_name = "unknown";
> > +
> > + if (__ratelimit(&rs)) {
>
> Just use dev_err_ratelimited() to hide the guts if you're not doing
> anything tricky.
Ack.
>
> > + dev_err(dart->dev,
> > + "translation fault: status:0x%x stream:%d code:0x%x (%s) at 0x%llx",
> > + error, stream_idx, error_code, fault_name, addr);
> > + }
> > +
> > + writel(error, dart->regs + DART_ERROR);
> > + return IRQ_HANDLED;
> > +}
> > +
> > +static int apple_dart_probe(struct platform_device *pdev)
> > +{
> > + int ret;
> > + u32 dart_params[2];
> > + struct resource *res;
> > + struct apple_dart *dart;
> > + struct device *dev = &pdev->dev;
> > +
> > + dart = devm_kzalloc(dev, sizeof(*dart), GFP_KERNEL);
> > + if (!dart)
> > + return -ENOMEM;
> > +
> > + dart->dev = dev;
> > + spin_lock_init(&dart->lock);
> > +
> > + if (pdev->num_resources < 1)
> > + return -ENODEV;
>
> But you have 2 resources (one MEM and one IRQ)? And anyway their
> respective absences would hardly go unnoticed below.
Probably a leftover from when I just had the MEM resource.
I'll just remove the check here.
>
> > + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > + if (resource_size(res) < 0x4000) {
> > + dev_err(dev, "MMIO region too small (%pr)\n", res);
> > + return -EINVAL;
> > + }
> > +
> > + dart->regs = devm_ioremap_resource(dev, res);
> > + if (IS_ERR(dart->regs))
> > + return PTR_ERR(dart->regs);
> > +
> > + ret = devm_clk_bulk_get_all(dev, &dart->clks);
> > + if (ret < 0)
> > + return ret;
> > + dart->num_clks = ret;
> > +
> > + ret = clk_bulk_prepare_enable(dart->num_clks, dart->clks);
> > + if (ret)
> > + return ret;
> > +
> > + ret = apple_dart_hw_reset(dart);
> > + if (ret)
> > + goto err_clk_disable;
> > +
> > + dart_params[0] = readl(dart->regs + DART_PARAMS1);
> > + dart_params[1] = readl(dart->regs + DART_PARAMS2);
> > + dart->pgsize = 1 << FIELD_GET(DART_PARAMS_PAGE_SHIFT, dart_params[0]);
> > + dart->supports_bypass = dart_params[1] & DART_PARAMS_BYPASS_SUPPORT;
> > + dart->force_bypass = dart->pgsize > PAGE_SIZE;
> > +
> > + dart->irq = platform_get_irq(pdev, 0);
> > + if (dart->irq < 0) {
> > + ret = -ENODEV;
> > + goto err_clk_disable;
> > + }
>
> FWIW I'd get the IRQ earlier when there's still nothing to clean up on
> failure - it's only the request which needs to wait until you've
> actually set up enough to be able to handle it if it does fire.
Good point, will move it further above.
>
> > + ret = devm_request_irq(dart->dev, dart->irq, apple_dart_irq,
> > + IRQF_SHARED, "apple-dart fault handler", dart);
>
> Be verfy careful with this pattern of mixing devrers-managed IRQs with
> explicitly-managed clocks, especially when IRQF_SHARED is in play. In
> the failure path here, and in remove, you have a period where the clocks
> have been disabled but the IRQ is still live - try CONFIG_DEBUG_SHIRQ
> and don't be surprised if you deadlock trying to read an unclocked register.
Good catch, I didn't even think about that situation.
"Luckily" the clocks are usually shared with the master device(s) attached to
the iommu, so they are already on long before apple_dart_probe is called.
>
> If you can't also offload the clock management to devres to guarantee
> ordering relative to the IRQ (I think I saw some patches recently), it's
> probably safest to manually manage the latter.
Okay, will take a look and see if I can offload it and otherwise manage it
manually then.
>
> > + if (ret)
> > + goto err_clk_disable;
> > +
> > + platform_set_drvdata(pdev, dart);
> > +
> > + ret = iommu_device_sysfs_add(&dart->iommu, dev, NULL, "apple-dart.%s",
> > + dev_name(&pdev->dev));
> > + if (ret)
> > + goto err_clk_disable;
> > +
> > + ret = iommu_device_register(&dart->iommu, &apple_dart_iommu_ops, dev);
> > + if (ret)
> > + goto err_clk_disable;
> > +
> > + if (dev->bus->iommu_ops != &apple_dart_iommu_ops) {
> > + ret = bus_set_iommu(dev->bus, &apple_dart_iommu_ops);
> > + if (ret)
> > + goto err_clk_disable;
> > + }
> > +#ifdef CONFIG_PCI
> > + if (dev->bus->iommu_ops != pci_bus_type.iommu_ops) {
>
> But it's still a platform device, not a PCI device?
Er, yes, I will fix this code here by doing something similar to what
arm-smmu does:
if (!iommu_present(&platform_bus_type)) {
ret = bus_set_iommu(&platform_bus_type, &apple_dart_iommu_ops);
if (ret)
goto err_clk_disable;
}
#ifdef CONFIG_PCI
if (!iommu_present(&pci_bus_type)) {
ret = bus_set_iommu(&pci_bus_type, &apple_dart_iommu_ops);
if (ret)
goto err_reset_platform_ops;
}
#endif
>
> > + ret = bus_set_iommu(&pci_bus_type, &apple_dart_iommu_ops);
> > + if (ret)
> > + goto err_clk_disable;
>
> And the platform bus ops?
ugh, good catch. will clean them up as well.
>
> > + }
> > +#endif
> > +
> > + dev_info(
> > + &pdev->dev,
> > + "DART [pagesize %x, bypass support: %d, bypass forced: %d] initialized\n",
> > + dart->pgsize, dart->supports_bypass, dart->force_bypass);
> > + return 0;
> > +
> > +err_clk_disable:
> > + clk_bulk_disable(dart->num_clks, dart->clks);
> > + clk_bulk_unprepare(dart->num_clks, dart->clks);
>
> No need to open-code clk_bulk_disable_unprepare() ;)
True :-)
>
> > + return ret;
> > +}
> > +
> > +static int apple_dart_remove(struct platform_device *pdev)
> > +{
> > + struct apple_dart *dart = platform_get_drvdata(pdev);
> > +
> > + devm_free_irq(dart->dev, dart->irq, dart);
> > +
> > + iommu_device_unregister(&dart->iommu);
> > + iommu_device_sysfs_remove(&dart->iommu);
> > +
> > + clk_bulk_disable(dart->num_clks, dart->clks);
> > + clk_bulk_unprepare(dart->num_clks, dart->clks);
>
> Ditto.
>
> And again the bus ops are still installed - that'll get really fun if
> this is a module unload...
Ugh, yeah. I'll fix that as well. I'll have to see how to make this work
correctly with multiple DART instances. I guess I should only remove the
bus ops once the last one is removed. Now that I think about it, this
could also get tricky in the cleanup paths of apple_dart_probe.
Maybe just add a module_init that sets up the bus ops when it finds at
least one DART node and module_exit to tear them down again?
>
> > + return 0;
> > +}
> > +
> > +static void apple_dart_shutdown(struct platform_device *pdev)
> > +{
> > + apple_dart_remove(pdev);
>
> The main reason for doing somthing on shutdown is in the case of kexec,
> to put the hardware back into a disable or otherwise sane state so as
> not to trip up whatever the subsequent payload is. If you're not doing
> that (which may be legitimate if the expectation is that software must
> always fully reset and initialise a DART before I/O can work) then
> there's not much point in doing anything, really. Stuff like tidying up
> sysfs is a complete waste of time when the world's about to end ;)
Makes sense, I'll see if I can put the DARTs back into a sane state
for whatever the next payload is here then.
>
> Robin.
>
> > +}
> > +
> > +static const struct of_device_id apple_dart_of_match[] = {
> > + { .compatible = "apple,t8103-dart", .data = NULL },
> > + {},
> > +};
> > +MODULE_DEVICE_TABLE(of, apple_dart_of_match);
> > +
> > +static struct platform_driver apple_dart_driver = {
> > + .driver = {
> > + .name = "apple-dart",
> > + .of_match_table = apple_dart_of_match,
> > + },
> > + .probe = apple_dart_probe,
> > + .remove = apple_dart_remove,
> > + .shutdown = apple_dart_shutdown,
> > +};
> > +module_platform_driver(apple_dart_driver);
> > +
> > +MODULE_DESCRIPTION("IOMMU API for Apple's DART");
> > +MODULE_AUTHOR("Sven Peter <[email protected]>");
> > +MODULE_LICENSE("GPL v2");
> >
>
Sven
On Wed, Jul 14, 2021 at 07:19:50PM +0100, Robin Murphy wrote:
> Even at the DMA API level you could hide *some* of it (at the cost of
> effectively only having 1/4 of the usable address space), but there are
> still cases like where v4l2 has a hard requirement that a page-aligned
> scatterlist can be mapped into a contiguous region of DMA addresses.
Where does v4l2 make that broken assumption? Plenty of dma mapping
implementations including dma-direct do not support that.
Drivers need to call dma_get_merge_boundary() to check for that kind of
behavior.
On 2021-07-16 07:24, Christoph Hellwig wrote:
> On Wed, Jul 14, 2021 at 07:19:50PM +0100, Robin Murphy wrote:
>> Even at the DMA API level you could hide *some* of it (at the cost of
>> effectively only having 1/4 of the usable address space), but there are
>> still cases like where v4l2 has a hard requirement that a page-aligned
>> scatterlist can be mapped into a contiguous region of DMA addresses.
>
> Where does v4l2 make that broken assumption? Plenty of dma mapping
> implementations including dma-direct do not support that.
See vb2_dc_get_contiguous_size() and its callers. I still remember
spending an entire work day on writing one email at the culmination of
this discussion:
https://lore.kernel.org/linux-iommu/[email protected]/
809eac54cdd6 was framed as an efficiency improvement because it
technically was one (and something I had wanted to implement anyway),
but it was also very much to save myself from any further email debates
or customer calls about "regressing" code ported from 32-bit platforms...
Robin.
On 2021-07-15 17:41, Sven Peter via iommu wrote:
[...]
>>> + u64 sw_bypass_cpu_start;
>>> + u64 sw_bypass_dma_start;
>>> + u64 sw_bypass_len;
>>> +
>>> + struct list_head streams;
>>
>> I'm staring to think this could just be a bitmap, in a u16 even.
>
> The problem is that these streams may come from two different
> DART instances. That is required for e.g. the dwc3 controller which
> has a weird quirk where DMA transactions go through two separate
> DARTs with no clear pattern (e.g. some xhci control structures use the
> first dart while other structures use the second one).
Ah right, I do remember discussing that situation, but I think I
misinterpreted dart_domain->dart representing "the DART instance" here
to mean we weren't trying to accommodate that just yet.
> Both of them need to point to the same pagetable.
> In the device tree the node will have an entry like this:
>
> dwc3_0: usb@382280000{
> ...
> iommus = <&dwc3_0_dart_0 0>, <&dwc3_0_dart_1 1>;
> };
>
> There's no need for a linked list though once I do this properly with
> groups. I can just use an array allocated when the first device is
> attached, which just contains apple_dart* and streamid values.
>
>
>>
>>> +
>>> + spinlock_t lock;
>>> +
>>> + struct iommu_domain domain;
>>> +};
>>> +
>>> +/*
>>> + * This structure is attached to devices with dev_iommu_priv_set() on of_xlate
>>> + * and contains a list of streams bound to this device as defined in the
>>> + * device tree. Multiple DART instances can be attached to a single device
>>> + * and each stream is identified by its stream id.
>>> + * It's usually reference by a pointer called *cfg.
>>> + *
>>> + * A dynamic array instead of a linked list is used here since in almost
>>> + * all cases a device will just be attached to a single stream and streams
>>> + * are never removed after they have been added.
>>> + *
>>> + * @num_streams: number of streams attached
>>> + * @streams: array of structs to identify attached streams and the device link
>>> + * to the iommu
>>> + */
>>> +struct apple_dart_master_cfg {
>>> + int num_streams;
>>> + struct {
>>> + struct apple_dart *dart;
>>> + u32 sid;
>>
>> Can't you use the fwspec for this?
>
>
> I'd be happy to use the fwspec code if that's somehow possible.
> I'm not sure how though since I need to store both the reference to the DART
> _and_ to the stream id. As far as I can tell the fwspec code would only allow
> to store the stream ids.
> (see also the previous comment regarding the dwc3 node which requires stream
> ids from two separate DART instances)
Hmm, yes, as above I was overlooking that, although there are still
various ideas that come to mind; the question becomes whether they're
actually worthwhile or just too-clever-for-their-own-good hacks. The
exact format of fwspec->ids is not fixed (other than the ACPI IORT code
having a common understanding with the Arm SMMU drivers) so in principle
you could munge some sort of DART instance index or indeed anything, but
if it remains cleaner to manage your own data internally then by all
means keep doing that.
>>> + struct device_link *link;
>>
>> Is it necessary to use stateless links, or could you use
>> DL_FLAG_AUTOREMOVE_SUPPLIER and not have to keep track of them manually?
>
> I'll just use DL_FLAG_AUTOREMOVE_SUPPLIER. No idea why I went for stateless links.
>
>>
> [...]
>>> + /* restore stream identity map */
>>> + writel(0x03020100, dart->regs + DART_STREAM_REMAP);
>>> + writel(0x07060504, dart->regs + DART_STREAM_REMAP + 4);
>>> + writel(0x0b0a0908, dart->regs + DART_STREAM_REMAP + 8);
>>> + writel(0x0f0e0d0c, dart->regs + DART_STREAM_REMAP + 12);
>>
>> Any hint of what the magic numbers mean?
>
> Yes, it's just 0,1,2,3...,0xe,0xf but I can't do 8bit writes to the bus
> and 32 bit writes then require these slightly awkward "swapped" numbers.
> I'll add a comment since it's not obvious at first glance.
Sure, I guessed that much from "identity map" - it was more a question
of why that means 0x03020100... rather than, say, 0x0c0d0e0f... or
0x76543210..., and perhaps the reason for "restoring" it in the first place.
[...]
>>> + /*
>>> + * we can't mix and match DARTs that support bypass mode with those who don't
>>> + * because the iova space in fake bypass mode generally has an offset
>>> + */
>>
>> Erm, something doesn't sound right there... IOMMU_DOMAIN_IDENTITY should
>> be exactly what it says, regardless of how it's implemented. If you
>> can't provide a true identity mapping then you're probably better off
>> not pretending to support them in the first place.
>
> Some background: the PCIe DART only supports a 32bit VA space but RAM
> on these machines starts at 0x8_0000_0000. I have something like
> dma-ranges = <0x42000000 0 0 0x8 0 0 0xffff0000>;
> in the pcie nodes to add that offset to dma addresses.
>
> What I want to do here then is to setup an identity mapping with respect
> to the DMA layer understanding of addresses encoded in bus_dma_region.
> Now this will always just be a constant offset of 0x8_0000_0000 for
> all M1s but I didn't want to hardcode that.
> The code here is just there to guard against a situation where someone
> somehow manages to attach two devices with different offsets to the same
> domain.
Urgh, *now* I think I get it - the addressing limitation WRT the
physical memory map layout had also slipped my mind. So you describe the
RC *as if* it had a physical bus offset, rely on iommu-dma ignoring it
when active (which is more by luck than design - we don't expect to ever
see a device with a real hard-wired offset upstream of an IOMMU,
although I did initially try to support it back in the very early days),
and otherwise statically program a translation such that anyone else who
*does* respect bus_dma_regions finds things work as expected.
That actually seems like an even stronger argument for having the
fake-bypass table belong to the DART rather than the domain, and at that
point you shouldn't even need the mismatch restriction, since as long as
you haven't described the fake offset for any devices who *can* achieve
real bypass, then "attach to an identity domain" simply comes down to
doing the appropriate thing for each individual stream, regardless of
whether it's the same nominal identity domain that another device is
using or a distinct one (it's highly unlikely that two groups would ever
get attached to one identity domain rather than simply having their own
anyway, but it is technically possible).
> If that's not how the abstraction is supposed to work and/or too big of a hack
> I'll just remove the software bypass mode altogether.
> PCIe won't work on 4k kernels then but the only people using this so far
> build their own kernels with patches either way and won't complain.
> And by the time Linux will actually be useful for "normal" setups
> the dma-iommu layer can hopefully just handle a larger page granularity.
It's certainly... "creative", and TBH I don't hate it (in a "play the
hand you've been given" kind of way), but the one significant downside
is that if the DART driver isn't loaded for any reason, PCI DMA will
look like it should be usable but then just silently (or not so
silently) fail.
FWIW if you do want to keep the option open, I'd be inclined to have the
DT just give an "honest" description of just the 32-bit limitation, then
have the DART driver's .probe_device sneakily modify the bus_dma_region
to match the relevant fake-bypass table as appropriate. It's possible
other folks might hate that even more though :D
>>> + if (WARN_ON(domain->type == IOMMU_DOMAIN_IDENTITY &&
>>> + (domain->dart->supports_bypass != dart->supports_bypass)))
>>> + return -EINVAL;
>>> +
>>> + list_for_each_entry(stream, &domain->streams, stream_head) {
>>> + if (stream->dart == dart && stream->sid == sid) {
>>> + stream->num_devices++;
>>> + return 0;
>>> + }
>>> + }
>>> +
>>> + spin_lock_irqsave(&dart->lock, flags);
>>> +
>>> + if (WARN_ON(dart->used_sids & BIT(sid))) {
>>> + ret = -EINVAL;
>>> + goto error;
>>> + }
>>> +
>>> + stream = kzalloc(sizeof(*stream), GFP_ATOMIC);
>>> + if (!stream) {
>>> + ret = -ENOMEM;
>>> + goto error;
>>> + }
>>
>> Couldn't you do this outside the lock? (If, calling back to other
>> comments, it can't get refactored out of existence anyway)
>
> Probably, but I'll first see if I can just refactor it away.
Actually I missed that we're already under dart_domain->lock at this
point anyway, so it's not going to make much difference, but it does
mean that the spin_lock_irqsave() above could just be spin_lock(),
unless it's possible to relax the domain locking a bit such that we
don't have to do the whole domain init with IRQs masked.
[...]
>>> +static struct iommu_domain *apple_dart_domain_alloc(unsigned int type)
>>> +{
>>> + struct apple_dart_domain *dart_domain;
>>> +
>>> + if (type != IOMMU_DOMAIN_DMA && type != IOMMU_DOMAIN_UNMANAGED &&
>>> + type != IOMMU_DOMAIN_IDENTITY && type != IOMMU_DOMAIN_BLOCKED)
>>> + return NULL;
>>
>> I want to say there's not much point in that, but then I realise I've
>> spent the last couple of days writing patches to add a new domain type :)
>
> Hah! Just because I'm curious: What is that new domain type going to be? :)
Splitting IOMMU_DOMAIN_DMA into two to replace iommu_dma_strict being an
orthogonal thing.
[...]
>>> +static int apple_dart_of_xlate(struct device *dev, struct of_phandle_args *args)
>>> +{
>>> + struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
>>> + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
>>> + unsigned int num_streams = cfg ? cfg->num_streams : 0;
>>> + struct apple_dart_master_cfg *cfg_new;
>>> + struct apple_dart *dart = platform_get_drvdata(iommu_pdev);
>>> +
>>> + if (args->args_count != 1)
>>> + return -EINVAL;
>>> +
>>> + cfg_new = krealloc(cfg, struct_size(cfg, streams, num_streams + 1),
>>> + GFP_KERNEL);
>>> + if (!cfg_new)
>>> + return -ENOMEM;
>>> +
>>> + cfg = cfg_new;
>>> + dev_iommu_priv_set(dev, cfg);
>>> +
>>> + cfg->num_streams = num_streams;
>>> + cfg->streams[cfg->num_streams].dart = dart;
>>> + cfg->streams[cfg->num_streams].sid = args->args[0];
>>> + cfg->num_streams++;
>>
>> Yeah, this is way too reminiscent of the fwspec code for comfort. Even
>> if you can't use autoremove links for some reason, an array of 16
>> device_link pointers hung off apple_dart still wins over these little
>> pointer-heavy structures if you need more than a few of them.
>
> I can get rid of the links, but I'll still need some way to store
> both the apple_dart and the sid here. Like mentioned above, I'll
> be happy to reuse the fwspec code but I don't see how yet.
As before, if you can fit in some kind of DART instance identifier which
isn't impractical to unpack than it makes sense to use the fwspec since
it's already there. However if you still need to allocate something
per-device rather than just stashing an existing pointer in iommu_priv,
then you may as well keep everything together there. If the worst known
case could still fit in just two DART pointers and two 64-bit bitmaps,
I'd be inclined to just have that as a fixed structure and save all the
extra bother - you're not cross-architecture like the fwspec code, and
arm64's minimum kmalloc granularity has just gone back up to 128 bytes
(but even at 64 bytes you'd have had plenty of room).
[...]
>>> +static int apple_dart_remove(struct platform_device *pdev)
>>> +{
>>> + struct apple_dart *dart = platform_get_drvdata(pdev);
>>> +
>>> + devm_free_irq(dart->dev, dart->irq, dart);
>>> +
>>> + iommu_device_unregister(&dart->iommu);
>>> + iommu_device_sysfs_remove(&dart->iommu);
>>> +
>>> + clk_bulk_disable(dart->num_clks, dart->clks);
>>> + clk_bulk_unprepare(dart->num_clks, dart->clks);
>>
>> Ditto.
>>
>> And again the bus ops are still installed - that'll get really fun if
>> this is a module unload...
>
> Ugh, yeah. I'll fix that as well. I'll have to see how to make this work
> correctly with multiple DART instances. I guess I should only remove the
> bus ops once the last one is removed. Now that I think about it, this
> could also get tricky in the cleanup paths of apple_dart_probe.
>
> Maybe just add a module_init that sets up the bus ops when it finds at
> least one DART node and module_exit to tear them down again?
Actually by this point it was late and I wasn't thinking as clearly as I
could have been, apologies ;)
I believe a module unload is in fact the *only* time you should expect
to see .remove called - you want to set .suppress_bind_attrs in your
driver data because there's basically no way to prevent manual unbinding
from blowing up - so it should be fine to unconditionally clear the ops
at this point (being removed means you must have successfully probed, so
any ops must be yours).
Cheers,
Robin.
On Mon, Jul 19, 2021, at 20:15, Robin Murphy wrote:
> On 2021-07-15 17:41, Sven Peter via iommu wrote:
> [...]
> >>> + u64 sw_bypass_cpu_start;
> >>> + u64 sw_bypass_dma_start;
> >>> + u64 sw_bypass_len;
> >>> +
> >>> + struct list_head streams;
> >>
> >> I'm staring to think this could just be a bitmap, in a u16 even.
> >
> > The problem is that these streams may come from two different
> > DART instances. That is required for e.g. the dwc3 controller which
> > has a weird quirk where DMA transactions go through two separate
> > DARTs with no clear pattern (e.g. some xhci control structures use the
> > first dart while other structures use the second one).
>
> Ah right, I do remember discussing that situation, but I think I
> misinterpreted dart_domain->dart representing "the DART instance" here
> to mean we weren't trying to accommodate that just yet.
>
> > Both of them need to point to the same pagetable.
> > In the device tree the node will have an entry like this:
> >
> > dwc3_0: usb@382280000{
> > ...
> > iommus = <&dwc3_0_dart_0 0>, <&dwc3_0_dart_1 1>;
> > };
> >
> > There's no need for a linked list though once I do this properly with
> > groups. I can just use an array allocated when the first device is
> > attached, which just contains apple_dart* and streamid values.
> >
> >
> >>
> >>> +
> >>> + spinlock_t lock;
> >>> +
> >>> + struct iommu_domain domain;
> >>> +};
> >>> +
> >>> +/*
> >>> + * This structure is attached to devices with dev_iommu_priv_set() on of_xlate
> >>> + * and contains a list of streams bound to this device as defined in the
> >>> + * device tree. Multiple DART instances can be attached to a single device
> >>> + * and each stream is identified by its stream id.
> >>> + * It's usually reference by a pointer called *cfg.
> >>> + *
> >>> + * A dynamic array instead of a linked list is used here since in almost
> >>> + * all cases a device will just be attached to a single stream and streams
> >>> + * are never removed after they have been added.
> >>> + *
> >>> + * @num_streams: number of streams attached
> >>> + * @streams: array of structs to identify attached streams and the device link
> >>> + * to the iommu
> >>> + */
> >>> +struct apple_dart_master_cfg {
> >>> + int num_streams;
> >>> + struct {
> >>> + struct apple_dart *dart;
> >>> + u32 sid;
> >>
> >> Can't you use the fwspec for this?
> >
> >
> > I'd be happy to use the fwspec code if that's somehow possible.
> > I'm not sure how though since I need to store both the reference to the DART
> > _and_ to the stream id. As far as I can tell the fwspec code would only allow
> > to store the stream ids.
> > (see also the previous comment regarding the dwc3 node which requires stream
> > ids from two separate DART instances)
>
> Hmm, yes, as above I was overlooking that, although there are still
> various ideas that come to mind; the question becomes whether they're
> actually worthwhile or just too-clever-for-their-own-good hacks. The
> exact format of fwspec->ids is not fixed (other than the ACPI IORT code
> having a common understanding with the Arm SMMU drivers) so in principle
> you could munge some sort of DART instance index or indeed anything, but
> if it remains cleaner to manage your own data internally then by all
> means keep doing that.
Yeah, I can think of some hacks as well (like storing a global id->apple_dart* map
or stuffing the 64bit pointer into two ints) and I've tried a few of them in the past
days but didn't like either of them.
I do like the idea to just put two (struct apple_dart *dart, u16 sidmap)
in there though which will be plenty for all current configurations.
>
> >>> + struct device_link *link;
> >>
> >> Is it necessary to use stateless links, or could you use
> >> DL_FLAG_AUTOREMOVE_SUPPLIER and not have to keep track of them manually?
> >
> > I'll just use DL_FLAG_AUTOREMOVE_SUPPLIER. No idea why I went for stateless links.
> >
> >>
> > [...]
> >>> + /* restore stream identity map */
> >>> + writel(0x03020100, dart->regs + DART_STREAM_REMAP);
> >>> + writel(0x07060504, dart->regs + DART_STREAM_REMAP + 4);
> >>> + writel(0x0b0a0908, dart->regs + DART_STREAM_REMAP + 8);
> >>> + writel(0x0f0e0d0c, dart->regs + DART_STREAM_REMAP + 12);
> >>
> >> Any hint of what the magic numbers mean?
> >
> > Yes, it's just 0,1,2,3...,0xe,0xf but I can't do 8bit writes to the bus
> > and 32 bit writes then require these slightly awkward "swapped" numbers.
> > I'll add a comment since it's not obvious at first glance.
>
> Sure, I guessed that much from "identity map" - it was more a question
> of why that means 0x03020100... rather than, say, 0x0c0d0e0f... or
> 0x76543210..., and perhaps the reason for "restoring" it in the first place.
So what this feature does is to allow the DART to take an incoming DMA stream
tagged with id i and pretend that it's actually been tagged with
readb(dart->regs + 0x80 + i) instead. That's as much as I can figure out by
poking the hardware. More details are probably only available to Apple.
Now the reason I thought I needed this was that I assumed we are handed these DARTs
in an unclean state because Apple makes use of this internally:
In their device tree they have a sid-remap property which I believe is a hack to make
their driver simpler. The dwc3 controller requires stream 0 of dartA and stream 1 of
dartB to be configured the same way. They configure dartB to remap stream 1 to stream 0
and then just mirror all MMIO writes from dartA to dartB and pretend that dwc3 only
needs a single DART.
As it actually turns out though, iBoot doesn't use the USB DARTs and we already get
them in the sane state. I can just drop this code. (And if we actually need it
for other DARTs I can also just restore those in our bootloader or add it in a
follow up).
>
> [...]
> >>> + /*
> >>> + * we can't mix and match DARTs that support bypass mode with those who don't
> >>> + * because the iova space in fake bypass mode generally has an offset
> >>> + */
> >>
> >> Erm, something doesn't sound right there... IOMMU_DOMAIN_IDENTITY should
> >> be exactly what it says, regardless of how it's implemented. If you
> >> can't provide a true identity mapping then you're probably better off
> >> not pretending to support them in the first place.
> >
> > Some background: the PCIe DART only supports a 32bit VA space but RAM
> > on these machines starts at 0x8_0000_0000. I have something like
> > dma-ranges = <0x42000000 0 0 0x8 0 0 0xffff0000>;
> > in the pcie nodes to add that offset to dma addresses.
> >
> > What I want to do here then is to setup an identity mapping with respect
> > to the DMA layer understanding of addresses encoded in bus_dma_region.
> > Now this will always just be a constant offset of 0x8_0000_0000 for
> > all M1s but I didn't want to hardcode that.
> > The code here is just there to guard against a situation where someone
> > somehow manages to attach two devices with different offsets to the same
> > domain.
>
> Urgh, *now* I think I get it - the addressing limitation WRT the
> physical memory map layout had also slipped my mind. So you describe the
> RC *as if* it had a physical bus offset, rely on iommu-dma ignoring it
> when active (which is more by luck than design - we don't expect to ever
> see a device with a real hard-wired offset upstream of an IOMMU,
> although I did initially try to support it back in the very early days),
> and otherwise statically program a translation such that anyone else who
> *does* respect bus_dma_regions finds things work as expected.
Yes, exactly. It's not very nice but it works...
>
> That actually seems like an even stronger argument for having the
> fake-bypass table belong to the DART rather than the domain, and at that
> point you shouldn't even need the mismatch restriction, since as long as
> you haven't described the fake offset for any devices who *can* achieve
> real bypass, then "attach to an identity domain" simply comes down to
> doing the appropriate thing for each individual stream, regardless of
> whether it's the same nominal identity domain that another device is
> using or a distinct one (it's highly unlikely that two groups would ever
> get attached to one identity domain rather than simply having their own
> anyway, but it is technically possible).
>
Agreed. That sounds a lot nicer actually.
> > If that's not how the abstraction is supposed to work and/or too big of a hack
> > I'll just remove the software bypass mode altogether.
> > PCIe won't work on 4k kernels then but the only people using this so far
> > build their own kernels with patches either way and won't complain.
> > And by the time Linux will actually be useful for "normal" setups
> > the dma-iommu layer can hopefully just handle a larger page granularity.
>
> It's certainly... "creative", and TBH I don't hate it (in a "play the
> hand you've been given" kind of way), but the one significant downside
> is that if the DART driver isn't loaded for any reason, PCI DMA will
> look like it should be usable but then just silently (or not so
> silently) fail.
Good point!
>
> FWIW if you do want to keep the option open, I'd be inclined to have the
> DT just give an "honest" description of just the 32-bit limitation, then
> have the DART driver's .probe_device sneakily modify the bus_dma_region
> to match the relevant fake-bypass table as appropriate. It's possible
> other folks might hate that even more though :D
I've given that one a try and I kinda like it so far :D
I'll keep it for v5 and just drop it in case someone complains.
>
> >>> + if (WARN_ON(domain->type == IOMMU_DOMAIN_IDENTITY &&
> >>> + (domain->dart->supports_bypass != dart->supports_bypass)))
> >>> + return -EINVAL;
> >>> +
> >>> + list_for_each_entry(stream, &domain->streams, stream_head) {
> >>> + if (stream->dart == dart && stream->sid == sid) {
> >>> + stream->num_devices++;
> >>> + return 0;
> >>> + }
> >>> + }
> >>> +
> >>> + spin_lock_irqsave(&dart->lock, flags);
> >>> +
> >>> + if (WARN_ON(dart->used_sids & BIT(sid))) {
> >>> + ret = -EINVAL;
> >>> + goto error;
> >>> + }
> >>> +
> >>> + stream = kzalloc(sizeof(*stream), GFP_ATOMIC);
> >>> + if (!stream) {
> >>> + ret = -ENOMEM;
> >>> + goto error;
> >>> + }
> >>
> >> Couldn't you do this outside the lock? (If, calling back to other
> >> comments, it can't get refactored out of existence anyway)
> >
> > Probably, but I'll first see if I can just refactor it away.
>
> Actually I missed that we're already under dart_domain->lock at this
> point anyway, so it's not going to make much difference, but it does
> mean that the spin_lock_irqsave() above could just be spin_lock(),
> unless it's possible to relax the domain locking a bit such that we
> don't have to do the whole domain init with IRQs masked.
I can relax the locking quite a bit.
Right now, I only need a spinlock around the TLB flush MMIO writes
and a mutex to protect domain initialization.
>
> [...]
> >>> +static struct iommu_domain *apple_dart_domain_alloc(unsigned int type)
> >>> +{
> >>> + struct apple_dart_domain *dart_domain;
> >>> +
> >>> + if (type != IOMMU_DOMAIN_DMA && type != IOMMU_DOMAIN_UNMANAGED &&
> >>> + type != IOMMU_DOMAIN_IDENTITY && type != IOMMU_DOMAIN_BLOCKED)
> >>> + return NULL;
> >>
> >> I want to say there's not much point in that, but then I realise I've
> >> spent the last couple of days writing patches to add a new domain type :)
> >
> > Hah! Just because I'm curious: What is that new domain type going to be? :)
>
> Splitting IOMMU_DOMAIN_DMA into two to replace iommu_dma_strict being an
> orthogonal thing.
>
> [...]
> >>> +static int apple_dart_of_xlate(struct device *dev, struct of_phandle_args *args)
> >>> +{
> >>> + struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
> >>> + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
> >>> + unsigned int num_streams = cfg ? cfg->num_streams : 0;
> >>> + struct apple_dart_master_cfg *cfg_new;
> >>> + struct apple_dart *dart = platform_get_drvdata(iommu_pdev);
> >>> +
> >>> + if (args->args_count != 1)
> >>> + return -EINVAL;
> >>> +
> >>> + cfg_new = krealloc(cfg, struct_size(cfg, streams, num_streams + 1),
> >>> + GFP_KERNEL);
> >>> + if (!cfg_new)
> >>> + return -ENOMEM;
> >>> +
> >>> + cfg = cfg_new;
> >>> + dev_iommu_priv_set(dev, cfg);
> >>> +
> >>> + cfg->num_streams = num_streams;
> >>> + cfg->streams[cfg->num_streams].dart = dart;
> >>> + cfg->streams[cfg->num_streams].sid = args->args[0];
> >>> + cfg->num_streams++;
> >>
> >> Yeah, this is way too reminiscent of the fwspec code for comfort. Even
> >> if you can't use autoremove links for some reason, an array of 16
> >> device_link pointers hung off apple_dart still wins over these little
> >> pointer-heavy structures if you need more than a few of them.
> >
> > I can get rid of the links, but I'll still need some way to store
> > both the apple_dart and the sid here. Like mentioned above, I'll
> > be happy to reuse the fwspec code but I don't see how yet.
>
> As before, if you can fit in some kind of DART instance identifier which
> isn't impractical to unpack than it makes sense to use the fwspec since
> it's already there. However if you still need to allocate something
> per-device rather than just stashing an existing pointer in iommu_priv,
> then you may as well keep everything together there. If the worst known
> case could still fit in just two DART pointers and two 64-bit bitmaps,
> I'd be inclined to just have that as a fixed structure and save all the
> extra bother - you're not cross-architecture like the fwspec code, and
> arm64's minimum kmalloc granularity has just gone back up to 128 bytes
> (but even at 64 bytes you'd have had plenty of room).
That's a very good point, I somehow tried to make this part as general
as possible and didn't realize that this only has to work on essentially
one SoC for now. I also don't expect Apple to require more than two
DARTs for a single master in the future.
I've tried the fixed structure now and I really like it so far.
>
> [...]
> >>> +static int apple_dart_remove(struct platform_device *pdev)
> >>> +{
> >>> + struct apple_dart *dart = platform_get_drvdata(pdev);
> >>> +
> >>> + devm_free_irq(dart->dev, dart->irq, dart);
> >>> +
> >>> + iommu_device_unregister(&dart->iommu);
> >>> + iommu_device_sysfs_remove(&dart->iommu);
> >>> +
> >>> + clk_bulk_disable(dart->num_clks, dart->clks);
> >>> + clk_bulk_unprepare(dart->num_clks, dart->clks);
> >>
> >> Ditto.
> >>
> >> And again the bus ops are still installed - that'll get really fun if
> >> this is a module unload...
> >
> > Ugh, yeah. I'll fix that as well. I'll have to see how to make this work
> > correctly with multiple DART instances. I guess I should only remove the
> > bus ops once the last one is removed. Now that I think about it, this
> > could also get tricky in the cleanup paths of apple_dart_probe.
> >
> > Maybe just add a module_init that sets up the bus ops when it finds at
> > least one DART node and module_exit to tear them down again?
>
> Actually by this point it was late and I wasn't thinking as clearly as I
> could have been, apologies ;)
>
> I believe a module unload is in fact the *only* time you should expect
> to see .remove called - you want to set .suppress_bind_attrs in your
> driver data because there's basically no way to prevent manual unbinding
> from blowing up - so it should be fine to unconditionally clear the ops
> at this point (being removed means you must have successfully probed, so
> any ops must be yours).
Makes sense, thanks!
I'll let my current version simmer for a bit and wait until it's been
tested by a few people and will send it in a week or so then!
Best,
Sven
> I'll let my current version simmer for a bit and wait until it's been
> tested by a few people and will send it in a week or so then!
New version has my T-b :)