This series implements mitigations for lack of DMA access control on
systems without an IOMMU, which could result in the DMA accessing the
system memory at unexpected times and/or unexpected addresses, possibly
leading to data leakage or corruption.
For example, we plan to use the PCI-e bus for Wi-Fi on one MTK platform and
that PCI-e bus is not behind an IOMMU. As PCI-e, by design, gives the
device full access to system memory, a vulnerability in the Wi-Fi firmware
could easily escalate to a full system exploit (remote wifi exploits: [1a],
[1b] that shows a full chain of exploits; [2], [3]).
To mitigate the security concerns, we introduce restricted DMA. The
restricted DMA is implemented by per-device swiotlb and coherent memory
pools. The feature on its own provides a basic level of protection against
the DMA overwriting buffer contents at unexpected times. However, to
protect against general data leakage and system memory corruption, the
system needs to provide a way to restrict the DMA to a predefined memory
region (this is usually done at firmware level, e.g. in ATF on some ARM
platforms).
[1a] https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_4.html
[1b] https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_11.html
[2] https://blade.tencent.com/en/advisories/qualpwn/
[3] https://www.bleepingcomputer.com/news/security/vulnerabilities-found-in-highly-popular-firmware-for-wifi-chips/
Claire Chang (5):
swiotlb: Add io_tlb_mem struct
swiotlb: Add device swiotlb pool
swiotlb: Use device swiotlb pool if available
dt-bindings: of: Add plumbing for restricted DMA pool
of: Add plumbing for restricted DMA pool
.../reserved-memory/reserved-memory.txt | 35 ++
drivers/iommu/intel/iommu.c | 8 +-
drivers/of/address.c | 39 ++
drivers/of/device.c | 3 +
drivers/of/of_private.h | 6 +
drivers/xen/swiotlb-xen.c | 4 +-
include/linux/device.h | 4 +
include/linux/dma-direct.h | 8 +-
include/linux/swiotlb.h | 49 +-
kernel/dma/direct.c | 8 +-
kernel/dma/swiotlb.c | 418 +++++++++++-------
11 files changed, 393 insertions(+), 189 deletions(-)
--
v1: https://lore.kernel.org/patchwork/cover/1271660/
Changes in v2:
- build on top of swiotlb
2.28.0.rc0.142.g3c755180ce-goog
Added a new struct, io_tlb_mem, as the IO TLB memory pool descriptor and
moved relevant global variables into that struct.
This will be useful later to allow for per-device swiotlb regions.
Signed-off-by: Claire Chang <[email protected]>
---
drivers/iommu/intel/iommu.c | 2 +-
drivers/xen/swiotlb-xen.c | 4 +-
include/linux/swiotlb.h | 38 ++++-
kernel/dma/swiotlb.c | 286 +++++++++++++++++-------------------
4 files changed, 172 insertions(+), 158 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 3f7c04cf89b3..44c9230251eb 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3736,7 +3736,7 @@ bounce_map_single(struct device *dev, phys_addr_t paddr, size_t size,
*/
if (!IS_ALIGNED(paddr | size, VTD_PAGE_SIZE)) {
tlb_addr = swiotlb_tbl_map_single(dev,
- __phys_to_dma(dev, io_tlb_start),
+ __phys_to_dma(dev, io_tlb_default_mem.start),
paddr, size, aligned_size, dir, attrs);
if (tlb_addr == DMA_MAPPING_ERROR) {
goto swiotlb_error;
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index b6d27762c6f8..62452424ec8a 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -190,8 +190,8 @@ int __ref xen_swiotlb_init(int verbose, bool early)
/*
* IO TLB memory already allocated. Just use it.
*/
- if (io_tlb_start != 0) {
- xen_io_tlb_start = phys_to_virt(io_tlb_start);
+ if (io_tlb_default_mem.start != 0) {
+ xen_io_tlb_start = phys_to_virt(io_tlb_default_mem.start);
goto end;
}
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 046bb94bd4d6..ab0d571d0826 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -69,11 +69,45 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys,
#ifdef CONFIG_SWIOTLB
extern enum swiotlb_force swiotlb_force;
-extern phys_addr_t io_tlb_start, io_tlb_end;
+
+/**
+ * struct io_tlb_mem - IO TLB Memory Pool Descriptor
+ *
+ * @start: The start address of the swiotlb memory pool. Used to do a quick
+ * range check to see if the memory was in fact allocated by this
+ * API. For device private swiotlb, this is device tree adjustable.
+ * @end: The end address of the swiotlb memory pool. Used to do a quick
+ * range check to see if the memory was in fact allocated by this
+ * API. For device private swiotlb, this is device tree adjustable.
+ * @nslabs: The number of IO TLB blocks (in groups of 64) between @start and
+ * @end. For system swiotlb, this is command line adjustable via
+ * setup_io_tlb_npages.
+ * @used: The number of used IO TLB block.
+ * @list: The free list describing the number of free entries available
+ * from each index.
+ * @index: The index to start searching in the next round.
+ * @orig_addr: The original address corresponding to a mapped entry for the
+ * sync operations.
+ * @lock: The lock to protect the above data structures in the map and
+ * unmap calls.
+ * @debugfs: The dentry to debugfs.
+ */
+struct io_tlb_mem {
+ phys_addr_t start;
+ phys_addr_t end;
+ unsigned long nslabs;
+ unsigned long used;
+ unsigned int *list;
+ unsigned int index;
+ phys_addr_t *orig_addr;
+ spinlock_t lock;
+ struct dentry *debugfs;
+};
+extern struct io_tlb_mem io_tlb_default_mem;
static inline bool is_swiotlb_buffer(phys_addr_t paddr)
{
- return paddr >= io_tlb_start && paddr < io_tlb_end;
+ return paddr >= io_tlb_mem.start && paddr < io_tlb_mem.end;
}
void __init swiotlb_exit(void);
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index c19379fabd20..f83911fa14ce 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -61,33 +61,11 @@
* allocate a contiguous 1MB, we're probably in trouble anyway.
*/
#define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
+#define INVALID_PHYS_ADDR (~(phys_addr_t)0)
enum swiotlb_force swiotlb_force;
-/*
- * Used to do a quick range check in swiotlb_tbl_unmap_single and
- * swiotlb_tbl_sync_single_*, to see if the memory was in fact allocated by this
- * API.
- */
-phys_addr_t io_tlb_start, io_tlb_end;
-
-/*
- * The number of IO TLB blocks (in groups of 64) between io_tlb_start and
- * io_tlb_end. This is command line adjustable via setup_io_tlb_npages.
- */
-static unsigned long io_tlb_nslabs;
-
-/*
- * The number of used IO TLB block
- */
-static unsigned long io_tlb_used;
-
-/*
- * This is a free list describing the number of free entries available from
- * each index
- */
-static unsigned int *io_tlb_list;
-static unsigned int io_tlb_index;
+struct io_tlb_mem io_tlb_default_mem;
/*
* Max segment that we can provide which (if pages are contingous) will
@@ -95,27 +73,17 @@ static unsigned int io_tlb_index;
*/
unsigned int max_segment;
-/*
- * We need to save away the original address corresponding to a mapped entry
- * for the sync operations.
- */
-#define INVALID_PHYS_ADDR (~(phys_addr_t)0)
-static phys_addr_t *io_tlb_orig_addr;
-
-/*
- * Protect the above data structures in the map and unmap calls
- */
-static DEFINE_SPINLOCK(io_tlb_lock);
-
static int late_alloc;
static int __init
setup_io_tlb_npages(char *str)
{
+ struct io_tlb_mem *mem = &io_tlb_default_mem;
+
if (isdigit(*str)) {
- io_tlb_nslabs = simple_strtoul(str, &str, 0);
+ mem->nslabs = simple_strtoul(str, &str, 0);
/* avoid tail segment of size < IO_TLB_SEGSIZE */
- io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+ mem->nslabs = ALIGN(mem->nslabs, IO_TLB_SEGSIZE);
}
if (*str == ',')
++str;
@@ -123,7 +91,7 @@ setup_io_tlb_npages(char *str)
swiotlb_force = SWIOTLB_FORCE;
} else if (!strcmp(str, "noforce")) {
swiotlb_force = SWIOTLB_NO_FORCE;
- io_tlb_nslabs = 1;
+ mem->nslabs = 1;
}
return 0;
@@ -134,7 +102,7 @@ static bool no_iotlb_memory;
unsigned long swiotlb_nr_tbl(void)
{
- return unlikely(no_iotlb_memory) ? 0 : io_tlb_nslabs;
+ return unlikely(no_iotlb_memory) ? 0 : io_tlb_default_mem.nslabs;
}
EXPORT_SYMBOL_GPL(swiotlb_nr_tbl);
@@ -158,14 +126,14 @@ unsigned long swiotlb_size_or_default(void)
{
unsigned long size;
- size = io_tlb_nslabs << IO_TLB_SHIFT;
+ size = io_tlb_default_mem.nslabs << IO_TLB_SHIFT;
return size ? size : (IO_TLB_DEFAULT_SIZE);
}
void swiotlb_print_info(void)
{
- unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT;
+ unsigned long bytes = io_tlb_default_mem.nslabs << IO_TLB_SHIFT;
if (no_iotlb_memory) {
pr_warn("No low mem\n");
@@ -173,9 +141,9 @@ void swiotlb_print_info(void)
}
pr_info("mapped [mem %#010llx-%#010llx] (%luMB)\n",
- (unsigned long long)io_tlb_start,
- (unsigned long long)io_tlb_end,
- bytes >> 20);
+ (unsigned long long)io_tlb_default_mem.start,
+ (unsigned long long)io_tlb_default_mem.end,
+ bytes >> 20);
}
/*
@@ -192,50 +160,52 @@ void __init swiotlb_update_mem_attributes(void)
if (no_iotlb_memory || late_alloc)
return;
- vaddr = phys_to_virt(io_tlb_start);
- bytes = PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT);
+ vaddr = phys_to_virt(io_tlb_default_mem.start);
+ bytes = PAGE_ALIGN(io_tlb_default_mem.nslabs << IO_TLB_SHIFT);
set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
memset(vaddr, 0, bytes);
}
int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
{
+ struct io_tlb_mem *mem = &io_tlb_default_mem;
unsigned long i, bytes;
size_t alloc_size;
bytes = nslabs << IO_TLB_SHIFT;
- io_tlb_nslabs = nslabs;
- io_tlb_start = __pa(tlb);
- io_tlb_end = io_tlb_start + bytes;
+ mem->nslabs = nslabs;
+ mem->start = __pa(tlb);
+ mem->end = mem->start + bytes;
/*
* Allocate and initialize the free list array. This array is used
* to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
- * between io_tlb_start and io_tlb_end.
+ * between io_tlb_default_mem.start and io_tlb_default_mem.end.
*/
- alloc_size = PAGE_ALIGN(io_tlb_nslabs * sizeof(int));
- io_tlb_list = memblock_alloc(alloc_size, PAGE_SIZE);
- if (!io_tlb_list)
+ alloc_size = PAGE_ALIGN(mem->nslabs * sizeof(int));
+ mem->list = memblock_alloc(alloc_size, PAGE_SIZE);
+ if (!mem->list)
panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
__func__, alloc_size, PAGE_SIZE);
- alloc_size = PAGE_ALIGN(io_tlb_nslabs * sizeof(phys_addr_t));
- io_tlb_orig_addr = memblock_alloc(alloc_size, PAGE_SIZE);
- if (!io_tlb_orig_addr)
+ alloc_size = PAGE_ALIGN(mem->nslabs * sizeof(phys_addr_t));
+ mem->orig_addr = memblock_alloc(alloc_size, PAGE_SIZE);
+ if (!mem->orig_addr)
panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
__func__, alloc_size, PAGE_SIZE);
- for (i = 0; i < io_tlb_nslabs; i++) {
- io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
- io_tlb_orig_addr[i] = INVALID_PHYS_ADDR;
+ for (i = 0; i < mem->nslabs; i++) {
+ mem->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+ mem->orig_addr[i] = INVALID_PHYS_ADDR;
}
- io_tlb_index = 0;
if (verbose)
swiotlb_print_info();
- swiotlb_set_max_segment(io_tlb_nslabs << IO_TLB_SHIFT);
+ swiotlb_set_max_segment(mem->nslabs << IO_TLB_SHIFT);
+ spin_lock_init(&mem->lock);
+
return 0;
}
@@ -246,25 +216,26 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
void __init
swiotlb_init(int verbose)
{
+ struct io_tlb_mem *mem = &io_tlb_default_mem;
size_t default_size = IO_TLB_DEFAULT_SIZE;
unsigned char *vstart;
unsigned long bytes;
- if (!io_tlb_nslabs) {
- io_tlb_nslabs = (default_size >> IO_TLB_SHIFT);
- io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+ if (!mem->nslabs) {
+ mem->nslabs = (default_size >> IO_TLB_SHIFT);
+ mem->nslabs = ALIGN(mem->nslabs, IO_TLB_SEGSIZE);
}
- bytes = io_tlb_nslabs << IO_TLB_SHIFT;
+ bytes = mem->nslabs << IO_TLB_SHIFT;
/* Get IO TLB memory from the low pages */
vstart = memblock_alloc_low(PAGE_ALIGN(bytes), PAGE_SIZE);
- if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, verbose))
+ if (vstart && !swiotlb_init_with_tbl(vstart, mem->nslabs, verbose))
return;
- if (io_tlb_start)
- memblock_free_early(io_tlb_start,
- PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT));
+ if (mem->start)
+ memblock_free_early(mem->start,
+ PAGE_ALIGN(mem->nslabs << IO_TLB_SHIFT));
pr_warn("Cannot allocate buffer");
no_iotlb_memory = true;
}
@@ -277,22 +248,23 @@ swiotlb_init(int verbose)
int
swiotlb_late_init_with_default_size(size_t default_size)
{
- unsigned long bytes, req_nslabs = io_tlb_nslabs;
+ struct io_tlb_mem *mem = &io_tlb_default_mem;
+ unsigned long bytes, req_nslabs = mem->nslabs;
unsigned char *vstart = NULL;
unsigned int order;
int rc = 0;
- if (!io_tlb_nslabs) {
- io_tlb_nslabs = (default_size >> IO_TLB_SHIFT);
- io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+ if (!mem->nslabs) {
+ mem->nslabs = (default_size >> IO_TLB_SHIFT);
+ mem->nslabs = ALIGN(mem->nslabs, IO_TLB_SEGSIZE);
}
/*
* Get IO TLB memory from the low pages
*/
- order = get_order(io_tlb_nslabs << IO_TLB_SHIFT);
- io_tlb_nslabs = SLABS_PER_PAGE << order;
- bytes = io_tlb_nslabs << IO_TLB_SHIFT;
+ order = get_order(mem->nslabs << IO_TLB_SHIFT);
+ mem->nslabs = SLABS_PER_PAGE << order;
+ bytes = mem->nslabs << IO_TLB_SHIFT;
while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
vstart = (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN,
@@ -303,15 +275,15 @@ swiotlb_late_init_with_default_size(size_t default_size)
}
if (!vstart) {
- io_tlb_nslabs = req_nslabs;
+ mem->nslabs = req_nslabs;
return -ENOMEM;
}
if (order != get_order(bytes)) {
pr_warn("only able to allocate %ld MB\n",
(PAGE_SIZE << order) >> 20);
- io_tlb_nslabs = SLABS_PER_PAGE << order;
+ mem->nslabs = SLABS_PER_PAGE << order;
}
- rc = swiotlb_late_init_with_tbl(vstart, io_tlb_nslabs);
+ rc = swiotlb_late_init_with_tbl(vstart, mem->nslabs);
if (rc)
free_pages((unsigned long)vstart, order);
@@ -320,22 +292,23 @@ swiotlb_late_init_with_default_size(size_t default_size)
static void swiotlb_cleanup(void)
{
- io_tlb_end = 0;
- io_tlb_start = 0;
- io_tlb_nslabs = 0;
+ io_tlb_default_mem.end = 0;
+ io_tlb_default_mem.start = 0;
+ io_tlb_default_mem.nslabs = 0;
max_segment = 0;
}
int
swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
{
+ struct io_tlb_mem *mem = &io_tlb_default_mem;
unsigned long i, bytes;
bytes = nslabs << IO_TLB_SHIFT;
- io_tlb_nslabs = nslabs;
- io_tlb_start = virt_to_phys(tlb);
- io_tlb_end = io_tlb_start + bytes;
+ mem->nslabs = nslabs;
+ mem->start = virt_to_phys(tlb);
+ mem->end = mem->start + bytes;
set_memory_decrypted((unsigned long)tlb, bytes >> PAGE_SHIFT);
memset(tlb, 0, bytes);
@@ -343,38 +316,39 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
/*
* Allocate and initialize the free list array. This array is used
* to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
- * between io_tlb_start and io_tlb_end.
+ * between io_tlb_default_mem.start and io_tlb_default_mem.end.
*/
- io_tlb_list = (unsigned int *)__get_free_pages(GFP_KERNEL,
- get_order(io_tlb_nslabs * sizeof(int)));
- if (!io_tlb_list)
+ mem->list = (unsigned int *)__get_free_pages(GFP_KERNEL,
+ get_order(mem->nslabs * sizeof(int)));
+ if (!mem->list)
goto cleanup3;
- io_tlb_orig_addr = (phys_addr_t *)
+ mem->orig_addr = (phys_addr_t *)
__get_free_pages(GFP_KERNEL,
- get_order(io_tlb_nslabs *
+ get_order(mem->nslabs *
sizeof(phys_addr_t)));
- if (!io_tlb_orig_addr)
+ if (!mem->orig_addr)
goto cleanup4;
- for (i = 0; i < io_tlb_nslabs; i++) {
- io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
- io_tlb_orig_addr[i] = INVALID_PHYS_ADDR;
+ for (i = 0; i < mem->nslabs; i++) {
+ mem->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+ mem->orig_addr[i] = INVALID_PHYS_ADDR;
}
- io_tlb_index = 0;
+ mem->index = 0;
swiotlb_print_info();
late_alloc = 1;
- swiotlb_set_max_segment(io_tlb_nslabs << IO_TLB_SHIFT);
+ swiotlb_set_max_segment(mem->nslabs << IO_TLB_SHIFT);
+ spin_lock_init(&mem->lock);
return 0;
cleanup4:
- free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
- sizeof(int)));
- io_tlb_list = NULL;
+ free_pages((unsigned long)mem->list, get_order(mem->nslabs *
+ sizeof(int)));
+ mem->list = NULL;
cleanup3:
swiotlb_cleanup();
return -ENOMEM;
@@ -382,23 +356,25 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
void __init swiotlb_exit(void)
{
- if (!io_tlb_orig_addr)
+ struct io_tlb_mem *mem = &io_tlb_default_mem;
+
+ if (!mem->orig_addr)
return;
if (late_alloc) {
- free_pages((unsigned long)io_tlb_orig_addr,
- get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
- free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
- sizeof(int)));
- free_pages((unsigned long)phys_to_virt(io_tlb_start),
- get_order(io_tlb_nslabs << IO_TLB_SHIFT));
+ free_pages((unsigned long)mem->orig_addr,
+ get_order(mem->nslabs * sizeof(phys_addr_t)));
+ free_pages((unsigned long)mem->list, get_order(mem->nslabs *
+ sizeof(int)));
+ free_pages((unsigned long)phys_to_virt(mem->start),
+ get_order(mem->nslabs << IO_TLB_SHIFT));
} else {
- memblock_free_late(__pa(io_tlb_orig_addr),
- PAGE_ALIGN(io_tlb_nslabs * sizeof(phys_addr_t)));
- memblock_free_late(__pa(io_tlb_list),
- PAGE_ALIGN(io_tlb_nslabs * sizeof(int)));
- memblock_free_late(io_tlb_start,
- PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT));
+ memblock_free_late(__pa(mem->orig_addr),
+ PAGE_ALIGN(mem->nslabs * sizeof(phys_addr_t)));
+ memblock_free_late(__pa(mem->list),
+ PAGE_ALIGN(mem->nslabs * sizeof(int)));
+ memblock_free_late(mem->start,
+ PAGE_ALIGN(mem->nslabs << IO_TLB_SHIFT));
}
swiotlb_cleanup();
}
@@ -451,6 +427,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
enum dma_data_direction dir,
unsigned long attrs)
{
+ struct io_tlb_mem *mem = &io_tlb_default_mem;
unsigned long flags;
phys_addr_t tlb_addr;
unsigned int nslots, stride, index, wrap;
@@ -501,13 +478,13 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
* Find suitable number of IO TLB entries size that will fit this
* request and allocate a buffer from that IO TLB pool.
*/
- spin_lock_irqsave(&io_tlb_lock, flags);
+ spin_lock_irqsave(&mem->lock, flags);
- if (unlikely(nslots > io_tlb_nslabs - io_tlb_used))
+ if (unlikely(nslots > mem->nslabs - mem->used))
goto not_found;
- index = ALIGN(io_tlb_index, stride);
- if (index >= io_tlb_nslabs)
+ index = ALIGN(mem->index, stride);
+ if (index >= mem->nslabs)
index = 0;
wrap = index;
@@ -515,7 +492,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
while (iommu_is_span_boundary(index, nslots, offset_slots,
max_slots)) {
index += stride;
- if (index >= io_tlb_nslabs)
+ if (index >= mem->nslabs)
index = 0;
if (index == wrap)
goto not_found;
@@ -526,40 +503,40 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
* contiguous buffers, we allocate the buffers from that slot
* and mark the entries as '0' indicating unavailable.
*/
- if (io_tlb_list[index] >= nslots) {
+ if (mem->list[index] >= nslots) {
int count = 0;
for (i = index; i < (int) (index + nslots); i++)
- io_tlb_list[i] = 0;
- for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) && io_tlb_list[i]; i--)
- io_tlb_list[i] = ++count;
- tlb_addr = io_tlb_start + (index << IO_TLB_SHIFT);
+ mem->list[i] = 0;
+ for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) && mem->list[i]; i--)
+ mem->list[i] = ++count;
+ tlb_addr = mem->start + (index << IO_TLB_SHIFT);
/*
* Update the indices to avoid searching in the next
* round.
*/
- io_tlb_index = ((index + nslots) < io_tlb_nslabs
- ? (index + nslots) : 0);
+ mem->index = ((index + nslots) < mem->nslabs
+ ? (index + nslots) : 0);
goto found;
}
index += stride;
- if (index >= io_tlb_nslabs)
+ if (index >= mem->nslabs)
index = 0;
} while (index != wrap);
not_found:
- tmp_io_tlb_used = io_tlb_used;
+ tmp_io_tlb_used = mem->used;
- spin_unlock_irqrestore(&io_tlb_lock, flags);
+ spin_unlock_irqrestore(&mem->lock, flags);
if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit())
dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
- alloc_size, io_tlb_nslabs, tmp_io_tlb_used);
+ alloc_size, mem->nslabs, tmp_io_tlb_used);
return (phys_addr_t)DMA_MAPPING_ERROR;
found:
- io_tlb_used += nslots;
- spin_unlock_irqrestore(&io_tlb_lock, flags);
+ mem->used += nslots;
+ spin_unlock_irqrestore(&mem->lock, flags);
/*
* Save away the mapping from the original address to the DMA address.
@@ -567,7 +544,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
* needed.
*/
for (i = 0; i < nslots; i++)
- io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
+ mem->orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE);
@@ -582,10 +559,11 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
size_t mapping_size, size_t alloc_size,
enum dma_data_direction dir, unsigned long attrs)
{
+ struct io_tlb_mem *mem = &io_tlb_default_mem;
unsigned long flags;
int i, count, nslots = ALIGN(alloc_size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
- int index = (tlb_addr - io_tlb_start) >> IO_TLB_SHIFT;
- phys_addr_t orig_addr = io_tlb_orig_addr[index];
+ int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT;
+ phys_addr_t orig_addr = mem->orig_addr[index];
/*
* First, sync the memory before unmapping the entry
@@ -601,36 +579,37 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
* While returning the entries to the free list, we merge the entries
* with slots below and above the pool being returned.
*/
- spin_lock_irqsave(&io_tlb_lock, flags);
+ spin_lock_irqsave(&mem->lock, flags);
{
count = ((index + nslots) < ALIGN(index + 1, IO_TLB_SEGSIZE) ?
- io_tlb_list[index + nslots] : 0);
+ mem->list[index + nslots] : 0);
/*
* Step 1: return the slots to the free list, merging the
* slots with superceeding slots
*/
for (i = index + nslots - 1; i >= index; i--) {
- io_tlb_list[i] = ++count;
- io_tlb_orig_addr[i] = INVALID_PHYS_ADDR;
+ mem->list[i] = ++count;
+ mem->orig_addr[i] = INVALID_PHYS_ADDR;
}
/*
* Step 2: merge the returned slots with the preceding slots,
* if available (non zero)
*/
- for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE -1) && io_tlb_list[i]; i--)
- io_tlb_list[i] = ++count;
+ for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) && mem->list[i]; i--)
+ mem->list[i] = ++count;
- io_tlb_used -= nslots;
+ mem->used -= nslots;
}
- spin_unlock_irqrestore(&io_tlb_lock, flags);
+ spin_unlock_irqrestore(&mem->lock, flags);
}
void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
size_t size, enum dma_data_direction dir,
enum dma_sync_target target)
{
- int index = (tlb_addr - io_tlb_start) >> IO_TLB_SHIFT;
- phys_addr_t orig_addr = io_tlb_orig_addr[index];
+ struct io_tlb_mem *mem = &io_tlb_default_mem;
+ int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT;
+ phys_addr_t orig_addr = mem->orig_addr[index];
if (orig_addr == INVALID_PHYS_ADDR)
return;
@@ -663,6 +642,7 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
dma_addr_t swiotlb_map(struct device *dev, phys_addr_t paddr, size_t size,
enum dma_data_direction dir, unsigned long attrs)
{
+ struct io_tlb_mem *mem = &io_tlb_default_mem;
phys_addr_t swiotlb_addr;
dma_addr_t dma_addr;
@@ -670,7 +650,7 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t paddr, size_t size,
swiotlb_force);
swiotlb_addr = swiotlb_tbl_map_single(dev,
- __phys_to_dma(dev, io_tlb_start),
+ __phys_to_dma(dev, mem->start),
paddr, size, size, dir, attrs);
if (swiotlb_addr == (phys_addr_t)DMA_MAPPING_ERROR)
return DMA_MAPPING_ERROR;
@@ -699,21 +679,21 @@ size_t swiotlb_max_mapping_size(struct device *dev)
bool is_swiotlb_active(void)
{
/*
- * When SWIOTLB is initialized, even if io_tlb_start points to physical
- * address zero, io_tlb_end surely doesn't.
+ * When SWIOTLB is initialized, even if io_tlb_default_mem.start points
+ * to physical address zero, io_tlb_default_mem.end surely doesn't.
*/
- return io_tlb_end != 0;
+ return io_tlb_default_mem.end != 0;
}
#ifdef CONFIG_DEBUG_FS
static int __init swiotlb_create_debugfs(void)
{
- struct dentry *root;
+ struct io_tlb_mem *mem = &io_tlb_default_mem;
- root = debugfs_create_dir("swiotlb", NULL);
- debugfs_create_ulong("io_tlb_nslabs", 0400, root, &io_tlb_nslabs);
- debugfs_create_ulong("io_tlb_used", 0400, root, &io_tlb_used);
+ mem->debugfs = debugfs_create_dir("swiotlb", NULL);
+ debugfs_create_ulong("io_tlb_nslabs", 0400, mem->debugfs, &mem->nslabs);
+ debugfs_create_ulong("io_tlb_used", 0400, mem->debugfs, &mem->used);
return 0;
}
--
2.28.0.rc0.142.g3c755180ce-goog
Introduce the new compatible string, device-swiotlb-pool, for restricted
DMA. One can specify the address and length of the device swiotlb memory
region by device-swiotlb-pool in the device tree.
Signed-off-by: Claire Chang <[email protected]>
---
.../reserved-memory/reserved-memory.txt | 35 +++++++++++++++++++
1 file changed, 35 insertions(+)
diff --git a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
index 4dd20de6977f..78850896e1d0 100644
--- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
+++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
@@ -51,6 +51,24 @@ compatible (optional) - standard definition
used as a shared pool of DMA buffers for a set of devices. It can
be used by an operating system to instantiate the necessary pool
management subsystem if necessary.
+ - device-swiotlb-pool: This indicates a region of memory meant to be
+ used as a pool of device swiotlb buffers for a given device. When
+ using this, the no-map and reusable properties must not be set, so the
+ operating system can create a virtual mapping that will be used for
+ synchronization. Also, there must be a restricted-dma property in the
+ device node to specify the indexes of reserved-memory nodes. One can
+ specify two reserved-memory nodes in the device tree. One with
+ shared-dma-pool to handle the coherent DMA buffer allocation, and
+ another one with device-swiotlb-pool for regular DMA to/from system
+ memory, which would be subject to bouncing. The main purpose for
+ restricted DMA is to mitigate the lack of DMA access control on
+ systems without an IOMMU, which could result in the DMA accessing the
+ system memory at unexpected times and/or unexpected addresses,
+ possibly leading to data leakage or corruption. The feature on its own
+ provides a basic level of protection against the DMA overwriting buffer
+ contents at unexpected times. However, to protect against general data
+ leakage and system memory corruption, the system needs to provide a
+ way to restrict the DMA to a predefined memory region.
- vendor specific string in the form <vendor>,[<device>-]<usage>
no-map (optional) - empty property
- Indicates the operating system must not create a virtual mapping
@@ -117,6 +135,16 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB).
compatible = "acme,multimedia-memory";
reg = <0x77000000 0x4000000>;
};
+
+ wifi_coherent_mem_region: wifi_coherent_mem_region {
+ compatible = "shared-dma-pool";
+ reg = <0x50000000 0x400000>;
+ };
+
+ wifi_device_swiotlb_region: wifi_device_swiotlb_region {
+ compatible = "device-swiotlb-pool";
+ reg = <0x50400000 0x4000000>;
+ };
};
/* ... */
@@ -135,4 +163,11 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB).
memory-region = <&multimedia_reserved>;
/* ... */
};
+
+ pcie_wifi: pcie_wifi@0,0 {
+ memory-region = <&wifi_coherent_mem_region>,
+ <&wifi_device_swiotlb_region>;
+ restricted-dma = <0>, <1>;
+ /* ... */
+ };
};
--
2.28.0.rc0.142.g3c755180ce-goog
If a device is not behind an IOMMU, we look up the device node and set
up the restricted DMA when the restricted-dma property is presented.
One can specify two reserved-memory nodes in the device tree. One with
shared-dma-pool to handle the coherent DMA buffer allocation, and
another one with device-swiotlb-pool for regular DMA to/from system
memory, which would be subject to bouncing.
Signed-off-by: Claire Chang <[email protected]>
---
drivers/of/address.c | 39 +++++++++++++++++++++++++++++++++++++++
drivers/of/device.c | 3 +++
drivers/of/of_private.h | 6 ++++++
3 files changed, 48 insertions(+)
diff --git a/drivers/of/address.c b/drivers/of/address.c
index 381dc9be7b22..1285f914481f 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -8,6 +8,7 @@
#include <linux/logic_pio.h>
#include <linux/module.h>
#include <linux/of_address.h>
+#include <linux/of_reserved_mem.h>
#include <linux/pci.h>
#include <linux/pci_regs.h>
#include <linux/sizes.h>
@@ -1009,6 +1010,44 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
return ret;
}
+int of_dma_set_restricted_buffer(struct device *dev)
+{
+ int length, size, ret, i;
+ u32 idx[2];
+
+ if (!dev || !dev->of_node)
+ return -EINVAL;
+
+ if (!of_get_property(dev->of_node, "restricted-dma", &length))
+ return 0;
+
+ size = length / sizeof(idx[0]);
+ if (size > ARRAY_SIZE(idx)) {
+ dev_err(dev,
+ "restricted-dma expected less than or equal to %d indexs, but got %d\n",
+ ARRAY_SIZE(idx), size);
+ return -EINVAL;
+ }
+
+ ret = of_property_read_u32_array(dev->of_node, "restricted-dma", idx,
+ size);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < size; i++) {
+ ret = of_reserved_mem_device_init_by_idx(dev, dev->of_node,
+ idx[i]);
+ if (ret) {
+ dev_err(dev,
+ "of_reserved_mem_device_init_by_idx() failed with %d\n",
+ ret);
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
/**
* of_dma_is_coherent - Check if device is coherent
* @np: device node
diff --git a/drivers/of/device.c b/drivers/of/device.c
index 27203bfd0b22..83d6cf8a8256 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -169,6 +169,9 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
arch_setup_dma_ops(dev, dma_addr, size, iommu, coherent);
+ if (!iommu)
+ return of_dma_set_restricted_buffer(dev);
+
return 0;
}
EXPORT_SYMBOL_GPL(of_dma_configure);
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index edc682249c00..f2e3adfb7d85 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -160,12 +160,18 @@ extern int of_bus_n_size_cells(struct device_node *np);
#ifdef CONFIG_OF_ADDRESS
extern int of_dma_get_range(struct device_node *np, u64 *dma_addr,
u64 *paddr, u64 *size);
+extern int of_dma_set_restricted_buffer(struct device *dev);
#else
static inline int of_dma_get_range(struct device_node *np, u64 *dma_addr,
u64 *paddr, u64 *size)
{
return -ENODEV;
}
+
+static inline int of_dma_get_restricted_buffer(struct device *dev)
+{
+ return -ENODEV;
+}
#endif
#endif /* _LINUX_OF_PRIVATE_H */
--
2.28.0.rc0.142.g3c755180ce-goog
Regardless of swiotlb setting, the device swiotlb pool is preferred if
available.
The device swiotlb pools provide a basic level of protection against
the DMA overwriting buffer contents at unexpected times. However, to
protect against general data leakage and system memory corruption, the
system needs to provide a way to restrict the DMA to a predefined memory
region.
Signed-off-by: Claire Chang <[email protected]>
---
drivers/iommu/intel/iommu.c | 6 +++---
include/linux/dma-direct.h | 8 ++++----
include/linux/swiotlb.h | 13 ++++++++-----
kernel/dma/direct.c | 8 ++++----
kernel/dma/swiotlb.c | 18 +++++++++++-------
5 files changed, 30 insertions(+), 23 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 44c9230251eb..37d6583cf628 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3684,7 +3684,7 @@ bounce_sync_single(struct device *dev, dma_addr_t addr, size_t size,
return;
tlb_addr = intel_iommu_iova_to_phys(&domain->domain, addr);
- if (is_swiotlb_buffer(tlb_addr))
+ if (is_swiotlb_buffer(dev, tlb_addr))
swiotlb_tbl_sync_single(dev, tlb_addr, size, dir, target);
}
@@ -3768,7 +3768,7 @@ bounce_map_single(struct device *dev, phys_addr_t paddr, size_t size,
return (phys_addr_t)iova_pfn << PAGE_SHIFT;
mapping_error:
- if (is_swiotlb_buffer(tlb_addr))
+ if (is_swiotlb_buffer(dev, tlb_addr))
swiotlb_tbl_unmap_single(dev, tlb_addr, size,
aligned_size, dir, attrs);
swiotlb_error:
@@ -3796,7 +3796,7 @@ bounce_unmap_single(struct device *dev, dma_addr_t dev_addr, size_t size,
return;
intel_unmap(dev, dev_addr, size);
- if (is_swiotlb_buffer(tlb_addr))
+ if (is_swiotlb_buffer(dev, tlb_addr))
swiotlb_tbl_unmap_single(dev, tlb_addr, size,
aligned_size, dir, attrs);
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 5a3ce2a24794..1cf920ddb2f6 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -134,7 +134,7 @@ static inline void dma_direct_sync_single_for_device(struct device *dev,
{
phys_addr_t paddr = dma_to_phys(dev, addr);
- if (unlikely(is_swiotlb_buffer(paddr)))
+ if (unlikely(is_swiotlb_buffer(dev, paddr)))
swiotlb_tbl_sync_single(dev, paddr, size, dir, SYNC_FOR_DEVICE);
if (!dev_is_dma_coherent(dev))
@@ -151,7 +151,7 @@ static inline void dma_direct_sync_single_for_cpu(struct device *dev,
arch_sync_dma_for_cpu_all();
}
- if (unlikely(is_swiotlb_buffer(paddr)))
+ if (unlikely(is_swiotlb_buffer(dev, paddr)))
swiotlb_tbl_sync_single(dev, paddr, size, dir, SYNC_FOR_CPU);
}
@@ -162,7 +162,7 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev,
phys_addr_t phys = page_to_phys(page) + offset;
dma_addr_t dma_addr = phys_to_dma(dev, phys);
- if (unlikely(swiotlb_force == SWIOTLB_FORCE))
+ if (unlikely(swiotlb_force == SWIOTLB_FORCE || dev->dma_io_tlb_mem))
return swiotlb_map(dev, phys, size, dir, attrs);
if (unlikely(!dma_capable(dev, dma_addr, size, true))) {
@@ -188,7 +188,7 @@ static inline void dma_direct_unmap_page(struct device *dev, dma_addr_t addr,
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
dma_direct_sync_single_for_cpu(dev, addr, size, dir);
- if (unlikely(is_swiotlb_buffer(phys)))
+ if (unlikely(is_swiotlb_buffer(dev, phys)))
swiotlb_tbl_unmap_single(dev, phys, size, size, dir, attrs);
}
#endif /* _LINUX_DMA_DIRECT_H */
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index ab0d571d0826..8a50b3af7c3f 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -105,18 +105,21 @@ struct io_tlb_mem {
};
extern struct io_tlb_mem io_tlb_default_mem;
-static inline bool is_swiotlb_buffer(phys_addr_t paddr)
+static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr)
{
- return paddr >= io_tlb_mem.start && paddr < io_tlb_mem.end;
+ struct io_tlb_mem *mem =
+ dev->dma_io_tlb_mem ? dev->dma_io_tlb_mem : &io_tlb_default_mem;
+
+ return paddr >= mem->start && paddr < mem->end;
}
void __init swiotlb_exit(void);
unsigned int swiotlb_max_segment(void);
size_t swiotlb_max_mapping_size(struct device *dev);
-bool is_swiotlb_active(void);
+bool is_swiotlb_active(struct device *dev);
#else
#define swiotlb_force SWIOTLB_NO_FORCE
-static inline bool is_swiotlb_buffer(phys_addr_t paddr)
+static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr)
{
return false;
}
@@ -132,7 +135,7 @@ static inline size_t swiotlb_max_mapping_size(struct device *dev)
return SIZE_MAX;
}
-static inline bool is_swiotlb_active(void)
+static inline bool is_swiotlb_active(struct device *dev)
{
return false;
}
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index bb0041e99659..b31312d68196 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -311,7 +311,7 @@ void dma_direct_sync_sg_for_device(struct device *dev,
for_each_sg(sgl, sg, nents, i) {
phys_addr_t paddr = dma_to_phys(dev, sg_dma_address(sg));
- if (unlikely(is_swiotlb_buffer(paddr)))
+ if (unlikely(is_swiotlb_buffer(dev, paddr)))
swiotlb_tbl_sync_single(dev, paddr, sg->length,
dir, SYNC_FOR_DEVICE);
@@ -337,7 +337,7 @@ void dma_direct_sync_sg_for_cpu(struct device *dev,
if (!dev_is_dma_coherent(dev))
arch_sync_dma_for_cpu(paddr, sg->length, dir);
- if (unlikely(is_swiotlb_buffer(paddr)))
+ if (unlikely(is_swiotlb_buffer(dev, paddr)))
swiotlb_tbl_sync_single(dev, paddr, sg->length, dir,
SYNC_FOR_CPU);
}
@@ -460,7 +460,7 @@ int dma_direct_supported(struct device *dev, u64 mask)
size_t dma_direct_max_mapping_size(struct device *dev)
{
/* If SWIOTLB is active, use its maximum mapping size */
- if (is_swiotlb_active() &&
+ if (is_swiotlb_active(dev) &&
(dma_addressing_limited(dev) || swiotlb_force == SWIOTLB_FORCE))
return swiotlb_max_mapping_size(dev);
return SIZE_MAX;
@@ -469,5 +469,5 @@ size_t dma_direct_max_mapping_size(struct device *dev)
bool dma_direct_need_sync(struct device *dev, dma_addr_t dma_addr)
{
return !dev_is_dma_coherent(dev) ||
- is_swiotlb_buffer(dma_to_phys(dev, dma_addr));
+ is_swiotlb_buffer(dev, dma_to_phys(dev, dma_addr));
}
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index eaa101b3e75b..fbc9367bccb4 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -449,7 +449,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
enum dma_data_direction dir,
unsigned long attrs)
{
- struct io_tlb_mem *mem = &io_tlb_default_mem;
+ struct io_tlb_mem *mem = hwdev->dma_io_tlb_mem ? hwdev->dma_io_tlb_mem :
+ &io_tlb_default_mem;
unsigned long flags;
phys_addr_t tlb_addr;
unsigned int nslots, stride, index, wrap;
@@ -459,7 +460,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
unsigned long max_slots;
unsigned long tmp_io_tlb_used;
- if (no_iotlb_memory)
+ if (!hwdev->dma_io_tlb_mem && no_iotlb_memory)
panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer");
if (mem_encrypt_active())
@@ -581,7 +582,8 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
size_t mapping_size, size_t alloc_size,
enum dma_data_direction dir, unsigned long attrs)
{
- struct io_tlb_mem *mem = &io_tlb_default_mem;
+ struct io_tlb_mem *mem = hwdev->dma_io_tlb_mem ? hwdev->dma_io_tlb_mem :
+ &io_tlb_default_mem;
unsigned long flags;
int i, count, nslots = ALIGN(alloc_size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT;
@@ -629,7 +631,8 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
size_t size, enum dma_data_direction dir,
enum dma_sync_target target)
{
- struct io_tlb_mem *mem = &io_tlb_default_mem;
+ struct io_tlb_mem *mem = hwdev->dma_io_tlb_mem ? hwdev->dma_io_tlb_mem :
+ &io_tlb_default_mem;
int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT;
phys_addr_t orig_addr = mem->orig_addr[index];
@@ -664,7 +667,8 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
dma_addr_t swiotlb_map(struct device *dev, phys_addr_t paddr, size_t size,
enum dma_data_direction dir, unsigned long attrs)
{
- struct io_tlb_mem *mem = &io_tlb_default_mem;
+ struct io_tlb_mem *mem =
+ dev->dma_io_tlb_mem ? dev->dma_io_tlb_mem : &io_tlb_default_mem;
phys_addr_t swiotlb_addr;
dma_addr_t dma_addr;
@@ -698,13 +702,13 @@ size_t swiotlb_max_mapping_size(struct device *dev)
return ((size_t)1 << IO_TLB_SHIFT) * IO_TLB_SEGSIZE;
}
-bool is_swiotlb_active(void)
+bool is_swiotlb_active(struct device *dev)
{
/*
* When SWIOTLB is initialized, even if io_tlb_default_mem.start points
* to physical address zero, io_tlb_default_mem.end surely doesn't.
*/
- return io_tlb_default_mem.end != 0;
+ return dev->dma_io_tlb_mem || io_tlb_default_mem.end != 0;
}
#ifdef CONFIG_DEBUG_FS
--
2.28.0.rc0.142.g3c755180ce-goog
Add the initialization function to create device swiotlb pools from
matching reserved-memory nodes in the device tree.
Signed-off-by: Claire Chang <[email protected]>
---
include/linux/device.h | 4 ++
kernel/dma/swiotlb.c | 148 +++++++++++++++++++++++++++++++++--------
2 files changed, 126 insertions(+), 26 deletions(-)
diff --git a/include/linux/device.h b/include/linux/device.h
index 79ce404619e6..f40f711e43e9 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -575,6 +575,10 @@ struct device {
struct cma *cma_area; /* contiguous memory area for dma
allocations */
#endif
+#ifdef CONFIG_SWIOTLB
+ struct io_tlb_mem *dma_io_tlb_mem;
+#endif
+
/* arch specific additions */
struct dev_archdata archdata;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index f83911fa14ce..eaa101b3e75b 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -36,6 +36,10 @@
#include <linux/scatterlist.h>
#include <linux/mem_encrypt.h>
#include <linux/set_memory.h>
+#include <linux/io.h>
+#include <linux/of.h>
+#include <linux/of_fdt.h>
+#include <linux/of_reserved_mem.h>
#ifdef CONFIG_DEBUG_FS
#include <linux/debugfs.h>
#endif
@@ -298,20 +302,14 @@ static void swiotlb_cleanup(void)
max_segment = 0;
}
-int
-swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
+static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
+ size_t size)
{
- struct io_tlb_mem *mem = &io_tlb_default_mem;
- unsigned long i, bytes;
-
- bytes = nslabs << IO_TLB_SHIFT;
+ unsigned long i;
- mem->nslabs = nslabs;
- mem->start = virt_to_phys(tlb);
- mem->end = mem->start + bytes;
-
- set_memory_decrypted((unsigned long)tlb, bytes >> PAGE_SHIFT);
- memset(tlb, 0, bytes);
+ mem->nslabs = size >> IO_TLB_SHIFT;
+ mem->start = start;
+ mem->end = mem->start + size;
/*
* Allocate and initialize the free list array. This array is used
@@ -336,11 +334,6 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
}
mem->index = 0;
- swiotlb_print_info();
-
- late_alloc = 1;
-
- swiotlb_set_max_segment(mem->nslabs << IO_TLB_SHIFT);
spin_lock_init(&mem->lock);
return 0;
@@ -354,6 +347,38 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
return -ENOMEM;
}
+int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
+{
+ struct io_tlb_mem *mem = &io_tlb_default_mem;
+ unsigned long bytes;
+ int ret;
+
+ bytes = nslabs << IO_TLB_SHIFT;
+
+ set_memory_decrypted((unsigned long)tlb, bytes >> PAGE_SHIFT);
+ memset(tlb, 0, bytes);
+
+ ret = swiotlb_init_io_tlb_mem(mem, virt_to_phys(tlb), bytes);
+ if (ret)
+ return ret;
+
+ swiotlb_print_info();
+
+ late_alloc = 1;
+
+ swiotlb_set_max_segment(mem->nslabs << IO_TLB_SHIFT);
+
+ return 0;
+}
+
+static void swiotlb_free_pages(struct io_tlb_mem *mem)
+{
+ free_pages((unsigned long)mem->orig_addr,
+ get_order(mem->nslabs * sizeof(phys_addr_t)));
+ free_pages((unsigned long)mem->list,
+ get_order(mem->nslabs * sizeof(int)));
+}
+
void __init swiotlb_exit(void)
{
struct io_tlb_mem *mem = &io_tlb_default_mem;
@@ -362,10 +387,7 @@ void __init swiotlb_exit(void)
return;
if (late_alloc) {
- free_pages((unsigned long)mem->orig_addr,
- get_order(mem->nslabs * sizeof(phys_addr_t)));
- free_pages((unsigned long)mem->list, get_order(mem->nslabs *
- sizeof(int)));
+ swiotlb_free_pages(mem);
free_pages((unsigned long)phys_to_virt(mem->start),
get_order(mem->nslabs << IO_TLB_SHIFT));
} else {
@@ -687,16 +709,90 @@ bool is_swiotlb_active(void)
#ifdef CONFIG_DEBUG_FS
-static int __init swiotlb_create_debugfs(void)
+static void swiotlb_create_debugfs(struct io_tlb_mem *mem, const char *name,
+ struct dentry *node)
{
- struct io_tlb_mem *mem = &io_tlb_default_mem;
-
- mem->debugfs = debugfs_create_dir("swiotlb", NULL);
+ mem->debugfs = debugfs_create_dir(name, node);
debugfs_create_ulong("io_tlb_nslabs", 0400, mem->debugfs, &mem->nslabs);
debugfs_create_ulong("io_tlb_used", 0400, mem->debugfs, &mem->used);
+}
+
+static int __init swiotlb_create_default_debugfs(void)
+{
+ swiotlb_create_debugfs(&io_tlb_default_mem, "swiotlb", NULL);
+
return 0;
}
-late_initcall(swiotlb_create_debugfs);
+late_initcall(swiotlb_create_default_debugfs);
#endif
+
+static int device_swiotlb_init(struct reserved_mem *rmem,
+ struct device *dev)
+{
+ struct io_tlb_mem *mem;
+ int ret;
+
+ if (dev->dma_io_tlb_mem)
+ return 0;
+
+ mem = kzalloc(sizeof(*mem), GFP_KERNEL);
+ if (!mem)
+ return -ENOMEM;
+
+ if (!devm_memremap(dev, rmem->base, rmem->size, MEMREMAP_WB)) {
+ ret = -EINVAL;
+ goto cleanup;
+ }
+
+ ret = swiotlb_init_io_tlb_mem(mem, rmem->base, rmem->size);
+ if (ret)
+ goto cleanup;
+
+ swiotlb_create_debugfs(mem, dev_name(dev), io_tlb_default_mem.debugfs);
+
+ dev->dma_io_tlb_mem = mem;
+
+ return 0;
+
+cleanup:
+ kfree(mem);
+
+ return ret;
+}
+
+static void device_swiotlb_release(struct reserved_mem *rmem,
+ struct device *dev)
+{
+ struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
+
+ dev->dma_io_tlb_mem = NULL;
+
+ debugfs_remove_recursive(mem->debugfs);
+ swiotlb_free_pages(mem);
+ kfree(mem);
+}
+
+static const struct reserved_mem_ops rmem_device_swiotlb_ops = {
+ .device_init = device_swiotlb_init,
+ .device_release = device_swiotlb_release,
+};
+
+static int __init device_swiotlb_pool_setup(struct reserved_mem *rmem)
+{
+ unsigned long node = rmem->fdt_node;
+
+ if (of_get_flat_dt_prop(node, "reusable", NULL) ||
+ of_get_flat_dt_prop(node, "linux,cma-default", NULL) ||
+ of_get_flat_dt_prop(node, "linux,dma-default", NULL) ||
+ of_get_flat_dt_prop(node, "no-map", NULL))
+ return -EINVAL;
+
+ rmem->ops = &rmem_device_swiotlb_ops;
+ pr_info("Reserved memory: created device swiotlb memory pool at %pa, size %ld MiB\n",
+ &rmem->base, (unsigned long)rmem->size / SZ_1M);
+ return 0;
+}
+
+RESERVEDMEM_OF_DECLARE(dma, "device-swiotlb-pool", device_swiotlb_pool_setup);
--
2.28.0.rc0.142.g3c755180ce-goog
It seems that I didn't rebase the patchset properly. There are some
build test errors.
Sorry about that. Please kindly ignore those rebase issues. I'll fix
them in the next version.
On Tue, Jul 28, 2020 at 1:01 PM Claire Chang <[email protected]> wrote:
>
> This series implements mitigations for lack of DMA access control on
> systems without an IOMMU, which could result in the DMA accessing the
> system memory at unexpected times and/or unexpected addresses, possibly
> leading to data leakage or corruption.
>
> For example, we plan to use the PCI-e bus for Wi-Fi on one MTK platform and
> that PCI-e bus is not behind an IOMMU. As PCI-e, by design, gives the
> device full access to system memory, a vulnerability in the Wi-Fi firmware
> could easily escalate to a full system exploit (remote wifi exploits: [1a],
> [1b] that shows a full chain of exploits; [2], [3]).
>
> To mitigate the security concerns, we introduce restricted DMA. The
> restricted DMA is implemented by per-device swiotlb and coherent memory
> pools. The feature on its own provides a basic level of protection against
> the DMA overwriting buffer contents at unexpected times. However, to
> protect against general data leakage and system memory corruption, the
> system needs to provide a way to restrict the DMA to a predefined memory
> region (this is usually done at firmware level, e.g. in ATF on some ARM
> platforms).
>
> [1a] https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_4.html
> [1b] https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_11.html
> [2] https://blade.tencent.com/en/advisories/qualpwn/
> [3] https://www.bleepingcomputer.com/news/security/vulnerabilities-found-in-highly-popular-firmware-for-wifi-chips/
>
>
> Claire Chang (5):
> swiotlb: Add io_tlb_mem struct
> swiotlb: Add device swiotlb pool
> swiotlb: Use device swiotlb pool if available
> dt-bindings: of: Add plumbing for restricted DMA pool
> of: Add plumbing for restricted DMA pool
>
> .../reserved-memory/reserved-memory.txt | 35 ++
> drivers/iommu/intel/iommu.c | 8 +-
> drivers/of/address.c | 39 ++
> drivers/of/device.c | 3 +
> drivers/of/of_private.h | 6 +
> drivers/xen/swiotlb-xen.c | 4 +-
> include/linux/device.h | 4 +
> include/linux/dma-direct.h | 8 +-
> include/linux/swiotlb.h | 49 +-
> kernel/dma/direct.c | 8 +-
> kernel/dma/swiotlb.c | 418 +++++++++++-------
> 11 files changed, 393 insertions(+), 189 deletions(-)
>
> --
> v1: https://lore.kernel.org/patchwork/cover/1271660/
> Changes in v2:
> - build on top of swiotlb
>
> 2.28.0.rc0.142.g3c755180ce-goog
>
On Tue, Jul 28, 2020 at 01:01:39PM +0800, Claire Chang wrote:
> Introduce the new compatible string, device-swiotlb-pool, for restricted
> DMA. One can specify the address and length of the device swiotlb memory
> region by device-swiotlb-pool in the device tree.
>
> Signed-off-by: Claire Chang <[email protected]>
> ---
> .../reserved-memory/reserved-memory.txt | 35 +++++++++++++++++++
> 1 file changed, 35 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> index 4dd20de6977f..78850896e1d0 100644
> --- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> +++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> @@ -51,6 +51,24 @@ compatible (optional) - standard definition
> used as a shared pool of DMA buffers for a set of devices. It can
> be used by an operating system to instantiate the necessary pool
> management subsystem if necessary.
> + - device-swiotlb-pool: This indicates a region of memory meant to be
swiotlb is a Linux thing. The binding should be independent.
> + used as a pool of device swiotlb buffers for a given device. When
> + using this, the no-map and reusable properties must not be set, so the
> + operating system can create a virtual mapping that will be used for
> + synchronization. Also, there must be a restricted-dma property in the
> + device node to specify the indexes of reserved-memory nodes. One can
> + specify two reserved-memory nodes in the device tree. One with
> + shared-dma-pool to handle the coherent DMA buffer allocation, and
> + another one with device-swiotlb-pool for regular DMA to/from system
> + memory, which would be subject to bouncing. The main purpose for
> + restricted DMA is to mitigate the lack of DMA access control on
> + systems without an IOMMU, which could result in the DMA accessing the
> + system memory at unexpected times and/or unexpected addresses,
> + possibly leading to data leakage or corruption. The feature on its own
> + provides a basic level of protection against the DMA overwriting buffer
> + contents at unexpected times. However, to protect against general data
> + leakage and system memory corruption, the system needs to provide a
> + way to restrict the DMA to a predefined memory region.
I'm pretty sure we already support per device carveouts and I don't
understand how this is different.
What is the last sentence supposed to imply? You need an IOMMU?
> - vendor specific string in the form <vendor>,[<device>-]<usage>
> no-map (optional) - empty property
> - Indicates the operating system must not create a virtual mapping
> @@ -117,6 +135,16 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB).
> compatible = "acme,multimedia-memory";
> reg = <0x77000000 0x4000000>;
> };
> +
> + wifi_coherent_mem_region: wifi_coherent_mem_region {
> + compatible = "shared-dma-pool";
> + reg = <0x50000000 0x400000>;
> + };
> +
> + wifi_device_swiotlb_region: wifi_device_swiotlb_region {
> + compatible = "device-swiotlb-pool";
> + reg = <0x50400000 0x4000000>;
> + };
> };
>
> /* ... */
> @@ -135,4 +163,11 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB).
> memory-region = <&multimedia_reserved>;
> /* ... */
> };
> +
> + pcie_wifi: pcie_wifi@0,0 {
> + memory-region = <&wifi_coherent_mem_region>,
> + <&wifi_device_swiotlb_region>;
> + restricted-dma = <0>, <1>;
> + /* ... */
> + };
> };
> --
> 2.28.0.rc0.142.g3c755180ce-goog
>
On Sat, Aug 1, 2020 at 4:58 AM Rob Herring <[email protected]> wrote:
>
> On Tue, Jul 28, 2020 at 01:01:39PM +0800, Claire Chang wrote:
> > Introduce the new compatible string, device-swiotlb-pool, for restricted
> > DMA. One can specify the address and length of the device swiotlb memory
> > region by device-swiotlb-pool in the device tree.
> >
> > Signed-off-by: Claire Chang <[email protected]>
> > ---
> > .../reserved-memory/reserved-memory.txt | 35 +++++++++++++++++++
> > 1 file changed, 35 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > index 4dd20de6977f..78850896e1d0 100644
> > --- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > +++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > @@ -51,6 +51,24 @@ compatible (optional) - standard definition
> > used as a shared pool of DMA buffers for a set of devices. It can
> > be used by an operating system to instantiate the necessary pool
> > management subsystem if necessary.
> > + - device-swiotlb-pool: This indicates a region of memory meant to be
>
> swiotlb is a Linux thing. The binding should be independent.
Got it. Thanks for pointing this out.
>
> > + used as a pool of device swiotlb buffers for a given device. When
> > + using this, the no-map and reusable properties must not be set, so the
> > + operating system can create a virtual mapping that will be used for
> > + synchronization. Also, there must be a restricted-dma property in the
> > + device node to specify the indexes of reserved-memory nodes. One can
> > + specify two reserved-memory nodes in the device tree. One with
> > + shared-dma-pool to handle the coherent DMA buffer allocation, and
> > + another one with device-swiotlb-pool for regular DMA to/from system
> > + memory, which would be subject to bouncing. The main purpose for
> > + restricted DMA is to mitigate the lack of DMA access control on
> > + systems without an IOMMU, which could result in the DMA accessing the
> > + system memory at unexpected times and/or unexpected addresses,
> > + possibly leading to data leakage or corruption. The feature on its own
> > + provides a basic level of protection against the DMA overwriting buffer
> > + contents at unexpected times. However, to protect against general data
> > + leakage and system memory corruption, the system needs to provide a
> > + way to restrict the DMA to a predefined memory region.
>
> I'm pretty sure we already support per device carveouts and I don't
> understand how this is different.
We use this to bounce streaming DMA in and out of a specially allocated region.
I'll try to merge this with the existing one (i.e., shared-dma-pool)
to see if that
makes things clearer.
>
> What is the last sentence supposed to imply? You need an IOMMU?
The main purpose is to mitigate the lack of DMA access control on
systems without an IOMMU.
For example, we plan to use this plus a MPU for our PCIe WiFi which is
not behind an IOMMU.
>
> > - vendor specific string in the form <vendor>,[<device>-]<usage>
> > no-map (optional) - empty property
> > - Indicates the operating system must not create a virtual mapping
> > @@ -117,6 +135,16 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB).
> > compatible = "acme,multimedia-memory";
> > reg = <0x77000000 0x4000000>;
> > };
> > +
> > + wifi_coherent_mem_region: wifi_coherent_mem_region {
> > + compatible = "shared-dma-pool";
> > + reg = <0x50000000 0x400000>;
> > + };
> > +
> > + wifi_device_swiotlb_region: wifi_device_swiotlb_region {
> > + compatible = "device-swiotlb-pool";
> > + reg = <0x50400000 0x4000000>;
> > + };
> > };
> >
> > /* ... */
> > @@ -135,4 +163,11 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB).
> > memory-region = <&multimedia_reserved>;
> > /* ... */
> > };
> > +
> > + pcie_wifi: pcie_wifi@0,0 {
> > + memory-region = <&wifi_coherent_mem_region>,
> > + <&wifi_device_swiotlb_region>;
> > + restricted-dma = <0>, <1>;
> > + /* ... */
> > + };
> > };
> > --
> > 2.28.0.rc0.142.g3c755180ce-goog
> >
Hi Claire and Rob,
On Mon, Aug 3, 2020 at 4:26 PM Claire Chang <[email protected]> wrote:
>
> On Sat, Aug 1, 2020 at 4:58 AM Rob Herring <[email protected]> wrote:
> >
> > On Tue, Jul 28, 2020 at 01:01:39PM +0800, Claire Chang wrote:
> > > Introduce the new compatible string, device-swiotlb-pool, for restricted
> > > DMA. One can specify the address and length of the device swiotlb memory
> > > region by device-swiotlb-pool in the device tree.
> > >
> > > Signed-off-by: Claire Chang <[email protected]>
> > > ---
> > > .../reserved-memory/reserved-memory.txt | 35 +++++++++++++++++++
> > > 1 file changed, 35 insertions(+)
> > >
> > > diff --git a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > index 4dd20de6977f..78850896e1d0 100644
> > > --- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > +++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > @@ -51,6 +51,24 @@ compatible (optional) - standard definition
> > > used as a shared pool of DMA buffers for a set of devices. It can
> > > be used by an operating system to instantiate the necessary pool
> > > management subsystem if necessary.
> > > + - device-swiotlb-pool: This indicates a region of memory meant to be
> >
> > swiotlb is a Linux thing. The binding should be independent.
> Got it. Thanks for pointing this out.
>
> >
> > > + used as a pool of device swiotlb buffers for a given device. When
> > > + using this, the no-map and reusable properties must not be set, so the
> > > + operating system can create a virtual mapping that will be used for
> > > + synchronization. Also, there must be a restricted-dma property in the
> > > + device node to specify the indexes of reserved-memory nodes. One can
> > > + specify two reserved-memory nodes in the device tree. One with
> > > + shared-dma-pool to handle the coherent DMA buffer allocation, and
> > > + another one with device-swiotlb-pool for regular DMA to/from system
> > > + memory, which would be subject to bouncing. The main purpose for
> > > + restricted DMA is to mitigate the lack of DMA access control on
> > > + systems without an IOMMU, which could result in the DMA accessing the
> > > + system memory at unexpected times and/or unexpected addresses,
> > > + possibly leading to data leakage or corruption. The feature on its own
> > > + provides a basic level of protection against the DMA overwriting buffer
> > > + contents at unexpected times. However, to protect against general data
> > > + leakage and system memory corruption, the system needs to provide a
> > > + way to restrict the DMA to a predefined memory region.
> >
> > I'm pretty sure we already support per device carveouts and I don't
> > understand how this is different.
> We use this to bounce streaming DMA in and out of a specially allocated region.
> I'll try to merge this with the existing one (i.e., shared-dma-pool)
> to see if that
> makes things clearer.
>
Indeed, from the firmware point of view, this is just a carveout, for
which we have the "shared-dma-pool" compatible string defined already.
However, depending on the device and firmware setup, the way the
carevout is used may change. I can see the following scenarios:
1) coherent DMA (dma_alloc_*) within a reserved pool and no
non-coherent DMA (dma_map_*).
This is how the "memory-region" property is handled today in Linux for
devices which can only DMA from/to the given memory region. However,
I'm not sure if no non-coherent DMA is actually enforced in any way by
the DMA subsystem.
2) coherent DMA from a reserved pool and non-coherent DMA from system memory
This is the case for the systems which have some dedicated part of
memory which is guaranteed to be coherent with the DMA, but still can
do non-coherent DMA to any part of the system memory. Linux handles it
the same way as 1), which is what made me believe that 1) might not
actually be handled correctly.
3) coherent DMA and bounced non-coherent DMA within a reserved pool
4) coherent DMA within one pool and bounced non-coherent within another pool
These are the two cases we're interested in. Basically they make it
possible for non-coherent DMA from arbitrary system memory to be
bounced through a reserved pool, which the device has access to. The
current series implements 4), but I'd argue that it:
- is problematic from the firmware point of view, because on most of
the systems, both pools would be just some carveouts and the fact that
Linux would use one for coherent and the other for non-coherent DMA
would be an OS implementation detail,
- suffers from the static memory split between coherent and
non-coherent DMA, which could either result in some wasted memory or
the DMA stopped working after a kernel update if the driver changes
its allocation pattern,
and so we should rather go with 3).
Now, from the firmware point of view, it doesn't matter how the OS
uses the carveout, but I think it's still necessary to tell the OS
about the device DMA capability. Right now we use "memory-region" for
any kind of reserved memory, but looking at the above scenarios, there
are 2 cases:
a) the memory region is preferred for the device, e.g. it enables
coherency, but the device can still DMA across the rest of the system
memory. This is the case in scenario 2) and is kind of assumed in the
Linux DMA subsystem, although it's certainly not the case for a lot of
hardware, even if they use the "memory-region" binding.
b) the memory region is the only region that the device can access.
This is the case in scenarios 1), 3) and 4).
For this, I'd like to propose a "restricted-dma-region" (feel free to
suggest a better name) binding, which is explicitly specified to be
the only DMA-able memory for this device and make Linux use the given
pool for coherent DMA allocations and bouncing non-coherent DMA.
What do you think?
Best regards,
Tomasz
> >
> > What is the last sentence supposed to imply? You need an IOMMU?
> The main purpose is to mitigate the lack of DMA access control on
> systems without an IOMMU.
> For example, we plan to use this plus a MPU for our PCIe WiFi which is
> not behind an IOMMU.
>
> >
> > > - vendor specific string in the form <vendor>,[<device>-]<usage>
> > > no-map (optional) - empty property
> > > - Indicates the operating system must not create a virtual mapping
> > > @@ -117,6 +135,16 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB).
> > > compatible = "acme,multimedia-memory";
> > > reg = <0x77000000 0x4000000>;
> > > };
> > > +
> > > + wifi_coherent_mem_region: wifi_coherent_mem_region {
> > > + compatible = "shared-dma-pool";
> > > + reg = <0x50000000 0x400000>;
> > > + };
> > > +
> > > + wifi_device_swiotlb_region: wifi_device_swiotlb_region {
> > > + compatible = "device-swiotlb-pool";
> > > + reg = <0x50400000 0x4000000>;
> > > + };
> > > };
> > >
> > > /* ... */
> > > @@ -135,4 +163,11 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB).
> > > memory-region = <&multimedia_reserved>;
> > > /* ... */
> > > };
> > > +
> > > + pcie_wifi: pcie_wifi@0,0 {
> > > + memory-region = <&wifi_coherent_mem_region>,
> > > + <&wifi_device_swiotlb_region>;
> > > + restricted-dma = <0>, <1>;
> > > + /* ... */
> > > + };
> > > };
> > > --
> > > 2.28.0.rc0.142.g3c755180ce-goog
> > >
On Mon, Aug 3, 2020 at 5:15 PM Tomasz Figa <[email protected]> wrote:
>
> Hi Claire and Rob,
>
> On Mon, Aug 3, 2020 at 4:26 PM Claire Chang <[email protected]> wrote:
> >
> > On Sat, Aug 1, 2020 at 4:58 AM Rob Herring <[email protected]> wrote:
> > >
> > > On Tue, Jul 28, 2020 at 01:01:39PM +0800, Claire Chang wrote:
> > > > Introduce the new compatible string, device-swiotlb-pool, for restricted
> > > > DMA. One can specify the address and length of the device swiotlb memory
> > > > region by device-swiotlb-pool in the device tree.
> > > >
> > > > Signed-off-by: Claire Chang <[email protected]>
> > > > ---
> > > > .../reserved-memory/reserved-memory.txt | 35 +++++++++++++++++++
> > > > 1 file changed, 35 insertions(+)
> > > >
> > > > diff --git a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > > index 4dd20de6977f..78850896e1d0 100644
> > > > --- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > > +++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > > @@ -51,6 +51,24 @@ compatible (optional) - standard definition
> > > > used as a shared pool of DMA buffers for a set of devices. It can
> > > > be used by an operating system to instantiate the necessary pool
> > > > management subsystem if necessary.
> > > > + - device-swiotlb-pool: This indicates a region of memory meant to be
> > >
> > > swiotlb is a Linux thing. The binding should be independent.
> > Got it. Thanks for pointing this out.
> >
> > >
> > > > + used as a pool of device swiotlb buffers for a given device. When
> > > > + using this, the no-map and reusable properties must not be set, so the
> > > > + operating system can create a virtual mapping that will be used for
> > > > + synchronization. Also, there must be a restricted-dma property in the
> > > > + device node to specify the indexes of reserved-memory nodes. One can
> > > > + specify two reserved-memory nodes in the device tree. One with
> > > > + shared-dma-pool to handle the coherent DMA buffer allocation, and
> > > > + another one with device-swiotlb-pool for regular DMA to/from system
> > > > + memory, which would be subject to bouncing. The main purpose for
> > > > + restricted DMA is to mitigate the lack of DMA access control on
> > > > + systems without an IOMMU, which could result in the DMA accessing the
> > > > + system memory at unexpected times and/or unexpected addresses,
> > > > + possibly leading to data leakage or corruption. The feature on its own
> > > > + provides a basic level of protection against the DMA overwriting buffer
> > > > + contents at unexpected times. However, to protect against general data
> > > > + leakage and system memory corruption, the system needs to provide a
> > > > + way to restrict the DMA to a predefined memory region.
> > >
> > > I'm pretty sure we already support per device carveouts and I don't
> > > understand how this is different.
> > We use this to bounce streaming DMA in and out of a specially allocated region.
> > I'll try to merge this with the existing one (i.e., shared-dma-pool)
> > to see if that
> > makes things clearer.
> >
>
> Indeed, from the firmware point of view, this is just a carveout, for
> which we have the "shared-dma-pool" compatible string defined already.
>
> However, depending on the device and firmware setup, the way the
> carevout is used may change. I can see the following scenarios:
>
> 1) coherent DMA (dma_alloc_*) within a reserved pool and no
> non-coherent DMA (dma_map_*).
>
> This is how the "memory-region" property is handled today in Linux for
> devices which can only DMA from/to the given memory region. However,
> I'm not sure if no non-coherent DMA is actually enforced in any way by
> the DMA subsystem.
>
> 2) coherent DMA from a reserved pool and non-coherent DMA from system memory
>
> This is the case for the systems which have some dedicated part of
> memory which is guaranteed to be coherent with the DMA, but still can
> do non-coherent DMA to any part of the system memory. Linux handles it
> the same way as 1), which is what made me believe that 1) might not
> actually be handled correctly.
>
> 3) coherent DMA and bounced non-coherent DMA within a reserved pool
> 4) coherent DMA within one pool and bounced non-coherent within another pool
>
> These are the two cases we're interested in. Basically they make it
> possible for non-coherent DMA from arbitrary system memory to be
> bounced through a reserved pool, which the device has access to. The
> current series implements 4), but I'd argue that it:
>
> - is problematic from the firmware point of view, because on most of
> the systems, both pools would be just some carveouts and the fact that
> Linux would use one for coherent and the other for non-coherent DMA
> would be an OS implementation detail,
> - suffers from the static memory split between coherent and
> non-coherent DMA, which could either result in some wasted memory or
> the DMA stopped working after a kernel update if the driver changes
> its allocation pattern,
>
> and so we should rather go with 3).
>
> Now, from the firmware point of view, it doesn't matter how the OS
> uses the carveout, but I think it's still necessary to tell the OS
> about the device DMA capability. Right now we use "memory-region" for
> any kind of reserved memory, but looking at the above scenarios, there
> are 2 cases:
>
> a) the memory region is preferred for the device, e.g. it enables
> coherency, but the device can still DMA across the rest of the system
> memory. This is the case in scenario 2) and is kind of assumed in the
> Linux DMA subsystem, although it's certainly not the case for a lot of
> hardware, even if they use the "memory-region" binding.
>
> b) the memory region is the only region that the device can access.
> This is the case in scenarios 1), 3) and 4).
>
> For this, I'd like to propose a "restricted-dma-region" (feel free to
> suggest a better name) binding, which is explicitly specified to be
> the only DMA-able memory for this device and make Linux use the given
> pool for coherent DMA allocations and bouncing non-coherent DMA.
>
> What do you think?
Rob, Robin, we'd appreciate your feedback on this when you have a
chance to take a look again. Thanks!
Best regards,
Tomasz
>
> Best regards,
> Tomasz
>
> > >
> > > What is the last sentence supposed to imply? You need an IOMMU?
> > The main purpose is to mitigate the lack of DMA access control on
> > systems without an IOMMU.
> > For example, we plan to use this plus a MPU for our PCIe WiFi which is
> > not behind an IOMMU.
> >
> > >
> > > > - vendor specific string in the form <vendor>,[<device>-]<usage>
> > > > no-map (optional) - empty property
> > > > - Indicates the operating system must not create a virtual mapping
> > > > @@ -117,6 +135,16 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB).
> > > > compatible = "acme,multimedia-memory";
> > > > reg = <0x77000000 0x4000000>;
> > > > };
> > > > +
> > > > + wifi_coherent_mem_region: wifi_coherent_mem_region {
> > > > + compatible = "shared-dma-pool";
> > > > + reg = <0x50000000 0x400000>;
> > > > + };
> > > > +
> > > > + wifi_device_swiotlb_region: wifi_device_swiotlb_region {
> > > > + compatible = "device-swiotlb-pool";
> > > > + reg = <0x50400000 0x4000000>;
> > > > + };
> > > > };
> > > >
> > > > /* ... */
> > > > @@ -135,4 +163,11 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB).
> > > > memory-region = <&multimedia_reserved>;
> > > > /* ... */
> > > > };
> > > > +
> > > > + pcie_wifi: pcie_wifi@0,0 {
> > > > + memory-region = <&wifi_coherent_mem_region>,
> > > > + <&wifi_device_swiotlb_region>;
> > > > + restricted-dma = <0>, <1>;
> > > > + /* ... */
> > > > + };
> > > > };
> > > > --
> > > > 2.28.0.rc0.142.g3c755180ce-goog
> > > >
On Tue, Aug 11, 2020 at 11:15 AM Tomasz Figa <[email protected]> wrote:
>
> On Mon, Aug 3, 2020 at 5:15 PM Tomasz Figa <[email protected]> wrote:
> >
> > Hi Claire and Rob,
> >
> > On Mon, Aug 3, 2020 at 4:26 PM Claire Chang <[email protected]> wrote:
> > >
> > > On Sat, Aug 1, 2020 at 4:58 AM Rob Herring <[email protected]> wrote:
> > > >
> > > > On Tue, Jul 28, 2020 at 01:01:39PM +0800, Claire Chang wrote:
> > > > > Introduce the new compatible string, device-swiotlb-pool, for restricted
> > > > > DMA. One can specify the address and length of the device swiotlb memory
> > > > > region by device-swiotlb-pool in the device tree.
> > > > >
> > > > > Signed-off-by: Claire Chang <[email protected]>
> > > > > ---
> > > > > .../reserved-memory/reserved-memory.txt | 35 +++++++++++++++++++
> > > > > 1 file changed, 35 insertions(+)
> > > > >
> > > > > diff --git a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > > > index 4dd20de6977f..78850896e1d0 100644
> > > > > --- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > > > +++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > > > @@ -51,6 +51,24 @@ compatible (optional) - standard definition
> > > > > used as a shared pool of DMA buffers for a set of devices. It can
> > > > > be used by an operating system to instantiate the necessary pool
> > > > > management subsystem if necessary.
> > > > > + - device-swiotlb-pool: This indicates a region of memory meant to be
> > > >
> > > > swiotlb is a Linux thing. The binding should be independent.
> > > Got it. Thanks for pointing this out.
> > >
> > > >
> > > > > + used as a pool of device swiotlb buffers for a given device. When
> > > > > + using this, the no-map and reusable properties must not be set, so the
> > > > > + operating system can create a virtual mapping that will be used for
> > > > > + synchronization. Also, there must be a restricted-dma property in the
> > > > > + device node to specify the indexes of reserved-memory nodes. One can
> > > > > + specify two reserved-memory nodes in the device tree. One with
> > > > > + shared-dma-pool to handle the coherent DMA buffer allocation, and
> > > > > + another one with device-swiotlb-pool for regular DMA to/from system
> > > > > + memory, which would be subject to bouncing. The main purpose for
> > > > > + restricted DMA is to mitigate the lack of DMA access control on
> > > > > + systems without an IOMMU, which could result in the DMA accessing the
> > > > > + system memory at unexpected times and/or unexpected addresses,
> > > > > + possibly leading to data leakage or corruption. The feature on its own
> > > > > + provides a basic level of protection against the DMA overwriting buffer
> > > > > + contents at unexpected times. However, to protect against general data
> > > > > + leakage and system memory corruption, the system needs to provide a
> > > > > + way to restrict the DMA to a predefined memory region.
> > > >
> > > > I'm pretty sure we already support per device carveouts and I don't
> > > > understand how this is different.
> > > We use this to bounce streaming DMA in and out of a specially allocated region.
> > > I'll try to merge this with the existing one (i.e., shared-dma-pool)
> > > to see if that
> > > makes things clearer.
> > >
> >
> > Indeed, from the firmware point of view, this is just a carveout, for
> > which we have the "shared-dma-pool" compatible string defined already.
> >
> > However, depending on the device and firmware setup, the way the
> > carevout is used may change. I can see the following scenarios:
> >
> > 1) coherent DMA (dma_alloc_*) within a reserved pool and no
> > non-coherent DMA (dma_map_*).
> >
> > This is how the "memory-region" property is handled today in Linux for
> > devices which can only DMA from/to the given memory region. However,
> > I'm not sure if no non-coherent DMA is actually enforced in any way by
> > the DMA subsystem.
> >
> > 2) coherent DMA from a reserved pool and non-coherent DMA from system memory
> >
> > This is the case for the systems which have some dedicated part of
> > memory which is guaranteed to be coherent with the DMA, but still can
> > do non-coherent DMA to any part of the system memory. Linux handles it
> > the same way as 1), which is what made me believe that 1) might not
> > actually be handled correctly.
> >
> > 3) coherent DMA and bounced non-coherent DMA within a reserved pool
> > 4) coherent DMA within one pool and bounced non-coherent within another pool
> >
> > These are the two cases we're interested in. Basically they make it
> > possible for non-coherent DMA from arbitrary system memory to be
> > bounced through a reserved pool, which the device has access to. The
> > current series implements 4), but I'd argue that it:
> >
> > - is problematic from the firmware point of view, because on most of
> > the systems, both pools would be just some carveouts and the fact that
> > Linux would use one for coherent and the other for non-coherent DMA
> > would be an OS implementation detail,
> > - suffers from the static memory split between coherent and
> > non-coherent DMA, which could either result in some wasted memory or
> > the DMA stopped working after a kernel update if the driver changes
> > its allocation pattern,
> >
> > and so we should rather go with 3).
> >
> > Now, from the firmware point of view, it doesn't matter how the OS
> > uses the carveout, but I think it's still necessary to tell the OS
> > about the device DMA capability. Right now we use "memory-region" for
> > any kind of reserved memory, but looking at the above scenarios, there
> > are 2 cases:
> >
> > a) the memory region is preferred for the device, e.g. it enables
> > coherency, but the device can still DMA across the rest of the system
> > memory. This is the case in scenario 2) and is kind of assumed in the
> > Linux DMA subsystem, although it's certainly not the case for a lot of
> > hardware, even if they use the "memory-region" binding.
> >
> > b) the memory region is the only region that the device can access.
> > This is the case in scenarios 1), 3) and 4).
> >
> > For this, I'd like to propose a "restricted-dma-region" (feel free to
> > suggest a better name) binding, which is explicitly specified to be
> > the only DMA-able memory for this device and make Linux use the given
> > pool for coherent DMA allocations and bouncing non-coherent DMA.
> >
> > What do you think?
>
> Rob, Robin, we'd appreciate your feedback on this when you have a
> chance to take a look again. Thanks!
Gentle ping.
>
> Best regards,
> Tomasz
>
> >
> > Best regards,
> > Tomasz
> >
> > > >
> > > > What is the last sentence supposed to imply? You need an IOMMU?
> > > The main purpose is to mitigate the lack of DMA access control on
> > > systems without an IOMMU.
> > > For example, we plan to use this plus a MPU for our PCIe WiFi which is
> > > not behind an IOMMU.
> > >
> > > >
> > > > > - vendor specific string in the form <vendor>,[<device>-]<usage>
> > > > > no-map (optional) - empty property
> > > > > - Indicates the operating system must not create a virtual mapping
> > > > > @@ -117,6 +135,16 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB).
> > > > > compatible = "acme,multimedia-memory";
> > > > > reg = <0x77000000 0x4000000>;
> > > > > };
> > > > > +
> > > > > + wifi_coherent_mem_region: wifi_coherent_mem_region {
> > > > > + compatible = "shared-dma-pool";
> > > > > + reg = <0x50000000 0x400000>;
> > > > > + };
> > > > > +
> > > > > + wifi_device_swiotlb_region: wifi_device_swiotlb_region {
> > > > > + compatible = "device-swiotlb-pool";
> > > > > + reg = <0x50400000 0x4000000>;
> > > > > + };
> > > > > };
> > > > >
> > > > > /* ... */
> > > > > @@ -135,4 +163,11 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB).
> > > > > memory-region = <&multimedia_reserved>;
> > > > > /* ... */
> > > > > };
> > > > > +
> > > > > + pcie_wifi: pcie_wifi@0,0 {
> > > > > + memory-region = <&wifi_coherent_mem_region>,
> > > > > + <&wifi_device_swiotlb_region>;
> > > > > + restricted-dma = <0>, <1>;
> > > > > + /* ... */
> > > > > + };
> > > > > };
> > > > > --
> > > > > 2.28.0.rc0.142.g3c755180ce-goog
> > > > >
On Tue, Aug 25, 2020 at 1:30 AM Tomasz Figa <[email protected]> wrote:
>
> On Tue, Aug 11, 2020 at 11:15 AM Tomasz Figa <[email protected]> wrote:
> >
> > On Mon, Aug 3, 2020 at 5:15 PM Tomasz Figa <[email protected]> wrote:
> > >
> > > Hi Claire and Rob,
> > >
> > > On Mon, Aug 3, 2020 at 4:26 PM Claire Chang <[email protected]> wrote:
> > > >
> > > > On Sat, Aug 1, 2020 at 4:58 AM Rob Herring <[email protected]> wrote:
> > > > >
> > > > > On Tue, Jul 28, 2020 at 01:01:39PM +0800, Claire Chang wrote:
> > > > > > Introduce the new compatible string, device-swiotlb-pool, for restricted
> > > > > > DMA. One can specify the address and length of the device swiotlb memory
> > > > > > region by device-swiotlb-pool in the device tree.
> > > > > >
> > > > > > Signed-off-by: Claire Chang <[email protected]>
> > > > > > ---
> > > > > > .../reserved-memory/reserved-memory.txt | 35 +++++++++++++++++++
> > > > > > 1 file changed, 35 insertions(+)
> > > > > >
> > > > > > diff --git a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > > > > index 4dd20de6977f..78850896e1d0 100644
> > > > > > --- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > > > > +++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > > > > @@ -51,6 +51,24 @@ compatible (optional) - standard definition
> > > > > > used as a shared pool of DMA buffers for a set of devices. It can
> > > > > > be used by an operating system to instantiate the necessary pool
> > > > > > management subsystem if necessary.
> > > > > > + - device-swiotlb-pool: This indicates a region of memory meant to be
> > > > >
> > > > > swiotlb is a Linux thing. The binding should be independent.
> > > > Got it. Thanks for pointing this out.
> > > >
> > > > >
> > > > > > + used as a pool of device swiotlb buffers for a given device. When
> > > > > > + using this, the no-map and reusable properties must not be set, so the
> > > > > > + operating system can create a virtual mapping that will be used for
> > > > > > + synchronization. Also, there must be a restricted-dma property in the
> > > > > > + device node to specify the indexes of reserved-memory nodes. One can
> > > > > > + specify two reserved-memory nodes in the device tree. One with
> > > > > > + shared-dma-pool to handle the coherent DMA buffer allocation, and
> > > > > > + another one with device-swiotlb-pool for regular DMA to/from system
> > > > > > + memory, which would be subject to bouncing. The main purpose for
> > > > > > + restricted DMA is to mitigate the lack of DMA access control on
> > > > > > + systems without an IOMMU, which could result in the DMA accessing the
> > > > > > + system memory at unexpected times and/or unexpected addresses,
> > > > > > + possibly leading to data leakage or corruption. The feature on its own
> > > > > > + provides a basic level of protection against the DMA overwriting buffer
> > > > > > + contents at unexpected times. However, to protect against general data
> > > > > > + leakage and system memory corruption, the system needs to provide a
> > > > > > + way to restrict the DMA to a predefined memory region.
> > > > >
> > > > > I'm pretty sure we already support per device carveouts and I don't
> > > > > understand how this is different.
> > > > We use this to bounce streaming DMA in and out of a specially allocated region.
> > > > I'll try to merge this with the existing one (i.e., shared-dma-pool)
> > > > to see if that
> > > > makes things clearer.
> > > >
> > >
> > > Indeed, from the firmware point of view, this is just a carveout, for
> > > which we have the "shared-dma-pool" compatible string defined already.
> > >
> > > However, depending on the device and firmware setup, the way the
> > > carevout is used may change. I can see the following scenarios:
> > >
> > > 1) coherent DMA (dma_alloc_*) within a reserved pool and no
> > > non-coherent DMA (dma_map_*).
> > >
> > > This is how the "memory-region" property is handled today in Linux for
> > > devices which can only DMA from/to the given memory region. However,
> > > I'm not sure if no non-coherent DMA is actually enforced in any way by
> > > the DMA subsystem.
> > >
> > > 2) coherent DMA from a reserved pool and non-coherent DMA from system memory
> > >
> > > This is the case for the systems which have some dedicated part of
> > > memory which is guaranteed to be coherent with the DMA, but still can
> > > do non-coherent DMA to any part of the system memory. Linux handles it
> > > the same way as 1), which is what made me believe that 1) might not
> > > actually be handled correctly.
> > >
> > > 3) coherent DMA and bounced non-coherent DMA within a reserved pool
> > > 4) coherent DMA within one pool and bounced non-coherent within another pool
> > >
> > > These are the two cases we're interested in. Basically they make it
> > > possible for non-coherent DMA from arbitrary system memory to be
> > > bounced through a reserved pool, which the device has access to. The
> > > current series implements 4), but I'd argue that it:
> > >
> > > - is problematic from the firmware point of view, because on most of
> > > the systems, both pools would be just some carveouts and the fact that
> > > Linux would use one for coherent and the other for non-coherent DMA
> > > would be an OS implementation detail,
> > > - suffers from the static memory split between coherent and
> > > non-coherent DMA, which could either result in some wasted memory or
> > > the DMA stopped working after a kernel update if the driver changes
> > > its allocation pattern,
> > >
> > > and so we should rather go with 3).
> > >
> > > Now, from the firmware point of view, it doesn't matter how the OS
> > > uses the carveout, but I think it's still necessary to tell the OS
> > > about the device DMA capability. Right now we use "memory-region" for
> > > any kind of reserved memory, but looking at the above scenarios, there
> > > are 2 cases:
> > >
> > > a) the memory region is preferred for the device, e.g. it enables
> > > coherency, but the device can still DMA across the rest of the system
> > > memory. This is the case in scenario 2) and is kind of assumed in the
> > > Linux DMA subsystem, although it's certainly not the case for a lot of
> > > hardware, even if they use the "memory-region" binding.
> > >
> > > b) the memory region is the only region that the device can access.
> > > This is the case in scenarios 1), 3) and 4).
> > >
> > > For this, I'd like to propose a "restricted-dma-region" (feel free to
> > > suggest a better name) binding, which is explicitly specified to be
> > > the only DMA-able memory for this device and make Linux use the given
> > > pool for coherent DMA allocations and bouncing non-coherent DMA.
> > >
> > > What do you think?
> >
> > Rob, Robin, we'd appreciate your feedback on this when you have a
> > chance to take a look again. Thanks!
>
> Gentle ping.
The "restricted-dma-region" idea sounds good to me and I'm happy to
submit a new version implementing it.
Rob, Robin, please kindly let us know if you have any concerns about
it. Thanks!
Best regards,
Claire
>
> >
> > Best regards,
> > Tomasz
> >
> > >
> > > Best regards,
> > > Tomasz
> > >
> > > > >
> > > > > What is the last sentence supposed to imply? You need an IOMMU?
> > > > The main purpose is to mitigate the lack of DMA access control on
> > > > systems without an IOMMU.
> > > > For example, we plan to use this plus a MPU for our PCIe WiFi which is
> > > > not behind an IOMMU.
> > > >
> > > > >
> > > > > > - vendor specific string in the form <vendor>,[<device>-]<usage>
> > > > > > no-map (optional) - empty property
> > > > > > - Indicates the operating system must not create a virtual mapping
> > > > > > @@ -117,6 +135,16 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB).
> > > > > > compatible = "acme,multimedia-memory";
> > > > > > reg = <0x77000000 0x4000000>;
> > > > > > };
> > > > > > +
> > > > > > + wifi_coherent_mem_region: wifi_coherent_mem_region {
> > > > > > + compatible = "shared-dma-pool";
> > > > > > + reg = <0x50000000 0x400000>;
> > > > > > + };
> > > > > > +
> > > > > > + wifi_device_swiotlb_region: wifi_device_swiotlb_region {
> > > > > > + compatible = "device-swiotlb-pool";
> > > > > > + reg = <0x50400000 0x4000000>;
> > > > > > + };
> > > > > > };
> > > > > >
> > > > > > /* ... */
> > > > > > @@ -135,4 +163,11 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB).
> > > > > > memory-region = <&multimedia_reserved>;
> > > > > > /* ... */
> > > > > > };
> > > > > > +
> > > > > > + pcie_wifi: pcie_wifi@0,0 {
> > > > > > + memory-region = <&wifi_coherent_mem_region>,
> > > > > > + <&wifi_device_swiotlb_region>;
> > > > > > + restricted-dma = <0>, <1>;
> > > > > > + /* ... */
> > > > > > + };
> > > > > > };
> > > > > > --
> > > > > > 2.28.0.rc0.142.g3c755180ce-goog
> > > > > >