2023-08-09 18:03:05

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v2] arm: dma-mapping: fix potential endless loop in __dma_page_dev_to_cpu()

The D-cache cleaning loop should not call folio_next() beyond the
requested region and rely on its parameters. Simply stop looping if left
counter reaches zero.

This fixes the following endless loop observed by RCU stall on the ARM
32bit Exynos5422-based Odroid-XU3lite board:

--->8---
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 0-....: (27320 ticks this GP) idle=e414/1/0x40000002 softirq=36/36 fqs=13044
rcu: (t=27385 jiffies g=-1067 q=34 ncpus=8)
CPU: 0 PID: 93 Comm: kworker/0:1H Not tainted 6.5.0-rc5-next-20230807 #6981
Hardware name: Samsung Exynos (Flattened Device Tree)
Workqueue: mmc_complete mmc_blk_mq_complete_work
PC is at _set_bit+0x28/0x44
LR is at __dma_page_dev_to_cpu+0xdc/0x170
..
_set_bit from __dma_page_dev_to_cpu+0xdc/0x170
__dma_page_dev_to_cpu from dma_direct_unmap_sg+0x100/0x130
dma_direct_unmap_sg from dw_mci_post_req+0x68/0x6c
dw_mci_post_req from mmc_blk_mq_post_req+0x34/0x100
mmc_blk_mq_post_req from mmc_blk_mq_complete_work+0x50/0x60
mmc_blk_mq_complete_work from process_one_work+0x20c/0x4d8
process_one_work from worker_thread+0x58/0x54c
worker_thread from kthread+0xe0/0xfc
kthread from ret_from_fork+0x14/0x2c
--->8---

While touching this code, move the set_bit() operation, which deals with
atomics, a bit up in the call chain. The new order helps a bit compiler
to produce code computing folio_size() only once.

Fixes: cc24e9c0895c ("arm: implement the new page table range API")
Signed-off-by: Marek Szyprowski <[email protected]>
---
v2:
- changed the code and explaiation as suggested by Russell and Matthew

v1:
- https://lore.kernel.org/all/[email protected]/
---
arch/arm/mm/dma-mapping.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 70cb7e63a9a5..0474840224d9 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -719,8 +719,10 @@ static void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
}

while (left >= (ssize_t)folio_size(folio)) {
- set_bit(PG_dcache_clean, &folio->flags);
left -= folio_size(folio);
+ set_bit(PG_dcache_clean, &folio->flags);
+ if (!left)
+ break;
folio = folio_next(folio);
}
}
--
2.34.1