Often devices allocate DMA buffers before they do
runtime pm resume. This is the case for example with v4l2
devices where buffers are allocated during 'VIDIOC_REQBUFS`
and runtime resume happens later usually during 'VIDIOC_STREAMON'.
In such cases the partial tlb flush when allocating will fail
since the iommu is runtime suspended. This will print a warning
and try to do full flush. But there is actually no need to flush
the tlb before the consumer device is turned on.
Fix the warning by skipping partial flush when allocating and instead
do full flash in runtime resume.
In order to do full flush from the resume cb, the test:
if (pm_runtime_get_if_in_use(data->dev) <= 0)
continue;
needs to be removed from the flush all func since pm_runtime_get_if_in_use
returns 0 while resuming and will skip the flush
This patchset is a combination of 4 patches already sent in a different
patchset: [1] and a warning fix from Sebastian Reichel
[1] https://lore.kernel.org/linux-devicetree/[email protected]/
changes since v1:
-----------------
* Added preparation patches to remove the unneeded 'for_each_m4u' usage
and add a spinlock to protect access to tlb control registers.
* remove the pm runtime status check as explained above.
* refactor commit logs and inline doc
* move the call to full flush to the bottom of the resume cb after all registers are updated.
Sebastian Reichel (1):
iommu/mediatek: Always check runtime PM status in tlb flush range
callback
Yong Wu (4):
iommu/mediatek: Remove for_each_m4u in tlb_sync_all
iommu/mediatek: Remove the power status checking in tlb flush all
iommu/mediatek: Add tlb_lock in tlb_flush_all
iommu/mediatek: Always tlb_flush_all when each PM resume
drivers/iommu/mtk_iommu.c | 42 ++++++++++++++++++++-------------------
1 file changed, 22 insertions(+), 20 deletions(-)
--
2.17.1
From: Yong Wu <[email protected]>
The tlb_sync_all is called from these three functions:
a) flush_iotlb_all: it will be called for each a iommu HW.
b) tlb_flush_range_sync: it already has for_each_m4u.
c) in irq: When IOMMU HW translation fault, Only need flush itself.
Thus, No need for_each_m4u in this tlb_sync_all. Remove it.
Signed-off-by: Yong Wu <[email protected]>
Reviewed-by: Dafna Hirschfeld <[email protected]>
---
drivers/iommu/mtk_iommu.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 507123ae7485..342aa562ab6a 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -210,17 +210,15 @@ static struct mtk_iommu_domain *to_mtk_domain(struct iommu_domain *dom)
static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data)
{
- for_each_m4u(data) {
- if (pm_runtime_get_if_in_use(data->dev) <= 0)
- continue;
+ if (pm_runtime_get_if_in_use(data->dev) <= 0)
+ return;
- writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
- data->base + data->plat_data->inv_sel_reg);
- writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE);
- wmb(); /* Make sure the tlb flush all done */
+ writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
+ data->base + data->plat_data->inv_sel_reg);
+ writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE);
+ wmb(); /* Make sure the tlb flush all done */
- pm_runtime_put(data->dev);
- }
+ pm_runtime_put(data->dev);
}
static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
--
2.17.1
From: Sebastian Reichel <[email protected]>
In case of v4l2_reqbufs() it is possible, that a TLB flush is done
without runtime PM being enabled. In that case the "Partial TLB flush
timed out, falling back to full flush" warning is printed.
Commit c0b57581b73b ("iommu/mediatek: Add power-domain operation")
introduced has_pm as optimization to avoid checking runtime PM
when there is no power domain attached. But without the PM domain
there is still the device driver's runtime PM suspend handler, which
disables the clock. Thus flushing should also be avoided when there
is no PM domain involved.
Signed-off-by: Sebastian Reichel <[email protected]>
Reviewed-by: Dafna Hirschfeld <[email protected]>
---
drivers/iommu/mtk_iommu.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 342aa562ab6a..dd2c08c54df4 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -225,16 +225,13 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
size_t granule,
struct mtk_iommu_data *data)
{
- bool has_pm = !!data->dev->pm_domain;
unsigned long flags;
int ret;
u32 tmp;
for_each_m4u(data) {
- if (has_pm) {
- if (pm_runtime_get_if_in_use(data->dev) <= 0)
- continue;
- }
+ if (pm_runtime_get_if_in_use(data->dev) <= 0)
+ continue;
spin_lock_irqsave(&data->tlb_lock, flags);
writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
@@ -259,8 +256,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
writel_relaxed(0, data->base + REG_MMU_CPE_DONE);
spin_unlock_irqrestore(&data->tlb_lock, flags);
- if (has_pm)
- pm_runtime_put(data->dev);
+ pm_runtime_put(data->dev);
}
}
--
2.17.1
From: Yong Wu <[email protected]>
The tlb_flush_all touches the registers controlling tlb operations.
Protect it with the tlb_lock spinlock.
This also require the range_sync func to release that spinlock before
calling tlb_flush_all.
Signed-off-by: Yong Wu <[email protected]>
[refactor commit log]
Signed-off-by: Dafna Hirschfeld <[email protected]>
---
drivers/iommu/mtk_iommu.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index e30ac68fab48..195a411e3087 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -210,10 +210,14 @@ static struct mtk_iommu_domain *to_mtk_domain(struct iommu_domain *dom)
static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data)
{
+ unsigned long flags;
+
+ spin_lock_irqsave(&data->tlb_lock, flags);
writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
data->base + data->plat_data->inv_sel_reg);
writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE);
wmb(); /* Make sure the tlb flush all done */
+ spin_unlock_irqrestore(&data->tlb_lock, flags);
}
static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
@@ -242,14 +246,16 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
/* tlb sync */
ret = readl_poll_timeout_atomic(data->base + REG_MMU_CPE_DONE,
tmp, tmp != 0, 10, 1000);
+
+ /* Clear the CPE status */
+ writel_relaxed(0, data->base + REG_MMU_CPE_DONE);
+ spin_unlock_irqrestore(&data->tlb_lock, flags);
+
if (ret) {
dev_warn(data->dev,
"Partial TLB flush timed out, falling back to full flush\n");
mtk_iommu_tlb_flush_all(data);
}
- /* Clear the CPE status */
- writel_relaxed(0, data->base + REG_MMU_CPE_DONE);
- spin_unlock_irqrestore(&data->tlb_lock, flags);
pm_runtime_put(data->dev);
}
--
2.17.1
From: Yong Wu <[email protected]>
Prepare for 2 HWs that sharing pgtable in different power-domains.
When there are 2 M4U HWs, it may has problem in the flush_range in which
we get the pm_status via the m4u dev, BUT that function don't reflect the
real power-domain status of the HW since there may be other HW also use
that power-domain.
DAM allocation is often done while the allocating device is runtime
suspended. In such a case the iommu will also be suspended and partial
flushing of the tlb will not be executed.
Therefore, we add a tlb_flush_all in the pm_runtime_resume to make
sure the tlb is always clean.
In other case, the iommu's power should be active via device
link with smi.
Signed-off-by: Yong Wu <[email protected]>
[move the call to mtk_iommu_tlb_flush_all to the bottom of resume cb, improve doc/log]
Signed-off-by: Dafna Hirschfeld <[email protected]>
---
drivers/iommu/mtk_iommu.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 195a411e3087..4799cd06511b 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -997,6 +997,13 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev)
writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
writel_relaxed(reg->vld_pa_rng, base + REG_MMU_VLD_PA_RNG);
writel(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, base + REG_MMU_PT_BASE_ADDR);
+
+ /*
+ * Users may allocate dma buffer before they call pm_runtime_get,
+ * in which case it will lack the necessary tlb flush.
+ * Thus, make sure to update the tlb after each PM resume.
+ */
+ mtk_iommu_tlb_flush_all(data);
return 0;
}
--
2.17.1
From: Yong Wu <[email protected]>
To simplify the code, Remove the power status checking in the
tlb_flush_all, remove this:
if (pm_runtime_get_if_in_use(data->dev) <= 0)
continue;
The mtk_iommu_tlb_flush_all is called from
a) isr
b) tlb flush range fail case
c) iommu_create_device_direct_mappings
In first two cases, the power and clock are always enabled.
In the third case tlb flush is unnecessary because in a later patch
in the series a full flush from the pm_runtime_resume callback is added.
In addition, writing the tlb control register when the iommu is not resumed
is ok and the write is ignored.
Signed-off-by: Yong Wu <[email protected]>
[refactor commit log]
Signed-off-by: Dafna Hirschfeld <[email protected]>
---
drivers/iommu/mtk_iommu.c | 5 -----
1 file changed, 5 deletions(-)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index dd2c08c54df4..e30ac68fab48 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -210,15 +210,10 @@ static struct mtk_iommu_domain *to_mtk_domain(struct iommu_domain *dom)
static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data)
{
- if (pm_runtime_get_if_in_use(data->dev) <= 0)
- return;
-
writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
data->base + data->plat_data->inv_sel_reg);
writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE);
wmb(); /* Make sure the tlb flush all done */
-
- pm_runtime_put(data->dev);
}
static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
--
2.17.1
On Wed, 2021-12-08 at 14:07 +0200, Dafna Hirschfeld wrote:
> From: Sebastian Reichel <[email protected]>
>
> In case of v4l2_reqbufs() it is possible, that a TLB flush is done
> without runtime PM being enabled. In that case the "Partial TLB flush
> timed out, falling back to full flush" warning is printed.
>
> Commit c0b57581b73b ("iommu/mediatek: Add power-domain operation")
> introduced has_pm as optimization to avoid checking runtime PM
> when there is no power domain attached. But without the PM domain
> there is still the device driver's runtime PM suspend handler, which
> disables the clock. Thus flushing should also be avoided when there
> is no PM domain involved.
>
> Signed-off-by: Sebastian Reichel <[email protected]>
> Reviewed-by: Dafna Hirschfeld <[email protected]>
Reviewed-by: Yong Wu <[email protected]>
> ---
> drivers/iommu/mtk_iommu.c | 10 +++-------
> 1 file changed, 3 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 342aa562ab6a..dd2c08c54df4 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -225,16 +225,13 @@ static void
> mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
> size_t granule,
> struct mtk_iommu_data *data)
> {
> - bool has_pm = !!data->dev->pm_domain;
> unsigned long flags;
> int ret;
> u32 tmp;
>
> for_each_m4u(data) {
> - if (has_pm) {
> - if (pm_runtime_get_if_in_use(data->dev) <= 0)
> - continue;
> - }
> + if (pm_runtime_get_if_in_use(data->dev) <= 0)
> + continue;
>
> spin_lock_irqsave(&data->tlb_lock, flags);
> writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
> @@ -259,8 +256,7 @@ static void
> mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
> writel_relaxed(0, data->base + REG_MMU_CPE_DONE);
> spin_unlock_irqrestore(&data->tlb_lock, flags);
>
> - if (has_pm)
> - pm_runtime_put(data->dev);
> + pm_runtime_put(data->dev);
> }
> }
>
Il 08/12/21 13:07, Dafna Hirschfeld ha scritto:
> From: Yong Wu <[email protected]>
>
> The tlb_sync_all is called from these three functions:
> a) flush_iotlb_all: it will be called for each a iommu HW.
> b) tlb_flush_range_sync: it already has for_each_m4u.
> c) in irq: When IOMMU HW translation fault, Only need flush itself.
>
> Thus, No need for_each_m4u in this tlb_sync_all. Remove it.
>
> Signed-off-by: Yong Wu <[email protected]>
> Reviewed-by: Dafna Hirschfeld <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
Il 08/12/21 13:07, Dafna Hirschfeld ha scritto:
> From: Sebastian Reichel <[email protected]>
>
> In case of v4l2_reqbufs() it is possible, that a TLB flush is done
> without runtime PM being enabled. In that case the "Partial TLB flush
> timed out, falling back to full flush" warning is printed.
>
> Commit c0b57581b73b ("iommu/mediatek: Add power-domain operation")
> introduced has_pm as optimization to avoid checking runtime PM
> when there is no power domain attached. But without the PM domain
> there is still the device driver's runtime PM suspend handler, which
> disables the clock. Thus flushing should also be avoided when there
> is no PM domain involved.
>
> Signed-off-by: Sebastian Reichel <[email protected]>
> Reviewed-by: Dafna Hirschfeld <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
Il 08/12/21 13:07, Dafna Hirschfeld ha scritto:
> From: Yong Wu <[email protected]>
>
> To simplify the code, Remove the power status checking in the
> tlb_flush_all, remove this:
> if (pm_runtime_get_if_in_use(data->dev) <= 0)
> continue;
>
> The mtk_iommu_tlb_flush_all is called from
> a) isr
> b) tlb flush range fail case
> c) iommu_create_device_direct_mappings
>
> In first two cases, the power and clock are always enabled.
> In the third case tlb flush is unnecessary because in a later patch
> in the series a full flush from the pm_runtime_resume callback is added.
>
> In addition, writing the tlb control register when the iommu is not resumed
> is ok and the write is ignored.
>
> Signed-off-by: Yong Wu <[email protected]>
> [refactor commit log]
> Signed-off-by: Dafna Hirschfeld <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
Il 08/12/21 13:07, Dafna Hirschfeld ha scritto:
> From: Yong Wu <[email protected]>
>
> The tlb_flush_all touches the registers controlling tlb operations.
> Protect it with the tlb_lock spinlock.
> This also require the range_sync func to release that spinlock before
> calling tlb_flush_all.
>
> Signed-off-by: Yong Wu <[email protected]>
> [refactor commit log]
> Signed-off-by: Dafna Hirschfeld <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
Il 08/12/21 13:07, Dafna Hirschfeld ha scritto:
> From: Yong Wu <[email protected]>
>
> Prepare for 2 HWs that sharing pgtable in different power-domains.
>
> When there are 2 M4U HWs, it may has problem in the flush_range in which
> we get the pm_status via the m4u dev, BUT that function don't reflect the
> real power-domain status of the HW since there may be other HW also use
> that power-domain.
>
> DAM allocation is often done while the allocating device is runtime
> suspended. In such a case the iommu will also be suspended and partial
> flushing of the tlb will not be executed.
> Therefore, we add a tlb_flush_all in the pm_runtime_resume to make
> sure the tlb is always clean.
>
> In other case, the iommu's power should be active via device
> link with smi.
>
> Signed-off-by: Yong Wu <[email protected]>
> [move the call to mtk_iommu_tlb_flush_all to the bottom of resume cb, improve doc/log]
> Signed-off-by: Dafna Hirschfeld <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
On Wed, Dec 08, 2021 at 02:07:39PM +0200, Dafna Hirschfeld wrote:
> Sebastian Reichel (1):
> iommu/mediatek: Always check runtime PM status in tlb flush range
> callback
>
> Yong Wu (4):
> iommu/mediatek: Remove for_each_m4u in tlb_sync_all
> iommu/mediatek: Remove the power status checking in tlb flush all
> iommu/mediatek: Add tlb_lock in tlb_flush_all
> iommu/mediatek: Always tlb_flush_all when each PM resume
>
> drivers/iommu/mtk_iommu.c | 42 ++++++++++++++++++++-------------------
> 1 file changed, 22 insertions(+), 20 deletions(-)
Applied, thanks.