From: Masayoshi Mizuma <[email protected]>
If the cache line size is greater than ARCH_DMA_MINALIGN (128),
the warning shows and it's tainted as TAINT_CPU_OUT_OF_SPEC.
However, it's not good about two points.
First, as discussed in the thread [1], the cpu cache line size will be
problem only on non-coherent devices.
Then, it should not be tainted as TAINT_CPU_OUT_OF_SPEC because
according to the specification of CTR_EL0.CWG, the maximum cache
writeback granule is 2048 byte (CWG == 0b1001).
This patch series try to:
- Show the warning only if the device is non-coherent device and
ARCH_DMA_MINALIGN is smaller than the cpu cache size.
- Show the warning and taints as TAINT_CPU_OUT_OF_SPEC if the cache line
size is greater than the maximum.
[1] https://lore.kernel.org/linux-arm-kernel/20180514145703.celnlobzn3uh5tc2@localhost/
Masayoshi Mizuma (2):
arm64/mm: check cpu cache line size with non-coherent device
arm64/mm: show TAINT_CPU_OUT_OF_SPEC warning if the cache size is over
the spec.
arch/arm64/include/asm/cache.h | 2 ++
arch/arm64/mm/dma-mapping.c | 9 +++++----
arch/arm64/mm/init.c | 5 +++++
3 files changed, 12 insertions(+), 4 deletions(-)
--
2.20.1
From: Masayoshi Mizuma <[email protected]>
As discussed in the thread [1], the cpu cache line size will be problem
only on non-coherent devices. And, the coherent flag is already introduced
to struct device.
Show the warning only if the device is non-coherent device and
ARCH_DMA_MINALIGN is smaller than the cpu cache size.
[1] https://lore.kernel.org/linux-arm-kernel/20180514145703.celnlobzn3uh5tc2@localhost/
Signed-off-by: Masayoshi Mizuma <[email protected]>
Reviewed-by: Hidetoshi Seto <[email protected]>
Tested-by: Zhang Lei <[email protected]>
---
arch/arm64/mm/dma-mapping.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 674860e3e478..c0c09890c845 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -91,10 +91,6 @@ static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
static int __init arm64_dma_init(void)
{
- WARN_TAINT(ARCH_DMA_MINALIGN < cache_line_size(),
- TAINT_CPU_OUT_OF_SPEC,
- "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
- ARCH_DMA_MINALIGN, cache_line_size());
return dma_atomic_pool_init(GFP_DMA32, __pgprot(PROT_NORMAL_NC));
}
arch_initcall(arm64_dma_init);
@@ -473,6 +469,11 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
const struct iommu_ops *iommu, bool coherent)
{
dev->dma_coherent = coherent;
+
+ if (!coherent && (cache_line_size() > ARCH_DMA_MINALIGN))
+ dev_WARN(dev, "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
+ ARCH_DMA_MINALIGN, cache_line_size());
+
__iommu_setup_dma_ops(dev, dma_base, size, iommu);
#ifdef CONFIG_XEN
--
2.20.1
From: Masayoshi Mizuma <[email protected]>
Show the warning and taints as TAINT_CPU_OUT_OF_SPEC if the cache line
size is greater than the maximum.
Signed-off-by: Masayoshi Mizuma <[email protected]>
Reviewed-by: Hidetoshi Seto <[email protected]>
Tested-by: Zhang Lei <[email protected]>
---
arch/arm64/include/asm/cache.h | 2 ++
arch/arm64/mm/init.c | 5 +++++
2 files changed, 7 insertions(+)
diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h
index 926434f413fa..636e277fefc9 100644
--- a/arch/arm64/include/asm/cache.h
+++ b/arch/arm64/include/asm/cache.h
@@ -91,6 +91,8 @@ static inline u32 cache_type_cwg(void)
#define __read_mostly __attribute__((__section__(".data..read_mostly")))
+#define ARM64_MAX_CACHE_LINE_SIZE 2048
+
static inline int cache_line_size(void)
{
u32 cwg = cache_type_cwg();
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index d2adffb81b5d..df621d90b19c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -562,6 +562,11 @@ void __init mem_init(void)
*/
sysctl_overcommit_memory = OVERCOMMIT_ALWAYS;
}
+
+ WARN_TAINT(cache_line_size() > ARM64_MAX_CACHE_LINE_SIZE,
+ TAINT_CPU_OUT_OF_SPEC,
+ "CTR_EL0.CWG is greater than the spec (%d > %d)",
+ cache_line_size(), ARM64_MAX_CACHE_LINE_SIZE);
}
void free_initmem(void)
--
2.20.1
On Tue, Jun 11, 2019 at 11:17:31AM -0400, Masayoshi Mizuma wrote:
> From: Masayoshi Mizuma <[email protected]>
>
> Show the warning and taints as TAINT_CPU_OUT_OF_SPEC if the cache line
> size is greater than the maximum.
In general the "out of spec" part is a misnomer, we tend to apply it to
CPU features that are not supported by the kernel rather than some CPU
feature not compliant with the architecture (we call the latter errata).
I suggest you drop this patch.
--
Catalin
On Tue, Jun 11, 2019 at 04:41:06PM +0100, Catalin Marinas wrote:
> On Tue, Jun 11, 2019 at 11:17:31AM -0400, Masayoshi Mizuma wrote:
> > From: Masayoshi Mizuma <[email protected]>
> >
> > Show the warning and taints as TAINT_CPU_OUT_OF_SPEC if the cache line
> > size is greater than the maximum.
>
> In general the "out of spec" part is a misnomer, we tend to apply it to
> CPU features that are not supported by the kernel rather than some CPU
> feature not compliant with the architecture (we call the latter errata).
>
> I suggest you drop this patch.
Thank you for your comments. I agree with you, so I drop this
patch.
Thanks,
Masa
On Tue, Jun 11, 2019 at 11:17:30AM -0400, Masayoshi Mizuma wrote:
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -91,10 +91,6 @@ static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
>
> static int __init arm64_dma_init(void)
> {
> - WARN_TAINT(ARCH_DMA_MINALIGN < cache_line_size(),
> - TAINT_CPU_OUT_OF_SPEC,
> - "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
> - ARCH_DMA_MINALIGN, cache_line_size());
> return dma_atomic_pool_init(GFP_DMA32, __pgprot(PROT_NORMAL_NC));
> }
> arch_initcall(arm64_dma_init);
> @@ -473,6 +469,11 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> const struct iommu_ops *iommu, bool coherent)
> {
> dev->dma_coherent = coherent;
> +
> + if (!coherent && (cache_line_size() > ARCH_DMA_MINALIGN))
> + dev_WARN(dev, "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
> + ARCH_DMA_MINALIGN, cache_line_size());
I'm ok in principle with this patch, with the minor issue that since
commit 7b8c87b297a7 ("arm64: cacheinfo: Update cache_line_size detected
from DT or PPTT") queued for 5.3 cache_line_size() gets the information
from DT or ACPI. The reason for this change is that the information is
used for performance tuning rather than DMA coherency.
You can go for a direct cache_type_cwg() check in here, unless Robin
(cc'ed) has a better idea.
--
Catalin
On Tue, Jun 11, 2019 at 07:00:07PM +0100, Catalin Marinas wrote:
> On Tue, Jun 11, 2019 at 11:17:30AM -0400, Masayoshi Mizuma wrote:
> > --- a/arch/arm64/mm/dma-mapping.c
> > +++ b/arch/arm64/mm/dma-mapping.c
> > @@ -91,10 +91,6 @@ static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
> >
> > static int __init arm64_dma_init(void)
> > {
> > - WARN_TAINT(ARCH_DMA_MINALIGN < cache_line_size(),
> > - TAINT_CPU_OUT_OF_SPEC,
> > - "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
> > - ARCH_DMA_MINALIGN, cache_line_size());
> > return dma_atomic_pool_init(GFP_DMA32, __pgprot(PROT_NORMAL_NC));
> > }
> > arch_initcall(arm64_dma_init);
> > @@ -473,6 +469,11 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> > const struct iommu_ops *iommu, bool coherent)
> > {
> > dev->dma_coherent = coherent;
> > +
> > + if (!coherent && (cache_line_size() > ARCH_DMA_MINALIGN))
> > + dev_WARN(dev, "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
> > + ARCH_DMA_MINALIGN, cache_line_size());
>
> I'm ok in principle with this patch, with the minor issue that since
> commit 7b8c87b297a7 ("arm64: cacheinfo: Update cache_line_size detected
> from DT or PPTT") queued for 5.3 cache_line_size() gets the information
> from DT or ACPI. The reason for this change is that the information is
> used for performance tuning rather than DMA coherency.
>
> You can go for a direct cache_type_cwg() check in here, unless Robin
> (cc'ed) has a better idea.
Got it, thanks.
I believe coherency_max_size is zero in case of coherent is false,
so I'll modify the patch as following. Does it make sense?
@@ -57,6 +53,11 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
const struct iommu_ops *iommu, bool coherent)
{
dev->dma_coherent = coherent;
+
+ if (!coherent && (cache_line_size() > ARCH_DMA_MINALIGN))
+ dev_WARN(dev, "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
+ ARCH_DMA_MINALIGN, (4 << cache_type_cwg()));
+
if (iommu)
iommu_setup_dma_ops(dev, dma_base, size);
Thanks,
Masa
On Tue, Jun 11, 2019 at 06:02:47PM -0400, Masayoshi Mizuma wrote:
> On Tue, Jun 11, 2019 at 07:00:07PM +0100, Catalin Marinas wrote:
> > On Tue, Jun 11, 2019 at 11:17:30AM -0400, Masayoshi Mizuma wrote:
> > > --- a/arch/arm64/mm/dma-mapping.c
> > > +++ b/arch/arm64/mm/dma-mapping.c
> > > @@ -91,10 +91,6 @@ static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
> > >
> > > static int __init arm64_dma_init(void)
> > > {
> > > - WARN_TAINT(ARCH_DMA_MINALIGN < cache_line_size(),
> > > - TAINT_CPU_OUT_OF_SPEC,
> > > - "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
> > > - ARCH_DMA_MINALIGN, cache_line_size());
> > > return dma_atomic_pool_init(GFP_DMA32, __pgprot(PROT_NORMAL_NC));
> > > }
> > > arch_initcall(arm64_dma_init);
> > > @@ -473,6 +469,11 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> > > const struct iommu_ops *iommu, bool coherent)
> > > {
> > > dev->dma_coherent = coherent;
> > > +
> > > + if (!coherent && (cache_line_size() > ARCH_DMA_MINALIGN))
> > > + dev_WARN(dev, "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
> > > + ARCH_DMA_MINALIGN, cache_line_size());
> >
> > I'm ok in principle with this patch, with the minor issue that since
> > commit 7b8c87b297a7 ("arm64: cacheinfo: Update cache_line_size detected
> > from DT or PPTT") queued for 5.3 cache_line_size() gets the information
> > from DT or ACPI. The reason for this change is that the information is
> > used for performance tuning rather than DMA coherency.
> >
> > You can go for a direct cache_type_cwg() check in here, unless Robin
> > (cc'ed) has a better idea.
>
> Got it, thanks.
> I believe coherency_max_size is zero in case of coherent is false,
> so I'll modify the patch as following. Does it make sense?
The coherency_max_size gives you the largest cache line in the system,
independent of whether a device is coherent or not. You may have a
device that does not snoop L1/L2 but there is a transparent L3 (system
cache) with a larger cache line that the device may be able to snoop.
The coherency_max_size and therefore cache_line_size() would give you
this L3 value but the device would work fine since CWG <=
ARCH_DMA_MINALIGN.
>
> @@ -57,6 +53,11 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> const struct iommu_ops *iommu, bool coherent)
> {
> dev->dma_coherent = coherent;
> +
> + if (!coherent && (cache_line_size() > ARCH_DMA_MINALIGN))
> + dev_WARN(dev, "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
> + ARCH_DMA_MINALIGN, (4 << cache_type_cwg()));
> +
> if (iommu)
> iommu_setup_dma_ops(dev, dma_base, size);
I think the easiest here is to add a local variable:
int cls = 4 << cache_type_cwg();
and check it against ARCH_DMA_MINALIGN.
--
Catalin
On 13/06/2019 16:54, Catalin Marinas wrote:
> On Tue, Jun 11, 2019 at 06:02:47PM -0400, Masayoshi Mizuma wrote:
>> On Tue, Jun 11, 2019 at 07:00:07PM +0100, Catalin Marinas wrote:
>>> On Tue, Jun 11, 2019 at 11:17:30AM -0400, Masayoshi Mizuma wrote:
>>>> --- a/arch/arm64/mm/dma-mapping.c
>>>> +++ b/arch/arm64/mm/dma-mapping.c
>>>> @@ -91,10 +91,6 @@ static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
>>>>
>>>> static int __init arm64_dma_init(void)
>>>> {
>>>> - WARN_TAINT(ARCH_DMA_MINALIGN < cache_line_size(),
>>>> - TAINT_CPU_OUT_OF_SPEC,
>>>> - "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
>>>> - ARCH_DMA_MINALIGN, cache_line_size());
>>>> return dma_atomic_pool_init(GFP_DMA32, __pgprot(PROT_NORMAL_NC));
>>>> }
>>>> arch_initcall(arm64_dma_init);
>>>> @@ -473,6 +469,11 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
>>>> const struct iommu_ops *iommu, bool coherent)
>>>> {
>>>> dev->dma_coherent = coherent;
>>>> +
>>>> + if (!coherent && (cache_line_size() > ARCH_DMA_MINALIGN))
>>>> + dev_WARN(dev, "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
>>>> + ARCH_DMA_MINALIGN, cache_line_size());
>>>
>>> I'm ok in principle with this patch, with the minor issue that since
>>> commit 7b8c87b297a7 ("arm64: cacheinfo: Update cache_line_size detected
>>> from DT or PPTT") queued for 5.3 cache_line_size() gets the information
>>> from DT or ACPI. The reason for this change is that the information is
>>> used for performance tuning rather than DMA coherency.
>>>
>>> You can go for a direct cache_type_cwg() check in here, unless Robin
>>> (cc'ed) has a better idea.
>>
>> Got it, thanks.
>> I believe coherency_max_size is zero in case of coherent is false,
>> so I'll modify the patch as following. Does it make sense?
>
> The coherency_max_size gives you the largest cache line in the system,
> independent of whether a device is coherent or not. You may have a
> device that does not snoop L1/L2 but there is a transparent L3 (system
> cache) with a larger cache line that the device may be able to snoop.
> The coherency_max_size and therefore cache_line_size() would give you
> this L3 value but the device would work fine since CWG <=
> ARCH_DMA_MINALIGN.
>
>>
>> @@ -57,6 +53,11 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
>> const struct iommu_ops *iommu, bool coherent)
>> {
>> dev->dma_coherent = coherent;
>> +
>> + if (!coherent && (cache_line_size() > ARCH_DMA_MINALIGN))
>> + dev_WARN(dev, "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
>> + ARCH_DMA_MINALIGN, (4 << cache_type_cwg()));
>> +
>> if (iommu)
>> iommu_setup_dma_ops(dev, dma_base, size);
>
> I think the easiest here is to add a local variable:
>
> int cls = 4 << cache_type_cwg();
>
> and check it against ARCH_DMA_MINALIGN.
>
Agreed, and I'd say we should keep the taint too, since if this
situation ever was hit the potential crashes would be weird and random
and not obviously DMA-related.
Robin.