This is what I have so far, which at least resolves the most ovbious
problems. I still haven't got very far with the USB corruption issue
I see on Juno with -rc1, but I'm yet to confirm whether that's actually
attributable to the SWIOTLB changes or something else entirely.
Robin.
Robin Murphy (2):
swiotlb: Make DIRECT_MAPPING_ERROR viable
swiotlb: Skip cache maintenance on map error
include/linux/dma-direct.h | 2 +-
kernel/dma/swiotlb.c | 3 ++-
2 files changed, 3 insertions(+), 2 deletions(-)
--
2.19.1.dirty
With the overflow buffer removed, we no longer have a unique address
which is guaranteed not to be a valid DMA target to use as an error
token. The DIRECT_MAPPING_ERROR value of 0 tries to at least represent
an unlikely DMA target, but unfortunately there are already SWIOTLB
users with DMA-able memory at physical address 0 which now gets falsely
treated as a mapping failure and leads to all manner of misbehaviour.
The best we can do to mitigate that is flip DIRECT_MAPPING_ERROR to the
commonly-used all-bits-set value, since the last single byte of memory
is by far the least-likely-valid DMA target.
Fixes: dff8d6c1ed58 ("swiotlb: remove the overflow buffer")]
Reported-by: John Stultz <[email protected]>
Signed-off-by: Robin Murphy <[email protected]>
---
include/linux/dma-direct.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index bd73e7a91410..9de9c7ab39d6 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -5,7 +5,7 @@
#include <linux/dma-mapping.h>
#include <linux/mem_encrypt.h>
-#define DIRECT_MAPPING_ERROR 0
+#define DIRECT_MAPPING_ERROR ~(dma_addr_t)0
#ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
#include <asm/dma-direct.h>
--
2.19.1.dirty
On Tue, Nov 20, 2018 at 02:09:52PM +0000, Robin Murphy wrote:
> With the overflow buffer removed, we no longer have a unique address
> which is guaranteed not to be a valid DMA target to use as an error
> token. The DIRECT_MAPPING_ERROR value of 0 tries to at least represent
> an unlikely DMA target, but unfortunately there are already SWIOTLB
> users with DMA-able memory at physical address 0 which now gets falsely
> treated as a mapping failure and leads to all manner of misbehaviour.
>
> The best we can do to mitigate that is flip DIRECT_MAPPING_ERROR to the
> commonly-used all-bits-set value, since the last single byte of memory
> is by far the least-likely-valid DMA target.
Are all the callers checking for DIRECT_MAPPING_ERROR or is it more of
a comparison (as in if (!ret)) ?
>
> Fixes: dff8d6c1ed58 ("swiotlb: remove the overflow buffer")]
> Reported-by: John Stultz <[email protected]>
> Signed-off-by: Robin Murphy <[email protected]>
> ---
> include/linux/dma-direct.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
> index bd73e7a91410..9de9c7ab39d6 100644
> --- a/include/linux/dma-direct.h
> +++ b/include/linux/dma-direct.h
> @@ -5,7 +5,7 @@
> #include <linux/dma-mapping.h>
> #include <linux/mem_encrypt.h>
>
> -#define DIRECT_MAPPING_ERROR 0
> +#define DIRECT_MAPPING_ERROR ~(dma_addr_t)0
>
> #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
> #include <asm/dma-direct.h>
> --
> 2.19.1.dirty
>
On Tue, Nov 20, 2018 at 02:09:53PM +0000, Robin Murphy wrote:
> If swiotlb_bounce_page() failed, calling arch_sync_dma_for_device() may
> lead to such delights as performing cache maintenance on whatever
> address phys_to_virt(SWIOTLB_MAP_ERROR) looks like, which is typically
> outside the kernel memory map and goes about as well as expected.
>
> Don't do that.
+Stefano
>
> Fixes: a4a4330db46a ("swiotlb: add support for non-coherent DMA")
> Signed-off-by: Robin Murphy <[email protected]>
> ---
> kernel/dma/swiotlb.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 5731daa09a32..045930e32c0e 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -679,7 +679,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
> }
>
> if (!dev_is_dma_coherent(dev) &&
> - (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> + (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0 &&
> + dev_addr != DIRECT_MAPPING_ERROR)
> arch_sync_dma_for_device(dev, phys, size, dir);
>
> return dev_addr;
> --
> 2.19.1.dirty
>
On 20/11/2018 14:49, Konrad Rzeszutek Wilk wrote:
> On Tue, Nov 20, 2018 at 02:09:52PM +0000, Robin Murphy wrote:
>> With the overflow buffer removed, we no longer have a unique address
>> which is guaranteed not to be a valid DMA target to use as an error
>> token. The DIRECT_MAPPING_ERROR value of 0 tries to at least represent
>> an unlikely DMA target, but unfortunately there are already SWIOTLB
>> users with DMA-able memory at physical address 0 which now gets falsely
>> treated as a mapping failure and leads to all manner of misbehaviour.
>>
>> The best we can do to mitigate that is flip DIRECT_MAPPING_ERROR to the
>> commonly-used all-bits-set value, since the last single byte of memory
>> is by far the least-likely-valid DMA target.
>
> Are all the callers checking for DIRECT_MAPPING_ERROR or is it more of
> a comparison (as in if (!ret)) ?
dma_direct_map_page() and dma_direct_mapping_error() were already doing
the right thing, and external callers must rely on the latter via
dma_mapping_error() rather than trying to inspect the actual value
themselves, since that varies between implementations anyway. AFAICS all
the new return paths from swiotlb_map_page() are also robust in
referencing the macro explicitly, so I think we're good.
Thanks,
Robin.
>> Fixes: dff8d6c1ed58 ("swiotlb: remove the overflow buffer")]
>> Reported-by: John Stultz <[email protected]>
>> Signed-off-by: Robin Murphy <[email protected]>
>> ---
>> include/linux/dma-direct.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
>> index bd73e7a91410..9de9c7ab39d6 100644
>> --- a/include/linux/dma-direct.h
>> +++ b/include/linux/dma-direct.h
>> @@ -5,7 +5,7 @@
>> #include <linux/dma-mapping.h>
>> #include <linux/mem_encrypt.h>
>>
>> -#define DIRECT_MAPPING_ERROR 0
>> +#define DIRECT_MAPPING_ERROR ~(dma_addr_t)0
>>
>> #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
>> #include <asm/dma-direct.h>
>> --
>> 2.19.1.dirty
>>
If swiotlb_bounce_page() failed, calling arch_sync_dma_for_device() may
lead to such delights as performing cache maintenance on whatever
address phys_to_virt(SWIOTLB_MAP_ERROR) looks like, which is typically
outside the kernel memory map and goes about as well as expected.
Don't do that.
Fixes: a4a4330db46a ("swiotlb: add support for non-coherent DMA")
Signed-off-by: Robin Murphy <[email protected]>
---
kernel/dma/swiotlb.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 5731daa09a32..045930e32c0e 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -679,7 +679,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
}
if (!dev_is_dma_coherent(dev) &&
- (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
+ (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0 &&
+ dev_addr != DIRECT_MAPPING_ERROR)
arch_sync_dma_for_device(dev, phys, size, dir);
return dev_addr;
--
2.19.1.dirty
The subject line should say dma-direct. This isn't really swiotlb
specific.
> +#define DIRECT_MAPPING_ERROR ~(dma_addr_t)0
Please add braces around the value like the other *MAPPING_ERROR
defintions so that it can be safely used in any context.
Otherwise looks good:
Reviewed-by: Christoph Hellwig <[email protected]>
On Tue, Nov 20, 2018 at 03:01:33PM +0000, Robin Murphy wrote:
> On 20/11/2018 14:49, Konrad Rzeszutek Wilk wrote:
> > On Tue, Nov 20, 2018 at 02:09:52PM +0000, Robin Murphy wrote:
> > > With the overflow buffer removed, we no longer have a unique address
> > > which is guaranteed not to be a valid DMA target to use as an error
> > > token. The DIRECT_MAPPING_ERROR value of 0 tries to at least represent
> > > an unlikely DMA target, but unfortunately there are already SWIOTLB
> > > users with DMA-able memory at physical address 0 which now gets falsely
> > > treated as a mapping failure and leads to all manner of misbehaviour.
> > >
> > > The best we can do to mitigate that is flip DIRECT_MAPPING_ERROR to the
> > > commonly-used all-bits-set value, since the last single byte of memory
> > > is by far the least-likely-valid DMA target.
> >
> > Are all the callers checking for DIRECT_MAPPING_ERROR or is it more of
> > a comparison (as in if (!ret)) ?
>
> dma_direct_map_page() and dma_direct_mapping_error() were already doing the
> right thing, and external callers must rely on the latter via
> dma_mapping_error() rather than trying to inspect the actual value
> themselves, since that varies between implementations anyway. AFAICS all the
> new return paths from swiotlb_map_page() are also robust in referencing the
> macro explicitly, so I think we're good.
Cool! Thank you for checking.
Acked-by: Konrad Rzeszutek Wilk <[email protected]>
Thank you!
>
> Thanks,
> Robin.
>
> > > Fixes: dff8d6c1ed58 ("swiotlb: remove the overflow buffer")]
> > > Reported-by: John Stultz <[email protected]>
> > > Signed-off-by: Robin Murphy <[email protected]>
> > > ---
> > > include/linux/dma-direct.h | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
> > > index bd73e7a91410..9de9c7ab39d6 100644
> > > --- a/include/linux/dma-direct.h
> > > +++ b/include/linux/dma-direct.h
> > > @@ -5,7 +5,7 @@
> > > #include <linux/dma-mapping.h>
> > > #include <linux/mem_encrypt.h>
> > > -#define DIRECT_MAPPING_ERROR 0
> > > +#define DIRECT_MAPPING_ERROR ~(dma_addr_t)0
> > > #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
> > > #include <asm/dma-direct.h>
> > > --
> > > 2.19.1.dirty
> > >
On Tue, Nov 20, 2018 at 02:09:53PM +0000, Robin Murphy wrote:
> If swiotlb_bounce_page() failed, calling arch_sync_dma_for_device() may
> lead to such delights as performing cache maintenance on whatever
> address phys_to_virt(SWIOTLB_MAP_ERROR) looks like, which is typically
> outside the kernel memory map and goes about as well as expected.
>
> Don't do that.
>
> Fixes: a4a4330db46a ("swiotlb: add support for non-coherent DMA")
> Signed-off-by: Robin Murphy <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
On Tue, Nov 20, 2018 at 02:09:51PM +0000, Robin Murphy wrote:
> This is what I have so far, which at least resolves the most ovbious
> problems. I still haven't got very far with the USB corruption issue
> I see on Juno with -rc1, but I'm yet to confirm whether that's actually
> attributable to the SWIOTLB changes or something else entirely.
This looks good modulo the minor nitpicks.
Konrad, are you ok with me picking up both through the dma-mapping
tree?
On Tue, Nov 20, 2018 at 05:08:18PM +0100, Christoph Hellwig wrote:
> On Tue, Nov 20, 2018 at 02:09:51PM +0000, Robin Murphy wrote:
> > This is what I have so far, which at least resolves the most ovbious
> > problems. I still haven't got very far with the USB corruption issue
> > I see on Juno with -rc1, but I'm yet to confirm whether that's actually
> > attributable to the SWIOTLB changes or something else entirely.
>
> This looks good modulo the minor nitpicks.
>
> Konrad, are you ok with me picking up both through the dma-mapping
> tree?
Yes, albeit I would want Stefano to take a peek at patch #2 just in case.
On Tue, Nov 20, 2018 at 6:10 AM Robin Murphy <[email protected]> wrote:
>
> This is what I have so far, which at least resolves the most ovbious
> problems. I still haven't got very far with the USB corruption issue
> I see on Juno with -rc1, but I'm yet to confirm whether that's actually
> attributable to the SWIOTLB changes or something else entirely.
>
> Robin.
>
> Robin Murphy (2):
> swiotlb: Make DIRECT_MAPPING_ERROR viable
> swiotlb: Skip cache maintenance on map error
>
> include/linux/dma-direct.h | 2 +-
> kernel/dma/swiotlb.c | 3 ++-
> 2 files changed, 3 insertions(+), 2 deletions(-)
Thanks so much for chasing this down!
Unfortunately AOSP is giving me grief this week, so I've not been able
to test the full environment, but I don't seem to be hitting the io
hangs I was seeing earlier with this patch set.
For both:
Tested-by: John Stultz <[email protected]>
thanks
-john
On Tue, Nov 20, 2018 at 11:34:41AM -0500, Konrad Rzeszutek Wilk wrote:
> > Konrad, are you ok with me picking up both through the dma-mapping
> > tree?
>
> Yes, albeit I would want Stefano to take a peek at patch #2 just in case.
Stefano, can you take a look asap? This is a pretty trivial fix for
a nasty bug that breaks boot on real life systems. I'd like to merge
it by tomorrow so that I can send it off to Linus for the next rc
(I will be offline for about two days after Friday morning)
On Wed, Nov 21, 2018 at 02:03:31PM +0100, Christoph Hellwig wrote:
> On Tue, Nov 20, 2018 at 11:34:41AM -0500, Konrad Rzeszutek Wilk wrote:
> > > Konrad, are you ok with me picking up both through the dma-mapping
> > > tree?
> >
> > Yes, albeit I would want Stefano to take a peek at patch #2 just in case.
>
> Stefano, can you take a look asap? This is a pretty trivial fix for
> a nasty bug that breaks boot on real life systems. I'd like to merge
> it by tomorrow so that I can send it off to Linus for the next rc
> (I will be offline for about two days after Friday morning)
It is Turkey Day in US so he may be busy catching those pesky turkeys.
How about Tuesday? And in can also send the GIT PULL.
On Wed, 21 Nov 2018, Konrad Rzeszutek Wilk wrote:
> On Wed, Nov 21, 2018 at 02:03:31PM +0100, Christoph Hellwig wrote:
> > On Tue, Nov 20, 2018 at 11:34:41AM -0500, Konrad Rzeszutek Wilk wrote:
> > > > Konrad, are you ok with me picking up both through the dma-mapping
> > > > tree?
> > >
> > > Yes, albeit I would want Stefano to take a peek at patch #2 just in case.
> >
> > Stefano, can you take a look asap? This is a pretty trivial fix for
> > a nasty bug that breaks boot on real life systems. I'd like to merge
> > it by tomorrow so that I can send it off to Linus for the next rc
> > (I will be offline for about two days after Friday morning)
>
> It is Turkey Day in US so he may be busy catching those pesky turkeys.
> How about Tuesday? And in can also send the GIT PULL.
I have just come back to the office. I'll give it a look today.
On Tue, 27 Nov 2018, Stefano Stabellini wrote:
> On Wed, 21 Nov 2018, Konrad Rzeszutek Wilk wrote:
> > On Wed, Nov 21, 2018 at 02:03:31PM +0100, Christoph Hellwig wrote:
> > > On Tue, Nov 20, 2018 at 11:34:41AM -0500, Konrad Rzeszutek Wilk wrote:
> > > > > Konrad, are you ok with me picking up both through the dma-mapping
> > > > > tree?
> > > >
> > > > Yes, albeit I would want Stefano to take a peek at patch #2 just in case.
> > >
> > > Stefano, can you take a look asap? This is a pretty trivial fix for
> > > a nasty bug that breaks boot on real life systems. I'd like to merge
> > > it by tomorrow so that I can send it off to Linus for the next rc
> > > (I will be offline for about two days after Friday morning)
> >
> > It is Turkey Day in US so he may be busy catching those pesky turkeys.
> > How about Tuesday? And in can also send the GIT PULL.
>
> I have just come back to the office. I'll give it a look today.
Everything looks good to me, thanks for keeping me in the loop.