2016-10-21 09:53:54

by Naga Sureshkumar Relli

[permalink] [raw]
Subject: UBIFS with dma on 4.6 kernel is not working

Hi,

This is regarding UBIFS on 4.6 kernel.
We have tested UBIFS on our ZynqMP SOC QSPI Controller, the UBIFS is not working with dma on this kernel.
Controller driver: https://github.com/torvalds/linux/commits/master/drivers/spi/spi-zynqmp-gqspi.c
If I replace all vmalloc allocations in fs/ubifs/ to kmalloc then UBIFS with dma is working fine.
But whereas kernel before 4.6 without changing vmalloc to kmalloc, UBIFS is working fine with dma.
So is there any change in UBIFS in 4.6 kernel that to dma related?

May I know some info regarding this?
Why UBIFS on kernels before 4.6 is working with dma but not with 4.6?
Now a days, most of QSPI controllers have internal dmas.

Could you please provide some info regrading this dma issue?
We can change our controller driver to operate in IO mode (doesn't use dma) but performance wise it's not a preferred one.

Thanks,
Naga Sureshkumar Relli



2016-10-21 09:29:28

by Richard Weinberger

[permalink] [raw]
Subject: Re: UBIFS with dma on 4.6 kernel is not working

Hi!

On 21.10.2016 11:21, Naga Sureshkumar Relli wrote:
> Hi,
>
> This is regarding UBIFS on 4.6 kernel.
> We have tested UBIFS on our ZynqMP SOC QSPI Controller, the UBIFS is not working with dma on this kernel.
> Controller driver: https://github.com/torvalds/linux/commits/master/drivers/spi/spi-zynqmp-gqspi.c
> If I replace all vmalloc allocations in fs/ubifs/ to kmalloc then UBIFS with dma is working fine.

No, it will sooner or later OOM. Both UBI and UBIFS need rather large buffers, that's why we have to use
vmalloc().

> But whereas kernel before 4.6 without changing vmalloc to kmalloc, UBIFS is working fine with dma.
> So is there any change in UBIFS in 4.6 kernel that to dma related?

I'm not aware of such one.
Do you see this with vanilla kernels? Maybe some other internal stuff has changed.
git bisect can help.

DMA to vmalloced memory not good, it may work by chance if you transfer less than PAGE_SIZE.
Especially on ARM.

> May I know some info regarding this?
> Why UBIFS on kernels before 4.6 is working with dma but not with 4.6?
> Now a days, most of QSPI controllers have internal dmas.
>
> Could you please provide some info regrading this dma issue?
> We can change our controller driver to operate in IO mode (doesn't use dma) but performance wise it's not a preferred one.

Most MTD drivers use a bounce buffer.
How much does your performance degrade?

Thanks,
//richard

2016-10-21 11:57:20

by Cyrille Pitchen

[permalink] [raw]
Subject: Re: UBIFS with dma on 4.6 kernel is not working

Hi all,

Le 21/10/2016 ? 11:29, Richard Weinberger a ?crit :
> Hi!
>
> On 21.10.2016 11:21, Naga Sureshkumar Relli wrote:
>> Hi,
>>
>> This is regarding UBIFS on 4.6 kernel.
>> We have tested UBIFS on our ZynqMP SOC QSPI Controller, the UBIFS is not working with dma on this kernel.
>> Controller driver: https://github.com/torvalds/linux/commits/master/drivers/spi/spi-zynqmp-gqspi.c
>> If I replace all vmalloc allocations in fs/ubifs/ to kmalloc then UBIFS with dma is working fine.
>
> No, it will sooner or later OOM. Both UBI and UBIFS need rather large buffers, that's why we have to use
> vmalloc().
>
>> But whereas kernel before 4.6 without changing vmalloc to kmalloc, UBIFS is working fine with dma.
>> So is there any change in UBIFS in 4.6 kernel that to dma related?
>
> I'm not aware of such one.
> Do you see this with vanilla kernels? Maybe some other internal stuff has changed.
> git bisect can help.
>
> DMA to vmalloced memory not good, it may work by chance if you transfer less than PAGE_SIZE.
> Especially on ARM.
>

Indeed we have the very same issue with Atmel SPI (not QSPI) controller driver
(drivers/spi/spi-atmel.c). Both the Atmel and Zynq drivers call dma_map_single()
on an address pointing inside the rx_buf or tx_buf members of the struct
spi_transfer.

dma_map_single() can only map *physically* contiguous memory pages. Memory
returned by kmalloc() is guaranteed to be physically contiguous so that's why
it worked when the memory was allocated with kmalloc() but since the memory
is allocated with vmalloc(), crashes have occurred: with vmalloc() there is no
guarantee that memory pages are physically contiguous.

I've have started to fix the Atmel SPI controller driver to get rid of the
dma_map_single() calls. To avoid the use a bounce buffer and the resulting
performance drop, I've tried to rely on the spi_map_msg() function from the SPI
framework. I've just implemented the master->can_dma() handler. Then the SPI
framework calls spi_map_msg() just before master->transfer_one() in
__spi_pump_messages(). spi_map_msg() calls __spi_map_msg(), which in turn calls
master->can_dma() and eventually spi_map_buf() to build a scatter-gather list
in xfer->tx_sg or xfer->rx_sg and map it with dma_map_sg().

For the few tests I ran till now, it works with a sama5d2. However I warn you
that there is a theoretical risk/limitation mapping vmalloc'ed memory for DMA
purpose. I've read that the DMA API doesn't handle cache aliases for memory
areas other than the DMA capable areas. vmalloc'ed memory falls into one
of those non DMA capable areas.
Hence, depending on the cache model (PIPT, VIPT or VIVT) you might expect cache
coherency issues if the same physical memory page is mapped to two or more
different virtual memory addresses. It is safe with a Physically Indexed
Physically Tagged cache model but unsafe with Virtually Indexed cache models.
As the name says, the cache line is indexed by the virtual address, so even
if the physical page is the same, each virtual address may have its own
cache line. Then when you call dma_map_sg() on the first virtual address, the
function is likely not to be aware of the second virtual address so would
not clean/invalidate the cache line associated with this second virtual
address.

This is theoretical, honestly I don't know whether there is a good reason to map
the same physical page twice in the virtual memory...
Anyway I'm not an expert in DMA and cache aliases so if anyone has a real
knowledge of the actual issues we may face using spi_map_msg(), it would be
nice to share with us! :)

Best regards,

Cyrille

>> May I know some info regarding this?
>> Why UBIFS on kernels before 4.6 is working with dma but not with 4.6?
>> Now a days, most of QSPI controllers have internal dmas.
>>
>> Could you please provide some info regrading this dma issue?
>> We can change our controller driver to operate in IO mode (doesn't use dma) but performance wise it's not a preferred one.
>
> Most MTD drivers use a bounce buffer.
> How much does your performance degrade?
>
> Thanks,
> //richard
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
>

2016-10-21 12:53:20

by Christoph Hellwig

[permalink] [raw]
Subject: Re: UBIFS with dma on 4.6 kernel is not working

On Fri, Oct 21, 2016 at 11:29:16AM +0200, Richard Weinberger wrote:
> DMA to vmalloced memory not good, it may work by chance if you transfer
> less than PAGE_SIZE.
> Especially on ARM.

DMA to vmalloc'ed or vmap memory is perfectly fine, you just have to be
very careful.

I would suggest to not expose the vmalloc address to the lower layers
that do DMA, but instead expose the pages, either as an array or
scatterlist. Either allocate the pages using the normal page allocator
and then use vm_map_ram to generate a virtual address for them (that
is what XFS does for it's large metadata objects for example). Or if
you can't do that iterate over the vmalloc address in page size chunks
and use vmalloc_to_page (we still also do that for one piece of legacy
cruft in XFS, but I'd rather avoid that for new designs).

2016-10-21 13:08:05

by Richard Weinberger

[permalink] [raw]
Subject: Re: UBIFS with dma on 4.6 kernel is not working

Christoph,

On 21.10.2016 14:53, Christoph Hellwig wrote:
> On Fri, Oct 21, 2016 at 11:29:16AM +0200, Richard Weinberger wrote:
>> DMA to vmalloced memory not good, it may work by chance if you transfer
>> less than PAGE_SIZE.
>> Especially on ARM.
>
> DMA to vmalloc'ed or vmap memory is perfectly fine, you just have to be
> very careful.
>
> I would suggest to not expose the vmalloc address to the lower layers
> that do DMA, but instead expose the pages, either as an array or
> scatterlist. Either allocate the pages using the normal page allocator
> and then use vm_map_ram to generate a virtual address for them (that
> is what XFS does for it's large metadata objects for example). Or if
> you can't do that iterate over the vmalloc address in page size chunks
> and use vmalloc_to_page (we still also do that for one piece of legacy
> cruft in XFS, but I'd rather avoid that for new designs).
>

Hmm, thought this is still problematic on VIVT architectures.
Boris tried to provide a solution for that some time ago:
http://www.spinics.net/lists/arm-kernel/msg494025.html

Thanks,
//richard

2016-10-21 13:15:10

by Christoph Hellwig

[permalink] [raw]
Subject: Re: UBIFS with dma on 4.6 kernel is not working

On Fri, Oct 21, 2016 at 03:07:57PM +0200, Richard Weinberger wrote:
> Hmm, thought this is still problematic on VIVT architectures.
> Boris tried to provide a solution for that some time ago:
> http://www.spinics.net/lists/arm-kernel/msg494025.html

Things have been working fine for approx 10 years when using
flush_kernel_vmap_range before doing I/O using the physical addresses and
then invalidate_kernel_vmap_range when completing the I/O and going back
to using the virtual mapping for XFS.

Of course all this assumes at least the higher level that did the
vm_map_ram operation knows about this dance between virtually mapped and
physiscal addresses.

2016-10-24 07:08:41

by Richard Weinberger

[permalink] [raw]
Subject: Re: UBIFS with dma on 4.6 kernel is not working

Christoph,

On 21.10.2016 15:15, Christoph Hellwig wrote:
> On Fri, Oct 21, 2016 at 03:07:57PM +0200, Richard Weinberger wrote:
>> Hmm, thought this is still problematic on VIVT architectures.
>> Boris tried to provide a solution for that some time ago:
>> http://www.spinics.net/lists/arm-kernel/msg494025.html
>
> Things have been working fine for approx 10 years when using
> flush_kernel_vmap_range before doing I/O using the physical addresses and
> then invalidate_kernel_vmap_range when completing the I/O and going back
> to using the virtual mapping for XFS.
>
> Of course all this assumes at least the higher level that did the
> vm_map_ram operation knows about this dance between virtually mapped and
> physiscal addresses.

Good to know, I was clearly wrong.

Let's see whether the costs of flush_kernel_vmap_range and invalidate_kernel_vmap_range
are smaller than the speedup by DMA on embedded platforms.
We'll have to test it.

Thanks,
//richard

2016-10-25 07:25:26

by Naga Sureshkumar Relli

[permalink] [raw]
Subject: RE: UBIFS with dma on 4.6 kernel is not working

Hi,

Thanks everybody for your valuable information.

I am not aware of all these dma related APIs but where to handle these dma stuff?
Is it in UBI/UBIFS(at the time of vmalloc allocations)? Or in controller driver?

And also is there a way to know the memory allocated using vmalloc is contiguous or not?
Based on that I can switch my driver to work in dma or non-dma mode for ubifs use.

Thanks,
Naga Sureshkumar Relli

-----Original Message-----
From: Christoph Hellwig [mailto:[email protected]]
Sent: Friday, October 21, 2016 6:45 PM
To: Richard Weinberger <[email protected]>
Cc: Christoph Hellwig <[email protected]>; Naga Sureshkumar Relli <[email protected]>; [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; Punnaiah Choudary Kalluri <[email protected]>; [email protected]; [email protected]; Boris Brezillon <[email protected]>
Subject: Re: UBIFS with dma on 4.6 kernel is not working

On Fri, Oct 21, 2016 at 03:07:57PM +0200, Richard Weinberger wrote:
> Hmm, thought this is still problematic on VIVT architectures.
> Boris tried to provide a solution for that some time ago:
> http://www.spinics.net/lists/arm-kernel/msg494025.html

Things have been working fine for approx 10 years when using flush_kernel_vmap_range before doing I/O using the physical addresses and then invalidate_kernel_vmap_range when completing the I/O and going back to using the virtual mapping for XFS.

Of course all this assumes at least the higher level that did the vm_map_ram operation knows about this dance between virtually mapped and physiscal addresses.