From: Michael Zoran <[email protected]>
Hi,
I implemented an alternative implementation of vchiq that uses
dma_map_sg instead of dma_map_page. I've included some benchmarks.
As you can see that for larger page sizes, the dma_map_sg
implementation is faster then the original unportable dma_map_area
implementation. So it's really a question of how important is
portability and getting this driver checked into upstream vs.
performance with small message sizes.
(also included in patch 2/2)
Test dmac_map_area dma_map_page dma_map_sg
vchiq_test -b 4 10000 51us/iter 76us/iter 76us
vchiq_test -b 8 10000 70us/iter 82us/iter 91us
vchiq_test -b 16 10000 94us/iter 118us/iter 121us
vchiq_test -b 32 10000 146us/iter 173us/iter 187us
vchiq_test -b 64 10000 263us/iter 328us/iter 299us
vchiq_test -b 128 10000 529us/iter 631us/iter 595us
vchiq_test -b 256 10000 2285us/iter 2275us/iter 2001us
vchiq_test -b 512 10000 4372us/iter 4616us/iter 4123us
So for message sizes >= 64KB, dma_map_sg is faster then dma_map_page.
For message size >= 256KB, the dma_map_sg is the fastest
implementation.
This removes all the errors and all the warnings except:
In file included from drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c:34:0:
drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c: In function ‘parse_rx_slots’:
drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c:1622:29: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
DEBUG_VALUE(PARSE_HEADER, (int)header);
^
drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.h:189:33: note: in definition of macro ‘DEBUG_VALUE’
do { debug_ptr[DEBUG_ ## d] = (v); dsb(sy); } while (0)
Thanks.
Michael Zoran (2):
staging: vc04_services: Fix unportable cast in vchiq_copy_from_user
staging: vc04_services: Replace dmac_map_area with dmac_map_sg
.../interface/vchiq_arm/vchiq_2835_arm.c | 155 ++++++++++++---------
1 file changed, 92 insertions(+), 63 deletions(-)
--
2.9.3