2016-04-26 16:30:51

by Robin Murphy

[permalink] [raw]
Subject: [PATCH] dmaengine: pl330: Fix race in pl330_get_desc()

The current logic in pl330_get_desc() contains a clear race condition,
whereby if the descriptor pool is empty, we will create a new
descriptor, add it to the pool with the lock held, *release the lock*,
then try to remove it from the pool again.

Furthermore, if that second attempt then finds the pool empty because
someone else got there first, we conclude that something must have gone
terribly wrong and it's the absolute end of the world. In practice,
however, this can be hit relatively easily by dmatest with a suitably
aggressive number of threads and channels on a multi-core system, and
the fallout is a veritable storm of this sort of thing:

...
[ 709.328891] dma-pl330 7ff00000.dma: pl330_get_desc:2459 ALERT!
** 10 printk messages dropped ** [ 709.352280] dma-pl330 7ff00000.dma: __pl330_prep_dma_memcpy:2493 Unable to fetch desc
** 11 printk messages dropped ** [ 709.372930] dma-pl330 7ff00000.dma: pl330_get_desc:2459 ALERT!
** 2 printk messages dropped ** [ 709.372953] dma-pl330 7ff00000.dma: pl330_get_desc:2459 ALERT!
** 8 printk messages dropped ** [ 709.374157] dma-pl330 7ff00000.dma: __pl330_prep_dma_memcpy:2493 Unable to fetch desc
** 41 printk messages dropped ** [ 709.393095] dmatest: dma0chan4-copy1: result #4545: 'prep error' with src_off=0x3a32 dst_off=0x46dd len=0xc5c3 (0)
...

Down with this sort of thing! The most sensible thing to do is avoid the
allocate-add-remove dance entirely and simply use the freshly-allocated
descriptor straight away. We can then further simplify the code and save
unnecessary work by also inlining the initial pool allocation and
removing add_desc() entirely.

Signed-off-by: Robin Murphy <[email protected]>
---

I'm also seeing what looks like another occasional race under the same
conditions where pl330_tx_submit() blows up from dma_cookie_assign()
dereferencing a bogus tx->chan, but I think that's beyond my ability to
figure out right now. Similarly the storm of WARNs from pl330_issue_pending()
when using a large number of small buffers and dmatest.noverify=1. This
one was some obvious low-hanging fruit.

Robin.

drivers/dma/pl330.c | 45 +++++++++++++--------------------------------
1 file changed, 13 insertions(+), 32 deletions(-)

diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
index 372b4359da97..a196adb358f1 100644
--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -2393,29 +2393,6 @@ static inline void _init_desc(struct dma_pl330_desc *desc)
INIT_LIST_HEAD(&desc->node);
}

-/* Returns the number of descriptors added to the DMAC pool */
-static int add_desc(struct pl330_dmac *pl330, gfp_t flg, int count)
-{
- struct dma_pl330_desc *desc;
- unsigned long flags;
- int i;
-
- desc = kcalloc(count, sizeof(*desc), flg);
- if (!desc)
- return 0;
-
- spin_lock_irqsave(&pl330->pool_lock, flags);
-
- for (i = 0; i < count; i++) {
- _init_desc(&desc[i]);
- list_add_tail(&desc[i].node, &pl330->desc_pool);
- }
-
- spin_unlock_irqrestore(&pl330->pool_lock, flags);
-
- return count;
-}
-
static struct dma_pl330_desc *pluck_desc(struct pl330_dmac *pl330)
{
struct dma_pl330_desc *desc = NULL;
@@ -2428,9 +2405,6 @@ static struct dma_pl330_desc *pluck_desc(struct pl330_dmac *pl330)
struct dma_pl330_desc, node);

list_del_init(&desc->node);
-
- desc->status = PREP;
- desc->txd.callback = NULL;
}

spin_unlock_irqrestore(&pl330->pool_lock, flags);
@@ -2449,19 +2423,18 @@ static struct dma_pl330_desc *pl330_get_desc(struct dma_pl330_chan *pch)

/* If the DMAC pool is empty, alloc new */
if (!desc) {
- if (!add_desc(pl330, GFP_ATOMIC, 1))
- return NULL;
-
- /* Try again */
- desc = pluck_desc(pl330);
+ desc = kzalloc(sizeof(*desc), GFP_ATOMIC);
if (!desc) {
dev_err(pch->dmac->ddma.dev,
"%s:%d ALERT!\n", __func__, __LINE__);
return NULL;
}
+ _init_desc(desc);
}

/* Initialize the descriptor */
+ desc->status = PREP;
+ desc->txd.callback = NULL;
desc->pchan = pch;
desc->txd.cookie = 0;
async_tx_ack(&desc->txd);
@@ -2814,6 +2787,7 @@ pl330_probe(struct amba_device *adev, const struct amba_id *id)
struct pl330_config *pcfg;
struct pl330_dmac *pl330;
struct dma_pl330_chan *pch, *_p;
+ struct dma_pl330_desc *desc;
struct dma_device *pd;
struct resource *res;
int i, ret, irq;
@@ -2874,8 +2848,15 @@ pl330_probe(struct amba_device *adev, const struct amba_id *id)
spin_lock_init(&pl330->pool_lock);

/* Create a descriptor pool of default size */
- if (!add_desc(pl330, GFP_KERNEL, NR_DEFAULT_DESC))
+ desc = kcalloc(NR_DEFAULT_DESC, sizeof(*desc), GFP_KERNEL);
+ if (desc) {
+ for (i = 0; i < NR_DEFAULT_DESC; i++) {
+ _init_desc(&desc[i]);
+ list_add_tail(&desc[i].node, &pl330->desc_pool);
+ }
+ } else {
dev_warn(&adev->dev, "unable to allocate desc\n");
+ }

INIT_LIST_HEAD(&pd->channels);

--
2.8.1.dirty


2016-04-27 03:26:55

by Jassi Brar

[permalink] [raw]
Subject: Re: [PATCH] dmaengine: pl330: Fix race in pl330_get_desc()

On Tue, Apr 26, 2016 at 10:00 PM, Robin Murphy <[email protected]> wrote:
> The current logic in pl330_get_desc() contains a clear race condition,
> whereby if the descriptor pool is empty, we will create a new
> descriptor, add it to the pool with the lock held, *release the lock*,
> then try to remove it from the pool again.
>
> Furthermore, if that second attempt then finds the pool empty because
> someone else got there first, we conclude that something must have gone
> terribly wrong and it's the absolute end of the world.
>
We may retry or simply increase the number of descriptors allocated in
a batch from 1 to, say, NR_DEFAULT_DESC.

diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
index 372b435..7179a4d 100644
--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -2449,7 +2449,7 @@ static struct dma_pl330_desc
*pl330_get_desc(struct dma_pl330_chan *pch)

/* If the DMAC pool is empty, alloc new */
if (!desc) {
- if (!add_desc(pl330, GFP_ATOMIC, 1))
+ if (!add_desc(pl330, GFP_ATOMIC, NR_DEFAULT_DESC))
return NULL;

/* Try again */

> ...
> [ 709.328891] dma-pl330 7ff00000.dma: pl330_get_desc:2459 ALERT!
> ** 10 printk messages dropped ** [ 709.352280] dma-pl330 7ff00000.dma: __pl330_prep_dma_memcpy:2493 Unable to fetch desc
> ** 11 printk messages dropped ** [ 709.372930] dma-pl330 7ff00000.dma: pl330_get_desc:2459 ALERT!
> ** 2 printk messages dropped ** [ 709.372953] dma-pl330 7ff00000.dma: pl330_get_desc:2459 ALERT!
> ** 8 printk messages dropped ** [ 709.374157] dma-pl330 7ff00000.dma: __pl330_prep_dma_memcpy:2493 Unable to fetch desc
> ** 41 printk messages dropped ** [ 709.393095] dmatest: dma0chan4-copy1: result #4545: 'prep error' with src_off=0x3a32 dst_off=0x46dd len=0xc5c3 (0)
> ...
>
> Down with this sort of thing! The most sensible thing to do is avoid the
> allocate-add-remove dance entirely and simply use the freshly-allocated
> descriptor straight away.
>
... but you still try to first pluck from the list.
Instead of churning the code, I would suggest either check in a loop
that we have a desc OR allocate 2 or NR_DEFAULT_DESC descriptors
there. Probably we get more descriptors at the same cost of memory.

>
> I'm also seeing what looks like another occasional race under the same
> conditions where pl330_tx_submit() blows up from dma_cookie_assign()
> dereferencing a bogus tx->chan, but I think that's beyond my ability to
> figure out right now. Similarly the storm of WARNs from pl330_issue_pending()
> when using a large number of small buffers and dmatest.noverify=1. This
> one was some obvious low-hanging fruit.
>
Sorry, that part of code has changed a lot since I wrote the driver,
so more details will help me.

Thanks.

2016-04-27 13:47:50

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] dmaengine: pl330: Fix race in pl330_get_desc()

Hi Jassi,

On 27/04/16 04:26, Jassi Brar wrote:
> On Tue, Apr 26, 2016 at 10:00 PM, Robin Murphy <[email protected]> wrote:
>> The current logic in pl330_get_desc() contains a clear race condition,
>> whereby if the descriptor pool is empty, we will create a new
>> descriptor, add it to the pool with the lock held, *release the lock*,
>> then try to remove it from the pool again.
>>
>> Furthermore, if that second attempt then finds the pool empty because
>> someone else got there first, we conclude that something must have gone
>> terribly wrong and it's the absolute end of the world.
>>
> We may retry or simply increase the number of descriptors allocated in
> a batch from 1 to, say, NR_DEFAULT_DESC.
>
> diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
> index 372b435..7179a4d 100644
> --- a/drivers/dma/pl330.c
> +++ b/drivers/dma/pl330.c
> @@ -2449,7 +2449,7 @@ static struct dma_pl330_desc
> *pl330_get_desc(struct dma_pl330_chan *pch)
>
> /* If the DMAC pool is empty, alloc new */
> if (!desc) {
> - if (!add_desc(pl330, GFP_ATOMIC, 1))
> + if (!add_desc(pl330, GFP_ATOMIC, NR_DEFAULT_DESC))
> return NULL;
>
> /* Try again */

I'm not sure I see how leaving the obvious race in place and just making
it n times harder to hit is in any way not a terrible idea compared to
actually fixing it.

>> ...
>> [ 709.328891] dma-pl330 7ff00000.dma: pl330_get_desc:2459 ALERT!
>> ** 10 printk messages dropped ** [ 709.352280] dma-pl330 7ff00000.dma: __pl330_prep_dma_memcpy:2493 Unable to fetch desc
>> ** 11 printk messages dropped ** [ 709.372930] dma-pl330 7ff00000.dma: pl330_get_desc:2459 ALERT!
>> ** 2 printk messages dropped ** [ 709.372953] dma-pl330 7ff00000.dma: pl330_get_desc:2459 ALERT!
>> ** 8 printk messages dropped ** [ 709.374157] dma-pl330 7ff00000.dma: __pl330_prep_dma_memcpy:2493 Unable to fetch desc
>> ** 41 printk messages dropped ** [ 709.393095] dmatest: dma0chan4-copy1: result #4545: 'prep error' with src_off=0x3a32 dst_off=0x46dd len=0xc5c3 (0)
>> ...
>>
>> Down with this sort of thing! The most sensible thing to do is avoid the
>> allocate-add-remove dance entirely and simply use the freshly-allocated
>> descriptor straight away.
>>
> ... but you still try to first pluck from the list.

Well, that's the point of having a pool, right? Try to reuse a free
descriptor if there is one, otherwise expand the pool. It's just that in
the latter case, race or not, it's absolutely pointless to add the new
descriptor to the pool only to immediately remove it again, when you can
simply start using it instead - either way there's still an extra
descriptor released back to the pool once we're done with it.

> Instead of churning the code, I would suggest either check in a loop
> that we have a desc OR allocate 2 or NR_DEFAULT_DESC descriptors
> there. Probably we get more descriptors at the same cost of memory.

Having had a quick look into how beneficial that might be, I discover
that what's provoking the add_desc() race is something else causing
desc_pool->head to get wedged pointing at itself, so list_empty()
forever returns true and we kick off a storm of allocations while
desc_pool->tail fills up with unused descriptors. Looks like I'm going
to have to set aside some time to properly debug and fix this driver if
I want to use it for stress-testing IOMMU code...

>>
>> I'm also seeing what looks like another occasional race under the same
>> conditions where pl330_tx_submit() blows up from dma_cookie_assign()
>> dereferencing a bogus tx->chan, but I think that's beyond my ability to
>> figure out right now. Similarly the storm of WARNs from pl330_issue_pending()
>> when using a large number of small buffers and dmatest.noverify=1. This
>> one was some obvious low-hanging fruit.
>>
> Sorry, that part of code has changed a lot since I wrote the driver,
> so more details will help me.

Here's the actual splat:

[ 220.649267] Unable to handle kernel paging request at virtual address
100341338
[ 220.666576] pgd = ffffff8008dcb000
[ 220.679805] [100341338] *pgd=0000000000000000, *pud=0000000000000000
[ 220.696116] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[ 220.711579] Modules linked in:
[ 220.724459] CPU: 3 PID: 1249 Comm: dma0chan5-copy0 Not tainted
4.6.0-rc4+ #527
[ 220.741652] Hardware name: ARM Juno development board (r1) (DT)
[ 220.757512] task: ffffffc976255400 ti: ffffffc975aa4000 task.ti:
ffffffc975aa4000
[ 220.774979] PC is at pl330_tx_submit+0x68/0x108
[ 220.789412] LR is at pl330_tx_submit+0x2c/0x108
[ 220.803704] pc : [<ffffff80083dae08>] lr : [<ffffff80083dadcc>]
pstate: 800001c5
[ 220.821078] sp : ffffffc975aa7c90
[ 220.834236] x29: ffffffc975aa7c90 x28: ffffffc975e81800
[ 220.849303] x27: ffffffc97626c018 x26: ffffff8008cd9000
[ 220.864314] x25: 0000000000000000 x24: 0000000000000000
[ 220.879300] x23: ffffff8008d8d410 x22: ffffff8008ce52f0
[ 220.894266] x21: ffffffc976220d18 x20: ffffffc079138610
[ 220.909139] x19: ffffffc976220c60 x18: 0000000000000010
[ 220.923857] x17: 0000000000000007 x16: 0000000000000001
[ 220.938418] x15: 0000000000000001 x14: 0001020304050607
[ 220.952886] x13: 08090a0b0c0d0e0f x12: 1011121314151617
[ 220.967305] x11: 18191a1b1c1d1e1f x10: 0001020304050607
[ 220.981603] x9 : 0000000000000000 x8 : 0000000000000010
[ 220.995783] x7 : 0000000000000001 x6 : ffffffc976220ce0
[ 221.009899] x5 : 0000000100341330 x4 : ffffffc079138600
[ 221.023775] x3 : 0000000000000000 x2 : ffffffc97626c158
[ 221.037417] x1 : ffffffc97636a790 x0 : 0000000000000140
[ 221.050845]
[ 221.060126] Process dma0chan5-copy0 (pid: 1249, stack limit =
0xffffffc975aa4020)
[ 221.075704] Stack: (0xffffffc975aa7c90 to 0xffffffc975aa8000)
[ 221.089574] 7c80: ffffffc975aa7cd0
ffffff80083d9e60
[ 221.105653] 7ca0: ffffffc9758b88c0 0000000000002b12 00000000000020b8
ffffff8008ce52f0
[ 221.121765] 7cc0: 00000000f9f82b12 ffffffc0790e5b00 ffffffc975aa7e30
ffffff80080d7138
[ 221.137833] 7ce0: ffffffc975e81780 ffffff8008d90b00 ffffff8008af0d40
ffffffc975e81800
[ 221.153811] 7d00: ffffff80083d91c0 0000000000000000 0000000000000000
0000000000000000
[ 221.169777] 7d20: 0000000000000000 0000000000000000 ffffff8008ca7000
000000323ef9fd9c
[ 221.185714] 7d40: 0000000000000008 ffffffc975aa7e20 ffffff8008b265d8
0000000000000000
[ 221.201753] 7d60: 0000000000000010 0000000000000000 0000000175aa7dc0
0000000000000000
[ 221.217898] 7d80: 0000000000000259 0000000000000001 ffffff8008b265f0
ffffffc9758b88e8
[ 221.234155] 7da0: ffffff8000002b12 ffffffc975aa7cd0 ffffffc976220c88
00000000005d99c3
[ 221.250577] 7dc0: 0000000000000001 000002f400000259 ffffff8008d12be0
ffffffc975aa7cc0
[ 221.267166] 7de0: 0000000000000000 ffffffc975aa7df0 ffffff800bcc0bcc
ffffffc975aa7df8
[ 221.283840] 7e00: ffffffc975aa7df8 c5b6eabfb5a2c4b8 ffffffc900000000
ffffff80080f2048
[ 221.300672] 7e20: ffffffc975aa7e20 ffffffc975aa7e20 0000000000000000
ffffff8008085e10
[ 221.317405] 7e40: ffffff80080d7068 ffffffc975e81780 0000000000000000
0000000000000000
[ 221.334211] 7e60: 0000000000000000 ffffff80080df900 ffffff80080d7068
0000000000000000
[ 221.351119] 7e80: 0000000000000000 ffffffc975e81800 0000000000000000
0000000000000000
[ 221.368086] 7ea0: ffffffc975aa7ea0 ffffffc975aa7ea0 0000000000000000
ffffff8000000000
[ 221.385128] 7ec0: ffffffc975aa7ec0 ffffffc975aa7ec0 0000000000000000
0000000000000000
[ 221.402194] 7ee0: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 221.419345] 7f00: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 221.436492] 7f20: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 221.453629] 7f40: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 221.470764] 7f60: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 221.487893] 7f80: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 221.505008] 7fa0: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 221.522116] 7fc0: 0000000000000000 0000000000000000 0000000000000000
0000000000000005
[ 221.539221] 7fe0: 0000000000000000 0000000000000000 88898a8b8c8d8e8f
8081828384858687
[ 221.556415] Call trace:
[ 221.568286] Exception stack(0xffffffc975aa7ad0 to 0xffffffc975aa7bf0)
[ 221.584423] 7ac0: ffffffc976220c60
ffffffc079138610
[ 221.602172] 7ae0: ffffffc975aa7c90 ffffff80083dae08 ffffffc976812d00
ffffffc976aa2600
[ 221.620055] 7b00: ffffffc975aa7b30 ffffff80080ebfa0 ffffffc975aa7b30
ffffff80080ec050
[ 221.638012] 7b20: 00000000ffffb237 ffffffc976255400 ffffffc975aa7bd0
ffffff8008881408
[ 221.656053] 7b40: ffffff8008ca7000 ffffff8008cd9000 ffffffc97feef380
ffffffc976255400
[ 221.674207] 7b60: ffffffc975aa7b90 ffffff80080e6fb8 0000000000000140
ffffffc97636a790
[ 221.692454] 7b80: ffffffc97626c158 0000000000000000 ffffffc079138600
0000000100341330
[ 221.710838] 7ba0: ffffffc976220ce0 0000000000000001 0000000000000010
0000000000000000
[ 221.729306] 7bc0: 0001020304050607 18191a1b1c1d1e1f 1011121314151617
08090a0b0c0d0e0f
[ 221.747943] 7be0: 0001020304050607 0000000000000001
[ 221.763694] [<ffffff80083dae08>] pl330_tx_submit+0x68/0x108
[ 221.780284] [<ffffff80083d9e60>] dmatest_func+0xca0/0x1248
[ 221.796853] [<ffffff80080d7138>] kthread+0xd0/0xe8
[ 221.812785] [<ffffff8008085e10>] ret_from_fork+0x10/0x40
[ 221.829336] Code: f9402083 f9002043 f9401045 3902205f (b94008a3)
[ 221.846910] ---[ end trace a413951a16f706c7 ]---

The faulting instruction is the initial access to chan->cookie in
dma_cookie_assign(), and tx->chan (x5 here) seems to always be
0x100341330, implying that tx is pointing somewhere near a struct
pl330_config because that 'address' is clearly the first two words of
one of those (.periph_id, .mode). I'm going to have to put this to one
side for now and maybe come back to it later, but I thought it was worth
mentioning just in case it looked familiar to anyone. I can usually hit
it within a couple of runs of dmatest with 8 channels,
threads_per_chan=6, test_buf_size=16384 and iterations=5000 (the point
of which, if anyone's wondering, is to batter the TLBs of the downstream
SMMU with recycled addresses as much as possible).

Thanks,
Robin.

2016-04-27 16:14:20

by Jassi Brar

[permalink] [raw]
Subject: Re: [PATCH] dmaengine: pl330: Fix race in pl330_get_desc()

On 27 April 2016 at 19:17, Robin Murphy <[email protected]> wrote:

>> Instead of churning the code, I would suggest either check in a loop
>> that we have a desc OR allocate 2 or NR_DEFAULT_DESC descriptors
>> there. Probably we get more descriptors at the same cost of memory.
>
>
> Having had a quick look into how beneficial that might be, I discover that
> what's provoking the add_desc() race is something else causing
> desc_pool->head to get wedged pointing at itself, so list_empty() forever
> returns true and we kick off a storm of allocations while desc_pool->tail
> fills up with unused descriptors. Looks like I'm going to have to set aside
> some time to properly debug and fix this driver if I want to use it for
> stress-testing IOMMU code...
>
OK, thanks.
And yes, we still need to fix that potential race in pl330_get_desc,
probably by retrying in a loop.

>>>
>>> I'm also seeing what looks like another occasional race under the same
>>> conditions where pl330_tx_submit() blows up from dma_cookie_assign()
>>> dereferencing a bogus tx->chan, but I think that's beyond my ability to
>>> figure out right now. Similarly the storm of WARNs from
>>> pl330_issue_pending()
>>> when using a large number of small buffers and dmatest.noverify=1. This
>>> one was some obvious low-hanging fruit.
>>>
>> Sorry, that part of code has changed a lot since I wrote the driver,
>> so more details will help me.
>
>
> Here's the actual splat:
>
> [ 220.649267] Unable to handle kernel paging request at virtual address
> 100341338
> [ 220.666576] pgd = ffffff8008dcb000
> [ 220.679805] [100341338] *pgd=0000000000000000, *pud=0000000000000000
> [ 220.696116] Internal error: Oops: 96000005 [#1] PREEMPT SMP
> [ 220.711579] Modules linked in:
> [ 220.724459] CPU: 3 PID: 1249 Comm: dma0chan5-copy0 Not tainted 4.6.0-rc4+
> #527
> [ 220.741652] Hardware name: ARM Juno development board (r1) (DT)
> [ 220.757512] task: ffffffc976255400 ti: ffffffc975aa4000 task.ti:
> ffffffc975aa4000
> [ 220.774979] PC is at pl330_tx_submit+0x68/0x108
> [ 220.789412] LR is at pl330_tx_submit+0x2c/0x108
> [ 220.803704] pc : [<ffffff80083dae08>] lr : [<ffffff80083dadcc>] pstate:
> 800001c5
> [ 220.821078] sp : ffffffc975aa7c90
> [ 220.834236] x29: ffffffc975aa7c90 x28: ffffffc975e81800
> [ 220.849303] x27: ffffffc97626c018 x26: ffffff8008cd9000
> [ 220.864314] x25: 0000000000000000 x24: 0000000000000000
> [ 220.879300] x23: ffffff8008d8d410 x22: ffffff8008ce52f0
> [ 220.894266] x21: ffffffc976220d18 x20: ffffffc079138610
> [ 220.909139] x19: ffffffc976220c60 x18: 0000000000000010
> [ 220.923857] x17: 0000000000000007 x16: 0000000000000001
> [ 220.938418] x15: 0000000000000001 x14: 0001020304050607
> [ 220.952886] x13: 08090a0b0c0d0e0f x12: 1011121314151617
> [ 220.967305] x11: 18191a1b1c1d1e1f x10: 0001020304050607
> [ 220.981603] x9 : 0000000000000000 x8 : 0000000000000010
> [ 220.995783] x7 : 0000000000000001 x6 : ffffffc976220ce0
> [ 221.009899] x5 : 0000000100341330 x4 : ffffffc079138600
> [ 221.023775] x3 : 0000000000000000 x2 : ffffffc97626c158
> [ 221.037417] x1 : ffffffc97636a790 x0 : 0000000000000140
> [ 221.050845]
> [ 221.060126] Process dma0chan5-copy0 (pid: 1249, stack limit =
> 0xffffffc975aa4020)
> [ 221.075704] Stack: (0xffffffc975aa7c90 to 0xffffffc975aa8000)
> [ 221.089574] 7c80: ffffffc975aa7cd0
> ffffff80083d9e60
> [ 221.105653] 7ca0: ffffffc9758b88c0 0000000000002b12 00000000000020b8
> ffffff8008ce52f0
> [ 221.121765] 7cc0: 00000000f9f82b12 ffffffc0790e5b00 ffffffc975aa7e30
> ffffff80080d7138
> [ 221.137833] 7ce0: ffffffc975e81780 ffffff8008d90b00 ffffff8008af0d40
> ffffffc975e81800
> [ 221.153811] 7d00: ffffff80083d91c0 0000000000000000 0000000000000000
> 0000000000000000
> [ 221.169777] 7d20: 0000000000000000 0000000000000000 ffffff8008ca7000
> 000000323ef9fd9c
> [ 221.185714] 7d40: 0000000000000008 ffffffc975aa7e20 ffffff8008b265d8
> 0000000000000000
> [ 221.201753] 7d60: 0000000000000010 0000000000000000 0000000175aa7dc0
> 0000000000000000
> [ 221.217898] 7d80: 0000000000000259 0000000000000001 ffffff8008b265f0
> ffffffc9758b88e8
> [ 221.234155] 7da0: ffffff8000002b12 ffffffc975aa7cd0 ffffffc976220c88
> 00000000005d99c3
> [ 221.250577] 7dc0: 0000000000000001 000002f400000259 ffffff8008d12be0
> ffffffc975aa7cc0
> [ 221.267166] 7de0: 0000000000000000 ffffffc975aa7df0 ffffff800bcc0bcc
> ffffffc975aa7df8
> [ 221.283840] 7e00: ffffffc975aa7df8 c5b6eabfb5a2c4b8 ffffffc900000000
> ffffff80080f2048
> [ 221.300672] 7e20: ffffffc975aa7e20 ffffffc975aa7e20 0000000000000000
> ffffff8008085e10
> [ 221.317405] 7e40: ffffff80080d7068 ffffffc975e81780 0000000000000000
> 0000000000000000
> [ 221.334211] 7e60: 0000000000000000 ffffff80080df900 ffffff80080d7068
> 0000000000000000
> [ 221.351119] 7e80: 0000000000000000 ffffffc975e81800 0000000000000000
> 0000000000000000
> [ 221.368086] 7ea0: ffffffc975aa7ea0 ffffffc975aa7ea0 0000000000000000
> ffffff8000000000
> [ 221.385128] 7ec0: ffffffc975aa7ec0 ffffffc975aa7ec0 0000000000000000
> 0000000000000000
> [ 221.402194] 7ee0: 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> [ 221.419345] 7f00: 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> [ 221.436492] 7f20: 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> [ 221.453629] 7f40: 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> [ 221.470764] 7f60: 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> [ 221.487893] 7f80: 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> [ 221.505008] 7fa0: 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> [ 221.522116] 7fc0: 0000000000000000 0000000000000000 0000000000000000
> 0000000000000005
> [ 221.539221] 7fe0: 0000000000000000 0000000000000000 88898a8b8c8d8e8f
> 8081828384858687
> [ 221.556415] Call trace:
> [ 221.568286] Exception stack(0xffffffc975aa7ad0 to 0xffffffc975aa7bf0)
> [ 221.584423] 7ac0: ffffffc976220c60
> ffffffc079138610
> [ 221.602172] 7ae0: ffffffc975aa7c90 ffffff80083dae08 ffffffc976812d00
> ffffffc976aa2600
> [ 221.620055] 7b00: ffffffc975aa7b30 ffffff80080ebfa0 ffffffc975aa7b30
> ffffff80080ec050
> [ 221.638012] 7b20: 00000000ffffb237 ffffffc976255400 ffffffc975aa7bd0
> ffffff8008881408
> [ 221.656053] 7b40: ffffff8008ca7000 ffffff8008cd9000 ffffffc97feef380
> ffffffc976255400
> [ 221.674207] 7b60: ffffffc975aa7b90 ffffff80080e6fb8 0000000000000140
> ffffffc97636a790
> [ 221.692454] 7b80: ffffffc97626c158 0000000000000000 ffffffc079138600
> 0000000100341330
> [ 221.710838] 7ba0: ffffffc976220ce0 0000000000000001 0000000000000010
> 0000000000000000
> [ 221.729306] 7bc0: 0001020304050607 18191a1b1c1d1e1f 1011121314151617
> 08090a0b0c0d0e0f
> [ 221.747943] 7be0: 0001020304050607 0000000000000001
> [ 221.763694] [<ffffff80083dae08>] pl330_tx_submit+0x68/0x108
> [ 221.780284] [<ffffff80083d9e60>] dmatest_func+0xca0/0x1248
> [ 221.796853] [<ffffff80080d7138>] kthread+0xd0/0xe8
> [ 221.812785] [<ffffff8008085e10>] ret_from_fork+0x10/0x40
> [ 221.829336] Code: f9402083 f9002043 f9401045 3902205f (b94008a3)
> [ 221.846910] ---[ end trace a413951a16f706c7 ]---
>
> The faulting instruction is the initial access to chan->cookie in
> dma_cookie_assign(), and tx->chan (x5 here) seems to always be 0x100341330,
> implying that tx is pointing somewhere near a struct pl330_config because
> that 'address' is clearly the first two words of one of those (.periph_id,
> .mode). I'm going to have to put this to one side for now and maybe come
> back to it later, but I thought it was worth mentioning just in case it
> looked familiar to anyone. I can usually hit it within a couple of runs of
> dmatest with 8 channels, threads_per_chan=6, test_buf_size=16384 and
> iterations=5000 (the point of which, if anyone's wondering, is to batter the
> TLBs of the downstream SMMU with recycled addresses as much as possible).
>
I have had it tested just as well on some old platform and I didn't
hit the issue.
Let me try if I can guess where the problem is, which might
'disappear' when we fix the real problem that this patch intended to
fix.

Thanks.