LinuxLists.cc - [PATCH] dmaengine: ti: k3-udma: Avoid false error msg on chan teardown

2022-02-15 06:31:22

Subject: [PATCH] dmaengine: ti: k3-udma: Avoid false error msg on chan teardown

In cyclic mode, there is no additional descriptor pushed to collect
outstanding data on channel teardown. Therefore no need to wait for this
descriptor to come back.

Without this terminating aplay cmd outputs false error msg like:
[ 116.402800] ti-bcdma 485c0100.dma-controller: chan1 teardown timeout!

Signed-off-by: Vignesh Raghavendra <[email protected]>
---
drivers/dma/ti/k3-udma.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
index 9abb08d353ca0..c9a1b2f312603 100644
--- a/drivers/dma/ti/k3-udma.c
+++ b/drivers/dma/ti/k3-udma.c
@@ -3924,7 +3924,7 @@ static void udma_synchronize(struct dma_chan *chan)

vchan_synchronize(&uc->vc);

- if (uc->state == UDMA_CHAN_IS_TERMINATING) {
+ if (uc->state == UDMA_CHAN_IS_TERMINATING && !uc->cyclic) {
timeout = wait_for_completion_timeout(&uc->teardown_completed,
timeout);
if (!timeout) {
--
2.35.1

2022-02-21 10:03:14

by Péter Ujfalusi

[permalink] [raw]

Subject: Re: [PATCH] dmaengine: ti: k3-udma: Avoid false error msg on chan teardown

Hi Vignesh,

On 15/02/2022 06:41, Vignesh Raghavendra wrote:
> In cyclic mode, there is no additional descriptor pushed to collect
> outstanding data on channel teardown. Therefore no need to wait for this
> descriptor to come back.
>
> Without this terminating aplay cmd outputs false error msg like:
> [ 116.402800] ti-bcdma 485c0100.dma-controller: chan1 teardown timeout!

are you sure it is aplay? It is MEM_TO_DEV, we only use the flush
descriptor for DEV_TO_MEM. MEM_TO_DEV can 'disconnect' from the
peripheral to flush out the FIFO.

I have not seen this on am654, j721e. I can not recall seeing this on
the capture side either.

The cyclic TR should be able to drain the DEV_TO_MEM by itself and the
TR should terminate.

> Signed-off-by: Vignesh Raghavendra <[email protected]>
> ---
> drivers/dma/ti/k3-udma.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
> index 9abb08d353ca0..c9a1b2f312603 100644
> --- a/drivers/dma/ti/k3-udma.c
> +++ b/drivers/dma/ti/k3-udma.c
> @@ -3924,7 +3924,7 @@ static void udma_synchronize(struct dma_chan *chan)
>
> vchan_synchronize(&uc->vc);
>
> - if (uc->state == UDMA_CHAN_IS_TERMINATING) {
> + if (uc->state == UDMA_CHAN_IS_TERMINATING && !uc->cyclic) {
> timeout = wait_for_completion_timeout(&uc->teardown_completed,
> timeout);
> if (!timeout) {

--
Péter

2022-02-28 10:58:59

by Vignesh Raghavendra

[permalink] [raw]

Subject: Re: [PATCH] dmaengine: ti: k3-udma: Avoid false error msg on chan teardown

Hi Peter,

On 21/02/22 1:42 am, Péter Ujfalusi wrote:
> Hi Vignesh,
>
> On 15/02/2022 06:41, Vignesh Raghavendra wrote:
>> In cyclic mode, there is no additional descriptor pushed to collect
>> outstanding data on channel teardown. Therefore no need to wait for this
>> descriptor to come back.
>>
>> Without this terminating aplay cmd outputs false error msg like:
>> [ 116.402800] ti-bcdma 485c0100.dma-controller: chan1 teardown timeout!
>
> are you sure it is aplay? It is MEM_TO_DEV, we only use the flush
> descriptor for DEV_TO_MEM. MEM_TO_DEV can 'disconnect' from the
> peripheral to flush out the FIFO.
>

Yes, this is with aplay. You are right that MEM_TO_DEV should have
worked w/o this patch.

> I have not seen this on am654, j721e. I can not recall seeing this on
> the capture side either.
>

I dont see it either

> The cyclic TR should be able to drain the DEV_TO_MEM by itself and the
> TR should terminate.
>

You are right. There seems to be a trobule with McASP + BCDMA on AM62
which needs more investigation. I see

RT c0000000 peer RT 90000000
BCNT 5dc00, peer BCNT 46400

So there is some data stuck in pipe which prevents channel from
disabling and TDCM being signaled. My guess is McASP is no longer
requesting more data from PDMA. Any way to look at McASP FIFO state/ DMA
req enable state? Wondering what else can prevent draining of data.

One difference is that AM62 has ti,tlv320aic3106 codec (codec is the
master) where J7 uses PCM.

Regards
Vignesh

>
>> Signed-off-by: Vignesh Raghavendra <[email protected]>
>> ---
>> drivers/dma/ti/k3-udma.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
>> index 9abb08d353ca0..c9a1b2f312603 100644
>> --- a/drivers/dma/ti/k3-udma.c
>> +++ b/drivers/dma/ti/k3-udma.c
>> @@ -3924,7 +3924,7 @@ static void udma_synchronize(struct dma_chan *chan)
>>
>> vchan_synchronize(&uc->vc);
>>
>> - if (uc->state == UDMA_CHAN_IS_TERMINATING) {
>> + if (uc->state == UDMA_CHAN_IS_TERMINATING && !uc->cyclic) {
>> timeout = wait_for_completion_timeout(&uc->teardown_completed,
>> timeout);
>> if (!timeout) {
>

2022-02-28 17:48:37

by Vignesh Raghavendra

[permalink] [raw]

Subject: Re: [PATCH] dmaengine: ti: k3-udma: Avoid false error msg on chan teardown

On 28/02/22 2:52 pm, Vignesh Raghavendra wrote:
> Hi Peter,
>
> On 21/02/22 1:42 am, Péter Ujfalusi wrote:
>> Hi Vignesh,
>>
>> On 15/02/2022 06:41, Vignesh Raghavendra wrote:
>>> In cyclic mode, there is no additional descriptor pushed to collect
>>> outstanding data on channel teardown. Therefore no need to wait for this
>>> descriptor to come back.
>>>
>>> Without this terminating aplay cmd outputs false error msg like:
>>> [ 116.402800] ti-bcdma 485c0100.dma-controller: chan1 teardown timeout!
>>
>> are you sure it is aplay? It is MEM_TO_DEV, we only use the flush
>> descriptor for DEV_TO_MEM. MEM_TO_DEV can 'disconnect' from the
>> peripheral to flush out the FIFO.
>>
>
> Yes, this is with aplay. You are right that MEM_TO_DEV should have
> worked w/o this patch.
>
>
>> I have not seen this on am654, j721e. I can not recall seeing this on
>> the capture side either.
>>
>
> I dont see it either
>
>> The cyclic TR should be able to drain the DEV_TO_MEM by itself and the
>> TR should terminate.
>>
>
> You are right. There seems to be a trobule with McASP + BCDMA on AM62
> which needs more investigation. I see
>
> RT c0000000 peer RT 90000000
> BCNT 5dc00, peer BCNT 46400
>
> So there is some data stuck in pipe which prevents channel from
> disabling and TDCM being signaled. My guess is McASP is no longer
> requesting more data from PDMA. Any way to look at McASP FIFO state/ DMA
> req enable state? Wondering what else can prevent draining of data.
>
> One difference is that AM62 has ti,tlv320aic3106 codec (codec is the
> master) where J7 uses PCM.
>

I see couple of issues with DMA usage by McASP/sound:

McASP TX FIFO events are disabled first and then DMA channel is stopped.
This does not work for K3 SoCs as some data remains stuck in DMA pipe
and channel never goes to disable state.

I see .stop_dma_first flag in snd_soc_dai_link to force DMA to be
stopped first, but I am not quite familiar on where to set this flag?
Even so, snd_dmaengine_pcm_trigger() calls dmaengine_terminate_async()
and does not call dmaengine_synchronize() before disabling McASP TX, so
channel teardown would still be unsuccessful.

Alternately, we could reduce dev_warn() in udma_synchronize() to
dev_dbg() as channel is still recoverable via udma_reset_chan() which
is done immediately after.
There is a further dev_warn() message to indicate if channel refused to
stop even after a reset?

Regards
Vignesh

> Regards
> Vignesh
>
>
>>
>>> Signed-off-by: Vignesh Raghavendra <[email protected]>
>>> ---
>>> drivers/dma/ti/k3-udma.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
>>> index 9abb08d353ca0..c9a1b2f312603 100644
>>> --- a/drivers/dma/ti/k3-udma.c
>>> +++ b/drivers/dma/ti/k3-udma.c
>>> @@ -3924,7 +3924,7 @@ static void udma_synchronize(struct dma_chan *chan)
>>>
>>> vchan_synchronize(&uc->vc);
>>>
>>> - if (uc->state == UDMA_CHAN_IS_TERMINATING) {
>>> + if (uc->state == UDMA_CHAN_IS_TERMINATING && !uc->cyclic) {
>>> timeout = wait_for_completion_timeout(&uc->teardown_completed,
>>> timeout);
>>> if (!timeout) {
>>

2022-03-01 20:19:14

by Péter Ujfalusi

[permalink] [raw]

Subject: Re: [PATCH] dmaengine: ti: k3-udma: Avoid false error msg on chan teardown

Hi Vignesh,

On 28/02/2022 13:18, Vignesh Raghavendra wrote:
>
>
> On 28/02/22 2:52 pm, Vignesh Raghavendra wrote:
>> Hi Peter,
>>
>> On 21/02/22 1:42 am, Péter Ujfalusi wrote:
>>> Hi Vignesh,
>>>
>>> On 15/02/2022 06:41, Vignesh Raghavendra wrote:
>>>> In cyclic mode, there is no additional descriptor pushed to collect
>>>> outstanding data on channel teardown. Therefore no need to wait for this
>>>> descriptor to come back.
>>>>
>>>> Without this terminating aplay cmd outputs false error msg like:
>>>> [ 116.402800] ti-bcdma 485c0100.dma-controller: chan1 teardown timeout!
>>>
>>> are you sure it is aplay? It is MEM_TO_DEV, we only use the flush
>>> descriptor for DEV_TO_MEM. MEM_TO_DEV can 'disconnect' from the
>>> peripheral to flush out the FIFO.
>>>
>>
>> Yes, this is with aplay. You are right that MEM_TO_DEV should have
>> worked w/o this patch.
>>
>>
>>> I have not seen this on am654, j721e. I can not recall seeing this on
>>> the capture side either.
>>>
>>
>> I dont see it either
>>
>>> The cyclic TR should be able to drain the DEV_TO_MEM by itself and the
>>> TR should terminate.
>>>
>>
>> You are right. There seems to be a trobule with McASP + BCDMA on AM62
>> which needs more investigation. I see
>>
>> RT c0000000 peer RT 90000000
>> BCNT 5dc00, peer BCNT 46400

In case of MEM_TO_DEV stop we set the peer DMA (PDMA) to flush and set
the UDMA/BCDMA/PKTDMA to tdown.
If the flush is set for the PDMA, it will (should) disconnect it's
trigger from the peripheral and 'free run' to flush all data.

Afaik the PDMA on am62 is the same as it is on j721e, no?

>> So there is some data stuck in pipe which prevents channel from
>> disabling and TDCM being signaled. My guess is McASP is no longer
>> requesting more data from PDMA. Any way to look at McASP FIFO state/ DMA
>> req enable state? Wondering what else can prevent draining of data.
>>
>> One difference is that AM62 has ti,tlv320aic3106 codec (codec is the
>> master) where J7 uses PCM.
>>
>
> I see couple of issues with DMA usage by McASP/sound:
>
> McASP TX FIFO events are disabled first and then DMA channel is stopped.
> This does not work for K3 SoCs as some data remains stuck in DMA pipe
> and channel never goes to disable state.

You can not really stop the DMA first because the McASP would undeflow
right away. The McASP FIFO should be in bypass mode for K3 devices, the
PDMA can handle the feeding of McASP just fine.

One of the reasons to have the FLUSH on the peer (PDMA) side is exactly
this: to be able to drain the MEM_TO_DEV DMA FIFO to /dev/null even if
the peripheral is long gone (disabled, even powered down).

> I see .stop_dma_first flag in snd_soc_dai_link to force DMA to be
> stopped first, but I am not quite familiar on where to set this flag?

The stop_dma_first might work, but it is actually added to support pxa
(I think?) where they have two separate DMAs on two side of an external
FIFO. The DMAengine side need to be stopped first and then they have
open coded DMA code to busy loop to wait for the other DMA (non
DMAengine) to drain out the data.

> Even so, snd_dmaengine_pcm_trigger() calls dmaengine_terminate_async()
> and does not call dmaengine_synchronize() before disabling McASP TX, so
> channel teardown would still be unsuccessful.

We can not do a dmaengine_synchronize() in pcm.trigger as it is in
atomic context and we can not sleep.

> Alternately, we could reduce dev_warn() in udma_synchronize() to
> dev_dbg() as channel is still recoverable via udma_reset_chan() which
> is done immediately after.
> There is a further dev_warn() message to indicate if channel refused to
> stop even after a reset?

The problem is that it is also possible that after a forced shutdown the
channel is not going to work anymore (we have this issue with am65, if I
recall right).

I would consult with the hardware team to understand what is going on,
make sure that the McASP AFIFO is disabled (evnums are 0).

If it is really something in the hardware that behaves differently in
am62 then add a quirk to handle it and implement a workaround.

>
> Regards
> Vignesh
>
>> Regards
>> Vignesh
>>
>>
>>>
>>>> Signed-off-by: Vignesh Raghavendra <[email protected]>
>>>> ---
>>>> drivers/dma/ti/k3-udma.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
>>>> index 9abb08d353ca0..c9a1b2f312603 100644
>>>> --- a/drivers/dma/ti/k3-udma.c
>>>> +++ b/drivers/dma/ti/k3-udma.c
>>>> @@ -3924,7 +3924,7 @@ static void udma_synchronize(struct dma_chan *chan)
>>>>
>>>> vchan_synchronize(&uc->vc);
>>>>
>>>> - if (uc->state == UDMA_CHAN_IS_TERMINATING) {
>>>> + if (uc->state == UDMA_CHAN_IS_TERMINATING && !uc->cyclic) {
>>>> timeout = wait_for_completion_timeout(&uc->teardown_completed,
>>>> timeout);
>>>> if (!timeout) {
>>>

--
Péter

2023-08-07 16:08:38

by Francesco Dolcini

[permalink] [raw]

Subject: Re: [PATCH] dmaengine: ti: k3-udma: Avoid false error msg on chan teardown

Hello Vignesh,
reviving this old email, since it seems that it raised a valid issue
that is still not resolved.

On Tue, Feb 15, 2022 at 10:11:12AM +0530, Vignesh Raghavendra wrote:
> In cyclic mode, there is no additional descriptor pushed to collect
> outstanding data on channel teardown. Therefore no need to wait for this
> descriptor to come back.
>
> Without this terminating aplay cmd outputs false error msg like:
> [ 116.402800] ti-bcdma 485c0100.dma-controller: chan1 teardown timeout!
I am experiencing a similar issue on Toradex Verdin AM62:

```
kern :warn : [ 40.196661] ti-udma 485c0100.dma-controller: chan2 teardown timeout!
kern :warn : [ 41.077013] kauditd_printk_skb: 14 callbacks suppressed
kern :warn : [ 46.404672] ti-udma 485c0100.dma-controller: chan2 teardown timeout!
kern :warn : [ 50.852643] ti-udma 485c0100.dma-controller: chan1 teardown timeout!
kern :warn : [ 56.932709] ti-udma 485c0100.dma-controller: chan1 teardown timeout!
kern :warn : [ 64.516797] ti-udma 485c0100.dma-controller: chan1 teardown timeout!
kern :warn : [ 70.404724] ti-udma 485c0100.dma-controller: chan1 teardown timeout!
kern :warn : [ 71.588933] ti-udma 485c0100.dma-controller: chan2 teardown timeout!
kern :warn : [ 75.908818] ti-udma 485c0100.dma-controller: chan1 teardown timeout!
kern :warn : [ 77.064761] ti-udma 485c0100.dma-controller: chan2 teardown timeout!
```

I would say just the same, since this happens everytime aplay terminates.

Any idea to progress on this?

Francesco