2020-04-28 14:36:34

by Rafał Hibner

[permalink] [raw]
Subject: [PATCH] dma: zynqmp_dma: Initialize descriptor list after freeing during reset

List elements are not formally removed from list therefore
list head is not initialized during zynqmp_dma_reset.

Signed-off-by: Rafal Hibner <[email protected]>

This BUG causes kernel panic when transaction is unsuccessful:
------------[ cut here ]------------
[ 62.710730] list_add corruption. prev->next should be next (ffffffc07d308040), but was dead000000000100. (prev=ffffffc0540b40b8).
[ 62.727960] WARNING: CPU: 0 PID: 1845 at /home/rafalh11/zynq_proj/zynq_platform/KERNEL/linux-xlnx/lib/list_debug.c:28 __list_add_valid+0x74/0xa0
[ 62.746374] Modules linked in: secom_mx_rpmsg(O) rpmsg_char secom_mx_retune(O) ad9680(O) ad5686_spi(O) ad5686(O) ad7298(O) ltc5594(O) lmx2594(O) ad8366(O) secom_mx(O) axi_jesd204_rx(O) axi_adxcvr(O) xilinx_transceiver(O) cf_axi_adc_core(O) hmc7044(O)
[ 62.768546] CPU: 0 PID: 1845 Comm: mx Tainted: G O 4.19.0 #10
[ 62.775582] Hardware name: xlnx,zynqmp (DT)
[ 62.779750] pstate: 60000005 (nZCv daif -PAN -UAO)
[ 62.784524] pc : __list_add_valid+0x74/0xa0
[ 62.788690] lr : __list_add_valid+0x74/0xa0
[ 62.792855] sp : ffffff8008003da0
[ 62.796154] x29: ffffff8008003da0 x28: 0000000000000020
[ 62.801457] x27: 0000000000000000 x26: dead000000000100
[ 62.806760] x25: ffffffc07d308040 x24: ffffffc0540b40a0
[ 62.812064] x23: ffffffc07d308018 x22: 0000000000000000
[ 62.817359] x21: ffffffc07d308018 x20: ffffffc0540b40b8
[ 62.822654] x19: ffffffc0540b40b8 x18: ffffffffffffffff
[ 62.827957] x17: 0000000000000000 x16: 0000000000000000
[ 62.833252] x15: ffffff8008f88648 x14: 3034303830336437
[ 62.838546] x13: ffffff8008f88670 x12: ffffff80085d0ec0
[ 62.843841] x11: ffffff8008f65018 x10: 0000000000000006
[ 62.849136] x9 : 657270202e6e6f69 x8 : 3462303435306366
[ 62.854431] x7 : 66666666663d7665 x6 : 00000000000001a2
[ 62.859726] x5 : 0000000000000064 x4 : 0000000000000000
[ 62.865020] x3 : 0000000000000000 x2 : ffffffffffffffff
[ 62.870315] x1 : 0cbee765ba0ec900 x0 : 0000000000000000
[ 62.875611] Call trace:
[ 62.878042] __list_add_valid+0x74/0xa0
[ 62.881864] zynqmp_dma_free_descriptor+0x48/0x100
[ 62.886644] zynqmp_dma_chan_desc_cleanup+0xb4/0xf8
[ 62.891505] zynqmp_dma_do_tasklet+0x68/0x110
[ 62.895847] tasklet_action_common.isra.3+0x7c/0x168
[ 62.900801] tasklet_action+0x24/0x30
[ 62.904447] __do_softirq+0x10c/0x200
[ 62.908092] irq_exit+0xac/0xc0
[ 62.911219] __handle_domain_irq+0x60/0xb0
[ 62.915305] gic_handle_irq+0x64/0xc0
[ 62.918951] el1_irq+0xb0/0x140
[ 62.922078] schedule_timeout+0x218/0x3a0
[ 62.926077] wait_for_common+0x170/0x268
[ 62.929983] wait_for_completion_timeout+0x10/0x18
[ 62.934761] ioctl+0x64/0x270 [secom_mx_retune]
[ 62.939281] do_vfs_ioctl+0xb8/0x900
[ 62.942838] ksys_ioctl+0x44/0x90
[ 62.946137] __arm64_sys_ioctl+0x1c/0x28
[ 62.950044] el0_svc_common+0x60/0xe8
[ 62.953689] el0_svc_handler+0x6c/0x88
[ 62.957421] el0_svc+0x8/0xc
[ 62.960284] ---[ end trace 91b4fdfe685446c4 ]---
[ 62.964934] ------------[ cut here ]------------

---
drivers/dma/xilinx/zynqmp_dma.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/dma/xilinx/zynqmp_dma.c b/drivers/dma/xilinx/zynqmp_dma.c
index 931e41a72..a2b0be07e 100644
--- a/drivers/dma/xilinx/zynqmp_dma.c
+++ b/drivers/dma/xilinx/zynqmp_dma.c
@@ -453,6 +453,7 @@ static void zynqmp_dma_free_desc_list(struct zynqmp_dma_chan *chan,

list_for_each_entry_safe(desc, next, list, node)
zynqmp_dma_free_descriptor(chan, desc);
+ INIT_LIST_HEAD(list);
}

/**
--
2.20.1


2020-05-02 12:36:57

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH] dma: zynqmp_dma: Initialize descriptor list after freeing during reset

Hello Rafał,

On 28-04-20, 16:32, Rafał Hibner wrote:
> List elements are not formally removed from list therefore
> list head is not initialized during zynqmp_dma_reset.
>
> Signed-off-by: Rafal Hibner <[email protected]>
>
> This BUG causes kernel panic when transaction is unsuccessful:
> ------------[ cut here ]------------
> [ 62.710730] list_add corruption. prev->next should be next (ffffffc07d308040), but was dead000000000100. (prev=ffffffc0540b40b8).
> [ 62.727960] WARNING: CPU: 0 PID: 1845 at /home/rafalh11/zynq_proj/zynq_platform/KERNEL/linux-xlnx/lib/list_debug.c:28 __list_add_valid+0x74/0xa0
> [ 62.746374] Modules linked in: secom_mx_rpmsg(O) rpmsg_char secom_mx_retune(O) ad9680(O) ad5686_spi(O) ad5686(O) ad7298(O) ltc5594(O) lmx2594(O) ad8366(O) secom_mx(O) axi_jesd204_rx(O) axi_adxcvr(O) xilinx_transceiver(O) cf_axi_adc_core(O) hmc7044(O)
> [ 62.768546] CPU: 0 PID: 1845 Comm: mx Tainted: G O 4.19.0 #10
> [ 62.775582] Hardware name: xlnx,zynqmp (DT)
> [ 62.779750] pstate: 60000005 (nZCv daif -PAN -UAO)
> [ 62.784524] pc : __list_add_valid+0x74/0xa0
> [ 62.788690] lr : __list_add_valid+0x74/0xa0
> [ 62.792855] sp : ffffff8008003da0
> [ 62.796154] x29: ffffff8008003da0 x28: 0000000000000020
> [ 62.801457] x27: 0000000000000000 x26: dead000000000100
> [ 62.806760] x25: ffffffc07d308040 x24: ffffffc0540b40a0
> [ 62.812064] x23: ffffffc07d308018 x22: 0000000000000000
> [ 62.817359] x21: ffffffc07d308018 x20: ffffffc0540b40b8
> [ 62.822654] x19: ffffffc0540b40b8 x18: ffffffffffffffff
> [ 62.827957] x17: 0000000000000000 x16: 0000000000000000
> [ 62.833252] x15: ffffff8008f88648 x14: 3034303830336437
> [ 62.838546] x13: ffffff8008f88670 x12: ffffff80085d0ec0
> [ 62.843841] x11: ffffff8008f65018 x10: 0000000000000006
> [ 62.849136] x9 : 657270202e6e6f69 x8 : 3462303435306366
> [ 62.854431] x7 : 66666666663d7665 x6 : 00000000000001a2
> [ 62.859726] x5 : 0000000000000064 x4 : 0000000000000000
> [ 62.865020] x3 : 0000000000000000 x2 : ffffffffffffffff
> [ 62.870315] x1 : 0cbee765ba0ec900 x0 : 0000000000000000
> [ 62.875611] Call trace:
> [ 62.878042] __list_add_valid+0x74/0xa0
> [ 62.881864] zynqmp_dma_free_descriptor+0x48/0x100
> [ 62.886644] zynqmp_dma_chan_desc_cleanup+0xb4/0xf8
> [ 62.891505] zynqmp_dma_do_tasklet+0x68/0x110
> [ 62.895847] tasklet_action_common.isra.3+0x7c/0x168
> [ 62.900801] tasklet_action+0x24/0x30
> [ 62.904447] __do_softirq+0x10c/0x200
> [ 62.908092] irq_exit+0xac/0xc0
> [ 62.911219] __handle_domain_irq+0x60/0xb0
> [ 62.915305] gic_handle_irq+0x64/0xc0
> [ 62.918951] el1_irq+0xb0/0x140
> [ 62.922078] schedule_timeout+0x218/0x3a0
> [ 62.926077] wait_for_common+0x170/0x268
> [ 62.929983] wait_for_completion_timeout+0x10/0x18
> [ 62.934761] ioctl+0x64/0x270 [secom_mx_retune]
> [ 62.939281] do_vfs_ioctl+0xb8/0x900
> [ 62.942838] ksys_ioctl+0x44/0x90
> [ 62.946137] __arm64_sys_ioctl+0x1c/0x28
> [ 62.950044] el0_svc_common+0x60/0xe8
> [ 62.953689] el0_svc_handler+0x6c/0x88
> [ 62.957421] el0_svc+0x8/0xc
> [ 62.960284] ---[ end trace 91b4fdfe685446c4 ]---
> [ 62.964934] ------------[ cut here ]------------
>
> ---
> drivers/dma/xilinx/zynqmp_dma.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/dma/xilinx/zynqmp_dma.c b/drivers/dma/xilinx/zynqmp_dma.c
> index 931e41a72..a2b0be07e 100644
> --- a/drivers/dma/xilinx/zynqmp_dma.c
> +++ b/drivers/dma/xilinx/zynqmp_dma.c
> @@ -453,6 +453,7 @@ static void zynqmp_dma_free_desc_list(struct zynqmp_dma_chan *chan,
>
> list_for_each_entry_safe(desc, next, list, node)
> zynqmp_dma_free_descriptor(chan, desc);
> + INIT_LIST_HEAD(list);

Would it not be better to use list_del_init() where we delete it rather
than do the init here?

Thanks
--
~Vinod

2020-05-02 13:13:11

by Rafał Hibner

[permalink] [raw]
Subject: Re: [PATCH] dma: zynqmp_dma: Initialize descriptor list after freeing during reset

Hello Vinod,

On 02.05.2020 14:32, Vinod Koul wrote:
> Would it not be better to use list_del_init() where we delete it rather
> than do the init here?
>

It is not a problem of list element itself not being initialized.
The problem is that during fault conditions (zynqmp_dma_reset) all
elements are moved to free list. List head however is not reinitialized.

In normal flow elements are removed by list_del and resubmitted to
free list with zynqmp_dma_free_descriptor.

static void zynqmp_dma_chan_desc_cleanup(struct zynqmp_dma_chan *chan)
{
    ...
    list_for_each_entry_safe(desc, next, &chan->done_list, node) {
        ...
        list_del(&desc->node);
        ...
        zynqmp_dma_free_descriptor(chan, desc);
    }
}

The zynqmp_dma_free_descriptor does not delete elements from the
list by itself.
I am not he author of this driver so I fixed it by
doing non intrusive changes.

Anyways, I do not see how using list_del_init would fix the bug.

Regards, Rafal

2020-05-04 05:35:32

by Harini Katakam

[permalink] [raw]
Subject: RE: [PATCH] dma: zynqmp_dma: Initialize descriptor list after freeing during reset

Hi Vinod,

> -----Original Message-----
> From: Vinod Koul [mailto:[email protected]]
> Sent: Monday, May 4, 2020 10:46 AM
> To: Rafał Hibner <[email protected]>
> Cc: Appana Durga Kedareswara Rao <[email protected]>; Radhey Shyam
> Pandey <[email protected]>; Harini Katakam <[email protected]>; Dan
> Williams <[email protected]>; Michal Simek <[email protected]>; open
> list:DMA GENERIC OFFLOAD ENGINE SUBSYSTEM
> <[email protected]>; moderated list:ARM/ZYNQ ARCHITECTURE
> <[email protected]>; open list <linux-
> [email protected]>
> Subject: Re: [PATCH] dma: zynqmp_dma: Initialize descriptor list after freeing
> during reset
>
> On 02-05-20, 15:00, Rafał Hibner wrote:
> > Hello Vinod,
> >
> > On 02.05.2020 14:32, Vinod Koul wrote:
> > > Would it not be better to use list_del_init() where we delete it
> > > rather than do the init here?
> > >
> >
> > It is not a problem of list element itself not being initialized.
> > The problem is that during fault conditions (zynqmp_dma_reset) all
> > elements are moved to free list. List head however is not reinitialized.
> >
> > In normal flow elements are removed by list_del and resubmitted to
> > free list with zynqmp_dma_free_descriptor.
> >
> > static void zynqmp_dma_chan_desc_cleanup(struct zynqmp_dma_chan *chan)
> > {
> >     ...
> >     list_for_each_entry_safe(desc, next, &chan->done_list, node) {
> >         ...
> >         list_del(&desc->node);
> >         ...
> >         zynqmp_dma_free_descriptor(chan, desc);
> >     }
> > }
> >
> > The zynqmp_dma_free_descriptor does not delete elements from the list
> > by itself.
> > I am not he author of this driver so I fixed it by doing non intrusive
> > changes.
> >
> > Anyways, I do not see how using list_del_init would fix the bug.
>
> Looking at this, i think it would make sense to do list_splice_init() before we
> send the list to be freed.
>
> Radhey/Appana are cced, they should test this.

I tested this patch (I took over from Appana Kedar).
Once this discussion concludes and Rafal sends a v2, I'll test again and add my tag.

Regards,
Harini

2020-05-04 05:56:19

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH] dma: zynqmp_dma: Initialize descriptor list after freeing during reset

On 02-05-20, 15:00, Rafał Hibner wrote:
> Hello Vinod,
>
> On 02.05.2020 14:32, Vinod Koul wrote:
> > Would it not be better to use list_del_init() where we delete it rather
> > than do the init here?
> >
>
> It is not a problem of list element itself not being initialized.
> The problem is that during fault conditions (zynqmp_dma_reset) all
> elements are moved to free list. List head however is not reinitialized.
>
> In normal flow elements are removed by list_del and resubmitted to
> free list with zynqmp_dma_free_descriptor.
>
> static void zynqmp_dma_chan_desc_cleanup(struct zynqmp_dma_chan *chan)
> {
>     ...
>     list_for_each_entry_safe(desc, next, &chan->done_list, node) {
>         ...
>         list_del(&desc->node);
>         ...
>         zynqmp_dma_free_descriptor(chan, desc);
>     }
> }
>
> The zynqmp_dma_free_descriptor does not delete elements from the
> list by itself.
> I am not he author of this driver so I fixed it by
> doing non intrusive changes.
>
> Anyways, I do not see how using list_del_init would fix the bug.

Looking at this, i think it would make sense to do list_splice_init()
before we send the list to be freed.

Radhey/Appana are cced, they should test this.

--
~Vinod

2020-05-06 10:37:07

by Rafał Hibner

[permalink] [raw]
Subject: [PATCH V2] dma: zynqmp_dma: Move list_del inside zynqmp_dma_free_descriptor.

List elements are not formally removed from list during zynqmp_dma_reset.

Signed-off-by: Rafal Hibner <[email protected]>
---
> Looking at this, i think it would make sense to do list_splice_init() before we
> send the list to be freed.

Moving list_del inside zynqmp_dma_free_descriptor fixes the problem just as well
and avoids INIT_LIST_HEAD(list).
I still don't see how list_splice_init would help here.

Regards,
Rafal

---
drivers/dma/xilinx/zynqmp_dma.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/dma/xilinx/zynqmp_dma.c b/drivers/dma/xilinx/zynqmp_dma.c
index 931e41a72..df0403335 100644
--- a/drivers/dma/xilinx/zynqmp_dma.c
+++ b/drivers/dma/xilinx/zynqmp_dma.c
@@ -434,6 +434,7 @@ static void zynqmp_dma_free_descriptor(struct zynqmp_dma_chan *chan,
struct zynqmp_dma_desc_sw *child, *next;

chan->desc_free_cnt++;
+ list_del(&sdesc->node);
list_add_tail(&sdesc->node, &chan->free_list);
list_for_each_entry_safe(child, next, &sdesc->tx_list, node) {
chan->desc_free_cnt++;
@@ -604,8 +605,6 @@ static void zynqmp_dma_chan_desc_cleanup(struct zynqmp_dma_chan *chan)
dma_async_tx_callback callback;
void *callback_param;

- list_del(&desc->node);
-
callback = desc->async_tx.callback;
callback_param = desc->async_tx.callback_param;
if (callback) {
--
2.20.1

2020-05-15 05:53:08

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH V2] dma: zynqmp_dma: Move list_del inside zynqmp_dma_free_descriptor.

On 06-05-20, 12:28, Rafał Hibner wrote:
> List elements are not formally removed from list during zynqmp_dma_reset.

Applied after fixing subsystem name to dmaengine, thanks
--
~Vinod