2023-05-23 16:33:30

by Benjamin Gaignard

[permalink] [raw]
Subject: [PATCH] media: verisilicon: Additional fix for the crash when opening the driver

This fixes the following issue observed on Odroid-M1 board:

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
Mem abort info:
...
Modules linked in: crct10dif_ce hantro_vpu snd_soc_simple_card snd_soc_simple_card_utils v4l2_vp9 v4l2_h264 rockchip_saradc v4l2_mem2mem videobuf2_dma_contig videobuf2_memops rtc_rk808 videobuf2_v4l2 industrialio_triggered_buffer rockchip_thermal dwmac_rk stmmac_platform stmmac videodev kfifo_buf display_connector videobuf2_common pcs_xpcs mc rockchipdrm analogix_dp dw_mipi_dsi dw_hdmi drm_display_helper panfrost drm_shmem_helper gpu_sched ip_tables x_tables ipv6
CPU: 3 PID: 176 Comm: v4l_id Not tainted 6.3.0-rc7-next-20230420 #13481
Hardware name: Hardkernel ODROID-M1 (DT)
pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : hantro_try_fmt+0xa0/0x278 [hantro_vpu]
lr : hantro_try_fmt+0x94/0x278 [hantro_vpu]
...
Call trace:
hantro_try_fmt+0xa0/0x278 [hantro_vpu]
hantro_set_fmt_out+0x3c/0x298 [hantro_vpu]
hantro_reset_raw_fmt+0x98/0x128 [hantro_vpu]
hantro_set_fmt_cap+0x240/0x254 [hantro_vpu]
hantro_reset_encoded_fmt+0x94/0xcc [hantro_vpu]
hantro_reset_fmts+0x18/0x38 [hantro_vpu]
hantro_open+0xd4/0x20c [hantro_vpu]
v4l2_open+0x80/0x120 [videodev]
chrdev_open+0xc0/0x22c
do_dentry_open+0x13c/0x48c
vfs_open+0x2c/0x38
path_openat+0x550/0x934
do_filp_open+0x80/0x12c
do_sys_openat2+0xb4/0x168
__arm64_sys_openat+0x64/0xac
invoke_syscall+0x48/0x114
el0_svc_common+0x100/0x120
do_el0_svc+0x3c/0xa8
el0_svc+0x40/0xa8
el0t_64_sync_handler+0xb8/0xbc
el0t_64_sync+0x190/0x194
Code: 97fc8a7f f940aa80 52864a61 72a686c1 (b9400800)
---[ end trace 0000000000000000 ]---

Fixes: db6f68b51e5c ("media: verisilicon: Do not set context src/dst formats in reset functions")

Signed-off-by: Benjamin Gaignard <[email protected]>
---
drivers/media/platform/verisilicon/hantro_v4l2.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
index 835518534e3b..61cfaaf4e927 100644
--- a/drivers/media/platform/verisilicon/hantro_v4l2.c
+++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
@@ -397,10 +397,12 @@ hantro_reset_raw_fmt(struct hantro_ctx *ctx, int bit_depth)
if (!raw_vpu_fmt)
return -EINVAL;

- if (ctx->is_encoder)
+ if (ctx->is_encoder) {
encoded_fmt = &ctx->dst_fmt;
- else
+ ctx->vpu_src_fmt = raw_vpu_fmt;
+ } else {
encoded_fmt = &ctx->src_fmt;
+ }

hantro_reset_fmt(&raw_fmt, raw_vpu_fmt);
raw_fmt.width = encoded_fmt->width;
--
2.34.1



2023-05-23 16:36:37

by Ezequiel Garcia

[permalink] [raw]
Subject: Re: [PATCH] media: verisilicon: Additional fix for the crash when opening the driver

Hi Benjamin,

Thanks for the patch.

On Tue, May 23, 2023 at 1:25 PM Benjamin Gaignard
<[email protected]> wrote:
>
> This fixes the following issue observed on Odroid-M1 board:
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008

What pointer is NULL? ctx->src_fmt ?

> Mem abort info:
> ...
> Modules linked in: crct10dif_ce hantro_vpu snd_soc_simple_card snd_soc_simple_card_utils v4l2_vp9 v4l2_h264 rockchip_saradc v4l2_mem2mem videobuf2_dma_contig videobuf2_memops rtc_rk808 videobuf2_v4l2 industrialio_triggered_buffer rockchip_thermal dwmac_rk stmmac_platform stmmac videodev kfifo_buf display_connector videobuf2_common pcs_xpcs mc rockchipdrm analogix_dp dw_mipi_dsi dw_hdmi drm_display_helper panfrost drm_shmem_helper gpu_sched ip_tables x_tables ipv6
> CPU: 3 PID: 176 Comm: v4l_id Not tainted 6.3.0-rc7-next-20230420 #13481
> Hardware name: Hardkernel ODROID-M1 (DT)
> pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : hantro_try_fmt+0xa0/0x278 [hantro_vpu]
> lr : hantro_try_fmt+0x94/0x278 [hantro_vpu]
> ...
> Call trace:
> hantro_try_fmt+0xa0/0x278 [hantro_vpu]
> hantro_set_fmt_out+0x3c/0x298 [hantro_vpu]
> hantro_reset_raw_fmt+0x98/0x128 [hantro_vpu]
> hantro_set_fmt_cap+0x240/0x254 [hantro_vpu]
> hantro_reset_encoded_fmt+0x94/0xcc [hantro_vpu]
> hantro_reset_fmts+0x18/0x38 [hantro_vpu]
> hantro_open+0xd4/0x20c [hantro_vpu]
> v4l2_open+0x80/0x120 [videodev]
> chrdev_open+0xc0/0x22c
> do_dentry_open+0x13c/0x48c
> vfs_open+0x2c/0x38
> path_openat+0x550/0x934
> do_filp_open+0x80/0x12c
> do_sys_openat2+0xb4/0x168
> __arm64_sys_openat+0x64/0xac
> invoke_syscall+0x48/0x114
> el0_svc_common+0x100/0x120
> do_el0_svc+0x3c/0xa8
> el0_svc+0x40/0xa8
> el0t_64_sync_handler+0xb8/0xbc
> el0t_64_sync+0x190/0x194
> Code: 97fc8a7f f940aa80 52864a61 72a686c1 (b9400800)
> ---[ end trace 0000000000000000 ]---
>
> Fixes: db6f68b51e5c ("media: verisilicon: Do not set context src/dst formats in reset functions")
>
> Signed-off-by: Benjamin Gaignard <[email protected]>
> ---
> drivers/media/platform/verisilicon/hantro_v4l2.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
> index 835518534e3b..61cfaaf4e927 100644
> --- a/drivers/media/platform/verisilicon/hantro_v4l2.c
> +++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
> @@ -397,10 +397,12 @@ hantro_reset_raw_fmt(struct hantro_ctx *ctx, int bit_depth)
> if (!raw_vpu_fmt)
> return -EINVAL;
>
> - if (ctx->is_encoder)
> + if (ctx->is_encoder) {
> encoded_fmt = &ctx->dst_fmt;
> - else
> + ctx->vpu_src_fmt = raw_vpu_fmt;
> + } else {
> encoded_fmt = &ctx->src_fmt;
> + }
>
> hantro_reset_fmt(&raw_fmt, raw_vpu_fmt);
> raw_fmt.width = encoded_fmt->width;
> --
> 2.34.1
>

2023-05-23 16:48:30

by Benjamin Gaignard

[permalink] [raw]
Subject: Re: [PATCH] media: verisilicon: Additional fix for the crash when opening the driver


Le 23/05/2023 à 18:28, Ezequiel Garcia a écrit :
> Hi Benjamin,
>
> Thanks for the patch.
>
> On Tue, May 23, 2023 at 1:25 PM Benjamin Gaignard
> <[email protected]> wrote:
>> This fixes the following issue observed on Odroid-M1 board:
>>
>> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> What pointer is NULL? ctx->src_fmt ?

yes ctx->vpu_src_fmt pointer was NULL when probing the encoder.

>
>> Mem abort info:
>> ...
>> Modules linked in: crct10dif_ce hantro_vpu snd_soc_simple_card snd_soc_simple_card_utils v4l2_vp9 v4l2_h264 rockchip_saradc v4l2_mem2mem videobuf2_dma_contig videobuf2_memops rtc_rk808 videobuf2_v4l2 industrialio_triggered_buffer rockchip_thermal dwmac_rk stmmac_platform stmmac videodev kfifo_buf display_connector videobuf2_common pcs_xpcs mc rockchipdrm analogix_dp dw_mipi_dsi dw_hdmi drm_display_helper panfrost drm_shmem_helper gpu_sched ip_tables x_tables ipv6
>> CPU: 3 PID: 176 Comm: v4l_id Not tainted 6.3.0-rc7-next-20230420 #13481
>> Hardware name: Hardkernel ODROID-M1 (DT)
>> pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> pc : hantro_try_fmt+0xa0/0x278 [hantro_vpu]
>> lr : hantro_try_fmt+0x94/0x278 [hantro_vpu]
>> ...
>> Call trace:
>> hantro_try_fmt+0xa0/0x278 [hantro_vpu]
>> hantro_set_fmt_out+0x3c/0x298 [hantro_vpu]
>> hantro_reset_raw_fmt+0x98/0x128 [hantro_vpu]
>> hantro_set_fmt_cap+0x240/0x254 [hantro_vpu]
>> hantro_reset_encoded_fmt+0x94/0xcc [hantro_vpu]
>> hantro_reset_fmts+0x18/0x38 [hantro_vpu]
>> hantro_open+0xd4/0x20c [hantro_vpu]
>> v4l2_open+0x80/0x120 [videodev]
>> chrdev_open+0xc0/0x22c
>> do_dentry_open+0x13c/0x48c
>> vfs_open+0x2c/0x38
>> path_openat+0x550/0x934
>> do_filp_open+0x80/0x12c
>> do_sys_openat2+0xb4/0x168
>> __arm64_sys_openat+0x64/0xac
>> invoke_syscall+0x48/0x114
>> el0_svc_common+0x100/0x120
>> do_el0_svc+0x3c/0xa8
>> el0_svc+0x40/0xa8
>> el0t_64_sync_handler+0xb8/0xbc
>> el0t_64_sync+0x190/0x194
>> Code: 97fc8a7f f940aa80 52864a61 72a686c1 (b9400800)
>> ---[ end trace 0000000000000000 ]---
>>
>> Fixes: db6f68b51e5c ("media: verisilicon: Do not set context src/dst formats in reset functions")
>>
>> Signed-off-by: Benjamin Gaignard <[email protected]>
>> ---
>> drivers/media/platform/verisilicon/hantro_v4l2.c | 6 ++++--
>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
>> index 835518534e3b..61cfaaf4e927 100644
>> --- a/drivers/media/platform/verisilicon/hantro_v4l2.c
>> +++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
>> @@ -397,10 +397,12 @@ hantro_reset_raw_fmt(struct hantro_ctx *ctx, int bit_depth)
>> if (!raw_vpu_fmt)
>> return -EINVAL;
>>
>> - if (ctx->is_encoder)
>> + if (ctx->is_encoder) {
>> encoded_fmt = &ctx->dst_fmt;
>> - else
>> + ctx->vpu_src_fmt = raw_vpu_fmt;
>> + } else {
>> encoded_fmt = &ctx->src_fmt;
>> + }
>>
>> hantro_reset_fmt(&raw_fmt, raw_vpu_fmt);
>> raw_fmt.width = encoded_fmt->width;
>> --
>> 2.34.1
>>

2023-05-23 17:03:26

by Benjamin Gaignard

[permalink] [raw]
Subject: Re: [PATCH] media: verisilicon: Additional fix for the crash when opening the driver


Le 23/05/2023 à 18:25, Benjamin Gaignard a écrit :
> This fixes the following issue observed on Odroid-M1 board:
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> Mem abort info:
> ...
> Modules linked in: crct10dif_ce hantro_vpu snd_soc_simple_card snd_soc_simple_card_utils v4l2_vp9 v4l2_h264 rockchip_saradc v4l2_mem2mem videobuf2_dma_contig videobuf2_memops rtc_rk808 videobuf2_v4l2 industrialio_triggered_buffer rockchip_thermal dwmac_rk stmmac_platform stmmac videodev kfifo_buf display_connector videobuf2_common pcs_xpcs mc rockchipdrm analogix_dp dw_mipi_dsi dw_hdmi drm_display_helper panfrost drm_shmem_helper gpu_sched ip_tables x_tables ipv6
> CPU: 3 PID: 176 Comm: v4l_id Not tainted 6.3.0-rc7-next-20230420 #13481
> Hardware name: Hardkernel ODROID-M1 (DT)
> pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : hantro_try_fmt+0xa0/0x278 [hantro_vpu]
> lr : hantro_try_fmt+0x94/0x278 [hantro_vpu]
> ...
> Call trace:
> hantro_try_fmt+0xa0/0x278 [hantro_vpu]
> hantro_set_fmt_out+0x3c/0x298 [hantro_vpu]
> hantro_reset_raw_fmt+0x98/0x128 [hantro_vpu]
> hantro_set_fmt_cap+0x240/0x254 [hantro_vpu]
> hantro_reset_encoded_fmt+0x94/0xcc [hantro_vpu]
> hantro_reset_fmts+0x18/0x38 [hantro_vpu]
> hantro_open+0xd4/0x20c [hantro_vpu]
> v4l2_open+0x80/0x120 [videodev]
> chrdev_open+0xc0/0x22c
> do_dentry_open+0x13c/0x48c
> vfs_open+0x2c/0x38
> path_openat+0x550/0x934
> do_filp_open+0x80/0x12c
> do_sys_openat2+0xb4/0x168
> __arm64_sys_openat+0x64/0xac
> invoke_syscall+0x48/0x114
> el0_svc_common+0x100/0x120
> do_el0_svc+0x3c/0xa8
> el0_svc+0x40/0xa8
> el0t_64_sync_handler+0xb8/0xbc
> el0t_64_sync+0x190/0x194
> Code: 97fc8a7f f940aa80 52864a61 72a686c1 (b9400800)
> ---[ end trace 0000000000000000 ]---
>
> Fixes: db6f68b51e5c ("media: verisilicon: Do not set context src/dst formats in reset functions")
>
> Signed-off-by: Benjamin Gaignard <[email protected]>
> ---

Diederick, Marek, Michael,
I have tested this patch on my boards and I see no regressions on
decoder part and no more crash when probing the encoder.
Could you test it on your side to confirm it is ok ?

Thorsten, I try/test regzbot commands, please tell me if it is correct.

#regzbot ^introduced db6f68b51e5c
#regzbot title media: verisilicon: null pointer dereference in try_fmt
#regzbot ignore-activity


> drivers/media/platform/verisilicon/hantro_v4l2.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
> index 835518534e3b..61cfaaf4e927 100644
> --- a/drivers/media/platform/verisilicon/hantro_v4l2.c
> +++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
> @@ -397,10 +397,12 @@ hantro_reset_raw_fmt(struct hantro_ctx *ctx, int bit_depth)
> if (!raw_vpu_fmt)
> return -EINVAL;
>
> - if (ctx->is_encoder)
> + if (ctx->is_encoder) {
> encoded_fmt = &ctx->dst_fmt;
> - else
> + ctx->vpu_src_fmt = raw_vpu_fmt;
> + } else {
> encoded_fmt = &ctx->src_fmt;
> + }
>
> hantro_reset_fmt(&raw_fmt, raw_vpu_fmt);
> raw_fmt.width = encoded_fmt->width;

2023-05-23 17:19:55

by Michael Tretter

[permalink] [raw]
Subject: Re: [PATCH] media: verisilicon: Additional fix for the crash when opening the driver

On Tue, 23 May 2023 18:36:09 +0200, Benjamin Gaignard wrote:
>
> Le 23/05/2023 à 18:25, Benjamin Gaignard a écrit :
> > This fixes the following issue observed on Odroid-M1 board:
> >
> > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> > Mem abort info:
> > ...
> > Modules linked in: crct10dif_ce hantro_vpu snd_soc_simple_card snd_soc_simple_card_utils v4l2_vp9 v4l2_h264 rockchip_saradc v4l2_mem2mem videobuf2_dma_contig videobuf2_memops rtc_rk808 videobuf2_v4l2 industrialio_triggered_buffer rockchip_thermal dwmac_rk stmmac_platform stmmac videodev kfifo_buf display_connector videobuf2_common pcs_xpcs mc rockchipdrm analogix_dp dw_mipi_dsi dw_hdmi drm_display_helper panfrost drm_shmem_helper gpu_sched ip_tables x_tables ipv6
> > CPU: 3 PID: 176 Comm: v4l_id Not tainted 6.3.0-rc7-next-20230420 #13481
> > Hardware name: Hardkernel ODROID-M1 (DT)
> > pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > pc : hantro_try_fmt+0xa0/0x278 [hantro_vpu]
> > lr : hantro_try_fmt+0x94/0x278 [hantro_vpu]
> > ...
> > Call trace:
> > hantro_try_fmt+0xa0/0x278 [hantro_vpu]
> > hantro_set_fmt_out+0x3c/0x298 [hantro_vpu]
> > hantro_reset_raw_fmt+0x98/0x128 [hantro_vpu]
> > hantro_set_fmt_cap+0x240/0x254 [hantro_vpu]
> > hantro_reset_encoded_fmt+0x94/0xcc [hantro_vpu]
> > hantro_reset_fmts+0x18/0x38 [hantro_vpu]
> > hantro_open+0xd4/0x20c [hantro_vpu]
> > v4l2_open+0x80/0x120 [videodev]
> > chrdev_open+0xc0/0x22c
> > do_dentry_open+0x13c/0x48c
> > vfs_open+0x2c/0x38
> > path_openat+0x550/0x934
> > do_filp_open+0x80/0x12c
> > do_sys_openat2+0xb4/0x168
> > __arm64_sys_openat+0x64/0xac
> > invoke_syscall+0x48/0x114
> > el0_svc_common+0x100/0x120
> > do_el0_svc+0x3c/0xa8
> > el0_svc+0x40/0xa8
> > el0t_64_sync_handler+0xb8/0xbc
> > el0t_64_sync+0x190/0x194
> > Code: 97fc8a7f f940aa80 52864a61 72a686c1 (b9400800)
> > ---[ end trace 0000000000000000 ]---
> >
> > Fixes: db6f68b51e5c ("media: verisilicon: Do not set context src/dst formats in reset functions")

This patch partially reverts the previous commit. I wonder whether the reason
for resetting the context format only if the targeted queue is not busy still
stands.

> >
> > Signed-off-by: Benjamin Gaignard <[email protected]>

Tested-by: Michael Tretter <[email protected]>

> > ---
>
> Diederick, Marek, Michael,
> I have tested this patch on my boards and I see no regressions on
> decoder part and no more crash when probing the encoder.
> Could you test it on your side to confirm it is ok ?
>
> Thorsten, I try/test regzbot commands, please tell me if it is correct.
>
> #regzbot ^introduced db6f68b51e5c
> #regzbot title media: verisilicon: null pointer dereference in try_fmt
> #regzbot ignore-activity
>
>
> > drivers/media/platform/verisilicon/hantro_v4l2.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
> > index 835518534e3b..61cfaaf4e927 100644
> > --- a/drivers/media/platform/verisilicon/hantro_v4l2.c
> > +++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
> > @@ -397,10 +397,12 @@ hantro_reset_raw_fmt(struct hantro_ctx *ctx, int bit_depth)
> > if (!raw_vpu_fmt)
> > return -EINVAL;
> > - if (ctx->is_encoder)
> > + if (ctx->is_encoder) {
> > encoded_fmt = &ctx->dst_fmt;
> > - else
> > + ctx->vpu_src_fmt = raw_vpu_fmt;
> > + } else {
> > encoded_fmt = &ctx->src_fmt;
> > + }
> > hantro_reset_fmt(&raw_fmt, raw_vpu_fmt);
> > raw_fmt.width = encoded_fmt->width;
>

2023-05-23 17:46:42

by Ezequiel Garcia

[permalink] [raw]
Subject: Re: [PATCH] media: verisilicon: Additional fix for the crash when opening the driver

Hi guys,

After reviewing the format logic (hantro_reset_encoded_fmt and
hantro_reset_raw_fmt).
It seems to me trying to support Decoders, Encoders and so many
different SoC Variants, is getting increasingly fragile.
This driver is becoming a big fat monolith. Regressions like this will
be increasingly frequent.

The only codec that supports encoding right now is JPEG, so I think
it's a good idea to remove it for good,
and split it to its own driver.

Anyone volunteering? :-)

Thanks,
Ezequiel

On Tue, May 23, 2023 at 2:06 PM Michael Tretter
<[email protected]> wrote:
>
> On Tue, 23 May 2023 18:36:09 +0200, Benjamin Gaignard wrote:
> >
> > Le 23/05/2023 à 18:25, Benjamin Gaignard a écrit :
> > > This fixes the following issue observed on Odroid-M1 board:
> > >
> > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> > > Mem abort info:
> > > ...
> > > Modules linked in: crct10dif_ce hantro_vpu snd_soc_simple_card snd_soc_simple_card_utils v4l2_vp9 v4l2_h264 rockchip_saradc v4l2_mem2mem videobuf2_dma_contig videobuf2_memops rtc_rk808 videobuf2_v4l2 industrialio_triggered_buffer rockchip_thermal dwmac_rk stmmac_platform stmmac videodev kfifo_buf display_connector videobuf2_common pcs_xpcs mc rockchipdrm analogix_dp dw_mipi_dsi dw_hdmi drm_display_helper panfrost drm_shmem_helper gpu_sched ip_tables x_tables ipv6
> > > CPU: 3 PID: 176 Comm: v4l_id Not tainted 6.3.0-rc7-next-20230420 #13481
> > > Hardware name: Hardkernel ODROID-M1 (DT)
> > > pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > pc : hantro_try_fmt+0xa0/0x278 [hantro_vpu]
> > > lr : hantro_try_fmt+0x94/0x278 [hantro_vpu]
> > > ...
> > > Call trace:
> > > hantro_try_fmt+0xa0/0x278 [hantro_vpu]
> > > hantro_set_fmt_out+0x3c/0x298 [hantro_vpu]
> > > hantro_reset_raw_fmt+0x98/0x128 [hantro_vpu]
> > > hantro_set_fmt_cap+0x240/0x254 [hantro_vpu]
> > > hantro_reset_encoded_fmt+0x94/0xcc [hantro_vpu]
> > > hantro_reset_fmts+0x18/0x38 [hantro_vpu]
> > > hantro_open+0xd4/0x20c [hantro_vpu]
> > > v4l2_open+0x80/0x120 [videodev]
> > > chrdev_open+0xc0/0x22c
> > > do_dentry_open+0x13c/0x48c
> > > vfs_open+0x2c/0x38
> > > path_openat+0x550/0x934
> > > do_filp_open+0x80/0x12c
> > > do_sys_openat2+0xb4/0x168
> > > __arm64_sys_openat+0x64/0xac
> > > invoke_syscall+0x48/0x114
> > > el0_svc_common+0x100/0x120
> > > do_el0_svc+0x3c/0xa8
> > > el0_svc+0x40/0xa8
> > > el0t_64_sync_handler+0xb8/0xbc
> > > el0t_64_sync+0x190/0x194
> > > Code: 97fc8a7f f940aa80 52864a61 72a686c1 (b9400800)
> > > ---[ end trace 0000000000000000 ]---
> > >
> > > Fixes: db6f68b51e5c ("media: verisilicon: Do not set context src/dst formats in reset functions")
>
> This patch partially reverts the previous commit. I wonder whether the reason
> for resetting the context format only if the targeted queue is not busy still
> stands.
>
> > >
> > > Signed-off-by: Benjamin Gaignard <[email protected]>
>
> Tested-by: Michael Tretter <[email protected]>
>
> > > ---
> >
> > Diederick, Marek, Michael,
> > I have tested this patch on my boards and I see no regressions on
> > decoder part and no more crash when probing the encoder.
> > Could you test it on your side to confirm it is ok ?
> >
> > Thorsten, I try/test regzbot commands, please tell me if it is correct.
> >
> > #regzbot ^introduced db6f68b51e5c
> > #regzbot title media: verisilicon: null pointer dereference in try_fmt
> > #regzbot ignore-activity
> >
> >
> > > drivers/media/platform/verisilicon/hantro_v4l2.c | 6 ++++--
> > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
> > > index 835518534e3b..61cfaaf4e927 100644
> > > --- a/drivers/media/platform/verisilicon/hantro_v4l2.c
> > > +++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
> > > @@ -397,10 +397,12 @@ hantro_reset_raw_fmt(struct hantro_ctx *ctx, int bit_depth)
> > > if (!raw_vpu_fmt)
> > > return -EINVAL;
> > > - if (ctx->is_encoder)
> > > + if (ctx->is_encoder) {
> > > encoded_fmt = &ctx->dst_fmt;
> > > - else
> > > + ctx->vpu_src_fmt = raw_vpu_fmt;
> > > + } else {
> > > encoded_fmt = &ctx->src_fmt;
> > > + }
> > > hantro_reset_fmt(&raw_fmt, raw_vpu_fmt);
> > > raw_fmt.width = encoded_fmt->width;
> >

2023-05-23 20:55:47

by Diederik de Haas

[permalink] [raw]
Subject: Re: [PATCH] media: verisilicon: Additional fix for the crash when opening the driver

On Tuesday, 23 May 2023 18:36:09 CEST Benjamin Gaignard wrote:
> Diederik, Marek, Michael,
> I have tested this patch on my boards and I see no regressions on
> decoder part and no more crash when probing the encoder.
> Could you test it on your side to confirm it is ok ?

With this patch I'm (also) not seeing the crash

Tested-by: Diederik de Haas <[email protected]>


Attachments:
signature.asc (235.00 B)
This is a digitally signed message part.

2023-05-23 23:29:15

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH] media: verisilicon: Additional fix for the crash when opening the driver

Le mardi 23 mai 2023 à 14:36 -0300, Ezequiel Garcia a écrit :
> Hi guys,
>
> After reviewing the format logic (hantro_reset_encoded_fmt and
> hantro_reset_raw_fmt).
> It seems to me trying to support Decoders, Encoders and so many
> different SoC Variants, is getting increasingly fragile.
> This driver is becoming a big fat monolith. Regressions like this will
> be increasingly frequent.
>
> The only codec that supports encoding right now is JPEG, so I think
> it's a good idea to remove it for good,
> and split it to its own driver.
>
> Anyone volunteering? :-)

We won't have that luxury with VP8 and H.264, as the decoder and encoder shares
the same cache memory. They must be time sliced. Note that this driver is only
missing VP8/H.264 encoding before it becomes maintenance only (there won't be
any interesting feature left, so I would not start on big refactoring, as this
may cause more trouble then good. Anything newer like VC8000 or VC9000 should be
a new driver, and with encoder/decoder split.

regards,
Nicolas

p.s. this is my personal opinion, in general, we should improve the helpers if
there is too much boilerplate, rather then creating monolithic drivers, and on
that, I believe I agree, but the H1/G1 combo have hardware dependencies which
has been solve that way, and changing that now is a big amount of work for a
relative quite driver. Feel free to split G2 away from that driver, that would
make sense, its not sharing anything.

>
> Thanks,
> Ezequiel
>
> On Tue, May 23, 2023 at 2:06 PM Michael Tretter
> <[email protected]> wrote:
> >
> > On Tue, 23 May 2023 18:36:09 +0200, Benjamin Gaignard wrote:
> > >
> > > Le 23/05/2023 à 18:25, Benjamin Gaignard a écrit :
> > > > This fixes the following issue observed on Odroid-M1 board:
> > > >
> > > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> > > > Mem abort info:
> > > > ...
> > > > Modules linked in: crct10dif_ce hantro_vpu snd_soc_simple_card snd_soc_simple_card_utils v4l2_vp9 v4l2_h264 rockchip_saradc v4l2_mem2mem videobuf2_dma_contig videobuf2_memops rtc_rk808 videobuf2_v4l2 industrialio_triggered_buffer rockchip_thermal dwmac_rk stmmac_platform stmmac videodev kfifo_buf display_connector videobuf2_common pcs_xpcs mc rockchipdrm analogix_dp dw_mipi_dsi dw_hdmi drm_display_helper panfrost drm_shmem_helper gpu_sched ip_tables x_tables ipv6
> > > > CPU: 3 PID: 176 Comm: v4l_id Not tainted 6.3.0-rc7-next-20230420 #13481
> > > > Hardware name: Hardkernel ODROID-M1 (DT)
> > > > pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > pc : hantro_try_fmt+0xa0/0x278 [hantro_vpu]
> > > > lr : hantro_try_fmt+0x94/0x278 [hantro_vpu]
> > > > ...
> > > > Call trace:
> > > > hantro_try_fmt+0xa0/0x278 [hantro_vpu]
> > > > hantro_set_fmt_out+0x3c/0x298 [hantro_vpu]
> > > > hantro_reset_raw_fmt+0x98/0x128 [hantro_vpu]
> > > > hantro_set_fmt_cap+0x240/0x254 [hantro_vpu]
> > > > hantro_reset_encoded_fmt+0x94/0xcc [hantro_vpu]
> > > > hantro_reset_fmts+0x18/0x38 [hantro_vpu]
> > > > hantro_open+0xd4/0x20c [hantro_vpu]
> > > > v4l2_open+0x80/0x120 [videodev]
> > > > chrdev_open+0xc0/0x22c
> > > > do_dentry_open+0x13c/0x48c
> > > > vfs_open+0x2c/0x38
> > > > path_openat+0x550/0x934
> > > > do_filp_open+0x80/0x12c
> > > > do_sys_openat2+0xb4/0x168
> > > > __arm64_sys_openat+0x64/0xac
> > > > invoke_syscall+0x48/0x114
> > > > el0_svc_common+0x100/0x120
> > > > do_el0_svc+0x3c/0xa8
> > > > el0_svc+0x40/0xa8
> > > > el0t_64_sync_handler+0xb8/0xbc
> > > > el0t_64_sync+0x190/0x194
> > > > Code: 97fc8a7f f940aa80 52864a61 72a686c1 (b9400800)
> > > > ---[ end trace 0000000000000000 ]---
> > > >
> > > > Fixes: db6f68b51e5c ("media: verisilicon: Do not set context src/dst formats in reset functions")
> >
> > This patch partially reverts the previous commit. I wonder whether the reason
> > for resetting the context format only if the targeted queue is not busy still
> > stands.
> >
> > > >
> > > > Signed-off-by: Benjamin Gaignard <[email protected]>
> >
> > Tested-by: Michael Tretter <[email protected]>
> >
> > > > ---
> > >
> > > Diederick, Marek, Michael,
> > > I have tested this patch on my boards and I see no regressions on
> > > decoder part and no more crash when probing the encoder.
> > > Could you test it on your side to confirm it is ok ?
> > >
> > > Thorsten, I try/test regzbot commands, please tell me if it is correct.
> > >
> > > #regzbot ^introduced db6f68b51e5c
> > > #regzbot title media: verisilicon: null pointer dereference in try_fmt
> > > #regzbot ignore-activity
> > >
> > >
> > > > drivers/media/platform/verisilicon/hantro_v4l2.c | 6 ++++--
> > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
> > > > index 835518534e3b..61cfaaf4e927 100644
> > > > --- a/drivers/media/platform/verisilicon/hantro_v4l2.c
> > > > +++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
> > > > @@ -397,10 +397,12 @@ hantro_reset_raw_fmt(struct hantro_ctx *ctx, int bit_depth)
> > > > if (!raw_vpu_fmt)
> > > > return -EINVAL;
> > > > - if (ctx->is_encoder)
> > > > + if (ctx->is_encoder) {
> > > > encoded_fmt = &ctx->dst_fmt;
> > > > - else
> > > > + ctx->vpu_src_fmt = raw_vpu_fmt;
> > > > + } else {
> > > > encoded_fmt = &ctx->src_fmt;
> > > > + }
> > > > hantro_reset_fmt(&raw_fmt, raw_vpu_fmt);
> > > > raw_fmt.width = encoded_fmt->width;
> > >


2023-05-24 08:05:44

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [PATCH] media: verisilicon: Additional fix for the crash when opening the driver

On 23.05.2023 18:25, Benjamin Gaignard wrote:
> This fixes the following issue observed on Odroid-M1 board:
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> Mem abort info:
> ...
> Modules linked in: crct10dif_ce hantro_vpu snd_soc_simple_card snd_soc_simple_card_utils v4l2_vp9 v4l2_h264 rockchip_saradc v4l2_mem2mem videobuf2_dma_contig videobuf2_memops rtc_rk808 videobuf2_v4l2 industrialio_triggered_buffer rockchip_thermal dwmac_rk stmmac_platform stmmac videodev kfifo_buf display_connector videobuf2_common pcs_xpcs mc rockchipdrm analogix_dp dw_mipi_dsi dw_hdmi drm_display_helper panfrost drm_shmem_helper gpu_sched ip_tables x_tables ipv6
> CPU: 3 PID: 176 Comm: v4l_id Not tainted 6.3.0-rc7-next-20230420 #13481
> Hardware name: Hardkernel ODROID-M1 (DT)
> pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : hantro_try_fmt+0xa0/0x278 [hantro_vpu]
> lr : hantro_try_fmt+0x94/0x278 [hantro_vpu]
> ...
> Call trace:
> hantro_try_fmt+0xa0/0x278 [hantro_vpu]
> hantro_set_fmt_out+0x3c/0x298 [hantro_vpu]
> hantro_reset_raw_fmt+0x98/0x128 [hantro_vpu]
> hantro_set_fmt_cap+0x240/0x254 [hantro_vpu]
> hantro_reset_encoded_fmt+0x94/0xcc [hantro_vpu]
> hantro_reset_fmts+0x18/0x38 [hantro_vpu]
> hantro_open+0xd4/0x20c [hantro_vpu]
> v4l2_open+0x80/0x120 [videodev]
> chrdev_open+0xc0/0x22c
> do_dentry_open+0x13c/0x48c
> vfs_open+0x2c/0x38
> path_openat+0x550/0x934
> do_filp_open+0x80/0x12c
> do_sys_openat2+0xb4/0x168
> __arm64_sys_openat+0x64/0xac
> invoke_syscall+0x48/0x114
> el0_svc_common+0x100/0x120
> do_el0_svc+0x3c/0xa8
> el0_svc+0x40/0xa8
> el0t_64_sync_handler+0xb8/0xbc
> el0t_64_sync+0x190/0x194
> Code: 97fc8a7f f940aa80 52864a61 72a686c1 (b9400800)
> ---[ end trace 0000000000000000 ]---
>
> Fixes: db6f68b51e5c ("media: verisilicon: Do not set context src/dst formats in reset functions")
>
> Signed-off-by: Benjamin Gaignard <[email protected]>
Tested-by: Marek Szyprowski <[email protected]>
> ---
> drivers/media/platform/verisilicon/hantro_v4l2.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
> index 835518534e3b..61cfaaf4e927 100644
> --- a/drivers/media/platform/verisilicon/hantro_v4l2.c
> +++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
> @@ -397,10 +397,12 @@ hantro_reset_raw_fmt(struct hantro_ctx *ctx, int bit_depth)
> if (!raw_vpu_fmt)
> return -EINVAL;
>
> - if (ctx->is_encoder)
> + if (ctx->is_encoder) {
> encoded_fmt = &ctx->dst_fmt;
> - else
> + ctx->vpu_src_fmt = raw_vpu_fmt;
> + } else {
> encoded_fmt = &ctx->src_fmt;
> + }
>
> hantro_reset_fmt(&raw_fmt, raw_vpu_fmt);
> raw_fmt.width = encoded_fmt->width;

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland


Subject: Re: [PATCH] media: verisilicon: Additional fix for the crash when opening the driver

On 23.05.23 18:36, Benjamin Gaignard wrote:
>
> Le 23/05/2023 à 18:25, Benjamin Gaignard a écrit :
>> This fixes the following issue observed on Odroid-M1 board:
> [...]
> Diederick, Marek, Michael,
> I have tested this patch on my boards and I see no regressions on
> decoder part and no more crash when probing the encoder.
> Could you test it on your side to confirm it is ok ?

They all did, so that is done. Thx for your help, everybody!

/me now hopes this patch will be quickly reviewed, accepted and sent to
Linus to prevent even more people running into this...

> Thorsten, I try/test regzbot commands, please tell me if it is correct.
>
> #regzbot ^introduced db6f68b51e5c
> #regzbot title media: verisilicon: null pointer dereference in try_fmt
> #regzbot ignore-activity

Thx for this, we just now track this regression two times. No worries,
let me fix this and also tell regzbot about the fix:

#regzbot dup-of: https://lore.kernel.org/lkml/4995215.LvFx2qVVIh@bagend/
#regzbot fix: media: verisilicon: Additional fix for the crash when
opening the driver

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

2023-05-24 09:28:13

by Hans Verkuil

[permalink] [raw]
Subject: Re: [PATCH] media: verisilicon: Additional fix for the crash when opening the driver

On 24/05/2023 11:06, Thorsten Leemhuis wrote:
> On 23.05.23 18:36, Benjamin Gaignard wrote:
>>
>> Le 23/05/2023 à 18:25, Benjamin Gaignard a écrit :
>>> This fixes the following issue observed on Odroid-M1 board:
>> [...]
>> Diederick, Marek, Michael,
>> I have tested this patch on my boards and I see no regressions on
>> decoder part and no more crash when probing the encoder.
>> Could you test it on your side to confirm it is ok ?
>
> They all did, so that is done. Thx for your help, everybody!
>
> /me now hopes this patch will be quickly reviewed, accepted and sent to
> Linus to prevent even more people running into this...

I plan to make a PR with 6.4 fixes today or tomorrow.

Regards,

Hans

>
>> Thorsten, I try/test regzbot commands, please tell me if it is correct.
>>
>> #regzbot ^introduced db6f68b51e5c
>> #regzbot title media: verisilicon: null pointer dereference in try_fmt
>> #regzbot ignore-activity
>
> Thx for this, we just now track this regression two times. No worries,
> let me fix this and also tell regzbot about the fix:
>
> #regzbot dup-of: https://lore.kernel.org/lkml/4995215.LvFx2qVVIh@bagend/
> #regzbot fix: media: verisilicon: Additional fix for the crash when
> opening the driver
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.


Subject: Re: [PATCH] media: verisilicon: Additional fix for the crash when opening the driver

On 24.05.23 11:15, Hans Verkuil wrote:
> On 24/05/2023 11:06, Thorsten Leemhuis wrote:
>> On 23.05.23 18:36, Benjamin Gaignard wrote:
>>>
>>> Le 23/05/2023 à 18:25, Benjamin Gaignard a écrit :
>>>> This fixes the following issue observed on Odroid-M1 board:
>>> [...]
>>> Diederick, Marek, Michael,
>>> I have tested this patch on my boards and I see no regressions on
>>> decoder part and no more crash when probing the encoder.
>>> Could you test it on your side to confirm it is ok ?
>>
>> They all did, so that is done. Thx for your help, everybody!
>>
>> /me now hopes this patch will be quickly reviewed, accepted and sent to
>> Linus to prevent even more people running into this...
>
> I plan to make a PR with 6.4 fixes today or tomorrow.

Ahh, fabulous, many thx!

Ciao, Thorsten