2019-11-06 22:36:00

by Jonas Karlman

[permalink] [raw]
Subject: [PATCH v3 2/5] media: hantro: Reduce H264 extra space for motion vectors

A decoded 8-bit 4:2:0 frame need memory for up to 448 bytes per
macroblock with additional 32 bytes on multi-core variants.

Memory layout is as follow:

+---------------------------+
| Y-plane 256 bytes x MBs |
+---------------------------+
| UV-plane 128 bytes x MBs |
+---------------------------+
| MV buffer 64 bytes x MBs |
+---------------------------+
| MC sync 32 bytes |
+---------------------------+

Reduce the extra space allocated now that motion vector buffer offset no
longer is based on the extra space.

Only allocate extra space for 64 bytes x MBs of motion vector buffer
and 32 bytes for multi-core sync.

Fixes: a9471e25629b ("media: hantro: Add core bits to support H264 decoding")
Signed-off-by: Jonas Karlman <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
---
Changes in v3:
- add memory layout to code comment (Boris)
Changes in v2:
- updated commit message
---
drivers/staging/media/hantro/hantro_v4l2.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
index 3dae52abb96c..c8c896a06f58 100644
--- a/drivers/staging/media/hantro/hantro_v4l2.c
+++ b/drivers/staging/media/hantro/hantro_v4l2.c
@@ -240,14 +240,30 @@ static int vidioc_try_fmt(struct file *file, void *priv, struct v4l2_format *f,
v4l2_fill_pixfmt_mp(pix_mp, fmt->fourcc, pix_mp->width,
pix_mp->height);
/*
+ * A decoded 8-bit 4:2:0 NV12 frame may need memory for up to
+ * 448 bytes per macroblock with additional 32 bytes on
+ * multi-core variants.
+ *
* The H264 decoder needs extra space on the output buffers
* to store motion vectors. This is needed for reference
* frames.
+ *
+ * Memory layout is as follow:
+ *
+ * +---------------------------+
+ * | Y-plane 256 bytes x MBs |
+ * +---------------------------+
+ * | UV-plane 128 bytes x MBs |
+ * +---------------------------+
+ * | MV buffer 64 bytes x MBs |
+ * +---------------------------+
+ * | MC sync 32 bytes |
+ * +---------------------------+
*/
if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE)
pix_mp->plane_fmt[0].sizeimage +=
- 128 * DIV_ROUND_UP(pix_mp->width, 16) *
- DIV_ROUND_UP(pix_mp->height, 16);
+ 64 * MB_WIDTH(pix_mp->width) *
+ MB_WIDTH(pix_mp->height) + 32;
} else if (!pix_mp->plane_fmt[0].sizeimage) {
/*
* For coded formats the application can specify
--
2.17.1


2019-11-20 13:38:02

by Tomasz Figa

[permalink] [raw]
Subject: Re: [PATCH v3 2/5] media: hantro: Reduce H264 extra space for motion vectors

Hi Jonas,

On Thu, Nov 7, 2019 at 7:34 AM Jonas Karlman <[email protected]> wrote:
>
> A decoded 8-bit 4:2:0 frame need memory for up to 448 bytes per
> macroblock with additional 32 bytes on multi-core variants.
>
> Memory layout is as follow:
>
> +---------------------------+
> | Y-plane 256 bytes x MBs |
> +---------------------------+
> | UV-plane 128 bytes x MBs |
> +---------------------------+
> | MV buffer 64 bytes x MBs |
> +---------------------------+
> | MC sync 32 bytes |
> +---------------------------+
>
> Reduce the extra space allocated now that motion vector buffer offset no
> longer is based on the extra space.
>
> Only allocate extra space for 64 bytes x MBs of motion vector buffer
> and 32 bytes for multi-core sync.
>
> Fixes: a9471e25629b ("media: hantro: Add core bits to support H264 decoding")
> Signed-off-by: Jonas Karlman <[email protected]>
> Reviewed-by: Boris Brezillon <[email protected]>
> ---
> Changes in v3:
> - add memory layout to code comment (Boris)
> Changes in v2:
> - updated commit message
> ---
> drivers/staging/media/hantro/hantro_v4l2.c | 20 ++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
>

Thanks for the patch!

What platform did you test it on and how? Was it tested with IOMMU enabled?

Best regards,
Tomasz

> diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
> index 3dae52abb96c..c8c896a06f58 100644
> --- a/drivers/staging/media/hantro/hantro_v4l2.c
> +++ b/drivers/staging/media/hantro/hantro_v4l2.c
> @@ -240,14 +240,30 @@ static int vidioc_try_fmt(struct file *file, void *priv, struct v4l2_format *f,
> v4l2_fill_pixfmt_mp(pix_mp, fmt->fourcc, pix_mp->width,
> pix_mp->height);
> /*
> + * A decoded 8-bit 4:2:0 NV12 frame may need memory for up to
> + * 448 bytes per macroblock with additional 32 bytes on
> + * multi-core variants.
> + *
> * The H264 decoder needs extra space on the output buffers
> * to store motion vectors. This is needed for reference
> * frames.
> + *
> + * Memory layout is as follow:
> + *
> + * +---------------------------+
> + * | Y-plane 256 bytes x MBs |
> + * +---------------------------+
> + * | UV-plane 128 bytes x MBs |
> + * +---------------------------+
> + * | MV buffer 64 bytes x MBs |
> + * +---------------------------+
> + * | MC sync 32 bytes |
> + * +---------------------------+
> */
> if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE)
> pix_mp->plane_fmt[0].sizeimage +=
> - 128 * DIV_ROUND_UP(pix_mp->width, 16) *
> - DIV_ROUND_UP(pix_mp->height, 16);
> + 64 * MB_WIDTH(pix_mp->width) *
> + MB_WIDTH(pix_mp->height) + 32;
> } else if (!pix_mp->plane_fmt[0].sizeimage) {
> /*
> * For coded formats the application can specify
> --
> 2.17.1
>

2019-12-09 18:12:39

by Ezequiel Garcia

[permalink] [raw]
Subject: Re: [PATCH v3 2/5] media: hantro: Reduce H264 extra space for motion vectors

On Wed, 2019-11-20 at 21:44 +0900, Tomasz Figa wrote:
> Hi Jonas,
>
> On Thu, Nov 7, 2019 at 7:34 AM Jonas Karlman <[email protected]> wrote:
> > A decoded 8-bit 4:2:0 frame need memory for up to 448 bytes per
> > macroblock with additional 32 bytes on multi-core variants.
> >
> > Memory layout is as follow:
> >
> > +---------------------------+
> > > Y-plane 256 bytes x MBs |
> > +---------------------------+
> > > UV-plane 128 bytes x MBs |
> > +---------------------------+
> > > MV buffer 64 bytes x MBs |
> > +---------------------------+
> > > MC sync 32 bytes |
> > +---------------------------+
> >
> > Reduce the extra space allocated now that motion vector buffer offset no
> > longer is based on the extra space.
> >
> > Only allocate extra space for 64 bytes x MBs of motion vector buffer
> > and 32 bytes for multi-core sync.
> >
> > Fixes: a9471e25629b ("media: hantro: Add core bits to support H264 decoding")
> > Signed-off-by: Jonas Karlman <[email protected]>
> > Reviewed-by: Boris Brezillon <[email protected]>
> > ---
> > Changes in v3:
> > - add memory layout to code comment (Boris)
> > Changes in v2:
> > - updated commit message
> > ---
> > drivers/staging/media/hantro/hantro_v4l2.c | 20 ++++++++++++++++++--
> > 1 file changed, 18 insertions(+), 2 deletions(-)
> >
>
> Thanks for the patch!
>
> What platform did you test it on and how? Was it tested with IOMMU enabled?

Hello Tomasz,

Please note that this series has been picked-up and is merged
in v5.5-rc1.

IIRC, we tested these patches on RK3399 and RK3288 (that means
with an IOMMU). I've just ran some more extensive tests on RK3288,
on media/master; and I plan to test some more on RK3399 later this week.

Do you have any specific concern in mind?

Thanks,
Ezequiel

2020-01-08 14:21:12

by Tomasz Figa

[permalink] [raw]
Subject: Re: [PATCH v3 2/5] media: hantro: Reduce H264 extra space for motion vectors

On Tue, Dec 10, 2019 at 3:11 AM Ezequiel Garcia <[email protected]> wrote:
>
> On Wed, 2019-11-20 at 21:44 +0900, Tomasz Figa wrote:
> > Hi Jonas,
> >
> > On Thu, Nov 7, 2019 at 7:34 AM Jonas Karlman <[email protected]> wrote:
> > > A decoded 8-bit 4:2:0 frame need memory for up to 448 bytes per
> > > macroblock with additional 32 bytes on multi-core variants.
> > >
> > > Memory layout is as follow:
> > >
> > > +---------------------------+
> > > > Y-plane 256 bytes x MBs |
> > > +---------------------------+
> > > > UV-plane 128 bytes x MBs |
> > > +---------------------------+
> > > > MV buffer 64 bytes x MBs |
> > > +---------------------------+
> > > > MC sync 32 bytes |
> > > +---------------------------+
> > >
> > > Reduce the extra space allocated now that motion vector buffer offset no
> > > longer is based on the extra space.
> > >
> > > Only allocate extra space for 64 bytes x MBs of motion vector buffer
> > > and 32 bytes for multi-core sync.
> > >
> > > Fixes: a9471e25629b ("media: hantro: Add core bits to support H264 decoding")
> > > Signed-off-by: Jonas Karlman <[email protected]>
> > > Reviewed-by: Boris Brezillon <[email protected]>
> > > ---
> > > Changes in v3:
> > > - add memory layout to code comment (Boris)
> > > Changes in v2:
> > > - updated commit message
> > > ---
> > > drivers/staging/media/hantro/hantro_v4l2.c | 20 ++++++++++++++++++--
> > > 1 file changed, 18 insertions(+), 2 deletions(-)
> > >
> >
> > Thanks for the patch!
> >
> > What platform did you test it on and how? Was it tested with IOMMU enabled?
>
> Hello Tomasz,
>
> Please note that this series has been picked-up and is merged
> in v5.5-rc1.
>
> IIRC, we tested these patches on RK3399 and RK3288 (that means
> with an IOMMU). I've just ran some more extensive tests on RK3288,
> on media/master; and I plan to test some more on RK3399 later this week.
>
> Do you have any specific concern in mind?

I specifically want to know whether we're 100% sure that those sizes
are correct. The IOMMU still works on page granularity so it's
possible that the allocation could be just big enough by luck.

Best regards,
Tomasz

2020-01-08 15:16:49

by Jonas Karlman

[permalink] [raw]
Subject: Re: [PATCH v3 2/5] media: hantro: Reduce H264 extra space for motion vectors

On 2020-01-08 13:59, Tomasz Figa wrote:
> On Tue, Dec 10, 2019 at 3:11 AM Ezequiel Garcia <[email protected]> wrote:
>>
>> On Wed, 2019-11-20 at 21:44 +0900, Tomasz Figa wrote:
>>> Hi Jonas,
>>>
>>> On Thu, Nov 7, 2019 at 7:34 AM Jonas Karlman <[email protected]> wrote:
>>>> A decoded 8-bit 4:2:0 frame need memory for up to 448 bytes per
>>>> macroblock with additional 32 bytes on multi-core variants.
>>>>
>>>> Memory layout is as follow:
>>>>
>>>> +---------------------------+
>>>>> Y-plane 256 bytes x MBs |
>>>> +---------------------------+
>>>>> UV-plane 128 bytes x MBs |
>>>> +---------------------------+
>>>>> MV buffer 64 bytes x MBs |
>>>> +---------------------------+
>>>>> MC sync 32 bytes |
>>>> +---------------------------+
>>>>
>>>> Reduce the extra space allocated now that motion vector buffer offset no
>>>> longer is based on the extra space.
>>>>
>>>> Only allocate extra space for 64 bytes x MBs of motion vector buffer
>>>> and 32 bytes for multi-core sync.
>>>>
>>>> Fixes: a9471e25629b ("media: hantro: Add core bits to support H264 decoding")
>>>> Signed-off-by: Jonas Karlman <[email protected]>
>>>> Reviewed-by: Boris Brezillon <[email protected]>
>>>> ---
>>>> Changes in v3:
>>>> - add memory layout to code comment (Boris)
>>>> Changes in v2:
>>>> - updated commit message
>>>> ---
>>>> drivers/staging/media/hantro/hantro_v4l2.c | 20 ++++++++++++++++++--
>>>> 1 file changed, 18 insertions(+), 2 deletions(-)
>>>>
>>>
>>> Thanks for the patch!
>>>
>>> What platform did you test it on and how? Was it tested with IOMMU enabled?
>>
>> Hello Tomasz,
>>
>> Please note that this series has been picked-up and is merged
>> in v5.5-rc1.
>>
>> IIRC, we tested these patches on RK3399 and RK3288 (that means
>> with an IOMMU). I've just ran some more extensive tests on RK3288,
>> on media/master; and I plan to test some more on RK3399 later this week.
>>
>> Do you have any specific concern in mind?
>
> I specifically want to know whether we're 100% sure that those sizes
> are correct. The IOMMU still works on page granularity so it's
> possible that the allocation could be just big enough by luck.

One of my RK3288 TRM [1] contains the following:

Direct mode motion vector write:
Its base addr is right after decode output picture data
Its length is mbwidth*mbheight*64

Also both the mpp library and imx-vpu-hantro code both use mbwidth*mbheight*64.
So I feel confident that the buffer size is correct.

[1] Rockchip RK3288TRM V1.1 Part3-Graphic and multi-media.pdf

Regards,
Jonas

>
> Best regards,
> Tomasz
>

2020-01-16 05:13:35

by Tomasz Figa

[permalink] [raw]
Subject: Re: [PATCH v3 2/5] media: hantro: Reduce H264 extra space for motion vectors

On Thu, Jan 9, 2020 at 12:10 AM Jonas Karlman <[email protected]> wrote:
>
> On 2020-01-08 13:59, Tomasz Figa wrote:
> > On Tue, Dec 10, 2019 at 3:11 AM Ezequiel Garcia <[email protected]> wrote:
> >>
> >> On Wed, 2019-11-20 at 21:44 +0900, Tomasz Figa wrote:
> >>> Hi Jonas,
> >>>
> >>> On Thu, Nov 7, 2019 at 7:34 AM Jonas Karlman <[email protected]> wrote:
> >>>> A decoded 8-bit 4:2:0 frame need memory for up to 448 bytes per
> >>>> macroblock with additional 32 bytes on multi-core variants.
> >>>>
> >>>> Memory layout is as follow:
> >>>>
> >>>> +---------------------------+
> >>>>> Y-plane 256 bytes x MBs |
> >>>> +---------------------------+
> >>>>> UV-plane 128 bytes x MBs |
> >>>> +---------------------------+
> >>>>> MV buffer 64 bytes x MBs |
> >>>> +---------------------------+
> >>>>> MC sync 32 bytes |
> >>>> +---------------------------+
> >>>>
> >>>> Reduce the extra space allocated now that motion vector buffer offset no
> >>>> longer is based on the extra space.
> >>>>
> >>>> Only allocate extra space for 64 bytes x MBs of motion vector buffer
> >>>> and 32 bytes for multi-core sync.
> >>>>
> >>>> Fixes: a9471e25629b ("media: hantro: Add core bits to support H264 decoding")
> >>>> Signed-off-by: Jonas Karlman <[email protected]>
> >>>> Reviewed-by: Boris Brezillon <[email protected]>
> >>>> ---
> >>>> Changes in v3:
> >>>> - add memory layout to code comment (Boris)
> >>>> Changes in v2:
> >>>> - updated commit message
> >>>> ---
> >>>> drivers/staging/media/hantro/hantro_v4l2.c | 20 ++++++++++++++++++--
> >>>> 1 file changed, 18 insertions(+), 2 deletions(-)
> >>>>
> >>>
> >>> Thanks for the patch!
> >>>
> >>> What platform did you test it on and how? Was it tested with IOMMU enabled?
> >>
> >> Hello Tomasz,
> >>
> >> Please note that this series has been picked-up and is merged
> >> in v5.5-rc1.
> >>
> >> IIRC, we tested these patches on RK3399 and RK3288 (that means
> >> with an IOMMU). I've just ran some more extensive tests on RK3288,
> >> on media/master; and I plan to test some more on RK3399 later this week.
> >>
> >> Do you have any specific concern in mind?
> >
> > I specifically want to know whether we're 100% sure that those sizes
> > are correct. The IOMMU still works on page granularity so it's
> > possible that the allocation could be just big enough by luck.
>
> One of my RK3288 TRM [1] contains the following:
>
> Direct mode motion vector write:
> Its base addr is right after decode output picture data
> Its length is mbwidth*mbheight*64
>
> Also both the mpp library and imx-vpu-hantro code both use mbwidth*mbheight*64.
> So I feel confident that the buffer size is correct.
>
> [1] Rockchip RK3288TRM V1.1 Part3-Graphic and multi-media.pdf

Fair enough. Thanks!

Best regards,
Tomasz