2022-08-08 16:42:03

by Hsia-Jun Li

[permalink] [raw]
Subject: [PATCH 0/2] Add pixel formats used in Synatpics SoC

From: "Hsia-Jun(Randy) Li" <[email protected]>

Those pixel formats are used in Synaptics's VideoSmart series SoCs,
likes VS640, VS680. I just disclose the pixel formats used in the video
codecs and display pipeline this time. Actually any device with a MTR
module could support those tiled and compressed pixel formats. The more
detail about MTR module could be found in the first patch of this serial
of mail.

We may not be able to post any drivers here in a short time, the most of
work in this platform is done in the Trusted Execution Environment and
we didn't use the optee framework.

Please notice that, the memory planes used for video codecs would be 5
when the compression is invoked while it would be 4 for display, the
extra planes in the video codecs is for the decoding internally usage,
it can't append the luma or chroma buffer as many other drivers do,
because this buffer could be only accessed by the video codecs itself,
it requests a different memory security attributes. Any other reason is
described in the v4l pixel formats's patch. I don't know whether a
different numbers of memory planes between drm and v4l2 is acceptable.

I only posted the compression fourcc for the v4l2, because it is really
hard to put the uncompression version of pixel formats under the fourcc.
I would be better that we could have something likes format modifers in
drm here.

https://synaptics.com/products/multimedia-solutions

Hsia-Jun(Randy) Li (2):
drm/fourcc: Add Synaptics VideoSmart tiled modifiers
[WIP]: media: Add Synaptics compressed tiled format

drivers/media/v4l2-core/v4l2-common.c | 1 +
drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
include/uapi/drm/drm_fourcc.h | 49 +++++++++++++++++++++++++++
include/uapi/linux/videodev2.h | 2 ++
4 files changed, 54 insertions(+)

--
2.17.1


2022-08-08 16:44:45

by Hsia-Jun Li

[permalink] [raw]
Subject: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format

From: "Hsia-Jun(Randy) Li" <[email protected]>

The most of detail has been written in the drm.
Please notice that the tiled formats here request
one more plane for storing the motion vector metadata.
This buffer won't be compressed, so you can't append
it to luma or chroma plane.

Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
---
drivers/media/v4l2-core/v4l2-common.c | 1 +
drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
include/uapi/linux/videodev2.h | 2 ++
3 files changed, 5 insertions(+)

diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
index e0fbe6ba4b6c..f645278b3055 100644
--- a/drivers/media/v4l2-core/v4l2-common.c
+++ b/drivers/media/v4l2-core/v4l2-common.c
@@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
{ .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
{ .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
{ .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
+ { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
};
unsigned int i;

diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
index e6fd355a2e92..8f65964aff08 100644
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
+ case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
+ case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
default:
if (fmt->description[0])
return;
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 01e630f2ec78..7e928cb69e7c 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -661,6 +661,8 @@ struct v4l2_pix_format {
#define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
#define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
#define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
+#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
+#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */

/* Bayer formats - see http://www.siliconimaging.com/RGB%20Bayer.htm */
#define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */
--
2.17.1

2022-08-18 06:44:09

by Tomasz Figa

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format

Hi Randy,

On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <[email protected]> wrote:
>
> From: "Hsia-Jun(Randy) Li" <[email protected]>
>
> The most of detail has been written in the drm.
> Please notice that the tiled formats here request
> one more plane for storing the motion vector metadata.
> This buffer won't be compressed, so you can't append
> it to luma or chroma plane.

Does the motion vector buffer need to be exposed to userspace? Is the
decoder stateless (requires userspace to specify the reference frames)
or stateful (manages the entire decoding process internally)?

Best regards,
Tomasz

>
> Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
> ---
> drivers/media/v4l2-core/v4l2-common.c | 1 +
> drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
> include/uapi/linux/videodev2.h | 2 ++
> 3 files changed, 5 insertions(+)
>
> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> index e0fbe6ba4b6c..f645278b3055 100644
> --- a/drivers/media/v4l2-core/v4l2-common.c
> +++ b/drivers/media/v4l2-core/v4l2-common.c
> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> { .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> { .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> { .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> + { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> };
> unsigned int i;
>
> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> index e6fd355a2e92..8f65964aff08 100644
> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
> case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
> case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
> + case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
> + case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
> default:
> if (fmt->description[0])
> return;
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 01e630f2ec78..7e928cb69e7c 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
> #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */
>
> /* Bayer formats - see http://www.siliconimaging.com/RGB%20Bayer.htm */
> #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */
> --
> 2.17.1
>

2022-08-18 06:57:36

by Hsia-Jun Li

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format



On 8/18/22 14:06, Tomasz Figa wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
> Hi Randy,
>
> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <[email protected]> wrote:
>>
>> From: "Hsia-Jun(Randy) Li" <[email protected]>
>>
>> The most of detail has been written in the drm.
>> Please notice that the tiled formats here request
>> one more plane for storing the motion vector metadata.
>> This buffer won't be compressed, so you can't append
>> it to luma or chroma plane.
>
> Does the motion vector buffer need to be exposed to userspace? Is the
> decoder stateless (requires userspace to specify the reference frames)
> or stateful (manages the entire decoding process internally)?
>
No, users don't need to access them at all. Just they need a different
dma-heap.

You would only get the stateful version of both encoder and decoder.
> Best regards,
> Tomasz
>
>>
>> Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
>> ---
>> drivers/media/v4l2-core/v4l2-common.c | 1 +
>> drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
>> include/uapi/linux/videodev2.h | 2 ++
>> 3 files changed, 5 insertions(+)
>>
>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
>> index e0fbe6ba4b6c..f645278b3055 100644
>> --- a/drivers/media/v4l2-core/v4l2-common.c
>> +++ b/drivers/media/v4l2-core/v4l2-common.c
>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
>> { .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>> { .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>> { .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>> + { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
>> };
>> unsigned int i;
>>
>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
>> index e6fd355a2e92..8f65964aff08 100644
>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
>> case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
>> case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
>> case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
>> + case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
>> + case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
>> default:
>> if (fmt->description[0])
>> return;
>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>> index 01e630f2ec78..7e928cb69e7c 100644
>> --- a/include/uapi/linux/videodev2.h
>> +++ b/include/uapi/linux/videodev2.h
>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
>> #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
>> #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
>> #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */
>>
>> /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIBaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=vmpysqneiHK3UXcq6UOewdMwobFa70zKB3RuOgYT02aFw9fCs6qd7j-U1sYSey79&s=yblzF1GwanMEJFC3yt9nBAQjaaAEJKKlNgj4k64v5eE&e= */
>> #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */
>> --
>> 2.17.1
>>

--
Hsia-Jun(Randy) Li

2022-08-18 23:12:47

by Laurent Pinchart

[permalink] [raw]
Subject: Re: [PATCH 0/2] Add pixel formats used in Synatpics SoC

Hi Hsia-Jun,

On Tue, Aug 09, 2022 at 12:27:48AM +0800, Hsia-Jun Li wrote:
> From: "Hsia-Jun(Randy) Li" <[email protected]>
>
> Those pixel formats are used in Synaptics's VideoSmart series SoCs,
> likes VS640, VS680. I just disclose the pixel formats used in the video
> codecs and display pipeline this time. Actually any device with a MTR
> module could support those tiled and compressed pixel formats. The more
> detail about MTR module could be found in the first patch of this serial
> of mail.
>
> We may not be able to post any drivers here in a short time, the most of
> work in this platform is done in the Trusted Execution Environment and
> we didn't use the optee framework.

Is that so for the display side too, or only for the video decoder ?

> Please notice that, the memory planes used for video codecs would be 5
> when the compression is invoked while it would be 4 for display, the
> extra planes in the video codecs is for the decoding internally usage,
> it can't append the luma or chroma buffer as many other drivers do,
> because this buffer could be only accessed by the video codecs itself,
> it requests a different memory security attributes. Any other reason is
> described in the v4l pixel formats's patch. I don't know whether a
> different numbers of memory planes between drm and v4l2 is acceptable.

I don't think that's a problem as such, as long as both the V4L2 and DRM
formats make sense on their own.

> I only posted the compression fourcc for the v4l2, because it is really
> hard to put the uncompression version of pixel formats under the fourcc.
> I would be better that we could have something likes format modifers in
> drm here.

Agreed, we need modifiers support in V4L2. This has been discussed
previously ([1]), and a proposal ([2]) has been submitted two years ago,
it needs to be revived.

[1] https://lore.kernel.org/linux-media/[email protected]/
[2] https://lore.kernel.org/linux-media/[email protected]/

> https://synaptics.com/products/multimedia-solutions
>
> Hsia-Jun(Randy) Li (2):
> drm/fourcc: Add Synaptics VideoSmart tiled modifiers
> [WIP]: media: Add Synaptics compressed tiled format
>
> drivers/media/v4l2-core/v4l2-common.c | 1 +
> drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
> include/uapi/drm/drm_fourcc.h | 49 +++++++++++++++++++++++++++
> include/uapi/linux/videodev2.h | 2 ++
> 4 files changed, 54 insertions(+)

--
Regards,

Laurent Pinchart

2022-08-18 23:52:32

by Laurent Pinchart

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format

On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
> On 8/18/22 14:06, Tomasz Figa wrote:
> > On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <[email protected]> wrote:
> >>
> >> From: "Hsia-Jun(Randy) Li" <[email protected]>
> >>
> >> The most of detail has been written in the drm.

This patch still needs a description of the format, which should go to
Documentation/userspace-api/media/v4l/.

> >> Please notice that the tiled formats here request
> >> one more plane for storing the motion vector metadata.
> >> This buffer won't be compressed, so you can't append
> >> it to luma or chroma plane.
> >
> > Does the motion vector buffer need to be exposed to userspace? Is the
> > decoder stateless (requires userspace to specify the reference frames)
> > or stateful (manages the entire decoding process internally)?
>
> No, users don't need to access them at all. Just they need a different
> dma-heap.
>
> You would only get the stateful version of both encoder and decoder.

Shouldn't the motion vectors be stored in a separate V4L2 buffer,
submitted through a different queue then ?

> >> Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
> >> ---
> >> drivers/media/v4l2-core/v4l2-common.c | 1 +
> >> drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
> >> include/uapi/linux/videodev2.h | 2 ++
> >> 3 files changed, 5 insertions(+)
> >>
> >> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> >> index e0fbe6ba4b6c..f645278b3055 100644
> >> --- a/drivers/media/v4l2-core/v4l2-common.c
> >> +++ b/drivers/media/v4l2-core/v4l2-common.c
> >> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> >> { .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> >> { .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> >> { .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> >> + { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> >> };
> >> unsigned int i;
> >>
> >> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> >> index e6fd355a2e92..8f65964aff08 100644
> >> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> >> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> >> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> >> case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
> >> case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
> >> case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
> >> + case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
> >> + case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
> >> default:
> >> if (fmt->description[0])
> >> return;
> >> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> >> index 01e630f2ec78..7e928cb69e7c 100644
> >> --- a/include/uapi/linux/videodev2.h
> >> +++ b/include/uapi/linux/videodev2.h
> >> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> >> #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
> >> #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> >> #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> >> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
> >> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */
> >>
> >> /* Bayer formats - see http://www.siliconimaging.com/RGB%20Bayer.htm */
> >> #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */

--
Regards,

Laurent Pinchart

2022-08-18 23:54:21

by Hsia-Jun Li

[permalink] [raw]
Subject: Re: [PATCH 0/2] Add pixel formats used in Synatpics SoC



On 8/19/22 07:08, Laurent Pinchart wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
> Hi Hsia-Jun,
>
> On Tue, Aug 09, 2022 at 12:27:48AM +0800, Hsia-Jun Li wrote:
>> From: "Hsia-Jun(Randy) Li" <[email protected]>
>>
>> Those pixel formats are used in Synaptics's VideoSmart series SoCs,
>> likes VS640, VS680. I just disclose the pixel formats used in the video
>> codecs and display pipeline this time. Actually any device with a MTR
>> module could support those tiled and compressed pixel formats. The more
>> detail about MTR module could be found in the first patch of this serial
>> of mail.
>>
>> We may not be able to post any drivers here in a short time, the most of
>> work in this platform is done in the Trusted Execution Environment and
>> we didn't use the optee framework.
>
> Is that so for the display side too, or only for the video decoder ?
These pixel formats are using in both video decoder and display(Not the
GPU). Besides, ISP and NPU in vs680 support some patterns of them.

Please notice that after I reviewed the compression options of our
platform, I found using modifies are not enough to store all the
compression options here. I would post a second version here.

I may use the same way that Intel, I would try to disclose more details
here, hoping we could find a better way to describe them.
>
>> Please notice that, the memory planes used for video codecs would be 5
>> when the compression is invoked while it would be 4 for display, the
>> extra planes in the video codecs is for the decoding internally usage,
>> it can't append the luma or chroma buffer as many other drivers do,
>> because this buffer could be only accessed by the video codecs itself,
>> it requests a different memory security attributes. Any other reason is
>> described in the v4l pixel formats's patch. I don't know whether a
>> different numbers of memory planes between drm and v4l2 is acceptable.
>
> I don't think that's a problem as such, as long as both the V4L2 and DRM
> formats make sense on their own.
>
>> I only posted the compression fourcc for the v4l2, because it is really
>> hard to put the uncompression version of pixel formats under the fourcc.
>> I would be better that we could have something likes format modifers in
>> drm here.
>
> Agreed, we need modifiers support in V4L2. This has been discussed
> previously ([1]), and a proposal ([2]) has been submitted two years ago,
> it needs to be revived.
Thank you, I have found those v4l2_ext_pix_format, I would relay my
comment in the email that posting synaptics v4l2 pixel formats.
>
> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_linux-2Dmedia_20170821155203.GB38943-40e107564-2Dlin.cambridge.arm.com_&d=DwIBaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=Ktu-e-R1Mn89Laxioh6RlL6Y2aycZ9NrJTIyONaDdRQvnlv-Nd570KldQ51vmigK&s=_7eMTIYwWUOWkXijcRfotLJlpR7G5yx-ZXuTwh9uZw4&e=
> [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_linux-2Dmedia_20200804192939.2251988-2D1-2Dhelen.koike-40collabora.com_&d=DwIBaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=Ktu-e-R1Mn89Laxioh6RlL6Y2aycZ9NrJTIyONaDdRQvnlv-Nd570KldQ51vmigK&s=f1dbc5ciUeIkO6VMtlRuEvXqJad2NsoaDBFyNUsSdpg&e=
>
>> https://synaptics.com/products/multimedia-solutions
>>
>> Hsia-Jun(Randy) Li (2):
>> drm/fourcc: Add Synaptics VideoSmart tiled modifiers
>> [WIP]: media: Add Synaptics compressed tiled format
>>
>> drivers/media/v4l2-core/v4l2-common.c | 1 +
>> drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
>> include/uapi/drm/drm_fourcc.h | 49 +++++++++++++++++++++++++++
>> include/uapi/linux/videodev2.h | 2 ++
>> 4 files changed, 54 insertions(+)
>
> --
> Regards,
>
> Laurent Pinchart

--
Hsia-Jun(Randy) Li

2022-08-19 00:08:14

by Hsia-Jun Li

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format



On 8/19/22 07:13, Laurent Pinchart wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
>> On 8/18/22 14:06, Tomasz Figa wrote:
>>> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <[email protected]> wrote:
>>>>
>>>> From: "Hsia-Jun(Randy) Li" <[email protected]>
>>>>
>>>> The most of detail has been written in the drm.
>
> This patch still needs a description of the format, which should go to
> Documentation/userspace-api/media/v4l/.

I just want t tell people we need an extra plane for MVTP and I don't
have enough space here to place all the pixel formats.

Besides, I was thinking a modifer in v4l2_ext_pix_format is not enough.
Let's take a compression NV12 tile format as an example, we need
1. luma planes
2. chroma planes
3. compression meta data for luma
4. compression meta data for chroma
5. mvtp
and a single data planer version would be
1. luma and chroma data
2. compression meta data
3. mvtp

You see, we actually have 3 kind of data here(not including the
compression options that I am thinking of storing them somewhere else).
>
>>>> Please notice that the tiled formats here request
>>>> one more plane for storing the motion vector metadata.
>>>> This buffer won't be compressed, so you can't append
>>>> it to luma or chroma plane.
>>>
>>> Does the motion vector buffer need to be exposed to userspace? Is the
>>> decoder stateless (requires userspace to specify the reference frames)
>>> or stateful (manages the entire decoding process internally)?
>>
>> No, users don't need to access them at all. Just they need a different
>> dma-heap.
>>
>> You would only get the stateful version of both encoder and decoder.
>
> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
> submitted through a different queue then ?
Yes, I like that.
Proposal: A third buffer type for the reconstruction buffers in V4L2 M2M
encoder
https://www.spinics.net/lists/linux-media/msg214565.html

Although the major usage for the decoder here is producing the non-tile
buffers, the decoder of us could product the NV12 or the pixel formats
that GPU likes, but it must happen at the same time a frame is decoded.
Still the reference buffer or we call them the real decoded frame would
stay in a tiled format. More than one queue would be need here.
>
>>>> Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
>>>> ---
>>>> drivers/media/v4l2-core/v4l2-common.c | 1 +
>>>> drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
>>>> include/uapi/linux/videodev2.h | 2 ++
>>>> 3 files changed, 5 insertions(+)
>>>>
>>>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
>>>> index e0fbe6ba4b6c..f645278b3055 100644
>>>> --- a/drivers/media/v4l2-core/v4l2-common.c
>>>> +++ b/drivers/media/v4l2-core/v4l2-common.c
>>>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
>>>> { .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>> { .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>> { .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>> + { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
>>>> };
>>>> unsigned int i;
>>>>
>>>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>> index e6fd355a2e92..8f65964aff08 100644
>>>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>>>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
>>>> case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
>>>> case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
>>>> case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
>>>> + case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
>>>> + case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
>>>> default:
>>>> if (fmt->description[0])
>>>> return;
>>>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>>>> index 01e630f2ec78..7e928cb69e7c 100644
>>>> --- a/include/uapi/linux/videodev2.h
>>>> +++ b/include/uapi/linux/videodev2.h
>>>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
>>>> #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
>>>> #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
>>>> #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
>>>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
>>>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */
>>>>
>>>> /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIBaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=87M5Aa5fG3kdTTjlJrLIgv0E7O10QAj_4RqDlVsCAFdbfJzJ_P0s8wkBqaR0VBUO&s=8AsoiLPt9hQnn4ta51tT-RUXRLoKKYrePdAwtdvxuDo&e= */
>>>> #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */
>
> --
> Regards,
>
> Laurent Pinchart

--
Hsia-Jun(Randy) Li

2022-08-19 15:32:33

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format

Le jeudi 18 août 2022 à 14:33 +0800, Hsia-Jun Li a écrit :
>
> On 8/18/22 14:06, Tomasz Figa wrote:
> > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> >
> >
> > Hi Randy,
> >
> > On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <[email protected]> wrote:
> > >
> > > From: "Hsia-Jun(Randy) Li" <[email protected]>
> > >
> > > The most of detail has been written in the drm.
> > > Please notice that the tiled formats here request
> > > one more plane for storing the motion vector metadata.
> > > This buffer won't be compressed, so you can't append
> > > it to luma or chroma plane.
> >
> > Does the motion vector buffer need to be exposed to userspace? Is the
> > decoder stateless (requires userspace to specify the reference frames)
> > or stateful (manages the entire decoding process internally)?
> >
> No, users don't need to access them at all. Just they need a different
> dma-heap.
>
> You would only get the stateful version of both encoder and decoder.

Can't you just allocate and manage these internally in the kernel driver without
adding kernel APIs ? This is notably what Mediatek and (downstream) RPi HEVC
driver do, as it allow reducing quite a lot the memory usage. In Hantro, we bind
them due to HW limitation.

Nicolas

> > Best regards,
> > Tomasz
> >
> > >
> > > Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
> > > ---
> > > drivers/media/v4l2-core/v4l2-common.c | 1 +
> > > drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
> > > include/uapi/linux/videodev2.h | 2 ++
> > > 3 files changed, 5 insertions(+)
> > >
> > > diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> > > index e0fbe6ba4b6c..f645278b3055 100644
> > > --- a/drivers/media/v4l2-core/v4l2-common.c
> > > +++ b/drivers/media/v4l2-core/v4l2-common.c
> > > @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> > > { .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > { .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > { .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > + { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> > > };
> > > unsigned int i;
> > >
> > > diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > index e6fd355a2e92..8f65964aff08 100644
> > > --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> > > +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> > > case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
> > > case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
> > > case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
> > > + case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
> > > + case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
> > > default:
> > > if (fmt->description[0])
> > > return;
> > > diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> > > index 01e630f2ec78..7e928cb69e7c 100644
> > > --- a/include/uapi/linux/videodev2.h
> > > +++ b/include/uapi/linux/videodev2.h
> > > @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> > > #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
> > > #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> > > #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> > > +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
> > > +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */
> > >
> > > /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIBaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=vmpysqneiHK3UXcq6UOewdMwobFa70zKB3RuOgYT02aFw9fCs6qd7j-U1sYSey79&s=yblzF1GwanMEJFC3yt9nBAQjaaAEJKKlNgj4k64v5eE&e= */
> > > #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */
> > > --
> > > 2.17.1
> > >
>

2022-08-19 15:47:24

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format

Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
> > On 8/18/22 14:06, Tomasz Figa wrote:
> > > On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <[email protected]> wrote:
> > > >
> > > > From: "Hsia-Jun(Randy) Li" <[email protected]>
> > > >
> > > > The most of detail has been written in the drm.
>
> This patch still needs a description of the format, which should go to
> Documentation/userspace-api/media/v4l/.
>
> > > > Please notice that the tiled formats here request
> > > > one more plane for storing the motion vector metadata.
> > > > This buffer won't be compressed, so you can't append
> > > > it to luma or chroma plane.
> > >
> > > Does the motion vector buffer need to be exposed to userspace? Is the
> > > decoder stateless (requires userspace to specify the reference frames)
> > > or stateful (manages the entire decoding process internally)?
> >
> > No, users don't need to access them at all. Just they need a different
> > dma-heap.
> >
> > You would only get the stateful version of both encoder and decoder.
>
> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
> submitted through a different queue then ?

Imho, I believe these should be invisible to users and pooled separately to
reduce the overhead. The number of reference is usually lower then the number of
allocated display buffers.

>
> > > > Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
> > > > ---
> > > > drivers/media/v4l2-core/v4l2-common.c | 1 +
> > > > drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
> > > > include/uapi/linux/videodev2.h | 2 ++
> > > > 3 files changed, 5 insertions(+)
> > > >
> > > > diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> > > > index e0fbe6ba4b6c..f645278b3055 100644
> > > > --- a/drivers/media/v4l2-core/v4l2-common.c
> > > > +++ b/drivers/media/v4l2-core/v4l2-common.c
> > > > @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> > > > { .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > { .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > { .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > + { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> > > > };
> > > > unsigned int i;
> > > >
> > > > diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > index e6fd355a2e92..8f65964aff08 100644
> > > > --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> > > > case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
> > > > case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
> > > > case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
> > > > + case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
> > > > + case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
> > > > default:
> > > > if (fmt->description[0])
> > > > return;
> > > > diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> > > > index 01e630f2ec78..7e928cb69e7c 100644
> > > > --- a/include/uapi/linux/videodev2.h
> > > > +++ b/include/uapi/linux/videodev2.h
> > > > @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> > > > #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
> > > > #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> > > > #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> > > > +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
> > > > +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */
> > > >
> > > > /* Bayer formats - see http://www.siliconimaging.com/RGB%20Bayer.htm */
> > > > #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */
>

2022-08-19 17:41:15

by Hsia-Jun Li

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format



On 8/19/22 23:28, Nicolas Dufresne wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
> Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
>> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
>>> On 8/18/22 14:06, Tomasz Figa wrote:
>>>> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <[email protected]> wrote:
>>>>>
>>>>> From: "Hsia-Jun(Randy) Li" <[email protected]>
>>>>>
>>>>> The most of detail has been written in the drm.
>>
>> This patch still needs a description of the format, which should go to
>> Documentation/userspace-api/media/v4l/.
>>
>>>>> Please notice that the tiled formats here request
>>>>> one more plane for storing the motion vector metadata.
>>>>> This buffer won't be compressed, so you can't append
>>>>> it to luma or chroma plane.
>>>>
>>>> Does the motion vector buffer need to be exposed to userspace? Is the
>>>> decoder stateless (requires userspace to specify the reference frames)
>>>> or stateful (manages the entire decoding process internally)?
>>>
>>> No, users don't need to access them at all. Just they need a different
>>> dma-heap.
>>>
>>> You would only get the stateful version of both encoder and decoder.
>>
>> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
>> submitted through a different queue then ?
>
> Imho, I believe these should be invisible to users and pooled separately to
> reduce the overhead. The number of reference is usually lower then the number of
> allocated display buffers.
>
You can't. The motion vector buffer can't share with the luma and chroma
data planes, nor the data plane for the compression meta data.

You could consider this as a security requirement(the memory region for
the MV could only be accessed by the decoder) or hardware limitation.

It is also not very easy to manage such a large buffer that would change
when the resolution changed.
>>
>>>>> Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
>>>>> ---
>>>>> drivers/media/v4l2-core/v4l2-common.c | 1 +
>>>>> drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
>>>>> include/uapi/linux/videodev2.h | 2 ++
>>>>> 3 files changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
>>>>> index e0fbe6ba4b6c..f645278b3055 100644
>>>>> --- a/drivers/media/v4l2-core/v4l2-common.c
>>>>> +++ b/drivers/media/v4l2-core/v4l2-common.c
>>>>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
>>>>> { .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>> { .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>> { .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>> + { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
>>>>> };
>>>>> unsigned int i;
>>>>>
>>>>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>> index e6fd355a2e92..8f65964aff08 100644
>>>>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
>>>>> case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
>>>>> case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
>>>>> case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
>>>>> + case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
>>>>> + case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
>>>>> default:
>>>>> if (fmt->description[0])
>>>>> return;
>>>>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>>>>> index 01e630f2ec78..7e928cb69e7c 100644
>>>>> --- a/include/uapi/linux/videodev2.h
>>>>> +++ b/include/uapi/linux/videodev2.h
>>>>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
>>>>> #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
>>>>> #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
>>>>> #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
>>>>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
>>>>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */
>>>>>
>>>>> /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e= */
>>>>> #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */
>>
>

--
Hsia-Jun(Randy) Li

2022-08-19 19:35:36

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format

Le vendredi 19 août 2022 à 23:44 +0800, Hsia-Jun Li a écrit :
>
> On 8/19/22 23:28, Nicolas Dufresne wrote:
> > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> >
> >
> > Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
> > > On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
> > > > On 8/18/22 14:06, Tomasz Figa wrote:
> > > > > On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <[email protected]> wrote:
> > > > > >
> > > > > > From: "Hsia-Jun(Randy) Li" <[email protected]>
> > > > > >
> > > > > > The most of detail has been written in the drm.
> > >
> > > This patch still needs a description of the format, which should go to
> > > Documentation/userspace-api/media/v4l/.
> > >
> > > > > > Please notice that the tiled formats here request
> > > > > > one more plane for storing the motion vector metadata.
> > > > > > This buffer won't be compressed, so you can't append
> > > > > > it to luma or chroma plane.
> > > > >
> > > > > Does the motion vector buffer need to be exposed to userspace? Is the
> > > > > decoder stateless (requires userspace to specify the reference frames)
> > > > > or stateful (manages the entire decoding process internally)?
> > > >
> > > > No, users don't need to access them at all. Just they need a different
> > > > dma-heap.
> > > >
> > > > You would only get the stateful version of both encoder and decoder.
> > >
> > > Shouldn't the motion vectors be stored in a separate V4L2 buffer,
> > > submitted through a different queue then ?
> >
> > Imho, I believe these should be invisible to users and pooled separately to
> > reduce the overhead. The number of reference is usually lower then the number of
> > allocated display buffers.
> >
> You can't. The motion vector buffer can't share with the luma and chroma
> data planes, nor the data plane for the compression meta data.
>
> You could consider this as a security requirement(the memory region for
> the MV could only be accessed by the decoder) or hardware limitation.
>
> It is also not very easy to manage such a large buffer that would change
> when the resolution changed.

Your argument are just aiming toward the fact that you should not let the user
allocate these in the first place. They should not be bound to the v4l2 buffer.
Allocate these in your driver, and leave to your user the pixel buffer (and
compress meta) allocation work.

Other driver handle this just fine, if your v4l2 driver implement the v4l2
resolution change mechanism, is should be very simple to manage.

> > >
> > > > > > Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
> > > > > > ---
> > > > > > drivers/media/v4l2-core/v4l2-common.c | 1 +
> > > > > > drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
> > > > > > include/uapi/linux/videodev2.h | 2 ++
> > > > > > 3 files changed, 5 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> > > > > > index e0fbe6ba4b6c..f645278b3055 100644
> > > > > > --- a/drivers/media/v4l2-core/v4l2-common.c
> > > > > > +++ b/drivers/media/v4l2-core/v4l2-common.c
> > > > > > @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> > > > > > { .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > { .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > { .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > + { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> > > > > > };
> > > > > > unsigned int i;
> > > > > >
> > > > > > diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > index e6fd355a2e92..8f65964aff08 100644
> > > > > > --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> > > > > > case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
> > > > > > case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
> > > > > > case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
> > > > > > + case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
> > > > > > + case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
> > > > > > default:
> > > > > > if (fmt->description[0])
> > > > > > return;
> > > > > > diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> > > > > > index 01e630f2ec78..7e928cb69e7c 100644
> > > > > > --- a/include/uapi/linux/videodev2.h
> > > > > > +++ b/include/uapi/linux/videodev2.h
> > > > > > @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> > > > > > #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
> > > > > > #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> > > > > > #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> > > > > > +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
> > > > > > +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */
> > > > > >
> > > > > > /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e= */
> > > > > > #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */
> > >
> >
>

2022-08-20 00:13:27

by Hsia-Jun Li

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format



On 8/20/22 03:17, Nicolas Dufresne wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
> Le vendredi 19 août 2022 à 23:44 +0800, Hsia-Jun Li a écrit :
>>
>> On 8/19/22 23:28, Nicolas Dufresne wrote:
>>> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>>>
>>>
>>> Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
>>>> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
>>>>> On 8/18/22 14:06, Tomasz Figa wrote:
>>>>>> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <[email protected]> wrote:
>>>>>>>
>>>>>>> From: "Hsia-Jun(Randy) Li" <[email protected]>
>>>>>>>
>>>>>>> The most of detail has been written in the drm.
>>>>
>>>> This patch still needs a description of the format, which should go to
>>>> Documentation/userspace-api/media/v4l/.
>>>>
>>>>>>> Please notice that the tiled formats here request
>>>>>>> one more plane for storing the motion vector metadata.
>>>>>>> This buffer won't be compressed, so you can't append
>>>>>>> it to luma or chroma plane.
>>>>>>
>>>>>> Does the motion vector buffer need to be exposed to userspace? Is the
>>>>>> decoder stateless (requires userspace to specify the reference frames)
>>>>>> or stateful (manages the entire decoding process internally)?
>>>>>
>>>>> No, users don't need to access them at all. Just they need a different
>>>>> dma-heap.
>>>>>
>>>>> You would only get the stateful version of both encoder and decoder.
>>>>
>>>> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
>>>> submitted through a different queue then ?
>>>
>>> Imho, I believe these should be invisible to users and pooled separately to
>>> reduce the overhead. The number of reference is usually lower then the number of
>>> allocated display buffers.
>>>
>> You can't. The motion vector buffer can't share with the luma and chroma
>> data planes, nor the data plane for the compression meta data.
>>
>> You could consider this as a security requirement(the memory region for
>> the MV could only be accessed by the decoder) or hardware limitation.
>>
>> It is also not very easy to manage such a large buffer that would change
>> when the resolution changed.
>
> Your argument are just aiming toward the fact that you should not let the user
> allocate these in the first place. They should not be bound to the v4l2 buffer.
> Allocate these in your driver, and leave to your user the pixel buffer (and
> compress meta) allocation work.
>
What I want to say is that userspace could allocate buffers then make
the v4l2 decoder import these buffers, but each planes should come from
the right DMA-heaps. Usually the userspace would know better the memory
occupation, it would bring some flexibility here.

Currently, they are another thing bothers me, I need to allocate a small
piece of memory(less than 128KiB) as the compression metadata buffers as
I mentioned here. And these pieces of memory should be located in a
small region, or the performance could be badly hurt, besides, we don't
support IOMMU for this kind of data.

Any idea about assign a small piece of memory from a pre-allocated
memory or select region(I don't think I could reserve them in a
DMA-heap) for a plane in the MMAP type buffer ?

Besides, I am not very satisfied with the dynamic resolution change
steps if I understand it correct. Buffers reallocation should happen
when we receive the event not until the drain is done. A resolution
rising is very common when you are playing a network stream, it would be
better that the decoder decided how many buffers it need for the
previous sequence while the userspace could reallocate the reset of
buffers in the CAPTURE queue.
> Other driver handle this just fine, if your v4l2 driver implement the v4l2
> resolution change mechanism, is should be very simple to manage.
>
>>>>
>>>>>>> Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
>>>>>>> ---
>>>>>>> drivers/media/v4l2-core/v4l2-common.c | 1 +
>>>>>>> drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
>>>>>>> include/uapi/linux/videodev2.h | 2 ++
>>>>>>> 3 files changed, 5 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> index e0fbe6ba4b6c..f645278b3055 100644
>>>>>>> --- a/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
>>>>>>> { .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>> { .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>> { .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>> + { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
>>>>>>> };
>>>>>>> unsigned int i;
>>>>>>>
>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> index e6fd355a2e92..8f65964aff08 100644
>>>>>>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
>>>>>>> case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
>>>>>>> case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
>>>>>>> case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
>>>>>>> + case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
>>>>>>> + case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
>>>>>>> default:
>>>>>>> if (fmt->description[0])
>>>>>>> return;
>>>>>>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>>>>>>> index 01e630f2ec78..7e928cb69e7c 100644
>>>>>>> --- a/include/uapi/linux/videodev2.h
>>>>>>> +++ b/include/uapi/linux/videodev2.h
>>>>>>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
>>>>>>> #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
>>>>>>> #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
>>>>>>> #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
>>>>>>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
>>>>>>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */
>>>>>>>
>>>>>>> /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e= */
>>>>>>> #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */
>>>>
>>>
>>
>

--
Hsia-Jun(Randy) Li

2022-08-22 14:30:33

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format

Le samedi 20 août 2022 à 08:10 +0800, Hsia-Jun Li a écrit :
>
> On 8/20/22 03:17, Nicolas Dufresne wrote:
> > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> >
> >
> > Le vendredi 19 août 2022 à 23:44 +0800, Hsia-Jun Li a écrit :
> > >
> > > On 8/19/22 23:28, Nicolas Dufresne wrote:
> > > > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> > > >
> > > >
> > > > Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
> > > > > On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
> > > > > > On 8/18/22 14:06, Tomasz Figa wrote:
> > > > > > > On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <[email protected]> wrote:
> > > > > > > >
> > > > > > > > From: "Hsia-Jun(Randy) Li" <[email protected]>
> > > > > > > >
> > > > > > > > The most of detail has been written in the drm.
> > > > >
> > > > > This patch still needs a description of the format, which should go to
> > > > > Documentation/userspace-api/media/v4l/.
> > > > >
> > > > > > > > Please notice that the tiled formats here request
> > > > > > > > one more plane for storing the motion vector metadata.
> > > > > > > > This buffer won't be compressed, so you can't append
> > > > > > > > it to luma or chroma plane.
> > > > > > >
> > > > > > > Does the motion vector buffer need to be exposed to userspace? Is the
> > > > > > > decoder stateless (requires userspace to specify the reference frames)
> > > > > > > or stateful (manages the entire decoding process internally)?
> > > > > >
> > > > > > No, users don't need to access them at all. Just they need a different
> > > > > > dma-heap.
> > > > > >
> > > > > > You would only get the stateful version of both encoder and decoder.
> > > > >
> > > > > Shouldn't the motion vectors be stored in a separate V4L2 buffer,
> > > > > submitted through a different queue then ?
> > > >
> > > > Imho, I believe these should be invisible to users and pooled separately to
> > > > reduce the overhead. The number of reference is usually lower then the number of
> > > > allocated display buffers.
> > > >
> > > You can't. The motion vector buffer can't share with the luma and chroma
> > > data planes, nor the data plane for the compression meta data.
> > >
> > > You could consider this as a security requirement(the memory region for
> > > the MV could only be accessed by the decoder) or hardware limitation.
> > >
> > > It is also not very easy to manage such a large buffer that would change
> > > when the resolution changed.
> >
> > Your argument are just aiming toward the fact that you should not let the user
> > allocate these in the first place. They should not be bound to the v4l2 buffer.
> > Allocate these in your driver, and leave to your user the pixel buffer (and
> > compress meta) allocation work.
> >
> What I want to say is that userspace could allocate buffers then make
> the v4l2 decoder import these buffers, but each planes should come from
> the right DMA-heaps. Usually the userspace would know better the memory
> occupation, it would bring some flexibility here.
>
> Currently, they are another thing bothers me, I need to allocate a small
> piece of memory(less than 128KiB) as the compression metadata buffers as
> I mentioned here. And these pieces of memory should be located in a
> small region, or the performance could be badly hurt, besides, we don't
> support IOMMU for this kind of data.
>
> Any idea about assign a small piece of memory from a pre-allocated
> memory or select region(I don't think I could reserve them in a
> DMA-heap) for a plane in the MMAP type buffer ?

A V4L2 driver should first implement the V4L2 semantic before adding optional
use case like buffer importation. For this reason, your V4L2 driver should know
all the memory requirements and how to allocate that memory. Later on, your
importing driver will have to validate that the userland did it right at
importation. This is to follow V4L2 semantic and security model. If you move
simply trust the userland (gralloc), you are not doing it right.

>
> Besides, I am not very satisfied with the dynamic resolution change
> steps if I understand it correct. Buffers reallocation should happen
> when we receive the event not until the drain is done. A resolution
> rising is very common when you are playing a network stream, it would be
> better that the decoder decided how many buffers it need for the
> previous sequence while the userspace could reallocate the reset of
> buffers in the CAPTURE queue.
> > Other driver handle this just fine, if your v4l2 driver implement the v4l2
> > resolution change mechanism, is should be very simple to manage.

This is a limitation of the queue design of V4L2. While streaming the buffers
associated with the queue must currently be large enough to support the selected
format. "large enough" in your case is complex, and validation must be
programmed in your driver.

There was discussion on perhaps extending on CREATE_BUFS feature, that let you
allocate at run-time for a different format/resolution (no drivers currently
allow that). The rules around that aren't specified (and will have to be defined
before a driver starts making use of that). Note that to be usable, a
DELETE_BUF(s) ioctl would probably be needed too.

In current state, If your driver can support it, userland does not strictly need
to re-allocate if the resolution is changed to smaller. In most SVC scenarios,
the largest resolution is known in advance, so pre-allocation can happen to the
appropriate resolution and queue size. Re-allocation is then rarely triggered at
run time. Unlike your system, IOMMU system are not as affected by allocation
latency and manages to do gapless transition despite this inefficiency.

Note that all this is pure recommendation. What I'm seeing here is a pixel
format documented with Android assumptions rather then mainline, and sent
without the associated implementation. This simply raises some question on the
viability of the whole. This is not a critic but just some verification that
ensure you are following the V4L2 spec.

> >
> > > > >
> > > > > > > > Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
> > > > > > > > ---
> > > > > > > > drivers/media/v4l2-core/v4l2-common.c | 1 +
> > > > > > > > drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
> > > > > > > > include/uapi/linux/videodev2.h | 2 ++
> > > > > > > > 3 files changed, 5 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> > > > > > > > index e0fbe6ba4b6c..f645278b3055 100644
> > > > > > > > --- a/drivers/media/v4l2-core/v4l2-common.c
> > > > > > > > +++ b/drivers/media/v4l2-core/v4l2-common.c
> > > > > > > > @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> > > > > > > > { .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > > > { .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > > > { .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > > > + { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> > > > > > > > };
> > > > > > > > unsigned int i;
> > > > > > > >
> > > > > > > > diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > > > index e6fd355a2e92..8f65964aff08 100644
> > > > > > > > --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > > > +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > > > @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> > > > > > > > case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
> > > > > > > > case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
> > > > > > > > case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
> > > > > > > > + case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
> > > > > > > > + case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
> > > > > > > > default:
> > > > > > > > if (fmt->description[0])
> > > > > > > > return;
> > > > > > > > diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> > > > > > > > index 01e630f2ec78..7e928cb69e7c 100644
> > > > > > > > --- a/include/uapi/linux/videodev2.h
> > > > > > > > +++ b/include/uapi/linux/videodev2.h
> > > > > > > > @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> > > > > > > > #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
> > > > > > > > #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> > > > > > > > #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> > > > > > > > +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
> > > > > > > > +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */
> > > > > > > >
> > > > > > > > /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e= */
> > > > > > > > #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */
> > > > >
> > > >
> > >
> >
>

2022-08-23 06:46:57

by Tomasz Figa

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format

On Sat, Aug 20, 2022 at 12:44 AM Hsia-Jun Li <[email protected]> wrote:
>
>
>
> On 8/19/22 23:28, Nicolas Dufresne wrote:
> > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> >
> >
> > Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
> >> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
> >>> On 8/18/22 14:06, Tomasz Figa wrote:
> >>>> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <[email protected]> wrote:
> >>>>>
> >>>>> From: "Hsia-Jun(Randy) Li" <[email protected]>
> >>>>>
> >>>>> The most of detail has been written in the drm.
> >>
> >> This patch still needs a description of the format, which should go to
> >> Documentation/userspace-api/media/v4l/.
> >>
> >>>>> Please notice that the tiled formats here request
> >>>>> one more plane for storing the motion vector metadata.
> >>>>> This buffer won't be compressed, so you can't append
> >>>>> it to luma or chroma plane.
> >>>>
> >>>> Does the motion vector buffer need to be exposed to userspace? Is the
> >>>> decoder stateless (requires userspace to specify the reference frames)
> >>>> or stateful (manages the entire decoding process internally)?
> >>>
> >>> No, users don't need to access them at all. Just they need a different
> >>> dma-heap.
> >>>
> >>> You would only get the stateful version of both encoder and decoder.
> >>
> >> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
> >> submitted through a different queue then ?
> >
> > Imho, I believe these should be invisible to users and pooled separately to
> > reduce the overhead. The number of reference is usually lower then the number of
> > allocated display buffers.
> >
> You can't. The motion vector buffer can't share with the luma and chroma
> data planes, nor the data plane for the compression meta data.

I believe what Nicolas is suggesting is to just keep the MV buffer
handling completely separate from video buffers. Just keep a map
between frame buffer and MV buffer in the driver and use the right
buffer when triggering a decode.

>
> You could consider this as a security requirement(the memory region for
> the MV could only be accessed by the decoder) or hardware limitation.
>
> It is also not very easy to manage such a large buffer that would change
> when the resolution changed.

How does it differ from managing additional planes of video buffers?

Best regards,
Tomasz

> >>
> >>>>> Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
> >>>>> ---
> >>>>> drivers/media/v4l2-core/v4l2-common.c | 1 +
> >>>>> drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
> >>>>> include/uapi/linux/videodev2.h | 2 ++
> >>>>> 3 files changed, 5 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> >>>>> index e0fbe6ba4b6c..f645278b3055 100644
> >>>>> --- a/drivers/media/v4l2-core/v4l2-common.c
> >>>>> +++ b/drivers/media/v4l2-core/v4l2-common.c
> >>>>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> >>>>> { .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> >>>>> { .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> >>>>> { .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> >>>>> + { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> >>>>> };
> >>>>> unsigned int i;
> >>>>>
> >>>>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> >>>>> index e6fd355a2e92..8f65964aff08 100644
> >>>>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> >>>>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> >>>>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> >>>>> case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
> >>>>> case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
> >>>>> case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
> >>>>> + case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
> >>>>> + case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
> >>>>> default:
> >>>>> if (fmt->description[0])
> >>>>> return;
> >>>>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> >>>>> index 01e630f2ec78..7e928cb69e7c 100644
> >>>>> --- a/include/uapi/linux/videodev2.h
> >>>>> +++ b/include/uapi/linux/videodev2.h
> >>>>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> >>>>> #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
> >>>>> #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> >>>>> #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> >>>>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
> >>>>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */
> >>>>>
> >>>>> /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e= */
> >>>>> #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */
> >>
> >
>
> --
> Hsia-Jun(Randy) Li

2022-08-23 07:12:00

by Hsia-Jun Li

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format



On 8/23/22 14:05, Tomasz Figa wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
> On Sat, Aug 20, 2022 at 12:44 AM Hsia-Jun Li <[email protected]> wrote:
>>
>>
>>
>> On 8/19/22 23:28, Nicolas Dufresne wrote:
>>> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>>>
>>>
>>> Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
>>>> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
>>>>> On 8/18/22 14:06, Tomasz Figa wrote:
>>>>>> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <[email protected]> wrote:
>>>>>>>
>>>>>>> From: "Hsia-Jun(Randy) Li" <[email protected]>
>>>>>>>
>>>>>>> The most of detail has been written in the drm.
>>>>
>>>> This patch still needs a description of the format, which should go to
>>>> Documentation/userspace-api/media/v4l/.
>>>>
>>>>>>> Please notice that the tiled formats here request
>>>>>>> one more plane for storing the motion vector metadata.
>>>>>>> This buffer won't be compressed, so you can't append
>>>>>>> it to luma or chroma plane.
>>>>>>
>>>>>> Does the motion vector buffer need to be exposed to userspace? Is the
>>>>>> decoder stateless (requires userspace to specify the reference frames)
>>>>>> or stateful (manages the entire decoding process internally)?
>>>>>
>>>>> No, users don't need to access them at all. Just they need a different
>>>>> dma-heap.
>>>>>
>>>>> You would only get the stateful version of both encoder and decoder.
>>>>
>>>> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
>>>> submitted through a different queue then ?
>>>
>>> Imho, I believe these should be invisible to users and pooled separately to
>>> reduce the overhead. The number of reference is usually lower then the number of
>>> allocated display buffers.
>>>
>> You can't. The motion vector buffer can't share with the luma and chroma
>> data planes, nor the data plane for the compression meta data.
>
> I believe what Nicolas is suggesting is to just keep the MV buffer
> handling completely separate from video buffers. Just keep a map
> between frame buffer and MV buffer in the driver and use the right
> buffer when triggering a decode.
>
>>
>> You could consider this as a security requirement(the memory region for
>> the MV could only be accessed by the decoder) or hardware limitation.
>>
>> It is also not very easy to manage such a large buffer that would change
>> when the resolution changed.
>
> How does it differ from managing additional planes of video buffers?
I should say I am not against his suggestion if I could make a DMA-heap
v4l2 allocator merge into kernel in the future. Although I think we need
two heaps here one for the normal video and one for the secure video, I
don't have much idea on how to determine whether we are decoding a
secure or non-secure video here (The design here is that the kernel
didn't know, only hardware and TEE care about that).

Just one place that I think it would be more simple for me to manage the
buffer here. When the decoder goes to the drain stage, then the MV
buffer goes when the data buffer goes and create when the data buffer
creates.
I know that is not a lot of work to doing the mapping between them. I
just need to convince the other accepting that do not allocator the MV
buffer outside.
>
> Best regards,
> Tomasz
>
>>>>
>>>>>>> Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
>>>>>>> ---
>>>>>>> drivers/media/v4l2-core/v4l2-common.c | 1 +
>>>>>>> drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
>>>>>>> include/uapi/linux/videodev2.h | 2 ++
>>>>>>> 3 files changed, 5 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> index e0fbe6ba4b6c..f645278b3055 100644
>>>>>>> --- a/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
>>>>>>> { .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>> { .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>> { .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>> + { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
>>>>>>> };
>>>>>>> unsigned int i;
>>>>>>>
>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> index e6fd355a2e92..8f65964aff08 100644
>>>>>>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
>>>>>>> case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
>>>>>>> case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
>>>>>>> case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
>>>>>>> + case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
>>>>>>> + case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
>>>>>>> default:
>>>>>>> if (fmt->description[0])
>>>>>>> return;
>>>>>>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>>>>>>> index 01e630f2ec78..7e928cb69e7c 100644
>>>>>>> --- a/include/uapi/linux/videodev2.h
>>>>>>> +++ b/include/uapi/linux/videodev2.h
>>>>>>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
>>>>>>> #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
>>>>>>> #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
>>>>>>> #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
>>>>>>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
>>>>>>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */
>>>>>>>
>>>>>>> /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e= */
>>>>>>> #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */
>>>>
>>>
>>
>> --
>> Hsia-Jun(Randy) Li

--
Hsia-Jun(Randy) Li

2022-08-23 07:47:00

by Hsia-Jun Li

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format



On 8/22/22 22:15, Nicolas Dufresne wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
> Le samedi 20 août 2022 à 08:10 +0800, Hsia-Jun Li a écrit :
>>
>> On 8/20/22 03:17, Nicolas Dufresne wrote:
>>> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>>>
>>>
>>> Le vendredi 19 août 2022 à 23:44 +0800, Hsia-Jun Li a écrit :
>>>>
>>>> On 8/19/22 23:28, Nicolas Dufresne wrote:
>>>>> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>>>>>
>>>>>
>>>>> Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
>>>>>> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
>>>>>>> On 8/18/22 14:06, Tomasz Figa wrote:
>>>>>>>> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>> From: "Hsia-Jun(Randy) Li" <[email protected]>
>>>>>>>>>
>>>>>>>>> The most of detail has been written in the drm.
>>>>>>
>>>>>> This patch still needs a description of the format, which should go to
>>>>>> Documentation/userspace-api/media/v4l/.
>>>>>>
>>>>>>>>> Please notice that the tiled formats here request
>>>>>>>>> one more plane for storing the motion vector metadata.
>>>>>>>>> This buffer won't be compressed, so you can't append
>>>>>>>>> it to luma or chroma plane.
>>>>>>>>
>>>>>>>> Does the motion vector buffer need to be exposed to userspace? Is the
>>>>>>>> decoder stateless (requires userspace to specify the reference frames)
>>>>>>>> or stateful (manages the entire decoding process internally)?
>>>>>>>
>>>>>>> No, users don't need to access them at all. Just they need a different
>>>>>>> dma-heap.
>>>>>>>
>>>>>>> You would only get the stateful version of both encoder and decoder.
>>>>>>
>>>>>> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
>>>>>> submitted through a different queue then ?
>>>>>
>>>>> Imho, I believe these should be invisible to users and pooled separately to
>>>>> reduce the overhead. The number of reference is usually lower then the number of
>>>>> allocated display buffers.
>>>>>
>>>> You can't. The motion vector buffer can't share with the luma and chroma
>>>> data planes, nor the data plane for the compression meta data.
>>>>
>>>> You could consider this as a security requirement(the memory region for
>>>> the MV could only be accessed by the decoder) or hardware limitation.
>>>>
>>>> It is also not very easy to manage such a large buffer that would change
>>>> when the resolution changed.
>>>
>>> Your argument are just aiming toward the fact that you should not let the user
>>> allocate these in the first place. They should not be bound to the v4l2 buffer.
>>> Allocate these in your driver, and leave to your user the pixel buffer (and
>>> compress meta) allocation work.
>>>
>> What I want to say is that userspace could allocate buffers then make
>> the v4l2 decoder import these buffers, but each planes should come from
>> the right DMA-heaps. Usually the userspace would know better the memory
>> occupation, it would bring some flexibility here.
>>
>> Currently, they are another thing bothers me, I need to allocate a small
>> piece of memory(less than 128KiB) as the compression metadata buffers as
>> I mentioned here. And these pieces of memory should be located in a
>> small region, or the performance could be badly hurt, besides, we don't
>> support IOMMU for this kind of data.
>>
>> Any idea about assign a small piece of memory from a pre-allocated
>> memory or select region(I don't think I could reserve them in a
>> DMA-heap) for a plane in the MMAP type buffer ?
>
> A V4L2 driver should first implement the V4L2 semantic before adding optional
> use case like buffer importation. For this reason, your V4L2 driver should know
> all the memory requirements and how to allocate that memory.
Yes, that is what I intend to. Or I just smuggle those things somewhere.
Later on, your
> importing driver will have to validate that the userland did it right at
> importation. This is to follow V4L2 semantic and security model. If you move
> simply trust the userland (gralloc), you are not doing it right.
>
Yes, that is what I try to describe in the other thread
https://lore.kernel.org/linux-media/[email protected]/
I don't have the problem that let the userspace decided where and how to
allocate the memory, but we need a new protocol here to let the
userspace do it right.
>>
>> Besides, I am not very satisfied with the dynamic resolution change
>> steps if I understand it correct. Buffers reallocation should happen
>> when we receive the event not until the drain is done. A resolution
>> rising is very common when you are playing a network stream, it would be
>> better that the decoder decided how many buffers it need for the
>> previous sequence while the userspace could reallocate the reset of
>> buffers in the CAPTURE queue.
>>> Other driver handle this just fine, if your v4l2 driver implement the v4l2
>>> resolution change mechanism, is should be very simple to manage.
>
> This is a limitation of the queue design of V4L2. While streaming the buffers
> associated with the queue must currently be large enough to support the selected
> format. "large enough" in your case is complex, and validation must be
> programmed in your driver.
>
> There was discussion on perhaps extending on CREATE_BUFS feature, that let you
> allocate at run-time for a different format/resolution (no drivers currently
> allow that). The rules around that aren't specified (and will have to be defined
> before a driver starts making use of that). Note that to be usable, a
> DELETE_BUF(s) ioctl would probably be needed too.
>
> In current state, If your driver can support it, userland does not strictly need
> to re-allocate if the resolution is changed to smaller. In most SVC scenarios,
> the largest resolution is known in advance, so pre-allocation can happen to the
When you play a video from Youtube, you may notice that starting
resolution is low, then after it received more data knowning the
bandwidth is enough, it would switch to a higher resolution. I don't
think it would inform the codecs2 or OMX there is a higher target
resolution.

Besides, for the case of SVC in a conference system, the remote(gatway)
would not tell you there is a higer resolution or frame rate because you
can't receive it in negotiate stage, it could be permanently(device
capability) or just bandwidth problem. Whether we know there is a higher
requirement video depends on the transport protocols used here.

The basic idea of SVC is that the low layer didn't depends on the upper
layer, we can't tell how the bitstream usually.
> appropriate resolution and queue size. Re-allocation is then rarely triggered at
> run time. Unlike your system, IOMMU system are not as affected by allocation
> latency and manages to do gapless transition despite this inefficiency.
>
> Note that all this is pure recommendation. What I'm seeing here is a pixel
> format documented with Android assumptions rather then mainline, and sent
> without the associated implementation. This simply raises some question on the
Because this implementation is very complex as you could see now, I
didn't see the exist implementation here could decode DRM video or has
the security restriction here.

And you see even before this decoder driver is done, we have had enough
problems, even just the definition of pixel formats and data exchange
mechanism.
> viability of the whole. This is not a critic but just some verification that
> ensure you are following the V4L2 spec.
I really want to those recommendations here, I want to make everything
right at the first place. Or we would make a driver we would know it
won't follow the v4l2 spec.
>
>>>
>>>>>>
>>>>>>>>> Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
>>>>>>>>> ---
>>>>>>>>> drivers/media/v4l2-core/v4l2-common.c | 1 +
>>>>>>>>> drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
>>>>>>>>> include/uapi/linux/videodev2.h | 2 ++
>>>>>>>>> 3 files changed, 5 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
>>>>>>>>> index e0fbe6ba4b6c..f645278b3055 100644
>>>>>>>>> --- a/drivers/media/v4l2-core/v4l2-common.c
>>>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-common.c
>>>>>>>>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
>>>>>>>>> { .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>>>> { .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>>>> { .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>>>> + { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
>>>>>>>>> };
>>>>>>>>> unsigned int i;
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>>>> index e6fd355a2e92..8f65964aff08 100644
>>>>>>>>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>>>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
>>>>>>>>> case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
>>>>>>>>> case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
>>>>>>>>> case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
>>>>>>>>> + case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
>>>>>>>>> + case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
>>>>>>>>> default:
>>>>>>>>> if (fmt->description[0])
>>>>>>>>> return;
>>>>>>>>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>>>>>>>>> index 01e630f2ec78..7e928cb69e7c 100644
>>>>>>>>> --- a/include/uapi/linux/videodev2.h
>>>>>>>>> +++ b/include/uapi/linux/videodev2.h
>>>>>>>>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
>>>>>>>>> #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
>>>>>>>>> #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
>>>>>>>>> #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
>>>>>>>>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
>>>>>>>>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */
>>>>>>>>>
>>>>>>>>> /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e= */
>>>>>>>>> #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */
>>>>>>
>>>>>
>>>>
>>>
>>
>

--
Hsia-Jun(Randy) Li

2022-08-23 17:23:43

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format

Le mardi 23 août 2022 à 15:03 +0800, Hsia-Jun Li a écrit :
>
> On 8/23/22 14:05, Tomasz Figa wrote:
> > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> >
> >
> > On Sat, Aug 20, 2022 at 12:44 AM Hsia-Jun Li <[email protected]> wrote:
> > >
> > >
> > >
> > > On 8/19/22 23:28, Nicolas Dufresne wrote:
> > > > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> > > >
> > > >
> > > > Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
> > > > > On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
> > > > > > On 8/18/22 14:06, Tomasz Figa wrote:
> > > > > > > On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <[email protected]> wrote:
> > > > > > > >
> > > > > > > > From: "Hsia-Jun(Randy) Li" <[email protected]>
> > > > > > > >
> > > > > > > > The most of detail has been written in the drm.
> > > > >
> > > > > This patch still needs a description of the format, which should go to
> > > > > Documentation/userspace-api/media/v4l/.
> > > > >
> > > > > > > > Please notice that the tiled formats here request
> > > > > > > > one more plane for storing the motion vector metadata.
> > > > > > > > This buffer won't be compressed, so you can't append
> > > > > > > > it to luma or chroma plane.
> > > > > > >
> > > > > > > Does the motion vector buffer need to be exposed to userspace? Is the
> > > > > > > decoder stateless (requires userspace to specify the reference frames)
> > > > > > > or stateful (manages the entire decoding process internally)?
> > > > > >
> > > > > > No, users don't need to access them at all. Just they need a different
> > > > > > dma-heap.
> > > > > >
> > > > > > You would only get the stateful version of both encoder and decoder.
> > > > >
> > > > > Shouldn't the motion vectors be stored in a separate V4L2 buffer,
> > > > > submitted through a different queue then ?
> > > >
> > > > Imho, I believe these should be invisible to users and pooled separately to
> > > > reduce the overhead. The number of reference is usually lower then the number of
> > > > allocated display buffers.
> > > >
> > > You can't. The motion vector buffer can't share with the luma and chroma
> > > data planes, nor the data plane for the compression meta data.
> >
> > I believe what Nicolas is suggesting is to just keep the MV buffer
> > handling completely separate from video buffers. Just keep a map
> > between frame buffer and MV buffer in the driver and use the right
> > buffer when triggering a decode.
> >
> > >
> > > You could consider this as a security requirement(the memory region for
> > > the MV could only be accessed by the decoder) or hardware limitation.
> > >
> > > It is also not very easy to manage such a large buffer that would change
> > > when the resolution changed.
> >
> > How does it differ from managing additional planes of video buffers?
> I should say I am not against his suggestion if I could make a DMA-heap
> v4l2 allocator merge into kernel in the future. Although I think we need
> two heaps here one for the normal video and one for the secure video, I
> don't have much idea on how to determine whether we are decoding a
> secure or non-secure video here (The design here is that the kernel
> didn't know, only hardware and TEE care about that).

Its always nice when "the design" get discussed upstream, so we can raise any
known issues and improve it. Here, not knowing if we are handling secure or non-
secure memory in kernel driver would indeed require external allocation for
everything, and V4L2 does not currently work like this. There is a few use cases
(not all of them might apply to your driver, but they exists).

1. Secondary buffers

When a CODEC is combined with a post-processor, the driver is then responsible
for reference frame allocation. In both known secure memory approach (NXP secure
bit and secondary mmu), the driver must know, as it won't be allowed produce any
non-secure buffer while secure (and vis-versa). It would be very difficult to
make secondary buffers externally allocated, since the fact secondary buffers
are used is no known by userspace. You slightly mention about adding a new queue
type, this seems like an option, though one will have to figure-out how to make
this work in a backward compatible manner.

2. Internally managed feedback buffers

Existing case of feedback buffers is VP9 decoders. I initially thought that
would only be a challenge for stateless decoders, but it turns out that Amlogic
stateful drivers also needs to take care. In VP9, the bitstream is further
compressed using probability obtained through decoding. Those probability can be
further tuned with updates placed in the compressed header. In Amlogic and
existing VP9 stateless decoder, the merging of the feedback and compressed
header updates is done using the CPU, hence that feedback buffer cannot be
secure. With lets say NXP secure domain HW, this is impossible. The OPT-TEE
needs to be involved to abstract the programming of the HW and copy back the
secure buffers to non-secure, making sure it is not being tricked into
delivering a copy of the wrong data. For the MMU approach, no copy is needed,
but to be sure the memory being mapped into the Linux Kernel MMU is the right
one, some level of abstraction of the CODEC is needed.

In short, you need a mix of secure and non-secure memory. This is a huge
challenge that isn't well covered by any secure memory design at the moment, its
not even clear if the HW can work. Remember that these feedback buffers are not
exposed to userspace, hence cannot be allocated there. Recent discussion shows
that NXP might be just giving up on their stateless codec so they can solve this
with a full codec abstraction (stateful codec).

Feedback buffers also exist in stateless encoders, but we don't have yet
existing drivers for that. Encoders also have to deal with secure memory,
notably when encoding from HDCP enabled HDMI receivers. Though this task is
quite likely limited to dedicated system, which can be considered secure as a
whole, time will define this.

>
> Just one place that I think it would be more simple for me to manage the
> buffer here. When the decoder goes to the drain stage, then the MV
> buffer goes when the data buffer goes and create when the data buffer
> creates.
> I know that is not a lot of work to doing the mapping between them. I
> just need to convince the other accepting that do not allocator the MV
> buffer outside.

Its also a big memory saver if you manage to convince them.

> >
> > Best regards,
> > Tomasz
> >
> > > > >
> > > > > > > > Signed-off-by: Hsia-Jun(Randy) Li <[email protected]>
> > > > > > > > ---
> > > > > > > > drivers/media/v4l2-core/v4l2-common.c | 1 +
> > > > > > > > drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++
> > > > > > > > include/uapi/linux/videodev2.h | 2 ++
> > > > > > > > 3 files changed, 5 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> > > > > > > > index e0fbe6ba4b6c..f645278b3055 100644
> > > > > > > > --- a/drivers/media/v4l2-core/v4l2-common.c
> > > > > > > > +++ b/drivers/media/v4l2-core/v4l2-common.c
> > > > > > > > @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> > > > > > > > { .format = V4L2_PIX_FMT_SGBRG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > > > { .format = V4L2_PIX_FMT_SGRBG12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > > > { .format = V4L2_PIX_FMT_SRGGB12, .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > > > + { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> > > > > > > > };
> > > > > > > > unsigned int i;
> > > > > > > >
> > > > > > > > diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > > > index e6fd355a2e92..8f65964aff08 100644
> > > > > > > > --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > > > +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > > > @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> > > > > > > > case V4L2_PIX_FMT_MT21C: descr = "Mediatek Compressed Format"; break;
> > > > > > > > case V4L2_PIX_FMT_QC08C: descr = "QCOM Compressed 8-bit Format"; break;
> > > > > > > > case V4L2_PIX_FMT_QC10C: descr = "QCOM Compressed 10-bit Format"; break;
> > > > > > > > + case V4L2_PIX_FMT_NV12M_V4H1C: descr = "Synaptics Compressed 8-bit tiled Format";break;
> > > > > > > > + case V4L2_PIX_FMT_NV12M_10_V4H3P8C: descr = "Synaptics Compressed 10-bit tiled Format";break;
> > > > > > > > default:
> > > > > > > > if (fmt->description[0])
> > > > > > > > return;
> > > > > > > > diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> > > > > > > > index 01e630f2ec78..7e928cb69e7c 100644
> > > > > > > > --- a/include/uapi/linux/videodev2.h
> > > > > > > > +++ b/include/uapi/linux/videodev2.h
> > > > > > > > @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> > > > > > > > #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y/CbCr 4:2:0 16x16 tiles */
> > > > > > > > #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> > > > > > > > #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> > > > > > > > +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2') /* 12 Y/CbCr 4:2:0 tiles */
> > > > > > > > +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0') /* 12 Y/CbCr 4:2:0 10-bits tiles */
> > > > > > > >
> > > > > > > > /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e= */
> > > > > > > > #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG.. GRGR.. */
> > > > >
> > > >
> > >
> > > --
> > > Hsia-Jun(Randy) Li
>

2022-08-23 19:07:43

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format

Le mardi 23 août 2022 à 15:40 +0800, Hsia-Jun Li a écrit :
> > In current state, If your driver can support it, userland does not strictly
> > need
> > to re-allocate if the resolution is changed to smaller. In most SVC
> > scenarios,
> > the largest resolution is known in advance, so pre-allocation can happen to
> > the
> When you play a video from Youtube, you may notice that starting
> resolution is low, then after it received more data knowning the
> bandwidth is enough, it would switch to a higher resolution. I don't
> think it would inform the codecs2 or OMX there is a higher target
> resolution.
>
> Besides, for the case of SVC in a conference system, the remote(gatway)
> would not tell you there is a higer resolution or frame rate because you
> can't receive it in negotiate stage, it could be permanently(device
> capability) or just bandwidth problem. Whether we know there is a higher
> requirement video depends on the transport protocols used here.
>
> The basic idea of SVC is that the low layer didn't depends on the upper
> layer, we can't tell how the bitstream usually.

I'm not saying against the fact the for drivers without IOMMU (hitting directly
into the CMA allocator), allocation latency is massive challenge, and a
mechanism to smoothly reallocate (rather then mass-reallocation) is needed in
the long run. This is what I'm referring to when saying that folks have
considered extending CREATE_BUFS() with a DELETE_BUFS() ioctl.

Note that there is tones of software trickery you can use to mitigate this. The
most simple one is to use CREATE_BUFS() instead of REQBUFS(). Instead of
reallocating all the buffers you need in one go, you would allocate them one by
one. This will distribute allocation latency. For stateful CODEC, most OMX
focused firmware needs to be modified for that, since they stick with the old
OMX spec which did not allow run-time allocation.

Another trick is to use a second codec session. Both stateful/stateless CODEC
have support for concurrent decoding. On the MSE requirement, is that the stream
transition happens only on keyframe boundary. Meaning, there is no need to reuse
the same session, you can create a new decoder in parallel, and that before the
drain is complete (after the event, before the last buffer). This will compress
the "setup" latency, to the cost of some extra memory usage. Specially in the
MSE case, this is nearly always possible since browsers do require support for
more then 1 concurrent decode. This method also works with OMX style CODEC
without any modification.

regards,
Nicolas