2024-03-04 15:59:17

by Louis Chauvet

[permalink] [raw]
Subject: [PATCH v4 00/14] drm/vkms: Reimplement line-per-line pixel conversion for plane reading

This patchset is the second version of [1]. It is almost a complete
rewrite to use a line-by-line algorithm for the composition.
It can be divided in three parts:
- PATCH 1 to 4: no functional change is intended, only some formatting and
documenting (PATCH 2 is taken from [2])
- PATCH 5 to 8: Some preparation work not directly related to the
line-by-line algorithm
- PATCH 9: main patch for this series, it reintroduce the
line-by-line algorithm
- PATCH 10 to 13: taken from Arthur's series [2], with sometimes
adaptation to use the pixel-by-pixel algorithm.

The PATCH 9 aims to restore the line-by-line pixel reading algorithm. It
was introduced in 8ba1648567e2 ("drm: vkms: Refactor the plane composer to
accept new formats") but removed in 8ba1648567e2 ("drm: vkms: Refactor the
plane composer to accept new formats") in a over-simplification effort.
At this time, nobody noticed the performance impact of this commit. After
the first iteration of my series, poeple notice performance impact, and it
was the case. Pekka suggested to reimplement the line-by-line algorithm.

Expiriments on my side shown great improvement for the line-by-line
algorithm, and the performances are the same as the original line-by-line
algorithm. I targeted my effort to make the code working for all the
rotations and translations. The usage of helpers from drm_rect_* avoid
reimplementing existing logic.

The only "complex" part remaining is the clipping of the coordinate to
avoid reading/writing outside of src/dst. Thus I added a lot of comments
to help when someone will want to add some features (framebuffer resizing
for example).

The YUV part is not mandatory for this series, but as my first effort was
to help the integration of YUV, I decided to rebase Arthur's series on
mine to help. I took [3], [4], [5] and [6] and adapted them to use the
line-by-line reading. They were also updated to use 32.32 fixed point
values for yuv conversion instead of 8.8 fixed points.

My series was mainly tested with:
- kms_plane (for color conversions)
- kms_rotation_crc (for rotations of planes)
- kms_cursor_crc (for translations)
The benchmark used to measure the improvment was done with:
- kms_fb_stress

[1]: https://lore.kernel.org/r/[email protected]
[2]: https://lore.kernel.org/all/[email protected]/
[3]: https://lore.kernel.org/all/[email protected]/
[4]: https://lore.kernel.org/all/[email protected]/
[5]: https://lore.kernel.org/all/[email protected]/
[6]: https://lore.kernel.org/all/[email protected]/

To: Rodrigo Siqueira <[email protected]>
To: Melissa Wen <[email protected]>
To: MaĆ­ra Canal <[email protected]>
To: Haneen Mohammed <[email protected]>
To: Daniel Vetter <[email protected]>
To: Maarten Lankhorst <[email protected]>
To: Maxime Ripard <[email protected]>
To: Thomas Zimmermann <[email protected]>
To: David Airlie <[email protected]>
To: [email protected]
To: Jonathan Corbet <[email protected]>
To: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Louis Chauvet <[email protected]>

Note: after my changes, those tests seems to pass, so [7] may need
updating (I did not check, it was maybe already the case):
- kms_cursor_legacy@flip-vs-cursor-atomic
- kms_pipe_crc_basic@nonblocking-crc
- kms_pipe_crc_basic@nonblocking-crc-frame-sequence
- kms_writeback@writeback-pixel-formats
- kms_writeback@writeback-invalid-parameters
- kms_flip@flip-vs-absolute-wf_vblank-interruptible
And those tests pass, I did not investigate why the runners fails:
- kms_flip@flip-vs-expired-vblank-interruptible
- kms_flip@flip-vs-expired-vblank
- kms_flip@plain-flip-fb-recreate
- kms_flip@plain-flip-fb-recreate-interruptible
- kms_flip@plain-flip-ts-check-interruptible
- kms_cursor_legacy@cursorA-vs-flipA-toggle
- kms_pipe_crc_basic@nonblocking-crc
- kms_prop_blob@invalid-get-prop
- kms_flip@flip-vs-absolute-wf_vblank-interruptible
- kms_invalid_mode@zero-hdisplay
- kms_invalid_mode@bad-vtotal
- kms_cursor_crc.* (everything is SUCCEED or SKIP, but no fails)

[7]: https://lore.kernel.org/all/[email protected]/

Changes in v4:
- PATCH 3/14: Update comments for get_pixel_* functions
- PATCH 4/14: Add WARN when trying to get unsupported pixel_* functions
- PATCH 5/14: Create dummy pixel reader/writer to avoid NULL
function pointers and kernel OOPS
- PATCH 6/14: Added the usage of const pointers when needed
- PATCH 7/14: Extraction of pixel accessors modification
- PATCH 8/14: Extraction of the blending function modification
- PATCH 9/14: Extraction of the pixel_read_direction enum
- PATCH 10/14: Update direction_for_rotation documentation
- PATCH 10/14: Rename conversion functions to be explicit
- PATCH 10/14: Replace while(count) by while(out_pixel<end) in read_line
callbacks. It avoid a new variable+addition in the composition hot path.
- PATCH 11/14: Rename conversion functions to be explicit
- PATCH 11/14: Update the documentation for get_subsampling_offset
- PATCH 11/14: Add the matrix_conversion structure to remove a test from
the hot path.
- PATCH 11/14: Upadate matrix values to use 32.32 fixed floats for
conversion
- PATCH 12/14: Update commit message
- PATCH 14/14: Change kunit expected value
- Link to v3: https://lore.kernel.org/r/[email protected]
Changes in v3:
- Correction of remaining git-rebase artefacts
- Added Pekka in copy of this patch
- Link to v2: https://lore.kernel.org/r/[email protected]
Changes in v2:
- Rebased the series on top of drm-misc/drm-misc-net
- Extract the typedef for pixel_read/pixel_write
- Introduce the line-by-line algorithm per pixel format
- Add some documentation for existing and new code
- Port the series [1] to use line-by-line algorithm
- Link to v1: https://lore.kernel.org/r/[email protected]

---
Arthur Grillo (5):
drm/vkms: Use drm_frame directly
drm/vkms: Add YUV support
drm/vkms: Add range and encoding properties to the plane
drm/vkms: Drop YUV formats TODO
drm/vkms: Create KUnit tests for YUV conversions

Louis Chauvet (9):
drm/vkms: Code formatting
drm/vkms: write/update the documentation for pixel conversion and pixel write functions
drm/vkms: Add typedef and documentation for pixel_read and pixel_write functions
drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL pointers
drm/vkms: Use const for input pointers in pixel_read an pixel_write functions
drm/vkms: Update pixels accessor to support packed and multi-plane formats.
drm/vkms: Avoid computing blending limits inside pre_mul_alpha_blend
drm/vkms: Introduce pixel_read_direction enum
drm/vkms: Re-introduce line-per-line composition algorithm

Documentation/gpu/vkms.rst | 3 +-
drivers/gpu/drm/vkms/Kconfig | 15 +
drivers/gpu/drm/vkms/Makefile | 1 +
drivers/gpu/drm/vkms/tests/.kunitconfig | 4 +
drivers/gpu/drm/vkms/tests/Makefile | 3 +
drivers/gpu/drm/vkms/tests/vkms_format_test.c | 158 ++++++
drivers/gpu/drm/vkms/vkms_composer.c | 251 ++++++--
drivers/gpu/drm/vkms/vkms_crtc.c | 6 +-
drivers/gpu/drm/vkms/vkms_drv.c | 3 +-
drivers/gpu/drm/vkms/vkms_drv.h | 86 ++-
drivers/gpu/drm/vkms/vkms_formats.c | 790 ++++++++++++++++++++++----
drivers/gpu/drm/vkms/vkms_formats.h | 12 +-
drivers/gpu/drm/vkms/vkms_plane.c | 45 +-
drivers/gpu/drm/vkms/vkms_writeback.c | 5 -
14 files changed, 1167 insertions(+), 215 deletions(-)
---
base-commit: c079e2e113f2ec2803ba859bbb442a6ab82c96bd
change-id: 20240201-yuv-1337d90d9576

Best regards,
--
Louis Chauvet <[email protected]>



2024-03-04 16:00:26

by Louis Chauvet

[permalink] [raw]
Subject: [PATCH v4 03/14] drm/vkms: write/update the documentation for pixel conversion and pixel write functions

Add some documentation on pixel conversion functions.
Update of outdated comments for pixel_write functions.

Signed-off-by: Louis Chauvet <[email protected]>
---
drivers/gpu/drm/vkms/vkms_composer.c | 7 ++++
drivers/gpu/drm/vkms/vkms_drv.h | 13 ++++++++
drivers/gpu/drm/vkms/vkms_formats.c | 62 ++++++++++++++++++++++++++++++------
3 files changed, 73 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
index c6d9b4a65809..da0651a94c9b 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -189,6 +189,13 @@ static void blend(struct vkms_writeback_job *wb,

size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;

+ /*
+ * The planes are composed line-by-line to avoid heavy memory usage. It is a necessary
+ * complexity to avoid poor blending performance.
+ *
+ * The function vkms_compose_row is used to read a line, pixel-by-pixel, into the staging
+ * buffer.
+ */
for (size_t y = 0; y < crtc_y_limit; y++) {
fill_background(&background_color, output_buffer);

diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index b4b357447292..18086423a3a7 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -25,6 +25,17 @@

#define VKMS_LUT_SIZE 256

+/**
+ * struct vkms_frame_info - structure to store the state of a frame
+ *
+ * @fb: backing drm framebuffer
+ * @src: source rectangle of this frame in the source framebuffer
+ * @dst: destination rectangle in the crtc buffer
+ * @map: see drm_shadow_plane_state@data
+ * @rotation: rotation applied to the source.
+ *
+ * @src and @dst should have the same size modulo the rotation.
+ */
struct vkms_frame_info {
struct drm_framebuffer *fb;
struct drm_rect src, dst;
@@ -52,6 +63,8 @@ struct vkms_writeback_job {
* vkms_plane_state - Driver specific plane state
* @base: base plane state
* @frame_info: data required for composing computation
+ * @pixel_read: function to read a pixel in this plane. The creator of a vkms_plane_state must
+ * ensure that this pointer is valid
*/
struct vkms_plane_state {
struct drm_shadow_plane_state base;
diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
index 172830a3936a..6e3dc8682ff9 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.c
+++ b/drivers/gpu/drm/vkms/vkms_formats.c
@@ -9,6 +9,18 @@

#include "vkms_formats.h"

+/**
+ * pixel_offset() - Get the offset of the pixel at coordinates x/y in the first plane
+ *
+ * @frame_info: Buffer metadata
+ * @x: The x coordinate of the wanted pixel in the buffer
+ * @y: The y coordinate of the wanted pixel in the buffer
+ *
+ * The caller must ensure that the framebuffer associated with this request uses a pixel format
+ * where block_h == block_w == 1.
+ * If this requirement is not fulfilled, the resulting offset can point to an other pixel or
+ * outside of the buffer.
+ */
static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y)
{
struct drm_framebuffer *fb = frame_info->fb;
@@ -17,18 +29,22 @@ static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int
+ (x * fb->format->cpp[0]);
}

-/*
- * packed_pixels_addr - Get the pointer to pixel of a given pair of coordinates
+/**
+ * packed_pixels_addr() - Get the pointer to the block containing the pixel at the given
+ * coordinates
*
* @frame_info: Buffer metadata
- * @x: The x(width) coordinate of the 2D buffer
- * @y: The y(Heigth) coordinate of the 2D buffer
+ * @x: The x(width) coordinate inside the plane
+ * @y: The y(height) coordinate inside the plane
*
* Takes the information stored in the frame_info, a pair of coordinates, and
* returns the address of the first color channel.
* This function assumes the channels are packed together, i.e. a color channel
* comes immediately after another in the memory. And therefore, this function
* doesn't work for YUV with chroma subsampling (e.g. YUV420 and NV21).
+ *
+ * The caller must ensure that the framebuffer associated with this request uses a pixel format
+ * where block_h == block_w == 1, otherwise the returned pointer can be outside the buffer.
*/
static void *packed_pixels_addr(const struct vkms_frame_info *frame_info,
int x, int y)
@@ -53,6 +69,13 @@ static int get_x_position(const struct vkms_frame_info *frame_info, int limit, i
return x;
}

+/*
+ * The following functions take pixel data from the buffer and convert them to the format
+ * ARGB16161616 in out_pixel.
+ *
+ * They are used in the `vkms_compose_row` function to handle multiple formats.
+ */
+
static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
{
/*
@@ -145,12 +168,11 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
}

/*
- * The following functions take an line of argb_u16 pixels from the
- * src_buffer, convert them to a specific format, and store them in the
- * destination.
+ * The following functions take one argb_u16 pixel and convert it to a specific format. The
+ * result is stored in @dst_pixels.
*
- * They are used in the `compose_active_planes` to convert and store a line
- * from the src_buffer to the writeback buffer.
+ * They are used in the `vkms_writeback_row` to convert and store a pixel from the src_buffer to
+ * the writeback buffer.
*/
static void argb_u16_to_ARGB8888(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
{
@@ -216,6 +238,14 @@ static void argb_u16_to_RGB565(u8 *dst_pixels, struct pixel_argb_u16 *in_pixel)
*pixels = cpu_to_le16(r << 11 | g << 5 | b);
}

+/**
+ * Generic loop for all supported writeback format. It is executed just after the blending to
+ * write a line in the writeback buffer.
+ *
+ * @wb: Job where to insert the final image
+ * @src_buffer: Line to write
+ * @y: Row to write in the writeback buffer
+ */
void vkms_writeback_row(struct vkms_writeback_job *wb,
const struct line_buffer *src_buffer, int y)
{
@@ -229,6 +259,13 @@ void vkms_writeback_row(struct vkms_writeback_job *wb,
wb->pixel_write(dst_pixels, &in_pixels[x]);
}

+/**
+ * Retrieve the correct read_pixel function for a specific format.
+ * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
+ * pointer is valid before using it in a vkms_plane_state.
+ *
+ * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
+ */
void *get_pixel_conversion_function(u32 format)
{
switch (format) {
@@ -247,6 +284,13 @@ void *get_pixel_conversion_function(u32 format)
}
}

+/**
+ * Retrieve the correct write_pixel function for a specific format.
+ * The returned pointer is NULL for unsupported pixel formats. The caller must ensure that the
+ * pointer is valid before using it in a vkms_writeback_job.
+ *
+ * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
+ */
void *get_pixel_write_function(u32 format)
{
switch (format) {

--
2.43.0


2024-03-04 16:02:58

by Louis Chauvet

[permalink] [raw]
Subject: [PATCH v4 11/14] drm/vkms: Add YUV support

From: Arthur Grillo <[email protected]>

Add support to the YUV formats bellow:

- NV12/NV16/NV24
- NV21/NV61/NV42
- YUV420/YUV422/YUV444
- YVU420/YVU422/YVU444

The conversion from yuv to rgb is done with fixed-point arithmetic, using
32.32 floats.

To do the conversion, a specific matrix must be used for each color range
(DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
the `conversion_matrix` struct, along with the specific y_offset needed.
This matrix is queried only once, in `vkms_plane_atomic_update` and
stored in a `vkms_plane_state`. Those conversion matrices of each
encoding and range were obtained by rounding the values of the original
conversion matrices multiplied by 2^32. This is done to avoid the use of
fixed point operations.

The same reading function is used for YUV and YVU formats. As the only
difference between those two category of formats is the order of field, a
simple swap in conversion matrix columns allows using the same function.

Signed-off-by: Arthur Grillo <[email protected]>
[Louis Chauvet:
- Adapted Arthur's work
- Implemented the read_line_t callbacks for yuv
- add struct conversion_matrix
- remove struct pixel_yuv_u8
- update the commit message]
Signed-off-by: Louis Chauvet <[email protected]>
---
drivers/gpu/drm/vkms/vkms_drv.h | 23 +++
drivers/gpu/drm/vkms/vkms_formats.c | 374 ++++++++++++++++++++++++++++++++++++
drivers/gpu/drm/vkms/vkms_formats.h | 4 +
drivers/gpu/drm/vkms/vkms_plane.c | 16 +-
4 files changed, 416 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index 23e1d247468d..393b76e7c694 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -99,6 +99,28 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
int y_start, enum pixel_read_direction direction, int count,
struct pixel_argb_u16 out_pixel[]);

+
+/**
+ * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
+ */
+#define CONVERSION_MATRIX_FLOAT_DEPTH 32
+
+/**
+ * struct conversion_matrix - Matrix to use for a specific encoding and range
+ *
+ * @matrix: Conversion matrix from yuv to rgb. The matrix is stored in a row-major manner and is
+ * used to compute rgb values from yuv values:
+ * [[r],[g],[b]] = @matrix * [[y],[u],[v]]
+ * OR for yvu formats:
+ * [[r],[g],[b]] = @matrix * [[y],[v],[u]]
+ * The values of the matrix are fixed floats, 32.CONVERSION_MATRIX_FLOAT_DEPTH
+ * @y_offest: Offset to apply on the y value.
+ */
+struct conversion_matrix {
+ s64 matrix[3][3];
+ s64 y_offset;
+};
+
/**
* vkms_plane_state - Driver specific plane state
* @base: base plane state
@@ -110,6 +132,7 @@ struct vkms_plane_state {
struct drm_shadow_plane_state base;
struct vkms_frame_info *frame_info;
pixel_read_line_t pixel_read_line;
+ struct conversion_matrix *conversion_matrix;
};

struct vkms_plane {
diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
index 87af3962ee12..d9b70d9b99ef 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.c
+++ b/drivers/gpu/drm/vkms/vkms_formats.c
@@ -90,6 +90,45 @@ static int get_step_1x1(struct drm_framebuffer *fb, enum pixel_read_direction di
return 0;
}

+/**
+ * get_subsampling() - Get the subsampling divisor value on a specific direction
+ */
+static int get_subsampling(const struct drm_format_info *format,
+ enum pixel_read_direction direction)
+{
+ switch (direction) {
+ case READ_BOTTOM_TO_TOP:
+ case READ_TOP_TO_BOTTOM:
+ return format->vsub;
+ case READ_RIGHT_TO_LEFT:
+ case READ_LEFT_TO_RIGHT:
+ return format->hsub;
+ }
+ WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
+ return 1;
+}
+
+/**
+ * get_subsampling_offset() - An offset for keeping the chroma siting consistent regardless of
+ * x_start and y_start values
+ */
+static int get_subsampling_offset(enum pixel_read_direction direction, int x_start, int y_start)
+{
+ switch (direction) {
+ case READ_BOTTOM_TO_TOP:
+ return -y_start;
+ case READ_TOP_TO_BOTTOM:
+ return y_start;
+ case READ_RIGHT_TO_LEFT:
+ return -x_start;
+ case READ_LEFT_TO_RIGHT:
+ return x_start;
+ }
+ WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
+ return 0;
+}
+
+
/*
* The following functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
* ARGB16161616 in out_pixel.
@@ -146,6 +185,40 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
return out_pixel;
}

+static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
+ struct conversion_matrix *matrix)
+{
+ u8 r, g, b;
+ s64 y_16, cb_16, cr_16;
+ s64 r_16, g_16, b_16;
+
+ y_16 = y - matrix->y_offset;
+ cb_16 = cb - 128;
+ cr_16 = cr - 128;
+
+ r_16 = matrix->matrix[0][0] * y_16 + matrix->matrix[0][1] * cb_16 +
+ matrix->matrix[0][2] * cr_16;
+ g_16 = matrix->matrix[1][0] * y_16 + matrix->matrix[1][1] * cb_16 +
+ matrix->matrix[1][2] * cr_16;
+ b_16 = matrix->matrix[2][0] * y_16 + matrix->matrix[2][1] * cb_16 +
+ matrix->matrix[2][2] * cr_16;
+
+ // rounding the values
+ r_16 = r_16 + (1LL << (CONVERSION_MATRIX_FLOAT_DEPTH - 4));
+ g_16 = g_16 + (1LL << (CONVERSION_MATRIX_FLOAT_DEPTH - 4));
+ b_16 = b_16 + (1LL << (CONVERSION_MATRIX_FLOAT_DEPTH - 4));
+
+ r_16 = clamp(r_16, 0, (1LL << (CONVERSION_MATRIX_FLOAT_DEPTH + 8)) - 1);
+ g_16 = clamp(g_16, 0, (1LL << (CONVERSION_MATRIX_FLOAT_DEPTH + 8)) - 1);
+ b_16 = clamp(b_16, 0, (1LL << (CONVERSION_MATRIX_FLOAT_DEPTH + 8)) - 1);
+
+ r = r_16 >> CONVERSION_MATRIX_FLOAT_DEPTH;
+ g = g_16 >> CONVERSION_MATRIX_FLOAT_DEPTH;
+ b = b_16 >> CONVERSION_MATRIX_FLOAT_DEPTH;
+
+ return argb_u16_from_u8888(255, r, g, b);
+}
+
/*
* The following functions are read_line function for each pixel format supported by VKMS.
*
@@ -263,6 +336,70 @@ static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
}
}

+/*
+ * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
+ * (column inversion)
+ */
+static void semi_planar_yuv_read_line(const struct vkms_plane_state *plane, int x_start,
+ int y_start, enum pixel_read_direction direction, int count,
+ struct pixel_argb_u16 out_pixel[])
+{
+ u8 *y_plane = packed_pixels_addr(plane->frame_info, x_start, y_start, 0);
+ u8 *uv_plane = packed_pixels_addr(plane->frame_info,
+ x_start / plane->frame_info->fb->format->hsub,
+ y_start / plane->frame_info->fb->format->vsub,
+ 1);
+ int step_y = get_step_1x1(plane->frame_info->fb, direction, 0);
+ int step_uv = get_step_1x1(plane->frame_info->fb, direction, 1);
+ int subsampling = get_subsampling(plane->frame_info->fb->format, direction);
+ int subsampling_offset = get_subsampling_offset(direction, x_start, y_start);
+ struct conversion_matrix *conversion_matrix = plane->conversion_matrix;
+
+ for (int i = 0; i < count; i++) {
+ *out_pixel = argb_u16_from_yuv888(y_plane[0], uv_plane[0], uv_plane[1],
+ conversion_matrix);
+ out_pixel += 1;
+ y_plane += step_y;
+ if ((i + subsampling_offset + 1) % subsampling == 0)
+ uv_plane += step_uv;
+ }
+}
+
+/*
+ * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
+ * (column inversion)
+ */
+static void planar_yuv_read_line(const struct vkms_plane_state *plane, int x_start,
+ int y_start, enum pixel_read_direction direction, int count,
+ struct pixel_argb_u16 out_pixel[])
+{
+ u8 *y_plane = packed_pixels_addr(plane->frame_info, x_start, y_start, 0);
+ u8 *u_plane = packed_pixels_addr(plane->frame_info,
+ x_start / plane->frame_info->fb->format->hsub,
+ y_start / plane->frame_info->fb->format->vsub,
+ 1);
+ u8 *v_plane = packed_pixels_addr(plane->frame_info,
+ x_start / plane->frame_info->fb->format->hsub,
+ y_start / plane->frame_info->fb->format->vsub,
+ 2);
+ int step_y = get_step_1x1(plane->frame_info->fb, direction, 0);
+ int step_u = get_step_1x1(plane->frame_info->fb, direction, 1);
+ int step_v = get_step_1x1(plane->frame_info->fb, direction, 2);
+ int subsampling = get_subsampling(plane->frame_info->fb->format, direction);
+ int subsampling_offset = get_subsampling_offset(direction, x_start, y_start);
+ struct conversion_matrix *conversion_matrix = plane->conversion_matrix;
+
+ for (int i = 0; i < count; i++) {
+ *out_pixel = argb_u16_from_yuv888(*y_plane, *u_plane, *v_plane, conversion_matrix);
+ out_pixel += 1;
+ y_plane += step_y;
+ if ((i + subsampling_offset + 1) % subsampling == 0) {
+ u_plane += step_u;
+ v_plane += step_v;
+ }
+ }
+}
+
/*
* The following functions take one argb_u16 pixel and convert it to a specific format. The
* result is stored in @out_pixel.
@@ -384,6 +521,20 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
return &XRGB16161616_read_line;
case DRM_FORMAT_RGB565:
return &RGB565_read_line;
+ case DRM_FORMAT_NV12:
+ case DRM_FORMAT_NV16:
+ case DRM_FORMAT_NV24:
+ case DRM_FORMAT_NV21:
+ case DRM_FORMAT_NV61:
+ case DRM_FORMAT_NV42:
+ return &semi_planar_yuv_read_line;
+ case DRM_FORMAT_YUV420:
+ case DRM_FORMAT_YUV422:
+ case DRM_FORMAT_YUV444:
+ case DRM_FORMAT_YVU420:
+ case DRM_FORMAT_YVU422:
+ case DRM_FORMAT_YVU444:
+ return &planar_yuv_read_line;
default:
/*
* This is a bug in vkms_plane_atomic_check. All the supported
@@ -400,6 +551,229 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
}
}

+/**
+ * get_conversion_matrix_to_argb_u16() - Retrieve the correct yuv to rgb conversion matrix for a
+ * given encoding and range.
+ *
+ * If the matrix is not found, return a null pointer. In all other cases, it return a simple
+ * diagonal matrix, which act as a "no-op".
+ *
+ * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
+ * @encoding: DRM_COLOR_* value for which to obtain a conversion matrix
+ * @range: DRM_COLOR_*_RANGE value for which to obtain a conversion matrix
+ */
+struct conversion_matrix *
+get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
+ enum drm_color_range range)
+{
+ static struct conversion_matrix no_operation = {
+ .matrix = {
+ { 4294967296, 0, 0, },
+ { 0, 4294967296, 0, },
+ { 0, 0, 4294967296, },
+ },
+ .y_offset = 0,
+ };
+ static struct conversion_matrix yuv_bt601_full = {
+ .matrix = {
+ { 4294967296, 0, 6021544149 },
+ { 4294967296, -1478054095, -3067191994 },
+ { 4294967296, 7610682049, 0 },
+ },
+ .y_offset = 0,
+ };
+ static struct conversion_matrix yuv_bt601_limited = {
+ .matrix = {
+ { 5020601039, 0, 6881764740 },
+ { 5020601039, -1689204679, -3505362278 },
+ { 5020601039, 8697922339, 0 },
+ },
+ .y_offset = 16,
+ };
+ static struct conversion_matrix yuv_bt709_full = {
+ .matrix = {
+ { 4294967296, 0, 6763714498 },
+ { 4294967296, -804551626, -2010578443 },
+ { 4294967296, 7969741314, 0 },
+ },
+ .y_offset = 0,
+ };
+ static struct conversion_matrix yuv_bt709_limited = {
+ .matrix = {
+ { 5020601039, 0, 7729959424 },
+ { 5020601039, -919487572, -2297803934 },
+ { 5020601039, 9108275786, 0 },
+ },
+ .y_offset = 16,
+ };
+ static struct conversion_matrix yuv_bt2020_full = {
+ .matrix = {
+ { 4294967296, 0, 6333358775 },
+ { 4294967296, -706750298, -2453942994 },
+ { 4294967296, 8080551471, 0 },
+ },
+ .y_offset = 0,
+ };
+ static struct conversion_matrix yuv_bt2020_limited = {
+ .matrix = {
+ { 5020601039, 0, 7238124312 },
+ { 5020601039, -807714626, -2804506279 },
+ { 5020601039, 9234915964, 0 },
+ },
+ .y_offset = 16,
+ };
+ static struct conversion_matrix yvu_bt601_full = {
+ .matrix = {
+ { 4294967296, 6021544149, 0 },
+ { 4294967296, -3067191994, -1478054095 },
+ { 4294967296, 0, 7610682049 },
+ },
+ .y_offset = 0,
+ };
+ static struct conversion_matrix yvu_bt601_limited = {
+ .matrix = {
+ { 5020601039, 6881764740, 0 },
+ { 5020601039, -3505362278, -1689204679 },
+ { 5020601039, 0, 8697922339 },
+ },
+ .y_offset = 16,
+ };
+ static struct conversion_matrix yvu_bt709_full = {
+ .matrix = {
+ { 4294967296, 6763714498, 0 },
+ { 4294967296, -2010578443, -804551626 },
+ { 4294967296, 0, 7969741314 },
+ },
+ .y_offset = 0,
+ };
+ static struct conversion_matrix yvu_bt709_limited = {
+ .matrix = {
+ { 5020601039, 7729959424, 0 },
+ { 5020601039, -2297803934, -919487572 },
+ { 5020601039, 0, 9108275786 },
+ },
+ .y_offset = 16,
+ };
+ static struct conversion_matrix yvu_bt2020_full = {
+ .matrix = {
+ { 4294967296, 6333358775, 0 },
+ { 4294967296, -2453942994, -706750298 },
+ { 4294967296, 0, 8080551471 },
+ },
+ .y_offset = 0,
+ };
+ static struct conversion_matrix yvu_bt2020_limited = {
+ .matrix = {
+ { 5020601039, 7238124312, 0 },
+ { 5020601039, -2804506279, -807714626 },
+ { 5020601039, 0, 9234915964 },
+ },
+ .y_offset = 16,
+ };
+
+ /* Breaking in this switch means that the color format+encoding+range is not supported */
+ switch (format) {
+ case DRM_FORMAT_NV12:
+ case DRM_FORMAT_NV16:
+ case DRM_FORMAT_NV24:
+ case DRM_FORMAT_YUV420:
+ case DRM_FORMAT_YUV422:
+ case DRM_FORMAT_YUV444:
+ switch (encoding) {
+ case DRM_COLOR_YCBCR_BT601:
+ switch (range) {
+ case DRM_COLOR_YCBCR_LIMITED_RANGE:
+ return &yuv_bt601_limited;
+ case DRM_COLOR_YCBCR_FULL_RANGE:
+ return &yuv_bt601_full;
+ case DRM_COLOR_RANGE_MAX:
+ break;
+ }
+ break;
+ case DRM_COLOR_YCBCR_BT709:
+ switch (range) {
+ case DRM_COLOR_YCBCR_LIMITED_RANGE:
+ return &yuv_bt709_limited;
+ case DRM_COLOR_YCBCR_FULL_RANGE:
+ return &yuv_bt709_full;
+ case DRM_COLOR_RANGE_MAX:
+ break;
+ }
+ break;
+ case DRM_COLOR_YCBCR_BT2020:
+ switch (range) {
+ case DRM_COLOR_YCBCR_LIMITED_RANGE:
+ return &yuv_bt2020_limited;
+ case DRM_COLOR_YCBCR_FULL_RANGE:
+ return &yuv_bt2020_full;
+ case DRM_COLOR_RANGE_MAX:
+ break;
+ }
+ break;
+ case DRM_COLOR_ENCODING_MAX:
+ break;
+ }
+ break;
+ case DRM_FORMAT_YVU420:
+ case DRM_FORMAT_YVU422:
+ case DRM_FORMAT_YVU444:
+ case DRM_FORMAT_NV21:
+ case DRM_FORMAT_NV61:
+ case DRM_FORMAT_NV42:
+ switch (encoding) {
+ case DRM_COLOR_YCBCR_BT601:
+ switch (range) {
+ case DRM_COLOR_YCBCR_LIMITED_RANGE:
+ return &yvu_bt601_limited;
+ case DRM_COLOR_YCBCR_FULL_RANGE:
+ return &yvu_bt601_full;
+ case DRM_COLOR_RANGE_MAX:
+ break;
+ }
+ break;
+ case DRM_COLOR_YCBCR_BT709:
+ switch (range) {
+ case DRM_COLOR_YCBCR_LIMITED_RANGE:
+ return &yvu_bt709_limited;
+ case DRM_COLOR_YCBCR_FULL_RANGE:
+ return &yvu_bt709_full;
+ case DRM_COLOR_RANGE_MAX:
+ break;
+ }
+ break;
+ case DRM_COLOR_YCBCR_BT2020:
+ switch (range) {
+ case DRM_COLOR_YCBCR_LIMITED_RANGE:
+ return &yvu_bt2020_limited;
+ case DRM_COLOR_YCBCR_FULL_RANGE:
+ return &yvu_bt2020_full;
+ case DRM_COLOR_RANGE_MAX:
+ break;
+ }
+ break;
+ case DRM_COLOR_ENCODING_MAX:
+ break;
+ }
+ break;
+ case DRM_FORMAT_ARGB8888:
+ case DRM_FORMAT_XRGB8888:
+ case DRM_FORMAT_ARGB16161616:
+ case DRM_FORMAT_XRGB16161616:
+ case DRM_FORMAT_RGB565:
+ /*
+ * Those formats are supported, but they don't need a conversion matrix. Return
+ * a valid pointer to avoid kernel panic in case this matrix is used/checked
+ * somewhere.
+ */
+ return &no_operation;
+ default:
+ break;
+ }
+ WARN(true, "Unsupported encoding (%d), range (%d) and format (%4cc) combination\n",
+ encoding, range, format);
+ return &no_operation;
+}
+
/**
* Retrieve the correct write_pixel function for a specific format.
* If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
index 8d2bef95ff79..e1d324764b17 100644
--- a/drivers/gpu/drm/vkms/vkms_formats.h
+++ b/drivers/gpu/drm/vkms/vkms_formats.h
@@ -9,4 +9,8 @@ pixel_read_line_t get_pixel_read_line_function(u32 format);

pixel_write_t get_pixel_write_function(u32 format);

+struct conversion_matrix *
+get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
+ enum drm_color_range range);
+
#endif /* _VKMS_FORMATS_H_ */
diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
index 8875bed76410..93d0a39fa8c5 100644
--- a/drivers/gpu/drm/vkms/vkms_plane.c
+++ b/drivers/gpu/drm/vkms/vkms_plane.c
@@ -17,7 +17,19 @@ static const u32 vkms_formats[] = {
DRM_FORMAT_XRGB8888,
DRM_FORMAT_XRGB16161616,
DRM_FORMAT_ARGB16161616,
- DRM_FORMAT_RGB565
+ DRM_FORMAT_RGB565,
+ DRM_FORMAT_NV12,
+ DRM_FORMAT_NV16,
+ DRM_FORMAT_NV24,
+ DRM_FORMAT_NV21,
+ DRM_FORMAT_NV61,
+ DRM_FORMAT_NV42,
+ DRM_FORMAT_YUV420,
+ DRM_FORMAT_YUV422,
+ DRM_FORMAT_YUV444,
+ DRM_FORMAT_YVU420,
+ DRM_FORMAT_YVU422,
+ DRM_FORMAT_YVU444
};

static struct drm_plane_state *
@@ -123,6 +135,8 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,


vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
+ vkms_plane_state->conversion_matrix = get_conversion_matrix_to_argb_u16
+ (fmt, new_state->color_encoding, new_state->color_range);
}

static int vkms_plane_atomic_check(struct drm_plane *plane,

--
2.43.0