Received: by 2002:ab2:6a05:0:b0:1f8:1780:a4ed with SMTP id w5csp1967407lqo; Mon, 13 May 2024 04:15:10 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVM3laH8FYxJmRa3kQY4/mu37+NBp899HSdWVuvt273hCao5TES9JngbZOLzCe60E/9joC0j7LMfGBgew6XOuQceySrDyum4w1qpTuwrg== X-Google-Smtp-Source: AGHT+IGvabeOmaajAfnj78R7dllMNHx/+ZOcu+ntOiJ5ILmFfYoyw9WTnqPS5nRkDhI2WJa4Zow1 X-Received: by 2002:a05:622a:1456:b0:43d:fca8:910d with SMTP id d75a77b69052e-43dfdabf02bmr106067241cf.24.1715598910415; Mon, 13 May 2024 04:15:10 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1715598910; cv=pass; d=google.com; s=arc-20160816; b=PllFTXeJO+QAViS3rwD8crgbb+Zwm+tGJljAVmLwNtCQON/nuUEoMbCS7yGfCa52qs xTUZnxJcrw2AgiFQqKQsv+CsyfZisJCZkCIQwKb4CsBDsxRo0u69aX6m/VmcwZuZwwcM s91OwJuFDytiFkQJXOkd9N7BcDpHfXRDDXBU44/tmcHsOKvK5ocn6/l81EKlzgZ5mksw xfu3gHg0z7CvmYozqzLcXRZuKn10QfDMnOUVVa46bqyLo8cjg8nxKpdMF/gLudFqi5Xz QRHjVANwclMpnNWkRpI0U4BoYQwIyVz01C5NfCf+9p1DaZfCZBSzzG8/oJJKAVZWRXHz Er4A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=4tqbZKsvoCGqzBnkK7ySUuL7yWh5FclO6wHYU2I4h8Q=; fh=VxQoP5gQJM8M5WdURpOhy8n1wMbJ2KLm37NA5wrNepI=; b=QFdFbqVfXzwfmUd7MmtPs562Ft6cvRpU2c54nZp3iXv/4IjXckAzZwTcSdX/R12/gN 3/UT7YN/CE/Wh1PCBwSEPYHizPG/sbzsP9DXYIQ92ZFxSvuheChES30ejPdikLYur4VY PPvmsYbXpGBc1W94mL223ZVUYAI5GFowbMVvjZcd2vK+OTzqLIiY3qDRgm7TUiXDcaCR 5b0qvhnwTV7Li3Xb4/ovkCINVaNZIhPpgwFF9HOeaB52nlQ63FVPkYFkQPaSsxNtNTNN zWdd1ZPodjvNwAMW41HE4ClFCcuzdggz8lh/7CbWlYPA6VVWbLxB/xr09eUfkmfSNeS0 4UVA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b="YRq/UkVo"; arc=pass (i=1 spf=pass spfdomain=collabora.com dkim=pass dkdomain=collabora.com dmarc=pass fromdomain=collabora.com); spf=pass (google.com: domain of linux-kernel+bounces-177490-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-177490-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id d75a77b69052e-43df566f8dfsi77137311cf.358.2024.05.13.04.15.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 May 2024 04:15:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-177490-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b="YRq/UkVo"; arc=pass (i=1 spf=pass spfdomain=collabora.com dkim=pass dkdomain=collabora.com dmarc=pass fromdomain=collabora.com); spf=pass (google.com: domain of linux-kernel+bounces-177490-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-177490-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 00FCF1C2296A for ; Mon, 13 May 2024 11:15:10 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5391B14BFAB; Mon, 13 May 2024 11:15:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="YRq/UkVo" Received: from madrid.collaboradmins.com (madrid.collaboradmins.com [46.235.227.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4FFE7548E1 for ; Mon, 13 May 2024 11:15:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=46.235.227.194 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715598904; cv=none; b=gNizm9VpTsF5R4TJbDwCL/r/5nmNTP+oAU+tgz/7i3LSTNmUh6uCCj1gfvjQpL0Cn2aREGPTdW9c/EZbySseUsJRwQC6tjuArtsvLdLX3yNGWJ33srbiyny+hJtmwhWgDB5DXvo4wJ99DhsZ2pYqxKULXxeoThbVO+2el+skJRI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715598904; c=relaxed/simple; bh=7iUwfobNnT2nxOvxAP5Xkuao5iOl4azg6uUFMsPAI3U=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=nDanyHdGXSe95n5GzSWOC1Ce8j7mRwBX3WadcKvgFE3eFDeLZv4KlZ+KduUALUtGUssfByo6XHohiJyEMi6WaacBpQYPQF1/3FqSXkecuraeoWs2mP8n9q8AUblq4qAOWRrAHC2xYUwAQ2DBnEs9uHc+KDNjKneMRAW48cUG9jA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=YRq/UkVo; arc=none smtp.client-ip=46.235.227.194 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1715598900; bh=7iUwfobNnT2nxOvxAP5Xkuao5iOl4azg6uUFMsPAI3U=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=YRq/UkVolqToE0Zlrc+rL7szh0o3Mmcpdfa+NnBhXsC30vub2FeUMwrM8yn5EjDGL V+PfcEEKojXLPVlMlxk1OW5grEeA/hJe5C1K48msL6ZnZ5LG+MypjGgaLy1OmTg3Ws MMAtX2eudEklJUbfqoCuMpxVociOK8AIDWpUKWb9mw74T6qq8SbccupIFvyu8gtuG0 5bpoFcBE8xowQHlzh7fEC6Et2A3h8yWm5KK+4C6h6hgqJa/1f6SBJI51/eRP0PyI9V 1mFT4svPol6d82/CoPqkzvHYEYOut3r8VANa1zrQQyGpdiHQUz4b1hGQzgX9e1mMML tIBDM3UXiDa+Q== Received: from eldfell (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (prime256v1) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: pq) by madrid.collaboradmins.com (Postfix) with ESMTPSA id 0B764378212E; Mon, 13 May 2024 11:14:58 +0000 (UTC) Date: Mon, 13 May 2024 14:14:49 +0300 From: Pekka Paalanen To: Louis Chauvet Cc: Rodrigo Siqueira , Melissa Wen , =?UTF-8?B?TWHDrXJh?= Canal , Haneen Mohammed , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , rdunlap@infradead.org, arthurgrillo@riseup.net, Jonathan Corbet , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, jeremie.dautheribes@bootlin.com, miquel.raynal@bootlin.com, thomas.petazzoni@bootlin.com, seanpaul@google.com, marcheu@google.com, nicolejadeyee@google.com Subject: Re: [PATCH v6 07/17] drm/vkms: Update pixels accessor to support packed and multi-plane formats. Message-ID: <20240513141449.662ea39d.pekka.paalanen@collabora.com> In-Reply-To: References: <20240409-yuv-v6-0-de1c5728fd70@bootlin.com> <20240409-yuv-v6-7-de1c5728fd70@bootlin.com> <20240422140757.576e363b.pekka.paalanen@collabora.com> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; x86_64-pc-linux-gnu) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: multipart/signed; boundary="Sig_/+HtRlSbEj54KW42m5d4eUu+"; protocol="application/pgp-signature"; micalg=pgp-sha256 --Sig_/+HtRlSbEj54KW42m5d4eUu+ Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Mon, 13 May 2024 09:15:13 +0200 Louis Chauvet wrote: > Le 22/04/24 - 14:07, Pekka Paalanen a =C3=A9crit : > > On Tue, 09 Apr 2024 15:25:25 +0200 > > Louis Chauvet wrote: > > =20 > > > Introduce the usage of block_h/block_w to compute the offset and the > > > pointer of a pixel. The previous implementation was specialized for > > > planes with block_h =3D=3D block_w =3D=3D 1. To avoid confusion and a= llow easier > > > implementation of tiled formats. It also remove the usage of the > > > deprecated format field `cpp`. > > >=20 > > > Introduce the plane_index parameter to get an offset/pointer on a > > > different plane. > > >=20 > > > Signed-off-by: Louis Chauvet > > > --- > > > drivers/gpu/drm/vkms/vkms_formats.c | 110 ++++++++++++++++++++++++++= ++-------- > > > 1 file changed, 87 insertions(+), 23 deletions(-) > > >=20 > > > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vk= ms/vkms_formats.c > > > index 69cf9733fec5..9a1400ad4db6 100644 > > > --- a/drivers/gpu/drm/vkms/vkms_formats.c > > > +++ b/drivers/gpu/drm/vkms/vkms_formats.c > > > @@ -10,22 +10,43 @@ > > > #include "vkms_formats.h" > > > =20 > > > /** > > > - * pixel_offset() - Get the offset of the pixel at coordinates x/y i= n the first plane > > > + * packed_pixels_offset() - Get the offset of the block containing t= he pixel at coordinates x/y > > > * > > > * @frame_info: Buffer metadata > > > * @x: The x coordinate of the wanted pixel in the buffer > > > * @y: The y coordinate of the wanted pixel in the buffer > > > + * @plane_index: The index of the plane to use > > > + * @offset: The returned offset inside the buffer of the block > > > + * @rem_x,@rem_y: The returned coordinate of the requested pixel in = the block > > > * > > > - * The caller must ensure that the framebuffer associated with this = request uses a pixel format > > > - * where block_h =3D=3D block_w =3D=3D 1. > > > - * If this requirement is not fulfilled, the resulting offset can po= int to an other pixel or > > > - * outside of the buffer. > > > + * As some pixel formats store multiple pixels in a block (DRM_FORMA= T_R* for example), some > > > + * pixels are not individually addressable. This function return 3 v= alues: the offset of the > > > + * whole block, and the coordinate of the requested pixel inside thi= s block. > > > + * For example, if the format is DRM_FORMAT_R1 and the requested coo= rdinate is 13,5, the offset > > > + * will point to the byte 5*pitches + 13/8 (second byte of the 5th l= ine), and the rem_x/rem_y > > > + * coordinates will be (13 % 8, 5 % 1) =3D (5, 0) > > > + * > > > + * With this function, the caller just have to extract the correct p= ixel from the block. > > > */ > > > -static size_t pixel_offset(const struct vkms_frame_info *frame_info,= int x, int y) > > > +static void packed_pixels_offset(const struct vkms_frame_info *frame= _info, int x, int y, > > > + int plane_index, int *offset, int *rem_x, int *rem_y) > > > { > > > struct drm_framebuffer *fb =3D frame_info->fb; > > > + const struct drm_format_info *format =3D frame_info->fb->format; > > > + /* Directly using x and y to multiply pitches and format->ccp is no= t sufficient because > > > + * in some formats a block can represent multiple pixels. > > > + * > > > + * Dividing x and y by the block size allows to extract the correct= offset of the block > > > + * containing the pixel. > > > + */ > > > =20 > > > - return fb->offsets[0] + (y * fb->pitches[0]) + (x * fb->format->cpp= [0]); > > > + int block_x =3D x / drm_format_info_block_width(format, plane_index= ); > > > + int block_y =3D y / drm_format_info_block_height(format, plane_inde= x); > > > + *rem_x =3D x % drm_format_info_block_width(format, plane_index); > > > + *rem_y =3D y % drm_format_info_block_height(format, plane_index); > > > + *offset =3D fb->offsets[plane_index] + > > > + block_y * fb->pitches[plane_index] + > > > + block_x * format->char_per_block[plane_index]; =20 > >=20 > > I started thinking... is > >=20 > > + block_y * fb->pitches[plane_index] + > >=20 > > correct, or should it be > >=20 > > + y * fb->pitches[plane_index] + > >=20 > > ? =20 >=20 > The documentation is not very clear about that: >=20 > * @pitches: Line stride per buffer. For userspace created object= this > * is copied from drm_mode_fb_cmd2. >=20 > If I look at the drm_mode_fb_cmd2, there is this documentation: >=20 > /** @pitches: Pitch (aka. stride) in bytes, one per plane. */ >=20 > For me, I interpret "stride" as it is used in matrix calculation, where it > means "the number of bytes between two number adjacent verticaly" > (&matrix[x,y] + stride =3D=3D &matrix[x,y+1]). >=20 > So in a graphic context, I interpret a stride as the number of byte to > reach the next line of blocks (as pixels can not always be accessed > individually). >=20 > So, for me, buffer_size_in_byte >=3D stride * number_of_lines. This is the definition, yes. Even for blocky formats, it is still number of 1-pixel-high lines, even though one cannot address a line as such. For blocky formats it is a theoretical value, and computing with it only makes sense when your number of lines is a multiple of block height. That's my recollection. This has been hashed in issues like https://gitlab.freedesktop.org/wayland/weston/-/issues/896 > > I'm looking at drm_format_info_min_pitch() which sounds like it should > > be the latter? Because of > > > > return DIV_ROUND_UP_ULL((u64)buffer_width * info->char_per_bloc= k[plane], > > drm_format_info_block_width(info, plane) * > > drm_format_info_block_height(info, plane)); > > > > in drm_format_info_min_pitch(). =20 >=20 > This function doesn't make sense to me. I can't understand how it could > work. >=20 > If I consider the X0L2 format, with block_h =3D=3D block_w =3D=3D 2, > char_per_block =3D 8, and a framebuffer of 1 * 10 pixels, the result of > drm_format_info_min_pitch is 2. If buffer_width is 1, then buffer_width / block_w is 1 because of rounding up. Cannot have images with partial blocks. That is a block-row of one block. That block takes char_per_block bytes, that is, 8 bytes here. But block height is 2, and stride is computed for 1-pixel-high line, so we divide by block_h, and the result is stride =3D 4 bytes. However, 1 * 8 / (2 * 2) =3D 2. I think the code is bugged, and the round-up happens at a wrong point. The correct form would be div_round_up(div_round_up(buffer_width, block_w) * char_per_block, block_h) in my opinion. There must always be an integer number of blocks on a block-row. If a block-row has multiple blocks, doing a div_round_up(char_per_block, block_h) would over-estimate the per-block bytes and add the extra for each block rather than averaging it out. So, multiplying by char_per_block before dividing by block_h gives a stricter minimum stride. The condition buffer_size_bytes >=3D buffer_height * stride is necessary but not always sufficient, because the buffer must hold an integer number of block-rows. > However, for this format and this framebuffer, I think the stride should > be at least 8 bytes (the buffer is "1 block width"). I believe the stride is 4 bytes, because stride for a 1-pixel-high line on average, rounded up. Stride allows for unused blocks per block-row, but its use for buffer size checking with buffer_height is incomplete, I believe. For the question "is buffer_size enough?" it may produce false-positives, but it cannot produce false-negatives. Thanks, pq > If pitch equals 2 (as suggested by the test), it implies that > height * pitch is not valid for calculating the minimum size of the buffer > (in our case, 10 * 2 =3D 20 bytes, but the minimum framebuffer size should > be 5 blocks * 8 bytes_per_block =3D 40 bytes). And I don't understand what > the 2 represents in this context. > Is it the number of pixels on a line (which is strange, because pitch=20 > should be in byte)? The width in byte of the first line, but by using the= =20 > "byte per pixel" value (which make no sense when using block formats)? >=20 > If pitch equals 8 (as I would expect), height * pitch is not optimal (but > at least sufficient to contain the entire framebuffer), and height * pitch > / block_h is optimal (which is logical, because height/block_h is the=20 > number of block per columns). >=20 > > Btw. maybe this should check that the result is not negative (e.g. due > > to overflow)? Or does that even work since signed overflow is undefined > > behavior (UB) and compilers may assume UB does not happen, causing the > > check to be eliminated as dead code? > > > > Otherwise this patch looks ok to me. > >=20 > >=20 > > Thanks, > > pq > > =20 >=20 > [...] >=20 > -- > Louis Chauvet, Bootlin > Embedded Linux and Kernel engineering > https://bootlin.com --Sig_/+HtRlSbEj54KW42m5d4eUu+ Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEJQjwWQChkWOYOIONI1/ltBGqqqcFAmZB9ioACgkQI1/ltBGq qqcUJA/+IdlwBdr+FJmO3CcHiS+nrrHFGN++uZ9PuwM5Lgej732L7EpemDQfTiUH wOm+QjnvbS+RxHlKDyCqD1fmHyH+nf0i1cuo6UtqeobUmaPzsKlNYdliB5nMO/ux czIQuObrdhSiEXeEgM8SHrY5iuKnMCwpYm9nQ4jBXCDZrzMUYgEHQXLvwRYvzVsf Koeoo0LMnsKzA6gDwWJpqV559x5GTFX0io2fYDHUsb6oND22fOOkPQtVE21wYjyg 3dM58JGblCUJeP5XMrGayfCOFgB96D6p9ldjnhoXV8tAdpZnv+Xi8goDzs8gTtZV dINLToRWldysnAD6O7QfEPAyycp4bMupkMSZY7tyB8lHr1n/HflJndAczZwq6uza XeRzn88hZ0j/JA/lOvHq7kG1Bwyo3FLCmFeb5jdpiqYLvSB1EbFqonZMTjjr91YB bThcslH1m7mrbePXGnDMbHN2gAmRTH96uncMYCL3vZ2KPK6kxekq3FQVpclcz/1i qJ9H0ngeEZaTJGJTbrl6GKvGj0kxKctPVNm+v5iTzabo560gdLQPbJeD/cy2nN3J tMc3qKc00en9U3xCkZQyUAtcvpiOIRiQsJiuyuauA6l4DmEiIARIc8xRSjhGIDdm ASvVvGuGd1JON7tZHRS8BlJ1yrlg8RRZWaTIzmk1RVQ9INsndTk= =gM+J -----END PGP SIGNATURE----- --Sig_/+HtRlSbEj54KW42m5d4eUu+--