Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp567029rdb; Fri, 8 Sep 2023 09:08:59 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGePFjSk85DmC7phHAzlbTVYmphd4y+JP6DcIW8bzCQ3T5e8QF6F41wu4UIMGAW+h6oqOk1 X-Received: by 2002:a17:906:5a51:b0:9a1:ffa7:d2da with SMTP id my17-20020a1709065a5100b009a1ffa7d2damr2272977ejc.17.1694189339253; Fri, 08 Sep 2023 09:08:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694189339; cv=none; d=google.com; s=arc-20160816; b=hmtxcAL0IvA/Oz3Q76FdDdkMvf4qvNBqrKCmhuGXWEXiUbsKjcWAGZLLu9+5EmZvZz YH2kTy6KuOFqQaL8KO63es1h+cIvutEoKM+dApB9tn+4APUgHDd9ldrYaLKgijbqhJsP McJL6Ssdzkguouk+s653cIQ7ATalZe5iRbakFs7mNi/3qlxz93EA7WoCcMzOx8TfaSlG z+4au/6Kokb5WPV5QZQ8glk/Tf28lfXLVXEUBpbFr7RDKeYUctVV3wEvXRhpy9nfJiye 7TnbAVANKeCzKOuQ+QlsgpN0TkizLUoQjI/Y0wNyitEBgZxhTVpb8pTM+XhdOhklI2rQ gPoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=KhangHqvifzsEmrU/jafDlCuZhalHpNTZ2BYg95hJmg=; fh=aD4nzDFgC0UZ1Hnc4/XaW5rKm2zwKzUzHo+lLTvbbP8=; b=vrHwk2dIeG4zRRY8bQ/Hr5p6NgPR+NB5Vyhl4UJr/+PWsG5/zrLwU3r1SG0QfvdywX Y1g9FmMj/tuJ7MzF05bFHBifUglssw6gzgHF4WcOdKp62qb/3HmRjj9RfX1Rs6B1fYv7 gsNvPSo80iabOFGM4qc444LE3SdLHeDYMAqB0hVaZKDfcHFKdhF02oaGdIZ+S92eijgo /JQG/lD1G+2GCgAVgDbyav4ErOFhjgyroPlIAjC+ZYM1W8Rkz9W7w0XJQU9sqoqGh27w teX3IgfUS6dtvM4jmtVjVERpJWrmq1J8Za0dWyJ1Tf4RhfHnKkUorU+088B/kCpN723/ ELiA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=ix0N7Kiq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f3-20020a1709064dc300b0098e1efaafc5si1464620ejw.195.2023.09.08.09.08.19; Fri, 08 Sep 2023 09:08:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=ix0N7Kiq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243986AbjIHOft (ORCPT + 99 others); Fri, 8 Sep 2023 10:35:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56656 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234972AbjIHOfs (ORCPT ); Fri, 8 Sep 2023 10:35:48 -0400 Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8EB861BEE for ; Fri, 8 Sep 2023 07:35:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=KhangHqvifzsEmrU/jafDlCuZhalHpNTZ2BYg95hJmg=; b=ix0N7KiqUqtc7Vy3wXgBLQgdTe UqQ+5O4MJGXwxfxWwg7t0y3BBZ3yQXdBoOArASHDG0Z1U35gY4q/JjjpdZaT3aSbjYNBLwG+hvq7u +jq5bWLk1q86bGKvJPgWWlSrmC+hErrD/kvKZrilDMgWuYY0y6p4JDlurVzHi/GT+mFijHbnxWZSs O4k9ateSk/Z0HEL2qxRep8+/5Kt1B+KUgkUFnxQJdSc0jeKZ669RxiafnR8MCFVTLGgDb9exjTGWi 6Rj4G0dOHuhW1ru/4E5ZiiOiYTppAGsb2cv2uXReFvt/5ROvZI5VHrpykuiG/ZD/jV/Uy40d38K9/ t1JQkkPA==; Received: from [38.44.68.151] (helo=mail.igalia.com) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qecDd-001Oqj-VU; Fri, 08 Sep 2023 16:12:21 +0200 Date: Fri, 8 Sep 2023 13:11:59 -0100 From: Melissa Wen To: Harry Wentland Cc: amd-gfx@lists.freedesktop.org, Rodrigo Siqueira , sunpeng.li@amd.com, Alex Deucher , dri-devel@lists.freedesktop.org, christian.koenig@amd.com, Xinhui.Pan@amd.com, airlied@gmail.com, daniel@ffwll.ch, Joshua Ashton , Sebastian Wick , Xaver Hugl , Shashank Sharma , Nicholas Kazlauskas , sungjoon.kim@amd.com, Alex Hung , Pekka Paalanen , Simon Ser , kernel-dev@igalia.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 01/34] drm/amd/display: fix segment distribution for linear LUTs Message-ID: <20230908141159.6hfne5r7hxi6bycs@mail.igalia.com> References: <20230810160314.48225-1-mwen@igalia.com> <20230810160314.48225-2-mwen@igalia.com> <7e11c23d-2824-4f32-b863-13cc631a6d40@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7e11c23d-2824-4f32-b863-13cc631a6d40@amd.com> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/06, Harry Wentland wrote: > On 2023-08-10 12:02, Melissa Wen wrote: > > From: Harry Wentland > > > > The region and segment calculation was incapable of dealing > > with regions of more than 16 segments. We first fix this. > > > > Now that we can support regions up to 256 elements we can > > define a better segment distribution for near-linear LUTs > > for our maximum of 256 HW-supported points. > > > > With these changes an "identity" LUT looks visually > > indistinguishable from bypass and allows us to use > > our 3DLUT. > > > > Have you had a chance to test whether this patch makes a > difference? I haven't had the time yet. Last time I tested there was a banding issue on plane shaper LUT PQ -> Display Native, but it seems I don't have this use case on tester anymore, so I wasn't able to double-check if the issue persist. Maybe Joshua can provide some inputs here. Something I noticed is that shaper LUTs are the only 1D LUT on DCN30 pipeline that uses cm_helper_translate_curve_to_hw_format(), all others (dpp-degamma/dpp-blend/mpc-regamma) call cm3_helper_translate_curve_*. We can drop it from this series until we get the steps to report the issue properly. Melissa > > Harry > > > Signed-off-by: Harry Wentland > > Signed-off-by: Melissa Wen > > --- > > .../amd/display/dc/dcn10/dcn10_cm_common.c | 93 +++++++++++++++---- > > 1 file changed, 75 insertions(+), 18 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c > > index 3538973bd0c6..04b2e04b68f3 100644 > > --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c > > +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c > > @@ -349,20 +349,37 @@ bool cm_helper_translate_curve_to_hw_format(struct dc_context *ctx, > > * segment is from 2^-10 to 2^1 > > * There are less than 256 points, for optimization > > */ > > - seg_distr[0] = 3; > > - seg_distr[1] = 4; > > - seg_distr[2] = 4; > > - seg_distr[3] = 4; > > - seg_distr[4] = 4; > > - seg_distr[5] = 4; > > - seg_distr[6] = 4; > > - seg_distr[7] = 4; > > - seg_distr[8] = 4; > > - seg_distr[9] = 4; > > - seg_distr[10] = 1; > > + if (output_tf->tf == TRANSFER_FUNCTION_LINEAR) { > > + seg_distr[0] = 0; /* 2 */ > > + seg_distr[1] = 1; /* 4 */ > > + seg_distr[2] = 2; /* 4 */ > > + seg_distr[3] = 3; /* 8 */ > > + seg_distr[4] = 4; /* 16 */ > > + seg_distr[5] = 5; /* 32 */ > > + seg_distr[6] = 6; /* 64 */ > > + seg_distr[7] = 7; /* 128 */ > > + > > + region_start = -8; > > + region_end = 1; > > + } else { > > + seg_distr[0] = 3; /* 8 */ > > + seg_distr[1] = 4; /* 16 */ > > + seg_distr[2] = 4; > > + seg_distr[3] = 4; > > + seg_distr[4] = 4; > > + seg_distr[5] = 4; > > + seg_distr[6] = 4; > > + seg_distr[7] = 4; > > + seg_distr[8] = 4; > > + seg_distr[9] = 4; > > + seg_distr[10] = 1; /* 2 */ > > + /* total = 8*16 + 8 + 64 + 2 = */ > > + > > + region_start = -10; > > + region_end = 1; > > + } > > + > > > > - region_start = -10; > > - region_end = 1; > > } > > > > for (i = region_end - region_start; i < MAX_REGIONS_NUMBER ; i++) > > @@ -375,16 +392,56 @@ bool cm_helper_translate_curve_to_hw_format(struct dc_context *ctx, > > > > j = 0; > > for (k = 0; k < (region_end - region_start); k++) { > > - increment = NUMBER_SW_SEGMENTS / (1 << seg_distr[k]); > > + /* > > + * We're using an ugly-ish hack here. Our HW allows for > > + * 256 segments per region but SW_SEGMENTS is 16. > > + * SW_SEGMENTS has some undocumented relationship to > > + * the number of points in the tf_pts struct, which > > + * is 512, unlike what's suggested TRANSFER_FUNC_POINTS. > > + * > > + * In order to work past this dilemma we'll scale our > > + * increment by (1 << 4) and then do the inverse (1 >> 4) > > + * when accessing the elements in tf_pts. > > + * > > + * TODO: find a better way using SW_SEGMENTS and > > + * TRANSFER_FUNC_POINTS definitions > > + */ > > + increment = (NUMBER_SW_SEGMENTS << 4) / (1 << seg_distr[k]); > > start_index = (region_start + k + MAX_LOW_POINT) * > > NUMBER_SW_SEGMENTS; > > - for (i = start_index; i < start_index + NUMBER_SW_SEGMENTS; > > + for (i = (start_index << 4); i < (start_index << 4) + (NUMBER_SW_SEGMENTS << 4); > > i += increment) { > > + struct fixed31_32 in_plus_one, in; > > + struct fixed31_32 value, red_value, green_value, blue_value; > > + uint32_t t = i & 0xf; > > + > > if (j == hw_points - 1) > > break; > > - rgb_resulted[j].red = output_tf->tf_pts.red[i]; > > - rgb_resulted[j].green = output_tf->tf_pts.green[i]; > > - rgb_resulted[j].blue = output_tf->tf_pts.blue[i]; > > + > > + in_plus_one = output_tf->tf_pts.red[(i >> 4) + 1]; > > + in = output_tf->tf_pts.red[i >> 4]; > > + value = dc_fixpt_sub(in_plus_one, in); > > + value = dc_fixpt_shr(dc_fixpt_mul_int(value, t), 4); > > + value = dc_fixpt_add(in, value); > > + red_value = value; > > + > > + in_plus_one = output_tf->tf_pts.green[(i >> 4) + 1]; > > + in = output_tf->tf_pts.green[i >> 4]; > > + value = dc_fixpt_sub(in_plus_one, in); > > + value = dc_fixpt_shr(dc_fixpt_mul_int(value, t), 4); > > + value = dc_fixpt_add(in, value); > > + green_value = value; > > + > > + in_plus_one = output_tf->tf_pts.blue[(i >> 4) + 1]; > > + in = output_tf->tf_pts.blue[i >> 4]; > > + value = dc_fixpt_sub(in_plus_one, in); > > + value = dc_fixpt_shr(dc_fixpt_mul_int(value, t), 4); > > + value = dc_fixpt_add(in, value); > > + blue_value = value; > > + > > + rgb_resulted[j].red = red_value; > > + rgb_resulted[j].green = green_value; > > + rgb_resulted[j].blue = blue_value; > > j++; > > } > > } >