Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp654295rdh; Sun, 24 Sep 2023 06:18:41 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHDpXvM9isOz+OfefNlnSar0iw11ubwAiBUVM4oaxeKb3L2fSICKJz1pgG0FFWzFy3kXIng X-Received: by 2002:a05:6871:4912:b0:1dc:f4de:46b8 with SMTP id tw18-20020a056871491200b001dcf4de46b8mr3145702oab.59.1695561521181; Sun, 24 Sep 2023 06:18:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695561521; cv=none; d=google.com; s=arc-20160816; b=qgT8paOGlgE4rKfBQ/SJRAd2YbxYR47AA35kOZ7pGM0taf2pG0Wwq0XBvURLWrI0AH gl5uyhujUMcQOnOyBu/ZXToakQ35wNdnq1WiedQ15efLCOgCA/AS3SBDR4S4GLv+q2Fc R9YV1wrcNcls2EPdQljTVL80UWITF/ecB71UbK68o3xexcs1mJaHZ3sULgHpW6oDGYMe JyjX4kg8UGgUWpUTKKIwFeNwIltEVEf/XuHzDumuRJI7vQzhq9SEouW/zs6Rd+3X+Btl 3kV5cV12PFDDWYyGGH8l2n98hDdIeK5Nkxeg7o/KgKvVxUtpFNX6wYVBRNjaTa+HaRRi ZYlQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=KuT1WxoBUL9TKW+56hS2J5w0UqQYAzAwyGVO4L6aibk=; fh=BJTrKU8oppEgSUmz+7GOdUgqO8HfVZhhp10nJBMsPIA=; b=WJrMqTd3meQwpSf+jEZEF5ICNFMsgVz2ALhrHnkAAvhJH3VOhvtI5xQ0/+3YbFN7aW M6nlnSC6isielrKJskZWHYUsLvPQmYECh7jXTKi1qUyquUCNBLqJ8UoGQt1DjFdUlRSs kkq0MSWIoWLaYla0PVrCGRhFSxuvCAFBIE57i2jtpqnUQTkA/zmkJzkD942NfEhOcS6c +G6yBhjNHRjOhKX8bG4B8/kekLu2NA+E8y+2uz6bqOxJ8pHL9pVt6qJ6l/fsjCgEz7JJ wE9hjRzqDI4YIBEjNx1xjqhGMHw2jne2JgGO2w1Q3VvXnVmoD86ClPze/nTngKIdDRtM mrJA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=A3rNIbIO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id i2-20020a639d02000000b0057c7341d568si6018957pgd.645.2023.09.24.06.18.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 24 Sep 2023 06:18:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=A3rNIbIO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 6C84A8157DEB; Sun, 24 Sep 2023 06:18:26 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230216AbjIXNSD (ORCPT + 99 others); Sun, 24 Sep 2023 09:18:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229545AbjIXNRt (ORCPT ); Sun, 24 Sep 2023 09:17:49 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E0AD01736; Sun, 24 Sep 2023 06:16:56 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 47DD1C433CD; Sun, 24 Sep 2023 13:16:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1695561415; bh=FF8e3C7Xs/OSrgjQHSuwKFgdVE7dLpEjPZz+RiAm6cs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=A3rNIbIOf//acwBjwDm9N3IvISMqzlrDvXbp53NME7YtZnPtvlex34m6SxTCeDCKj rZBYQw1kKixGPh98nGnHVcCX9A6oAGA0yueveCf7dJ6z8J549weynpC40ukhhF6z1B IKjUofw2OJeRw4B3rLnUX9dBWZMYcKi1ohlexaNOiP56833819FoPPxWAYhxuenHUm 2dfCjhyMkA0qvsJgltsNhyqsKXsnxy/KfXJdZP2VH8KCSdkWcpnUBDru+R2Yp9RGEH jrp5n3NNg+JWoLsGpHxNNt6cROLJNLSJPmaqDGIaXPR5aLDYjutlmlUEVZvjn3ixuh /Y+9dpLniOVwg== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Mukul Joshi , Felix Kuehling , Alex Deucher , Sasha Levin , christian.koenig@amd.com, Xinhui.Pan@amd.com, airlied@gmail.com, daniel@ffwll.ch, amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Subject: [PATCH AUTOSEL 6.5 25/41] drm/amdkfd: Update cache info reporting for GFX v9.4.3 Date: Sun, 24 Sep 2023 09:15:13 -0400 Message-Id: <20230924131529.1275335-25-sashal@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230924131529.1275335-1-sashal@kernel.org> References: <20230924131529.1275335-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.5.5 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=0.0 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,TVD_PH_SUBJ_META1 autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Sun, 24 Sep 2023 06:18:26 -0700 (PDT) From: Mukul Joshi [ Upstream commit 0752e66e91fa86fa5481b04b22053363833ffb85 ] Update cache info reporting in sysfs to report the correct number of CUs and associated cache information based on different spatial partitioning modes. Signed-off-by: Mukul Joshi Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdkfd/kfd_crat.h | 4 ++ drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 82 +++++++++++++---------- drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 2 +- 3 files changed, 51 insertions(+), 37 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.h b/drivers/gpu/drm/amd/amdkfd/kfd_crat.h index fc719389b5d65..4684711aa695a 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.h @@ -79,6 +79,10 @@ struct crat_header { #define CRAT_SUBTYPE_IOLINK_AFFINITY 5 #define CRAT_SUBTYPE_MAX 6 +/* + * Do not change the value of CRAT_SIBLINGMAP_SIZE from 32 + * as it breaks the ABI. + */ #define CRAT_SIBLINGMAP_SIZE 32 /* diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c index ea67a353beb00..5582191022106 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c @@ -1650,14 +1650,17 @@ static int fill_in_l1_pcache(struct kfd_cache_properties **props_ext, static int fill_in_l2_l3_pcache(struct kfd_cache_properties **props_ext, struct kfd_gpu_cache_info *pcache_info, struct kfd_cu_info *cu_info, - int cache_type, unsigned int cu_processor_id) + int cache_type, unsigned int cu_processor_id, + struct kfd_node *knode) { unsigned int cu_sibling_map_mask; int first_active_cu; - int i, j, k; + int i, j, k, xcc, start, end; struct kfd_cache_properties *pcache = NULL; - cu_sibling_map_mask = cu_info->cu_bitmap[0][0][0]; + start = ffs(knode->xcc_mask) - 1; + end = start + NUM_XCC(knode->xcc_mask); + cu_sibling_map_mask = cu_info->cu_bitmap[start][0][0]; cu_sibling_map_mask &= ((1 << pcache_info[cache_type].num_cu_shared) - 1); first_active_cu = ffs(cu_sibling_map_mask); @@ -1692,16 +1695,18 @@ static int fill_in_l2_l3_pcache(struct kfd_cache_properties **props_ext, cu_sibling_map_mask = cu_sibling_map_mask >> (first_active_cu - 1); k = 0; - for (i = 0; i < cu_info->num_shader_engines; i++) { - for (j = 0; j < cu_info->num_shader_arrays_per_engine; j++) { - pcache->sibling_map[k] = (uint8_t)(cu_sibling_map_mask & 0xFF); - pcache->sibling_map[k+1] = (uint8_t)((cu_sibling_map_mask >> 8) & 0xFF); - pcache->sibling_map[k+2] = (uint8_t)((cu_sibling_map_mask >> 16) & 0xFF); - pcache->sibling_map[k+3] = (uint8_t)((cu_sibling_map_mask >> 24) & 0xFF); - k += 4; - - cu_sibling_map_mask = cu_info->cu_bitmap[0][i % 4][j + i / 4]; - cu_sibling_map_mask &= ((1 << pcache_info[cache_type].num_cu_shared) - 1); + for (xcc = start; xcc < end; xcc++) { + for (i = 0; i < cu_info->num_shader_engines; i++) { + for (j = 0; j < cu_info->num_shader_arrays_per_engine; j++) { + pcache->sibling_map[k] = (uint8_t)(cu_sibling_map_mask & 0xFF); + pcache->sibling_map[k+1] = (uint8_t)((cu_sibling_map_mask >> 8) & 0xFF); + pcache->sibling_map[k+2] = (uint8_t)((cu_sibling_map_mask >> 16) & 0xFF); + pcache->sibling_map[k+3] = (uint8_t)((cu_sibling_map_mask >> 24) & 0xFF); + k += 4; + + cu_sibling_map_mask = cu_info->cu_bitmap[xcc][i % 4][j + i / 4]; + cu_sibling_map_mask &= ((1 << pcache_info[cache_type].num_cu_shared) - 1); + } } } pcache->sibling_map_size = k; @@ -1719,7 +1724,7 @@ static int fill_in_l2_l3_pcache(struct kfd_cache_properties **props_ext, static void kfd_fill_cache_non_crat_info(struct kfd_topology_device *dev, struct kfd_node *kdev) { struct kfd_gpu_cache_info *pcache_info = NULL; - int i, j, k; + int i, j, k, xcc, start, end; int ct = 0; unsigned int cu_processor_id; int ret; @@ -1753,37 +1758,42 @@ static void kfd_fill_cache_non_crat_info(struct kfd_topology_device *dev, struct * then it will consider only one CU from * the shared unit */ + start = ffs(kdev->xcc_mask) - 1; + end = start + NUM_XCC(kdev->xcc_mask); + for (ct = 0; ct < num_of_cache_types; ct++) { cu_processor_id = gpu_processor_id; if (pcache_info[ct].cache_level == 1) { - for (i = 0; i < pcu_info->num_shader_engines; i++) { - for (j = 0; j < pcu_info->num_shader_arrays_per_engine; j++) { - for (k = 0; k < pcu_info->num_cu_per_sh; k += pcache_info[ct].num_cu_shared) { - - ret = fill_in_l1_pcache(&props_ext, pcache_info, pcu_info, - pcu_info->cu_bitmap[0][i % 4][j + i / 4], ct, - cu_processor_id, k); - - if (ret < 0) - break; - - if (!ret) { - num_of_entries++; - list_add_tail(&props_ext->list, &dev->cache_props); + for (xcc = start; xcc < end; xcc++) { + for (i = 0; i < pcu_info->num_shader_engines; i++) { + for (j = 0; j < pcu_info->num_shader_arrays_per_engine; j++) { + for (k = 0; k < pcu_info->num_cu_per_sh; k += pcache_info[ct].num_cu_shared) { + + ret = fill_in_l1_pcache(&props_ext, pcache_info, pcu_info, + pcu_info->cu_bitmap[xcc][i % 4][j + i / 4], ct, + cu_processor_id, k); + + if (ret < 0) + break; + + if (!ret) { + num_of_entries++; + list_add_tail(&props_ext->list, &dev->cache_props); + } + + /* Move to next CU block */ + num_cu_shared = ((k + pcache_info[ct].num_cu_shared) <= + pcu_info->num_cu_per_sh) ? + pcache_info[ct].num_cu_shared : + (pcu_info->num_cu_per_sh - k); + cu_processor_id += num_cu_shared; } - - /* Move to next CU block */ - num_cu_shared = ((k + pcache_info[ct].num_cu_shared) <= - pcu_info->num_cu_per_sh) ? - pcache_info[ct].num_cu_shared : - (pcu_info->num_cu_per_sh - k); - cu_processor_id += num_cu_shared; } } } } else { ret = fill_in_l2_l3_pcache(&props_ext, pcache_info, - pcu_info, ct, cu_processor_id); + pcu_info, ct, cu_processor_id, kdev); if (ret < 0) break; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h index cba2cd5ed9d19..46927263e014d 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h @@ -86,7 +86,7 @@ struct kfd_mem_properties { struct attribute attr; }; -#define CACHE_SIBLINGMAP_SIZE 64 +#define CACHE_SIBLINGMAP_SIZE 128 struct kfd_cache_properties { struct list_head list; -- 2.40.1