Received: by 2002:a05:6358:f14:b0:e5:3b68:ec04 with SMTP id b20csp2251569rwj; Mon, 19 Dec 2022 01:43:22 -0800 (PST) X-Google-Smtp-Source: AA0mqf5ZinLVy159ABlgabCZSuzImuLhpfT4RJ3nTKrssLk516GJRC2bzmYTGj33+18YyypXNhZL X-Received: by 2002:a17:90a:d146:b0:219:3c9a:b1e4 with SMTP id t6-20020a17090ad14600b002193c9ab1e4mr43099256pjw.29.1671443002158; Mon, 19 Dec 2022 01:43:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671443002; cv=none; d=google.com; s=arc-20160816; b=F/hlLoBZBm+eFLwDkwo2UOYFlVpyDTvz+xN0RyVunEJlVRp66l0G6/Ab/EOgxQ8vQ2 vnsJYaut4iNZdQb42A9JfmqYIHyi8ygU1HaS/GJWv4JPQxAJS7sZvYsgpAOHUTGHvoUR 0myc5rtX2Ld0xkX9Jm1BteFFr6I1hHVZURxAAXpN5V3JakegX/sH6cfYfwoyQjsU1f6r H8S7zadF/p3qW4az+TYBCTzz1v7Q3oHLxny9ndowXtL3VHpWv88eV9QdSWZnDo7k1/CV eNWZuERUGYwyX6bV3m7CH/uX+LaiN5Ps/ZjCl5Z+RLXxDVhtoXNBzgBAoCBBTE7WnhWW BI9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=d4PUWpDCc5OCB6Ld0BBIMnycdruf1PmlxbuZjGzJaNI=; b=rXUH/in6tjhCyw4pt3RfyPQ8rRYEN5NYga8Q1qYhhCmmjWOrjmZ8CFR8Uwze9jAXGE SbaNUuyFz5cQQ+rhpmkm0dw4wI0ZOg4gassu52txktD6ya9s7nw0B5uqRKy+XI4b39QE UdQ4xdhHAsEskUQC/CK6HwpuYkaPV8Gelf/aqGEOWeGlbBZBoAd6o0CBSpIgZgvFKS3h ++tuypipp/ppZel4fJgTkp6Fs/GQZsoy4JBNh6r7uREE9hDvemyM3X/WiN+5JfkBWNu+ /2eSTFVWrfbtCybV4UCE2oU+hjyW/pt6VM8YVoP2jklnKB6tDbQYkABu7niFikXnOyrf 1Fzg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=jEjbWPy3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i189-20020a639dc6000000b00478d39fcb3dsi10923757pgd.519.2022.12.19.01.43.12; Mon, 19 Dec 2022 01:43:22 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=jEjbWPy3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231556AbiLSIhs (ORCPT + 70 others); Mon, 19 Dec 2022 03:37:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55302 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231234AbiLSIhr (ORCPT ); Mon, 19 Dec 2022 03:37:47 -0500 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD7EEA195 for ; Mon, 19 Dec 2022 00:37:43 -0800 (PST) Received: by mail-wr1-x434.google.com with SMTP id w15so7826785wrl.9 for ; Mon, 19 Dec 2022 00:37:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=d4PUWpDCc5OCB6Ld0BBIMnycdruf1PmlxbuZjGzJaNI=; b=jEjbWPy3wl7ITLr6+vKM2O1zBwRTG+Cfcc/X5DlmiSBitzlS/LzwgnImKPhv/5D6ED BH/znKkATw3mwgY8kioKuh6GmSRHT0rXCP0dFs+Kg5rAPJT/rRpuB/TZk4rmltnpADqO nitrydB3ZiDPtrY1g7PuvJG10i2LmF1ZeS+cek8jn32aayFBGJ0SIyeTmSWLMlx7kvOy l+gL6LB3sqdo39ElbKft1MMy9yqN9wZLrgPeu2s5qsNemG+M+hA2A3Eb96jVfB4G8Vn3 pC1KfVRrp82KsdyUXz3L7Z8LrGyIbVtsjjDJoF3rL2hZfVNfDkbU7SwliRRAfPitUU7Q R8VA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=d4PUWpDCc5OCB6Ld0BBIMnycdruf1PmlxbuZjGzJaNI=; b=UUo2SY/HFyouGhZw/6BQBJ9XQ5j0d41KqV9RJzxoHRPt/vMdNUQyNux99PfdqtKZQb dXIqvdpmT1ggCAEqTIfGnemX0IJPlknXUfF4yNIZs4BDjCDWx+uUU8WMkg7CxoJkFvcQ DG4USiClNHICGBj+HGbHMEF7SuqZCUlkbBot9mPCwloBQesSUW1yFYLEJH3DTbj/BLjI Ne0HPS1gziZw2R+Wt43ll0M/ZNBiEATEmOUAt+GKegrOZ3IhXVVEuYKp1AzNXY8dGAV7 WbnpQbeo12sW/FYrI4HN7yQPq+ma+ASKrODCnD23BL7gmyKYGem/CSKcMo1y5UajsSC0 V7rw== X-Gm-Message-State: ANoB5plepUSgbltvcj1NJS/wc+uKFWiMpsaKhRvASfPbeNqWv9TcAqHo L2AfnKQ6yJLPx8UieGDHWsA= X-Received: by 2002:a5d:6091:0:b0:242:2088:1546 with SMTP id w17-20020a5d6091000000b0024220881546mr32222857wrt.61.1671439062329; Mon, 19 Dec 2022 00:37:42 -0800 (PST) Received: from ?IPV6:2a02:908:1256:79a0:83d7:3937:b31e:d44c? ([2a02:908:1256:79a0:83d7:3937:b31e:d44c]) by smtp.gmail.com with ESMTPSA id d7-20020a5d5387000000b00241e4bff85asm9285780wrv.100.2022.12.19.00.37.41 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 19 Dec 2022 00:37:41 -0800 (PST) Message-ID: <9e886927-df5a-e264-8d8d-c83045bac732@gmail.com> Date: Mon, 19 Dec 2022 09:37:44 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Subject: Re: [PATCH v6] drm: Optimise for continuous memory allocation Content-Language: en-US To: xinhui pan , amd-gfx@lists.freedesktop.org Cc: arunpravin.paneerselvam@amd.com, intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, matthew.auld@intel.com, daniel@ffwll.ch, christian.koenig@amd.com References: <20221218065708.93332-1-xinhui.pan@amd.com> From: =?UTF-8?Q?Christian_K=c3=b6nig?= In-Reply-To: <20221218065708.93332-1-xinhui.pan@amd.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Am 18.12.22 um 07:57 schrieb xinhui pan: > Optimise a little when continuous memory request fails. > > There are memory holes and continuous memory request usually fails when > order is too big. > Currently buddy only look for exactly order memory for such request. > Now we can try again to look for several smaller continuous memory on > failure. I'm still pretty sure that this is illegal. See the order is not only the minimum we need for linear allocation, but also the minimum alignment we need. So if you look at some block combination like 010 when searching for an order 2 allocation you satisfy the contiguous constrain, but not the alignment constrain and that's illegal. Additional to that we have a huge additional CPU overhead for contiguous allocations with that. Regards, Christian. > > Signed-off-by: xinhui pan > --- > change from v5: > reworked > --- > drivers/gpu/drm/drm_buddy.c | 161 ++++++++++++++++++++++++++++++++++-- > 1 file changed, 154 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c > index 11bb59399471..6c795e1b3247 100644 > --- a/drivers/gpu/drm/drm_buddy.c > +++ b/drivers/gpu/drm/drm_buddy.c > @@ -386,6 +386,140 @@ alloc_range_bias(struct drm_buddy *mm, > return ERR_PTR(err); > } > > +static void __continuous_block_in_tree(struct drm_buddy_block *top_block, > + struct list_head *fbl, > + int left, > + int min_order) > +{ > + /* > + * Look for continuous memory of > + * [top_block) when left is true or (top_block] when left is false. > + * The list of fbl looks like (top_block1][free_block][...][top_blockX). > + * Memory offset is in ascending order. > + */ > + while (top_block) { > + struct drm_buddy_block *block = top_block; > + int order; > + > + while (drm_buddy_block_is_split(block)) > + block = left ? block->left : block->right; > + > + order = drm_buddy_block_order(block); > + if (order < min_order || !drm_buddy_block_is_free(block)) > + return; > + > + if (left) > + list_add_tail(&block->tmp_link, fbl); > + else > + list_add(&block->tmp_link, fbl); > + > + if (order == min_order) > + return; > + top_block = __get_buddy(block); > + } > +} > + > +static bool __free_block_in_order(struct list_head *fbl, > + struct drm_buddy_block *cur, > + int order, > + struct drm_buddy_block **first, > + struct drm_buddy_block **last) > +{ > + struct drm_buddy_block *fb = cur, *lb = list_next_entry(cur, tmp_link); > + u64 pages = BIT(order); > + u64 cur_pages = 0; > + > + /* > + * Look for continuous memory which satisfy requested order. > + * Memory in list fbl are already in below order. > + * 1) Memory offset are in ascending order. > + * 2) Memory size are in ascending order from left to middle and > + * descending order from middle to right. > + * So walk through the list of fbl from middle to both sides to > + * choose the bigger memory. > + * This is because one memory with order X are composed with 2 of order X-1 > + * or 1 of order X-1 and 2 of order X-2, etc. Looks like below. > + * n > + * {∑(X - y)} + {2 * (X-n-1))} > + * 1 > + * And the last 2 memory of order (X-n-1) are at the two sides of list. > + */ > + list_for_each_entry_from_reverse(fb, fbl, tmp_link) { > + int prev_order = drm_buddy_block_order(fb); > + > + list_for_each_entry_from(lb, fbl, tmp_link) { > + int next_order = drm_buddy_block_order(lb); > + > + if (prev_order <= next_order) > + cur_pages += BIT(next_order); > + else > + break; > + } > + > + cur_pages += BIT(prev_order); > + if (pages == cur_pages) { > + *first = fb; > + *last = list_prev_entry(lb, tmp_link); > + return true; > + } > + BUG_ON(pages < cur_pages); > + } > + > + *first = *last = NULL; > + return false; > +} > + > +static struct drm_buddy_block * > +find_continuous_blocks(struct drm_buddy *mm, > + int order, > + unsigned long flags, > + struct drm_buddy_block **lb) > +{ > + struct list_head *head = &mm->free_list[order - 1]; > + struct drm_buddy_block *free_block, *first = NULL, *last = NULL; > + > + /* > + * Look for continuous free memory in buddy and buddy-in-law. > + * IOW, the most left blocks at right of free block and the most right > + * blocks at left of free block. > + */ > + > + list_for_each_entry(free_block, head, link) { > + struct drm_buddy_block *buddy, *parent, *block; > + int left, min_order = 0; > + LIST_HEAD(fbl); > + > + parent = free_block->parent; > + if (!parent) > + continue; > + > + left = parent->left == free_block; > + list_add(&free_block->tmp_link, &fbl); > + buddy = __get_buddy(free_block); > + __continuous_block_in_tree(buddy, &fbl, left, min_order); > + > + while (parent && !((parent->left == block) ^ left)) { > + block = parent; > + parent = parent->parent; > + } > + > + if (!parent) > + continue; > + > + buddy = __get_buddy(block); > + __continuous_block_in_tree(buddy, &fbl, !left, min_order); > + > + /* list head of fbl is invalid outside. > + * Walk through list from first fo last only. > + */ > + if (__free_block_in_order(&fbl, free_block, order, &first, &last)) > + break; > + } > + > + *lb = last; > + return first; > +} > + > static struct drm_buddy_block * > get_maxblock(struct list_head *head) > { > @@ -637,7 +771,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, > struct list_head *blocks, > unsigned long flags) > { > - struct drm_buddy_block *block = NULL; > + struct drm_buddy_block *block = NULL, *last_block = NULL; > unsigned int min_order, order; > unsigned long pages; > LIST_HEAD(allocated); > @@ -689,17 +823,30 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, > break; > > if (order-- == min_order) { > + if (!(flags & DRM_BUDDY_RANGE_ALLOCATION) && > + min_order != 0 && pages == BIT(min_order)) { > + block = find_continuous_blocks(mm, > + min_order, > + flags, > + &last_block); > + if (block) > + break; > + } > err = -ENOSPC; > goto err_free; > } > } while (1); > > - mark_allocated(block); > - mm->avail -= drm_buddy_block_size(mm, block); > - kmemleak_update_trace(block); > - list_add_tail(&block->link, &allocated); > - > - pages -= BIT(order); > + do { > + mark_allocated(block); > + mm->avail -= drm_buddy_block_size(mm, block); > + kmemleak_update_trace(block); > + list_add_tail(&block->link, &allocated); > + pages -= BIT(drm_buddy_block_order(block)); > + if (block == last_block || !last_block) > + break; > + block = list_next_entry(block, tmp_link); > + } while (block); > > if (!pages) > break;