Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp3006460rdb; Mon, 4 Dec 2023 13:54:27 -0800 (PST) X-Google-Smtp-Source: AGHT+IEaZjvIxxtJCP6Dwiyhq6EKWU47/fd3I944q4ciOCRvq51SIutui0EQ0ww9tsuUvFyCTWPf X-Received: by 2002:a05:6a00:409a:b0:6cd:fe6e:ac9b with SMTP id bw26-20020a056a00409a00b006cdfe6eac9bmr220448pfb.0.1701726866930; Mon, 04 Dec 2023 13:54:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701726866; cv=none; d=google.com; s=arc-20160816; b=1C85Zb4pu9sIL1148Tm1NelRw6c2jQapvf2qkSH3zhEvqalKtFqzsgy325vyb4MJd8 hiO4gPfrWKQ4pFnZL31YuptC3h/Cm0F2WqsabR37TKx+v7LPy5sreGgkhs8FgrDU4K+z hfQSYrHBEA9WWBNmyVjQtegxAdOuiVCwal6+f06wCbTBk4zxWUeXvaGiparGixF7IpkT 2wXOLvpdD39Nd8z6ZtCLtvh7QMluK0/e8sd8LGTFzr5asg6mt6LeNXYaClighgf1A9Ow FyLahO0K259m3Zob9CAnR7+zPA0z+2USS8G1xwCr+bJj3Y1sCIYIN7fEtYEPA5LxMfyx t1wQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=BlZLmeqxpiGQQSs10LoTtpiBgorU3TT3VUZTA6H0DRg=; fh=e5rKN6v2uIX7OpjYyZBKs4yx9LJafobFJ6E6MFsmETo=; b=fwK5qIaODH3L7Wdzq4Dxg1QH8tqMl/w7Sg9SZNgNiB2RjYKpZOzi4ZZ6JIJOOEFWg3 03XGY4FYn0BAw6t37ix/ycEjz0v9UnHkDBlyUFrXI0jNhzWnHVyZfRxpNmwZn3oUYZeI Fn3V6KBXPvaDsIEkZKAfE6HnAUAm04sweY/oP3qgANQhk2O+2G7QppvKR2QMK+06yRrV iHukhlxsHM3NsdZzqA7WPQ51vrINpQ+27wCxZLuNN0dbOuKMwF7ja1en0ke4EbGXoVyP tUklrSWl5fwB7bu3ey9o7SHn/94F05MiOj797Odc7yGxpcvAtgqPkkj8prmw9IG+EEGB YopQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=OQhyiLHF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id s4-20020a056a00194400b006cbd28f1f43si1747661pfk.47.2023.12.04.13.54.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 13:54:26 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=OQhyiLHF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id E4016807F2A9; Mon, 4 Dec 2023 13:54:23 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234392AbjLDVyK (ORCPT + 99 others); Mon, 4 Dec 2023 16:54:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57300 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230053AbjLDVyJ (ORCPT ); Mon, 4 Dec 2023 16:54:09 -0500 Received: from mail-ed1-x530.google.com (mail-ed1-x530.google.com [IPv6:2a00:1450:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DEF5CC3; Mon, 4 Dec 2023 13:54:14 -0800 (PST) Received: by mail-ed1-x530.google.com with SMTP id 4fb4d7f45d1cf-54c70c70952so3664655a12.3; Mon, 04 Dec 2023 13:54:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701726853; x=1702331653; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=BlZLmeqxpiGQQSs10LoTtpiBgorU3TT3VUZTA6H0DRg=; b=OQhyiLHFdeGndwRT8/xpYQzTXwui3fbX+0vafumUYF6B7r1Z0bhgl2+GpBjFGHPpeH t9VsCzkOHWXRuzBLI9vKamPDQa4wQUei/jxZQaZnXl7SjfYpBMyfHgqhv55XYRu8FF0z sSNlPXI8fsNfBukT7LM1tWUK8l2k+VUdNQ12uqFLC+fdKqBojdnr/moJQBwiJHj7cyZ6 d4s0pnd8SYSX5A9OGLVBN8oM5x3bCNiOZWe0aY1ZPzRan5BgBfCEshlNzhw4Vd9FmV8q SVU36DaW7DObBbzkf2d3eLYBB3s/fmJfpOo7hFNxa4JbxCipIY0tNbndeHHs39d/RBqG ZM8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701726853; x=1702331653; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BlZLmeqxpiGQQSs10LoTtpiBgorU3TT3VUZTA6H0DRg=; b=niSumKhy5o8xOwEJ5MbFYuxiHgtB+ay8g5clRHUz6uI3JyB4+tklQ6MW/w/dygp1aO MQI4YT99U1leA/z0o0VA4xhIv6QlIL4KHqqnTrVaRftxhNMa25M1Z0tvHgpKEQBsQfNn VZmvYbnRaD+V7ROO23qehYvCdmzpr2sIcjw1CB5rPK05hAGV+NYBwtP0NxHt6Vfrngfa 30BnlsONgyAZuyy/q2++Jp7o22sK7najWGofw/lbH/HcvmQz7DsZPd4Zvtk596efvc07 nMaGdqBqd8FJ186yw+qBvsoezPodv8v1YB1ViLchD2n7YXpKpQzyUO8rXYM1p2AajrcN s//w== X-Gm-Message-State: AOJu0YzydVDak55kyLuSm4unyHxYbR7JE7YiZeP256Os8tFRgV9CU7MG EqTR6ml/OifU4N8dc2uethGly1SHOBaiV14hnt0= X-Received: by 2002:a50:ab12:0:b0:54c:ba16:f573 with SMTP id s18-20020a50ab12000000b0054cba16f573mr1121210edc.69.1701726852960; Mon, 04 Dec 2023 13:54:12 -0800 (PST) MIME-Version: 1.0 References: <20230322224403.35742-1-robdclark@gmail.com> In-Reply-To: From: Rob Clark Date: Mon, 4 Dec 2023 13:54:00 -0800 Message-ID: Subject: Re: [RFC] drm/scheduler: Unwrap job dependencies To: =?UTF-8?Q?Christian_K=C3=B6nig?= Cc: dri-devel@lists.freedesktop.org, Rob Clark , Luben Tuikov , David Airlie , Daniel Vetter , Sumit Semwal , open list , "open list:DMA BUFFER SHARING FRAMEWORK" , "moderated list:DMA BUFFER SHARING FRAMEWORK" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Mon, 04 Dec 2023 13:54:24 -0800 (PST) On Thu, Mar 23, 2023 at 2:30=E2=80=AFPM Rob Clark wro= te: > > On Thu, Mar 23, 2023 at 7:03=E2=80=AFAM Christian K=C3=B6nig > wrote: > > > > Am 23.03.23 um 14:54 schrieb Rob Clark: > > > On Thu, Mar 23, 2023 at 12:35=E2=80=AFAM Christian K=C3=B6nig > > > wrote: > > >> Am 22.03.23 um 23:44 schrieb Rob Clark: > > >>> From: Rob Clark > > >>> > > >>> Container fences have burner contexts, which makes the trick to sto= re at > > >>> most one fence per context somewhat useless if we don't unwrap arra= y or > > >>> chain fences. > > >> Mhm, we intentionally kept them not unwrapped since this way they on= ly > > >> occupy one fence slot. > > >> > > >> But it might be better to unwrap them if you add many of those depen= dencies. > > >> > > >>> Signed-off-by: Rob Clark > > >>> --- > > >>> tbh, I'm not sure why we weren't doing this already, unless there i= s > > >>> something I'm overlooking > > >>> > > >>> drivers/gpu/drm/scheduler/sched_main.c | 43 +++++++++++++++++---= ------ > > >>> 1 file changed, 28 insertions(+), 15 deletions(-) > > >>> > > >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/d= rm/scheduler/sched_main.c > > >>> index c2ee44d6224b..f59e5335afbb 100644 > > >>> --- a/drivers/gpu/drm/scheduler/sched_main.c > > >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c > > >>> @@ -41,20 +41,21 @@ > > >>> * 4. Entities themselves maintain a queue of jobs that will be = scheduled on > > >>> * the hardware. > > >>> * > > >>> * The jobs in a entity are always scheduled in the order that t= hey were pushed. > > >>> */ > > >>> > > >>> #include > > >>> #include > > >>> #include > > >>> #include > > >>> +#include > > >>> #include > > >>> #include > > >>> > > >>> #include > > >>> #include > > >>> #include > > >>> #include > > >>> > > >>> #define CREATE_TRACE_POINTS > > >>> #include "gpu_scheduler_trace.h" > > >>> @@ -665,41 +666,27 @@ void drm_sched_job_arm(struct drm_sched_job *= job) > > >>> sched =3D entity->rq->sched; > > >>> > > >>> job->sched =3D sched; > > >>> job->s_priority =3D entity->rq - sched->sched_rq; > > >>> job->id =3D atomic64_inc_return(&sched->job_id_count); > > >>> > > >>> drm_sched_fence_init(job->s_fence, job->entity); > > >>> } > > >>> EXPORT_SYMBOL(drm_sched_job_arm); > > >>> > > >>> -/** > > >>> - * drm_sched_job_add_dependency - adds the fence as a job dependen= cy > > >>> - * @job: scheduler job to add the dependencies to > > >>> - * @fence: the dma_fence to add to the list of dependencies. > > >>> - * > > >>> - * Note that @fence is consumed in both the success and error case= s. > > >>> - * > > >>> - * Returns: > > >>> - * 0 on success, or an error on failing to expand the array. > > >>> - */ > > >>> -int drm_sched_job_add_dependency(struct drm_sched_job *job, > > >>> - struct dma_fence *fence) > > >>> +static int _add_dependency(struct drm_sched_job *job, struct dma_f= ence *fence) > > >> Please keep the drm_sched_job_ prefix here even for static functions= . > > >> The symbol _add_dependency just sucks in a backtrace, especially whe= n > > >> it's tail optimized. > > >> > > >>> { > > >>> struct dma_fence *entry; > > >>> unsigned long index; > > >>> u32 id =3D 0; > > >>> int ret; > > >>> > > >>> - if (!fence) > > >>> - return 0; > > >>> - > > >>> /* Deduplicate if we already depend on a fence from the same= context. > > >>> * This lets the size of the array of deps scale with the nu= mber of > > >>> * engines involved, rather than the number of BOs. > > >>> */ > > >>> xa_for_each(&job->dependencies, index, entry) { > > >>> if (entry->context !=3D fence->context) > > >>> continue; > > >>> > > >>> if (dma_fence_is_later(fence, entry)) { > > >>> dma_fence_put(entry); > > >>> @@ -709,20 +696,46 @@ int drm_sched_job_add_dependency(struct drm_s= ched_job *job, > > >>> } > > >>> return 0; > > >>> } > > >>> > > >>> ret =3D xa_alloc(&job->dependencies, &id, fence, xa_limit_32= b, GFP_KERNEL); > > >>> if (ret !=3D 0) > > >>> dma_fence_put(fence); > > >>> > > >>> return ret; > > >>> } > > >>> + > > >>> +/** > > >>> + * drm_sched_job_add_dependency - adds the fence as a job dependen= cy > > >>> + * @job: scheduler job to add the dependencies to > > >>> + * @fence: the dma_fence to add to the list of dependencies. > > >>> + * > > >>> + * Note that @fence is consumed in both the success and error case= s. > > >>> + * > > >>> + * Returns: > > >>> + * 0 on success, or an error on failing to expand the array. > > >>> + */ > > >>> +int drm_sched_job_add_dependency(struct drm_sched_job *job, > > >>> + struct dma_fence *fence) > > >> Maybe name the new function drm_sched_job_unwrap_add_dependency or > > >> something like this. > > >> > > >> I need to double check, but I think for some cases we don't need or > > >> don't even want this in the driver. > > > I'd be curious to know the cases where you don't want this.. one thin= g > > > I was thinking about, what if you have a container fence with two > > > contained fences. One is on the same ctx as the job, one is not but > > > signals sooner. You end up artificially waiting on both, which seems > > > sub-optimal. > > > > Well resv objects don't contain other containers for example. > > I suppose I have the explicit sync case more in mind, where the > dependent fence ends up being a chain or array (if userspace is > merging fence fd's). > > > Then we also have an use case in amdgpu where fence need to be > > explicitly waited for even when they are from the same ctx as the job > > because otherwise we wouldn't see everything cache coherent. > > This was the kinda weird case I wanted to make sure I wasn't breaking. > I remember seeing something fly by for this, but can't find it now or > remember what amdgpu's solution was.. > > > On the other hand we currently handle that amdgpu use case differently > > and the extra overhead of unwrapping fences even if they can't be > > containers is probably negligible. > > > > > Anyways, I can make this a new entrypoint which unwraps, and/or renam= e > > > the internal static function, if we think this is a good idea. > > > > If you think that's unnecessary keeping your original approach is fine > > with me as well. > > I'm going to assume unnecessary until someone speaks up with their > weird specific case ;-) So, this patch turns out to blow up spectacularly with dma_fence refcnt underflows when I enable DRIVER_SYNCOBJ_TIMELINE .. I think, because it starts unwrapping fence chains, possibly in parallel with fence signaling on the retire path. Is it supposed to be permissible to unwrap a fence chain concurrently? BR, -R > BR, > -R > > > Regards, > > Christian. > > > > > > > > BR, > > > -R > > > > > >> Christian. > > >> > > >>> +{ > > >>> + struct dma_fence_unwrap iter; > > >>> + struct dma_fence *f; > > >>> + int ret =3D 0; > > >>> + > > >>> + dma_fence_unwrap_for_each (f, &iter, fence) { > > >>> + ret =3D _add_dependency(job, f); > > >>> + if (ret) > > >>> + break; > > >>> + } > > >>> + > > >>> + return ret; > > >>> +} > > >>> EXPORT_SYMBOL(drm_sched_job_add_dependency); > > >>> > > >>> /** > > >>> * drm_sched_job_add_resv_dependencies - add all fences from the= resv to the job > > >>> * @job: scheduler job to add the dependencies to > > >>> * @resv: the dma_resv object to get the fences from > > >>> * @usage: the dma_resv_usage to use to filter the fences > > >>> * > > >>> * This adds all fences matching the given usage from @resv to @= job. > > >>> * Must be called with the @resv lock held. > >