Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp1378753iob; Fri, 29 Apr 2022 04:18:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxjoy+uE9tK0ijrcB25jfUXhnqczUKNBUfj/6BQr0D/s/hl+zbyuZsDye8BXz01TapXXncm X-Received: by 2002:ac2:5f7c:0:b0:472:2133:c12c with SMTP id c28-20020ac25f7c000000b004722133c12cmr11672628lfc.385.1651231105669; Fri, 29 Apr 2022 04:18:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651231105; cv=none; d=google.com; s=arc-20160816; b=F7Cwnvmfj6e3MbzmKGCCMiWxT4E4YC84q9eqGw627Xv+2kQrZZlexdxc1QhAetyaPq CKnkQSUYGCFPiQiUqsvQ1RmOdQ1LvzeJteo2JctWMLKDVpcWo7X+WWEBWTgg6Fo3tKNL z2xoG5AOcwq4GyS9a07gxcto2tREiWpK5E1XQ6hT0LgeTIYnM+SlZHCumLeQ/f8v0DpI DiavWY6Y9WEVCZp3g7RYZr3/zmkC0o9a2MH/B99EZvsHlzwJQ5eKshDcz8ElSkKt9kbF Agpdz316ghYitPttEKv+Lb4Y3cIA/CsKdFh17Wr0h0CUoeOlyij43MQMa81PplxQrWAC 3QtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=KwG0dBZAtUlJq4flAZ0bcC8rm0sfzvBpurp6isyuVBA=; b=1CfyY6Zx9Mpz+i5xEmCpiwMmfB4ANiKhreSs4rTjBT2+yHLy5JAozooZ1r3wjY8kr5 983y2SBuVTqrP+kVo+ok/8Wxpzq6lyWvwMFhqgOY0swNPl+to9ryGUjK5/PPbUGJCAU/ Iyburm0yEWcL5SeKtsxZueiwIiVywUTOcx0gSPnywTlEXXbYn4+vy3Or+rp895Dl5lO6 vcN6Gi2JwzmiNfbLDEJK6xuDbN/zBmsExS93el41yYmgJAkX62fk7BteFvstOqrbxLxX QM5Tcif9sgiQkbTY0FZ5c0F09/Q6Rcibgqlb954AskbcfdTFzKowBo/I0bUbkq9Uc+jW lRew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=TnrvclRC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f28-20020a19381c000000b00471d0bf48e3si6566661lfa.620.2022.04.29.04.17.56; Fri, 29 Apr 2022 04:18:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=TnrvclRC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234927AbiD2DGu (ORCPT + 99 others); Thu, 28 Apr 2022 23:06:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52798 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234847AbiD2DGt (ORCPT ); Thu, 28 Apr 2022 23:06:49 -0400 Received: from mail-pg1-x543.google.com (mail-pg1-x543.google.com [IPv6:2607:f8b0:4864:20::543]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A0154BB099 for ; Thu, 28 Apr 2022 20:03:31 -0700 (PDT) Received: by mail-pg1-x543.google.com with SMTP id v10so5495024pgl.11 for ; Thu, 28 Apr 2022 20:03:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=KwG0dBZAtUlJq4flAZ0bcC8rm0sfzvBpurp6isyuVBA=; b=TnrvclRCXJuSLVLTamfGkcqixDJgV3F2Tp4XJK7Z1XxhiP+jQP1AB1xT8k3aDDGkI2 W08WTwz+Kc5JsncQtMRJnVGrhyZAZQD8os1T6+G5rJ0ik362CejOYoO5N9sQ/n+j5YWP nxiuBoquNJbxMRT1iO+u2etLWDmxe43oTpMytBo9lq1fvjvLdaQ512jthQXYG8Oj/3rK +uwbTkxx60XnW++MoENfsnpp8h6fYcV6jqN2ZoAHzl3w3OIKv02Djgfs32S1+l8C0fpE iDMOqMUZXybAgnSG8ik+b6XIlpUJkdvE+PLfoPw0hQhv8H0PnZq0YFrQc7aUOQS1y0kG miUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=KwG0dBZAtUlJq4flAZ0bcC8rm0sfzvBpurp6isyuVBA=; b=wf7pDgwtYGFHKNcE4O2D6Vw/DmT1hc6SlRQpZoSyivznkMumOxCrf8XA9Rq5BTD2g5 9msP5lSYwpRhz/oPafSBxt8y2kN3l7LDczTFlAqDE5BXRYZJ3Rw6Kq/tDl2R602d7BCW ortYyCjkep4OOzmNdm2GyhGt+nx7RqzO/piOzLro+SJ++fGY+s03scm1JSguzDBQCJX6 mT4baizMq7Ywn0Y8uKFxB0UXBIINDVO5tG5EliJFx4ooFdFy2Gdzh5VGl730iBg0IUxB 9UajOTzPxqSjSY8Qy8FDA71mNUX8/MaIBrr3W1ZFUe0cMprRSa+oR15q+JkN9rvftskY yXFw== X-Gm-Message-State: AOAM533IyGh4vGcxmZevri6R/iN3LIeOYipTKdsuft0NJ/nf8xMpgSoP GDJLL0F2SYERBcInOPowfzhykQVsgMw= X-Received: by 2002:a63:83c6:0:b0:3ab:5027:374b with SMTP id h189-20020a6383c6000000b003ab5027374bmr19044037pge.284.1651201410961; Thu, 28 Apr 2022 20:03:30 -0700 (PDT) Received: from [192.168.50.247] ([103.84.139.165]) by smtp.gmail.com with ESMTPSA id b4-20020a63d804000000b003c14af50604sm4292847pgh.28.2022.04.28.20.03.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 28 Apr 2022 20:03:30 -0700 (PDT) Message-ID: <37a48008-0700-7c60-30d6-ac2f4152114c@gmail.com> Date: Fri, 29 Apr 2022 11:03:26 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Subject: Re: [PATCH] gpu: drm: remove redundant dma_fence_put() when drm_sched_job_add_dependency() fails Content-Language: en-US To: Andrey Grodzovsky , yuq825@gmail.com, airlied@linux.ie, daniel@ffwll.ch Cc: dri-devel@lists.freedesktop.org, lima@lists.freedesktop.org, linux-kernel@vger.kernel.org References: <20220425083645.25922-1-hbh25y@gmail.com> <79b198d0-eff2-d658-4b5e-9084a834fc93@gmail.com> <88dd5d67-7dd5-2f58-5254-adaa941deb0f@gmail.com> <65b6cc23-1a77-7df0-5768-f81cd03b6514@amd.com> <9b91e06c-50fd-b567-7d2b-a2597d01c4dc@amd.com> From: Hangyu Hua In-Reply-To: <9b91e06c-50fd-b567-7d2b-a2597d01c4dc@amd.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022/4/28 23:27, Andrey Grodzovsky wrote: > > On 2022-04-28 04:56, Hangyu Hua wrote: >> On 2022/4/27 22:43, Andrey Grodzovsky wrote: >>> >>> On 2022-04-26 22:31, Hangyu Hua wrote: >>>> On 2022/4/26 22:55, Andrey Grodzovsky wrote: >>>>> >>>>> On 2022-04-25 22:54, Hangyu Hua wrote: >>>>>> On 2022/4/25 23:42, Andrey Grodzovsky wrote: >>>>>>> On 2022-04-25 04:36, Hangyu Hua wrote: >>>>>>> >>>>>>>> When drm_sched_job_add_dependency() fails, dma_fence_put() will >>>>>>>> be called >>>>>>>> internally. Calling it again after >>>>>>>> drm_sched_job_add_dependency() finishes >>>>>>>> may result in a dangling pointer. >>>>>>>> >>>>>>>> Fix this by removing redundant dma_fence_put(). >>>>>>>> >>>>>>>> Signed-off-by: Hangyu Hua >>>>>>>> --- >>>>>>>>   drivers/gpu/drm/lima/lima_gem.c        | 1 - >>>>>>>>   drivers/gpu/drm/scheduler/sched_main.c | 1 - >>>>>>>>   2 files changed, 2 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/gpu/drm/lima/lima_gem.c >>>>>>>> b/drivers/gpu/drm/lima/lima_gem.c >>>>>>>> index 55bb1ec3c4f7..99c8e7f6bb1c 100644 >>>>>>>> --- a/drivers/gpu/drm/lima/lima_gem.c >>>>>>>> +++ b/drivers/gpu/drm/lima/lima_gem.c >>>>>>>> @@ -291,7 +291,6 @@ static int lima_gem_add_deps(struct drm_file >>>>>>>> *file, struct lima_submit *submit) >>>>>>>>           err = >>>>>>>> drm_sched_job_add_dependency(&submit->task->base, fence); >>>>>>>>           if (err) { >>>>>>>> -            dma_fence_put(fence); >>>>>>>>               return err; >>>>>>> >>>>>>> >>>>>>> Makes sense here >>>>>>> >>>>>>> >>>>>>>>           } >>>>>>>>       } >>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>> index b81fceb0b8a2..ebab9eca37a8 100644 >>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>> @@ -708,7 +708,6 @@ int >>>>>>>> drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job, >>>>>>>>           dma_fence_get(fence); >>>>>>>>           ret = drm_sched_job_add_dependency(job, fence); >>>>>>>>           if (ret) { >>>>>>>> -            dma_fence_put(fence); >>>>>>> >>>>>>> >>>>>>> >>>>>>> Not sure about this one since if you look at the relevant commits - >>>>>>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies' and >>>>>>> 'drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder' >>>>>>> You will see that the dma_fence_put here balances the extra >>>>>>> dma_fence_get >>>>>>> above >>>>>>> >>>>>>> Andrey >>>>>>> >>>>>> >>>>>> I don't think so. I checked the call chain and found no additional >>>>>> dma_fence_get(). But dma_fence_get() needs to be called before >>>>>> drm_sched_job_add_dependency() to keep the counter balanced. >>>>> >>>>> >>>>> I don't say there is an additional get, I just say that >>>>> drm_sched_job_add_dependency doesn't grab an extra reference to the >>>>> fences it stores so this needs to be done outside and for that >>>>> drm_sched_job_add_implicit_dependencies->dma_fence_get is called >>>>> and, if this addition fails you just call dma_fence_put to keep the >>>>> counter balanced. >>>>> >>>> >>>> drm_sched_job_add_implicit_dependencies() will call >>>> drm_sched_job_add_dependency(). And drm_sched_job_add_dependency() >>>> already call dma_fence_put() when it fails. Calling dma_fence_put() >>>> twice doesn't make sense. >>>> >>>> dma_fence_get() is in [2]. But dma_fence_put() will be called in [1] >>>> and [3] when xa_alloc() fails. >>> >>> >>> The way I see it, [2] and [3] are mat matching *get* and *put* >>> respectively. [1] *put* is against the original >>> dma_fence_init->kref_init of the fence which always set the refcount >>> at 1. >>> Also in support of this see commit 'drm/scheduler: fix >>> drm_sched_job_add_implicit_dependencies harder' - it says there >>> "drm_sched_job_add_dependency() could drop the last ref"  - this last >>> ref is the original refcount set by dma_fence_init->kref >>> >>> Andrey >> >> >> You can see that drm_sched_job_add_dependency() has three return paths >> they are [4], [5] and [1]. [4] and [5] will return 0. [1] will return >> error. >> >> There will be three weird problems if you're right: >> >> 1. [5] path will triger a refcount leak beacause ret is 0 in *if*[6]. > > > Terminology confusion issue - [5] is a 'put' so it cannot cause a leak > by definition, extra unbalanced 'get' will cause a leak because memory > is never released, extra put will just probably cause a warning in > kref_put or maybe double free. > > >> Otherwise [2] and [5] are matching *get* and *put* in here. > > > Exactly, they are matching - so until this point all good and no 'leak' > then, no ? > In fact, i just want to prove that [2] and [3] are not a matching pair when the path go [4] or [5]. It's less likely when the path is [1]. But it doesn't matter, please see my explanation below. > >> >> 2. [4] path need a additional dma_fence_get() to adds the fence as a >> job dependency. fence is from obj->resv. Taking msm as an example >> obj->resv is from etnaviv_ioctl_gem_submit()->submit_lookup_objects(). >> It is not possible that an object has *refcount == 1* but is >> referenced in two places. So dma_fence_get() called in [2] is for [4]. >> By the way, [3] don't execute in this case. > > > Still don't see the problem - [2] is the additional dma_fence_get() you > need here (just as you say above). > > >> >> 3. This one is a doubt. You can see in "[PATCH] drm/scheduler: fix >> drm_sched_job_add_implicit_dependencies harder". >> drm_sched_job_add_dependency() could drop the last ref, so we need to do >> the dma_fence_get() first. But the last ref still will drop in [3] if >> drm_sched_job_add_dependency() go path [1]. And there is only a >> *return* between [1] and [3]. Is this necessary? I think Rob Clark >> wants to avoid the last ref being dropped in >> drm_sched_job_add_implicit_dependencies() because fence is still used >> by obj->resv. > > > In the scenario above - if we go thorough path [1] refcount before [1] > starts is 2 - one from original kref_init and one from [2] and so it's > balanced against 2 puts (one from [1] and one from [3]) so I still don't > see a problem. > We can't directly drop the last refcount and release fence in drm_sched_job_add_implicit_dependencies. fence is from obj->resv. Taking msm as an example obj->resv is from msm_ioctl_gem_submit()->submit_lookup_objects(). static int submit_lookup_objects(struct msm_gem_submit *submit, struct drm_msm_gem_submit *args, struct drm_file *file) { ... for (i = 0; i < args->nr_bos; i++) { struct drm_gem_object *obj; /* normally use drm_gem_object_lookup(), but for bulk lookup * all under single table_lock just hit object_idr directly: */ obj = idr_find(&file->object_idr, submit->bos[i].handle); <---- we find obj in here by a user controllable handle if (!obj) { DRM_ERROR("invalid handle %u at index %u\n", submit->bos[i].handle, i); ret = -EINVAL; goto out_unlock; } drm_gem_object_get(obj); submit->bos[i].obj = to_msm_bo(obj); <---- we store it } ... } Taking msm as an example, the patch to call drm_sched_job_add_implicit_dependencies() is msm_ioctl_gem_submit()->submit_fence_sync(). static int submit_fence_sync(struct msm_gem_submit *submit, bool no_implicit) { int i, ret = 0; for (i = 0; i < submit->nr_bos; i++) { struct drm_gem_object *obj = &submit->bos[i].obj->base; <---- get the obj ... ret = drm_sched_job_add_implicit_dependencies(&submit->base, obj, write); if (ret) break; } return ret; } If fence is released in drm_sched_job_add_implicit_dependencies(), a dangling pointer will be in obj->resv. specific scenario: recount = 1 init, obj->resv->fence_excl = fence recount = 1 before drm_sched_job_add_implicit_dependencies recount = 2 in [2] recount = 1 in [1] recount = 0 in [3] <--- fence release. But fence still in obj->resv Thanks, Hangyu > I suggest that you give a specific scenario  from fence ref-count > perspective that your patch fixes. I might be wrong but unless you give > a specific case where the 'put' in [3] is redundant I just can't see it. > > Andrey > > >> >> >> int drm_sched_job_add_dependency(struct drm_sched_job *job, >>                                  struct dma_fence *fence) >> { >>         ... >>         xa_for_each(&job->dependencies, index, entry) { >>                 if (entry->context != fence->context) >>                         continue; >> >>                 if (dma_fence_is_later(fence, entry)) { >>                         dma_fence_put(entry); >>                         xa_store(&job->dependencies, index, fence, >> GFP_KERNEL);    <---- [4] >>                 } else { >>                         dma_fence_put(fence);    <---- [5] >>                 } >>                 return 0; >>         } >> >>         ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, >> GFP_KERNEL); >>         if (ret != 0) >>                 dma_fence_put(fence);   <---- [1] >> >>         return ret; >> } >> >> >> int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job, >>                                             struct drm_gem_object *obj, >>                                             bool write) >> { >>         struct dma_resv_iter cursor; >>         struct dma_fence *fence; >>         int ret; >> >>         dma_resv_for_each_fence(&cursor, obj->resv, write, fence) { >>                 /* Make sure to grab an additional ref on the added >> fence */ >>                 dma_fence_get(fence);   <---- [2] >>                 ret = drm_sched_job_add_dependency(job, fence); >>                 if (ret) {      <---- [6] >>                         dma_fence_put(fence);   <---- [3] >> >>                         return ret; >>                 } >>         } >>         return 0; >> } >> >> Thanks, >> hangyu >> >>> >>>> >>>> >>>> int drm_sched_job_add_dependency(struct drm_sched_job *job, >>>>                  struct dma_fence *fence) >>>> { >>>>     ... >>>>     ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, >>>> GFP_KERNEL); >>>>     if (ret != 0) >>>>         dma_fence_put(fence);    <--- [1] >>>> >>>>     return ret; >>>> } >>>> EXPORT_SYMBOL(drm_sched_job_add_dependency); >>>> >>>> >>>> int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job, >>>>                         struct drm_gem_object *obj, >>>>                         bool write) >>>> { >>>>     struct dma_resv_iter cursor; >>>>     struct dma_fence *fence; >>>>     int ret; >>>> >>>>     dma_resv_for_each_fence(&cursor, obj->resv, write, fence) { >>>>         /* Make sure to grab an additional ref on the added fence */ >>>>         dma_fence_get(fence);    <--- [2] >>>>         ret = drm_sched_job_add_dependency(job, fence); >>>>         if (ret) { >>>>             dma_fence_put(fence);    <--- [3] >>>>             return ret; >>>>         } >>>>     } >>>>     return 0; >>>> } >>>> >>>> >>>>> >>>>>> On the other hand, dma_fence_get() and dma_fence_put() are >>>>>> meaningless here if threre is an extra dma_fence_get() beacause >>>>>> counter will not decrease to 0 during drm_sched_job_add_dependency(). >>>>>> >>>>>> I check the call chain as follows: >>>>>> >>>>>> msm_ioctl_gem_submit() >>>>>> -> submit_fence_sync() >>>>>> -> drm_sched_job_add_implicit_dependencies() >>>>> >>>>> >>>>> Can you maybe trace or print one such example of problematic >>>>> refcount that you are trying to fix ? I still don't see where is >>>>> the problem. >>>>> >>>>> Andrey >>>>> >>>> >>>> I also wish I could. System logs can make this easy. But i don't >>>> have a corresponding GPU physical device. >>>> drm_sched_job_add_implicit_dependencies is only used in a few devices. >>>> >>>> Thanks. >>>>> >>>>>> >>>>>> Thanks, >>>>>> Hangyu >>>>>> >>>>>>> >>>>>>>>               return ret; >>>>>>>>           } >>>>>>>>       }