Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp53629rwl; Thu, 13 Apr 2023 20:47:06 -0700 (PDT) X-Google-Smtp-Source: AKy350aI2EsriMWaF9wToMCO/LQUb7msnrixeM6uA+6oMSyuf9AT7J8Zq2Tw++xNN8X925GiuYkM X-Received: by 2002:a17:90b:1a90:b0:247:1129:cbcc with SMTP id ng16-20020a17090b1a9000b002471129cbccmr4194315pjb.27.1681444025788; Thu, 13 Apr 2023 20:47:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681444025; cv=none; d=google.com; s=arc-20160816; b=dYc7HikUHZxQ9IrzElrxZc5/N7+5N6u8qXOa4lWmDtl4MNI/zYjQCRu+PwRcvz/CLV z0VUryOLM1ZwnxfLRo7einWSyipdHUy8ivUKnGAn450Z8TX/ya67lQjCHyBQ6MWJ1BX0 LT4cCIVDEu9q5OiCqzvId6CvHceEMNwAXSiW2No4ORMe+c0gr7Ig1NmgO7mX6wk+iHy/ mg6GKWn5vP86+Ytspwj84CfONS/Nwm4IjqhyTsP7QJIMx0A6pgLA5WYFlHIFwKJpDs8M Mfs+mWRhe1H4RY2i6lPWXHLbtGuYBwcuqCKM0YxoZijVag+htAa0YBrtzEULKiuWS9Ss TNIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :content-language:references:cc:to:subject:from:mime-version:date :dkim-signature:message-id; bh=W9bW0zXOdOvJDmsTubWdBZzJB+s+MAqkuEXByAddXCg=; b=lHkoTEYFQ0E96oa0ePnNkK8ekZKWY0wn/VShGqVPWyh5vb5NpHOPP4JGVRDg96jzlt zI+K4P7ry/atMgFrF9xkQDki+TuOFTEwhVmoaa+JC7usr4NUwtfKAu8MD+AH1GBVGVhR FBcxnlTYMgQO5nXZgws0af1TULSCbn7BObh04+T+pyzIEq1hm8B6rLBWzgufIIrTPkO3 1Wky021GvY4smDj6qXrCYI6cErCFU+RdU8ldEjXKxTE8RGHxQXVeoEZTRvXQrLCXmQMF K5PJE9pABnxn4YNhUiE6Uxo8SKxBb5dURwZ5WeElAKZlDdAQsZRCaKlYu5kkxeEAJc5R b5ig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=vPCE9SWC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id nm3-20020a17090b19c300b00244b61cd1fesi3313931pjb.52.2023.04.13.20.46.51; Thu, 13 Apr 2023 20:47:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=vPCE9SWC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229546AbjDNDlS (ORCPT + 99 others); Thu, 13 Apr 2023 23:41:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49028 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229540AbjDNDkr (ORCPT ); Thu, 13 Apr 2023 23:40:47 -0400 Received: from out-18.mta0.migadu.com (out-18.mta0.migadu.com [IPv6:2001:41d0:1004:224b::12]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9185449F5 for ; Thu, 13 Apr 2023 20:40:43 -0700 (PDT) Message-ID: <230b310a-26ef-34f1-4c3b-c2360088ce04@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1681443640; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=W9bW0zXOdOvJDmsTubWdBZzJB+s+MAqkuEXByAddXCg=; b=vPCE9SWC22zDhK6VkGzGIi5qxo+uPsqcBEGiAiYRs2UNHolmGGYdT03CwSYPDj6WaMAQUR 1mK2wZKgFSKJzW3M8HCIvqm5HE/7m+aGlsUrt8o6Ev0WV8mpzzFJiQzYlVEQHVlAF/xEoj 0ZeC1D+4JccfckKUuckT1a8tgMAYksE= Date: Fri, 14 Apr 2023 11:40:35 +0800 MIME-Version: 1.0 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Guoqing Jiang Subject: Re: [PATCH for-next 2/3] RDMA/rtrs: Fix rxe_dealloc_pd warning To: "Zhijian Li (Fujitsu)" , Leon Romanovsky Cc: "haris.iqbal@ionos.com" , "jinpu.wang@ionos.com" , "jgg@ziepe.ca" , "linux-rdma@vger.kernel.org" , "linux-kernel@vger.kernel.org" References: <1681108984-2-1-git-send-email-lizhijian@fujitsu.com> <1681108984-2-3-git-send-email-lizhijian@fujitsu.com> <20230410120809.GN182481@unreal> <0d9c57db-bca3-adb4-71fd-7362e4842917@linux.dev> <85323eb2-cfc7-d1b8-3a75-3fa63dde29db@fujitsu.com> <20230411122651.GV182481@unreal> <8fe62e38-e43d-3d6c-624f-1c8ce5859788@linux.dev> Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/13/23 16:12, Zhijian Li (Fujitsu) wrote: > On 13/04/2023 15:35, Guoqing Jiang wrote: >> Hi, >> >> I take a closer look today. >> >> On 4/12/23 09:15, Zhijian Li (Fujitsu) wrote: >>> On 11/04/2023 20:26, Leon Romanovsky wrote: >>>> On Tue, Apr 11, 2023 at 02:43:46AM +0000, Zhijian Li (Fujitsu) wrote: >>>>> On 10/04/2023 21:10, Guoqing Jiang wrote: >>>>>> On 4/10/23 20:08, Leon Romanovsky wrote: >>>>>>> On Mon, Apr 10, 2023 at 06:43:03AM +0000, Li Zhijian wrote: >>>>>>>> The warning occurs when destroying PD whose reference count is not zero. >>>>>>>> >>>>>>>> Precodition: clt_path->s.con_num is 2. >>>>>>>> So 2 cm connection will be created as below: >>>>>>>> CPU0                                              CPU1 >>>>>>>> init_conns {                              | >>>>>>>>      create_cm() // a. con[0] created        | >>>>>>>>                                              |  a'. rtrs_clt_rdma_cm_handler() { >>>>>>>>                                              |    rtrs_rdma_addr_resolved() >>>>>>>>                                              |      create_con_cq_qp(con); << con[0] >>>>>>>>                                              |  } >>>>>>>>                                              | in this moment, refcnt of PD was increased to 2+ >> What do you mean "refcnt of PD"? usecnt in struct ib_pd or dev_ref. > I mean usecnt in struct ib_pd > > > >>>>>>>>                                              | >>>>>>>>      create_cm() // b. cid = 1, failed       | >>>>>>>>        destroy_con_cq_qp()                   | >>>>>>>>          rtrs_ib_dev_put()                   | >>>>>>>>            dev_free()                        | >>>>>>>>              ib_dealloc_pd(dev->ib_pd) << PD | >>>>>>>>               is destroyed, but refcnt is    | >>>>>>>>               still greater than 0           | >> Assuming you mean "pd->usecnt". We only allocate pd in con[0] by rtrs_ib_dev_find_or_add, >> if con[1] failed to create cm, then alloc_path_reqs -> ib_alloc_mr -> atomic_inc(&pd->usecnt) The above can't be invoked, right? >> can't be triggered. Is there other places could increase the refcnt? > Yes, when create a qp, it will also associate to this PD, that also mean refcnt of PD will be increased. > > When con[0](create_con_cq_qp) succeeded, refcnt of PD will be 2. and then when con[1] failed, since > QP didn't create, refcnt of PD is still 2. con[1]'s cleanup will destroy the PD(ib_dealloc_pd) since dev_ref = 1, after that its > refcnt is still 1. I can see the path increase usecnt to 1. rtrs_cq_qp_create -> create_qp                     -> rdma_create_qp                         -> ib_create_qp                             -> create_qp                             -> ib_qp_usecnt_inc which increases pd->usecnt Where is another place to increase usecnt to 2? >> Then what is the appropriate time to call destroy_con_cq_qp for this scenario? >> Otherwise there could be memory leak. > we must ensure QP in con[0] is closed before destroying the PD. > Currently destroy_con_cq_qp() subroutine will close the opened QP first. Let me try another way, with below change, rtrs_ib_dev_put can't be called from destroy_con_cq_qp, right? + if (!con->has_dev) + return; if (clt_path->s.dev_ref && !--clt_path->s.dev_ref) { rtrs_ib_dev_put(clt_path->s.dev); clt_path->s.dev = NULL; Then when will you dealloc pd and free rtrs_ib_dev? Thanks, Guoqing