Received: by 2002:a05:7412:37c9:b0:e2:908c:2ebd with SMTP id jz9csp1063024rdb; Tue, 19 Sep 2023 20:36:53 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGHLaKd7tFE/VhJW0okzTASv3rwk0Zg1F5gAuOug25O0uxp7nYFLQzdH+fpe5vNvE8d5Lv7 X-Received: by 2002:a25:3619:0:b0:d81:cb92:337c with SMTP id d25-20020a253619000000b00d81cb92337cmr1340565yba.54.1695181013003; Tue, 19 Sep 2023 20:36:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695181012; cv=none; d=google.com; s=arc-20160816; b=kq9P2gxetmHSBqaAxxgkB4B4vepuN332g5vkLBVs4Ga20mb8HFukRVAKsuLBbmwu8b ing5PWx4/CAgm7bzSM02PKGjizFHaQ1iUa5C1/PZ2tcXz8AL9X3lr0Dx9/l7bt9GACpK tzZCnFZ5j8JyW0U0h4h/QNaKCu11vkvvBVJJ0Jwyh7sIRHzs0VNXBqhnFe5zOqAsNqb7 oZBg4DI1SNOy/6kdHu/EW1Wqbjrm46CrsnhJrHDN5rvAFLU4RbjX1pO7VjqBgv9YrvUW cbAIuEoUzDuV+zZS1c/uNA8zbjpFHZUAnGlT9Z/ZGVa7N/VB9WV9+LgPt3ImDKTjFcX9 iYNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=Ezk2RrQWB6Mt4o3Hx35J5J/7s6xG3C0CNFPAmm0NBII=; fh=KWvyQxL3Ff+3WPSMjlYu+P4255AmcMULAsFol6M1vNI=; b=utuCkC0x/KFSRgjNkjQBzXHsr3HHGLayuBypC1XmjO8Koej23agW0LRS4P/QvX/FKD xHHtrc/mCZwaF3HlPsGpc1DeycmOAaqVVTfxKPRbNUtu68MOSqaD6OBlf7CBXnbuBbIQ lwB57peaaxmjZTpTEWa5Frdg3oevXLX2ThGvfgJnAlQj81UON72X3kcLL3IxY7DTsZTh rFz3Jp1tmdjnYamgGSivNCy6leQZ/mFbDNon34KyrinV5pJkAlfzTmDxCRKqd4FfsYAE +VG2tcb5ZO/gBPmJNKzpr0fSIxa1c5DKCVKj5Zfn0o0/ppHwWpJO71D9SwKha5iLQ8Vw g0SA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=hisilicon.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id p127-20020a634285000000b00573fe49ad3esi10731555pga.672.2023.09.19.20.36.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Sep 2023 20:36:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=hisilicon.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 1C39E80E0A74; Tue, 19 Sep 2023 20:33:35 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231948AbjITDdU (ORCPT + 99 others); Tue, 19 Sep 2023 23:33:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231675AbjITDdT (ORCPT ); Tue, 19 Sep 2023 23:33:19 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BED96B0; Tue, 19 Sep 2023 20:33:09 -0700 (PDT) Received: from kwepemi500006.china.huawei.com (unknown [172.30.72.56]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4Rr3t90M35zJsWs; Wed, 20 Sep 2023 11:29:21 +0800 (CST) Received: from localhost.localdomain (10.67.165.2) by kwepemi500006.china.huawei.com (7.221.188.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Wed, 20 Sep 2023 11:33:06 +0800 From: Junxian Huang To: , CC: , , , Subject: [PATCH for-next] RDMA/hns: Support SRQ record doorbell Date: Wed, 20 Sep 2023 11:30:05 +0800 Message-ID: <20230920033005.1557-1-huangjunxian6@hisilicon.com> X-Mailer: git-send-email 2.30.0 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.67.165.2] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemi500006.china.huawei.com (7.221.188.68) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Tue, 19 Sep 2023 20:33:35 -0700 (PDT) From: Yangyang Li Compared with normal doorbell, using record doorbell can shorten the process of ringing the doorbell and reduce the latency. Add a flag HNS_ROCE_CAP_FLAG_SRQ_RECORD_DB to allow FW to enable/disable SRQ record doorbell. If the flag above is set, allocate the dma buffer for SRQ record doorbell and write the buffer address into SRQC during SRQ creation. For userspace SRQ, add a flag HNS_ROCE_RSP_SRQ_CAP_RECORD_DB to notify userspace whether the SRQ record doorbell is enabled. Signed-off-by: Yangyang Li Signed-off-by: Junxian Huang --- drivers/infiniband/hw/hns/hns_roce_device.h | 3 + drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 30 ++++++-- drivers/infiniband/hw/hns/hns_roce_srq.c | 85 ++++++++++++++++++++- include/uapi/rdma/hns-abi.h | 13 +++- 4 files changed, 120 insertions(+), 11 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index 7f0d0288beb1..668cef06a269 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -146,6 +146,7 @@ enum { HNS_ROCE_CAP_FLAG_SDI_MODE = BIT(14), HNS_ROCE_CAP_FLAG_STASH = BIT(17), HNS_ROCE_CAP_FLAG_CQE_INLINE = BIT(19), + HNS_ROCE_CAP_FLAG_SRQ_RECORD_DB = BIT(22), }; #define HNS_ROCE_DB_TYPE_COUNT 2 @@ -453,6 +454,8 @@ struct hns_roce_srq { spinlock_t lock; struct mutex mutex; void (*event)(struct hns_roce_srq *srq, enum hns_roce_event event); + struct hns_roce_db rdb; + u32 cap_flags; }; struct hns_roce_uar_table { diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index d82daff2d9bd..6418c681eb5a 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -941,11 +941,16 @@ static void fill_wqe_idx(struct hns_roce_srq *srq, unsigned int wqe_idx) idx_que->head++; } -static void update_srq_db(struct hns_roce_v2_db *db, struct hns_roce_srq *srq) +static void update_srq_db(struct hns_roce_srq *srq) { - hr_reg_write(db, DB_TAG, srq->srqn); - hr_reg_write(db, DB_CMD, HNS_ROCE_V2_SRQ_DB); - hr_reg_write(db, DB_PI, srq->idx_que.head); + struct hns_roce_dev *hr_dev = to_hr_dev(srq->ibsrq.device); + struct hns_roce_v2_db db; + + hr_reg_write(&db, DB_TAG, srq->srqn); + hr_reg_write(&db, DB_CMD, HNS_ROCE_V2_SRQ_DB); + hr_reg_write(&db, DB_PI, srq->idx_que.head); + + hns_roce_write64(hr_dev, (__le32 *)&db, srq->db_reg); } static int hns_roce_v2_post_srq_recv(struct ib_srq *ibsrq, @@ -954,7 +959,6 @@ static int hns_roce_v2_post_srq_recv(struct ib_srq *ibsrq, { struct hns_roce_dev *hr_dev = to_hr_dev(ibsrq->device); struct hns_roce_srq *srq = to_hr_srq(ibsrq); - struct hns_roce_v2_db srq_db; unsigned long flags; int ret = 0; u32 max_sge; @@ -985,9 +989,11 @@ static int hns_roce_v2_post_srq_recv(struct ib_srq *ibsrq, } if (likely(nreq)) { - update_srq_db(&srq_db, srq); - - hns_roce_write64(hr_dev, (__le32 *)&srq_db, srq->db_reg); + if (srq->cap_flags & HNS_ROCE_SRQ_CAP_RECORD_DB) + *srq->rdb.db_record = srq->idx_que.head & + V2_DB_PRODUCER_IDX_M; + else + update_srq_db(srq); } spin_unlock_irqrestore(&srq->lock, flags); @@ -5606,6 +5612,14 @@ static int hns_roce_v2_write_srqc(struct hns_roce_srq *srq, void *mb_buf) hr_reg_write(ctx, SRQC_WQE_BUF_PG_SZ, to_hr_hw_page_shift(srq->buf_mtr.hem_cfg.buf_pg_shift)); + if (srq->cap_flags & HNS_ROCE_SRQ_CAP_RECORD_DB) { + hr_reg_enable(ctx, SRQC_DB_RECORD_EN); + hr_reg_write(ctx, SRQC_DB_RECORD_ADDR_L, + lower_32_bits(srq->rdb.dma) >> 1); + hr_reg_write(ctx, SRQC_DB_RECORD_ADDR_H, + upper_32_bits(srq->rdb.dma)); + } + return hns_roce_v2_write_srqc_index_queue(srq, ctx); } diff --git a/drivers/infiniband/hw/hns/hns_roce_srq.c b/drivers/infiniband/hw/hns/hns_roce_srq.c index 8dae98f827eb..4e2d1c8e164a 100644 --- a/drivers/infiniband/hw/hns/hns_roce_srq.c +++ b/drivers/infiniband/hw/hns/hns_roce_srq.c @@ -5,6 +5,7 @@ #include #include +#include #include "hns_roce_device.h" #include "hns_roce_cmd.h" #include "hns_roce_hem.h" @@ -387,6 +388,79 @@ static void free_srq_buf(struct hns_roce_dev *hr_dev, struct hns_roce_srq *srq) free_srq_idx(hr_dev, srq); } +static int get_srq_ucmd(struct hns_roce_srq *srq, struct ib_udata *udata, + struct hns_roce_ib_create_srq *ucmd) +{ + struct ib_device *ibdev = srq->ibsrq.device; + int ret; + + ret = ib_copy_from_udata(ucmd, udata, min(udata->inlen, sizeof(*ucmd))); + if (ret) { + ibdev_err(ibdev, "failed to copy SRQ udata, ret = %d.\n", ret); + return ret; + } + + return 0; +} + +static void free_srq_db(struct hns_roce_dev *hr_dev, struct hns_roce_srq *srq, + struct ib_udata *udata) +{ + struct hns_roce_ucontext *uctx; + + if (!(srq->cap_flags & HNS_ROCE_SRQ_CAP_RECORD_DB)) + return; + + srq->cap_flags &= ~HNS_ROCE_SRQ_CAP_RECORD_DB; + if (udata) { + uctx = rdma_udata_to_drv_context(udata, + struct hns_roce_ucontext, + ibucontext); + hns_roce_db_unmap_user(uctx, &srq->rdb); + } else { + hns_roce_free_db(hr_dev, &srq->rdb); + } +} + +static int alloc_srq_db(struct hns_roce_dev *hr_dev, struct hns_roce_srq *srq, + struct ib_udata *udata, + struct hns_roce_ib_create_srq_resp *resp) +{ + struct hns_roce_ib_create_srq ucmd = {}; + struct hns_roce_ucontext *uctx; + int ret; + + if (udata) { + ret = get_srq_ucmd(srq, udata, &ucmd); + if (ret) + return ret; + + if ((hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_SRQ_RECORD_DB) && + (ucmd.req_cap_flags & HNS_ROCE_SRQ_CAP_RECORD_DB)) { + uctx = rdma_udata_to_drv_context(udata, + struct hns_roce_ucontext, ibucontext); + ret = hns_roce_db_map_user(uctx, ucmd.db_addr, + &srq->rdb); + if (ret) + return ret; + + srq->cap_flags |= HNS_ROCE_RSP_SRQ_CAP_RECORD_DB; + } + } else { + if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_SRQ_RECORD_DB) { + ret = hns_roce_alloc_db(hr_dev, &srq->rdb, 1); + if (ret) + return ret; + + *srq->rdb.db_record = 0; + srq->cap_flags |= HNS_ROCE_RSP_SRQ_CAP_RECORD_DB; + } + srq->db_reg = hr_dev->reg_base + SRQ_DB_REG; + } + + return 0; +} + int hns_roce_create_srq(struct ib_srq *ib_srq, struct ib_srq_init_attr *init_attr, struct ib_udata *udata) @@ -407,15 +481,20 @@ int hns_roce_create_srq(struct ib_srq *ib_srq, if (ret) return ret; - ret = alloc_srqn(hr_dev, srq); + ret = alloc_srq_db(hr_dev, srq, udata, &resp); if (ret) goto err_srq_buf; + ret = alloc_srqn(hr_dev, srq); + if (ret) + goto err_srq_db; + ret = alloc_srqc(hr_dev, srq); if (ret) goto err_srqn; if (udata) { + resp.cap_flags = srq->cap_flags; resp.srqn = srq->srqn; if (ib_copy_to_udata(udata, &resp, min(udata->outlen, sizeof(resp)))) { @@ -424,7 +503,6 @@ int hns_roce_create_srq(struct ib_srq *ib_srq, } } - srq->db_reg = hr_dev->reg_base + SRQ_DB_REG; srq->event = hns_roce_ib_srq_event; refcount_set(&srq->refcount, 1); init_completion(&srq->free); @@ -435,6 +513,8 @@ int hns_roce_create_srq(struct ib_srq *ib_srq, free_srqc(hr_dev, srq); err_srqn: free_srqn(hr_dev, srq); +err_srq_db: + free_srq_db(hr_dev, srq, udata); err_srq_buf: free_srq_buf(hr_dev, srq); @@ -448,6 +528,7 @@ int hns_roce_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) free_srqc(hr_dev, srq); free_srqn(hr_dev, srq); + free_srq_db(hr_dev, srq, udata); free_srq_buf(hr_dev, srq); return 0; } diff --git a/include/uapi/rdma/hns-abi.h b/include/uapi/rdma/hns-abi.h index 2e68a8b0c92c..c68bb0cd80e1 100644 --- a/include/uapi/rdma/hns-abi.h +++ b/include/uapi/rdma/hns-abi.h @@ -52,15 +52,26 @@ struct hns_roce_ib_create_cq_resp { __aligned_u64 cap_flags; }; +enum hns_roce_srq_cap_flags { + HNS_ROCE_SRQ_CAP_RECORD_DB = 1 << 0, +}; + +enum hns_roce_srq_cap_flags_resp { + HNS_ROCE_RSP_SRQ_CAP_RECORD_DB = 1 << 0, +}; + + struct hns_roce_ib_create_srq { __aligned_u64 buf_addr; __aligned_u64 db_addr; __aligned_u64 que_addr; + __u32 req_cap_flags; /* Use enum hns_roce_srq_cap_flags */ + __u32 reserved; }; struct hns_roce_ib_create_srq_resp { __u32 srqn; - __u32 reserved; + __u32 cap_flags; /* Use enum hns_roce_srq_cap_flags */ }; struct hns_roce_ib_create_qp { -- 2.30.0