Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp1334223pxb; Thu, 7 Oct 2021 05:51:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwOqP8I6ghPX+TGcjMC99n0U+roCsQBFf0Qh5SD5KPeAEHS7YzCFWNJnweTDwPn4uEyhR3s X-Received: by 2002:a50:9d4f:: with SMTP id j15mr6285364edk.68.1633611102939; Thu, 07 Oct 2021 05:51:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633611102; cv=none; d=google.com; s=arc-20160816; b=DobZ/dpjQA/tlwOzYBldpoMcGdv96pyqRkktGkR8xISvSeSsE2gPxSuKmwALxxLiWR LHUMPb2amnsP8OXKKW63K/e6TiZ4vlMjFcVu3hORJFAkGPz+sh8+kIa5dyKVdYcMcS2O Q34oVucXHYsSkaZ4wHXb31LPPMmFGwLKsWL92yeMrEZ6BCB3ysYRaurUjb7b3le+35EY dvkrKOImaJSUgUbp2/+V0z5jQ4p79ZMiF7Jm2zKb98BMqc3XFkmB15+Es8nNa9UR4Z05 OXm57DvjMc6OwiZyHViHRXfvAKIcKKdiZCrFKuJTfPi81ma59JOukQ2UUsB2nfOeIH7c rs+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ZC5z2T2gw+PahGCJ3v0eUrB5lX6HX0lWhLeZpwqBms0=; b=IT10EgDtpdVso6Y1jnJTCOb44K6VF2mSEQ1ugLqBSC69YS7DwTSpuXQcIDEe/+mrYg KJItpSbHw9ORy8qpz0q5JTj8yBx+oJ+BdwhwWCioiAe8Rs3wq79TbFeV7SlIV7340ocj ZJGEoOy4o+R1NOjQ4Ci5w7GAoStJMsJpX7Y+ItfflIky85fHLkHSM1o99OUpUZ2sZqh6 Bx7IRelgcPfh9MlYpLjCOz47WnAWpkNIkVzP7x/pSoZiHKOtkdawdPU84RNHUvn2S+qP XFHA1X/Vt2fx2WJiNK8h5UVRVRBH6O4IncKLkdVW7QK7xgGUwe+n4loFwtV+0HM4iiDb TApw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.com header.s=amazon201209 header.b=Jw2eXSIz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y11si2445675eda.178.2021.10.07.05.51.14; Thu, 07 Oct 2021 05:51:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.com header.s=amazon201209 header.b=Jw2eXSIz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241013AbhJGKpm (ORCPT + 99 others); Thu, 7 Oct 2021 06:45:42 -0400 Received: from smtp-fw-2101.amazon.com ([72.21.196.25]:38223 "EHLO smtp-fw-2101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240982AbhJGKpf (ORCPT ); Thu, 7 Oct 2021 06:45:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1633603422; x=1665139422; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZC5z2T2gw+PahGCJ3v0eUrB5lX6HX0lWhLeZpwqBms0=; b=Jw2eXSIzMjylGzrvKXHWxwKnAIUUE55b52p6X+uJAW7KedeawXxxKadY 2r2oq/0LMsgSgtUi2y9MB3F/IrUQg06EXLk7GObQdlOZHhice3cA14gC8 0fIgGutb39U0DfQMgVMhQ6LEag2s9s6c+Nr07+HJL06M1g0XhRCLqjtK6 M=; X-IronPort-AV: E=Sophos;i="5.85,354,1624320000"; d="scan'208";a="142961449" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-pdx-2a-e6c05252.us-west-2.amazon.com) ([10.43.8.2]) by smtp-border-fw-2101.iad2.amazon.com with ESMTP; 07 Oct 2021 10:43:33 +0000 Received: from EX13D02EUC001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-pdx-2a-e6c05252.us-west-2.amazon.com (Postfix) with ESMTPS id 8D121415A8; Thu, 7 Oct 2021 10:43:32 +0000 (UTC) Received: from EX13MTAUWC001.ant.amazon.com (10.43.162.135) by EX13D02EUC001.ant.amazon.com (10.43.164.92) with Microsoft SMTP Server (TLS) id 15.0.1497.23; Thu, 7 Oct 2021 10:43:31 +0000 Received: from 8c85908914bf.ant.amazon.com.com (10.1.213.27) by mail-relay.amazon.com (10.43.162.232) with Microsoft SMTP Server id 15.0.1497.23 via Frontend Transport; Thu, 7 Oct 2021 10:43:24 +0000 From: Gal Pressman To: Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Doug Ledford , Jason Gunthorpe CC: , , , , Oded Gabbay , Tomer Tayar , Yossi Leybovich , Alexander Matushevsky , Leon Romanovsky , Jianxin Xiong , Firas Jahjah , Gal Pressman Subject: [RFC PATCH 2/2] RDMA/efa: Add support for dmabuf memory regions Date: Thu, 7 Oct 2021 13:43:00 +0300 Message-ID: <20211007104301.76693-3-galpress@amazon.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211007104301.76693-1-galpress@amazon.com> References: <20211007104301.76693-1-galpress@amazon.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Implement a dmabuf importer for the EFA driver. As ODP is not supported, the dmabuf memory regions always pin the buffers to prevent the move_notify callback from being called. Signed-off-by: Gal Pressman --- drivers/infiniband/hw/efa/efa.h | 4 + drivers/infiniband/hw/efa/efa_main.c | 1 + drivers/infiniband/hw/efa/efa_verbs.c | 166 +++++++++++++++++++++----- 3 files changed, 141 insertions(+), 30 deletions(-) diff --git a/drivers/infiniband/hw/efa/efa.h b/drivers/infiniband/hw/efa/efa.h index 2b8ca099b381..407d7c4baa16 100644 --- a/drivers/infiniband/hw/efa/efa.h +++ b/drivers/infiniband/hw/efa/efa.h @@ -141,6 +141,10 @@ int efa_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length, u64 virt_addr, int access_flags, struct ib_udata *udata); +struct ib_mr *efa_reg_user_mr_dmabuf(struct ib_pd *ibpd, u64 start, + u64 length, u64 virt_addr, + int fd, int access_flags, + struct ib_udata *udata); int efa_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata); int efa_get_port_immutable(struct ib_device *ibdev, u32 port_num, struct ib_port_immutable *immutable); diff --git a/drivers/infiniband/hw/efa/efa_main.c b/drivers/infiniband/hw/efa/efa_main.c index 203e6ddcacbc..72cd7d952a07 100644 --- a/drivers/infiniband/hw/efa/efa_main.c +++ b/drivers/infiniband/hw/efa/efa_main.c @@ -267,6 +267,7 @@ static const struct ib_device_ops efa_dev_ops = { .query_port = efa_query_port, .query_qp = efa_query_qp, .reg_user_mr = efa_reg_mr, + .reg_user_mr_dmabuf = efa_reg_user_mr_dmabuf, INIT_RDMA_OBJ_SIZE(ib_ah, efa_ah, ibah), INIT_RDMA_OBJ_SIZE(ib_cq, efa_cq, ibcq), diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c index be6d3ff0f1be..ca907853a84f 100644 --- a/drivers/infiniband/hw/efa/efa_verbs.c +++ b/drivers/infiniband/hw/efa/efa_verbs.c @@ -3,6 +3,8 @@ * Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All rights reserved. */ +#include +#include #include #include @@ -1491,26 +1493,29 @@ static int efa_create_pbl(struct efa_dev *dev, return 0; } -struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length, - u64 virt_addr, int access_flags, - struct ib_udata *udata) +static void efa_dmabuf_invalidate_cb(struct dma_buf_attachment *attach) +{ + WARN_ON_ONCE(1, + "Invalidate callback should not be called when memory is pinned\n"); +} + +static struct dma_buf_attach_ops efa_dmabuf_attach_ops = { + .allow_peer2peer = true, + .move_notify = efa_dmabuf_invalidate_cb, +}; + +static struct efa_mr *efa_alloc_mr(struct ib_pd *ibpd, int access_flags, + struct ib_udata *udata) { struct efa_dev *dev = to_edev(ibpd->device); - struct efa_com_reg_mr_params params = {}; - struct efa_com_reg_mr_result result = {}; - struct pbl_context pbl; int supp_access_flags; - unsigned int pg_sz; struct efa_mr *mr; - int inline_size; - int err; if (udata && udata->inlen && !ib_is_udata_cleared(udata, 0, sizeof(udata->inlen))) { ibdev_dbg(&dev->ibdev, "Incompatible ABI params, udata not cleared\n"); - err = -EINVAL; - goto err_out; + return ERR_PTR(-EINVAL); } supp_access_flags = @@ -1522,23 +1527,26 @@ struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length, ibdev_dbg(&dev->ibdev, "Unsupported access flags[%#x], supported[%#x]\n", access_flags, supp_access_flags); - err = -EOPNOTSUPP; - goto err_out; + return ERR_PTR(-EOPNOTSUPP); } mr = kzalloc(sizeof(*mr), GFP_KERNEL); - if (!mr) { - err = -ENOMEM; - goto err_out; - } + if (!mr) + return ERR_PTR(-ENOMEM); - mr->umem = ib_umem_get(ibpd->device, start, length, access_flags); - if (IS_ERR(mr->umem)) { - err = PTR_ERR(mr->umem); - ibdev_dbg(&dev->ibdev, - "Failed to pin and map user space memory[%d]\n", err); - goto err_free; - } + return mr; +} + +static int efa_register_mr(struct ib_pd *ibpd, struct efa_mr *mr, u64 start, + u64 length, u64 virt_addr, int access_flags) +{ + struct efa_dev *dev = to_edev(ibpd->device); + struct efa_com_reg_mr_params params = {}; + struct efa_com_reg_mr_result result = {}; + struct pbl_context pbl; + unsigned int pg_sz; + int inline_size; + int err; params.pd = to_epd(ibpd)->pdn; params.iova = virt_addr; @@ -1549,10 +1557,9 @@ struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length, dev->dev_attr.page_size_cap, virt_addr); if (!pg_sz) { - err = -EOPNOTSUPP; ibdev_dbg(&dev->ibdev, "Failed to find a suitable page size in page_size_cap %#llx\n", dev->dev_attr.page_size_cap); - goto err_unmap; + return -EOPNOTSUPP; } params.page_shift = order_base_2(pg_sz); @@ -1566,21 +1573,21 @@ struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length, if (params.page_num <= inline_size) { err = efa_create_inline_pbl(dev, mr, ¶ms); if (err) - goto err_unmap; + return err; err = efa_com_register_mr(&dev->edev, ¶ms, &result); if (err) - goto err_unmap; + return err; } else { err = efa_create_pbl(dev, &pbl, mr, ¶ms); if (err) - goto err_unmap; + return err; err = efa_com_register_mr(&dev->edev, ¶ms, &result); pbl_destroy(dev, &pbl); if (err) - goto err_unmap; + return err; } mr->ibmr.lkey = result.l_key; @@ -1588,9 +1595,98 @@ struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length, mr->ibmr.length = length; ibdev_dbg(&dev->ibdev, "Registered mr[%d]\n", mr->ibmr.lkey); + return 0; +} + +struct ib_mr *efa_reg_user_mr_dmabuf(struct ib_pd *ibpd, u64 start, + u64 length, u64 virt_addr, + int fd, int access_flags, + struct ib_udata *udata) +{ + struct efa_dev *dev = to_edev(ibpd->device); + struct ib_umem_dmabuf *umem_dmabuf; + struct efa_mr *mr; + int err; + + mr = efa_alloc_mr(ibpd, access_flags, udata); + if (IS_ERR(mr)) { + err = PTR_ERR(mr); + goto err_out; + } + + umem_dmabuf = ib_umem_dmabuf_get(ibpd->device, start, length, fd, + access_flags, &efa_dmabuf_attach_ops); + if (IS_ERR(umem_dmabuf)) { + ibdev_dbg(&dev->ibdev, "Failed to get dmabuf[%d]\n", err); + err = PTR_ERR(umem_dmabuf); + goto err_free; + } + + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL); + err = dma_buf_pin(umem_dmabuf->attach); + if (err) { + ibdev_dbg(&dev->ibdev, "Failed to pin dmabuf memory\n"); + goto err_release; + } + + err = ib_umem_dmabuf_map_pages(umem_dmabuf); + if (err) { + ibdev_dbg(&dev->ibdev, "Failed to map dmabuf pages\n"); + goto err_unpin; + } + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); + + mr->umem = &umem_dmabuf->umem; + err = efa_register_mr(ibpd, mr, start, length, virt_addr, access_flags); + if (err) + goto err_unmap; + return &mr->ibmr; err_unmap: + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL); + ib_umem_dmabuf_unmap_pages(umem_dmabuf); +err_unpin: + dma_buf_unpin(umem_dmabuf->attach); +err_release: + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); + ib_umem_release(mr->umem); +err_free: + kfree(mr); +err_out: + atomic64_inc(&dev->stats.reg_mr_err); + return ERR_PTR(err); +} + +struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length, + u64 virt_addr, int access_flags, + struct ib_udata *udata) +{ + struct efa_dev *dev = to_edev(ibpd->device); + struct efa_mr *mr; + int err; + + mr = efa_alloc_mr(ibpd, access_flags, udata); + if (IS_ERR(mr)) { + err = PTR_ERR(mr); + goto err_out; + } + + mr->umem = ib_umem_get(ibpd->device, start, length, access_flags); + if (IS_ERR(mr->umem)) { + err = PTR_ERR(mr->umem); + ibdev_dbg(&dev->ibdev, + "Failed to pin and map user space memory[%d]\n", err); + goto err_free; + } + + err = efa_register_mr(ibpd, mr, start, length, virt_addr, access_flags); + if (err) + goto err_release; + + return &mr->ibmr; + +err_release: ib_umem_release(mr->umem); err_free: kfree(mr); @@ -1603,6 +1699,7 @@ int efa_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) { struct efa_dev *dev = to_edev(ibmr->device); struct efa_com_dereg_mr_params params; + struct ib_umem_dmabuf *umem_dmabuf; struct efa_mr *mr = to_emr(ibmr); int err; @@ -1613,6 +1710,15 @@ int efa_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) if (err) return err; + if (mr->umem->is_dmabuf) { + umem_dmabuf = to_ib_umem_dmabuf(mr->umem); + + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL); + ib_umem_dmabuf_unmap_pages(umem_dmabuf); + dma_buf_unpin(umem_dmabuf->attach); + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); + } + ib_umem_release(mr->umem); kfree(mr); -- 2.33.0