Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp721005pxb; Tue, 14 Sep 2021 07:21:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJze9P1mBvUQdeMUkugog+wAxRXtrZs8LcSreIHevvYX4Zfr9afhGYYvntB1m6PEiYIQfQ6i X-Received: by 2002:a05:6512:6c2:: with SMTP id u2mr13254064lff.215.1631629318567; Tue, 14 Sep 2021 07:21:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631629318; cv=none; d=google.com; s=arc-20160816; b=VjLcHzCw0TI1JZ8hLT7nnBw51y4/67r0SBij1QKOtx+ICuE7MywyX+TLHXiOdE8ygr zYgVtZSYHqVdBbWoOblvTAHdhZqueOlpEq35NpcC0IynSKQW4QpMuu6wacQoyjIa28cy 66dUn9wH15d6hUxT6t8jEpGf262ztcklDHDvYa9cv7DkDaZXurH5PCdUfMZE1yfOkPpA FUmzP8DgEAmUjRja2LR9ZEkorXkTjBLl3+OFRt0VcNoECBgE7iFIiDtnVrTH1NmYV5Op Hr3hHQ1vdhOffb1xaYrhcBwh55+oIba7p3EgvileIuJP7NFLbZeWGryvrw+qs4VAc+4l AHZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:mail-followup-to:message-id:subject:cc:to:from:date :dkim-signature; bh=kpn+M1bqQCPZctbXT2rMFocOAVoHTOH/YHoMT77kaXA=; b=GFMXhKXTx+Km6ECUQ5GjUp5UL2WoDlqCWvZQysnyswGvm93Ah/Qs9IPKeGyRgzsMTg l/Cgnnrp5iA08GUUrpLT9frbg0Q5DP3kSLLKuvs/Jdq75j/Hb5VLmJSTFQGaP9bczL/9 Tu+wVrZkNaWOBJMYMDHf/V8Vw3PQk55BnOlAl4dpfTrpKY+WQ5OtkUdIF7S8wvxOFr+D ojAHEYPgXZVc0uNzclIWzpw2QKH5oiC8qMKyEsmLkfB6pvTLoaXKO5Ie+1BtZtLR40Vb gkOGgUo0iiSXdBnbuECnkmbcqJoA7URflwRL2dWv5GWBP7G+R3RE9I5B/UfVxsQPA2n2 NnIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=KB4MNAxh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w25si12204474lji.333.2021.09.14.07.21.28; Tue, 14 Sep 2021 07:21:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=KB4MNAxh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233338AbhINOTz (ORCPT + 99 others); Tue, 14 Sep 2021 10:19:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35648 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233349AbhINOTx (ORCPT ); Tue, 14 Sep 2021 10:19:53 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5FB2C061764 for ; Tue, 14 Sep 2021 07:18:35 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id t18so20553433wrb.0 for ; Tue, 14 Sep 2021 07:18:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to; bh=kpn+M1bqQCPZctbXT2rMFocOAVoHTOH/YHoMT77kaXA=; b=KB4MNAxhrmlbX59MwAQZ6hLhzSFEWzYdMbFTOohpBg3/xVF58nlidY6zi0qGimEhqm Jl6OCNnnQMIFIqi85bkucA2jwEyap/QQOUngnGS5sFkRXOcZpJV0qG67XAOWE7gBHPBL dZ2GY6mBv3tc5NBbWhoYZ9ImG68TbT4LJGvgk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :in-reply-to; bh=kpn+M1bqQCPZctbXT2rMFocOAVoHTOH/YHoMT77kaXA=; b=i5tBnlXooIuy/hEPZopFfr0g3Z3751H26Q9kO+3kExmGzxQciQicZTr0fAPzy4RUir vBQTbZ4auaX8Pi7wcn5jepBfTKm2X5mXkUjuua/KCpWawMTnoAR2oxeil4XS/88hcM4v DSOMZaBNXornjNRV6lmuZYl9hpODnf4eu++r2A3UfjjbhX+TNhseGeLs6eaGQGyt0Ng+ CjnM63hzdEqkgfQhSxKvYIyCqCdQHDu3odXxw5itQojNnGS9P5ub2MMw+OXkOU9pqzUH TdEXMH0XSU7ThU08rXcdj9r/Pu611qcPhNfS1OW7UfXStBPauoOGOFy2MbFAP8eFWv3A F73g== X-Gm-Message-State: AOAM530/ZMCc9q4tvXq7N4yyXZ+3L/QfT3eH7MV54q/N03HNX+fB2fj5 yt+TbVBSj9ZrfztLG5tCOFH5/w== X-Received: by 2002:adf:f183:: with SMTP id h3mr13825816wro.32.1631629114235; Tue, 14 Sep 2021 07:18:34 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id l15sm1251759wme.42.2021.09.14.07.18.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Sep 2021 07:18:33 -0700 (PDT) Date: Tue, 14 Sep 2021 16:18:31 +0200 From: Daniel Vetter To: Oded Gabbay Cc: linux-kernel@vger.kernel.org, gregkh@linuxfoundation.org, jgg@ziepe.ca, christian.koenig@amd.com, daniel.vetter@ffwll.ch, galpress@amazon.com, sleybo@amazon.com, dri-devel@lists.freedesktop.org, linux-rdma@vger.kernel.org, linux-media@vger.kernel.org, dledford@redhat.com, airlied@gmail.com, alexander.deucher@amd.com, leonro@nvidia.com, hch@lst.de, amd-gfx@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Subject: Re: [PATCH v6 0/2] Add p2p via dmabuf to habanalabs Message-ID: Mail-Followup-To: Oded Gabbay , linux-kernel@vger.kernel.org, gregkh@linuxfoundation.org, jgg@ziepe.ca, christian.koenig@amd.com, galpress@amazon.com, sleybo@amazon.com, dri-devel@lists.freedesktop.org, linux-rdma@vger.kernel.org, linux-media@vger.kernel.org, dledford@redhat.com, airlied@gmail.com, alexander.deucher@amd.com, leonro@nvidia.com, hch@lst.de, amd-gfx@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org References: <20210912165309.98695-1-ogabbay@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210912165309.98695-1-ogabbay@kernel.org> X-Operating-System: Linux phenom 5.10.0-8-amd64 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Sep 12, 2021 at 07:53:07PM +0300, Oded Gabbay wrote: > Hi, > Re-sending this patch-set following the release of our user-space TPC > compiler and runtime library. > > I would appreciate a review on this. I think the big open we have is the entire revoke discussions. Having the option to let dma-buf hang around which map to random local memory ranges, without clear ownership link and a way to kill it sounds bad to me. I think there's a few options: - We require revoke support. But I've heard rdma really doesn't like that, I guess because taking out an MR while holding the dma_resv_lock would be an inversion, so can't be done. Jason, can you recap what exactly the hold-up was again that makes this a no-go? - The other option I discussed is a bit more the exlusive device ownership model we've had for gpus in drm of the really old kind. Roughly this would work like this, in terms of drm_device: - Only the current owner (drm_master in current drm code, but should probably rename that to drm_owner) is allowed to use the accel driver. So all ioctl would fail if you're not drm_master. - On dropmaster/file close we'd revoke as much as possible, e.g. in-flight commands, mmaps, anything really that can be revoked. - For non-revokable things like these dma-buf we'd keep a drm_master reference around. This would prevent the next open to acquire ownership rights, which at least prevents all the nasty potential problems. - admin (or well container orchestrator) then has responsibility to shoot down all process until the problem goes away (i.e. until you hit the one with the rdma MR which keeps the dma-buf alive) - Not sure there's another reasonable way to do this without inviting some problems once we get outside of the "single kernel instance per tenant" use-case. Wrt implementation there's the trouble of this reinventing a bunch of drm stuff and concepts, but that's maybe for after we've figured out semantics. Also would be great if you have a pull request for the userspace runtime that shows a bit how this all gets used and tied together. Or maybe some pointers, since I guess retconning a PR in github is maybe a bit much. Cheers, Daniel > > Thanks, > Oded > > Oded Gabbay (1): > habanalabs: define uAPI to export FD for DMA-BUF > > Tomer Tayar (1): > habanalabs: add support for dma-buf exporter > > drivers/misc/habanalabs/Kconfig | 1 + > drivers/misc/habanalabs/common/habanalabs.h | 22 + > drivers/misc/habanalabs/common/memory.c | 522 +++++++++++++++++++- > drivers/misc/habanalabs/gaudi/gaudi.c | 1 + > drivers/misc/habanalabs/goya/goya.c | 1 + > include/uapi/misc/habanalabs.h | 28 +- > 6 files changed, 570 insertions(+), 5 deletions(-) > > -- > 2.17.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch