DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com D26C8C04B93E
Date: Fri, 2 Jun 2017 08:58:31 -0600
From: Alex Williamson <alex.williamson@redhat.com>
To: "Chen, Xiaoguang" <xiaoguang.chen@intel.com>
Cc: "kraxel@redhat.com" <kraxel@redhat.com>,
        "chris@chris-wilson.co.uk" <chris@chris-wilson.co.uk>,
        "intel-gfx@lists.freedesktop.org" <intel-gfx@lists.freedesktop.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "zhenyuw@linux.intel.com" <zhenyuw@linux.intel.com>,
        "Lv, Zhiyuan" <zhiyuan.lv@intel.com>,
        "intel-gvt-dev@lists.freedesktop.org" 
        <intel-gvt-dev@lists.freedesktop.org>,
        "Wang, Zhi A" <zhi.a.wang@intel.com>,
        "Tian, Kevin" <kevin.tian@intel.com>
Subject: Re: [PATCH v6 6/6] drm/i915/gvt: Adding interface so user space can
 get the dma-buf
Message-ID: <20170602085831.41454a1f@t450s.home>
In-Reply-To: <DD379D741F77464281CE7ED1CD7C12DE7066ACE8@SHSMSX101.ccr.corp.intel.com>
References: <1495874332-2851-1-git-send-email-xiaoguang.chen@intel.com>
        <1495874332-2851-7-git-send-email-xiaoguang.chen@intel.com>
        <20170601120820.3358f7dd@w520.home>
        <DD379D741F77464281CE7ED1CD7C12DE7066A99D@SHSMSX101.ccr.corp.intel.com>
        <20170601213459.461ecf36@w520.home>
        <DD379D741F77464281CE7ED1CD7C12DE7066ACE8@SHSMSX101.ccr.corp.intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 15808
Lines: 398

On Fri, 2 Jun 2017 09:31:41 +0000
"Chen, Xiaoguang" <xiaoguang.chen@intel.com> wrote:

> >-----Original Message-----
> >From: Alex Williamson [mailto:alex.williamson@redhat.com]
> >Sent: Friday, June 02, 2017 11:35 AM
> >To: Chen, Xiaoguang <xiaoguang.chen@intel.com>
> >Cc: kraxel@redhat.com; chris@chris-wilson.co.uk; intel-
> >gfx@lists.freedesktop.org; linux-kernel@vger.kernel.org;
> >zhenyuw@linux.intel.com; Lv, Zhiyuan <zhiyuan.lv@intel.com>; intel-gvt-
> >dev@lists.freedesktop.org; Wang, Zhi A <zhi.a.wang@intel.com>; Tian, Kevin
> ><kevin.tian@intel.com>
> >Subject: Re: [PATCH v6 6/6] drm/i915/gvt: Adding interface so user space can get
> >the dma-buf
> >
> >On Fri, 2 Jun 2017 03:24:35 +0000
> >"Chen, Xiaoguang" <xiaoguang.chen@intel.com> wrote:
> >  
> >> Hi Alex,
> >>  
> >> >-----Original Message-----
> >> >From: Alex Williamson [mailto:alex.williamson@redhat.com]
> >> >Sent: Friday, June 02, 2017 2:08 AM
> >> >To: Chen, Xiaoguang <xiaoguang.chen@intel.com>
> >> >Cc: kraxel@redhat.com; chris@chris-wilson.co.uk; intel-
> >> >gfx@lists.freedesktop.org; linux-kernel@vger.kernel.org;
> >> >zhenyuw@linux.intel.com; Lv, Zhiyuan <zhiyuan.lv@intel.com>;
> >> >intel-gvt- dev@lists.freedesktop.org; Wang, Zhi A
> >> ><zhi.a.wang@intel.com>; Tian, Kevin <kevin.tian@intel.com>
> >> >Subject: Re: [PATCH v6 6/6] drm/i915/gvt: Adding interface so user
> >> >space can get the dma-buf
> >> >
> >> >On Sat, 27 May 2017 16:38:52 +0800
> >> >Xiaoguang Chen <xiaoguang.chen@intel.com> wrote:
> >> >  
> >> >> User space should create the management fd for the dma-buf operation first.
> >> >> Then user can query the plane information and create dma-buf if
> >> >> necessary using the management fd.
> >> >>
> >> >> Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
> >> >> ---
> >> >>  drivers/gpu/drm/i915/gvt/dmabuf.c |  12 ++++
> >> >>  drivers/gpu/drm/i915/gvt/dmabuf.h |   5 ++
> >> >>  drivers/gpu/drm/i915/gvt/gvt.c    |   2 +
> >> >>  drivers/gpu/drm/i915/gvt/gvt.h    |   5 ++
> >> >>  drivers/gpu/drm/i915/gvt/kvmgt.c  | 144  
> >> >++++++++++++++++++++++++++++++++++++++  
> >> >>  drivers/gpu/drm/i915/gvt/vgpu.c   |   1 +
> >> >>  6 files changed, 169 insertions(+)
> >> >>
> >> >> diff --git a/drivers/gpu/drm/i915/gvt/dmabuf.c
> >> >> b/drivers/gpu/drm/i915/gvt/dmabuf.c
> >> >> index c831e91..9759e9a 100644
> >> >> --- a/drivers/gpu/drm/i915/gvt/dmabuf.c
> >> >> +++ b/drivers/gpu/drm/i915/gvt/dmabuf.c
> >> >> @@ -226,6 +226,7 @@ int intel_vgpu_create_dmabuf(struct intel_vgpu
> >> >> *vgpu,  
> >> >void *args)  
> >> >>  	struct vfio_vgpu_dmabuf_info *gvt_dmabuf = args;
> >> >>  	struct intel_vgpu_fb_info *fb_info;
> >> >>  	int ret;
> >> >> +	struct intel_vgpu_dmabuf_obj *dmabuf_obj;
> >> >>
> >> >>  	ret = intel_vgpu_get_plane_info(dev, vgpu, &gvt_dmabuf->plane_info);
> >> >>  	if (ret != 0)
> >> >> @@ -263,6 +264,17 @@ int intel_vgpu_create_dmabuf(struct intel_vgpu
> >> >> *vgpu,  
> >> >void *args)  
> >> >>  		gvt_vgpu_err("create dma-buf fd failed ret:%d\n", ret);
> >> >>  		return ret;
> >> >>  	}
> >> >> +	dmabuf_obj = kmalloc(sizeof(*dmabuf_obj), GFP_KERNEL);
> >> >> +	if (dmabuf_obj == NULL) {
> >> >> +		kfree(fb_info);
> >> >> +		i915_gem_object_put(obj);
> >> >> +		gvt_vgpu_err("alloc dmabuf_obj failed\n");
> >> >> +		return -ENOMEM;
> >> >> +	}
> >> >> +	dmabuf_obj->obj = obj;
> >> >> +	INIT_LIST_HEAD(&dmabuf_obj->list);
> >> >> +	list_add_tail(&dmabuf_obj->list, &vgpu->dmabuf_obj_list_head);
> >> >> +
> >> >>  	gvt_dmabuf->fd = ret;
> >> >>
> >> >>  	return 0;
> >> >> diff --git a/drivers/gpu/drm/i915/gvt/dmabuf.h
> >> >> b/drivers/gpu/drm/i915/gvt/dmabuf.h
> >> >> index 8be9979..cafa781 100644
> >> >> --- a/drivers/gpu/drm/i915/gvt/dmabuf.h
> >> >> +++ b/drivers/gpu/drm/i915/gvt/dmabuf.h
> >> >> @@ -31,6 +31,11 @@ struct intel_vgpu_fb_info {
> >> >>  	uint32_t fb_size;
> >> >>  };
> >> >>
> >> >> +struct intel_vgpu_dmabuf_obj {
> >> >> +	struct drm_i915_gem_object *obj;
> >> >> +	struct list_head list;
> >> >> +};
> >> >> +
> >> >>  int intel_vgpu_query_plane(struct intel_vgpu *vgpu, void *args);
> >> >> int intel_vgpu_create_dmabuf(struct intel_vgpu *vgpu, void *args);
> >> >>
> >> >> diff --git a/drivers/gpu/drm/i915/gvt/gvt.c
> >> >> b/drivers/gpu/drm/i915/gvt/gvt.c index 2032917..dbc3f86 100644
> >> >> --- a/drivers/gpu/drm/i915/gvt/gvt.c
> >> >> +++ b/drivers/gpu/drm/i915/gvt/gvt.c
> >> >> @@ -54,6 +54,8 @@ static const struct intel_gvt_ops intel_gvt_ops = {
> >> >>  	.vgpu_reset = intel_gvt_reset_vgpu,
> >> >>  	.vgpu_activate = intel_gvt_activate_vgpu,
> >> >>  	.vgpu_deactivate = intel_gvt_deactivate_vgpu,
> >> >> +	.vgpu_query_plane = intel_vgpu_query_plane,
> >> >> +	.vgpu_create_dmabuf = intel_vgpu_create_dmabuf,
> >> >>  };
> >> >>
> >> >>  /**
> >> >> diff --git a/drivers/gpu/drm/i915/gvt/gvt.h
> >> >> b/drivers/gpu/drm/i915/gvt/gvt.h index 763a8c5..a855797 100644
> >> >> --- a/drivers/gpu/drm/i915/gvt/gvt.h
> >> >> +++ b/drivers/gpu/drm/i915/gvt/gvt.h
> >> >> @@ -185,8 +185,11 @@ struct intel_vgpu {
> >> >>  		struct kvm *kvm;
> >> >>  		struct work_struct release_work;
> >> >>  		atomic_t released;
> >> >> +		struct vfio_device *vfio_device;
> >> >>  	} vdev;
> >> >>  #endif
> >> >> +	int dmabuf_mgr_fd;
> >> >> +	struct list_head dmabuf_obj_list_head;
> >> >>  };
> >> >>
> >> >>  struct intel_gvt_gm {
> >> >> @@ -467,6 +470,8 @@ struct intel_gvt_ops {
> >> >>  	void (*vgpu_reset)(struct intel_vgpu *);
> >> >>  	void (*vgpu_activate)(struct intel_vgpu *);
> >> >>  	void (*vgpu_deactivate)(struct intel_vgpu *);
> >> >> +	int (*vgpu_query_plane)(struct intel_vgpu *vgpu, void *);
> >> >> +	int (*vgpu_create_dmabuf)(struct intel_vgpu *vgpu, void *);
> >> >>  };
> >> >>
> >> >>
> >> >> diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c
> >> >> b/drivers/gpu/drm/i915/gvt/kvmgt.c
> >> >> index 389f072..a079080 100644
> >> >> --- a/drivers/gpu/drm/i915/gvt/kvmgt.c
> >> >> +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
> >> >> @@ -41,6 +41,7 @@
> >> >>  #include <linux/kvm_host.h>
> >> >>  #include <linux/vfio.h>
> >> >>  #include <linux/mdev.h>
> >> >> +#include <linux/anon_inodes.h>
> >> >>
> >> >>  #include "i915_drv.h"
> >> >>  #include "gvt.h"
> >> >> @@ -524,6 +525,125 @@ static int
> >> >> intel_vgpu_reg_init_opregion(struct  
> >> >intel_vgpu *vgpu)  
> >> >>  	return ret;
> >> >>  }
> >> >>
> >> >> +static int kvmgt_get_vfio_device(struct intel_vgpu *vgpu) {
> >> >> +	struct vfio_device *device;
> >> >> +
> >> >> +	device = vfio_device_get_from_dev(mdev_dev(vgpu->vdev.mdev));
> >> >> +	if (device == NULL)
> >> >> +		return -ENODEV;
> >> >> +	vgpu->vdev.vfio_device = device;
> >> >> +
> >> >> +	return 0;
> >> >> +}
> >> >> +
> >> >> +static void kvmgt_put_vfio_device(struct intel_vgpu *vgpu) {
> >> >> +	vfio_device_put(vgpu->vdev.vfio_device);
> >> >> +}
> >> >> +
> >> >> +static int intel_vgpu_dmabuf_mgr_fd_mmap(struct file *file,
> >> >> +		struct vm_area_struct *vma)
> >> >> +{
> >> >> +	return -EPERM;
> >> >> +}
> >> >> +
> >> >> +static int intel_vgpu_dmabuf_mgr_fd_release(struct inode *inode,
> >> >> +		struct file *filp)
> >> >> +{
> >> >> +	struct intel_vgpu *vgpu = filp->private_data;
> >> >> +	struct intel_vgpu_dmabuf_obj *obj;
> >> >> +	struct list_head *pos;
> >> >> +
> >> >> +	if (WARN_ON(!vgpu->vdev.vfio_device))
> >> >> +		return -ENODEV;
> >> >> +
> >> >> +	list_for_each(pos, &vgpu->dmabuf_obj_list_head) {
> >> >> +		obj = container_of(pos, struct intel_vgpu_dmabuf_obj, list);
> >> >> +		if (WARN_ON(!obj))
> >> >> +			return -ENODEV;
> >> >> +		kfree(obj->obj->gvt_info);
> >> >> +		i915_gem_object_put(obj->obj);
> >> >> +		kfree(obj);
> >> >> +		kvmgt_put_vfio_device(vgpu);  
> >> >
> >> >Can we do this?  If I understand, we're releasing all the references
> >> >and allocations for the dmabuf fds on release of the manager fd.
> >> >What happens if the user continues trying to access those dmabuf fds after  
> >this?  
> >> I think we can do this here.
> >> The dma-buf's release function dma_buf_release() will be called by kernel which  
> >means all the created dmabufs will be invalid even we do not release all the
> >references and allocations here.
> >
> >Are you assuming that the user has closed the dmabuf fds?  They could close the
> >manager fd first, should the dmabuf fd continue to work?  
> If guest vm was shutdown it is ok system will release the allocated fds including the management fd and dmabuf fds.
> But if user call the close() to close the management fd deliberately there are problems in current implementation.
> 
> Usually vendor of dma-buf defines its own dma_buf_ops(map, unmap, release.....) so vendor can release the reference and allocations in the release callback.
> The calling sequence is: dma-buf framework's release()->vendor's dma-buf release(i915's in our case).
> 
> But in our case we did not create the dma-buf from the scratch but calling an existing i915 function i915_gem_prime_export() to create a dma-buf which use the i915's dma-buf-ops. 
> In order to use this function we must create a gem object first(the reference count of the gem object is now 1!!!). 
> 
> When i915's dma-buf's release() callback is called it will try to free the gem object associated with the dma-buf if its ref count is 0. But in our case the ref count is 1 so no free callback is called so we can not release allocations there.
> So we have to release our dma-buf releated allocations in the management fd's release callback which means the dma-bufs' life cycle must the same with the management fd.
> 
> The reason we call i915_gem_prime_export() to create a dma-buf is we do not need to change the i915's dma-buf framework.
> 
> So: 1) we maintain current implementation that the dma-bufs have the same lifecycle with the management fd. User can not close the management fd. Or
>         2) we create the dma-buf from scratch which need to change the dma-buf infrastructure of i915.

We cannot simply say that the user isn't allowed to release them in
that order.  If they do, how does it break, including what might mmaps
into those dmabufs provide them access to after the underlying object is
removed. If the policy is that the management fd cannot be released
until the dmabufs are closed, then that needs to be actively enforced.
Otherwise it needs to be supported.  Requiring changes to dmabuf
handling in the i915 code isn't an excuse IMO.  Thanks,

Alex

> >> >> +	}
> >> >> +	kvmgt_put_vfio_device(vgpu);
> >> >> +
> >> >> +	return 0;
> >> >> +}
> >> >> +
> >> >> +static long intel_vgpu_dmabuf_mgr_fd_ioctl(struct file *filp,
> >> >> +		unsigned int ioctl, unsigned long arg) {
> >> >> +	struct intel_vgpu *vgpu = filp->private_data;
> >> >> +	int minsz;
> >> >> +	int ret;
> >> >> +	struct fd f;
> >> >> +
> >> >> +	f = fdget(vgpu->dmabuf_mgr_fd);
> >> >> +	if (!f.file)
> >> >> +		return -EBADF;
> >> >> +
> >> >> +	if (ioctl == VFIO_DEVICE_QUERY_PLANE) {
> >> >> +		struct vfio_vgpu_plane_info info;
> >> >> +
> >> >> +		minsz = offsetofend(struct vfio_vgpu_plane_info, resv);
> >> >> +		if (copy_from_user(&info, (void __user *)arg, minsz)) {
> >> >> +			fdput(f);
> >> >> +			return -EFAULT;
> >> >> +		}
> >> >> +		if (info.argsz < minsz) {
> >> >> +			fdput(f);
> >> >> +			return -EINVAL;
> >> >> +		}
> >> >> +		ret = intel_gvt_ops->vgpu_query_plane(vgpu, &info);
> >> >> +		if (ret != 0) {
> >> >> +			fdput(f);
> >> >> +			gvt_vgpu_err("query plane failed:%d\n", ret);
> >> >> +			return -EINVAL;
> >> >> +		}
> >> >> +		fdput(f);
> >> >> +		return copy_to_user((void __user *)arg, &info, minsz) ?
> >> >> +								-EFAULT : 0;
> >> >> +	} else if (ioctl == VFIO_DEVICE_CREATE_DMABUF) {
> >> >> +		struct vfio_vgpu_dmabuf_info dmabuf;
> >> >> +
> >> >> +		minsz = offsetofend(struct vfio_vgpu_dmabuf_info, resv);
> >> >> +		if (copy_from_user(&dmabuf, (void __user *)arg, minsz)) {
> >> >> +			fdput(f);
> >> >> +			return -EFAULT;
> >> >> +		}
> >> >> +		if (dmabuf.argsz < minsz) {
> >> >> +			fdput(f);
> >> >> +			return -EINVAL;
> >> >> +		}
> >> >> +		ret = kvmgt_get_vfio_device(vgpu);
> >> >> +		if (ret != 0)
> >> >> +			return ret;  
> >> >
> >> >Missed an fdput, though I'm not sure I understand the value of the
> >> >original fdget or the dmabuf_mgr_fd field at all.  dmabuf_mgr_fd is
> >> >only used here, presumably to add a reference to the fd while we're
> >> >in the ioctl, but we're in the ioctl function of that fd, so I think there are  
> >already references elsewhere.  
> >> Make sense. Fdget/fdput can be removed.
> >>  
> >> >  
> >> >> +
> >> >> +		ret = intel_gvt_ops->vgpu_create_dmabuf(vgpu, &dmabuf);
> >> >> +		if (ret != 0) {
> >> >> +			kvmgt_put_vfio_device(vgpu);
> >> >> +			fdput(f);
> >> >> +			return -EINVAL;  
> >> >
> >> >Why not return the errno that vgpu_create_dmabuf provided?  
> >> Will change to use the returned errno.
> >>  
> >> >  
> >> >> +		}
> >> >> +		fdput(f);
> >> >> +		return copy_to_user((void __user *)arg, &dmabuf, minsz) ?
> >> >> +								-EFAULT : 0;
> >> >> +	}
> >> >> +
> >> >> +	fdput(f);
> >> >> +	gvt_vgpu_err("unsupported dmabuf operation\n");
> >> >> +
> >> >> +	return -EINVAL;
> >> >> +}
> >> >> +
> >> >> +static const struct file_operations intel_vgpu_dmabuf_mgr_fd_ops = {
> >> >> +	.release        = intel_vgpu_dmabuf_mgr_fd_release,
> >> >> +	.unlocked_ioctl = intel_vgpu_dmabuf_mgr_fd_ioctl,
> >> >> +	.mmap           = intel_vgpu_dmabuf_mgr_fd_mmap,
> >> >> +	.llseek         = noop_llseek,
> >> >> +};
> >> >>  static int intel_vgpu_create(struct kobject *kobj, struct
> >> >> mdev_device
> >> >> *mdev)  {
> >> >>  	struct intel_vgpu *vgpu = NULL;
> >> >> @@ -1259,6 +1379,30 @@ static long intel_vgpu_ioctl(struct
> >> >> mdev_device  
> >> >*mdev, unsigned int cmd,  
> >> >>  	} else if (cmd == VFIO_DEVICE_RESET) {
> >> >>  		intel_gvt_ops->vgpu_reset(vgpu);
> >> >>  		return 0;
> >> >> +	} else if (cmd == VFIO_DEVICE_GET_FD) {
> >> >> +		int fd;
> >> >> +		u32 type;
> >> >> +		int ret;
> >> >> +
> >> >> +		if (copy_from_user(&type, (void __user *)arg, sizeof(type)))
> >> >> +			return -EINVAL;
> >> >> +		if (type != VFIO_DEVICE_DMABUF_MGR_FD)
> >> >> +			return -EINVAL;
> >> >> +
> >> >> +		ret = kvmgt_get_vfio_device(vgpu);
> >> >> +		if (ret != 0)
> >> >> +			return ret;
> >> >> +
> >> >> +		fd = anon_inode_getfd("intel-vgpu-dmabuf-mgr-fd",
> >> >> +			&intel_vgpu_dmabuf_mgr_fd_ops,
> >> >> +			vgpu, O_RDWR | O_CLOEXEC);
> >> >> +		if (fd < 0) {
> >> >> +			gvt_vgpu_err("create dmabuf mgr fd failed\n");
> >> >> +			return -EINVAL;  
> >> >
> >> >Error path leaks vfio_device reference.  
> >> Will correct in the next version.
> >>  
> >> >  
> >> >> +		}
> >> >> +		vgpu->dmabuf_mgr_fd = fd;  
> >> >
> >> >As above, unclear value of this field, additionally, what if the user
> >> >calls VFIO_DEVICE_GET_FD more than once?  
> >> Ah, good question.
> >> VFIO_DEVICE_GET_FD should only be called once.
> >> And we should add a check if the vgpu->dmabuf_mgr_fd is not 0 which means  
> >VFIO_DEVICE_GET_FD had been called before we should return an error.
> >
> >Except we no longer need that fd and we should probably use an atomic 'opened'
> >so it's not racy.  Thanks,
> >
> >Alex
> >  
> >> >> +
> >> >> +		return fd;
> >> >>  	}
> >> >>
> >> >>  	return 0;
> >> >> diff --git a/drivers/gpu/drm/i915/gvt/vgpu.c
> >> >> b/drivers/gpu/drm/i915/gvt/vgpu.c index 6e3cbd8..af6fc74 100644
> >> >> --- a/drivers/gpu/drm/i915/gvt/vgpu.c
> >> >> +++ b/drivers/gpu/drm/i915/gvt/vgpu.c
> >> >> @@ -346,6 +346,7 @@ static struct intel_vgpu  
> >> >*__intel_gvt_create_vgpu(struct intel_gvt *gvt,  
> >> >>  	vgpu->gvt = gvt;
> >> >>  	vgpu->sched_ctl.weight = param->weight;
> >> >>  	bitmap_zero(vgpu->tlb_handle_pending, I915_NUM_ENGINES);
> >> >> +	INIT_LIST_HEAD(&vgpu->dmabuf_obj_list_head);
> >> >>
> >> >>  	intel_vgpu_init_cfg_space(vgpu, param->primary);
> >> >>  
> >>  
>