Received: by 10.213.65.68 with SMTP id h4csp2324041imn; Mon, 9 Apr 2018 01:17:07 -0700 (PDT) X-Google-Smtp-Source: AIpwx49HnlNXFvsVD/BdvYcLMelSFnMmYKS0d/DjBz4WumWnOVanwZ58qSE6qkx2qGtZVOsYc0Gz X-Received: by 10.101.101.136 with SMTP id u8mr17027400pgv.333.1523261827443; Mon, 09 Apr 2018 01:17:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523261827; cv=none; d=google.com; s=arc-20160816; b=GOgdzWvsMCR/ikpa3P0S0QRKWPYuFi5fDKVSdt3bOEwpABtW7VXvmguD1bY3YA3ajy o8cpx96OrSZaSLpDy9xHBlqDxGyen9xdjnuObjfUWV6TKzNaieTGHBlZNtDzGfG3153i sJvJmBc66MXjIFB/2286jFPOGW6nEOCH+ca2biUnfyzAQXEa7/5JdYJLtWFLKKehGVNj pGWuZ366TKMBRQmz4TmfqjvX0b+o830OSormJwhB9fQxJ0XozP/ZsiYhBp77MXgcqwx7 BW3yYq8PokkEEL8bR+JdPq1ucxkDIXQF/7yRohYIJJsXobE6b7Un/GKsB5U3qpY1zoPk 5EYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:mail-followup-to:message-id:subject:cc:to:from:date :dkim-signature:arc-authentication-results; bh=hxPw9lFrQfPDPrVQTDJgupaUC7nA15LgHmSDRwyYYMo=; b=fYmDFAoQfxaAAW22+YvC3yZxNSIdLoY53pC7G7tbkHrK5sTCBEAZQnlopW1xyOT7pj qyfhkjGAUjhVMI3d2EqteJn2iudGkGhKPxL1D4pU4A0kWeJrdw7JBLL4Xis2L0EAhZF7 8spI/G4W4hQohe/MUIVwwLtg9y2CmQbjX548uVRt4Y3AyxO7VVTPkfpCglA2/flVd4fl LMLpB71gZ48WbDOI/Isc+3Dpt2CELe99VDK1u8xeJxaeQPukarRWetIk6gOzmGpkHEPV OCnOL8ctk+99zkVh8QBYM/5Kxf1G/w/murAWg4zepTKmq08wMmTfL+Xueni6ZjN28E4j uBqg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@ffwll.ch header.s=google header.b=KnWDtmt3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r74si8621221pfe.63.2018.04.09.01.16.30; Mon, 09 Apr 2018 01:17:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@ffwll.ch header.s=google header.b=KnWDtmt3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751931AbeDIIMf (ORCPT + 99 others); Mon, 9 Apr 2018 04:12:35 -0400 Received: from mail-wm0-f43.google.com ([74.125.82.43]:37055 "EHLO mail-wm0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751464AbeDIIMb (ORCPT ); Mon, 9 Apr 2018 04:12:31 -0400 Received: by mail-wm0-f43.google.com with SMTP id r131so14678064wmb.2 for ; Mon, 09 Apr 2018 01:12:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=sender:date:from:to:cc:subject:message-id:mail-followup-to :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=hxPw9lFrQfPDPrVQTDJgupaUC7nA15LgHmSDRwyYYMo=; b=KnWDtmt3F0dIOKceU9JpAtRE+8bH/ShRLr4wvriVTMqKO7jIo78JlIe08oUziSgPFA RYDqvwrVPyM9bj1tE5PumPQ6+yTXm4HBUK2Phl3V5BL/6y1NFcAsJbqmTSDpF55XUnf6 +dJwqRuPdVSMxyttDphhniDvYygGSpRV1KnJI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=hxPw9lFrQfPDPrVQTDJgupaUC7nA15LgHmSDRwyYYMo=; b=kkkl8aUkzALQw9v1R55vURs94Mb+Mr0kmReyrBE3cxEOqT1glYbXFMselnebHPOmSp /jnISJgbmvosgipeEyRohst7hJYt62wKNpOQY5FVivOc5mufAi76wTtQQGXHbaq/+Qrv J38Hy6gGaa4Md0RlwlkrTsb4gaOKScOz7nuMuQHnqBDYBYmTkdTGaujlIGPi2rin3ImI +jn8FnmZJtO4WwdF5M9S84fp7kDC4Lhc+AJjzbRTyXWW91GTJN3WdZyZbp8Zc7IEZMTI 4qETSlYxY699stxiZJxA0SSTpCCjGqlYiLshEhPCkzAhLQtII4bmEC2TG2JZQercckkH xQ6A== X-Gm-Message-State: ALQs6tAfcOuRPpW2QTCnJ8zJwKBkWbAsq9z3XBP+sOrm7DXgMiJW1h/5 9d5Jdh8ZhHdhbtOe4OwggMzvuA== X-Received: by 10.80.244.133 with SMTP id s5mr21235817edm.23.1523261549936; Mon, 09 Apr 2018 01:12:29 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:5635:0:39d2:f87e:2033:9f6]) by smtp.gmail.com with ESMTPSA id k24sm38856ede.62.2018.04.09.01.12.28 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 09 Apr 2018 01:12:29 -0700 (PDT) Date: Mon, 9 Apr 2018 10:12:27 +0200 From: Daniel Vetter To: Matt Roper Cc: Daniel Vetter , Gerd Hoffmann , Oleksandr Andrushchenko , Dongwon Kim , dri-devel , Tomeu Vizoso , David Airlie , open list , qemu-devel@nongnu.org, "moderated list:DMA BUFFER SHARING FRAMEWORK" , "open list:DMA BUFFER SHARING FRAMEWORK" Subject: Re: [RfC PATCH] Add udmabuf misc device Message-ID: <20180409081226.GE31310@phenom.ffwll.local> Mail-Followup-To: Matt Roper , Gerd Hoffmann , Oleksandr Andrushchenko , Dongwon Kim , dri-devel , Tomeu Vizoso , David Airlie , open list , qemu-devel@nongnu.org, "moderated list:DMA BUFFER SHARING FRAMEWORK" , "open list:DMA BUFFER SHARING FRAMEWORK" References: <20180313154826.20436-1-kraxel@redhat.com> <20180313161035.GL4788@phenom.ffwll.local> <20180314080301.366zycak3whqvvqx@sirius.home.kraxel.org> <20180406001117.GD31612@mdroper-desk.amr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20180406001117.GD31612@mdroper-desk.amr.corp.intel.com> X-Operating-System: Linux phenom 4.15.0-1-amd64 User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 05, 2018 at 05:11:17PM -0700, Matt Roper wrote: > On Thu, Apr 05, 2018 at 10:32:04PM +0200, Daniel Vetter wrote: > > Pulling this out of the shadows again. > > > > We now also have xen-zcopy from Oleksandr and the hyper dmabuf stuff > > from Matt and Dongwong. > > > > At least from the intel side there seems to be the idea to just have 1 > > special device that can handle cross-gues/host sharing for all kinds > > of hypervisors, so I guess you all need to work together :-) > > > > Or we throw out the idea that hyper dmabuf will be cross-hypervisor > > (not sure how useful/reasonable that is, someone please convince me > > one way or the other). > > > > Cheers, Daniel > > Dongwon (DW) is the one doing all the real work on hyper_dmabuf, but I'm > familiar with the use cases he's trying to address, and I think there > are a couple high-level goals of his work that are worth calling out as > we discuss the various options for sharing buffers produced in one VM > with a consumer running in another VM: > > * We should try to keep the interface/usage separate from the > underlying hypervisor implementation details. I.e., in DW's design > the sink/source drivers that handle the actual buffer passing in the > two VM's should provide a generic interface that does not depend on a > specific hypervisor. Behind the scenes there could be various > implementations for specific hypervisors (Xen, KVM, ACRN, etc.), and > some of those backends may have additional restrictions, but it would > be best if userspace didn't have to know the specific hypervisor > running on the system and could just query the general capabilities > available to it. We've already got projects in flight that are > wanting this functionality on Xen and ACRN today. Two comments on this: - Just because it's in drivers/gpu doesn't mean you can't use it for anything else. E.g. the xen-zcopy driver can very much be used for any dma-buf, there's nothing gpu specific with it - well besides that it resuses some useful DRM ioctls, but if that annoys you just do a #define TOTALLY_GENERIC DRM and be done :-) - Especially the kvm memory and hypervisor model seems totally different from other hypervisors, e.g. no real use for guest-guest sharing (which doesn't go through the host) and other cases. So trying to make something 100% generic seems like a bad idea. Wrt making it generic: Just use generic interfaces - if you can somehow use xen-front for the display sharing, then a) no need for hyper-dmabuf and b) already fully generic since it looks like a normal drm device to the guest userspace. > * The general interface should be able to express sharing from any > guest:guest, not just guest:host. Arbitrary G:G sharing might be > something some hypervisors simply aren't able to support, but the > userspace API itself shouldn't make assumptions or restrict that. I > think ideally the sharing API would include some kind of > query_targets interface that would return a list of VM's that your > current OS is allowed to share with; that list would be depend on the > policy established by the system integrator, but obviously wouldn't > include targets that the hypervisor itself wouldn't be capable of > handling. Uh ... has a proper security architect analyzed this idea? > * A lot of the initial use cases are in the realm of graphics, but this > shouldn't be a graphics-specific API. Buffers might contain other > types of content as well (e.g., audio). Really the content producer > could potentially be any driver (or userspace) running in the VM that > knows how to import/export dma_buf's (or maybe just import given > danvet's suggestion that we should make the sink driver do all the > actual memory allocation for any buffers that may be shared). See above, just because it uses drm ioctls doesn't make it gfx specific. Otoh making it even more graphics specific might be even better, i.e. just sharing the backend tech (grant tables or whatever), but having dedicated front-ents for each use-case so there's less code to type. > * We need to be able to handle cross-VM coordination of buffer usage as > well, so I think we'd want to include fence forwarding support in the > API as well to signal back and forth about production/consumption > completion. And of course document really well what should happen > if, for example, the entire VM you're sharing with/from dies. Implicit fencing has been proven to be a bad idea. Please do explicit passing of dma_fences (plus assorted protocol). > * The sharing API could be used to share multiple kinds of content in a > single system. The sharing sink driver running in the content > producer's VM should accept some additional metadata that will be > passed over to the target VM as well. The sharing source driver > running in the content consumer's VM would then be able to use this > metadata to determine the purpose of a new buffer that arrives and > filter/dispatch it to the appropriate consumer. If you want metadata, why not use xen-front or something similar to have a well-defined means to transfer everything? One of the key design decisions of dma-buf was to _not_ have metadata, just buffer sharing. -Daniel > > > For reference, the terminology I'm using is > > /----------\ dma_buf /------\ HV /--------\ dma_buf /----------\ > | Producer |----------->| Sink | HV | Source |----------->| Consumer | > \----------/ ioctls \------/ HV \--------/ uevents \----------/ > > > > In the realm of graphics, "Producer" could potentially be something like > an EGL client that sends the buffer at context setup and then signals > with fences on each SwapBuffers. "Consumer" could be a Wayland client > that proxies the buffers into surfaces or dispatches them to other > userspace software that's waiting for buffers. > > With the hyper_dmabuf approach, there's a lot of ABI details that need > to be worked out and really clearly documented before we worry too much > about the backend hypervisor-specific stuff. > > I'm not super familiar with xen-zcopy and udmabuf, but it sounds like > they're approaching similar problems from slightly different directions, > so we should make sure we can come up with something that satisfies > everyone's requirements. > > > Matt > > > > > On Wed, Mar 14, 2018 at 9:03 AM, Gerd Hoffmann wrote: > > > Hi, > > > > > >> Either mlock account (because it's mlocked defacto), and get_user_pages > > >> won't do that for you. > > >> > > >> Or you write the full-blown userptr implementation, including mmu_notifier > > >> support (see i915 or amdgpu), but that also requires Christian K?nigs > > >> latest ->invalidate_mapping RFC for dma-buf (since atm exporting userptr > > >> buffers is a no-go). > > > > > > I guess I'll look at mlock accounting for starters then. Easier for > > > now, and leaves the door open to switch to userptr later as this should > > > be transparent to userspace. > > > > > >> > Known issue: Driver API isn't complete yet. Need add some flags, for > > >> > example to support read-only buffers. > > >> > > >> dma-buf has no concept of read-only. I don't think we can even enforce > > >> that (not many iommus can enforce this iirc), so pretty much need to > > >> require r/w memory. > > > > > > Ah, ok. Just saw the 'write' arg for get_user_pages_fast and figured we > > > might support that, but if iommus can't handle that anyway it's > > > pointless indeed. > > > > > >> > Cc: David Airlie > > >> > Cc: Tomeu Vizoso > > >> > Signed-off-by: Gerd Hoffmann > > >> > > >> btw there's also the hyperdmabuf stuff from the xen folks, but imo their > > >> solution of forwarding the entire dma-buf api is over the top. This here > > >> looks _much_ better, pls cc all the hyperdmabuf people on your next > > >> version. > > > > > > Fun fact: googling for "hyperdmabuf" found me your mail and nothing else :-o > > > (Trying "hyper dmabuf" instead worked better then). > > > > > > Yes, will cc them on the next version. Not sure it'll help much on xen > > > though due to the memory management being very different. Basically xen > > > owns the memory, not the kernel of the control domain (dom0), so > > > creating dmabufs for guest memory chunks isn't that simple ... > > > > > > Also it's not clear whenever they really need guest -> guest exports or > > > guest -> dom0 exports. > > > > > >> Overall I like the idea, but too lazy to review. > > > > > > Cool. General comments on the idea was all I was looking for for the > > > moment. Spare yor review cycles for the next version ;) > > > > > >> Oh, some kselftests for this stuff would be lovely. > > > > > > I'll look into it. > > > > > > thanks, > > > Gerd > > > > > > _______________________________________________ > > > dri-devel mailing list > > > dri-devel@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch > > -- > Matt Roper > Graphics Software Engineer > IoTG Platform Enabling & Development > Intel Corporation > (916) 356-2795 -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch