Received: by 10.213.65.68 with SMTP id h4csp401151imn; Fri, 6 Apr 2018 02:09:17 -0700 (PDT) X-Google-Smtp-Source: AIpwx49kM80JAuRvLd2Jse1VyZoNvThg3/BF5VBltK+W8GhFrRPrvdB7sWSLHCcVVnGSbhjLtyDw X-Received: by 2002:a17:902:5066:: with SMTP id f35-v6mr26723994plh.14.1523005757552; Fri, 06 Apr 2018 02:09:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523005757; cv=none; d=google.com; s=arc-20160816; b=EOZTzk1b/7+gqV5lGA9h2BRDfHeY3ytKLr2o2J7XLVtUeZRYgK8p4hO3fxTi2xggJ5 BtK0iSUUOP42KqSVzpQWfr5CJtndncYtyJo6de3yZo87PI5Ml5CLpXy3UVEhZHR8uDyc UlIsNriIwt8ebvDwxb6wEKrPespzfGtKt7zgJUwPBuw2yKaX+86coP6dLcpKAmYcge8v G6Pqo78Y5HzwUdZHxIhi/vSn7uXoUMWfah/HmeQi1saSS1GCx165QT3sOpnQcQSTerHh RVus3byBapBD6LsrX1qMWg7dXPRqjQNCuCmWgTHBSFDc2lMKuK+gdU6PDs0tcYBP6upK d/Yw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=NMs45/5cefen8g6tOmr6PNC5Vp2VnmtE8qQyjjqV4ZI=; b=DOjaH/3ZlAUmNhmB+yjlTmUYU0M62gxDM/tsSWejzt84is/cwHER65TMv6COZ6eaNe TXMV7nTXq3f+jSH4cqEF/2Uzbc/lBRgGVJDHqg1F27Ss5QbmOQIONlWveSQn1hUg5PgK zO3SmJx3p7ICm9O8x3eJr14MOct0ehBq+s6mEwU49Kq4l65SiG3UlYJu18QgJOPzmPmv XDGfud9lzu4QiFk+zyuKuZTNkh7eeUqMPrwKIGdS2xkJGV9TT8v4Njytm2b/UHJSe1/R g/ITcBnKL+vgIp/wwQfBwRf7Yz/masAHudD4TSyFPzzDP7r/csCEwL2pn8JDCn0AygKF B/6Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v189si7198546pgb.374.2018.04.06.02.09.03; Fri, 06 Apr 2018 02:09:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751573AbeDFJH4 (ORCPT + 99 others); Fri, 6 Apr 2018 05:07:56 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:35682 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751278AbeDFJHx (ORCPT ); Fri, 6 Apr 2018 05:07:53 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1065B40201A3; Fri, 6 Apr 2018 09:07:53 +0000 (UTC) Received: from sirius.home.kraxel.org (ovpn-116-117.ams2.redhat.com [10.36.116.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 91EFB215CDAF; Fri, 6 Apr 2018 09:07:48 +0000 (UTC) Received: by sirius.home.kraxel.org (Postfix, from userid 1000) id BA7DF3F76E; Fri, 6 Apr 2018 11:07:47 +0200 (CEST) Date: Fri, 6 Apr 2018 11:07:47 +0200 From: Gerd Hoffmann To: Oleksandr Andrushchenko Cc: Matt Roper , Daniel Vetter , Dongwon Kim , dri-devel , Tomeu Vizoso , David Airlie , open list , qemu-devel@nongnu.org, "moderated list:DMA BUFFER SHARING FRAMEWORK" , "open list:DMA BUFFER SHARING FRAMEWORK" Subject: Re: [RfC PATCH] Add udmabuf misc device Message-ID: <20180406090747.gwiegu22z4noj23i@sirius.home.kraxel.org> References: <20180313154826.20436-1-kraxel@redhat.com> <20180313161035.GL4788@phenom.ffwll.local> <20180314080301.366zycak3whqvvqx@sirius.home.kraxel.org> <20180406001117.GD31612@mdroper-desk.amr.corp.intel.com> <2411d2c1-33c0-2ba5-67ea-3bb9af5d5ec9@epam.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <2411d2c1-33c0-2ba5-67ea-3bb9af5d5ec9@epam.com> User-Agent: NeoMutt/20180323 X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Fri, 06 Apr 2018 09:07:53 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Fri, 06 Apr 2018 09:07:53 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'kraxel@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, > > * The general interface should be able to express sharing from any > > guest:guest, not just guest:host. Arbitrary G:G sharing might be > > something some hypervisors simply aren't able to support, but the > > userspace API itself shouldn't make assumptions or restrict that. I > > think ideally the sharing API would include some kind of > > query_targets interface that would return a list of VM's that your > > current OS is allowed to share with; that list would be depend on the > > policy established by the system integrator, but obviously wouldn't > > include targets that the hypervisor itself wouldn't be capable of > > handling. > Can you give a use-case for this? I mean that the system integrator > is the one who defines which guests/hosts talk to each other, > but querying means that it is possible that VMs have some sort > of discovery mechanism, so they can decide on their own whom > to connect to. Note that vsock (created by vmware, these days also has a virtio transport for kvm) started with support for both guest <=> host and guest <=> guest support. But later on guest <=> guest was dropped. As far I know the reasons where (a) lack of use cases and (b) security. So, I likewise would know more details on the use cases you have in mind here. Unless we have a compelling use case here I'd suggest to drop the guest <=> guest requirement as it makes the whole thing alot more complex. > > * The sharing API could be used to share multiple kinds of content in a > > single system. The sharing sink driver running in the content > > producer's VM should accept some additional metadata that will be > > passed over to the target VM as well. Not sure this should be part of hyper-dmabuf. A dma-buf is nothing but a block of data, period. Therefore protocols with dma-buf support (wayland for example) typically already send over metadata describing the content, so duplicating that in hyper-dmabuf looks pointless. > 1. We are targeting ARM and one of the major requirements for the buffer > sharing is the ability to allocate physically contiguous buffers, which gets > even more complicated for systems not backed with an IOMMU. So, for some > use-cases it is enough to make the buffers contiguous in terms of IPA and > sometimes those need to be contiguous in terms of PA. Which pretty much implies the host must to the allocation. > 2. For Xen we would love to see UAPI to create a dma-buf from grant > references provided, so we can use this generic solution to implement > zero-copying without breaking the existing Xen protocols. This can > probably be extended to other hypervizors as well. I'm not sure we can create something which works on both kvm and xen. The memory management model is quite different ... On xen the hypervisor manages all memory. Guests can allow other guests to access specific pages (using grant tables). In theory any guest <=> guest communication is possible. In practice is mostly guest <=> dom0 because guests access their virtual hardware that way. dom0 is the priviledged guest which owns any hardware not managed by xen itself. Xen guests can ask the hypervisor to update the mapping of guest physical pages. They can ballon down (unmap and free pages). They can ballon up (ask the hypervisor to map fresh pages). They can map pages exported by other guests using grant tables. xen-zcopy makes heavy use of this. It balloons down, to make room in the guest physical address space, then goes map the exported pages there, finally composes a dma-buf. On kvm qemu manages all guest memory. qemu also has all guest memory mapped, so a grant-table like mechanism isn't needed to implement virtual devices. qemu can decide how it backs memory for the guest. qemu propagates the guest memory map to the kvm driver in the linux kernel. kvm guests have some control over the guest memory map, for example they can map pci bars wherever they want in their guest physical address space by programming the base registers accordingly, but unlike xen guests they can't ask the host to remap individual pages. Due to qemu having all guest memory mapped virtual devices are typically designed to have the guest allocate resources, then notify the host where they are located. This is where the udmabuf idea comes from: Guest tells the host (qemu) where the gem object is, and qemu then can create a dmabuf backed by those pages to pass it on to other processes such as the wayland display server. Possibly even without the guest explicitly asking for it, i.e. export the framebuffer placed by the guest in the (virtual) vga pci memory bar as dma-buf. And I can imagine that this is useful outsize virtualization too. I fail to see any common ground for xen-zcopy and udmabuf ... Beside that there is the problem that the udmabuf idea has its own share of issues, for example the fork() issue pointed out by Christian K?nig[1]. So I still need to find something which can work for kvm ... cheers, Gerd [1] https://www.spinics.net/lists/dri-devel/msg169442.html