Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp91342imm; Thu, 2 Aug 2018 14:35:49 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcYJ6Gu0Z13cdlvMMCBRz9tBYf6gO3HdJ6e73uZoLQIqCQ27zL8XWUYPHUslfMla6DHasQ1 X-Received: by 2002:a62:e0d5:: with SMTP id d82-v6mr1238089pfm.59.1533245749622; Thu, 02 Aug 2018 14:35:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533245749; cv=none; d=google.com; s=arc-20160816; b=Py0O3SEMHInxlbBC1n87UP9K9RLnLfsjg+k9nxIGSGqC6ju+buzCvj7703BfyxID5H Z5eN/TlGL+lL5fB2vJ6Y8MZ9uxx0mkhKU3d8/f8KGglCWCO7yxjPWW7qs5Ln/i/ctM/+ CN3df54u9/SHUjRZ2YReyE2EBdu+iYJBzTcY6R+PUzalPIq9MvonW+lLBOzR1mDSCo1D 4S8UOhGM7a9swpVeyUz2LBh9owUze2UIMPLTso5N6QPpLiORRug3hJ/byWoE6gbtjeWE fplswHq+jSMJ4Wp7dWCFZW0p+BBlPYcuwS+AHV51u//bNewAZMJu4O+dvHznH8eQ7dzV iJpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :arc-authentication-results; bh=sxOTc7hE6V+D5E2snxYkrwsfqhwyhFi0DEM4gYPJpNs=; b=drdey+t0irywn1nkvBKWUme56STVkldk/aBQXoOehDPxYGBnjznA+KHLohukzsZ/dX zS77mVTNcZ0QoYLLwqvXM19CZAZvZmSPVPUiDsq7VjDxljc8u5DuWHFahCpKSuY8Vuaj ApgBp1YMbFEHGt094KIedQovoPnjXjRs0zTR9Ob1HweT+j5dWOxjZnvOqbBl5/+HguD6 pAHlatJOwNndBOGX3eXKGTavGKUa8OaGmJ8ShLJ496leBfhe7I1d0gG2oqiWA2a6ErSv Yo2J5Un2cF44YBspolFyzBSU5n71uwRbGbzwVLTk35MPVr4v/+mx2yUpFIMhg/pVHrWY qXgw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y2-v6si3263281pga.141.2018.08.02.14.35.34; Thu, 02 Aug 2018 14:35:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731911AbeHBX1m (ORCPT + 99 others); Thu, 2 Aug 2018 19:27:42 -0400 Received: from gate.crashing.org ([63.228.1.57]:33962 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728799AbeHBX1m (ORCPT ); Thu, 2 Aug 2018 19:27:42 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id w72LD9tM004777; Thu, 2 Aug 2018 16:13:10 -0500 Message-ID: Subject: Re: [RFC 0/4] Virtio uses DMA API for all devices From: Benjamin Herrenschmidt To: "Michael S. Tsirkin" Cc: Christoph Hellwig , Will Deacon , Anshuman Khandual , virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, aik@ozlabs.ru, robh@kernel.org, joe@perches.com, elfring@users.sourceforge.net, david@gibson.dropbear.id.au, jasowang@redhat.com, mpe@ellerman.id.au, linuxram@us.ibm.com, haren@linux.vnet.ibm.com, paulus@samba.org, srikar@linux.vnet.ibm.com, robin.murphy@arm.com, jean-philippe.brucker@arm.com, marc.zyngier@arm.com Date: Thu, 02 Aug 2018 16:13:09 -0500 In-Reply-To: <20180802225738-mutt-send-email-mst@kernel.org> References: <20180730155633-mutt-send-email-mst@kernel.org> <20180731173052.GA17153@infradead.org> <3d6e81511571260de1c8047aaffa8ac4df093d2e.camel@kernel.crashing.org> <20180801081637.GA14438@arm.com> <20180801083639.GF26378@infradead.org> <26c1d3d50d8e081eed44fe9940fbefed34598cbd.camel@kernel.crashing.org> <20180802182959-mutt-send-email-mst@kernel.org> <82ccef6ec3d95ee43f3990a4a2d0aea87eb45e89.camel@kernel.crashing.org> <20180802200646-mutt-send-email-mst@kernel.org> <20180802225738-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.4 (3.28.4-1.fc28) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2018-08-02 at 23:52 +0300, Michael S. Tsirkin wrote: > > Yes, this is the purpose of Anshuman original patch (I haven't looked > > at the details of the patch in a while but that's what I told him to > > implement ;-) : > > > > - Make virtio always use DMA ops to simplify the code path (with a set > > of "transparent" ops for legacy) > > > > and > > > > - Provide an arch hook allowing us to "override" those "transparent" > > DMA ops with some custom ones that do the appropriate swiotlb gunk. > > > > Cheers, > > Ben. > > > > Right but as I tried to say doing that brings us to a bunch of issues > with using DMA APIs in virtio. Put simply DMA APIs weren't designed for > guest to hypervisor communication. I'm not sure I see the problem, see below > When we do (as is the case with PLATFORM_IOMMU right now) this adds a > bunch of overhead which we need to get rid of if we are to switch to > PLATFORM_IOMMU by default. We need to fix that. So let's differenciate the two problems of having an IOMMU (real or emulated) which indeeds adds overhead etc... and using the DMA API. At the moment, virtio does this all over the place: if (use_dma_api) dma_map/alloc_something(...) else use_pa The idea of the patch set is to do two, somewhat orthogonal, changes that together achieve what we want. Let me know where you think there is "a bunch of issues" because I'm missing it: 1- Replace the above if/else constructs with just calling the DMA API, and have virtio, at initialization, hookup its own dma_ops that just "return pa" (roughly) when the IOMMU stuff isn't used. This adds an indirect function call to the path that previously didn't have one (the else case above). Is that a significant/measurable overhead ? This change stands alone, and imho "cleans" up virtio by avoiding all that if/else "2 path" and unless it adds a measurable overhead, should probably be done. 2- Make virtio use the DMA API with our custom platform-provided swiotlb callbacks when needed, that is when not using IOMMU *and* running on a secure VM in our case. This benefits from -1- by making us just plumb in a different set of DMA ops we would have cooked up specifically for virtio in our arch code (or in virtio itself but build arch-conditionally in a separate file). But it doesn't strictly need it -1-: Now, -2- doesn't strictly needs -1-. We could have just done another xen-like hack that forces the DMA API "ON" for virtio when running in a secure VM. The problem if we do that however is that we also then need the arch PCI code to make sure it hooks up the virtio PCI devices with the special "magic" DMA ops that avoid the iommu but still do swiotlb, ie, not the same as other PCI devices. So it will have to play games such as checking vendor/device IDs for virtio, checking the IOMMU flag, etc... from the arch code which really bloody sucks when assigning PCI DMA ops. However, if we do it the way we plan here, on top of -1-, with a hook called from virtio into the arch to "override" the virtio DMA ops, then we avoid the problem completely: The arch hook would only be called by virtio if the IOMMU flag is *not* set. IE only when using that special "hypervisor" iommu bypass. If the IOMMU flag is set, virtio uses normal PCI dma ops as usual. That way, we have a very clear semantic: This hook is purely about replacing those "null" DMA ops that just return PA introduced in -1- with some arch provided specially cooked up DMA ops for non-IOMMU virtio that know about the arch special requirements. For us bounce buffering. Is there something I'm missing ? Cheers, Ben.