Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp3639155imm; Mon, 6 Aug 2018 08:08:57 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdtKzof2IRRb+nWQvdOCV5shLSOKZN+8KQTRVtOif+MawIpHH6EDtuQ27jPH9o69oJIiFh3 X-Received: by 2002:a17:902:7b97:: with SMTP id w23-v6mr14419027pll.66.1533568137332; Mon, 06 Aug 2018 08:08:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533568137; cv=none; d=google.com; s=arc-20160816; b=U+qZZtpyLVmEezEh60t2rv2aa5vX9+22k649y659zYdjPkfh9o23o6Se5GgfIOs0B7 sXXBfC7nydVq0cBkLqjzPuOjZJwSwj9foWtOSGjz7HbYT242Oxq0D2/DVmAp/SMCLlG/ 4pFUr7b3rJqe4hibWDaQRask2ZQ9sMdGWLA9mTTJ6KjNlFF8eKYOSepj3Xmkt26sGP0e cfB+B8uVK8SszZoQvu/0SmRGm2EHCY0x0i2SRX143T3oBmSLVqIl9B3lNKOUHbhck6LE dReO1/l0ipTnsfVq5HEX0VR78TXhz9uBV+de0Fc0qgmljE35gtnRh+P6q9HLUFRMCXGM +ugQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=0Y9G9v316k9j6mcsHp9Gptlx/JIvO5s9SF4bSyfUS58=; b=uk3kUiHn61evf23IpCHWpWDaLyrjnTP0wEmYb66EilnlCENy1EQ4E13BMoanMNhQFN 8ywYKauuPcy26yXjSKv/hV2+Kf07OD7Kxl+rAqwQsI52Rq+HkpI0GZv+O47GaLV9+lNP zkgkPRocCR3FPQr3M7ZnowjzjbN2kjUsjw5Mj5uRNOO6yhartrMZwfA052taVzwF9qax CwhKJIIIZdzSC4wB8Tk2epbeCpcbx+rvQxxNmbpkOHRv9nGkGFDpgM9qyiDbOUcJKxpj pIPhf8VB22vmnR43ajwbAQ8gmSOkMB/ndBws8QBn6+lDnkHmPrLv+/O9gjm1TVIuJQ10 rBRw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p10-v6si12278377pgm.265.2018.08.06.08.08.41; Mon, 06 Aug 2018 08:08:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729096AbeHFQOV (ORCPT + 99 others); Mon, 6 Aug 2018 12:14:21 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:39522 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727451AbeHFQOU (ORCPT ); Mon, 6 Aug 2018 12:14:20 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 96F3D7A9; Mon, 6 Aug 2018 07:05:04 -0700 (PDT) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 678323F2EA; Mon, 6 Aug 2018 07:05:04 -0700 (PDT) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id 35E111AE2D8E; Mon, 6 Aug 2018 15:05:08 +0100 (BST) Date: Mon, 6 Aug 2018 15:05:08 +0100 From: Will Deacon To: "Michael S. Tsirkin" Cc: Benjamin Herrenschmidt , Christoph Hellwig , Anshuman Khandual , virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, aik@ozlabs.ru, robh@kernel.org, joe@perches.com, elfring@users.sourceforge.net, david@gibson.dropbear.id.au, jasowang@redhat.com, mpe@ellerman.id.au, linuxram@us.ibm.com, haren@linux.vnet.ibm.com, paulus@samba.org, srikar@linux.vnet.ibm.com, robin.murphy@arm.com, jean-philippe.brucker@arm.com, marc.zyngier@arm.com Subject: Re: [RFC 0/4] Virtio uses DMA API for all devices Message-ID: <20180806140507.GB15078@arm.com> References: <20180720035941.6844-1-khandual@linux.vnet.ibm.com> <20180727095804.GA25592@arm.com> <20180730093414.GD26245@infradead.org> <20180730125100-mutt-send-email-mst@kernel.org> <20180730111802.GA9830@infradead.org> <20180730155633-mutt-send-email-mst@kernel.org> <20180731173052.GA17153@infradead.org> <3d6e81511571260de1c8047aaffa8ac4df093d2e.camel@kernel.crashing.org> <20180801081637.GA14438@arm.com> <20180805032504-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180805032504-mutt-send-email-mst@kernel.org> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Michael, On Sun, Aug 05, 2018 at 03:27:42AM +0300, Michael S. Tsirkin wrote: > On Wed, Aug 01, 2018 at 09:16:38AM +0100, Will Deacon wrote: > > On Tue, Jul 31, 2018 at 03:36:22PM -0500, Benjamin Herrenschmidt wrote: > > > On Tue, 2018-07-31 at 10:30 -0700, Christoph Hellwig wrote: > > > > > However the question people raise is that DMA API is already full of > > > > > arch-specific tricks the likes of which are outlined in your post linked > > > > > above. How is this one much worse? > > > > > > > > None of these warts is visible to the driver, they are all handled in > > > > the architecture (possibly on a per-bus basis). > > > > > > > > So for virtio we really need to decide if it has one set of behavior > > > > as specified in the virtio spec, or if it behaves exactly as if it > > > > was on a PCI bus, or in fact probably both as you lined up. But no > > > > magic arch specific behavior inbetween. > > > > > > The only arch specific behaviour is needed in the case where it doesn't > > > behave like PCI. In this case, the PCI DMA ops are not suitable, but in > > > our secure VMs, we still need to make it use swiotlb in order to bounce > > > through non-secure pages. > > > > On arm/arm64, the problem we have is that legacy virtio devices on the MMIO > > transport (so definitely not PCI) have historically been advertised by qemu > > as not being cache coherent, but because the virtio core has bypassed DMA > > ops then everything has happened to work. If we blindly enable the arch DMA > > ops, we'll plumb in the non-coherent ops and start getting data corruption, > > so we do need a way to quirk virtio as being "always coherent" if we want to > > use the DMA ops (which we do, because our emulation platforms have an IOMMU > > for all virtio devices). > > > > Will > > Right that's not very different from placing the device within the IOMMU > domain but in fact bypassing the IOMMU Hmm, I'm not sure I follow you here -- the IOMMU bypassing is handled inside the IOMMU driver, so we'd still end up with non-coherent DMA ops for the guest accesses. The presence of an IOMMU doesn't imply coherency for us. Or am I missing your point here? > I wonder whether anyone ever needs a non coherent virtio-mmio. If yes we > can extend PLATFORM_IOMMU to cover that or add another bit. I think that's probably the right way around: assume that legacy virtio-mmio devices are coherent by default. > What exactly do the non-coherent ops do that causes the corruption? The non-coherent ops mean that the guest ends up allocating the vring queues using non-cacheable mappings, whereas qemu/hypervisor uses a cacheable mapping despite not advertising the devices as being cache-coherent. This hits something in the architecture known as "mismatched aliases", which means that coherency is lost between the guest and the hypervisor, consequently resulting in data not being visible and ordering not being guaranteed. The usual symptom is that the device appears to lock up iirc, because the guest and the hypervisor are unable to communicate with each other. Does that help to clarify things? Thanks, Will