Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp3953357imm; Mon, 6 Aug 2018 13:42:31 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdAFkaPH1in5iQr4U42Mn6GJ1Tn5krwOE7AbKBq8tV6kH6hK2wXp+6gIQ/TwaBoqGuFT+1Q X-Received: by 2002:a63:c902:: with SMTP id o2-v6mr15952902pgg.118.1533588151506; Mon, 06 Aug 2018 13:42:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533588151; cv=none; d=google.com; s=arc-20160816; b=KIMnvhPLysudahBtjytp0aA82KG3SuQ2CWH182XqHTofXT1wEISpHJWYAGGuM5AcU4 Q5Zq/4O+cF2rexV9Bd/do4UBWDfn+b/w62c3+zZ9Yv1G3UQFc7/EcVP25WKb6gWPuDnj Zww5GY86APrek2N6yC0s5oiWT7xRolqPoxLneIFpcGm/nHerHztOrtqdHqNqWQzAF6iS 8ZGZSewuYAedjVu4NzkWOb1SXpzzhsGLTBiGjmELBIoonInRWpM+YIOlaYMBu3dqG4n6 ljkpk0X7EXTDkMDbepjaWSMINkAByLUh2ytrhNyIDrMkOS7d04FaJQ2YFk0guuwPQJ7y Tacw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :arc-authentication-results; bh=eDs4f36e2Omk210vPtC3EUeaOFBhIyY4jm5SZ3gE+ms=; b=ZwahvP3Cjeopy4sYD5iKuFOj53zQu9mHAHuIaKKVeAHEol0+KK92rCXkk/xYWQ4I6z eNrbGxIykInfMhPbKBk1IR4tPV4Xj81yZzurpeX88sEsdl+AN8H7s/Sn4dPdI31/0qqV MQhQigSlczR/d3/HWwdoKC9n0BDnXYF3L+m4SM0ri2aLkZYVITuR25MDJ0D1ngY6Nbqr BhKEnuE3Yyi3YW7Lek6Bht1Vh8qYWgdTuy9cc72x2DRuwiqAYksdyrKtlWXE1mlFob6p 3DP/Rk5Tiwb9HFoW5LRUZfKf6M3HpJPHuZTWfZdOPUgvhFLMnVTLCanFF3zotMy82a2P bojA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 3-v6si12070237pld.36.2018.08.06.13.42.15; Mon, 06 Aug 2018 13:42:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387397AbeHFWEN (ORCPT + 99 others); Mon, 6 Aug 2018 18:04:13 -0400 Received: from gate.crashing.org ([63.228.1.57]:52721 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731258AbeHFWEN (ORCPT ); Mon, 6 Aug 2018 18:04:13 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id w76JqCAv002188; Mon, 6 Aug 2018 14:52:13 -0500 Message-ID: <6c707d6d33ac25a42265c2e9b521c2416d72c739.camel@kernel.crashing.org> Subject: Re: [RFC 0/4] Virtio uses DMA API for all devices From: Benjamin Herrenschmidt To: Christoph Hellwig Cc: "Michael S. Tsirkin" , Will Deacon , Anshuman Khandual , virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, aik@ozlabs.ru, robh@kernel.org, joe@perches.com, elfring@users.sourceforge.net, david@gibson.dropbear.id.au, jasowang@redhat.com, mpe@ellerman.id.au, linuxram@us.ibm.com, haren@linux.vnet.ibm.com, paulus@samba.org, srikar@linux.vnet.ibm.com, robin.murphy@arm.com, jean-philippe.brucker@arm.com, marc.zyngier@arm.com Date: Tue, 07 Aug 2018 05:52:12 +1000 In-Reply-To: <20180806094243.GA16032@infradead.org> References: <20180802225738-mutt-send-email-mst@kernel.org> <20180803070507.GA1344@infradead.org> <20180803160246.GA13794@infradead.org> <22310f58605169fe9de83abf78b59f593ff7fbb7.camel@kernel.crashing.org> <20180804082120.GB4421@infradead.org> <20180805072930.GB23288@infradead.org> <20180806094243.GA16032@infradead.org> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.4 (3.28.4-1.fc28) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2018-08-06 at 02:42 -0700, Christoph Hellwig wrote: > On Mon, Aug 06, 2018 at 07:16:47AM +1000, Benjamin Herrenschmidt wrote: > > Who would set this bit ? qemu ? Under what circumstances ? > > I don't really care who sets what. The implementation might not even > involved qemu. > > It is your job to write a coherent interface specification that does > not depend on the used components. The hypervisor might be PAPR, > Linux + qemu, VMware, Hyperv or something so secret that you'd have > to shoot me if you had to tell me. The guest might be Linux, FreeBSD, > AIX, OS400 or a Hipster project of the day in Rust. As long as we > properly specify the interface it simplify does not matter. That's the point Christoph. The interface is today's interface. It does NOT change. That information is not part of the interface. It's the VM itself that is stashing away its memory in a secret place, and thus needs to do bounce buffering. There is no change to the virtio interface per-se. > > What would be the effect of this bit while VIRTIO_F_IOMMU is NOT set, > > ie, what would qemu do and what would Linux do ? I'm not sure I fully > > understand your idea. > > In a perfect would we'd just reuse VIRTIO_F_IOMMU and clarify the > description which currently is rather vague but basically captures > the use case. Currently is is: > > VIRTIO_F_IOMMU_PLATFORM(33) > This feature indicates that the device is behind an IOMMU that > translates bus addresses from the device into physical addresses in > memory. If this feature bit is set to 0, then the device emits > physical addresses which are not translated further, even though an > IOMMU may be present. > > And I'd change it to something like: > > VIRTIO_F_PLATFORM_DMA(33) > This feature indicates that the device emits platform specific > bus addresses that might not be identical to physical address. > The translation of physical to bus address is platform speific > and defined by the plaform specification for the bus that the virtio > device is attached to. > If this feature bit is set to 0, then the device emits > physical addresses which are not translated further, even if > the platform would normally require translations for the bus that > the virtio device is attached to. > > If we can't change the defintion any more we should deprecate the > old VIRTIO_F_IOMMU_PLATFORM bit, and require the VIRTIO_F_IOMMU_PLATFORM > and VIRTIO_F_PLATFORM_DMA to be not set at the same time. But this doesn't really change our problem does it ? None of what happens in our case is part of the "interface". The suggestion to force the iommu ON was simply that it was a "workaround" as by doing so, we get to override the DMA ops, but that's just a trick. Fundamentally, what we need to solve is pretty much entirely a guest problem. > > I'm trying to understand because the limitation is not a device side > > limitation, it's not a qemu limitation, it's actually more of a VM > > limitation. It has most of its memory pages made inaccessible for > > security reasons. The platform from a qemu/KVM perspective is almost > > entirely normal. > > Well, find a way to describe this either in the qemu specification using > new feature bits, or by using something like the above. But again, why do you want to involve the interface, and thus the hypervisor for something that is essentially what the guest is doign to itself ? It really is something we need to solve locally to the guest, it's not part of the interface. Cheers, Ben.