Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp2957783imm; Mon, 28 May 2018 20:56:36 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJaHZVyzufwOY7RCCe0YAdwa/HTz1KoA4F/e+To22qhxbFAKYRIEGfBK3+xKtN/VCET8D4a X-Received: by 2002:a17:902:8b8c:: with SMTP id ay12-v6mr11147111plb.74.1527566196917; Mon, 28 May 2018 20:56:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527566196; cv=none; d=google.com; s=arc-20160816; b=D3n8ICIqzn+eDcmqFCgGCBxHm58R8ZTBeLeuiSq+x4uxIccl2woyDxqaHC/Ox7SD2I 1GVn9zWgzYREwZjCwS9of6ixDeAP2bTTVvzmWuxaQuTW2U8cvXQyu0dHnRNFBz4RD38u Ykwd7TrkPh9uygiwSUNgllQlkgUHRkehJVmuw2mxHlC+1SFzYMsRLaVrWnsc4mqrj1tp uV4aPCDceriHHWiG2xEogUqvUhhZV6pmTh5DMYJnYj3Enn0gcspWU5xIwKrCr2E15P0w gcPLb86mKEvaWGP5ElfDHKpWNlDIFUp9xjSiQKCIbhCJti4Vw1PC65s6Een6Tac/M9Ky OcgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :arc-authentication-results; bh=ycfBKfv8mwPHRNMPxo80RPnymjQHItcglI+5n4CEvv4=; b=DrD3B/EifZUAFd/vJv/ofQvXnoR/KVNYn6VoX3W9KI09QV+k+YcwSzqr7HuIt7/3Zb 3z4QHOkHvZ0pAYI8LSStAg2DLZCLzCFhliCF48Ua93wpJrBvmpwhakob1kloXCQFcLyt vEjq6S4Sz09hwbXfbmvEV9gjyRsIpEi/Uy7zhc+EWcdeu4Nl2oU02f6hZGdlmEaKwnim 4jSpE5+Kq9Jdet7LJHVy1ucAtMUtI8jHRl39B+BNMZ84bbhPU6FMxmrOCxpiDpiQ3sGD tqPeMVmwmqZ+mn6njuiHMIv0xBk6Ohneoc46BudNut9Au0lYcJL9iYqF8nzidWPUQFxn HBng== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o3-v6si30254170pls.64.2018.05.28.20.56.22; Mon, 28 May 2018 20:56:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936522AbeE1X5V (ORCPT + 99 others); Mon, 28 May 2018 19:57:21 -0400 Received: from gate.crashing.org ([63.228.1.57]:42357 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935087AbeE1X5U (ORCPT ); Mon, 28 May 2018 19:57:20 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id w4SNuOd9004764; Mon, 28 May 2018 18:56:25 -0500 Message-ID: <2f1d48cf029c1f0903f3cffea946ae5b85f60ec0.camel@kernel.crashing.org> Subject: Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices From: Benjamin Herrenschmidt To: "Michael S. Tsirkin" Cc: Anshuman Khandual , virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, aik@ozlabs.ru, robh@kernel.org, joe@perches.com, elfring@users.sourceforge.net, david@gibson.dropbear.id.au, jasowang@redhat.com, mpe@ellerman.id.au, hch@infradead.org Date: Tue, 29 May 2018 09:56:24 +1000 In-Reply-To: <6fff9f5d67361653e6072570a857cf0d1009a123.camel@kernel.crashing.org> References: <20180522063317.20956-1-khandual@linux.vnet.ibm.com> <20180523213703-mutt-send-email-mst@kernel.org> <20180525202300-mutt-send-email-mst@kernel.org> <6fff9f5d67361653e6072570a857cf0d1009a123.camel@kernel.crashing.org> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.1 (3.28.1-2.fc28) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2018-05-29 at 09:48 +1000, Benjamin Herrenschmidt wrote: > > Well it's not supposed to be much slower for the static case. > > > > vhost has a cache so should be fine. > > > > A while ago Paolo implemented a translation cache which should be > > perfect for this case - most of the code got merged but > > never enabled because of stability issues. > > > > If all else fails, we could teach QEMU to handle the no-iommu case > > as if VIRTIO_F_IOMMU_PLATFORM was off. > > Any serious reason why not just getting that 2 line patch allowing our > arch code to force virtio to use the DMA API ? > > It's not particularly invasive and solves our problem rather nicely > without adding overhead or additional knowledge to qemu/libvirt/mgmnt > tools etc... that it doesn't need etc.... > > The guest knows it's going secure so the guest arch code can do the > right thing rather trivially. > > Long term we should probably make virtio always use the DMA API anyway, > and interpose "1:1" dma_ops for the traditional virtio case, that would > reduce code clutter significantly. In that case, it would become just a > matter of having a platform hook to override the dma_ops used. To elaborate a bit .... What we are trying to solve here is entirely a guest problem, I don't think involving qemu in the solution is the right thing to do. The guest can only allow external parties (qemu, potentially PCI devices, etc...) access to some restricted portions of memory (insecure memory). Thus the guest need to do some bounce buffering/swiotlb type tricks. This is completely orthogonal to whether there is an actual iommu between the guest and the device (or emulated device/virtio). This is why I think the solution should reside in the guest kernel, by proper manipulation (by the arch code) of the dma ops. I don't think forcing the addition of an emulated iommu in the middle just to work around the fact that virtio "cheats" and doesn't use the dma API unless there is one, is the right "fix". The right long term fix is to always use the DMA API, reducing code path etc... and just have a single point where virtio can "chose" alternate DMA ops (via an arch hook to deal with our case). In the meantime, having the hook we propose gets us going, but if you agree with the approach, we should also work on the long term approach. Cheers, Ben.