Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp1522017yba; Thu, 25 Apr 2019 00:55:22 -0700 (PDT) X-Google-Smtp-Source: APXvYqxlIUGnB+Frwr9X+XqDW9PtfrRGTfw3swDPoMP6S7mLb9Rkpm53/xvZ5f+lBWsBZzBB0Pop X-Received: by 2002:a62:26c1:: with SMTP id m184mr38858868pfm.102.1556178922410; Thu, 25 Apr 2019 00:55:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556178922; cv=none; d=google.com; s=arc-20160816; b=p20NmVwuMM5T5ooPNSPe4KCqjcaLJm8lr2t2a/jn0zQQ7lj78xp7jgfq0CPebsU7Es ctao72jSCtIHcJY72RHkPFsnfkdN0n/Zyusyb9Yqhmb6HdNAZFq8EkyVTJPqstrruzBc staQHUCbhhsTO/BkOITh+H5MAYRyCKlS7c5dnDxSDhRAl4PXw1HB305NbBKUiW1lTiEu NLX7H5hPFjmUsBm/QmKzT95D/6iwAMIuZNcaKYXlMHt3X0X1eyAcWGov7W6g/9ACmp9M FcvEgxtbAlF8lmBsF4K6c/gZczFKDxZ5V/KMtGoI3X1dkxAof8oFpO86hgN7NBddfA6F 19jQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=gSdbl7tiCDChVFi/egk34T6hGKLdbLIMT0F6PPP6KJk=; b=LJFrWM3Ghe6a3RroCSsDqALINu00++TXe9vBfmv91W+8YcYyAS+KjPNP58O51rOm5Q MuYIz3BQEabuXdVqPa8XJ8SElAJ7qcNx9w5pMRmYy9CIVIpE3soJ+7FnQClIMDEzpG4T 0PdGNindxXMS7aVjwj8NEhjUjNoZJknVs9oikvBh3ag/UeyE/3vEUAcU+TeNbAlKeRfc VGSwnKHqvnPxPpFqvzvKvQBDAmnoq/8gsVG9tXIcPRdVfn2xZ9tNEHMWM8CRONucwql/ US2MhuoUM6rwRpKEDY9JjZzKB29awk5me1XMYm7PVHuYssoJRKuhIY2tvIjDK78dw/av /ZBQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s78si21806929pfa.103.2019.04.25.00.55.07; Thu, 25 Apr 2019 00:55:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387654AbfDYBS7 (ORCPT + 99 others); Wed, 24 Apr 2019 21:18:59 -0400 Received: from mail-qt1-f196.google.com ([209.85.160.196]:40662 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726380AbfDYBS7 (ORCPT ); Wed, 24 Apr 2019 21:18:59 -0400 Received: by mail-qt1-f196.google.com with SMTP id y49so3926004qta.7 for ; Wed, 24 Apr 2019 18:18:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=gSdbl7tiCDChVFi/egk34T6hGKLdbLIMT0F6PPP6KJk=; b=LRGp8HnKuyTQ8iaGOXzJX7efeZ8PRDyPoP/+z9c8f8p16TCTghiIGWOk3J9oRloU1Z yZQIleBd3TYP5m3a4q7RnR9R/g7KOx/l+xTnLzj2yJYjXIvFSybBe0bpD1O78JOIzocz ewB4Bmm7mYCAbaXFF83XrrqpyPFRV13UkvS8Qpfki0cC3bNWYK8Vszx683HkTZ46lJ12 ZWrUYogR5x3wehY3Oq4s9C8xBPXDKO6eyaCEes8lOIdK3EzwHzdPXeoOL9VMd7hAyChw iGpxWZJLxeAxCo7HaYK4IqAa3fYAhAl+50JNCTnsBGTO/guWaGEFPmDPIKRwJAgif8FZ jYwA== X-Gm-Message-State: APjAAAX4ZniUW3g0wefui788T8FPgi+ximdMXtes5l2gsyQkXUOZ+wOo Ab3H6nWVcN8Ebb2B7M7jTOCVbLqk3q4= X-Received: by 2002:aed:20c4:: with SMTP id 62mr26929160qtb.256.1556155138200; Wed, 24 Apr 2019 18:18:58 -0700 (PDT) Received: from redhat.com (pool-173-76-105-71.bstnma.fios.verizon.net. [173.76.105.71]) by smtp.gmail.com with ESMTPSA id e6sm9930128qtr.56.2019.04.24.18.18.55 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 24 Apr 2019 18:18:56 -0700 (PDT) Date: Wed, 24 Apr 2019 21:18:54 -0400 From: "Michael S. Tsirkin" To: Thiago Jung Bauermann Cc: virtualization@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Jason Wang , Christoph Hellwig , David Gibson , Alexey Kardashevskiy , Paul Mackerras , Benjamin Herrenschmidt , Ram Pai , Jean-Philippe Brucker , Michael Roth , Mike Anderson Subject: Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted Message-ID: <20190424210813-mutt-send-email-mst@kernel.org> References: <20190129134750-mutt-send-email-mst@kernel.org> <877eefxvyb.fsf@morokweng.localdomain> <20190204144048-mutt-send-email-mst@kernel.org> <87ef71seve.fsf@morokweng.localdomain> <20190320171027-mutt-send-email-mst@kernel.org> <87tvfvbwpb.fsf@morokweng.localdomain> <20190323165456-mutt-send-email-mst@kernel.org> <87a7go71hz.fsf@morokweng.localdomain> <20190419190258-mutt-send-email-mst@kernel.org> <875zr228zf.fsf@morokweng.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <875zr228zf.fsf@morokweng.localdomain> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 24, 2019 at 10:01:56PM -0300, Thiago Jung Bauermann wrote: > > Michael S. Tsirkin writes: > > > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> > >> Michael S. Tsirkin writes: > >> > >> > On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote: > >> >> > >> >> Michael S. Tsirkin writes: > >> >> > >> >> > On Wed, Mar 20, 2019 at 01:13:41PM -0300, Thiago Jung Bauermann wrote: > >> >> >> >From what I understand of the ACCESS_PLATFORM definition, the host will > >> >> >> only ever try to access memory addresses that are supplied to it by the > >> >> >> guest, so all of the secure guest memory that the host cares about is > >> >> >> accessible: > >> >> >> > >> >> >> If this feature bit is set to 0, then the device has same access to > >> >> >> memory addresses supplied to it as the driver has. In particular, > >> >> >> the device will always use physical addresses matching addresses > >> >> >> used by the driver (typically meaning physical addresses used by the > >> >> >> CPU) and not translated further, and can access any address supplied > >> >> >> to it by the driver. When clear, this overrides any > >> >> >> platform-specific description of whether device access is limited or > >> >> >> translated in any way, e.g. whether an IOMMU may be present. > >> >> >> > >> >> >> All of the above is true for POWER guests, whether they are secure > >> >> >> guests or not. > >> >> >> > >> >> >> Or are you saying that a virtio device may want to access memory > >> >> >> addresses that weren't supplied to it by the driver? > >> >> > > >> >> > Your logic would apply to IOMMUs as well. For your mode, there are > >> >> > specific encrypted memory regions that driver has access to but device > >> >> > does not. that seems to violate the constraint. > >> >> > >> >> Right, if there's a pre-configured 1:1 mapping in the IOMMU such that > >> >> the device can ignore the IOMMU for all practical purposes I would > >> >> indeed say that the logic would apply to IOMMUs as well. :-) > >> >> > >> >> I guess I'm still struggling with the purpose of signalling to the > >> >> driver that the host may not have access to memory addresses that it > >> >> will never try to access. > >> > > >> > For example, one of the benefits is to signal to host that driver does > >> > not expect ability to access all memory. If it does, host can > >> > fail initialization gracefully. > >> > >> But why would the ability to access all memory be necessary or even > >> useful? When would the host access memory that the driver didn't tell it > >> to access? > > > > When I say all memory I mean even memory not allowed by the IOMMU. > > Yes, but why? How is that memory relevant? It's relevant when driver is not trusted to only supply correct addresses. The feature was originally designed to support userspace drivers within guests. > >> >> >> >> > But the name "sev_active" makes me scared because at least AMD guys who > >> >> >> >> > were doing the sensible thing and setting ACCESS_PLATFORM > >> >> >> >> > >> >> >> >> My understanding is, AMD guest-platform knows in advance that their > >> >> >> >> guest will run in secure mode and hence sets the flag at the time of VM > >> >> >> >> instantiation. Unfortunately we dont have that luxury on our platforms. > >> >> >> > > >> >> >> > Well you do have that luxury. It looks like that there are existing > >> >> >> > guests that already acknowledge ACCESS_PLATFORM and you are not happy > >> >> >> > with how that path is slow. So you are trying to optimize for > >> >> >> > them by clearing ACCESS_PLATFORM and then you have lost ability > >> >> >> > to invoke DMA API. > >> >> >> > > >> >> >> > For example if there was another flag just like ACCESS_PLATFORM > >> >> >> > just not yet used by anyone, you would be all fine using that right? > >> >> >> > >> >> >> Yes, a new flag sounds like a great idea. What about the definition > >> >> >> below? > >> >> >> > >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_IOMMU This feature has the same meaning as > >> >> >> VIRTIO_F_ACCESS_PLATFORM both when set and when not set, with the > >> >> >> exception that the IOMMU is explicitly defined to be off or bypassed > >> >> >> when accessing memory addresses supplied to the device by the > >> >> >> driver. This flag should be set by the guest if offered, but to > >> >> >> allow for backward-compatibility device implementations allow for it > >> >> >> to be left unset by the guest. It is an error to set both this flag > >> >> >> and VIRTIO_F_ACCESS_PLATFORM. > >> >> > > >> >> > It looks kind of narrow but it's an option. > >> >> > >> >> Great! > >> >> > >> >> > I wonder how we'll define what's an iommu though. > >> >> > >> >> Hm, it didn't occur to me it could be an issue. I'll try. > >> > >> I rephrased it in terms of address translation. What do you think of > >> this version? The flag name is slightly different too: > >> > >> > >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> with the exception that address translation is guaranteed to be > >> unnecessary when accessing memory addresses supplied to the device > >> by the driver. Which is to say, the device will always use physical > >> addresses matching addresses used by the driver (typically meaning > >> physical addresses used by the CPU) and not translated further. This > >> flag should be set by the guest if offered, but to allow for > >> backward-compatibility device implementations allow for it to be > >> left unset by the guest. It is an error to set both this flag and > >> VIRTIO_F_ACCESS_PLATFORM. > > > > Thanks, I'll think about this approach. Will respond next week. > > Thanks! > > >> >> > Another idea is maybe something like virtio-iommu? > >> >> > >> >> You mean, have legacy guests use virtio-iommu to request an IOMMU > >> >> bypass? If so, it's an interesting idea for new guests but it doesn't > >> >> help with guests that are out today in the field, which don't have A > >> >> virtio-iommu driver. > >> > > >> > I presume legacy guests don't use encrypted memory so why do we > >> > worry about them at all? > >> > >> They don't use encrypted memory, but a host machine will run a mix of > >> secure and legacy guests. And since the hypervisor doesn't know whether > >> a guest will be secure or not at the time it is launched, legacy guests > >> will have to be launched with the same configuration as secure guests. > > > > OK and so I think the issue is that hosts generally fail if they set > > ACCESS_PLATFORM and guests do not negotiate it. > > So you can not just set ACCESS_PLATFORM for everyone. > > Is that the issue here? > > Yes, that is one half of the issue. The other is that even if hosts > didn't fail, existing legacy guests wouldn't "take the initiative" of > not negotiating ACCESS_PLATFORM to get the improved performance. They'd > have to be modified to do that. So there's a non-encrypted guest, hypervisor wants to set ACCESS_PLATFORM to allow encrypted guests but that will slow down legacy guests since their vIOMMU emulation is very slow. So enabling support for encryption slows down non-encrypted guests. Not great but not the end of the world, considering even older guests that don't support ACCESS_PLATFORM are completely broken and you do not seem to be too worried by that. For future non-encrypted guests, bypassing the emulated IOMMU for when that emulated IOMMU is very slow might be solvable in some other way, e.g. with virtio-iommu. Which reminds me, could you look at virtio-iommu as a solution for some of the issues? Review of that patchset from that POV would be appreciated. > -- > Thiago Jung Bauermann > IBM Linux Technology Center