Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp29705ybl; Tue, 3 Dec 2019 20:45:09 -0800 (PST) X-Google-Smtp-Source: APXvYqwZdxKAvR1QLbPJHTIEEzq6/R82jK+3XrvzBEAvau5nSmvqwFdUkFyca4cJpE1KuhBFm+8X X-Received: by 2002:aca:c7cb:: with SMTP id x194mr1084426oif.157.1575434709747; Tue, 03 Dec 2019 20:45:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1575434709; cv=none; d=google.com; s=arc-20160816; b=dHLkPhRD3lFtJUPjouSTAYt3GhRX+/fA5/uQwM8QaBH7vzoWBp+TJ0LDa48BkN0/cO uAJOe1aFCQGv365psTasqMAK7Yw0V5INDPjrwUFAgKkZb6ZGlL1T5yPZqwblhOZpIat/ bCZxun9zgWTuOqG2bS/wZ/NzxUiX65DT9WII/dAXURDSjlAENUaGyPvKwwhCyYUknt+3 MvHQCxvyie/aM/l3DbVHdG0hETqAUaMlJoDAjEs5XGlb4ZTpZVUVpLBhXxiZGMu6/5+0 UFN4u3Tq+4OfUb1TO3A+Q6+ibKxe/+Q/FrhEheVToWhDoTdKxewNyyW3M3pkISKo+hYb E2IQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=1P55r0/ik0CZLckrFIkzQ80wIrhxZdd6F9Zo61XEeXs=; b=v0HnJWB9wzCOnQ8hEE9xvZfMtfBUYZXFOdA13jbiIHYHsCUmnCWShEsn0HNAyjEP+W TGzew1ExHz/wykMfLyt68Lhorh8kDO7tMLQr6stgN9ahICQKf1JOCubD5GkB46Xzt3lK xE90fDSpLZyNODSXWO+pKA1BLCDxRxXgBWBZuAVNVJdIEL5/HELd7ReboLMWW+0pK7Et fEnkL23l7XfV03wn6K9HYeBRb7VtMAYhlwMpv+zBCwgiFqHA0Kj80ZQz8FLxOhDghBlf MlkknjTryicjqXrVGbC1qCF1saUcCCooKX/Q1c2uubNv8fqrwxaxgxwJYL/ut0/s443n mqaQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gibson.dropbear.id.au header.s=201602 header.b=Ky1DOXdY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u9si1686878otg.90.2019.12.03.20.44.54; Tue, 03 Dec 2019 20:45:09 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gibson.dropbear.id.au header.s=201602 header.b=Ky1DOXdY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726923AbfLDEnr (ORCPT + 99 others); Tue, 3 Dec 2019 23:43:47 -0500 Received: from ozlabs.org ([203.11.71.1]:53171 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726835AbfLDEnr (ORCPT ); Tue, 3 Dec 2019 23:43:47 -0500 Received: by ozlabs.org (Postfix, from userid 1007) id 47SR7h0Jhfz9sR8; Wed, 4 Dec 2019 15:43:43 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gibson.dropbear.id.au; s=201602; t=1575434624; bh=0ftyPdnyKfQDnu98YNE3xbdhD7agELawu01H6mP95Lg=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Ky1DOXdYVtCpC5d0qcti91YI0bdeBD2fIKsbHOMcDGQOiJfZ4xu4AzrEyMH5cD1Os T+mAfdqlKTX+hMSvTaTVpSjCqPJpw6FbHB5EddfRBnBpOWjTu85RRyaUuh9ZdmvTxd UCbY0E5bEH8hvB7ssamFhTqc/ymW+lIdBW6T2wgU= Date: Wed, 4 Dec 2019 14:36:18 +1100 From: David Gibson To: Alexey Kardashevskiy Cc: Ram Pai , linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au, benh@kernel.crashing.org, paulus@ozlabs.org, mdroth@linux.vnet.ibm.com, hch@lst.de, andmike@us.ibm.com, sukadev@linux.vnet.ibm.com, mst@redhat.com, ram.n.pai@gmail.com, cai@lca.pw, tglx@linutronix.de, bauerman@linux.ibm.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v4 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor. Message-ID: <20191204033618.GA5031@umbus.fritz.box> References: <1575269124-17885-2-git-send-email-linuxram@us.ibm.com> <20191203020850.GA12354@oc0525413822.ibm.com> <0b56ce3e-6c32-5f3b-e7cc-0d419a61d71d@ozlabs.ru> <20191203040509.GB12354@oc0525413822.ibm.com> <20191203165204.GA5079@oc0525413822.ibm.com> <3a17372a-fcee-efbf-0a05-282ffb1adc90@ozlabs.ru> <20191204004958.GB5063@oc0525413822.ibm.com> <5963ff32-2119-be7c-d1e5-63457888a73b@ozlabs.ru> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="IS0zKkzwUGydFO0o" Content-Disposition: inline In-Reply-To: <5963ff32-2119-be7c-d1e5-63457888a73b@ozlabs.ru> User-Agent: Mutt/1.12.1 (2019-06-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --IS0zKkzwUGydFO0o Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Dec 04, 2019 at 12:08:09PM +1100, Alexey Kardashevskiy wrote: >=20 >=20 > On 04/12/2019 11:49, Ram Pai wrote: > > On Wed, Dec 04, 2019 at 11:04:04AM +1100, Alexey Kardashevskiy wrote: > >> > >> > >> On 04/12/2019 03:52, Ram Pai wrote: > >>> On Tue, Dec 03, 2019 at 03:24:37PM +1100, Alexey Kardashevskiy wrote: > >>>> > >>>> > >>>> On 03/12/2019 15:05, Ram Pai wrote: > >>>>> On Tue, Dec 03, 2019 at 01:15:04PM +1100, Alexey Kardashevskiy wrot= e: > >>>>>> > >>>>>> > >>>>>> On 03/12/2019 13:08, Ram Pai wrote: > >>>>>>> On Tue, Dec 03, 2019 at 11:56:43AM +1100, Alexey Kardashevskiy wr= ote: > >>>>>>>> > >>>>>>>> > >>>>>>>> On 02/12/2019 17:45, Ram Pai wrote: > >>>>>>>>> H_PUT_TCE_INDIRECT hcall uses a page filled with TCE entries, a= s one of > >>>>>>>>> its parameters. One page is dedicated per cpu, for the lifetime= of the > >>>>>>>>> kernel for this purpose. On secure VMs, contents of this page, = when > >>>>>>>>> accessed by the hypervisor, retrieves encrypted TCE entries. H= ypervisor > >>>>>>>>> needs to know the unencrypted entries, to update the TCE table > >>>>>>>>> accordingly. There is nothing secret or sensitive about these = entries. > >>>>>>>>> Hence share the page with the hypervisor. > >>>>>>>> > >>>>>>>> This unsecures a page in the guest in a random place which creat= es an > >>>>>>>> additional attack surface which is hard to exploit indeed but > >>>>>>>> nevertheless it is there. > >>>>>>>> A safer option would be not to use the > >>>>>>>> hcall-multi-tce hyperrtas option (which translates FW_FEATURE_MU= LTITCE > >>>>>>>> in the guest). > >>>>>>> > >>>>>>> > >>>>>>> Hmm... How do we not use it? AFAICT hcall-multi-tce option gets = invoked > >>>>>>> automatically when IOMMU option is enabled. > >>>>>> > >>>>>> It is advertised by QEMU but the guest does not have to use it. > >>>>> > >>>>> Are you suggesting that even normal-guest, not use hcall-multi-tce? > >>>>> or just secure-guest? =20 > >>>> > >>>> > >>>> Just secure. > >>> > >>> hmm.. how are the TCE entries communicated to the hypervisor, if > >>> hcall-multi-tce is disabled? > >> > >> Via H_PUT_TCE which updates 1 entry at once (sets or clears). > >> hcall-multi-tce enables H_PUT_TCE_INDIRECT (512 entries at once) and > >> H_STUFF_TCE (clearing, up to 4bln at once? many), these are simply an > >> optimization. > >=20 > > Do you still think, secure-VM should use H_PUT_TCE and not > > H_PUT_TCE_INDIRECT? And normal VM should use H_PUT_TCE_INDIRECT? > > Is there any advantage of special casing it for secure-VMs. >=20 >=20 > Reducing the amount of insecure memory at random location. The other approach we could use for that - which would still allow H_PUT_TCE_INDIRECT, would be to allocate the TCE buffer page from the same pool that we use for the bounce buffers. I assume there must already be some sort of allocator for that? > > In fact, we could make use of as much optimization as possible. > >=20 > >=20 > >> > >>>>>> Is not this for pci+swiotlb?=20 > > ..snip.. > >>>>> This patch is purely to help the hypervisor setup the TCE table, in= the > >>>>> presence of a IOMMU. > >>>> > >>>> Then the hypervisor should be able to access the guest pages mapped = for > >>>> DMA and these pages should be made unsecure for this to work. Where/= when > >>>> does this happen? > >>> > >>> This happens in the SWIOTLB code. The code to do that is already > >>> upstream. =20 > >>> > >>> The sharing of the pages containing the SWIOTLB bounce buffers is done > >>> in init_svm() which calls swiotlb_update_mem_attributes() which calls > >>> set_memory_decrypted(). In the case of pseries, set_memory_decrypted= () calls=20 > >>> uv_share_page(). > >> > >> > >> This does not seem enough as when you enforce iommu_platform=3Don, QEMU > >> starts accessing virtio buffers via IOMMU so bounce buffers have to be > >> mapped explicitly, via H_PUT_TCE&co, where does this happen? > >> > >=20 > > I think, it happens at boot time. Every page of the guest memory is TCE > > mapped, if iommu is enabled. SWIOTLB pages get implicitly TCE-mapped > > as part of that operation. >=20 >=20 > Ah I see. This works via the huge dma window. Ok, makes sense now. >=20 > It just seems like a waste that we could map swiotlb 1:1 via the always > existing small DMA window but instead we rely on a huge window to map > these small buffers. This way we are wasting the entire 32bit window and > most of the huge window. We may fix it in the future (not right now) but > for now I would still avoid unsecuring additional memory. Thanks, >=20 >=20 --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --IS0zKkzwUGydFO0o Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl3nKbAACgkQbDjKyiDZ s5KMexAAllYYsecO3NDQbhy2gO0VyzjwweSDd5tosdQl/Knkeu7XeCBfKQN3mIrG fO8k2+HZHX2LLPyopjnit5VOwKgIigv2y4HKHRrAKWrLsbFVYpm1M/VC2Dz3TBmt v0RYeiaVtXtpHEJcPHRj6WmKjiEG2DwiAFQoT76PjTj4HTQBAp9GYEfddPKbjhhF wrxR5h3SOOuLaaP1o8lkWTTkUCkUTfMNUiKBeC3xedeAsvUldXihSUgENeAKjxMS nOvMcA8JJ6M7acLILT6/xOrrLPkjn+wWfbo0B40c9GQfnSWGbb+OZRwaIoefowKk 1IgSXJtM3aYqJadcObgTD0MCp0PaEBpKEe66QxTP2gxKKgpD71TMG5kYqYbqTs2t LHUGq7HegCQe5aOZ4hBHvCHb4VqztxNz0woJTD3GjtxmFiwlsrQ1FCK51OT+s0rr Ga5inejhc+BTuho7XFPxNyQ0piFq3uBGTRJ5vIkygMUWnxVMSVi0DYSxbqC9J5qm M7Ezhlca7c5m7sh5LG1ULrh+GZEZx/Li5iUWytolJnuV12v4eGAyEcux14mik8wh pa/1OmU+7K+tCIP3lTzzcPqMNjRDbOhYIVahaByzAvztET4x2uj4AadGHCdhTUsu AaKRc+4helF5D6sbcsxwY1h2VrR7c7ZrTlFdHa7Xd+LXlepo+OY= =k2sc -----END PGP SIGNATURE----- --IS0zKkzwUGydFO0o--