We are happy to announce the second version of the Arm Confidential
Compute Architecture (CCA) support for the Linux stack. The intention is
to seek early feedback in the following areas:
* KVM integration of the Arm CCA;
* KVM UABI for managing the Realms, seeking to generalise the
operations where possible with other Confidential Compute solutions;
* Linux Guest support for Realms.
See the previous RFC[1] for a more detailed overview of Arm's CCA
solution, or visible the Arm CCA Landing page[2].
This series is based on the final RMM v1.0 (EAC5) specification[3].
Quick-start guide
=================
The easiest way of getting started with the stack is by using
Shrinkwrap[4]. Currently Shrinkwrap has a configuration for the initial
v1.0-EAC5 release[5], so the following overlay needs to be applied to
the standard 'cca-3world.yaml' file. Note that the 'rmm' component needs
updating to 'main' because there are fixes that are needed and are not
yet in a tagged release. The following will create an overlay file and
build a working environment:
cat<<EOT >cca-v2.yaml
build:
linux:
repo:
revision: cca-full/v2
kvmtool:
repo:
kvmtool:
revision: cca/v2
rmm:
repo:
revision: main
kvm-unit-tests:
repo:
revision: cca/v2
EOT
shrinkwrap build cca-3world.yaml --overlay buildroot.yaml --btvar GUEST_ROOTFS='${artifact:BUILDROOT}' --overlay cca-v2.yaml
You will then want to modify the 'guest-disk.img' to include the files
necessary for the realm guest (see the documentation in cca-3world.yaml
for details of other options):
cd ~/.shrinkwrap/package/cca-3world
/sbin/e2fsck -fp rootfs.ext2
/sbin/resize2fs rootfs.ext2 256M
mkdir mnt
sudo mount rootfs.ext2 mnt/
sudo mkdir mnt/cca
sudo cp guest-disk.img KVMTOOL_EFI.fd lkvm Image mnt/cca/
sudo umount mnt
rmdir mnt/
Finally you can run the FVP with the host:
shrinkwrap run cca-3world.yaml --rtvar ROOTFS=$HOME/.shrinkwrap/package/cca-3world/rootfs.ext2
And once the host kernel has booted, login (user name 'root') and start
a realm guest:
cd /cca
./lkvm run --realm --restricted_mem -c 2 -m 256 -k Image -p earlycon
Be patient and you should end up in a realm guest with the host's
filesystem mounted via p9.
It's also possible to use EFI within the realm guest, again see
cca-3world.yaml within Shrinkwrap for more details.
An branch of kvm-unit-tests including realm-specific tests is provided
here:
https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca/-/tree/cca/v2
[1] Previous RFC
https://lore.kernel.org/r/20230127112248.136810-1-suzuki.poulose%40arm.com
[2] Arm CCA Landing page (See Key Resources section for various documentation)
https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
[3] RMM v1.0-EAC5 specification
https://developer.arm.com/documentation/den0137/1-0eac5/
[4] Shrinkwrap
https://git.gitlab.arm.com/tooling/shrinkwrap
[5] Linux support for Arm CCA RMM v1.0-EAC5
https://lore.kernel.org/r/fb259449-026e-4083-a02b-f8a4ebea1f87%40arm.com
This series adds support for running Linux in a protected VM under the
Arm Confidential Compute Architecture (CCA). The purpose of this series
is to gather feedback on the proposed changes to the architecture code
for CCA.
The ABI to the RMM from a realm (the RSI) is based on the final RMM v1.0
(EAC 5) specification[1].
This series is based on v6.9-rc1. It is also available as a git
repository:
https://gitlab.arm.com/linux-arm/linux-cca cca-guest/v2
Introduction
============
A more general introduction to Arm CCA is available on the Arm
website[2], and links to the other components involved are available in
the overall cover letter.
Arm Confidential Compute Architecture adds two new 'worlds' to the
architecture: Root and Realm. A new software component known as the RMM
(Realm Management Monitor) runs in Realm EL2 and is trusted by both the
Normal World and VMs running within Realms. This enables mutual
distrust between the Realm VMs and the Normal World.
Virtual machines running within a Realm can decide on a (4k)
page-by-page granularity whether to share a page with the (Normal World)
host or to keep it private (protected). This protection is provided by
the hardware and attempts to access a page which isn't shared by the
Normal World will trigger a Granule Protection Fault.
Realm VMs can communicate with the RMM via another SMC interface known
as RSI (Realm Services Interface). This series adds wrappers for the
full set of RSI commands and uses them to manage the Realm IPA State
(RIPAS) and to discover the configuration of the realm.
The VM running within the Realm needs to ensure that memory that is
going to use is marked as 'RIPAS_RAM' (i.e. protected memory accessible
only to the guest). This could be provided by the VMM (and subject to
measurement to ensure it is setup correctly) or the VM can set it
itself. This series includes a patch which will iterate over all
described RAM and set the RIPAS. This is a relatively cheap operation,
and doesn't require memory donation from the host. Instead, memory can
be dynamically provided by the host on fault. An alternative would be to
update booting.rst and state this as a requirement, but this would
reduce the flexibility of the VMM to manage the available memory to the
guest (as the initial RIPAS state is part of the guest's measurement).
Within the Realm the most-significant active bit of the IPA is used to
select whether the access is to protected memory or to memory shared
with the host. This series treats this bit as if it is attribute bit in
the page tables and will modify it when sharing/unsharing memory with
the host.
This top bit usage also necessitates that the IPA width is made more
dynamic in the guest. The VMM will choose a width (and therefore which
bit controls the shared flag) and the guest must be able to identify
this bit to mask it out when necessary. PHYS_MASK_SHIFT/PHYS_MASK are
therefore made dynamic.
To allow virtio to communicate with the host the shared buffers must be
placed in memory which has this top IPA bit set. This is achieved by
implementating the set_memory_{encrypted,decrypted} APIs for arm64 and
forcing the use of bounce buffers. For now all device access is
considered to required the memory to be shared, at this stage there is
no support for real devices to be assigned to a realm guest - obviously
if device assignment is added this will have to change.
Finally the GIC is (largely) emulated by the (untrusted) host. The RMM
provides some management (including register save/restore) but the
ITS buffers must be placed into shared memory for the host to emulate.
There is likely to be future work to harden the GIC driver against a
malicious host (along with any other drivers used within a Realm guest).
[1] https://developer.arm.com/documentation/den0137/1-0eac5/
[2] https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
Sami Mujawar (2):
arm64: rsi: Interfaces to query attestation token
virt: arm-cca-guest: TSM_REPORT support for realms
Steven Price (5):
arm64: realm: Query IPA size from the RMM
arm64: Mark all I/O as non-secure shared
arm64: Make the PHYS_MASK_SHIFT dynamic
arm64: Enforce bounce buffers for realm DMA
arm64: realm: Support nonsecure ITS emulation shared
Suzuki K Poulose (7):
arm64: rsi: Add RSI definitions
arm64: Detect if in a realm and set RIPAS RAM
fixmap: Allow architecture overriding set_fixmap_io
arm64: Override set_fixmap_io
arm64: Enable memory encrypt for Realms
arm64: Force device mappings to be non-secure shared
efi: arm64: Map Device with Prot Shared
arch/arm64/Kconfig | 3 +
arch/arm64/include/asm/fixmap.h | 4 +-
arch/arm64/include/asm/io.h | 6 +-
arch/arm64/include/asm/kvm_arm.h | 2 +-
arch/arm64/include/asm/mem_encrypt.h | 19 ++
arch/arm64/include/asm/pgtable-hwdef.h | 4 +-
arch/arm64/include/asm/pgtable-prot.h | 3 +
arch/arm64/include/asm/pgtable.h | 7 +-
arch/arm64/include/asm/rsi.h | 46 ++++
arch/arm64/include/asm/rsi_cmds.h | 143 ++++++++++++
arch/arm64/include/asm/rsi_smc.h | 136 ++++++++++++
arch/arm64/kernel/Makefile | 3 +-
arch/arm64/kernel/efi.c | 2 +-
arch/arm64/kernel/rsi.c | 85 +++++++
arch/arm64/kernel/setup.c | 3 +
arch/arm64/mm/init.c | 13 +-
arch/arm64/mm/mmu.c | 13 ++
arch/arm64/mm/pageattr.c | 48 +++-
drivers/irqchip/irq-gic-v3-its.c | 95 ++++++--
drivers/virt/coco/Kconfig | 2 +
drivers/virt/coco/Makefile | 1 +
drivers/virt/coco/arm-cca-guest/Kconfig | 11 +
drivers/virt/coco/arm-cca-guest/Makefile | 2 +
.../virt/coco/arm-cca-guest/arm-cca-guest.c | 208 ++++++++++++++++++
include/asm-generic/fixmap.h | 2 +
25 files changed, 822 insertions(+), 39 deletions(-)
create mode 100644 arch/arm64/include/asm/mem_encrypt.h
create mode 100644 arch/arm64/include/asm/rsi.h
create mode 100644 arch/arm64/include/asm/rsi_cmds.h
create mode 100644 arch/arm64/include/asm/rsi_smc.h
create mode 100644 arch/arm64/kernel/rsi.c
create mode 100644 drivers/virt/coco/arm-cca-guest/Kconfig
create mode 100644 drivers/virt/coco/arm-cca-guest/Makefile
create mode 100644 drivers/virt/coco/arm-cca-guest/arm-cca-guest.c
--
2.34.1
From: Suzuki K Poulose <[email protected]>
Override the set_fixmap_io to set shared permission for the host
in case of a CC guest. For now we mark it shared unconditionally.
Future changes could filter the physical address and make the
decision accordingly.
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Steven Price <[email protected]>
---
arch/arm64/include/asm/fixmap.h | 4 +++-
arch/arm64/mm/mmu.c | 13 +++++++++++++
2 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 87e307804b99..f765943b088c 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -107,7 +107,9 @@ void __init early_fixmap_init(void);
#define __late_set_fixmap __set_fixmap
#define __late_clear_fixmap(idx) __set_fixmap((idx), 0, FIXMAP_PAGE_CLEAR)
-extern void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot);
+#define set_fixmap_io set_fixmap_io
+void set_fixmap_io(enum fixed_addresses idx, phys_addr_t phys);
+void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot);
#include <asm-generic/fixmap.h>
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 495b732d5af3..79d84db9ffcb 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1179,6 +1179,19 @@ void vmemmap_free(unsigned long start, unsigned long end,
}
#endif /* CONFIG_MEMORY_HOTPLUG */
+void set_fixmap_io(enum fixed_addresses idx, phys_addr_t phys)
+{
+ pgprot_t prot = FIXMAP_PAGE_IO;
+
+ /*
+ * For now we consider all I/O as non-secure. For future
+ * filter the I/O base for setting appropriate permissions.
+ */
+ prot = __pgprot(pgprot_val(prot) | PROT_NS_SHARED);
+
+ return __set_fixmap(idx, phys, prot);
+}
+
int pud_set_huge(pud_t *pudp, phys_addr_t phys, pgprot_t prot)
{
pud_t new_pud = pfn_pud(__phys_to_pfn(phys), mk_pud_sect_prot(prot));
--
2.34.1
From: Suzuki K Poulose <[email protected]>
Device mappings (currently) need to be emulated by the VMM so must be
mapped shared with the host.
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Steven Price <[email protected]>
---
arch/arm64/include/asm/pgtable.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index f5376bd567a1..db71c564ec21 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -598,7 +598,7 @@ static inline void set_pud_at(struct mm_struct *mm, unsigned long addr,
#define pgprot_writecombine(prot) \
__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC) | PTE_PXN | PTE_UXN)
#define pgprot_device(prot) \
- __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN)
+ __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN | PROT_NS_SHARED)
#define pgprot_tagged(prot) \
__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_TAGGED))
#define pgprot_mhp pgprot_tagged
--
2.34.1
On Fri, Apr 12, 2024 at 09:40:56AM +0100, Steven Price wrote:
> We are happy to announce the second version of the Arm Confidential
> Compute Architecture (CCA) support for the Linux stack. The intention is
> to seek early feedback in the following areas:
> * KVM integration of the Arm CCA;
> * KVM UABI for managing the Realms, seeking to generalise the
> operations where possible with other Confidential Compute solutions;
> * Linux Guest support for Realms.
>
> See the previous RFC[1] for a more detailed overview of Arm's CCA
> solution, or visible the Arm CCA Landing page[2].
>
> This series is based on the final RMM v1.0 (EAC5) specification[3].
Instructions for building and running the CCA stack on QEMU, both as
system emulation and VMM, are available here:
https://linaro.atlassian.net/wiki/spaces/QEMU/pages/29051027459/Building+an+RME+stack+for+QEMU
I'll send out the QEMU VMM patches shortly:
https://git.codelinaro.org/linaro/dcap/qemu.git branch cca/v2
Thanks,
Jean
> [1] Previous RFC
> https://lore.kernel.org/r/20230127112248.136810-1-suzuki.poulose%40arm.com
> [2] Arm CCA Landing page (See Key Resources section for various documentation)
> https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
> [3] RMM v1.0-EAC5 specification
> https://developer.arm.com/documentation/den0137/1-0eac5/
> [4] Shrinkwrap
> https://git.gitlab.arm.com/tooling/shrinkwrap
> [5] Linux support for Arm CCA RMM v1.0-EAC5
> https://lore.kernel.org/r/fb259449-026e-4083-a02b-f8a4ebea1f87%40arm.com
>
Hi Steven,
On Fri, Apr 12, 2024 at 09:40:56AM +0100, Steven Price wrote:
> We are happy to announce the second version of the Arm Confidential
> Compute Architecture (CCA) support for the Linux stack. The intention is
> to seek early feedback in the following areas:
> * KVM integration of the Arm CCA;
> * KVM UABI for managing the Realms, seeking to generalise the
> operations where possible with other Confidential Compute solutions;
> * Linux Guest support for Realms.
>
> See the previous RFC[1] for a more detailed overview of Arm's CCA
> solution, or visible the Arm CCA Landing page[2].
>
> This series is based on the final RMM v1.0 (EAC5) specification[3].
It's great to see the updated "V2" series. Since you said you like
"early" feedback on V2, does that mean it's likely to be followed by
V3 and V4, anticipating large code-base changes from the current form
(V2)? Do you have a rough timeframe to make this Arm CCA support landed
in mainline? Do you Arm folk expect this is going to be a multiple-year
long project?
Thanks,
Itaru.
>
> Quick-start guide
> =================
>
> The easiest way of getting started with the stack is by using
> Shrinkwrap[4]. Currently Shrinkwrap has a configuration for the initial
> v1.0-EAC5 release[5], so the following overlay needs to be applied to
> the standard 'cca-3world.yaml' file. Note that the 'rmm' component needs
> updating to 'main' because there are fixes that are needed and are not
> yet in a tagged release. The following will create an overlay file and
> build a working environment:
>
> cat<<EOT >cca-v2.yaml
> build:
> linux:
> repo:
> revision: cca-full/v2
> kvmtool:
> repo:
> kvmtool:
> revision: cca/v2
> rmm:
> repo:
> revision: main
> kvm-unit-tests:
> repo:
> revision: cca/v2
> EOT
>
> shrinkwrap build cca-3world.yaml --overlay buildroot.yaml --btvar GUEST_ROOTFS='${artifact:BUILDROOT}' --overlay cca-v2.yaml
>
> You will then want to modify the 'guest-disk.img' to include the files
> necessary for the realm guest (see the documentation in cca-3world.yaml
> for details of other options):
>
> cd ~/.shrinkwrap/package/cca-3world
> /sbin/e2fsck -fp rootfs.ext2
> /sbin/resize2fs rootfs.ext2 256M
> mkdir mnt
> sudo mount rootfs.ext2 mnt/
> sudo mkdir mnt/cca
> sudo cp guest-disk.img KVMTOOL_EFI.fd lkvm Image mnt/cca/
> sudo umount mnt
> rmdir mnt/
>
> Finally you can run the FVP with the host:
>
> shrinkwrap run cca-3world.yaml --rtvar ROOTFS=$HOME/.shrinkwrap/package/cca-3world/rootfs.ext2
>
> And once the host kernel has booted, login (user name 'root') and start
> a realm guest:
>
> cd /cca
> ./lkvm run --realm --restricted_mem -c 2 -m 256 -k Image -p earlycon
>
> Be patient and you should end up in a realm guest with the host's
> filesystem mounted via p9.
>
> It's also possible to use EFI within the realm guest, again see
> cca-3world.yaml within Shrinkwrap for more details.
>
> An branch of kvm-unit-tests including realm-specific tests is provided
> here:
> https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca/-/tree/cca/v2
>
> [1] Previous RFC
> https://lore.kernel.org/r/20230127112248.136810-1-suzuki.poulose%40arm.com
> [2] Arm CCA Landing page (See Key Resources section for various documentation)
> https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
> [3] RMM v1.0-EAC5 specification
> https://developer.arm.com/documentation/den0137/1-0eac5/
> [4] Shrinkwrap
> https://git.gitlab.arm.com/tooling/shrinkwrap
> [5] Linux support for Arm CCA RMM v1.0-EAC5
> https://lore.kernel.org/r/fb259449-026e-4083-a02b-f8a4ebea1f87%40arm.com
On 11/04/2024 19:54, Itaru Kitayama wrote:
> Hi Steven,
>
> On Fri, Apr 12, 2024 at 09:40:56AM +0100, Steven Price wrote:
>> We are happy to announce the second version of the Arm Confidential
>> Compute Architecture (CCA) support for the Linux stack. The intention is
>> to seek early feedback in the following areas:
>> * KVM integration of the Arm CCA;
>> * KVM UABI for managing the Realms, seeking to generalise the
>> operations where possible with other Confidential Compute solutions;
>> * Linux Guest support for Realms.
>>
>> See the previous RFC[1] for a more detailed overview of Arm's CCA
>> solution, or visible the Arm CCA Landing page[2].
>>
>> This series is based on the final RMM v1.0 (EAC5) specification[3].
>
> It's great to see the updated "V2" series. Since you said you like
> "early" feedback on V2, does that mean it's likely to be followed by
> V3 and V4, anticipating large code-base changes from the current form
> (V2)? Do you have a rough timeframe to make this Arm CCA support landed
> in mainline? Do you Arm folk expect this is going to be a multiple-year
> long project?
I probably should have expanded on that wording a bit, sorry! ;)
I decided to drop the 'RFC' tag as I believe this is now in a state
where it's not got any known bugs. The previous RFC didn't use
guest_memfd and had a known issue where a malicious VMM could bring down
the host kernel - so was obviously not ready for merging. But, of
course, "no known bugs" and ready to merge are somewhat different
milestones.
The support for running in a guest is (I believe) in a good state and I
don't expect to have to iterate much on that before merging - but, as
always, that depends on the feedback received.
The host support I expect to take longer. The key thing here is that
there are other CoCo solutions and we don't want to deviate
unnecessarily from what gets merged for them. Most obviously there is
some overlap between pKVM and Arm's CCA as they both touch the Arm arch
code in similar ways. At the moment we've got a hacked up version of the
kvmtool based on pKVM's branch for testing this, but if you've been
following the threads on pKVM you will be aware that there is a question
over whether the guest_memfd support meets pKVM's needs. So there are
definite questions as to what long term approach works best here. There
is even the possibility that if pKVM can solve the issues using
anonymous memory then it may make sense to also switch Arm's CCA back to
using anonymous memory rather than guest_memfd. Although I expect we'll
want to keep guest_memfd as an option at the very least to match where
x86 is heading.
I'd also expect some minor iteration on the exact form the uAPI takes.
Of particular note is Intel is planing to introduce KVM_MAP_MEMORY[1]
which looks very similar to KVM_CAP_ARM_RME_POPULATE_REALM. It will
probably make sense for us to switch (although KVM_MAP_MEMORY has
restrictions which are unnecessary for Arm CCA - e.g. it's run on a vcpu
for x86 but not for Arm CCA).
In terms of timescales - honestly I don't really know. I certainly hope
this won't be as long as "multi-year"! Although the wider CoCo effort is
certainly going to take multiple years. This series is for "CCA v1.0",
there will be more versions of the RMM specification which will add more
features in the future. Equally there is likely to be a lot of work
needed in guest hardening which is largely generic across all CoCo
solutions.
Steve
[1]
https://lore.kernel.org/r/9a060293c9ad9a78f1d8994cfe1311e818e99257.1712785629.git.isaku.yamahata%40intel.com
> Thanks,
> Itaru.
>
>>
>> Quick-start guide
>> =================
>>
>> The easiest way of getting started with the stack is by using
>> Shrinkwrap[4]. Currently Shrinkwrap has a configuration for the initial
>> v1.0-EAC5 release[5], so the following overlay needs to be applied to
>> the standard 'cca-3world.yaml' file. Note that the 'rmm' component needs
>> updating to 'main' because there are fixes that are needed and are not
>> yet in a tagged release. The following will create an overlay file and
>> build a working environment:
>>
>> cat<<EOT >cca-v2.yaml
>> build:
>> linux:
>> repo:
>> revision: cca-full/v2
>> kvmtool:
>> repo:
>> kvmtool:
>> revision: cca/v2
>> rmm:
>> repo:
>> revision: main
>> kvm-unit-tests:
>> repo:
>> revision: cca/v2
>> EOT
>>
>> shrinkwrap build cca-3world.yaml --overlay buildroot.yaml --btvar GUEST_ROOTFS='${artifact:BUILDROOT}' --overlay cca-v2.yaml
>>
>> You will then want to modify the 'guest-disk.img' to include the files
>> necessary for the realm guest (see the documentation in cca-3world.yaml
>> for details of other options):
>>
>> cd ~/.shrinkwrap/package/cca-3world
>> /sbin/e2fsck -fp rootfs.ext2
>> /sbin/resize2fs rootfs.ext2 256M
>> mkdir mnt
>> sudo mount rootfs.ext2 mnt/
>> sudo mkdir mnt/cca
>> sudo cp guest-disk.img KVMTOOL_EFI.fd lkvm Image mnt/cca/
>> sudo umount mnt
>> rmdir mnt/
>>
>> Finally you can run the FVP with the host:
>>
>> shrinkwrap run cca-3world.yaml --rtvar ROOTFS=$HOME/.shrinkwrap/package/cca-3world/rootfs.ext2
>>
>> And once the host kernel has booted, login (user name 'root') and start
>> a realm guest:
>>
>> cd /cca
>> ./lkvm run --realm --restricted_mem -c 2 -m 256 -k Image -p earlycon
>>
>> Be patient and you should end up in a realm guest with the host's
>> filesystem mounted via p9.
>>
>> It's also possible to use EFI within the realm guest, again see
>> cca-3world.yaml within Shrinkwrap for more details.
>>
>> An branch of kvm-unit-tests including realm-specific tests is provided
>> here:
>> https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca/-/tree/cca/v2
>>
>> [1] Previous RFC
>> https://lore.kernel.org/r/20230127112248.136810-1-suzuki.poulose%40arm.com
>> [2] Arm CCA Landing page (See Key Resources section for various documentation)
>> https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
>> [3] RMM v1.0-EAC5 specification
>> https://developer.arm.com/documentation/den0137/1-0eac5/
>> [4] Shrinkwrap
>> https://git.gitlab.arm.com/tooling/shrinkwrap
>> [5] Linux support for Arm CCA RMM v1.0-EAC5
>> https://lore.kernel.org/r/fb259449-026e-4083-a02b-f8a4ebea1f87%40arm.com
On Fri, Apr 12, 2024 at 09:42:05AM +0100, Steven Price wrote:
> Override the set_fixmap_io to set shared permission for the host
> in case of a CC guest. For now we mark it shared unconditionally.
> Future changes could filter the physical address and make the
> decision accordingly.
[...]
> +void set_fixmap_io(enum fixed_addresses idx, phys_addr_t phys)
> +{
> + pgprot_t prot = FIXMAP_PAGE_IO;
> +
> + /*
> + * For now we consider all I/O as non-secure. For future
> + * filter the I/O base for setting appropriate permissions.
> + */
> + prot = __pgprot(pgprot_val(prot) | PROT_NS_SHARED);
> +
> + return __set_fixmap(idx, phys, prot);
> +}
I looked through the patches and could not find any place where this
function does anything different as per the commit log suggestion. Can
we just update FIXMAP_PAGE_IO for now until you have a clear use-case?
--
Catalin
On 13/05/2024 17:14, Catalin Marinas wrote:
> On Fri, Apr 12, 2024 at 09:42:05AM +0100, Steven Price wrote:
>> Override the set_fixmap_io to set shared permission for the host
>> in case of a CC guest. For now we mark it shared unconditionally.
>> Future changes could filter the physical address and make the
>> decision accordingly.
> [...]
>> +void set_fixmap_io(enum fixed_addresses idx, phys_addr_t phys)
>> +{
>> + pgprot_t prot = FIXMAP_PAGE_IO;
>> +
>> + /*
>> + * For now we consider all I/O as non-secure. For future
>> + * filter the I/O base for setting appropriate permissions.
>> + */
>> + prot = __pgprot(pgprot_val(prot) | PROT_NS_SHARED);
>> +
>> + return __set_fixmap(idx, phys, prot);
>> +}
>
> I looked through the patches and could not find any place where this
> function does anything different as per the commit log suggestion. Can
> we just update FIXMAP_PAGE_IO for now until you have a clear use-case?
>
This gets used by the earlycon mapping. The commit description could be
made clear.
We may have to revisit this code to optionally apply the PROT_NS_SHARED
attribute, depending on whether this is a "protected MMIO" or not.
Suzuki
On Fri, Apr 12, 2024 at 09:42:09AM +0100, Steven Price wrote:
> From: Suzuki K Poulose <[email protected]>
>
> Device mappings (currently) need to be emulated by the VMM so must be
> mapped shared with the host.
You say "currently". What's the plan when the device is not emulated?
How would the guest distinguish what's emulated and what's not to avoid
setting the PROT_NS_SHARED bit?
> Signed-off-by: Suzuki K Poulose <[email protected]>
> Signed-off-by: Steven Price <[email protected]>
> ---
> arch/arm64/include/asm/pgtable.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index f5376bd567a1..db71c564ec21 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -598,7 +598,7 @@ static inline void set_pud_at(struct mm_struct *mm, unsigned long addr,
> #define pgprot_writecombine(prot) \
> __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC) | PTE_PXN | PTE_UXN)
> #define pgprot_device(prot) \
> - __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN)
> + __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN | PROT_NS_SHARED)
This pgprot_device() is not the only one used to map device resources.
pgprot_writecombine() is another commonly macro. It feels like a hack to
plug one but not the other and without any way for the guest to figure
out what's emulated.
Can the DT actually place those emulated ranges in the higher IPA space
so that we avoid randomly adding this attribute for devices?
--
Catalin
On 15/05/2024 10:01, Catalin Marinas wrote:
> On Fri, Apr 12, 2024 at 09:42:09AM +0100, Steven Price wrote:
>> From: Suzuki K Poulose <[email protected]>
>>
>> Device mappings (currently) need to be emulated by the VMM so must be
>> mapped shared with the host.
>
> You say "currently". What's the plan when the device is not emulated?
> How would the guest distinguish what's emulated and what's not to avoid
> setting the PROT_NS_SHARED bit?
Arm CCA plans to add support for passing through real devices,
which support PCI-TDISP protocol. This would involve the Realm
authenticating the device and explicitly requesting "protected"
mapping *after* the verification (with the help of RMM).
>
>> Signed-off-by: Suzuki K Poulose <[email protected]>
>> Signed-off-by: Steven Price <[email protected]>
>> ---
>> arch/arm64/include/asm/pgtable.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> index f5376bd567a1..db71c564ec21 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -598,7 +598,7 @@ static inline void set_pud_at(struct mm_struct *mm, unsigned long addr,
>> #define pgprot_writecombine(prot) \
>> __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC) | PTE_PXN | PTE_UXN)
>> #define pgprot_device(prot) \
>> - __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN)
>> + __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN | PROT_NS_SHARED)
>
> This pgprot_device() is not the only one used to map device resources.
> pgprot_writecombine() is another commonly macro. It feels like a hack to
> plug one but not the other and without any way for the guest to figure
> out what's emulated.
Agree. I have been exploring hooking this into ioremap_prot() where we
could apply the attribute accordingly. We will change it in the next
version.
>
> Can the DT actually place those emulated ranges in the higher IPA space
> so that we avoid randomly adding this attribute for devices?
It can, but then we kind of break the "Realm" view of the IPA space.
i.e., right now it only knows about the "lower IPA" half and uses the
top bit as a protection attr to access the IPA as shared.
Expanding IPA size view kind of breaks "sharing memory", where, we
must "use a different PA" for a page that is now shared.
Suzuki
On Wed, May 15, 2024 at 12:00:49PM +0100, Suzuki K Poulose wrote:
> On 15/05/2024 10:01, Catalin Marinas wrote:
> > On Fri, Apr 12, 2024 at 09:42:09AM +0100, Steven Price wrote:
> > > From: Suzuki K Poulose <[email protected]>
> > >
> > > Device mappings (currently) need to be emulated by the VMM so must be
> > > mapped shared with the host.
> >
> > You say "currently". What's the plan when the device is not emulated?
> > How would the guest distinguish what's emulated and what's not to avoid
> > setting the PROT_NS_SHARED bit?
>
> Arm CCA plans to add support for passing through real devices,
> which support PCI-TDISP protocol. This would involve the Realm
> authenticating the device and explicitly requesting "protected"
> mapping *after* the verification (with the help of RMM).
I'd have to do some reading, no clue how this works.
> > > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> > > index f5376bd567a1..db71c564ec21 100644
> > > --- a/arch/arm64/include/asm/pgtable.h
> > > +++ b/arch/arm64/include/asm/pgtable.h
> > > @@ -598,7 +598,7 @@ static inline void set_pud_at(struct mm_struct *mm, unsigned long addr,
> > > #define pgprot_writecombine(prot) \
> > > __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC) | PTE_PXN | PTE_UXN)
> > > #define pgprot_device(prot) \
> > > - __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN)
> > > + __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN | PROT_NS_SHARED)
> >
> > This pgprot_device() is not the only one used to map device resources.
> > pgprot_writecombine() is another commonly macro. It feels like a hack to
> > plug one but not the other and without any way for the guest to figure
> > out what's emulated.
>
> Agree. I have been exploring hooking this into ioremap_prot() where we
> could apply the attribute accordingly. We will change it in the next
> version.
pgprot_* at least has the advantage that it covers other places.
ioremap_prot() would handle the kernel mappings but you have devices
mapped in user-space via remap_pfn_range() for example. The protection
bits may come from dma_pgprot() with either write-combine or cacheable
attributes. One may map device I/O as well (not sure what DPDK does). We
could restrict those to protected devices but we need to go through the
use-cases.
All this needs some thinking, especially if at some point we'll have
protected devices. Just hijacking the low-level pgprot macros doesn't
feel like a great approach.
> > Can the DT actually place those emulated ranges in the higher IPA space
> > so that we avoid randomly adding this attribute for devices?
>
> It can, but then we kind of break the "Realm" view of the IPA space. i.e.,
> right now it only knows about the "lower IPA" half and uses the top bit as a
> protection attr to access the IPA as shared.
>
> Expanding IPA size view kind of breaks "sharing memory", where, we
> must "use a different PA" for a page that is now shared.
True, I did not realise that the IPA split is transparent to the host.
An option would be additional DT/ACPI attributes for those devices.
That's not great either though as we can't handle those attributes in
the arch code only and probably we don't want to change generic drivers.
Yet another option would be to query the RMM somehow.
--
Catalin
On Mon, Apr 15, 2024 at 09:14:47AM +0100, Steven Price wrote:
> The support for running in a guest is (I believe) in a good state
> and I don't expect to have to iterate much on that before merging -
> but, as always, that depends on the feedback received.
All the stuff I've been hearing about CC is that timely guest support
is a really important thing. Right now the majority of the CC world is
running on propritary hypervisors, it is the guest enablement that is
something a wide group of people will be able to actually consume and
use.
It needs to get into mainline to be able to reach distros about a year
before anyone offers an ARM CC VM to the public. Various x86 guest
only parts for CC are already merged.
The KVM side is absolutely really important as well, but x86 has
managed for a long time now with KVM being out of tree. The KVM side
is far more complex at least.
So I'd split out the guest side and just send it, I saw a few comments
already, but it looks like it shouldn't be an issue to make it this
cycle or next? Keep sending guest enablement updates when the spec is
stable and you have some way to do basic test.
Jason