Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp4068001pxb; Tue, 17 Nov 2020 10:22:38 -0800 (PST) X-Google-Smtp-Source: ABdhPJy6ZhK5MHYLeehRkQ6TbI1KRoeYSTq0lla6r/4fSSPxaT9pXtkea+zVNk8XrEG21kPtifvI X-Received: by 2002:aa7:da44:: with SMTP id w4mr22254669eds.131.1605637357876; Tue, 17 Nov 2020 10:22:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605637357; cv=none; d=google.com; s=arc-20160816; b=0QTMX67MXnoNc7wJBZ+hWHiO+yuEtA8LICh2hPef+jf91m9+/pwVlPWA+B7BuTt0oN xzp6PnUfsdlYtEMTdddPPimtq6Javwhgie4EwN//8e88xfcfUVqKQxjDqeWsN1Grt2+B L1iDdZ92f02xVi+UXSm1ftOMPclyiEwK4KDOc+/8V767tJVRaEUneEoT+iJd8gyOGm7R 0Nyqr1F/xk60hDwBp4l8/sbWC0XHrmxS0MFMHnDpM/7mVRLx0S/5pEYlCEMYoqTOrCZh xJFW4WHkRAtOPLE3d3oA/whagf6ppbYogQWV0mWB/wnnJLltyQW0g7Fv6pDhXVMMS5k9 Ku3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:sender:dkim-signature; bh=UnISW/7ysLvEOA4M0KYS2DiSYKHuk/dOZvBw6jTbu2M=; b=DB8g1+mUgGGPIXqZ+MeWK8+gJT5YgBq1HE+98V6CipUw1QIRpX1Ig/2WbzQZt6PDLy XR4VR2fmfgH92aGvk6863KT6tWAxHhMddSM9LR4rU1Z/3OwT2KVBC9oFw1YMds/s9IyC 31ov3YMNzyckQMV7hGZ4B8QDFu0PyDSHoAXXA0R84o0I+mJk4yEYgon9bdZEoKyUAvPR qO4cLYRz5bshsHVZdVo1wxcC2WQO5Bh2W3e/jVIUYkL0ZzVWnDH+a7RTMriLaHIMPxm+ VFdDwlg9zajmVPmpsRocZkO2DN2NKhkVeg5dYhNgqePJBXgTH73zNayB5kjSBvOArg2v ER4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Y46xsWr+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s15si14708278edj.165.2020.11.17.10.22.14; Tue, 17 Nov 2020 10:22:37 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Y46xsWr+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731179AbgKQSRZ (ORCPT + 99 others); Tue, 17 Nov 2020 13:17:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49054 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731150AbgKQSRX (ORCPT ); Tue, 17 Nov 2020 13:17:23 -0500 Received: from mail-wr1-x44a.google.com (mail-wr1-x44a.google.com [IPv6:2a00:1450:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19067C0617A6 for ; Tue, 17 Nov 2020 10:17:23 -0800 (PST) Received: by mail-wr1-x44a.google.com with SMTP id 67so13499677wra.2 for ; Tue, 17 Nov 2020 10:17:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=UnISW/7ysLvEOA4M0KYS2DiSYKHuk/dOZvBw6jTbu2M=; b=Y46xsWr+fpscULTkv6wJmNLh4Rny8VDfXImWw0g4FGuDGcH5Dj65BmOnRZesT6ZFth nzmvzgrCAMT35AMiTIAi2DFYOrh/Aq6nY55zm/m6DPUxjPGsHta+ja7jqxqUfCU4FnQb MHtfB5aoKisNqH6shJKNLDU/NzfYKajoHo+mIn3KhtSthuEGflYJZp2bEpYphw8Ro2WW /CkJ+W+rUFgkwTPjM8pjl1ZQjOacdVnCazJPTUwiatafcU5KRWPDo1VSlIpvGL2urLm2 2XqyygWV30YhIq8Z5vT4EcO3G69skl7KBQjf4G1HIF/tPT4vHuT52snzhwlVedT2HSLA 53Cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=UnISW/7ysLvEOA4M0KYS2DiSYKHuk/dOZvBw6jTbu2M=; b=oiIbx550LKlP00ohRsZGvxQFNak9Rpm6r180F/fTNl4GlN3VVXofLjMCeepnzxwzHc +JbCgzj1+S/Y95Pw+8EQuI6e7SNiKJRADSlzuQ00lxxMqr1u3tw0cUcQx0PJT/mwXuUP WMa0VpA1agDThyXpdmbcOunFf9JAgPdmJkNusI45QSHTrStv9O+VrBYLMYLNAQ5jwjZW U8vR8j7yjk2ZVtIz1uAKnQr8p3QtcCDlaQYE3rS1MvY6ERods9OUQOuC48pcMKkiQhtB WxxUfFS3Y7bB2L+c5doihKMT0Fg6emtdbPshUVCFGtN+LFxiygmqoIVAmgONmyTMRLHC NzzA== X-Gm-Message-State: AOAM531OFFYTCKLZqKTodSv+gBvJhBAUugqlrXvYsrVcnHFNj88V22KJ 71o9pv1//hQeSi7cNJGGBSG5DKCfjLrI Sender: "qperret via sendgmr" X-Received: from luke.lon.corp.google.com ([2a00:79e0:d:210:f693:9fff:fef4:a7ef]) (user=qperret job=sendgmr) by 2002:a1c:9ad3:: with SMTP id c202mr346589wme.43.1605637041767; Tue, 17 Nov 2020 10:17:21 -0800 (PST) Date: Tue, 17 Nov 2020 18:16:07 +0000 In-Reply-To: <20201117181607.1761516-1-qperret@google.com> Message-Id: <20201117181607.1761516-28-qperret@google.com> Mime-Version: 1.0 References: <20201117181607.1761516-1-qperret@google.com> X-Mailer: git-send-email 2.29.2.299.gdc1121823c-goog Subject: [RFC PATCH 27/27] KVM: arm64: Wrap the host with a stage 2 From: Quentin Perret To: Catalin Marinas , Will Deacon , Marc Zyngier , James Morse , Julien Thierry , Suzuki K Poulose , Rob Herring , Frank Rowand Cc: "moderated list:ARM64 PORT (AARCH64 ARCHITECTURE)" , open list , "open list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE" , kernel-team@android.com, android-kvm@google.com, Quentin Perret Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When KVM runs in protected nVHE mode, make use of a stage 2 page-table to give the hypervisor some control over the host memory accesses. At the moment all memory aborts from the host will be instantly idmapped RWX at stage 2 in a lazy fashion. Later patches will make use of that infrastructure to implement access control restrictions to e.g. protect guest memory from the host. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/kvm_cpufeature.h | 2 + arch/arm64/kernel/image-vars.h | 3 + arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 33 +++ arch/arm64/kvm/hyp/nvhe/Makefile | 2 +- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 6 + arch/arm64/kvm/hyp/nvhe/mem_protect.c | 191 ++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/setup.c | 6 + arch/arm64/kvm/hyp/nvhe/switch.c | 7 +- arch/arm64/kvm/hyp/nvhe/tlb.c | 4 +- 9 files changed, 247 insertions(+), 7 deletions(-) create mode 100644 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h create mode 100644 arch/arm64/kvm/hyp/nvhe/mem_protect.c diff --git a/arch/arm64/include/asm/kvm_cpufeature.h b/arch/arm64/include/asm/kvm_cpufeature.h index d34f85cba358..74043a149322 100644 --- a/arch/arm64/include/asm/kvm_cpufeature.h +++ b/arch/arm64/include/asm/kvm_cpufeature.h @@ -15,3 +15,5 @@ #endif KVM_HYP_CPU_FTR_REG(SYS_CTR_EL0, arm64_ftr_reg_ctrel0) +KVM_HYP_CPU_FTR_REG(SYS_ID_AA64MMFR0_EL1, arm64_ftr_reg_id_aa64mmfr0_el1) +KVM_HYP_CPU_FTR_REG(SYS_ID_AA64MMFR1_EL1, arm64_ftr_reg_id_aa64mmfr1_el1) diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h index f2d43e6cd86d..e2652278dd63 100644 --- a/arch/arm64/kernel/image-vars.h +++ b/arch/arm64/kernel/image-vars.h @@ -137,6 +137,9 @@ KVM_NVHE_ALIAS(__hyp_data_ro_after_init_end); KVM_NVHE_ALIAS(__hyp_bss_start); KVM_NVHE_ALIAS(__hyp_bss_end); +/* pKVM static key */ +KVM_NVHE_ALIAS(kvm_protected_mode_initialized); + #endif /* CONFIG_KVM */ #endif /* __ARM64_KERNEL_IMAGE_VARS_H */ diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h new file mode 100644 index 000000000000..a22ef118a610 --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2020 Google LLC + * Author: Quentin Perret + */ + +#ifndef __KVM_NVHE_MEM_PROTECT__ +#define __KVM_NVHE_MEM_PROTECT__ +#include +#include +#include +#include +#include + +struct host_kvm { + struct kvm_arch arch; + struct kvm_pgtable pgt; + struct kvm_pgtable_mm_ops mm_ops; + hyp_spinlock_t lock; +}; +extern struct host_kvm host_kvm; + +int kvm_host_prepare_stage2(void *mem_pgt_pool, void *dev_pgt_pool); +void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt); + +static __always_inline void __load_host_stage2(void) +{ + if (static_branch_likely(&kvm_protected_mode_initialized)) + __load_stage2(&host_kvm.arch.mmu, host_kvm.arch.vtcr); + else + write_sysreg(0, vttbr_el2); +} +#endif /* __KVM_NVHE_MEM_PROTECT__ */ diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile index d7381a503182..c3e2f98555c4 100644 --- a/arch/arm64/kvm/hyp/nvhe/Makefile +++ b/arch/arm64/kvm/hyp/nvhe/Makefile @@ -11,7 +11,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs)) obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \ hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \ - cache.o cpufeature.o setup.o mm.o + cache.o cpufeature.o setup.o mm.o mem_protect.o obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \ ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o obj-y += $(lib-objs) diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index a0bfe0d26da6..5d0cb17e03a1 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -13,6 +13,7 @@ #include #include +#include #include DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params); @@ -234,6 +235,11 @@ void handle_trap(struct kvm_cpu_context *host_ctxt) case ESR_ELx_EC_SMC64: handle_host_smc(host_ctxt); break; + case ESR_ELx_EC_IABT_LOW: + fallthrough; + case ESR_ELx_EC_DABT_LOW: + handle_host_mem_abort(host_ctxt); + break; default: hyp_panic(); } diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c new file mode 100644 index 000000000000..0cd3eb178f3b --- /dev/null +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -0,0 +1,191 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2020 Google LLC + * Author: Quentin Perret + */ + +#include +#include +#include +#include +#include +#include +#include + +#include + +#include +#include +#include +#include + +extern unsigned long hyp_nr_cpus; +struct host_kvm host_kvm; + +struct hyp_pool host_s2_mem; +struct hyp_pool host_s2_dev; + +static void *host_s2_zalloc_pages_exact(size_t size) +{ + return hyp_alloc_pages(&host_s2_mem, HYP_GFP_ZERO, get_order(size)); +} + +static void *host_s2_zalloc_page(void *pool) +{ + return hyp_alloc_pages(pool, HYP_GFP_ZERO, 0); +} + +static int prepare_s2_pools(void *mem_pgt_pool, void *dev_pgt_pool) +{ + unsigned long nr_pages; + int ret; + + nr_pages = host_s2_mem_pgtable_size() >> PAGE_SHIFT; + ret = hyp_pool_init(&host_s2_mem, __hyp_pa(mem_pgt_pool), nr_pages, 0); + if (ret) + return ret; + + nr_pages = host_s2_dev_pgtable_size() >> PAGE_SHIFT; + ret = hyp_pool_init(&host_s2_dev, __hyp_pa(dev_pgt_pool), nr_pages, 0); + if (ret) + return ret; + + host_kvm.mm_ops.zalloc_pages_exact = host_s2_zalloc_pages_exact; + host_kvm.mm_ops.zalloc_page = host_s2_zalloc_page; + host_kvm.mm_ops.phys_to_virt = hyp_phys_to_virt; + host_kvm.mm_ops.virt_to_phys = hyp_virt_to_phys; + host_kvm.mm_ops.page_count = hyp_page_count; + host_kvm.mm_ops.get_page = hyp_get_page; + host_kvm.mm_ops.put_page = hyp_put_page; + + return 0; +} + +static void prepare_host_vtcr(void) +{ + u32 parange, phys_shift; + u64 mmfr0, mmfr1; + + mmfr0 = arm64_ftr_reg_id_aa64mmfr0_el1.sys_val; + mmfr1 = arm64_ftr_reg_id_aa64mmfr1_el1.sys_val; + + /* The host stage 2 is id-mapped, so use parange for T0SZ */ + parange = kvm_get_parange(mmfr0); + phys_shift = id_aa64mmfr0_parange_to_phys_shift(parange); + + host_kvm.arch.vtcr = kvm_get_vtcr(mmfr0, mmfr1, phys_shift); +} + +int kvm_host_prepare_stage2(void *mem_pgt_pool, void *dev_pgt_pool) +{ + struct kvm_s2_mmu *mmu = &host_kvm.arch.mmu; + struct kvm_nvhe_init_params *params; + int ret, i; + + prepare_host_vtcr(); + hyp_spin_lock_init(&host_kvm.lock); + + ret = prepare_s2_pools(mem_pgt_pool, dev_pgt_pool); + if (ret) + return ret; + + ret = kvm_pgtable_stage2_init(&host_kvm.pgt, &host_kvm.arch, + &host_kvm.mm_ops); + if (ret) + return ret; + + mmu->pgd_phys = __hyp_pa(host_kvm.pgt.pgd); + mmu->arch = &host_kvm.arch; + mmu->pgt = &host_kvm.pgt; + mmu->vmid.vmid_gen = 0; + mmu->vmid.vmid = 0; + + for (i = 0; i < hyp_nr_cpus; i++) { + params = per_cpu_ptr(&kvm_init_params, i); + params->vttbr = kvm_get_vttbr(mmu); + params->vtcr = host_kvm.arch.vtcr; + params->hcr_el2 |= HCR_VM; + __flush_dcache_area(params, sizeof(*params)); + } + + write_sysreg(this_cpu_ptr(&kvm_init_params)->hcr_el2, hcr_el2); + __load_stage2(&host_kvm.arch.mmu, host_kvm.arch.vtcr); + + return 0; +} + +static void host_stage2_unmap_dev_all(void) +{ + struct kvm_pgtable *pgt = &host_kvm.pgt; + struct hyp_memblock_region *reg; + u64 addr = 0; + int i; + + /* Unmap all non-memory regions to recycle the pages */ + for (i = 0; i < hyp_memblock_nr; i++, addr = reg->end) { + reg = &hyp_memory[i]; + kvm_pgtable_stage2_unmap(pgt, addr, reg->start - addr); + } + kvm_pgtable_stage2_unmap(pgt, addr, ULONG_MAX); +} + +static bool ipa_is_memory(u64 ipa) +{ + int cur, left = 0, right = hyp_memblock_nr; + struct hyp_memblock_region *reg; + + /* The list of memblock regions is sorted, binary search it */ + while (left < right) { + cur = (left + right) >> 1; + reg = &hyp_memory[cur]; + if (ipa < reg->start) + right = cur; + else if (ipa >= reg->end) + left = cur + 1; + else + return true; + } + + return false; +} + +static int __host_stage2_map(u64 ipa, u64 size, enum kvm_pgtable_prot prot, + struct hyp_pool *p) +{ + return kvm_pgtable_stage2_map(&host_kvm.pgt, ipa, size, ipa, prot, p); +} + +static int host_stage2_map(u64 ipa, u64 size, enum kvm_pgtable_prot prot) +{ + int ret, is_memory = ipa_is_memory(ipa); + struct hyp_pool *pool; + + pool = is_memory ? &host_s2_mem : &host_s2_dev; + + hyp_spin_lock(&host_kvm.lock); + ret = __host_stage2_map(ipa, size, prot, pool); + if (ret == -ENOMEM && !is_memory) { + host_stage2_unmap_dev_all(); + ret = __host_stage2_map(ipa, size, prot, pool); + } + hyp_spin_unlock(&host_kvm.lock); + + return ret; +} + +void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt) +{ + enum kvm_pgtable_prot prot; + u64 far, hpfar, esr, ipa; + int ret; + + esr = read_sysreg_el2(SYS_ESR); + if (!__get_fault_info(esr, &far, &hpfar)) + hyp_panic(); + + prot = KVM_PGTABLE_PROT_R | KVM_PGTABLE_PROT_W | KVM_PGTABLE_PROT_X; + ipa = (hpfar & HPFAR_MASK) << 8; + ret = host_stage2_map(ipa, PAGE_SIZE, prot); + if (ret) + hyp_panic(); +} diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index b73e6b08cfba..3e079027e608 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -12,6 +12,7 @@ #include #include #include +#include #include struct hyp_pool hpool; @@ -161,6 +162,11 @@ void __noreturn __kvm_hyp_protect_finalise(void) if (ret) goto out; + /* Wrap the host with a stage 2 */ + ret = kvm_host_prepare_stage2(host_s2_mem_pgt_base, host_s2_dev_pgt_base); + if (ret) + goto out; + hyp_pgtable_mm_ops.zalloc_page = hyp_zalloc_hyp_page; hyp_pgtable_mm_ops.phys_to_virt = hyp_phys_to_virt; hyp_pgtable_mm_ops.virt_to_phys = hyp_virt_to_phys; diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c index 979a76cdf9fb..31bc1a843bf8 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -28,6 +28,8 @@ #include #include +#include + /* Non-VHE specific context */ DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data); DEFINE_PER_CPU(struct kvm_cpu_context, kvm_hyp_ctxt); @@ -102,11 +104,6 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu) write_sysreg(__kvm_hyp_host_vector, vbar_el2); } -static void __load_host_stage2(void) -{ - write_sysreg(0, vttbr_el2); -} - /* Save VGICv3 state on non-VHE systems */ static void __hyp_vgic_save_state(struct kvm_vcpu *vcpu) { diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c index fbde89a2c6e8..255a23a1b2db 100644 --- a/arch/arm64/kvm/hyp/nvhe/tlb.c +++ b/arch/arm64/kvm/hyp/nvhe/tlb.c @@ -8,6 +8,8 @@ #include #include +#include + struct tlb_inv_context { u64 tcr; }; @@ -43,7 +45,7 @@ static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu, static void __tlb_switch_to_host(struct tlb_inv_context *cxt) { - write_sysreg(0, vttbr_el2); + __load_host_stage2(); if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { /* Ensure write of the host VMID */ -- 2.29.2.299.gdc1121823c-goog