Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp6255505imd; Wed, 31 Oct 2018 09:02:52 -0700 (PDT) X-Google-Smtp-Source: AJdET5dp/Fq3pZKzvG6Z4WkiBFAPVl4uRenEL+lWbbtLpQYzmC5iVYZz94J1IGT/6ykjm2ZbOgDC X-Received: by 2002:a17:902:6b04:: with SMTP id o4-v6mr4036023plk.333.1541001772326; Wed, 31 Oct 2018 09:02:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1541001772; cv=none; d=google.com; s=arc-20160816; b=cOkm03unA/ATyBMNTLM1cy2XWimMjWfQPb7ABkp9wzBMa2iHut36X6VXLMVVcnwRSd BpT2W4+/ZJZVNYavlpoAWkO6bQfPKV2eXopSZqzPxWArJmORRUj6PI4YetflmyMqW3B8 9F8dqa8vg2mBxhBwWHcAY/WLkQMKHoOXQcy3h4ngBTm+JX1wldzUZAaClJGwbT3TvJ9u Uw/86ccPBx48O2z+9VjlIUhU5hDUVEQJYE5hdBSY+w1fDxeWEgyPtYASjIMMWS0Ck6B6 MPhHk1gwT9vOtXJrBfVQU7f+2LtWdWP/ggZhZ/HAG+OtEtBIh1AIkSUI5y7JlBMNU9nW hADw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=DdQx7OPohb5Qo6eMHw6LX7gVynyHvcDnQX9NSTo4wQ8=; b=DbPeUrkmxHEygSRWsDmjaI1EoiRopDi18IICKOcydY2kvucx7mzeRv7VLfrAsJwAef v2FnPcRGtGIkGoU3Jh79k2RQmO9+Rsod63spr3lx6w24/5bI9r675DdOnIFuXsi4ia2h ocdDF/5GxWvH2SF5W6p3aVLFmQAVm2fcPpdsNN+J9ZkP95nI8fTvcp8qYMPbJVwegReJ J/Kwzb7TWxAFPfAez/tc2/NdC/Ue3koXs4E6HT1NkLsvx06Y4iQujX6iRO2qyo+G2GEw Sr7x/3+eoqsi9UU8Lpi8PGiuEwjg8dmAwGzQBrFVV9Xmwv3aQ+DAlKd6q2YHr7rq2vxi XxyQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e93-v6si28353399plk.208.2018.10.31.09.02.34; Wed, 31 Oct 2018 09:02:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729809AbeKAA73 (ORCPT + 99 others); Wed, 31 Oct 2018 20:59:29 -0400 Received: from foss.arm.com ([217.140.101.70]:43210 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728848AbeKAA72 (ORCPT ); Wed, 31 Oct 2018 20:59:28 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AD5EE341; Wed, 31 Oct 2018 09:00:51 -0700 (PDT) Received: from localhost (e113682-lin.copenhagen.arm.com [10.32.144.41]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3FBF03F71D; Wed, 31 Oct 2018 09:00:51 -0700 (PDT) Date: Wed, 31 Oct 2018 17:00:49 +0100 From: Christoffer Dall To: Lukas Braun Cc: Marc Zyngier , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org, Ralph Palutke Subject: Re: [PATCH v2] KVM: arm/arm64: Check memslot bounds before mapping hugepages Message-ID: <20181031160049.GF12057@e113682-lin.lund.arm.com> References: <20180926151135.16083-1-koomi@moshbit.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180926151135.16083-1-koomi@moshbit.net> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 26, 2018 at 05:11:35PM +0200, Lukas Braun wrote: > Userspace can create a memslot with memory backed by (transparent) > hugepages, but with bounds that do not align with hugepages. > In that case, we cannot map the entire region in the guest as hugepages > without exposing additional host memory to the guest and potentially > interfering with other memslots. > Consequently, this patch adds a bounds check when populating guest page > tables and forces the creation of regular PTEs if mapping an entire > hugepage would violate the memslots bounds. > > Signed-off-by: Lukas Braun It took me fairly long to understand why we didn't catch that when we introduced the 'offset check', which indicates that this function is just getting too long to read. I don't absolutely mind the fix below, but it does pile on to the complexity. Here's an alternative approach (untested, of course), but it slightly limits the functionality we have today, in favor of simplicitly. (Also not sure if it's too large for cc'ing stable). Thanks, Christoffer diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h index 8b68099348e5..ccb003dfdb97 100644 --- a/arch/arm64/include/asm/stage2_pgtable.h +++ b/arch/arm64/include/asm/stage2_pgtable.h @@ -119,6 +119,11 @@ static inline phys_addr_t stage2_pmd_addr_end(phys_addr_t addr, phys_addr_t end) return (boundary - 1 < end - 1) ? boundary : end; } +static inline bool stage2_pmd_aligned(phys_addr_t addr) +{ + return !!(addr & ~S2_PMD_MASK); +} + #endif /* STAGE2_PGTABLE_LEVELS > 2 */ #define stage2_pte_table_empty(ptep) kvm_page_empty(ptep) diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c index ed162a6c57c5..e5709ccee224 100644 --- a/virt/kvm/arm/mmu.c +++ b/virt/kvm/arm/mmu.c @@ -1466,6 +1466,22 @@ static void kvm_send_hwpoison_signal(unsigned long address, send_sig_info(SIGBUS, &info, current); } +static bool memslot_supports_pmd_mappings(struct kvm_memory_slot *memslot) +{ + gpa_t gpa_start, gpa_end; + hva_t hva_start, hva_end; + size_t size; + + size = memslot->npages * PAGE_SIZE; + gpa_start = memslot->base_gfn << PAGE_SHIFT; + gpa_end = gpa_start + size; + hva_start = memslot->userspace_addr; + hva_end = hva_start + size; + + return stage2_pmd_aligned(gpa_start) && stage2_pmd_aligned(gpa_end) && + stage2_pmd_aligned(hva_start) && stage2_pmd_aligned(hva_end); +} + static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_memory_slot *memslot, unsigned long hva, unsigned long fault_status) @@ -1491,6 +1507,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, return -EFAULT; } + /* + * We limit PMD-size mappings to situations where both the userspace + * region and GPA region is aligned to the stage2 pmd size and is a + * multiple of the stage2 pmd size in total size. This ensures + * that we won't map unintended host memory and that we'll map the + * intended user pages (not skewed by mismatching PMD offsets). + * + * We miss out on the opportunity to map non-edge PMD regions in + * unaligned memslots. Oh well... + */ + if (!memslot_supports_pmd_mappings(memslot)) + force_pte = true; + + if (logging_active) + force_pte = true; + /* Let's check if we will get back a huge page backed by hugetlbfs */ down_read(¤t->mm->mmap_sem); vma = find_vma_intersection(current->mm, hva, hva + 1); @@ -1500,22 +1532,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, return -EFAULT; } - if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) { + if (vma_kernel_pagesize(vma) == PMD_SIZE && !force_pte) { hugetlb = true; gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT; - } else { - /* - * Pages belonging to memslots that don't have the same - * alignment for userspace and IPA cannot be mapped using - * block descriptors even if the pages belong to a THP for - * the process, because the stage-2 block descriptor will - * cover more than a single THP and we loose atomicity for - * unmapping, updates, and splits of the THP or other pages - * in the stage-2 block range. - */ - if ((memslot->userspace_addr & ~PMD_MASK) != - ((memslot->base_gfn << PAGE_SHIFT) & ~PMD_MASK)) - force_pte = true; } up_read(¤t->mm->mmap_sem); @@ -1554,7 +1573,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, * should not be mapped with huge pages (it introduces churn * and performance degradation), so force a pte mapping. */ - force_pte = true; flags |= KVM_S2_FLAG_LOGGING_ACTIVE; /*