Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp4436491pxf; Tue, 16 Mar 2021 13:28:42 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwlv7ukgFTI6XX/5KWuuaSRwZ/kC1a/STgEmGCCefAH0Ixbv9DYFdwiS8FhsrZpqx6TYCLj X-Received: by 2002:a17:906:9515:: with SMTP id u21mr32424818ejx.86.1615926522576; Tue, 16 Mar 2021 13:28:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1615926522; cv=none; d=google.com; s=arc-20160816; b=rpoRP1+9vFYVsAW/UOVSg3iCnCB/99UKYILcbntdp8PLmnaO4VFUbkXj6la91cd5Tq sQL+rNO8I3ViQ67Q4i3fz7qZ36JAt/5zwu7bce1rBM7sYgWqi7DflhREC3H3dZiFJTP8 2XFVCGOfij37QO2j2kJG5N+BCdpSj+QLLGxwd9P4h0G6FuTj5ovvcPPwgEvwzi0lSjse Jm4keyj+majEVgeYk9CtSJlktoGmZ5riCLmOGUdd9ITpyBuTMyi1w2j6r1dOsFURca68 WAo+xeR+mL5nBTXUkdNxFZpDUNJQGuL3EV/i3wlC8xOMtNw7sF5RdKsrF4b6K7GyNWjQ 7VYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:in-reply-to:message-id :date:subject:cc:to:from; bh=IrwYLcO/5NXSfjVaRVsDVYLLFwP0GM4pfX2Nqgukx9Q=; b=dYlyaTdK+BNustCXwBPaw5F/t3H+EqdpUdnATooqD+RgI0Ef9At67yYmZ5uQyplJt6 yb/6t7pO0lYx+Ef398D8JR8BddxAp2NdtJNGS2TV9gtHRQPxihHc4kn0aRLda0CqMXbw BiC7FksEX4JsdLYQA3vnm5qA6rQc+9KiilDh5yOyESCtQGLg8trlni0XB3ugajoehgpV CArhpcYe6RZb7g7bXf1E8wzuQX32EVDOakNfDaK4Rkjjq+0JuXwrvK8FQo9YCZ9EHxbE r6mA0Aepne6ukDYIF9OR483LeC9qamEa+LvKhcFD2tAewGqYGSQ0pQrjdBcp+8zGSxer iJQw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g20si14184768ejf.364.2021.03.16.13.28.20; Tue, 16 Mar 2021 13:28:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235513AbhCPNok (ORCPT + 99 others); Tue, 16 Mar 2021 09:44:40 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:13546 "EHLO szxga05-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235421AbhCPNoS (ORCPT ); Tue, 16 Mar 2021 09:44:18 -0400 Received: from DGGEMS406-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4F0DwQ3jhnzNngl; Tue, 16 Mar 2021 21:41:42 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS406-HUB.china.huawei.com (10.3.19.206) with Microsoft SMTP Server id 14.3.498.0; Tue, 16 Mar 2021 21:43:56 +0800 From: Keqian Zhu To: , , , , Will Deacon , Marc Zyngier CC: Catalin Marinas , Mark Rutland , James Morse , Suzuki K Poulose , Julien Thierry , , , , Subject: [RFC PATCH v2 2/2] kvm/arm64: Try stage2 block mapping for host device MMIO Date: Tue, 16 Mar 2021 21:43:38 +0800 Message-ID: <20210316134338.18052-3-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210316134338.18052-1-zhukeqian1@huawei.com> References: <20210316134338.18052-1-zhukeqian1@huawei.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The MMIO region of a device maybe huge (GB level), try to use block mapping in stage2 to speedup both map and unmap. Compared to normal memory mapping, we should consider two more points when try block mapping for MMIO region: 1. For normal memory mapping, the PA(host physical address) and HVA have same alignment within PUD_SIZE or PMD_SIZE when we use the HVA to request hugepage, so we don't need to consider PA alignment when verifing block mapping. But for device memory mapping, the PA and HVA may have different alignment. 2. For normal memory mapping, we are sure hugepage size properly fit into vma, so we don't check whether the mapping size exceeds the boundary of vma. But for device memory mapping, we should pay attention to this. This adds device_rough_page_shift() to check these two points when selecting block mapping size. Signed-off-by: Keqian Zhu --- Mainly for RFC, not fully tested. I will fully test it when the code logic is well accepted. --- arch/arm64/kvm/mmu.c | 42 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 38 insertions(+), 4 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index c59af5ca01b0..224aa15eb4d9 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -624,6 +624,36 @@ static void kvm_send_hwpoison_signal(unsigned long address, short lsb) send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, current); } +/* + * Find a mapping size that properly insides the intersection of vma and + * memslot. And hva and pa have the same alignment to this mapping size. + * It's rough because there are still other restrictions, which will be + * checked by the following fault_supports_stage2_huge_mapping(). + */ +static short device_rough_page_shift(struct kvm_memory_slot *memslot, + struct vm_area_struct *vma, + unsigned long hva) +{ + size_t size = memslot->npages * PAGE_SIZE; + hva_t sec_start = max(memslot->userspace_addr, vma->vm_start); + hva_t sec_end = min(memslot->userspace_addr + size, vma->vm_end); + phys_addr_t pa = (vma->vm_pgoff << PAGE_SHIFT) + (hva - vma->vm_start); + +#ifndef __PAGETABLE_PMD_FOLDED + if ((hva & (PUD_SIZE - 1)) == (pa & (PUD_SIZE - 1)) && + ALIGN_DOWN(hva, PUD_SIZE) >= sec_start && + ALIGN(hva, PUD_SIZE) <= sec_end) + return PUD_SHIFT; +#endif + + if ((hva & (PMD_SIZE - 1)) == (pa & (PMD_SIZE - 1)) && + ALIGN_DOWN(hva, PMD_SIZE) >= sec_start && + ALIGN(hva, PMD_SIZE) <= sec_end) + return PMD_SHIFT; + + return PAGE_SHIFT; +} + static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot, unsigned long hva, unsigned long map_size) @@ -769,7 +799,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, return -EFAULT; } - /* Let's check if we will get back a huge page backed by hugetlbfs */ + /* + * Let's check if we will get back a huge page backed by hugetlbfs, or + * get block mapping for device MMIO region. + */ mmap_read_lock(current->mm); vma = find_vma_intersection(current->mm, hva, hva + 1); if (unlikely(!vma)) { @@ -780,11 +813,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (is_vm_hugetlb_page(vma)) vma_shift = huge_page_shift(hstate_vma(vma)); + else if (vma->vm_flags & VM_PFNMAP) + vma_shift = device_rough_page_shift(memslot, vma, hva); else vma_shift = PAGE_SHIFT; - if (logging_active || - (vma->vm_flags & VM_PFNMAP)) { + if (logging_active) { force_pte = true; vma_shift = PAGE_SHIFT; } @@ -855,7 +889,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (kvm_is_device_pfn(pfn)) { device = true; - force_pte = true; + force_pte = (vma_pagesize == PAGE_SIZE); } else if (logging_active && !write_fault) { /* * Only actually map the page as writable if this was a write -- 2.19.1