Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp5391155yba; Wed, 10 Apr 2019 19:01:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqzkQfjuCGGfCcRgInbmKdyFhunQ/HLjHNqLE3OCxgyvottFYou+p60PKgsP3Pls/OdCwb8S X-Received: by 2002:a17:902:469:: with SMTP id 96mr46681941ple.46.1554948101283; Wed, 10 Apr 2019 19:01:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554948101; cv=none; d=google.com; s=arc-20160816; b=lEFWmtN1rKkzXWYOGUjGcE/VfF00ebHeOj1w2o+h7V9PTqFN2R5uasPIt/vehyZlZS ldrjZ1WOjO4MzLt049N2f/MvVNtuMsWswUDd2e6tRexN5BCarZp0gLlvqyUwTxl2j8ZZ hReyWflnyzG3YvpQFFJdPw6lbVCdTbrFo08aie6TLsdcfS6luBK5XdnHR4vhKltPIiDZ v8GQxoVzhIL5mJKGwjfJU9FHJ9285vG1dcO1LP/awwQDXN3Sr4yv0kV1Z/yYCZMxWQQE 5eFY5iReZSRY1hLG1oqJfPgmtRx6cCjmJa9231Mj+5IUXdoFByXP8/fGTq+b8lmwz43/ qqWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=3sml20nMc1/enw4tUh7OK75MIclM9lMoXQFRGeNOSQQ=; b=rb5AaZHn8dCfw1fGbvG30SyS6GDGLZdk9SCmTzMcgHbl3bL5sHjwal4fUoMJ9/18n1 e1sngzwJrUG8uKiEupEseJhOR6YaFIYafbF2zB2T1SniL6jM8mdl7fag3zRL1mCddDI0 zQVcjLmNmO5qK1/ic/ZBHZI0hEKIUBRhyy+h6yaLUaHE3zWXJlxxzA08PMCH15QeJkqK xVOsXZwCdln1faBysp5FRLQxNgQ2R6quFYa6WW3TRVQhXI1g9r5zV670gZ0PuSIeF2yV +Sr9xkXWSAiQ7ZjngESYrQNZRzBYzmnb0uu10lwwzYXi0ysd2dreFrO4kBK0l3aeNoov 8igw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b1si9143583plx.232.2019.04.10.19.01.24; Wed, 10 Apr 2019 19:01:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726632AbfDKCAk (ORCPT + 99 others); Wed, 10 Apr 2019 22:00:40 -0400 Received: from szxga07-in.huawei.com ([45.249.212.35]:60404 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725981AbfDKCAj (ORCPT ); Wed, 10 Apr 2019 22:00:39 -0400 Received: from DGGEMS406-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 6B8E4EB095994463CA7C; Thu, 11 Apr 2019 10:00:36 +0800 (CST) Received: from [127.0.0.1] (10.184.12.158) by DGGEMS406-HUB.china.huawei.com (10.3.19.206) with Microsoft SMTP Server id 14.3.408.0; Thu, 11 Apr 2019 10:00:28 +0800 Subject: Re: [PATCH 2/2] kvm: arm: Unify handling THP backed host memory To: Suzuki K Poulose , CC: , , , , , , , , , References: <1554909297-6753-1-git-send-email-suzuki.poulose@arm.com> <1554909832-7169-1-git-send-email-suzuki.poulose@arm.com> <1554909832-7169-3-git-send-email-suzuki.poulose@arm.com> From: Zenghui Yu Message-ID: Date: Thu, 11 Apr 2019 09:59:06 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:64.0) Gecko/20100101 Thunderbird/64.0 MIME-Version: 1.0 In-Reply-To: <1554909832-7169-3-git-send-email-suzuki.poulose@arm.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.184.12.158] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Suzuki, On 2019/4/10 23:23, Suzuki K Poulose wrote: > We support mapping host memory backed by PMD transparent hugepages > at stage2 as huge pages. However the checks are now spread across > two different places. Let us unify the handling of the THPs to > keep the code cleaner (and future proof for PUD THP support). > This patch moves transparent_hugepage_adjust() closer to the caller > to avoid a forward declaration for fault_supports_stage2_huge_mappings(). > > Also, since we already handle the case where the host VA and the guest > PA may not be aligned, the explicit VM_BUG_ON() is not required. > > Cc: Marc Zyngier > Cc: Christoffer Dall > Cc: Zneghui Yu > Signed-off-by: Suzuki K Poulose > --- > virt/kvm/arm/mmu.c | 123 +++++++++++++++++++++++++++-------------------------- > 1 file changed, 62 insertions(+), 61 deletions(-) > > diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c > index 6d73322..714eec2 100644 > --- a/virt/kvm/arm/mmu.c > +++ b/virt/kvm/arm/mmu.c > @@ -1380,53 +1380,6 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, > return ret; > } > > -static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap) > -{ > - kvm_pfn_t pfn = *pfnp; > - gfn_t gfn = *ipap >> PAGE_SHIFT; > - struct page *page = pfn_to_page(pfn); > - > - /* > - * PageTransCompoundMap() returns true for THP and > - * hugetlbfs. Make sure the adjustment is done only for THP > - * pages. > - */ > - if (!PageHuge(page) && PageTransCompoundMap(page)) { > - unsigned long mask; > - /* > - * The address we faulted on is backed by a transparent huge > - * page. However, because we map the compound huge page and > - * not the individual tail page, we need to transfer the > - * refcount to the head page. We have to be careful that the > - * THP doesn't start to split while we are adjusting the > - * refcounts. > - * > - * We are sure this doesn't happen, because mmu_notifier_retry > - * was successful and we are holding the mmu_lock, so if this > - * THP is trying to split, it will be blocked in the mmu > - * notifier before touching any of the pages, specifically > - * before being able to call __split_huge_page_refcount(). > - * > - * We can therefore safely transfer the refcount from PG_tail > - * to PG_head and switch the pfn from a tail page to the head > - * page accordingly. > - */ > - mask = PTRS_PER_PMD - 1; > - VM_BUG_ON((gfn & mask) != (pfn & mask)); > - if (pfn & mask) { > - *ipap &= PMD_MASK; > - kvm_release_pfn_clean(pfn); > - pfn &= ~mask; > - kvm_get_pfn(pfn); > - *pfnp = pfn; > - } > - > - return true; > - } > - > - return false; > -} > - > /** > * stage2_wp_ptes - write protect PMD range > * @pmd: pointer to pmd entry > @@ -1677,6 +1630,61 @@ static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot, > (hva & ~(map_size - 1)) + map_size <= uaddr_end; > } > > +/* > + * Check if the given hva is backed by a transparent huge page (THP) > + * and whether it can be mapped using block mapping in stage2. If so, adjust > + * the stage2 PFN and IPA accordingly. Only PMD_SIZE THPs are currently > + * supported. This will need to be updated to support other THP sizes. > + * > + * Returns the size of the mapping. > + */ > +static unsigned long > +transparent_hugepage_adjust(struct kvm_memory_slot *memslot, > + unsigned long hva, kvm_pfn_t *pfnp, > + phys_addr_t *ipap) > +{ > + kvm_pfn_t pfn = *pfnp; > + struct page *page = pfn_to_page(pfn); > + > + /* > + * PageTransCompoundMap() returns true for THP and > + * hugetlbfs. Make sure the adjustment is done only for THP > + * pages. Also make sure that the HVA and IPA are sufficiently > + * aligned and that the block map is contained within the memslot. > + */ > + if (!PageHuge(page) && PageTransCompoundMap(page) && We managed to get here, ensure that we only play with normal size pages and no hugetlbfs pages will be involved. "!PageHuge(page)" will always return true and we can let it go. > + fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) { > + /* > + * The address we faulted on is backed by a transparent huge > + * page. However, because we map the compound huge page and > + * not the individual tail page, we need to transfer the > + * refcount to the head page. We have to be careful that the > + * THP doesn't start to split while we are adjusting the > + * refcounts. > + * > + * We are sure this doesn't happen, because mmu_notifier_retry > + * was successful and we are holding the mmu_lock, so if this > + * THP is trying to split, it will be blocked in the mmu > + * notifier before touching any of the pages, specifically > + * before being able to call __split_huge_page_refcount(). > + * > + * We can therefore safely transfer the refcount from PG_tail > + * to PG_head and switch the pfn from a tail page to the head > + * page accordingly. > + */ > + *ipap &= PMD_MASK; > + kvm_release_pfn_clean(pfn); > + pfn &= ~(PTRS_PER_PMD - 1); > + kvm_get_pfn(pfn); > + *pfnp = pfn; > + > + return PMD_SIZE; > + } > + > + /* Use page mapping if we cannot use block mapping */ > + return PAGE_SIZE; > +} > + > static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > struct kvm_memory_slot *memslot, unsigned long hva, > unsigned long fault_status) > @@ -1780,20 +1788,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > if (mmu_notifier_retry(kvm, mmu_seq)) > goto out_unlock; > > - if (vma_pagesize == PAGE_SIZE && !force_pte) { > - /* > - * Only PMD_SIZE transparent hugepages(THP) are > - * currently supported. This code will need to be > - * updated to support other THP sizes. > - * > - * Make sure the host VA and the guest IPA are sufficiently > - * aligned and that the block is contained within the memslot. > - */ > - if (fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE) && > - transparent_hugepage_adjust(&pfn, &fault_ipa)) > - vma_pagesize = PMD_SIZE; > - } > - > + /* > + * If we are not forced to use page mapping, check if we are > + * backed by a THP and thus use block mapping if possible. > + */ > + if (vma_pagesize == PAGE_SIZE && !force_pte) > + vma_pagesize = transparent_hugepage_adjust(memslot, hva, > + &pfn, &fault_ipa); > if (writable) > kvm_set_pfn_dirty(pfn); > thanks, zenghui