Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp242542yba; Fri, 12 Apr 2019 02:39:27 -0700 (PDT) X-Google-Smtp-Source: APXvYqxOf0AtT1tqoeHOYfm3wt5M2/0EIlt10SYnHAdqJxhHix9WPpVde7xA9WNc9chNIk4LfxzG X-Received: by 2002:a17:902:521:: with SMTP id 30mr25487965plf.248.1555061967685; Fri, 12 Apr 2019 02:39:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555061967; cv=none; d=google.com; s=arc-20160816; b=ev0cLrQp+idq2G8/cUfHB3RANAKaqzN/MZqZUzn8t8JuK9twxZxl0RghITG6IJoW25 QFldTte6ayZ3j5p/wcfKS65gV11MrVU/YODDyEyRTqxaucvQMm5v8RQz6jdY+mhxxrXd 0fiqA0kVx+g1j5/DrNlWNvllY+w0Z7hjmSofMNoBmsxyhGfvn171c3MWvrM7T4aRhbU4 3eEk8RYZdtMqvkMfCraUt99kxoHO1kjZzP0b3WMCzH9S4xRB3GtvZBY3p0yNhkKRr+vF L2Vhbk47y3efaYceV37v9I/rXj4NADboVIjdB+T2XNtxMeyMJ73cqqvlL/V1Y9gRGQym rbJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=/u4psVbAO/fpItvZqAcDrhUOg1HzcUwd11s9fM4Y0g8=; b=0KdznGmL0vEcPB4Jfosnb0fMmgBhLPE28hvk7z88jcOkSbba81XOUXo5BgDlRIs/aO v99h4StIlq4Jzq6a/4WK4rRZ4uszwaMO717eVhSotvTa5Vq251BS/hsrxIZOG/ooI7MZ SHiWjWVHt3A6lIsYP9kCEetmceA0TlYXI8JhVgZnYVWMaqeoXf6uZH5PGmIldEFkSCbk qesTvSa3bhOHE5f9K6iLbeJ/2mMne6SkhNNy1WIGk5jx9QeZdcmPI83irSEL7VWdrQ0X aLLrNcay30/lqHl7CtHlGSVndnzPLOoEyITfE8AEAAMT4e9uRow5hHTgiwsRwQCFQCFE qFYQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f1si12369552pgv.195.2019.04.12.02.39.11; Fri, 12 Apr 2019 02:39:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726842AbfDLJhP (ORCPT + 99 others); Fri, 12 Apr 2019 05:37:15 -0400 Received: from szxga06-in.huawei.com ([45.249.212.32]:36632 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726709AbfDLJhO (ORCPT ); Fri, 12 Apr 2019 05:37:14 -0400 Received: from DGGEMS410-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id AF22C68F1E796B373A28; Fri, 12 Apr 2019 17:37:12 +0800 (CST) Received: from [127.0.0.1] (10.184.12.158) by DGGEMS410-HUB.china.huawei.com (10.3.19.210) with Microsoft SMTP Server id 14.3.408.0; Fri, 12 Apr 2019 17:37:04 +0800 Subject: Re: [PATCH 2/2] kvm: arm: Unify handling THP backed host memory To: Suzuki K Poulose , CC: , , , , , , , , , References: <1554909297-6753-1-git-send-email-suzuki.poulose@arm.com> <1554909832-7169-1-git-send-email-suzuki.poulose@arm.com> <1554909832-7169-3-git-send-email-suzuki.poulose@arm.com> <1bac514c-e4f0-a609-96ed-f48ef3461da1@arm.com> From: Zenghui Yu Message-ID: Date: Fri, 12 Apr 2019 17:34:53 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:64.0) Gecko/20100101 Thunderbird/64.0 MIME-Version: 1.0 In-Reply-To: <1bac514c-e4f0-a609-96ed-f48ef3461da1@arm.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.184.12.158] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2019/4/11 23:16, Suzuki K Poulose wrote: > Hi Zhengui, > > On 11/04/2019 02:59, Zenghui Yu wrote: >> Hi Suzuki, >> >> On 2019/4/10 23:23, Suzuki K Poulose wrote: >>> We support mapping host memory backed by PMD transparent hugepages >>> at stage2 as huge pages. However the checks are now spread across >>> two different places. Let us unify the handling of the THPs to >>> keep the code cleaner (and future proof for PUD THP support). >>> This patch moves transparent_hugepage_adjust() closer to the caller >>> to avoid a forward declaration for >>> fault_supports_stage2_huge_mappings(). >>> >>> Also, since we already handle the case where the host VA and the guest >>> PA may not be aligned, the explicit VM_BUG_ON() is not required. >>> >>> Cc: Marc Zyngier >>> Cc: Christoffer Dall >>> Cc: Zneghui Yu >>> Signed-off-by: Suzuki K Poulose >>> --- >>>    virt/kvm/arm/mmu.c | 123 >>> +++++++++++++++++++++++++++-------------------------- >>>    1 file changed, 62 insertions(+), 61 deletions(-) >>> >>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c >>> index 6d73322..714eec2 100644 >>> --- a/virt/kvm/arm/mmu.c >>> +++ b/virt/kvm/arm/mmu.c >>> @@ -1380,53 +1380,6 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, >>> phys_addr_t guest_ipa, >>>        return ret; >>>    } >>> -static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t >>> *ipap) >>> -{ >>> -    kvm_pfn_t pfn = *pfnp; >>> -    gfn_t gfn = *ipap >> PAGE_SHIFT; >>> -    struct page *page = pfn_to_page(pfn); >>> - >>> -    /* >>> -     * PageTransCompoundMap() returns true for THP and >>> -     * hugetlbfs. Make sure the adjustment is done only for THP >>> -     * pages. >>> -     */ >>> -    if (!PageHuge(page) && PageTransCompoundMap(page)) { >>> -        unsigned long mask; >>> -        /* >>> -         * The address we faulted on is backed by a transparent huge >>> -         * page.  However, because we map the compound huge page and >>> -         * not the individual tail page, we need to transfer the >>> -         * refcount to the head page.  We have to be careful that the >>> -         * THP doesn't start to split while we are adjusting the >>> -         * refcounts. >>> -         * >>> -         * We are sure this doesn't happen, because mmu_notifier_retry >>> -         * was successful and we are holding the mmu_lock, so if this >>> -         * THP is trying to split, it will be blocked in the mmu >>> -         * notifier before touching any of the pages, specifically >>> -         * before being able to call __split_huge_page_refcount(). >>> -         * >>> -         * We can therefore safely transfer the refcount from PG_tail >>> -         * to PG_head and switch the pfn from a tail page to the head >>> -         * page accordingly. >>> -         */ >>> -        mask = PTRS_PER_PMD - 1; >>> -        VM_BUG_ON((gfn & mask) != (pfn & mask)); >>> -        if (pfn & mask) { >>> -            *ipap &= PMD_MASK; >>> -            kvm_release_pfn_clean(pfn); >>> -            pfn &= ~mask; >>> -            kvm_get_pfn(pfn); >>> -            *pfnp = pfn; >>> -        } >>> - >>> -        return true; >>> -    } >>> - >>> -    return false; >>> -} >>> - >>>    /** >>>     * stage2_wp_ptes - write protect PMD range >>>     * @pmd:    pointer to pmd entry >>> @@ -1677,6 +1630,61 @@ static bool >>> fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot, >>>               (hva & ~(map_size - 1)) + map_size <= uaddr_end; >>>    } >>> +/* >>> + * Check if the given hva is backed by a transparent huge page (THP) >>> + * and whether it can be mapped using block mapping in stage2. If >>> so, adjust >>> + * the stage2 PFN and IPA accordingly. Only PMD_SIZE THPs are currently >>> + * supported. This will need to be updated to support other THP sizes. >>> + * >>> + * Returns the size of the mapping. >>> + */ >>> +static unsigned long >>> +transparent_hugepage_adjust(struct kvm_memory_slot *memslot, >>> +                unsigned long hva, kvm_pfn_t *pfnp, >>> +                phys_addr_t *ipap) >>> +{ >>> +    kvm_pfn_t pfn = *pfnp; >>> +    struct page *page = pfn_to_page(pfn); >>> + >>> +    /* >>> +     * PageTransCompoundMap() returns true for THP and >>> +     * hugetlbfs. Make sure the adjustment is done only for THP >>> +     * pages. Also make sure that the HVA and IPA are sufficiently >>> +     * aligned and that the  block map is contained within the memslot. >>> +     */ >>> +    if (!PageHuge(page) && PageTransCompoundMap(page) && >> >> We managed to get here, ensure that we only play with normal size pages >> and no hugetlbfs pages will be involved.  "!PageHuge(page)" will always >> return true and we can let it go. > > I think that is a bit tricky. If someone ever modifies the user_mem_abort() > and we end up in getting called with a HugeTLB backed page things could go > wrong. That will be bad. I'm not sure if it's possible in the future. > I could do remove the check, but would like to add a WARN_ON_ONCE() to make > sure our assumption is held. > > i.e, >     WARN_ON_ONCE(PageHuge(page)); But this is a careful approach. I think this will be valuable both for developers and the code itself. Thanks! zenghui > >     if (PageTransCompoundMap(page) &&>> + > fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) { > > ... >