Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp3071848yba; Mon, 8 Apr 2019 10:27:42 -0700 (PDT) X-Google-Smtp-Source: APXvYqxUde4WVT0mQLxI+05ToVYjIdpiuF+n3gzgp+EBuuD5Av5wjKaUxqvkD50HaqP6OkRNS3ht X-Received: by 2002:a63:29c8:: with SMTP id p191mr29850978pgp.197.1554744462838; Mon, 08 Apr 2019 10:27:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554744462; cv=none; d=google.com; s=arc-20160816; b=LS8UPKD5mBON0bHAb9oKSdmab5+wFY+CgalO8kjRfry3kl/9nX1NobW3JAkeeZ71d+ U2Wksjf4dK6iiFraCQd2yQ0aKjgL179b4kvTOHvZu0rkHG7pSo+RzaDHbMNzkqHYvYvS 6FM3xk0Ks6xGbYyjOFrmKhjpnkrBgC+tRhNz5xp0EAhr2tHYuzRLOrtI0xdHekjnJPw0 622SVPECzVrKkpUOfgmk+x9u3GWH1LBRBUOsqcA4kYGNHdwnKdHZw3cQAE180z1iIxRe PkHnxD+QCkOPiL/V7rMh3kQXnrstIy7vgw3dDsMb+nl5A08FQxPIuGqHKBLzw82FsRmx d8+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=gkmq9kxnA82heaNr5INS9sj7Eky0JGu1Oa5XBKFEclE=; b=TtX2r0wskFuxgwlNtBwzFw8s/wRZ/IkS9UcAcmmIFYHbleKXjt2mshtPnkrHTAMbc3 QzXCVNAnL3nNAcoBn/P40KssWY0PdQQNnR9cjlJtRWGkTgwFR+Mf0PttskcKM3FVyhy/ ptkHzDpo2GapIpvIl0P2HYStZGmHO/SXe+BXGfm626C1qB2la+sRvWgnkmIjONq4t1Gx SDW7eDg68M6iZRW0PrVrAK6ayz93ratumbKTztSRqDTBrq1HSfR3wOGSrRggm/fmbBFV 4LIN7xDwpJBB+ZUGUToQfOO6mQXd8q4Ch0sPxlw7HW2D8qIJZur/F4Ajr6ctwLHC++dg G4Pw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a64si27511286pge.592.2019.04.08.10.27.27; Mon, 08 Apr 2019 10:27:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728754AbfDHPOw (ORCPT + 99 others); Mon, 8 Apr 2019 11:14:52 -0400 Received: from szxga07-in.huawei.com ([45.249.212.35]:54598 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726228AbfDHPOv (ORCPT ); Mon, 8 Apr 2019 11:14:51 -0400 Received: from DGGEMS404-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 17EC4198ED3F8C73999C; Mon, 8 Apr 2019 23:14:47 +0800 (CST) Received: from [127.0.0.1] (10.184.12.158) by DGGEMS404-HUB.china.huawei.com (10.3.19.204) with Microsoft SMTP Server id 14.3.408.0; Mon, 8 Apr 2019 23:14:39 +0800 Subject: Re: [PATCH] kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THP To: Suzuki K Poulose , CC: , , , , , , , , References: <1554203176-3958-1-git-send-email-suzuki.poulose@arm.com> <2ea55b9c-09da-c3d0-3616-aa6be85b5a46@huawei.com> From: Zenghui Yu Message-ID: <44ffad26-a407-4603-0bb6-145f7290adbe@huawei.com> Date: Mon, 8 Apr 2019 23:11:10 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:64.0) Gecko/20100101 Thunderbird/64.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.184.12.158] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Suzuki, Thanks for the reply. On 2019/4/8 18:35, Suzuki K Poulose wrote: > On 04/08/2019 04:50 AM, Zenghui Yu wrote: >> >> >> On 2019/4/2 19:06, Suzuki K Poulose wrote: >>> With commit a80868f398554842b14, we no longer ensure that the >>> THP page is properly aligned in the guest IPA. Skip the stage2 >>> huge mapping for unaligned IPA backed by transparent hugepages. >>> >>> Fixes: a80868f398554842b14 ("KVM: arm/arm64: Enforce PTE mappings at >>> stage2 when needed") >>> Reported-by: Eric Auger >>> Cc: Marc Zyngier >>> Cc: Chirstoffer Dall >>> Cc: Zenghui Yu >>> Cc: Zheng Xiang >>> Tested-by: Eric Auger >>> Signed-off-by: Suzuki K Poulose >> >> Hi Suzuki, >> >> Why not making use of fault_supports_stage2_huge_mapping()?  Let it do >> some checks for us. >> >> fault_supports_stage2_huge_mapping() was intended to do a *two-step* >> check to tell us that can we create stage2 huge block mappings, and this >> check is both for hugetlbfs and THP.  With commit a80868f398554842b14, >> we pass PAGE_SIZE as "map_size" for normal size pages (which turned out >> to be almost meaningless), and unfortunately the THP check no longer >> works. > > Thats correct. > >> >> So we want to rework *THP* check process.  Your patch fixes the first >> checking-step, but the second is still missed, am I wrong? > > It fixes the step explicitly for the THP by making sure that the GPA and > the HVA are aligned to the map size. Yes, I understand how your patch had fixed the issue. But what I'm really concerned about here is the *second* checking-step in fault_supports_stage2_huge_mapping(). We have to check if we are mapping a non-block aligned or non-block sized memslot, if so, we can not create block mappings for the beginning and end of this memslot. This is what the second part of fault_supports_stage2_huge_mapping() had done. I haven't seen this checking-step in your patch, did I miss something? >> >> Can you please give a look at the below diff? >> >> >> thanks, >> >> zenghui >> >>> --- >>>   virt/kvm/arm/mmu.c | 4 +++- >>>   1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c >>> index 27c9583..4a22f5b 100644 >>> --- a/virt/kvm/arm/mmu.c >>> +++ b/virt/kvm/arm/mmu.c >>> @@ -1412,7 +1412,9 @@ static bool >>> transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap) >>>            * page accordingly. >>>            */ >>>           mask = PTRS_PER_PMD - 1; >>> -        VM_BUG_ON((gfn & mask) != (pfn & mask)); >> Somehow, I'd prefer keeping the VM_BUG_ON() here, let it report some >> potential issues in the future (of course I hope none:) ) > > I don't think this calls for a VM_BUG_ON(). It is simply a case where > the GPA is not aligned to HVA, but for normal VMA that could be made THP. > > We had this VM_BUG_ON(), which would have never hit because we would > have set force_pte if they were not aligned. Yes, I agree. >>> +        /* Skip memslots with unaligned IPA and user address */ >>> +        if ((gfn & mask) != (pfn & mask)) >>> +            return false; >>>           if (pfn & mask) { >>>               *ipap &= PMD_MASK; >>>               kvm_release_pfn_clean(pfn); >>> >> >> ---8>--- >> >> Rework fault_supports_stage2_huge_mapping(), let it check THP again. >> >> Signed-off-by: Zenghui Yu >> --- >>   virt/kvm/arm/mmu.c | 11 ++++++++++- >>   1 file changed, 10 insertions(+), 1 deletion(-) >> >> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c >> index 27c9583..5e1b258 100644 >> --- a/virt/kvm/arm/mmu.c >> +++ b/virt/kvm/arm/mmu.c >> @@ -1632,6 +1632,15 @@ static bool >> fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot, >>       uaddr_end = uaddr_start + size; >> >>       /* >> +     * If the memslot is _not_ backed by hugetlbfs, then check if it >> +     * can be backed by transparent hugepages. >> +     * >> +     * Currently only PMD_SIZE THPs are supported, revisit it later. >> +     */ >> +    if (map_size == PAGE_SIZE) >> +        map_size = PMD_SIZE; >> + > > This looks hackish. What is we support PUD_SIZE huge page in the future > ? Yes, this might make the code a little difficult to understand. But by doing so, we follow the same logic before commit a80868f398554842b14, that said, we do the two-step checking for normal size pages in fault_supports_stage2_huge_mapping(), to decide if we can create THP mappings for these pages. As for PUD_SIZE THPs, to be honest, I have no idea now :( thanks, zenghui >> +    /* >>        * Pages belonging to memslots that don't have the same alignment >>        * within a PMD/PUD for userspace and IPA cannot be mapped with >> stage-2 >>        * PMD/PUD entries, because we'll end up mapping the wrong pages. >> @@ -1643,7 +1652,7 @@ static bool >> fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot, >>        *    |abcde|fgh  Stage-1 block  |    Stage-1 block tv|xyz| >>        *    +-----+--------------------+--------------------+---+ >>        * >> -     *    memslot->base_gfn << PAGE_SIZE: >> +     *    memslot->base_gfn << PAGE_SHIFT: > > That could be fixed. > >>        *      +---+--------------------+--------------------+-----+ >>        *      |abc|def  Stage-2 block  |    Stage-2 block   |tvxyz| >>        *      +---+--------------------+--------------------+-----+ > > > But personally I don't think this is the right way to fix it. And as > mentioned, the VM_BUG_ON() is unnecessary and could easily be triggered > by the user by using unaligned GPA/HVA. All we need to do is, use page > mapping in such cases, which we do with my patch. > > > Cheers > Suzuki > > .