Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp40009imw; Thu, 7 Jul 2022 20:45:12 -0700 (PDT) X-Google-Smtp-Source: AGRyM1u+n7+YEtHIfiW8idm9zsXxRgQ/wpHaoOwcvDfQUe70R6r5xDCk3CmM1bgNingjFsu5aiZW X-Received: by 2002:a17:902:7807:b0:16b:e3d5:b2ce with SMTP id p7-20020a170902780700b0016be3d5b2cemr1505111pll.18.1657251911799; Thu, 07 Jul 2022 20:45:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657251911; cv=none; d=google.com; s=arc-20160816; b=ha5tpZV3N9MsQN5rhH4CpsQ2xlgB6Vgy/nE+HIXvotOL+pixRxH28o2nxss9EPvj11 75sCB2KvNLxw/d35aWFknBhPKE6KqQh9gaAi/2nPYuutBP9ppEekr+C+h6S/yEcWv5H8 CmGmAiyEft95qrXIwb7zRGGCOTlqYmirNG/laekxx2wqSOt8PvvBq+okdFFJ9QGBhc1a F5gAjhUtluirX78rTiFoK/JW2EQ4ciW7sJaIUrX4EOsL4GBi4FbP7A+iUkClhktRijs+ U5e+i+YGKxVIyZI12SACgMlbPthUWzGJggOhW9HHCfbByalaFUPAdKfboj1KsmeX2/HS eIxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=BQVhsVniGnOyGzQCypRxUs419LmykQ2y8vluMGbYZFI=; b=CRTSXcSrKMuMDpPp5nb78708R/siCdLsK+yycC9XokjQnGLpqyQdx2Lai6Yq72g6Jd 6n/3t7xIkFcWvi0AsrGPX/36YYPxba1ndI3zObqtpxn1pYLILid61K2XBnj149QsjDvC abhiHXG60gdH56MATTI5oMC2VArf9kbd0avVeRcCcyXYx7i1pRaP9YkV13CnmGXDEtOb sIJ3AfjUTDmjK2U8ifT56PCeuLGQr9Y74ziozHqi115hW4D6YZT8Oda4FM24hGqmW8gF X84aJeQ9K7345tgR/XrPuJ0xitCGoqkYKLNH+0cWaSrzIUSRpxA0NwSX4K/YmFLmxkIN F67w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=gWBFJ8bK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i15-20020a17090332cf00b001677e4d8420si36738251plr.467.2022.07.07.20.44.52; Thu, 07 Jul 2022 20:45:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=gWBFJ8bK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236978AbiGHDaG (ORCPT + 99 others); Thu, 7 Jul 2022 23:30:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59258 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236946AbiGHDaE (ORCPT ); Thu, 7 Jul 2022 23:30:04 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36F7E747B5; Thu, 7 Jul 2022 20:30:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1657251003; x=1688787003; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=PT02IACBa7Wm0O3NAyqOHw8PuqCfYgWEKpI8T6m3E9Q=; b=gWBFJ8bKzkdleLNFzf1MYRu/9xaIowwykmV0ryKxUQJMp6NekZWLYtE7 kmxCrg465f1yTjOiNVQuMExTbU8UecqXo8Fvj9rLEA0K4l6veTIk8g+8H iILMULRbXp+Ro/i7c4A+gHfXHdB064MiQCOb1n2ouCyRiTttUrnXZgfyD HQV1/5/XDurl4rNY1U8MYVhXQDV+0l+oQ701Ksk9qJr8Sp8ejWbJFWi0R mcJyAMS9Cu/mJ23qq0kpl/UdpwCh1pw6cRr/cJK1UflDcJ2fvjNUIaEBa ja4rAI3Cdvh1OxGxSicSuxFo+3yzMMp+9gHoUEwI4j/7Ff7m2WDDrXriB Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10401"; a="370492166" X-IronPort-AV: E=Sophos;i="5.92,254,1650956400"; d="scan'208";a="370492166" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2022 20:30:02 -0700 X-IronPort-AV: E=Sophos;i="5.92,254,1650956400"; d="scan'208";a="651398466" Received: from xiaoyaol-hp-g830.ccr.corp.intel.com (HELO [10.249.175.131]) ([10.249.175.131]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2022 20:29:51 -0700 Message-ID: <5d0b9341-78b5-0959-2517-0fb1fe83a205@intel.com> Date: Fri, 8 Jul 2022 11:29:49 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 Thunderbird/91.11.0 Subject: Re: [PATCH v6 6/8] KVM: Handle page fault for private memory Content-Language: en-US To: Sean Christopherson Cc: Michael Roth , Vishal Annapurve , Chao Peng , "Nikunj A. Dadhania" , kvm list , LKML , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, qemu-devel@nongnu.org, Paolo Bonzini , Jonathan Corbet , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86 , "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Yu Zhang , "Kirill A . Shutemov" , Andy Lutomirski , Jun Nakajima , Dave Hansen , Andi Kleen , David Hildenbrand , aarcange@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, Quentin Perret , mhocko@suse.com References: <20220519153713.819591-1-chao.p.peng@linux.intel.com> <20220519153713.819591-7-chao.p.peng@linux.intel.com> <20220624090246.GA2181919@chaop.bj.intel.com> <20220630222140.of4md7bufd5jv5bh@amd.com> <4fe3b47d-e94a-890a-5b87-6dfb7763bc7e@intel.com> From: Xiaoyao Li In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,HK_RANDOM_ENVFROM, HK_RANDOM_FROM,NICE_REPLY_A,RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/8/2022 4:08 AM, Sean Christopherson wrote: > On Fri, Jul 01, 2022, Xiaoyao Li wrote: >> On 7/1/2022 6:21 AM, Michael Roth wrote: >>> On Thu, Jun 30, 2022 at 12:14:13PM -0700, Vishal Annapurve wrote: >>>> With transparent_hugepages=always setting I see issues with the >>>> current implementation. > > ... > >>>> Looks like with transparent huge pages enabled kvm tried to handle the >>>> shared memory fault on 0x84d gfn by coalescing nearby 4K pages >>>> to form a contiguous 2MB page mapping at gfn 0x800, since level 2 was >>>> requested in kvm_mmu_spte_requested. >>>> This caused the private memory contents from regions 0x800-0x84c and >>>> 0x86e-0xa00 to get unmapped from the guest leading to guest vm >>>> shutdown. >>> >>> Interesting... seems like that wouldn't be an issue for non-UPM SEV, since >>> the private pages would still be mapped as part of that 2M mapping, and >>> it's completely up to the guest as to whether it wants to access as >>> private or shared. But for UPM it makes sense this would cause issues. >>> >>>> >>>> Does getting the mapping level as per the fault access type help >>>> address the above issue? Any such coalescing should not cross between >>>> private to >>>> shared or shared to private memory regions. >>> >>> Doesn't seem like changing the check to fault->is_private would help in >>> your particular case, since the subsequent host_pfn_mapping_level() call >>> only seems to limit the mapping level to whatever the mapping level is >>> for the HVA in the host page table. >>> >>> Seems like with UPM we need some additional handling here that also >>> checks that the entire 2M HVA range is backed by non-private memory. >>> >>> Non-UPM SNP hypervisor patches already have a similar hook added to >>> host_pfn_mapping_level() which implements such a check via RMP table, so >>> UPM might need something similar: >>> >>> https://github.com/AMDESE/linux/commit/ae4475bc740eb0b9d031a76412b0117339794139 >>> >>> -Mike >>> >> >> For TDX, we try to track the page type (shared, private, mixed) of each gfn >> at given level. Only when the type is shared/private, can it be mapped at >> that level. When it's mixed, i.e., it contains both shared pages and private >> pages at given level, it has to go to next smaller level. >> >> https://github.com/intel/tdx/commit/ed97f4042eb69a210d9e972ccca6a84234028cad > > Hmm, so a new slot->arch.page_attr array shouldn't be necessary, KVM can instead > update slot->arch.lpage_info on shared<->private conversions. Detecting whether > a given range is partially mapped could get nasty if KVM defers tracking to the > backing store, but if KVM itself does the tracking as was previously suggested[*], > then updating lpage_info should be relatively straightfoward, e.g. use > xa_for_each_range() to see if a given 2mb/1gb range is completely covered (fully > shared) or not covered at all (fully private). > > [*] https://lore.kernel.org/all/YofeZps9YXgtP3f1@google.com Yes, slot->arch.page_attr was introduced to help identify whether a page is completely shared/private at given level. It seems XARRAY can serve the same purpose, though I know nothing about it. Looking forward to seeing the patch of using XARRAY. yes, update slot->arch.lpage_info is good to utilize the existing logic and Isaku has applied it to slot->arch.lpage_info for 2MB support patches.