Received-SPF: pass (google.com: domain of linux-kernel+bounces-104206-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99;
Date: Fri, 15 Mar 2024 17:14:13 +0800
From: Chao Peng <chao.p.peng@linux.intel.com>
To: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: seanjc@google.com, pbonzini@redhat.com, isaku.yamahata@intel.com,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] KVM: x86/mmu: x86: Don't overflow lpage_info when
 checking attributes
Message-ID: <20240315091413.GA221655@chaop.bj.intel.com>
Reply-To: Chao Peng <chao.p.peng@linux.intel.com>
References: <20240314212902.2762507-1-rick.p.edgecombe@intel.com>
Precedence: bulk
MIME-Version: 1.0
Content-Type: text/plain; charset=gb2312
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <20240314212902.2762507-1-rick.p.edgecombe@intel.com>

On Thu, Mar 14, 2024 at 02:29:02PM -0700, Rick Edgecombe wrote:
> Fix KVM_SET_MEMORY_ATTRIBUTES to not overflow lpage_info array and trigger
> KASAN splat, as seen in the private_mem_conversions_test selftest.
> 
> When memory attributes are set on a GFN range, that range will have
> specific properties applied to the TDP. A huge page cannot be used when
> the attributes are inconsistent, so they are disabled for those the
> specific huge pages. For internal KVM reasons, huge pages are also not
> allowed to span adjacent memslots regardless of whether the backing memory
> could be mapped as huge.
> 
> What GFNs support which huge page sizes is tracked by an array of arrays
> 'lpage_info' on the memslot, of ??kvm_lpage_info?? structs. Each index of
> lpage_info contains a vmalloc allocated array of these for a specific
> supported page size. The kvm_lpage_info denotes whether a specific huge
> page (GFN and page size) on the memslot is supported. These arrays include
> indices for unaligned head and tail huge pages.
> 
> Preventing huge pages from spanning adjacent memslot is covered by
> incrementing the count in head and tail kvm_lpage_info when the memslot is
> allocated, but disallowing huge pages for memory that has mixed attributes
> has to be done in a more complicated way. During the
> KVM_SET_MEMORY_ATTRIBUTES ioctl KVM updates lpage_info for each memslot in
> the range that has mismatched attributes. KVM does this a memslot at a
> time, and marks a special bit, KVM_LPAGE_MIXED_FLAG, in the kvm_lpage_info
> for any huge page. This bit is essentially a permanently elevated count.
> So huge pages will not be mapped for the GFN at that page size if the
> count is elevated in either case: a huge head or tail page unaligned to
> the memslot or if KVM_LPAGE_MIXED_FLAG is set because it has mixed
> attributes.
> 
> To determine whether a huge page has consistent attributes, the
> KVM_SET_MEMORY_ATTRIBUTES operation checks an xarray to make sure it
> consistently has the incoming attribute. Since level - 1 huge pages are
> aligned to level huge pages, it employs an optimization. As long as the
> level - 1 huge pages are checked first, it can just check these and assume
> that if each level - 1 huge page contained within the level sized huge
> page is not mixed, then the level size huge page is not mixed. This
> optimization happens in the helper hugepage_has_attrs().
> 
> Unfortunately, although the kvm_lpage_info array representing page size
> 'level' will contain an entry for an unaligned tail page of size level,
> the array for level - 1  will not contain an entry for each GFN at page
> size level. The level - 1 array will only contain an index for any
> unaligned region covered by level - 1 huge page size, which can be a
> smaller region. So this causes the optimization to overflow the level - 1
> kvm_lpage_info and perform a vmalloc out of bounds read.
> 
> In some cases of head and tail pages where an overflow could happen,
> callers skip the operation completely as KVM_LPAGE_MIXED_FLAG is not
> required to prevent huge pages as discussed earlier. But for memslots that
> are smaller than the 1GB page size, it does call hugepage_has_attrs(). In
> this case the huge page is both the head and tail page. The issue can be
> observed simply by compiling the kernel with CONFIG_KASAN_VMALLOC and
> running the selftest ??private_mem_conversions_test??, which produces the
> output like the following:
> 
> BUG: KASAN: vmalloc-out-of-bounds in hugepage_has_attrs+0x7e/0x110
> Read of size 4 at addr ffffc900000a3008 by task private_mem_con/169
> Call Trace:
>   dump_stack_lvl
>   print_report
>   ? __virt_addr_valid
>   ? hugepage_has_attrs
>   ? hugepage_has_attrs
>   kasan_report
>   ? hugepage_has_attrs
>   hugepage_has_attrs
>   kvm_arch_post_set_memory_attributes
>   kvm_vm_ioctl
> 
> It is a little ambiguous whether the unaligned head page (in the bug case 
> also the tail page) should be expected to have KVM_LPAGE_MIXED_FLAG set. 
> It is not functionally required, as the unaligned head/tail pages will 
> already have their kvm_lpage_info count incremented. The comments imply 
> not setting it on unaligned head pages is intentional, so fix the callers 
> to skip trying to set KVM_LPAGE_MIXED_FLAG in this case, and in doing so 
> not call hugepage_has_attrs().
> 
> Cc: stable@vger.kernel.org
> Fixes: 90b4fe17981e ("KVM: x86: Disallow hugepages when memory attributes are mixed")
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

Reviewed-by: Chao Peng <chao.p.peng@linux.intel.com>

> ---
> v2:
>  - Drop function rename (Sean)
>  - Clarify in commit log that this is only head pages that are also tail
>    pages (Sean)
> ---
>  arch/x86/kvm/mmu/mmu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 0544700ca50b..42e7de604bb6 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -7388,7 +7388,8 @@ bool kvm_arch_post_set_memory_attributes(struct kvm *kvm,
>  			 * by the memslot, KVM can't use a hugepage due to the
>  			 * misaligned address regardless of memory attributes.
>  			 */
> -			if (gfn >= slot->base_gfn) {
> +			if (gfn >= slot->base_gfn &&
> +			    gfn + nr_pages <= slot->base_gfn + slot->npages) {
>  				if (hugepage_has_attrs(kvm, slot, gfn, level, attrs))
>  					hugepage_clear_mixed(slot, gfn, level);
>  				else
> -- 
> 2.34.1