Received: by 2002:ab2:2994:0:b0:1ef:ca3e:3cd5 with SMTP id n20csp952780lqb; Fri, 15 Mar 2024 10:47:36 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUOH52TkyGfdf8xu/lc1T57oJcTLE0aBOeCaiEBCqj6gy5rz4smzWCR47+AjOlXMe2EFGDpZF/7xw5CR4Ls5DQSqR3kUgs0IR+THG9vaA== X-Google-Smtp-Source: AGHT+IEREYV/N1/Ud6b1fsEMtcn1PDJvA7bSb5VEp9rrUepeh9cQsMBDFb/21OdS7OeXENnbCdej X-Received: by 2002:a17:90a:bb89:b0:29b:8118:9822 with SMTP id v9-20020a17090abb8900b0029b81189822mr4950783pjr.17.1710524855719; Fri, 15 Mar 2024 10:47:35 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710524855; cv=pass; d=google.com; s=arc-20160816; b=ZTQe4/Gu7JWOXIMWfc9iKYvtwGmf8tUO4BaKMc3FKiBXv9ZXSlMLK4seCuLYBbRKKm 7hpLo+6xEbcJTUj44w0spDRa2rQ2JXGH+YH4NLlEM7PBcFQXoyYo68kXx+r35N4KyBz8 KpuO/nOEvKyS4scQ2MV8S2W53X5U7M91JOh7zV7XvFA6c/20UOb/HOWiPl5/Xl6YIr1t Y02VPmjn6iDfRllcfbavaVyQl+sF7Dj07qQi5Hw4HP0MTcj1IxT1aMaJQCmpKiEejT5a OOedGID+esDTvhf1aiRjrkxcz2wtxx7w7MM5oJPJ/OvOgMDxOiebZznXsUrwHk93HX0M Sdcg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=hoJNHYHBvc+NiTzUMXR7SPofvce6CeMpj6LS6d87N84=; fh=X77QSQ+PT1btzqynWfD1ByluRwdirq0tP3h+cZx2G+M=; b=MRDTwPLZko/iyq6mKRzrl80YSCYSUkVptjKdiKr1SWoxgCuLUIA8s8D5cFIaN3flqZ CnBVoy3pj6V0njLaGXQmAcV1yaFoz+j5TiQYp6cwgOm2hdrg3RSGpV6awk5WjNgoJI8M +35bjojua/PJnwAfUPbfuD3hvnVapcQjjkmixkT4FqTft0JaIFeTK51ckKFD/hG0BB53 JaOePhLYrD7xQS75RN/p6iDQCSgkp0cwIk1l/nlCNS8oilwbI4lsgeCVuBeBMcNkYnY4 rtU7Y0uBl2lnwuokfbdZhVpZQ8/uJf/rRbJBGOLz/qL91oLeypd6b4rBGMKS0LIdxM9A +EhA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UI4O34qY; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-104206-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-104206-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id gi8-20020a17090b110800b0029c6f8dd5casi3007582pjb.185.2024.03.15.10.47.35 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Mar 2024 10:47:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-104206-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UI4O34qY; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-104206-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-104206-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id C3B56282AD3 for ; Fri, 15 Mar 2024 09:28:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id DF61A17C6E; Fri, 15 Mar 2024 09:28:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="UI4O34qY" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32FF31799D; Fri, 15 Mar 2024 09:28:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710494923; cv=none; b=AgLxNULnba71ZLY9gfgp7KKey30OZEL8cMSpi9UETBG+W+5I6LxZkwQ0nXs71dbUg6ObvE9XqR+arCPX/YcECc67r3AIe/7qSw3IUErgr+F3J7oseJZSLEWS0IB9MnyLFwXJ6YzTr1DPTphdHxgBnSFO0Boop1oLGaqTPC39S30= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710494923; c=relaxed/simple; bh=Jw1py/JdYhpsbyL85xC8WpCBa6kvGQQhAkylKoj7qcM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Xb1E5//mdhMJx9S1CAzzSKuBTOh3wN/SsKXUuBEWREF0dYPd9hE6YPstBNdMG4bKCK/m6rE8x0EHw3Sx9dNMqh4VW4Ak3TlolmArsGDGNy4j8vQpbot6z4C05Q9rWAXdBxNqBughbL2UB4UBjh5jdGceinNjijyNE0Qbg6n1sks= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=UI4O34qY; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710494921; x=1742030921; h=date:from:to:cc:subject:message-id:reply-to:references: mime-version:content-transfer-encoding:in-reply-to; bh=Jw1py/JdYhpsbyL85xC8WpCBa6kvGQQhAkylKoj7qcM=; b=UI4O34qYxhklY71jQniQFVxB4TgYlJkJmVP2WnWTCFZ3uGK1xxk3MlUA 5G5O8tdnGfb+5B3N5HbJ/39Xe4Ddy9pBTbiqAY1PnRv6S4x9AxvJDmo6A V623zUncoAnNw0k30QV6Now4o0m/imm3na4RYB0wkAnmmlfyr+eiLtGJR 6RYJkf6xoQaxvxS6s332PnR+4/ubuUy6ynrDkvtysUUiWM2woKtP6pK5A 0QtO/jPfuFOaOjg089LXluxIqPxUr5I9Pb95UCUHyj/o5Bgi+OVaGiGcZ MKd8F/jSZzGRGumk50l4Pwa+y/FsIPHI6wJTeMrt5BTuQCYzI9cqu3R+8 g==; X-IronPort-AV: E=McAfee;i="6600,9927,11013"; a="22874660" X-IronPort-AV: E=Sophos;i="6.07,128,1708416000"; d="scan'208";a="22874660" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2024 02:28:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,128,1708416000"; d="scan'208";a="43509447" Received: from chaop.bj.intel.com (HELO localhost) ([10.240.193.80]) by orviesa002.jf.intel.com with ESMTP; 15 Mar 2024 02:28:38 -0700 Date: Fri, 15 Mar 2024 17:14:13 +0800 From: Chao Peng To: Rick Edgecombe Cc: seanjc@google.com, pbonzini@redhat.com, isaku.yamahata@intel.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] KVM: x86/mmu: x86: Don't overflow lpage_info when checking attributes Message-ID: <20240315091413.GA221655@chaop.bj.intel.com> Reply-To: Chao Peng References: <20240314212902.2762507-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=gb2312 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20240314212902.2762507-1-rick.p.edgecombe@intel.com> On Thu, Mar 14, 2024 at 02:29:02PM -0700, Rick Edgecombe wrote: > Fix KVM_SET_MEMORY_ATTRIBUTES to not overflow lpage_info array and trigger > KASAN splat, as seen in the private_mem_conversions_test selftest. > > When memory attributes are set on a GFN range, that range will have > specific properties applied to the TDP. A huge page cannot be used when > the attributes are inconsistent, so they are disabled for those the > specific huge pages. For internal KVM reasons, huge pages are also not > allowed to span adjacent memslots regardless of whether the backing memory > could be mapped as huge. > > What GFNs support which huge page sizes is tracked by an array of arrays > 'lpage_info' on the memslot, of ??kvm_lpage_info?? structs. Each index of > lpage_info contains a vmalloc allocated array of these for a specific > supported page size. The kvm_lpage_info denotes whether a specific huge > page (GFN and page size) on the memslot is supported. These arrays include > indices for unaligned head and tail huge pages. > > Preventing huge pages from spanning adjacent memslot is covered by > incrementing the count in head and tail kvm_lpage_info when the memslot is > allocated, but disallowing huge pages for memory that has mixed attributes > has to be done in a more complicated way. During the > KVM_SET_MEMORY_ATTRIBUTES ioctl KVM updates lpage_info for each memslot in > the range that has mismatched attributes. KVM does this a memslot at a > time, and marks a special bit, KVM_LPAGE_MIXED_FLAG, in the kvm_lpage_info > for any huge page. This bit is essentially a permanently elevated count. > So huge pages will not be mapped for the GFN at that page size if the > count is elevated in either case: a huge head or tail page unaligned to > the memslot or if KVM_LPAGE_MIXED_FLAG is set because it has mixed > attributes. > > To determine whether a huge page has consistent attributes, the > KVM_SET_MEMORY_ATTRIBUTES operation checks an xarray to make sure it > consistently has the incoming attribute. Since level - 1 huge pages are > aligned to level huge pages, it employs an optimization. As long as the > level - 1 huge pages are checked first, it can just check these and assume > that if each level - 1 huge page contained within the level sized huge > page is not mixed, then the level size huge page is not mixed. This > optimization happens in the helper hugepage_has_attrs(). > > Unfortunately, although the kvm_lpage_info array representing page size > 'level' will contain an entry for an unaligned tail page of size level, > the array for level - 1 will not contain an entry for each GFN at page > size level. The level - 1 array will only contain an index for any > unaligned region covered by level - 1 huge page size, which can be a > smaller region. So this causes the optimization to overflow the level - 1 > kvm_lpage_info and perform a vmalloc out of bounds read. > > In some cases of head and tail pages where an overflow could happen, > callers skip the operation completely as KVM_LPAGE_MIXED_FLAG is not > required to prevent huge pages as discussed earlier. But for memslots that > are smaller than the 1GB page size, it does call hugepage_has_attrs(). In > this case the huge page is both the head and tail page. The issue can be > observed simply by compiling the kernel with CONFIG_KASAN_VMALLOC and > running the selftest ??private_mem_conversions_test??, which produces the > output like the following: > > BUG: KASAN: vmalloc-out-of-bounds in hugepage_has_attrs+0x7e/0x110 > Read of size 4 at addr ffffc900000a3008 by task private_mem_con/169 > Call Trace: > dump_stack_lvl > print_report > ? __virt_addr_valid > ? hugepage_has_attrs > ? hugepage_has_attrs > kasan_report > ? hugepage_has_attrs > hugepage_has_attrs > kvm_arch_post_set_memory_attributes > kvm_vm_ioctl > > It is a little ambiguous whether the unaligned head page (in the bug case > also the tail page) should be expected to have KVM_LPAGE_MIXED_FLAG set. > It is not functionally required, as the unaligned head/tail pages will > already have their kvm_lpage_info count incremented. The comments imply > not setting it on unaligned head pages is intentional, so fix the callers > to skip trying to set KVM_LPAGE_MIXED_FLAG in this case, and in doing so > not call hugepage_has_attrs(). > > Cc: stable@vger.kernel.org > Fixes: 90b4fe17981e ("KVM: x86: Disallow hugepages when memory attributes are mixed") > Signed-off-by: Rick Edgecombe Reviewed-by: Chao Peng > --- > v2: > - Drop function rename (Sean) > - Clarify in commit log that this is only head pages that are also tail > pages (Sean) > --- > arch/x86/kvm/mmu/mmu.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 0544700ca50b..42e7de604bb6 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -7388,7 +7388,8 @@ bool kvm_arch_post_set_memory_attributes(struct kvm *kvm, > * by the memslot, KVM can't use a hugepage due to the > * misaligned address regardless of memory attributes. > */ > - if (gfn >= slot->base_gfn) { > + if (gfn >= slot->base_gfn && > + gfn + nr_pages <= slot->base_gfn + slot->npages) { > if (hugepage_has_attrs(kvm, slot, gfn, level, attrs)) > hugepage_clear_mixed(slot, gfn, level); > else > -- > 2.34.1