Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp559980pxb; Fri, 28 Jan 2022 05:21:36 -0800 (PST) X-Google-Smtp-Source: ABdhPJzTeoCpq6po19/Lg2gjz6q0dOJy2iNTEkdYyy9S2GQZfnbe46SCxJ28ZbdieDCxGHohw/C7 X-Received: by 2002:aa7:9e4a:: with SMTP id z10mr7844416pfq.53.1643376095824; Fri, 28 Jan 2022 05:21:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643376095; cv=none; d=google.com; s=arc-20160816; b=MFbw5G+qP5wwQ5/BkUQLMBkwwNuodbxxt8VXu4ZdzqhahPUTFNn7NIgE7g/DYHVrx7 PbrS9jWpomMcVuhIbLhHk2qHRtV7dkQgiJFpz3z7Y/aTUmRlH349FlAix3Aol6Gz2IQa x+iuuhnP24wNwIpZmKI7JHpsv9NrRBYIUrkzMVYYhcF2zMsRzHk8vlK2QqIf+wrubrsJ 4LHlC9IVJ9V4roc32tn1hTT0UvA0pkQpKwb1+ys2YkNU/d8umukwpa0OurRF3RMu0ueY OA+Vmps0b4lZP6zdzyFzW0Ey+apellT+QvZiKpo6MwvFnUK9Vlz4lWpAycFHdf//Gi4s k8Qw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=na4JrcfYxuRrWhHtI4xX3VVNyVjvPbIvleQCbZ5kHqA=; b=RQAWYF3leiTl5Mj4yaSPIxWD8UyJR6nnrUCswYTO8b4qWlrqUw7tD4P9piQjBvMaiT RnPoJKxN9BIHLa62far6FRIX5L7Cpasx1s2ZFiSiqAhvVSZlihF6DWtZc1+J2Cnr0WGi XS8t2G3JMqWOfvGgha6MA6A8NmFyI8O9y+qMuV9fyRK07kyx/Igl864FblSOfkAmAKIj Si2C/6FqDwZixjiWHSWJXvvbrJTMUSK7JZr1NLS0s4qzdOG3/XFO7/QDPEYhKsZLk16h 9Rh96JPH9scWD15WPvR6D3CZLy36CxbsZVjDqY6ae1QZPGgUX127Fq4P+QrJeqqKGcym 7Rzw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=tmnXaOWu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g1si2640637pfc.119.2022.01.28.05.21.24; Fri, 28 Jan 2022 05:21:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=tmnXaOWu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244794AbiA0SJD (ORCPT + 99 others); Thu, 27 Jan 2022 13:09:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45094 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244595AbiA0SI5 (ORCPT ); Thu, 27 Jan 2022 13:08:57 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 67E12C061749; Thu, 27 Jan 2022 10:08:57 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 0346761CEC; Thu, 27 Jan 2022 18:08:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C5A36C340E4; Thu, 27 Jan 2022 18:08:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1643306936; bh=pjUps12VKNP/f4t1N2fmXT17o0v4/hjN1mtcgEZrecs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tmnXaOWu45Uu/7vZru6IsrnP0yl049s4Mf4Zfa3fpAUR9WjL/BRBPDcOdPfgYYDx4 DTpzVcUoe5OzSqPYA9XaFj43ylxiuViM0pa1OZeVsyoyLdyYo3t44bJa0mn4nWuD6m 5QihQ3IJrzV1kk2UvT6rXu8izox2lOyZmewM5LuQ= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Greg Kroah-Hartman , Lai Jiangshan , Paolo Bonzini , Ben Hutchings Subject: [PATCH 4.9 5/9] KVM: X86: MMU: Use the correct inherited permissions to get shadow page Date: Thu, 27 Jan 2022 19:08:23 +0100 Message-Id: <20220127180257.391063648@linuxfoundation.org> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20220127180257.225641300@linuxfoundation.org> References: <20220127180257.225641300@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Lai Jiangshan commit b1bd5cba3306691c771d558e94baa73e8b0b96b7 upstream. When computing the access permissions of a shadow page, use the effective permissions of the walk up to that point, i.e. the logic AND of its parents' permissions. Two guest PxE entries that point at the same table gfn need to be shadowed with different shadow pages if their parents' permissions are different. KVM currently uses the effective permissions of the last non-leaf entry for all non-leaf entries. Because all non-leaf SPTEs have full ("uwx") permissions, and the effective permissions are recorded only in role.access and merged into the leaves, this can lead to incorrect reuse of a shadow page and eventually to a missing guest protection page fault. For example, here is a shared pagetable: pgd[] pud[] pmd[] virtual address pointers /->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--) /->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-) pgd-| (shared pmd[] as above) \->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--) \->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--) pud1 and pud2 point to the same pmd table, so: - ptr1 and ptr3 points to the same page. - ptr2 and ptr4 points to the same page. (pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries) - First, the guest reads from ptr1 first and KVM prepares a shadow page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1. "u--" comes from the effective permissions of pgd, pud1 and pmd1, which are stored in pt->access. "u--" is used also to get the pagetable for pud1, instead of "uw-". - Then the guest writes to ptr2 and KVM reuses pud1 which is present. The hypervisor set up a shadow page for ptr2 with pt->access is "uw-" even though the pud1 pmd (because of the incorrect argument to kvm_mmu_get_page in the previous step) has role.access="u--". - Then the guest reads from ptr3. The hypervisor reuses pud1's shadow pmd for pud2, because both use "u--" for their permissions. Thus, the shadow pmd already includes entries for both pmd1 and pmd2. - At last, the guest writes to ptr4. This causes no vmexit or pagefault, because pud1's shadow page structures included an "uw-" page even though its role.access was "u--". Any kind of shared pagetable might have the similar problem when in virtual machine without TDP enabled if the permissions are different from different ancestors. In order to fix the problem, we change pt->access to be an array, and any access in it will not include permissions ANDed from child ptes. The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/ Remember to test it with TDP disabled. The problem had existed long before the commit 41074d07c78b ("KVM: MMU: Fix inherited permissions for emulated guest pte updates"), and it is hard to find which is the culprit. So there is no fixes tag here. Signed-off-by: Lai Jiangshan Message-Id: <20210603052455.21023-1-jiangshanlai@gmail.com> Cc: stable@vger.kernel.org Fixes: cea0f0e7ea54 ("[PATCH] KVM: MMU: Shadow page table caching") Signed-off-by: Paolo Bonzini [bwh: Backported to 4.9: - Keep passing vcpu argument to gpte_access functions - Adjust filenames, context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman --- Documentation/virtual/kvm/mmu.txt | 4 ++-- arch/x86/kvm/paging_tmpl.h | 14 +++++++++----- 2 files changed, 11 insertions(+), 7 deletions(-) --- a/Documentation/virtual/kvm/mmu.txt +++ b/Documentation/virtual/kvm/mmu.txt @@ -152,8 +152,8 @@ Shadow pages contain the following infor shadow pages) so role.quadrant takes values in the range 0..3. Each quadrant maps 1GB virtual address space. role.access: - Inherited guest access permissions in the form uwx. Note execute - permission is positive, not negative. + Inherited guest access permissions from the parent ptes in the form uwx. + Note execute permission is positive, not negative. role.invalid: The page is invalid and should not be used. It is a root page that is currently pinned (by a cpu hardware register pointing to it); once it is --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -100,8 +100,8 @@ struct guest_walker { gpa_t pte_gpa[PT_MAX_FULL_LEVELS]; pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS]; bool pte_writable[PT_MAX_FULL_LEVELS]; - unsigned pt_access; - unsigned pte_access; + unsigned int pt_access[PT_MAX_FULL_LEVELS]; + unsigned int pte_access; gfn_t gfn; struct x86_exception fault; }; @@ -380,13 +380,15 @@ retry_walk: } walker->ptes[walker->level - 1] = pte; + + /* Convert to ACC_*_MASK flags for struct guest_walker. */ + walker->pt_access[walker->level - 1] = FNAME(gpte_access)(vcpu, pt_access ^ walk_nx_mask); } while (!is_last_gpte(mmu, walker->level, pte)); pte_pkey = FNAME(gpte_pkeys)(vcpu, pte); accessed_dirty = pte_access & PT_GUEST_ACCESSED_MASK; /* Convert to ACC_*_MASK flags for struct guest_walker. */ - walker->pt_access = FNAME(gpte_access)(vcpu, pt_access ^ walk_nx_mask); walker->pte_access = FNAME(gpte_access)(vcpu, pte_access ^ walk_nx_mask); errcode = permission_fault(vcpu, mmu, walker->pte_access, pte_pkey, access); if (unlikely(errcode)) @@ -424,7 +426,8 @@ retry_walk: } pgprintk("%s: pte %llx pte_access %x pt_access %x\n", - __func__, (u64)pte, walker->pte_access, walker->pt_access); + __func__, (u64)pte, walker->pte_access, + walker->pt_access[walker->level - 1]); return 1; error: @@ -586,7 +589,7 @@ static int FNAME(fetch)(struct kvm_vcpu { struct kvm_mmu_page *sp = NULL; struct kvm_shadow_walk_iterator it; - unsigned direct_access, access = gw->pt_access; + unsigned int direct_access, access; int top_level, ret; gfn_t gfn, base_gfn; @@ -618,6 +621,7 @@ static int FNAME(fetch)(struct kvm_vcpu sp = NULL; if (!is_shadow_present_pte(*it.sptep)) { table_gfn = gw->table_gfn[it.level - 2]; + access = gw->pt_access[it.level - 2]; sp = kvm_mmu_get_page(vcpu, table_gfn, addr, it.level-1, false, access); }