Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp1414332ybp; Fri, 11 Oct 2019 13:58:51 -0700 (PDT) X-Google-Smtp-Source: APXvYqxnwofXTtfTN86gUyYvqjHWAjiATgR23Xnn8yGPmh0umNbmh4gsqPtXHu/pE3eRtbuAIeBo X-Received: by 2002:aa7:d0d5:: with SMTP id u21mr15664456edo.36.1570827531347; Fri, 11 Oct 2019 13:58:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570827531; cv=none; d=google.com; s=arc-20160816; b=rAEV8cu/uDiU678Am2CjbzHLjmDpdOAbq3ByTsod9c/aXIjJjwjrhA/dpSkXrVkGKl 5cJCvUzDxFeb3IoYRlq1llcxYo2l4vDjm5JLZyu2vwLC7QGd8QNx6IuCmlq3+pCYfsE4 TtyywJoDGFHLYef2++Kv681wbcA9tjbK5r36yCYCnqsdxI91ZYf8ZqHbJmb3LB/vHoNd HX8UHDBKyU+pYJiVdstFa9IJvXNpFMTAgiGWGlwN4G8Etc0NYgMzNGFEQqc2zJpt+Bch BCx/XD0eaNcuIR0a5+eGfjChJro4bGgU9baQhDutqxCf1SI+DWyON2M+QRheowVYSWSd yt8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=lQtIpCqUeRMtNnzQ8ls8iatc+M9YfN+mnKLz1N2MKoY=; b=zWf75WS6QWUSE9dB5FYMpTYX+vzRwq1XrZKnu6EftKSlfVpuvVLXYUUIjG9Eu7dW5G I2jh5VAGULmD9FcZ4Tg+kY9U/9xq1PoN2RzM1Mwm/RnK1jsTZ/1+KRP7FIMHVOLmrYBr GjJM6/4CXpDE0+NwYOj80/T17cygiGfhZpePDl35emgsWVYDhQcSwbsAjoHhs9UASAIY YUhFOiyaXoSG1wVOYGcYxUP4XNXavsiHLyV15aj9RdzBSTvkHNpkbak5qUREM7EItKfM hlAOw8A6/a7Geez3RYURLjpKUaxG0iSFxcRoxASTVoDdRFldmFs3H7GaTnJZ232ZophE bZ7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=KISjVuAn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v9si5709405eju.422.2019.10.11.13.58.27; Fri, 11 Oct 2019 13:58:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=KISjVuAn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729128AbfJKUbW (ORCPT + 99 others); Fri, 11 Oct 2019 16:31:22 -0400 Received: from mail-io1-f66.google.com ([209.85.166.66]:36073 "EHLO mail-io1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728930AbfJKUbW (ORCPT ); Fri, 11 Oct 2019 16:31:22 -0400 Received: by mail-io1-f66.google.com with SMTP id b136so24236336iof.3 for ; Fri, 11 Oct 2019 13:31:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=lQtIpCqUeRMtNnzQ8ls8iatc+M9YfN+mnKLz1N2MKoY=; b=KISjVuAnSe9J2G2N/COr2s8rOK0qoujhQlFDb+YKFoGabkgDMWt+rJY7rXRcplRuy/ K7pCDaq5hYTWQ+LCEh7HMpR77VFRx39WQpwfa+7vwyhZgxvsNjPvO9SkUqOfDi9dwOEL RWbzbgQrKYSpqD+F5BXi6olEP9wEKO4frgO9DXs/k1aEVbil2zDq+LkgT4VZRZ5lErE0 sIir6mvh3jiCWeK/zc0ELhodBags16jNdSseFZGxA9r0DCECo534vO/cXwX6PihG3Pm9 odOnxFhkFlqoWH3Hka8D7AL8BAWyeS/hhn97GzqlyZhxd7YfkF6LjBI4UdOIPL2KHq2F PJzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=lQtIpCqUeRMtNnzQ8ls8iatc+M9YfN+mnKLz1N2MKoY=; b=L5AdM+lq2kGc0PrvhfiAndTohmjEuf7fy2MQP7/Aa0S7CnIQknlC9ILWp/5tza2LOC sbzBhetjKABazKDivceOnhfj3/9KRlT6SYhX5k8PUT5VieghGwHvO9+UiQMwSe9MsYbO ZBVFkkjhsZgReqLaEsTsG+X8drl+xPUWGHrpbhhGQre0CAONtKtwrTQWhtCy/R739/Ls /gF6Y+F8ayZcUMXbx0k3I7FSFvaLZU4r1dVU262OZCk0FqYGhYqH1h0sfM3ZwQdoUpGn EK2x7Jokpdar4N39CPdv/LJr7wNyEFoqKQpwSLFVL3KeCcBD4iV5CbIc5FaM8fwinIZb UNHw== X-Gm-Message-State: APjAAAVvfJjx1Pt7wpK7GCSfANCYgMEbQ0Dl2BPPZHt3ly3wUPf/ve0m tDh9uKT/ZN35P/WEkucu6G9zHYiiVLpl9apzCpJdwpdh X-Received: by 2002:a6b:ee07:: with SMTP id i7mr12618562ioh.26.1570825880204; Fri, 11 Oct 2019 13:31:20 -0700 (PDT) MIME-Version: 1.0 References: <20190917085304.16987-1-weijiang.yang@intel.com> <20190917085304.16987-2-weijiang.yang@intel.com> In-Reply-To: <20190917085304.16987-2-weijiang.yang@intel.com> From: Jim Mattson Date: Fri, 11 Oct 2019 13:31:08 -0700 Message-ID: Subject: Re: [PATCH v5 1/9] Documentation: Introduce EPT based Subpage Protection To: Yang Weijiang Cc: kvm list , LKML , Paolo Bonzini , Sean Christopherson , "Michael S. Tsirkin" , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , yu.c.zhang@intel.com, alazar@bitdefender.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 17, 2019 at 1:52 AM Yang Weijiang wrote: > > Co-developed-by: yi.z.zhang@linux.intel.com > Signed-off-by: yi.z.zhang@linux.intel.com > Signed-off-by: Yang Weijiang > --- > Documentation/virtual/kvm/spp_kvm.txt | 178 ++++++++++++++++++++++++++ > 1 file changed, 178 insertions(+) > create mode 100644 Documentation/virtual/kvm/spp_kvm.txt > > diff --git a/Documentation/virtual/kvm/spp_kvm.txt b/Documentation/virtual/kvm/spp_kvm.txt > new file mode 100644 > index 000000000000..1bd1c11d0a99 > --- /dev/null > +++ b/Documentation/virtual/kvm/spp_kvm.txt > @@ -0,0 +1,178 @@ > +EPT-Based Sub-Page Protection (SPP) for KVM > +==================================================== > + > +1.Overview > + EPT-based Sub-Page Protection(SPP) allows VMM to specify > + fine-grained(128byte per sub-page) write-protection for guest physical > + memory. When it's enabled, the CPU enforces write-access permission > + for the sub-pages within a 4KB page, if corresponding bit is set in > + permission vector, write to sub-page region is allowed, otherwise, > + it's prevented with a EPT violation. > + > + *Note*: In current implementation, SPP is exclusive with nested flag, > + if it's on, SPP feature won't work. > + > +2.SPP Operation > + Sub-Page Protection Table (SPPT) is introduced to manage sub-page > + write-access permission. > + > + It is active when: > + a) nested flag is turned off. > + b) "sub-page write protection" VM-execution control is 1. > + c) SPP is initialized with KVM_INIT_SPP ioctl. > + d) Sub-page permissions are set with KVM_SUBPAGES_SET_ACCESS ioctl. > + see below sections for details. > + > + __________________________________________________________________________ > + > + How SPP hardware works: > + __________________________________________________________________________ > + > + Guest write access --> GPA --> Walk EPT --> EPT leaf entry -----| > + |---------------------------------------------------------------| > + |-> if VMexec_control.spp && ept_leaf_entry.spp_bit (bit 61) > + | > + |-> --> EPT legacy behavior > + | > + | > + |-> --> if ept_leaf_entry.writable > + | > + |-> --> Ignore SPP > + | > + |-> --> GPA --> Walk SPP 4-level table--| > + | > + |------------<----------get-the-SPPT-point-from-VMCS-filed-----<------| /filed/field/ > + | > + Walk SPP L4E table > + | > + |---> if-entry-misconfiguration ------------>-------|-------<---------| > + | | | > + else | | > + | | | > + | |------------------SPP VMexit<-----------------| | > + | | | > + | |-> exit_qualification & sppt_misconfig --> sppt misconfig | > + | | | > + | |-> exit_qualification & sppt_miss --> sppt miss | > + |---| | > + | | > + walk SPPT L3E--|--> if-entry-misconfiguration------------>------------| > + | | > + else | > + | | > + | | > + walk SPPT L2E --|--> if-entry-misconfiguration-------->-------| > + | | > + else | > + | | > + | | > + walk SPPT L1E --|-> if-entry-misconfiguration--->----| > + | > + else > + | > + |-> if sub-page writable > + |-> allow, write access > + |-> disallow, EPT violation > + ______________________________________________________________________________ > + > +3.IOCTL Interfaces > + > + KVM_INIT_SPP: > + Allocate storage for sub-page permission vectors and SPPT root page. > + > + KVM_SUBPAGES_GET_ACCESS: > + Get sub-page write permission vectors for given continuous guest pages. /continuous/contiguous/ > + > + KVM_SUBPAGES_SET_ACCESS > + Set SPP bit in EPT leaf entries for given continuous guest pages. The /continuous/contiguous/ > + actual SPPT setup is triggered when SPP miss vm-exit is handled. > + > + /* for KVM_SUBPAGES_GET_ACCESS and KVM_SUBPAGES_SET_ACCESS */ > + struct kvm_subpage_info { > + __u64 gfn; /* the first page gfn of the continuous pages */ /continuous/contiguous/ > + __u64 npages; /* number of 4K pages */ > + __u64 *access_map; /* sub-page write-access bitmap array */ > + }; > + > + #define KVM_SUBPAGES_GET_ACCESS _IOR(KVMIO, 0x49, __u64) > + #define KVM_SUBPAGES_SET_ACCESS _IOW(KVMIO, 0x4a, __u64) > + #define KVM_INIT_SPP _IOW(KVMIO, 0x4b, __u64) The ioctls should be documented in api.txt. > +4.Set Sub-Page Permission > + > + * To enable SPP protection, system admin sets sub-page permission via Why system admin? Can't any kvm user do this? > + KVM_SUBPAGES_SET_ACCESS ioctl: > + (1) It first stores the access permissions in bitmap array. > + > + (2) Then, if the target 4KB page is mapped as PT_PAGE_TABLE_LEVEL entry in EPT, /page is/pages are/ > + it sets SPP bit of the corresponding entry to mark sub-page protection. > + If the 4KB page is mapped as PT_DIRECTORY_LEVEL or PT_PDPE_LEVEL, it /page is/pages are/ > + zapps the hugepage entry and let following memroy access to trigger EPT /zapps/zaps/, /entry/enttries/, /memroy/memory/ > + page fault, there the gfn is check against SPP permission bitmap and /page fault/violation/ > + proper level is selected to set up EPT entry. > + > + > + The SPPT paging structure format is as below: > + > + Format of the SPPT L4E, L3E, L2E: > + | Bit | Contents | > + | :----- | :------------------------------------------------------------------------| > + | 0 | Valid entry when set; indicates whether the entry is present | > + | 11:1 | Reserved (0) | > + | N-1:12 | Physical address of 4KB aligned SPPT LX-1 Table referenced by this entry | > + | 51:N | Reserved (0) | > + | 63:52 | Reserved (0) | > + Note: N is the physical address width supported by the processor. X is the page level > + > + Format of the SPPT L1E: > + | Bit | Contents | > + | :---- | :---------------------------------------------------------------- | > + | 0+2i | Write permission for i-th 128 byte sub-page region. | > + | 1+2i | Reserved (0). | > + Note: 0<=i<=31 > + > +5.SPPT-induced VM exit > + > + * SPPT miss and misconfiguration induced VM exit > + > + A SPPT missing VM exit occurs when walk the SPPT, there is no SPPT > + misconfiguration but a paging-structure entry is not > + present in any of L4E/L3E/L2E entries. > + > + A SPPT misconfiguration VM exit occurs when reserved bits or unsupported values > + are set in SPPT entry. > + > + *NOTE* SPPT miss and SPPT misconfigurations can occur only due to an > + attempt to write memory with a guest physical address. Can you clarify what this means? For instance, setting an A or D bit in a PTE is an attempt to "write memory with a guest physical address," but per the SDM, it is not an operation that is eligible for sub-page write permissions. > + * SPP permission induced VM exit > + SPP sub-page permission induced violation is reported as EPT violation > + thesefore causes VM exit. /thesefore/therefore/ > + > +6.SPPT-induced VM exit handling > + > + #define EXIT_REASON_SPP 66 > + > + static int (*const kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { > + ... > + [EXIT_REASON_SPP] = handle_spp, > + ... > + }; > + > + New exit qualification for SPPT-induced vmexits. > + > + | Bit | Contents | > + | :---- | :---------------------------------------------------------------- | > + | 10:0 | Reserved (0). | > + | 11 | SPPT VM exit type. Set for SPPT Miss, cleared for SPPT Misconfig. | > + | 12 | NMI unblocking due to IRET | > + | 63:13 | Reserved (0) | > + > + In addition to the exit qualification, guest linear address and guest > + physical address fields will be reported. > + > + * SPPT miss and misconfiguration induced VM exit > + Set up SPPT entries correctly. > + > + * SPP permission induced VM exit > + This kind of VM exit is left to VMI tool to handle. > -- > 2.17.2 >