Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp826119ybt; Wed, 8 Jul 2020 12:40:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxarkkNTZxRTLI8kxKS59K31vkXyabS70B/RoltYewe1xQjtNuL3XcYkSK3Yf3LPy5E1vRx X-Received: by 2002:a17:906:ef2:: with SMTP id x18mr51594327eji.547.1594237201055; Wed, 08 Jul 2020 12:40:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594237201; cv=none; d=google.com; s=arc-20160816; b=VEZO7lG0uPZOgB3Lg86CZILS9uFyYHGlZJRm24Bm1tlPH/d+j48iD73JGP2K/PMgr9 1N+WV6uKrEqc659Alin2q7yfQ9lcVNlYzI80Z6Z84YVP4H2l5tQBpAKTIfpJztch5Kss f094juvgib8p1Py84F/nDlYiDcXfKkT1tNu6MABZ87/lkZ0DGVI4bd20DX0FPBi1Z0SG nLtb7Au3yA+Q0UZ9Mlod3fKI0m6fxBuCvlwbFvBh4T5clkzg+UZXecCcd21sUxMpJpyX IcK0kue3P95JNboxNvHlbg1h3JR3XyLC+LSU+0nfRScM9nsZ+ktp9opmUlCILPwx0mil XZFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=TkLRD+DDZDrATnixnn0V3/wCo2rpmHI/2CPlioZ8aRo=; b=XB23iexCiRkNXWGNBQwmmnpw0naV2iNRdCKJvqk7OObkSEpo6Z+2hEWbiT9WgNuOIP 0AkVw0+ZH4mlKGUyKMejGDH5LiE3zZujbBJIhwhxyQ6c1ktv0zljq19EP/Q59kvTR5dJ OHXTTcY/dolU22uAByaSsK9Z+nMuFKyKZh0pX3MiYtZWx+M5FHxON+okWwbM74eV64yQ 8lLKJPtSVKDDoSjGiXw3ipxD2YakNwmiQ6cYfZDzzc0GgS6zTlkqWFJqmsDrOr9PLTmY Bwtl6hndO5+n/rPT+Gds1rX3xWfKqX2hlx8B9W2HcuCssUOwlJYzSruDgCw0VS3FjIeN 34LQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=e2a3CyMY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id qq21si446428ejb.570.2020.07.08.12.39.36; Wed, 08 Jul 2020 12:40:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=e2a3CyMY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726759AbgGHTeV (ORCPT + 99 others); Wed, 8 Jul 2020 15:34:21 -0400 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:29074 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725446AbgGHTeU (ORCPT ); Wed, 8 Jul 2020 15:34:20 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594236858; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TkLRD+DDZDrATnixnn0V3/wCo2rpmHI/2CPlioZ8aRo=; b=e2a3CyMYzp6ebug5EYhHfKnCwM6u7LmaM+Sdh73YJOzUl8dV035kVXx2u4ztfqe5kFOH62 h2bt0gVr7C1M40q999ALsY5tTY7yI41FDX/LR07ovCwZimv4HY7ahcl9/nvoS9ChTRESxP CjotjKyiGC/Mx72jCJhUmLuel+UHdyY= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-499-Ia_UWac4O7Sau3klKssVWA-1; Wed, 08 Jul 2020 15:34:16 -0400 X-MC-Unique: Ia_UWac4O7Sau3klKssVWA-1 Received: by mail-qv1-f71.google.com with SMTP id r19so29787528qvz.7 for ; Wed, 08 Jul 2020 12:34:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TkLRD+DDZDrATnixnn0V3/wCo2rpmHI/2CPlioZ8aRo=; b=M+rM2m3lKm4H3ccJiB+i9yHv3iA7CpCNQZg9whX+OsRtJMNzTSfib8zfNuf8iTC2ub fZrzs9f1frvap4hPFMPk4NOUICuyCWKQeJ9K8r4mJ8PUC2x7MLBgqaoqDcsldn6awVjt QLp6VZ2lsYudSYW159ff+dJxZ0YKo/lkgNq8/9QzgoQs1XJX5mgyuFLpT35fc+ADySbp G/tqhO1UbewniMovn3IrnRVsR/FNh3KASQ7bOQjLsT5M4baTLZpKFICHToutjuXikMpS OZWbntidUkJ7PdQl/f2lKZaVh2gq3rpP8ei8W4Ye2EKxhXbOXy8gQmazgoAseZdjjbiC GQRQ== X-Gm-Message-State: AOAM533ypdVXyehRWoyFprluydd7Ivc+ZScn9UlQAPfwMfj2C8VnFD+9 EiNUvkacIquy+PEHIdGyoIYQ3BgXzUs19l8z4E73rGmIRSZ6YdW6bLg2AvPqdCFtuuI4SpK4nmd nbEuEJUC1k7qu7ZglrzmIrSL5 X-Received: by 2002:ad4:504a:: with SMTP id m10mr46587538qvq.172.1594236855869; Wed, 08 Jul 2020 12:34:15 -0700 (PDT) X-Received: by 2002:ad4:504a:: with SMTP id m10mr46587508qvq.172.1594236855472; Wed, 08 Jul 2020 12:34:15 -0700 (PDT) Received: from xz-x1.redhat.com ([2607:9880:19c8:6f::1f4f]) by smtp.gmail.com with ESMTPSA id f18sm664884qtc.28.2020.07.08.12.34.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jul 2020 12:34:14 -0700 (PDT) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Sean Christopherson , "Dr . David Alan Gilbert" , Andrew Jones , Paolo Bonzini Subject: [PATCH v11 02/13] KVM: X86: Don't track dirty for KVM_SET_[TSS_ADDR|IDENTITY_MAP_ADDR] Date: Wed, 8 Jul 2020 15:33:57 -0400 Message-Id: <20200708193408.242909-3-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200708193408.242909-1-peterx@redhat.com> References: <20200708193408.242909-1-peterx@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Originally, we have three code paths that can dirty a page without vcpu context for X86: - init_rmode_identity_map - init_rmode_tss - kvmgt_rw_gpa init_rmode_identity_map and init_rmode_tss will be setup on destination VM no matter what (and the guest cannot even see them), so it does not make sense to track them at all. To do this, allow __x86_set_memory_region() to return the userspace address that just allocated to the caller. Then in both of the functions we directly write to the userspace address instead of calling kvm_write_*() APIs. Another trivial change is that we don't need to explicitly clear the identity page table root in init_rmode_identity_map() because no matter what we'll write to the whole page with 4M huge page entries. Suggested-by: Paolo Bonzini Signed-off-by: Peter Xu --- arch/x86/include/asm/kvm_host.h | 3 +- arch/x86/kvm/svm/avic.c | 9 ++-- arch/x86/kvm/vmx/vmx.c | 88 ++++++++++++++++----------------- arch/x86/kvm/x86.c | 37 +++++++++++--- 4 files changed, 81 insertions(+), 56 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 97cb005c7aa7..f26bc2bdacf4 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1647,7 +1647,8 @@ void __kvm_request_immediate_exit(struct kvm_vcpu *vcpu); int kvm_is_in_guest(void); -int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size); +void __user *__x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, + u32 size); bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu); bool kvm_vcpu_is_bsp(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index ac830cd50830..f9d665bbfd68 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -235,7 +235,8 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu, */ static int avic_update_access_page(struct kvm *kvm, bool activate) { - int ret = 0; + void __user *ret; + int r = 0; mutex_lock(&kvm->slots_lock); /* @@ -251,13 +252,15 @@ static int avic_update_access_page(struct kvm *kvm, bool activate) APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, APIC_DEFAULT_PHYS_BASE, activate ? PAGE_SIZE : 0); - if (ret) + if (IS_ERR(ret)) { + r = PTR_ERR(ret); goto out; + } kvm->arch.apic_access_page_done = activate; out: mutex_unlock(&kvm->slots_lock); - return ret; + return r; } static int avic_init_backing_page(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 8187ca152ad2..6e5208a735ec 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3552,42 +3552,33 @@ static bool guest_state_valid(struct kvm_vcpu *vcpu) return true; } -static int init_rmode_tss(struct kvm *kvm) +static int init_rmode_tss(struct kvm *kvm, void __user *ua) { - gfn_t fn; - u16 data = 0; - int idx, r; + const void *zero_page = (const void *) __va(page_to_phys(ZERO_PAGE(0))); + u16 data; + int i; + + for (i = 0; i < 3; i++) { + if (__copy_to_user(ua + PAGE_SIZE * i, zero_page, PAGE_SIZE)) + return -EFAULT; + } - idx = srcu_read_lock(&kvm->srcu); - fn = to_kvm_vmx(kvm)->tss_addr >> PAGE_SHIFT; - r = kvm_clear_guest_page(kvm, fn, 0, PAGE_SIZE); - if (r < 0) - goto out; data = TSS_BASE_SIZE + TSS_REDIRECTION_SIZE; - r = kvm_write_guest_page(kvm, fn++, &data, - TSS_IOPB_BASE_OFFSET, sizeof(u16)); - if (r < 0) - goto out; - r = kvm_clear_guest_page(kvm, fn++, 0, PAGE_SIZE); - if (r < 0) - goto out; - r = kvm_clear_guest_page(kvm, fn, 0, PAGE_SIZE); - if (r < 0) - goto out; + if (__copy_to_user(ua + TSS_IOPB_BASE_OFFSET, &data, sizeof(u16))) + return -EFAULT; + data = ~0; - r = kvm_write_guest_page(kvm, fn, &data, - RMODE_TSS_SIZE - 2 * PAGE_SIZE - 1, - sizeof(u8)); -out: - srcu_read_unlock(&kvm->srcu, idx); - return r; + if (__copy_to_user(ua + RMODE_TSS_SIZE - 1, &data, sizeof(u8))) + return -EFAULT; + + return 0; } static int init_rmode_identity_map(struct kvm *kvm) { struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm); int i, r = 0; - kvm_pfn_t identity_map_pfn; + void __user *uaddr; u32 tmp; /* Protect kvm_vmx->ept_identity_pagetable_done. */ @@ -3598,24 +3589,24 @@ static int init_rmode_identity_map(struct kvm *kvm) if (!kvm_vmx->ept_identity_map_addr) kvm_vmx->ept_identity_map_addr = VMX_EPT_IDENTITY_PAGETABLE_ADDR; - identity_map_pfn = kvm_vmx->ept_identity_map_addr >> PAGE_SHIFT; - r = __x86_set_memory_region(kvm, IDENTITY_PAGETABLE_PRIVATE_MEMSLOT, - kvm_vmx->ept_identity_map_addr, PAGE_SIZE); - if (r < 0) + uaddr = __x86_set_memory_region(kvm, + IDENTITY_PAGETABLE_PRIVATE_MEMSLOT, + kvm_vmx->ept_identity_map_addr, + PAGE_SIZE); + if (IS_ERR(uaddr)) { + r = PTR_ERR(uaddr); goto out; + } - r = kvm_clear_guest_page(kvm, identity_map_pfn, 0, PAGE_SIZE); - if (r < 0) - goto out; /* Set up identity-mapping pagetable for EPT in real mode */ for (i = 0; i < PT32_ENT_PER_PAGE; i++) { tmp = (i << 22) + (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE); - r = kvm_write_guest_page(kvm, identity_map_pfn, - &tmp, i * sizeof(tmp), sizeof(tmp)); - if (r < 0) + if (__copy_to_user(uaddr + i * sizeof(tmp), &tmp, sizeof(tmp))) { + r = -EFAULT; goto out; + } } kvm_vmx->ept_identity_pagetable_done = true; @@ -3642,19 +3633,22 @@ static void seg_setup(int seg) static int alloc_apic_access_page(struct kvm *kvm) { struct page *page; - int r = 0; + void __user *hva; + int ret = 0; mutex_lock(&kvm->slots_lock); if (kvm->arch.apic_access_page_done) goto out; - r = __x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, - APIC_DEFAULT_PHYS_BASE, PAGE_SIZE); - if (r) + hva = __x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, + APIC_DEFAULT_PHYS_BASE, PAGE_SIZE); + if (IS_ERR(hva)) { + ret = PTR_ERR(hva); goto out; + } page = gfn_to_page(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT); if (is_error_page(page)) { - r = -EFAULT; + ret = -EFAULT; goto out; } @@ -3666,7 +3660,7 @@ static int alloc_apic_access_page(struct kvm *kvm) kvm->arch.apic_access_page_done = true; out: mutex_unlock(&kvm->slots_lock); - return r; + return ret; } int allocate_vpid(void) @@ -4616,7 +4610,7 @@ static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection) static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr) { - int ret; + void __user *ret; if (enable_unrestricted_guest) return 0; @@ -4626,10 +4620,12 @@ static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr) PAGE_SIZE * 3); mutex_unlock(&kvm->slots_lock); - if (ret) - return ret; + if (IS_ERR(ret)) + return PTR_ERR(ret); + to_kvm_vmx(kvm)->tss_addr = addr; - return init_rmode_tss(kvm); + + return init_rmode_tss(kvm, ret); } static int vmx_set_identity_map_addr(struct kvm *kvm, u64 ident_addr) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 09ee54f5e385..65be4977f608 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9959,7 +9959,32 @@ void kvm_arch_sync_events(struct kvm *kvm) kvm_free_pit(kvm); } -int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size) +#define ERR_PTR_USR(e) ((void __user *)ERR_PTR(e)) + +/** + * __x86_set_memory_region: Setup KVM internal memory slot + * + * @kvm: the kvm pointer to the VM. + * @id: the slot ID to setup. + * @gpa: the GPA to install the slot (unused when @size == 0). + * @size: the size of the slot. Set to zero to uninstall a slot. + * + * This function helps to setup a KVM internal memory slot. Specify + * @size > 0 to install a new slot, while @size == 0 to uninstall a + * slot. The return code can be one of the following: + * + * HVA: on success (uninstall will return a bogus HVA) + * -errno: on error + * + * The caller should always use IS_ERR() to check the return value + * before use. Note, the KVM internal memory slots are guaranteed to + * remain valid and unchanged until the VM is destroyed, i.e., the + * GPA->HVA translation will not change. However, the HVA is a user + * address, i.e. its accessibility is not guaranteed, and must be + * accessed via __copy_{to,from}_user(). + */ +void __user * __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, + u32 size) { int i, r; unsigned long hva, uninitialized_var(old_npages); @@ -9968,12 +9993,12 @@ int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size) /* Called with kvm->slots_lock held. */ if (WARN_ON(id >= KVM_MEM_SLOTS_NUM)) - return -EINVAL; + return ERR_PTR_USR(-EINVAL); slot = id_to_memslot(slots, id); if (size) { if (slot && slot->npages) - return -EEXIST; + return ERR_PTR_USR(-EEXIST); /* * MAP_SHARED to prevent internal slot pages from being moved @@ -9982,7 +10007,7 @@ int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size) hva = vm_mmap(NULL, 0, size, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, 0); if (IS_ERR((void *)hva)) - return PTR_ERR((void *)hva); + return (void __user *)hva; } else { if (!slot || !slot->npages) return 0; @@ -10001,13 +10026,13 @@ int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size) m.memory_size = size; r = __kvm_set_memory_region(kvm, &m); if (r < 0) - return r; + return ERR_PTR_USR(r); } if (!size) vm_munmap(hva, old_npages * PAGE_SIZE); - return 0; + return (void __user *)hva; } EXPORT_SYMBOL_GPL(__x86_set_memory_region); -- 2.26.2