Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751762AbdG1H3z (ORCPT ); Fri, 28 Jul 2017 03:29:55 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:51765 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751577AbdG1H3x (ORCPT ); Fri, 28 Jul 2017 03:29:53 -0400 Subject: Re: [PATCH] KVM: nVMX: do not pin the VMCS12 To: Paolo Bonzini , David Matlack Cc: "linux-kernel@vger.kernel.org" , kvm list , Jim Mattson References: <1501163686-13648-1-git-send-email-pbonzini@redhat.com> <78593b07-5d67-ac7e-eb5b-fdefee2a3477@redhat.com> From: Christian Borntraeger Date: Fri, 28 Jul 2017 09:29:47 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <78593b07-5d67-ac7e-eb5b-fdefee2a3477@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 17072807-0040-0000-0000-000003C8750F X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17072807-0041-0000-0000-000025C608DA Message-Id: <032b942d-c5fe-d7cb-4cea-0f4ae137f083@de.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-07-28_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1706020000 definitions=main-1707280118 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4142 Lines: 108 On 07/28/2017 08:57 AM, Paolo Bonzini wrote: > On 27/07/2017 19:20, David Matlack wrote: >>> + kvm_vcpu_write_guest_page(&vmx->vcpu, >>> + vmx->nested.current_vmptr >> PAGE_SHIFT, >>> + vmx->nested.cached_vmcs12, 0, VMCS12_SIZE); >> Have you hit any "suspicious RCU usage" error messages during VM >> teardown with this patch? We did when we replaced memcpy with >> kvm_write_guest a while back. IIRC it was due to kvm->srcu not being >> held in one of the teardown paths. kvm_write_guest() expects it to be >> held in order to access memslots. >> >> We fixed this by skipping the VMCS12 flush during VMXOFF. I'll send >> that patch along with a few other nVMX dirty tracking related patches >> I've been meaning to get upstreamed. > > Oh, right. I had this other (untested) patch in the queue after > Christian recently annotated everything with RCU checks: > So you make the checks not trigger for users_count == 0 to cope with the teardown pathes? Since for users_count==0 all file descriptors are gone, no memslot/bus can be changed by userspace so this makes sense. > Paolo > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 890b706d1943..07e3b02a1be3 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -477,7 +477,8 @@ struct kvm { > static inline struct kvm_io_bus *kvm_get_bus(struct kvm *kvm, enum kvm_bus idx) > { > return srcu_dereference_check(kvm->buses[idx], &kvm->srcu, > - lockdep_is_held(&kvm->slots_lock)); > + lockdep_is_held(&kvm->slots_lock) || > + !refcount_read(&kvm->users_count)); > } > > static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i) > @@ -570,7 +571,8 @@ void kvm_put_kvm(struct kvm *kvm); > static inline struct kvm_memslots *__kvm_memslots(struct kvm *kvm, int as_id) > { > return srcu_dereference_check(kvm->memslots[as_id], &kvm->srcu, > - lockdep_is_held(&kvm->slots_lock)); > + lockdep_is_held(&kvm->slots_lock) || > + !refcount_read(&kvm->users_count)); > } > > static inline struct kvm_memslots *kvm_memslots(struct kvm *kvm) > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index f3f74271f1a9..6a21c98b22bf 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -655,7 +655,6 @@ static struct kvm *kvm_create_vm(unsigned long type) > mutex_init(&kvm->lock); > mutex_init(&kvm->irq_lock); > mutex_init(&kvm->slots_lock); > - refcount_set(&kvm->users_count, 1); > INIT_LIST_HEAD(&kvm->devices); > > r = kvm_arch_init_vm(kvm, type); > @@ -701,6 +700,7 @@ static struct kvm *kvm_create_vm(unsigned long type) > if (r) > goto out_err; > > + refcount_set(&kvm->users_count, 1); > spin_lock(&kvm_lock); > list_add(&kvm->vm_list, &vm_list); > spin_unlock(&kvm_lock); > @@ -717,10 +717,9 @@ static struct kvm *kvm_create_vm(unsigned long type) > hardware_disable_all(); > out_err_no_disable: > for (i = 0; i < KVM_NR_BUSES; i++) > - kfree(rcu_access_pointer(kvm->buses[i])); > + kfree(kvm_get_bus(kvm, i)); > for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) > - kvm_free_memslots(kvm, > - rcu_dereference_protected(kvm->memslots[i], 1)); > + kvm_free_memslots(kvm, __kvm_memslots(kvm, i)); > kvm_arch_free_vm(kvm); > mmdrop(current->mm); > return ERR_PTR(r); > @@ -754,9 +754,8 @@ static void kvm_destroy_vm(struct kvm *kvm) > spin_unlock(&kvm_lock); > kvm_free_irq_routing(kvm); > for (i = 0; i < KVM_NR_BUSES; i++) { > - struct kvm_io_bus *bus; > + struct kvm_io_bus *bus = kvm_get_bus(kvm, i); > > - bus = rcu_dereference_protected(kvm->buses[i], 1); > if (bus) > kvm_io_bus_destroy(bus); > kvm->buses[i] = NULL; > @@ -770,8 +769,7 @@ static void kvm_destroy_vm(struct kvm *kvm) > kvm_arch_destroy_vm(kvm); > kvm_destroy_devices(kvm); > for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) > - kvm_free_memslots(kvm, > - rcu_dereference_protected(kvm->memslots[i], 1)); > + kvm_free_memslots(kvm, __kvm_memslots(kvm, i)); > cleanup_srcu_struct(&kvm->irq_srcu); > cleanup_srcu_struct(&kvm->srcu); > kvm_arch_free_vm(kvm); >