Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp767303pxy; Wed, 28 Apr 2021 13:52:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzC+M2XQ6fmvsRArn8HRYTsSOuyh0xiEOOftQAZAosxaTKzqjxzM69xsdJAs3OAjjClns5w X-Received: by 2002:a17:906:6544:: with SMTP id u4mr31146201ejn.455.1619643161145; Wed, 28 Apr 2021 13:52:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619643161; cv=none; d=google.com; s=arc-20160816; b=ZZwsjazYyc4LLz7lrk70Gyd90+eFNIv3JjuWQJcjU6P4cS/wFthq20ICnsOe/xbYLP ja8Za4Ai6CldMGt9nru/6aTx1WRV5jgg319KuREW0he71OnQnsuJnJTAl1YBseswZfto A7fLhvr8bSX82bEtSEhT35YFH0VfzG4zZ/Cae3w7YXS3Z1U5S7jAqIPXshIBTu1ZRBaq owLeYcIez3ka5VZrwKZjw71gTNPilVpTO/rodCQ173zJem0tlrBdAimnKo6OUe9zdpw3 GU2+eluveM4Gspe6Kegv4IimB/ychMYfzKBbv1LfeVvaSPZiSTnwZcdiv+Ntw3m7mOyx QOTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:subject:from :references:cc:to:dkim-signature; bh=8BOun1ZYibCbIwXIrMmv7Pl4kSj5RYq+jSho1dRkNPA=; b=yMASZRZqy3az/gqFno0f9QNPD7XolL+S7Q7viC/R8muby7LrqANaQrXlPKBFMZQ3ak uXmJYc2JVXS6xe65Dp36dNMZ2WmznHJVpLHmvJzbOtpkCDX07QE9I3o7pZdZef+MwEiW FSB/TeePSu/5/A+yOhmfjc5Rcj+JuJCY4y5F2G6ImUXnJ1BAOj0cjmczzzspG+jK87De GrcsDYWxMMvXLljAZ2GEZQuyvNup5Yk9PljC6BZt39Ge5zrTwtQ2K5KlF1sdwS2B4V0F n+3SPeq7iXxPlUlNKB1LhIeHO9UA+69lAjPRyFsR6Xyzs91LrBhyNf0ZWPw9BwN6P3hx DbPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=jJ9eozza; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a20si1403642ejg.532.2021.04.28.13.52.16; Wed, 28 Apr 2021 13:52:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=jJ9eozza; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242461AbhD1Rre (ORCPT + 99 others); Wed, 28 Apr 2021 13:47:34 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:56145 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242449AbhD1Rrb (ORCPT ); Wed, 28 Apr 2021 13:47:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619632005; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8BOun1ZYibCbIwXIrMmv7Pl4kSj5RYq+jSho1dRkNPA=; b=jJ9eozzaD5WeSd0mgY7tQQJE0WdZTh/qd4n8w37YZ+p/Dr4E7A325KGJVlE1a9bBMJrLuP dfiM6LjoYGEosWCNZN/iDd5gG75gLoBvUY824loD5Hm4EfSwIroKgsF//iSxpXBLuK+RBF B0LvA0lw1146deGlRWvJc6X4jT2YFTw= Received: from mail-ej1-f71.google.com (mail-ej1-f71.google.com [209.85.218.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-258-idS5XV-SOiyXJwJPBEyU9g-1; Wed, 28 Apr 2021 13:46:43 -0400 X-MC-Unique: idS5XV-SOiyXJwJPBEyU9g-1 Received: by mail-ej1-f71.google.com with SMTP id s23-20020a1709069617b02903907023c7c0so2338865ejx.0 for ; Wed, 28 Apr 2021 10:46:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:subject:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=8BOun1ZYibCbIwXIrMmv7Pl4kSj5RYq+jSho1dRkNPA=; b=c0NFiQmyR8ECBxq7REoNfPDzbrPfMv57mtI9FUsFEd2rIa5l9sh3I1WqGk1kM1ttbt 98BfuXpZomiLvFHxjEVl3TatEjxTAf1yXrjsVfb4+27LfBuwM+b9S/WZCZlp4s8pK/nC VqIh2cNHwddzpVo9ar+FizPIWbmotNI0cmBh8C12AJpuIZgL5bQZhZiXk+JkplY9HBUa P1H94gCeBgxeADyPAl0vhKNNeTVd5rYPjNQUUhjxfUO0X1Gen+X09w+6L6CgrdhWHMVA vGFADem6PJX+2RNy/GwweIIjOJjX9SCmeYANYcMTBfUQHu95MsmExTjXjxkSCFFUdZyi OLbQ== X-Gm-Message-State: AOAM530Jwa9mci3NdceoqaoRPm8VCAi8ZOYkyA35If0vo+vnu23G3b7j 5Sql/fF7mR5Ds+6cQNTiEOPaa58OLx3HqL+KlwXRg9dytK2T4Gy+dmwyB0JqD9mDpYdnpOQdM1/ kRGLaNgcLdRIu9HrKzslSZTRn X-Received: by 2002:aa7:c7d5:: with SMTP id o21mr13060596eds.166.1619632001802; Wed, 28 Apr 2021 10:46:41 -0700 (PDT) X-Received: by 2002:aa7:c7d5:: with SMTP id o21mr13060570eds.166.1619632001620; Wed, 28 Apr 2021 10:46:41 -0700 (PDT) Received: from ?IPv6:2001:b07:6468:f312:5e2c:eb9a:a8b6:fd3e? ([2001:b07:6468:f312:5e2c:eb9a:a8b6:fd3e]) by smtp.gmail.com with ESMTPSA id u24sm361329edt.85.2021.04.28.10.46.40 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 28 Apr 2021 10:46:41 -0700 (PDT) To: Ben Gardon Cc: LKML , kvm , Peter Xu , Sean Christopherson , Peter Shier , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong References: <20210427223635.2711774-1-bgardon@google.com> <20210427223635.2711774-6-bgardon@google.com> <997f9fe3-847b-8216-c629-1ad5fdd2ffae@redhat.com> From: Paolo Bonzini Subject: Re: [PATCH 5/6] KVM: x86/mmu: Protect kvm->memslots with a mutex Message-ID: Date: Wed, 28 Apr 2021 19:46:39 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 28/04/21 18:40, Ben Gardon wrote: > On Tue, Apr 27, 2021 at 11:25 PM Paolo Bonzini wrote: >> >> On 28/04/21 00:36, Ben Gardon wrote: >>> +void kvm_arch_assign_memslots(struct kvm *kvm, int as_id, >>> + struct kvm_memslots *slots) >>> +{ >>> + mutex_lock(&kvm->arch.memslot_assignment_lock); >>> + rcu_assign_pointer(kvm->memslots[as_id], slots); >>> + mutex_unlock(&kvm->arch.memslot_assignment_lock); >>> +} >> >> Does the assignment also needs the lock, or only the rmap allocation? I >> would prefer the hook to be something like kvm_arch_setup_new_memslots. > > The assignment does need to be under the lock to prevent the following race: > 1. Thread 1 (installing a new memslot): Acquires memslot assignment > lock (or perhaps in this case rmap_allocation_lock would be more apt.) > 2. Thread 1: Check alloc_memslot_rmaps (it is false) > 3. Thread 1: doesn't allocate memslot rmaps for new slot > 4. Thread 1: Releases memslot assignment lock > 5. Thread 2 (allocating a shadow root): Acquires memslot assignment lock > 6. Thread 2: Sets alloc_memslot_rmaps = true > 7. Thread 2: Allocates rmaps for all existing slots > 8. Thread 2: Releases memslot assignment lock > 9. Thread 2: Sets shadow_mmu_active = true > 10. Thread 1: Installs the new memslots > 11. Thread 3: Null pointer dereference when trying to access rmaps on > the new slot. ... because thread 3 would be under mmu_lock and therefore cannot allocate the rmap itself (you have to do it in mmu_alloc_shadow_roots, as in patch 6). Related to this, your solution does not have to protect kvm_dup_memslots with the new lock, because the first update of the memslots will not go through kvm_arch_prepare_memory_region but it _will_ go through install_new_memslots and therefore through the new hook. But overall I think I'd prefer to have a kvm->slots_arch_lock mutex in generic code, and place the call to kvm_dup_memslots and kvm_arch_prepare_memory_region inside that mutex. That makes the new lock decently intuitive, and easily documented as "Architecture code can use slots_arch_lock if the contents of struct kvm_arch_memory_slot needs to be written outside kvm_arch_prepare_memory_region. Unlike slots_lock, slots_arch_lock can be taken inside a ``kvm->srcu`` read-side critical section". I admit I haven't thought about it very thoroughly, but if something like this is enough, it is relatively pretty: diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 9b8e30dd5b9b..6e5106365597 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1333,6 +1333,7 @@ static struct kvm_memslots *install_new_memslots(struct kvm *kvm, rcu_assign_pointer(kvm->memslots[as_id], slots); + mutex_unlock(&kvm->slots_arch_lock); synchronize_srcu_expedited(&kvm->srcu); /* @@ -1399,6 +1398,7 @@ static int kvm_set_memslot(struct kvm *kvm, struct kvm_memslots *slots; int r; + mutex_lock(&kvm->slots_arch_lock); slots = kvm_dup_memslots(__kvm_memslots(kvm, as_id), change); if (!slots) return -ENOMEM; @@ -1427,6 +1427,7 @@ static int kvm_set_memslot(struct kvm *kvm, * - kvm_is_visible_gfn (mmu_check_root) */ kvm_arch_flush_shadow_memslot(kvm, slot); + mutex_lock(&kvm->slots_arch_lock); } r = kvm_arch_prepare_memory_region(kvm, new, mem, change); It does make the critical section a bit larger, so that the initialization of the shadow page (which is in KVM_RUN context) contends with slightly more code than necessary. However it's all but a performance critical situation, as it will only happen just once per VM. WDYT? Paolo > Putting the assignment under the lock prevents 5-8 from happening > between 2 and 10. > > I'm open to other ideas as far as how to prevent this race though. I > admit this solution is not the most elegant looking.