Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp884495pxb; Wed, 27 Oct 2021 14:26:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxvKV/qjJZ7uBNccRKltHphixAxh/DwCsmzKOZqh0he3i47t+0VHhG6F3M5pm773P//Ps3L X-Received: by 2002:a17:90b:3b4f:: with SMTP id ot15mr155332pjb.189.1635370007495; Wed, 27 Oct 2021 14:26:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635370007; cv=none; d=google.com; s=arc-20160816; b=0FYwyrreCzzmXrE81Z1jLc+SMWzviiArnmBwsQwU/Sgv0t/nIvb0vsxe7j0tBCo33Y 3yp0M6QgoGGLdI9vL0r3Ehwo4S7en9YfqQNaNpabpwv2KGCPXNEGyI+VqlOFCKpHztge NWpYdfRqzYjJ8s5m2YsAEfx2ADReXQSLLA8UKZBWjlSKglk20Hew7h5pqnt8dQ3E9obI oNDGR1NwR7OgBJm4nBqkS8HQ95CY7HOKrREkxbQ+uI0mjhocmDW58I6GmURAPatkc7kf ZlWD3JR8Gltgmyu3EVNaHEF9YR/ovmJdEsCPJQz44x2BfzQ4UbgMsVhsqCBhyruyKL70 a65g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:subject:from :references:cc:to; bh=o+SBBYCskCB1h+sviGlzwTPB2X/UB4U7d9FxWwQWt1c=; b=kmC6UfX2Ky7/X2f/m0OU6sG4l6xlQfpx4bbzEG8dQcpwN7O/WbnIEPgPbGhUNKObjM 6cqxDugUETouLJePxrPR1G2VZFvhcikKbGQ40nne7EZkxbisdu1c7fXC5Jj0CytRRSzU 8RjuSBI1UQXF9Z9a5JmjJq5W3LmJGff6DCa3C0X/Vc9cdsSUdFYgjY1fIb8BIB+wIepm EBqWPjdUucW+L6anAD+oJbdKqcFaz4Q9j1wVwM2EaMsU0h2dtJDJbcE3MeIUbMn7wdJ0 vgtnGcplVfZwVa5rSl3pJItNbCL1onMzVcMGSQV4t6x+39SnCTF3lA90b7zu7/53uN+c 9n2g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j190si1225421pgd.491.2021.10.27.14.26.34; Wed, 27 Oct 2021 14:26:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237539AbhJ0Nu4 (ORCPT + 97 others); Wed, 27 Oct 2021 09:50:56 -0400 Received: from vps-vb.mhejs.net ([37.28.154.113]:41838 "EHLO vps-vb.mhejs.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231458AbhJ0Nuz (ORCPT ); Wed, 27 Oct 2021 09:50:55 -0400 Received: from MUA by vps-vb.mhejs.net with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1mfjHu-0001RD-DR; Wed, 27 Oct 2021 15:48:18 +0200 To: Sean Christopherson Cc: Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Igor Mammedov , Marc Zyngier , James Morse , Julien Thierry , Suzuki K Poulose , Huacai Chen , Aleksandar Markovic , Paul Mackerras , Christian Borntraeger , Janosch Frank , David Hildenbrand , Cornelia Huck , Claudio Imbrenda , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org References: <4f8718fc8da57ab799e95ef7c2060f8be0f2391f.1632171479.git.maciej.szmigiero@oracle.com> From: "Maciej S. Szmigiero" Subject: Re: [PATCH v5 13/13] KVM: Optimize overlapping memslots check Message-ID: <4222ead3-f80f-0992-569f-9e1a7adbabcc@maciej.szmigiero.name> Date: Wed, 27 Oct 2021 15:48:12 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 26.10.2021 20:59, Sean Christopherson wrote: > On Mon, Sep 20, 2021, Maciej S. Szmigiero wrote: >> From: "Maciej S. Szmigiero" >> >> Do a quick lookup for possibly overlapping gfns when creating or moving >> a memslot instead of performing a linear scan of the whole memslot set. >> >> Signed-off-by: Maciej S. Szmigiero >> --- >> virt/kvm/kvm_main.c | 36 +++++++++++++++++++++++++++--------- >> 1 file changed, 27 insertions(+), 9 deletions(-) >> >> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c >> index 5fea467d6fec..78dad8c6376f 100644 >> --- a/virt/kvm/kvm_main.c >> +++ b/virt/kvm/kvm_main.c >> @@ -1667,6 +1667,30 @@ static int kvm_delete_memslot(struct kvm *kvm, >> return kvm_set_memslot(kvm, mem, old, &new, as_id, KVM_MR_DELETE); >> } >> >> +static bool kvm_check_memslot_overlap(struct kvm_memslots *slots, >> + struct kvm_memory_slot *nslot) >> +{ >> + int idx = slots->node_idx; >> + gfn_t nend = nslot->base_gfn + nslot->npages; >> + struct rb_node *node; >> + >> + kvm_for_each_memslot_in_gfn_range(node, slots, nslot->base_gfn, nend) { >> + struct kvm_memory_slot *cslot; >> + gfn_t cend; >> + >> + cslot = container_of(node, struct kvm_memory_slot, gfn_node[idx]); >> + cend = cslot->base_gfn + cslot->npages; >> + if (cslot->id == nslot->id) >> + continue; >> + >> + /* kvm_for_each_in_gfn_no_more() guarantees that cslot->base_gfn < nend */ >> + if (cend > nslot->base_gfn) > > Hmm, IMO the need for this check means that kvm_for_each_memslot_in_gfn_range() > is flawed. The user of kvm_for_each...() should not be responsible for skipping > memslots that do not actually overlap the requested range. I.e. this function > should be no more than: > > static bool kvm_check_memslot_overlap(struct kvm_memslots *slots, > struct kvm_memory_slot *slot) > { > gfn_t start = slot->base_gfn; > gfn_t end = start + slot->npages; > > kvm_for_each_memslot_in_gfn_range(&iter, slots, start, end) { > if (iter.slot->id != slot->id) > return true; > } > > return false; > } > > > and I suspect kvm_zap_gfn_range() could be further simplified as well. > > Looking back at the introduction of the helper, its comment's highlighting of > "possibily" now makes sense. > > /* Iterate over each memslot *possibly* intersecting [start, end) range */ > #define kvm_for_each_memslot_in_gfn_range(node, slots, start, end) \ > > That's an unnecessarily bad API. It's a very solvable problem for the iterator > helpers to advance until there's actually overlap, not doing so violates the > principle of least surprise, and unless I'm missing something, there's no use > case for an "approximate" iteration. In principle this can be done, however this will complicate the gfn iterator logic - especially the kvm_memslot_iter_start() part, which will already get messier from open-coding kvm_memslots_gfn_upper_bound() there. At the same kvm_zap_gfn_range() will still need to do the memslot range <-> request range merging by itself as it does not want to process the whole returned memslot, but rather just the part that's actually overlapping its requested range. In the worst case, the current code can return one memslot too much, so I don't think it's worth bringing additional complexity just to detect and skip it - it's not that uncommon to design an API that needs extra checking from its caller to cover some corner cases. For example, see pthread_cond_wait() or kernel waitqueues with their spurious wakeups or atomic_compare_exchange_weak() from C11. And these are higher level APIs than a very limited internal KVM one with just two callers. In case of kvm_zap_gfn_range() the necessary checking is already there and has to be kept due to the above range merging. Also, a code that is simpler is easier to understand, maintain and so less prone to subtle bugs. >> + return true; >> + } >> + >> + return false; >> +} >> + >> /* >> * Allocate some memory and give it an address in the guest physical address >> * space. >> @@ -1752,16 +1776,10 @@ int __kvm_set_memory_region(struct kvm *kvm, >> } >> >> if ((change == KVM_MR_CREATE) || (change == KVM_MR_MOVE)) { >> - int bkt; >> - >> /* Check for overlaps */ > > This comment can be dropped, the new function is fairly self-documenting. Will drop it. >> - kvm_for_each_memslot(tmp, bkt, __kvm_memslots(kvm, as_id)) { >> - if (tmp->id == id) >> - continue; >> - if (!((new.base_gfn + new.npages <= tmp->base_gfn) || >> - (new.base_gfn >= tmp->base_gfn + tmp->npages))) >> - return -EEXIST; >> - } >> + if (kvm_check_memslot_overlap(__kvm_memslots(kvm, as_id), >> + &new)) > > And then with the comment dropped, the wrap can be avoided by folding the check > into the outer if statement, e.g. > > if (((change == KVM_MR_CREATE) || (change == KVM_MR_MOVE)) && > kvm_check_memslot_overlap(__kvm_memslots(kvm, as_id), &new)) > return -EEXIST; > Will fold it. Thanks, Maciej