Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp287964pxb; Wed, 3 Feb 2021 05:46:24 -0800 (PST) X-Google-Smtp-Source: ABdhPJz6qPv2/JrgPx//iUR+gZq2JW4zv6o2wY3ZkIK+jItgFNnKjqsj9NAS88UwOYJZunB7MFV2 X-Received: by 2002:aa7:c906:: with SMTP id b6mr3071452edt.194.1612359983835; Wed, 03 Feb 2021 05:46:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612359983; cv=none; d=google.com; s=arc-20160816; b=TUlyBeCuFgQCITKcpxqiIQgs009oCj+36HV+xVWx+QRwXoCR9bEj9xk3nbp6DdfmsR w3Rjd4SDBfaqZac176uayZEwS9A3GTTf9w4X5cgsmwfonNDN0mN3U+UCuFmhKI55fQPd OW7Qgst35pCH/Kp4UvrVcaEuhdepza8EooV0lCupEKcY53NQJCyp5WkFpqFSjOhj8afk J4qKaTVEzxapobjEu9X7SCXcZ9/FrPPAgBfP91DgS6H7fjm4Yu8aOwYtzMN3XNK7n+7A kZ/1ZqwaWbNySm3rI2u8q/KN5hfPz1n4xhPKEy5X7z1hxeOJvKaI6qEpqwIdC9G14vJi yBaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:subject:from :references:cc:to:dkim-signature; bh=msAcTyPpH0uhCVhlJSk7w3lzIdisxEsipIij/eqjQsw=; b=fw5SqdzKvocwYPAzrSUIoyrCpfY0tWajZcIOndH3NI4TTGTw6fOVH0phnLjWmpKY/J +Q5D3IUDjpg+Jtxpm5K+4nFiRcZGU/Fx+F0L/a3HPuTQQjbVhOhWXiE4KXWokaf0q5Y5 XXiA62Ipug+xMAc+Q5P4a8rAtYryqpBrZeRmaCu/BkuGd2i2OtBEBeZpsZPoW9rO3FaE 5y920Cn9DvbgPG0u1/aBJrXJ6GnDWiYV1lWdHSTme/vhlDwpEENDyqWcVydmWEgxFAXa k09D1LviF45MXxa4qHobVfvuKm7NFbup5Zr8ca6KBguZQ7QpB+V46ZvZ4SSi9zuqKNax hNjA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=D7bAo1A1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n12si1259617edo.512.2021.02.03.05.45.56; Wed, 03 Feb 2021 05:46:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=D7bAo1A1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231880AbhBCNnD (ORCPT + 99 others); Wed, 3 Feb 2021 08:43:03 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:51483 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231428AbhBCNmj (ORCPT ); Wed, 3 Feb 2021 08:42:39 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612359668; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=msAcTyPpH0uhCVhlJSk7w3lzIdisxEsipIij/eqjQsw=; b=D7bAo1A1hs91LgcWk8eA014fS2TyxkAr3qXK+Kt9tfL0PtSe5pj2Y08/ISXBW+jA6yipOO V64lBcdO5HGWttOyv8B68Ff/7HG/SDclSJbXsjnOQ+8HZPbHvb2P67nEbMvi7o9s7g6xV9 kf9YlmEK/E3YU+TOr16ANRI05MysO4Y= Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-315-6Hd3moCyOo2MZ7cJvBzo7g-1; Wed, 03 Feb 2021 08:41:06 -0500 X-MC-Unique: 6Hd3moCyOo2MZ7cJvBzo7g-1 Received: by mail-ed1-f71.google.com with SMTP id g6so3738033edy.9 for ; Wed, 03 Feb 2021 05:41:06 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:subject:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=msAcTyPpH0uhCVhlJSk7w3lzIdisxEsipIij/eqjQsw=; b=joJKceJ+dHmVRziCog6tsDnUt3MoykNa3DA9R9PyIGyiDP/31+i/NsyMMbSCZU3QqL 3j8z54u5WT4lGorUNFm0G/W0sJrKNBZdnoo81talyjBQUgf3VbwU1yvAhOebwHD9qtoO zm4/nLo2NUrn8SbKsMSjCyeGUpU7nfXYUOhmPDATeqZ9GhtDnGNqSXdS8e5Pe9qE37V8 CMVpY1S/nMkozsQ/Gi74G8vyEuADbQBCq9a9AXjfzdn13xOzWjfbsT1jCzf46e61uCur bJChbG71MI236mA98+83MIWe7eT5oQqZR6Bk8VT7zvtcKrCOAgXUVlq6YEdoM8+H55i9 nSyA== X-Gm-Message-State: AOAM533sYpY9TsJwGSoAz+g6t/9Sc76jlsdZCOSvA56CMmmPQbCqn8dg 2NtPt7e7L/KoEhsPnKR1j6/bmBR7KzRTsLNVM6BfzsAM8D9ufngkFhQyiLK4lbJoNlJhs9fwA1w npFTZrLqiA2NV0gkaTiIySBJ9KjInl6I9V62JYtmVqVUG8Ya0Dsywe8DemdM5tC3Zj5jGlVyIXu dZ X-Received: by 2002:aa7:dbd4:: with SMTP id v20mr3099898edt.330.1612359664852; Wed, 03 Feb 2021 05:41:04 -0800 (PST) X-Received: by 2002:aa7:dbd4:: with SMTP id v20mr3099857edt.330.1612359664645; Wed, 03 Feb 2021 05:41:04 -0800 (PST) Received: from ?IPv6:2001:b07:6468:f312:c8dd:75d4:99ab:290a? ([2001:b07:6468:f312:c8dd:75d4:99ab:290a]) by smtp.gmail.com with ESMTPSA id s15sm1010923ejy.68.2021.02.03.05.41.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 03 Feb 2021 05:41:03 -0800 (PST) To: "Maciej S. Szmigiero" , Sean Christopherson Cc: Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Igor Mammedov , Marc Zyngier , James Morse , Julien Thierry , Suzuki K Poulose , Huacai Chen , Aleksandar Markovic , Paul Mackerras , Christian Borntraeger , Janosch Frank , David Hildenbrand , Cornelia Huck , Claudio Imbrenda , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org References: <4d748e0fd50bac68ece6952129aed319502b6853.1612140117.git.maciej.szmigiero@oracle.com> <9e6ca093-35c3-7cca-443b-9f635df4891d@maciej.szmigiero.name> From: Paolo Bonzini Subject: Re: [PATCH 2/2] KVM: Scalable memslots implementation Message-ID: <4bdcb44c-c35d-45b2-c0c1-e857e0fd383e@redhat.com> Date: Wed, 3 Feb 2021 14:41:02 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <9e6ca093-35c3-7cca-443b-9f635df4891d@maciej.szmigiero.name> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/02/21 23:42, Maciej S. Szmigiero wrote: >> I'm not opposed to using more sophisticated storage for the gfn >> lookups, but only if there's a good reason for doing so. IMO, the >> rbtree isn't simpler, just different. And it also has worse cache utilization than an array, due to memory footprint (as you point out below) but also pointer chasing. >> Memslot modifications are >> unlikely to be a hot path (and if it is, x86's "zap everything" >> implementation is a far bigger problem), and it's hard to beat the >> memory footprint of a raw array. That doesn't leave much >> motivation for such a big change to some of KVM's scariest (for me) >> code. >> > > Improvements can be done step-by-step, > kvm_mmu_invalidate_zap_pages_in_memslot() can be rewritten, too in > the future, if necessary. After all, complains are that this change > alone is too big. > > I think that if you look not at the patch itself but at the > resulting code the new implementation looks rather straightforward, > there are comments at every step in kvm_set_memslot() to explain > exactly what is being done. Not only it already scales well, but it > is also flexible to accommodate further improvements or even new > operations. > > The new implementation also uses standard kernel {rb,interval}-tree > and hash table implementation as its basic data structures, so it > automatically benefits from any generic improvements to these. > > All for the low price of just 174 net lines of code added. I think the best thing to do here is to provide a patch series that splits the individual changes so that they can be reviewed and their separate merits evaluated. Another thing that I dislike about KVM_SET_USER_MEMORY_REGION is that IMO userspace should provide all memslots at once, for an atomic switch of the whole memory array. (Or at least I would like to see the code; it might be a bit tricky because you'll need code to compute the difference between the old and new arrays and invoke kvm_arch_prepare/commit_memory_region). I'm not sure how that would interact with the active/inactive pair that you introduce here. Paolo