Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp581600pxb; Wed, 3 Feb 2021 12:15:18 -0800 (PST) X-Google-Smtp-Source: ABdhPJy6RXB+KxsTMa4ogrMsVCFY4/1pdtjaI3gbJWUBrmtYWTgWE1tHycM1Hbey2AboY4UCE7sV X-Received: by 2002:a17:906:3805:: with SMTP id v5mr4732444ejc.508.1612383318236; Wed, 03 Feb 2021 12:15:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612383318; cv=none; d=google.com; s=arc-20160816; b=Cv02Tmu9SiEhS73/JTkuRwDDk/W8lLd6CwiEhJcVs4qwN09AK2F7ew2IdK2JJDRVHi v1NaniJYDqrUNLkeZzdVTcg7AdIWNVeMnT9nnk9rW0C7rYiaLHv1dpLUTNe61zfkbSQ2 GsOd0U0y3UGauSgO7/AFovxUJ5M3Fa2suE1dXLBaiZsUzNb/lcXHg/xoVOakWgVsttbf STDApMff6S/ViRIO/7lW4WVKMNay72fCpfP08Rg4wC6yazr/FiPMf8q3toZeS0JtXFVi VmUx9kJUkVgb47pfEptEMSyGCB/3fQlJ2A+0hrhBgpWQ4/VU38vbaRPrJ/AYllq63gxv 30UA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:subject:from :references:cc:to; bh=dcsUCoswVZStQMQ1rMrnDbuQOuQ3F0w9xTMmLEVEzew=; b=tsZpRBg1gMi4Mu1yxzOCUh+z5MNlF/aS9hvB9OfbmqsnWZ9W+23OQUF8jmspL/S8Bz TDksry+TSXsVrIoou3GiHDg1Ms7VNE8sqXWUwca7MdfBCjhGdwqpxfrA/VzUpTCY8nO0 JbgdflcKvEcgm/ppieCPnsxk6Or2Va2wi9UKt0BBpKBBU1TzhPoeytESZQR+8EqrdNAw i0RC5rRlV8ZLd9UWhFWfAIubM/aJrfjRBllqJlfNbte3nE0J/J/u8j5wwLRgzBXDjSAR fTQ0izwOVSErjIICc/SoCxVhCM8PjYUFlfmyvS9r+Z9tbwtz48vv04IXzLBJxPK18Xuj vKQA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id kl4si2017522ejc.341.2021.02.03.12.14.52; Wed, 03 Feb 2021 12:15:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232026AbhBCULn (ORCPT + 99 others); Wed, 3 Feb 2021 15:11:43 -0500 Received: from vps-vb.mhejs.net ([37.28.154.113]:56732 "EHLO vps-vb.mhejs.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232018AbhBCULf (ORCPT ); Wed, 3 Feb 2021 15:11:35 -0500 Received: from MUA by vps-vb.mhejs.net with esmtps (TLS1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.93.0.4) (envelope-from ) id 1l7OU8-0005cn-3e; Wed, 03 Feb 2021 21:10:44 +0100 To: Vitaly Kuznetsov Cc: Paolo Bonzini , Wanpeng Li , Jim Mattson , Igor Mammedov , Marc Zyngier , James Morse , Julien Thierry , Suzuki K Poulose , Huacai Chen , Aleksandar Markovic , Paul Mackerras , Christian Borntraeger , Janosch Frank , David Hildenbrand , Cornelia Huck , Claudio Imbrenda , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Sean Christopherson References: <4d748e0fd50bac68ece6952129aed319502b6853.1612140117.git.maciej.szmigiero@oracle.com> <9e6ca093-35c3-7cca-443b-9f635df4891d@maciej.szmigiero.name> <875z39p6z1.fsf@vitty.brq.redhat.com> From: "Maciej S. Szmigiero" Subject: Re: [PATCH 2/2] KVM: Scalable memslots implementation Message-ID: Date: Wed, 3 Feb 2021 21:10:38 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: <875z39p6z1.fsf@vitty.brq.redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03.02.2021 15:21, Vitaly Kuznetsov wrote: > "Maciej S. Szmigiero" writes: > >> On 03.02.2021 00:43, Sean Christopherson wrote: >>> On Tue, Feb 02, 2021, Maciej S. Szmigiero wrote: >>>> On 02.02.2021 02:33, Sean Christopherson wrote: > > ... > >>>> >>>> I guess you mean to still turn id_to_index into a hash table, since >>>> otherwise a VMM which uses just two memslots but numbered 0 and 508 >>>> will have a 509-entry id_to_index array allocated. >>> >>> That should be irrelevant for the purposes of optimizing hva lookups, and mostly >>> irrelevant for optimizing memslot updates. Using a hash table is almost a pure >>> a memory optimization, it really only matters when the max number of memslots >>> skyrockets, which is a separate discussion from optimizing hva lookups. >> >> While I agree this is a separate thing from scalable hva lookups it still >> matters for the overall design. >> >> The current id_to_index array is fundamentally "pay the cost of max >> number of memslots possible regardless how many you use". >> >> And it's not only that it takes more memory it also forces memslot >> create / delete / move operations to be O(n) since the indices have to >> be updated. > > FWIW, I don't see a fundamental disagreement between you and Sean here, > it's just that we may want to eat this elephant one bite at a time > instead of trying to swallow it unchewed :-) > > E.g. as a first step, we may want to introduce helper functions to not > work with id_to_index directly and then suggest a better implementation > (using rbtree, bynamically allocated array,...) for these helpers. This > is definitely more work but it's likely worth it. That's sound like a good idea, will have a look at it - thanks. >> >> By the way, I think nobody argues here for a bazillion of memslots. >> It is is enough to simply remove the current cap and allow the maximum >> number permitted by the existing KVM API, that is 32k as Vitaly's >> patches recently did. > > Yea, there's no immegiate need even for 32k as KVM_MAX_VCPUS is '288', > we can get away with e.g. 1000 but id_to_index is the only thing which > may make us consider something lower than 32k: if only a few slots are > used, there's no penalty (of course slot *modify* operations are O(n) > so for 32k it'll take a lot but these configurations are currently > illegal and evem 'slow' is better :-) > Yes, id_to_index has this problem of depending on the max number of memlots allowed (current KVM code) or the ID of the highest memslot currently present (possible alternative, dynamically resized implementation) rather than just the total number of memslots currently in use. So it's a "pay all the cost upfront"-type implementation. Each slot modify operation is O(n) for the current code due to copying of the memslots array and possibly updating id_to_index indices (and minor things like kvm_mmu_calculate_default_mmu_pages() for x86), but it is O(log(n)) for the new implementation - that's one of its main benefits. Thanks, Maciej