Received: by 2002:a05:6358:a55:b0:ec:fcf4:3ecf with SMTP id 21csp6843474rwb; Wed, 18 Jan 2023 10:02:38 -0800 (PST) X-Google-Smtp-Source: AMrXdXs+Lkc2flDu7mmIV104+8mVQUllxovJqJQqqtn4ig5PUI1qv1aYZbyCXNl2xv01gYFxphIg X-Received: by 2002:a05:6402:401b:b0:46f:d386:117d with SMTP id d27-20020a056402401b00b0046fd386117dmr8582829eda.33.1674064957941; Wed, 18 Jan 2023 10:02:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674064957; cv=none; d=google.com; s=arc-20160816; b=VKB65Sl2qYSSY95aExMKcfuAObJtjfiFU7ZPcWVoAMl/+MuObI0azG/b7WA/AkF25K 7rdyJiD5RDmixDvT5vqk0UfQ/yJAWFKu6M4XVxMIAu5lpyBvqaWpShP0ZrKMB/dUr2KM csBBfjDa80F12hPp66vga8vSrMisDMfO+pm9sL57Fsx/U0pjlcMLIQE9mID9kwIUbdpi xI9xfVKNKl3R+VpUNDxCI+JZmWp4+WN9wYbWozpu7/wieC8cud/40me5QEJX62lpiJbP 7hSnOll1zz9o8qmfQjn+RHJFhQLDmInMF4pu+7SGAhlQW/1lc8t6rWFQA6T6V2rp5LQY U4mA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Fdipg/e4fRuWWYxK3H9aGCTA3DbUCt12SK72zt21rzM=; b=SRXb6w/stg+YmpBQz9E3ket8aY1t9nV3gPxdec48Asc1wHX5ttvFzrg89MVa4nfBeF ZOsB8MYO6PwMJAwof+CQAHjMj6cIt8wEKteNIQdjdIiBfYxQLU92kDOEyFG29MKSPoGn OiXj6SveQudNnSvBqwLQmKCLYQLOga1w2xYKkLMNJ0P4NoI0bo5FYN2Z1EiPufRl2Oj8 D6aHjoXmUri+W03Uft6/PodoLiHRlX436gjZeIZaxZxIsTXJIBvncpN3dfNsdcOG22aM Sp2P5kU/yPHXLyihYQCepGn8fZlqLWBSa9NCKpFHcXZ2ANItF1MWw2jrslgauqtuYDsW 3VgQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=d5LcAH7z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k26-20020aa7d8da000000b0048ee8cf8bb8si32565686eds.53.2023.01.18.10.02.27; Wed, 18 Jan 2023 10:02:37 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=d5LcAH7z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229758AbjARRpi (ORCPT + 45 others); Wed, 18 Jan 2023 12:45:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40998 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231455AbjARRpP (ORCPT ); Wed, 18 Jan 2023 12:45:15 -0500 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 330F95AB4D for ; Wed, 18 Jan 2023 09:43:56 -0800 (PST) Received: by mail-pl1-x62f.google.com with SMTP id d3so37613305plr.10 for ; Wed, 18 Jan 2023 09:43:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Fdipg/e4fRuWWYxK3H9aGCTA3DbUCt12SK72zt21rzM=; b=d5LcAH7zwL18g39tFFPyrtTmow8BpqTSbmmMqSzPjDBQNYo7J/+zfBhvwKIuGqXPsL ydGdNA2GFez4En5eQvt++QxQxbWyXrZpb/E9jxO7CSDBswbF18VDdaUMCUdgqvOZmEBc LgydGdUoMlBIpTCaUbvxnAyfcc4HBUguYA9xlgzaOW6kueOYmdCKoOK2LmFoklDEpTPI XHVSOUriWggkjJElRv/P0qUWS+TcRFFsdEVvClaV3nasnzpdvwND5BGFGLr1kf2rwBCs VBwe7mq91A6yW2sF35PGnhyMMEoJX2TFchBRCOCLUEYOasQ5klyCTJlq+ZTVtNp+Y1Da DmNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Fdipg/e4fRuWWYxK3H9aGCTA3DbUCt12SK72zt21rzM=; b=cbBA3kSJJjv9LIMsIM/fQM35w1hRuZo+50oobEI7fTIfysAX9IPHEmiojLBpUPbNwg 4uWp3HNhIjfqSS/JkVSlnDEP5vFUxL18StYev/PXFJ1LuI2FNVfDjL/neTPvcVDGFCEh iqvMu266VrP9KU2yG1u+TLUtrtIx3306ms/ZS3Jq43IK6YQbw1dX+R4agXg29iBmWNER l6lZ4Mic6cy51iCpq3t+hhNiS3XGB/FRA4QIi/S4Goq6ZQfvMv55oHXQPTLAffTEwJtQ Kr7g441JU93R7LLK1FMyTuiAcjuNWNw/AmhtAS3Xc5d6bAUHShM66wf21Ce012cSSgac WS5A== X-Gm-Message-State: AFqh2kqnfE7D6A0n/FKssCIDGNjEbl7qNjL94umAytVc3LvGsIKt4jH+ AxG0AAfpflwGvSzV8ZsfoDxrKA== X-Received: by 2002:a17:902:82c2:b0:192:6bff:734 with SMTP id u2-20020a17090282c200b001926bff0734mr3481093plz.2.1674063830825; Wed, 18 Jan 2023 09:43:50 -0800 (PST) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id l1-20020a170903244100b0019498477f31sm3912441pls.123.2023.01.18.09.43.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Jan 2023 09:43:50 -0800 (PST) Date: Wed, 18 Jan 2023 17:43:46 +0000 From: Sean Christopherson To: Vipin Sharma Cc: David Matlack , pbonzini@redhat.com, bgardon@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [Patch v3 1/9] KVM: x86/mmu: Repurpose KVM MMU shrinker to purge shadow page caches Message-ID: References: <20221222023457.1764-1-vipinsh@google.com> <20221222023457.1764-2-vipinsh@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org @all, trim your replies! On Tue, Jan 03, 2023, Vipin Sharma wrote: > On Tue, Jan 3, 2023 at 10:01 AM Vipin Sharma wrote: > > > > On Thu, Dec 29, 2022 at 1:55 PM David Matlack wrote: > > > > @@ -6646,66 +6690,49 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen) > > > > static unsigned long > > > > mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) > > > > { > > > > - struct kvm *kvm; > > > > - int nr_to_scan = sc->nr_to_scan; > > > > + struct kvm_mmu_memory_cache *cache; > > > > + struct kvm *kvm, *first_kvm = NULL; > > > > unsigned long freed = 0; > > > > + /* spinlock for memory cache */ > > > > + spinlock_t *cache_lock; > > > > + struct kvm_vcpu *vcpu; > > > > + unsigned long i; > > > > > > > > mutex_lock(&kvm_lock); > > > > > > > > list_for_each_entry(kvm, &vm_list, vm_list) { > > > > - int idx; > > > > - LIST_HEAD(invalid_list); > > > > - > > > > - /* > > > > - * Never scan more than sc->nr_to_scan VM instances. > > > > - * Will not hit this condition practically since we do not try > > > > - * to shrink more than one VM and it is very unlikely to see > > > > - * !n_used_mmu_pages so many times. > > > > - */ > > > > - if (!nr_to_scan--) > > > > + if (first_kvm == kvm) > > > > break; > > > > - /* > > > > - * n_used_mmu_pages is accessed without holding kvm->mmu_lock > > > > - * here. We may skip a VM instance errorneosly, but we do not > > > > - * want to shrink a VM that only started to populate its MMU > > > > - * anyway. > > > > - */ > > > > - if (!kvm->arch.n_used_mmu_pages && > > > > - !kvm_has_zapped_obsolete_pages(kvm)) > > > > - continue; > > > > + if (!first_kvm) > > > > + first_kvm = kvm; > > > > + list_move_tail(&kvm->vm_list, &vm_list); > > > > > > > > - idx = srcu_read_lock(&kvm->srcu); > > > > - write_lock(&kvm->mmu_lock); > > > > + kvm_for_each_vcpu(i, vcpu, kvm) { > > > > > > What protects this from racing with vCPU creation/deletion? > > > > > vCPU deletion: > We take kvm_lock in mmu_shrink_scan(), the same lock is taken in > kvm_destroy_vm() to remove a vm from vm_list. So, once we are > iterating vm_list we will not see any VM removal which will means no > vcpu removal. > > I didn't find any other code for vCPU deletion except failures during > VM and VCPU set up. A VM is only added to vm_list after successful > creation. Yep, KVM doesn't support destroying/freeing a vCPU after it's been added. > vCPU creation: > I think it will work. > > kvm_vm_ioctl_create_vcpus() initializes the vcpu, adds it to > kvm->vcpu_array which is of the type xarray and is managed by RCU. > After this online_vcpus is incremented. So, kvm_for_each_vcpu() which > uses RCU to read entries, if it sees incremented online_vcpus value > then it will also sees all of the vcpu initialization. Yep. The shrinker may race with a vCPU creation, e.g. not process a just-created vCPU, but that's totally ok in this case since the shrinker path is best effort (and purging the caches of a newly created vCPU is likely pointless). > @Sean, Paolo > > Is the above explanation correct, kvm_for_each_vcpu() is safe without any lock? Well, in this case, you do need to hold kvm_lock ;-) But yes, iterating over vCPUs without holding the per-VM kvm->lock is safe, the caller just needs to ensure the VM can't be destroyed, i.e. either needs to hold a reference to the VM or needs to hold kvm_lock.