Received: by 2002:a05:6358:111d:b0:dc:6189:e246 with SMTP id f29csp732708rwi; Wed, 2 Nov 2022 17:58:29 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7OMq7fOAQchhprFyZBaciN84XDkKSpeaHy0pNhGMP7x23L99I3BZ9+BgTAMM0z8BKZTVgU X-Received: by 2002:a17:907:7f8e:b0:7aa:7598:126d with SMTP id qk14-20020a1709077f8e00b007aa7598126dmr26958967ejc.289.1667437109556; Wed, 02 Nov 2022 17:58:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667437109; cv=none; d=google.com; s=arc-20160816; b=bQW7wPyejd7ukDIC+GoWKm5BEQlqvjWakVrQdJxEJzioz0UtyKevECkzQPz7IuSsXj SgZk3BwSgealBZ+0R8t5YesOy6u5hSf9EVPOOhEosQaR4CGaiKJyVoPdmP8uCu1k+m4B AXbLXT5BkMVVCml6onuL2yfoAxnjGroDtJiDTKwj3aYKgNmB2RYhc1d3aCCgLvX+TYEb t6tXPkmuhp+QRjXwvyJaSQcDJ21nmCqgtLELezG6MDSW00eB+Y9D5okeXG0mqEdw+vOf lm31p3RVKZTMtWpu0mgu6SZ/S6E2g2kEnduqeDmg9CeApPmAnCK5i+vHlPkeKCsVNN4p FGgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=+sxIRPFhQ/bRU1EUsINusyHUfwnJnra57CB3HRbIrGQ=; b=0DDwz/UFY+1EIE9Jlt68qmmp42+vrMHorS2AfqSOttD+/CZ992AH2eY3CYvzq2Lzjh ZUlirDdEEfFhzHodBN85RXnygNdKbDeON4S+EYzGS4aJUb422RbHGhXDJEVTDBsN/vEo ZgnWojkCShXi3TsgL7qHC1gtcGfMu8VIzCd/ZmZhRmDmYvvfZi/5saB7lLpcdHuo8N7l MgXGJCfv8PPKAAV2Co0acDJyVhpnmB+WWNtf1RpKPjuJFI6W506xmRBzXKn77hCKyxs3 lLp99nc8aDcNbv8pZTyTbJjLwtqD0KzVpgBbQYisOOMhMalZVYhaJz9rwIKuHttyNeeD umPQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=jv6dI7qG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d9-20020a50cd49000000b00452dfa2bb27si17863875edj.134.2022.11.02.17.58.06; Wed, 02 Nov 2022 17:58:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=jv6dI7qG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230371AbiKBX02 (ORCPT + 97 others); Wed, 2 Nov 2022 19:26:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33938 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231524AbiKBXYv (ORCPT ); Wed, 2 Nov 2022 19:24:51 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E45E11A2A for ; Wed, 2 Nov 2022 16:20:37 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-3697bd55974so721777b3.15 for ; Wed, 02 Nov 2022 16:20:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=+sxIRPFhQ/bRU1EUsINusyHUfwnJnra57CB3HRbIrGQ=; b=jv6dI7qGHV2gUZ3Nk9vOoHe84FBP9TkOY9A9eVQMhtjmGk9u0jlVKDO6SkkJVrwt12 m0CdSNNgCHma9W4plIHPNP31YE0isW6rHxg8SLzN06JNj7B4eH/JVFDKLvwVk+pTrSfc cd1f3arotAsI9PLIVZ+gM5iOHqG0gkQonoBeG+igtzlakyKPUg7bVk8HEG4Qt0gpuy/J SANWh5EPPIcvcI1b1i5q00zNbhYEA45rs8BxPx1RbxWAx78g+tAqKNfe3EdC5ZhIPY91 Xb4J0zHYKnAPUlNccijsQBbbLfGEdn+GfBR6MlkOjmTN4KzH5w39TKzZQ1G6FhxzCFCP aqaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=+sxIRPFhQ/bRU1EUsINusyHUfwnJnra57CB3HRbIrGQ=; b=f8OCET7DZzBCdxT9P+hLs3sdSWAaFrn+2qPwrgytDjgjZVcRkSIq44KJRBcU1t9KJn aXH7vx5LwVzDTuukWrfmwslTN7iWSafrDo07u9G/kfc+k8TQYX/xRIb0caXr+91b6N6y SksbTlYzWOMEmKBLwjubeNhNrMjCydvcA9ZSHGSx1WVWNn6XiCpEr6vIETX4NKfCgbvN HwqTmJDN/M1VavtTnX+05Sgk0mGt823t8STbROpqo4r9lhNCb7vq4Enx5rYnjpoPCJNc X53rPQ3XpgmemdVGT4oDOtFGE9HcB95aB5A9+kUoZQJyKChIVg9YMp868TV8FSJkKqGd FQoA== X-Gm-Message-State: ACrzQf2JZKAnEEZ9kxqHCgn71Hqoqcf8wb0d7su4FOFBO7Z/8bputsVt 9DUdOgTun4LUozvvnxCvSx0KjJwDy3s= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:690c:80a:b0:36b:6ff3:ee05 with SMTP id bx10-20020a05690c080a00b0036b6ff3ee05mr183596ywb.495.1667431220753; Wed, 02 Nov 2022 16:20:20 -0700 (PDT) Reply-To: Sean Christopherson Date: Wed, 2 Nov 2022 23:19:06 +0000 In-Reply-To: <20221102231911.3107438-1-seanjc@google.com> Mime-Version: 1.0 References: <20221102231911.3107438-1-seanjc@google.com> X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221102231911.3107438-40-seanjc@google.com> Subject: [PATCH 39/44] KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock From: Sean Christopherson To: Paolo Bonzini , Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Matthew Rosato , Eric Farman , Sean Christopherson , Vitaly Kuznetsov Cc: James Morse , Alexandru Elisei , Suzuki K Poulose , Oliver Upton , Atish Patra , David Hildenbrand , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, Isaku Yamahata , Fabiano Rosas , Michael Ellerman , Chao Gao , Thomas Gleixner , Yuan Yao Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Isaku Yamahata Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock now that KVM hooks CPU hotplug during the ONLINE phase, which can sleep. Previously, KVM hooked the STARTING phase, which is not allowed to sleep and thus could not take kvm_lock (a mutex). Explicitly disable preemptions/IRQs in the CPU hotplug paths as needed to keep arch code happy, e.g. x86 expects IRQs to be disabled during hardware enabling, and expects preemption to be disabled during hardware disabling. There are no preemption/interrupt concerns in the hotplug path, i.e. the extra disabling is done purely to allow x86 to keep its sanity checks, which are targeted primiarily at the "enable/disable all" paths. Opportunistically update KVM's locking documentation. Signed-off-by: Isaku Yamahata Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson --- Documentation/virt/kvm/locking.rst | 18 ++++++------ virt/kvm/kvm_main.c | 44 +++++++++++++++++++++--------- 2 files changed, 40 insertions(+), 22 deletions(-) diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst index 845a561629f1..4feaf527575b 100644 --- a/Documentation/virt/kvm/locking.rst +++ b/Documentation/virt/kvm/locking.rst @@ -9,6 +9,8 @@ KVM Lock Overview The acquisition orders for mutexes are as follows: +- cpus_read_lock() is taken outside kvm_lock + - kvm->lock is taken outside vcpu->mutex - kvm->lock is taken outside kvm->slots_lock and kvm->irq_lock @@ -29,6 +31,8 @@ The acquisition orders for mutexes are as follows: On x86: +- kvm_lock is taken outside kvm->mmu_lock + - vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock - kvm->arch.mmu_lock is an rwlock. kvm->arch.tdp_mmu_pages_lock and @@ -216,15 +220,11 @@ time it will be set using the Dirty tracking mechanism described above. :Type: mutex :Arch: any :Protects: - vm_list - -``kvm_count_lock`` -^^^^^^^^^^^^^^^^^^ - -:Type: raw_spinlock_t -:Arch: any -:Protects: - hardware virtualization enable/disable -:Comment: 'raw' because hardware enabling/disabling must be atomic /wrt - migration. + - kvm_usage_count + - hardware virtualization enable/disable + - module probing (x86 only) +:Comment: KVM also disables CPU hotplug via cpus_read_lock() during + enable/disable. ``kvm->mn_invalidate_lock`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^ diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4e765ef9f4bd..c8d92e6c3922 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -100,7 +100,6 @@ EXPORT_SYMBOL_GPL(halt_poll_ns_shrink); */ DEFINE_MUTEX(kvm_lock); -static DEFINE_RAW_SPINLOCK(kvm_count_lock); LIST_HEAD(vm_list); static cpumask_var_t cpus_hardware_enabled; @@ -5028,9 +5027,10 @@ static void hardware_enable_nolock(void *junk) static int kvm_online_cpu(unsigned int cpu) { + unsigned long flags; int ret = 0; - raw_spin_lock(&kvm_count_lock); + mutex_lock(&kvm_lock); /* * Abort the CPU online process if hardware virtualization cannot * be enabled. Otherwise running VMs would encounter unrecoverable @@ -5039,13 +5039,16 @@ static int kvm_online_cpu(unsigned int cpu) if (kvm_usage_count) { WARN_ON_ONCE(atomic_read(&hardware_enable_failed)); + local_irq_save(flags); hardware_enable_nolock(NULL); + local_irq_restore(flags); + if (atomic_read(&hardware_enable_failed)) { atomic_set(&hardware_enable_failed, 0); ret = -EIO; } } - raw_spin_unlock(&kvm_count_lock); + mutex_unlock(&kvm_lock); return ret; } @@ -5061,10 +5064,13 @@ static void hardware_disable_nolock(void *junk) static int kvm_offline_cpu(unsigned int cpu) { - raw_spin_lock(&kvm_count_lock); - if (kvm_usage_count) + mutex_lock(&kvm_lock); + if (kvm_usage_count) { + preempt_disable(); hardware_disable_nolock(NULL); - raw_spin_unlock(&kvm_count_lock); + preempt_enable(); + } + mutex_unlock(&kvm_lock); return 0; } @@ -5079,9 +5085,11 @@ static void hardware_disable_all_nolock(void) static void hardware_disable_all(void) { - raw_spin_lock(&kvm_count_lock); + cpus_read_lock(); + mutex_lock(&kvm_lock); hardware_disable_all_nolock(); - raw_spin_unlock(&kvm_count_lock); + mutex_unlock(&kvm_lock); + cpus_read_unlock(); } static int hardware_enable_all(void) @@ -5097,7 +5105,7 @@ static int hardware_enable_all(void) * Disable CPU hotplug to prevent scenarios where KVM sees */ cpus_read_lock(); - raw_spin_lock(&kvm_count_lock); + mutex_lock(&kvm_lock); kvm_usage_count++; if (kvm_usage_count == 1) { @@ -5110,7 +5118,7 @@ static int hardware_enable_all(void) } } - raw_spin_unlock(&kvm_count_lock); + mutex_unlock(&kvm_lock); cpus_read_unlock(); return r; @@ -5716,6 +5724,15 @@ static void kvm_init_debug(void) static int kvm_suspend(void) { + /* + * Secondary CPUs and CPU hotplug are disabled across the suspend/resume + * callbacks, i.e. no need to acquire kvm_lock to ensure the usage count + * is stable. Assert that kvm_lock is not held as a paranoid sanity + * check that the system isn't suspended when KVM is enabling hardware. + */ + lockdep_assert_not_held(&kvm_lock); + lockdep_assert_irqs_disabled(); + if (kvm_usage_count) hardware_disable_nolock(NULL); return 0; @@ -5723,10 +5740,11 @@ static int kvm_suspend(void) static void kvm_resume(void) { - if (kvm_usage_count) { - lockdep_assert_not_held(&kvm_count_lock); + lockdep_assert_not_held(&kvm_lock); + lockdep_assert_irqs_disabled(); + + if (kvm_usage_count) hardware_enable_nolock(NULL); - } } static struct syscore_ops kvm_syscore_ops = { -- 2.38.1.431.g37b22c650d-goog