Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp4318733rwi; Wed, 12 Oct 2022 13:59:38 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5pZW1UV/tGOf7oZUPz0ZO+9cdt3z36V2c78u61mr7QRgw/m1wQ6SlY08b5ukvvEBppDjFE X-Received: by 2002:a17:906:9be9:b0:78d:2f63:10dd with SMTP id de41-20020a1709069be900b0078d2f6310ddmr24130249ejc.479.1665608377989; Wed, 12 Oct 2022 13:59:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665608377; cv=none; d=google.com; s=arc-20160816; b=xAMHd6OzxRsEdolv2PSrF3xbJfaH+5oAVnHcoAEu0UrcM3srxHqXzCem7T1i3L8iOo zKljV3rdwlXK9cUyj+q87yG3LVo1JNdAKNp1kUAv7osH1UsLoxycDYFLla5pTSFSuzdJ 0QBLcpdNNWBFM/h4qCreHG+3k964D8RGQhENwgL7Wg1i/ikEsGUdpf+clK6dwt2IXdMs lRW1pMH8PjCflZu55ulUj1Lt3f3C06mjZiKPZ4Zi3i/fty2RzoJ0C25tFrcaNe8DOP3e bNBfXzA81H9PqP2uNvv+hKVOrwFVnfEm4KmBW2wshjyMafFGzFL9GXdgo1Z4fy4Hruqd a4PA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=+1hD4NL0ToajUdsWgIZo98G8hLdI2fHI/D6AjCnZpco=; b=hcKvEmkbcKPNhRWShGd8+V5bm6ZoQGHu4guu5dnGMowMfTdoLOY9zBLrDFw/tlSRsx 7U9ULX7VDdnHtfG3sTZrt+1NPmtY2z7tLRC1fT0Hxor7S6jGPh/wkZPBtmTNKP2GODsy 2rvAcKzdv6II2Tio8ih/vE2p+AEYgAJ/i03trf5/mODRoFBGdOag/GvyWUka4MtIF+pi 66jfJydwH3gP48SiKugrwR2uNLEbvfPLRpOJ/fVrisDWt5V9x38VMkGUXchOF5VFe5vM AySlluknQbKfda2UHBzGfE0t+rUQFrEbXz63L+h28VfyLODnly7l2bFvYtmDkYjpikLD 1O6g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=NRtSQTBr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qb18-20020a1709077e9200b0078dcdbb3e87si6643542ejc.530.2022.10.12.13.59.12; Wed, 12 Oct 2022 13:59:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=NRtSQTBr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229505AbiJLUOY (ORCPT + 99 others); Wed, 12 Oct 2022 16:14:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229616AbiJLUOW (ORCPT ); Wed, 12 Oct 2022 16:14:22 -0400 Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 585649C21F for ; Wed, 12 Oct 2022 13:14:21 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id d24so17277167pls.4 for ; Wed, 12 Oct 2022 13:14:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=+1hD4NL0ToajUdsWgIZo98G8hLdI2fHI/D6AjCnZpco=; b=NRtSQTBr095dw5s1nMVrCYtxceAL8LJ0hQcqSN57PbeOTbbfX447gF9gL/7oRy+KUJ eidxe/PQwrKHzxSQFc08Pqxv6JFnP+9rFOpsaIIyb5+ARxwEB/lHX6l7vONs0eRJmgQe qtpMKYpeqFCPP0cwZSiDR/fbbl7u7Q81YcX92W3vc+/2ogl7bZkUfodnF0Dk3AD/kNZN NJrrR+afn4inpfUgoUd+kE0pZaZKYunRvmOV097bIjyQagtx0yPx0eDeSBseNK0kBtzZ bYG/+SOSs+mEG7AtemF+U96LL4BHvUuyZ7/itgbT7WSGqIpFO79tDGga4pHSFuv6Ue6d 0IRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=+1hD4NL0ToajUdsWgIZo98G8hLdI2fHI/D6AjCnZpco=; b=uC0hPYzU9g+zrco7WTBxXMLDygoQOPUSP3opNmiIFyb0vNv0KTgnH2hAPYXgjfey0N KZQPlDjcdlFmwVVa5y3orNTeFwDTjVNQWZGBLur6Y9gE4qxgw/Z4Nnkndphcm8j3YUKp Sd3W0ZzidTQpqPKAY4RS/yStJFVIw9GfBEhhMTWXs4iEx/ep041aScujk5vjCDcHjM8o irK3F6b8OUXfNfspjvA4hF1g68TaZ5dUQQgJrp2mmb4n3ONLmhubXG/LlEpWlQao2B9O 61JLh5VUR28y/pzgsWbbXwlFy4iXH4oJ3+MS1hfbPyfp6hiQDRlSebWkMN9Ghkiz59ac l5Fg== X-Gm-Message-State: ACrzQf2pJtNld2C47D1AJZck+3sTusJcvACAz2W3hJTSJJjFfBELORyY FSqWfRkV2lxVokyymtMOQnjsZVbqg8CWPQ== X-Received: by 2002:a17:90b:4d0c:b0:20b:c983:2d85 with SMTP id mw12-20020a17090b4d0c00b0020bc9832d85mr7110557pjb.45.1665605660707; Wed, 12 Oct 2022 13:14:20 -0700 (PDT) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id p18-20020a1709028a9200b0017d97d13b18sm11013567plo.65.2022.10.12.13.14.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Oct 2022 13:14:20 -0700 (PDT) Date: Wed, 12 Oct 2022 20:14:16 +0000 From: Sean Christopherson To: isaku.yamahata@intel.com Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Thomas Gleixner , Marc Zyngier , Will Deacon , isaku.yamahata@gmail.com, Kai Huang , Chao Gao , Atish Patra , Shaokun Zhang , Daniel Lezcano , Huang Ying , Huacai Chen , Dave Hansen , Borislav Petkov Subject: Re: [PATCH v5 09/30] KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock Message-ID: References: <92836b09c8e0f19f8e506008e45993881d22b6d1.1663869838.git.isaku.yamahata@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <92836b09c8e0f19f8e506008e45993881d22b6d1.1663869838.git.isaku.yamahata@intel.com> X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 22, 2022, isaku.yamahata@intel.com wrote: > From: Isaku Yamahata > > Because kvm_count_lock unnecessarily complicates the KVM locking convention > Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock for > simplicity. kvm_arch_hardware_enable/disable() callbacks depend on > non-preemptiblity with the spin lock. Add preempt_disable/enable() > around hardware enable/disable callback to keep the assumption. There's the other "minor" wrinkle that prior to patch 7, "KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section, kvm_online_cpu() was called with IRQs disabled and couldn't sleep, i.e. couldn't acquire a mutex. That's very important to capture in the changelog. > Because kvm_suspend() and kvm_resume() is called with interrupt disabled, > they don't need preempt_disable/enable() pair. > > Opportunistically add some comments on locking. > > Suggested-by: Sean Christopherson > Signed-off-by: Isaku Yamahata ... > @@ -5028,13 +5029,20 @@ static int kvm_online_cpu(unsigned int cpu) > if (kvm_usage_count) { > WARN_ON_ONCE(atomic_read(&hardware_enable_failed)); > > + /* > + * arch callback kvm_arch_hardware_eanble() assumes that s/eanble/enable Though even better would be to avoid function names entirely. > + * preemption is disabled for historical reason. Disable > + * preemption until all arch callbacks are fixed. > + */ Probably better to put this comment above to the WARN_ON_ONCE() in hardware_enable_nolock() since that's where the oddity and dependency on arch behavior lies. And then it can be turned into a FIXME, e.g. /* * FIXME: drop the "preemption disabled" requirement here and in the * disable path once all arch code plays nice with preemption. */ > + preempt_disable(); > hardware_enable_nolock(NULL); > + preempt_enable(); > if (atomic_read(&hardware_enable_failed)) { > atomic_set(&hardware_enable_failed, 0); > ret = -EIO; > } > } > - raw_spin_unlock(&kvm_count_lock); > + mutex_unlock(&kvm_lock); > return ret; > } > > @@ -5042,6 +5050,8 @@ static void hardware_disable_nolock(void *junk) > { > int cpu = raw_smp_processor_id(); > > + WARN_ON_ONCE(preemptible()); > + > if (!cpumask_test_cpu(cpu, cpus_hardware_enabled)) > return; > cpumask_clear_cpu(cpu, cpus_hardware_enabled); > @@ -5050,10 +5060,18 @@ static void hardware_disable_nolock(void *junk) > > static int kvm_offline_cpu(unsigned int cpu) > { > - raw_spin_lock(&kvm_count_lock); > - if (kvm_usage_count) > + mutex_lock(&kvm_lock); > + if (kvm_usage_count) { > + /* > + * arch callback kvm_arch_hardware_disable() assumes that > + * preemption is disabled for historical reason. Disable > + * preemption until all arch callbacks are fixed. > + */ I vote to drop this comment and instead document everything in the enable FIXME (see above). > + preempt_disable(); > hardware_disable_nolock(NULL); > - raw_spin_unlock(&kvm_count_lock); > + preempt_enable(); > + } > + mutex_unlock(&kvm_lock); > return 0; > } ... > @@ -5708,15 +5728,27 @@ static void kvm_init_debug(void) > > static int kvm_suspend(void) > { > - if (kvm_usage_count) > + /* > + * The caller ensures that CPU hotplug is disabled by > + * cpu_hotplug_disable() and other CPUs are offlined. No need for > + * locking. Disabling CPU hotplug prevents racing with kvm_online_cpu()/kvm_offline_cpu(), but doesn't prevent racing with hardware_enable_all()/hardware_disable_all(). And the lockdep doesn't mesh with the comment, which explains why kvm_lock doesn't _need_ to be held, but not why kvm_lock _can't_ be held. Maybe this? /* * Secondary CPUs and CPU hotplug are disabled across the suspend/resume * callbacks, i.e. no need to acquire kvm_lock to ensure the usage count * is stable. Assert that kvm_lock is not held as a paranoid sanity * check that the system isn't suspended when KVM is enabling hardware. */ > + */ > + lockdep_assert_not_held(&kvm_lock); > + > + if (kvm_usage_count) { > + /* > + * Because kvm_suspend() is called with interrupt disabled, no > + * need to disable preemption. > + */ Add a lockdep and drop the comment, e.g. below the lockdep_assert_not_held(), add lockdep_assert_irqs_disabled(); That covers the "why doesn't this disable preemption" _and_ enforces that IRQs are indeed disabled. > hardware_disable_nolock(NULL); > + } > return 0; > } > > static void kvm_resume(void) > { > if (kvm_usage_count) { > - lockdep_assert_not_held(&kvm_count_lock); > + lockdep_assert_not_held(&kvm_lock); > hardware_enable_nolock(NULL); > } > } > -- > 2.25.1 >