Received: by 2002:a05:6358:4e97:b0:b3:742d:4702 with SMTP id ce23csp3380930rwb; Tue, 16 Aug 2022 01:44:53 -0700 (PDT) X-Google-Smtp-Source: AA6agR6z9JnobUE1jVXtl7ca9WOEftpNZw7pSiH4qk9BvBOh6HKVqvMJqJzraB/DKeUwIiWjX03H X-Received: by 2002:a05:6a00:787:b0:52d:47ca:e9fb with SMTP id g7-20020a056a00078700b0052d47cae9fbmr19928032pfu.39.1660639492886; Tue, 16 Aug 2022 01:44:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660639492; cv=none; d=google.com; s=arc-20160816; b=socKVSyFD93pg2BxrdJn/zbsngslzyJZq2uWwhmJ56ub9ayWMsz7qjgZajcbjKJ1Hr n120yC8nt4gY3HWqdG4MArj58LQR7iZOiQs221QuUX9Zg5+gXBzCfo0STsVEZjS66155 lmhUqxu0L3LfvG445QRTfdrUxLIAojbk1XfKTA/SrWPhGuuGqVOm5l8H29uIi7XyaHif xLohadtC9nzMXgCWeGFZDzw0yjepenzufYIrUTx/y4L5KfDkzPhJqn5ITZOn3YDZCJ4Q PvMhSQs29B1zzYv/DK52jufyL6twJ7A2zMozXtlg1A2bzGVM6QzF4tKIx8eDm6KI/CT/ cLag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:reply-to:dkim-signature; bh=/mb/4kA68DuEk+etiSxItbDhS1o312Y9/FSzPon3SwU=; b=V931vJRhZp1fjoeP1LsBUNwVALJR+GZAWlDYxLaDBMwzLC27eBv3+9ZFtNUcAiDbl4 jAyUoOLbImcQ7B1oXgVOPw+hjL3bq9Bp+V0C1M/A3ntzClCQOCY459qopJFsIcYbDMVw LLcgtfT7LhIyqOpM+KK8KqT3qgg2S8xqxJN/e6TcmuNAqu8ROvESy/tJgLyPfISvMiXM znCh9uC21v9IiqktoLMTVA//IF2/xgBQpjGhfKSUR4Ah2Vfydv0Bs9sEzVe734H/03BO /QrJcVyjb07f7b7QNPx7EiCFTyVw8A6ZkWxj1zNMMQblETYX9DzfzVVXUmHSChkpWJsZ ukww== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="E/qmMTDg"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id rm11-20020a17090b3ecb00b001f50c96fe7bsi13422907pjb.62.2022.08.16.01.44.32; Tue, 16 Aug 2022 01:44:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="E/qmMTDg"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233100AbiHPI3K (ORCPT + 99 others); Tue, 16 Aug 2022 04:29:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40164 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233017AbiHPI21 (ORCPT ); Tue, 16 Aug 2022 04:28:27 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18DCCB5A54 for ; Mon, 15 Aug 2022 22:39:45 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id h12-20020a170902f54c00b0016f8858ce9bso6053919plf.9 for ; Mon, 15 Aug 2022 22:39:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:reply-to:from:to:cc; bh=/mb/4kA68DuEk+etiSxItbDhS1o312Y9/FSzPon3SwU=; b=E/qmMTDg2mnLIRqmZ/h0BV2xJPkV0YxMHFVoLTvK/m0l5eKcjvYe7Zs4cjq3foQ8DG XcXMqto2Rrn733cK+JUdoPaHanFJx4f8VMzwOJpE/N9fgyhTZK+y5sIbR6ieG9QOOL37 RW6GJwft95tp/9Bs9OQrOQ0z95G6TVpIyueG0TfruZQabzNGWUa6bqAEET4qItWN3XsO ingzWu6BKh2Ndo7UBJshY/BIRkdfcw0/C0mLSdqDlYhbJZX+VGJIkxs245ImtKPCkNHs kIddymjEooD8ywYCmBhhdT/cwcNipE9ZDjxD1O+xSnKLKwBMf2HEUfiaPgsyRkoOXDnP E6hA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc; bh=/mb/4kA68DuEk+etiSxItbDhS1o312Y9/FSzPon3SwU=; b=yiPJYf1nLh7Ft7/z4gEb6I1ZmcYG18S0GJLiGPl+NjIM8tyH3psmPu8U/Gb9qkESsX JDxnx6C3IU3LiA1AW7nkVkWOZRy4uR4t4Z7fRXHcE6VMrp/lEIy888ZvG70H71YyrqSV dJsYaG5C8Jrzlq9WpQuv3l6WyL/VY/pWEvgnZZkAtZbLhohaFAiVvMTpIkJFxJSmykxQ 2zap2YDZ4jO6HKolpA29g8NJi0r9D20gdYxiDpRcbqnGcrloiHMWuY54yZ3JtQQgPEy+ LC+MyIPNee3hEHBXFG5UXoLkjUeWQIAOFhOCdY4CawZ7EYuEBll2eQMVR8SEeltBTURm QviA== X-Gm-Message-State: ACgBeo2wK6KfAluWE6o2JaWhYEsnTPsaSqvsb6itCd0L11Xrdr67Uumm VwqwkDbw7HdFXN8eWk1usTVVregK++4= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6a00:4147:b0:52e:2d56:17c8 with SMTP id bv7-20020a056a00414700b0052e2d5617c8mr19678854pfb.51.1660628384621; Mon, 15 Aug 2022 22:39:44 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 16 Aug 2022 05:39:36 +0000 In-Reply-To: <20220816053937.2477106-1-seanjc@google.com> Message-Id: <20220816053937.2477106-3-seanjc@google.com> Mime-Version: 1.0 References: <20220816053937.2477106-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.595.g718a3a8f04-goog Subject: [PATCH 2/3] KVM: Unconditionally get a ref to /dev/kvm module when creating a VM From: Sean Christopherson To: Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, syzbot+744e173caec2e1627ee0@syzkaller.appspotmail.com, Oliver Upton , Sean Christopherson , David Matlack Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Unconditionally get a reference to the /dev/kvm module when creating a VM instead of using try_get_module(), which will fail if the module is in the process of being forcefully unloaded. The error handling when try_get_module() fails doesn't properly unwind all that has been done, e.g. doesn't call kvm_arch_pre_destroy_vm() and doesn't remove the VM from the global list. Not removing VMs from the global list tends to be fatal, e.g. leads to use-after-free explosions. The obvious alternative would be to add proper unwinding, but the justification for using try_get_module(), "rmmod --wait", is completely bogus as support for "rmmod --wait", i.e. delete_module() without O_NONBLOCK, was removed by commit 3f2b9c9cdf38 ("module: remove rmmod --wait option.") nearly a decade ago. It's still possible for try_get_module() to fail due to the module dying (more like being killed), as the module will be tagged MODULE_STATE_GOING by "rmmod --force", i.e. delete_module(..., O_TRUNC), but playing nice with forced unloading is an exercise in futility and gives a falsea sense of security. Using try_get_module() only prevents acquiring _new_ references, it doesn't magically put the references held by other VMs, and forced unloading doesn't wait, i.e. "rmmod --force" on KVM is all but guaranteed to cause spectacular fireworks; the window where KVM will fail try_get_module() is tiny compared to the window where KVM is building and running the VM with an elevated module refcount. Addressing KVM's inability to play nice with "rmmod --force" is firmly out-of-scope. Forcefully unloading any module taints kernel (for obvious reasons) _and_ requires the kernel to be built with CONFIG_MODULE_FORCE_UNLOAD=y, which is off by default and comes with the amusing disclaimer that it's "mainly for kernel developers and desperate users". In other words, KVM is free to scoff at bug reports due to using "rmmod --force" while VMs may be running. Fixes: 5f6de5cbebee ("KVM: Prevent module exit until all VMs are freed") Cc: stable@vger.kernel.org Cc: David Matlack Signed-off-by: Sean Christopherson --- virt/kvm/kvm_main.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index ee5f48cc100b..15e304e059d4 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1134,6 +1134,9 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname) if (!kvm) return ERR_PTR(-ENOMEM); + /* KVM is pinned via open("/dev/kvm"), the fd passed to this ioctl(). */ + __module_get(kvm_chardev_ops.owner); + KVM_MMU_LOCK_INIT(kvm); mmgrab(current->mm); kvm->mm = current->mm; @@ -1226,16 +1229,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname) preempt_notifier_inc(); kvm_init_pm_notifier(kvm); - /* - * When the fd passed to this ioctl() is opened it pins the module, - * but try_module_get() also prevents getting a reference if the module - * is in MODULE_STATE_GOING (e.g. if someone ran "rmmod --wait"). - */ - if (!try_module_get(kvm_chardev_ops.owner)) { - r = -ENODEV; - goto out_err; - } - return kvm; out_err: @@ -1259,6 +1252,7 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname) out_err_no_srcu: kvm_arch_free_vm(kvm); mmdrop(current->mm); + module_put(kvm_chardev_ops.owner); return ERR_PTR(r); } -- 2.37.1.595.g718a3a8f04-goog