Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp1339151pxm; Sat, 26 Feb 2022 12:15:40 -0800 (PST) X-Google-Smtp-Source: ABdhPJwYbGiJ5zKbjHKpNMRJF8MSS5g1mOjPtPuyYgLzhWwskz7KcYdlMu1X4Ow6tcoHp0yutORV X-Received: by 2002:a17:90a:c296:b0:1bc:7a6e:623b with SMTP id f22-20020a17090ac29600b001bc7a6e623bmr9320746pjt.68.1645906539924; Sat, 26 Feb 2022 12:15:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645906539; cv=none; d=google.com; s=arc-20160816; b=jfri21qV2N2fWVqbQ1k8db3Cl6HlBrRP1H9D8mypbPQkBPC2iVzoS6zHsXSMjYnVjD jxE69hJ3R2395lAzAJHo+NT8rdb4HlNdCQaGe3rRq/cf9MRW2/GgzynQ6AcjTtC5u7xY n3B6R3ubutCS2+P5eIQ3Qc939cFOSnc5Pok9e2jhTN74U1WdBuYEC63U1zVNzbGC1HZt nPQpKUmfs/Sbm9nCgZTg9NOAcHwNJoh1s9ZZgk83asXg9qTsalSXDVGS8pSkL8SEEIkz iLI27yFBPEUczyPkZWXhPzODOQbKglczy6HjgcyT8B2qODeOf4T72GjpWdN/myUyAWqs Mapw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=mGLbYD/2nvuQU3bxrB+PQon9DhwpNEkOfLPBNbbaSA0=; b=yRBFiWW+i1hV0THAKfvxQwCR0i05eLWBakKlCQOwdO2dgaEvnCRwW1/tjeNh19kNI7 GRPkHuvGRs99AqiVfKzXKuBrMHtUQtGsPlUczTID5NsPWOH2iZ4pevEm4pIUOjlwmVp8 4bZ8hdQIeiFsxTf2yfS9mVVoZPUHR/DeboxS5LniUgqMr9DiDwh+RTnL25qmRw6OHMrz DJXNKPo9/Amt4nFpKIuEedZgeU30LIZNVd8X9JlANQOzhrjy+rXsVVcEwQiV39743/G/ 6xP8NSqqb7AmE7qcowHKXxGbeFdzIgk/iavsfWy644fcDDmx3kGtDgx9VS2T5WaGpAUk DO7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=HPfZ0tqp; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id bg22-20020a17090b0d9600b001bc22d9a57asi5355177pjb.14.2022.02.26.12.15.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Feb 2022 12:15:39 -0800 (PST) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=HPfZ0tqp; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 1AACE23F3C0; Sat, 26 Feb 2022 12:08:53 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231834AbiBZOZn (ORCPT + 99 others); Sat, 26 Feb 2022 09:25:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38732 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231360AbiBZOZl (ORCPT ); Sat, 26 Feb 2022 09:25:41 -0500 Received: from mail-oi1-x233.google.com (mail-oi1-x233.google.com [IPv6:2607:f8b0:4864:20::233]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC993793B7 for ; Sat, 26 Feb 2022 06:25:06 -0800 (PST) Received: by mail-oi1-x233.google.com with SMTP id i5so9858956oih.1 for ; Sat, 26 Feb 2022 06:25:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=mGLbYD/2nvuQU3bxrB+PQon9DhwpNEkOfLPBNbbaSA0=; b=HPfZ0tqpALJXNjZRE/OiJQXvraZ4XX1qn47Ai61jqZ+uDRD+fG7NlDQqXTJ6OVNcKz bJNpXfslK4+NbxihXj5QNa1gvtBJmojDBo2AyyB0Xq0pLRzQA9RtTZ4WQox08Uv+8acF 4MICtEZqy1BW7jpMFsSj6RulzASIVNN1i2H8Cz62QCzPoVm+SpOR0XAqVE+1/Ti8Vu/5 e3K5gUV+W3SG6ZQ1uqOxni9DNaoDBpNwwTgSeHZdR+48WclYDBEIBi+KE9+PSFvHXRBe URhYZPJJ7J6/luMixnOBUBbRlj3E/nfmiICUGk6eV+4ZYMgmGVlVhSoxdvw/fw0EG2Yp sVVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=mGLbYD/2nvuQU3bxrB+PQon9DhwpNEkOfLPBNbbaSA0=; b=hnT0xloQOIvgRUXpKEiDAqBvtUeq4deCtmIyfKAhm865EWlURSnoPqPGawHqAyxY4g Rbt79wbbqlsrQ+aIjn1bdR7alXlQQup9lVJ1AnVenqODukv1tplOPgrwwdFYinXm9ns0 /KZOiObkVR6g1uwhLfeZftfjQ6sJ8lDrVZ9EuRDCMoxI7YFN8H5H9DhSZ7c/I5Z48gGc 64lEQUX5yAF3XvzOC9caqXjzXwExhGEGkYfBWh20DNHjTwT+noZnJDAhBozTAPP53Q0g JT9ErjiG8SJAYNUY/YM/ZHUfVYgu22GqMxyKzFN7hbW0hdJXPYqFNOQJ1ttzbtYMPxws 1w1g== X-Gm-Message-State: AOAM531x5cSVdvXCdmSj7hgwJfhsd7fJb2fnrSyZOasidUoFoU8q6jjN kchFT6XDvrgNg3+/X3UZei9pbDvCrBsNqL0bEt6i/w== X-Received: by 2002:a05:6808:1999:b0:2d6:7fe3:10bd with SMTP id bj25-20020a056808199900b002d67fe310bdmr4820885oib.68.1645885505692; Sat, 26 Feb 2022 06:25:05 -0800 (PST) MIME-Version: 1.0 References: <20220223062412.22334-1-chenyi.qiang@intel.com> <88eb9a9a-fbe3-8e2c-02bd-4bdfc855b67f@intel.com> <6a839b88-392d-886d-836d-ca04cf700dce@intel.com> <7859e03f-10fa-dbc2-ed3c-5c09e62f9016@redhat.com> <71736b9d-9ed4-ea02-e702-74cae0340d66@intel.com> In-Reply-To: <71736b9d-9ed4-ea02-e702-74cae0340d66@intel.com> From: Jim Mattson Date: Sat, 26 Feb 2022 06:24:54 -0800 Message-ID: Subject: Re: [PATCH v3] KVM: VMX: Enable Notify VM exit To: Xiaoyao Li Cc: Paolo Bonzini , Chenyi Qiang , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 25, 2022 at 10:24 PM Xiaoyao Li wrote: > > On 2/26/2022 12:53 PM, Jim Mattson wrote: > > On Fri, Feb 25, 2022 at 8:25 PM Jim Mattson wrote: > >> > >> On Fri, Feb 25, 2022 at 8:07 PM Xiaoyao Li wrote: > >>> > >>> On 2/25/2022 11:13 PM, Paolo Bonzini wrote: > >>>> On 2/25/22 16:12, Xiaoyao Li wrote: > >>>>>>>> > >>>>>>> > >>>>>>> I don't like the idea of making things up without notifying userspace > >>>>>>> that this is fictional. How is my customer running nested VMs supposed > >>>>>>> to know that L2 didn't actually shutdown, but L0 killed it because the > >>>>>>> notify window was exceeded? If this information isn't reported to > >>>>>>> userspace, I have no way of getting the information to the customer. > >>>>>> > >>>>>> Then, maybe a dedicated software define VM exit for it instead of > >>>>>> reusing triple fault? > >>>>>> > >>>>> > >>>>> Second thought, we can even just return Notify VM exit to L1 to tell > >>>>> L2 causes Notify VM exit, even thought Notify VM exit is not exposed > >>>>> to L1. > >>>> > >>>> That might cause NULL pointer dereferences or other nasty occurrences. > >>> > >>> IMO, a well written VMM (in L1) should handle it correctly. > >>> > >>> L0 KVM reports no Notify VM Exit support to L1, so L1 runs without > >>> setting Notify VM exit. If a L2 causes notify_vm_exit with > >>> invalid_vm_context, L0 just reflects it to L1. In L1's view, there is no > >>> support of Notify VM Exit from VMX MSR capability. Following L1 handler > >>> is possible: > >>> > >>> a) if (notify_vm_exit available & notify_vm_exit enabled) { > >>> handle in b) > >>> } else { > >>> report unexpected vm exit reason to userspace; > >>> } > >>> > >>> b) similar handler like we implement in KVM: > >>> if (!vm_context_invalid) > >>> re-enter guest; > >>> else > >>> report to userspace; > >>> > >>> c) no Notify VM Exit related code (e.g. old KVM), it's treated as > >>> unsupported exit reason > >>> > >>> As long as it belongs to any case above, I think L1 can handle it > >>> correctly. Any nasty occurrence should be caused by incorrect handler in > >>> L1 VMM, in my opinion. > >> > >> Please test some common hypervisors (e.g. ESXi and Hyper-V). > > > > I took a look at KVM in Linux v4.9 (one of our more popular guests), > > and it will not handle this case well: > > > > if (exit_reason < kvm_vmx_max_exit_handlers > > && kvm_vmx_exit_handlers[exit_reason]) > > return kvm_vmx_exit_handlers[exit_reason](vcpu); > > else { > > WARN_ONCE(1, "vmx: unexpected exit reason 0x%x\n", exit_reason); > > kvm_queue_exception(vcpu, UD_VECTOR); > > return 1; > > } > > > > At least there's an L1 kernel log message for the first unexpected > > NOTIFY VM-exit, but after that, there is silence. Just a completely > > inexplicable #UD in L2, assuming that L2 is resumable at this point. > > At least there is a message to tell L1 a notify VM exit is triggered in > L2. Yes, the inexplicable #UD won't be hit unless L2 triggers Notify VM > exit with invalid_context, which is malicious to L0 and L1. There is only an L1 kernel log message *the first time*. That's not good enough. And this is just one of the myriad of possible L1 hypervisors. > If we use triple_fault (i.e., shutdown), then no info to tell L1 that > it's caused by Notify VM exit with invalid context. Triple fault needs > to be extended and L1 kernel needs to be enlightened. It doesn't help > old guest kernel. > > If we use Machine Check, it's somewhat same inexplicable to L2 unless > it's enlightened. But it doesn't help old guest kernel. > > Anyway, for Notify VM exit with invalid context from L2, I don't see a > good solution to tell L1 VMM it's a "Notify VM exit with invalid context > from L2" and keep all kinds of L1 VMM happy, especially for those with > old kernel versions. I agree that there is no way to make every conceivable L1 happy. That's why the information needs to be surfaced to the L0 userspace. I contend that any time L0 kvm violates the architectural specification in its emulation of L1 or L2, the L0 userspace *must* be informed.