Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp1142514pxb; Wed, 6 Apr 2022 09:41:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwm38O1LBDHVvua6DgyG4jfstOwBnztlnE+tl0PgdV3T4VFkOGPRs7XcEGArjRX2fCAqZLX X-Received: by 2002:a17:90a:7841:b0:1c7:e8ad:ea25 with SMTP id y1-20020a17090a784100b001c7e8adea25mr10967221pjl.25.1649263311733; Wed, 06 Apr 2022 09:41:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649263311; cv=none; d=google.com; s=arc-20160816; b=f6G6CahIygAKb6tMANe8s2RlpmHy7CpiaVqHt79A3oD9OO2WPMdVDIo2XqtytMIj5g ZPKiD24gwiZMnoWzBY60QHcb7pOevPT3CZ1zPMLh8fAAvYsN5mE5Kp/V3IDRMZqs9Nhb mcG6kYHF1mpX2At+UPZeh0fxCNuRy+kkqozOygkAMLshtTeKlfA/7XQGHjdkdFbU4mOe ZK0IOE3mr6KHv+jgQX7NU6GcnZ/Wzx2zsiCpd//dyaY8qcPLj38p4e/YfceZCgHi2SF+ 0q1eWFz2SES2LwBYG47PxscuWf/MOyOIqgspdawP6iDbln7e2t2gcCqJko3oAG3/VZ2b 87zw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :from:references:cc:to:content-language:user-agent:mime-version:date :message-id; bh=lDCh7lFyVlC3kml6Q084M7631rR3z7Dg4QSqxAb3QGw=; b=T0pY0Vq/IxQcj8FS96zmT4QHoflXodSGf/GzniQiebLzXotEqZ/GN87pgQoH+sWfx1 DZW48Qlg+GlQsxktUw8ZeSe5PZuetsRYd8JO4uu2coWvCFqZ9bLcK3Aae4oXGIlKx6cS ZM4wrvXbhDucVHbh+O6gObHT4vtSiVDVcb+Fg5s6rAVCIDFQ2e5JSb9vxfxxmS3DlJWp v37Sw+BvRxKodJkqHUfbqvKhVRwhFBtueimCncKBCGDX8fEMFoMl5JT1mROMFg3BnVEL /kmM7SQGg2fV3jmVXkfmiq10Q40CWkkgBLGmokkgOB7X5L2gdyWMErw247qid+GxayKM HVaA== ARC-Authentication-Results: i=1; mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id n2-20020a17090a9f0200b001c79aa946bdsi5315078pjp.122.2022.04.06.09.41.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Apr 2022 09:41:51 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A4936276FB5; Wed, 6 Apr 2022 08:51:37 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236624AbiDFPxP (ORCPT + 99 others); Wed, 6 Apr 2022 11:53:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236578AbiDFPw2 (ORCPT ); Wed, 6 Apr 2022 11:52:28 -0400 Received: from vps-vb.mhejs.net (vps-vb.mhejs.net [37.28.154.113]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C88B139B2A4; Wed, 6 Apr 2022 06:14:32 -0700 (PDT) Received: from MUA by vps-vb.mhejs.net with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1nc5Ti-0001n7-Cl; Wed, 06 Apr 2022 15:13:42 +0200 Message-ID: Date: Wed, 6 Apr 2022 15:13:35 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Content-Language: en-US To: Sean Christopherson Cc: Maxim Levitsky , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org References: <20220402010903.727604-1-seanjc@google.com> <20220402010903.727604-6-seanjc@google.com> <7caee33a-da0f-00be-3195-82c3d1cd4cb4@maciej.szmigiero.name> From: "Maciej S. Szmigiero" Subject: Re: [PATCH 5/8] KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6.04.2022 03:48, Sean Christopherson wrote: > On Mon, Apr 04, 2022, Maciej S. Szmigiero wrote: (..) >> Also, I'm not sure that even the proposed updated code above will >> actually restore the L1-requested next_rip correctly on L1 -> L2 >> re-injection (will review once the full version is available). > > Spoiler alert, it doesn't. Save yourself the review time. :-) > > The missing piece is stashing away the injected event on nested VMRUN. Those > events don't get routed through the normal interrupt/exception injection code and > so the next_rip info is lost on the subsequent #NPF. > > Treating soft interrupts/exceptions like they were injected by KVM (which they > are, technically) works and doesn't seem too gross. E.g. when prepping vmcb02 > > if (svm->nrips_enabled) > vmcb02->control.next_rip = svm->nested.ctl.next_rip; > else if (boot_cpu_has(X86_FEATURE_NRIPS)) > vmcb02->control.next_rip = vmcb12_rip; > > if (is_evtinj_soft(vmcb02->control.event_inj)) { > svm->soft_int_injected = true; > svm->soft_int_csbase = svm->vmcb->save.cs.base; > svm->soft_int_old_rip = vmcb12_rip; > if (svm->nrips_enabled) > svm->soft_int_next_rip = svm->nested.ctl.next_rip; > else > svm->soft_int_next_rip = vmcb12_rip; > } > > And then the VMRUN error path just needs to clear soft_int_injected. I am also a fan of parsing EVENTINJ from VMCB12 into relevant KVM injection structures (much like EXITINTINFO is parsed), as I said to Maxim two days ago [1]. Not only for software {interrupts,exceptions} but for all incoming events (again, just like EXITINTINFO). However, there is another issue related to L1 -> L2 event re-injection using standard KVM event injection mechanism: it mixes the L1 injection state with the L2 one. Specifically for SVM: * When re-injecting a NMI into L2 NMI-blocking is enabled in vcpu->arch.hflags (shared between L1 and L2) and IRET intercept is enabled. This is incorrect, since it is L1 that is responsible for enforcing NMI blocking for NMIs that it injects into its L2. Also, *L2* being the target of such injection definitely should not block further NMIs for *L1*. * When re-injecting a *hardware* IRQ into L2 GIF is checked (previously even on the BUG_ON() level), while L1 should be able to inject even when L2 GIF is off, With the code in my previous patch set I planned to use exit_during_event_injection() to detect such case, but if we implement VMCB12 EVENTINJ parsing we can simply add a flag that the relevant event comes from L1, so its normal injection side-effects should be skipped. By the way, the relevant VMX code also looks rather suspicious, especially for the !enable_vnmi case. Thanks, Maciej [1]: https://lore.kernel.org/kvm/7d67bc6f-00ac-7c07-f6c2-c41b2f0d35a1@maciej.szmigiero.name/