Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp5170798rdb; Wed, 13 Dec 2023 00:25:40 -0800 (PST) X-Google-Smtp-Source: AGHT+IEGyBrEbdE0LKyZZfyidk6p36R6g4qzqjTWn3f9lylUsmHvnw4ECxdnpIasBCXzpEfu4jqy X-Received: by 2002:a17:90b:4ad0:b0:286:7616:4b74 with SMTP id mh16-20020a17090b4ad000b0028676164b74mr5915714pjb.99.1702455940053; Wed, 13 Dec 2023 00:25:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702455940; cv=none; d=google.com; s=arc-20160816; b=iDQrweOm4WU/2V0TKrXrBOgDyD3nXBG9bNnKx2VixDvDxxuh0fAIhE5bMuE2lXpVA/ iHcs5mynFsY6WNaSH4TCHy7lX59SZRtCyIQQvvtD/H126hSJ/ZtObfcxbiA1IBFU3SFg gpPWy3SjNIhcHw8KXM44QaBAUD15vwRnM6+mlSMAjCLMd7xvFpC4vaoUN3wJT2K9lgZr zuMU+5R5aQzqN1WiwExwBA9q45HaYaEO+/d3DAfrciiqRAY6JzaN9m9x9vpQ4viKejbj zcNN0ZKN5rcdkxwaNdVsplhCacq4kR8K9cavDEWlABPBsTSWv6WZlErXsb5dfuofQGTV TVPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=Uh3dbu4BJGoqG7WyDjS7Ql7hw90UlDVfz+hx8zYG/jE=; fh=noXxyCgNa1v0LSj/Gb83QDxMxBmz0tPyEhSYW9fMX/o=; b=Gx7+kf9VVcVrS8sn6L5nOMh8Y1K50VpxqKk4x6xhqJMJXn+SnmPTdhNUFC8fp3tr77 fxlnxSOTRoUj4BIGW2e4Dl+FFGq9UTP6/+a+N0ahczhcBTE3jVzHbdLYFXduX+IoUx1D UAV9ZPypXFp5r70qGuUNSFSscoFpzD0jEyqGdRaGHi1oAha/0pqpliYtcsJ+486mAvNx Mm4ePQzXK1rWRbG+HcRkT4MSS++gw2Ve2zKK282aaccfa2TjqEHqZThDd+CCnGpG/zFd oq1GZhW1qde7oeinHbxfLriULA0gPFgWxqa1AR3k5YkrIh9XgETBY7j3p7X88Z6r/EBV YhUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=C0dNI53Q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id h6-20020a170902704600b001d33c7ff07dsi2110586plt.67.2023.12.13.00.25.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 00:25:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=C0dNI53Q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id AE7CB806AFCE; Wed, 13 Dec 2023 00:25:15 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232569AbjLMIY6 (ORCPT + 99 others); Wed, 13 Dec 2023 03:24:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232630AbjLMIY4 (ORCPT ); Wed, 13 Dec 2023 03:24:56 -0500 Received: from mail-pf1-x430.google.com (mail-pf1-x430.google.com [IPv6:2607:f8b0:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5659DF2; Wed, 13 Dec 2023 00:25:02 -0800 (PST) Received: by mail-pf1-x430.google.com with SMTP id d2e1a72fcca58-6cea2a38b48so5887103b3a.3; Wed, 13 Dec 2023 00:25:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702455902; x=1703060702; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=Uh3dbu4BJGoqG7WyDjS7Ql7hw90UlDVfz+hx8zYG/jE=; b=C0dNI53Qb4NVQjHQS71eLSRw2si/uiDCOBCSdkAA2A+ZmlRwQKkNcQzb5+wvRZ5EYE bqW20YJzTnKYBy+/NbKdwfRyWF9hsmysSPLOhS5Ur8ydKrK7mMCwJg6xxOEC1GGmqmvJ v/fZVibEWCdpcRcYvkxGf7fIyHut2BL0n7d6UkYs9iA024HU0J91jPT3nMRFxUxhR2pf Jt5V0qqxZB2+fRQDTNygb+Y/85Za8utBx8bU2NGniKQe0Neysv5UB6GxYmf/5yRxhgGd 3CKRxTdODZerGW8algGMcEm+DonXXOryCnSPWhQQAqu3ysrzk/MFJwQp26CDfFEqf9Kd R6GA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702455902; x=1703060702; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Uh3dbu4BJGoqG7WyDjS7Ql7hw90UlDVfz+hx8zYG/jE=; b=iEqzCmsQrr8kEbBCbttMdbzOVBeCehyAbSL9Rj2J9of/9XnpvvWAjFATUGtQroeHju niX5D+E6TjVLATncoD+6BcmZRdlaME+Tp8ahmLw9yCPF8ZDZ3xgJvYgaZO5BnZlbf68I UKgv489j+2pz4849gaux6Ef6/I5yx7lOnJdSb40d2bV/ujWgd09mJHpfPtVSIEZRPc7+ W8UkHD967fs5XmE+KAO+49SUMuXmrOwpF+QJyBKYNZ7RX9OrsMlcQ95WV6SG12jYO68A gqJLHBLtZ2YWJGvM9xe31oC35/y5HCHf4ckfdrDYtxRlPVaU7NLUKe0y7jQNsvOfvUCI KK3w== X-Gm-Message-State: AOJu0YxHiJQ1ZfDKTsdQOBbdH0CqN1hnJkUN+N95Uy88lAtDQqW0jPcU Lrpt6pSVp7r1j/Q7uArp8Tc= X-Received: by 2002:a05:6a00:10cc:b0:6ce:751b:81d9 with SMTP id d12-20020a056a0010cc00b006ce751b81d9mr8704308pfu.9.1702455901616; Wed, 13 Dec 2023 00:25:01 -0800 (PST) Received: from [192.168.255.10] ([203.205.141.118]) by smtp.gmail.com with ESMTPSA id u23-20020a62d457000000b006ce9e9d27c7sm9798465pfl.129.2023.12.13.00.24.59 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 13 Dec 2023 00:25:01 -0800 (PST) Message-ID: <0591cb18-77e1-4e98-a405-4a39cfb512e1@gmail.com> Date: Wed, 13 Dec 2023 16:24:58 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] KVM: x86/intr: Explicitly check NMI from guest to eliminate false positives To: Dongli Zhang Cc: Paolo Bonzini , Andi Kleen , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Sean Christopherson References: <20231206032054.55070-1-likexu@tencent.com> <6d3417f7-062e-9934-01ab-20e3a46656a7@oracle.com> Content-Language: en-US From: Like Xu In-Reply-To: <6d3417f7-062e-9934-01ab-20e3a46656a7@oracle.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Wed, 13 Dec 2023 00:25:15 -0800 (PST) On 13/12/2023 3:28 pm, Dongli Zhang wrote: > Hi Like, > > On 12/5/23 19:20, Like Xu wrote: >> From: Like Xu >> >> Explicitly checking the source of external interrupt is indeed NMI and not >> other types in the kvm_arch_pmi_in_guest(), which prevents perf-kvm false >> positive samples generated in perf/core NMI mode after vm-exit but before >> kvm_before_interrupt() from being incorrectly labelled as guest samples: > > About the before kvm_before_interrupt() ... > >> >> # test: perf-record + cpu-cycles:HP (which collects host-only precise samples) >> # Symbol Overhead sys usr guest sys guest usr >> # ....................................... ........ ........ ........ ......... ......... >> # >> # Before: >> [g] entry_SYSCALL_64 24.63% 0.00% 0.00% 24.63% 0.00% >> [g] syscall_return_via_sysret 23.23% 0.00% 0.00% 23.23% 0.00% >> [g] files_lookup_fd_raw 6.35% 0.00% 0.00% 6.35% 0.00% >> # After: >> [k] perf_adjust_freq_unthr_context 57.23% 57.23% 0.00% 0.00% 0.00% >> [k] __vmx_vcpu_run 4.09% 4.09% 0.00% 0.00% 0.00% >> [k] vmx_update_host_rsp 3.17% 3.17% 0.00% 0.00% 0.00% >> >> In the above case, perf records the samples labelled '[g]', the RIPs behind >> the weird samples are actually being queried by perf_instruction_pointer() >> after determining whether it's in GUEST state or not, and here's the issue: >> >> If vm-exit is caused by a non-NMI interrupt (such as hrtimer_interrupt) and >> at least one PMU counter is enabled on host, the kvm_arch_pmi_in_guest() >> will remain true (KVM_HANDLING_IRQ is set) until kvm_before_interrupt(). > > ... and here. > > Would you mind helping why kvm_arch_pmi_in_guest() remains true before > *kvm_before_interrupt()*. > > According to the source code, the vcpu->arch.handling_intr_from_guest > is set to non-zero only at kvm_before_interrupt(), and cleared at > kvm_after_interrupt(). > > Or would you mean kvm_after_interrupt()? Oops, it should refer to kvm_after_interrupt() as the code fixed. Thank you. > > Thank you very much! > > Dongli Zhang > >> >> During this window, if a PMI occurs on host (since the KVM instructions on >> host are being executed), the control flow, with the help of the host NMI >> context, will be transferred to perf/core to generate performance samples, >> thus perf_instruction_pointer() and perf_guest_get_ip() is called. >> >> Since kvm_arch_pmi_in_guest() only checks if there is an interrupt, it may >> cause perf/core to mistakenly assume that the source RIP of the host NMI >> belongs to the guest world and use perf_guest_get_ip() to get the RIP of >> a vCPU that has already exited by a non-NMI interrupt. >> >> Error samples are recorded and presented to the end-user via perf-report. >> Such false positive samples could be eliminated by explicitly determining >> if the exit reason is KVM_HANDLING_NMI. >> >> Note that when vm-exit is indeed triggered by PMI and before HANDLING_NMI >> is cleared, it's also still possible that another PMI is generated on host. >> Also for perf/core timer mode, the false positives are still possible since >> that non-NMI sources of interrupts are not always being used by perf/core. >> In both cases above, perf/core should correctly distinguish between real >> RIP sources or even need to generate two samples, belonging to host and >> guest separately, but that's perf/core's story for interested warriors. >> >> Fixes: dd60d217062f ("KVM: x86: Fix perf timer mode IP reporting") >> Signed-off-by: Like Xu >> --- >> V1 -> V2 Changelog: >> - Refine commit message to cover both perf/core timer and NMI modes; >> - Use in_nmi() to distinguish whether it's NMI mode or not; (Sean) >> V1: https://urldefense.com/v3/__https://lore.kernel.org/kvm/20231204074535.9567-1-likexu@tencent.com/__;!!ACWV5N9M2RV99hQ!MQ8FetD27SVKN34CS_P-K3qrhspFnpf_Mqb0McFN9y5vSUeScc5b0TlZ3ZMDvt4Cn4b3g0h9ci6EO9k3PBEQXpePrg$ >> arch/x86/include/asm/kvm_host.h | 10 +++++++++- >> arch/x86/kvm/x86.h | 6 ------ >> 2 files changed, 9 insertions(+), 7 deletions(-) >> >> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h >> index c8c7e2475a18..167d592e08d0 100644 >> --- a/arch/x86/include/asm/kvm_host.h >> +++ b/arch/x86/include/asm/kvm_host.h >> @@ -1868,8 +1868,16 @@ static inline int kvm_arch_flush_remote_tlbs_range(struct kvm *kvm, gfn_t gfn, >> } >> #endif /* CONFIG_HYPERV */ >> >> +enum kvm_intr_type { >> + /* Values are arbitrary, but must be non-zero. */ >> + KVM_HANDLING_IRQ = 1, >> + KVM_HANDLING_NMI, >> +}; >> + >> +/* Enable perf NMI and timer modes to work, and minimise false positives. */ >> #define kvm_arch_pmi_in_guest(vcpu) \ >> - ((vcpu) && (vcpu)->arch.handling_intr_from_guest) >> + ((vcpu) && (vcpu)->arch.handling_intr_from_guest && \ >> + (in_nmi() == ((vcpu)->arch.handling_intr_from_guest == KVM_HANDLING_NMI))) >> >> void __init kvm_mmu_x86_module_init(void); >> int kvm_mmu_vendor_module_init(void); >> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h >> index 2f7e19166658..4dc38092d599 100644 >> --- a/arch/x86/kvm/x86.h >> +++ b/arch/x86/kvm/x86.h >> @@ -431,12 +431,6 @@ static inline bool kvm_notify_vmexit_enabled(struct kvm *kvm) >> return kvm->arch.notify_vmexit_flags & KVM_X86_NOTIFY_VMEXIT_ENABLED; >> } >> >> -enum kvm_intr_type { >> - /* Values are arbitrary, but must be non-zero. */ >> - KVM_HANDLING_IRQ = 1, >> - KVM_HANDLING_NMI, >> -}; >> - >> static __always_inline void kvm_before_interrupt(struct kvm_vcpu *vcpu, >> enum kvm_intr_type intr) >> { >> >> base-commit: 1ab097653e4dd8d23272d028a61352c23486fd4a