Received: by 2002:ab2:788f:0:b0:1ee:8f2e:70ae with SMTP id b15csp507062lqi; Thu, 7 Mar 2024 04:02:50 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUTUtItj8S3mrdEduczZP9/ZaWnH97iWYUwjiO80m7UIolldM0Br2dPbFh29+AGctG8Bbgv5KQW/Trq27sG4ZQ8BKtV+3YHyey+FmL2rg== X-Google-Smtp-Source: AGHT+IET79nYap57kCNdGPVlE99FGqr7Up4+v6sc+392EwR9rD7Tc3uQ35ElY6undegOrutNGV7F X-Received: by 2002:a9d:4f01:0:b0:6e4:e675:a92c with SMTP id d1-20020a9d4f01000000b006e4e675a92cmr8924332otl.16.1709812970040; Thu, 07 Mar 2024 04:02:50 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709812970; cv=pass; d=google.com; s=arc-20160816; b=QhiNRTPCv3JkfrclBumhb77EfVpIhx1/sCkr9PzmYwuyttfRrFmHzTEwHsxqEjE34X E5bK1/NlCn5GErk1SZtjYghtmYJBGZ2dhUHp5mCXFmtSrObqv/GBJAY4eSFgH69v/pHe tTzRedOPuQTYCA+KxL4GF7D6cJfhcYJB4oZNkuQqw80Qnr9x2SuxdLXoH3ZwRmBS6+Vk CGji89MNwHmBYDKCpB8vACMTWQd1doPo3ayTJa9jSimPyz3pnlgPTc3sXNOo4ICZYXdb t7JV9ntS+Ou6CQWbGfcz6ti3F1ILBKFv8IS1kdzchpdCGK9mp+SUwrkZVNyT7ZgJM4V0 DkKQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id:dkim-signature; bh=32Xd4vG74gCf71WpQP/SsM5wd+5XoWj9D9vwMONqJCc=; fh=SO91FP7/mYX3uHlggojMJezggGRYQRLLZ2KC7CHmuIs=; b=qKeYc5J0tBYyMjz1yI5j2xk+n6acUyU5xLYk111bDGjBP1u+e7a3Vri5JM8ERPo1h+ B+EbsHn84Mw/IHakYf00sOEIlYl6Cj0512PH+G3lfVdoVJh888YOc5yznFeWJP/3jNtP AmuuBo4A6mfd3S2DU4EwVB9eAjfm4PTBN01UKdOIV2lJzdkK9i4S4qDSiMH2/s6EDG/H XCd98pwvbvzlQHBu1G1iR0Dm9DHvcY4bNVjeNnT6CUNVnREOLXigRKjGx563iG9s0C+S 4SmfiLOKHHPGr5NZY61uP2cLAHLWPHhDwS8yX15nt+zS1j1/nGoT3znV/ptkLDXoQDdf rqig==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=fN4mYmbG; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-95505-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-95505-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id r11-20020a056122014b00b004c84a4c9944si2406895vko.218.2024.03.07.04.02.49 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Mar 2024 04:02:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-95505-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=fN4mYmbG; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-95505-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-95505-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id A36391C22E1E for ; Thu, 7 Mar 2024 12:02:49 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 67789129A93; Thu, 7 Mar 2024 12:02:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="fN4mYmbG" Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0D79B26AF5; Thu, 7 Mar 2024 12:02:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.174 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709812958; cv=none; b=r8M+mR39Uba+nuL/5RYv4qqD/pn0VCn+Muu2iR7cxzugjuDXoIx9sGFx4S2088Fq2jELe8A1p4sOUzcONsSTTQC960MVVGyskS7nh6WhLVLgwoCgrvcpTiB2lt1EQA4QLioadeNgOGRjwCRBE8fzKak3PW9gXhgUWUn/ISB3AH0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709812958; c=relaxed/simple; bh=jq/yLDCr7ooLKZzMBxV6A1cyASSSs5RmVDgW+iMhRus=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=PJ6ob/OZ4txTY8/f+qIWURPUxugZButblxOVhOiwbDehhfiam1bTuugtcan492Fh3bKrLhS2jUVfX2d8neM8/5iF8xZemZyyAQS/mRvEi6MVt9u+0m/YULMGwwkhi81n/OxLNhXr6PgneCKeErt4E0WVkBWbGOFg5foxvuNVOJc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=fN4mYmbG; arc=none smtp.client-ip=209.85.214.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-1dc1ff3ba1aso6130845ad.3; Thu, 07 Mar 2024 04:02:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709812956; x=1710417756; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=32Xd4vG74gCf71WpQP/SsM5wd+5XoWj9D9vwMONqJCc=; b=fN4mYmbG2i0xNRMxIIcGoAWlg2B43XitFsmZxJUsTmuEnvlsPgqDkjBn4+NShLCohj GvVZuaBezGscjLVwOFOuhJVFHmAtme640fq+SC25gy07QgIQkCACzbFHblnFglkIcW2D N1c9inexH+ke7AtuzUx70d2tiRXDQd7L7K+ObrdghqCnbCaM23zRwXxgQnDDODIj+XTt UiyOoDb13OLmu6wly/a8Qmwq8a/B9zqWoxAoQq83xnrC8tkR1w9xKfIJ+JXBMtXjv9lU mTzmQm/37aQ0gKkIS/XnBFRabPTWRLzN2fOE59YHsmZgCfBMfCXg68FrKvf5KHnR+sM4 qObA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709812956; x=1710417756; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=32Xd4vG74gCf71WpQP/SsM5wd+5XoWj9D9vwMONqJCc=; b=WpfVnNbRKgh1V4uLwN9Q1qjmZMmkOh45+nxSoLIeyAz5cIUVlFdjGHPG2UO6r1onpr ZWZbfGpT/UZWvsxBtEzGF5Edf9QiRSHe5PYktR4Fo/bOnJbDFXFlsY9L68NBZKhDAiDw 52scbJrUFM6BpduHS0Y228SxvMtxj4Rem52xVnTQu2kh2WijeFihz3oaFgsnxT8U7nop 5IdFPjI1XwF+0O+gySE8HjagwmxfS7lKFfXQr9X37da0nPaXiNUWjNnstv0qHh+K0Q4G Q9syti44qhPRODg0FfTgRKFEqk7Y3DZv1y6KhUArvyhtXDPblZf8gE9hp7Hi2yyYC5M9 ebDQ== X-Forwarded-Encrypted: i=1; AJvYcCVDAAfw6EK6bSJr2dntk212BueujseISHh8Ls4k8EeY+qdIYSLYsPUfk8TlWwKUlkkIVTMkXitL6n1rSSRapfdvUMLxs4IdSfu5eK08n8rO3znn70WNFgxUgGtd6/+1yydVUOoBcmEIWZl4B2XtmubF8xCv8EMMkhoZYoRQWrViV+tQ1A== X-Gm-Message-State: AOJu0YxeQzpQJ8w6Ebsih4s1US7Xeve5Yl5va1+6cBi06AlJHgRM0UAW Vjmv93ljvHvzkFyGK5nC1wnRr4UFkowq+WQTQlrSb7kqy1v+NXax X-Received: by 2002:a17:902:e544:b0:1db:7195:5fca with SMTP id n4-20020a170902e54400b001db71955fcamr9330350plf.34.1709812956086; Thu, 07 Mar 2024 04:02:36 -0800 (PST) Received: from [192.168.255.10] ([43.132.141.20]) by smtp.gmail.com with ESMTPSA id p24-20020a170903249800b001dbcfb4766csm14396800plw.226.2024.03.07.04.02.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 07 Mar 2024 04:02:32 -0800 (PST) Message-ID: <8548ea3c-7ec6-4027-927f-fce9bbd5e6a3@gmail.com> Date: Thu, 7 Mar 2024 20:02:27 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [Patch v3] KVM: x86/pmu: Manipulate FIXED_CTR_CTRL MSR with macros Content-Language: en-US To: Sean Christopherson Cc: Mingwei Zhang , Paolo Bonzini , Kan Liang , Like Xu , kvm@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Zhenyu Wang , Zhang Xiong , Lv Zhiyuan , Dapeng Mi , Dapeng Mi References: <20230824020546.1108516-1-dapeng1.mi@linux.intel.com> <2677739b-bc84-43ee-ba56-a5e243148ceb@gmail.com> From: Like Xu In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 7/3/2024 4:17 am, Sean Christopherson wrote: > On Wed, Mar 06, 2024, Like Xu wrote: >>>> @@ -595,7 +600,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) >>>> pmu->reserved_bits &= ~ICL_EVENTSEL_ADAPTIVE; >>>> for (i = 0; i < pmu->nr_arch_fixed_counters; i++) { >>>> pmu->fixed_ctr_ctrl_mask &= >>>> - ~(1ULL << (INTEL_PMC_IDX_FIXED + i * 4)); >>> >>> OMG, this might just win the award for most obfuscated PMU code in KVM, which is >>> saying something. The fact that INTEL_PMC_IDX_FIXED happens to be 32, the same >>> bit number as ICL_FIXED_0_ADAPTIVE, is 100% coincidence. Good riddance. >>> >>> Argh, and this goofy code helped introduce a real bug. reprogram_fixed_counters() >>> doesn't account for the upper 32 bits of IA32_FIXED_CTR_CTRL. >>> >>> Wait, WTF? Nothing in KVM accounts for the upper bits. This can't possibly work. >>> >>> IIUC, because KVM _always_ sets precise_ip to a non-zero bit for PEBS events, >>> perf will _always_ generate an adaptive record, even if the guest requested a >>> basic record. Ugh, and KVM will always generate adaptive records even if the >>> guest doesn't support them. This is all completely broken. It probably kinda >>> sorta works because the Basic info is always stored in the record, and generating >>> more info requires a non-zero MSR_PEBS_DATA_CFG, but ugh. >> >> Yep, it works at least on machines with both adaptive and pebs_full features. > > *AND* if the guest uses PEBS exactly the same way that the Linux kernel uses PEBS. > >> Mingwei or others are encouraged to construct use cases in KUT::pmu_pebs.flat >> that violate guest-pebs rules (e.g., leak host state), as we all recognize >> that testing is the right way to condemn legacy code, not just lengthy emails. > > *sigh* > > diff --git a/x86/pmu_pebs.c b/x86/pmu_pebs.c > index f7b52b90..43e7a207 100644 > --- a/x86/pmu_pebs.c > +++ b/x86/pmu_pebs.c > @@ -212,8 +212,12 @@ static void pebs_enable(u64 bitmask, u64 pebs_data_cfg) > u64 baseline_extra_ctrl = 0, fixed_ctr_ctrl = 0; > unsigned int idx; > > - if (has_baseline) > - wrmsr(MSR_PEBS_DATA_CFG, pebs_data_cfg); > + if (has_baseline) { > + if (pebs_data_cfg) > + wrmsr(MSR_PEBS_DATA_CFG, pebs_data_cfg); > + else > + wrmsr(MSR_PEBS_DATA_CFG, 0xf); > + } > > ds = (struct debug_store *)ds_bufer; > ds->pebs_index = ds->pebs_buffer_base = (unsigned long)pebs_buffer; > @@ -224,7 +228,7 @@ static void pebs_enable(u64 bitmask, u64 pebs_data_cfg) > for (idx = 0; idx < pmu.nr_fixed_counters; idx++) { > if (!(BIT_ULL(FIXED_CNT_INDEX + idx) & bitmask)) > continue; > - if (has_baseline) > + if (has_baseline && pebs_data_cfg) > baseline_extra_ctrl = BIT(FIXED_CNT_INDEX + idx * 4); > wrmsr(MSR_PERF_FIXED_CTRx(idx), ctr_start_val); > fixed_ctr_ctrl |= (0xbULL << (idx * 4) | baseline_extra_ctrl); > @@ -235,7 +239,7 @@ static void pebs_enable(u64 bitmask, u64 pebs_data_cfg) > for (idx = 0; idx < max_nr_gp_events; idx++) { > if (!(BIT_ULL(idx) & bitmask)) > continue; > - if (has_baseline) > + if (has_baseline && pebs_data_cfg) > baseline_extra_ctrl = ICL_EVENTSEL_ADAPTIVE; > wrmsr(MSR_GP_EVENT_SELECTx(idx), EVNTSEL_EN | EVNTSEL_OS | EVNTSEL_USR | > intel_arch_events[idx] | baseline_extra_ctrl); > > FAIL: Multiple (0x700000055): PEBS record (written seq 0) is verified (including size, counters and cfg). > FAIL: The pebs_record_size (488) doesn't match with MSR_PEBS_DATA_CFG (32). > FAIL: The pebs_data_cfg (0xf) doesn't match with MSR_PEBS_DATA_CFG (0x0). > FAIL: GP counter 0 (0xfffffffffffe): PEBS record (written seq 0) is verified (including size, counters and cfg). > FAIL: The pebs_record_size (488) doesn't match with MSR_PEBS_DATA_CFG (32). > FAIL: The pebs_data_cfg (0xf) doesn't match with MSR_PEBS_DATA_CFG (0x0). > FAIL: Multiple (0x700000055): PEBS record (written seq 0) is verified (including size, counters and cfg). > FAIL: The pebs_record_size (488) doesn't match with MSR_PEBS_DATA_CFG (32). > FAIL: The pebs_data_cfg (0xf) doesn't match with MSR_PEBS_DATA_CFG (0x0). > >>> Oh great, and it gets worse. intel_pmu_disable_fixed() doesn't clear the upper >>> bits either, i.e. leaves ICL_FIXED_0_ADAPTIVE set. Unless I'm misreading the code, >>> intel_pmu_enable_fixed() effectively doesn't clear ICL_FIXED_0_ADAPTIVE either, >>> as it only modifies the bit when it wants to set ICL_FIXED_0_ADAPTIVE. >>> >>> *sigh* >>> >>> I'm _very_ tempted to disable KVM PEBS support for the current PMU, and make it >>> available only when the so-called passthrough PMU is available[*]. Because I >>> don't see how this is can possibly be functionally correct, nor do I see a way >>> to make it functionally correct without a rather large and invasive series. >> >> Considering that I've tried the idea myself, I have no inclination towards >> "passthrough PMU", and I'd like to be able to take the time to review that >> patchset while we all wait for a clear statement from that perf-core man, >> who don't really care about virtualization and don't want to lose control >> of global hardware resources. >> >> Before we actually get to that ideal state you want, we have to deal with >> some intermediate state and face to any users that rely on the current code, > > It's not an ideal state, it's simply the only way I see to get things like adaptive > PEBS to work safely, reliably, and correctly without taking on an absurd amount of > complexity. The upstream guest_pebs feature has always been broken from my perspective, but unfortunately I didn't get enough time to polish it (slow path, host/guest multiplexing, live migration support, security defense), so I'm really glad that Google is seriously considering using this or that PMU feature. > >> you had urged to merge in a KVM document for vPMU, not sure how far >> along that part of the work is. > >>> Ouch. And after chatting with Mingwei, who asked the very good question of >>> "can this leak host state?", I am pretty sure that yes, this can leak host state. >> >> The Basic Info has a tsc field, I suspect it's the host-state-tsc. > > It's not, the CPU offsets it correctly, at least on ICX (I haven't check scaling). And one more to check, the tsc fields form the guest Intel_PT or Timed LBR if any. > >>> When PERF_CAP_PEBS_BASELINE is enabled for the guest, i.e. when the guest has >>> access to adaptive records, KVM gives the guest full access to MSR_PEBS_DATA_CFG >>> >>> pmu->pebs_data_cfg_mask = ~0xff00000full; >>> >>> which makes sense in a vacuum, because AFAICT the architecture doesn't allow >>> exposing a subset of the four adaptive controls. >>> >>> GPRs and XMMs are always context switched and thus benign, but IIUC, Memory Info >>> provides data that might now otherwise be available to the guest, e.g. if host >>> userspace has disallowed equivalent events via KVM_SET_PMU_EVENT_FILTER. >> >> Indeed, KVM_SET_PMU_EVENT_FILTER doesn't work in harmony with >> guest-pebs, and I believe there is a big problem here, especially with the >> lack of targeted testing. >> >> One reason for this is that we don't use this cockamamie API in our > > Libeling APIs because they aren't useful for _your_ security goals doesn't mean > you get to ignore their existence when contributing upstream. New features don't > necessarilly have to fully support existing capabilities, e.g. I would be a-ok > making KVM_SET_PMU_EVENT_FILTER mututally exclusive with exposing adapative PEBS > to the guest. That way the user can at least know that filtering won't work if > adapative PEBS is exposed to the guest. Yes, that should be considered. > >> large-scale production environments, and users of vPMU want to get real >> runtime information about physical cpus, not just virtualised hardware >> architecture interfaces. >> >>> >>> And unless I'm missing something, LBRs are a full leak of host state. Nothing >>> in the SDM suggests that PEBS records honor MSR intercepts, so unless KVM is >>> also passing through LBRs, i.e. is context switching all LBR MSRs, the guest can >>> use PEBS to read host LBRs at will. >> >> KVM is also passing through LBRs when guest uses LBR but not at the >> granularity of vm-exit/entry. I'm not sure if the LBR_EN bit is required >> to get LBR information via PEBS, also not confirmed whether PEBS-lbr >> can be enabled at the same time as independent LBR; >> >> I recall that PEBS-assist, per cpu-arch, would clean up this part of the >> record when crossing root/non-root boundaries, or not generate record. > > Nope. The MSRs definitely leak to the guest. The only hard part was figuring > out how to get perf to utilize LBRs without consuming every counter (`perf top` > was the extent of my knowledge, until now...). > > E.g. > > perf record -b -e instructions Consulting "Ian Rogers" is the fastest route. > > and > > FAIL: PEBS LBR record 0 isn't empty, got from = 'ffffffffc0a00ccb', to = 'ffffffffc0a010af', info = '2' > FAIL: PEBS LBR record 1 isn't empty, got from = 'ffffffffc0a010aa', to = 'ffffffffc0a00cb0', info = '6' > FAIL: PEBS LBR record 2 isn't empty, got from = 'ffffffffc0a00e06', to = 'ffffffffc0a01090', info = '1' > FAIL: PEBS LBR record 3 isn't empty, got from = 'ffffffffc0a00df4', to = 'ffffffffc0a00e00', info = '2' > FAIL: PEBS LBR record 4 isn't empty, got from = 'ffffffffc0a00dbc', to = 'ffffffffc0a00de0', info = '1' > FAIL: PEBS LBR record 5 isn't empty, got from = 'ffffffffc0a00f63', to = 'ffffffffc0a00db5', info = '1' > FAIL: PEBS LBR record 6 isn't empty, got from = 'ffffffffc0903f23', to = 'ffffffffc0a00f61', info = '11' > FAIL: PEBS LBR record 7 isn't empty, got from = 'ffffffffc0a00f5c', to = 'ffffffffc0903f10', info = '1' > FAIL: PEBS LBR record 8 isn't empty, got from = 'ffffffffc0a00db0', to = 'ffffffffc0a00f55', info = '1' > FAIL: PEBS LBR record 9 isn't empty, got from = 'ffffffff8f6b2c23', to = 'ffffffffc0a00da6', info = 'a' > FAIL: PEBS LBR record 10 isn't empty, got from = 'ffffffffc0a00da1', to = 'ffffffff8f6b2b60', info = '7' > FAIL: PEBS LBR record 11 isn't empty, got from = 'ffffffff8eba1b85', to = 'ffffffffc0a00d9a', info = '6' > FAIL: PEBS LBR record 12 isn't empty, got from = 'ffffffff8eba1b8c', to = 'ffffffff8eba1b3f', info = '1' > FAIL: PEBS LBR record 13 isn't empty, got from = 'ffffffff8eba1b2d', to = 'ffffffff8eba1b87', info = 'e' > FAIL: PEBS LBR record 14 isn't empty, got from = 'ffffffff8eba1aff', to = 'ffffffff8eba1b18', info = '7' > FAIL: PEBS LBR record 15 isn't empty, got from = 'ffffffff8eb996e0', to = 'ffffffff8eba1aeb', info = '2' > FAIL: PEBS LBR record 16 isn't empty, got from = 'ffffffff8eb9963a', to = 'ffffffff8eb996cc', info = '1' > FAIL: PEBS LBR record 17 isn't empty, got from = 'ffffffff8eb995ef', to = 'ffffffff8eb9962f', info = '1' > FAIL: PEBS LBR record 18 isn't empty, got from = 'ffffffff8f6b30a1', to = 'ffffffff8eb995df', info = '4' > FAIL: PEBS LBR record 19 isn't empty, got from = 'ffffffff8eb995da', to = 'ffffffff8f6b3070', info = '2' > FAIL: PEBS LBR record 20 isn't empty, got from = 'ffffffff8eba1ae6', to = 'ffffffff8eb995c0', info = '6' > FAIL: PEBS LBR record 21 isn't empty, got from = 'ffffffffc0a00d95', to = 'ffffffff8eba1a90', info = '2' > FAIL: PEBS LBR record 22 isn't empty, got from = 'ffffffff8eb69135', to = 'ffffffffc0a00d89', info = '2' > FAIL: PEBS LBR record 23 isn't empty, got from = 'ffffffff8eb690d6', to = 'ffffffff8eb69104', info = '2' > FAIL: PEBS LBR record 24 isn't empty, got from = 'ffffffff8eb69102', to = 'ffffffff8eb690c3', info = '1' > FAIL: PEBS LBR record 25 isn't empty, got from = 'ffffffff8eb6f442', to = 'ffffffff8eb69100', info = '2' > FAIL: PEBS LBR record 26 isn't empty, got from = 'ffffffff8eb6f3f3', to = 'ffffffff8eb6f429', info = '5' > FAIL: PEBS LBR record 27 isn't empty, got from = 'ffffffff8eb6f3a6', to = 'ffffffff8eb6f3c3', info = '2' > FAIL: PEBS LBR record 28 isn't empty, got from = 'ffffffff8eb690fb', to = 'ffffffff8eb6f390', info = '2' > FAIL: PEBS LBR record 29 isn't empty, got from = 'ffffffff8eb690c1', to = 'ffffffff8eb690d8', info = '2' > FAIL: PEBS LBR record 30 isn't empty, got from = 'ffffffff8eb6907b', to = 'ffffffff8eb690b1', info = '1' > FAIL: PEBS LBR record 31 isn't empty, got from = 'ffffffff8eb690af', to = 'ffffffff8eb6906e', info = '1' Guilty Guilty Guilty > >>> diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h >>> index 41a4533f9989..a2f827fa0ca1 100644 >>> --- a/arch/x86/kvm/vmx/capabilities.h >>> +++ b/arch/x86/kvm/vmx/capabilities.h >>> @@ -392,7 +392,7 @@ static inline bool vmx_pt_mode_is_host_guest(void) >>> static inline bool vmx_pebs_supported(void) >>> { >>> - return boot_cpu_has(X86_FEATURE_PEBS) && kvm_pmu_cap.pebs_ept; >>> + return false; >> >> As you know, user-space VMM may disable guest-pebs by filtering out the >> MSR_IA32_PERF_CAPABILITIE.PERF_CAP_PEBS_FORMAT or CPUID.PDCM. > > Relying on userspace to fudge around KVM bugs is not acceptable for upstream. > PMU virtualization is already a bit dicey in that KVM relies on userspace to > filter out sensitive events, but leaking host addresses and failing to correctly > virtualize the architecture is an entirely different level of wrong. The leak is real and I would do the same thing (drop ADAPTIVE) in the kernel side. > >> In the end, if our great KVM maintainers insist on doing this, >> there is obviously nothing I can do about it. > > Bullshit. There is plenty you could have done to prevent this. It took me what, > 6 hours total? To (a) realize the code is buggy, (b) type up an email, and (c) > modify tests to confirm the bugs. And at least half of that time was due to me > floundering around trying to generate LBR records because I know nothing about > the perf tool. > > If you were more interested in ensuring KVM is maintainble, secure, and functionally > correct, we wouldn't be where we are today. I was actually really trying to communicate my todo list to my successors in this direction, but it took a long time for them to warm up. Very pleased to know that you have so much manpower invested in vPMU, which was unimaginable a few years ago when I was the only one doing these things. We have benefited from your efforts here or there, and hope to still have the opportunity to contribute back. Thank you, Sean.