Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp559890ybi; Tue, 16 Jul 2019 01:44:59 -0700 (PDT) X-Google-Smtp-Source: APXvYqxEqRHd4C5JMHCGxpJ10ULCbCdHXoZG5WzR/8YM1S7P4IeLPo1yPgPYQ2PiAlpl1PCD783d X-Received: by 2002:a17:90a:109:: with SMTP id b9mr32979682pjb.112.1563266699339; Tue, 16 Jul 2019 01:44:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1563266699; cv=none; d=google.com; s=arc-20160816; b=UPPtO6sA5lwMC6ohNFLA1xVEvpm7LHnMYoWULV1QbA1tE3ZwMtzsC0KhvsdvLz/j1f UJL9rlSCAEVZCpvAKAiyO+R3Ytv1+ZySn/h9A/CmraPXNkiByQy20px5YUMXBFMdGMEC 3jyaVffOZlbyhJ7raA1h43Vw5nJuJjN3L7zc5oc581vBqW9cb5c7Mwu0TS6zbDoZ9TXs mC/esfvSs60E2ganSBEy9yRnzAHBPDLwEdw9f2grcR1hnfO7zCsLIUNs62GpsWNOhs26 qux4vG3lWTgLXtJCn4oEDHF0vjv/70Voba+tuT2FD0F9IGW+7jCtytn0RmBRnntZP4Ay cqRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :references:subject:cc:to:mime-version:user-agent:from:date :message-id; bh=kPfzuEypzqtuez7N556Q3sQ+a3X3tvfQcWWy1IOjlj8=; b=L//H0Xk2mEe+8+M+/IGYFJI3U/pxGaPnS6DV1ev8ciSyd0X9LXvvNtWKcjjYObonIF dXaTazrGWjXLhjg2m/7Jd80SzmGUt+258drMsmrfI2sbGHrK/H3zsS7strZPnbVNyuc6 ttXMn+iazU6y4OqOx9xR4ofE8TGtMCvk8gZXfHgFsBwAGv5Zkl8bkYnNTo6GltBQJbdX 2X4cdJBeGoL1i0s62dLdsj/Aa5QQy1/S1uS7ZHEXVBWKbpX8ahLhzUEBz9Iqq84qXhvI vhJqxjsm3YsYzotnfQhur+GrMGXsVtSIoi7wOulAf8IRjHahgSRFjj/HpDhkkzyMa3TL AyQw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o7si14846787pgq.459.2019.07.16.01.44.43; Tue, 16 Jul 2019 01:44:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731403AbfGPIoX (ORCPT + 99 others); Tue, 16 Jul 2019 04:44:23 -0400 Received: from mga17.intel.com ([192.55.52.151]:24132 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726536AbfGPIoX (ORCPT ); Tue, 16 Jul 2019 04:44:23 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Jul 2019 01:44:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,497,1557212400"; d="scan'208";a="167588676" Received: from unknown (HELO [10.239.13.7]) ([10.239.13.7]) by fmsmga008.fm.intel.com with ESMTP; 16 Jul 2019 01:44:21 -0700 Message-ID: <5D2D8FB4.3020505@intel.com> Date: Tue, 16 Jul 2019 16:49:56 +0800 From: Wei Wang User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Eric Hankland CC: Paolo Bonzini , rkrcmar@redhat.com, linux-kernel@vger.kernel.org, Stephane Eranian , kvm@vger.kernel.org Subject: Re: [PATCH v2] KVM: x86: PMU Event Filter References: <5D27FE26.1050002@intel.com> In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/16/2019 08:10 AM, Eric Hankland wrote: >> I think just disabling guest cpuid might not be enough, since guest >> could write to the msr without checking the cpuid. >> >> Why not just add a bitmap for fixed counter? >> e.g. fixed_counter_reject_bitmap >> >> At the beginning of reprogram_fixed_counter, we could add the check: >> >> if (test_bit(idx, &kvm->arch.fixed_counter_reject_bitmap)) >> return -EACCES; >> >> (Please test with your old guest and see if they have issues if we >> inject #GP when >> they try to set the fixed_ctrl msr. If there is, we could drop -EACCESS >> above) >> >> The bitmap could be set at kvm_vm_ioctl_set_pmu_event_filter. > intel_pmu_refresh() checks the guest cpuid and sets the number of > fixed counters according to that: > pmu->nr_arch_fixed_counters = min_t(int, edx.split.num_counters_fixed, > INTEL_PMC_MAX_FIXED); > > and reprogram_fixed_counters()/get_fixed_pmc() respect this so the > guest can't just ignore the cpuid. Yes, but as you noticed, we couldn't disable fixed counter 2 while keeping counter 3 running via that. > Adding a bitmap does let you do things like disable the first counter > but keep the second and third, but given that there are only three and > the events are likely to be on a whitelist anyway, it seemed like > adding the bitmap wasn't worth it. If you still feel the same way even > though we can disable them via the cpuid, I can add this in. We need the design to be architecturally clean. For example, if the hardware later comes up with fixed counter 5 and 6. 5 is something we really want to expose to the guest to use while 6 isn't, can our design here (further) support that? We don't want to re-design this at that time. However, extending what we have would be acceptable. So, if you hesitate to add the bitmap method that I described, please add GP tags to the ACTIONs defined, e.g. enum kvm_pmu_action_type { KVM_PMU_EVENT_ACTION_GP_NONE = 0, KVM_PMU_EVENT_ACTION_GP_ACCEPT = 1, KVM_PMU_EVENT_ACTION_GP_REJECT = 2, KVM_PMU_EVENT_ACTION_MAX }; and add comments to explain something like below: Those GP actions are for the filtering of guest events running on the virtual general purpose counters. The actions to filter guest events running on the virtual fixed function counters are not added currently as they all seem fine to be used by the guest so far, but that could be supported on demand in the future via adding new actions. >> I think it would be better to add more, please see below: >> >> enum kvm_pmu_action_type { >> KVM_PMU_EVENT_ACTION_NONE = 0, >> KVM_PMU_EVENT_ACTION_ACCEPT = 1, >> KVM_PMU_EVENT_ACTION_REJECT = 2, >> KVM_PMU_EVENT_ACTION_MAX >> }; >> >> and do a check in kvm_vm_ioctl_set_pmu_event_filter() >> if (filter->action >= KVM_PMU_EVENT_ACTION_MAX) >> return -EINVAL; >> >> This is for detecting the case that we add a new action in >> userspace, while the kvm hasn't been updated to support that. >> >> KVM_PMU_EVENT_ACTION_NONE is for userspace to remove >> the filter after they set it. > We can achieve the same result by using a reject action with an empty > set of events - is there some advantage to "none" over that? I can add > that check for valid actions. Yes, we could also make it work via passing nevents=0. I slightly prefer the use of the NONE action here. The advantage is simpler (less) userspace command. For QEMU, people could disable the filter list via qmp command, e.g. pmu-filter=none, instead of pmu-filter=accept,nevents=0. It also seems more straightforward when people read the usage manual - a NONE command to cancel the filtering, instead of "first setting this, then setting that.." I don't see advantages of using nevents over NONE action. Anyway, this one isn't a critical issue, and it's up to you here. > >>> +#define KVM_PMU_EVENT_FILTER_MAX_EVENTS 63 >> Why is this limit needed? > Serves to keep the filters on the smaller side and ensures the size > calculation can't overflow if users attempt to. Keeping the filter > under 4k is nicer for allocation - also, if we want really large > filters we might want to do something smarter than a linear traversal > of the filter when guests program counters. I think 63 is too small, and it looks like a random number being put here. From the SDM table 19-3, it seems there are roughly 300 events. Functionally, the design should support to filter most of them. (optimization with smarter traversal is another story that could be done later) Maybe #define KVM_PMU_EVENT_FILTER_MAX_EVENTS (PAGE_SIZE - sizeof(struct kvm_pmu_event_filter)) / sizeof (__u64) ? and please add some comments above about the consideration that we set this number. >>> + kvfree(filter); >> Probably better to have it conditionally? >> >> if (filter) { >> synchronize_srcu(); >> kfree(filter) >> } >> >> You may want to factor it out, so that kvm_pmu_destroy could reuse. > Do you mean kvm_arch_destroy_vm? It looks like that's where kvm_arch > members are freed. I can do that. Sounds good. Best, Wei