Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp237023pxf; Thu, 8 Apr 2021 01:45:06 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwP4fWp4IJgGWPAb5WmwI6Bb1mPu+ZKwZdNzpwOaBsePeuoqGvLypDLw8n+POzNiutVyQpW X-Received: by 2002:a17:902:d78a:b029:e6:e1f:f695 with SMTP id z10-20020a170902d78ab02900e60e1ff695mr6637698ply.82.1617871506225; Thu, 08 Apr 2021 01:45:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617871506; cv=none; d=google.com; s=arc-20160816; b=S5M3X3U1WMNUXEJkYTA6fIicwBfLd55QymciTESvUlXklCzIbcw/RA7+VH4YwmLzn6 7/ihKNRGdAlrSDcx5E5ych5gB4LJ3jZT2a2070fuci+JTH8pVM9B+RQBn0uMuKWqDbQb sLMc8BKDyjlusZOsry/bRBuz8O2mRXit+ZjOQj4dxbrfIN3K2w5iuycS47MQA/CjHZZh RRZM0Mry06t23PHKQzHiFbV9ksiqwEyyZyJ/yVdlYMG2Y+M2mvdcM/oPrjmgX50A4mas 7l1vnplNO3rs0eE4Y3NVdRk2nwK3VRrFpgSlI2DXh1qyKC2jMA0CM4xLG8ts2ud0RE8k 6OTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:ironport-sdr:ironport-sdr; bh=9N1slQvfGalE1lU9QQZ/6XTUEIA+JrVnCwQ8fg+ySVE=; b=l3+0R8gkGNb8i4nCVDKDwIRTd6zjV8G3OmBIeIPOw6weSZdL8heWyC53KwDs1BoMM/ YB04iVFWTLnXejEjWc4+A/zdQZvF5yoSXhFK5DceOsPSTvji0p2DdT2glnYkateNcTvA n+20CScb1uGwA1hhCUMLcsTY946zyFquokZ5Y3E8Xo3XkT1UuN+qXR6cWxZuP9uTn5vK T+xM6zEGdBdXQ8J5FZY78oqErOrjAKWIO058pOwj4LmCKHtMmXFF9IEuYR1nGVwR3fEf 66TUCbEvjz+C5wU6/rzdMvOOC8Al4zA8QcOlgCRi54QE6zgWSHYjojsvaWB3P30qx99b KWdg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y13si20066592pgh.195.2021.04.08.01.44.53; Thu, 08 Apr 2021 01:45:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229689AbhDHIob (ORCPT + 99 others); Thu, 8 Apr 2021 04:44:31 -0400 Received: from mga07.intel.com ([134.134.136.100]:46106 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229588AbhDHIob (ORCPT ); Thu, 8 Apr 2021 04:44:31 -0400 IronPort-SDR: HRQzzJjrF9otLwiqVZilH5a7EA/0HWBqrupB715Zu8zzY0boJz7E7surNThLNK66T6cEZ5qk00 DlC3YBxc4smg== X-IronPort-AV: E=McAfee;i="6000,8403,9947"; a="257477051" X-IronPort-AV: E=Sophos;i="5.82,205,1613462400"; d="scan'208";a="257477051" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2021 01:44:19 -0700 IronPort-SDR: ndJ9uh+y3ZENsCyinQKz6gZyzFsEZSAQW9ZLGg0BfUEVJ5Agw6izh9rpXAwEgMwWja+KvegCzE HdGPIBFp0/BA== X-IronPort-AV: E=Sophos;i="5.82,205,1613462400"; d="scan'208";a="458736086" Received: from likexu-mobl1.ccr.corp.intel.com (HELO [10.238.4.93]) ([10.238.4.93]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2021 01:44:15 -0700 Subject: Re: [PATCH v4 08/16] KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to manage guest DS buffer To: Peter Zijlstra Cc: Sean Christopherson , Paolo Bonzini , eranian@google.com, andi@firstfloor.org, kan.liang@linux.intel.com, wei.w.wang@intel.com, Wanpeng Li , Vitaly Kuznetsov , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Andi Kleen , Like Xu References: <20210329054137.120994-1-like.xu@linux.intel.com> <20210329054137.120994-9-like.xu@linux.intel.com> <610bfd14-3250-0542-2d93-cbd15f2b4e16@intel.com> From: "Xu, Like" Message-ID: Date: Thu, 8 Apr 2021 16:44:12 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/4/8 15:52, Peter Zijlstra wrote: > On Thu, Apr 08, 2021 at 01:39:49PM +0800, Xu, Like wrote: >> Hi Peter, >> >> Thanks for your detailed comments. >> >> If you have more comments for other patches, please let me know. >> >> On 2021/4/7 23:39, Peter Zijlstra wrote: >>> On Mon, Mar 29, 2021 at 01:41:29PM +0800, Like Xu wrote: >>>> @@ -3869,10 +3876,12 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data) >>>> if (arr[1].guest) >>>> arr[0].guest |= arr[1].guest; >>>> - else >>>> + else { >>>> arr[1].guest = arr[1].host; >>>> + arr[2].guest = arr[2].host; >>>> + } >>> What's all this gibberish? >>> >>> The way I read that it says: >>> >>> if guest has PEBS_ENABLED >>> guest GLOBAL_CTRL |= PEBS_ENABLED >>> otherwise >>> guest PEBS_ENABLED = host PEBS_ENABLED >>> guest DS_AREA = host DS_AREA >>> >>> which is just completely random garbage afaict. Why would you leak host >>> msrs into the guest? >> In fact, this is not a leak at all. >> >> When we do "arr[i].guest = arr[i].host;" assignment in the >> intel_guest_get_msrs(), the KVM will check "if (msrs[i].host == >> msrs[i].guest)" and if so, it disables the atomic switch for this msr >> during vmx transaction in the caller atomic_switch_perf_msrs(). > Another marvel of bad coding style that function is :-( Lots of missing > {} and indentation fail. Sorry for that and I'll fix them. > > This is terrible though, why would we clear the guest MSRs when it > changes PEBS_ENABLED. The values of arr[1].host and arr[1].guest depend on the arrangement of host perf:         arr[1].host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask;         arr[1].guest = cpuc->pebs_enabled & ~cpuc->intel_ctrl_host_mask; rather than the guest value of PEBS_ENABLE. When the value of this msr is different across vmx-transaction, we will load arr[1].host after vm-exit and load arr[1].guest before vm-entry. If the value of this msr is the same before and after vmx-transaction, we do nothing and keep the original value on the register. > The guest had better clear them itself. I don't understand what you are referring to here. Can you explain what you think is the correct behavior here ? > Removing > guest DS_AREA just because we don't have any bits set in PEBS_ENABLED is > wrong and could very break all sorts of drivers. Except for PEBS, other features that rely on DS_AREA are not available in the guest . Can you explain more of your concerns for DS_AREA switch ? > >> In that case, the msr value doesn't change and any guest write will be >> trapped. If the next check is "msrs[i].host != msrs[i].guest", the >> atomic switch will be triggered again. >> >> Compared to before, this part of the logic has not changed, which helps to >> reduce overhead. > It's unreadable garbage at best. If you don't want it changed, then > don't add it to the arr[] thing in the first place. Thanks, adding GLOBAL_CTRL to arr[] in the last step is a better choice. > >>> Why would you change guest GLOBAL_CTRL implicitly; >> This is because in the early part of this function, we have operations: >> >>     if (x86_pmu.flags & PMU_FL_PEBS_ALL) >>         arr[0].guest &= ~cpuc->pebs_enabled; >>     else >>         arr[0].guest &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK); >> >> and if guest has PEBS_ENABLED, we need these bits back for PEBS counters: >> >>     arr[0].guest |= arr[1].guest; > I don't think that's right, who's to say they were set in the first > place? The guest's GLOBAL_CTRL could have had the bits cleared at VMEXIT > time. Please note the guest GLOBAL_CTRL value is stored in the pmu->global_ctrl, while the actual loaded value for GLOBAL_CTRL msr after vm-entry is "x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_host_mask". > You can't unconditionally add PEBS_ENABLED into GLOBAL_CTRL, > that's wrong. The determination of the msr values ​​before and after vmx-transaction are always in the context of host perf which means the PEBS perf_events created by the KVM are all scheduled on and used legally , and it does not depend on the guest values at all. > >>> guest had better wrmsr that himself to control when stuff is enabled. >> When vm_entry, the msr value of GLOBAL_CTRL on the hardware may be >> different from trapped value "pmu->global_ctrl" written by the guest. >> >> If the perf scheduler cross maps guest counter X to the host counter Y, >> we have to enable the bit Y in GLOBAL_CTRL before vm_entry rather than X. > Sure, but I don't see that happening here. Just fire questions if we're not on the same page or you're out of KVM context.