Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp1159472pxu; Fri, 27 Nov 2020 00:48:21 -0800 (PST) X-Google-Smtp-Source: ABdhPJz9tMlHXrx23MttHVucBwS7vpAudbWoqNF2reauFaWgVwIiegxjZDD8FyQl7e8sb8CQFemB X-Received: by 2002:aa7:c2d6:: with SMTP id m22mr6455437edp.368.1606466901616; Fri, 27 Nov 2020 00:48:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606466901; cv=none; d=google.com; s=arc-20160816; b=XTv0BdY7o4XK5owfn/mWLC5JEmjF++vBua+Gh+f1e2z35TmwBikkzrAOaoKvX/D2Zj kabVhLrVsNfQiI8i07cBfJP8Jnn1FLk7b+mXTKAII4MlgJYUoB4akDzdy70gIPQ8eBb7 RvHWTdjZrIbjJimi8OQM3T5NcyjRQA2upZ+2regqg/gjSFXlJGMHNm4TQqgvrCB75J0R /EztLGbiTkSw3DWUTIMZy9DLK4QaIxd+UEwvtPc0tFjzzMqUddkiVeeJ7MSGV+o57hJF bDs5gME9XdRM9r2PEqX6LkqO+uTnIhqRUEGD76qhsk2QMoQxhDmAuZV5ZqnIgrjve4jh /PwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:ironport-sdr:ironport-sdr; bh=dVdX62KRI9irP4jMk/WhkCoteLWuTOrrzKNcGNcAzM8=; b=iNiwwLaoiCRASoemjqQeh+xEydgu8SPAFiOZRPP/cd7wo1vA9SFddOsMGwpuTApHha mS4zALTL2nMAnCIUvECBv9aDbblcRTzp/RBnMtkW6COS9HZQXOMX7JnXz2XDwpYhPcxX AOK2O7fxNwskkZacQEwOA3KEd1+/f2rop1GY8wYTjhok9fHEFE5MeD8/GRsR5Wrt1EW1 oPychpzhGzfbgNLYRp2g87fgs/KLzmE2TF/CYzxzgBM58Nbcf2KzUVLmWx4+IOWMYN8U sbfEePyYIn80YQ5QAmHb9HFE4AJ4VEzo9RqfyPjHJwfcfMiGA9JVmWzfeUrNrGuzRfXJ oMgQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v9si4642776edb.48.2020.11.27.00.47.58; Fri, 27 Nov 2020 00:48:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392203AbgK0CO4 (ORCPT + 99 others); Thu, 26 Nov 2020 21:14:56 -0500 Received: from mga14.intel.com ([192.55.52.115]:10922 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2392194AbgK0CO4 (ORCPT ); Thu, 26 Nov 2020 21:14:56 -0500 IronPort-SDR: hJ3cfura3hCxmqd4YDYOY4zwdU+eYo2lilAFoFJ8YSps2Nh8AON5J/R0au0tM6Ktwm6QOB2tgp /NIX8NRiw3/w== X-IronPort-AV: E=McAfee;i="6000,8403,9817"; a="171560412" X-IronPort-AV: E=Sophos;i="5.78,373,1599548400"; d="scan'208";a="171560412" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Nov 2020 18:14:55 -0800 IronPort-SDR: yWJeDj0DU3bGwjXS9UiOcHoAKzsabIXJHBjwFFeZtUYXj9smuI/cMCThEw/dGhO2UN/whyz+Fv wTFwRu9eOuOw== X-IronPort-AV: E=Sophos;i="5.78,373,1599548400"; d="scan'208";a="547899967" Received: from likexu-mobl1.ccr.corp.intel.com (HELO [10.238.4.107]) ([10.238.4.107]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Nov 2020 18:14:51 -0800 Subject: Re: [PATCH v2 04/17] perf: x86/ds: Handle guest PEBS overflow PMI and inject it to guest To: Peter Zijlstra , Like Xu , "Kleen, Andi" Cc: Paolo Bonzini , kvm@vger.kernel.org, Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Kan Liang , luwei.kang@intel.com, Thomas Gleixner , wei.w.wang@intel.com, Tony Luck , Stephane Eranian , Mark Gross , Srinivas Pandruvada , linux-kernel@vger.kernel.org References: <20201109021254.79755-1-like.xu@linux.intel.com> <20201109021254.79755-5-like.xu@linux.intel.com> <20201117143529.GJ3121406@hirez.programming.kicks-ass.net> <20201118180721.GA3121392@hirez.programming.kicks-ass.net> From: "Xu, Like" Message-ID: <682011d8-934f-4c76-69b0-788f71d91961@intel.com> Date: Fri, 27 Nov 2020 10:14:49 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <20201118180721.GA3121392@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Peter, On 2020/11/19 2:07, Peter Zijlstra wrote: > On Thu, Nov 19, 2020 at 12:15:09AM +0800, Like Xu wrote: > >>> ISTR there was lots of fail trying to virtualize it earlier. What's >>> changed? There's 0 clues here. >> Ah, now we have EPT-friendly PEBS facilities supported since Ice Lake >> which makes guest PEBS feature possible w/o guest memory pinned. > OK. > >>> Why are the host and guest DS area separate, why can't we map them to >>> the exact same physical pages? >> If we map both guest and host DS_AREA to the exact same physical pages, >> - the guest can access the host PEBS records, which means that the host >> IP maybe leaked, because we cannot predict the time guest drains records and >> it would be over-designed to clean it up before each vm-entry; >> - different tasks/vcpus on the same pcpu cannot share the same PEBS DS >> settings from the same physical page. For example, some require large >> PEBS and reset values, while others do not. >> >> Like many guest msrs, we use the separate guest DS_AREA for the guest's >> own use and it avoids mutual interference as little as possible. > OK, but the code here wanted to inspect the guest DS from the host. It > states this is somehow complicated/expensive. But surely we can at the > very least map the first guest DS page somewhere so we can at least > access the control bits without too much magic. We note that the SDM has a contiguous present memory mapping assumption about the DS save area and the PEBS buffer area. Therefore, we revisit your suggestion here and move it a bit forward: When the PEBS is enabled, KVM will cache the following values: - gva ds_area (kvm msr trap) - hva1 for "gva ds_area" (walk guest page table) - hva2 for "gva pebs_buffer_base" via hva1 (walk guest page table) if the "gva ds_area" cache hits, - access PEBS "interrupt threshold" and "Counter Reset[]" via hva1 - get "gva2 pebs_buffer_base" via __copy_from_user(hva1) if the "gva2 pebs_buffer_base" cache hits, - we get "gva2 pebs_index" via __copy_from_user(hva2), - rewrite the guest PEBS records via hva2 and pebs_index If any cache misses, setup the cache values via walking tables again. I wonder if you would agree with this optimization idea, we look forward to your confirmation for the next step. Thanks, Like Xu