Received: by 2002:ab2:6309:0:b0:1fb:d597:ff75 with SMTP id s9csp212359lqt; Thu, 6 Jun 2024 00:58:49 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCW35tnZtAdSZbD3YvuQNoOaGZXfKHD+BS9bXEUmQHVUARx+oFq9FDEqEbzOn2FrnALbY4tkCTxHNcjbIffBP/1KMmlNb8ElO6gzK/p3aA== X-Google-Smtp-Source: AGHT+IG/4BSFahtM88fZlHmRydi6I5LgbBR+J6LpGMB3UYCkjzHUP1Esh+qDvN8bj7eBDEiT2S14 X-Received: by 2002:a25:2e08:0:b0:dfa:72e6:47a0 with SMTP id 3f1490d57ef6-dfacab1a773mr4736150276.10.1717660729336; Thu, 06 Jun 2024 00:58:49 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717660729; cv=pass; d=google.com; s=arc-20160816; b=rTUfkl0vrN4zQyK8gGpkop6T+DHsPpi3cuSgexKIZ2Cb2itrYZLEUNzfaE2KJhiQj8 pNcdht3xz+MnY2OavoOw8Nu7ksP0Yb64Jid6bjIAQt91Zt0NSvCmK5949E8gWa7mSLYw SIccWT8qfEY1Sjwnzo4YYwilN0d5p7yZZ4goou3FoAkI8dFp/WrxkZm3zUkM+86zQ0c3 /ZlN+BcNvsNMmMvp+7GHAdZwQZ7Quai91YNKJ84FVZNYdu6Fsp9gAU6hhLkHu53q9kr8 EBu97mhVKC4gAfxpJUwshQShfVsv1BgL8XG/LcWFBASt+XuvOMXUhpiOPjUB2P7TGgDN 41XA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:subject:message-id:date:from:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:dkim-signature; bh=lBpAv6C3O+U7a9knE0xgywFkw1MNsJlamzVihT52F+s=; fh=U0cwqh5WBclFDd/TBiUURiWh0V1sXVh0ovSuoOVdLUk=; b=A84TiES3l4e/7Y3RUbkqpXLi75juZ0SR7Goexq3MSJCMc5jppQmqTAzN9SvJKnacli sOlf2a7/1uasGEk3pi4t1+FnT+k/7OVnshE56OJf4YGGYO5dwcXsVFmuQt7ta8kDSER9 fmq+IMCb98hNFVee399zPE1v/3PVePXWYhGm/IhxAUqTdAd071feThgTWHfR16M3yP8a q6PxfbOAJOy7otrWuar2jkdrFo1JIvc0iUkMQdzBzdgypPcwonN6o6wS9KsBMQaXCB5X lDoiPq+3d4kwA0dn80067BfOOJyGgsaTVgvqpHoaGgp9EhP8BbK0k/AIxn4h40pavspn ZMNg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=gRzjJyW4; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-203772-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-203772-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id d75a77b69052e-44038a86b06si10191811cf.180.2024.06.06.00.58.49 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 00:58:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-203772-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=gRzjJyW4; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-203772-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-203772-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 7A11B1C24E08 for ; Thu, 6 Jun 2024 07:58:37 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 1B43513BC3B; Thu, 6 Jun 2024 07:57:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="gRzjJyW4" Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 924BF13A3EF for ; Thu, 6 Jun 2024 07:57:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.54 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717660673; cv=none; b=McofNIXNEgfRXAy5wJiXcPj0QrKyaMYMwb3J5Ku+5AHRUcVcNW4EQR4wIireybLJAzQ6B9TvOeCsJ2dpGczK4kiEgpW+CT6MJkZy1tQazKDoBOyMy3N3+KQsqgLrdCOA6aZKhbHhmUW4jhIZ4Y2eZG5M1aqwtLjcEpdFoDjrDNA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717660673; c=relaxed/simple; bh=8ftuWvN4f67ebYtsec+R7qcnXlTP1cIvGoDAWr1UXXY=; h=MIME-Version:From:Date:Message-ID:Subject:To:Cc:Content-Type; b=p6mqlnZnQZ6ScnkfnOsSjUdHhKjGduBpIC70GeMDLN0wjHrK3nvgE3hvF9Je4QFrzB8EPlytPFKYrbuIJ1/UQbL06+5+s89fHe+xaYHZdpEbCPv8BbfoAfHI0Vv+BvUr2dgGaX41DOU3zjEMV7Wn+6LFiPFyOh+gPDdYkzxRswU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=gRzjJyW4; arc=none smtp.client-ip=209.85.128.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-4215b888eabso49305e9.0 for ; Thu, 06 Jun 2024 00:57:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717660670; x=1718265470; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=lBpAv6C3O+U7a9knE0xgywFkw1MNsJlamzVihT52F+s=; b=gRzjJyW4QENf2dHHyYfYXGgHGpxH8+zaqSsjAu6NQqRrL6hc8AE9zkEtPfrL8hUo/v 4s6giYE5HUq3jA2G1bWJfItc454JV/13zx66WxjB0IY7DYJQo9o4iBd8aRleWkcZcyFS LXKS5KRp1eZ8ia+R1N1aSAsFjCUTrMNQiySCEtinzaPgjsu4wuR/znyIn8vMDhr+aNbO +OEG5azLpcfXZi8iG7/5LduxRU5IWN+Vr0b9wKRIY/OQQ2OPjhDreFc360XKWSj3Twk6 AI2BEIlPc/lc3meU9v/K9y//7AjYLdtUG4aibnrDwWshjV+4vtW1h5T48pbRiN4P+cUo PIew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717660670; x=1718265470; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=lBpAv6C3O+U7a9knE0xgywFkw1MNsJlamzVihT52F+s=; b=Br0vOjjysvm3D4SJQTd8an/5507sO4OYnVAuE04TMnHrW7qobf/ivhqHApc9iGh1et uRHT9JKjzMmKH7AXJbWeyM3vTvHV7hs61f2PT16m63GWVgjr4BbcmNegwXYBU1ygOQjD f11aCX2nze5kzM21OEJjqVPHyIfOkZOYb1OLHVv3iZmIYZwXUNehCxpnLHl5mIL3eUqG wS1jeIlcnivfomeW72cD+DDvQHpN9KhCOHVsNSlm8pvd++yR8X/gtsVJic0AaacLLmKL W55BX5IBzNW3MuBvvTKoCdHNL6UkaZOFeQRndEYaNGXqAjlrmpopKILSWl4TdE/+A5nF Tj5Q== X-Gm-Message-State: AOJu0YyYfKT/UrqPRfqWgUoTs9yPLtJYD3qvZS+/FCmdcPsjhUWCfQeN SutNYJLpUFT+W+mrgv1vXmHXvsPaIg+IdjW2KRPfY/uwecUibJ5CC1nQfvte1HbM5pg8YZmDqJu OGG9Nfk3Nu1FuQyn+ELMqnNFgNPG9FoKTCN1A X-Received: by 2002:a7b:ca4e:0:b0:41b:4c6a:de7a with SMTP id 5b1f17b1804b1-4215c0da3c3mr1158105e9.3.1717660669562; Thu, 06 Jun 2024 00:57:49 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Stephane Eranian Date: Thu, 6 Jun 2024 00:57:35 -0700 Message-ID: Subject: [RFC] perf_events: exclude_guest impact on time_enabled/time_running To: Peter Zijlstra Cc: LKML , Ian Rogers , "Liang, Kan" , Andi Kleen , Ingo Molnar , "Narayan, Ananth" , "Bangoria, Ravikumar" , Namhyung Kim , Mingwei Zhang , Dapeng Mi , Zhang Xiong Content-Type: text/plain; charset="UTF-8" Hi Peter, In the context of the new vPMU passthru patch series, we have to look closer at the definition and implementation of the exclude_guest filter in the perf_event_attr structure. This filter has been in the kernel for many years. See patch: https://lore.kernel.org/all/20240506053020.3911940-8-mizhang@google.com/ The presumed definition of the filter is that the user does not want the event to count while the processor is running in guest mode (i.e., inside the virtual machine guest OS or guest user code). The perf tool sets is by default on all core PMU events: $ perf stat -vv -e cycles sleep 0 ------------------------------------------------------------ perf_event_attr: size 112 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ In the kernel, the way this is treated differs between AMD and Intel because AMD does provide a hardware filter for guest vs. host in the PMU counters whereas Intel does not. For the latter, the kernel simply disables the event in the hardware counters, i.e., the event is not descheduled. Both approaches produce pretty much the same desired effect, the event is not counted while in guest mode. The issue I would like to raise has to do with the effects on time_enabled and time_running for exclude_guest=1 events. Given the event is not scheduled out while in guest mode, even though it is stopped, both time_enabled and time_running continue ticking while in guest mode. If a measurement is 10s long but only 5s are in non-guest mode, then time_enabled=10s, time_running=10s. The count represents 10s worth of non guest mode, of which only 5s were really actively monitoring, but the user has no way of determining this. If we look at vPMU passthru, the host event must have exclude_guest=1 to avoid going into an error state on context switch to the vCPU thread (with vPMU enabled). But this time, the event is scheduled out, that means that time_enabled keeps counting, but time_running stops. On context switch back in, the host event is scheduled again and time_running restarts ticking. For a 10s measurement, where 5s here in the guest, the event will come out with time_enabled=10s, time_running=5s, and the tool will scale it up because it thinks the event was multiplexed, when in fact it was not. This is not the intended outcome here. The tool should not scale the count, it was not multiplexed, it was descheduled because the filter forced it out. Note that if the event had been multiplexed while running on the host, then the scaling would be appropriate. In that case, I argue, time_running should be updated to cover the time the event was not running. That would bring us back to the case I was describing earlier. It boils down to the exact definition of exclude_guest and expected impact on time_enabled and time_running. Then, with or without vPMU passthru, we can fix the kernel to ensure a uniform behavior. What are your thoughts on this problem?