Received: by 2002:a05:7412:251c:b0:e2:908c:2ebd with SMTP id w28csp1548053rda; Mon, 23 Oct 2023 16:41:40 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGK5w1Ee11n+WDRd2r25wrWiSd0gEqX3HW0We5Ok8B1I/HNBwHyeQfxfWBEQ7qdOq3GYRKI X-Received: by 2002:a05:6359:593:b0:168:e9d2:6568 with SMTP id ee19-20020a056359059300b00168e9d26568mr2386976rwb.25.1698104500253; Mon, 23 Oct 2023 16:41:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698104500; cv=none; d=google.com; s=arc-20160816; b=RWZde7RW6FzoaAH2V0RZhUg433h0xu7lTTWCnWiTgzz3XpEQbtfSlyyKFTjyQD0sRW eebPYejqQ13u895JM5QC2LT3yWLfbg2TRWDaiqzOpXnsazbQ5sdqKx6UPBw13kv06Nkn x4pNeJ11ftHuUxY+g4WenMclUiLGaafP30o1cHoyOVJ0uZVi4x2oJgFWSEH8gmVXG6Pp oMEOvJOjBKgJ7uG7NUQnGEZD1bCjXOjznuqCgtj7JptyQkjfzZg6K7GJKiM9EcHxJQK9 dG9OCLHBKmXP/0FR0zLJcy1qHL1PwYxlp3wgg6t1jcJHF3iU/OHgG0m7BxPVpgsqSlqs o/sg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=qUH4I8dRZYZdtSJK7SIMlGfWi/fWxXkAkYB7xZkkAPk=; fh=t7gFcWLRyAqmyldGbr6d3KgIXGxOYU/i61iCoIU6S88=; b=q3QvUoyyEVUvTqGOtYtP+oxoIn41DZmRi/gVVx5bf18io2rBnd9nUV9L+YoibuFKoo 2P6NI61tnTsRLgt7WILyfm50xubML8tuvdmLJUw6Z8qCiN66KRtk2lXuS9zkHC8h5PXZ alL7nth0GpKKMBpPQOtB9RFxqOWGSZAe2Cc3xI+QRx9tmCmEOlwFklBbRfq3jmTn0JFY QU4rh/xD4meEwz4F9lJ69Ldd2kChO2N59wSlWkLA8wBchimbLBFMxNKZ3PKjNe+XavL5 h6NN26oMckDqkHIjaoiZK0/3R9hIq15PBze55RIjsmzV/rLce4UpM/tkpeHPdoOiaJZS icfg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=cMMBVzmo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id bq8-20020a056a02044800b005ad727cfd10si7678254pgb.132.2023.10.23.16.41.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 16:41:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=cMMBVzmo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 5669080A5642; Mon, 23 Oct 2023 16:41:04 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231536AbjJWXkh (ORCPT + 99 others); Mon, 23 Oct 2023 19:40:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231728AbjJWXkS (ORCPT ); Mon, 23 Oct 2023 19:40:18 -0400 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22CF510E0 for ; Mon, 23 Oct 2023 16:40:15 -0700 (PDT) Received: by mail-pg1-x54a.google.com with SMTP id 41be03b00d2f7-5b62a669d61so2369874a12.1 for ; Mon, 23 Oct 2023 16:40:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698104414; x=1698709214; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=qUH4I8dRZYZdtSJK7SIMlGfWi/fWxXkAkYB7xZkkAPk=; b=cMMBVzmoNR/o7Mx04hDAFX2YXFvVUgT7Ud+6RkV1oR4ISwoZ+veNYZE2zG0ZN5GLgK Urc/PX4Ixlvm8ZsPDkKOQm2OdJAq0Bi7RG7xY3YR01BfHxsLpW0I4e/Ifp3GBwxfNQYV iebilztHcdacpJppIpI0ew6D/A3vmNI47CqA6/xgA0QwwqSQ4XwyJHzirRS0nhyGJzWf R6jM75esKzX8HoWsklfbYOlLeHjosAUSFLrUjASAiko9LxtEvt+KWa0EAL4VuRtJGCcM ZFQtmahaA5KOZqr3ftCk/UgfEpZsvQUbg2nIN76BMunaKgMxO0/NhxcRUI2+8l0+HtBi ss9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698104414; x=1698709214; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=qUH4I8dRZYZdtSJK7SIMlGfWi/fWxXkAkYB7xZkkAPk=; b=DqSUAlXHa2ugf+NPWjG6Qci3psL4cImvmRyWRtOhnNXbUB3lCLl6DH7soSkVfcWvbF ng+pRnT2E1LOOMjcpNpyqHbWvOICRg7e1H1ul1H66foTrikruu8NU3ALXnzPZ8kPum1r 3+2/I7Ms4xFOkSdQq/YiaMjB6A2kspZsjz+G5yzGcFbuj1w7ZY8QxkgvHG6qUVDJAByJ KbS2kcJsURXnY6ozrRWzGwvwL5/XB95XNK0/+g8D0zpltwtfMiWtL0Kp4LzwIPdo81iw voGvYyc9ieij5xvj5d2HHjGVuBjnel4fQsZSmcYklUXqQe3Pmm+BKsbmXTmK3JPjNroa HWow== X-Gm-Message-State: AOJu0YwJ9mxMTuu5q7Jkq95afsEZMGm0VhEyPiKBhe/PO02Uh4jkl5Ca /kR+HbiVmNjLBTWjsCUA0zXojEHFgDw= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a65:518b:0:b0:56b:6acd:d5f8 with SMTP id h11-20020a65518b000000b0056b6acdd5f8mr187066pgq.7.1698104414521; Mon, 23 Oct 2023 16:40:14 -0700 (PDT) Reply-To: Sean Christopherson Date: Mon, 23 Oct 2023 16:40:00 -0700 In-Reply-To: <20231023234000.2499267-1-seanjc@google.com> Mime-Version: 1.0 References: <20231023234000.2499267-1-seanjc@google.com> X-Mailer: git-send-email 2.42.0.758.gaed0368e0e-goog Message-ID: <20231023234000.2499267-7-seanjc@google.com> Subject: [PATCH 6/6] KVM: x86/pmu: Track emulated counter events instead of previous counter From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Mingwei Zhang , Roman Kagan , Jim Mattson , Dapeng Mi , Like Xu Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Mon, 23 Oct 2023 16:41:04 -0700 (PDT) Explicitly track emulated counter events instead of using the common counter value that's shared with the hardware counter owned by perf. Bumping the common counter requires snapshotting the pre-increment value in order to detect overflow from emulation, and the snapshot approach is inherently flawed. Snapshotting the previous counter at every increment assumes that there is at most one emulated counter event per emulated instruction (or rather, between checks for KVM_REQ_PMU). That's mostly holds true today because KVM only emulates (branch) instructions retired, but the approach will fall apart if KVM ever supports event types that don't have a 1:1 relationship with instructions. And KVM already has a relevant bug, as handle_invalid_guest_state() emulates multiple instructions without checking KVM_REQ_PMU, i.e. could miss an overflow event due to clobbering pmc->prev_counter. Not checking KVM_REQ_PMU is problematic in both cases, but at least with the emulated counter approach, the resulting behavior is delayed overflow detection, as opposed to completely lost detection. Cc: Mingwei Zhang Cc: Roman Kagan Cc: Jim Mattson Cc: Dapeng Mi Cc: Like Xu Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm_host.h | 17 +++++++++++++++- arch/x86/kvm/pmu.c | 36 +++++++++++++++++++++++---------- arch/x86/kvm/pmu.h | 3 ++- 3 files changed, 43 insertions(+), 13 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index d7036982332e..d8bc9ba88cfc 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -500,8 +500,23 @@ struct kvm_pmc { u8 idx; bool is_paused; bool intr; + /* + * Base value of the PMC counter, relative to the *consumed* count in + * the associated perf_event. This value includes counter updates from + * the perf_event and emulated_count since the last time the counter + * was reprogrammed, but it is *not* the current value as seen by the + * guest or userspace. + * + * The count is relative to the associated perf_event so that KVM + * doesn't need to reprogram the perf_event every time the guest writes + * to the counter. + */ u64 counter; - u64 prev_counter; + /* + * PMC events triggered by KVM emulation that haven't been fully + * processed, i.e. haven't undergone overflow detection. + */ + u64 emulated_counter; u64 eventsel; struct perf_event *perf_event; struct kvm_vcpu *vcpu; diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 3725d001239d..f02cee222e9a 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -127,9 +127,9 @@ static void kvm_perf_overflow(struct perf_event *perf_event, struct kvm_pmc *pmc = perf_event->overflow_handler_context; /* - * Ignore overflow events for counters that are scheduled to be - * reprogrammed, e.g. if a PMI for the previous event races with KVM's - * handling of a related guest WRMSR. + * Ignore asynchronous overflow events for counters that are scheduled + * to be reprogrammed, e.g. if a PMI for the previous event races with + * KVM's handling of a related guest WRMSR. */ if (test_and_set_bit(pmc->idx, pmc_to_pmu(pmc)->reprogram_pmi)) return; @@ -226,13 +226,19 @@ static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config, static void pmc_pause_counter(struct kvm_pmc *pmc) { - u64 counter = pmc->counter; + /* + * Accumulate emulated events, even if the PMC was already paused, e.g. + * if KVM emulated an event after a WRMSR, but before reprogramming, or + * if KVM couldn't create a perf event. + */ + u64 counter = pmc->counter + pmc->emulated_counter; - if (!pmc->perf_event || pmc->is_paused) - return; + pmc->emulated_counter = 0; /* update counter, reset event value to avoid redundant accumulation */ - counter += perf_event_pause(pmc->perf_event, true); + if (pmc->perf_event && !pmc->is_paused) + counter += perf_event_pause(pmc->perf_event, true); + pmc->counter = counter & pmc_bitmask(pmc); pmc->is_paused = true; } @@ -289,6 +295,14 @@ static void pmc_update_sample_period(struct kvm_pmc *pmc) void pmc_write_counter(struct kvm_pmc *pmc, u64 val) { + /* + * Drop any unconsumed accumulated counts, the WRMSR is a write, not a + * read-modify-write. Adjust the counter value so that it's value is + * relative to the current perf_event (if there is one), as reading the + * current count is faster than pausing and repgrogramming the event in + * order to reset it to '0'. + */ + pmc->emulated_counter = 0; pmc->counter += val - pmc_read_counter(pmc); pmc->counter &= pmc_bitmask(pmc); pmc_update_sample_period(pmc); @@ -426,6 +440,7 @@ static bool pmc_event_is_allowed(struct kvm_pmc *pmc) static void reprogram_counter(struct kvm_pmc *pmc) { struct kvm_pmu *pmu = pmc_to_pmu(pmc); + u64 prev_counter = pmc->counter; u64 eventsel = pmc->eventsel; u64 new_config = eventsel; u8 fixed_ctr_ctrl; @@ -435,7 +450,7 @@ static void reprogram_counter(struct kvm_pmc *pmc) if (!pmc_event_is_allowed(pmc)) goto reprogram_complete; - if (pmc->counter < pmc->prev_counter) + if (pmc->counter < prev_counter) __kvm_perf_overflow(pmc, false); if (eventsel & ARCH_PERFMON_EVENTSEL_PIN_CONTROL) @@ -475,7 +490,6 @@ static void reprogram_counter(struct kvm_pmc *pmc) reprogram_complete: clear_bit(pmc->idx, (unsigned long *)&pmc_to_pmu(pmc)->reprogram_pmi); - pmc->prev_counter = 0; } void kvm_pmu_handle_event(struct kvm_vcpu *vcpu) @@ -701,6 +715,7 @@ static void kvm_pmu_reset(struct kvm_vcpu *vcpu) pmc_stop_counter(pmc); pmc->counter = 0; + pmc->emulated_counter = 0; if (pmc_is_gp(pmc)) pmc->eventsel = 0; @@ -772,8 +787,7 @@ void kvm_pmu_destroy(struct kvm_vcpu *vcpu) static void kvm_pmu_incr_counter(struct kvm_pmc *pmc) { - pmc->prev_counter = pmc->counter; - pmc->counter = (pmc->counter + 1) & pmc_bitmask(pmc); + pmc->emulated_counter++; kvm_pmu_request_counter_reprogram(pmc); } diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index cae85e550f60..7caeb3d8d4fd 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -66,7 +66,8 @@ static inline u64 pmc_read_counter(struct kvm_pmc *pmc) { u64 counter, enabled, running; - counter = pmc->counter; + counter = pmc->counter + pmc->emulated_counter; + if (pmc->perf_event && !pmc->is_paused) counter += perf_event_read_value(pmc->perf_event, &enabled, &running); -- 2.42.0.758.gaed0368e0e-goog