Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp22685922rwd; Fri, 30 Jun 2023 11:07:45 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5TiJaK9gi2McXbvO+hwyQuzMWgI98lbLGiXhHt0lcjNoiQrskADzrIuQsJrmkwpCkXtju9 X-Received: by 2002:aca:1a11:0:b0:3a3:6364:2b6b with SMTP id a17-20020aca1a11000000b003a363642b6bmr3289431oia.36.1688148465407; Fri, 30 Jun 2023 11:07:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688148465; cv=none; d=google.com; s=arc-20160816; b=vCHp12HSGvnIWXdCax1ani+KEHYpBJcn2gpQki/QTm/om2dlA8AVmyJNMe4VsBIGb1 Y8r4SfJDWwNyIpEMZIpbhgg7ky1TihAAZrN6zAXRs32w4HCyQO5868DQfxkVzs3y2N+g gahajSJ0Dx3Z9Q2P2Ev4ExdqGcm/dudlyIMWnNIW4dtvv4/eda4UKmMP6fMKN+rJNY4b St7dV1tuceY96cgsLWRisgZt4B4H8BVOPi2MFYH0NI68U7q9Ap5/dqK6XqGFrLW2XMIH 6HuyX7PDWdYefE+n/AWIbRLO3PQbdGrpHJGGaxE+TDJZgaEOu6B5Xcn6m/9Z+69NV0T6 TkKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=TRxnSk9kxMzaUaw9NVLI4lJ1EfwlE8b7suSgdr4IEqI=; fh=z7UKuFOR5bYaO/hUILJJIYP20D5PVcE+SI0pWnkIbDE=; b=1JDih0DXfr/GAvqB91A6GNB1QsRrWVlfI5uuCLsM8au6lMiS+NxGWLwcRcz8znZyOs WnO/wbqZ1dQDqVaCLNcCPQkPBu5+QXcvvmL5mf06H+R5ST73ZCmeb/IXJEY7O+svwY3A zWXe5FuEPoDbBQliv33acmEcj+J9cVEr5OlA62mnsIXH+Hd4jvXcK0+KbBkD+BRPseON qJOv3gWG0LxQcgtCWUDHeOo5djS3nA5SuGgv55MfhRGOvleJpRivSW8sDZx7uL7zqysn +n/isacMzuE6CP3Atjgktm2yJ74NDXwdK52/HVGrQ8ACCm2c1WnDwxNjd4NLzc8qM2pc eP8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=c6JEuiDQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q17-20020a635c11000000b005538fca55f5si12694836pgb.874.2023.06.30.11.07.27; Fri, 30 Jun 2023 11:07:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=c6JEuiDQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232562AbjF3RdE (ORCPT + 99 others); Fri, 30 Jun 2023 13:33:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47056 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231886AbjF3RdC (ORCPT ); Fri, 30 Jun 2023 13:33:02 -0400 Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com [IPv6:2a00:1450:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C3C5E6C for ; Fri, 30 Jun 2023 10:33:01 -0700 (PDT) Received: by mail-ed1-x52b.google.com with SMTP id 4fb4d7f45d1cf-51d7f4c1cfeso2244487a12.0 for ; Fri, 30 Jun 2023 10:33:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688146380; x=1690738380; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=TRxnSk9kxMzaUaw9NVLI4lJ1EfwlE8b7suSgdr4IEqI=; b=c6JEuiDQ+wrPvK+QjxunYykvRbfHhcBKV05AoUOFRmzHswSWjmcSXPXM1ZVHkHfDqs vWkbiaNqkpDXdWqxfBFccdeVdepgIW86nt7v8pemIupjrVYIp8vncjQn8BxgTl6edG36 RLNcK4EJPaBtOGWBr8ruXswZTzyWd0Z+YQ9rSRlmvF6BCz5txfyPS3R2jfAtgaovAyl1 UfB+4hVZxyU6S8fHHeV2tvxuJjimlAeWsN9tgVTn68uMqbA8JWwRnaoYlxmE8T0FCOrF 1AuiJAQdewxbWyHfWEkehHr0I4u93PrY8mgl4xew0ZjyUsu+4BXkbf4laP4fcAxHyaN+ B2qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688146380; x=1690738380; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TRxnSk9kxMzaUaw9NVLI4lJ1EfwlE8b7suSgdr4IEqI=; b=OXSdVQkP37eF+CA4Sgh/sYOZl8ykoeGQLADObyVmdqy0AWk6MKODzuijUFue9AUbS1 Kv8cVgZAuytx7pBgBxAmFsJtIo4xPvAc+Xmitsjax1JiSgmbBvzArJKZUgymku7g8oDi i2/Z3x/Uhes4RK33T7FIEDlEsucBvn1Yf5Q+x9jJd/FlIh818CoanICoMIbDPchDHA/L +z8FcXZbAkj0mCeAMOlEYixr/oixggMXLq4HI0wmJ3Pr8Bh9ypeEaQPFOGf1EuN5jgap hLy0+RBr0EWWIfR2YLchqS6kkF9Y7p/CXwG7U3KDla+7OtfQhUkA1DNbhb0mqntflRYL ZPMw== X-Gm-Message-State: ABy/qLb9vMfG6SR6vViEEe0MBREwP3bghYOBzWR+dQeFEOcMhUNHRpmG B2qgTQYxhxX0yTLJ5mbBua27i2EfudEJfx8HpWm4nQ== X-Received: by 2002:a17:906:f299:b0:991:f427:2fd8 with SMTP id gu25-20020a170906f29900b00991f4272fd8mr2265350ejb.74.1688146379640; Fri, 30 Jun 2023 10:32:59 -0700 (PDT) MIME-Version: 1.0 References: <20230504120042.785651-1-rkagan@amazon.de> In-Reply-To: From: Mingwei Zhang Date: Fri, 30 Jun 2023 10:32:23 -0700 Message-ID: Subject: Re: [PATCH] KVM: x86: vPMU: truncate counter value to allowed width To: Jim Mattson Cc: Sean Christopherson , Roman Kagan , Paolo Bonzini , Eric Hankland , kvm@vger.kernel.org, Dave Hansen , Like Xu , x86@kernel.org, Thomas Gleixner , linux-kernel@vger.kernel.org, "H. Peter Anvin" , Borislav Petkov , Ingo Molnar Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 30, 2023 at 10:16=E2=80=AFAM Jim Mattson = wrote: > > On Fri, Jun 30, 2023 at 10:08=E2=80=AFAM Mingwei Zhang wrote: > > > > On Fri, Jun 30, 2023 at 8:45=E2=80=AFAM Sean Christopherson wrote: > > > > > > On Fri, Jun 30, 2023, Roman Kagan wrote: > > > > On Fri, Jun 30, 2023 at 07:28:29AM -0700, Sean Christopherson wrote= : > > > > > On Fri, Jun 30, 2023, Roman Kagan wrote: > > > > > > On Thu, Jun 29, 2023 at 05:11:06PM -0700, Sean Christopherson w= rote: > > > > > > > @@ -74,6 +74,14 @@ static inline u64 pmc_read_counter(struct = kvm_pmc *pmc) > > > > > > > return counter & pmc_bitmask(pmc); > > > > > > > } > > > > > > > > > > > > > > +static inline void pmc_write_counter(struct kvm_pmc *pmc, u6= 4 val) > > > > > > > +{ > > > > > > > + if (pmc->perf_event && !pmc->is_paused) > > > > > > > + perf_event_set_count(pmc->perf_event, val); > > > > > > > + > > > > > > > + pmc->counter =3D val; > > > > > > > > > > > > Doesn't this still have the original problem of storing wider v= alue than > > > > > > allowed? > > > > > > > > > > Yes, this was just to fix the counter offset weirdness. My plan = is to apply your > > > > > patch on top. Sorry for not making that clear. > > > > > > > > Ah, got it, thanks! > > > > > > > > Also I'm now chasing a problem that we occasionally see > > > > > > > > [3939579.462832] Uhhuh. NMI received for unknown reason 30 on CPU 4= 3. > > > > [3939579.462836] Do you have a strange power saving mode enabled? > > > > [3939579.462836] Dazed and confused, but trying to continue > > > > > > > > in the guests when perf is used. These messages disappear when > > > > 9cd803d496e7 ("KVM: x86: Update vPMCs when retiring instructions") = is > > > > reverted. I haven't yet figured out where exactly the culprit is. > > > > > > Can you reverting de0f619564f4 ("KVM: x86/pmu: Defer counter emulated= overflow > > > via pmc->prev_counter")? I suspect the problem is the prev_counter m= ess. > > > > For sure it is prev_counter issue, I have done some instrumentation as = follows: > > > > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > > index 48a0528080ab..946663a42326 100644 > > --- a/arch/x86/kvm/pmu.c > > +++ b/arch/x86/kvm/pmu.c > > @@ -322,8 +322,11 @@ static void reprogram_counter(struct kvm_pmc *pmc) > > if (!pmc_event_is_allowed(pmc)) > > goto reprogram_complete; > > > > - if (pmc->counter < pmc->prev_counter) > > + if (pmc->counter < pmc->prev_counter) { > > + pr_info("pmc->counter: %llx\tpmc->prev_counter: %llx\n"= , > > + pmc->counter, pmc->prev_counter); > > __kvm_perf_overflow(pmc, false); > > + } > > > > if (eventsel & ARCH_PERFMON_EVENTSEL_PIN_CONTROL) > > printk_once("kvm pmu: pin control bit is ignored\n"); > > > > I find some interesting changes on prev_counter: > > > > [ +7.295348] pmc->counter: 12 pmc->prev_counter: fffffffffb3d > > [ +0.622991] pmc->counter: 3 pmc->prev_counter: fffffffffb1a > > [ +6.943282] pmc->counter: 1 pmc->prev_counter: fffffffff746 > > [ +4.483523] pmc->counter: 0 pmc->prev_counter: ffffffffffff > > [ +12.817772] pmc->counter: 0 pmc->prev_counter: ffffffffffff > > [ +21.721233] pmc->counter: 0 pmc->prev_counter: ffffffffffff > > > > The first 3 logs will generate this: > > > > [ +11.811925] Uhhuh. NMI received for unknown reason 20 on CPU 2. > > [ +0.000003] Dazed and confused, but trying to continue > > > > While the last 3 logs won't. This is quite reasonable as looking into > > de0f619564f4 ("KVM: x86/pmu: Defer counter emulated overflow via > > pmc->prev_counter"), counter and prev_counter should only have 1 diff > > in value. > > prev_counter isn't actually sync'ed at this point, is it? This comes > back to that "setting a running counter" nonsense. We want to add 1 to > the current counter, but we don't actually know what the current > counter is. > > My interpretation of the above is that, in the first three cases, PMU > hardware has already detected an overflow. In the last three cases, > software counting has detected an overflow. > > If the last three occur while executing the guest's PMI handler (i.e. > NMIs are blocked), then this could corroborate my conjecture about > IA32_DEBUGCTL.Freeze_PerfMon_On_PMI. > I see. I wonder if we can just do this: diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 48a0528080ab..8d28158e58f2 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -322,7 +322,7 @@ static void reprogram_counter(struct kvm_pmc *pmc) if (!pmc_event_is_allowed(pmc)) goto reprogram_complete; - if (pmc->counter < pmc->prev_counter) + if (pmc->counter =3D=3D 0) __kvm_perf_overflow(pmc, false); if (eventsel & ARCH_PERFMON_EVENTSEL_PIN_CONTROL) Since this is software emulation, we (KVM) should only handle overflow by plusing one?