Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 768F7C433F5 for ; Tue, 16 Nov 2021 00:27:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 55E14619E8 for ; Tue, 16 Nov 2021 00:27:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348559AbhKPAaR (ORCPT ); Mon, 15 Nov 2021 19:30:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38290 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244295AbhKOTbB (ORCPT ); Mon, 15 Nov 2021 14:31:01 -0500 Received: from mail-lf1-x12e.google.com (mail-lf1-x12e.google.com [IPv6:2a00:1450:4864:20::12e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BDDBAC061227 for ; Mon, 15 Nov 2021 11:20:38 -0800 (PST) Received: by mail-lf1-x12e.google.com with SMTP id m27so21940542lfj.12 for ; Mon, 15 Nov 2021 11:20:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=vKu5UPhQdIGWGqSTG7MfkgLRy3JiEmmXHIx1eX/IYhA=; b=s0QySkmrrEGyttshq2/3TbHCX5Pf7dD0xWlhvbFlW44PoSbd0mOp5vCfLQ3bsoXTl2 emCMdcJQB7MjsrrLCCpXtNsMyYWyIuT0djwg5+6Poq1pOFRXvqKqpO09hFqo5+BMvbfQ GDIhjw+wUBtd79xG1hf6LDFl6QnVr9JB4Wqb7Y0utEm7NqEniIxfk3c7g2+h+AaSj1Fe 1VH0v3phB0udEpmkRjH+Ev2HT4Gt+RZrUGVaIVUaMhTKw4GeKnP/Y47N0n/vnFpIEiMO FpTbcQ6B6EfVKU1wsWJWI0ocNS4CHMNn7+eypNqRUY9nOULHbzm9jNZe3TdVPRBXtVD5 QpUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=vKu5UPhQdIGWGqSTG7MfkgLRy3JiEmmXHIx1eX/IYhA=; b=zdbMQBmcGqp2jqHAqI51vhYvuY2+iuNGEUqvoP6bMWTpqDVowO/gZ/JbhHW6L4mpJH TDtAZIh1SOt0g9ylapq5T1ur+7k16iL1dqK5/Z6bYv3HZ7XAqntKDGS8nllN8N5a/QAw zT4SD54a2bEx2rp2ONpSYd7Zyb4DF/yXCzcletdvab0iDhzRZ257gSrPr9jwoON6IWiz WDgKu7BtRUL9xlLtL6AAwUoYDEZuA6D5oq0BZZVbceAiTwdtTVhHciU6vDX6+96IRD3L vyYc2xB2TGRZct9Qx0eYovXwSlyMbrfUXFpDdAJ1dZLT6QtF5UyYNJQHO1bNGJLxNavF a55A== X-Gm-Message-State: AOAM5314NqdaNuxCTGanfB9Fqog3O+6kBzzVwbe/CmZj6lKefpworyJY fbAiWyxKc97bzItbLg2nwQ3MfnLwS07IxU5tBPrGDA== X-Google-Smtp-Source: ABdhPJwKrGxg1pSzTD3PKubnXRlVbMWvvyQAyXdQenr8vjH12lGbZt0UJhM6TOIwPk0vJCnPvJ97im0ndO6OqYKWUA4= X-Received: by 2002:ac2:558d:: with SMTP id v13mr1050746lfg.190.1637004036883; Mon, 15 Nov 2021 11:20:36 -0800 (PST) MIME-Version: 1.0 References: <20211111221448.2683827-1-seanjc@google.com> In-Reply-To: <20211111221448.2683827-1-seanjc@google.com> From: David Matlack Date: Mon, 15 Nov 2021 11:20:10 -0800 Message-ID: Subject: Re: [PATCH] KVM: x86/mmu: Update number of zapped pages even if page list is stable To: Sean Christopherson Cc: Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 11, 2021 at 2:14 PM Sean Christopherson wrote: > > When zapping obsolete pages, update the running count of zapped pages > regardless of whether or not the list has become unstable due to zapping > a shadow page with its own child shadow pages. If the VM is backed by > mostly 4kb pages, KVM can zap an absurd number of SPTEs without bumping > the batch count and thus without yielding. In the worst case scenario, > this can cause an RCU stall. > > rcu: INFO: rcu_sched self-detected stall on CPU > rcu: 52-....: (20999 ticks this GP) idle=7be/1/0x4000000000000000 > softirq=15759/15759 fqs=5058 > (t=21016 jiffies g=66453 q=238577) > NMI backtrace for cpu 52 > Call Trace: > ... > mark_page_accessed+0x266/0x2f0 > kvm_set_pfn_accessed+0x31/0x40 > handle_removed_tdp_mmu_page+0x259/0x2e0 > __handle_changed_spte+0x223/0x2c0 > handle_removed_tdp_mmu_page+0x1c1/0x2e0 > __handle_changed_spte+0x223/0x2c0 > handle_removed_tdp_mmu_page+0x1c1/0x2e0 > __handle_changed_spte+0x223/0x2c0 > zap_gfn_range+0x141/0x3b0 > kvm_tdp_mmu_zap_invalidated_roots+0xc8/0x130 This is a useful patch but I don't see the connection with this stall. The stall is detected in kvm_tdp_mmu_zap_invalidated_roots, which runs after kvm_zap_obsolete_pages. How would rescheduling during kvm_zap_obsolete_pages help? > kvm_mmu_zap_all_fast+0x121/0x190 > kvm_mmu_invalidate_zap_pages_in_memslot+0xe/0x10 > kvm_page_track_flush_slot+0x5c/0x80 > kvm_arch_flush_shadow_memslot+0xe/0x10 > kvm_set_memslot+0x172/0x4e0 > __kvm_set_memory_region+0x337/0x590 > kvm_vm_ioctl+0x49c/0xf80 > > Fixes: fbb158cb88b6 ("KVM: x86/mmu: Revert "Revert "KVM: MMU: zap pages in batch""") > Reported-by: David Matlack > Cc: Ben Gardon > Cc: stable@vger.kernel.org > Signed-off-by: Sean Christopherson > --- > > I haven't actually verified this makes David's RCU stall go away, but I did > verify that "batch" stays at "0" before and increments as expected after, > and that KVM does yield as expected after. > > arch/x86/kvm/mmu/mmu.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 33794379949e..89480fab09c6 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -5575,6 +5575,7 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm) > { > struct kvm_mmu_page *sp, *node; > int nr_zapped, batch = 0; > + bool unstable; nit: Declare unstable in the body of the loop. (So should nr_zapped and batch but that's unrelated to your change.) > > restart: > list_for_each_entry_safe_reverse(sp, node, > @@ -5606,11 +5607,12 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm) > goto restart; > } > > - if (__kvm_mmu_prepare_zap_page(kvm, sp, > - &kvm->arch.zapped_obsolete_pages, &nr_zapped)) { > - batch += nr_zapped; > + unstable = __kvm_mmu_prepare_zap_page(kvm, sp, > + &kvm->arch.zapped_obsolete_pages, &nr_zapped); > + batch += nr_zapped; > + > + if (unstable) > goto restart; > - } > } > > /* > -- > 2.34.0.rc1.387.gb447b232ab-goog >