Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF9A3C64EC4 for ; Thu, 9 Mar 2023 19:53:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230108AbjCITxL (ORCPT ); Thu, 9 Mar 2023 14:53:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230075AbjCITxI (ORCPT ); Thu, 9 Mar 2023 14:53:08 -0500 Received: from mail-yb1-xb30.google.com (mail-yb1-xb30.google.com [IPv6:2607:f8b0:4864:20::b30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 16C2176AA for ; Thu, 9 Mar 2023 11:53:04 -0800 (PST) Received: by mail-yb1-xb30.google.com with SMTP id e82so3121294ybh.9 for ; Thu, 09 Mar 2023 11:53:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678391583; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=6jiAWDSdqEid6785iM8R8/HDPUK2YGRPhf1N7NBj6vE=; b=V1Gi59vzWYkUoeVdm8xReUy8/fG5XinRQYo0Lenx8GX9MaVP7E+CyaS/xTyLveEhEA kAs7cYzVUiW46znHsZ++7KLiab14PSn09SC5Ogu2f25SaHBGYq+cuEecNhQixTQQltPV N6ugqiwBuh6FyUJ28tOGrJM4THIZpah/GvlY1l91vmAJL56S++BEOwHjQ0GasktKf8jw eDuFEBJ+5itUiQGo4+tYQI9JDoYDG9kXCD0U/23VpM9Nd4MBU2u0Xocfj610IyOWijQh /5DlxxAfwL6dwBTuyFnnD/VaYx/x1lwAIvWMRuKGva8qT0mjXXhB7i3p4m1dCYLUZGyf LGbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678391583; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6jiAWDSdqEid6785iM8R8/HDPUK2YGRPhf1N7NBj6vE=; b=aD7rRcve801R1Q6/7hp6vMSeVhVx075W1b9z1zqFD0T3Ohc5Djynmq7Lw1BTnWlpVz OVbv8D9TAUQC2rFt+M5gVMG+vzlwkgoAze03Dy1dpC5KGVcnMcSUZHcrhZaQ5pqscuJF tVaKYrhULL58rwkMmJAYkS87wi9MT6h5EjcGcK3MfN9zgwvcNFm4ACkP3pk2jAcO3u64 zXU7Ze58F6lR0silkN+QMgM/uxF2s9t9r02fDEkAybwfRwfqofFhe4wm2Y5gUql+08N6 Oij4kIdwk4kcYizdk0w8K2ub0fwIMZH74xe+5I1PSyS4qkpBrqMT+cXgY9j0DsgOgnyH lQRw== X-Gm-Message-State: AO0yUKU8xLrQJQTkvreI5hVEh0lLem3O1G92gPDMhmBwnCAs4wZopwl1 hZwafjjFgH5+n7s3+Z7fwUudxBzLPdJOseGCPpkZlw== X-Google-Smtp-Source: AK7set84Z09q2f5EJzTn8palJn+0fPTTjkcVbIH1GUdew3q1IIV5xX+24zG2mdN9KRBfm7U7+M5XVZCFtpOEUR1KUL4= X-Received: by 2002:a25:d50e:0:b0:98e:6280:90e7 with SMTP id r14-20020a25d50e000000b0098e628090e7mr14055547ybe.13.1678391583057; Thu, 09 Mar 2023 11:53:03 -0800 (PST) MIME-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> <20230306224127.1689967-4-vipinsh@google.com> <20230308223331.00000234@gmail.com> <20230309145221.000044b4@gmail.com> In-Reply-To: <20230309145221.000044b4@gmail.com> From: Vipin Sharma Date: Thu, 9 Mar 2023 11:52:26 -0800 Message-ID: Subject: Re: [Patch v4 03/18] KVM: x86/mmu: Track count of pages in KVM MMU page caches globally To: Zhi Wang Cc: Mingwei Zhang , seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com, jmattson@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 9, 2023 at 4:52=E2=80=AFAM Zhi Wang = wrote: > > On Thu, 9 Mar 2023 05:18:11 +0000 > Mingwei Zhang wrote: > > > > > > > > > 1) Previously mmu_topup_memory_caches() works fine without a lock. > > > > 2) IMHO I was suspecting if this lock seems affects the paralleliza= tion > > > > of the TDP MMU fault handling. > > > > > > > > TDP MMU fault handling is intend to be optimized for parallelizatio= n fault > > > > handling by taking a read lock and operating the page table via ato= mic > > > > operations. Multiple fault handling can enter the TDP MMU fault pat= h > > > > because of read_lock(&vcpu->kvm->mmu_lock) below. > > > > > > > > W/ this lock, it seems the part of benefit of parallelization is go= ne > > > > because the lock can contend earlier above. Will this cause perform= ance > > > > regression? > > > > > > This is a per vCPU lock, with this lock each vCPU will still be able > > > to perform parallel fault handling without contending for lock. > > > > > > > I am curious how effective it is by trying to accquiring this per vCPU > > lock? If a vcpu thread should stay within the (host) kernel (vmx > > root/non-root) for the vast majority of the time, isn't the shrinker > > always fail to make any progress? > > IMHO the lock is to prevent the faulting path from being disturbed by the > shrinker. I guess even a vCPU thread stays in the host kernel, the shrink= er > should still be able to harvest the pages from the cache as long as there= is > no faulting flood. Yes, lock is to prevent the faulting path from being disturbed by the shrinker. In this new approach, shrinker goes through each vCPU of each VM alive on the host. All of these vCPUs collectively being in the fault path while shrinker is invoked seems unlikely. Let us say we free the cache during the fault path, now when a vCPU asks a page from the cache, it will dynamically allocate a page via GFP_ATOMIC which has higher chances of failing if a host is already under memory pressure. Shrinker by default should be at lower priority and based on discussions pointed in patch 1, it seems like it was of not much practical use before either. > > I am curious about the effectiveness as well. It would be nice there can = be > some unit tests that people can try by themselves to see the results, lik= e > when the shrinker isn't triggered, the faulting is still as effective as > before and when the shrinker is triggered, what would actually happen whe= n > the system memory is under different pressure. (like how much the faultin= g > will be slowed down) > Not sure what can be a right test to measure this. My manual testing was to just run dirty_log_perf_test with and without shrinker and I didn't notice much difference. I did print some logs to see if shrinker is getting invoked, caches are freed by shrinker and when VM is freed to verify page accounting is right with patch 9 of the series.