Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp7487152rwd; Tue, 20 Jun 2023 01:47:44 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7ueRl60Alx0Y6QNEXOnc+tC38oCVhFEtxsJwOkARX/FEgNAZQrNASm+4nV3guzNCyN4d9K X-Received: by 2002:a17:902:6b8b:b0:1b2:1a79:147d with SMTP id p11-20020a1709026b8b00b001b21a79147dmr10430192plk.2.1687250864078; Tue, 20 Jun 2023 01:47:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687250864; cv=none; d=google.com; s=arc-20160816; b=niCOvl5vGOwoKh4ZdxXN3ai4TiNHFCYSSJfBlTD3Wjaa1DRFVySh73Df5oZRFdTOsd bHB6+//gu9MJwKzxXIp3Xwx6XQdWpHMycsg6MTfGVrtTMVuwizpXlpnc1k+XmdkJidSP UCL1F725s0RR0f3Dlx57hFYWtdNFgZRAY4qHmAqANbVN7bpR1uaTpPOkj8KqOvuTUJ4o uvG2gYRxxSjLZZBBovlYgbMJ3Yfosfi7F4wtvB6+pWz3QzpS7GDuxWuZ0FlnHgzPiARd X4+51VocH04RgztgIhOprfDyiyWPXvnBTpQzxN7zhC9wotnvdlhij2A9ACFJTuMbYyi4 VIsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=Y0soM3eBW17TUN7zlLidBNvK+EaFBcT8rKMwsr5hx58=; b=fKdfWK2k8UqUaCBAAuQ9RrsgM0QnEtA70/un7NdV/M8Nn78gcwk0AJWw1ysu+ZL3Ud H0rh8yFNa4abBOAR0zaK5yguHRVZe2LXVaFhOeSBpH6mJbjQAzsOgpO46f5VE7hYjtpx 6ABCFszOWMQK38VRDYV5inIO5c797KyNCob6ejf4rtYMGA5goCBbJRySpNABi2RY1z1+ EcEhrxBlCGa8I+6Rzcnxrck3JOzbSoC7SrBUfobf+ewfqMXepPMSty7wd3KeKdKdcMRH NDuEy9PrQAaNUUDa/5kqG+AhunvoC757xsJbmFFVn+EXMcIxL4AaD/7Dv2+8AMULI+PZ nteQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=6oEmMLuU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w15-20020a170902a70f00b001b53b6d8ebfsi1436018plq.56.2023.06.20.01.47.29; Tue, 20 Jun 2023 01:47:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=6oEmMLuU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231157AbjFTIBh (ORCPT + 99 others); Tue, 20 Jun 2023 04:01:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231211AbjFTIBZ (ORCPT ); Tue, 20 Jun 2023 04:01:25 -0400 Received: from mail-qt1-x830.google.com (mail-qt1-x830.google.com [IPv6:2607:f8b0:4864:20::830]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E80F5E72 for ; Tue, 20 Jun 2023 01:01:23 -0700 (PDT) Received: by mail-qt1-x830.google.com with SMTP id d75a77b69052e-3ff25ca795eso288981cf.1 for ; Tue, 20 Jun 2023 01:01:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687248083; x=1689840083; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Y0soM3eBW17TUN7zlLidBNvK+EaFBcT8rKMwsr5hx58=; b=6oEmMLuUKY8JFmpDg5+v8E4MdTfurRjLoX1ABWiD8rzU/LoDjwWm0ElBunYvvTrVA9 8jootSjhRUZ8mYyleJGBxXK5CcUGV9LC89/FsE1fwM3MFc18DbQLdNMT/KUlg2oXiLnf 3hYu82RMtbq9gvVpHdRYAhY1c5cul9XOj4lfNfa8XuMmfActB+whqEOf7AttaUfJ7c0C 3vzpiNwE1EtKzD+/va+/gJE+1XGIbiurMGkbGIv3LXcIiB+WsYskiA7/AEN02+rUbgu0 NlCzIuWkeu5kq2y4PCTaglIB6hcUeYig1h/77sLoDyWgn/n47w+phuzDHthZW/SS7dfY 7K6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687248083; x=1689840083; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Y0soM3eBW17TUN7zlLidBNvK+EaFBcT8rKMwsr5hx58=; b=G1lgWUzYDChO9yAwkPCJnkV0e60fY09sDjs/3e51eFxm76vToqtlEgxhOHppenymM5 u2lHma4o+05GpC3Ao65kaRmOCt4KfcDEaHlHSiHmAd2Exv/n+zIo/Oj1qg3zYAWU2SxG VP3L1vxGJGqgJJiPng4Huo78fBxm5OxPgNMbvjqavF00Q2Jod5rGHKiwg4P+JCNQo3LT gAzBNB8LzIGCAaGub7gMmlfkIgn9Lk31XC7U8A1heJZz+J5vwmE1dZFzhZJ3Sos77Wfl r46uJaCSfDnbK2HEWBQZ8nT+7tzFJOXTkprRRNyLBf/LU3oLCBpfc38tXNJVumrlHAXv 03tw== X-Gm-Message-State: AC+VfDwVAi18bzEqNajpSVZgizcpGkvGHcwKIpMheCoeZLx86hAJtwRm VWcrqMS/4hGxD13+EipIl3S5scqFsDaFY0Qh/L0ptg== X-Received: by 2002:a05:622a:589:b0:3ed:210b:e698 with SMTP id c9-20020a05622a058900b003ed210be698mr1007941qtb.7.1687248082908; Tue, 20 Jun 2023 01:01:22 -0700 (PDT) MIME-Version: 1.0 References: <20230526234435.662652-1-yuzhao@google.com> <20230526234435.662652-7-yuzhao@google.com> In-Reply-To: From: Yu Zhao Date: Tue, 20 Jun 2023 02:00:46 -0600 Message-ID: Subject: Re: [PATCH mm-unstable v2 06/10] kvm/powerpc: make radix page tables RCU safe To: Nicholas Piggin Cc: Andrew Morton , Paolo Bonzini , Alistair Popple , Anup Patel , Ben Gardon , Borislav Petkov , Catalin Marinas , Chao Peng , Christophe Leroy , Dave Hansen , Fabiano Rosas , Gaosheng Cui , Gavin Shan , "H. Peter Anvin" , Ingo Molnar , James Morse , "Jason A. Donenfeld" , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Masami Hiramatsu , Michael Ellerman , Michael Larabel , Mike Rapoport , Oliver Upton , Paul Mackerras , Peter Xu , Sean Christopherson , Steven Rostedt , Suzuki K Poulose , Thomas Gleixner , Thomas Huth , Will Deacon , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-trace-kernel@vger.kernel.org, x86@kernel.org, linux-mm@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 20, 2023 at 12:33=E2=80=AFAM Nicholas Piggin wrote: > > On Sat May 27, 2023 at 9:44 AM AEST, Yu Zhao wrote: > > KVM page tables are currently not RCU safe against remapping, i.e., > > kvmppc_unmap_free_pmd_entry_table() et al. The previous > > Minor nit but the "page table" is not RCU-safe against something. It > is RCU-freed, and therefore some algorithm that accesses it can have > the existence guarantee provided by RCU (usually there still needs > to be more to it). > > > mmu_notifier_ops members rely on kvm->mmu_lock to synchronize with > > that operation. > > > > However, the new mmu_notifier_ops member test_clear_young() provides > > a fast path that does not take kvm->mmu_lock. To implement > > kvm_arch_test_clear_young() for that path, orphan page tables need to > > be freed by RCU. > > Short version: clear the referenced bit using RCU instead of MMU lock > to protect against page table freeing, and there is no problem with > clearing the bit in a table that has been freed. > > Seems reasonable. Thanks. All above points taken. > > Unmapping, specifically kvm_unmap_radix(), does not free page tables, > > hence not a concern. > > Not sure if you really need to make the distinction about why the page > table is freed, we might free them via unmapping. The point is just > anything that frees them while there can be concurrent access, right? Correct. > > Signed-off-by: Yu Zhao > > --- > > arch/powerpc/kvm/book3s_64_mmu_radix.c | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/= book3s_64_mmu_radix.c > > index 461307b89c3a..3b65b3b11041 100644 > > --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c > > +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c > > @@ -1469,13 +1469,15 @@ int kvmppc_radix_init(void) > > { > > unsigned long size =3D sizeof(void *) << RADIX_PTE_INDEX_SIZE; > > > > - kvm_pte_cache =3D kmem_cache_create("kvm-pte", size, size, 0, pte= _ctor); > > + kvm_pte_cache =3D kmem_cache_create("kvm-pte", size, size, > > + SLAB_TYPESAFE_BY_RCU, pte_ctor)= ; > > if (!kvm_pte_cache) > > return -ENOMEM; > > > > size =3D sizeof(void *) << RADIX_PMD_INDEX_SIZE; > > > > - kvm_pmd_cache =3D kmem_cache_create("kvm-pmd", size, size, 0, pmd= _ctor); > > + kvm_pmd_cache =3D kmem_cache_create("kvm-pmd", size, size, > > + SLAB_TYPESAFE_BY_RCU, pmd_ctor)= ; > > if (!kvm_pmd_cache) { > > kmem_cache_destroy(kvm_pte_cache); > > return -ENOMEM; > > KVM PPC HV radix PUD level page tables use the arch/powerpc allocators > (for some reason), which are not RCU freed. I think you need them too? We don't. The use of the arch/powerpc allocator for PUD tables seems appropriate to me because, unlike PMD/PTE tables, we never free PUD tables during the lifetime of a VM: * We don't free PUD/PMD/PTE tables when they become empty, i.e., not mapping any pages but still attached. (We could in theory, as x86/aarch64 do.) * We have to free PMD/PTE tables when we replace them with 1GB/2MB pages. (Otherwise we'd lose track of detached tables.) And we currently don't support huge pages at P4D level, so we never detach and free PUD tables.