Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp1349250rdh; Fri, 24 Nov 2023 10:17:05 -0800 (PST) X-Google-Smtp-Source: AGHT+IEk9vzz3yM3n1JvsFMlAHC0Gvz50zBIb/ej/e2EvK8JvLSrTYjT5CAPbhQKNq0WJhPCo1nN X-Received: by 2002:a05:6808:1718:b0:3b2:f4a2:e7fc with SMTP id bc24-20020a056808171800b003b2f4a2e7fcmr3726876oib.25.1700849824860; Fri, 24 Nov 2023 10:17:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700849824; cv=none; d=google.com; s=arc-20160816; b=Hm4s46Dl33TK4yvyGOdDbw6MyW37UM79j5eAaev9FShHGNf9pndJzH4LZndzyyIMcW BX22J1N7aE2W0Q30Xrk6Qoqjj3qKoxNrx/BGaH9toS1w+FRCD7amDH3gFGnhe4IUUDsU BaWKFiR0bK9HdCcxQJ2VD0bOyeOPvJctDtNW2p8lz134Eso3q5Jyg+kUYmMjKiFlFAwa hrMFKT3xy6Azt03DIO3i6ilXEazH8FsaMnO0Fja+FJRYswh5o6TIIOMwM7jqRuVzkP+j xQntLrxycV7x0XXyqQ1jNgLhpT2OK/lviYQ1ek7iEgWHW0ETTnaRUfIHf7ZP176QnrWS Xh1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=eg/pahNY1sda15m0S5y8qWCWFh+lk9Yndlzoe9hnA8c=; fh=QRm2sNzgyYhuqrDJ2QHVMRhsFA5SulDG+Fu0r0kRA98=; b=q4Y2liYdEUqDHhGK7ejhSq4wTfb5A+xQ/ADcvNrQXT2dfxghuD7gvpDYyWpfadGAt2 4bf6zU+jIbtsQq3l9/K9wglxpqtix/GbOQaiYOxiblZNB6BP9CDh51nuKd24uSlmniuO 2yBymlTYaXbNS5vueZ3pzH54s7BbVDbBhBaNoPRk9OFScBmxWqJ4femasKtFqb9Uemy1 mDs0p15RmwDGu+d/n0IX8bq+hcFy/klwbrl7FmTCGSkYsd59/3RdORgX557Msyc4B9eX dtXXbE6QafXUDYIV3j9aKGolld18EQeO/kn0/z7T5B7irymT8Xmms2KiccH2RVl4CsDv 43nw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=F0o8X5Ys; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id v11-20020a056830090b00b006ce343f1a64si1845305ott.15.2023.11.24.10.17.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Nov 2023 10:17:04 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=F0o8X5Ys; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id F356980F7E74; Fri, 24 Nov 2023 10:17:01 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231454AbjKXSQh (ORCPT + 99 others); Fri, 24 Nov 2023 13:16:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229907AbjKXSQf (ORCPT ); Fri, 24 Nov 2023 13:16:35 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C7F319A3 for ; Fri, 24 Nov 2023 10:16:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700849801; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=eg/pahNY1sda15m0S5y8qWCWFh+lk9Yndlzoe9hnA8c=; b=F0o8X5Ys5/H19in7q+tRNwpkSKXjckUC9BiG7nwJkoz67LsTkabzDwGGtQHVPrAewMDSNA oBlnxZwhZZQePnRHHt3I6simtprG311ypSt86b1P5eu7Y7VHQ7DlgnAZKQVAYk11YhVBhu uBjZTeFSuJzpK8/qhSaAPg7CfjK1z54= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-179-BoflYNN0MC67cM5AiX6VAA-1; Fri, 24 Nov 2023 13:16:39 -0500 X-MC-Unique: BoflYNN0MC67cM5AiX6VAA-1 Received: by mail-qv1-f72.google.com with SMTP id 6a1803df08f44-67a0921b293so4590186d6.1 for ; Fri, 24 Nov 2023 10:16:39 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700849799; x=1701454599; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=eg/pahNY1sda15m0S5y8qWCWFh+lk9Yndlzoe9hnA8c=; b=cXWA5hElocQg2RD96gibxJ2ITPujt/8L5MAcWsYAubvi8E/oV0qDXCj2qeFAuYmUrv nqURYXq4TRjgKyLQ3I0xpIdDutDpYWWtL/hJ5SVHtEk/hmkPyh1aH7Ch40Uv9JF3+TgD jNvsJc4gX/dW4QXugaAvp6x8vz/k8lrh8GI6RalpXjrJ2i3WAhvWIGueCnT76/lr51tZ J+KzvRWSPULf2uhKXVc0fD3AmMebdVn8NOs3zDdPfUq69ztWpMi8zv6/tskY5+F00qhY Cz4/8oL2lhNfRQvz9073tQgsXDA29jpWtYDH5LYHO04xk/yeo2q0cjsuoY4rCfpXIjs/ qbDw== X-Gm-Message-State: AOJu0YwGZoI1A0aD3+gHFYt5qcur2GAaNiC6rsxjI8EkuJAezOUGQDu/ Z4GpbfUcfg0mVqdTQSYQviovvCoHLEgQBdQovEH1NPD20pO0E6u+yAPx0tAJU6FR8tAmL4mHmBJ gt1cFZ1UnH+ZuL4HsJrXzOwqa X-Received: by 2002:a05:6214:2f02:b0:67a:1458:aacd with SMTP id od2-20020a0562142f0200b0067a1458aacdmr3565238qvb.1.1700849799395; Fri, 24 Nov 2023 10:16:39 -0800 (PST) X-Received: by 2002:a05:6214:2f02:b0:67a:1458:aacd with SMTP id od2-20020a0562142f0200b0067a1458aacdmr3565222qvb.1.1700849799109; Fri, 24 Nov 2023 10:16:39 -0800 (PST) Received: from x1n (cpe688f2e2cb7c3-cm688f2e2cb7c0.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id cp4-20020ad44ae4000000b0067a154df4cesm747802qvb.70.2023.11.24.10.16.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Nov 2023 10:16:38 -0800 (PST) Date: Fri, 24 Nov 2023 13:16:35 -0500 From: Peter Xu To: Christophe Leroy , "Aneesh Kumar K.V" , Michael Ellerman Cc: Christoph Hellwig , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , Andrea Arcangeli , James Houghton , Lorenzo Stoakes , David Hildenbrand , Vlastimil Babka , John Hubbard , Yang Shi , Rik van Riel , Hugh Dickins , Matthew Wilcox , Jason Gunthorpe , Axel Rasmussen , "Kirill A . Shutemov" , Andrew Morton , "linuxppc-dev@lists.ozlabs.org" , Mike Rapoport , Mike Kravetz Subject: Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing Message-ID: References: <20231116012908.392077-1-peterx@redhat.com> <20231116012908.392077-7-peterx@redhat.com> <57be0ed0-f1d7-4583-9a5f-3ed7deb0ea97@csgroup.eu> <1a1cbd2c-ef59-4b73-bffc-a375bf81243c@csgroup.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1a1cbd2c-ef59-4b73-bffc-a375bf81243c@csgroup.eu> X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 24 Nov 2023 10:17:02 -0800 (PST) Hi, Christophe, Michael, Aneesh, [I'll reply altogether here] On Fri, Nov 24, 2023 at 07:03:11AM +0000, Christophe Leroy wrote: > I added that code with commit e17eae2b8399 ("mm: pagewalk: fix walk for > hugepage tables") because I was getting crazy displays when dumping > /sys/kernel/debug/pagetables > > Huge pages can be used for many thing. > > On powerpc 8xx, there are 4 possible page size: 4k, 16k, 512k and 8M. > Each PGD entry addresses 4M areas, so hugepd is used for anything using > 8M pages. Could have used regular page tables instead, but it is not > worth allocating a 4k table when the HW will only read first entry. > > At the time being, linear memory mapping is performed with 8M pages, so > ptdump_walk_pgd() will walk into huge page directories. > > Also, huge pages can be used in vmalloc() and in vmap(). At the time > being we support 512k pages there on the 8xx. 8M pages will be supported > once vmalloc() and vmap() support hugepd, as explained in commit > a6a8f7c4aa7e ("powerpc/8xx: add support for huge pages on VMAP and VMALLOC") > > So yes as a conclusion hugepd is used outside hugetlbfs, hope it > clarifies things. Yes it does, thanks a lot for all of your replies. So I think this is what I missed: on Freescale ppc 8xx there's a special hugepd_populate_kernel() defined to install kernel pgtables for hugepd. Obviously I didn't check further than hugepd_populate() when I first looked, and stopped at the first instance of hugepd_populate() definition on the 64 bits ppc. For this specific patch: I suppose the change is still all fine to reuse the fast-gup function, because it is still true when there's a VMA present (GUP applies only to user mappings, nothing like KASAN should ever pop up). So AFAIU it's still true that hugepd is only used in hugetlb pages in this case even for Freescale 8xx, and nothing should yet explode. So maybe I can still keep the code changes. However the comment at least definitely needs fixing (that I'm going to add some, which hch requested and I agree), that is not yet in the patch I posted here but I'll refine them locally. For the whole work: the purpose of it is to start merging hugetlb pgtable processing with generic mm. That is my take of previous lsfmm discussions in the community on how we should move forward with hugetlb in the future, to avoid code duplications against generic mm. Hugetlb is kind of blocked on adding new (especially, large) features in general because of such complexity. This is all about that, but a small step towards it. I see that it seems a trend to make hugepd more general. Christophe's fix on dump pgtable is exactly what I would also look for if keep going. I hope that's the right way to go. I'll also need to think more on how this will affect my plan, currently it seems all fine: I won't ever try to change any kernel mapping specific code. I suppose any hugetlbfs based test should still cover all codes I will touch on hugepd. Then it should just work for kernel mappings on Freescales; it'll be great if e.g. Christophe can help me double check that if the series can stablize in a few versions. If any of you have any hint on testing it'll be more than welcomed, either specific test case or hints; currently I'm still at a phase looking for a valid ppc systems - QEMU tcg ppc64 emulation on x86 is slow enough to let me give up already. Considering hugepd's specialty in ppc and the possibility that I'll break it, there's yet another option which is I only apply the new logic into archs with !ARCH_HAS_HUGEPD. It'll make my life easier, but that also means even if my attempt would work out anything new will by default rule ppc out. And we'll have a bunch of "#ifdef ARCH_HAS_HUGEPD" in generic code, which is not preferred either. For gup, it might be relatively easy when comparing to the rest. I'm still hesitating for the long term plan. Please let me know if you have any thoughts on any of above. Thanks! -- Peter Xu