Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp271723rwl; Wed, 29 Mar 2023 01:18:23 -0700 (PDT) X-Google-Smtp-Source: AKy350YaWw12F2z+pm6bwcMi5KYMOmgoXFoVMn6YYiINOHYIBp4bcQqP4rniKyrnhSSAzb5ji0CB X-Received: by 2002:aa7:ce19:0:b0:4fa:601a:3913 with SMTP id d25-20020aa7ce19000000b004fa601a3913mr18610316edv.27.1680077902904; Wed, 29 Mar 2023 01:18:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680077902; cv=none; d=google.com; s=arc-20160816; b=JnhFiqMVKWKsOK4SEKYjkM+3BsWSy6yc/seUEtjIBV0JtPp6ixl9QdR3ho3eoMgf7g yIcZ/LAPUp7zab/8JjQBHiehX0UfNylv8KCEH7j0J/WQ1eg+J+BljgaYTa/8vehaQqwI IGzU1czc6aase8vbtuYYO8cv6K32Guv+bIVClvxYNOCnefh2BSUCh+G15y+8xMrcA0EK CFYAztYqasxiCEJjzeIlcQ8J7gpTBlY6wjf0R1nhKmV56++VG+buXMdGcgTTGCddt2hI xnznjR48knnbgEviOgv1EYwt0XUf48+OTTYIUSNB/WrRLtQb2agwp/hBlMpNqwrOE3/m uoYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=4uKJsI6kcmXX5PcI4vzLQ7gfadOE59ADRMPHNu9uM4g=; b=ErH0CMM8TGdTwA0yljRQyR8fC3k9OT90GimgbqdA3ICiKKOfakEZ8rFvY1zzXh9Boi iOsmCAHKnz1QAob7WqlLG08GjwM4Fddi3OHsDR5yQPIpL0T2etALBkzVsPmhS1Cf92WE BdUJsrrm2TM+MAOfuIXMb8/NHlGLU1pWBZYQ/rBHi9P8lv7rpwvw0OgF84KkvNo7T4IR 2b9liizzwL2woDKB/l7F0skYU6SrYZxd4aM6a9knGaGf3xk3JbufJ7MJMYr1KhB+EqR2 B2k4iusQa++awTwqhhZqB4AJRzR2RGIOpVEu9inDMN6rXxAghe9tn34QbHGLPs8r2Ph/ j9hg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=KgmXk3WD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i20-20020a170906115400b008b17cc2e38asi11703885eja.582.2023.03.29.01.17.58; Wed, 29 Mar 2023 01:18:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=KgmXk3WD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229905AbjC2IN3 (ORCPT + 99 others); Wed, 29 Mar 2023 04:13:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42170 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229970AbjC2IN0 (ORCPT ); Wed, 29 Mar 2023 04:13:26 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02A4626A5 for ; Wed, 29 Mar 2023 01:13:25 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id B00141FDF3; Wed, 29 Mar 2023 08:13:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1680077603; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=4uKJsI6kcmXX5PcI4vzLQ7gfadOE59ADRMPHNu9uM4g=; b=KgmXk3WDiPzBXzdj45vKo4Vxz9vi4sI6hNTwjCUsKoQDpB57b7/5//GKc082Or6lCMx/b9 ZKDzzzwTxi4ao+87xcxzxrVyODS5gjn/Lku2in5ezddE1za04vzIij6QMQpnfKlTJeGdIU mG87qg70katMuzooi3uJR+33T3GLTfE= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id A26B0139D3; Wed, 29 Mar 2023 08:13:23 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id CwdQJyPzI2TmWQAAMHmgww (envelope-from ); Wed, 29 Mar 2023 08:13:23 +0000 Date: Wed, 29 Mar 2023 10:13:23 +0200 From: Michal Hocko To: Mike Rapoport Cc: linux-mm@kvack.org, Andrew Morton , Dave Hansen , Peter Zijlstra , Rick Edgecombe , Song Liu , Thomas Gleixner , Vlastimil Babka , linux-kernel@vger.kernel.org, x86@kernel.org Subject: Re: [RFC PATCH 1/5] mm: intorduce __GFP_UNMAPPED and unmapped_alloc() Message-ID: References: <20230308094106.227365-1-rppt@kernel.org> <20230308094106.227365-2-rppt@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.5 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 29-03-23 10:28:02, Mike Rapoport wrote: > On Tue, Mar 28, 2023 at 05:24:49PM +0200, Michal Hocko wrote: > > On Tue 28-03-23 18:11:20, Mike Rapoport wrote: > > > On Tue, Mar 28, 2023 at 09:39:37AM +0200, Michal Hocko wrote: > > [...] > > > > OK, so you want to reduce that direct map fragmentation? > > > > > > Yes. > > > > > > > Is that a real problem? > > > > > > A while ago Intel folks published report [1] that showed better performance > > > with large pages in the direct map for majority of benchmarks. > > > > > > > My impression is that modules are mostly static thing. BPF > > > > might be a different thing though. I have a recollection that BPF guys > > > > were dealing with direct map fragmention as well. > > > > > > Modules are indeed static, but module_alloc() used by anything that > > > allocates code pages, e.g. kprobes, ftrace and BPF. Besides, Thomas > > > mentioned that having code in 2M pages reduces iTLB pressure [2], but > > > that's not only about avoiding the splits in the direct map but also about > > > using large mappings in the modules address space. > > > > > > BPF guys suggested an allocator for executable memory [3] mainly because > > > they've seen performance improvement of 0.6% - 0.9% in their setups [4]. > > > > These are fair arguments and it would have been better to have them in > > the RFC. Also it is not really clear to me what is the actual benefit of > > the unmapping for those usecases. I do get they want to benefit from > > caching on the same permission setup but do they need unmapping as well? > > The pages allocated with module_alloc() get different permissions depending > on whether they belong to text, rodata, data etc. The permissions are > updated in both vmalloc address space and in the direct map. The updates to > the direct map cause splits of the large pages. That much is clear (wouldn't hurt to mention that in the changelog though). > If we cache large pages as > unmapped we take out the entire 2M page from the direct map and then > if/when it becomes free it can be returned to the direct map as a 2M page. > > Generally, the unmapped allocations are intended for use-cases that anyway > map the memory elsewhere than direct map and need to modify direct mappings > of the memory, be it modules_alloc(), secretmem, PKS page tables or maybe > even some of the encrypted VM memory. I believe we are still not on the same page here. I do understand that you want to re-use the caching capability of the unmapped_pages_alloc for modules allocations as well. My question is whether module_alloc really benefits from the fact that the memory is unmapped? I guess you want to say that it does because it wouldn't have to change the permission for the direct map but I do not see that anywhere in the patch... Also consinder the slow path where module_alloc needs to allocate a fresh (huge)page and unmap it. Does the overhead of the unmapping suits needs of all module_alloc users? Module loader is likely not interesting as it tends to be rather static but what about BPF programs? [...] > > > I also think vmalloc with unmmapped pages can provide backing pages for > > > execmem_alloc() Song proposed. > > > > Why would you need to have execmem_alloc have its memory virtually > > mapped into vmalloc space? > > Currently all the memory allocated for code is managed in a subset of > vmalloc() space. The intention of execmem_alloc() was to replace > module_alloc() for the code pages, so it's natural that it will use the > same virtual ranges. Ohh, sorry my bad. I have only now realized I have misread and thought about secretmem_alloc... -- Michal Hocko SUSE Labs