Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp1631139rwl; Wed, 29 Mar 2023 22:19:19 -0700 (PDT) X-Google-Smtp-Source: AKy350YIOJjj7njZF1SdcRuYtEQ49RtB23nYKl7L7uTynLX+ssRAyqATs7Id1aX+DGo/3iafFJRr X-Received: by 2002:a05:6a20:7da6:b0:c0:2875:9e8c with SMTP id v38-20020a056a207da600b000c028759e8cmr1024985pzj.1.1680153558704; Wed, 29 Mar 2023 22:19:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680153558; cv=none; d=google.com; s=arc-20160816; b=J6C5s2P43dccV6cVSjQE5j70M0phfKsb+4L8cY2e599HkxZRRR+Ttdh8P78Qna0PLV bBlkv5v2D1MZjfrko6SH5DnmUeOxTaAZkYn08sMSqlA6eGumnUqrSaO8ius9gOzLlZRZ XNfgrOdv0tP7AY+GP4UtCS6tWhiG8w6KJbwY7JDuvx6MfrtE1YFz+aJn5/CzLuHjkTha mftMvR4M9i/2Q+J243bqGDifrk3W/aXRwn1zrcFinzvD0Z+STd8iX+pXHE5oBSR/4064 XxXXanBlLC0er2Xy33B3l2fUa9ZsHsHiGO1UaZlgbKecAkswSEAsSZHfOdApjC2Fonk6 y3gA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=YPawnbTaLsbJ4YkPJIG4RcodoESoYevshNoTfZOEGPM=; b=awUxqlFI8PKteqW4hLxYL/50YNuCVbcKassxrGBwMZDMqd5wo0yU9A7YOB5awzR+5c 3/TmPasaYDgUdzRjIleY7rgOTo135NcyEJlNQ6nt553kTAbmLnnWqqlUrLrrPtvjVaNu QUSXNHPEEsZ04vPkK3zt0i8R0u3vtNvHXU1V7oipAMZFe+IUbXkiWwSCnJo1D/BvoMtE +jdgtADm6NZSa6fr55+Pc158vfnUeNa/hVKQQxqfg5Pet0C3dmFp0rIJ3gdjmECsrpGM /QPRPOtGRFjcCf1vgpn5MVnXTG4OvorLzXAvpgpYEr4MNYnkgWAe7gtne68PAlAzFkeM V0bw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=X88rXo6K; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x17-20020a63f711000000b0050bfd124310si32601569pgh.270.2023.03.29.22.19.03; Wed, 29 Mar 2023 22:19:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=X88rXo6K; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229808AbjC3FOK (ORCPT + 99 others); Thu, 30 Mar 2023 01:14:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229527AbjC3FOH (ORCPT ); Thu, 30 Mar 2023 01:14:07 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7789E4EE4 for ; Wed, 29 Mar 2023 22:14:06 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 00559B825AB for ; Thu, 30 Mar 2023 05:14:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EA7B8C433EF; Thu, 30 Mar 2023 05:13:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680153243; bh=xnr6GkmuPQBX2CfvXMCyH+BeYqP+rDi3Om9oGJN4E6A=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=X88rXo6Kn8GpVuEer6q4mKi/UOgR67iY+NFv1SZi5qFZWw7tm4rKjjCD3qFqOfVEv dgh8qWCazrmlWeonzBeCFQM8+78c14GqdgqjunzRW16W8K5eSzv4jBGIzlhRFQ7Tn1 d8xTBBlpLwnx9CdUdUWKn/A8t0lpa3ntUQ4UsrBnbPVcwUj3eSsp0wb2hnh+vt0T6W wNhqd8+vy/y1+qnPrTtt4TPaJktJMrUWX8GPdrgY0oSXj7JivlPtngaQ8e5/9I3kUU gI8oOaOKNf/eUFLC36X9ZSKWUeNGxPdNtcDc4JY/BWVb0CU5suOOK4ecX9DyK2uGNv +VX3GpcPXiPpQ== Date: Thu, 30 Mar 2023 08:13:48 +0300 From: Mike Rapoport To: Michal Hocko Cc: linux-mm@kvack.org, Andrew Morton , Dave Hansen , Peter Zijlstra , Rick Edgecombe , Song Liu , Thomas Gleixner , Vlastimil Babka , linux-kernel@vger.kernel.org, x86@kernel.org Subject: Re: [RFC PATCH 1/5] mm: intorduce __GFP_UNMAPPED and unmapped_alloc() Message-ID: References: <20230308094106.227365-2-rppt@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.5 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 29, 2023 at 10:13:23AM +0200, Michal Hocko wrote: > On Wed 29-03-23 10:28:02, Mike Rapoport wrote: > > On Tue, Mar 28, 2023 at 05:24:49PM +0200, Michal Hocko wrote: > > > On Tue 28-03-23 18:11:20, Mike Rapoport wrote: > > > > On Tue, Mar 28, 2023 at 09:39:37AM +0200, Michal Hocko wrote: > > > [...] > > > > > OK, so you want to reduce that direct map fragmentation? > > > > > > > > Yes. > > > > > > > > > Is that a real problem? > > > > > > > > A while ago Intel folks published report [1] that showed better performance > > > > with large pages in the direct map for majority of benchmarks. > > > > > > > > > My impression is that modules are mostly static thing. BPF > > > > > might be a different thing though. I have a recollection that BPF guys > > > > > were dealing with direct map fragmention as well. > > > > > > > > Modules are indeed static, but module_alloc() used by anything that > > > > allocates code pages, e.g. kprobes, ftrace and BPF. Besides, Thomas > > > > mentioned that having code in 2M pages reduces iTLB pressure [2], but > > > > that's not only about avoiding the splits in the direct map but also about > > > > using large mappings in the modules address space. > > > > > > > > BPF guys suggested an allocator for executable memory [3] mainly because > > > > they've seen performance improvement of 0.6% - 0.9% in their setups [4]. > > > > > > These are fair arguments and it would have been better to have them in > > > the RFC. Also it is not really clear to me what is the actual benefit of > > > the unmapping for those usecases. I do get they want to benefit from > > > caching on the same permission setup but do they need unmapping as well? > > > > The pages allocated with module_alloc() get different permissions depending > > on whether they belong to text, rodata, data etc. The permissions are > > updated in both vmalloc address space and in the direct map. The updates to > > the direct map cause splits of the large pages. > > That much is clear (wouldn't hurt to mention that in the changelog > though). > > > If we cache large pages as > > unmapped we take out the entire 2M page from the direct map and then > > if/when it becomes free it can be returned to the direct map as a 2M page. > > > > Generally, the unmapped allocations are intended for use-cases that anyway > > map the memory elsewhere than direct map and need to modify direct mappings > > of the memory, be it modules_alloc(), secretmem, PKS page tables or maybe > > even some of the encrypted VM memory. > > I believe we are still not on the same page here. I do understand that > you want to re-use the caching capability of the unmapped_pages_alloc > for modules allocations as well. My question is whether module_alloc > really benefits from the fact that the memory is unmapped? > > I guess you want to say that it does because it wouldn't have to change > the permission for the direct map but I do not see that anywhere in the > patch... This happens automagically outside the patch :) Currently, to change memory permissions modules code calls set_memory APIs and passes vmalloced address to them. set_memory functions lookup the direct map alias and update the permissions there as well. If the memory allocated with module_alloc() is unmapped in the direct map, there won't be an alias for set_memory APIs to update. > Also consinder the slow path where module_alloc needs to > allocate a fresh (huge)page and unmap it. Does the overhead of the > unmapping suits needs of all module_alloc users? Module loader is likely > not interesting as it tends to be rather static but what about BPF > programs? The overhead of unmapping pages in the direct map on allocation path will be offset by reduced overhead of updating permissions in the direct map after the allocation. Both are using the same APIs and if today the permission update causes a split of a large page, unmapping of a large page won't. Of course in a loaded system unmapped_alloc() won't be able to always allocated large pages to replenish the cache, but still there will be fewer updates to the direct map. > -- > Michal Hocko > SUSE Labs -- Sincerely yours, Mike.