Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp2960773rwl; Mon, 27 Mar 2023 07:32:57 -0700 (PDT) X-Google-Smtp-Source: AKy350Y2Y8mzeb9yniMFAJofv3YcUlKFpDpDeSo3k8xEQr6Grt9+gtSMP/l9504jkLzw5rHqXV9S X-Received: by 2002:a17:906:9b89:b0:8a9:e031:c4b7 with SMTP id dd9-20020a1709069b8900b008a9e031c4b7mr15998546ejc.4.1679927577380; Mon, 27 Mar 2023 07:32:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679927577; cv=none; d=google.com; s=arc-20160816; b=ohXn6z/Z5oOTl3V387jm/flZwBXmg8zcEHr+kAKWr1nl1Cvsob52PW2wq6SLd8Xj48 AJson6ZWsKAUAmCik9+FCRHDgQVu6lmD5jC2iXi+5AySUO8aEChL+Ag8sD5glk34cni1 qJnmHEbAPd/9FBNcm64hnosAsDF6ySw478/nJlS8kIEFa+bPm59wfK6NIcefb9+Mg0ZF bEqV8yf7zw4KHZxC8z44mJDrFI7WCCvnWS/DiQaNB5VYYuiHk1vb3+ehZUGWGegu8Wam +hK5fOEiBG/3VJ6MkMJqNeltBv+q6hO2fKe3H5RcUxw/MnaNLysM3V6dmxO2/SRAOtLg 4O3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature:dkim-signature; bh=s2w61i4ixMyqYusHwfrvLzXW5aN7Xtkku6WNwaDadPY=; b=AKKbpz/k8JNEAuP65Lfb0Jh2Fwu64QwP/dYnGjr8J+4O6Y0OLyWw2kzd+0O3w0bIy6 auaY4dBJt80C/PfH779Lu8j6Cdeb0FJRe0CuoclRoouPtu3lh9kNt1XEvU0F3TYpgS7X fEYGFyTKOQHKaJeJCeXoce/N1piMgf/Fro6exQrPJlh0DhUpJYQ7zgQvlU0AKrIb4nZ+ 6pcitZpRoFrcmNsCXkbY51gqJfNkXYwpU1ELfdMgMi/ejfJAu6NzM3Rg4sCTJS33OKqd Y6AzPEGhuxilW3+akEoiNM2ECoqqAjiOiouETfet0sr/COWLx/dr3vhL2QjkuRlazk+0 /MpA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b="m4EQn2/j"; dkim=neutral (no key) header.i=@suse.cz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a13-20020a1709064a4d00b0092bc7e163e9si24777604ejv.658.2023.03.27.07.32.28; Mon, 27 Mar 2023 07:32:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b="m4EQn2/j"; dkim=neutral (no key) header.i=@suse.cz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229640AbjC0Obx (ORCPT + 99 others); Mon, 27 Mar 2023 10:31:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230273AbjC0Obt (ORCPT ); Mon, 27 Mar 2023 10:31:49 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B12210D8 for ; Mon, 27 Mar 2023 07:31:47 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id F3EFA21CC8; Mon, 27 Mar 2023 14:31:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1679927506; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s2w61i4ixMyqYusHwfrvLzXW5aN7Xtkku6WNwaDadPY=; b=m4EQn2/jdCZ3O/fv1kxvv0zziOQljSrWfBGaK69LM66+XLU1Z09GuiKAbPhteNAaDSL6ow vkvISD0nqygDQ2g0TJdrkQ7QMso4tAp7S5rDBX2K/LndwIyGQfKu8h1S2v7nBVKfkgZNuJ gpyaa1uT9Tov5FStjIE+YPCQJD5I6cw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1679927506; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s2w61i4ixMyqYusHwfrvLzXW5aN7Xtkku6WNwaDadPY=; b=wT9VT+Pw1gsMGJhRIY9yDIfTMGZvt+XEWCFlX2FUykZyimCDOzkEXDppVZYeoUpJltlZsm fH9XkbMyzpNDygCA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id BDAC613482; Mon, 27 Mar 2023 14:31:45 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id bkqBLdGoIWT2FQAAMHmgww (envelope-from ); Mon, 27 Mar 2023 14:31:45 +0000 Message-ID: Date: Mon, 27 Mar 2023 16:31:45 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 Subject: Re: [RFC PATCH 1/5] mm: intorduce __GFP_UNMAPPED and unmapped_alloc() Content-Language: en-US To: Michal Hocko , Mike Rapoport Cc: linux-mm@kvack.org, Andrew Morton , Dave Hansen , Peter Zijlstra , Rick Edgecombe , Song Liu , Thomas Gleixner , linux-kernel@vger.kernel.org, x86@kernel.org, Mel Gorman References: <20230308094106.227365-1-rppt@kernel.org> <20230308094106.227365-2-rppt@kernel.org> From: Vlastimil Babka In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.5 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_SOFTFAIL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/27/23 15:43, Michal Hocko wrote: > On Sat 25-03-23 09:38:12, Mike Rapoport wrote: >> On Fri, Mar 24, 2023 at 09:37:31AM +0100, Michal Hocko wrote: >> > On Wed 08-03-23 11:41:02, Mike Rapoport wrote: >> > > From: "Mike Rapoport (IBM)" >> > > >> > > When set_memory or set_direct_map APIs used to change attribute or >> > > permissions for chunks of several pages, the large PMD that maps these >> > > pages in the direct map must be split. Fragmenting the direct map in such >> > > manner causes TLB pressure and, eventually, performance degradation. >> > > >> > > To avoid excessive direct map fragmentation, add ability to allocate >> > > "unmapped" pages with __GFP_UNMAPPED flag that will cause removal of the >> > > allocated pages from the direct map and use a cache of the unmapped pages. >> > > >> > > This cache is replenished with higher order pages with preference for >> > > PMD_SIZE pages when possible so that there will be fewer splits of large >> > > pages in the direct map. >> > > >> > > The cache is implemented as a buddy allocator, so it can serve high order >> > > allocations of unmapped pages. >> > >> > Why do we need a dedicated gfp flag for all this when a dedicated >> > allocator is used anyway. What prevents users to call unmapped_pages_{alloc,free}? >> >> Using unmapped_pages_{alloc,free} adds complexity to the users which IMO >> outweighs the cost of a dedicated gfp flag. > > Aren't those users rare and very special anyway? I think it's mostly about the freeing that can happen from a generic context not aware of the special allocation, so it's not about how rare it is, but how complex would be to determine exhaustively those contexts and do something in them. >> For modules we'd have to make x86::module_{alloc,free}() take care of >> mapping and unmapping the allocated pages in the modules virtual address >> range. This also might become relevant for another architectures in future >> and than we'll have several complex module_alloc()s. > > The module_alloc use is lacking any justification. More context would be > more than useful. Also vmalloc support for the proposed __GFP_UNMAPPED > likely needs more explanation as well. > >> And for secretmem while using unmapped_pages_alloc() is easy, the free path >> becomes really complex because actual page freeing for fd-based memory is >> deeply buried in the page cache code. > > Why is that a problem? You already hook into the page freeing path and > special case unmapped memory. But the proposal of unmapped_pages_free() would suggest this would no longer be the case? But maybe we could, as a compromise, provide unmapped_pages_alloc() to get rid of the new __GFP flag, provide unmapped_pages_free() to annotate places that are known to free unmapped memory explicitly, but the generic page freeing would also keep the hook? >> My gut feeling is that for PKS using a gfp flag would save a lot of hassle >> as well. > > Well, my take on this is that this is not a generic page allocator > functionality. It is clearly an allocator on top of the page allocator. > In general gfp flags are scarce and convenience argument usually fires > back later on in hard to predict ways. So I've learned to be careful > here. I am not saying this is a no-go but right now I do not see any > acutal advantage. The vmalloc usecase could be interesting in that > regards but it is not really clear to me whether this is a good idea in > the first place. >