Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1561607rwd; Thu, 18 May 2023 13:53:37 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4momuRV57XUikSn1q8s5f5SVpoBS0amTDaI8pQ6l1F+sNxbDxbnBs8wAFiOd68hMCEDTs8 X-Received: by 2002:a17:902:f689:b0:1ac:acb5:4336 with SMTP id l9-20020a170902f68900b001acacb54336mr336580plg.33.1684443217364; Thu, 18 May 2023 13:53:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684443217; cv=none; d=google.com; s=arc-20160816; b=sWY4UXCyrfWKet8CSGu+uuXHvc6uAL1OyqB4HN9WIK5M6nvPweSlwUB2Ed75WPnSPL gnfnhS5NWjIkWz85ROAkFLAGyC7oKQOU6TDvreUhUnKO7d6Z5klX1yMpb7tOIplLQRg2 mbdxK7VS3+tMAZOKzPyFNcZf1EKG1Bt+WuGstCXh948+0C9in0yZiHgnbX5uV/DLWZt/ tAWZyu8DtF09XaFbW6UZgz5DQQ1OYtqtkhsJlXe1jRTa4J9e0WbRc7wK35/Bs6OOxKfH sC/Qi0Wf/W6pQ5SdW+Q8Qv9BCrepomyyLuIOXQSdsrg84I+xhKw63i8aINFk0bwIlrAx /SAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=Mcohg/XV/wrFIOA2dbWQp4WbXXb3DOZC5jV/ZprWGFE=; b=LnIotqcrvTq961HgU/mCPJ72pkANASU262PycS/Wg9xhkHvvonyooBAP3pP3JiHW2w 67vZV3HWYw0sdj1VkQ5B77rjGfbrSvvFSCxeTuynOCrjuVn2+/3LvKa2OM8aHwyyoHEr OTyLE5WSwJx6Jv3LWCXMlXa51r850D5DVF26FygbJVvAarCEuFO/s+MDRmhYMRpws06R l1Mz8hcv3rotwweSToCv67lay/Bb5RHHMScOXY1x0C4JWpK9G7IqpNzvFwdFJx8rz7UD 5G2G5JzTwZ1rlGnwv/W3/+/IzM78DYVkY0WGGGQMcHIqpe/o5Ru7XJ6wVrtokdI6ek9/ EI1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=TbsWqD+7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h5-20020a170902680500b001ab1ba2572csi2101279plk.240.2023.05.18.13.53.25; Thu, 18 May 2023 13:53:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=TbsWqD+7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229557AbjERUw2 (ORCPT + 99 others); Thu, 18 May 2023 16:52:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60506 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230187AbjERUw1 (ORCPT ); Thu, 18 May 2023 16:52:27 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 160E91995 for ; Thu, 18 May 2023 13:51:55 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 25CCA65238 for ; Thu, 18 May 2023 20:51:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 81BABC433A8 for ; Thu, 18 May 2023 20:51:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684443082; bh=Mcohg/XV/wrFIOA2dbWQp4WbXXb3DOZC5jV/ZprWGFE=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=TbsWqD+7b4kGi8NsWgROO6O/Pg+2Iwrj15TYHLq1bbeELPriTHIdqkvLswmNAS6Dj SPn0mKlINpuG6hfj44KfMIItwAO/E2AkD39atTJGumGCc/SyeCHLUbSjQ/FnAJuu2c wbPRCQb6Z9nGI9maqjChMf20UcDuSCN7ooLRGJDf2Q8KnkukixjTkJCQn+DeSSOwU+ VI7qIhIYN541GRrb3XjZZ1ynUgGVkK+N7wtfaFg9SE+9PEhazXxnf/ZHw29L55jpM+ +FCBp3GRPnAO/x4Mzn5t7Bygy/T5tMt1wVcx/GQCiMp7b6U2kWSDBHxmHFQjyqyAYE 9MmEzJ6hjv2pQ== Received: by mail-lj1-f182.google.com with SMTP id 38308e7fff4ca-2ac836f4447so26658451fa.2 for ; Thu, 18 May 2023 13:51:22 -0700 (PDT) X-Gm-Message-State: AC+VfDw3ut3Rxq2GxkeHtq1LmhAbnl4fjRbry4woNGClcLJZ043jeydL uq86n0DNBmYUETax/EbjRtdLBBuxvSHIwU56/g0= X-Received: by 2002:a19:ad02:0:b0:4f2:509b:87ba with SMTP id t2-20020a19ad02000000b004f2509b87bamr55370lfc.50.1684443080480; Thu, 18 May 2023 13:51:20 -0700 (PDT) MIME-Version: 1.0 References: <20230308094106.227365-1-rppt@kernel.org> <20230308094106.227365-2-rppt@kernel.org> <20230518152354.GD4967@kernel.org> In-Reply-To: From: Song Liu Date: Thu, 18 May 2023 13:51:08 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH 1/5] mm: intorduce __GFP_UNMAPPED and unmapped_alloc() To: Kent Overstreet Cc: Mike Rapoport , linux-mm@kvack.org, Andrew Morton , Dave Hansen , Peter Zijlstra , Rick Edgecombe , Thomas Gleixner , Vlastimil Babka , linux-kernel@vger.kernel.org, x86@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 18, 2023 at 12:15=E2=80=AFPM Kent Overstreet wrote: > > On Thu, May 18, 2023 at 12:03:03PM -0700, Song Liu wrote: > > On Thu, May 18, 2023 at 11:47=E2=80=AFAM Song Liu wro= te: > > > > > > On Thu, May 18, 2023 at 10:24=E2=80=AFAM Kent Overstreet > > > wrote: > > > > > > > > On Thu, May 18, 2023 at 10:00:39AM -0700, Song Liu wrote: > > > > > On Thu, May 18, 2023 at 9:48=E2=80=AFAM Kent Overstreet > > > > > wrote: > > > > > > > > > > > > On Thu, May 18, 2023 at 09:33:20AM -0700, Song Liu wrote: > > > > > > > I am working on patches based on the discussion in [1]. I am = planning to > > > > > > > send v1 for review in a week or so. > > > > > > > > > > > > Hey Song, I was reviewing that thread too, > > > > > > > > > > > > Are you taking a different approach based on Thomas's feedback?= I think > > > > > > he had some fair points in that thread. > > > > > > > > > > Yes, the API is based on Thomas's suggestion, like 90% from the d= iscussions. > > > > > > > > > > > > > > > > > My own feeling is that the buddy allocator is our tool for allo= cating > > > > > > larger variable sized physically contiguous allocations, so I'd= like to > > > > > > see something based on that - I think we could do a hybrid budd= y/slab > > > > > > allocator approach, like we have for regular memory allocations= . > > > > > > > > > > I am planning to implement the allocator based on this (reuse > > > > > vmap_area logic): > > > > > > > > Ah, you're still doing vmap_area approach. > > > > > > > > Mike's approach looks like it'll be _much_ lighter weight and highe= r > > > > performance, to me. vmalloc is known to be slow compared to the bud= dy > > > > allocator, and with Mike's approach we're only modifying mappings o= nce > > > > per 2 MB chunk. > > > > > > > > I don't see anything in your code for sub-page sized allocations to= o, so > > > > perhaps I should keep going with my slab allocator. > > > > > > The vmap_area approach handles sub-page allocations. In 5/5 of set [2= ], > > > we showed that multiple BPF programs share the same page with some > > > kernel text (_etext). > > > > > > > Could you share your thoughts on your approach vs. Mike's? I'm newe= r to > > > > this area of the code than you two so maybe there's an angle I've m= issed > > > > :) > > > > > > AFAICT, tree based solution (vmap_area) is more efficient than bitmap > > > based solution. > > Tree based requires quite a bit of overhead for the rbtree pointers, and > additional vmap_area structs. > > With a buddy allocator based approach, there's no additional state that > needs to be allocated, since it all fits in struct page. To allocate memory for text, we will allocate 2MiB, make it ROX, and then use it for many small allocations. IIUC, buddy allocator will use unallocat= ed parts of this page for metadata. I guess this may be a problem, as the whole page is ROX now, and we have to use text_poke to write to it. OTOH, if we allocate extra memory for metadata (tree based solution), all the metadata operations can be regular read/write. Thanks, Song