Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp2704570rwd; Fri, 19 May 2023 09:05:56 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7wOYDWHr652nuXIjt22t+V2CE8NCCEppSlZYhNyWlTJWZVXHHOBEYU5AxFgOWP0vca0DQE X-Received: by 2002:a05:6a00:22cc:b0:64b:205:dbf3 with SMTP id f12-20020a056a0022cc00b0064b0205dbf3mr4049336pfj.34.1684512356499; Fri, 19 May 2023 09:05:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684512356; cv=none; d=google.com; s=arc-20160816; b=mSO+gqBFa8DClwDKT0367pYE2GyQl+ECPmmh3PW7oiaPVezqFbGdV6r9BQKB8w3isQ VhZMxo388IlQXzXjPvNwbwfjovDDqu4CvSxJTcmjUeQAaLavRebJAVQ4gegSFhqZDIOg hRppaH0WO/5eH/TaZkjmrFcXG0WlT38dBZ9XnpGGJAA3fKKb/Hf09spO9EEm68l6cZab 1yQ9S7pXdtLwLxYzOiabtOBBk3gd2ymYgVujJeymZ4OiG11koTNw6axINg9JGi2WCdcy wQBfEsh2Y4aT8jzLmiJg5VW4Xm6IGWo1AgdDjlRFIAoihkDiw7mBLOF1QRW3pPS8eGKT xkUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=42lD46/C0m++0gtSXHeH7vh3I6VJML9hciiQaap5vLI=; b=ZUFHSE1JhO6QwFXEzUQpe2P7kDN9UxDNO4PE8qqoAteEl04GFcyrA+EIOXeVdJvboV i6x2BDYsRoReqFixJ5qE5ZJe+5aF1gBfDZ9lWgqMvwacYD9ZVixWyx2mxsUHdVVIPkhC Rdr0/UuWXI6AOlFScNasmWV95/5/VMxHlOFEK4Y/bivPcRxRm+c2FSN30QbFBO046I6I 7HQ22mQGQUrw24CYBnNPdHqyWPnlT3xPtlDUUyClaXHOWV0n+XMLpBSmlZUg6xapMiY4 +ODRDU9yXVpUUVF8Y5bFhkEdKzHUagoPaHH//YsyM1yh6pDYiyWDvvWkpjG9S2AJw7j9 0kfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=W03aV4Mi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u67-20020a637946000000b005288509b88asi3628744pgc.681.2023.05.19.09.05.41; Fri, 19 May 2023 09:05:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=W03aV4Mi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232194AbjESPm6 (ORCPT + 99 others); Fri, 19 May 2023 11:42:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57792 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232102AbjESPmz (ORCPT ); Fri, 19 May 2023 11:42:55 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A8271B4 for ; Fri, 19 May 2023 08:42:53 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 23ABF658CC for ; Fri, 19 May 2023 15:42:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7C332C433AA for ; Fri, 19 May 2023 15:42:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684510972; bh=r6KNElAB9lNue2/Hd21GsrXEksf7jxTS8vfcJaqZy9Q=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=W03aV4MidBBBiz6wuT/nE4J+uow5ECmz1loXa+08lTMgAi8VpQuaundC0ZjFK7WpS evwV/GDr0WeKUboUIcNB1G/AV6oIwDwZ/0C9PhcNpyLW0rUjznNI0QvOnkEJ1iHEMP 8LNiAre/2wCMaicM8BkYD6r3MP3ffiWxwMPgUTxKeVPr4QaSvdrtRsJPin3u9TH5fJ vfPSXjZBn9408HM8vcU+9pydD5kc4ehDgLSa1GpHd8oqqMG44afoJE+LnFZ1+uO65N X3lZcjCTLHlmLM1turvJnnX85uTGJTDM1oRbLHQ7nbg86eXdhT2mBXqo6yrfaZmp2Y mqGCUuhzt2gGw== Received: by mail-lf1-f54.google.com with SMTP id 2adb3069b0e04-4efe8b3f3f7so3892257e87.2 for ; Fri, 19 May 2023 08:42:52 -0700 (PDT) X-Gm-Message-State: AC+VfDwj2FbETvY5pjB0OY+1zChid6DYaECtCq1X/Q+KQC3gWmen4sGN vTJqBHKrAEJsfGfAxwSy0L+2Ok2FEWKeMiDAT3o= X-Received: by 2002:a05:6512:908:b0:4f3:9136:9cd0 with SMTP id e8-20020a056512090800b004f391369cd0mr756798lft.44.1684510970433; Fri, 19 May 2023 08:42:50 -0700 (PDT) MIME-Version: 1.0 References: <20230308094106.227365-1-rppt@kernel.org> <20230308094106.227365-2-rppt@kernel.org> <20230518152354.GD4967@kernel.org> <20230519082945.GE4967@kernel.org> In-Reply-To: <20230519082945.GE4967@kernel.org> From: Song Liu Date: Fri, 19 May 2023 08:42:38 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH 1/5] mm: intorduce __GFP_UNMAPPED and unmapped_alloc() To: Mike Rapoport Cc: Kent Overstreet , linux-mm@kvack.org, Andrew Morton , Dave Hansen , Peter Zijlstra , Rick Edgecombe , Thomas Gleixner , Vlastimil Babka , linux-kernel@vger.kernel.org, x86@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 19, 2023 at 1:30=E2=80=AFAM Mike Rapoport wro= te: > > Hi Kent, > > On Thu, May 18, 2023 at 01:23:56PM -0400, Kent Overstreet wrote: > > On Thu, May 18, 2023 at 10:00:39AM -0700, Song Liu wrote: > > > On Thu, May 18, 2023 at 9:48=E2=80=AFAM Kent Overstreet > > > wrote: > > > > > > > > On Thu, May 18, 2023 at 09:33:20AM -0700, Song Liu wrote: > > > > > I am working on patches based on the discussion in [1]. I am plan= ning to > > > > > send v1 for review in a week or so. > > > > > > > > Hey Song, I was reviewing that thread too, > > > > > > > > Are you taking a different approach based on Thomas's feedback? I t= hink > > > > he had some fair points in that thread. > > > > > > Yes, the API is based on Thomas's suggestion, like 90% from the discu= ssions. > > > > > > > > > > > My own feeling is that the buddy allocator is our tool for allocati= ng > > > > larger variable sized physically contiguous allocations, so I'd lik= e to > > > > see something based on that - I think we could do a hybrid buddy/sl= ab > > > > allocator approach, like we have for regular memory allocations. > > > > > > I am planning to implement the allocator based on this (reuse > > > vmap_area logic): > > > > Ah, you're still doing vmap_area approach. > > > > Mike's approach looks like it'll be _much_ lighter weight and higher > > performance, to me. vmalloc is known to be slow compared to the buddy > > allocator, and with Mike's approach we're only modifying mappings once > > per 2 MB chunk. > > > > I don't see anything in your code for sub-page sized allocations too, s= o > > perhaps I should keep going with my slab allocator. > > Your allocator implicitly relies on vmalloc because of module_alloc ;-) > > What I was thinking is that we can replace module_alloc() calls in your > allocator with something based on my unmapped_alloc(). If we make the par= t > that refills the cache also take care of creating the mapping in the > module address space, that should cover everything. Here are what I found as I work more on the code: 1. It takes quite some work to create a clean interface and make sure all the users of module_alloc can use the new allocator on all archs. (archs with text poke need to work with ROX memory from the allocator; archs without text poke need to have a clean fall back mechanism, etc.). Most of this work is independent of the actual allocator, so we can do this part and plug in whatever allocator we want (buddy+slab or vmap-based or any other solutions). 2. vmap_area based solution will work. And it will be one solution for both < PAGE_SIZE and > PAGE_SIZE allocations. Given module_alloc is not in any hot path (AFAIK), I don't see any practical issues with this solution. It will be a little tricky to plac= e and name the code, as it uses vmalloc logic, but it is technically a module allocator. I will prioritize building the interface, and migrating users to it. If we do this part right, changing the underlying allocator should be straightforward. Thanks, Song