Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp5413219rwd; Mon, 5 Jun 2023 03:21:00 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6nZ6FX8fO5XsaSLZagMYvcU1IpKEbWxXr2qmihvU/FQqYT+IcM0Wg8lioc5DZKeljYP2Pu X-Received: by 2002:a54:470f:0:b0:39a:c1ea:6c32 with SMTP id k15-20020a54470f000000b0039ac1ea6c32mr1371121oik.31.1685960460037; Mon, 05 Jun 2023 03:21:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685960460; cv=none; d=google.com; s=arc-20160816; b=KtWi3rwVlMIjdfTIzU3Spl77UzCMjCM+ljScPIiQVuOeIqPrDGcnWd/9XV2tHWHJzQ K3HH1MxelG1jXdmvcxW7/ItNVITvNxEmsfCT4Tq9teUgWaJML1G0kvMgNFZz7u2l+j2E +2FBnKSQb/55sajPUInJwuP/JA097aCGQ5p+wzfBTN0k2V/d2fhI2iLBVtfFmBk7E51O nvJb70tMXenGVPrD1TT5sLIQWkz29mISXu7MpQT3iNQX+vwYUvwzTa1s0OH8j6bWUP34 j6GnusbJxICiwI323CLbeQrAYyq+EM6fDseGqVJMVz8LbdR++muVWEGw6vy1HVVvNrrx 3xhQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=C1VCZNhXlukhxo57mpZNqduvMgsXvxcNkNXX6OQtUz0=; b=wuGOQLigaq9NlPM4/Sf9U3CaA9GYAhI6gpmqnfmYC6SDMCbz0dSMTvA51Glh+ORkQC Mc7KtlKywQZS83gsuvPVTP0//Ggs19CLGqs61tkVAXVrT0GEsNtOZDnbveGYT3abL4Sh ecpX1vVrIApldRg4htto7srfen9VFGUiU9A4TWTE0NWuLXua4IDg2ZqN34D+YDDrfXa/ nRxwOWQXGztOCy8NIqx+1TyRpXHdnypFeYDmXf/YSAHZGVrH4bsnvg1Tixa6BTOMSplJ mKXfHYCpj1V65nl67Ji5n5YsCMmoCah+UVRCwVIvbAEx58pxvVrd8BG1C+iuarsE6XKj i13w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f1-20020aa79d81000000b0063b652ff9cdsi1526547pfq.404.2023.06.05.03.20.46; Mon, 05 Jun 2023 03:20:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231451AbjFEKJ4 (ORCPT + 99 others); Mon, 5 Jun 2023 06:09:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40010 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231816AbjFEKJs (ORCPT ); Mon, 5 Jun 2023 06:09:48 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id EDA05ED; Mon, 5 Jun 2023 03:09:42 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2B9CBD75; Mon, 5 Jun 2023 03:10:28 -0700 (PDT) Received: from FVFF77S0Q05N (unknown [10.57.24.244]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8F8793F793; Mon, 5 Jun 2023 03:09:37 -0700 (PDT) Date: Mon, 5 Jun 2023 11:09:34 +0100 From: Mark Rutland To: Mike Rapoport Cc: Kent Overstreet , linux-kernel@vger.kernel.org, Andrew Morton , Catalin Marinas , Christophe Leroy , "David S. Miller" , Dinh Nguyen , Heiko Carstens , Helge Deller , Huacai Chen , Luis Chamberlain , Michael Ellerman , "Naveen N. Rao" , Palmer Dabbelt , Russell King , Song Liu , Steven Rostedt , Thomas Bogendoerfer , Thomas Gleixner , Will Deacon , bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-modules@vger.kernel.org, linux-parisc@vger.kernel.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev, netdev@vger.kernel.org, sparclinux@vger.kernel.org, x86@kernel.org Subject: Re: [PATCH 00/13] mm: jit/text allocator Message-ID: References: <20230601101257.530867-1-rppt@kernel.org> <20230605092040.GB3460@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230605092040.GB3460@kernel.org> X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 05, 2023 at 12:20:40PM +0300, Mike Rapoport wrote: > On Fri, Jun 02, 2023 at 10:35:09AM +0100, Mark Rutland wrote: > > On Thu, Jun 01, 2023 at 02:14:56PM -0400, Kent Overstreet wrote: > > > On Thu, Jun 01, 2023 at 05:12:03PM +0100, Mark Rutland wrote: > > > > For a while I have wanted to give kprobes its own allocator so that it can work > > > > even with CONFIG_MODULES=n, and so that it doesn't have to waste VA space in > > > > the modules area. > > > > > > > > Given that, I think these should have their own allocator functions that can be > > > > provided independently, even if those happen to use common infrastructure. > > > > > > How much memory can kprobes conceivably use? I think we also want to try > > > to push back on combinatorial new allocators, if we can. > > > > That depends on who's using it, and how (e.g. via BPF). > > > > To be clear, I'm not necessarily asking for entirely different allocators, but > > I do thinkg that we want wrappers that can at least pass distinct start+end > > parameters to a common allocator, and for arm64's modules code I'd expect that > > we'd keep the range falblack logic out of the common allcoator, and just call > > it twice. > > > > > > > Several architectures override module_alloc() because of various > > > > > constraints where the executable memory can be located and this causes > > > > > additional obstacles for improvements of code allocation. > > > > > > > > > > This set splits code allocation from modules by introducing > > > > > jit_text_alloc(), jit_data_alloc() and jit_free() APIs, replaces call > > > > > sites of module_alloc() and module_memfree() with the new APIs and > > > > > implements core text and related allocation in a central place. > > > > > > > > > > Instead of architecture specific overrides for module_alloc(), the > > > > > architectures that require non-default behaviour for text allocation must > > > > > fill jit_alloc_params structure and implement jit_alloc_arch_params() that > > > > > returns a pointer to that structure. If an architecture does not implement > > > > > jit_alloc_arch_params(), the defaults compatible with the current > > > > > modules::module_alloc() are used. > > > > > > > > As above, I suspect that each of the callsites should probably be using common > > > > infrastructure, but I don't think that a single jit_alloc_arch_params() makes > > > > sense, since the parameters for each case may need to be distinct. > > > > > > I don't see how that follows. The whole point of function parameters is > > > that they may be different :) > > > > What I mean is that jit_alloc_arch_params() tries to aggregate common > > parameters, but they aren't actually common (e.g. the actual start+end range > > for allocation). > > jit_alloc_arch_params() tries to aggregate architecture constraints and > requirements for allocations of executable memory and this exactly what > the first 6 patches of this set do. > > A while ago Thomas suggested to use a structure that parametrizes > architecture constraints by the memory type used in modules [1] and Song > implemented the infrastructure for it and x86 part [2]. > > I liked the idea of defining parameters in a single structure, but I > thought that approaching the problem from the arch side rather than from > modules perspective will be better starting point, hence these patches. > > I don't see a fundamental reason why a single structure cannot describe > what is needed for different code allocation cases, be it modules, kprobes > or bpf. There is of course an assumption that the core allocations will be > the same for all the users, and it seems to me that something like > > * allocate physical memory if allocator caches are empty > * map it in vmalloc or modules address space > * return memory from the allocator cache to the caller > > will work for all usecases. > > We might need separate caches for different cases on different > architectures, and a way to specify what cache should be used in the > allocator API, but that does not contradict a single structure for arch > specific parameters, but only makes it more elaborate, e.g. something like > > enum jit_type { > JIT_MODULES_TEXT, > JIT_MODULES_DATA, > JIT_KPROBES, > JIT_FTRACE, > JIT_BPF, > JIT_TYPE_MAX, > }; > > struct jit_alloc_params { > struct jit_range ranges[JIT_TYPE_MAX]; > /* ... */ > }; > > > > Can you give more detail on what parameters you need? If the only extra > > > parameter is just "does this allocation need to live close to kernel > > > text", that's not that big of a deal. > > > > My thinking was that we at least need the start + end for each caller. That > > might be it, tbh. > > Do you mean that modules will have something like > > jit_text_alloc(size, MODULES_START, MODULES_END); > > and kprobes will have > > jit_text_alloc(size, KPROBES_START, KPROBES_END); > ? Yes. > It sill can be achieved with a single jit_alloc_arch_params(), just by > adding enum jit_type parameter to jit_text_alloc(). That feels backwards to me; it centralizes a bunch of information about distinct users to be able to shove that into a static array, when the callsites can pass that information. What's *actually* common after separating out the ranges? Is it just the permissions? If we want this to be able to share allocations and so on, why can't we do this like a kmem_cache, and have the callsite pass a pointer to the allocator data? That would make it easy for callsites to share an allocator or use a distinct one. Thanks, Mark. > [1] https://lore.kernel.org/linux-mm/87v8mndy3y.ffs@tglx/ > [2] https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org > > > Thanks, > > Mark. > > -- > Sincerely yours, > Mike.