Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1105594rwd; Thu, 8 Jun 2023 12:11:06 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7fADUmg7k7Yyer7I4x2EdCd68c5ynN0sJsSGtOPPb4/mYq6tyGu5xlCsSJOnf1bBdjAdd8 X-Received: by 2002:a17:902:c40e:b0:1b0:307c:e706 with SMTP id k14-20020a170902c40e00b001b0307ce706mr6756660plk.45.1686251466488; Thu, 08 Jun 2023 12:11:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686251466; cv=none; d=google.com; s=arc-20160816; b=RR8rKgoi8SNWKVfmQaTPsHufqCbFFDaZMAxjjU4/bXam+FTm9dFJRl4BTIHFW0b4Qw txlB7pldyHR5+XYoA9TafSBWvb5IpNwa+2FlUS+C1VXjF6RRWrWiUuVkbrzWFan2Y/0v RCF0niuDimkVYDonfcASEkGjEdeplstWHJCKaOmwf0wQni9RqGI3X4YuSJHslg7x4fVC C9/c4dd6t6LLcplR1iBNAoc1li9AXvRK2D5MhE6kqmRJJOi9tpFnHAG2y0XUqlqJlThr BF0dYCdnz8hFYAaB722QGmv2UVoppeKWTqZyIlBgmrPHc/BYRTAcYURqp7oF5rUpyy35 bsZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=yxfnAczOL1g6BMV9/9Oc/SgvHZ8pSMLfq1vaH9319+Q=; b=zei2jR9XfdksUnkF16dVLKKAnxMKy38I7cTXEggr2tImhXXnqlcx9r4qd+p0mJzi9k 0qqEsdcbhpKkJlf4lmRG198mSXt3gCX6pXsUppupAEQ9HQpyfFF3OlguhOnzgifGdelm Xcdgp01BH5GXBqesP52gWL6vmB+2j4aUq6iOhUz4I2YDV4YRm2fyy4qO3PZZBFhEeWqI 9pT3L88PBpplsHSh1ybiZSCx99V3h/vlZU0lRKLwcGoAncedcngpyIycxT63CuNctw3y LCVhNFkrsnxGOxIJKJwEEPcFBdVsaU4Dd1ju9KrPvFOYYawSxUq6S7vWANseh2SRl41w an3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Zf6igQH5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y7-20020a17090322c700b001b03dbcc6a9si1477433plg.439.2023.06.08.12.10.54; Thu, 08 Jun 2023 12:11:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Zf6igQH5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233740AbjFHSlx (ORCPT + 99 others); Thu, 8 Jun 2023 14:41:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48406 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229471AbjFHSlw (ORCPT ); Thu, 8 Jun 2023 14:41:52 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE0AB11A; Thu, 8 Jun 2023 11:41:51 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5B78C6506C; Thu, 8 Jun 2023 18:41:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A486DC433D2; Thu, 8 Jun 2023 18:41:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1686249710; bh=4a7fPJtZnT2G3XUHjxk53JytGZ7vtRhXAZPweGPLJ7Y=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Zf6igQH5RfRjPL1LLNRp2u8iNNy6O+kF/bGYFX3Nrh8fUdglTmP/qjkVHeDHFuQNr jn2Uh6KySMX6f0GTo02MvlJp6Rv68cW/69I/MFZtSdKzOt0NBFweVHnVybdzHgWXfl KvtLJ4ESmigmI/uc0oNYGCIO7NHmoF2IrKkybPAl+gdBXFQ0P23XDeYWK4z9h15oKv ZKGiTpWdZVp/QxM4gHnWF1LN14ykgAc8/vUL3ndVfxx63Am/uu3wKbhKbUkF+Bi6Ox 4oM/zbXXCqLIwER2tnq6O7dDS4+mh/aE4qybCY5QC5PMes/KnyWH9l0FxBIq6F8Dg4 8wj5ZhMvd7MFQ== Date: Thu, 8 Jun 2023 21:41:16 +0300 From: Mike Rapoport To: Song Liu Cc: Mark Rutland , Kent Overstreet , linux-kernel@vger.kernel.org, Andrew Morton , Catalin Marinas , Christophe Leroy , "David S. Miller" , Dinh Nguyen , Heiko Carstens , Helge Deller , Huacai Chen , Luis Chamberlain , Michael Ellerman , "Naveen N. Rao" , Palmer Dabbelt , Russell King , Steven Rostedt , Thomas Bogendoerfer , Thomas Gleixner , Will Deacon , bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-modules@vger.kernel.org, linux-parisc@vger.kernel.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev, netdev@vger.kernel.org, sparclinux@vger.kernel.org, x86@kernel.org Subject: Re: [PATCH 00/13] mm: jit/text allocator Message-ID: <20230608184116.GJ52412@kernel.org> References: <20230601101257.530867-1-rppt@kernel.org> <20230605092040.GB3460@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 06, 2023 at 11:21:59AM -0700, Song Liu wrote: > On Mon, Jun 5, 2023 at 3:09 AM Mark Rutland wrote: > > [...] > > > > > > Can you give more detail on what parameters you need? If the only extra > > > > > parameter is just "does this allocation need to live close to kernel > > > > > text", that's not that big of a deal. > > > > > > > > My thinking was that we at least need the start + end for each caller. That > > > > might be it, tbh. > > > > > > Do you mean that modules will have something like > > > > > > jit_text_alloc(size, MODULES_START, MODULES_END); > > > > > > and kprobes will have > > > > > > jit_text_alloc(size, KPROBES_START, KPROBES_END); > > > ? > > > > Yes. > > How about we start with two APIs: > jit_text_alloc(size); > jit_text_alloc_range(size, start, end); > > AFAICT, arm64 is the only arch that requires the latter API. And TBH, I am > not quite convinced it is needed. Right now arm64 and riscv override bpf and kprobes allocations to use the entire vmalloc address space, but having the ability to allocate generated code outside of modules area may be useful for other architectures. Still the start + end for the callers feels backwards to me because the callers do not define the ranges, but rather the architectures, so we still need a way for architectures to define how they want allocate memory for the generated code. > > > It sill can be achieved with a single jit_alloc_arch_params(), just by > > > adding enum jit_type parameter to jit_text_alloc(). > > > > That feels backwards to me; it centralizes a bunch of information about > > distinct users to be able to shove that into a static array, when the callsites > > can pass that information. > > I think we only two type of users: module and everything else (ftrace, kprobe, > bpf stuff). The key differences are: > > 1. module uses text and data; while everything else only uses text. > 2. module code is generated by the compiler, and thus has stronger > requirements in address ranges; everything else are generated via some > JIT or manual written assembly, so they are more flexible with address > ranges (in JIT, we can avoid using instructions that requires a specific > address range). > > The next question is, can we have the two types of users share the same > address ranges? If not, we can reserve the preferred range for modules, > and let everything else use the other range. I don't see reasons to further > separate users in the "everything else" group. I agree that we can define only two types: modules and everything else and let the architectures define if they need different ranges for these two types, or want the same range for everything. With only two types we can have two API calls for alloc, and a single structure that defines the ranges etc from the architecture side rather than spread all over. Like something along these lines: struct execmem_range { unsigned long start; unsigned long end; unsigned long fallback_start; unsigned long fallback_end; pgprot_t pgprot; unsigned int alignment; }; struct execmem_modules_range { enum execmem_module_flags flags; struct execmem_range text; struct execmem_range data; }; struct execmem_jit_range { struct execmem_range text; }; struct execmem_params { struct execmem_modules_range modules; struct execmem_jit_range jit; }; struct execmem_params *execmem_arch_params(void); void *execmem_text_alloc(size_t size); void *execmem_data_alloc(size_t size); void execmem_free(void *ptr); void *jit_text_alloc(size_t size); void jit_free(void *ptr); Modules or anything that must live close to the kernel image can use execmem_*_alloc() and the callers that don't generally care about relative addressing will use jit_text_alloc(), presuming that arch will restrict jit range if necessary, like e.g. below for arm64 jit can be anywhere in vmalloc and for x86 and s390 it will share the modules range. struct execmem_params arm64_execmem = { .modules = { .flags = KASAN, .text = { .start = MODULES_VADDR, .end = MODULES_END, .pgprot = PAGE_KERNEL_ROX, .fallback_start = VMALLOC_START, .fallback_start = VMALLOC_END, }, }, .jit = { .text = { .start = VMALLOC_START, .end = VMALLOC_END, .pgprot = PAGE_KERNEL_ROX, }, }, }; /* x86 and s390 */ struct execmem_params cisc_execmem = { .modules = { .flags = KASAN, .text = { .start = MODULES_VADDR, .end = MODULES_END, .pgprot = PAGE_KERNEL_ROX, }, }, .jit_range = {}, /* impplies reusing .modules */ }; struct execmem_params default_execmem = { .modules = { .flags = KASAN, .text = { .start = VMALLOC_START, .end = VMALLOC_END, .pgprot = PAGE_KERNEL_EXEC, }, }, }; -- Sincerely yours, Mike.