Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp4125570imw; Thu, 7 Jul 2022 13:13:31 -0700 (PDT) X-Google-Smtp-Source: AGRyM1trJTx77h4CVqf4gL4dGZMETCVWeRwDOxWb1wUZdRNIsIdTQGwC9yF82bODzupMPQnu8nU/ X-Received: by 2002:a63:1a10:0:b0:40d:fb07:ec26 with SMTP id a16-20020a631a10000000b0040dfb07ec26mr41007567pga.273.1657224811555; Thu, 07 Jul 2022 13:13:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657224811; cv=none; d=google.com; s=arc-20160816; b=XA+CRrkCIRSkKPgqCWGDTW7/SVQNlW7uOfB2KpIv70fhEarjy3+lGWFmkCoR84bat0 SjyMOLJqUZw5Nw6oMg3/752VWSGmh0zonlOuA7SbiUO4fuX+fhP6llpjm+RTuBSnawO0 did8+cRVXH6LnMoDC+60dYILAPnk0r61T4621jZo3K7WOIQMgl4sV7YDFqgWa06H4AvL e+IPNYkxt40lfOICKvnHOLD4hzKpujln8hsbCiqgpJWuKfZeiY52P7eUHwGnFOvMKSZd /FXYsnqO34aXG9je6rDN4n1+mbwe2FWMA1v//YT3YdrlM5SpJ8SFW2Ns3WO64YkOPI+A 2V8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=YvxVbi0s8pdES8Ig1gSBYhIVsLmPB4xtFIWqaVBPjdc=; b=cPA2c0GDd70w6DCZeULbU5LBrzVoRfeojVUu+wRi4Rn7B7LBa8Uq9jR+9hq82Zyd4+ pIFZ2YzWIHk8TdEOEFKzW3KJw5uK+9cehzMw7WB4vA6M7eFdjTFhxXmL1eLf3pZZw+U1 Urnx2iwEDr3pABMYxb4YK4QY+Baa549P0Y3Ur/A+moTUf2ZtfjxmN6Xi/0NTe5Pelkg6 Uu7rtQC1QP5Rkzv5B+GF/CNe3+E83GFN0nAsL3avUFJXIu7eQd5me1g0Snm/K89/gTJY lDRPcQIuAK5QPO7ImT2VLFcIffDYcKyE2vZzbGuA6cfu7EVEhsczAyN/hHHXMdhazoqj VYeQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20210309 header.b=BMPjaWOn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t16-20020a634450000000b0040de29fae99si3041667pgk.540.2022.07.07.13.13.17; Thu, 07 Jul 2022 13:13:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20210309 header.b=BMPjaWOn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236058AbiGGULp (ORCPT + 99 others); Thu, 7 Jul 2022 16:11:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50246 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231320AbiGGULo (ORCPT ); Thu, 7 Jul 2022 16:11:44 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 65A6ECE9; Thu, 7 Jul 2022 13:11:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=YvxVbi0s8pdES8Ig1gSBYhIVsLmPB4xtFIWqaVBPjdc=; b=BMPjaWOnqsQRsLxJ8XGws3zvDd RhhCBijfg3pyR660vMVQgjveB3wrPXHfrtcK/WhychCeHPT7axZxzeepRb9v2Ahm/H8C4z4Co3xau /aUM7EGC//W6yq3U14JH6zc/bSRcqTg0YCe5ACYS9NMiuvVKu0btDKW511kIu35ammzENIt/Td45T Z/ZlTDUBYl7KFbXabCmX8r0feHweSpO3s9Vyx9DUFoUaYhsDCDM5FoNXJEvd6tGOd4kGSV5GVbDKS 0fJquinBuJhKW2g9rU40MS5/hcU9l7BdT8Ead/wJDuA3Rhnjj4VPPrOPDpf1Ry9RSyxoVMKB4C6Pa lWzZsblw==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1o9Xqc-0005C6-Jz; Thu, 07 Jul 2022 20:11:38 +0000 Date: Thu, 7 Jul 2022 13:11:38 -0700 From: Luis Chamberlain To: Song Liu Cc: Song Liu , lkml , "netdev@vger.kernel.org" , "x86@vger.kernel.org" , "dave.hansen@linux.intel.com" , "rick.p.edgecombe@intel.com" , Kernel Team , "daniel@iogearbox.net" Subject: Re: [PATCH v5 bpf-next 1/5] module: introduce module_alloc_huge Message-ID: References: <20220624215712.3050672-1-song@kernel.org> <20220624215712.3050672-2-song@kernel.org> <16959523-ABD1-4D2B-B249-118DDADD7976@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <16959523-ABD1-4D2B-B249-118DDADD7976@fb.com> Sender: Luis Chamberlain X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 06, 2022 at 04:39:13AM +0000, Song Liu wrote: > > On Jul 1, 2022, at 4:20 PM, Luis Chamberlain wrote: > > On Fri, Jun 24, 2022 at 02:57:08PM -0700, Song Liu wrote: > >> +void *module_alloc_huge(unsigned long size) > >> +{ > >> + gfp_t gfp_mask = GFP_KERNEL; > >> + void *p; > >> + > >> + if (PAGE_ALIGN(size) > MODULES_LEN) > >> + return NULL; > >> + > >> + p = __vmalloc_node_range(size, MODULE_ALIGN, > >> + MODULES_VADDR + get_module_load_offset(), > >> + MODULES_END, gfp_mask, PAGE_KERNEL, > >> + VM_DEFER_KMEMLEAK | VM_ALLOW_HUGE_VMAP, > >> + NUMA_NO_NODE, __builtin_return_address(0)); > >> + if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) { > >> + vfree(p); > >> + return NULL; > >> + } > >> + > >> + return p; > >> +} > > > > 1) When things like kernel/bpf/core.c start using a module alloc it > > is time to consider genearlizing this. > > I am not quite sure what the generalization would look like. IMHO, the > ideal case would have: > a) A kernel_text_rw_allocator, similar to current module_alloc. > b) A kernel_text_ro_allocator, similar to current bpf_prog_pack_alloc. > This is built on top of kernel_text_rw_allocator. Different > allocations could share a page, thus it requires text_poke like > support from the arch. > c) If the arch supports text_poke, kprobes, ftrace trampolines, and > bpf trampolines should use kernel_text_ro_allocator. > d) Major archs should support CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC, > and they should use kernel_text_ro_allocator for module text. > > Does this sound reasonable to you? Yes, and a respective free call may have an arch specific stuff which removes exec stuff. In so far as the bikeshedding, I do think this is generic so vmalloc_text_*() suffices or vmalloc_exec_*() take your pick for a starter and let the world throw in their preference. > I tried to enable CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC for x86_64, > but that doesn't really work. Do we have plan to make this combination > work? Oh nice. Good stuff. Perhaps it just requires a little love from mm folks. Don't beat yourself up if it does not yet. We can work towards that later. > > 2) How we free is important, and each arch does something funky for > > this. This is not addressed here. > > How should we address this? IIUC, x86_64 just calls vfree. That's not the case for all archs is it? I'm talking about the generic module_alloc() too. I'd like to see that go the way we discussed above. > > And yes I welcome generalizing generic module_alloc() too as suggested > > before. The concern on my part is the sloppiness this enables. > > One question I have is, does module_alloc (or kernel_text_*_allocator > above) belong to module code, or mm code (maybe vmalloc)? The evolution of its uses indicates it is growing outside of modules and so mm should be the new home. > I am planning to let BPF trampoline use bpf_prog_pack on x86_64, which > is another baby step of c) above. OK! Luis