Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp293259ybh; Wed, 15 Jul 2020 02:03:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwobhFwDvtpnLhrzUaqjladeRZ8cHNQINmGtxkUrivuRC672WqzC0RUngZYOGMvC+Yo6W/c X-Received: by 2002:a05:6402:1841:: with SMTP id v1mr8593056edy.198.1594803817716; Wed, 15 Jul 2020 02:03:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594803817; cv=none; d=google.com; s=arc-20160816; b=mxdI7b4n+LGYIByL05TdQZMd1/1EHejxpkZtkIe1Vkimq1DzCG7Os8NRG7aofma1P5 WvyOyefs6MdjEPJ6lxDtfJB8Kq4y8mBvWMevzrULLDxPeTgo8OkqmIZZoJacHv0Rb2vI Z9CGNQZVM8rEHl40kX3U57Fmmynb87m/lDgQSWVOlU9kS6bCeT+3ZwYYznCMgyfJFlD9 kct9qCv5HIBwzJbDVR4dSKH0UPzMxSQUBmpIN4iW5lWcOsvHIo/u5//j4rjsdylBKUJI uH3/JEl/iTPD5NQuh5wabdpMHkX+KQnfFBD20kHR2Ih1im3v/gZ8S9bUcjJkV3tAAvIP 5vVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=GI+LofYfQHrKQaIVgBtIAswcjwI2AeT6g2dEd/lm2xs=; b=wcby8Oy8x1qj65SpY37L5kbXLgcrkeLHRzlwcFg1WP735DMZrHOSSzv6m4c+bdF5Mk y85f3T/+wLKglHjDnuGu4yItS2bXHNkdYDyJywgx4h0D/JzXGDRwtH46k8hRo0d+PLCC imFtY0o8SIKwSdJkjjZF93hQlD8afs1Cch4pK30ccO0EogmfCt5UIAb5oNcmyGWJfE8y F/2bjDAtDrwV7KLLFPfdum+cz56R40i9DEj6aV8NWZwvNGql96BfWN1AeqH+FmIAFAvh KNOg32wRdU0WGcV4Wp5CT9E+5T5TIL/9PpYmWt49G0EgveXSC7TG2BYJdRaAeoPu72gd Cebg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Ncwm3q4t; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id be25si895517edb.319.2020.07.15.02.03.14; Wed, 15 Jul 2020 02:03:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Ncwm3q4t; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730030AbgGOI1t (ORCPT + 99 others); Wed, 15 Jul 2020 04:27:49 -0400 Received: from mail.kernel.org ([198.145.29.99]:37426 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726396AbgGOI1q (ORCPT ); Wed, 15 Jul 2020 04:27:46 -0400 Received: from devnote2 (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B24BC206D5; Wed, 15 Jul 2020 08:27:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1594801665; bh=qGb0YPwemNdz/a7HEDVtuPhoJwQFsspP4paClTm11fQ=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=Ncwm3q4tsI1/f2ZO8r9V+Hj+XAonXZp3IyfdE9+UjSMNG9PMt6+7R98W0cxec7iKb vgNIhxA9sadskAYqzAAUQT9xCY8SHU1ZC3dtG52sroPSnLoU/EmtvgkVFfyRYHDOiT De9ah3Lz8fVdTUx2nukJuXaTvy2+0j4tmS2hWl/E= Date: Wed, 15 Jul 2020 17:27:32 +0900 From: Masami Hiramatsu To: Jarkko Sakkinen Cc: linux-kernel@vger.kernel.org, x86@vger.kernel.org, Andi Kleen , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)), "H. Peter Anvin" , "Naveen N. Rao" , Anil S Keshavamurthy , "David S. Miller" , Masami Hiramatsu , Jessica Yu , Andrew Morton , "Aneesh Kumar K.V" , Kees Cook , Will Deacon , Sami Tolvanen , Alexandre Ghiti , Masahiro Yamada , Peter Collingbourne , Frederic Weisbecker , Krzysztof Kozlowski , Arnd Bergmann , Stephen Boyd , Andy Lutomirski , Josh Poimboeuf , Miroslav Benes , Babu Moger , Omar Sandoval , Nayna Jain , Marco Elver , Brian Gerst , Jiri Kosina , Joe Lawrence , Mike Rapoport Subject: Re: [PATCH v3 1/3] kprobes: Add text_alloc() and text_free() Message-Id: <20200715172732.35110a4e53e84fcec9aeac83@kernel.org> In-Reply-To: <20200714223239.1543716-2-jarkko.sakkinen@linux.intel.com> References: <20200714223239.1543716-1-jarkko.sakkinen@linux.intel.com> <20200714223239.1543716-2-jarkko.sakkinen@linux.intel.com> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.32; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Jarkko, On Wed, 15 Jul 2020 01:32:27 +0300 Jarkko Sakkinen wrote: > Introduce new API for allocating space for code generaed at run-time > leveraging from module_alloc() and module_memfree() code. Use this to > perform memory allocations in the kprobes code in order to loose the > bound between kprobes and modules subsystem. > > Initially, use this API only with arch/x86 and define a new config > flag CONFIG_ARCH_HAS_TEXT_ALLOC to promote the availability of the > new API. This might need to be extended for the text_alloc() flags as we discuss in previous thread. (e.g. bpf on arm64) > Cc: Andi Kleen > Suggested-by: Peter Zijlstra > Signed-off-by: Jarkko Sakkinen > --- > arch/Kconfig | 2 +- > arch/x86/Kconfig | 3 ++ > arch/x86/kernel/Makefile | 1 + > arch/x86/kernel/module.c | 49 --------------------------- > arch/x86/kernel/text.c | 71 ++++++++++++++++++++++++++++++++++++++++ I think this would better be arch/x86/mm/text_alloc.c (text.c is too generic, and this only provides memory allocation.) > include/linux/text.h | 17 ++++++++++ "text_alloc.h" would be better, or puting the prototype in vmalloc.h will be easier to use. > kernel/kprobes.c | 11 +++++++ > kernel/module.c | 10 ++++++ > 8 files changed, 114 insertions(+), 50 deletions(-) > create mode 100644 arch/x86/kernel/text.c > create mode 100644 include/linux/text.h > > diff --git a/arch/Kconfig b/arch/Kconfig > index 8cc35dc556c7..e3d6b6868ccb 100644 > --- a/arch/Kconfig > +++ b/arch/Kconfig > @@ -61,7 +61,7 @@ config OPROFILE_NMI_TIMER > > config KPROBES > bool "Kprobes" > - depends on MODULES > + depends on MODULES || ARCH_HAS_TEXT_ALLOC > depends on HAVE_KPROBES > select KALLSYMS > help > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index 0dea7fdd7a00..a4ee5d1300f6 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -2035,6 +2035,9 @@ config KEXEC_FILE > config ARCH_HAS_KEXEC_PURGATORY > def_bool KEXEC_FILE > > +config ARCH_HAS_TEXT_ALLOC > + def_bool y > + > config KEXEC_SIG > bool "Verify kernel signature during kexec_file_load() syscall" > depends on KEXEC_FILE > diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile > index e77261db2391..2878e4b753a0 100644 > --- a/arch/x86/kernel/Makefile > +++ b/arch/x86/kernel/Makefile > @@ -68,6 +68,7 @@ obj-y += tsc.o tsc_msr.o io_delay.o rtc.o > obj-y += pci-iommu_table.o > obj-y += resource.o > obj-y += irqflags.o > +obj-y += text.o > > obj-y += process.o > obj-y += fpu/ > diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c > index 34b153cbd4ac..261df078f127 100644 > --- a/arch/x86/kernel/module.c > +++ b/arch/x86/kernel/module.c > @@ -36,55 +36,6 @@ do { \ > } while (0) > #endif > > -#ifdef CONFIG_RANDOMIZE_BASE > -static unsigned long module_load_offset; > - > -/* Mutex protects the module_load_offset. */ > -static DEFINE_MUTEX(module_kaslr_mutex); > - > -static unsigned long int get_module_load_offset(void) > -{ > - if (kaslr_enabled()) { > - mutex_lock(&module_kaslr_mutex); > - /* > - * Calculate the module_load_offset the first time this > - * code is called. Once calculated it stays the same until > - * reboot. > - */ > - if (module_load_offset == 0) > - module_load_offset = > - (get_random_int() % 1024 + 1) * PAGE_SIZE; > - mutex_unlock(&module_kaslr_mutex); > - } > - return module_load_offset; > -} > -#else > -static unsigned long int get_module_load_offset(void) > -{ > - return 0; > -} > -#endif > - > -void *module_alloc(unsigned long size) > -{ > - void *p; > - > - if (PAGE_ALIGN(size) > MODULES_LEN) > - return NULL; > - > - p = __vmalloc_node_range(size, MODULE_ALIGN, > - MODULES_VADDR + get_module_load_offset(), > - MODULES_END, GFP_KERNEL, > - PAGE_KERNEL, 0, NUMA_NO_NODE, > - __builtin_return_address(0)); > - if (p && (kasan_module_alloc(p, size) < 0)) { > - vfree(p); > - return NULL; > - } > - > - return p; > -} Please don't touch this module_alloc() at all. Then we can just call __vmalloc_node_range() in the text_alloc(). > - > #ifdef CONFIG_X86_32 > int apply_relocate(Elf32_Shdr *sechdrs, > const char *strtab, > diff --git a/arch/x86/kernel/text.c b/arch/x86/kernel/text.c > new file mode 100644 > index 000000000000..986b92ff1434 > --- /dev/null > +++ b/arch/x86/kernel/text.c > @@ -0,0 +1,71 @@ > +// SPDX-License-Identifier: GPL-2.0-or-later > +/* > + * Kernel module help for x86. > + * Copyright (C) 2001 Rusty Russell. > + */ > + > +#include > +#include > +#include > +#include > +#include > + > +#ifdef CONFIG_RANDOMIZE_BASE > +static unsigned long module_load_offset; > + > +/* Mutex protects the module_load_offset. */ > +static DEFINE_MUTEX(module_kaslr_mutex); > + > +static unsigned long get_module_load_offset(void) > +{ > + if (kaslr_enabled()) { > + mutex_lock(&module_kaslr_mutex); > + /* > + * Calculate the module_load_offset the first time this > + * code is called. Once calculated it stays the same until > + * reboot. > + */ > + if (module_load_offset == 0) > + module_load_offset = > + (get_random_int() % 1024 + 1) * PAGE_SIZE; > + mutex_unlock(&module_kaslr_mutex); > + } > + return module_load_offset; > +} > +#else > +static unsigned long get_module_load_offset(void) > +{ > + return 0; > +} > +#endif So as I pointed, we can ignore this for kprobes (and other dynamic allocated trampoline code). > + > +void *text_alloc(unsigned long size) > +{ > + void *p; > + > + if (PAGE_ALIGN(size) > MODULES_LEN) > + return NULL; > + > + p = __vmalloc_node_range(size, MODULE_ALIGN, > + MODULES_VADDR + get_module_load_offset(), > + MODULES_END, GFP_KERNEL, > + PAGE_KERNEL, 0, NUMA_NO_NODE, > + __builtin_return_address(0)); > + if (p && (kasan_module_alloc(p, size) < 0)) { > + vfree(p); > + return NULL; > + } > + > + return p; > +} > + > +void text_free(void *region) > +{ > + /* > + * This memory may be RO, and freeing RO memory in an interrupt is not > + * supported by vmalloc. > + */ > + WARN_ON(in_interrupt()); > + > + vfree(region); > +} > diff --git a/include/linux/text.h b/include/linux/text.h > new file mode 100644 > index 000000000000..a27d4a42ecda > --- /dev/null > +++ b/include/linux/text.h > @@ -0,0 +1,17 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > + > +#ifndef _LINUX_TEXT_H > +#define _LINUX_TEXT_H > + > +/* > + * An allocator used for allocating modules, core sections and init sections. > + * Returns NULL on failure. > + */ > +void *text_alloc(unsigned long size); > + > +/* > + * Free memory returned from text_alloc(). > + */ > +void text_free(void *region); Hmm, if this is this short, in this version we might better move these to vmalloc.h. > + > +#endif /* _LINUX_TEXT_H */ > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > index 2e97febeef77..fa7687eb0c0e 100644 > --- a/kernel/kprobes.c > +++ b/kernel/kprobes.c > @@ -35,6 +35,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -111,12 +112,20 @@ enum kprobe_slot_state { > > void __weak *alloc_insn_page(void) > { > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > + return text_alloc(PAGE_SIZE); > +#else > return module_alloc(PAGE_SIZE); > +#endif > } > > void __weak free_insn_page(void *page) > { > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > + text_free(page); > +#else > module_memfree(page); > +#endif > } > > struct kprobe_insn_cache kprobe_insn_slots = { > @@ -1608,6 +1617,7 @@ static int check_kprobe_address_safe(struct kprobe *p, > goto out; > } > > +#ifdef CONFIG_MODULES > /* > * If the module freed .init.text, we couldn't insert > * kprobes in there. > @@ -1618,6 +1628,7 @@ static int check_kprobe_address_safe(struct kprobe *p, > *probed_mod = NULL; > ret = -ENOENT; > } > +#endif This change is not related to text_alloc(). Please move this to 3/3. > } > out: > preempt_enable(); > diff --git a/kernel/module.c b/kernel/module.c > index aa183c9ac0a2..8adeb126b02c 100644 > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -56,6 +56,7 @@ > #include > #include > #include > +#include > #include > #include "module-internal.h" > > @@ -2151,7 +2152,12 @@ void __weak module_memfree(void *module_region) > * supported by vmalloc. > */ > WARN_ON(in_interrupt()); > + > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > + text_free(module_region); > +#else > vfree(module_region); > +#endif > } > > void __weak module_arch_cleanup(struct module *mod) > @@ -2786,9 +2792,13 @@ static void dynamic_debug_remove(struct module *mod, struct _ddebug *debug) > > void * __weak module_alloc(unsigned long size) > { > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > + return text_alloc(size); > +#else > return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END, > GFP_KERNEL, PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS, > NUMA_NO_NODE, __builtin_return_address(0)); > +#endif > } Please don't touch kernel/module.c too. This seems to make things complicated. Thank you, -- Masami Hiramatsu